Monorepos in Python
But it's not like this with monorepos. There you create one (or a couple) repositories for your entire company. This might have 100s or 1,000s of employees working on multiple projects within the single repo. Famously, Google, Meta, Microsoft, and Airbnb all employ very large monorepos with varying strategies of coordination.
On this episode, we have David Vujic here to give us his perspective on monorepos as well as highlight an architectural pattern and set of tools for accomplishing this in Python.
Episode Deep Dive
Guests introduction and background
David Vilyek is a seasoned Python developer and contributor to open-source projects, particularly around monorepo tooling. He has worked in various teams and companies spanning design, web, Clojure, and Python back-ends. His expertise in functional programming and architecture led him to explore how monorepos could be effectively managed in Python—particularly adapting the Polylith architecture from Clojure into the Python ecosystem. David has a natural curiosity for how code can be structured, deployed, and integrated seamlessly across multiple projects.
What to Know If You’re New to Python
If you’re just getting started with Python and want to understand monorepos and advanced architecture patterns, the core language concepts such as modules, packages, and virtual environments will really help. Knowing how Python imports and organizes files on disk, as well as having some experience working with pip
or poetry
, will make following discussions about managing dependencies and building code in a monorepo setting much clearer.
It’s also helpful to have a basic understanding of source control (especially Git) so you can relate to the conversation about partial clones, shallow clones, and other advanced Git operations.
Key points and takeaways
1. Why Monorepos Matter Monorepos gather all of a team’s or company’s code into a single repository, providing a unified way to manage shared libraries, dependencies, and inter-project changes. This is notably different from housing every service or library in its own isolated repo. The conversation highlighted major tech companies like Google, Meta, Microsoft, and Airbnb using giant monorepos successfully and how they handle massive scale.
- Links and tools:
2. Microservices vs. Monolith vs. Monorepo Monolithic applications bundle everything into one application deployed at once, whereas microservices split them into standalone services. Monorepos, however, are about unified code storage rather than single or multiple deployments. The discussion clarified that having a monorepo does not mean you are building a monolithic application—small services can still exist but share a single source-control home.
- Links and tools:
- GitHub (for hosting many monorepos)
3. The Polylith Architecture for Python Originally popularized in the Clojure world, Polylith organizes code into small “components” that can be composed into different “projects” and “bases.” Components are reusable Lego-like bricks (functions or sets of related functions) so that you can share them across multiple deployments in one repository. This architecture encourages a clean separation of concerns and drastically reduces code duplication.
- Links and tools:
4. Poetry Polylith Plugin David built a plugin for Poetry (the Python package and dependency manager) to handle monorepos in a Polylith style. This plugin helps with building artifacts (wheels and source distributions) that assemble only the relevant “components” per project without dragging the entire monorepo along. It also includes CLI commands to display a workspace overview, show dependency usage, and more.
- Links and tools:
5. Partial Clones, Shallow Clones, and Sparse Checkouts
Managing large repositories often requires advanced Git operations. Partial clones (git clone --filter=blob:none
), shallow clones (git clone --depth=1
), and sparse checkouts let developers avoid downloading unneeded history or directories. This can speed up CI builds and make local development more efficient.
- Links and tools:
6. Versioning in Monorepos A key advantage of monorepos is the ability to synchronize changes across projects. Rather than tagging multiple separate repos, you maintain a single version history. Changes affecting common libraries are caught in continuous integration, ensuring no project is left behind on incompatible versions. Still, teams can isolate older components or create backward-compatible new ones if the entire codebase can’t be updated immediately.
- Links and tools:
7. Dependency Management for Multiple Apps With monorepos, you can standardize dependency versions across many services while still selectively installing or building only what each app needs. Tools like Poetry or Pants can define per-project constraints without losing track of shared modules. Properly structuring your code means minimal duplication and robust update paths.
- Links and tools:
8. Editor and Dev Environment Setup Monorepos can overwhelm some editors if they index an entire codebase. Techniques like partial clones and dedicated workspace settings help. David emphasized using advanced editors such as PyCharm or Emacs configured for large Python projects, which automatically detect how files interrelate and provide better refactoring tools.
- Links and tools:
9. Polylith vs. Other Solutions From user feedback, some companies attempt submodules or other microservice patterns to share code, but these can become complicated to maintain. Polylith aims to strike a balance between microservices and monolithic structures by emphasizing composability. While other approaches like Nx or Lerna (in the JavaScript world) exist, Polylith’s method is particularly Python-friendly and encourages pure Python packaging standards.
- Links and tools:
10. Functional Programming Influences on Python Architecture David’s background with Clojure and functional programming shaped how he sees code organization. Concepts like stateless functions, minimized shared mutable state, and building up applications from “pure” components work cleanly even in Python. Many monorepo best practices—such as small, composable blocks—mirror functional programming approaches.
- Links and tools:
Interesting quotes and stories
“If you're going to build a new service, you just pick the components you already have, combine them in a base, and that’s it. You don’t need to copy-paste code anymore.” — David explaining how Polylith drastically reduces duplication in monorepos.
“People often confuse the idea of a monorepo with a monolith, but they’re not the same at all.” — Emphasizing that a single repository can still release multiple independent services or microservices.
"When you want to change a function signature, you can instantly see every place it's used. That's the beauty of a monorepo." — Showing how easy cross-repository refactoring can be in a unified codebase.
Key definitions and terms
- Monorepo: A single repository containing multiple projects or services, often with shared libraries.
- Monolith vs. Microservices vs. Monorepos: Monoliths bundle all features into one deployment; microservices break them into smaller deployable units; monorepos are about shared source control rather than deployment boundaries.
- Polylith: A software architecture approach using reusable “components” that can be composed into multiple “projects” and “bases,” especially effective in monorepos.
- Partial Clone: A Git clone that omits certain file content (like large blobs in history) until needed.
- Shallow Clone: A Git clone with limited commit history (e.g.,
depth=1
for just the latest commit). - Sparse Checkout: A Git feature allowing you to check out only specific directories or files from a repository.
Learning resources
If you want to go deeper into Python fundamentals, monorepos, and code organization:
- Python for Absolute Beginners – Great if you’re new to Python and want to quickly become confident writing packages and modules.
- Up and Running with Git – Master Git concepts such as partial clones, shallow clones, and advanced merges—all crucial for large monorepos.
- Poetry Documentation – Official docs for managing Python dependencies and packaging.
- Polylith Docs – Detailed explanation of the Polylith architecture approach and tooling.
Overall takeaway
Monorepos offer significant advantages for teams that need to share code, maintain consistent dependencies, and refactor quickly. While many think monorepos imply one big monolithic app, the conversation clarifies that it’s more about unified code storage than deployment style. Tools like Polylith, partial/sparse Git clones, and plugins for Poetry can streamline the developer experience even in large codebases. The episode underscores how small, composable building blocks, plus automated checking and CI, enable a cleaner, more maintainable Python codebase—regardless of how many apps you ultimately deploy.
Links from the show
David on Mastodon: @davidvujic@mastodon.nu
Monorepo definition: wikipedia.org
git-sizer tool for large repos: github.com
git partial clones: docs.gitlab.com
git sparse checkout: git-scm.com
Polylith architecture: polylith.gitbook.io
Article: A simple & scalable Python project structure: davidvujic.blogspot.com
The last Python Architecture you will ever need?: davidvujic.blogspot.com
python-polylith plugin for poetry: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy