I just finished watching Why Google Stores Billions of Lines of Code in a Single Repository and honestly, while it looks intriguing, it also looks horrible.
Have you run into issues? Did you love it? How was it/
Thanks for the insight. Are there any tools that you used at your company that you’d recommend? Did you encounter any opensource CICD for monorepos that worked?
I discovered JOSH which was intriguing to put in front of existing source forges, but I don’t know of source forges that support monorepos by design. Github and Gitlab are multirepo for sure and shoehorning a monorepo into that, like nix did with nixpkgs, is cumbersome.
We use them at Meta. It’s easier to interact with other parts of the codebase, but it doesn’t play well with libraries so you end up redoing a lot of stuff in-house.
I would only recommend a monorepo if you’re a company with at least 5,000+ engineers and can dedicate significant time to internal infra.
I would only recommend a monorepo if you’re a company with at least 5,000+ engineers and can dedicate significant time to internal infra.
It’s funny because at least one FANG does not use monorepos and has no problem with them, in spite of being at the same scale or even perhaps larger than Facebook.
I wonder why anyone would feel compelled to suggest adopting a monorepo in a setting that makes them far harder to use and maintain.
Is it Amazon because they did a really good job at keeping teams separate (via APIs)?
it doesn’t play well with libraries
What do you mean by that? Is it the versioning of libraries that isn’t possible meaning an update to the interface requires updating all dependent apps/libs?
Updating a library in a monorepo means copying it all over and hoping the lib update didn’t break someone else’s code. Whereas updating a library normally would never break anything, and you can let people update on their own cadence
I set up a monorepo that had a library used by several different projects. It was my first foray into DevOps and we had this problem.
I decided to version and release the library whenever a change was merged to it on the trunk. Other projects would depend on one of those versions and could be updated at their own pace. There was a lot of hidden complexity and many gotchas so we needed some rules to make it functional. It worked good once those were sorted out.
One rule we needed was that changes to the library had to be merged and released prior to any downstream project that relied on those changes. This made a lot of sense from certain perspectives but it was annoying developers. They couldn’t simply open a single PR containing both changes. This had a huge positive impact on the codebase over time IMO but that’s a different story.
How is it done at Meta? Always compile and depend on latest? Is the library copied into different projects, or did you just mean you had to update several projects whenever the library’s interfaces changed?
I think it mostly has to do with how coupled your code modules are. If you have a lot of tightly coupled modules/libraries/apps/etc, then it makes sense to put them in the same repo so that changes that ultimately have a large blast radius can be handled within a single repo instead of spanning many repos.
And that’s just a judgement call based on code organization and team organization.
I’m inclined to interpret monorepos as an anti-pattern intended to mask away fundamental problems in the way an organization structures it’s releases and dependency management.
It all boils down to being an artificial versioning constraint at the expense of autonomy and developer experience.
Huge multinationals don’t have a problem in organizing all their projects as independent (and sometimes multiple) source code repositories per project. What’s wrong with these small one-bus software shops that fail to do that when they operate at a scale that’s orders of magnitude smaller?
I’ve been a big fan of monorepos because it leads to more consistent style and coding across the whole company. It makes the code more transparent so you can see what’s going on with the rest of the company, too, which helps reduce code islands and duplicated work. It enables me to build everything from source, which helps catch bugs that would only show up in prod due to version drift. It also means that I can do massive refactorings across the company without breaking anything.
That said, tooling is slowly improving for decentralized repos, so some of these may be doable on git now/soon.
(…) you can see what’s going on with the rest of the company, too.
That’s a huge security problem.
Edit for those who are down voting this post, please explain why you believe that granting anyone in the organization full access to all the projects used across all organizations does not represent a security problem.