EXPEDIA GROUP TECHNOLOGY — ENGINEERING

Merging Code Quickly and Safely in a Monorepo

Building an open source tool to overcome challenges with a monorepo setup

Dan Adajian
Expedia Group Technology
6 min readAug 8, 2023

--

Two people enjoying a walk in a desert scene.
Photo by Persnickety Prints on Unsplash

Continuous integration is critical to ensuring our code works. It offers a source of truth on top of which developers can build new and exciting things. But how well does continuous integration (CI) scale? Many large tech organizations boast about using a monorepo to maximize visibility and consolidate dependencies. But what we don’t often hear about is what developers actually go through to integrate (or merge) their code into the repository, and what that process looks like.

I’d like to use an anecdote to explore the tradeoff between merging code quickly and safely, and when there is actually no tradeoff between the two at all.

Merging safely

What does merging safely mean? When we merge code into a repository’s main branch, we want to make sure everything works together in perfect harmony. If Developer A merges a change to a file, and Developer B tries to merge a change to that same file, git will help us by identifying a merge conflict for that second person. This forces Developer B to reconcile the scope of their change in the context of the incoming change Developer A made.

Diagram showing Developer B experiencing a merge conflict after Developer A has merged a code change to master.
Merge conflict

But what if Developer B wants to make a change to a different file that would break the CI once Developer A’s change is introduced?

The typical solution when using GitHub would be to enforce a branch protection rule that forces Developer B to incorporate Developer A’s change prior to merging. Since Developer A’s change is already merged in this scenario, we can think of this as requiring all pull requests (PRs) to be up to date with the main branch. This removes the vulnerability of an unsafe merge since each new change must pass the CI when up to date with the main branch, prior to being able to merge.

Diagram showing Developer B being forced to incorporate Developer A’s change in order to merge to master.
Branch protection rule

Merging quickly

Now that we are merging safely, there is a new problem: Getting a PR merged is slow. And if there are tons of developers that want to make changes at once, it becomes even slower!

At Expedia Group™️, we experienced this slowness firsthand with one of our largest codebases called “shared UI.” This monorepo has over a thousand contributors and dozens of contributions per day. We valued merge safety since we needed the CI to pass all the time in order to be trustworthy. However, to achieve merge safety, we decided to enforce that all PRs were up to date with the main branch prior to merge.

We typically had upward of 20 PRs that were ready to merge simultaneously, and each PR would run CI checks once they received the necessary approvals. Whichever PR passed its checks first would get merged, but this immediately blocked every other PR from merging. This is because every open PR was now out of date with the main branch, and developers were forced to update their branch with a new commit, triggering a whole new set of PR checks. And this process would repeat for each new PR that got merged, pitting PR authors against each other to ship code!

A diagram showing many PRs struggling to merge simultaneously, and being forced to update with the main branch constantly.
The battle to merge

So, we implemented a merge queue (similar to GitHub’s merge queue feature) that forced all PRs to merge one at a time, one after the other. Each time a PR was merged, the next PR in line would be forced to update with the main branch and run all the CI checks. Once the checks passed, the PR could merge; and the process would repeat with the next PR in the queue.

A diagram of multiple PRs in a queue, having to incorporate and test against each new change.
Merge queue

However, what happened if the checks failed for the first PR in the queue? In this case, every other PR in the queue was blocked until either the test failure was resolved, or until the PR author removed themselves from the queue entirely. Sometimes this happened quickly, but more often than not, the PR author was offline or just forgot to watch their CI checks. And nobody wanted to be the merge queue monitor because that is a very boring job! As a result, the size of the merge queue became enormous, and it sometimes took days for PRs to get merged. We were still merging safely, but definitely not quickly.

Merging safely and quickly

After months of trying to manage and mitigate the problem, we realized something: Our repo is made up of loosely coupled modules. And in many cases, these modules are fully decoupled from each other. This means when we make a change to one module, the change is totally irrelevant to many other modules in the repo, and could never break any of these modules as a result. The vast majority of changes to this repo are to individual packages or to small groups of packages. So why were we blindly forcing every single change to incorporate the previous one, even if some didn’t actually need this?

We discovered we could make our merge process far more intelligent by simply inspecting the nature of each prospective change. Rather than assuming any change could break anything, we should:

  • compile a list of all the modules that the change could possibly break
  • allow the branch to merge if those modules are up to date
  • prevent the branch from merging if those modules are out of date

Thus, the merge safety check was born!

A diagram showing four PRs able to merge independently of one another.
Merge safety check

How it works

The merge safety check is a custom GitHub Action that we have added to our open-sourced suite of GitHub helpers. Here’s how it works:

First, you provide some inputs. One important input is paths, which denotes the file paths to each decoupled module in your project. Another useful input is override_filter_paths, which denotes paths that, if a PR changes, should fail the check no matter what and force an update with the main branch (this would typically be things like dependency management files or global shared directories).

When running on a PR, the helper first grabs the PR’s changed files using the GitHub API. Then, it again uses the GitHub API to determine the files on which the PR’s branch is out of date relative to the main branch. Finally, it checks to see if any of the PR’s changed files are contained in the list of outdated files on the branch. If any of a PR’s changed files are outdated, it fails the check and prevents the PR from merging; otherwise, it passes!

With this tool, we are now able to efficiently prevent out-of-date branches from merging while removing the codependency of pull requests. Each author can merge their PR whenever they want, just as long as the files they are touching are sufficiently up to date!

Conclusion

At Expedia Group, we feel strongly about empowering our developers to ship code fast and fearlessly, and our merge safety check has certainly done that so far. I am excited to see the progress we make while equipped with this newfound speed and confidence, and I would encourage anyone using GitHub Actions in a monorepo to give this helper a try!

Learn more about life at Expedia Group

--

--