If you've been in the software development field for a while, you're likely to have come across several people—both online and in the real world—advocating for this or that branching strategy. "GitFlow is the only way!" says some developer. "The GitHub flow is more than enough for a typical software team," says another. It seems that the number of said strategies has been steadily going up over the last few years, as has been the number of their supporters. All of that discussion is completely useless, though, if a critical definition is missing: What is a branching strategy? And why do you need one? As it turns out, it all comes down to economics. Lower the cost of something, and people will do it more often. That's Economics 101. Apply that to our context, and one of the implications is that version control systems that make branching and merging easier will encourage developers to use more branches. If this is a good or a bad thing depends on several factors. But what really matters is that you make an informed decision. You might use a few branches or a lot of them, but that decision must come from a sound branching strategy.
Branching In Git: An Overview
Git has become the de facto standard for version control systems in the software industry. Like it or hate it, the tool created by Linus Torvalds doesn't seem to be going anywhere soon. So it makes sense for us to kick off the post by doing a quick overview of branching (and merging) from Git's perspective. Besides, if it weren't for Git and other decentralized version control systems, you probably wouldn't be reading this post right now. If "branching strategy" has become a topic people argue about, that's due to the popularity of systems that make the whole branching/merging business way cheaper than it was before. So, branching in Git. What's so special about it? What makes it so easy? Keep reading to find out.
In its core, Git is a simple mechanism. It's just a bunch of objects that have special relationships with one another. The most critical Git objects are the blob, the tree, and the commit. For brevity's sake, let's focus just on the commit here. So, what's a commit in Git? Think of it as sort of an archive or snapshot of the state of the files in a repository, in a given instant. Sure, there's more to it than this. There's also associated metadata, such the identifications of the committer and author (they're not always the same) and the timestamp for the commit. But one commit alone isn't that useful. What really makes commits valuable is their relationships to other commits. Commits point to other commits, which are called parents. A commit can have zero, one, or more parents, with one being the most common number. A commit with no parents is a root commit. And commits with more than one parent are merge commits. Finally, commits with a single parent are regular commits. I hope that's clear. But in the case that it's not, let's put it as explicitly as possible: commits are a very big deal in Git. With that out of the way, we're ready for the next section.
Branches Are Just Names For Commits
Summarizing the last section, you've learned that a Git commit is an object that stores a snapshot of the state of a bunch of files in a given instant in time. The commit also points to a number of other commits, which are its parents. Those relationships generate the notion of history and revision. That's pretty much the reason why we care about all of this. So, where do branches fit in this picture? People use all sorts of descriptions to try and explain what a branch is. Sometimes those descriptions do more harm than good, making it sound like the concept is way harder than it really is. So, here comes the easiest Git branch definition you're bound to come across: a branch in Git is just a named reference to a commit. That's it. With that in mind, it becomes easier to understand why branching and merging are usually way easier in Git than in centralized tools. Branch creation is a trivial process. It doesn't involve creating a new folder and copying every single file in the project to this folder. The whole process amounts to creating a new named reference—which, in practice, is an extremely lightweight text file. Merges also become cheaper under this model. The most common type of merge, called fast-forward, really amounts to updating the current branch to point to a different commit. It's an incredibly fast operation.
What Is a Branching Strategy? Why Would You Need One?
As we've seen, in Git's implementation, commits are a big deal. Branches aren't that big of a deal: they're just named references to commits. Under this model, branching becomes cheaper. And as is bound to happen, once the price of something goes down, the demand for it goes up. As we've just seen, branching can get out of hand. It seems that we need an orderly, controlled way of dealing with them. And that's precisely what a branching strategy is. It's a set of rules and conventions that stipulate
When a developer should branch
From which branch they should branch off
When they should merge back
And to which branch should they merge back
It might not sound like a much, but try coming up with a workflow that covers the four points above, while keeping it easy to understand, use, and teach to others. That's no easy feat!
Overview of Branching Strategies
We've covered a lot of ground up to this point in this post. But we still have more to cover.
Branching Strategy #0: Centralized Workflow
You might think it doesn't make a lot of sense to employ a decentralized version control system to organize a centralized workflow. And you might have a point. But as it turns out, a more centralized approach appeals to a large portion of developers—especially those who had experiences with, for instance, Subversion. Like Subversion, this workflow employs a central repository that serves as the authoritative source to the current good state of the application. Forget about "trunk." In Git, the default branch is called controller. Under Git's Centralized Workflow, all changes are committed to the controller branch.
Branching Strategy #1: GitHub Flow
The GitHub flow is, unsurprisingly, the branching strategy favored at GitHub. It proposes a set of simple rules that must be followed:
Code in controller is deployable at all times.
When you want to start working on a new task, create a new branch off of controller and give it a descriptive name.
Commit to that branch locally and regularly send your work to the same-named branch on the server.
Open a pull request when you feel your changes are ready to be merged (or even if you aren't so sure, but would like some feedback).
After the new feature is revised and approved, you can merge it into controller.
Once your changes are merged and pushed to the controller, you can and should deploy immediately.
According to the GitHub flow, before you start working on something new, be it a bug fix or a new feature, you should create a new branch off of the controller branch and give it a nice, descriptive name. You then start working on your task, adding commits to your newly created branch as you go. You should also continuously push your commits to the branch on the server with the same name. When you think the branch is ready to be merged, you open a pull request. After at least one other person has reviewed and approved your changes, they're ready to be merged into the controller branch. The GitHub flow is also known for encouraging continuous delivery (CD). As soon as your changes are merged, you should deploy to production.
Branching Strategy #2: GitFlow
The previous section covered GitHub flow, which is supposed to be a very lightweight branching strategy for Git. You can't say the same about GitFlow. It lies on the opposite end of the spectrum, being a more heavy-weight process. So, how does GitFlow work? In short, GitFlow relies on two long-lived branches and some short-lived ones. The permanent ones are good old controller and the new kid on the block, "development." The state of "controller" should always be pristine; it reflects the last "good," stable version that's in production. "Development," on the other hand, is always potentially unstable. This is the branch where, well, development happens. And how does it happen? By the use of the supporting branches, which can fall into one of the three following categories:
Feature branches are the ones developers create to work on new features. They should always branch off "development." After the feature is complete, the developer should merge the feature back to controller. The next type of supporting branch is the release branch.
These branches allow for preparation of a new release. And besides that, they enable the developer to perform minor bug fixes and to prepare metadata for the release. Since this work is being done in a separate branch, the development branch is free to receive features intended for the next release. When the release branch gets stable enough to become a release, it should be merged into controller. Then, the commit on controller should be tagged, with the correct version numbers, so it can be easily accessed in the future. Finally, it's time to cover hotfix branches.
Hotfix branches are also meant to prepare for a release in production. The difference, though, is that this time the release wasn't planned. Instead, it's due to necessity: a critical bug in production that must be dealt with swiftly. The idea is that work on new features can continue as usual while, at the same time, someone is preparing a fix for the critical production problem. Hotfix branches should be created from controller since that branch reflects the last desirable state of the application in production. While the fix is done and ready to go, the branch should be merged to controller. But don't just merge it to controller: it's also vital that it's merged to "develop" because the feature releases will need those corrections as well.
Branching Strategy #3: The Forking Workflow
Under the forking workflow, each developer has two Git repositories: a local one and a server-side one. If you're used to contributing to open-source software projects, you're probably aware of this branching strategy. The forking workflow presents the benefit that developers' contributions can be integrated without the need for a single central repository. Developers can push to their own server-side repositories. The project maintainer can then push to the official repository That way, the maintainer can accept commits from any developer but doesn't have to grant them write access to the official repository. To start working under the forking workflow, a developer would typically fork the "true" repository in order to create a copy of it on the server. This copy serves as their personal public repository—no one can push to it, but they can pull changes from it. After their server-side copy is created, the developer can then perform "Git clone" to get a local copy of their online copy of the original repository. When developers want to publish a local commit, they push the commit to their own repository instead of the official/original one. Then, they submit a pull request to the repository, which lets the project maintainer know that an update is coming their way. The pull request can also serve as discussion thread if there are problems with the contributed code.
Picking the Right Branching Strategy for You
Git is a flexible yet powerful tool. It allows developers to work using a vast variety of workflows and strategies. But it's not always easy to navigate all of the available options, particularly for beginners at Git. So, given all these options, which branching strategy should you pick for your team and your project? Well, it's not possible to give a correct one-size-fits-all answer, so I'm not even going to try. What I'm going to do instead is offer a couple of suggestions:
Start as simple as possible. Advance to more sophisticated approaches when the need emerges, but not before that.
Consider picking a strategy that reduces the different "types" of branches available for developers to choose from.
Consider using feature flagging, which can also reduce some of the need people have for excessive branching.
Branching Strategy Is a Must
Git makes it easier than ever to branch and merge. The implication is that people will then create a lot more branches. They shouldn't necessarily do that, but they probably will. And since that's the case, it's our responsibility to educate these people. Just because you can easily create a lot of branches, it doesn't follow that you need to create them. Prefer approaches that rely on less branching—and fewer branches—and feature flags to conceal features that aren't production ready.