In today's post we're going to discuss branching in Git. As usual the accompanying video tutorial is above if that's more your style. In our last post we introduced you to Git and briefly touched on how it functions and some of the problems it resolves. We saw how Git maintains a history of our files as we commit changes. We also saw how Git commits are linked together and how Sourcetree was able to draw a little graph line between our two commits, starting with the most recent commit and moving backward to our first commit. You may have also noticed in our last post that there was a little label that said
master that always stayed on our most recent commit.
This little label is called a branch pointer. Currently we have one branch, the
master branch. A Git repository must have at least one branch or else the repository would be largely useless. The reason for this is that branch pointers are references to specific commits, and without them we wouldn't be able to keep track of our commit history.
Remember from our last post when I said that it draws the commit graph backward from the most recent commit to the oldest commit? What I meant was that it draws the commit graph backward from the most recent commit that it's aware of. In the above screenshot you can see that Git is aware of the two commits we created last time. The reason for this is because of the
master branch pointer. We currently have the
master branch checked out. As we make commits with a branch checked out, that branch's pointer will move to point at each new commit we make. Sourcetree is then able to draw our commit history graph by starting at each branch pointer and working backward. To demonstrate this more clearly let's create another branch by selecting our most recent commit and clicking the "Branch" button at the top.
I named my new branch
new-feature. After clicking "Create Branch" you should be able to see that we now have a new branch pointer that is also pointing at the same commit as our
master branch pointer.
Notice also that by default we checked out our new branch. You can tell which branch is currently checked out by looking on the left-hand side of Sourcetree and noting which branch name is bolded. What this means is that any new commits we make will cause our new branch pointer to move to point to them, but the same will not be true for branches we do not have checked out, such as
master. Let's make a new commit so we can see this. I'm going to add a couple items to my grocery list and commit those changes now that my
new-feature branch is checked out.
You can see that I've modified my grocery list to add "Trash Bags" and "Ketchup" to the list and commited those changes. You should notice that your new branch pointer moves to point to your new commit, but the
master branch pointer stays pointing at the previous commit. Notice also that the commit history graph is still a straight line starting with our most recent commit on our new branch. My
new-feature branch is pointed to the commit I just did, and it has a reference to the parent commit that came before it. That parent commit is still pointed to by the
master branch pointer and that commit has reference to it's parent commit, which was our very first commit.
Because of the way our branch pointers are now, both branches are aware of our first two commits. If you travel backward from the commit on our new branch you will run into both commits that the
master branch is aware of. However, the
master branch is not aware of our latest commit because it is still pointing to the commit before that one. Let's dive even deeper. The next thing we'll do is check out the
master branch again and make a new commit against it. To do that we will double-click "master" in the left-hand column. You should see it turn bold to indicate we now have the
master branch checked out.
Now that we have
master checked out again let's again open up our text file in Notepad and add more items to our grocery list. However, upon opening up the file you might be confused. What happened to "Trash Bags" and "Ketchup"?
The reason those two items vanished from our grocery list is because we checked out the
master branch. Remember when I said
master was not aware of our most recent commit? Only our
new-feature branch is aware of the commit containing those changes. So until that branch gets merged back into
master anyone working on the
master branch isn't going to be aware of those changes. This is the reason I named my branch
new-feature, in order to demonstrate what branches are often used for. If I'm working on a feature in code but I'm not quite done with it yet, I don't really want to commit half-done changes that might break the app for any teammates who are also working on the same project. If I committed my half-done changes to master then they might pull down those changes into their own repository and have a hard time working around my busted code (we'll cover more about pushing and pulling commits to and from a remote repository in the next tutorial post).
By creating a branch I'm able to do all my work on that branch, committing changes as I go even though the code is still a work in progress and the app might not function properly until I'm done. Then when I'm all finished with my feature I can go ahead and merge all the commits on that feature branch into the
master branch so that it's aware of them. Now when my colleagues pull down my updates to the
master branch they can be reasonably confident that the app is in a working state, provided I didn't accidentally introduce any bugs of course ;)
Now that we know why our file is missing the work we committed on our other branch, let's go ahead and make some changes and commit them to the
master branch anyway.
Before we head over to the history view, let's talk about what we just did. Our
new-feature branch has one commit on it that our
master branch is unaware of. Now we are about to make a commit against the
master branch. All commits (except the very first one in the repo) have a reference to a parent commit. What commit is this commit going to reference as it's parent? The same one that our commit on our
new-feature branch references as its parent. This is because our
master branch is currently still pointed at that commit. So when our
master branch pointer updates to point to our new commit we're about to make, it's going to draw a line from that new commit back to the same commit our
new-feature branch commit is already drawing a line to. I know that's confusing, but all should be made clear in the next screenshot.
If you look at our graph that Sourcetree is kindly drawing for us, you can see that it now forks. This is because of what I described above. Both the "dental items" commit and the "trash bags" commit hold a reference to the same parent commit. That is why both commits draw a line back to the "kitty supplies" commit.
That parent commit is now what is referred to as a common ancestor. In other words, it is the point where the two branches diverge. It is the last commit both branches have in common when going through each branch's history. The
new-feature branch pointer is pointed at the "trash bags" commit. That commit's parent is the "kitty supplies" commit. And that commit's parent is the first commit we made to the repository. So
new-feature's history looks a bit different from the
master branch's history which begins with the "dental items" commit. That commit references the "kitty supplies" commit. And again that commit references our very first commit in the repository.
This is why it's important to understand that Git tracks history by looking at all its branch pointers and then drawing a graph backward from that point. If you deleted one of those branches right now then you'd lose the pointer which would cause Git to lose reference to the commits that were only tracked by that branch. Technically the commits would still be there in the Git database, but they would be lost since we have no way to reference them anymore. Those commits would be referred to as dangling commits. A dangling commit or commits is a piece of our commit chain that exists and forks off of a common ancenstor somewhere, but we have no branch pointer to reference them so those commits are just in the Git database somewhere by themselves, unable to be referenced.
There are ways to discover dangling commits and create a new branch pointer to reference them, but that's outside the scope of this tutorial. I just bring it up to try and help build a complete picture for how commit history works in Git. The commits themselves forge the links between one-another and build a commit chain. Branches are just pointers that refer to those commits so we can keep track of where we are on the commit tree.
Now you know why they are called "branches". You can think of the
master branch as the trunk of the tree, and all other branches usually extend out from there. The goal of most branches though, is to eventually be merged back into
master. You can see in the above diagram that there is an unnamed blue branch that forks off of a commit on
master, contains three additonal commits, and then comes back to the
master branch. That's called a branch merge.
How do we merge one branch into another?
Merging one branch into another usually creates what's called a merge commit. It's a normal commit like any other except for two aspects of it that make it unique. First, the merge commit contains all the changes from all the commits on the branch that's being merged. So if the branch being merged has multiple commits that are tracked by it, the merge commit will contain all the changes from all those commits combined into one commit. Second, the merge commit is the only kind of commit that holds a reference to two parent commits.
Let's demonstrate this by merging our
new-feature branch back into
master. To do this you first have to check out the branch you want to merge into. So let's checkout the
master branch by double-clicking it on the left side if it's not checked out already. Then just click the "Merge" button at the top.
It's easy to get confused here. Remember that you always check out the branch you want to merge into and then you pick the branch/commit that you want to merge, not the other way around. For this example we'll pick the commit that our
new-feature branch is pointed at and then we'll click "OK". In this example we're going to get a message telling us there is a conflict.
What is a conflict and how do we resolve it?
The reason for this is because the latest commit on the
master branch adds a couple things to the grocery list (dental supplies) and the commit we are merging into master also added a couple things to the grocery list file (trash bags and ketchup). Both of these commits modify the same lines in the file. There are fancy tools for helping resolve merge conflicts, but for this demo we're going to do it manually. Click the "Close" button on the dialog box that popped up and take a look at the file status view.
If you go the file status view after closing the conflict alert you'll notice a few things. First, you'll notice that the file appears to be both "staged" and "unstaged" as it appears in both areas. This is because parts of the file merged successfully and were added to the staging area ready to be committed as part of the merge commit. The parts that are conflicted are unstaged, waiting for you to fix the conflict and stage the change. Second, you can see in the diff view on the right that the file contains some very odd looking text.
<<<<<<< HEAD - Toothbrush - Toothpaste ======= - Trash Bags - Ketchup >>>>>>> new-feature
Git has separated the two different versions of this file, allowing you to edit the file to what you think it should look like. Both sections are divided by the line filled with equals signs
=======. The bottom section contains the changes we tried to merge in from our
new-feature branch. This is denoted by the line just after that section
>>>>>>> new-feature. The first section contains the changes to those same lines of the file on our current branch, which happens to be
master. You might be wondering why it says
<<<<<<< HEAD instead of
<<<<<<< master. In Git the
HEAD is a unique pointer that describes what commit is currently checked out in the repository.
In Git you can actually check out any commit you want, not just branches. In fact, when you check out a branch you're actually just checking out the commit that branch pointer is currently pointing to. The
HEAD of your repository is a pointer that points to whatever commit you currently have checked out. In the case of this example we checked out the
master branch so we could start to merge the
new-feature branch into it. When checking out the
master branch we actually just checked out the commit the
master branch pointer was pointed to, which updated our
HEAD pointer to point to that commit as well. This is why the first section of conflicted code says
<<<<<<< HEAD. It's just letting you know that the change in that first section is the change that is currently present on the commit you have checked out, which is the commit the
master branch pointer is pointed to.
In this case we know we want both the changes in our
master and the changes being merged in from
new-feature, so we can simply remove all the conflict demarcation markup.
This is the super manual way to resolve a conflict in Git. We are using our text editor to manually remove the conflict markup and keep both sets of changes to the file. Once we stage and commit this then the merge commit will be created.
You can see that we already have a nice little commit message ready to go. Git knows we are currently in the middle of a merge resolving conflicts and has already prepared a commit message in anticipation. Let's go ahead and click commit and then head over to our history view for a look.
You can see that after our merge we now have a merge commit. The merge commit contains all the changes from the new-feature branch and also holds a reference to two parent commits. This is why Sourcetree draws lines from our most recent commit back to both the commit on
master and the commit on
Notice that our
new-feature branch pointer is still pointing to a commit on that branch. That's because all we did was merge that branch's changes into the
master branch by creating a merge commit and moving the
master branch pointer to point to that commit. That's how
master is aware of the merge but
new-feature is not. At this point we could actually delete the
new-feature branch. All it's changes are now contained within a commit on
master and we even have reference to all the commits on that branch via the merge commit if we really need to go through that history for some reason in the future. The
new-feature branch pointer is kind of pointless now unless we want to check it out again and make more commits against it. If we do want to make more commits against our
new-feature branch notice that any further commits would reference the commit it currently points to.
What is a fast-forward merge?
Let's say we wanted to keep our
new-feature branch but
master has made some additional commits since we last used it and we'd rather our
new-feature branch fork from those commits further up the tree. To demonstrate this I will first make a couple commits to
Now that we have a couple more commits on
master let's go ahead and check out our
new-feature branch and merge those commits from master into it. To do that simply double-click the
new-feature branch on the left-hand side. After that click "Merge" to begin merging.
We want our
new-feature branch to be up to date with the latest commits from
master so select the latest commit and click "OK".
So what just happened? Well if you take a look at the
new-feature branch pointer you'll see that it jumped up to point to the same commit as the
master branch pointer. You might have expected a merge commit to be created. What actually happened was called a fast-forward merge. This happened because the
master branch was already aware of all the commits that the
new-feature branch was aware of, thanks to our earlier merge into
master. Since we didn't make any additional changes to our
new-feature branch, Git was able to just follow the tree upward to the most recent commit on
master. In other words, Git was able to just fast-forward the
new-feature branch pointer to point it at the most recent commit.
That's it for today! I know it might be a lot to take in, but in the future when we start actually writing code I'll be walking you through the steps we take as we interact through Git. With time and practice it will all start to come together more. Thanks so much for reading. See you in part 3 :)