Stacked Diffs: Keeping Phabricator Diffs Small

Kurtis Nusbaum
7 min readJul 12, 2016

--

In my previous post, I argued that Large Diffs Are Hurting Your Ability To Ship. Now I’m going to show you how to keep your diffs small using Phabricator.

Don’t use Phabricator? This workflow has been ported to Github in Stacked Pull Requests: Keeping Github Diffs Small.

Here’s the basic strategy:

  1. Break up a large change into a series of individual commits
  2. Stack the commits on top of one another.
  3. Create Phabricator Diffs for each commit.

Let’s walk through a real world example.

1: Create the First Commit

Time to make the first bit of our change. Starting on master, we’ll checkout a branch for the new feature, add a new class to our project, make our first commit, and sent it out for review:

If we look at our git log we can now see the new branch with the new commit:

And if we checkout Phabricator, we’ll see our diff is up for review:

2: Commits 2 Through N

We continue coding, making an additional logical change. For the sake of this example, let’s say the class uses functionality in the NewClass.java file we added in the first commit:

Here’s what our git log looks like now:

Running arc diff HEAD^now sends up our second commit (and only our second commit) to differential:

We now have two commits in differential, that can both be reviewed independently. When your reviewers are looking at Part 1, they can focus solely on the changes introduced by NewClass.java. When your reviewers are looking at Part 2, they can just focus on the additional functionality that is added by the DependentClass.java file. The benefits here are the same you get with abstraction, the reviewer has to hold fewer things in their head.

3: Address Feedback

Now let’s say a reviewer caught a mistake in your Part 1 diff and we need to go back and fix it.

Let’s do this using an interactive rebase. While we’re still on the my_feature branch we’ll run the following:

Doing this will drop us into a text editor where we can opt to edit commits between our current commit and the master commit. Since we want to edit the first commit, we’ll follow the instructions given by git and switch the pick associated with the first commit to edit. We’ll then save the file.

If you’re not familiar with an interactive rebase here’s essentially what’s happening. Git rewinds all of your progress, uncommiting all the changes you’ve made. Then it plays back each of those changes in the order you specify in the file above. If any of the commits are marked with edit, git will stop at the commit and let you make changes. In fact, this is exactly what git tells you once you start this rebase:

And if we take a look at our git log now, we’ll see that our HEAD is actually pointing to this commit:

So following git’s instructions, let’s make our fix, amend, upload the fixes to Differential, and continue the rebase:

Once our rebase is finished, we’ll be right back where we started, at the tip of the new_feature branch:

And we see that our diff is now waiting to get re-reviewed:

4: Land the Diff

Alright, Part 1 has been approved and is ready to land:

We want to specifically land Part 1 of the branch we’re on so we use it’s git refhash when running arc land . Also, we still have commits on the branch that we’re actively editing. So instead of deleting the branch, we tell arc that we’d like to keep it for now.

Your diff has now landed! Our git log now looks like this:

You’ll notice that the Part 1 commit seems to appear twice. Let’s clean this up by rebasing our my_feature branch onto the new master. The new master should include the commit we just landed, so after the rebase we’ll see that our my_feature branch actually only include the Part 2 commit. This is pretty nice. After each land and then rebase, our branch will now only include the commits that haven’t yet been landed.

After doing this, we’ll see that our git log is nice and tidy:

5: Rinse and Repeat

If we’d had more than just two diffs in our stack we would just keep repeating the same processes in steps 3 and 4. Once all but the last diff in a stack has been landed, we can just run arc land my_feature to land the last commit.

Provisos, Pro-tips, and Parting Notes

  • You’ll see I always have to type arc diff HEAD^(I never use the standard arc diff ). Because of this, I’ve actually created a bash alias for doing my diffs so I never forget the HEAD^ part: alias ad="arc diff HEAD^"
  • I was pretty explicit and verbose with a lot of the commands above for the purposes of explaining exactly what was going on. As you get more comfortable with the work flow, I highly recommend setting up scripts and git aliases to help you do all of this. For example, since this work flow involves often doing an interactive rebase starting from master, I’ve aliased git rebase -i master to simply git rim.
  • If you’re using Harbormaster and Jenkins for CI, you’ll want to setup a staging area, other wise your stacked diffs will fail your integration tests (see appendix for how to do this).
  • If you’re ever unsure about what code is going to be sent up to Differential when you run arc diff HEAD^, just run arc diff HEAD^ -preview.
  • If you ever screw up a rebase, git reflog is your salvation.
  • Sometimes you’ll amend during a rebase and that will cause a conflict further up in the stack when you continue the rebase. When resolving these conflicts be sure to only use git rebase --continue. Don’t do an amend when resolving these conflicts. I’ve made that mistake before. It’s no fun.
  • When doing an interactive rebase, only edit one commit at a time. Doing more than one is an invitation for forget where you are in the stack and screw something up. I’ve made this mistake before. It’s no fun.
  • Always finish a rebase. If you walk away from your computer in the middle of a rebase, you may come back to it later, forget you’re in a rebase, and then try to change branches. This can seriously screw up what your repository looks like and confuse the hell out of you. I’ve made this mistake before. It’s no fun.

Special Note

I want to give a special thanks to all of the mentors at Facebook who taught me this workflow and to all the folks at Uber who have helped me refine it. Special thanks to Grayson Koonce for his massive help with editing this post.

Kurtis Nusbaum is a Mobile/Backend Developer at Uber and an avid long distance runner. He can be found on Facebook, Twitter, and LinkedIn. Here’s his github. If Kurtis seems like the kind of guy with whom you’d like to work, shoot him an e-mail at kcommiter@gmail.com.

Appendix

Rebasing before landing

When landing an individual commit in your stack, there’s a chance there might be a conflict when landing. In order to avoid any conflict issues, we update master and rebase our branch to ensure a clean land.

Setting up a staging area

  1. find your repository in diffusion, e.g. https://code.mycomany.com/diffusion/GRREF/
  2. click “edit repository”
  3. copy the “remote uri” value
  4. edit the “staging area”, and paste in the remote uri value
  5. save!

--

--

Kurtis Nusbaum
Kurtis Nusbaum

Written by Kurtis Nusbaum

Software Engineer, Long Distance Runner, Diversity Advocate, YIMBY, github.com/klnusbaum kcommiter@gmail.com

Responses (15)