Bisecting git with git

Recently, I was talking to a friend about how great git-bisect is, and our conversation reminded me about this one time I used git-bisect to bisect a problem with an old version of git that was being used on a CI server that would erroneously report a conflict when none existed.

I had been working on an issue where I needed to move a submodule around to change the way the code was being compiled. Basically, I had an initial folder structure like:

.
└── source
    ├── common
    │   └── driver.c
    ├── platform_foo
    │   ├── component_specific.c
    │   └── submodule
    └── platform_bar
        └── component_specific.c

I had done some porting work so that submodule now could be shared by foo and bar. Therefore, I needed to move submodule into something that looked like this, in order to make it available to both foo and bar.

.
└── source
    ├── common
    │   ├── driver.c
    │   └── submodule
    ├── platform_foo
    │   └── component_specific.c
    └── platform_bar
        └── component_specific.c

This seemed like a simple task. I would move the submodule, and do all my work and make any changes on top of that to get it compiling, and then submit a Pull Request.

Making the changes wasn't too difficult. All I had to do was ensure that things were compiling, and that the checksums of the release build artifacts hadn't changed. Little did I know that I'd run into completely unrelated problems that would prevent me from merging.

Initially, my merge was actually conflicting because other people working on the project were making changes that were conflicting with my own changes. But then after fixing the conflicts and still running into merge conflicts, I suspected that something was up.

My suspicions were confirmed when I tried merging it manually into the develop branch (and it worked) on my local machine, but when trying it on a build server my team had access to, the merge failed. I talked to a co-worker, and we pulled the logs from the CI server to see if we could figure out what was going on.

At the same time, he attempted the merge locally, and his attempt failed. I pointed out that I had used git mv to move the submodule, and my version of git was recent, so it should be able to handle that correctly. This prompted my co-worker to check his version of git, and we noticed that the git versions that the two of us were using were different (v2.10.2 and v2.14.1).

At this point, I thought that maybe I could get away with just grepping through the release notes between those two versions, and then perhaps something would jump out at me. Nothing did, though.

Spoiler alert: It was in the changelog for 2.11.1, I just didn't make the connection when reading through the output from rg

  • An empty directory in a working tree that can simply be nuked used to interfere while merging or cherry-picking a change to create a submodule directory there, which has been fixed..

After a while, I wasn't getting anywhere looking through the release notes, so I decided that this would be perfect for git bisect.

Bisecting is pretty straightforward—if you understand how binary search works, this is essentially the same principle. By providing feedback about whether a commit is good or bad, we can find the commit at which a change in behaviour was introduced.

# Start bisecting
git bisect start

# Give it the good commit hash
git bisect good ${GOOD_COMMIT_HASH}

# Give it the bad commit hash
git bisect bad ${BAD_COMMIT_HASH}

# Now build and run your tests

# Mark the current commit as good or bad
git bisect good
git bisect bad

I installed all the dependencies needed to build git from source on my laptop, and then ended up bisecting to 5423d2e7005eca89481d3137569b2b96b4d133ff.

$ git show 5423d2e7005eca89481d3137569b2b96b4d133ff
# commit 5423d2e7005eca89481d3137569b2b96b4d133ff
# Author: David Turner <[email protected]>
# Date:   Mon Nov 7 13:31:31 2016 -0500
# 
#     submodules: allow empty working-tree dirs in merge/cherry-pick
# 
#     When a submodule is being merged or cherry-picked into a working
#     tree that already contains a corresponding empty directory, do not
#     record a conflict.
# 
#     One situation where this bug appears is:
# 
#     - Commit 1 adds a submodule
#     - Commit 2 removes that submodule and re-adds it into a subdirectory
#            (sub1 to sub1/sub1).
#     - Commit 3 adds an unrelated file.
# 
#     Now the user checks out commit 1 (first deinitializing the submodule),
#     and attempts to cherry-pick commit 3.  Previously, this would fail,
#     because the incoming submodule sub1/sub1 would falsely conflict with
#     the empty sub1 directory.
# 
#     This patch ignores the empty sub1 directory, fixing the bug.  We only
#     ignore the empty directory if the object being emplaced is a
#     submodule, which expects an empty directory.
# 
#     Signed-off-by: David Turner <[email protected]>
#     Signed-off-by: Junio C Hamano <[email protected]>

Bingo. I was running into the exact bug that this commit fixed.

At this point, I had a couple solutions:

  1. Update git on the cluster
  2. Separate my PR into multiple commits

After asking around, apparently some changes in newer git versions had broken some CI steps, so the migration had been put off by our Developer Infrastructure team for the time being. As such, updating git on the cluster wasn't feasible.

I also didn't want to break the build, so I ended up working around this by splitting up my change into 3 separate changes:

  1. Add the new submodule to common
  2. Point build system to common
  3. Remove the old submodule from foo

I had previously read blog posts about people running into issues with tools or programs they're using, and then going down a rabbit hole debugging to find the solution. For some reason, this always seemed impressive to me. I had never expected that I'd run into something similar. And while I didn't actually fix the problem (thanks to the git developers who already fixed the issue), I managed to solve a problem that had apparently stumped another co-worker who had done something similar about a month ago, and then had given up. This experience gave me more confidence in being able to jump into source code and figure out what the problem was—not just to blindly rely on documentation.