on
git checkout a GitHub Pull Request
Sometimes, reviewing code is easier done with the full diff
in front of you.
Web UIs like GitHub, Phabricator, or Stash all have their strengths, but often
times I find myself opening up my editor and following along. This allows me
to better understand the scope of the changes being made, and lets me navigate
using an interface I'm more confident in—making me feel more productive.
I can then easily jump around the code base, allowing me to understand what
effects their diff
will have on other code, look up things I'm not 100% sure
about, and so on.
As it turns out, GitHub provides an abstraction to Pull Requests via a
refs/pull/
object. If you're not familiar with Git internals, a ref
is
simply a file in .git/refs
, which points to a commit hash.
The process is detailed in this GitHub Help page. Essentially, all you need to do is:
# The PR ID of the PR you're looking to checkout.
#
# Example: "https://github.com/<user>/<repo>/pull/<id>"
PR_ID=42
# Fetch the ref and create the branch pr-${PR_ID}
git fetch origin pull/"${PR_ID}"/head:pr-"${PR_ID}"
# Now checkout the branch
git checkout pr-"${PR_ID}"
You can use this to do some neat stuff besides using it to browse the contents of a Pull Request. Anything that you would normally script on your local repository, you can now script—even with Pull Requests made from a GitHub fork! Pull Requests can now just be treated as a normal branch.
For example, recently I had a potential use case on a project I was working on,
where we were introducing a code formatter requirement (think clang-format
,
black
, gofmt
, rustfmt
, etc.). I won't spend time arguing about the merits
of having (or not using) a formatter here—the point here is to illustrate
one way refs/pull/
can be used.
Adding a code formatter, and subsequently enforcing it on a code base, typically
requires that all the code would be formatted in a giant patch. This could have
been performed incrementally on each diff
, but our tooling wasn't advanced
enough to do so. As there were non-trivial PRs open, we also didn't want to
cause undue churn and introduce a bunch of merge conflicts. This not only
wastes everybody's time, but results in some annoyance as developers are forced
to resolve all these conflicts. Thus, we decided to keep things simple and only
format files that weren't currently modified in any open PRs.
On the CI servers, in order to ensure that files weren't being "unformatted"
after the giant format patch, we introduced checks that would fail the CI job
if files were being committed that would result in a diff
should the
formatter run again. This would remind the developer to format their code
before requesting a Code Review.
However, for the files that were currently modified by a PR, the idea was to relax the formatting requirements. Once those PRs were merged, the plan was to incrementally enable the CI checks on those files when they wouldn't cause conflicts, until we reached 100% coverage. We could take advantage of the ability to exclude certain files from the formatters, and so we needed to generate an exclusion list of all the files currently modified by an open PR.
You could do it using a bash
script that looks something like this:
#!/bin/bash
# A list of Active PR IDs, which can be fetched via
# GitHub's API (or a JavaScript selector on the page)
ACTIVE_PRS=(42 121 311)
# The output file to write to
MODIFIED_FILES_OUTPUT="exclusion_list.txt"
# Truncate the file before writing to it
echo "" >| "${MODIFIED_FILES_OUTPUT}"
for PR in "${ACTIVE_PRS[@]}"; do
git fetch origin pull/"${PR}"/head:pr-"${PR}"
git diff --name-only develop...pr-"${PR}" | tee -a "${MODIFIED_FILES_OUTPUT}"
done
# Dedupe the modified files
sort "${MODIFIED_FILES_OUTPUT}" -o "${MODIFIED_FILES_OUTPUT}" -u
This takes advantage of the ...
diff
syntax, which finds the the last
common commit with the first commit given to the command, and then diff
s the
second commit against that.
It's basically equivalent performing the following:
git diff $(git-merge-base A B) B
As files in the exclusion list were formatted, they could be removed, thus
enabling the CI checks for it. Once the exclusion list reached 0
files, then
we knew that we had successfully formatted all the files in the repository.
In the end, we just ended up biting the bullet and doing a giant reformat over a holiday (with lots of warning beforehand). But if you ever need to do something like this, perhaps this might be useful to avoid people tearing their hair out.