This is the second post in a series about asynchronous collaboration. By asynchronous, I mean that people don’t work at the same time. It’s common on distributed teams, especially across time zones. You can see all the posts so far on this series right here.
In the previous post, we talked about asynchronous limitations on code reviews:
Code reviews: we know they make our code better. How do we do code reviews? There’s less agreement on that.
I love pairing for immediate code review, but it requires synchronous communication. For asynchronous code review, we need another option.
Enter the pull request. In this more prevalent approach to code reviews, a programmer keeps a change set on a separate branch, then submits a request to merge it into the main branch. Another programmer reviews the code in that request prior to merge.
Then I shared what I include in my pull request submissions. In this post, we talk about the other side of the interaction: how I do pull request reviews.
Strap in folks, because you need to hear this, and you’re not going to like it.
Why most pull request reviews suck
I mentioned that my favorite form of code review is pairing, but that pull requests work better for asynchronous communication. Pull request reviews don’t have to be inferior to pairing, but they usually are because of the way reviewers do them.
In pair programming, if a programmer has a suggestion for how to change the code, the interaction goes like this:
Programmer 1: “What if we tried [X]?”
Programmer 2: “Hmmm. Interesting. Let’s try it. [*types]”
Programmer 1: “May I take the keyboard to demonstrate? [*starts typing] it might look something like…”
We have, in pairing, an expectation is that the pair will participate in implementing any alternative that they suggest. Meanwhile, in pull request reviews, I see a lot of this crap:
Original Developer: [*Implements Code]
Reviewer: “I don’t like [X] about this solution. [*blocks merge]”
or the only slightly less atrocious:
Original Developer: [*Implements Code]
Reviewer: “I don’t like [X] about this solution. Have you tried [*vague description of an alternative]?”
This is not better:
Original Developer: [*Implements Code]
Reviewer: “I don’t like [X] about this solution. We might do instead something like this [*pseudocode, or code that doesn’t compile]. You get the idea.”
And these, by the way, are civil examples. I’ve seen this exact thing done regularly except with insults tacked onto beginning like “No offense, but I cannot imagine why somebody would even think to implement it like [X].”
In fact, the state of pull request quality is so bad that most advice I’ve read on how to do a good review only addresses the idea that you shouldn’t insult someone’s code. To be frank, it’s absurd that we have to explain that to grown adults. The only other advice I regularly see says “try pointing out the things you like about their code!” Yes, do that. But that’s still setting a really low bar. The above examples aren’t mean at all, and they’re still bad. We must demand more from adult professional software engineers than “don’t be mean” and “try being nice.”
Why are the above examples so bad?
They’re bad because the reviewer is failing to participate in the development of sound code. Instead, they are dictating their untested ideas from their ivory tower of reviewership. They are delegating responsibility to the original developer to do both the feasibility test and the implementation of those ideas. They’re recording their idea in such a way that they get credit if it works out, but the original implementer undertakes all the risk that it doesn’t.
Because if these ideas, that they fired from the hip, don’t work? It’s the original developer who lost time chasing them down—not the reviewer. This is disrespectful of the original developer’s time, a waste of the reviewer’s knowledge, and a great way to criticize other people’s work without examining one’s own limitations.
In fact, in most pull request reviews, it is dead obvious to me that the reviewer did not pull down or run the code at all. This is flatly unacceptable.
This means that, if the original developer became unavailable to merge this code, the reviewer would be no more prepared to take over development than they were before they reviewed it. Context transfer has failed.
Here’s the proof that pull request reviews could be extremely effective: they are still sometimes useful, even when they’re this bad. So why do we do them this badly, when there is so much potential for them to change our code for the better?
Stop doing these distant, cursory code reviews. You (yes, you) are better than this.
Instead, reviewers: put your money where your mouth is.
Don’t like something about this branch? Pull that sucker down. Test out an alternative yourself. Fix it and push a commit for the original developer to review. This is how open source contribution works, and it’s how collaboration on your distributed team ought to work, too. Here’s a pull request where I did that. The original reviewer did most of the work, and I submitted a change back to him before final merge.
Once again, if the reviewer lacks sufficient context to modify the code themselves, the pull request has failed on the goal of context transfer. The original submitter should transfer more context in the pull request description until this is possible. Here’s how they can do that.
I understand that other factors may prevent a reviewer from pushing their own changes. Maybe they don’t have push access. Maybe they don’t want the original developer to feel like they’re taking over their plum project, or doubting their abilities. I get it. In these cases, I still pull down and try out my own suggestions, then provide the changes, code verbatim, to the original developer in my comments.
An Example of That
All examples come from this pull request on a Python script. Context: The author knows Python, but isn’t an expert on the particular libraries used. I have used the libraries, so as the reviewer, I do my best to make that knowledge useful to the original developer.
See? Code. Copy-pastable code. I know this code works because I tried it in the script and saw it work with my eyeballs. Here’s a slightly more involved example:
For context: the
tqdm library is named for the Arabic word for “progress” (تقدّم, pronounced ‘tiqdam’) . The verbatim change that I am suggesting here propagates to a few places in the code, so I suggest changes on those places as well rather than forcing the original developer to figure that out by themselves.
In fact, there’s one more change we need here:
My review should save time and frustration, not create it.
How far should this go? In one pull request review at a previous job, I needed to:
- pull down the code,
- notice that something was running slowly,
- confirm with the original developer that the current speed was unacceptable,
- benchmark the code to determine what’s running slowly,
- identify an inefficient query,
- try a different query,
- benchmark the difference, and
- submit the results in a PR comment.
It was on a private repo, so I can’t show it to you, but I want to emphasize how far the reviewer’s commitment to participation in the problem-solving can go. Had I pushed all that work back on the original developer, what would have happened?
- They would resent me
- I would have assumed (rather than understood) whether performance were important at this part of the app
- I would be no more prepared to work on the app that I was before
- My skills at benchmarking and my knowledge of queries would be wasted, and the original developer would instead have to do a bunch of Googling
- The reviewer appreciated my help
- I understood the role of performance in the app
- I could run the whole app on my machine and make changes to it
- I got to cement my knowledge by explaining something to someone else.
Wow, Chelsea. This sounds like a massive demand on the reviewer’s time.
The fact that this takes so much time illustrates precisely why pairing is efficient. There is immediate context transfer between two people working together, so all of this work happens in a condensed amount of time relative to the amount of work being done. I’ve written a post on why pairing works if you’re interested, but folks who argue that pairing costs twice as much for the same code are not factoring in the time it would take to do a marginally useful code review.
But if pull requests are the only option available to your team, a reviewer spending this time is still a better use of resources than having the original developer chase down solutions suggested by lazy, delegating commentary on their pull request.
Other things I try to do in pull request reviews
In addition to pulling down, running, and modifying the code myself, I deploy a few other tactics to make my participation in a code review as useful as possible. Here are examples of some of those, from the same pull request as the screenshots above.
1. Where additional documentation would be relevant to the changes I am suggesting, I provide links to that documentation. I do not make the original developer go and chase down these things on their own.
2. I spend time asking questions. I want to have full context on this code. Remember: if something prevents the original developer from continuing to develop this code, it becomes my responsibility. I must be prepared to handle that responsibility. And I’m not judging another developer’s choices until I have asked all my questions.
3. I’m pointing out the hidden gems in the code that might otherwise be overlooked. This particular issue gets a lot of people working with
pandas, so I’m pointing it out for the benefit of developers on the project who have a lot of experience, but possibly not specifically with
In addition, I point out when the resources that the original developer finds about something they used might be worth reading with a grain of salt. This is how a review becomes more useful to the original developer than Googling around:
4. I judge my reviewing skills on whether my review could be supplanted by:
- A personal proofread
- A Google search
- (God forbid) nothing
I want to make sure my review provides value that other resources don’t.
5. Just as important as feedback about what should change is feedback about what shouldn’t change.
6. Finally, in mentioning conventions, I point out that they’re exactly that—conventions. I’ve talked a little bit about the nature of conventions in the footnote on this post, and I think it’s important to call them out for what they are: cultural agreements that may or may not last.
Pull requests have the potential to compete with pairing in their effectiveness as a code review tool. Usually they can’t compete, though, because we have normalized pull requests in which reviewers fail to undertake a participatory role in building the solution with the original developer. I’m not talking about edge cases where the programming version of Dr. House comes in to consult on thorny cases. I’m talking about the vast majority of code reviews where the reviewer is another developer on the same team as the original developer.
A maximally effective pull request suggests solutions…in code. It points out what’s working and what’s not, and links to documentation where useful. It highlights laudable work by the original developer, and asks questions before making assumptions or judgments. It explains the reasoning behind suggestions, whether that reasoning affects functionality or adheres to convention. In short, it demands the reviewer’s full participation in finishing the solution that the original developer started. And it prepares the reviewer to take responsibility for this code in the event that the original developer were unable to complete it.
If you liked this piece, you might also like:
The remote work series (warning: if this post was too salty for you, don’t read that)
The listening series (another thing not to read without a cup of…tea)
The SICP Series (completely unrelated, but this is my blog and I get to plug my esoteric programming endeavors if I want to)