Were the world business-as-usual, I’d be in Portland today.
My Portland plans: eat donuts, gorge my eyeballs on arresting Cascadia views, and speak at RailsConf in Jennifer Tu‘s “Exported Expertise” track.
Then, things changed. We’re all riding out a pandemic at home. RailsConf changed, too…to RailsConf Couch Edition, About half of the original speakers recorded their talks from their homes. RailsConf published our talks, for you, for free!
Here’s mine. Approximate transcript interspersed with slides below.
Approximate Transcript with Slides
This talk is called Debugging: Techniques for Uncertain Times. It’s by Chelsea Troy…which is…me (meta-commentary: Yep, I know I opened this kinda awkwardly. It’s weird to give a talk to no one, folks).
Before I was a software engineer, I was almost everything else.
I coached rowing at a high school in Miami. I blogged for a startup whose business model turned out to be illegal. I tended a bar, performed standup comedy, danced with fire on haunted riverboats, and edited a quack psychology magazine. I did open source investigations of international crime rings.
That all sounds very fun and exciting…in hindsight. But at the time it wasn’t a fun journey of self discovery: I was compelled to adapt at frequent intervals in order to stay afloat. I got into software engineering for job security—not out of passion for programming.
But some of the coping mechanisms I learned from those frequent adaptations followed me into the programming world. It turns out, the skills that equip us to deal with rapid, substantial changes to our lives also make us calmer, more effective debuggers.
Debugging, in my opinion, doesn’t get the attention it deserves from the programming community. We imagine it as an amorphous skill, one that we rarely teach, for which we have no apparent praxis or pedagogy.
Instead, we teach people how to write features: how to build something new in the software that we know, when we understand what the code is doing, when we have certainty.
I suspect you’ve watched a video or two about programming. If I didn’t know better, I’d say you’re watching one right now! This one doesn’t deal in code examples, but I suspect you’ve seen demos where speakers share code on their screens, or demonstrate how to do something in a code base during a video recording.
Here’s the dirty secret—and I suspect you already know it. When we sling demos onstage or upload them to YouTube, that’s definitely not the first time we’ve written that code. We’ve probably written a feature like that one in production before. Then we modify it, to make it fit in a talk or video. Then we practice. Over, and over, and over. We need to minimize all mistakes and error messages. We learn to avoid every rake. Just for that code. And sometimes in recording we still mess it up. We pause the recording, we back up, and we do it again. Until it’s perfect.
We know what we’re writing. And that’s what gets modeled in programming education.
But that’s not the case when we’re writing code on the job. In fact, many of us spend most of our time on the job writing something that’s a bit different from anything we’ve done before. If we had done this exact thing before, our clients would be using the off-the-shelf solution, not paying our exorbitant rates to have it done custom. We spend the lion’s share of our time outside the comfort zone of code we understand.
Debugging feels hard, in part, because we take the skills that we learn from feature building, in the context of certainty, and attempt to apply them in a new context: one where we don’t understand what our code is doing, where we are surrounded by uncertainty.
And that is the first thing we need to debug effectively;
We need to acknowledge that we do not already understand the behavior of our code. This sounds like an obvious detail, but we often get it wrong. And it adds stress that makes it harder for us to find the problem.
Because we’ve only seen models where the programmer knew what was going on. We’re supposed to do that—we’re supposed to know what’s going on! And we don’t! We better hurry up and get out of this mess quick! But speed is precisely the enemy with insidious bugs. We’ll get to why later.
I struggled with this same thing through my decade of odd jobs. I felt inadequate, unfit for adulthood, because I didn’t know how to do my taxes or find my next gig or say the right thing to my family or make my life meaningful. And how would I have known those things? But it topped off all my personal struggles with a generous helping of insecurity, guilt, inadequacy, that drove me to run away from issues rather than address them.
But failing enough times over a long enough period made me realize: not understanding is normal. Or at least, it’s my normal. So I learned to notice and acknowledge my insecurity, and not let it dictate my actions. When my feelings of inadequacy screeched at me to speed up, that’s when I most needed to slow down. To figure out what, exactly, I wasn’t getting. To get out of progress mode and into investigation mode.
And that is the second thing we need to debug effectively;
We need to switch modes when we debug, from focusing on progress to focusing on investigation.
The most common debugging strategy I see looks something like this:
where we try our best idea first, and if that doesn’t work, our second best idea, and so forth.
I call this The Standard Strategy. If we understand the behavior of our code, then this is often the quickest way to diagnose the bug. So it’s a useful strategy.
The problem arises when we don’t understand the behavior of our code and we keep repeating this strategy as if we do. We hurt our own cause by operating as if we understand the code when we don’t.
The less we understand the behavior of our code, the lower the correlation between the things we think are causing the bug and the thing that’s really causing the bug, and the weaker this strategy becomes. So we get this:
where we circle among ideas that don’t work because we’re not sure what’s going on.
Once we have established that we do not understand the behavior of our code, we need to stop focus on fixing the problem, and instead ask questions that help us find the problem. And by “the problem,” I mean the specific invalid assumption we’re making about this code. The precise place, that is, where we are wrong.
Let me show you a couple of examples of how we might do that.
We could use a Binary Search Strategy:
In this strategy, we assume that the code path follows a single-threaded, linear flow from the beginning of execution (where we run the code) to the end of execution, or when the bug happens.
We choose a spot more or less in the middle of that and run tests on the pieces that would contribute to the bug. Now by test, I don’t necessarily mean an automated test, though that’s one way to do this.
By test, in this case, I mean the process of getting feedback, as fast as possible, on whether our assumptions about the state of the system at this point match the values in the code.
Because it’s not just that insidious bugs come from inaccurate assumptions. It’s deeper than that: insidiousness as a characteristic of bugs comes from inaccurate assumptions. We’re looking in the code when the problem is rooted in our understanding. It takes an awfully long time to find something when we’re looking in the wrong place.
It’s hard for us to detect when our assumptions about a system are wrong because it’s hard for us to detect when we’re making assumptions at all. Assumptions, by definition, describe things we’re taking for granted. They include all the details into which we are not putting thought. We’re sure that that variable has to be present at this point. I mean, the way this whole thing is built, it has to be. Have we checked? Well, uh, no. Never thought to do that. Never thought of this as an assumption—it’s just the truth. But is it?
This is where that fast feedback becomes useful. We can stop, create a list of our assumptions, and then use the instruments at our disposal to test them.
- Automated Tests are one such instrument. Tests allow us to run a series of small feedback loops simultaneously. We can check lots of paths through our code quickly and all at once. Tests aren’t inherently a more “moral” way to develop software or some baloney like that. They just do really well on the key metric that matters to us: the tight feedback loop.
- Manual run-throughs are another instrument. Developers start doing this almost as soon as they start to write code, and we continue to do it when we want to check things out.
- Break Points: We can stop the code at a specific line, then open a console to look at the variables in scope at that point. We can even run methods in scope from the command line to see what happens.
- Print Statements: If break points aren’t working, or if the code is multithreaded or asynchronous in such a way that we don’t know whether the buggy code will run before or after our break point, print statements come in really handy.
- Logging: For deployed code or code where we cannot access standard out, we may need to use more robust logging instead. Bonus: a more permanent logging framework within our code can help us diagnose issues after the fact.
Changing small things: If I think I know how a variable works, I can change its value a little bit and see if the program reacts the way I expect it to. This helps to establish my understanding of what’s in scope and which code is affecting what. Here’s where assumption detection comes into play. We’re likely to thoughtlessly assume that we know things at this point: that variable x should be this, that that class should be instantiated, et cetera. This is where insidious bugs hide: in the stuff we’re not checking.
And this is the third thing we need to debug effectively:
The ability to identify what is the truth, and what is our perspective.
I cannot tell you how many things, in those early years of my independent life, I knew beyond a shadow of a doubt to be true. And maybe, just maybe, in a tiny fraction of cases, I was even right! But in all the other cases, learning to differentiate between my views and empirical evidence, and learning to reconsider my perspectives, has been my key to leveling up everywhere in my life.
So let’s whip out our programming journals and try an exercise that will help us learn to detect and question our assumptions.
At each step represented by a rounded box in the debugging flow chart, we write down what step of the process we are checking, and then we make a list for assumptions and a list for checks.
The “Given” section attempts to explicitly state our assumptions: the things we are not checking. The “Checking” section lists the things we are checking—and we can mark each one with a checkmark or an X depending on whether they produce what we expect.
This exercise seems tedious, right up until we’ve checked every possible place in the code and all of our checks are working, but the bug still happens. At that point, it’s time to go back and assess our “Given”s, one by one. I recommend keeping these notes. How often do bugs that thwart us for long periods of time end up hiding in our assumptions? What can we learn from this about spotting our assumptions, and which of our assumptions run the highest risk of being incorrect?
At each check, whether we find something amiss or not, with a binary search we should reduce the problem space by half. Hopefully, this way, we can find the cause of our insidious bug in relatively few steps.
There are cases where binary search won’t work: namely, cases where the code path does not follow a single-threaded, linear flow from the beginning of execution to the end.
In this case, we may need to trace the entire code path from beginning to end ourselves.
But the concept remains: we explicitly list our assumptions and checks to investigate our code like an expert witness, to gather answers that lead us to the defect.
We are training our brains to spot our own assumptions. We know it’s working if our “Given” lists start getting longer. We especially remember to include givens that weren’t what we thought when we hunted down previous bugs. It is specifically this intuition that we are building when we get better at debugging through practice. However, because we do not deliberately practice it, nor generalize the skill to other languages and frameworks, our disorganized approach to learning debugging from experience tends to limit our skills to the stacks we have written.
By identifying common patterns in the assumptions we tend to make that end up being wrong (and causing bugs), we can improve our language-agnostic debugging intuition.
This is the final thing we need to debug effectively:
The ability to see how the things we’re doing now serve longer term goals.
I’m an expert at stressing myself out about things. I started young; on a trip to Disney World, my mom remembers changing my diaper on a bench as I cried “Careful! Careful!” afraid she’d let me roll off. I continued my winning streak of stress through high school, where I decided that college acceptances would determine my fate in life, and afterward, as I continued to sensationalize the results of tests, sports competitions, and job interviews as make-or-break moments.
I have since learned to see no particular moment as make-or-break. I have taken the power back from my evaluators. If I go to an interview now, my goals are to meet someone and to learn something. Whether or not I get the job, I came out with more understanding. In that sense, I have succeeded. Everything is in service to something else that’s coming, so that even if I fail, I have taken a step forward.
In the same way, every insidious bug presents a golden opportunity to teach us something. Maybe we hate what we learn. That’s okay. We know it now, and can use it to save trouble for someone else later. Or maybe we learn something deep and insightful, that we can carry with us to other code bases, to other work places, or maybe even home to our hobbies and our loved ones.
But either way, we get to hone our skills at conversing with code, and with navigating uncertainty in our lives.
We can practice acknowledging what we don’t understand, learning to slow down, differentiating our views from a shared reality, and finding ways to keep moving forward.
And spending time on those skills is a pretty good investment.
If you liked this piece, you might also like:
This post on my entree into live coding (in case you’re interested in real-time programming demonstrations)
This series about Structure and Interpretation of Computer Programs—in which I share what I learned in a week-long course on the classic book
The listening series—Unrelated to a specific programming problem, but hopefully useful 🙂