Hello! Welcome to the tech debt series. This is not a series where I recommend hopping in right in the middle, in part because I almost immediately replace the term “tech debt” with the term “maintenance load.” Half of what I say makes no sense if you’re still subscribing to some misconceptions about what tech debt is, which we addressed in Part 1. You’re currently looking at Part 3, and the terminology in this piece is going to make a lot more sense if you start with Part 1 and Part 2, then come back.
This is the third in a three part series:
In the last post, we talked about the role of code stewardship in avoiding the accrual of maintenance load on your code base. But keeping your maintenance load from growing any further isn’t helpful—and might be impossible—if you’re already at your maintenance limit now. So…
How do we reduce maintenance load?
That’s all fine and good, Chelsea, but as we have established, my team is already at its maintenance limit. How do I f@&ing fix this?– You, probably
I’ll tell you what helps, but I’ll also say this: the more the maintenance load outstrips the current team, the harder it is to get back to a good spot.
Recovering from this situation takes some commitment and some rethinking assumptions as a software business. Big opportunities here:
1. Force features to earn their keep.
Measure the return on investment of the features in the product, and if a feature is producing a low return on investment, dedicate time for the developers to remove it. Do this regularly, not just when your team is breaking under their maintenance load. Also…
2. Regularly equip developers to suggest streamlining options.
Often, simpler code and better functionality are not actually at odds with each other—they’re just at odds with each other in the very short term, in the very current code base.
Give developers time and space to pop up one level of abstraction and say “How can we do this with fewer edge cases in a way that works better for our constituents?”
A developer who is skilled at this can often find simpler solutions that also work better than what we currently have—improving the code base’s functionality, robustness, and maintainability simultaneously.
Allow me to provide some illustrative examples of this in practice (previously listed here):
…a few simple solutions to complex problems that I’m especially proud of:
– I solved a thorny issue with a sometimes camouflaged, sometimes unreliable delete button by ripping it out and making it so you could click anywhere on the object being deleted to delete it (Here’s the PR if you want more details)
– I resolved the issue of a laggy metronome in React-Native, where computationally expensive operations can result in delays for scheduled tasks. The app happened to also need to play back tracks at various multiples of their original speed. So I recorded a sound file of a metronome at 120 BPM, calculated what speed to play it by dividing the BPM the musician requested by 120, and wrote one method that shelled out to
expo-avto play accurate-to-tempo sound files for both the metronome and the backtracks.
– I resolved an issue of regulating how often a person should be allowed to log their mood by realizing the issue was fake (assumed foil to “let’s make sure they log it at least this often”) and letting them log as often as they want (more details here).Me, originally in a different piece, but I don’t want you to have to click around to understand what streamlining is.
I discuss the skill of streamlining in more detail right here apropos of Michael Feathers’ work on edge-free programming. That piece lists specific streamlining heuristics that developers can think about while designing, implementing, or maintaining a system. I’ll also mention that this exercise, inspired and guided by Hillel Wayne, helped me form my mental model about streamlining.
Worth noting: the skill of streamlining is wonderful, but developers with this skill also need the time to use it, and the business needs to listen to the suggestions they come up with as a result. Most models of the software negotiation process show the business asking for something and the tech side saying yes or no. What the use of this skill warrants is a conversation about how to get to yes, which requires developers to think outside the bounds of the existing boundaries and limitations of the system, and also requires product people to sometimes let go of their protective instincts for the current implementation in favor of a conversation about that implementation’s goals and other ways to reach them.
Right, so clearly we could do a whole post series on this, but that’s not this series. Onward.
3. Recover lost context.
Oy vey. Okay, here we go.
Context Loss: The silent killer of tech projects
Only one of the nine rewrites I have worked on was replacing a fundamentally broken code base.
All of the others—all of them, million and billion dollar projects—were victims of context loss. People didn’t document decisions, and then they left, and now this code theoretically works fine, but no one here knows how it works, so they can’t add a needed feature, and if it ever broke, they’d be screwed. This is a team dynamic problem related to chronically under-valuing and under-rewarding time spent on documentation and context sharing.
In my experience, this is what kills projects. I have seen some shitty code, but I have seen a lot less shitty code than I have seen code that’s pretty much fine, but no one knows how it works anymore. Just because no one knows how it works doesn’t make it shitty code anymore than a treadmill without an instruction manual is automatically a shitty treadmill. Unusable for the current owner? Maybe. But would theoretically work fine if the owner could just figure out how it works.1
This is where the skill of forensic software analysis is important.
When detectives investigate a crime, they bring along a crime scene investigator to piece together what happened based on what’s still at the scene. If you’re trying to un-scary abandoned houses in your code base and recover the context your team has lost, you need someone who can play this role for a software project.2
This person understands how to follow paths through the code, navigate the comments, mine the git history, and draw from historical programming practices to figure out:
- who worked on a code base (in order to understand when context got lost, not to blame people)
- what they intended for a code base to do
- when those intentions came into effect
- how the developers tried to implement
- why they did them the way that they did.
Here’s why that’s valuable: this developer can recover lost context in your code base. They can walk into the abandoned house and fix the broken window. They can even recommence feature development in code bases that the team/business thinks of as frozen in time because no one has the familiarity to change them. Provided this developer has communication and documentation skills in addition to forensic software analysis skills, they can also share what they are learning with the team so that the rest of the team also regains the ability to maintain this code.
It’s common for teams to respond to the idea of iterating on an abandoned house by refusing, saying that that code base needs to be rebuilt from scratch. I’d say, not necessarily. Forensic software analysis takes time—but that’s the cost of context loss, not the fault of the dev trying to recover the context. It can take months even for a dev who is good at it already. But it still takes less time, often, than replicating the existing feature set of a complex code base from nothing.
Like code stewardship, forensic software analysis is a whole, difficult skill, completely separate from writing feature code. Also like code stewardship or streamlining, “how to do forensic software analysis” requires its own whole separate blog posts and resources. I promise that I am working on this, but in the meantime, here’s what I recommend to a developer who wants to improve at it:
- Skim Working Effectively with Legacy Code, by Michael Feathers, and keep a reference copy
- Try out commit tracing on a code base you’d like to understand better
- Learn from hackers: reverse engineering an app requires skills that transfer to forensic software analysis. I like Malware Unicorn’s resource on reverse engineering Windows malware or ragingrock’s workshop on reverse engineering an Android app.
Caveat: Your org needs skill and will to address maintenance load.
Developers having code stewardship, streamlining, or forensic software analysis skills is a separate thing from the business giving them the room to use those skills.
Both the tech team having the skills and the organization valuing the use of those skills are necessary to curb the growth of maintenance load and ultimately reduce it.
But hopefully, if that’s something you’d like to do, something from this series will help you do it.
- This, to me, is what it means for code to be “legacy code.” Some people use that term to mean “undocumented code,” some use it to mean “untested code,” some say it in place of just “old code,” and yes, some basically assume it’s all “shitty code.” To me, it’s code with a maintenance load related to context loss.
- Given my affinity for the crime scene investigator analogy, you can imagine how excited I was to purchase a copy of a book I found called Your Code as a Crime Scene, which I hoped would be about forensic software analysis. Unfortunately, the book didn’t fulfill my hopes in this regard.
If you liked this piece, you might also like:
The debugging category (I linked this above, too, so if you’ve been clickin’ links, you have this already)
The refactoring category (this I only linked in the footnotes)
The risk analysis workshop (4 out of 5 “Jimi Hendrix of [insert programming language here]”s approve!)