Hello! Welcome to the tech debt series. This is not a series where I recommend hopping in right in the middle, in part because I almost immediately replace the term “tech debt” with the term “maintenance load.” Half of what I say makes no sense if you’re still subscribing to some misconceptions about what tech debt is, which we addressed in Part 1.
This is the second in a three part series:
In the last post, after debunking a few misconceptions about technical debt and where it comes from. I proposed that we measure software maintenance requirements in terms of ongoing development effort. I described how this maintenance load increases faster for some teams than others with two example cases: a “yikes” case and an “average” case. We ended on the topic of technical bankruptcy, when teams take on the exorbitant expense of rewriting their code from scratch because, for a couple of years, it allows them to feel like they’re on top of their maintenance load until it gets out of control again. We ended on a question:
So is maintenance load just destined to get out of control?– Me, earlier
And my answer is: not necessarily.
We covered the yikes case and we covered the average case. Now let’s leap towards a case in which maintenance load doesn’t grow. What might that look like?
Here I’ll discuss an example: Explosion, a software company founded by Ines Montani and Matthew Honnibal. This company maintains a developer tool called Prodigy with a SaaS product called Prodigy Teams in development, plus an open-source natural language processing library called SpaCy. SpaCy gets help from a vibrant open source community, but Explosion’s full-time development power is roughly four people (a few devs, plus the contributions of the founders, who are also doing founder stuff). So let’s say five full time devs on three disparate products, one of which is about 6 years old, one of which is about 3 years old, and one of which has been in development for two-ish years. If all these projects were “average case” ones, Explosion would have a maintenance load of five and a half full-time devs right now. But nope: they continue to add features with maybe five devs, and as of this writing they’re not hiring.
In this talk, Ines discusses how they built the company and how they make decisions that impact the product. I have already mentioned that the team maintains three products without having grown appreciably in size. There are a few other pieces here that I want to tease out:
- “Bus count is a myth” – the talk suggests that one need not necessarily worry about what would happen if some number of developers got hit by a bus.
- “Authorship gets things done; not design by committee” – the talk suggests, correctly, that 20 people working on a novel would be inefficient, and that projects benefit from one person (or perhaps a duo with a good relationship) taking charge.
Here’s a critical piece of context:
Ines and Matthew, the cofounders of this company, both come from a background of maintaining an open source project. I have worked as an employee at a few companies, a consultant at many, and an open source project maintainer for several. So I can say with confidence that that context matters for a few reasons:
- In the open source world, projects tend to follow individual contributors, not companies.
It’s common for a developer who maintains a high-profile open source project to work at several different companies, taking their open-source work with them wherever they go. This is not the same as an organization’s proprietary project, which an employee stops working on if they leave. The churn on high-profile, successful, well-maintained open source projects is decidedly lower than the churn on the average tech team. This is important because it means that…
2. Open source projects have fewer churn events than tech teams at private companies.
Ines says in the talk, if four developers at Explosion got hit by a bus, it would have negative developers. But the bus thing is really a euphemism, isn’t it? When we say “will your project be dead in the water if someone gets hit by a bus,” what we really mean is “will your project be dead in the water if some number of developers leave?”
Ines and Matthew are both founding contributors to all of these projects, presumably with no intention of leaving them. So Explosion has not experienced, and probably has relatively low risk of experiencing, a context loss event where someone walks out with all the product knowledge—much lower risk than a not-self-employed tech team where the members do not feel personal, lifetime, technical authorship of the code.
So, frankly, Explosion gets to ignore a risk that a lot of tech teams cannot ignore, which is the loss of a bunch of undocumented context that leaves whole swaths of a code base a total mystery to the remaining maintainers.
That said, Explosion’s roots in open source contribution also mean something else:
Explosion’s team has higher-than-average code stewardship skills.
Code stewardship is a whole, difficult skill that is completely separate from writing feature code, which is what most developers are a) trained to do and b) rewarded for by the business. Developers are neither prepared nor incentivized to test, document, or communicate, and they’re also not empowered to rethink or remove features.-Me, earlier
Developers are trained to write features. Almost all online tutorials are “How to make X feature in Y framework from scratch.” Academic computer science education focuses, similarly, on building from nothing. Developers get excited about building from nothing because they get to make all the decisions. Very little education, tutelage, or mentorship focuses on reading other people’s code or maintaining existing code bases.1
Developers also get recognition, raises, and promotions for shipping features. “How to be a 10x Developer” pieces tend to focus hard on “ship, ship ship.” Your shipping makes it hard for others to wade through the code base? That’s their problem. The client advocates who love the shipping aren’t looking at the code base, and if they can’t see it, they struggle to prioritize it as much as the features they can see, particularly in the short term.
But open source contributors’ clients are literally other developers.
These people can see the code. In fact, for open-source projects to keep up with their clients’ needs, often the projects need to be accessible for the clients themselves to contribute code. So testing and documentation get much higher priority than in private projects because part of the point is, absolute strangers need to be able to not only run the code, but also add to or fix the code, with little (and ideally zero) intervention from the maintainers. An open source projects’ zero-onboarding requirement serves as an automatic hedge against context loss and maintenance load.
A visit to the spaCy code base confirms the team’s skill in this area:
- The documentation is clear and complete even by popular open-source project standards. I got spaCy up and running on my machine and was able to use models with it, from only the docs, in about four minutes (not counting the download time to get it).
- Not only are the existing docs complete, but the maintainers communicate standard patterns for other people to discuss or seek help with project. They clearly state what communication channels will and won’t work (don’t email them, for example), and they explain why (discussions in public fora are available to help more people, for example).
- There are clear, concisely named, well-circumscribed commits, which I talk about here, and which is rare even on open source projects. I believe commit messages to be an invaluable source of temporally organized documentation.
- I checked the last dozen substantive commits (I subjectively judged “substantive” to mean “Looks like it has to do with the library’s core functionality based on the commit message.”) They all have tests, which I can read and pretty quickly understand what changed.
- There are contributor guidelines, and recent commits from the core maintainers follow those guidelines. It’s easy to get lazy with this when you’ve worked on a code base for the better part of a decade, but they don’t.
To be really thorough what I would have had to do is see that I could make a full-slice change to this repo. I didn’t. However, I’m fairly confident that I could. That’s a thing that tech debt can prevent: someone going in and changing the core functionality with ease, let alone an internet rando like me. What I see here is high proficiency in the technical skill of code stewardship.2
These skills of code stewardship—testing, documenting, communicating—they atrophy, or sometimes never germinate, in the privately employed developer chasing their 10x title. But they contribute to the longevity of open source projects, and they can contribute to the longevity of private projects, too.
Also worth noting is that, since open source contributors act as authors of their code bases, they are empowered to make decisions about rethinking or removing features. I talk about the value of this practice more in this piece (my take on the 10x developer trend), but in order to streamline or remove features, developers need permission from the product and the business. They often don’t have it. On open source projects, by contrast, the devs are usually in charge. They can say “I changed these two APIs so now they both use the same backend. No, I didn’t ask any of you. I promise, it works better now. You’re welcome!” Such a missive would go over poorly in most private companies.
How do we eliminate maintenance load growth?
- Normalize and value the practice of writing tests. Include time in the development budget for the testing. Celebrate the occasions when something would have broken, but the tests caught it. When tests get flaky, prioritize giving them attention and fixing them. Whether a team is just getting started with testing or trying to re-prioritize testing, I might recommend doing this workshop with the team.
- Normalize and value the practice of writing documentation, give developers time to do it, and reward developers who do. Measure the value of documentation by recording the time that the documentation author needed to figure out how to do something, comparing it to the time that same task took when another developer did it with documentation, and multiplying the difference by every developer who gets to use the documentation. I think the skills of out-of-team documentation transfer to in-team communication and vice versa, so I might recommend thinking about these things for developers who want to build documentation skills.
- Train, expect, and reward developers to communicate their choices effectively with the team, including socializing big changes and giving one another feedback.
Of course, keeping your maintenance load from growing any further isn’t helpful—and might be impossible—if you’re already at your maintenance limit now. So how do we reduce the existing maintenance limit tout suite and get back on firm footing? I’ll talk about that in the next post.
- This complete underrepresentation of resources on the parts of the development process besides feature additions is one of the things I’m hoping to help fix by live streaming. On the live streams, I talk through the software decision-making process, debug stuff live while explaining what I’m doing, and explore code bases that I didn’t write to figure out how to make a change. I do rundowns of all the live streaming sessions and then put them here. I also specifically wrote about debugging a bunch here, talk about a strategic approach to refactoring here (with a bunch more examples here), and recorded this workshop about identifying, assessing, prioritizing, and addressing risks in your software system. Why yes, I do think about this a lot. Why do you ask?
- Part of the reason we’re in this predicament of tech debt everywhere has to do with the skills of code stewardship—including communication and documentation—getting relegated to the “non-technical” pile and subsequently undervalued. It’s not just thoughtful or nice of developers to document their work. It’s a technical skill, it’s difficult to do well, and as we’ve established, it’s expensive if not done.
If you liked this piece, you might also like:
The debugging category (I linked this above, too, so if you’ve been clickin’ links, you have this already)
The refactoring category (this I only linked in the footnotes)
The risk analysis workshop (4 out of 5 “Jimi Hendrix of [insert programming language here]”s approve!)