In December, I took a course in which we attempted to implement the Raft algorithm from this paper. In this series, I share memorable insights from the course (here’s where you can see all the posts so far).
In the previous post, we left off here:
I hate your
elifs, Chelsea. Don’t you know about objects?
Have you read this blog? Of course I know about objects. But software doesn’t usually model objects, as we’ve discussed before.
So how do we bridge the gap between the infinitely nuanced abstract world and our limited options for representing it in code?
Here, I am keeping all my functionality in one place until the data I have collected on the problem reaches the threshold at which the abstractions I define are satisfactorily likely to be accurate. Pulling out separate responsibilities from a preexisting blob is much less bamboozling than attempting to move responsibilities between a suite of inaccurately separated concerns. So, until I understand the problem better, the code looks like this. And you know what? It’s pretty legible.
In the next post, you’ll see some separated concerns begin to emerge.
So far, we’ve written some naughty code.
server.pydoes lots of unrelated things
- Nary a test
- Weird behavioral issues to iron out
Now that something works, we’ve gathered enough insights to start separating responsibilities.
First of all, everything we own lives in the root directory. We’ll start making a few boxes to put things in (this commit). I made four to start with:
src: for our Server, Client, KeyValueStore, and helper methods in message_pass and config.
logs: for the server registry and the log files where our servers write the commands they have received.
exercises: for the state machine exercise that we did with the traffic lights, mostly.
tests: it’s time to start getting this system under test.
Why aren’t there already tests?
One of the pieces of advice I received for embarking on this project was to manage complexity by splitting up objects and leaning on automated testing. And I think that’s fabulous advice for folks who have taken a crack at solving this problem, or a problem like it, before.
I, however, have not solved a lot of problems like this. I don’t typically work with socket servers or store structured data for persistence outside some kind of database. And I’m not starting from a framework here like Rails, Django, Android, or Spring, which come equipped with opinions on where to put various types of code in the project. I started with zilch: an empty project.
Tests serve three functions.
1. Documentation. Tests offer us miniature stories to explain how to use our API and what to expect it to do. They’re therefore immensely valuable for storing context and transferring it to other developers. But I started out working on this code alone. All the context lived in my head as I wrapped my brain around new concepts. I’m not fast enough to execute enough software in a few days that I can’t remember how to use it. I wouldn’t put down an untested project that I wanted any hope of picking back up. I’d need documentation for that. But at the early stages of this project, I didn’t need that.
2. Regression Prevention. Tests (particularly integration tests, system tests, and risk-oriented tests) allow us to run a series of small feedback loops simultaneously, so we can check lots of paths through our code. That becomes critical in applications where we might break something without noticing. Up until now, my little system has had pretty limited functionality. Everything’s broken. I notice it all.
3. Design Aids. Avdi Grimm sees unit test-driven development as a tool to write clean, well-circumscribed APIs. Coraline Ehmke describes why this works: in a test, we get to articulate our wishful, greedy, best-case vision of how we’d like the API to read and act. After that, we flip to the implementation to execute on that vision. Without that vision, the lazy slobs in us might take over and implement the code in whatever way is most convenient for our current selves, the writers. But the test holds us accountable to our vision as readers. And code gets read much, much more than it gets written.
As I started this project, though, I did no design. I threw everything into three files because I didn’t yet have enough data about the problem to accurately separate the concerns. Even among the three files I started with, I ran into churn early on. Should the server write to its logs, or should they key value store do it? Are the properties of a cluster emergent from the behavior of individual servers, or are they enforced by some kind of cluster manager like we had with the traffic lights? Had I invested a lot of time in separating concerns early, I’d often find that I had separated them incorrectly. Re-separating concerns is more frustrating than separating concerns from a single place, and having to move and rewrite unit tests adds to the frustration.
Look, I have to tell you a dirty secret.
Ready? I slung code for the most dogmatically TDD shop in the industry for years. Seriously, programmers love to joke that Labs is a cult, and the fact that we clapped at the end of standup didn’t help our case. You can think of us as the ultimate TDD conservatory. And even we didn’t TDD novel problems from the get-go.
Instead, we did:
- A spike—experimental blobs of functionality to get something working.
- Delete it all (well, truth be told, comment it out to reference later).
- Color-code or otherwise demarcate the comments into hypothetical concerns.
- Pick a concern. Write a test. Consider how we’d like its API to look and work.
- Watch the test fail. Make it pass. Look for simplifications that allow it to still pass.
- Repeat steps 4 and 5 until the concerns are done.
If you start with #4, it’s a short road to Circular Dependency Land.
The dirty secret is, TDD isn’t for completely novel problems. We need some information first about the risks and responsibilities inherent to our system.
But forging on without tests has its consequences.
As I developed my test-less laboratory of spaghetti code, I also developed alongside it a checklist of things to manually retry each time I changed something. I caught regressions this way. Since this code had only maybe two execution paths, it didn’t take long to do the manual thing, and I was poking around manually anyway after each change. As software scales, this doesn’t (although keeping a manual regression checklist to run before a release is a good idea for an application of any size).
Were I to do this project again, I might get to an echo server and then write a single, happy-path system test for the echo server. I might update the system test as I added functionality to the project, leaving unit tests until the separation of concerns stage as I did here.
I tried out a new thing on these tests that I haven’t done in previous test suites I’ve written. I’ll show you my normal way, and then the new thing, and we’ll see what you think.
Typically when I write unit tests, I attempt to use only the API method that I am testing in the test itself. I set everything else up by some other means. Here’s an example (from this commit):
Take a look at the test for
get. Rather than using
set to set Sibyl to cruelty, I assign the
data variable to a hash myself. This isolates each part of the API to its test so that, if something fails I know exactly what part of the API has broken.
Similarly, the only method I call on the subject under test in the
write_to_log test is
But how about these tests?
Now I’m testing some of the other, more complex methods in the class, and I’m using the
write_to_log() method even though it isn’t being tested. For now I’m calling this approach “incremental story,” and I haven’t decided if I like it yet.
Here’s what it buys me:
- A simpler, more expressive story. This flow is perhaps easier to follow, such that a developer could read this test to understand what the class is supposed to do.
- A check on the legibility of my method names: If I do this and the methods aren’t named well, the story becomes harder to read, not easier.
And here’s what I lose:
- These tests could fail because either
.write_to_logs()or the method they’re supposed to test have changed.
For now, I’m okay with this drawback because I can look at the test that tests
write_to_logs in isolation. If it fails, that’s the method that failed. If it passes, a failure in the other tests probably has to do with the method they’re supposed to test. For this reason, I think I only want to include methods that have already had their test in the implementation of other tests.
Pulling Things Out of the Server
Finally, out server is a little cumbersome to test. We have to spin it up, regale it with messages, and check the responses. I’d therefore like the server to focus exclusively on managing sockets, which means pulling out everything else it does.
So I know how I want to separate the socket concern. That doesn’t mean I know how to separate any of the other concerns in the server. For now, I pull everything else (in this commit) into another file that serves as a grab bag of methods with a temporarily profane name (sorry).
In retrospect I might not name the file a cuss, but I do think that “obviously not the final name” is a fine quality for the file name to have right now. This is not the final separation of concerns. It is, rather, the act of separating a single concern—socket management—from everything else, with the acknowledgment that work remains to understand and organize the everything else part.
In this commit, the server goes from 139 lines to 80. It possesses three non-constructor methods:
start(to start its listening socket)
handle_client(to receive incoming connections)
tell(to send responses)
These are the only things I want the server to do.
Oh, the places we can go!
From here, we can dive into:
- Writing system tests that spin up servers and clients
- Writing unit tests for the things in our grab bag file
- Forging ahead with something Raftier
We’ll forge ahead with something Raftier and come back to some of these other things: you’re here for Raft, and those other things get more interesting once we move a little further into Raft.
If you liked this piece, you might also like:
This post on my entree into live coding (plus, how to get a transcript of a YouTube video)
The series on reducing job interview anxiety—no relation to this post, but people seem to like this series.
This talk about the technology and psychology of refactoring—in which you’ll hear me explain some of what you see me do when I code.