Recently I did a Behind the Scenes post about recording a screencast for RubyTapas—bite-size videos (4-7 minutes) to share intermediate and advanced-level skills and ideas with programmers.
My screencast covers risk-oriented testing, which I’ve applied to the design of applications in a few languages. I got permission to share it with you!
Episode Transcript
We tend to divide automated tests into two categories: unit tests and integration tests.
We maximize unit test coverage during the development cycle and then, once the code is written,
we fit a few integration tests to make sure everything stays together.
But sometimes we can get a better return on our time and effort by identifying the riskiest parts of our code and prioritizing tests to fit that risk.
Let’s look at an example. Suppose you work for WalletPal, a website that helps people manage their receipts. The original build is a monolithic Rails app that stores people’s receipt data in a relational database.
WalletPal would like to migrate the receipt data into a document database inside a new app. The frontend on the original app will now fetch the migrated data via an HTTP API.
You’re in charge of rewriting that data layer—and the frontend itself should not change.
You extract the ActiveRecord
calls to the relational database out of the controllers and into a data source class, and you namespace it with the term Repositories.
You’ll write another class with the same interface, and you’ll namespace that one Services
.
You’ll switch out the repository for the service class when it’s time to get the data from the API app instead of from the local database.
Once you have switched all your data dependencies to the new API, the receipts table in the relational database will be deleted.
How do you test through this fairly large refactor?
A common testing approach is to maximize coverage with unit tests.
But unit tests tend to test the framework, and they tend to slide toward checking our implementations rather than our outcomes. That makes it harder for us to refactor later.
We could also write an end-to-end test that visits a URL on the WalletPal app and asserts that the appropriate receipts show up.
Integration tests, for their own part, leave us with a long feedback loop. Also, configuration issues on our local machines can cause them to fail for reasons that don’t signal real problems in the code.
What if we could find an approach that fell somewhere in the middle…that is, an approach that used automated tests to define and catch the riskiest cases without micro-managing all the cases and without requiring a full buildout to show progress?
Let’s try it.
Step 1: Make a risk profile of the system we’re building.
Let’s return to our picture of our system with the changes we’d like to make:
This diagram can help us visualize the risks associated with our refactor. Let’s go through our diagram and ask the question: what could go wrong here?
Now, for each of these things that could go wrong, I want to answer three questions:
- Would the outcome be catastrophic if this went wrong?
- Is this likely to go wrong?
- If this goes wrong, is it likely to sneak through QA and deployment?
Step 2: Plan automated tests for the riskiest items.
I’m focused on placing automated tests around problems that are somewhat likely to happen,
somewhat likely to go uncaught,
and somewhat catastrophic if allowed to stay wrong.
So, for example, the API server going down is somewhat catastrophic, but it’s not likely to go uncaught. It makes itself known the moment the engineer, the designer, or QA visits the site.
So an automated test that requires the collaborating server to be up is not that useful from a risk profile perspective.
But what if that server serves data that looks kind of valid but is, in fact, inaccurate?
What if, say,
a weird character in the text of the receipt messes up the JSON body in the trip over to the new server…
…so the new server’s version of this receipt only shows the portion of the items that appeared on the receipt before the weird character?
It still looks like a receipt, but it’s missing items. That could go uncaught unless QA is looking very closely. That kind of data equivalence is an excellent candidate for automated testing.
What are our riskiest items in this refactor?
To me, here are the top two in order of risk:
- Something gets messed up in the data import, so the document database version looks OK at first glance but is not, in fact, equivalent to the original relational data.
- When we switch from the repository data source to the service data source, the services data source is missing methods or returns a result that doesn’t respond to methods called on it in the controller or view.
Let’s check our data in production to make sure that I have equivalent receipts in both places.
This test connects to my API app,
makes requests,
and compares the results to the local database.
We check an example attribute called number on the receipt.
You could do this same thing with any and all receipt attributes.
This test will take a while to run, and we’re not running this all the time: we’re running it in the particular case that we prepare to do something risky, like switch the data source that the app is using or delete the receipts table in the relational database. In that case, compared to no test, it is indeed slow. Compared to having QA manually check every record to get this same level of confidence? It’s blazing fast.
Now let’s talk about our other top risk: API inconsistency between the two data sources.
We write a test to compare the method lists of our two data sources.
In statically typed languages, you wouldn’t assert this with a test. Instead, you would have both data sources adhere to an interface, and then the code wouldn’t compile if the things that these tests are testing weren’t true. In this case, though, we want something there to make sure we don’t miss any critical methods in our development process.
Let’s also write a test to compare the return values of the .all() method in our two data sources.
It’s worth noting that, instead of asserting equality between the two data payloads here, I want to assert that, regardless of the data itself, both return values respond to certain requests — namely, each
— the method we use on this return value in the view.
I also check that the items in the returned collections respond to number
, an attribute on Receipt that we’ll also want to call in the view.
Unit tests help us us drive out intra-class functionality and build confidence in our incremental changes.
Integration tests help us ensure that our inter-class and inter-app configuration works as a system.
But neither of these test types presents a panacea for helping us save time and avoid worry:
unit tests are relatively slow to drive out,
and they tend to end up testing our framework and micro-managing our implementation choices.
Integration tests give us a long feedback loop,
and they tend to produce a lot of false positives for problems such that developers start ignoring their results.
Instead, we can consider the risks present in our system as a whole, then write a test harness that mitigates the largest risks and communicates those risks to the rest of the team.
This risk-focused perspective, over time, makes it easier for you and your teammates to spot and preempt the kinds of bugs that could become headaches later.
If you liked this post, you might also like:
The series that starts here on process design in software (also Avdi-influenced)
This cornucopia of test-oriented posts – including testing for Android and iOS!
This post on how iterations might add value to your business (or not)