Risk-Oriented Testing (From RubyTapas Screencast)

Reading Time: 11 minutes

rubies

Recently I did a Behind the Scenes post about recording a screencast with RubyTapas.

RubyTapas specializes in bite-size videos (4-7 minutes) to share intermediate and advanced-level skills and ideas with programmers.

The examples are implemented in Ruby, so the videos are generally thought of as “for Rubyists.” However, nowadays the majority of the videos present concepts that are useful to any app developer as long as they can understand Ruby as an example language. I have implemented lessons I learned from those videos in Ruby, Python, Java, Swift, and on one auspicious occasion, APL.

My screencast for the show covers risk-oriented testing, which I’ve applied to the design of applications in a few languages. The screencast originally appeared on RubyTapas in September, and after 90 days I would be allowed to share the material with you. So here it is!

 

Episode Script

two circles

We tend to divide automated tests into two categories:

circles; "unit tests", "integration tests"

unit tests and integration tests.

highlight: unit circle

We maximize unit test coverage during the development cycle and then, once the code is written,

highlight integration circle

we fit a few integration tests to make sure everything stays together.

two circles, overlapping

But sometimes we can get a better return on our time and effort by identifying the riskiest parts of our code and prioritizing tests to fit that risk.

rdb only architecture diagram

Let’s look at an example. Suppose you work for WalletPal, a website that helps people manage their receipts. The original build is a monolithic Rails app that stores people’s receipt data in a relational database.

both dbs architecture diagram

WalletPal would like to migrate the receipt data into a document database inside a new app. The frontend on the original app will now fetch the migrated data via an HTTP API. You’re in charge of rewriting that data layer—and the frontend itself should not change.

module Repositories
  class ReceiptDataSource
    def index
      Receipt.all
    end
  end
end

You extract the ActiveRecord calls to the relational database out of the controllers and into a data source class, and you namespace it with the term Repositories.

module Services
  class ReceiptDataSource
    def index
      ActiveResource::Receipt.all
    end
  end
end

You’ll write another class with the same interface, and you’ll namespace that one Services.

class ReceiptsController < ApplicationController
  before_action :set_data_source

  def set_data_source
    @data_source = Reopsitories::ReceiptDataSource.new
  end
end

You’ll switch out the repository for the service class when it’s time to get the data from the API app instead of from the local database.

class ReceiptsController < ApplicationController
  before_action :set_data_source

  def set_data_source
    @data_source = Services::ReceiptDataSource.new
  end
end

Once you have switched all your data dependencies to the new API, the receipts table in the relational database will be deleted.

class DropReceiptsTable < ActiveRecord::Migration
  def up
    drop_table :receipts
  end

  def down
    raise ActiveRecord::IrreversibleMigration
  end
end

How do you test through this fairly large refactor?

review about unit tests

A common testing approach is to maximize coverage with unit tests. But unit tests tend to test the framework, and they tend to slide toward checking our implementations rather than our outcomes. That makes it harder for us to refactor later.

review about integration tests

We could also write an end-to-end test that visits a URL on the WalletPal app and asserts that the appropriate receipts show up. Integration tests, for their own part, leave us with a long feedback loop. Also, configuration issues on our local machines can cause them to fail for reasons that don’t signal real problems in the code.

Risk Oriented tests

What if we could find an approach that fell somewhere in the middle…that is, an approach that used automated tests to define and catch the riskiest cases without micro-managing all the cases and without requiring a full buildout to show progress?

Let’s try it.

Our first step is to make a risk profile of the system we’re building.

show our DI architecture diagram again

Let’s return to our picture of our system with the changes we’d like to make:

overlay some risks on the architecture diagram

This diagram can help us visualize the risks associated with our refactor. Let’s go through our diagram and ask the question: what could go wrong here?

Now, for each of these things that could go wrong, I want to answer three questions:

questions

  1. Would the outcome be catastrophic if this went wrong?
  2. Is this likely to go wrong?
  3. If this goes wrong, is it likely to sneak through QA and deployment?

show the risks + architecture picture again

Step 2: Plan automated tests for the riskiest items.

overlay 1s for this risk on the diagram

I’m focused on placing automated tests around problems that are somewhat likely to happen,

overlay 2s for this risk on the diagram

somewhat likely to go uncaught,

overlay 3s for this risk on the diagram

and somewhat catastrophic if allowed to stay wrong.

highlight the API server risk on the diagram

So, for example, the API server going down is somewhat catastrophic, but it’s not likely to go uncaught. It makes itself known the moment the engineer, the designer, or QA visits the site. So an automated test that requires the collaborating server to be up is not that useful from a risk profile perspective.

un-highlight the API server risk on the diagram

But what if that server serves data that looks kind of valid but is, in fact, inaccurate?

receipt

What if, say,

point at the weird character in the json

a weird character in the text of the receipt messes up the JSON body in the trip over to the new server

receipt

…so the new server’s version of this receipt only shows the portion of the items that appeared on the receipt before the weird character?

It still looks like a receipt, but it’s missing items. That could go uncaught unless QA is looking very closely. That kind of data equivalence is an excellent candidate for automated testing.

risk + arch diagram

What are our riskiest items in this refactor?

To me, here are the top two in order of risk:

highlight that risk

  1. Something gets messed up in the data import, so the document database version looks OK at first glance but is not, in fact, equivalent to the original relational data.

second risk

2. When we switch from the repository data source to the service data source, the services data source is missing methods or returns a result that doesn’t respond to methods called on it in the controller or view.

RSpec.describe ReceiptDataSourceEquivalence do
  before(:suite) do
      @repository_receipt_data_source ||= Repositories::ReceiptDataSource.new(environment: Rails.env)
      @service_receipt_data_source ||= Services::ReceiptDataSource.new(environment: Rails.env)

      @local_records = @repository_receipt_data_source.all.collect(&:id)
      @api_records = @service_receipt_data_source.all.collect(&:id)
      @on_records = Array(@local_records & @api_records)
  end

  describe "receipts list call" do
    it "provides receipts with the same set of ids, each of which have equivalent numbers" do
      expect(Set.new(local_records)).to eq(Set.new(api_records))
    end
  end

  describe "receipts detail call" do
    it "given an id, provides a receipt with the same attributes from local database or API"
    @on_records.each do |id|
      local_record = @repository_receipt_data_source.get(id)
      api_record = @service_receipt_data_source.get(id)

      expect local_record.number.to eq(api_record.number)
    end
  end
end

Let’s check our data in production to make sure that I have equivalent receipts in both places.

@service_receipt_data_source ||= Services::ReceiptDataSource.new(environment: Rails.env)

This test connects to my API app,

@local_records = @repository_receipt_data_source.all.collect(&:id)
@api_records = @service_receipt_data_source.all.collect(&:id)

makes requests,

expect(Set.new(local_records)).to eq(Set.new(api_records))

and compares the results to the local database.

expect local_record.number.to eq(api_record.number)

We check an example attribute called number on the receipt.

You could do this same thing with any and all receipt attributes.

This test will take a while to run, and we’re not running this all the time: we’re running it in the particular case that we prepare to do something risky, like switch the data source that the app is using or delete the receipts table in the relational database. In that case, compared to no test, it is indeed slow. Compared to having QA manually check every record to get this same level of confidence? It’s blazing fast.

Now let’s talk about our other top risk: API inconsistency between the two data sources.

require 'rails_helper'

RSpec.describe Services::ReceiptDataSource do
  describe "equivalency" do
    it "has all the methods present in the repository library" do
      expect(
        Repositories::ReceiptDataSource.instance_methods - Services::ReceiptDataSource.instance_methods
      ).to be_empty
  end
end

...

end

We write a test to compare the method lists of our two data sources.

In statically typed languages, you wouldn’t assert this with a test. Instead, you would have both data sources adhere to an interface, and then the code wouldn’t compile if the things that these tests are testing weren’t true. In this case, though, we want something there to make sure we don’t miss any critical methods in our development process.

require 'rails_helper'

RSpec.describe ReceiptDataSourceTransition do
  describe "get_receipts" do
    before do
      Receipt.delete_all
    end

    it "gets similarly structured responses from both repository and service" do
      #Subjects Under Test
      @repository = Repositories::ReceiptDataSource.new
      @service = Services::ReceiptDataSource.new
      @data_sources = [@repository, @service]

      #Given
      FactoryGirl.create(:receipt, number: 11.40)
      FactoryGirl.create(:receipt, number: 22.50)
      stub_request(:get, "#{ENV['RECEIPT_ENDPOINT']}/receipts.json").
      to_return(body: [
        {"uuid": "324-asd-423-fsd", "relational_id": 1, "number": 11.40},
        {"uuid": "asd-ewr-456-676", "relational_id": 2, "number":  22.50}
      ].to_json)

      #When
      @results = @data_sources.map {|ds| ds.get_receipts }

      #Then
      @results.each do |result|
        expect(result).to respond_to(:each)
        expect(result.length).to eq(2)
        expect(result.first).to respond_to(:number)
      end
    end

    ...

  end
end

Let’s also write a test to compare the return values of the .all() method in our two data sources.

expect(result).to respond_to(:each)

It’s worth noting that, instead of asserting equality between the two data payloads here, I want to assert that, regardless of the data itself, both return values respond to certain requests — namely, each — the method we use on this return value in the view.

expect(result.first).to respond_to(:number)

I also check that the items in the returned collections respond to number, an attribute on Receipt that we’ll also want to call in the view.

two labeled circles, unit tests darker

Unit tests help us us drive out intra-class functionality and build confidence in our incremental changes.

two labeled circles, integration tests darker

Integration tests help us ensure that our inter-class and inter-app configuration works as a system.

two labeled circles, neither one darker

But neither of these test types presents a panacea for helping us save time and avoid worry:

add a turtle to unit test circle

unit tests are relatively slow to drive out,

add magnifying glass to unit test circle

and they tend to end up testing our framework and micro-managing our implementation choices.

add telescope to integration test circle

Integration tests give us a long feedback loop,

add injured emoji to integration test circle

and they tend to produce a lot of false positives for problems such that developers start ignoring their results.

a risk tests circle appears between the unit and integration test circles

Instead, we can consider the risks present in our system as a whole, then write a test harness that mitigates the largest risks and communicates those risks to the rest of the team. This risk-focused perspective, over time, makes it easier for you and your teammates to spot and preempt the kinds of bugs that could become headaches later.

If you liked this post, you might also like:

The series that starts here on process design in software (also Avdi-influenced)

This cornucopia of test-oriented posts – including testing for Android and iOS!

This post on how iterations might add value to your business (or not)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.