Risk-Oriented Testing (From RubyTapas Screencast)

Reading Time: 11 minutes

rubies

Recently I did a Behind the Scenes post about recording a screencast for RubyTapas—bite-size videos (4-7 minutes) to share intermediate and advanced-level skills and ideas with programmers.

My screencast covers risk-oriented testing, which I’ve applied to the design of applications in a few languages. I got permission to share it with you!

Episode Transcript

We tend to divide automated tests into two categories: unit tests and integration tests.

Risk-Oriented Testing (1)

We maximize unit test coverage during the development cycle and then, once the code is written,

Risk-Oriented Testing (2)

we fit a few integration tests to make sure everything stays together.

Risk-Oriented Testing (3)

But sometimes we can get a better return on our time and effort by identifying the riskiest parts of our code and prioritizing tests to fit that risk.

Risk-Oriented Testing (4)

Let’s look at an example. Suppose you work for WalletPal, a website that helps people manage their receipts. The original build is a monolithic Rails app that stores people’s receipt data in a relational database.

Risk-Oriented Testing (5)

WalletPal would like to migrate the receipt data into a document database inside a new app. The frontend on the original app will now fetch the migrated data via an HTTP API.

Risk-Oriented Testing (6)

You’re in charge of rewriting that data layer—and the frontend itself should not change.

module Repositories
  class ReceiptDataSource
    def index
      Receipt.all
    end
  end
end

You extract the ActiveRecord calls to the relational database out of the controllers and into a data source class, and you namespace it with the term Repositories.

module Services
  class ReceiptDataSource
    def index
      ActiveResource::Receipt.all
    end
  end
end

You’ll write another class with the same interface, and you’ll namespace that one Services.

class ReceiptsController < ApplicationController
  before_action :set_data_source

  def set_data_source
    @data_source = Reopsitories::ReceiptDataSource.new
  end
end

You’ll switch out the repository for the service class when it’s time to get the data from the API app instead of from the local database.

class ReceiptsController < ApplicationController
  before_action :set_data_source

  def set_data_source
    @data_source = Services::ReceiptDataSource.new
  end
end

Once you have switched all your data dependencies to the new API, the receipts table in the relational database will be deleted.

class DropReceiptsTable < ActiveRecord::Migration
  def up
    drop_table :receipts
  end

  def down
    raise ActiveRecord::IrreversibleMigration
  end
end

How do you test through this fairly large refactor?

A common testing approach is to maximize coverage with unit tests.

Risk-Oriented Testing (7)

 But unit tests tend to test the framework, and they tend to slide toward checking our implementations rather than our outcomes. That makes it harder for us to refactor later.

We could also write an end-to-end test that visits a URL on the WalletPal app and asserts that the appropriate receipts show up.

Risk-Oriented Testing (8)

Integration tests, for their own part, leave us with a long feedback loop. Also, configuration issues on our local machines can cause them to fail for reasons that don’t signal real problems in the code.

Risk Oriented tests

What if we could find an approach that fell somewhere in the middle…that is, an approach that used automated tests to define and catch the riskiest cases without micro-managing all the cases and without requiring a full buildout to show progress?

Let’s try it.

Step 1: Make a risk profile of the system we’re building.

Let’s return to our picture of our system with the changes we’d like to make:

Risk-Oriented Testing (10)

This diagram can help us visualize the risks associated with our refactor. Let’s go through our diagram and ask the question: what could go wrong here?

Risk-Oriented Testing (11)

Now, for each of these things that could go wrong, I want to answer three questions:

questions

  1. Would the outcome be catastrophic if this went wrong?
  2. Is this likely to go wrong?
  3. If this goes wrong, is it likely to sneak through QA and deployment?

Risk-Oriented Testing (13)

Step 2: Plan automated tests for the riskiest items.

I’m focused on placing automated tests around problems that are somewhat likely to happen,

Risk-Oriented Testing (14)

somewhat likely to go uncaught,

Risk-Oriented Testing (15)

and somewhat catastrophic if allowed to stay wrong.

Risk-Oriented Testing (16)

So, for example, the API server going down is somewhat catastrophic, but it’s not likely to go uncaught. It makes itself known the moment the engineer, the designer, or QA visits the site. Risk-Oriented Testing (17)

So an automated test that requires the collaborating server to be up is not that useful from a risk profile perspective.

Risk-Oriented Testing (18)

But what if that server serves data that looks kind of valid but is, in fact, inaccurate?

receipt

What if, say,

Risk-Oriented Testing (20)

a weird character in the text of the receipt messes up the JSON body in the trip over to the new server…

receipt

…so the new server’s version of this receipt only shows the portion of the items that appeared on the receipt before the weird character?

It still looks like a receipt, but it’s missing items. That could go uncaught unless QA is looking very closely. That kind of data equivalence is an excellent candidate for automated testing.

What are our riskiest items in this refactor?

Risk-Oriented Testing (22)

To me, here are the top two in order of risk:

second risk

  1. Something gets messed up in the data import, so the document database version looks OK at first glance but is not, in fact, equivalent to the original relational data.
  2. When we switch from the repository data source to the service data source, the services data source is missing methods or returns a result that doesn’t respond to methods called on it in the controller or view.
RSpec.describe ReceiptDataSourceEquivalence do
  before(:suite) do
      @repository_receipt_data_source ||= Repositories::ReceiptDataSource.new(environment: Rails.env)
      @service_receipt_data_source ||= Services::ReceiptDataSource.new(environment: Rails.env)

      @local_records = @repository_receipt_data_source.all.collect(&:id)
      @api_records = @service_receipt_data_source.all.collect(&:id)
      @on_records = Array(@local_records & @api_records)
  end

  describe "receipts list call" do
    it "provides receipts with the same set of ids, each of which have equivalent numbers" do
      expect(Set.new(local_records)).to eq(Set.new(api_records))
    end
  end

  describe "receipts detail call" do
    it "given an id, provides a receipt with the same attributes from local database or API"
    @on_records.each do |id|
      local_record = @repository_receipt_data_source.get(id)
      api_record = @service_receipt_data_source.get(id)

      expect local_record.number.to eq(api_record.number)
    end
  end
end

Let’s check our data in production to make sure that I have equivalent receipts in both places.

@service_receipt_data_source ||= Services::ReceiptDataSource.new(environment: Rails.env)

This test connects to my API app,

@local_records = @repository_receipt_data_source.all.collect(&:id)
@api_records = @service_receipt_data_source.all.collect(&:id)

makes requests,

expect(Set.new(local_records)).to eq(Set.new(api_records))

and compares the results to the local database.

expect local_record.number.to eq(api_record.number)

We check an example attribute called number on the receipt.

You could do this same thing with any and all receipt attributes.

This test will take a while to run, and we’re not running this all the time: we’re running it in the particular case that we prepare to do something risky, like switch the data source that the app is using or delete the receipts table in the relational database. In that case, compared to no test, it is indeed slow. Compared to having QA manually check every record to get this same level of confidence? It’s blazing fast.

Now let’s talk about our other top risk: API inconsistency between the two data sources.

require 'rails_helper'

RSpec.describe Services::ReceiptDataSource do
  describe "equivalency" do
    it "has all the methods present in the repository library" do
      expect(
        Repositories::ReceiptDataSource.instance_methods - Services::ReceiptDataSource.instance_methods
      ).to be_empty
  end
end

...

end

We write a test to compare the method lists of our two data sources.

In statically typed languages, you wouldn’t assert this with a test. Instead, you would have both data sources adhere to an interface, and then the code wouldn’t compile if the things that these tests are testing weren’t true. In this case, though, we want something there to make sure we don’t miss any critical methods in our development process.

require 'rails_helper'

RSpec.describe ReceiptDataSourceTransition do
  describe "get_receipts" do
    before do
      Receipt.delete_all
    end

    it "gets similarly structured responses from both repository and service" do
      #Subjects Under Test
      @repository = Repositories::ReceiptDataSource.new
      @service = Services::ReceiptDataSource.new
      @data_sources = [@repository, @service]

      #Given
      FactoryGirl.create(:receipt, number: 11.40)
      FactoryGirl.create(:receipt, number: 22.50)
      stub_request(:get, "#{ENV['RECEIPT_ENDPOINT']}/receipts.json").
      to_return(body: [
        {"uuid": "324-asd-423-fsd", "relational_id": 1, "number": 11.40},
        {"uuid": "asd-ewr-456-676", "relational_id": 2, "number":  22.50}
      ].to_json)

      #When
      @results = @data_sources.map {|ds| ds.get_receipts }

      #Then
      @results.each do |result|
        expect(result).to respond_to(:each)
        expect(result.length).to eq(2)
        expect(result.first).to respond_to(:number)
      end
    end

    ...

  end
end

Let’s also write a test to compare the return values of the .all() method in our two data sources.

expect(result).to respond_to(:each)

It’s worth noting that, instead of asserting equality between the two data payloads here, I want to assert that, regardless of the data itself, both return values respond to certain requests — namely, each — the method we use on this return value in the view.

expect(result.first).to respond_to(:number)

I also check that the items in the returned collections respond to number, an attribute on Receipt that we’ll also want to call in the view.

Unit tests help us us drive out intra-class functionality and build confidence in our incremental changes.

Risk-Oriented Testing (25)

Integration tests help us ensure that our inter-class and inter-app configuration works as a system.

Risk-Oriented Testing (26)

But neither of these test types presents a panacea for helping us save time and avoid worry:

Risk-Oriented Testing (27)

unit tests are relatively slow to drive out,

Risk-Oriented Testing (28)

and they tend to end up testing our framework and micro-managing our implementation choices.

Screen Shot 2020-04-23 at 2.02.49 PM

Integration tests give us a long feedback loop,

Screen Shot 2020-04-23 at 2.02.58 PM

and they tend to produce a lot of false positives for problems such that developers start ignoring their results.

Screen Shot 2020-04-23 at 2.03.08 PM

Instead, we can consider the risks present in our system as a whole, then write a test harness that mitigates the largest risks and communicates those risks to the rest of the team.

Risk-Oriented Testing (32)

This risk-focused perspective, over time, makes it easier for you and your teammates to spot and preempt the kinds of bugs that could become headaches later.

If you liked this post, you might also like:

The series that starts here on process design in software (also Avdi-influenced)

This cornucopia of test-oriented posts – including testing for Android and iOS!

This post on how iterations might add value to your business (or not)

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.