Improving Your API Design Skills with Open Source Examples

Reading Time: 9 minutes

Long before I became a programmer, I studied at a public fine arts conservatory. My specialty was visual arts: I drew and I painted.

My first week there, the program director slung several pieces of fabric over a pile of disassembled skeletons, then told us all to draw it. My second week there, he plinked us in the hallway for four hours and instructed us—you guessed it—to draw the hallway.

We don’t have a good word for the skill that this exercise develops. The mantra for drawing-from-life is to “really look” at your subject. The point isn’t just to stare at it. The point, rather, is to do the visual equivalent of listening (which we’ve talked about extensively right here). You’re supposed to look past what you expect to see, or what you assume is there, and to question those assumptions. You’re supposed to let the subject show you how it doesn’t match those assumptions. For example, you know that the tiles on the hallway extend all the way into the distance. But you can’t actually see them all the way into the distance, so in a realistic drawing you should not draw them all the way into the distance.

When we notice these differences between our expectations and reality, we are surprised. The brain works by making predictions, and it learns by noting when its predictions don’t match the results. Surprise triggers memories. The learning exercises I design, for myself and my students, often focus on trying things out and then looking up (or talking about) how others might solve the same problem. This helps the lessons stick in memory.

Allow me to propose a programming exercise that correlates to the drawing-from-life exercise.

This quarter, I decided to have my Python Programming students replicate functions in the widely-used open source Python libraries pytest and pandas. I thought that, since I needed some kind of substrate problems for them to solve, I might give them the kind of problems that open-source maintainers had faced in building tools that my students would use. Then, when it was time for my students to learn the tools, they would have a head start from studying how those tools were built and what the tools could do.

I expected my students to learn from it. I did not expect to learn so much from it myself.

I’ll walk you through a smallish-scope example below, but I feel compelled to say that I’m thoroughly enjoying my “API studies” on pytest and pandas.1 I’ll show you here my current favorite study I’ve done: method chaining in Pandas. Here’s a usage example for my implementation from scratch next to the same operation in the real pandas library:

The actual pandas implementation (admittedly faster and more robust 🙂 )

My favorite thing about this exercise is that it requires almost no ramp-up time, provided you elect to study a library you already use all the time. In my Python work, for example, I use pytest and pandas a lot, so I already know large portions of their APIs. I don’t have to research these libraries before jumping into the exercise: I can start from what I already know.

The goal: Replicate the behavior of a function in an open-source library.

How do we get from plain, vanilla code to a function that just works for tens of thousands of client developers? Sometimes a library’s functionality can seem like magic. Although the solution sometimes involves metaprogramming, it’s never (well, rarely 😉 ) actual magic.

Take pytest‘s parametrization decorator, for example. Here’s what it looks like, straight from the documentation:

I can’t say I love this decorator—I think its legibility is suspect. But it’s a useful tool for running the same set of assertions on a collection of inputs for giving students lots and lots of examples to understand how a function they’re about to write is supposed to work.

If I were to rewrite that from scratch, I could do something like this:

def parametrize(inputs):
    def decorator(function):
        for example in inputs:
            print(f"Running {function.__name__} with args {example}")
            except Exception as e:
                print("  " + str(e))
    return decorator

And then call it like so:

data = [
    ("x", "x"),
    ("2", "2"),
    ("2", "21"),
    (2, 5)

def test_one_two_equal(one, two):

To get something like this, that runs test_one_two_equal on each of the sets of inputs in data:

Once I’ve tried solving the problem on my own, here’s the fun part: I can compare my solution to the real one that is actually published in pytest. At the time of this writing, it’s right here. I found it by Googling “github pytest” and clicking on the first link (which took me to their github repo), then using Github’s search function to find “def parametrize” (in quotes) and selecting the option “search in repository”. Like this:

The actual solution is much longer than mine—128 lines—but a lot of that is documentation. Minus the documentation, much of the remainder handles edge cases that I didn’t think about. What if the examples pertain to different signatures for this function? What if parametrize decorates the same function multiple times? What if one of the arguments is callable? Et cetera.

In the end, rather than calling the decorated function on each of the argument sets like I do, the pytest implementation adds the assembled calls to a collection, then assigns that collection to a privately-marked (in Python an underscore says “you could modify this, but you’re not really supposed to”) attribute on an instance of the Metafunc class that the parametrize method belongs to. Here’s the assignment:

And here is the initializer on Metafunc, with _calls highlighted in blue:

Those calls all get summarily run from a PyCollector class:

That collector function gets called from a function, that gets called from a function, that gets called from a function, et cetera et cetera, until you pop out at the top in the _main method in

That method is a privately-labeled method that gets wrapped and called from this public method:

Which gets called from the public main method in the pytest initializer:

I figure this stuff out by navigating through code locally in my IDE (Jetbrains, CMD+B to see definition, OPTION+F7 to see references), but as of sometime in 2020 you can do it directly in Github too by clicking on the function name and tabbing between Definitions and References on the popover like so:

This functionality presented me with enough code to wade through that I might not have done it without a pointed interest in whether I “got the implementation right” or not (kinda seems like I didn’t). But in the process, I learned a lot about how pytest detects, assembles, and then runs test examples despite not asking clients to organize all their test methods in subclasses of some explicit TestCase object.

That particular API decision on pytest’s part has piqued my curiosity for some time. It differs from what most xUnit implementations do, including Python’s unittest:

Peep that unittest.TestCase subclass.

A superclass test runner is also exactly what I would have done, and in fact did, for my students’ test runner, as you can see right here. It’s a little meaty in its final form, but if you’re interested in that test runner implementation, this is the exercise I have my students do to help them understand test runners (in convenient Jupyter notebook format!). Once they’ve done that, this is the exercise (another notebook!) where they add improvements to my test runner.

Were I to write a test runner for actual use, I think I would still follow the xUnit pattern. But now, I wouldn’t follow it just because it’s the only one I understand. I have taken the time to explore an alternative. I would no longer make that decision just to avoid facing my own uncertainty.

Exploring the real implementations of the APIs I have cheaply copied perpetually underscores one big lesson: there is a big difference between making an API work for one person and making it work for lots of different people. Getting the sucker to work is like 1% of the code. The other 99% is split about evenly between handling errors, accounting for edge cases, and docstrings. Perhaps that should serve as inspiration for me to do more of the same in my own code.


  1. When I say “API” here, I do not mean specifically “a means to get data from a server over an HTTP connection.” I mean it in the more general sense—the application programming interface. The set of functions and attributes that a library exposes publicly for its clients to use.

If you liked this piece, you might also like:

The rest of my pieces about strategies and tactics for leveling up as a technologist

This walkthrough of the time and space efficiency tradeoffs in an in-memory database implementation

The series on designing graceful processes in software, which starts right here

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.