I’ve heard a lot of opinions about API design. This is great—I love opinions.
I want folks to inform their opinions with context on the history of API design. We seem to be in this weird continuous loop of folks pontificating based on whatever has come out in the last two years. And because those opinions are missing a lot of history, they also make a lot of assumptions that prove inaccurate. Some examples:
REST is ye olde standard, ’tis THE WAY to fetch data from thine API!
GraphQL is the newest, hottest idea in fetching data through an API!
When I’m writing an API, I have to choose between the SOAP and the REST protocols (unless I use GraphQL).
I’m gonna take us all on a journey through computing history so that this stops happening.
So first of all:
In the beginning, there was no SOAP or REST.
If you’ve never heard of either SOAP or REST, don’t worry: we’re starting this story before either of those existed, at the birth of the internet.
All Hallows Eve’s eve, 1969: Grad student programmers Charley Kline and Bill Duvall set up two computers next to each other in California and transmitted the message “L,” then “O”, from one to the other. Then the system crashed. They rebooted it, tried again, succeeded in transmitting the full message “L-O-G-I-N,” and then went out for burgers. And thus, they sent the first message across the internet!
Kline’s regret about the message: he should have come up with something more poetic. What’s even funnier about this is that Neil Armstrong had uttered his history-making “One small step for a man, one giant leap for mankind” just three months before. Kline theoretically had both precedent and inspiration to come up with something better than “LOGIN,” but here we are.
So the transmission medium was born. I’ll let Wikipedia help me fast-forward through its childhood and awkward teen years:
The Internet protocol suite (TCP/IP) was developed…in the 1970s and became the standard networking protocol on the ARPANET.
Commercial Internet service providers (ISPs) began to emerge in the very late 1980s. The ARPANET was decommissioned in 1990.”
ARPANET was commemorated, by the way, with this rousing elegy by internet pioneer Vinton G. Cerf. Cerf, clearly, had more poetic inclination than Charley Kline 😂.
Limited private connections to parts of the Internet by officially commercial entities emerged in several American cities by late 1989 and 1990, and the NSFNET was decommissioned in 1995, removing the last restrictions on the use of the Internet to carry commercial traffic.
With the rise of privately (and eventually publicly) available connections to the internet, the programmers doing the connecting needed to agree on a protocol so that their apps could talk to each other.
CORBA Predates SOAP and REST
The first widespread attempt at such a protocol predates both SOAP and REST. Instead, the frontrunner in 1991, 22 years after the “L-O” moment, was a heavy-handed approach called the Common Object Request Broker Architecture: CORBA.
CORBA came from Object Management Group, a nonprofit technology standards consortium whose acronym is, in fact, OMG.
CORBA’s pattern separates the client (code running on one machine) from the server (code running on a different machine), and it centers on making remote process calls (RPC) from the client to a proxy that connects code on the server. The client then waits for a response from the proxy and runs client-side code to translate that response into something it can understand. It doesn’t concern itself with the details of the server network. Before this, clients and servers spoke language-specific binary protocols. Now a C++ app could talk to a Java app!
If you’re writing code in 2018, maybe this all sounds humdrum: client-server, remote calls, proxies, apps in different languages are all just part of using APIs, right? Sure, now—but they weren’t always. CORBA put a lot of that in place for the rest of us. So when you read war stories that bash CORBA, keep that in mind.
So what sucked about CORBA? Frankly, the same thing that sucks about any API layer:
Talking to each other required meeting and agreeing on interfaces. Updates to interfaces required updates on both sides. The process to make updates was costly, since it involved multiple people meeting in a room and hammering out these changes.
Nowadays, we sometimes do a little better at this interface coordination thing than the teams writing CORBA did back in the day. But a) to be honest not even really, and b) the CORBA protocol wasn’t a bad first attempt at interface coordination, and it gave later protocols something to build on.
But what came next?
SOAP Improves on CORBA (sort of)
In the ’90s Microsoft had the reputation that Google has now of swallowing up the nascent work of smaller software groups. And the opportunity was ripe for them. Greg Tunrquist explains:
The OMG, the consortium responsible for the CORBA spec, had gaps not covered by the spec…To cover these gaps, very vendor had proprietary extensions. The biggest one was Iona, an Irish company that at one time held 80% of the CORBA marketshare. We knew them as “I-own-ya’” given their steep price.
CORBA was supposed to cross vendor supported, but it wasn’t. You bought all middleware from the same vendor. Something clicked, and LOTS of customers dropped Iona. This galvanized the rise of [Simple Object Access Protocol] SOAP.
SOAP still relied on RPC, but it boasted a more lightweight list of Stuff to Codify™ than CORBA did. Instead of bespoke procedures to translate requests and responses on both the client and server, client and server would now agree on a common interface codified in a language called extended markup language, or XML. XML looks kind of like the HTML used to define headers, numbered lists, and other formatting tropes for text on the internet, but developers can extend the available tag list with their own bespoke tags. A SOAP client/server pair would agree on the exact flavor and format of tags for their communication with an XML document called a web services description language, or WSDL (pronounced ‘WIZ-dull’).
The most important thing to know about SOAP: you hit one endpoint with one HTTP verb and various request bodies to get what you want. The original SOAP from Microsoft was locked to XML as its data transfer format. XML was, at the time, also a new thing largely influenced by Microsoft.
The requests look something like this:
I finally put this in a gist because the WordPress html editor kept deleting my XML tags. Curses!
That’s a request body. Those words in pointy brackets are the XML part.
This equivalent REST request shows the same body information formatted as JSON:
People love to kvetch and moan about how complicated the SOAP spec is because you have to load up your request inside of an XML request body. The Microsoft team tried to make it more palatable by adopting the skeuomorphic envelope as the name for the request body, but that seems to have backfired. Look: the fact that it’s called an envelope doesn’t make it anything more than the formatting of a request body.
A lot of people hate XML. The gossip: XML isn’t legible. I don’t buy this. You see it up there in the example: it’s verbose, but it’s legible.
Instead, I think this is what happened:
When SOAP started in early 1998, there was no schema language or type system for XML (in fact, XML 1.0 had just become a full Recommendation that quarter).
So no one had object mappers into or out of XML, which meant they had to do object mapping by hand. Also, because this was the first standardized HTTP protocol, no one had much familiarity with object mapping at all.
Have you ever seen a programmer take a crack at a problem they have never seen the likes of before in their lives? Now, imagine thousands of programmers all doing that at once. You’re gonna get some weird approaches. I have worked on one legacy project with a hand-rolled XML object mapper that loops through the objects one at a time and, therefore, is quite slow. I have worked on another legacy project in which the programmers assembled XML requests by string concatenating XML tags with their data. The concatenation program is the second most beautiful trainwreck I have seen in my programming tenure, and the XML it produces is…not valid. It’s almost valid. But it’s just far enough away from valid that it’s a pain in the neck to parse into objects.
But those programs were written before we had fancy mappers. Now we have fancy mappers, and XML is no longer some kind of untameable beast. Turns out computers are pretty good at picking out text from rigid predefined patterns, even if said text is unappealing to our discerning human eyeballs.*
*Speaking of which, I cannot help but notice the glaring inconsistency in the fact that natural language processing is the sexiest of the sexy to programmers right now, but XML parsing is thought to be a pain in the ass. I work on NLP problems. 98% of the work is cleaning up text data with a combination of parsers, regexes, conditionals, and cussing.
Why are we talking about the origins of excessive XML hatred? You’ll see when we get to Part 3. In the meantime, if you take one thing away from Part 1, please let it be that SOAP-inspired protocols make requests to one URL with different request bodies to get different results. This becomes relevant later in our story, roundabout 2015 or so.
Check out the next section, The Arrival of REST, to find out what happened next.
If you like this kind of thing, you might also like:
And while we’re at it, I’d like to introduce you to:
This fantastic series from Giovanni Navaria on the birth of the internet. This series was one of my favorite reads from researching my own series, and it goes into more depth about an early period of the web’s development that my series only skims.