Raft 2: Setting Up Socket Servers

Reading Time: 8 minutes

In November, I took Dave Beazley’s week-long course on The Structure and Interpretation of Computer Programs.

In December, I took Rafting Trip, in which we attempt to implement the Raft algorithm from this paper. In this series, I’ll share memorable insights from the course (here’s where you can see all the posts so far).

I explained a little bit about why Raft exists and how it works in the previous post, and we’ll get deeper into it later. For now, you need to know that the algorithm operates on a collection of servers, some number of which may or may not be up and running, all of which should maintain a consistent set of data to serve up to their clients.

So if we’re gonna try to build this thing, we will need:

  1. Some servers
  2. A set of data to keep consistent
  3. The ability to recover data if the server goes down
  4. Some clients

In this post, I’ll go over 1, 2, and 3. Number 4 will be the topic of the next post.

The Servers

I’m doing this with socket servers. What’s a socket server? I’ll show you. I am keeping all the details of this raft project inside one repository, and I’m trying to maintain clearly named, well-circumscribed commits. The first commit implements an echo server and an echo client, both with sockets. I annotated the commit right here if you’d like to copy and paste code or navigate my comments with a screen reader. I’ll also post screenshots right here to save you a click, if you prefer.

First, we have the server:

Screen Shot 2019-12-22 at 4.19.01 PMScreen Shot 2019-12-22 at 4.19.16 PMScreen Shot 2019-12-22 at 4.19.23 PM

So there are four main steps: initialize the socket, bind it to an address, tell it to listen, and then open a loop to receive any connections (and, finally, to close the connection when we are done).

By contrast, here is a socket for the client:

Screen Shot 2019-12-22 at 4.19.33 PMScreen Shot 2019-12-22 at 4.19.45 PMScreen Shot 2019-12-22 at 4.19.57 PM

So this time we have three steps: initialize the socket, connect it to an address where another socket is listening, and then send over some data (and listen for a response). Finally, we close the connection when we are done.

You can check out this code, fire up a terminal, open two windows, and navigate to the raft directory in each window. Then, if you run:

python echo_server.py in one window, you should see:

starting up on localhost port 1000

Now, run python echo_client.py in the other window. In the client window, you should see connecting to localhost port 1000, followed by Type your message:. In the server window you should see connection from (localhost, [some port]). This is your client. Hooray!

Now, whatever you type into the client window, you should receive back as a message from the server.

And if you’d like to learn more about the Python socket API, I recommend this piece that covers the basics.

A Set of Data to Keep Consistent

Now we need these servers to be managing a set of data. This could be any set of data. We’re going to use a key value store. In Python this type of data store is called a dictionary. We’ll make it possible for clients to tell the server to put a new key and value in the store (set), fetch a value out of the store by its key (get), and delete a key and its value from the store (delete). Here’s the annotated commit for this. Again, screenshots:

Screen Shot 2019-12-22 at 4.44.17 PMScreen Shot 2019-12-22 at 4.44.34 PMScreen Shot 2019-12-22 at 4.44.43 PMScreen Shot 2019-12-22 at 4.44.50 PMScreen Shot 2019-12-22 at 4.45.00 PMScreen Shot 2019-12-22 at 4.44.27 PM

You can check out this commit and use the same steps that you used for the echo server to get everything running. Then you can type things into the client like set chelsea rules​, and you should then be able to type in get chelsea and receive the response rules. If you type in data you should get back {"chelsea": "rules"}. Just be sure to avoid extraneous spaces:

Screen Shot 2019-12-19 at 10.26.44 AM.png

Recovering Data if the Server Goes Down

Currently, our key value store lives only in memory on our server. So if the server goes down and comes back up, that data will be lost. We want that data to persist—for the server to be able to recover its data. There are a few ways we can do this, but for now, we’ll do it by writing any data store commands to a file. Any time the server starts up, it will read this file and execute all the commands in order to return its data to its prior state, much in the way that web frameworks run all the database migrations in order to arrive at the most updated version of a database schema.

As we have continued writing this code, it has become fairly clear that the server code (for coordinating the socket connection) is a separate set of responsibilities from the management of the key value store. So we move the key value store management code into its own methods that will live in an object called KeyValueStore.

So we have a method on KeyValueStore that accepts a command, executes that command to update (or read from) the data store, and returns a response for the server to pass to the client:

Screen Shot 2019-12-22 at 5.15.02 PM.png

The server writes all commands to a log before sending them to the KeyValueStore to be executed:

Screen Shot 2019-12-22 at 5.16.32 PM

And each time the server boots up, it initializes a KeyValueStore and runs a method on the store called catch_up() :

Screen Shot 2019-12-22 at 5.16.49 PM

That catch_up() method reads from the logs and executes the commands to return the KeyValueStore‘s data attribute to its previous state:

Screen Shot 2019-12-22 at 5.19.11 PM.png

If you check out this commit, you should be able to do all the same things that you did at the “A Set of Data to Keep Consistent” stage, except now you should also be able to shut down and boot up the server at will without losing your precious data entries like chelsea: rules.

Some Clients

We’re doing pretty good, but we still need to take care of a detail. Currently, each of our servers will only accept connections from one client at a time. We would like them each to connect to multiple clients. Why? Because Raft achieves fault tolerance, in part, by electing one server the “leader” of the cluster, and all client requests are redirected to that leader. Any of the servers in the cluster could be elected leader and therefore need to do this.

Why does each server only accept connections from one client at a time? And how would we go about changing that? We’ll talk about it in the next post.

If you liked this piece, you might also like:

The SICP series (based on another Dave Beazley course)

The Crafting Interpreters Series (ongoing)

The stuff in the (brand new!) “Debugging” category (I haz a proud)

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.