SICP 5: Modularity, Objects, and State

Reading Time: 7 minutes

Screen Shot 2019-11-05 at 10.35.37 AM

I recently took Dave Beazley’s week-long course on The Structure and Interpretation of Computer ProgramsIn this series, I’ll share memorable insights from the course (here’s where you can see all the posts so far).

The first two chapters of SICP emphasize the idea that all modular programming amounts to substitution. We substitute placeholder names for the values that we pass into functions and then run the code with those values.

In this post, we’ll turn to the third chapter of the book, which introduces mutable state: the ability to change the value of an existing variable.

We can do that in scheme with the set! operation:


; 3.1
(define (make-counter value)
(define (incr x)
(set! value (+ value x))
value)
incr)
(define counter (make-counter 100))
; > (counter 100)
; 200
; > (counter -1)
; 199
; > (counter 10)
; 209
; > (counter -150)
; 59

view raw

state.scm

hosted with ❤ by GitHub

Once we add mutable state, the substitution model we have used so far no longer works.

What changes when we add mutable state to our programs?

 Mutability Change #1: Rather than rely on substitution of passed-in variables, we must introduce the idea of scopes.

bird scope

Suppose we want to build an interpreter in Python to evaluate and execute scheme code. We’d want it to evaluate typical Scheme expressions like these:


assert seval(23, {}) == 23
assert seval("x", {"x": 23}) == 23
assert seval(("+", 1, 2), env) == 3
assert seval(("+", 1, ("*", 2, 3)), env) == 7
seval(("define", "x", 13), env) == 7
assert seval(("x"), env) == 13
assert seval(("if", ("<", 2, 3), 4, 5), env) == 4
assert seval(("if", (">", 2, 3), 4, 5), env) == 5
assert substitute(('*', ('+', 'x', 'y'), ('x',)), 'x', 2) == ('*', ('+', 2, 'y'), (2,))
factorial = ('define', 'factorial',
('lambda', ('n',), ('if', ('=', 'n', 1), 1, ('*', 'n', ('factorial', ('-', 'n', 1))))))
seval(factorial, env)
seval(('define', 'n', 5), env)
result = seval(('fact', 'n'), env)
assert result == 120

view raw

scheme.py

hosted with ❤ by GitHub

We write essentially a giant conditional for the evaluation and application of basic syntax (that link will show you the whole interpreter, and I’ll only inline the code directly relevant to our discussion here).

We want programmers to be able to create and execute procedures. We create a small class to represent the parts of a procedure, and we create an instance of it to represent each procedure in the code:


class Procedure:
def __init__(self, parameters, expressions, env):
self.parameters = parameters
self.expressions = expressions
self.env = env
def __call__(self, *args):
for expression in self.expressions: # Substitute the arguments for parameter names
for names, values in zip(self.parameters, args):
expression = substitute(expression, names, values)
result = seval(expression, # Evaluate the resulting expression
dict(self.env)) # makes a copy of env
return result

view raw

scheme.py

hosted with ❤ by GitHub

For every expression in the procedure, we take the parameters (the names that represent passed-in values in a function’s definition) and the arguments (the values passed in for those names when the function is called), zip them together, and then, on line 10, call the substitute method, which lives at the top level of our interpreter and looks like this:


def substitute(exp, name, value):
if exp == name:
return value
elif isinstance(exp, tuple):
return tuple([substitute(part, name, value) for part in exp])
else:
return exp #Unchanged

view raw

scheme.py

hosted with ❤ by GitHub

It’s recursively substituting values for names before returning the fully substituted expression.

When it’s time to add scope, this goes away, and we must re-implement our interpreter to keep track of environments (again, full interpreter at link).

Our substitute method goes away. Now, each time we find ourself within the scope of a new procedure, we need to keep track of the environment at the time that the procedure was created, and additionally keep track of a new environment for the scope of this specific procedure.


class Procedure:
def __init__(self, parameters, expressions, env):
self.parameters = parameters
self.expressions = expressions
self.env = env # PROCEDURE MUST REMEMBER THE ENVIRONMENT AT THE TIME THAT IT WAS DEFINED
def __call__(self, *args):
# Args are the argument values
# Make a new scope for the local variables
local_env = {}
# Bind the parameter names to values
for name, value in zip(self.parameters, args):
local_env[name] = value
for expression in self.expressions:
result = seval(expression, {**self.env, **local_env})
return result

view raw

scheme.py

hosted with ❤ by GitHub

For nested expressions, our nested environments are recursively passed through, first to seval (which evaluates expressions), then back into a new environment if evaluation reveals another procedure.

A quick note: You’ll notice on line 17 that we’re slapping our new local env on top of a copy of the existing environment, and for a huge environment that will be really space inefficient. We’ve talked on this blog before about a similar situation: tracking nested transactions in an implementation of an in-memory database. If this were a real interpreter and not a toy for learning purposes, I might turn to a solution like that one. Note that the nested scopes wouldn’t face that annoying deletion snafu we faced with the database: local scopes can overwrite the values of variables from broader scopes, but they don’t delete them.

 Mutability Change #2: “is” and “equals”.

pair fo jays

The question of “is” (value equality) versus “equals” (reference equality) is only a problem when you introduce mutability. If created variables cannot be changed, it doesn’t matter if objects have the same reference or not. Their values are not going to change. So if they’re initialized at the same value, they might as well be the same.

This is why Swift has an identity operator === that you can use for classes, but not for structs. Structs don’t change after they are created and are always uniquely referenced, so to equate them we can only implement the equality operator ==, which does value equality for primitives and is intended to be used for value equality in compound objects, too.

 Mutability Change #3: Order matters.

birds on a wire

Results only depend on the sequence of operations when variables can be reassigned. Before that, things can happen in any order, or concurrently, and the result is the same. “Stateless” functions (kind of a misnomer, the state is passed in, it’s not completely absent) make sense in cases of concurrent or asynchronous calls for exactly this reason.

When we mix state and potential concurrency, we can have the same piece of data getting modified at the same time in two different ways. Programmers usually resolve this with various types of mutex lock that prevent changes to data that the program thinks is currently being changed somewhere else.

Can we have state without reassignment or mutability?

Sort of. Suppose we have two operations: one that counts up, and one that counts down.


(define (countdown n)
(cond ((= n 0) 'countdown-done)
(else
(display "Down ")
(displayln n)
(call-soon (lambda () (countdown ( n 1)))))))
(define (up stop)
(define (iter x)
(cond ((> x stop) 'up-done)
(else
(display "Up ")
(displayln x)
(call-soon (lambda () (iter (+ x 1)))))))
(iter 0))
(countdown 5)
; Down 5
; Down 4
; Down 3
; Down 2
; Down 1
(up 5)
; Up 1
; Up 2
; Up 3
; Up 4
; Up 5

view raw

concurrency.scm

hosted with ❤ by GitHub

We want to run these two operations concurrently.

What if we were to implement a queue, shove operations from each of these functions onto that queue, and then pop them off and execute them one by one?

Our queue might look like this:


; 3.22
(define (make-queue)
(let ((front-ptr '())
(rear-ptr '()))
(define (empty?)
(null? front-ptr))
(define (insert! item)
(let ((new-pair (mcons item '())))
(cond ((empty?)
(set! front-ptr new-pair)
(set! rear-ptr new-pair)
)
(else
(set-mcdr! rear-ptr new-pair)
(set! rear-ptr new-pair)))))
(define (front)
(cond ((empty?) (error "Empty queue"))
(else (mcar front-ptr))))
(define (rear)
(cond ((empty?) (error "Empty queue"))
(else (mcar rear-ptr))))
(define (delete!)
(cond ((empty?) (error "Empty queue"))
(else (set! front-ptr (mcar front-ptr)))))
(define (dispatch msg)
(cond ((eq? msg 'insert!) insert!)
((eq? msg 'delete!) (delete!))
((eq? msg 'empty?) (empty?))
((eq? msg 'front) (front))
((eq? msg 'rear) (rear))
(else (error "Bad message"))))
dispatch
))

view raw

queue.scm

hosted with ❤ by GitHub

And then instead of executing each operation as it’s created, we put it on a queue. We define a run method to pop things off the queue and execute them in order:


(define ready-to-call (make-queue))
(define (call-soon proc)
((ready-to-call 'insert!) proc))
(define (run)
(cond ((ready-to-call 'empty?) 'done)
(else
(let ((proc (ready-to-call 'front)))
(proc) ; execute the 0-argument lambda
(ready-to-call 'delete!)))))
(define (countdown n)
(cond ((= n 0) 'countdown-done)
(else
(display "Down ")
(displayln n)
(call-soon (lambda () (countdown ( n 1)))))))
(define (up stop)
(define (iter x)
(cond ((> x stop) 'up-done)
(else
(display "Up ")
(displayln x)
(call-soon (lambda () (iter (+ x 1)))))))
(iter 0))
(call-soon (lambda () (countdown 3)))
(call-soon (lambda () (up 3)))
(run)
; Down 3
; Up 1
; Down 2
; Up 2
; Down 1
; Up 3

view raw

queue.scm

hosted with ❤ by GitHub

What have we done here? We have delayed our procedures’ execution. Scheme has keywords delay and force to do this explicitly in programs as well.

We can take this concept even further with something like streams. Streams in scheme function much like a list, with an important difference: the car (first element) of the stream is evaluated, but the cdr (rest of it) is not.

From Dave:

“Stream processing lets us model systems that have state without ever using assignment or mutable data.”

Because all but the first element is not evaluated, we can even create infinite streams. These streams reference themselves in their cdrs, and that’s perfectly allowed.


(define ones (stream-cons 1 ones)) ;<—has not evalutated ones yet, as ones is a stream
(stream-ref ones 1) ;–>1
(stream-ref ones 10) ;–>1
(stream-ref ones 37) ;–>1 (an infinite stream of ones)
(define integers (stream-cons 1 (add-streams ones integers))); <–counts up from 1
(define fibonacci (stream-cons 0 (stream-cons 1 (add-streams (stream-rest fibonacci) fibonacci)))) ; <–Displays fibonacci numbers

view raw

streams.scm

hosted with ❤ by GitHub

If you try to do this with an ordinary pair, the Scheme interpreter will not let you, insisting that you have referenced a value before assigning it.

In the next post, we’ll talk more about delayed evaluation—specifically, about the implementation details of building an interpreter for a Scheme like-language that defaults to lazy evaluation instead of eager evaluation.

If you liked this piece, you might also like:

The Leveling Up Series

The Crafting Interpreters Series (which includes some examples of techniques from the leveling up series)

The Techtivism Series (when I finish the listening series and get around to publishing the techtivism series)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.