I recently took Dave Beazley’s week-long course on The Structure and Interpretation of Computer Programs. In this series, I’ll share memorable insights from the course (here’s where you can see all the posts so far).
The first two chapters of SICP emphasize the idea that all modular programming amounts to substitution. We substitute placeholder names for the values that we pass into functions and then run the code with those values.
In this post, we’ll turn to the third chapter of the book, which introduces mutable state: the ability to change the value of an existing variable.
We can do that in scheme with the set!
operation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
; 3.1 | |
(define (make-counter value) | |
(define (incr x) | |
(set! value (+ value x)) | |
value) | |
incr) | |
(define counter (make-counter 100)) | |
; > (counter 100) | |
; 200 | |
; > (counter -1) | |
; 199 | |
; > (counter 10) | |
; 209 | |
; > (counter -150) | |
; 59 |
Once we add mutable state, the substitution model we have used so far no longer works.
What changes when we add mutable state to our programs?
Mutability Change #1: Rather than rely on substitution of passed-in variables, we must introduce the idea of scopes.
Suppose we want to build an interpreter in Python to evaluate and execute scheme code. We’d want it to evaluate typical Scheme expressions like these:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
assert seval(23, {}) == 23 | |
assert seval("x", {"x": 23}) == 23 | |
assert seval(("+", 1, 2), env) == 3 | |
assert seval(("+", 1, ("*", 2, 3)), env) == 7 | |
seval(("define", "x", 13), env) == 7 | |
assert seval(("x"), env) == 13 | |
assert seval(("if", ("<", 2, 3), 4, 5), env) == 4 | |
assert seval(("if", (">", 2, 3), 4, 5), env) == 5 | |
assert substitute(('*', ('+', 'x', 'y'), ('x',)), 'x', 2) == ('*', ('+', 2, 'y'), (2,)) | |
factorial = ('define', 'factorial', | |
('lambda', ('n',), ('if', ('=', 'n', 1), 1, ('*', 'n', ('factorial', ('-', 'n', 1)))))) | |
seval(factorial, env) | |
seval(('define', 'n', 5), env) | |
result = seval(('fact', 'n'), env) | |
assert result == 120 |
We write essentially a giant conditional for the evaluation and application of basic syntax (that link will show you the whole interpreter, and I’ll only inline the code directly relevant to our discussion here).
We want programmers to be able to create and execute procedures. We create a small class to represent the parts of a procedure, and we create an instance of it to represent each procedure in the code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Procedure: | |
def __init__(self, parameters, expressions, env): | |
self.parameters = parameters | |
self.expressions = expressions | |
self.env = env | |
def __call__(self, *args): | |
for expression in self.expressions: # Substitute the arguments for parameter names | |
for names, values in zip(self.parameters, args): | |
expression = substitute(expression, names, values) | |
result = seval(expression, # Evaluate the resulting expression | |
dict(self.env)) # makes a copy of env | |
return result |
For every expression in the procedure, we take the parameters (the names that represent passed-in values in a function’s definition) and the arguments (the values passed in for those names when the function is called), zip them together, and then, on line 10, call the substitute
method, which lives at the top level of our interpreter and looks like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def substitute(exp, name, value): | |
if exp == name: | |
return value | |
elif isinstance(exp, tuple): | |
return tuple([substitute(part, name, value) for part in exp]) | |
else: | |
return exp #Unchanged |
It’s recursively substituting values for names before returning the fully substituted expression.
When it’s time to add scope, this goes away, and we must re-implement our interpreter to keep track of environments (again, full interpreter at link).
Our substitute
method goes away. Now, each time we find ourself within the scope of a new procedure, we need to keep track of the environment at the time that the procedure was created, and additionally keep track of a new environment for the scope of this specific procedure.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Procedure: | |
def __init__(self, parameters, expressions, env): | |
self.parameters = parameters | |
self.expressions = expressions | |
self.env = env # PROCEDURE MUST REMEMBER THE ENVIRONMENT AT THE TIME THAT IT WAS DEFINED | |
def __call__(self, *args): | |
# Args are the argument values | |
# Make a new scope for the local variables | |
local_env = {} | |
# Bind the parameter names to values | |
for name, value in zip(self.parameters, args): | |
local_env[name] = value | |
for expression in self.expressions: | |
result = seval(expression, {**self.env, **local_env}) | |
return result |
For nested expressions, our nested environments are recursively passed through, first to seval
(which evaluates expressions), then back into a new environment if evaluation reveals another procedure.
A quick note: You’ll notice on line 17 that we’re slapping our new local env on top of a copy of the existing environment, and for a huge environment that will be really space inefficient. We’ve talked on this blog before about a similar situation: tracking nested transactions in an implementation of an in-memory database. If this were a real interpreter and not a toy for learning purposes, I might turn to a solution like that one. Note that the nested scopes wouldn’t face that annoying deletion snafu we faced with the database: local scopes can overwrite the values of variables from broader scopes, but they don’t delete them.
Mutability Change #2: “is” and “equals”.
The question of “is” (value equality) versus “equals” (reference equality) is only a problem when you introduce mutability. If created variables cannot be changed, it doesn’t matter if objects have the same reference or not. Their values are not going to change. So if they’re initialized at the same value, they might as well be the same.
This is why Swift has an identity operator ===
that you can use for classes, but not for structs. Structs don’t change after they are created and are always uniquely referenced, so to equate them we can only implement the equality operator ==
, which does value equality for primitives and is intended to be used for value equality in compound objects, too.
Mutability Change #3: Order matters.
Results only depend on the sequence of operations when variables can be reassigned. Before that, things can happen in any order, or concurrently, and the result is the same. “Stateless” functions (kind of a misnomer, the state is passed in, it’s not completely absent) make sense in cases of concurrent or asynchronous calls for exactly this reason.
When we mix state and potential concurrency, we can have the same piece of data getting modified at the same time in two different ways. Programmers usually resolve this with various types of mutex lock that prevent changes to data that the program thinks is currently being changed somewhere else.
Can we have state without reassignment or mutability?
Sort of. Suppose we have two operations: one that counts up, and one that counts down.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(define (countdown n) | |
(cond ((= n 0) 'countdown-done) | |
(else | |
(display "Down ") | |
(displayln n) | |
(call-soon (lambda () (countdown (– n 1))))))) | |
(define (up stop) | |
(define (iter x) | |
(cond ((> x stop) 'up-done) | |
(else | |
(display "Up ") | |
(displayln x) | |
(call-soon (lambda () (iter (+ x 1))))))) | |
(iter 0)) | |
(countdown 5) | |
; Down 5 | |
; Down 4 | |
; Down 3 | |
; Down 2 | |
; Down 1 | |
(up 5) | |
; Up 1 | |
; Up 2 | |
; Up 3 | |
; Up 4 | |
; Up 5 |
We want to run these two operations concurrently.
What if we were to implement a queue, shove operations from each of these functions onto that queue, and then pop them off and execute them one by one?
Our queue might look like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
; 3.22 | |
(define (make-queue) | |
(let ((front-ptr '()) | |
(rear-ptr '())) | |
(define (empty?) | |
(null? front-ptr)) | |
(define (insert! item) | |
(let ((new-pair (mcons item '()))) | |
(cond ((empty?) | |
(set! front-ptr new-pair) | |
(set! rear-ptr new-pair) | |
) | |
(else | |
(set-mcdr! rear-ptr new-pair) | |
(set! rear-ptr new-pair))))) | |
(define (front) | |
(cond ((empty?) (error "Empty queue")) | |
(else (mcar front-ptr)))) | |
(define (rear) | |
(cond ((empty?) (error "Empty queue")) | |
(else (mcar rear-ptr)))) | |
(define (delete!) | |
(cond ((empty?) (error "Empty queue")) | |
(else (set! front-ptr (mcar front-ptr))))) | |
(define (dispatch msg) | |
(cond ((eq? msg 'insert!) insert!) | |
((eq? msg 'delete!) (delete!)) | |
((eq? msg 'empty?) (empty?)) | |
((eq? msg 'front) (front)) | |
((eq? msg 'rear) (rear)) | |
(else (error "Bad message")))) | |
dispatch | |
)) | |
And then instead of executing each operation as it’s created, we put it on a queue. We define a run
method to pop things off the queue and execute them in order:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(define ready-to-call (make-queue)) | |
(define (call-soon proc) | |
((ready-to-call 'insert!) proc)) | |
(define (run) | |
(cond ((ready-to-call 'empty?) 'done) | |
(else | |
(let ((proc (ready-to-call 'front))) | |
(proc) ; execute the 0-argument lambda | |
(ready-to-call 'delete!))))) | |
(define (countdown n) | |
(cond ((= n 0) 'countdown-done) | |
(else | |
(display "Down ") | |
(displayln n) | |
(call-soon (lambda () (countdown (– n 1))))))) | |
(define (up stop) | |
(define (iter x) | |
(cond ((> x stop) 'up-done) | |
(else | |
(display "Up ") | |
(displayln x) | |
(call-soon (lambda () (iter (+ x 1))))))) | |
(iter 0)) | |
(call-soon (lambda () (countdown 3))) | |
(call-soon (lambda () (up 3))) | |
(run) | |
; Down 3 | |
; Up 1 | |
; Down 2 | |
; Up 2 | |
; Down 1 | |
; Up 3 |
What have we done here? We have delayed our procedures’ execution. Scheme has keywords delay
and force
to do this explicitly in programs as well.
We can take this concept even further with something like streams. Streams in scheme function much like a list, with an important difference: the car
(first element) of the stream is evaluated, but the cdr
(rest of it) is not.
From Dave:
“Stream processing lets us model systems that have state without ever using assignment or mutable data.”
Because all but the first element is not evaluated, we can even create infinite streams. These streams reference themselves in their cdr
s, and that’s perfectly allowed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(define ones (stream-cons 1 ones)) ;<—has not evalutated ones yet, as ones is a stream | |
(stream-ref ones 1) ;–>1 | |
(stream-ref ones 10) ;–>1 | |
(stream-ref ones 37) ;–>1 (an infinite stream of ones) | |
(define integers (stream-cons 1 (add-streams ones integers))); <–counts up from 1 | |
(define fibonacci (stream-cons 0 (stream-cons 1 (add-streams (stream-rest fibonacci) fibonacci)))) ; <–Displays fibonacci numbers |
If you try to do this with an ordinary pair, the Scheme interpreter will not let you, insisting that you have referenced a value before assigning it.
In the next post, we’ll talk more about delayed evaluation—specifically, about the implementation details of building an interpreter for a Scheme like-language that defaults to lazy evaluation instead of eager evaluation.
If you liked this piece, you might also like:
The Crafting Interpreters Series (which includes some examples of techniques from the leveling up series)
The Techtivism Series (when I finish the listening series and get around to publishing the techtivism series)