Last week, someone asked this question on Twitter and I thought I’d answer:
I use this as one of my examples in my “WTF Python” introductory lecture…the first lecture I do in my Python Programming class (more on that lecture here).
I’ll start with what’s happening, move on to how to avoid it, then finish with when to use it.
Why does this happen?
Python evaluates default arguments when it places the definition of the function into the environment. Not when it runs that function. So the default arguments keep the id they had at function definition time.
No problem when your defaults are immutable. But a list is mutable. That means, when you change the list set in this function as the default argument, it’s gonna keep using that changed list in future calls that don’t set that argument.
If I understand correctly, this was an oversight in the original implementation, but it has been kept because:
- There’s a way around it and python strives to have “one right way”
- It’s well known, so people were already using the aforementioned circumnavigation
- Also people were using the fact that Python does this in their programs, and as such
- Changing it would break people’s code.
Zooming out: The Python maintainers have a very different risk calculation to make about changing behavior than, I dunno, you or I do on a pet project, because their project is in active use by 8.2 million people whose code they need to not break.
So how do I get around this behavior?
def my_function(default_list = None): if default_list is None: default_list = [ ] . . .
This explicitly sets that default to a fresh, new list at function run time. Another syntactic sugar you’ll see on the above is
if not default_list. I don’t use that for three reasons:
1. The above way is more explicit on what is happening because it doesn’t bank on the reader having to remember that None is falsey.
2. [ ] is also falsey.
So “if not default_list” will reassign that name to a new empty list if someone passes in an empty list as the argument. Doesn’t break things, but pointless.
3. “” and 0 are ALSO falsey so this behavior won’t raise in some cases where someone is passing in the wrong kind of default.
I like my code to tell people when they’re doing things wrong before they have to go investigate and then change half their program over it.
So we talked about why this happens and how to avoid it.
When does it make sense to use this behavior?
Honestly I don’t use this in prod code. I set an instance variable if I want an object to hang onto state.
I do use it occasionally in mocks. Here’s how:
- The object I am mocking mutates some state that I cannot access through an existing instance variable.
- I don’t want to add an instance var to an object JUST to test it, but I want to confirm something happened in this function…
- Across multiple calls to the function.
class Logger(Loggable): def log(msg): #prod behavior class MockLogger(Loggable): def log(msg, messages=): messages.append(msg) return messages
MockLogger.log("Hi!") msgs = Mocklogger.log("Hello!") >>>["Hi!", "Hello!"]
This can be useful later when I have a test like:
def test_dispatch_object__logs_to_snowplow(): dispatch.function_that_calls_log_twice() assert dispatch.messages == "First message I need sent" assert dispatch.messages == "Second message I need sent"
In general, I’d say it’s a little sneaky, and if we were all starting over with Python, maybe we’d make this not work this way.
But as it is, that’s why it happens, how to get around it, and a use case.
If you liked this piece, you might also like:
Me walking you through my implementation of a shortest path program in Python—with edits from readers!
This introductory series on compiler design, with code examples in Python
This post about why use, or not use, an interface (specifically in Python because of the way Python does, or rather doesn’t exactly do, interfaces)