Swift Question: Dictionaries with Keys of Arbitrary Type

Reading Time: 5 minutes

I teach a Mobile Software Development class, and my students ask excellent questions. Sometimes I cannot answer a question thoroughly, concisely, and accurately on the spot. So I defer those questions and answer them after class in follow-ups added to the class resources. I’ll also post some of those questions and answers here, because they’re that good.

Question: We learned about dictionaries: a data structure similar to arrays, except that an array’s indices are incrementing integers, while a dictionary’s indices are arbitrary.

The examples, though, all showed dictionaries whose keys were all of the same type (usually strings).

Can you have keys of different types in Swift?

So, you can. This is what that might look like:

var dict = [AnyHashable: Any]()

dict = [
    "Hello" : 1,
    1: "one",
    [ 1, 2, 3 ]: "Some Array"
]

dict[1]             // "one"
dict["Hello"]   // 1
dict[[1,2,3]]    // "Some Array"

Before explaining what’s happening here, a disclaimer: I have not run into cases where it would have been wise to do this. There are two reasons: a legibility reason and a computational reason.

Legibility reason for uniform keys: We use a dictionary so we can have keys that are meaningful to us. If I have an Array that represents complex data like the attributes of a bird, maybe there’s a value at bird[0], but its index doesn’t tell me what it means. If I have a dictionary representing that bird, bird["name"] tells me what the value means.

If my bird dictionary were like:

let bird = [
 "name" : "Oriole",
 1: "Eastern Montana and Southern Ontario",
 ["Orange", "Black"] : "feather colors"
]

There’s some info there about orioles, but it would be frustrating for a developer to figure out what it means and how to access it.

So although it is theoretically true that a hash can have anything (well, almost anything—we’ll get to that in a second) as its keys, in practice the keys are usually uniformly strings, or in some cases, uniformly instances of a more complex data type. You don’t usually have one dictionary with keys of more than one type, though.

Computational reason for uniform keys: Notice that our dictionary above has the following type: [AnyHashable:Any]

What’s that about?

First, the value can be anything: that includes a class instance, a struct instance, a function, or a closure (anonymous function).

The key, though, must conform to the Hashable protocol.

The What?

“Protocols” in Swift are similar to interfaces in Java, but more flexible. A protocol can define a set of methods to which any implementor must conform.

The type AnyHashable covers any object in Swift, from either the standard library or your own code, that implements the Hashable protocol, which requires that the object implement a hashing function that represents the object as a unique integer. That unique integer is the key that, under the hood, Swift is using to access the value. Does that sound exactly like an array? Bingo. The Dictionary is implemented with an internal array. It just gets its integer indices from functions applied to the key values we chose, rather than just incrementing the integers from zero.

Does a hash function just pick random numbers? No. The same value passed into a hash function should always produce the same integer. The integers, though, should be evenly distributed across the array. The value associated with the key is stored in a bucket (array) at the index of its hashed key. As long as the hashing function produces a unique integer for each of your keys, there will only ever be one value in an index’s bucket. The dictionary will return that value to you.

When two different keys hash to the same integer, you get a hash collision. If hashing functions and hash collisions sound super interesting to you, here’s a longer illustrated explanation.

hashing
Image from this explanation.

Different languages handle hash collisions for dictionary keys in different ways, but many of them deposit both values in the same bucket at the underlying array index. Now, to get something out of this dictionary, there are two steps instead of one: first, get the bucket at the hashed key’s index. Then, figure out which of the things in that bucket is the thing that the user wanted.

This is actually a difficult problem, and we raise the risk of it when we combine different types of keys that implement different hash functions in the same dictionary. It’s easier to coordinate one hashing function to produce unique integers than to hope that several different ones, which might each have their own method of producing the hash, to produce integers that are unique to their objects and unique also from each other’s objects.*

*Given the number of integers available and the size of most dictionaries we use in practical programming, you personally will probably not experience a hash collision even with a dictionary of disparate key types, particularly if all your keys are Swift types. You’d have to work pretty hard in most cases—implement your own hashable types with deliberately bad hashing functions, and then use those as keys—to get a large likelihood a hash collision. So the computational reason here is pretty far behind the legibility reason in terms of priority.

But the legibility reason alone suffices in most cases to keep the types on a dictionary’s keys uniform.

If you liked this Q&A, you might also like:

The three-part series on teaching a programming course

The behind the scenes series

The leveling-up series on advancing your own skills (which will soon be an audiobook read by yours truly. Preorder here).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.