'What is the advantage of using a immutable data structure in a map in Clojure?

I am in the second chapter of the Programming Clojure book, and I came across this paragraph -

Because Clojure data structures are immutable and implement hashCode correctly, any Clojure data structure can be a key in a map.

I cannot understand how the feature mentioned in the above quote would be advantageous. I would appreciate it if someone could help me understand this using an example or point me to the right resources.



Solution 1:[1]

This can be useful when forming a data structure that has composite keys i.e. keys that consist of more than one piece of information.

As a simple example, say we have a graph with vertices :a :b and :c and we wish to have a data structure which enables lookup of the cost metric associated with any edge. We can use a Clojure map where each key is a set:

(def cost {#{:a :b} 5
           #{:b :c} 6
           #{:c :a} 2})

We can now look up the cost associated with any edge:

(get cost #{:c :b})  ; => 6

Solution 2:[2]

In order to answer your question there are two things we need to discuss:

  1. Hash Tables
  2. Value vs Reference

In a hash table (and I'm grossly oversimplifying here) you take a "key" and you run it through a hashing function that converts that key in a unique* identifier that is then associated with an address in memory which holds a particular "value". This is the underlying abstraction for higher-level associative data structures like Clojure maps or Python dictionaries: you give me the key, I hash it, look up the address, give you back the thing.

In order for this scheme to work, the same key has to always hash to the same value for some definition of "same", otherwise you couldn't get the thing back out of the data structure.

In Java (and many other languages) the definition of "same" boils down to reference:

public class Foo {
  int x = 5;

  public static void main(String[] args) {
    Foo myObj = new Foo();
    Foo myOtherObj = new Foo();
    myObj == myOtherObj; // false
    myObj.x = 6;
    myObj == myObj; // true
  }
}

Even though myObj and myOtherObj both hold the value 5 at the point of comparison, they are not equal according to the rules of Java, because they are different references. That last comparison, which looks non-sensical if you've never worked in a different model, highlights the problem: when we create myObj it has a value of 5, but at one point in time it has a value of 6. Is it still the same? Again, Java says yes.

And when we get to hashing something that is a potential key for a hash table that distinction matters: how do we maintain a consistent thing to feed the hash function? Is it the value (x) the container holds or the container (Foo instance)?

Python takes an interesting approach here:

ls = [1, 2] # list
tp = (1, 2) # tuple
st = set()  # empty set

st.add(tp) # fine
st.add((1, 2)) # still only has 1 element, (1, 2) == (1, 2)
st.add(ls) # Error: unhashable type list!

In Python, you can't use mutable objects as set members or dictionary keys, because they are saying "the meaning of this thing changes" so it is unsuitable for a hashed key. But you can use any immutable type as a hash key (even an immutable container). Note that (1, 2) == (1, 2), unlike in Java where two containers holding the same values are still compared on reference. Python compares mutable types by reference, but immutable types by value.

But in Clojure, everything** is immutable. Everything can be used as a hash key, because everything has a consistent value through time that can be fed to the hash function:

(def x [1 2])

(def y { x 5 })

(get y x)     ; 5
(get y [1 2]) ; 5

When we lookup the vector bound to x in the map bound to y we get 5, since vectors are immutable we don't have to worry about identity. We don't have to pass around a reference like we do in Java, we can just create the value and use it as a lookup key.

* They're not entirely unique, per the pigeonhole principle unless the hashed output is at least as large as the input you will have collisions where two keys hash to the same values. But for our purposes here, they're unique.

** Not quite everything, but close enough.

Solution 3:[3]

Input

(def my-cat "meaw")
(def my-dog "baw")
(def my-pets {my-cat "Luna"
              my-dog "Lucky"})

Output

(get my-pets my-cat) ;=> "Luna"
(:key my-pets my-cat) ;=> "meaw"
(get my-pets "baw") ;=> "Lucky"
(get my-pets "meaw") ;=> "Luna"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 stindrago