from IPython.core.display import HTML

HTML(open("custom.html", "r").read())
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright (C) 2014-2023 Scientific IT Services of ETH Zurich,
Contributing Authors: Uwe Schmitt, Mikolaj Rybniski

7 Dictionaries¶

Dictionaries, aka "hash tables" or "look up tables" allow presentation of two column tables. The map a key to a value.

For example:

first name
(key)
family name
(value)
monty python
curt cobain

To implement this table in Python we write

first_to_family_name = {
    "monty": "python",
    "curt": "cobain",
}
print(first_to_family_name)
{'monty': 'python', 'curt': 'cobain'}

To lookup up a value use brackets:

print(first_to_family_name["monty"])
python

You can insert new values or overwrite existing values like this:

first_to_family_name["uwe"] = "schmitt"
print(first_to_family_name)
{'monty': 'python', 'curt': 'cobain', 'uwe': 'schmitt'}

Size of a dictionary:

print(len(first_to_family_name))
3

Left column of table are "keys":

print(first_to_family_name.keys())
dict_keys(['monty', 'curt', 'uwe'])

Right column are "values":

print(first_to_family_name.values())
dict_values(['python', 'cobain', 'schmitt'])

Comment: .keys() and .values() return iterators which look like a list and partially behave like a list. So you can use for to iterate over keys and values, but you can not append to them.

The empty dictionary is {}:

d = {}
print(d)
print(len(d))
{}
0

Restrictions¶

  • values of dictionaries may have arbitrary type, even dictionaries.
  • keys must be immutable (so: no lists, but tuples, int, float, str, bool, ...)

Lookup of non existing keys:

print(first_to_family_name["jesus"])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[10], line 1
----> 1 print(first_to_family_name["jesus"])

KeyError: 'jesus'

in order to test if a value appears as an key of a dictionary use in: only checks keys, not values !

print("jesus" in first_to_family_name.keys())
# equivalently: "jesus" in first_to_family_name
False

or use error-safe get() method of the dictionary which will return None if the key does not exist

print(first_to_family_name.get("monty"))
print(first_to_family_name.get("jesus"))
python
None

Dictionaries may have different types for keys and values:

what_a_mess = {3: 9, 5: {25: 125}, 1.2: (1, 2), "four": [4]}
print(what_a_mess)
{3: 9, 5: {25: 125}, 1.2: (1, 2), 'four': [4]}
print(what_a_mess[5])
print(what_a_mess[5][25])
{25: 125}
125

A few comments on dictionaries:

  • writing and reading from a dictionary is extremely fast, also for huge dictionaries.
  • dictionaries are very helpful in many places
  • dictionaries are heavily used within the Python interpreter.

Computing a histogram using dictionaries¶

This is an example usage how to use a dictionary to create a histogram of numbers when the actual numbers are not known from the beginning.

The histogram for the numbers [4, 4, 2, 4, 3] is

number count
2 1
3 1
4 3

And here is a function which will compute this histogram as a dictionary:

def compute_histogram(elements):
    histogram = {}
    for element in elements:
        if element not in histogram.keys():
            histogram[element] = 1
        else:
            # fetch count for element from dict, increase by one and write back:
            histogram[element] = histogram[element] + 1
            # or shorter:
            # histogram[element] += 1
    return histogram


print(compute_histogram([4, 4, 2, 4, 3]))
{4: 3, 2: 1, 3: 1}
How does this work?¶
  1. In the first iteration, element == 4, histogram is empty and the if check is True. So we write the key-value pair (4, 1) into the dictionary. So after the first iteration we have the following histogram: | key | value | |--------|-------| | 4 | 1 |

  2. In the second iteration, element == 4, the if check is False. In this case we increase the value corresponding to the key 1 by 1. After the second iteration we have the following histogram: | key | value | |--------|-------| | 4 | 2 |

  3. In the third iteration, element == 2, the if check is True and we write a new row into the "table" as we did in the first iteration. So after the third iteration we have the following histogram: | key | value | |--------|-------| | 4 | 2 | | 2 | 1 |

  4. And so on...

Sets*¶

The basic function of a set is to answer "is a given value contained in a given set ?".

This allows answering questions like

  • what is the intersection of two sets ?
  • what is the union of two sets ?
  • what is the set difference between two sets ?
  • etc

Properties:

  • Elements in a set have no order
  • one element can only occur once
  • membership lookup is very fast

sets: http://www.learnpython.org/en/Sets

# set construction, two notations for the same operation:
a = {1, 2, 3, 4}
b = set((2, 4, 6))
print(a, b)
{1, 2, 3, 4} {2, 4, 6}
# set intersection, two notations for the same operation:

print(a & b)
print(a.intersection(b))
{2, 4}
{2, 4}
# set union, two notations for the same operation:

print(a | b)
print(a.union(b))
{1, 2, 3, 4, 6}
{1, 2, 3, 4, 6}
# set difference, two notations for the same operation:

print(a - b)
print(a.difference(b))
{1, 3}
{1, 3}

Sets can be used to effiently compute unique elements in a collection, e.g.:

def only_unique_elements(li):
    return len(li) == len(set(li))


print(only_unique_elements([1, 2, 3, 4]))
print(only_unique_elements([1, 2, 3, 4, 1]))
True
False
def count_unique_elements(li):
    return len(set(li))


print(count_unique_elements([1, 2, 3, 4]))
print(count_unique_elements([1, 2, 3, 4, 1]))
4
4

Exercises block 7¶

  • Reread the examples above carefully.

Check question¶

Try to answer the determine the final value of x using pen and paper, you can then use Python to check your result.

    a = {1: {2: [3, 4]}, 2: [5], "x": "abc", 7.1 : (1, 2, 3)}
    x = 0
    for k in a.keys():
        x += len(a[k])


  • Type, run and try to understand the compute_histogram example.

Programming exercise¶

  1. Create a dictionary which maps numbers 1 to 10 to their squares: start with an empty dictionary and use range to loop over numbers 1 to 100 (inclusive).

Homework¶

  1. Write a function which takes dictionary with integer keys and removes the entries where the key is an even number. Please recall the discussion about transforming lists at the end of chapter 6.

Optional exercises*¶

  1. Try to figure out what the following code does. It might help to add print statements, to introduce temporary variables and to use Pythons help system:
def do_something(z):
    cc = {}
    for x in z.split():
        cc[x] = cc.get(x, 0) + 1
    return cc


print(do_something("love love me doo"))
{'love': 2, 'me': 1, 'doo': 1}
def something_other(z):
    c = []
    for w in z.split(","):
        if len(w) % 2 == 0:
            c.append(w)
    c.sort()
    return "-".join(c)


print(something_other("hi,what is,this,good,for"))
good-hi-this
  1. Write a function which inverts a dictionary. So it maps values to keys of a given dictionary. This works only if the values are unique.
  1. Now assume the values are not unique and write a function which creates a dictionary mapping values to a list of keys from the given dictionary. Eg for input {1: 2, 2: 2, 3: 4} the function returns {2: [1, 2], 4: [3]}.