CS 111 f21 — Dictionaries

1 Review

Write the definition for a Date class. It should have instance variables for the year, the month, and the day. Implement the following methods:1

  • __init__(self, year, month, day): construct a Date object with the given year, month, and day
  • __repr__(self): return a string representation of the date
  • __eq__(self, other): return True if other is the same date as self
  • __gt__(self, other): return True if self is a later date than other

2 Dictionaries

2.1 Motivation

  • what if I was running for president?
    • I would need to create a volunteer database that matches names with emails
    • too many people to have a separate variable for each
  • today we're going to talk about how to represent this kind of data efficiently

2.2 Operations

  • each entry is our database is a pair: a key (name) and a value (email address)
  • contains: key in database (is this volunteer signed up?)
  • write: database[key] = value (update email on record)
  • read: print(database[key]) (display email on record)

2.3 What About Using a List?

  • can we provide these operations using a list? Work with your neighbord to sketch out how you would do this
    • maintain a list of tuples (key, value)
      • append new tuples as new keys are added, replace when overwritten
    • each operation involves a search through the list
      • i.e., we have to lookup the index of the key every time
    • problem: operations take number of steps proportional to size of database

2.4 Dictionary to the Rescue

  • dict in Python, also called a hash table or hash map
    • a big idea in Computer Science
  • still have a list, key idea is we have something called a hash function
    • function from possible keys to indices
  • write: hash key, write value to corresponding index
  • read: hash key, read value at corresponding index
  • contains: hash key, check if value present at corresponding index
  • now all these operations take the same amount of steps no matter how many things are in our database
  • syntax: use the key just like you would an index for a list
database = {}  # initializes an empty dictionary
# creates the entry in the dictionary for the key "Bilbo" with the value on the left side of the =
database["Bilbo"] = "bb@bagend.co.uk"  
if "Bilbo" in database:  # use in to check if a key exists in the dictionary, returns True or False
    print("Bilbo's email:", database["Bilbo"])

2.5 Hash function

  • this magic hash function seems to be doing all the work
    • a class' __hash__ method, use hash(value) to get the hash of that value
  • intricacies are outside the scope of 111, but here's the basic idea
    • inside, dictionaries have a list of fixed length
    • compute hash(key), then modulo it be the length of the list (turns it into a valid list index)
  • leaving out: what if two keys hash to the same index? what if the internal list gets full?
    • take CS 201!
hash(4)         # returns 4
hash("4")       # returns -1282485687280376630
hash(4.1)       # returns 230584300921368580
hash("Python")  # returns -6798175930716008987
hash("python")  # returns -1824268576744008373

Footnotes:

1
# Object practice exercise defining a class to represent a date
# Aaron Bauer, CS 111, Fall 2021
class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

    def __repr__(self):
        return str(self.year) + "/" + str(self.month) + "/" + str(self.day)

    def __eq__(self, other):
        # two Dates are equal if all their instance variables are equal
        return self.year == other.year and self.month == other.month and self.day == other.day

    def __gt__(self, other):
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.day > other.day
        return False

    def __ge__(self, other):
        # now that we have equals and greater than implemented
        # we can easily do greater than or equal to
        return self > other or self == other

d1 = Date(2021, 10, 31)
d2 = Date(2021, 11, 5)
d3 = Date(1988, 12, 1)
d4 = Date(2030, 1, 1)

print(d1)  # should print 2021/10/31
print(d1 == Date(2021, 10, 31))  # should print True
print(d1 > d2)  # should print False
print(d1 < d3)  # should print False
print(d4 >= d2)  # should print True