CS 111 f21 — Dictionaries
1 Review
Write the definition for a Date
class. It should have instance variables for the year, the month, and the day. Implement the following methods:1
__init__(self, year, month, day)
: construct aDate
object with the given year, month, and day__repr__(self)
: return a string representation of the date__eq__(self, other)
: returnTrue
ifother
is the same date asself
__gt__(self, other)
: returnTrue
ifself
is a later date thanother
2 Dictionaries
2.1 Motivation
- what if I was running for president?
- I would need to create a volunteer database that matches names with emails
- too many people to have a separate variable for each
- today we're going to talk about how to represent this kind of data efficiently
2.2 Operations
- each entry is our database is a pair: a key (name) and a value (email address)
- contains:
key in database
(is this volunteer signed up?) - write:
database[key] = value
(update email on record) - read:
print(database[key])
(display email on record)
2.3 What About Using a List?
- can we provide these operations using a list? Work with your neighbord to sketch out how you would do this
- maintain a list of tuples (key, value)
- append new tuples as new keys are added, replace when overwritten
- each operation involves a search through the list
- i.e., we have to lookup the index of the key every time
- problem: operations take number of steps proportional to size of database
- maintain a list of tuples (key, value)
2.4 Dictionary to the Rescue
dict
in Python, also called a hash table or hash map- a big idea in Computer Science
- still have a list, key idea is we have something called a hash function
- function from possible keys to indices
- write: hash key, write value to corresponding index
- read: hash key, read value at corresponding index
- contains: hash key, check if value present at corresponding index
- now all these operations take the same amount of steps no matter how many things are in our database
- syntax: use the key just like you would an index for a list
database = {} # initializes an empty dictionary # creates the entry in the dictionary for the key "Bilbo" with the value on the left side of the = database["Bilbo"] = "bb@bagend.co.uk" if "Bilbo" in database: # use in to check if a key exists in the dictionary, returns True or False print("Bilbo's email:", database["Bilbo"])
2.5 Hash function
- this magic hash function seems to be doing all the work
- a class'
__hash__
method, usehash(value)
to get the hash of that value
- a class'
- intricacies are outside the scope of 111, but here's the basic idea
- inside, dictionaries have a list of fixed length
- compute
hash(key)
, then modulo it be the length of the list (turns it into a valid list index)
- leaving out: what if two keys hash to the same index? what if the internal list gets full?
- take CS 201!
hash(4) # returns 4 hash("4") # returns -1282485687280376630 hash(4.1) # returns 230584300921368580 hash("Python") # returns -6798175930716008987 hash("python") # returns -1824268576744008373
Footnotes:
1
# Object practice exercise defining a class to represent a date # Aaron Bauer, CS 111, Fall 2021 class Date: def __init__(self, year, month, day): self.year = year self.month = month self.day = day def __repr__(self): return str(self.year) + "/" + str(self.month) + "/" + str(self.day) def __eq__(self, other): # two Dates are equal if all their instance variables are equal return self.year == other.year and self.month == other.month and self.day == other.day def __gt__(self, other): if self.year > other.year: return True if self.year == other.year: if self.month > other.month: return True if self.month == other.month: return self.day > other.day return False def __ge__(self, other): # now that we have equals and greater than implemented # we can easily do greater than or equal to return self > other or self == other d1 = Date(2021, 10, 31) d2 = Date(2021, 11, 5) d3 = Date(1988, 12, 1) d4 = Date(2030, 1, 1) print(d1) # should print 2021/10/31 print(d1 == Date(2021, 10, 31)) # should print True print(d1 > d2) # should print False print(d1 < d3) # should print False print(d4 >= d2) # should print True