CS 111 w20 lecture 20 outline

1 Review

Write a class definition for SortedList. It should be a subclass of the built-in Python list. It's constructor starts with def __init__(self, seq=tuple()):. Implement the constructor and the append method such that the list is always in sorted order.

2 Dictionaries

2.1 Motivation

  • what if I was running for president?
    • I would need to create a volunteer database that matches names with emails
    • too many people to have a separate variable for each
  • today we're going to talk about how to represent this kind of data efficiently

2.2 Operations

  • each entry is our database is a pair: a key (name) and a value (email address)
  • contains: key in database (is this volunteer signed up?)
  • write: database[key] = value (update email on record)
  • read: print(database[key]) (display email on record)

2.3 What About Using a List?

  • can we provide these operations using a list? Work with your neighbord to sketch out how you would do this
    • maintain a list of tuples (key, value)
      • append new tuples as new keys are added, replace when overwritten
    • each operation involves a search through the list
      • i.e., we have to “lookup” the index of the key every time
    • problem: operations take number of steps proportional to size of database

2.4 Dictionary to the Rescue

  • dict in Python, also called a hash table or hash map
    • a "big idea" in Computer Science
  • still have a list, key idea is we have something called a hash function
    • function from possible keys to indices
  • write: hash key, write value to corresponding index
  • read: hash key, read value at corresponding index
  • contains: hash key, check if value present at corresponding index
  • now all these operations take the same amount of steps no matter how many things are in our database
  • syntax: use the key just like you would an index for a list
database = {}  # initializes an empty dictionary
# creates the entry in the dictionary for the key "Bilbo" with the value on the left side of the =
database["Bilbo"] = "bb@bagend.co.uk"  
if "Bilbo" in database:  # use in to check if a key exists in the dictionary, returns True or False
    print("Bilbo's email:", database["Bilbo"])

2.5 Hash function

  • this magic hash function seems to be doing all the work
    • a class' __hash__ method, use hash(value) to get the hash of that value
  • intricacies are outside the scope of 111, but here's the basic idea
    • inside, dictionaries have a list of fixed length
    • compute hash(key), then modulo it be the length of the list (turns it into a valid list index)
  • leaving out: what if two keys hash to the same index? what if the internal list gets full?
    • take CS 201!
hash(4)         # returns 4
hash("4")       # returns -1282485687280376630
hash(4.1)       # returns 230584300921368580
hash("Python")  # returns -6798175930716008987
hash("python")  # returns -1824268576744008373