Hashing II

Table of Contents

1 Reading

Read Sections 15.4.1, 15.4.2, and 15.4.5 of Bailey.

2 Video

Here is a video lecture for the slides below.

The video contains sections on

  1. Hash Tables: Review (0:05)
  2. Collision resolution (1:32)
  3. Collision-avoidance (2:05)
  4. More on prime table size (3:57)
  5. Handling Collisions: Separate Chaining (5:31)
  6. Thoughts on chaining (7:18)
  7. Time vs. space (constant factors only here) (9:25)
  8. More rigorous chaining analysis (10:03)
  9. Alternative: Use empty space in the table (13:02)
  10. Probing hash tables (14:49)
  11. Other operations (16:25)
  12. (Primary) Clustering (21:21)
  13. Analysis of Linear Probing (22:34)
  14. Quadratic probing (24:39)
  15. Quadratic Probing Example (26:00)
  16. Another Quadratic Probing Example (27:01)
  17. From Bad News to Good News (28:36)
  18. Rehashing (29:16)

The Panopto viewer has table of contents entries for these sections: https://carleton.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=bfdfd6c6-36b5-4a8a-bbb7-acc401222432

3 Slides

You can download the slides in pdf or PowerPoint.

4 Practice Problems1

  1. What is the load factor (\(\lambda\)) of a hash table?
  2. What is a hash collision?
  3. In a hash table is it possible to have a load factor of 2?
  4. Is a constant-time performance guaranteed for hash tables?
  5. What is the final state of a hash table of size 10 after adding 35, 2, 15, 80, 42, 95, and 66? Assume that we are using the standard "mod" hash function shown in the video/slides and linear probing for collision resolution. Do not perform any resizing or rehashing. Draw the entire array and the contents of each index.
  6. If separate chaining is used for collision resolution, and the same elements from the previous problem (35, 2, 15, 80, 42, 95, and 66) are added to a hash table of size 10, what is the final state of the hash table? Do not perform any resizing or rehashing. Draw the entire array and the contents of each index.
  7. Suppose we have a hash set that uses the standard "mod" hash function and uses linear probing for collision resolution. The starting hash table length is 11, and the table does not rehash during this problem. If we begin with an empty set, what will be the final state of the hash table after the following elements are added and removed? Draw the entire array and the contents of each index. Write "X" in any index in which an element is removed and not replaced by another element. Also write the size, capacity, and load factor of the final hash table.
    set.add(4);
    set.add(52);
    set.add(50);
    set.add(39);
    set.add(29);
    set.remove(4);
    set.remove(52);
    set.add(70);
    set.add(82);
    set.add(15);
    set.add(18);
    

5 Learning Block

  • Questions?
  • Form questions
  • Text generation demo
  • Implementing a Counter
  • Midterm course feedback

Footnotes:

1

Solutions:

  1. It is the percentage of the table entries that contain key-value pairs.
  2. A collision occurs when two different keys attempt to hash to the same location. This can occur when the two keys directly map to the same index, or as the result of probing due to previous collisions for one or both of the keys.
  3. Yes. If the hash table uses external chaining, the size of the array can be smaller than the number of key-value pairs.
  4. It is not guaranteed. In practice, if the load factor is kept low, the number of comparisons can be expected to be very low.
  5. After adding the elements, the hash table's state is the following:
       0   1   2   3   4   5   6   7   8   9
    [ 80,  0,  2, 42,  0, 35, 15, 95, 66,  0]
    
  6. After adding the elements, the hash table's state is the following:
       0   1   2   3   4   5   6   7   8   9
    [  |,  /,  |,  /,  /,  |,  |,  /,  /,  /]
      80      42          95  66
               |           |
               2          15
                           |
                          35
    
  7. After adding the elements, the hash table's state is the following:
       0   1   2   3   4   5   6   7   8   9  10
    [  0,  0,  0,  0, 70, 82, 50, 39, 15, 29, 18]
    size = 7
    capacity = 11
    load factor = 0.636