Hashing I
Table of Contents
1 Reading
Read Sections 15.4.3 and 15.4.4 of Bailey. Ignore figures 15.11 and 15.12, those relate to the the performance analysis section that will be reading for Friday's lesson.
2 Video
Here is a video lecture for the slides below.
The video contains sections on
- Motivating Hash Tables (0:05)
- Hash Tables: Turn Key Into Array Index (2:51)
- Hash Tables: More Keys Than Spots (4:18)
- Hash functions (5:59)
- Hashing integers (7:07)
- Who hashes what? (11:04)
- Okay, what about keys that aren't ints? (14:28)
- Specializing hash functions (18:19)
- Verifying String hashCode (19:17)
- Hashing and comparing (22:04)
- Equal Objects Must Hash the Same (23:13)
- Example: CalendarDate (25:54)
- Conclusions and notes on hashing (36:05)
The Panopto viewer has table of contents entries for these sections: https://carleton.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=f9b503fc-1d68-48b8-9449-acc201760ef6
3 Slides
You can download the slides in pdf or PowerPoint.
4 Practice Problems1
- What is hashing, and why is hashing a good way of implementing a map?
- Is it possible for a hash table to have two entries with equal keys?
- Is it possible for a hash table to have two entries with equal values?
- For Java strings consisting of only one character,
hashCode
simply returns the value of thatchar
. Suppose we have a hash table with an array of length 20. What will the array look like after the following keys have beenput
: "s", "a", "n", "d", "w", "o", "r", "m"? For reference here is a table ofchar
values in Java:Dec = Decimal Value Char = Character '5' has the int value 53 if we write '5'-'0' it evaluates to 53-48, or the int 5 if we write char c = 'B'+32; then c stores 'b' Dec Char Dec Char Dec Char Dec Char --------- --------- --------- ---------- 0 NUL (null) 32 SPACE 64 @ 96 ` 1 SOH (start of heading) 33 ! 65 A 97 a 2 STX (start of text) 34 " 66 B 98 b 3 ETX (end of text) 35 # 67 C 99 c 4 EOT (end of transmission) 36 $ 68 D 100 d 5 ENQ (enquiry) 37 % 69 E 101 e 6 ACK (acknowledge) 38 & 70 F 102 f 7 BEL (bell) 39 ' 71 G 103 g 8 BS (backspace) 40 ( 72 H 104 h 9 TAB (horizontal tab) 41 ) 73 I 105 i 10 LF (NL line feed, new line) 42 * 74 J 106 j 11 VT (vertical tab) 43 + 75 K 107 k 12 FF (NP form feed, new page) 44 , 76 L 108 l 13 CR (carriage return) 45 - 77 M 109 m 14 SO (shift out) 46 . 78 N 110 n 15 SI (shift in) 47 / 79 O 111 o 16 DLE (data link escape) 48 0 80 P 112 p 17 DC1 (device control 1) 49 1 81 Q 113 q 18 DC2 (device control 2) 50 2 82 R 114 r 19 DC3 (device control 3) 51 3 83 S 115 s 20 DC4 (device control 4) 52 4 84 T 116 t 21 NAK (negative acknowledge) 53 5 85 U 117 u 22 SYN (synchronous idle) 54 6 86 V 118 v 23 ETB (end of trans. block) 55 7 87 W 119 w 24 CAN (cancel) 56 8 88 X 120 x 25 EM (end of medium) 57 9 89 Y 121 y 26 SUB (substitute) 58 : 90 Z 122 z 27 ESC (escape) 59 ; 91 [ 123 { 28 FS (file separator) 60 < 92 \ 124 | 29 GS (group separator) 61 = 93 ] 125 } 30 RS (record separator) 62 > 94 ^ 126 ~ 31 US (unit separator) 63 ? 95 _ 127 DEL
- For each of the following possible
hashCode
implementations: Is the following a legalhashCode
method for aPoint
class, according to the requirement that equivalent points should have the same hash code? Does it distribute the hash codes well between objects? Why or why not?public int hashCode() { return x * y; }
public int hashCode() { return 42; }
public int hashCode() { Random rng = new Random(); return rng.nextInt(); }
- Write a
hashCode
method for aDate
class, whose fields are a year, month, and day, as integers. Follow the general requirement that equivalent objects should have the same hash code. - Write a
hashCode
method for aStudent
class, whose fields are a name (a string), age (an integer), student ID number (an integer), and favorite data structure (a string). Follow the general requirement that equivalent objects should have the same hash code.
5 Learning Block
- Discuss in learning pods (share confusions, practice problems 1, 2, and 3)
- Questions?
- Form questions
- Text generation demo
- Practice problems 5, 6, and 7 in pods
Footnotes:
1
Solutions:
- Hashing is a process of mapping element values to integer indexes and storing the elements at those indexes in an array. Hashing is a good way of implementing a map because it provides theoretically O(1) runtime for adding, removing, and searching a set.
- No. If the keys were equal, the second key inserted would have found the first (i.e., been hashed to the same index), and thus not have been inserted.
- Yes, there is no restriction that values must be unique. Many keys can be associated with the same values.
internal array: +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ |"d"| | | | | | | | |"m"|"n"|"o"| | |"r"|"s"| |"a"| |"w"| +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 <--- index
- This is a legal hashCode implementation, according to the requirement. It distributes hash codes somewhat well, but it has certain groups of points that will collide with each other unnecessarily (for example, every point with an x-coordinate or y-coordinate of 0 will have the same hash code).
- This is a legal hashCode implementation, according to the requirement. But it distributes hash codes extremely poorly; literally it could not be worse in that respect, because every point has the same hash code. It's important to note that it is still a technically correct implementation, though it works very poorly in terms of hash table performance.
- This is not a legal hashCode implementation, according to the contract. It does not consistently return the same hash code for the same point object state.
hashCode
method for aDate
class (the constant multipliers for each component are somewhat arbitrary):public int hashCode() { return 31 * year + 31 * 31 * month + day; }
hashCode
method for aStudent
class (the constant multipliers for each component are somewhat arbitrary):public int hashCode() { int result = 17; result = result * 31 + name.hashCode(); result = result * 31 + age; result = result * 31 + studentID; result = result * 31 + favorite.hashCode(); return result; }