CSC 212 – Data Structures Lecture 26: Hash Tables.
-
Upload
patience-parker -
Category
Documents
-
view
226 -
download
0
Transcript of CSC 212 – Data Structures Lecture 26: Hash Tables.
CSC 212 –Data Structures
Lecture 26:
Hash Tables
Question of the Day
Two English words change their pronounciation when their first letter is capitalized. What are they?
Polish/polish
Reading/reading
Entry Interface
ADT representing search dataEach Entry is key-value pairkey is what we have and use…… but we actually want the value
public interface Entry<K,V> { public K key(); public V value();
}
Map ADT
Represents searchable CollectionsData items are entries (key-value pairs) Instances map keys to values
Keys contained in at most one entrySo each key mapped to at most one value
Values may be in multiple entriesSo many keys refer to same value
Basis of searching data structures
Map Interface
public interface Map<K,V> extends Collection {public V put(K key, V val) throws InvalidKeyException;public V get(K key) throws InvalidKeyException;public V remove(K key) throws InvalidKeyException;public Iterable<K> keys();public Iterable<V> values();public Iterable<Entry<K,V>> entries();
}
PositionList-Based Implementation
PositionList holds entries in any order Independent of PositionList’s
implementationRelies on methods defined by Interface
Positions
Entrys
9 c 6 c 5 c 8 c
PositionList
Map
Map Performance
Want simple & fast implementationGoogle: Search speed measured in TB/sList-based Map: get, remove, put takes
O(n) time Would love to use arrays
Implementation is easy Insertion, access, and removal in O(1) timeBut ranks or array indices are ints, not K
Hashing To The Rescue
For each key, hash function computes integer from 0 to N - 1For example, h(x) = x mod NValue h(x) is “hash value” of x
Hash table stores all the entries(Nothing to do with eateries in Amsterdam)Really just an array of size N
Goal is storing entry (k, v) at index h(k)1st (good) implementation of a Map
Hash Table Example
Stores instances of Entry<SSN,Name>
Array has 10,000 indices
Hash function ish(x)x mod 10,000
What if execute call: put(212710001, “Ike Oh”);
01234
999799989999
…
4512290004 | “Jill Roe”
9811010002 | “Bob Dole”
2007519998 | “Rhi Smith”
0256120001 | “Jay Doe”
Collisions
Name when keys hash to same index Ideal hash spreads out equally and evenly
Limit/avoid collisionsBut also want to keep table small
But good hash hard to findDepends on what you have to work withEven harder to make a good hash
Could try to work with collisions
Bucket Arrays
Each item in array is itself a List “Chain” whenever there is a collision
Nothing to do with road rage Instead, just add new Entry onto List
Bucket Arrays
But what if have really bad hash? Suppose always hash to same index
All entries now in single List Back to O(n) execution times (Also get bad case of the munchies)
Your Turn
Get back into groups and do activity
Before Next Lecture…
Keep up with your reading! Complete Week #10 Assignment Review Programming Assignment #3