Fast Routing Table Lookup Based on Deterministic Multi-hashing
Optimal Fast Hashing
description
Transcript of Optimal Fast Hashing
![Page 1: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/1.jpg)
Optimal Fast Hashing
Yossi Kanizo (Technion, Israel)
Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Hebrew Univ., Israel)
![Page 2: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/2.jpg)
Hash Tables for Networking Devices
Hash tables and hash-based structures are often used in high-speed devices Heavy-hitter flow identification Flow state keeping Flow counter management Virus signature scanning IP address lookup algorithms
![Page 3: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/3.jpg)
Hash tables
In theory, hash tables are particularly suitable: O(1) memory accesses per operation (element insertion/query/deletion) for reasonable load
But in practice, there is a big difference between an average of 1.1 memory accesses per operation, and an average of 4
Why not only 1 memory access? Collisions
![Page 4: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/4.jpg)
Hash Tables for Networking Devices Collisions are unavoidable wasted memory
accesses For load≤1, let a and d be the average and worst-
case time (number of memory accesses) per element insertion
Objective: Minimize a and d
1 2 3 4 5 6 7 8 9
Memory
![Page 5: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/5.jpg)
Why We Care
On-chip memory: memory accesses power consumption
Off-chip memory: memory accesses lost on/off-chip pin capacity
Datacenters: memory accesses network & server load
Parallelism does not help reduce these costs d serial or parallel memory accesses have same cost
![Page 6: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/6.jpg)
Traditional Hash Table Schemes Example 1: linked lists (chaining)
1 2 3 4 5 6 7 8 9
Memory 1 2
3
4 5
6
7
8
9
![Page 7: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/7.jpg)
Traditional Hash Table Schemes Example 1: linked lists (chaining) Example 2: linear probing (open addressing)
Problem: the worst-case time cannot be bounded by a constant d
1 2 3 4 5 6 7 8 9
Memory 1 234 5
6
8
![Page 8: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/8.jpg)
High-Speed Hardware Enable overflows: if time exceeds d → overflow list
Can be stored in expensive CAM Otherwise, overflow elements = lost elements
Bucket contains h elements E.g.: 128-bit memory word h=4 elements of 32 bits Assumption: Access cost (read & write word) = 1 cycle
1 2 3 4 5 6 7 8 9Memory
4
7
1 5
3
6
2 8
h
CAM
9
![Page 9: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/9.jpg)
Possible Settings
Static setting - Insertions and queries only
Dynamic setting – Insertions, deletions, and queries.
Generalized setting – Balancing between the buckets’ load.
![Page 10: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/10.jpg)
Problem Formulation
1 2 3 4 5 6 7 8 9Memory
4
7
1 5
3
6
2 8
h
CAM
9
Given average a and worst-case d of memory accesses per operation,
Minimize overflow rate
![Page 11: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/11.jpg)
Example: Power of d-Random Choices
d hash functions: pick least loaded bucket. Break ties u.a.r. [Azar et al.]
Intuition: can reach low … but average time a = worst-case time d wasted memory accesses
1 2 3 4 5 6 7 8 9Memory
4
7
1 5
3
6
2 8
h
CAM
9
![Page 12: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/12.jpg)
Other Examples
d-left [Vöcking] Same as d-random, but break ties to the left.
Cuckoo [Pagh et al.] Whenever collision occurs, moves stored
elements to their other choices. Typically, uses much more than d memory
accesses on average.
![Page 13: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/13.jpg)
Outline
Static Case Overflow Lower Bound Optimal Schemes: SIMPLE, GREEDY, MHT.
Dynamic Case Comparison with Static Case. Overflow Lower Bound
Overflow Fraction Depending on d.
![Page 14: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/14.jpg)
Overflow Lower Bound
Objective: given any online scheme with average a and worst-case d, find lower-bound on overflow .
[h=4, load=n/(mh)=0.95, fixed d]
No scheme can achieve (capacity region)
![Page 15: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/15.jpg)
Overflow Lower Bound
Result: closed-form lower-bound formula Given n elements in m buckets of height h:
Valid also for non-uniform hashes For n=m and h=1, we get simply
Defines a capacity region for high-throughput hashing
![Page 16: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/16.jpg)
Lower-Bound Example
[h=4, load=n/(mh)=0.95]
For 3% overflow rate, throughput can be at most
1/a = 2/3 of memory rate
![Page 17: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/17.jpg)
Overflow Lower Bound
Example: d-left scheme: low overflow , but high average memory access rate a
[h=4, load=n/(mh)=0.95, m=5,000]
![Page 18: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/18.jpg)
The SIMPLE Scheme
SIMPLE scheme: single hash function Looks like truncated linked list
1 2 3 4 5 6 7 8 9Memory
4
7
1 5
3
6
2 8
h
CAM
9
![Page 19: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/19.jpg)
Performance of SIMPLE Scheme
[h=4, load=0.95, m=5,000]
The lower bound can actually be achieved
for a=1
![Page 20: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/20.jpg)
The GREEDY Scheme
Using uniform hashes, try to insert each element greedily until either inserted or d
1 2 3 4 5 6 7 8 9Memory
4
7
1 5
3
6
2 8
h
CAM
9
d=2
![Page 21: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/21.jpg)
Performance of GREEDY Scheme
[d=4, h=4, load=0.95, m=5,000]
The GREEDY scheme is always optimal until aco
![Page 22: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/22.jpg)
Performance of GREEDY Scheme
[d=4, h=4, load=0.95, m=5,000]
Overflow rate worse than 4-left, but better throughput (1/a)
![Page 23: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/23.jpg)
The MHT Scheme
MHT (Multi-Level Hash Table) [Broder&Karlin]: d successive subtables with their d hash functions
1 2 3 4 5 6 7Memory
4
7
15
3
6
2 8
h
CAM
9
1st Subtable 2nd Subtable 3rd Subtable
![Page 24: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/24.jpg)
Performance of MHT Scheme Optimality of MHT until cut-off point aco(MHT)
Proof that subtable sizes fall geometrically Confirmed in simulations
[d=4, h=4, load=0.95, m=5,000]
Overflow rate close to 4-left, with much better throughput (1/a)
![Page 25: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/25.jpg)
Outline
Static Case Overflow Lower Bound Optimal Schemes: SIMPLE, GREEDY, MHT.
Dynamic Case Comparison with Static Case. Overflow Lower Bound
Overflow Fraction Depending on d.
![Page 26: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/26.jpg)
Dynamic vs. Static
Dynamic hash tables are harder to model than the static ones [Kirsch et al.]
But past studies show same asymptotic behavior with infinite buckets (insertions only vs. alternations) traditional hashing using linked lists – maximum bucket
size of approx. log n / log log n [Gonnet, 1981] d-random, d-left schemes – maximum bucket size of log
log n / log 2 + O(1) [Azar et al.,1994; Vöcking, 1999] As a designer, using the static model seems natural.
Even if real-life devices have finite buckets
![Page 27: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/27.jpg)
Degradation with Finite Buckets
Finite buckets are used. Surprising result: degradation in performance
1 2 3 4Finite Infinite
11
2
1 2 3 4
H(1) = 3H(2) = 3
Element “2” is lost although its corresponding bucket is empty
![Page 28: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/28.jpg)
Comparing Static and Dynamic
Static setting: insertions only n = number of elements m = number of buckets
Dynamic setting: alternations between element insertions and deletions of randomly chosen elements. fixed load of c = n / (mh)
Fair comparison Given an average number of memory accesses a,
minimize overflow fraction .
![Page 29: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/29.jpg)
Overflow Lower Bound
Overflow lower bound of
where r = ach. Also holds for non-uniformly distributed
hash functions (under some constraints). The lower bound is tight (Simple, Greedy)
![Page 30: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/30.jpg)
Numerical Example
For h=1 and c=1 (100% load) we get a lower bound of 1/(1+a).
To get an overflow fraction of 1%, one needs at least 99 memory accesses per element. Infeasible for high-speed networking devices
Compared to a tight upper bound of e-a in the static case. [Kanizo et al., INFOCOM 2009] need ~4.6 memory accesses.
![Page 31: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/31.jpg)
Outline
Static Case Overflow Lower Bound Optimal Schemes: SIMPLE, GREEDY, MHT.
Dynamic Case Comparison with Static Case. Overflow Lower Bound
Overflow Fraction Depending on d.
![Page 32: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/32.jpg)
Overflow Fraction Depending on d
So far, we relaxed the constraint on d. We considered n elements with an average of
a memory accesses, as na distinct elements.
To take into account d, we must consider each element along with its own hash values.
![Page 33: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/33.jpg)
Graph Theory Approach
Consider a bipartite graph. Left vertices = Elements Right vertices = Buckets
(assume h=1). Edge = The bucket is one
of the element’s d choices
![Page 34: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/34.jpg)
Graph Theory Approach
We get a random bipartite graph where each left vertex has degree d.
Expected maximum size matching = Expected number of elements that can be inserted to the table, that is, a lower bound.
We derived an explicit expression for d=2.
Upper bound can be achieved by Cuckoo hashing (equivalent to finding maximum size matching).
![Page 35: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/35.jpg)
Summary
We found lower and upper bounds on the achievable overflow fraction both for the static and dynamic cases.
Static models are not necessarily exact with dynamic hash tables.
Improved lower bound for d=2 and a characterization of the performance of Cuckoo hashing.
![Page 36: Optimal Fast Hashing](https://reader035.fdocuments.net/reader035/viewer/2022062323/56816743550346895ddbf848/html5/thumbnails/36.jpg)
Thank you.