Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel)...

24
Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and Isaac Keslassy (Technion)
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel)...

Page 1: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Hash Tables With Finite BucketsAre Less Resistant to Deletions

Yossi Kanizo (Technion, Israel)

Joint work with David Hay (Columbia U. and Hebrew U.) and Isaac Keslassy (Technion)

Page 2: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Hash Tables for Networking DevicesHash Tables for Networking Devices

Hash tables and hash-based structures are often used in high-speed devices Heavy-hitter flow identification Flow state keeping Flow counter management Virus signature scanning IP address lookup algorithms

In many applications elements are also deleted (a.k.a. dynamic hash tables)

Page 3: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Dynamic vs. Static

Dynamic hash tables are harder to model than the static ones, that is, insertions only [Kirsch et al.]

Past studies show same asymptotic behavior with infinite buckets (insertions only vs. alternations) traditional hashing using linked lists – maximum

bucket size of approx. log n / log log n [Gonnet, 1981]

d-random, d-left schemes – maximum bucket size of log log n / log 2 + O(1) [Azar et al.,1994; Vöcking, 1999]

Using the static model seems natural.

Page 4: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

High-Speed HardwareHigh-Speed Hardware

Bucket is a memory word that contains h elements E.g.: 128-bit memory word h=4 elements of 32 bits Assumption: Access cost (read & write word) = 1 cycle

Enable overflows: after d memory accesses → overflow list Can be stored in expensive CAM Otherwise, overflow elements = lost elements Overflow fraction =

1 2 3 4 5 6 7 8 9Memory

4

7

1 5

3

6

2 8

h

CAM

9

Page 5: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Degradation with Finite Buckets

Finite buckets are used. Degradation in performance

1 2 3 4

Finite Infinite

11

2

1 2 3 4

H(1) = 3H(2) = 3

Element “2” is not stored although its corresponding bucket is empty

Page 6: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Degradation with Finite Buckets

What we had is Insert element “1” Insert element “2” Remove element “1”

Equivalent to only inserting element “2” in the static case

1 2 3 4

Finite Infinite

2

1 2 3 4

2

Page 7: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Simulations

[h=1, load=n/(mh)=1, d = 2]

Page 8: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Comparing Static and Dynamic

Static setting: insertions only n = number of elements m = number of buckets

Dynamic setting: alternations between element insertions and deletions of randomly chosen elements. fixed load of c = n / (mh)

Fair comparison Given an average number of memory accesses a,

minimize overflow fraction .

Page 9: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Why We Care about Average Number of Memory Why We Care about Average Number of Memory Accesses?Accesses?

On-chip memory: memory accesses power consumption

Off-chip memory: memory accesses lost on/off-chip pin capacity

Datacenters: memory accesses network & server load

Parallelism does not help reduce these costs d serial or parallel memory accesses have same cost

Page 10: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

From Discrete to Fluid Model

Discrete model Models the system accurately but induces complex

interactions between the elements Approximation using a fluid model

Based on differential equations with an infinite number of elements and buckets.

Elements stay in the system for exponentially-distributed duration of average 1.

• Bucket departure rate is proportional to its occupancy. Upon departure, a new element arrives.

• arrival rate is constant (fixed load in the system).• Assuming uniformly distributed hash functions, bucket arrival

rate is n / m = ch

Page 11: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Main ResultsMain Results

Case Study: Single choice hashing scheme

Lower bound on overflow fraction

Mitigating the degradation in performance.

Page 12: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Case Study: Analysis of Single Choice Hashing Scheme

Departure rate is proportional to bucket occupancy; arrival rate is constant

We show that (limit of) discrete Markov chain fluid model Intuition: No dependency between the buckets because of the single

choice. No “complex interaction” Bucket occupancy distribution is

The Overflow fraction is (Erlang-B formula)

1 2 h0

1/m·(1-1/n)

(1-1/m) ·1/n

1/m·(1-2/n) 1/m·(1-3/n) 1/m·(1-h/n)

(1-1/m) ·2/n (1-1/m) ·3/n h/n

Page 13: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Case Study: Numerical Example

For bucket size h=1, we get: =c/(1 + c). In case of 100% load (c=1):

dynamic: 50%. static: 36.79%. [Kanizo et al., INFOCOM 2009]

In case of 10% load (c=0.1): dynamic: 9.1%. static: 4.84%.

As load 0, dynamic systems has twice the overflow fraction of static systems.

Page 14: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Main ResultsMain Results

Case Study: Single choice hashing scheme

Lower bound on overflow fraction

Mitigating the degradation in performance.

Page 15: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Overflow Lower BoundOverflow Lower Bound

Objective: given any online scheme with average a, find lower-bound on the overflow fraction .

We use the fluid model Elements arrival rate is ch = n / m. Hashing rate per element is a. In the best case, all memory accesses are

used to store elements.

Page 16: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Overflow Lower BoundOverflow Lower Bound

Overflow lower bound of

where r = ach. Also holds for non-uniformly distributed

hash functions (under some constraints).

Page 17: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Numerical Example

For bucket size h=1, lower bound of 1-a/(1+ac). 100% load (c=1) implies lower bound of 1/(1+a).

To get an overflow fraction of 1%, one needs at least 99 memory accesses per element. Infeasible for high-speed networking devices

Compared to a tight upper bound of e-a in the static case. [Kanizo et al., INFOCOM 2009] need ~4.6 memory accesses.

Page 18: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

The Lower Bound is Tight

Single choice hashing scheme Optimal for a = 1

Multiple choice: Try to insert each element greedily until either inserted or d trials. Optimal for larger number of memory

accesses, depending on system parameters. Example:

h = 4, c = 1, d = 4 Multiple choice is optimal for a 2.19 .

Page 19: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Main ResultsMain Results

Case Study: Single choice hashing scheme

Lower bound on overflow fraction

Mitigating the degradation in performance.

Page 20: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Moving Back Elements

Recall the example from the beginning

1 2 3 4

Finite Infinite

2

1 2 3 4

Element “2” is not stored although its corresponding bucket is empty

Page 21: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Moving Back Elements

Overflow elements are stored in CAM. Moving back elements from the CAM to the

buckets. We cannot check upon a deletion every element in

the CAM. Store the hash values along with the elements in

the CAM. Upon departure check if an element can be moved

back. Can be combined with any hashing insertion

scheme.

Page 22: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Evaluation

Single choice hashing scheme Performance is exactly as in the static case.

Multiple choice hashing scheme Performance is better than the static case,

albeit with more memory accesses.

[h=4, d=1]

Page 23: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Wrap-up

Initial simulation results show degradation in performance.

We found lower and upper bounds on the achievable overflow fraction.

We compared it with upper bounds of the static case.

Mitigating the degradation in performance. Also in the paper

Simulations with synthetic data Other dynamic models Trace-driven simulations

Page 24: Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.

Thank you.Thank you.