1
Advanced Hash Algorithms with Key Bits Duplication for IP Address Lookup
Author:Christopher Martinez and Wei-Ming Lin
Publisher:2009 Fifth International Conference on Networking and Services
Presenter:Han-Chen ChenDate:2009/10/7
2
Outline Introduction Some of Non-Duplication XOR
Folding Algorithms Bit-Duplication XOR Hashing
approaches Minimal IDC Duplication Performance
3
Introduction Preprocess & Improve XOR Hashing Algorithms Reduce MSL (Maximum Search Length) & ASL
(Average Search Length)
Hash function
Data
m bits Bucket
…
data
data
…
…
2m entry
collision
collision
…
data
…collision
4
MSL & ASL
0
1
2
3
4
5
6
7
2
1
3
1
1
1
Hash function
Data
MSL=3
ASL=(2+1+3+1+1)/6
Bucket
5
XOR Hashing (Group XOR) Random XORing process (m=3)
1 0 1 0 1 0 1 00 0 1 0 0 1 1 10 1 1 0 0 1 0 00 0 1 0 0 1 0 10 0 1 0 0 1 0 11 0 1 1 0 1 0 10 1 0 1 1 1 0 00 1 1 0 0 0 1 1
7 6 5 4 3 2 1 0Bit Position
#0
#1
#2
#3
#4
#5
#6
#7
DB Entry Bit Position 7 6 5 4 3 2 1 0
⊕ ⊕ ⊕
A B C
A
B
C
6
Non-Duplication XOR Hashing (In-order XOR Hashing)
Sort d value
1 0 1 0 1 0 1 00 0 1 0 0 1 1 10 1 1 0 0 1 0 00 0 1 0 0 1 0 10 0 1 0 0 1 0 11 0 1 1 0 1 0 10 1 0 1 1 1 0 00 1 1 0 0 0 1 1
7 6 5 4 3 2 1 0Bit Position
#0
#1
#2
#3
#4
#5
#6
#7
DB Entry
d= 4 2 6 4 4 4 2 2
0 1 0 1 0 1 0 10 1 1 0 0 0 1 11 0 0 0 0 0 1 10 0 1 0 0 0 1 10 0 1 0 0 0 1 10 0 1 1 1 0 1 11 0 0 0 1 1 1 01 1 1 0 0 0 0 1
6 1 0 7 4 3 2 5Bit Position
#0
#1
#2
#3
#4
#5
#6
#7
DB Entry
d= 2 2 2 4 4 4 4 6
Bit Position 6 1 0 7 4 3 2 5
d value 2 2 2 4 4 4 4 6
⊕ ⊕ ⊕
A B C
A
B
C
7
Three Simple Bit-Duplication XOR Hashing Approaches Self-Duplication Exchange-Duplication Cycle-Duplication
8
Self-Duplication
A B C D
W X Y Z
A A A A
A B C D W X Y Z
A A=0⊕
Deficiency: Nullification
d value small big
Bucket size : m=4
9
Exchange-Duplication
A B C D
W X Y Z
B A A A
A B C D W X Y Z
A B=B A⊕ ⊕
Deficiency: Downgrade
d value small big
Bucket size : m=4
10
Cycle-Duplication
A B C D
W X Y Z
D A B C
A B C D W X Y Z
d value small big
Bucket size : m=4
11
MSL & ASL on Randomly Generated Data Sets
Maximum Search Length Average Search Length
The self-duplication reduction of 15% both in MSL and ASL. The cycle-duplication reduction of 50% in MSL and 27% in ASL.
m m
lengthlength
12
Minimal IDC Duplication (1/4)
A B C D
D A B C
C D A B B C D A
A B C D
D A B C
A B C D
B C D A
D A B C
bit0 bit1 bit2 bit3 bit0 bit1 bit2 bit3 bit0 bit1 bit2 bit3
Given m, how many times can it be duplicated without causing the downgrading problem?orIn order to duplicate X times without the downgrading problem, what is the minimal m required?
Bucket size : m=4
Duplication times :
X=2
IDC : Induced Duplication Correlation
13
Minimal IDC Duplication (2/4)
A B C D E F G
bit0 bit1 bit2 bit3
G A B C D E F
E F G A B C D
bit4 bit5 bit6
A
X3
X5
X1
A
X6
Proof : m ≥ (X + 1)2 – (X + 1) = X2 + X +1
m ≥ X * (X + 1) + 1
…
X2
X4
A
…m=7
X=2
14
Minimal IDC Duplication (3/4)
Dij = min( (si − sj) mod m , (sj − si) mod m ) Dij = Dkl , i, j, k, l, 0 ≤ i, j, k, l ≤ m, ∀
and (i, j) ≠(k, l).
A B C D E F G
bit0 bit1 bit2 bit3
G A B C D E F
F G A B C D E
bit4 bit5 bit6
m=7
X=2
15
Minimal IDC Duplication (4/4)
m=13, X=3 13 ≥ 3 * (3 + 1) + 1 ok! D01=Min((0-1)mod13, (1-0)mod13)=1
D13=Min((0-2)mod13, (2-0)mod13)=2
D03=Min((0-3)mod13, (3-0)mod13)=3
D09=Min((0-9)mod13, (9-0)mod13)=4
D19=Min((1-9)mod13, (9-1)mod13)=5
D39=Min((3-9)mod13, (9-3)mod13)=6
16
Performance
m
length
m
length
MSL & ASL on Randomly Generated Data Sets
17
Thanks for your
listening
Top Related