CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees
-
Upload
halla-morin -
Category
Documents
-
view
34 -
download
0
description
Transcript of CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees
![Page 1: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/1.jpg)
CSE 326: Data StructuresLecture #13
Extendible Hashing and Splay Trees
Alon Halevy
Spring Quarter 2001
![Page 2: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/2.jpg)
Extendible Hashing
• Hashing technique for huge data sets– optimizes to reduce disk accesses
– each hash bucket fits on one disk block
– better than B-Trees if order is not important
• Table contains– buckets, each fitting in one disk block, with the data
– a directory that fits in one disk block used to hash to the correct bucket
![Page 3: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/3.jpg)
001 010 011 110 111 101
Extendible Hash Table• Directory contains entries labeled by k bits plus a
pointer to the bucket with all keys starting with its bits• Each block contains keys+data matching on the first
j k bits
000 100
(2)00001000110010000110
(2)010010101101100
(3)1000110011
(3)101011011010111
(2)11001110111110011110
directory for k = 3
![Page 4: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/4.jpg)
Inserting (easy case)
001 010 011 110 111 101000 100
(2)00001000110010000110
(2)010010101101100
(3)1000110011
(3)101011011010111
(2)11001110111110011110
insert(11011)001 010 011 110 111 101000 100
(2)00001000110010000110
(2)010010101101100
(3)1000110011
(3)101011011010111
(2)110011110011110
![Page 5: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/5.jpg)
Splitting
001 010 011 110 111 101000 100
(2)00001000110010000110
(2)010010101101100
(3)1000110011
(3)101011011010111
(2)11001110111110011110
insert(11000)
001 010 011 110 111 101000 100
(2)00001000110010000110
(2)010010101101100
(3)1000110011
(3)101011011010111
(3)110001100111011
(3)1110011110
![Page 6: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/6.jpg)
Rehashing
01 10 1100
(2)01101
(2)10000100011001110111
(2)1100111110
insert(10010)
001 010 011 110 111 101000 100
No room to insert and no adoption!
Expand directory
Now, it’s just a normal split.
![Page 7: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/7.jpg)
Rehash of Hashing• Hashing is a great data structure for storing unordered data
that supports insert, delete, and find• Both separate chaining (open) and open addressing
(closed) hashing are useful– separate chaining flexible– closed hashing uses less storage, but performs badly with load
factors near 1– extendible hashing for very large disk-based data
• Hashing pros and cons+ very fast+ simple to implement, supports insert, delete, find- lazy deletion necessary in open addressing, can waste storage- does not support operations dependent on order: min, max, range
![Page 8: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/8.jpg)
Recall: AVL Tree Dictionary Data Structure
4
121062
115
8
14137 9
• Binary search tree properties– binary tree property
– search tree property
• Balance property– balance of every node is:
-1 b 1– result:
• depth is (log n)
15
![Page 9: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/9.jpg)
Splay Trees
“blind” rebalancing – no height info kept• amortized time for all operations is O(log n)• worst case time is O(n)• insert/find always rotates node to the root!
– Good locality – most common keys move high in tree
![Page 10: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/10.jpg)
Idea
17
10
92
5
3
You’re forced to make a really deep access:
Since you’re down there anyway,fix up a lot of deep nodes!
![Page 11: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/11.jpg)
Splay Operations: Find
• Find(x)1. do a normal BST search to find n such that
n->key = x
2. move n to root by series of zig-zag and zig-zig rotations, followed by a final zig if necessary
![Page 12: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/12.jpg)
Zig-Zag*
g
Xp
Y
n
Z
W
*This is just a double rotation
n
Y
g
W
p
ZX
Helped
Unchanged
Hurt
![Page 13: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/13.jpg)
Zig-Zig
n
Z
Y
p
X
g
W
g
W
X
p
Y
n
Z
![Page 14: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/14.jpg)
Zig
p
X
n
Y
Z
n
Z
p
Y
X
root root
![Page 15: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/15.jpg)
Why Splaying Helps• Node n and its children are always helped (raised)• Except for final zig, nodes that are hurt by a zig-
zag or zig-zig are later helped by a rotation higher up the tree!
• Result: – shallow (zig) nodes may increase depth by one or two– helped nodes may decrease depth by a large amount
• If a node n on the access path is at depth d before the splay, it’s at about depth d/2 after the splay– Exceptions are the root, the child of the root, and the
node splayed
![Page 16: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/16.jpg)
Locality
• Assume m n access in a tree of size n– Total amortized time O(m log n)
– O(log n) per access on average
• Gets better when you only access k distinct items in the m accesses.– Exercise.
![Page 17: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/17.jpg)
Splaying Example
2
1
3
4
5
6
Find(6)
2
1
3
6
5
4
zig-zig
![Page 18: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/18.jpg)
Still Splaying 6
zig-zig2
1
3
6
5
4
1
6
3
2 5
4
![Page 19: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/19.jpg)
Almost There, Stay on Target
zig
1
6
3
2 5
4
6
1
3
2 5
4
![Page 20: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/20.jpg)
Splay Again
Find(4)
zig-zag
6
1
3
2 5
4
6
1
4
3 5
2
![Page 21: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/21.jpg)
Example Splayed Out
zig-zag
6
1
4
3 5
2
61
4
3 5
2
![Page 22: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/22.jpg)
Splay Tree Summary
• All operations are in amortized O(log n) time• Splaying can be done top-down; better because:
– only one pass– no recursion or parent pointers necessary
• Invented by Sleator and Tarjan (1985), now widely used in place of AVL trees
• Splay trees are very effective search trees– relatively simple– no extra fields required– excellent locality properties: frequently accessed keys are cheap to
find
![Page 23: CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees](https://reader036.fdocuments.net/reader036/viewer/2022062517/568137d1550346895d9f6fd8/html5/thumbnails/23.jpg)
Coming Up
• Project 3: implement a “smart” web server.
• Heaps!
• Disjoint Sets
• Graphs and more.