Binary tree
-
Upload
aakash-khanna -
Category
Documents
-
view
214 -
download
0
description
Transcript of Binary tree
1
CPS216: Data-intensive CPS216: Data-intensive Computing SystemsComputing Systems
Operators for Data Access Operators for Data Access (contd.)(contd.)
Shivnath BabuShivnath Babu
19
Insertion: PrimitivesInsertion: Primitives
Inserting into a leaf nodeInserting into a leaf node Splitting a leaf nodeSplitting a leaf node Splitting an internal nodeSplitting an internal node Splitting root nodeSplitting root node
59
54 6640
[ 59, 66)[54, 59)
74 84
9921 ……
[66,74)
Splitting an Internal NodeSplitting an Internal Node
59
54 6640 74 84
9921 ……
[ 59, 66)[54, 59) [66,74)
Splitting an Internal NodeSplitting an Internal Node
5954
66
40 74 84
9921 ……
[66, 99)
[ 59, 66)[54, 59)
[21,66)
[66,74)
Splitting an Internal NodeSplitting an Internal Node
44
Deletion: PrimitivesDeletion: Primitives
Delete key from a leafDelete key from a leaf Redistribute keys between sibling Redistribute keys between sibling
leavesleaves Merge a leaf into its siblingMerge a leaf into its sibling Redistribute keys between two Redistribute keys between two
sibling internal nodessibling internal nodes Merge an internal node into its Merge an internal node into its
siblingsibling
49
Merge Internal Node into Merge Internal Node into SiblingSibling
41 48 52 63 74
59
[52, 59) [59,63)
……
50
Merge Internal Node into Merge Internal Node into SiblingSibling
41 48 52 63
59
[52, 59) [59,63)
59
……
51
B-Tree RoadmapB-Tree Roadmap
B-TreeB-Tree RecapRecap Insertion (recap)Insertion (recap) DeletionDeletion ConstructionConstruction EfficiencyEfficiency
B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
B-Tree ConstructionB-Tree Construction
75 9721 41 571511 13 4834 62 81
Scan
75 81 9711 13 15 21 34 41 48 57 62
56
B-Tree ConstructionB-Tree Construction
Why is sort-based construction better thaninsertion-based one?
57
Cost of B-Tree Cost of B-Tree OperationsOperations
Height of B-Tree: HHeight of B-Tree: H Assume no duplicatesAssume no duplicates Question: what is the random I/O Question: what is the random I/O
cost of:cost of: Insertion:Insertion: Deletion:Deletion: Equality search:Equality search: Range Search: Range Search:
58
Height of B-TreeHeight of B-Tree
Number of keys: NNumber of keys: N B-Tree parameter: nB-Tree parameter: n
Height ≈ log N = Height ≈ log N = nn
log Nlog N
log nlog n
In practice: 2-3 levelsIn practice: 2-3 levels
59
Question: How do you pick parameter n? Question: How do you pick parameter n?
1.1. Ignore inserts and deletesIgnore inserts and deletes2.2. Optimize for equality searchesOptimize for equality searches3.3. Assume no duplicatesAssume no duplicates
60
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants
Sparse IndexSparse Index Duplicate KeysDuplicate Keys
Hash-based IndexesHash-based Indexes
61
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
Static Hash TableStatic Hash Table Extensible Hash TableExtensible Hash Table Linear Hash TableLinear Hash Table
62
Hash-Based IndexesHash-Based Indexes
Adaptations of main memory hash Adaptations of main memory hash tablestables
Support equality searchesSupport equality searches No range searchesNo range searches
64
Main Memory Hash Main Memory Hash TableTable
buckets
32
(null)
(null)
(null)
(null)
(null)
10
48
27 75
21
55
0
3
1
2
4
5
6
7
keyh (key)
h (key) = key % 8
65
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in
the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually
accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys …
67
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in
the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually
accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys … … … but need linked list of blocks but need linked list of blocks
((overflow blocksoverflow blocks))
69
Adapting to diskAdapting to disk
Bucket Id Bucket Id Disk Address mapping Disk Address mapping Contiguous blocksContiguous blocks Store mapping in main memoryStore mapping in main memory
Too large?Too large? Dynamic Dynamic Linear and Extensible Linear and Extensible
hash tableshash tables