Post on 03-Oct-2015
description
B+-TreesAdapted from Mike Franklin
Example Tree IndexIndex entries: they direct search for data entries in leaves.Example where each node can hold 2 entries;
ISAMIndexed Sequential Access MethodSimilar to what we discussed in the last class10*15*20*27*33*37*40*46*51*55*63*203351634097*RootOverflowPagesLeafPagesPrimary41*IndexPages
Example B+ TreeSearch begins at root, and key comparisons direct it to a leaf.Search for 5*, 15*, all data entries >= 24* ... Based on the search for 15*, we know it is not in the tree!
B+ Tree - PropertiesBalancedEvery node except root must be at least full.Order: the minimum number of keys/pointers in a non-leaf nodeFanout of a node: the number of pointers out of the node
B+ Trees in PracticeTypical order: 100. Typical fill-factor: 67%.average fanout = 133Typical capacities:Height 3: 1333 = 2,352,637 entriesHeight 4: 1334 = 312,900,700 entriesCan often hold top levels in buffer pool:Level 1 = 1 page = 8 KbytesLevel 2 = 133 pages = 1 MbyteLevel 3 = 17,689 pages = 133 MBytes
B+ Trees: SummarySearching:logd(n) Where d is the order, and n is the number of entriesInsertion:Find the leaf to insert intoIf full, split the node, and adjust index accordinglySimilar cost as searchingDeletionFind the leaf nodeDeleteMay not remain half-full; must adjust the index accordingly
Insert 23*No splitting required.
Insert 8*
Example B+ Tree - Inserting 8* Notice that root was split, leading to increase in height. In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.
Data vs. Index Page Split (from previous example of inserting 8)Observe how minimum occupancy is guaranteed in both leaf and index pg splits.Note difference between copy-up and push-up; be sure you understand the reasons for this.Data Page SplitIndex Page Split
Delete 19*
Delete 20* ...
Delete 19* and 20* ...Deleting 19* is easy.Deleting 20* is done with re-distribution. Notice how middle key is copied up.Further deleting 24* results in more drastic changes
Delete 24* ...No redistribution fromneighbors possible
Deleting 24*Must merge.Observe `toss of index entry (on right), and `pull down of index entry (below).3022*27*29*33*34*38*39*2*3*7*14*16*22*27*29*33*34*38*39*5*8*Root3013517
Example of Non-leaf Re-distributionTree is shown below during deletion of 24*. (What could be a possible initial tree?)In contrast to previous example, can re-distribute entry from left child of root to right child. Root13517202230
After Re-distributionIntuitively, entries are re-distributed by `pushing through the splitting entry in the parent node.It suffices to re-distribute index entry with key 20; weve re-distributed 17 as well for illustration.14*16*33*34*38*39*22*27*29*17*18*20*21*7*5*8*2*3*Root13517302022
Primary vs Secondary IndexNote: We were assuming the data items were in sorted orderThis is called primary indexSecondary index:Built on an attribute that the file is not sorted on.
A Secondary B+-Tree index14162227291718202175823Root135173020222* 16* 5* 39*
Primary vs Secondary IndexNote: We were assuming the data items were in sorted orderThis is called primary indexSecondary index:Built on an attribute that the file is not sorted on.Can have many different indexes on the same file.
MoreHash-based IndexesStatic HashingExtendible HashingRead on your own.Linear HashingGrid-filesR-Trees etc
6101013121313151316171818