Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.
-
Upload
sabrina-jacobs -
Category
Documents
-
view
219 -
download
0
description
Transcript of Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.
![Page 1: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/1.jpg)
Indexing and B+-TreesIndexing and B+-Trees
By Kenneth CheungBy Kenneth CheungCS 157B TR 07:30-08:45CS 157B TR 07:30-08:45
Professor LeeProfessor Lee
![Page 2: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/2.jpg)
Introduction to IndexingIntroduction to Indexing
Goal: to make it easier to look Goal: to make it easier to look up dataup data
Do by saving the data in a Do by saving the data in a sorted, compressed versionsorted, compressed version
Searching and insertion will be Searching and insertion will be easiereasier
![Page 3: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/3.jpg)
Factors of IndicesFactors of Indices
1. Access type1. Access type 2. Access Time2. Access Time 3. Insertion time3. Insertion time 4. Deletion time4. Deletion time 5. Space overhead5. Space overhead
![Page 4: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/4.jpg)
Clustering IndexClustering Index
an index whose search key also an index whose search key also defines the sequential order of defines the sequential order of the filethe file
![Page 5: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/5.jpg)
Index-sequential filesIndex-sequential files
files ordered sequentially on a files ordered sequentially on a search keysearch key
![Page 6: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/6.jpg)
Index RecordIndex Record
(aka index entry)- holds the (aka index entry)- holds the search-key value and pointers to search-key value and pointers to the records with the valuethe records with the value
![Page 7: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/7.jpg)
PointerPointer
identifies disk block or offset to identifies disk block or offset to disk blockdisk block
![Page 8: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/8.jpg)
Dense IndexDense Index
a record appears for every a record appears for every search key value. Records are search key value. Records are stored in the same search-keystored in the same search-key
faster access time, but higher faster access time, but higher space overheadspace overhead
![Page 9: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/9.jpg)
Sparse IndexSparse Index
an index record appears on an index record appears on some search-key values. To find some search-key values. To find a record, the system finds the a record, the system finds the largest search key value that is largest search key value that is less than or equal to the given less than or equal to the given search-key value then it moves search-key value then it moves up to finds it if it is notup to finds it if it is not
lower space overhead, but lower space overhead, but higher access timehigher access time
![Page 10: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/10.jpg)
Larger DatabasesLarger Databases
Make a sparse index on a Make a sparse index on a clustering index, using 2 levels clustering index, using 2 levels of indicesof indices
Multilevel indices search faster Multilevel indices search faster than a binary searchthan a binary search
![Page 11: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/11.jpg)
Index Update (Insertion)Index Update (Insertion)
A. Look up search keyA. Look up search key B. If the index record stores all B. If the index record stores all
pointers with the same index pointers with the same index value, then add a new pointer to value, then add a new pointer to the index recordthe index record
C. Otherwise, the index stores C. Otherwise, the index stores the first pointer to the index the first pointer to the index valuevalue
![Page 12: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/12.jpg)
Index update- (Insertion to Index update- (Insertion to Sparse Indices)Sparse Indices) For sparse indices, if the system For sparse indices, if the system
makes a new block, then it must makes a new block, then it must add the first search-key value to add the first search-key value to the new index the new index
if the value has the least search if the value has the least search key value in the block, the index key value in the block, the index record is updated pointing to the record is updated pointing to the blockblock
![Page 13: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/13.jpg)
DeletionDeletion
A. Look up recordA. Look up record B. If it was a dense index and B. If it was a dense index and
the record deleted was the only the record deleted was the only one with the search key, then one with the search key, then delete the key form the indexdelete the key form the index
C. If the record stores pointers C. If the record stores pointers to all records, then the pointer to to all records, then the pointer to the deleted record is removedthe deleted record is removed
![Page 14: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/14.jpg)
Deletion (cont’d)Deletion (cont’d)
D. If the record stores the D. If the record stores the pointer to the first record and the pointer to the first record and the first record is deleted, then the first record is deleted, then the pointer moves to the following pointer moves to the following recordrecord
E. If the index is sparse and E. If the index is sparse and the index does not contain the the index does not contain the search-key value, then the index search-key value, then the index remains the same.remains the same.
![Page 15: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/15.jpg)
Deletion (cont’d)Deletion (cont’d)
F. If deleted record had the only F. If deleted record had the only search key, then the system search key, then the system replaces the corresponding replaces the corresponding index search record for the next index search record for the next search key value. If the next search key value. If the next search key value is an index search key value is an index entry, then the entry is deleted entry, then the entry is deleted instead of being replacedinstead of being replaced
![Page 16: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/16.jpg)
Deletion (cont’d)Deletion (cont’d)
G. If the index record for the G. If the index record for the search-key point to the record search-key point to the record being deleted, the pointer goes being deleted, the pointer goes to the next record with the same to the next record with the same search key value.search key value.
![Page 17: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/17.jpg)
Secondary IndicesSecondary Indices
A. Secondary Indices are dense A. Secondary Indices are dense and points to all recordsand points to all records
B. Stored sequentially and may B. Stored sequentially and may not have non-candidate keysnot have non-candidate keys
C. If a multi-indexed database is C. If a multi-indexed database is updated, then every index must updated, then every index must be updated alsobe updated also
![Page 18: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/18.jpg)
B+-TreesB+-Trees
An alternative to An alternative to Binary Search TreesBinary Search Trees
![Page 19: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/19.jpg)
Conditions of a B+-TreeConditions of a B+-Tree
A. Search-key values are K1, A. Search-key values are K1, K2...Kn-1K2...Kn-1
B. Pointers P1, P2...PnB. Pointers P1, P2...Pn C. Search key values are kept in C. Search key values are kept in
sorted ordersorted order
![Page 20: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/20.jpg)
Conditions (cont’d)Conditions (cont’d)
D. Pointer P points to a file D. Pointer P points to a file record with a search-key value record with a search-key value of K or a bucket of more of K or a bucket of more pointerspointers
E. Each node has more than 2 E. Each node has more than 2 pointers (binary tree has 2)pointers (binary tree has 2)
F. Stores redundant search-key F. Stores redundant search-key valuesvalues
![Page 21: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/21.jpg)
BucketsBuckets
Buckets are used only if the Buckets are used only if the search key value does not form search key value does not form a candidate key and if the file is a candidate key and if the file is not stored in search key ordernot stored in search key order
![Page 22: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/22.jpg)
LeavesLeaves
A. Each leaf holds up to n-1 A. Each leaf holds up to n-1 valuesvalues
B. Pointers P chain together B. Pointers P chain together leaf nodes in search key orderleaf nodes in search key order
C. Non-leaf nodes are sparse C. Non-leaf nodes are sparse multilevel indicesmultilevel indices
![Page 23: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/23.jpg)
Leaves (cont’d)Leaves (cont’d)
D. Non-leaf nodes may hold up D. Non-leaf nodes may hold up to n/2 ceil to n pointersto n/2 ceil to n pointers
E. Number of pointers in a node E. Number of pointers in a node is a fan out of a nodeis a fan out of a node
F. The root must hold at 2 to n/2 F. The root must hold at 2 to n/2 pointerspointers
![Page 24: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/24.jpg)
Queries for finding VQueries for finding V
A. To find search-key value V, A. To find search-key value V, start at rootstart at root
B. It looks for the smallest B. It looks for the smallest search-key greater than Vsearch-key greater than V
C. If it finds a K, then the pointer C. If it finds a K, then the pointer P goes to another nodeP goes to another node
![Page 25: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/25.jpg)
Queries (cont’d)Queries (cont’d)
D. The process repeats going D. The process repeats going down the tree by finding a down the tree by finding a search key value K that equals search key value K that equals V. V.
E. If there is no K that equals V E. If there is no K that equals V at the leaf, then no such record at the leaf, then no such record existsexists
![Page 26: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/26.jpg)
B+-tree InsertionB+-tree Insertion
A. First look upA. First look up B. If the search key value exists in B. If the search key value exists in
the leaf node, then add a file to the the leaf node, then add a file to the record and a bucket pointer if record and a bucket pointer if necessarynecessary
C. If a search-key value does not C. If a search-key value does not exist, then insert a new record into exist, then insert a new record into the file and make a new bucket and the file and make a new bucket and pointer if necessarypointer if necessary
![Page 27: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/27.jpg)
Insertion (cont’d)Insertion (cont’d)
D. If there is no search key value D. If there is no search key value and there is no room in the node, and there is no room in the node, then split the node.then split the node.
E. Adjust the two leaves to a new E. Adjust the two leaves to a new greatest and least search-key valuegreatest and least search-key value
F. After a split, insert a new node to F. After a split, insert a new node to the parent and repeat the process of the parent and repeat the process of splitting when it gets too fullsplitting when it gets too full
![Page 28: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/28.jpg)
B+-Tree DeletionB+-Tree Deletion
A. Look up the record and A. Look up the record and remove it from fileremove it from file
B. If no bucket was associated B. If no bucket was associated with its search-key value, with its search-key value, remove the search-key valueremove the search-key value
C. If the bucket is empty, C. If the bucket is empty, remove the search-key valueremove the search-key value
![Page 29: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/29.jpg)
Deletion (cont’d)Deletion (cont’d)
D. If there are too few pointers D. If there are too few pointers in a node, transfer teh pointers in a node, transfer teh pointers to a sibling node, then delete itto a sibling node, then delete it
E. If transferring pointers gives a E. If transferring pointers gives a node to many pointers, node to many pointers, redistribute the pointers. the redistribute the pointers. the parent of the two nodes, need to parent of the two nodes, need to change pointerschange pointers
![Page 30: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/30.jpg)
B+-Tree File OrganizationB+-Tree File Organization
A. Leaf nodes store records instead A. Leaf nodes store records instead of pointers to recordsof pointers to records
B. Insertion and deletion happens B. Insertion and deletion happens the same waythe same way
C. When inserting, the system adds C. When inserting, the system adds the record to the block if there is the record to the block if there is enough space, otherwise it splits the enough space, otherwise it splits the blockblock
D. Any Split will propagate upward if D. Any Split will propagate upward if necessarynecessary
![Page 31: Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.](https://reader036.fdocuments.net/reader036/viewer/2022062413/5a4d1b3b7f8b9ab05999e82a/html5/thumbnails/31.jpg)
BibliographyBibliography
Sliberchatz, Abraham, Henry F. Sliberchatz, Abraham, Henry F. Korth, and S. Sudarshan Korth, and S. Sudarshan Database System Concepts 5th Database System Concepts 5th Ed. Boston: McGraw Hill, 2002. Ed. Boston: McGraw Hill, 2002. Ch 12Ch 12