Binary tree

70
1 CPS216: Data-intensive CPS216: Data-intensive Computing Systems Computing Systems Operators for Data Operators for Data Access (contd.) Access (contd.) Shivnath Babu Shivnath Babu

description

Binary trees

Transcript of Binary tree

1

CPS216: Data-intensive CPS216: Data-intensive Computing SystemsComputing Systems

Operators for Data Access Operators for Data Access (contd.)(contd.)

Shivnath BabuShivnath Babu

2

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36

Insert: 62

3

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36

Insert: 62

62

4

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36 62

Insert: 50

5

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36 50

Insert: 50

62

62

6

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36 50

Insert: 75

62

62

7

Insertion in a B-TreeInsertion in a B-Tree

49

49

n = 2

15 36 50

Insert: 75

62

62

75

8

InsertionInsertion

9

InsertionInsertion

10

InsertionInsertion

11

InsertionInsertion

12

InsertionInsertion

13

InsertionInsertion

14

InsertionInsertion

15

InsertionInsertion

16

InsertionInsertion

17

InsertionInsertion

18

InsertionInsertion

19

Insertion: PrimitivesInsertion: Primitives

Inserting into a leaf nodeInserting into a leaf node Splitting a leaf nodeSplitting a leaf node Splitting an internal nodeSplitting an internal node Splitting root nodeSplitting root node

20

Inserting into a Leaf Inserting into a Leaf NodeNode

54 57 60 62

58

21

Inserting into a Leaf Inserting into a Leaf NodeNode

54 57 60 62

58

22

Inserting into a Leaf Inserting into a Leaf NodeNode

54 57 60 62

58

58

23

61

54 57 60 6258

54 66

Splitting a Leaf NodeSplitting a Leaf Node

24

61

54 57 60 6258

54 66

Splitting a Leaf NodeSplitting a Leaf Node

25

61

54 57 61 6258

54 66

60

Splitting a Leaf NodeSplitting a Leaf Node

26

61

54 57 61 6258

54 66

60

59

Splitting a Leaf NodeSplitting a Leaf Node

27

61

54 57 61 6258

54 66

60

59

Splitting a Leaf NodeSplitting a Leaf Node

59

54 6640

[ 59, 66)[54, 59)

74 84

9921 ……

[66,74)

Splitting an Internal NodeSplitting an Internal Node

59

54 6640 74 84

9921 ……

[ 59, 66)[54, 59) [66,74)

Splitting an Internal NodeSplitting an Internal Node

5954

66

40 74 84

9921 ……

[66, 99)

[ 59, 66)[54, 59)

[21,66)

[66,74)

Splitting an Internal NodeSplitting an Internal Node

54 6640 74 84

59

[ 59, 66)[54, 59) [66,74)

Splitting the RootSplitting the Root

54 6640 74 84

59

[ 59, 66)[54, 59) [66,74)

Splitting the RootSplitting the Root

54

66

40 74 8459

[ 59, 66)[54, 59) [66,74)

Splitting the RootSplitting the Root

34

DeletionDeletion

35

DeletionDeletion

redistribute

36

DeletionDeletion

37

Deletion - IIDeletion - II

Deletion - IIDeletion - II

merge

39

Deletion - IIDeletion - II

40

Deletion - IIDeletion - II

41

Deletion - IIDeletion - II

42

Deletion - IIDeletion - II

merge

Not needed

43

Deletion - IIDeletion - II

44

Deletion: PrimitivesDeletion: Primitives

Delete key from a leafDelete key from a leaf Redistribute keys between sibling Redistribute keys between sibling

leavesleaves Merge a leaf into its siblingMerge a leaf into its sibling Redistribute keys between two Redistribute keys between two

sibling internal nodessibling internal nodes Merge an internal node into its Merge an internal node into its

siblingsibling

45

Merge Leaf into SiblingMerge Leaf into Sibling

54 58 64 68 72 75

67 85…72

46

Merge Leaf into SiblingMerge Leaf into Sibling

54 58 64 68 75

67…72 85

47

Merge Leaf into SiblingMerge Leaf into Sibling

54 58 64 68 75

67…72 85

48

Merge Leaf into SiblingMerge Leaf into Sibling

54 58 64 68 75

…72 85

49

Merge Internal Node into Merge Internal Node into SiblingSibling

41 48 52 63 74

59

[52, 59) [59,63)

……

50

Merge Internal Node into Merge Internal Node into SiblingSibling

41 48 52 63

59

[52, 59) [59,63)

59

……

51

B-Tree RoadmapB-Tree Roadmap

B-TreeB-Tree RecapRecap Insertion (recap)Insertion (recap) DeletionDeletion ConstructionConstruction EfficiencyEfficiency

B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes

52

QuestionQuestion

How does insertion-based constructionperform?

53

B-Tree ConstructionB-Tree Construction

11 1315 21 344148 57 6275 81 97

Sort

B-Tree ConstructionB-Tree Construction

75 9721 41 571511 13 4834 62 81

Scan

75 81 9711 13 15 21 34 41 48 57 62

B-Tree ConstructionB-Tree Construction

21 48 75

11 13 15 21 34 41 48 57 62 75 81 97

Scan

56

B-Tree ConstructionB-Tree Construction

Why is sort-based construction better thaninsertion-based one?

57

Cost of B-Tree Cost of B-Tree OperationsOperations

Height of B-Tree: HHeight of B-Tree: H Assume no duplicatesAssume no duplicates Question: what is the random I/O Question: what is the random I/O

cost of:cost of: Insertion:Insertion: Deletion:Deletion: Equality search:Equality search: Range Search: Range Search:

58

Height of B-TreeHeight of B-Tree

Number of keys: NNumber of keys: N B-Tree parameter: nB-Tree parameter: n

Height ≈ log N = Height ≈ log N = nn

log Nlog N

log nlog n

In practice: 2-3 levelsIn practice: 2-3 levels

59

Question: How do you pick parameter n? Question: How do you pick parameter n?

1.1. Ignore inserts and deletesIgnore inserts and deletes2.2. Optimize for equality searchesOptimize for equality searches3.3. Assume no duplicatesAssume no duplicates

60

RoadmapRoadmap

B-TreeB-Tree B-Tree variantsB-Tree variants

Sparse IndexSparse Index Duplicate KeysDuplicate Keys

Hash-based IndexesHash-based Indexes

61

RoadmapRoadmap

B-TreeB-Tree B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes

Static Hash TableStatic Hash Table Extensible Hash TableExtensible Hash Table Linear Hash TableLinear Hash Table

62

Hash-Based IndexesHash-Based Indexes

Adaptations of main memory hash Adaptations of main memory hash tablestables

Support equality searchesSupport equality searches No range searchesNo range searches

Indexing Problem (recap)Indexing Problem (recap)

a1

2a

ia

na

A = val

Index Keysrecord pointers

64

Main Memory Hash Main Memory Hash TableTable

buckets

32

(null)

(null)

(null)

(null)

(null)

10

48

27 75

21

55

0

3

1

2

4

5

6

7

keyh (key)

h (key) = key % 8

65

Adapting to diskAdapting to disk

1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in

the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually

accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys …

66

Adapting to DiskAdapting to Disk

How do we handle this?

67

Adapting to diskAdapting to disk

1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in

the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually

accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys … … … but need linked list of blocks but need linked list of blocks

((overflow blocksoverflow blocks))

68

Adapting to DiskAdapting to Disk

69

Adapting to diskAdapting to disk

Bucket Id Bucket Id Disk Address mapping Disk Address mapping Contiguous blocksContiguous blocks Store mapping in main memoryStore mapping in main memory

Too large?Too large? Dynamic Dynamic Linear and Extensible Linear and Extensible

hash tableshash tables

70

Beware of claims that assume 1 I/O Beware of claims that assume 1 I/O for hash tables and 3 I/Os for B-Tree!!for hash tables and 3 I/Os for B-Tree!!