Chapter 10 Search Structures

44
Chapter 10 Search Structures Instructors: C. Y. Tang and J. S. Roger Jang All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

description

Chapter 10 Search Structures. Instructors: C. Y. Tang and J. S. Roger Jang. All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU). Balance Matters. Binary search trees can be degenerate. - PowerPoint PPT Presentation

Transcript of Chapter 10 Search Structures

Page 1: Chapter 10 Search Structures

Chapter 10Search Structures

Instructors: C. Y. Tang and J. S. Roger Jang

All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Page 2: Chapter 10 Search Structures

Balance Matters Binary search trees can be degenerate.

If you insert in a sorted order using the insertion algorithm introduced before, you’ll obtain a degenerate BST.

O(n) time search in such cases.

Page 3: Chapter 10 Search Structures

Balanced Binary Search Trees

There are binary search trees that guarantees balance.

Balance factor of a node: (height of left subtree) – (height of right subtree)

An AVL tree is a tree where each node has a balance factor of -1, 0, or 1.

Page 4: Chapter 10 Search Structures

AVL Trees Balance is maintained by the insertion and deleti

on algorithms. Both take O(log n) time. For example, if an insertion causes un-balance, t

hen some rotation is performed. For details please refer to your textbook.

Page 5: Chapter 10 Search Structures

Comparing Some Structures

Page 6: Chapter 10 Search Structures

2-3 Trees

Each node can have a degree of 2 or 3.

< 40 > 40

External nodes are at the same level.

Page 7: Chapter 10 Search Structures

2-3 Trees

The number of elements is between 2h – 1 and 3h – 1, where h is the height of the tree.

So if there are n elements, the height is between log3 (n+1) and log2 (n+1).

Hence to search in a 2-3 tree, you need O(log n) time.

Page 8: Chapter 10 Search Structures

Search in A 2-3 Tree The search algorithm is similar to that for a

BST. At each node (suppose we are searching for

x):k1 k2

Go this way if x < k1

Go this way if x > k1 and x < k2

Go this way if x > k2

Page 9: Chapter 10 Search Structures

Insertion Into A 2-3 Tree

To insert 70: First we search for 70. It is not in the tree, and the leaf node encountered during the search is node C. Since there is only one key there, we directly insert 70 into node C.

Page 10: Chapter 10 Search Structures

Now we want to insert 30. The leaf we encounter is B.

B is full. So we must create a new node D. The keys that will be concerned in this operation

are 10, 20 (elem. in B) and 30 (elem. to be inserted).

Largest (30): put into D. Smallest (10): remain in B. Median (20): insert into A, the parent of B.

Add a link from A to D.

median

smallest largest

Page 11: Chapter 10 Search Structures

Now we want to insert 60. We encounter leaf node C when we search for 60.

Node C is full, so we: Create node E to hold max{70, 80, 60} = 80. min{70,80,60} = 60 will remain in C. The median, 70, will be inserted into A.

But A is also full, so… New node F will be created. F has children C (where 60 is in) and E (where 80 is

in).

20 | 40

10 | 30 | 60 | 80 |

70 will be inserted to A

A

B D C E

Page 12: Chapter 10 Search Structures

20 | 40

10 | 30 | 60 |

70 will be inserted to A

A

B D C

But A is also full, so… Create node F to hold max{20,40,70} = 70. F has children C and E. min{20,40,70} = 20 will remain in A. med{20,40,70} = 40 should be inserted into parent

of A. But A has no parent, so create G to hold 40. G has children A and F.

80 |

E

Page 13: Chapter 10 Search Structures

Split A 3-Node Inserting y into a 3-node B causes a split.

x | z min | max |

C D E

B

A

F

(F is a node that does not have a parent yet. )

B G(new)

A

min, max, and med are the minimum, maximum, and median of {x, y, z}, respectively.

med will be inserted into A.

Page 14: Chapter 10 Search Structures

Split

Observe that this pattern repeats.

x | z

ch1(p) ch2(p) ch3(p)

p

parent(p)

q

q is initialized to be null. At that time p is a leaf.

min | max |

q

p (next q)

parent(p) (next p)

The position to insert the link to q depends on the situation.

med(next y)

ch1(p) ch2(p) ch3(p)

Page 15: Chapter 10 Search Structures

Split Split is simpler when p is the root.

x | z

ch1(p) ch2(p) ch3(p)

p

q

min | max |

q

p

med |

ch1(p) ch2(p) ch3(p)

New root

The position to insert the link to q depends on the situation.

Page 16: Chapter 10 Search Structures

Insertion Algorithm We are to insert key y in tree t. First, search for y in t. When you visit

each node, push it into a stack to facilitate finding parents later.

Assume that y is not in t (otherwise we need not insert). Let p be the leaf node we encountered in the search. So, if we pop a node from the above stack,

we’ll obtain the parent of p (assume that p itself is not pushed into the stack).

Page 17: Chapter 10 Search Structures

Insertion Algorithm

Initialize q to be null. If p is a 2-node, then simply insert y into p.

Put q immediately to the right of y. That is, if w is originally in p, then we have two cases:

w | y

q=nil

p

nil nil

y | w

nil

p

nil q=nil

And we’re done!

Page 18: Chapter 10 Search Structures

Insertion Algorithm

If p is a 3-node, then split p.

x | z

nil nil nil

p

parent(p)

q=nil

min | max |

q=nil

p (next q)

parent(p) (next p)med(next y)

nil nil nil

Then, let p = parent(p), q be the new node holding max, and y = med. We’ll now consider the insertion of the new y into the new p.

Page 19: Chapter 10 Search Structures

Insertion Algorithm In the remaining process, if p is a 2-

node, then simply insert y into p, and update the links as:

w | y

qa b

y | w

b

p

a q

p

And we’re done!

Page 20: Chapter 10 Search Structures

Insertion Algorithm If p is a 3-node, then split. Then we’ll

continue to insert the new y into the new p.

x | z

ch2(p) ch3(p)

p

parent(p)

q

min | max |

q

p

parent(p) (next p)med(next y)

ch1(p) ch2(p) ch3(p)

(next q)

ch1(p)

The position to insert the link to q depends on the situation.

Page 21: Chapter 10 Search Structures

Insertion Algorithm If p (3-node) is the root, then the split is done in

the manner as stated before. We’re done after this.

x | z

ch1(p) ch2(p) ch3(p)

p

q

min | max |

q

p

med |

ch1(p) ch2(p) ch3(p)

New root

The position to insert the link to q depends on the situation.

Page 22: Chapter 10 Search Structures

Correctness of Insertion Note that, all keys in part B, including y and keys in

q, lie between u and v. Because we followed the middle link of parent(p) when w

e did the search in the example below, the input key (to be inserted) falls between u and v.

Besides the (input) key to insert, all keys in B were originally there and fall between u and v.

? | ?

ch2(p) ch3(p)

p

parent(p)

qch1(p)

u | v

y to be inserted in p

A B C

Page 23: Chapter 10 Search Structures

Correctness of Insertion

So the global relationship is ok. As to the local relationship among the keys, the insertion actions clearly maintain such properly.

w | y

ch2(p)

p

parent(p)

qch1(p)

u | v

Page 24: Chapter 10 Search Structures

Correctness of Insertion

You should use induction as well as these observations to give a more rigorous proof. p and q are always 2-3 trees after

each iteration.

Page 25: Chapter 10 Search Structures

Time Complexity of Insertion

At each level, the algorithm takes O(1) time.

There are O(log n) levels. So insertion takes O(log n) time.

Page 26: Chapter 10 Search Structures

Deletion From A 2-3 Tree

Deletion of any element can be transformed into deletion of a leaf element.

To delete 50, we replace 50 by 60 or 20. Then delete correspondingly the leaf element 60 or 20. 60 is the leftmost leaf element in the right subtree of 50. 20 is the rightmost leaf element in the left subtree of 50.

20 | 80

10 | 60 | 70 90 | 95

Use the algorithm presented later to delete 20 in the leaf.

Page 27: Chapter 10 Search Structures

Deletion From A 2-3 Tree

Delete 70 (in C). This case is straightforward, as the resulting C is non-empty.

Page 28: Chapter 10 Search Structures

Deletion From A 2-3 Tree

Delete 90 (in D). This is also simple; a shift of 95 in D suffices.

Page 29: Chapter 10 Search Structures

Deletion From A 2-3 Tree Delete 60 (in C). C becomes empty. Left sibling of C is a 3-node. Hence

(rotation): Move 50 from A to C. Move 20 from B to A.

Page 30: Chapter 10 Search Structures

Deletion From A 2-3 Tree

Delete 95 (from D): D becomes empty. Its left sibling C is a 2-node, hence (combine): Move 80 from A to C. Delete D.

Page 31: Chapter 10 Search Structures

Deletion From A 2-3 Tree

Delete 50 (in C). Simply shift.

Page 32: Chapter 10 Search Structures

Deletion From A 2-3 Tree Delete 10 (in B): B becomes empty. The

right sibling of B is a 2-node, hence (combine): Move 20 from A to B. Move 80 from C to B. The parent A, which is also the root, is

empty. Hence simply let B be the new root.

Page 33: Chapter 10 Search Structures

Rotation and Combine

When a deletion in node p leaves p empty, then: Let r be the parent of p. If p is the left child of r, then let q be the

right sibling of p. Otherwise, let q be the left sibling of p. If q is a 3-node, then combine. If q is a 2-node, then rotation.

Page 34: Chapter 10 Search Structures

Rotation If p is the left child of r:

(“?” means don’t care) Observe the correctness.

Page 35: Chapter 10 Search Structures

Rotation

If p is the middle child of r.

Page 36: Chapter 10 Search Structures

Rotation

If p is the right child of r.

Page 37: Chapter 10 Search Structures

Combine If p is the left child of r:

Case 1: If r is a 2-node. r becomes empty, so we set p to be r, and

continue to consider to rotate/combine the new p. If r is a root, then let p become the new root.

Page 38: Chapter 10 Search Structures

Combine

If p is the left child of r. Case 2: If r is a 3-node.

Page 39: Chapter 10 Search Structures

Combine If p is the middle child of r:

Case 1: If r is a 2-node. Continue to handle the empty r as

before. w |

| y |

|

y | w

a b c

r

q p

a b c

p

r

Page 40: Chapter 10 Search Structures

Combine If p is the middle child of r:

Case 2: If r is a 3-node.

w | x

| y |

x |

y | w

a b c

d

r

q p d

a b c

p

r

Page 41: Chapter 10 Search Structures

Combine If p is the right child of r:

w | x

| y | a

b c

r

qp

d

w |

y | xa

b c

r

p

d

Page 42: Chapter 10 Search Structures

Correctness of Deletion

Observe that, if a combine results in a new empty node, then that node must have the following appearance (r with one tail): | r

r will become p in the next iteration. In the (left-hand side) pictures we’ve seen, p

has the above appearance. So applicable.

Page 43: Chapter 10 Search Structures

Correctness of Deletion

We begin with p being a leaf. At that time, the children of p are all null. So rotation/combine as illustrated in the previous figures are also applicable.

Correctness of other parts should be clear.

Page 44: Chapter 10 Search Structures

Time Complexity of Deletion

At each level: O(1) time. Rotation/combine need O(1) time.

#levels: O(log n). Total: O(log n) time.