CS 221
-
Upload
eliana-odom -
Category
Documents
-
view
24 -
download
0
description
Transcript of CS 221
CS 221
Analysis of Algorithms
Ordered Dictionaries and Search Trees
Portions of these slides come from Michael Goodrich and Roberto Tamassia,
Algorithm Design: Foundations, Analysis and Internet Examples, 2002, John Wiley and Sons.
and its authors, Michael Goodrich and Roberto Tamassia,
the books publisher John Wiley & Sons and… www.wikipedia.org
Reading material
Goodrich and Tamassia, 2002 Chapter 2, section 2.5,pages 114-137 see also section 2.6
Chapter 3, section 3.1 pages 141-151
Wikipedia: http://en.wikipedia.org/wiki/AVL_trees
in the previous episode… …we defined a data structure which we
called a dictionary. It was… a container to hold multiple objects or in
Goodrich and Tamassia’s terminology “items” each item = a (key, element) pair element = a “piece” of data
think= name, address, phone number key = a value we associate the element to help
us find, retrieve, delete, etc an element think = rdbms autoincrement key, student ID#
Dictionaries
Up til now we looked at Unordered dictionaries
container for (k,e) pairs but… in no particular order
Logfiles Hash Tables
Dictionaries
A terminology note for purposes of our discussion –
A linear unordered dictionary = logfile A lineary ordered dictionary = lookup table
Game Time
Twenty Questions One person thinks of an object that can
be any person, place or thing… and does not disclose the selected object
until it is specifically identified by the other players…
All other players take turns asking Yes/No questions in an attempt to identify the mystery object
Game Time
Twenty Questions An efficient problem solving strategy is
to ask questions for which the answers will optimally narrow the size of the problem space (possible solutions)
for example, Q: Is it a person? A: Yes ….we just eliminated all places and
non-human objects from the solution set
Game Time
Twenty Questions Size of problem?
N=??? large ~∞
Yes/No attack makes this a binary search problem…
So, what size of problem space can we effectively search? 220
Game Time
Twenty Questions Something to think about…
N is conceivably much larger than 220
So, how is that we can usually solve this problem in 20 steps or less… i.e. correctly identify the mystery object
Dictionaries Ordered Dictionaries
suppose the items in a dictionary are ordered (sorted) like low to high
Would that make a difference in terms of size() isEmpty() findElement() insertItem() removeItem()
Dictionaries Ordered Dictionaries
suppose we implement an ordered dictionary as a linear data structure or more specifically a vector
items are in vector in key order we gain considerable efficiency because we can
visit D[x], where x is a rank in O(1) time Can we achieve the same time of findElement()
time if the ordered dictionary were implemented as a linked list?
Binary Search Binary search performs operation findElement(k) on a
dictionary implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after O(log n) steps
Example: findElement(7)
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0
0
0
ml h
ml h
ml h
lm h
Binary Search
Lookup tables are not very efficient for dynamic data (lot of insertItem, removeElement
Lookup tables are efficient for dictionaries where predominant access is findElement, and relatively little inserts or removes credit card authorizations, code translation tables,…
Method Logfile Lookup Table
findElement O(n) O(log n)
insertItem O(1) O(n)
removeElement
O(n) O(n)
closetKeyBef O(n) O(log n)
Binary Search Tree
Binary tree for holding (k,e) items, such that… each internal node v store elem e with
key k k of e in left subtree of v <= k of v k of e in right subtree of v >= k of v external nodes store no elements…
only placeholder (NULL_NODE)
Binary Search Tree Each left
subtree is less than its parent
Each right subtree is greater than its parent
All leaf nodes hold no items
58
31 90
25 42
12 36
62
75
SearchAlgorithm findElement(k, v)
if T.isExternal (v)return NO_SUCH_KEY
if k key(v)return findElement(k, T.leftChild(v))
else if k key(v)return element(v)
else { k key(v) }return findElement(k, T.rightChild(v))
6
92
41 8
removeElement(k) – simple case
To perform operation removeElement(k), we search for key k
Assume key k is in the tree, and let let v be the node storing k
If node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w)
Example: remove 4
6
92
41 8
5
vw
6
92
51 8
RemoveElement(k) – more complicated case
We consider the case where the key k to be removed is stored at a node v whose children are both internal we find the internal node w
that follows v in an inorder traversal
we copy key(w) into node v we remove node w and its
left child z (which must be a leaf) by means of operation removeAboveExternal(z)
Example: remove 3
3
1
8
6 9
5
v
w
z
2
5
1
8
6 9
v
2
Binary Search Tree Performance Consider a dictionary
with n items implemented by means of a binary search tree of height h the space used is O(n) methods findElement ,
insertItem and removeElement take O(h) time
The height h is O(n) in the worst case and O(log n) in the best case
Balanced Trees
When a path in a tree gets very long relative to other paths in the tree…
the tree is unbalanced In fact, in its extreme form an
unbalanced tree is a linear list. So, to achieve optimal performance… you need to keep the tree balanced
AVL Trees we want to maintain a balanced tree recall-
height of a node v = longest path from v to an external node
We want to maintain the principle that for every node v the height of its children
can differ by no more than 1 Height-Balance Property
AVL Trees h(right_subtree)-h(left_subtree) =
Balance Factor |h(right_subtree)-h(left_subtree)| =
{0,1} Tree with Balance Factor ≠ {-1,0,1}
Unbalanced Tree Must be rebalanced
Balance Factor exists for every node v except (trivially) external nodes
AVL Trees
If Balance Factor = -1,0,1 tree balanced does not need restructured
If Balance Factor = -2, 2 tree unbalanced needs restructured
restructured done by process called rotation
AVL Trees
Rotation Four types – but two are symmetrical
Left Single Rotation Right Single Rotation Left Double Rotation Right Double Rotation
Since two are symmetrical –only consider single and double rotation
AVL Trees
Rotation if BF = 2
AVL Trees
Binary Trees that maintain the Height-Balance Property are called
AVL trees the name comes from the inventors
G.M. Adelson-Velsky and E.M. Landis in paper entitled “An Algorithm for Information Organization”
AVL Trees
Unbalanced Tree Balanced Tree
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees h(right_subtree)-h(left_subtree) =
Balance Factor (BF) If BF = {-1,0,1} then tree balanced
(do nothing) If BF ≠{-1,0,1} then tree unbalanced
(must be restructured) Restructuring done by rotation
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees
Rotation four cases – but pairs are symmetrical
left single rotation right single rotation left double rotation right double rotation
singe symmetric – we only examine single and double
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion Rotation
If BF > 2 unbalance occurred further down in right subtree Recursively walk down subtree until |BF| =2
If BF < -2 unbalance occurred further down in left subtree Recursively walk down subtree until |BF| =2
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion Rotation
If BF = 2 unbalance occurred in right subtree Recursively walk down subtree until |BF| =2
If BF = -2 unbalance occurred in left subtree Recursively walk down subtree until |BF| =2
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion Rotation
If BF = 2 unbalance occurred in right subtree Step down to subtree to find where
insertion occurred If BF = -2 unbalance occurred in left
subtree Step down to subtree to find where
insertion occurred
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
Rotation If BF at subtree = 1
insertion occurred on right leaf node single rotation required
If BF at subtree = -1 insertion occurred on left leaf node double rotation occurred
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
Rotation See
http://en.wikipedia.org/wiki/AVL_trees
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
Performance rotations – O(1) Recall h(T) maintained at O(log n) insertItem – O(log n) balanced tree - priceless
from:http://en.wikipedia.org/wiki/AVL_trees
Bounded –depth Search Trees
Search efficiency in tree is related to the depth of the tree
Can use depth bounded tree to create ordered dictionaries that run in O(log n) for search and update run-time
Multi-way Search Trees
Remember Binary Search Trees any node v can have at most 2 children what if we get rid of that rule
Suppose a node could have multiple children (>2)
Terminology – if v has d children – v is a d-node
Multi-way Search Trees
Multi-way Search Tree - T Each Internal node must have at least
two children -- internal node is d-node with d ≥ 2
Internal nodes store collections of items (k,e)
Each d-node stores d-1 items Special keys k0 = -∞ and kd = ∞ External nodes only placeholders