The Selection Problemtcs/ds/lecture9.pdf · 2009-10-13 · The Recursion Tree Method The recursion...
Transcript of The Selection Problemtcs/ds/lecture9.pdf · 2009-10-13 · The Recursion Tree Method The recursion...
The Selection Problem
The selection problem:I given an integer k and a list x1, . . . , xn of n elements
I find the k -th smallest element in the list
Example: the 3rd smallest element of the following list is 6
7 4 9 6 2
An O(n · log n) solution:I sort the list (O(n · log n))
I pick the k -th element of the sorted list (O(1))
2 4 6 7 9
Can we find the k -th smallest element faster?
Quick-Select
Quick-select of the k-th smallest element in the list S:I based on prune-and-search paradigm
I Prune: pick random element x (pivot) from S and split S in:I L elements < x, E elements == x, G elements > x
7 2 1 9 6 5 3 85
2 1 3 5 7 9 6 8
L E GI partitioning into L, E and G works precisely as for quick-sort
I Search:I if k ≤ |L| then return quickSelect(k,L)
I if |L| < k ≤ |L| + |E | then return x
I if k > |L| + |E | then return quickSelect(k − |L| − |E |,G)
Quick-Select Visualization
Quick-select can be displayed by a sequence of nodes:I each node represents recursive call and stores: k, the
sequence, and the pivot element
k = 5, S = (7 4 9 3 2 6 5 1 8)
k = 2, S = (7 4 9 6 5 8)
k = 2, S = (7 4 6 5)
k = 1, S = (7 6 5)
found 5
Quick-Select: Running Time
The worst-case running is O(n2) time:I if the pivot is always the minimal or maximal element
The expected running time is O(n) (compare with quick-sort):I with probability 0.5 the recursive call is good: (3/4)n size
I T (n) ≤ b · a · n + T (34n)
I a is the time steps for partitioning per element
I b is the expected number of calls until a good call
I b = 2 (average number of coins to toss until head shows up)
I Let m = 2 · a · n, then
T (n) ≤ 2 · a · n + T ((3/4)n)
≤ 2 · a · n + 2 · a · (3/4) · n + 2 · a · (3/4)2 · n . . .= 8 · a · n ∈ O(n) (geometric series)
Quick-Select: Median of Medians
We can do selection in O(n) worst-case time.
Idea: recursively use select itself to find a good pivot:I divide S into n/5 sets of 5 elements
I find a median in each set (baby median)
I recursively use select to find the median of the medians
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
min. size of L
min. size of G
3
The minimal size of L and G is 0.3 · n.
Quick-Select: Median of Medians
We know:I The minimal size of L and G is 0.3 · n.
I Thus the maximal size of L and G is 0.7 · n.Let b ∈ N such that:
I partitioning of a list of size n takes at most b · n time,
I finding the baby medians takes at most b · n time,
I the base case n ≤ 1 takes at most b time.
We derive a recurrence equation for the time complexity:
T (n) =
b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1
We will see how to solve recurrence equations. . .
Fundamental Techniques
We have considered algorithms for solving special problems.Now we consider a few fundamental techniques:
I Devide-and-Conquer
I The Greedy Method
I Dynamic Programming
Divide-and-Conquer
Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide input S into k ≥ 2 disjoint subsets S1,. . . ,Sk
I Recur: recursively solve the subproblems S1,. . . ,Sk
I Conquer: combine solutions for S1,. . . ,Sk to solution for S(the base case of the recursion are problems of size 0 or 1)
We have already seen examples:I merge sort
I quick sort
I bottom-up heap construction
Focus now: analysing time complexity by recurrence equations.
Methods for Solving Recurrence Equation
We consider 3 methods for solving recurrence equation:I Iterative substitution method,
I Recursion tree method,
I Guess-and-test method, and
I Master method.
Iterative Substitution
Iterative substitution technique works as follows:I iteratively apply the recurrence equations to itself
I hope to find a pattern
Example
T (n) =
1 if n ≤ 1T (n − 1) + 2 if n > 1
We start with T (n) and apply the recursive case:
T (n) = T (n − 1) + 2= T (n − 2) + 4= T (n − k) + 2 · k
For k = n − 1 we reach the base case T (n − k) = 1 thus:
T (n) = 1 + 2 · (n − 1) = 2 · n − 1
Merge-Sort Review
Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elementsI Recur: recursively sort S1, S2
I Conquer: merge S1 and S2 into a sorting of SLet b ∈ N such that:
I merging two lists of size n/2 takes at most b · n time, and
I the base case n ≤ 1 takes at most b time
We obtain the following recurrence equation for merge-sort:
T (n) =
b if n ≤ 12 · T (n/2) + b · n if n > 1
We search a closed solution for the equation, that is:I T (n) = . . . where T (n) does not occur in the right side
Example: Merge-Sort
T (n) =
b if n ≤ 12 · T (n/2) + b · n if n > 1
We assume that n is a power of 2: n = 2k (that is, k = log2 n)I allowed since we are interested in asymptotic behaviour
We start with T (n) and apply the recursive case:
T (n) = 2 · T (n/2) + b · n= 2 ·
(2 · T (n/22) + b · (n/2)
)+ b · n
= 22 · T (n/22) + 2 · b · n= 23 · T (n/23) + 3 · b · n= 2k · T (n/2k) + k · b · n= n · b + (log2 n) · b · n
Thus T (n) = b · (n + n · log2 n) ∈ O(n log2 n).
The Recursion Tree Method
The recursion tree method is a visual approach:I draw the recursion tree and hope to find a pattern
T (n) =
b if n ≤ 23 · T (n/3) + b · n if n > 2
For a node with input size k: work at this node is b · k.
10 n
31 n/3
3ii n/3i
depth nodes size
Thus the work at depth i is 3i · b · n/3i = b · n.The height of the tree is log3 n. Thus T (n) is O(n · log3 n).
Guess-and-Test Method
The guess-and-test method works as follows:I we guess a solution (or an upper bound)
I we prove that the solution is true by induction
Example
T (n) =
1 if n ≤ 1T (n/2) + 2 · n if n > 1
Guess: T (n) ≤ 2 · nI for n = 1 it holds T (n) = 1 ≤ 2 · 1I for n > 1 we have:
T (n) = T (n/2) + 2 · n ≤ 2 · n/2 + 2 · n = 3 · n
Wrong guess: we cannot make 3 · n smaller or equal to 2 · n.
Example, continued
T (n) =
1 if n ≤ 1T (n/2) + 2 · n if n > 1
New guess: T (n) ≤ 4 · nI for n = 1 it holds T (n) = 1 ≤ 4 · 1I for n > 1 we have:
T (n) = T (n/2) + 2 · n≤ 4 · n/2 + 2 · n (by induction hypothesis)= 4 · n
This time the guess was good: 4 · n ≤ 4 · n. Thus T (n) ≤ 4 · n.
Example: Quick-Select with Median of Median
T (n) =
b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1
Guess: T (n) ≤ 20 · b · nI for n = 1 it holds T (n) = b ≤ 20 · b · 1I for n > 1 we have:
T (n) = T (0.7 · n) + T (0.2 · n) + 2 · b · n≤ 0.7 · 20 · b · n + 0.2 · 20 · b · n + 2 · b · n (by IH)= 0.9 · 20 · b · n + 2 · b · n = 18 · b · n + 2 · b · n= 20 · b · n
Thus the guess was good.This shows that quick-select with median of median is O(n).
Master method
Many divide-and-conquer recurrence equations have the form:
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Master Method: Example 1
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 4 · T (n/2) + n
Solution: logb a = 2, thus case 1 says T (n) = Θn2.
Master Method: Example 2
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 2 · T (n/2) + n · log n
Solution: logb a = 1, thus case 2 says T (n) = Θn log2 n.
Master Method: Example 3
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = T (n/3) + n · log n
Solution: logb a = 0, thus case 3 says T (n) = Θn log n.
Master Method: Example 4
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 8 · T (n/2) + n2
Solution: logb a = 3, thus case 1 says T (n) = Θn3.
Master Method: Example 5
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 9 · T (n/3) + n3
Solution: logb a = 2, thus case 3 says T (n) = Θn3.
Master Method: Example 6
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = T (n/2) + 1 binary search
Solution: logb a = 0, thus case 2 says T (n) = Θlog n.
Master Method: Example 7
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 2 · T (n/2) + log n heap construction
Solution: logb a = 0, thus case 1 says T (n) = Θn.