CS 3343: Analysis of Algorithms Lecture 24: Graph searching, Topological sort.
CS 3343: Analysis of Algorithms
-
Upload
brandon-rice -
Category
Documents
-
view
22 -
download
0
description
Transcript of CS 3343: Analysis of Algorithms
04/19/23 1
CS 3343: Analysis of Algorithms
Review for final
04/19/23 2
Final Exam
• Closed book exam
• Coverage: the whole semester
• Cheat sheet: you are allowed one letter-size sheet, both sides
• Monday, Dec 16, 3:15 – 5:45pm
• Basic calculator (no graphing) allowed
04/19/23 3
Final Exam: Study Tips
• Study tips:– Study each lecture– Study the homework and homework solutions– Study the midterm exams
• Re-make your previous cheat sheets
04/19/23 4
Topics covered (1)By reversed chronological order:• Graph algorithms
– Representations– MST (Prim’s, Kruskal’s)– Shortest path (Dijkstra’s)– Running time analysis with different implementations
• Greedy algorithm– Unit-profit restaurant location problem– Fractional knapsack problem– Prim’s and Kruskal’s are also examples of greedy algorithms
• Greedy algorithm– Unit-profit restaurant location problem– Fractional knapsack problem– Prim’s and Kruskal’s are also examples of greedy algorithms– How to show that certain greedy choices are optimal
04/19/23 5
Topics covered (2)
• Dynamic programming– LCS– Restaurant location problem– Shortest path problem on a grid– Other problems– How to define recurrence solution, and use dynamic
programming to solve it
• Binary heap and priority queue– Heapify, buildheap, insert, exatractMax, changeKey– Running time
04/19/23 6
Topics covered (3)
• Order statistics– Rand-Select– Worst-case Linear-time selection– Running time analysis
• Sorting algorithms– Insertion sort– Merge sort– Quick sort– Heap sort– Linear time sorting: counting sort, radix sort– Stability of sorting algorithms– Worst-case and expected running time analysis– Memory requirement of sorting algorithms
04/19/23 7
Topics covered (4)• Analysis
– Order of growth– Asymptotic notation, basic definition
• Limit method• L’ Hopital’s rule• Stirling’s formula
– Best case, worst case, average case• Analyzing non-recursive algorithms
– Arithmetic series– Geometric series
• Analyzing recursive algorithms– Defining recurrence– Solving recurrence
• Recursion tree (iteration) method• Substitution method• Master theorem
04/19/23 8
Review for finals
• In chronological order
• Only the more important concepts– Very likely to appear in your final
• Does not mean to be exclusive
04/19/23 9
Asymptotic notations
• O: Big-Oh
• Ω: Big-Omega
• Θ: Theta
• o: Small-oh
• ω: Small-omega
• Intuitively:
O is like o is like <
is like is like >
is like =
04/19/23 10
Big-Oh
• Math:– O(g(n)) = {f(n): positive constants c and n0
such that 0 ≤ f(n) ≤ cg(n) n>n0}
– Or: lim n→∞ g(n)/f(n) > 0 (if the limit exists.)
• Engineering:– g(n) grows at least as faster as f(n)– g(n) is an asymptotic upper bound of f(n)
• Intuitively it is like f(n) ≤ g(n)
04/19/23 11
Big-Oh
• Claim: f(n) = 3n2 + 10n + 5 O(n2)
• Proof:3n2 + 10n + 5 3n2 + 10n2 + 5n2 when n > 1
18 n2 when n > 1
Therefore,• Let c = 18 and n0 = 1
• We have f(n) c n2, n > n0
• By definition, f(n) O(n2)
04/19/23 12
Big-Omega
• Math:– Ω(g(n)) = {f(n): positive constants c and n0
such that 0 ≤ cg(n) ≤ f(n) n>n0}
– Or: lim n→∞ f(n)/g(n) > 0 (if the limit exists.)
• Engineering:– f(n) grows at least as faster as g(n)– g(n) is an asymptotic lower bound of f(n)
• Intuitively it is like g(n) ≤ f(n)
04/19/23 13
Big-Omega
• f(n) = n2 / 10 = Ω(n)
• Proof: f(n) = n2 / 10, g(n) = n– g(n) = n ≤ n2 / 10 = f(n) when n > 10
– Therefore, c = 1 and n0 = 10
04/19/23 14
Theta
• Math:– Θ(g(n)) = {f(n): positive constants c1, c2, and n0
such that c1 g(n) f(n) c2 g(n) n n0 n>n0}
– Or: lim n→∞ f(n)/g(n) = c > 0 and c < ∞
– Or: f(n) = O(g(n)) and f(n) = Ω(g(n))
• Engineering:– f(n) grows in the same order as g(n)– g(n) is an asymptotic tight bound of f(n)
• Intuitively it is like f(n) = g(n)
• Θ(1) means constant time.
04/19/23 15
Theta
• Claim: f(n) = 2n2 + n = Θ (n2)
• Proof:– We just need to find three constants c1, c2,
and n0 such that
– c1n2 ≤ 2n2+n ≤ c2n2 for all n > n0
– A simple solution is c1 = 2, c2 = 3, and n0 = 1
04/19/23 16
Using limits to compare orders of growth
0• lim f(n) / g(n) = c > 0
∞n→∞
f(n) o(g(n))
f(n) Θ (g(n))
f(n) ω (g(n))
f(n) O(g(n))
f(n) Ω(g(n))
04/19/23 17
• Compare 2n and 3n
• lim 2n / 3n = lim(2/3)n = 0
• Therefore, 2n o(3n), and 3n ω(2n)
n→∞ n→∞
04/19/23 18
L’ Hopital’s rule
lim f(n) / g(n) = lim f(n)’ / g(n)’n→∞ n→∞
If both lim f(n) and lim g(n) goes to ∞
04/19/23 19
• Compare n0.5 and log n
• lim n0.5 / log n = ?
• (n0.5)’ = 0.5 n-0.5
• (log n)’ = 1 / n• lim (n-0.5 / 1/n) = lim(n0.5) = • Therefore, log n o(n0.5)
n→∞
∞
04/19/23 20
Stirling’s formula
nnn
ene
nnn
2/122!
!n nn en 2/1(constant)
04/19/23 21
• Compare 2n and n!
• Therefore, 2n = o(n!)
n
nnn
n
nnn e
nnc
e
nncn
2lim
2lim
2
!lim
04/19/23 22
More advanced dominance ranking
04/19/23 23
General plan for analyzing time efficiency of a non-recursive algorithm
• Decide parameter (input size)
• Identify most executed line (basic operation)
• worst-case = average-case?
• T(n) = i ti
• T(n) = Θ (f(n))
04/19/23 24
Statement cost time__InsertionSort(A, n) {
for j = 2 to n { c1 n
key = A[j] c2 (n-1)
i = j - 1; c3 (n-1)
while (i > 0) and (A[i] > key) { c4 S
A[i+1] = A[i] c5 (S-(n-1))
i = i - 1 c6 (S-(n-1))
} 0
A[i+1] = keyc7 (n-1)
} 0
}
Analysis of insertion Sort
04/19/23 25
Best case
• Array already sorted
1 i j
sorted Key
Inner loop stops when A[i] <= key, or i = 0
)(11
nnSn
j
04/19/23 26
Worst case
• Array originally in reverse order
1 i j
sorted
Inner loop stops when A[i] <= key
Key
)(2
)1(...21 2
1
nnn
njSn
j
04/19/23 27
Average case
• Array in random order
1 i j
sorted
Inner loop stops when A[i] <= key
Key
)(4
)1(
2
1
2)( 2
11
nnn
jj
SEn
j
n
j
04/19/23 28
Find the order of growth for sums
• How to find out the actual order of growth?– Remember some formulas– Learn how to guess and prove
)()( 2
1
njnTn
j
...
?2
)(
?2)(
?)log()(
1
1
1
n
jj
n
j
j
n
j
nnT
nT
jnT
04/19/23 29
Arithmetic series
• An arithmetic series is a sequence of numbers such that the difference of any two successive members of the sequence is a constant.
e.g.: 1, 2, 3, 4, 5
or 10, 12, 14, 16, 18, 20• In general:
Recursive definition
Closed form, or explicit formuladjaa
daa
j
jj
)1(1
1
Or:
04/19/23 30
Sum of arithmetic series
If a1, a2, …, an is an arithmetic series, then
2
)( 1
1
nn
ii
aana
04/19/23 31
Geometric series
• A geometric series is a sequence of numbers such that the ratio between any two successive members of the sequence is a constant.
e.g.: 1, 2, 4, 8, 16, 32
or 10, 20, 40, 80, 160
or 1, ½, ¼, 1/8, 1/16• In general:
Recursive definition
Closed form, or explicit formula0
1
1
ara
raaj
j
jj
Or:
04/19/23 32
Sum of geometric series
if r < 1
1
)1/()1(
)1/()1(1
1
0 n
rr
rr
r n
n
n
i
i if r > 1
if r = 1
112lim2
1lim
21
1)(lim
2
1lim
21212
122
02
1
0
02
1
1
21
02
1
0
111
0
n
in
n
iin
n
i
in
n
iin
nnnn
i
i
04/19/23 33
Important formulas
)1()(
)1()1(
1
1
)(2
)1(
)(1
1
0
2
1
1
rr
r
r
rr
nnn
i
nn
n
nn
i
i
n
i
n
i
)lg(lg
)(lg1
)2(22)1(2
)(1
)(3
1
1
1
1
11
1
33
1
2
nni
ni
nni
nk
ni
nn
i
n
i
n
i
nnn
i
i
kkn
i
k
n
i
Remember them, or remember where to find them!
04/19/23 34
Sum manipulation rules
n
xii
x
mii
n
mii
i ii i
i ii ii ii
aaa
acca
baba
1
)(
Example:
n
ii
n
ii
n
i
n
i
nin
i
i
nnn
nnii
11
1 1
1
1
2
1
2
22)1(224)24(
04/19/23 35
Recursive algorithms
• General idea:– Divide a large problem into smaller ones
• By a constant ratio• By a constant or some variable
– Solve each smaller one recursively or explicitly
– Combine the solutions of smaller ones to form a solution for the original problem
Divide and Conquer
04/19/23 36
How to analyze the time-efficiency of a recursive algorithm?
• Express the running time on input of size n as a function of the running time on smaller problems
04/19/23 37
Analyzing merge sort
MERGE-SORT A[1 . . n]1. If n = 1, done.2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .3. “Merge” the 2 sorted lists
T(n)Θ(1)2T(n/2)
f(n)
Sloppiness: Should be T( n/2 ) + T( n/2 ) , but it turns out not to matter asymptotically.
04/19/23 38
Analyzing merge sort
1. Divide: Trivial.
2. Conquer: Recursively sort 2 subarrays.
3. Combine: Merge two sorted subarrays
T(n) = 2 T(n/2) + f(n) +Θ(1)
# subproblemssubproblem size
Work dividing and
Combining
1. What is the time for the base case?
2. What is f(n)?
3. What is the growth order of T(n)?
Constant
04/19/23 39
Solving recurrence
• Running time of many algorithms can be expressed in one of the following two recursive forms
)()/()(
)()()(
nfbnaTnT
nfbnaTnT
or
Challenge: how to solve the recurrence to get a closed form, e.g. T(n) = Θ (n2) or T(n) = Θ(nlgn), or at least some bound such as T(n) = O(n2)?
04/19/23 40
Solving recurrence
1. Recurrence tree (iteration) method- Good for guessing an answer
2. Substitution method- Generic method, rigid, but may be hard
3. Master method- Easy to learn, useful in limited cases only
- Some tricks may help in other cases
04/19/23 41
The master method
The master method applies to recurrences of the form
T(n) = a T(n/b) + f (n) ,
where a 1, b > 1, and f is asymptotically positive.
1. Divide the problem into a subproblems, each of size n/b
2. Conquer the subproblems by solving them recursively.
3. Combine subproblem solutions
Divide + combine takes f(n) time.
04/19/23 42
Master theoremT(n) = a T(n/b) + f (n)
CASE 1: f (n) = O(nlogba – ) T(n) = (nlogba) .
CASE 2: f (n) = (nlogba) T(n) = (nlogba log n) .
CASE 3: f (n) = (nlogba + ) and a f (n/b) c f (n) T(n) = ( f (n)) .
Key: compare f(n) with nlogba
e.g.: merge sort: T(n) = 2 T(n/2) + Θ(n)a = 2, b = 2 nlogba = n
CASE 2 T(n) = Θ(n log n) .
04/19/23 43
Case 1
Compare f (n) with nlogba:f (n) = O(nlogba – ) for some constant > 0.
: f (n) grows polynomially slower than nlogba (by an n factor).
Solution: T(n) = (nlogba) i.e., aT(n/b) dominates
e.g. T(n) = 2T(n/2) + 1
T(n) = 4 T(n/2) + n
T(n) = 2T(n/2) + log n
T(n) = 8T(n/2) + n2
04/19/23 44
Case 3
Compare f (n) with nlogba:f (n) = (nlogba + ) for some constant > 0.
: f (n) grows polynomially faster than nlogba (by an n factor).
Solution: T(n) = (f(n)) i.e., f(n) dominates
e.g. T(n) = T(n/2) + n
T(n) = 2 T(n/2) + n2
T(n) = 4T(n/2) + n3
T(n) = 8T(n/2) + n4
04/19/23 45
Case 2
Compare f (n) with nlogba:f (n) = (nlogba).
: f (n) and nlogba grow at similar rate.
Solution: T(n) = (nlogba log n)
e.g. T(n) = T(n/2) + 1
T(n) = 2 T(n/2) + n
T(n) = 4T(n/2) + n2
T(n) = 8T(n/2) + n3
04/19/23 46
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
04/19/23 47
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
T(n)
04/19/23 48
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
T(n/2) T(n/2)
dn
04/19/23 49
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
T(n/4) T(n/4) T(n/4) T(n/4)
dn/2 dn/2
04/19/23 50
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
04/19/23 51
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
04/19/23 52
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
dn
04/19/23 53
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
dn
dn
04/19/23 54
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
dn
dn
dn
…
04/19/23 55
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
dn
dn
dn
#leaves = n (n)
…
04/19/23 56
Recursion tree
Solve T(n) = 2T(n/2) + dn, where d > 0 is constant.
dn
dn/4 dn/4 dn/4 dn/4
dn/2 dn/2
(1)
…
h = log n
dn
dn
dn
#leaves = n (n)
Total(n log n)
…
04/19/23 57
Substitution method
1. Guess the form of the solution:(e.g. using recursion trees, or expansion)
2. Verify by induction (inductive step).
The most general method to solve a recurrence (prove O and separately):
04/19/23 58
• Recurrence: T(n) = 2T(n/2) + n.• Guess: T(n) = O(n log n). (eg. by recurrence tree
method)• To prove, have to show T(n) ≤ c n log n for
some c > 0 and for all n > n0
• Proof by induction: assume it is true for T(n/2), prove that it is also true for T(n). This means:
• Fact: T(n) = 2T(n/2) + n• Assumption: T(n/2)≤ cn/2 log (n/2)• Need to Prove: T(n)≤ c n log (n)
Proof by substitution
04/19/23 59
Proof
• Fact: T(n) = 2T(n/2) + n• Assumption: T(n/2)≤ cn/2 log (n/2)• Need to Prove: T(n)≤ c n log (n)
• Proof: Substitute T(n/2) into the recurrence function
=> T(n) = 2 T(n/2) + n ≤ cn log (n/2) + n=> T(n) ≤ c n log n - c n + n=> T(n) ≤ c n log n (if we choose c ≥ 1).
04/19/23 60
• Recurrence: T(n) = 2T(n/2) + n.• Guess: T(n) = Ω(n log n). • To prove, have to show T(n) ≥ c n log n for
some c > 0 and for all n > n0
• Proof by induction: assume it is true for T(n/2), prove that it is also true for T(n). This means:
• Fact: • Assumption:• Need to Prove: T(n) ≥ c n log (n)
Proof by substitution
T(n) = 2T(n/2) + n
T(n/2) ≥ cn/2 log (n/2)
04/19/23 61
Proof
• Fact: T(n) = 2T(n/2) + n• Assumption: T(n/2) ≥ cn/2 log (n/2)• Need to Prove: T(n) ≥ c n log (n)
• Proof: Substitute T(n/2) into the recurrence function
=> T(n) = 2 T(n/2) + n ≥ cn log (n/2) + n=> T(n) ≥ c n log n - c n + n=> T(n) ≥ c n log n (if we choose c ≤ 1).
04/19/23 62
Quick sort
Quicksort an n-element array:
1. Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray x elements in upper subarray.
2. Conquer: Recursively sort the two subarrays.
3. Combine: Trivial.
x x xx ≥ x≥ x
Key: Linear-time partitioning subroutine.
04/19/23 63
Partition
• All the action takes place in the partition() function– Rearranges the subarray in place– End result: two subarrays
• All values in first subarray all values in second
– Returns the index of the “pivot” element separating the two subarrays
x x xx ≥ x≥ xp rq
04/19/23 64
Partition CodePartition(A, p, r) x = A[p]; // pivot is the first element i = p; j = r + 1; while (TRUE) {
repeat i++; until A[i] > x or i >= j; repeat j--; until A[j] < x or j < i; if (i < j) Swap (A[i], A[j]); else break;
} swap (A[p], A[j]); return j;
What is the running time of partition()?
partition() runs in O(n) time
04/19/23 65
i j66 1010 55 88 1313 33 22 1111x = 6
p r
i j66 1010 55 88 1313 33 22 1111
i j66 22 55 88 1313 33 1010 1111
i j66 22 55 88 1313 33 1010 1111
i j66 22 55 33 1313 88 1010 1111
ij66 22 55 33 1313 88 1010 1111
33 22 55 66 1313 88 1010 1111qp r
04/19/23 66
66 1010 55 88 1111 33 22 1313
33 22 55 66 1111 88 1010 1313
22 33 55 66 88 1010 1111 1313
22 33 55 66 1010 88 1111 1313
22 33 55 66 88 1010 1111 1313
04/19/23 67
Quicksort Runtimes
• Best case runtime Tbest(n) O(n log n)
• Worst case runtime Tworst(n) O(n2)
• Worse than mergesort? Why is it called quicksort then?
• Its average runtime Tavg(n) O(n log n )
• Better even, the expected runtime of randomized quicksort is O(n log n)
04/19/23 68
Randomized quicksort
• Randomly choose an element as pivot– Every time need to do a partition, throw a die to
decide which element to use as the pivot– Each element has 1/n probability to be selected
Partition(A, p, r) d = random(); // a random number between 0 and 1 index = p + floor((r-p+1) * d); // p<=index<=r swap(A[p], A[index]); x = A[p]; i = p; j = r + 1; while (TRUE) {
… }
04/19/23 69
Running time of randomized quicksort
• The expected running time is an average of all cases
T(n) =
T(0) + T(n–1) + dn if 0 : n–1 split,T(1) + T(n–2) + dn if 1 : n–2 split,T(n–1) + T(0) + dn if n–1 : 0 split,
)log()1()(1
)(1
0nnnknTkT
nnT
n
k
Expectation
04/19/23 70
Heaps
• In practice, heaps are usually implemented as arrays:
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1
04/19/23 71
Heaps
• To represent a complete binary tree as an array: – The root node is A[1]– Node i is A[i]– The parent of node i is A[i/2] (note: integer divide)– The left child of node i is A[2i]– The right child of node i is A[2i + 1]
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A = =
04/19/23 72
The Heap Property
• Heaps also satisfy the heap property:A[Parent(i)] A[i] for all nodes i > 1
– In other words, the value of a node is at most the value of its parent
– The value of a node should be greater than or equal to both its left and right children
• And all of its descendents
– Where is the largest element in a heap stored?
04/19/23 73
Heap Operations: Heapify()
Heapify(A, i){ // precondition: subtrees rooted at l and r are heaps
l = Left(i); r = Right(i);if (l <= heap_size(A) && A[l] > A[i])
largest = l;else
largest = i;if (r <= heap_size(A) && A[r] > A[largest])
largest = r;if (largest != i) {
Swap(A, i, largest);Heapify(A, largest);
}} // postcondition: subtree rooted at i is a heap
Among A[l], A[i], A[r],which one is largest?
If violation, fix it.
04/19/23 74
Heapify() Example
16
4 10
14 7 9 3
2 8 1
16 4 10 14 7 9 3 2 8 1A =
04/19/23 75
Heapify() Example
16
4 10
14 7 9 3
2 8 1
16 10 14 7 9 3 2 8 1A = 4
04/19/23 76
Heapify() Example
16
4 10
14 7 9 3
2 8 1
16 10 7 9 3 2 8 1A = 4 14
04/19/23 77
Heapify() Example
16
14 10
4 7 9 3
2 8 1
16 14 10 7 9 3 2 8 1A = 4
04/19/23 78
Heapify() Example
16
14 10
4 7 9 3
2 8 1
16 14 10 7 9 3 2 1A = 4 8
04/19/23 79
Heapify() Example
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 1A = 4
04/19/23 80
Heapify() Example
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
04/19/23 81
Analyzing Heapify(): Formal
• T(n) T(2n/3) + (1)
• By case 2 of the Master Theorem,
T(n) = O(lg n)
• Thus, Heapify() takes logarithmic time
04/19/23 82
Heap Operations: BuildHeap()
• We can build a heap in a bottom-up manner by running Heapify() on successive subarrays– Fact: for array of length n, all elements in range
A[n/2 + 1 .. n] are heaps (Why?)– So:
• Walk backwards through the array from n/2 to 1, calling Heapify() on each node.
• Order of processing guarantees that the children of node i are heaps when i is processed
04/19/23 83
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)Heapify(A, i);
}
04/19/23 84
BuildHeap() Example
• Work through exampleA = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
4
1 3
2 16 9 10
14 8 7
04/19/23 85
4
1 3
2 16 9 10
14 8 7
04/19/23 86
4
1 3
14 16 9 10
2 8 7
04/19/23 87
4
1 10
14 16 9 3
2 8 7
04/19/23 88
4
16 10
14 7 9 3
2 8 1
04/19/23 89
16
14 10
8 7 9 3
2 4 1
04/19/23 90
Analyzing BuildHeap(): Tight
• To Heapify() a subtree takes O(h) time where h is the height of the subtree– h = O(lg m), m = # nodes in subtree– The height of most subtrees is small
• Fact: an n-element heap has at most n/2h+1 nodes of height h
• CLR 7.3 uses this fact to prove that BuildHeap() takes O(n) time
04/19/23 91
Heapsort Example
• Work through exampleA = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
4
1 3
2 16 9 10
14 8 7
4 1 3 2 16 9 10 14 8 7A =
04/19/23 92
Heapsort Example
• First: build a heap
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
04/19/23 93
Heapsort Example
• Swap last and first
1
14 10
8 7 9 3
2 4 16
1 14 10 8 7 9 3 2 4 16A =
04/19/23 94
Heapsort Example
• Last element sorted
1
14 10
8 7 9 3
2 4 16
1 14 10 8 7 9 3 2 4 16A =
04/19/23 95
Heapsort Example
• Restore heap on remaining unsorted elements
14
8 10
4 7 9 3
2 1 16 Heapify
14 8 10 4 7 9 3 2 1 16A =
04/19/23 96
Heapsort Example
• Repeat: swap new last and first
1
8 10
4 7 9 3
2 14 16
1 8 10 4 7 9 3 2 14 16A =
04/19/23 97
Heapsort Example
• Restore heap
10
8 9
4 7 1 3
2 14 16
10 8 9 4 7 1 3 2 14 16A =
04/19/23 98
Heapsort Example
• Repeat
9
8 3
4 7 1 2
10 14 16
9 8 3 4 7 1 2 10 14 16A =
04/19/23 99
Heapsort Example
• Repeat
8
7 3
4 2 1 9
10 14 16
8 7 3 4 2 1 9 10 14 16A =
04/19/23 100
Heapsort Example
• Repeat
1
2 3
4 7 8 9
10 14 16
1 2 3 4 7 8 9 10 14 16A =
04/19/23 101
Analyzing Heapsort
• The call to BuildHeap() takes O(n) time
• Each of the n - 1 calls to Heapify() takes O(lg n) time
• Thus the total time taken by HeapSort() = O(n) + (n - 1) O(lg n)= O(n) + O(n lg n)= O(n lg n)
04/19/23 102
HeapExtractMax Example
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
04/19/23 103
HeapExtractMax Example
Swap first and last, then remove last
1
14 10
8 7 9 3
2 4 16
14 10 8 7 9 3 2 4 16A = 1
04/19/23 104
HeapExtractMax Example
Heapify
14
8 10
4 7 9 3
2 1
10 7 9 3 2 16A =
16
14 8 4 1
04/19/23 105
HeapChangeKey Example
Increase key
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
04/19/23 106
HeapChangeKey Example
Increase key
16
14 10
15 7 9 3
2 4 1
16 14 10 7 9 3 2 4 1A = 15
04/19/23 107
HeapChangeKey Example
Increase key
16
15 10
14 7 9 3
2 4 1
16 10 7 9 3 2 4 1A = 1415
04/19/23 108
HeapInsert Example
HeapInsert(A, 17)
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
04/19/23 109
HeapInsert Example
HeapInsert(A, 17)
16
14 10
8 7 9 3
2 4 1
16 14 10 8 7 9 3 2 4 1A =
-∞
-∞
-∞ makes it a valid heap
04/19/23 110
HeapInsert Example
HeapInsert(A, 17)
16
14 10
8 7 9 3
2 4 1
16 10 8 9 3 2 4 1A =
17
1714 7
Now call changeKey
04/19/23 111
HeapInsert Example
HeapInsert(A, 17)
17
16 10
8 14 9 3
2 4 1
17 10 8 9 3 2 4 1A =
7
716 14
04/19/23 112
• Heapify: Θ(log n)• BuildHeap: Θ(n)• HeapSort: Θ(nlog n)
• HeapMaximum: Θ(1)• HeapExtractMax: Θ(log n)• HeapChangeKey: Θ(log n)• HeapInsert: Θ(log n)
04/19/23 113
Counting sort
for i 1 to kdo C[i] 0
for j 1 to ndo C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key = i}|
for i 2 to kdo C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
for j n downto 1do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
1.
2.
3.
4.
Initialize
Count
Compute running sum
Re-arrange
04/19/23 114
Counting sort
A: 44 11 33 44 33
B:
1 2 3 4 5
C: 11 00 22 22
1 2 3 4
C': 11 11 33 55
for i 2 to kdo C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
3.
04/19/23 115
Loop 4: re-arrange
A: 44 11 33 44 33
B: 33
1 2 3 4 5
C: 11 11 33 55
1 2 3 4
C': 11 11 33 55
for j n downto 1do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
4.
04/19/23 116
Analysisfor i 1 to k
do C[i] 0
(n)
(k)
(n)
(k)
for j 1 to ndo C[A[ j]] C[A[ j]] + 1
for i 2 to kdo C[i] C[i] + C[i–1]
for j n downto 1do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
(n + k)
1.
2.
3.
4.
04/19/23 117
Stable sorting
Counting sort is a stable sort: it preserves the input order among equal elements.
A: 44 11 33 44 33
B: 11 33 33 44 44
Why this is important?What other algorithms have this property?
04/19/23 118
Radix sort
• Similar to sorting the address books
• Treat each digit as a key
• Start from the least significant bit
198099109123518183599340199540380128115295384700101594539614696382408360201039258538614386507628681328936
Most significant Least significant
04/19/23 119
Time complexity
• Sort each of the d digits by counting sort• Total cost: d (n + k)
– k = 10– Total cost: Θ(dn)
• Partition the d digits into groups of 3– Total cost: (n+103)d/3
• We work with binaries rather than decimals– Partition a binary number into groups of r bits– Total cost: (n+2r)d/r– Choose r = log n– Total cost: dn / log n– Compare with dn log n
• Catch: faster than quicksort only when n is very large
04/19/23 120
Randomized selection algorithm
RAND-SELECT(A, p, q, i) ⊳ i th smallest of A[ p . . q] if p = q & i > 1 then error!r RAND-PARTITION(A, p, q)k r – p + 1 ⊳ k = rank(A[r])if i = k then return A[ r]if i < k
then return RAND-SELECT( A, p, r – 1, i )else return RAND-SELECT( A, r + 1, q, i – k )
A[r] A[r] A[r] A[r]rp q
k
04/19/23 121
Example
pivot
i = 677 1010 55 88 1111 33 22 1313
k = 4
Select the 6 – 4 = 2nd smallest recursively.
Select the i = 6th smallest:
33 22 55 77 1111 88 1010 1313
Partition:
04/19/23 122
77 1010 55 88 1111 33 22 1313
33 22 55 77 1111 88 1010 1313
1010
1010 88 1111 1313
88 1010
Complete example: select the 6th smallest element.
i = 6
k = 4
i = 6 – 4 = 2
k = 3
i = 2 < k
k = 2
i = 2 = k
Note: here we always used first element as pivot to do the partition (instead of rand-partition).
04/19/23 123
Intuition for analysis
Lucky:101log 9/10 nn
CASE 3T(n) = T(9n/10) + (n)
= (n)Unlucky:
T(n) = T(n – 1) + (n)= (n2)
arithmetic series
Worse than sorting!
(All our analyses today assume that all elements are distinct.)
04/19/23 124
Running time of randomized selection
• For upper bound, assume ith element always falls in larger side of partition
• The expected running time is an average of all cases
T(n) ≤
T(max(0, n–1)) + n if 0 : n–1 split,T(max(1, n–2)) + n if 1 : n–2 split,T(max(n–1, 0)) + n if n–1 : 0 split,
)()1,max(1
)(1
0nnknkT
nnT
n
k
Expectation
04/19/23 125
Worst-case linear-time selection
if i = k then return xelseif i < k
then recursively SELECT the i th smallest element in the
lower partelse recursively SELECT the (i–
k)th smallest element in the upper part
SELECT(i, n)1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.2. Recursively SELECT the median x of the n/5
group medians to be the pivot.3. Partition around the pivot x. Let k = rank(x).4.
Same as RAND-SELECT
04/19/23 126
Developing the recurrence
if i = k then return xelseif i < k
then recursively SELECT the i th smallest element in the
lower partelse recursively SELECT the (i–
k)th smallest element in the upper part
SELECT(i, n)1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.2. Recursively SELECT the median x of the n/5
group medians to be the pivot.3. Partition around the pivot x. Let k = rank(x).4.
T(n)
(n)
T(n/5)
(n)
T(7n/10+3)
04/19/23 127
nnTnTnT
3
107
51
)(
Solving the recurrence
if c ≥ 20 and n ≥ 60cn
ncncn
ncn
ncncn
nncncnT
)20/(
20/19
4/35
)3107()5()(
Assumption: T(k) ck for all k < n
if n ≥ 60
04/19/23 128
Elements of dynamic programming
• Optimal sub-structures– Optimal solutions to the original problem
contains optimal solutions to sub-problems
• Overlapping sub-problems– Some sub-problems appear in many solutions
04/19/23 129
Two steps to dynamic programming
• Formulate the solution as a recurrence relation of solutions to subproblems.
• Specify an order to solve the subproblems so you always have what you need.
04/19/23 130
Optimal subpaths
• Claim: if a path startgoal is optimal, any sub-path, startx, or xgoal, or xy, where x, y is on the optimal path, is also the shortest.
• Proof by contradiction– If the subpath between x and y is not the shortest, we can
replace it with the shorter one, which will reduce the total length of the new path => the optimal path from start to goal is not the shortest => contradiction!
– Hence, the subpath xy must be the shortest among all paths from x to y
start goalx ya
bc
b’
a + b + c is shortest
b’ < b
a + b’ + c < a + b + c
04/19/23 131
Dynamic programming illustration3 9 1 2
3 2 5 2
2 4 2 3
3 6 3 3
1 2 3 2
5 3 3 3 3
2 3 3 9 3
6 2 3 7 4
4 6 3 1 3
3 12 13 15
6 8 13 15
9 11 13 16
11 14 17 20
17 17 18 20
0
5
7
13
17
S
G
F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
04/19/23 132
Trace back
3 9 1 2
3 2 5 2
2 4 2 3
3 6 3 3
1 2 3 2
5 3 3 3 3
2 3 3 9 3
6 2 3 7 4
4 6 3 1 3
3 12 13 15
6 8 13 15
9 11 13 16
11 14 17 20
17 17 18 20
0
5
7
13
17
04/19/23 133
Longest Common Subsequence
• Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both.
x: A B C B D A B
y: B D C A B A
“a” not “the”
BCBA = LCS(x, y)
functional notation, but not a function
04/19/23 134
Optimal substructure
• Notice that the LCS problem has optimal substructure: parts of the final solution are solutions of subproblems.– If z = LCS(x, y), then any prefix of z is an LCS of a prefix of
x and a prefix of y.
• Subproblems: “find LCS of pairs of prefixes of x and y”
x
y
m
nz
i
j
04/19/23 135
Finding length of LCS
• Let c[i, j] be the length of LCS(x[1..i], y[1..j])=> c[m, n] is the length of LCS(x, y)
• If x[m] = y[n]c[m, n] = c[m-1, n-1] + 1
• If x[m] != y[n]c[m, n] = max { c[m-1, n], c[m, n-1] }
x
y
m
n
04/19/23 136
DP Algorithm
• Key: find out the correct order to solve the sub-problems• Total number of sub-problems: m * n
c[i, j] =c[i–1, j–1] + 1 if x[i] = y[j],max{c[i–1, j], c[i, j–1]} otherwise.
C(i, j)
0
m
0 n
i
j
04/19/23 137
LCS Example (0)j 0 1 2 3 4 5
0
1
2
3
4
i
X[i]
A
B
C
B
Y[j] BB ACD
X = ABCB; m = |X| = 4Y = BDCAB; n = |Y| = 5Allocate array c[5,6]
ABCBBDCAB
04/19/23 138
LCS Example (1)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0
ABCBBDCAB
X[i]
Y[j]
04/19/23 139
LCS Example (2)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0
ABCBBDCAB
X[i]
Y[j]
04/19/23 140
LCS Example (3)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 0
ABCBBDCAB
X[i]
Y[j]
04/19/23 141
LCS Example (4)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 0 1
ABCBBDCAB
X[i]
Y[j]
04/19/23 142
LCS Example (5)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
000 1 1
ABCBBDCAB
X[i]
Y[j]
04/19/23 143
LCS Example (6)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 10 1
1
ABCBBDCAB
X[i]
Y[j]
04/19/23 144
LCS Example (7)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 1 11
ABCBBDCAB
X[i]
Y[j]
04/19/23 145
LCS Example (8)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 1 1 1 2
ABCBBDCAB
X[i]
Y[j]
04/19/23 146
LCS Example (14)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
B
BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3
ABCBBDCAB
X[i]
Y[j]
04/19/23 147
LCS Algorithm Running Time
• LCS algorithm calculates the values of each entry of the array c[m,n]
• So what is the running time?
O(m*n)
since each c[i,j] is calculated in constant time, and there are m*n elements in the array
04/19/23 148
How to find actual LCS
• The algorithm just found the length of LCS, but not LCS itself.• How to find the actual LCS?• For each c[i,j] we know how it was acquired:
• A match happens only when the first equation is taken
• So we can start from c[m,n] and go backwards, remember x[i] whenever c[i,j] = c[i-1, j-1]+1.
2
2 3
2 For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
04/19/23 149
Finding LCSj 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
X[i]
Y[j]
Time for trace back: O(m+n).
04/19/23 150
Finding LCS (2)j 0 1 2 3 4 5
0
1
2
3
4
i
A
B
C
BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
B C BLCS (reversed order):
LCS (straight order): B C B (this string turned out to be a palindrome)
X[i]
Y[j]
04/19/23 151
LCS as a longest path problem
A
B
C
B
BB ACD
1
1
1 1
1
1
04/19/23 152
LCS as a longest path problem
A
B
C
B
BB ACD
0 0 0 0 0 0
0 0 0 0 1 1
0 1 1 1 1 2
0 1 1 2 2 2
0 1 1 1 2 3
1
1
1 1
1
1
04/19/23 153
Restaurant location problem 1
• You work in the fast food business• Your company plans to open up new restaurants in
Texas along I-35
• Towns along the highway called t1, t2, …, tn
• Restaurants at ti has estimated annual profit pi
• No two restaurants can be located within 10 miles of each other due to some regulation
• Your boss wants to maximize the total profit• You want a big bonus
10 mile
04/19/23 154
A DP algorithm
• Suppose you’ve already found the optimal solution
• It will either include tn or not include tn
• Case 1: tn not included in optimal solution
– Best solution same as best solution for t1 , …, tn-1
• Case 2: tn included in optimal solution
– Best solution is pn + best solution for t1 , …, tj , where j < n is the largest index so that dist(tj, tn) ≥ 10
04/19/23 155
Recurrence formulation
• Let S(i) be the total profit of the optimal solution when the first i towns are considered (not necessarily selected)– S(n) is the optimal solution to the complete problem
S(n-1)
S(j) + pn j < n & dist (tj, tn) ≥ 10S(n) = max
S(i-1)
S(j) + pi j < i & dist (tj, ti) ≥ 10S(i) = max
Generalize
Number of sub-problems: n. Boundary condition: S(0) = 0.
Dependency: ii-1jS
04/19/23 156
Example
• Natural greedy 1: 6 + 3 + 4 + 12 = 25• Natural greedy 2: 12 + 9 + 3 = 24
5 2 2 6 6 63 10 7
6 7 9 8 3 3 2 4 12 5
Distance (mi)
Profit (100k)
6 7 9 9 10 12 12 14 26 26S(i)
S(i-1)
S(j) + pi j < i & dist (tj, ti) ≥ 10S(i) = max
100
07 3 4 12
dummy
Optimal: 26
04/19/23 157
Complexity
• Time: (nk), where k is the maximum number of towns that are within 10 miles to the left of any town– In the worst case, (n2)
– Can be improved to (n) with some preprocessing tricks
• Memory: Θ(n)
04/19/23 158
Knapsack problem
Three versions:
0-1 knapsack problem: take each item or leave it
Fractional knapsack problem: items are divisible
Unbounded knapsack problem: unlimited supplies of each item.
Which one is easiest to solve?
•Each item has a value and a weight•Objective: maximize value•Constraint: knapsack has a weight
limitation
We study the 0-1 problem today.
04/19/23 159
Formal definition (0-1 problem)
• Knapsack has weight limit W• Items labeled 1, 2, …, n (arbitrarily)
• Items have weights w1, w2, …, wn
– Assume all weights are integers
– For practical reason, only consider wi < W
• Items have values v1, v2, …, vn
• Objective: find a subset of items, S, such that iS wi W and iS vi is maximal among all such (feasible) subsets
04/19/23 160
A DP algorithm
• Suppose you’ve find the optimal solution S
• Case 1: item n is included
• Case 2: item n is not included
Total weight limit:W
wn
Total weight limit:W
Find an optimal solution using items 1, 2, …, n-1 with weight limit W - wn
wn
Find an optimal solution using items 1, 2, …, n-1 with weight limit W
04/19/23 161
Recursive formulation
• Let V[i, w] be the optimal total value when items 1, 2, …, i are considered for a knapsack with weight limit w
=> V[n, W] is the optimal solution
V[n, W] = maxV[n-1, W-wn] + vn
V[n-1, W]
Generalize
V[i, w] = maxV[i-1, w-wi] + vi item i is taken
V[i-1, w] item i not taken
V[i-1, w] if wi > w item i not taken
Boundary condition: V[i, 0] = 0, V[0, w] = 0. Number of sub-problems = ?
04/19/23 162
Example
• n = 6 (# of items)• W = 10 (weight limit)• Items (weight, value):
2 24 33 35 62 46 9
04/19/23 163
0
0
0
0
0
0
00000000000
w 0 1 2 3 4 5 6 7 8 9 10
425
6
4
3
2
1
i
96
65
33
34
22
viwi
maxV[i-1, w-wi] + vi item i is taken
V[i-1, w] item i not taken
V[i-1, w] if wi > w item i not taken
V[i, w] =
V[i, w]
V[i-1, w]V[i-1, w-wi]
6
wi
5
04/19/23 164
107400
1310764400
9633200
8653200
555532200
2222222200
00000000000
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi
1 2 2
2 4 3
3 3 3
4 5 6
5 2 4
6 6 9
maxV[i-1, w-wi] + vi item i is taken
V[i-1, w] item i not taken
V[i-1, w] if wi > w item i not taken
V[i, w] =
2
4
3
6
5
6
7
5
9
6
8
10
9 11
8
3
12 13
13 15
04/19/23 165
107400
1310764400
9633200
8653200
555532200
2222222200
00000000000
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi
1 2 2
2 4 3
3 3 3
4 5 6
5 2 4
6 6 9
2
4
3
6
5
6
7
5
9
6
8
10
9 11
8
3
12 13
13 15
Item: 6, 5, 1
Weight: 6 + 2 + 2 = 10
Value: 9 + 4 + 2 = 15
Optimal value: 15
04/19/23 166
Time complexity
• Θ (nW)• Polynomial?
– Pseudo-polynomial– Works well if W is small
• Consider following items (weight, value):(10, 5), (15, 6), (20, 5), (18, 6)
• Weight limit 35– Optimal solution: item 2, 4 (value = 12). Iterate: 2^4 = 16 subsets– Dynamic programming: fill up a 4 x 35 = 140 table entries
• What’s the problem?– Many entries are unused: no such weight combination– Top-down may be better
04/19/23 167
Longest increasing subsequence
• Given a sequence of numbers1 2 5 3 2 9 4 9 3 5 6 8
• Find a longest subsequence that is non-decreasing– E.g. 1 2 5 9– It has to be a subsequence of the original list– It has to in sorted order
=> It is a subsequence of the sorted list
Original list: 1 2 5 3 2 9 4 9 3 5 6 8LCS:Sorted: 1 2 2 3 3 4 5 5 6 8 9 9
1 2 3 4 5 6 8
04/19/23 168
Events scheduling problem
• A list of events to schedule (or shows to see)– ei has start time si and finishing time fi
– Indexed such that fi < fj if i < j• Each event has a value vi
• Schedule to make the largest value– You can attend only one event at any time
• Very similar to the new restaurant location problem– Sort events according to their finish time– Consider: if the last event is included or not
Time
e1 e2
e3e4 e5
e6
e7
e8
e9
04/19/23 169
Events scheduling problem
Time
e1 e2
e3e4 e5
e6
e7
e8
e9
• V(i) is the optimal value that can be achieved when the first i events are considered
• V(n) =
V(n-1) en not selected
en selectedV(j) + vn
max {
j < n and fj < sn
s9 f9
s8 f8
s7 f7
04/19/23 170
Coin change problem
• Given some denomination of coins (e.g., 2, 5, 7, 10), decide if it is possible to make change for a value (e.g, 13), or minimize the number of coins
• Version 1: Unlimited number of coins for each denomination– Unbounded knapsack problem
• Version 2: Use each denomination at most once– 0-1 Knapsack problem
04/19/23 171
Use DP algorithm to solve new problems
• Directly map a new problem to a known problem• Modify an algorithm for a similar task• Design your own
– Think about the problem recursively– Optimal solution to a larger problem can be computed
from the optimal solution of one or more subproblems– These sub-problems can be solved in certain
manageable order– Works nicely for naturally ordered data such as
strings, trees, some special graphs– Trickier for general graphs
• The text book has some very good exercises.
04/19/23 172
Unit-profit restaurant location problem
• Now the objective is to maximize the number of new restaurants (subject to the distance constraint)– In other words, we assume that each
restaurant makes the same profit, no matter where it is opened
10 mile
04/19/23 173
A DP Algorithm
• Exactly as before, but pi = 1 for all i
S(i-1)
S(j) + 1 j < i & dist (tj, ti) ≥ 10S(i) = max
S(i-1)
S(j) + pi j < i & dist (tj, ti) ≥ 10S(i) = max
04/19/23 174
Greedy algorithm for restaurant location problem
select t1
d = 0;
for (i = 2 to n)
d = d + dist(ti, ti-1);
if (d >= min_dist)
select ti
d = 0;
end
end
5 2 2 6 6 63 10 7
d 0 5 7 9 150
6 9 150
100
7
04/19/23 175
Complexity
• Time: Θ(n)
• Memory: – Θ(n) to store the input– Θ(1) for greedy selection
04/19/23 176
Optimal substructure• Claim 1: if A = [m1, m2, …, mk] is the optimal solution to the
restaurant location problem for a set of towns [t1, …, tn]
– m1 < m2 < … < mk are indices of the selected towns
– Then B = [m2, m3, …, mk] is the optimal solution to the sub-problem [tj, …, tn], where tj is the first town that are at least 10 miles to the right of tm1
• Proof by contradiction: suppose B is not the optimal solution to the sub-problem, which means there is a better solution B’ to the sub-problem– A’ = mi || B’ gives a better solution than A = mi || B => A is not
optimal => contradiction => B is optimal
m1 B’ (imaginary)A’
Bm1Am2 mk
04/19/23 177
Greedy choice property
• Claim 2: for the uniform-profit restaurant location problem, there is an optimal solution that chooses t1
• Proof by contradiction: suppose that no optimal solution can be obtained by choosing t1
– Say the first town chosen by the optimal solution S is ti, i > 1
– Replace ti with t1 will not violate the distance constraint, and the total profit remains the same => S’ is an optimal solution
– Contradiction– Therefore claim 2 is valid
S
S’
04/19/23 178
Fractional knapsack problem
0-1 knapsack problem: take each item or leave it
Fractional knapsack problem: items are divisible
Unbounded knapsack problem: unlimited supplies of each item.
Which one is easiest to solve?
•Each item has a value and a weight•Objective: maximize value•Constraint: knapsack has a weight
limitation
We can solve the fractional knapsack problem using greedy algorithm
04/19/23 179
Greedy algorithm for fractional knapsack problem
• Compute value/weight ratio for each item• Sort items by their value/weight ratio into
decreasing order– Call the remaining item with the highest ratio the most
valuable item (MVI)
• Iteratively: – If the weight limit can not be reached by adding MVI
• Select MVI
– Otherwise select MVI partially until weight limit
04/19/23 180
Example• Weight limit: 10
1.5
2
1.2
1
0.75
1
$ / LB
966
425
654
333
342
221
Value ($)
Weight (LB)
item
04/19/23 181
Example• Weight limit: 10
• Take item 5– 2 LB, $4
• Take item 6– 8 LB, $13
• Take 2 LB of item 4– 10 LB, 15.4
item Weight (LB)
Value ($)
$ / LB
5 2 4 2
6 6 9 1.5
4 5 6 1.2
1 2 2 1
3 3 3 1
2 4 3 0.75
04/19/23 182
Why is greedy algorithm for fractional knapsack problem valid?
• Claim: the optimal solution must contain the MVI as much as possible (either up to the weight limit or until MVI is exhausted)
• Proof by contradiction: suppose that the optimal solution does not use all available MVI (i.e., there is still w (w < W) pounds of MVI left while we choose other items)– We can replace w pounds of less valuable items by MVI– The total weight is the same, but with value higher than the
“optimal”– Contradiction
w w w w
04/19/23 183
Graphs
• A graph G = (V, E)– V = set of vertices– E = set of edges = subset of V V– Thus |E| = O(|V|2)
1
2 4
3
Vertices: {1, 2, 3, 4}
Edges: {(1, 2), (2, 3), (1, 3), (4, 3)}
04/19/23 184
Graphs: Adjacency Matrix
• Example:
1
2 4
3
A 1 2 3 4
1 0 1 1 0
2 0 0 1 0
3 0 0 0 0
4 0 0 1 0
How much storage does the adjacency matrix require?A: O(V2)
04/19/23 185
Graphs: Adjacency List
• Adjacency list: for each vertex v V, store a list of vertices adjacent to v
• Example:– Adj[1] = {2,3}– Adj[2] = {3}– Adj[3] = {}– Adj[4] = {3}
• Variation: can also keep a list of edges coming into vertex
1
2 4
3
04/19/23 186
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 187
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 188
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 189
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 190
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 191
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 192
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 193
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 194
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 195
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 196
Kruskal’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
c-d: 3
b-f: 5
b-a: 6
f-e: 7
b-d: 8
f-g: 9
d-e: 10
a-f: 12
b-c: 14
e-h: 15
04/19/23 197
Time complexity
• Depending on implementation• Pseudocode:
sort all edges according to weightsT = {}. tree(v) = v for all v.for each edge (u, v)
if tree(u) != tree(v)T = T U (u, v);union (tree(u), tree(v))
Overall time complexityNaïve: Θ(nm)Better implementation: Θ(m log n)
Θ(m log m)= Θ(m log n) m edges
Avg time spent per edge
Naïve: Θ (n)Better: Θ (log n) using set union
04/19/23 198
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
a b c d e f g h
∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
04/19/23 199
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
ChangeKey
c b a d e f g h
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞
04/19/23 200
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
ExctractMin
h b a d e f g
∞ ∞ ∞ ∞ ∞ ∞ ∞
04/19/23 201
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
d b a h e f g
3 14 ∞ ∞ ∞ ∞ ∞
ChangeKey
04/19/23 202
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
b g a h e f
14
∞ ∞ ∞ ∞ ∞
ExctractMin
04/19/23 203
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
b e a h g f
8 10 ∞ ∞ ∞ ∞
Changekey
04/19/23 204
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
e f a h g
10 ∞ ∞ ∞ ∞
ExtractMin
04/19/23 205
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
f e a h g
5 10 6 ∞ ∞
Changekey
04/19/23 206
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
a e g h
6 10 ∞ ∞
ExtractMin
04/19/23 207
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
a e g h
6 7 9 ∞
Changekey
04/19/23 208
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
e h g
7 ∞ 9
ExtractMin
04/19/23 209
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
g h
9 ∞
ExtractMin
04/19/23 210
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
g h
9 15
Changekey
04/19/23 211
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
h
15
ExtractMin
04/19/23 212
Prim’s algorithm: example
aa
bb ff
cc ee
dd
gg
hh
6 125
14
3
8
10
15
9
7
04/19/23 213
Complete Prim’s AlgorithmMST-Prim(G, w, r) Q = V[G]; for each u Q key[u] = ; key[r] = 0; T = {}; while (Q not empty) u = ExtractMin(Q); for each v Adj[u] if (v Q and w(u,v) < key[v]) T = T U (u, v); ChangeKey(v, w(u,v));
How often is ExtractMin() called?How often is ChangeKey() called?
n vertices
Θ(n) times
Θ(n2) times?
Θ(m) times
Overall running time: Θ(m log n)Cost per ChangeKey
04/19/23 214
Summary
• Kruskal’s algorithm– Θ(m log n)– Possibly Θ(m + n log n) with counting sort
• Prim’s algorithm– With priority queue : Θ(m log n)
• Assume graph represented by adj list
– With distance array : Θ(n^2)• Adj list or adj matrix
– For sparse graphs priority queue wins– For dense graphs distance array may be better
04/19/23 215
a b c d e f g h i
∞ 14 7 5 0 ∞ ∞ ∞ ∞
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
Dijkstra’s algorithm
04/19/23 216
a b c d e f g h i
11 11 7 5 0 ∞ ∞ ∞ ∞
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
11
Dijkstra’s algorithm
04/19/23 217
a b c d e f g h i
9 11 7 5 0 ∞ ∞ ∞ ∞
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
Dijkstra’s algorithm
04/19/23 218
a b c d e f g h i
9 11 7 5 0 12 ∞ ∞ 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
Dijkstra’s algorithm
04/19/23 219
a b c d e f g h i
9 11 7 5 0 12 ∞ 20 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
20
Dijkstra’s algorithm
04/19/23 220
a b c d e f g h i
9 11 7 5 0 12 ∞ 19 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
20
19
Dijkstra’s algorithm
04/19/23 221
a b c d e f g h i
9 11 7 5 0 12 18 18 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
20
19
18
18
Dijkstra’s algorithm
04/19/23 222
a b c d e f g h i
9 11 7 5 0 12 18 18 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
20
19
18
18
Dijkstra’s algorithm
04/19/23 223
a b c d e f g h i
9 11 7 5 0 12 18 18 17
7
2
39
6
14
9
7
1
6
8
1
7
5
4
b
ed
cg
ai
hf
0
14
5
7
11
119
12
17
20
19
18
18
Dijkstra’s algorithm
04/19/23 224
Prim’s AlgorithmMST-Prim(G, w, r) Q = V[G]; for each u Q key[u] = ; key[r] = 0; T = {}; while (Q not empty) u = ExtractMin(Q); for each v Adj[u] if (v Q and w(u,v) < key[v]) T = T U (u, v); ChangeKey(v, w(u,v));
Overall running time: Θ(m log n)Cost per ChangeKey
04/19/23 225
Dijkstra’s AlgorithmDijkstra(G, w, r) Q = V[G]; for each u Q key[u] = ; key[r] = 0; T = {}; while (Q not empty) u = ExtractMin(Q); for each v Adj[u] if (v Q and key[u]+w(u,v) < key[v]) T = T U (u, v); ChangeKey(v, key[u]+w(u,v));
Overall running time: Θ(m log n)Cost per ChangeKey
Running time of Dijkstra’s algorithm is the same as Prim’s algorithm
04/19/23 226
Good luck with your final!