CSE 326: Data Structures Topic 18: The Algorhythmics

26
CSE 326: Data Structures Topic 18: The Algorhythmics Luke McDowell Summer Quarter 2003

description

CSE 326: Data Structures Topic 18: The Algorhythmics. Luke McDowell Summer Quarter 2003. Today’s Outline. Greedy Divide & Conquer Dynamic Programming. Greedy Algorithms. Repeat until problem is solved: Consider possible next steps Choose best-looking alternative and commit to it - PowerPoint PPT Presentation

Transcript of CSE 326: Data Structures Topic 18: The Algorhythmics

Page 1: CSE 326: Data Structures Topic 18: The Algorhythmics

CSE 326: Data StructuresTopic 18: The Algorhythmics

Luke McDowell

Summer Quarter 2003

Page 2: CSE 326: Data Structures Topic 18: The Algorhythmics

Today’s Outline

• Greedy

• Divide & Conquer

• Dynamic Programming

Page 3: CSE 326: Data Structures Topic 18: The Algorhythmics

Greedy Algorithms

Repeat until problem is solved:– Consider possible next steps

– Choose best-looking alternative and commit to it

Greedy algorithms are normally fast and simple.

Sometimes appropriate as a heuristic solution or to approximate the optimal solution.

Page 4: CSE 326: Data Structures Topic 18: The Algorhythmics

Greed in Action

Page 5: CSE 326: Data Structures Topic 18: The Algorhythmics

The Grocery Bagging Problem• You are an environmentally-conscious grocery

bagger at QFC• You would like to minimize the total number of bags

needed to pack each customer’s items.Items (mostly junk food) Grocery bags

Sizes s1, s2,…, sN (0 < si 1) Size of each bag = 1

Page 6: CSE 326: Data Structures Topic 18: The Algorhythmics

Optimal Grocery Bagging: An Example

• Example: Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3– How may bags of size 1 are required?

• Can find optimal solution through exhaustive search– Search all combinations of N items using 1 bag, 2 bags,

etc.– Runtime?

Page 7: CSE 326: Data Structures Topic 18: The Algorhythmics

Bin Packing

• General problem: Given N items of sizes s1, s2,…, sN (0 < si 1), pack these items in the least number of bins of size 1.

• The general bin packing problem is NP-complete – Reductions: All NP-problems SAT 3SAT 3DM

PARTITION Bin Packing (see Garey & Johnson, 1979)

Items Bins

Sizes s1, s2,…, sN (0 < si 1) Size of each bin = 1

Page 8: CSE 326: Data Structures Topic 18: The Algorhythmics

Greedy Solution?Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3

Page 9: CSE 326: Data Structures Topic 18: The Algorhythmics

Another Greedy Solution?Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3

Page 10: CSE 326: Data Structures Topic 18: The Algorhythmics

Divide & Conquer

• Divide problem into multiple smaller parts• Solve smaller parts

– Solve base cases directly

– Otherwise, solve subproblems recursively

• Merge solutions together (Conquer!)

Often leads to elegant and simple recursive implementations.

Page 11: CSE 326: Data Structures Topic 18: The Algorhythmics

Divide & Conquer in Action

Page 12: CSE 326: Data Structures Topic 18: The Algorhythmics

Fibonacci Numbers

F(n) = F(n - 1) + F(n - 2)

F(0) = 1 F(1) = 1

int fib(int n) {

if (n <= 1)

return 1;

else

return fib(n - 1) +

fib(n - 2);

}

Divide & Conquer

Page 13: CSE 326: Data Structures Topic 18: The Algorhythmics

Fibonacci Numbers

F(n) = F(n - 1) + F(n - 2)

F(0) = 1 F(1) = 1

int fib(int n) {

if (n <= 1)

return 1;

else

return fib(n - 1) +

fib(n - 2);

}

Divide & Conquer F6

F5

F4 F3

F3

F2

F1 F0

F2

F1 F0

F2

F1 F0

F2

F1 F0

F2

F1 F0

F4

F3

F1 F1

F1

Runtime:

Page 14: CSE 326: Data Structures Topic 18: The Algorhythmics

Fibonacci NumbersF(n) = F(n - 1) + F(n - 2)

F(0) = 1 F(1) = 1

int fib(int n) {// Create a “static” array however your favorite language allows

int fibs[n];

if (n <= 1)

return 1;

if (fibs[n] == 0)

fibs[n] = fib(n - 1) +

fib(n - 2);

return fibs[n];

}

Memoized

Runtime:

Page 15: CSE 326: Data Structures Topic 18: The Algorhythmics

Memoizing/Dynamic Programming

• Define problem in terms of smaller subproblems• Solve and record solution for base cases• Build solutions for subproblems up from solutions to

smaller subproblems

Can improve runtime of divide & conquer algorithms that have shared subproblems with optimal substructure.

Usually involves a table of subproblem solutions.

Page 16: CSE 326: Data Structures Topic 18: The Algorhythmics

Dynamic Programming in Action

• Sequence Alignment• Fibonacci numbers• All pairs shortest path• Optimal Binary Search Tree• Matrix multiplication

Page 17: CSE 326: Data Structures Topic 18: The Algorhythmics

Sequence Alignment

Biology 101:

Page 18: CSE 326: Data Structures Topic 18: The Algorhythmics

DNA Sequence• String using letters (nucleotides): A,C,G,T

For example: ACGGGCATTATCGTA

• DNA can mutate!

Change a letter: ACGGGCAT → ACGTGCAT

Insert a letter: ACGGGCAT → ACGGGCAAT

Delete a letter: ACGGGCAT → ACGGGAT

• A few mutations makes sequences “different”, but “similar”

• Similar sequences often have similar functions

Page 19: CSE 326: Data Structures Topic 18: The Algorhythmics

What is Sequence Alignment?

• Use underscores (_) or wildcards to match up 2 sequences

• The “best alignment” of 2 sequences is an alignment which minimizes the number of “underscores”

• For example: ACCCGTTT and TCCCTTT

A_CCCGTTT

_TCCC_TTTBest alignment:

(3 underscores)

Page 20: CSE 326: Data Structures Topic 18: The Algorhythmics

Solutions

Naïve solution

• Try all possible alignments

• Running time: exponential

Dynamic Programming Solution

• Create a table

• Table(x,y): # errors in best alignment for first x letters of string 1, and first y letters of string 2

• Running time: polynomial

Page 21: CSE 326: Data Structures Topic 18: The Algorhythmics

Example Alignment

• Suppose we have already determined the best alignment for:

– First x letters of string1 with first y-1 letters of string2

– First x-1 letters of string1 with first y-1 letters of string2

– First x-1 letters of string1 with first y letters of string2

If (string1[x] == string2[y]) then TABLE[x,y] = TABLE[x-1,y-1]

Else TABLE[x,y] = min(1+TABLE[x,y-1], 1+TABLE[x-1,y])

Match ACCGTTAG with ACTGTTAA

(1) match ‘G’ with ‘_’: 1 + align(ACCGTTA,ACTGTTAA)

(2) match ‘A’ with ‘_’: 1 + align(ACCGTTAG,ACTGTTA)

Page 22: CSE 326: Data Structures Topic 18: The Algorhythmics

Example GGCAT and TGCAA

G

C

(empty)

G

A

T

1

3

0

2

4

5

(empty)

1

T

2

G

3

C

4

A

5

A

Page 23: CSE 326: Data Structures Topic 18: The Algorhythmics

Example GGCAT and TGCAA

G

C

(empty)

G

A

T

1

3

0

2

4

5

(empty)

2

4

1

3

5

4

T

1

3

2

2

4

5

G

2

2

3

3

3

4

C

3

3

4

4

2

3

A

4

4

5

5

3

4

A

T_GCAA_

_GGCA_T

Page 24: CSE 326: Data Structures Topic 18: The Algorhythmics

Pseudocode (bottom-up)

int align(String X, String Y, TABLE[1..x,1..y]) {int i,j;// Initialize top row and leftmost columnfor (i=1; i<=x, ++i) TABLE[i,1] = i;for (j=1; j<=y; ++j) TABLE[1,j] = j;

for (i=2; i<=x, ++i) {for (j=2; j<=y; ++j) {

if (X[i] == Y[j])TABLE[i,j] = TABLE[i-1,j-1]

elseTABLE[i,j] = min(TABLE[i-1,j],

TABLE[i,j-1])+1}

}return TABLE[x,y];

}runtime:

Page 25: CSE 326: Data Structures Topic 18: The Algorhythmics

Pseudocode (top-down)

int align(String X,String Y,TABLE[1..x,1..y]){

Compute TABLE[x-1,y-1] if necessaryCompute TABLE[x-1,y] if necessaryCompute TABLE[x,y-1] if necessaryif (X[x] == Y[y])

TABLE[x,y] = TABLE[x-1,y-1];else

TABLE[x,y] = min(TABLE[x-1,y], TABLE[x,y-1]) + 1;

return TABLE[x,y];}

Page 26: CSE 326: Data Structures Topic 18: The Algorhythmics

Dynamic Programming Wrap-Up

• Re-use expensive computations

• Store optimal solutions to sub-problems in table

• Use optimal solutions to sub-problems to determine optimal solution to slightly larger problem