Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to...

41
CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall 2007 · Lecture 3 - 1 - Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 3 Introduction to Algorithms

Transcript of Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to...

Page 1: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 1 -

Bioinformatics:Issues and Algorithms

CSE 308-408 • Fall 2007 • Lecture 3

Introduction to Algorithms

Page 2: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 2 -

Administrative issues

• Homework #1 has been posted on Blackboard and is due next Tuesday (Sept. 11) at 5:00 pm. Submit your work online using the Blackboard Assignment function.

CSE Department Distinguished Seminar SeriesTopic: “Architecture of Product Lines”

Speaker: Dr. David M. Weiss, Avaya LaboratoriesLocation: Packard Lab 466

Date: Thurs., Sept. 6, 4:00 pm – 5:00 pmReception @ 3:30 pm in Packard Lobby

Page 3: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 3 -

Algorithms

* Even if you've studied algorithms before, this one is probably new to you. In the context of this course, it's a question of some importance to your final grade.

Questions to answer (starting today):

Skills to develop (over the course of the semester):

• What is an algorithm?

• What is the difference between an algorithm and a program?

• What makes one algorithm better than another?

• Reading and understanding the description of an algorithm.

• Translating a textual description into working program code.

• Writing about an algorithm so others can understand it. *

Page 4: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 4 -

Algorithms

What is an algorithm?

An algorithm is a sequence of well-defined operations that solve a particular formal problem of interest.

We may have some vague ideas about answers to these questions, but for our purposes we need to be rigorous.

Does that shed light on the matter? Or raise more questions?

• What is a well-defined operation?• What does solve mean?• What exactly is a formal problem?

Page 5: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 5 -

Algorithms

Another viewpoint: an algorithm is a "black box" that transforms inputs (a problem statement) into outputs (a solution to the problem).

AlgorithmUniverse of

possible problems of a

given type

Specificprobleminstance

Solution toprobleminstance

Note that algorithm must be ready to solve any instance!

Page 6: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 6 -

Operations

What is a well-defined operation?• Goal is to specify algorithm so that it can be run on any

computer, without limiting it to a specific architecture.• Hence, operations should be abstract, but must reflect what

can be done on real-world machines.• Often called “pseudo-code” (like a programming language).

Some examples of typical operations:

AssignmentFormat: a ← b

Effect: sets the variable a to the value of b. *

* Note: for convenience, we'll sometimes write '=' instead of '←'.

Page 7: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 7 -

More operations

ArithmeticFormat: a + b, a – b, a * b, a / b, ab

Effect: addition, subtraction, multiplication, division, and exponentiation of numbers.

ConditionalFormat: if A is true

Belse

CEffect: If statement A is true, executes operation B,

otherwise executes operation C.

Page 8: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 8 -

More operations

We are allowed to create more complex functions by combining simpler operations:

Example: MAX(a,b)if a > b

return aelse

return b

Effect: Returns the maximum of a and b.

Example: DIST(x1, y1, x2, y2)dx ← (x2 – x1)2

dy ← (y2 – y1)2

return SQRT(dx + dy)

Effect: Returns Euclidean distance between points (x1,y1) and (x2,y2).

Page 9: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 9 -

A simple algorithm

Is our MAX function an algorithm?

Example: MAX(a,b)if a > b

return aelse

return b

Hmmm ... we said an algorithm was a sequence of well-defined operations that solve a particular formal problem of interest.

• Does MAX always solve the problem?So MAX is an algorithm!

YES.• Is the problem formally specified?YES.• Are the operations well-defined?

YES.

Page 10: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 10 -

How about this?

Let's try making MAX a little simpler ...

Example: MAX2(a,b)return a

• Does MAX2 always solve the problem?

YES.• Is the problem formally specified?YES.• Are the operations well-defined?

NO.This version only works about half the time on average.

While this is a silly example, it highlights an important point. We generally insist that our algorithms work all of the time.

Techniques that provide no guarantees are called heuristics.

Page 11: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 11 -

Brief digression ...

Soon, we'll be learning thePerl programming language.

• A real runnable program requires a few more details that,for now, will only cloud the discussion.

• Generality ⇒ don't tie algorithm to a specific language.• It's traditional to use pseudo-code for describing algorithms.

(I.e., you'll see this when you read about algorithms.)

Why are we using pseudo-code? Why not use Perl?

Page 12: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 12 -

More operations

FOR loopsFormat: for i ← a to b

BEffect: Sets i to a and executes operation B.

Repeats for i = a + 1, a + 2, ..., b – 1, b.

WHILE loopsFormat: while A is true

BEffect: Checks condition A. If true, then executes

operation B. Checks A again, if true, executes B again. Repeats until A not true.

Page 13: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 13 -

More operations

Array accessFormat: ai - or - a[i]

Effect: Returns ith item of array a = (a1, ..., ai, ..., an).

Note: it's sometimes convenient to assume that first item in array is stored at index 0 instead of index 1. In that case, we'll write (a0, ..., ai, ..., an-1) or maybe a[0], a[1], ..., a[n-1].

Example:if a = (1, 8, 2, 5, 7)then

a2 = 8a5 = 7, etc.

Arrays can be n-dimensional. E.g.,1 8 2 5 7

if a = 6 2 9 4 53 7 0 9 8

thena2,3 = 9, etc.

Page 14: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 14 -

Another simple algorithm

STRINGCOPY(s,n)for i ← 1 to n

ti = si

return t

Consider the (computational) problem of making a copy of a given DNA sequence.

String Duplication Problem: Given a string of letters, return a copy.Input: A string s = (s1, s2, ..., sn) of length n, as an

array of characters.Output: A string representing a copy of s.

Formal problemstatement

Algorithm that willsolve all instances

of problem

Page 15: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 15 -

A slightly harder problem

Consider making change in a financial transaction optimally.

• Least annoying way (from customer's perspective).What does “optimally” mean?

• Fewest number of coins.• (Could also be lightest weight, most “useful” coins, etc.)

What does “least annoying” mean?

Say you go to McDonalds for dinner and your bill totals $19.23. You hand the cashier a $20 bill. Your change could be:

etc.

3 quarters + 2 pennies 77 pennies

Page 16: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 16 -

The change problem

Input: An amount of money, M, in cents.Output: Smallest number of quarters q, dimes d, nickels

n, and pennies p whose value adds up to M.(I.e., 25q + 10d + 5n + p = M and q + d + n + p is as small as possible.)

United States Change Problem:Convert some amount of money into the fewest number of U.S. coins.

• Provide as much as possible using largest denomination.• Make up remainder as much as possible using next largest.• Etc.

General approach to the solution (conventional wisdom):

Page 17: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 17 -

The change problem

USCHANGE(M)while M > 0

c ← Largest coin no bigger than MGive coin with denomination c to customerM ← M - c

A solution to the US change problem:

This works, doesn't it? (Do you see any potential problems?)

Page 18: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 18 -

Let's get a bit more general

Input: An amount of money, M, and an array of d denominations c = (c1, c2, ..., cd), in decreasing order of value.

Output: A list of d integers i1, i2, ..., id such thatc1i1 + c2i2 + ... cdid = M and i1 + i2 + ... + id is as small as possible.

Change Problem:Convert some amount of money M into given denominations, using the smallest number of coins.

Looks good! Now can't we just adapt our existing algorithm?

Page 19: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 19 -

The change problem

USCHANGE(M)r ← Mq ← r / 25r ← r – 25 * qd ← r / 10r ← r – 10 * dn ← r / 5r ← r – 5 * np ← rreturn (q, d, n, p)

Slightly more detailed solution:

This works, but it seemsawfully specific, doesn't it?

Division assumed toreturn integer result

Page 20: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 20 -

A more general change problem

BETTERCHANGE(M,c,d)r ← Mfor k ← 1 to d

ik ← r / ck

r ← r – ck * ikreturn (i1, i2, ..., id)

Adapting our previous algorithm:

Is this algorithm correct?

In other words, does it work for all possible inputs?

Page 21: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 21 -

Is this algorithm correct?

BETTERCHANGE(M,c,d)r ← Mfor k ← 1 to d

ik ← r / ck

r ← r – ck * ikreturn (i1, i2, ..., id)

No – this algorithm will not work for all cases!

• Today we would use one quarter, one dime, and one nickel.• In the past, however, the US had a 20 cent coin ...• ... in this case, optimal change would be two such coins.

Consider making change for 40 cents:

Page 22: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 22 -

Fibonacci

He assumes the rabbits do not escape and none die.

A pair of rabbits are put in a field. If rabbits take a month to become mature and then produce a new pair every month after that, how many pairs will there be in twelve months time?

Leonardo of Pisa, better known as Fibonacci, has been called the "greatest European mathematician of the middle ages." In one of Fibonacci's books, he introduces a problem for his readers to use to practice their arithmetic:

Our problem: compute how many rabbits after n months.

Page 23: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 23 -

1 pair

Fibonacci

Let's look at what happens ...

5 pairs3 pairs2 pairs1 pair

same rabbitsnew babies

Page 24: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 24 -

A Fibonacci algorithm

Example: FIBONACCI(n)F1 ← 1F2 ← 1for i ← 3 to n

Fi ← Fi-1 + Fi-2

return Fn

Effect: Computes the nth Fibonacci number

How do we compute number of rabbit pairs after n months?Note that this number equals:• # of rabbit pairs at n – 1 months ... plus ...• # of mature rabbit pairs at n – 1 months (they'll have babies)But this last value is same as # of rabbit pairs at n – 2 months.

Page 25: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 25 -

Recursion

Example: RECURSIVEFIBONACCI(n)if n = 1 or n = 2

return 1else

a ← RECURSIVEFIBONACCI(n – 1)b ← RECURSIVEFIBONACCI(n – 2)return a + b

Fibonacci is an example of a simple iterative algorithm.Another way to view this is that computing FIBONACCI for n months can be done by calling FIBONACCI for n – 1 months and for n – 2 months and summing the two values together.

This technique is known as recursion.

Page 26: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 26 -

Recursion

Recursion is a powerful and useful concept. It doesn't always lead to efficient algorithms, however. Consider calls to RECURSIVEFIBONACCI:

n-3 n-4 n-4 n-5 n-4 n-5 n-5 n-6

n-2 n-3 n-3 n-4

n

n-1 n-2Many redundant calls, much wasted computation

Page 27: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 27 -

Towers of Hanoi

Consider now the following problem ...

Towers of Hanoi ProblemInput: An integer n.

Output: A sequence of moves that will solve the n-disk Towers of Hanoi puzzle.

Move the three disks from blue peg to red peg subject to:• May only move one disk at a time.• May never place a larger disk over a smaller one.

Page 28: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 28 -

Towers of Hanoi

How can we solve this?

Just move the disk!

Reduce it to an easier problem ... just one disk.

Page 29: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 29 -

Towers of Hanoi

How to apply this? Solve n-1 disk problem here

(1)

Move largedisk here

(2)

Solve n-1 disk problem here(3)

Page 30: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 30 -

Towers of Hanoi

HANOI(n, fromPeg, toPeg)if n = 1

output “Move disk from peg fromPeg to peg toPeg”return

unusedPeg ← 6 – fromPeg – toPegHANOI(n – 1, fromPeg, unusedPeg)output “Move disk from peg fromPeg to peg toPeg”HANOI(n – 1, unusedPeg, toPeg)return

Expressing this as a recursive algorithm ...

This simple algorithm solves any Towers of Hanoi problem, but not necessarily quickly ...

Page 31: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 31 -

Efficiency

How can we analyze the efficiency of this algorithm?By counting the number of basic operations performed.

Let T(n) represent the number of disk moves for HANOI(n).

Add 1 to both sides: T(n) + 1 = 2 * (T(n – 1) + 1)

Let U(n) = T(n) + 1, then: U(n) = 2 * U(n – 1)

T(n) = 2 * T(n – 1) + 1T(1) = 1

Then:

U(n) = 2 * U(n – 1) = 2 * 2 * U(n – 2) = ... = 2nSo:

So T(n) = 2n – 1 and HANOI(n) is an exponential algorithm.

Page 32: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 32 -

n T(n) = 2n – 1

Efficiency

HANOI takes exponential time. What does this mean?

1 12 33 74 155 316 637 1278 2559 511

Hmm ... not so good.

Page 33: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 33 -

T(n) = 2n – 1

Big-O notation

We need a way to express runtime of algorithm without worrying about nitty gritty details of a specific implementation.

Example of a “nitty gritty” detail. This 1 doesn't really contribute anything.

Instead, we write that HANOI has a runtime of O(2n), which is read as “order two to the n.”

More formally: we write that a function f(n) is O(g(n)) if f(n) grows no faster than g(n).

In other words, there exists constants c and x0 such that for all values of x ≥ x0, we have f(x) ≤ cg(x).

Page 34: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 34 -

Big-O and Big-Ω

Big-O notation is a worst-case view of the world. It tells us that a function grows no faster than some other function, but not whether it grows much slower.

For example, the function f(n) = 2n + 2 is O(n). But it's also:• O(n2)• O(n3), etc.We say that big-O notation provides an upper bound on the growth of a function, but the bound is not necessarily tight.

Big-omega notation (Ω) provides the corresponding lower bound on the growth of a function.I.e., a function f(n) is Ω(g(n)) if f(n) grows no slower than g(n).

In general, we care mostly about upper bounds.

Page 35: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 35 -

Growth rates of various functions

In the long run, constants and low-order terms don't matter:

We always express runtime of algorithm relative to its highest-order term.

Eventually O(n2) beats O(n3). In end, O(logn) beats them both.

runt

ime

Page 36: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 36 -

Tractable vs. intractable problems

• So far, we've discussed the time complexity of algorithms. But we haven't discussed the complexity of problems. The difference is key.

When is a problem “solved”? Are some harder than others?

• A problem is considered solved (“easy”) if we know an efficient algorithm. Efficient here means polynomial time.

• A problem is unsolved if we know no such algorithm.• A problem is hard if no such algorithm exists.• Note that “unsolved” and “hard” aren't the same thing.

Before going further, we need a notion of “hard” and “easy”.

Page 37: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 37 -

Relativized complexity

(1) No such algorithm exists.(2) It exists, but we're just not smart enough to find it.

We'd like to prove (1), but often that's impossible. If we can't prove (1), then we're forced to admit (2) may be the case.

Idea: prove that if we could solve our problem, that would also provide a solution to another, well-known problem that other people have been working on for a long time without success.

If a lot of smart people have been working for a long time, this increases our confidence no such algorithm exists (and, hence, we look a little less stupid ourselves).

Say we don't know a good algorithm for a problem. Two cases:

Page 38: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 38 -

NP-completeness

A class of problems that smart people have been working on for a long time with little success are the NP-complete problems.

These are problems which might have an efficient solution (a polynomial time algorithm), but no one has been able to find it.

All NP-complete problems related: solve one, solve them all!

To prove a problem is hard, reduce NP-complete problem to it:

KnownNP-completeproblem

Ournew

problem

Efficienttransformation Posited

solution

Page 39: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 39 -

(Not

cov

ered

inth

is c

ours

e.)

Taxonomy of algorithm design techniques

Your IBA book is organized in terms of general methodologies:Exhaustive search enumerate all possible solutions and

look for best one.Branch and bound eliminate search paths that

obviously won't be productive.Dynamic programming build up solutions for larger

problems from smaller ones.Divide-and-conquer break problem into pieces which are

easier to solve, then combine.Machine learning collect statistics over time and use

these to solve current problem.Randomized algorithms used to overcome certain kinds of

worst-case scenarios.

Page 40: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 40 -

Hints on reading, writing, and talking about algorithms

• Realize that you may have to read the description several times before you truly understand the algorithm.

• Strive to understand the algorithm deeply. Try to learn it well enough that you could actually implement it.

• Understand (and be able to explain) the model of the problem that the algorithm is designed to solve.

• Know whether the algorithm is exact or a heuristic. If the latter, figure out the kinds of cases where it breaks.

• Understand the efficiency of the algorithm so that you are able to express it in convincing terms.

• Use examples to illustrate your presentation. Choose your examples carefully to point out various important issues.

Page 41: Introduction to Algorithmslopresti/Courses/2007-08/CSE308-408... · 2007-09-05 · Introduction to Algorithms. CSE 308-408 · Bioinformatics: Issues and Algorithms Lopresti · Fall

CSE 308-408 · Bioinformatics: Issues and AlgorithmsLopresti · Fall 2007 · Lecture 3 - 41 -

Wrap-up

Remember:• Come to class having done the readings.• Check Blackboard regularly for updates.• If enrolled in CSE 408, let me know which lecture topic you

wish to scribe by Friday, Sept. 7. Send me several choices, keeping in mind that our schedule may shift somewhat if we fall behind for some reason.

Readings for next time:• BB&P Chapters 3-4 (introduction to Perl).