Computational complexity The cost of computing. “Complexity” in the algorithm analysis context n...

Post on 16-Jan-2016

219 views 0 download

Transcript of Computational complexity The cost of computing. “Complexity” in the algorithm analysis context n...

Computational complexity

The cost of computing

“Complexity” in the algorithm analysis context

means the cost of a program's execution– (running time, memory, ...)

rather than The cost of creating the program

– (# of statements, development time) In this context, less-complex programs may

require more development time.

Complexity--an introduction 2

Time complexity –the time required to solve a problem of a particular size

Space complexity—analysis of the computer memory

Several questions can be asked about a particular algorithm1. Is it correct?, i.e., does it do what it claims

to do when given valid input data?

2. How long does it take to run?

3. How much space is needed to execute the algorithm? i.e., how much storage is needed to hold data, arrays, program variables, etc.

Functions associated with a program

Consider a program with one natural number as input.

Two functions that can be associated with the program are:f(n), the function computed by the programT(n), the running time of the program

Possible size measuresfor T(n)

Total number of bits used to encode the input.

Number of data values in the input (e.g. size of an array)

The second is viewed as an approximation to the first.

Primitive Operations

These are operations which we don't further decompose in our analysis.

They are considered the fundamental building blocks of the algorithm, e.g.

+ * - / if( ) Typically, the time they take is assumed to

be constant. This assumption is not always valid.

Is multiplication really “primitive”?

In doing arithmetic on arbitrarily-large numbers, the size of the numerals representing those numbers may have a definite effect on the time required.

– 2 x 2

vs.– 26378491562329846 x 786431258901237

Size of Numerals

For a single number (n) input, size of the corresponding numeral is typically on the order of log(n)

e.g. decimal numeral encoding:– size(n) = #digits of n– = log10n

x = smallest integer > x (called the “ceiling of” x)

Asymptotic Analysis

Typically in measuring complexity, we look at the growth-rate of the time function rather than the value itself.

There is therefore a tendency to pay less attention to constant factors.

This is called asymptotic analysis. It is only one part of the story; sometimes

constants are important.

Step Counting

Using exact running time to measure an algorithm requires calibration based on the type of machine, clock rate, etc.

Instead, we usually just count steps taken in the algorithm.

Often we will assume primitives take one step each.

This usually gives us an accurate view of the growth rate.

Straight-Line Code

x = x + 1;

v = x / z;

w = x + v;

3 operations, therefore

3 steps

Loop Code

for( int i = 0; i < n; i++ )

{

x = x + 1;

v = x / z;

w = x + v;

}

n iterations x 5 steps + 1

= 5n+1 steps

(These count as steps too.)

Non-Numeric Loop Control

for( Polylist L = A; !L.isEmpty(); L = L.rest(); )

{

... loop body ...

} # of iterations = A.length()

Recursive Code

fac(n) = n == 0 ? 1 : n*fac(n-1)

2n +1 steps, if we count multiply and - as steps;

Steps are involved in the overhead for function calls too. The number of such steps would be proportional to n in this case.

Time and space requirements

Ex/ Suppose, given a set X consisting of n elements, some labeled “red”, some labeled “black”.

Like to find the number of subsets of X that contain at least one red item

A set with n elements has how many subsets? 2^n Hence, at least 2^n steps to execute Infeasible except for small values of n

Finding largest value in a finite sequence

Algorithm searches sequence S(1),S(2),…,S(N) for the largest value and returns it as LARGE

1. [Initialization] Set I:=1, LARGE:=S(1) // “I”= current element under examination

2. [Find a larger value] If S(I)>LARGE, then LARGE:=S(I) //If larger value is found, update LARGE

1. [Terminate?]If I=N, then STOP//largest value is LARGE

3. [Continue Search]I:=I+1, Go to STEP 2

Execution—24,56,19,99,41

I=1, LARGE = 24 I=2,56>24,LARGE=56 I=3,19<56 I=4,99>56,LARGE =99 I=5, 41<99,DONE

Time/space

To measure space requirements—determine amount of memory required to hold variables, array elements, etc.

To measure time requirements, count the number of instructions executed (# of times each loop is executed, or # of comparisons

Usually, interested in complexity of input selected from a particular class

Might select input of size n that gives best time or space performance

Look at best case, worst case, also average case

Definition 4

Let f and g be functions on {1,2,3,…}. Write f(n)=O(g(n)) and say f(n) is of order at most g(n) if there exists a positive constant C such that |f(n)|<=C|g(n)| for all but finitely many positive integers n

Ex/n^2 + n + 3 < = 3n^2 +3n^2 + 3n^2=9n^2; C=9, so n^2 + n + 3 = O(n^2)

Ex/1 + 2+…+n=n(n+1)/2=(n^2)/2 +(n/2) <= (n^2/2)+(n^2/2)=n^2; deduce 1 + 2+…+n=O(n^2)

Ex/3n^3+6n^2-4n+2=?

“O” Notation

“O” is letter “Oh” (for “order”) This is “big-Oh”; “little-Oh” has a different

meaning Used to express upper bounds on running time

(number of steps) T O(f) means that

(Think of O(f) as the set of functions that grow no faster than a constant times f.)

E A

( c)( n) T(n) < cf(n)

“O” Notation

Constants are irrelevant in “O” comparisons:

If g is a function, then any function n => d*g(n), where g is a constant, is  O(g).

Examples:– n => 2.5*n2  O(n => n2 )– n => 100000000*n2  O(n => n2 )

Notation Abuse

It is customary to drop the n => and just use the body expression of the anonymous function

Examples:– O(n2) instead of O(n => n2 )– O(n) instead of O(n => n )– O(1) instead of O(n => 1)

Notation Abuse

Examples:– 2.5*n2  O(n2 )– 100000000*n  O(n)– 10000000000  O(1)

Definition 3

ANS: 6n^3+6n^3+6n^3+6n^3=24n^3, so O(n^3) If an algorithm requires f(n) units of time to

terminate for output of size n, and f(n) = O(g(n)), then the time required by the algorithm is of order at most g(n) or O(g(n))

Ex/ Suppose an algorithm is known to require exactly n^2 + n + 3 units of memory for an input of size n. n^2 + n + 3id O(n^2), so the space required by this algorithm is O(n^2).

Searching an unordered sequence

Algorithm searches sequence S(1),S(2),…,S(N) for the first occurrence of KEY and returns its location I. If KEY is not found, the algorithm returns the value 0

1. [Initialization] Set I:=1

2. [Not found?] If I>N, set I:=0 and terminate (not successful)

3. [Test]If KEY = S(I), then terminate (search successful)

4. [continue search] If I:=I+1, Go to STEP 2

Analysis of Searching an unordered sequence

Best case analysis: If S(1)=KEY, Step 3 is executed once

Best case: runtime (O(1)) Worst case analysis: KEY not in

sequence, execute N times Worst case: runtime (O(N))

Analysis of Searching an unordered sequence

Average: If KEY is found in the Ith position, Step 3 is executed I times

If KEY not in sequence, Step 3 is executed N times

Hence, average is ((1+2+3+…+N)+N)/(N+1)=

((N(N+1)/2)+N)/(N+1)<= ((N(N+1)/2)+N+1)/(N+1)=N/2 +1<=N+1<=2N, so O(N) is average

Analysis of Searching an unordered sequence

Normally, choose the best g(N) for O(g(N))

Beware: 300N, 5N^2when N=5, second is better; when N=500, first is better

Big Oh Name

O(1) Constant

O(log_2(log_2(N)) Log Log

O(log_2(N)) Logarithmic

O(N) Linear

O(Nlog_2(N)) NlogN

O(N^2) Quadratic

O(N^3) Cubic

O(N^N) Polynomial

Big Oh Name

O(m^N) Exponential

O(N!) Factorial

Examples  void method1(int n){ for (int i=1; i <= n; i++)

System.out.println(n);}

Method1—prints numbers from 1 to N—N steps to complete—O(N)

Examples void method2(int n){ for (int i = 0; i < 2*n; i++) for (int j = 0; j < n; j++) System.out.println(i*j);} Method 2– 2 loops—inner and outer. For every

fixed i in the outer loop, the inner loop will run from j=0 to j=n-1 in steps of 1. Therefore, for every fixed I there are n steps in the inner loop. Since the outer loop runs from i=0 to i=2n-1, it will execute 2n times. Together with the inner loop, the inside call to S.O.P. executes 2n*n =2n^2 times—O(n^2)

Examples  void method5(int n){ for (int i = -2; i < n; i++) for (int j = 4; j < n-3; j++)

System.out.println(j+i);} 2 loops, like method 2—outer loop executes

(2+n) times, inner loop executes n-7times. Inner S.O.P executes(2+n)*(n-7)

– --O(n^2) void method6(int n){ for (int i = 0; i < 10; i++) System.out.println(i*n);} Exactly 10 times—O(1)

Examples  int method8(int n){ int sum = 0;

for (int i = 0; i < n; i++)

sum += i;

return sum;} Adds numbers from 1 to n—n

steps—O(n)

Examples--recursive void method7(int n){ if (n > 2) { method7(n-2); System.out.println(n);}} Ex/method7(10) calls

method7(8) then method7(6) then method7(4)

--others—roughly, n/2 times –O(n)

Examples--recursivevoid method3(int n){ if (n > 0) { method3(n-1); System.out.println(n); method3(n-1);}}method3(1) calls method3(0) twice, so if n

= 1, the method makes two recursive calls

method3(2) calls method3(1) twice, each of which calls method3(0) twice (as we already figured out); therefore, the method makes 2 * 2 = 4 recursive calls

Examples--recursivevoid method3(int n){ if (n > 0) { method3(n-1); System.out.println(n); method3(n-1);}}method3(3) calls method3(2) twice. That

method, as we already say, makes 4 recursive calls, so that method3(3) will make 2*4 = 8 recursive calls.

method3(4) calls method3(3) twice. That method, from before, makes 8 recursive calls, so that method3(4) will make 2*8 = 16 recursive calls.

Examples--recursivevoid method3(int n){ if (n > 0) { method3(n-1); System.out.println(n); method3(n-1);}}  Now, however, the pattern should be clear:

if the input to method3 is n, the method will make 2n recursive calls. Therefore, this method is O(2n). Again, instead of relying on "pattern recognition" one could use the principle of induction to mathematically prove that this is indeed the proper order of growth.

Induction proof for complexity of method3 3

Base: Let n=1; method3(1) calls method3(0) -- prints out 1 – calls method3(0)--then jumps out –2^(1)=2 steps

Induction Hypothesis: Assume method3(n) has 2^n steps

Induction Step: method3(n+1) makes 2 calls to method3(n), and this has 2^n +2^n=2*2^n=2^(n+1)

Examples--recursive void method4(int n){ if (n > 0)

{ System.out.println(n);

method4(n / 2);

}} method(16)makes 4 recursive

calls to itself; if n=64, calls(32,16,8,4,2,1)

Examples--recursive void method4(int n){ if (n > 0)

{ System.out.println(n);

method4(n / 2);

}} if the input is a power of 2: if n =

2k, the method calls itself k times. n = 2k == > log_2(n)= k O(log(n ))

Summary  method1: single loop--O(n) method2: double loop--O(n2) method3: double recursion--O(2n) method4: recursion with half the input

O(log(n)) method5: double loop--O(n2) method6: no dependence on n--O(1) method7: simple recursion--O(n) method8: single loop--O(n)

Analysis>

plot({2^x,x^3,log_2(x),x,1},x=-10..10,y=-10..10);

>

Why it MattersRunning Time as a function of Complexity

log n 3.3219 6.6438 9.9658 13.287 16.609 19.931

log2n 10.361 44.140 99.317 176.54 275.85 397.24

sqrt n 3.162 10 31.622 100 316.22 1000n 10 100 1000 10000 100000 1000000

n log n 33.219 664.38 9965.8 132877 1.66*106 1.99*107

n1.5 31.6 103 31.6*104 106 31.6*107 109

n2 100 104 106 108 1010 1012

n3 1000 106 109 1012 1015 1018

2n 1024 1030 10301 103010 1030103 10301030

n! 3 628 800 9.3*10157 102567 1035659 10456573 105565710

Values of various functions vs. values of argument n.

Even a computer a trillion times faster won’t help in this region

Allowable Problem Size as a Function of Available Time

Time Multiple 10 100 1000 10000 100000 1000000

log n 1024 1030 10300 103000 1030000 1030000

log2n 8 1024 3*109 1.2*1030 1.5*1095 1.1*10301

sqrt n 100 104 106 108 1010 1012

n 10 102 103 104 105 106

n log n 4.5 22 140 1000 7.7*103 6.2*104

n1.5 4 21 100 210 2100 10000

n2 3 10 32 100 320 1000

n3 2 4 10 21 46 100

2n 3 6 9 13 16 19

n! 3 4 6 7 8 9

Increase in size of problem that can be run based on increase in allowed

time, assuming algorithm runs problem size 1 in time 1.

Doubling Input Size

Complexity Doubling the input causes execution time toO(1) stay the same

O(log n) increase by an additive constant

O(n1/2) increase by a factor of sqrt(2)

O(n) doubleO(n log n) double, plus increase by a constant factor times n

O(n2) increase by a factor of 4

O(nk) increase by a factor of 2k

O(kn) square

Black-box vs. White-box Complexity

Black-box: We have a copy of the program, with no code. We can run it for different sizes of input.

White-box (aka “Clear box”): We have the code. We can analyze it.

Black-Box Complexity

Run the program for different sizes of data set; try to get a fix on the growth rate.

What sizes?– An approach is to repeatedly double the input

size, until testing becomes infeasible (e.g. it takes too long, or the program breaks).

Black-Box Complexity

Run on sizes 32, 64, 128, 512, … or10, 100, 1000, 10000, ...

For each n, get time T(n). How can we estimate the order of run-time

(e.g. O(n2), O(n3), etc.)?

Black-Box Complexity

Suppose we are trying to establish correctness of the hypothesis T(n) O(f(n))

From the definition, we know this means there is a constant c such that

for all n, T(n) < c f(n) If our hypothesis is correct, then we expect for all

n, T(n)/ f(n) < c

We can simply examine this ratio

Black-Box Complexity

If we see T(n)/ f(n) < c

then the hypothesis T(n) O(f(n)) is supported. If T(n)/ f(n) is approximately a constant as n

increases, then the bound appears to be tight. If T(n)/ f(n) decreases as n increase, the bound is

loose. If T(n)/ f(n) increases, we don’t have a bound.

Algorithm—find the largest element in a sequence of integers 1. Set temp max=first element in the

sequence (temporary= largest integer at every stage)

2. Compare next int to temp; if larger than temp, set temp equal to the integer

3. Repeat 2 if there are more integers4. Stop when no integers are left—temp is

the largest in the sequence

Analysis of finding the max element of a set with n elements 2 “comparisons” for each term—one to determine

that the end of the list has not been reached, and another to determine whether to update the temporary maximum

2(n-1) 2 comparisons each for the second through the nth element

1 to exit the loop 2(n-1) + 1 = 2n-1 Hence, the order of the algorithm is generally

polynomial; specifically, it is“linear”—O(n)

Consider a linear search for a particular element in a list Worst case analysis: # of comparisons: each step of the loop has 2

comparisons—one to see if we are at the end of the list, and one to compare the element with the terms of the list

One comparison is made outside the loop 2i+1 comparisons inside At most—2n+2 when not in the list (2n in loop, 1 to exit

loop, one outside of loop) Once again, polynomial; specifically, it is“linear”—O(n)