Data Structures and Algorithms - Vrije Universiteit …tcs/ds/slides.pdfData Structures and...

Data Structures and Algorithms

Jörg Endrullis

Vrije Universiteit AmsterdamDepartment of Computer Science

Section Theoretical Computer Science

2009–2010

http://www.few.vu.nl/~tcs/ds/

http://www.few.vu.nl/~tcs/ds/

The Lecture

Lectures:I Tuesday 11:00 – 12:45

(weeks 36–42 in WN-KC159)

I Wednesday 9:00 – 10:45(weeks 36–41 in WN-Q105, 42 in WN-Q112)

Exercise classes:I Thursday 11:00 – 12:45 (weeks 36–42 in WN-M607(50))

I Thursday 13:30 – 15:15 (weeks 36–42 in WN-M623(50))

I Thursday 15:30 – 17:15 (weeks 36–42 in WN-C121)

Exam on 19.10.2009 at 8:45 – 11:30.

Retake exam on 12.01.2010 at 18:30 – 21:15.

Important Information: Voortentamen

Voortentamen (pre-exam):I Tuesday 29 september, 11:00 – 12:45 in WN-KC159

I The participation is not obligatory, but recommended.I The result can influence the final grade only positively:

I Let V be the pre-exam grade, and T the exam grade.The final grade is calculated by:

max(T ,2T + V

3)

Thus if the final exam grade T is higher than the pre-examgrade V , then only the final grade T counts. But if the pre-examV is higher then it counts with 1

3 .

Contact Persons

I lecture:I Jörg [email protected]

I exercise classes:I Dennis [email protected]

I Michel [email protected]

I Atze van der [email protected]

Literature

Introduction

DefinitionAn algorithm is a list of instructions for completing a task.

I Algorithms are central to computer science.I Important aspects when designing algorithms:

I correctness

I termination

I efficiency

Evaluation of Algorithms

I Algorithms with equal functionality may have hugedifferences in efficiency (complexity)

I Important measures:I time complexity

I space (memory) complexity

I Time and memory usage increase with size of the input:

input size

time

best case

average case

worst case

10 20 30 40

2s

4s

6s

8s

10s

Time Complexity

I Running time of programs depends on various factors:I input for the program

I compiler

I speed of the computer (CPU, memory)

I time complexity of the used algorithm

I Average case often difficult to determine.I We focus on worst case running time:

I easier to analyse

I usually the crucial measure for applications

Methods for Analysing Time Complexity

Experiments (measurements on a certain machine):I requires implementation

I comparisons require equal hardware and software

I experiments may miss out imporant inputs

Calculation of complexity for idealised computer model:I requires exact counting

I computer model: Random Access Machine (RAM)

Asymptotic complexity estimation depending on input size n:I allows for approximations

I logarithmic (log2 n), linear (n), . . . , exponential (an), . . .

Random Access Machine (RAM)

A Random Access Machine is a CPU connected to a memory:I potentially unbounded number of memory cells

I each memory cell can store an arbitrary number

CPU Memory

I primitive operations are executed in constant time

I memory cells can be accessed with one primitive operation

Pseudo-code

We use Pseudo-code to describe algorithms:I programming-like, hight-level description

I independent from specific programming languageI primitive operations:

I assigning a value to a variable

I calling a method

I arithmetic operations, comparing two numbers

I array access

I returning from a method

Counting Primitive Operations: arrayMax

We analyse the worst case complexity of arrayMax.

Algorithm arrayMax(A,n):Input: An array storing n ≥ 1 integers.Output: The maximum element in A.

currentMax = A[0]

for i = 1 to n − 1 doif currentMax < A[i] then

currentMax = A[i]donereturn currentMax

1 + 11 + n + (n − 1)·

1 + 11 + 11 + 1

1

Hence we have a worst case time complexity:

T (n) = 2 + 1 + n + (n − 1) · 6 + 1= 7 · n − 2

Counting Primitive Operations: arrayMax

We analyse the best case complexity of arrayMax.


currentMax = A[0]



1 + 11 + n + (n − 1)·

1 + 101 + 1

1

Hence we have a best case time complexity:

T (n) = 2 + 1 + n + (n − 1) · 4 + 1= 5 · n

Important Growth Functions

Examples of growth functions, ordered by speed of growth:

I log n

I n

I n · log n

I n2

I n3

I an

logarithmic growth

linear growth

(n · log n)-growth

quadratic growth

cubic growth

exponential growth (a > 1)

We recall the definition of logarithm:

loga b = c such that ac = b

Speed of Growth, Pictorial

x

f (x)

f (x) = log x

f (x) = x

f (x) = x log x

f (x) = x2

f (x) = ex

Speed of Growth

n 10 100 1000 104 105 106

log2 n 3 7 10 13 17 20n 10 100 1000 104 105 106

n · log n 30 700 13000 105 106 107

n2 100 10000 16 108 1010 1012

n3 1000 16 19 1012 1015 1018

2n 1024 1030 10300 103000 1030000 10300000

Note that log2 n growth very slow, whereas 2n growth explosive!

Time depending on Problem Size

Computation time assuming that 1 step takes 1µs (0.000001s).

n 10 100 1000 104 105 106

log n < < < < < 0.00002sn < < 0.001s 0.01s 0.1s 1s

n · log n < < 0.013s 0.1s 1s 10sn2 < 0.01s 1s 100s 3h 1000hn3 0.001s 1s 1000s 1000h 100y 105y2n 0.001s 1023y > > > >

Here < means ‘fast’ (< 0.001s), and > ‘more than 10300 years’.

A problem for which there exist only exponential algorithms areusually considered intractable.

Search in Arrays

We analyse the worst case complexity of search.

Algorithm search(A,n, x):Input: An array storing n ≥ 1 integers.Output: Returns true if A contains x and false, otherwise.

for i = 0 to n − 1 doif A[i] == x then

return truedonereturn false

1 + (n + 1) + n·1 + 1

(not taken for worst case)1 + 1

1


T (n) = 1 + (n + 1) + n · 4 + 1= 5 · n + 3

Search in Arrays

We analyse the best case complexity of search.

Algorithm search(A,n, x):Input: An array storing n ≥ 1 integers.Output: Returns true if A contains x and false, otherwise.

for i = 0 to n − 1 doif A[i] == x then

return truedonereturn false

1 + 11 + 11


T (n) = 5

Search in Sorted Arrays: Binary Search

Algorithm binSearch(A,n, x):Input: An array storing n integers in ascending order.Output: Returns true if A contains x and false, otherwise.

low = 0high = n − 1while low ≤ high do

mid = b(low + high)/2cy = A[mid]

if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1

donereturn false

1 2 2 4 6 7 8 11 13 15 16

low highmid

x = 8y = 7






donereturn false

1 2 2 4 6 7 8 11 13 15 16

highlow mid

x = 8y = 13






donereturn false

1 2 2 4 6 7 8 11 13 15 16

low highmid

x = 8y = 8


We analyse the worst case complexity of binSearch.


. . .donereturn false

12number of loops · (1+

. . .)

1

The analyse the maximal number of loops L for minimal lists A:

L A1 1 x = 22 1 2 x = 33 1 2 3 4 x = 54 1 2 3 4 5 6 7 8 x = 9

For a list A of length n we need (1 + log2 n) loops in worst case.


We analyse the worst case complexity of binSearch.




donereturn false

121+(1+ log2 n) ·(1+

421 (false)1 (false)3 (true) )

1


T (n) = 5 + (1 + log2 n) · 12= 17 + 12 · log2 n


We analyse the best case complexity of binSearch.




donereturn false

121+

421 (false)2 (true)


T (n) = 13

Summary: Counting Primitive Operations

Assumption (Random Access Machine):I primitive operations are executed in constant time

Allows for analysis of worst, best and average case complexity.(average analysis requires probability distribution for the inputs)

Disadvantages

I exact counting is cumbersome and error-proneI in the real world:

I different operations take different time

I computers have different CPU/memory speeds

Estimating Running Time

Assume we have an algorithm that performs (input size n):I n2 additions, and

I n multiplications.

Assume that we have computers A and B for which:I A needs 3s per addition and 7s per multiplication,

I B needs 1s per addition and 10s per multiplication.

Then the algorithm runs on A with time complexity:

T (n) = 3 · n2 + 7 · n

and on computer B with:

T (n) = 1 · n2 + 10 · n

The constant factors may differ on different computers.

Estimating Running Time

LetI A be an algorithm,

I T (n) the time complexity for Random Access Machines,

I TC(n) the time complexity on a (real) computer C.

For every computer C there exist factors a, b > 0 such that:

a · T (n) ≤ TC(n) ≤ b · T (n)

We can choose:I a = time for the fastest primitive operation on C

I b = time for the slowest primitive operation on C

Thus idealized computation is precise up to constant factors:I changing hardware/software affects only constant factors

The Role of Constant Factors

Problem size that can be solved in 1 hour:

current computer 100× faster 10000× faster

n N1 100 · N1 10000 · N1

n2 N2 10 · N2 100 · N2

n3 N3 4.6 · N3 21.5 · N3

2n N4 N4 + 7 N4 + 13

A 10000× faster computer is equal to:I normal computer with 10000h time, or

I 10000 times smaller constant factors in T (n): 110000 · T (n).

The growth rate of running time (e.g. n,n2,n3, 2n,. . . ) is moreimportant than constant factors.

Methods for Analysing Time Complexity

Experiments (measurements on a certain machine):I requires implementation

I comparisons require equal hardware and software

I experiments may miss out imporant inputs

Calculation of complexity for idealised computer model:I requires exact counting

I computer model: Random Access Machine (RAM)

Asymptotic complexity estimation depending on input size n:I allows for approximations

I linear (n), quadratic (n2), exponential (an), . . .

Asymptotic estimation

Magnitude of growth of complexity depending on input size n.

Mostly used: upper bounds big-Oh-notation (worst case).

Definition (Big-Oh-notation)Given functions f (n) and g(n), we say that f (n) is O(g(n)),denoted f (n) ∈ O(g(n)), if there exist c, n0 such that:

∀n ≥ n0 : f (n) ≤ c · g(n)

In words: the growth rate of f (n) is ≤ to the growth rate of g(n).

Example (2n + 1 ∈ O(n))We need to find c, n0 such that for all n ≥ n0 we have

2n + 1 ≤ c · n

We choose c = 3, n0 = 1. Then 3 · n = 2n + n ≥ 2n + 1.

Examples

Example (7n − 2 ∈ O(n))We need to find c, n0 such that for all n ≥ n0 we have

7n − 2 ≤ c · n

We choose c = 7, n0 = 1.

Example (3n3 + 5n2 + 2 ∈ O(n3))We need to find c, n0 such that for all n ≥ n0 we have

3n3 + 5n2 + 2 ≤ c · n3

We choose c = 4, n0 = 7, then

4 · n3 = 3n3 + n3 ≥ 3n3 + 7n2

= 3n3 + 5n2 + 2n2 ≥ 3n3 + 5 · n2 + 2

Examples

Example (7 ∈ O(1))We need to find c, n0 such that for all n ≥ n0 we have

7 ≤ c · 1

Take c = 7, n0 = 1.

In general: O(1) means ‘constant time’.

Example (n2 6∈ O(n))Assume there would exist c, n0 such that for all n ≥ n0 we have

n2 ≤ c · n

Take an arbitrary n ≥ c + 1, then

n2 ≥ (c + 1) · n > c · n

This contradicts n2 ≤ c · n.

Big-Oh Rules

I If f (n) is a polynomial of degree d , then f (n) ∈ O(nd).I More general, we can drop:

I constant factors, and

I lower order terms: ( means higher order/growth rate)

3n 2n n5 n2 n · log2 n n log2 n log2 log2 n

Example

5n6 + 3n2 + 2n7 + n ∈ O(n7)

Example

n200 + 3 · 2n + 50n3 + log2 n ∈ O(2n)

Asymptotic estimation: arrayMax

We analyse the worst case complexity of arrayMax.


currentMax = A[0]



1 (constant time)(n − 1)·

1

1

Hence we have a (worst case) time complexity:

T (n) ∈ O(1 + (n − 1) · 1 + 1)

= O(n)

Asymptotic estimation: prefixAverage

We analyse the complexity of prefixAverage.

Algorithm prefixAverage(A,n):Input: An array storing n ≥ 1 integers.Output: An array B of length n such that for all i < n:

B[i ] = 1i+1 · (A[0] + A[1] + . . .+ A[i ])

B = new array of length nfor i = 0 to n − 1 do

sum = 0for j = 0 to i do

sum = sum + A[j]doneB[i] = sum/(i + 1)

donereturn B

nnn · 1(1 + 2 + . . .+ n)·

1

n · 1

1

The (worst case) time complexity is: T (n) ∈ O(n2)

Asymptotic estimation: prefixAverage2

We analyse the complexity of prefixAverage2.

Algorithm prefixAverage2(A,n):Input: An array storing n ≥ 1 integers.Output: An array B of length n such that for all i < n:

B[i ] = 1i+1 · (A[0] + A[1] + . . .+ A[i ])

B = new array of length nsum = 0for i = 0 to n − 1 do

sum = sum + A[i]B[i] = sum/(i + 1)

donereturn B

n1n ·

1

1

The (worst case) time complexity is:

T (n) ∈ O(n)

Efficiency for Small Problem Size

Asymptotic complexity (big-O-notation) speaks about big n:I For small n, an algorithm with good asymptotic complexity

can be slower than one with bad complexity.

ExampleWe have

I 1000 · n ∈ O(n) is asymptotically better than n2 ∈ O(n2)

but for n = 10:I 1000 · n = 10000 is much slower than n2 = 100

Relatives of Big-Oh

Lower bounds: big-Omega-notation.

Definition (Big-Omega-notation)f (n) is Ω(g(n)), denoted f (n) ∈ Ω(g(n)), if there exist c, n0 s.t.:

∀n ≥ n0 : f (n) ≥ c · g(n)

Exact bound (lower and upper): big-Theta-notation.

Definition (Big-Theta-notation)f (n) is Θ(g(n)), denoted f (n) ∈ Θ(g(n)) if:

f (n) ∈ O(g(n)) and f (n) ∈ Ω(g(n))

Abstract Data Types

DefinitionAn abstract data type (ADT) is an abstraction of a data type.An ADT specifies:

I data stored

I operations on the data

I error conditions associated with operations

The choice of data types is important for efficient algorithms.

The Stack ADT

I stores arbitrary objects: last-in first-out principle (LIFO)I main operations:

I push(o): insert the object o at the top of the stack

I pop(): removes & returns the object on the top of the stack.An error occurs (exception is thrown) if the stack is empty.

21

5

21

21

53

push(3)pop()

returns 5

I auxiliary operations:I isEmpty(): indicates whether the stack is empty

I size(): returns number of elements in the stack

I top(): returns the top element without removing it.An error occurs if the stack is empty.

Applications of Stacks

Direct applications:I page-visited history in a web browser

I undo-sequence in a text editor

I chain of method calls in the Java Virtual MachineIndirect applications:

I auxiliary data structure for algorithms

I component of other data structures

Stack in the Jave Virtual Machine (JVM)

JVM keeps track of the chain of active methods with a stack:I stores local variables and program counter

main() int i = 5;foo(i);

foo(int j) int k;k = j+1;bar(k);

bar(int m) ...

barPC = 1m = 6

fooPC = 3j = 5k = 6

mainPC = 2i = 5

Array-based Implementation of Stacks

I we fix the maximal size N of the stack in advance

I elements are added to the array from left to right

I variable t points to the top of the stack

S 3 2 5 4 1 4

0 1 2 . . . t

I initially t = 0, the stack is empty


size():return t + 1

push(o):if size() == N then

throw FullStackExceptionelse

t = t + 1S[t ] = o

(throws an exception if the stack is full)

S 3 2 5 4 1 4

0 1 2 . . . t


pop():if size() == 0 then

throw EmptyStackExceptionelse

o = S[t ]t = t − 1return o

(throws an exception if the stack is emtpy)

S 3 2 5 4 1 4

0 1 2 . . . t

Array-based Implementation of Stacks, Summary

Performace:I every operation runs in O(1)

Limitations:I maximum size of the stack must be choosen a priory

I trying to add more than N elements causes an exceptionThese limitations do not hold for stacks in general:

I only for this specific implementation

The Queue ADT

I stores arbitrary objects: first-in first-out principle (FIFO)I main operations:

I enqueue(o): insert the object o at the end of the queue

I dequeue(): removes & returns the object at the beginningof the queue. An error occurs if the queue is empty.

21721

3217

enqueue(3)dequeue()

returns 7

I auxiliary operations:I isEmpty(): indicates whether the queue is empty

I size(): returns number of elements in the queue

I front(): returns the first element without removing it.An error occurs if the queue is empty.

Applications of Queues

Direct applications:I waiting lists (bureaucracy)

I access to shared resources (CPU, printer, . . . )Indirect applications:

I auxiliary data structure for algorithms


Array-based Implementation of Queues

I we fix the maximal size N − 1 of the queue in advanceI uses array of size N in circular fashion:

I variable f points to the front of the queue

I variable r points one behind the rear of the queue(that is, array location r is kept empty)

Q 3 2 5 4 1

0 1 2 . . . f r

normal configuration

Q 3 25 4 1fr

wrapped configuration

Qfr

empty queue (f == r)


size():return (N + r − f ) mod N

isEmpty():return size() == 0

Q 3 2 5 4 1

0 1 2 . . . f r


Q 3 25 4 1fr



enqueue(o):if size() == N − 1 then

throw FullQueueExceptionelse

Q[r ] = or = (r + 1) mod N

(throws an exception if the queue is full)

Q 3 2 5 4 1

0 1 2 . . . f r


Q 3 25 4 1fr



dequeue():if size() == 0 then

throw EmptyQueueExceptionelse

o = Q[f ]f = (f + 1) mod Nreturn o

(throws an exception if the queue is emtpy)

Q 3 2 5 4 1

0 1 2 . . . f r


Q 3 25 4 1fr


What is a Tree?

I stores elements in a hierarchical structure

I consists of nodes in parent-child relation

I children are ordered (we speak about ordered trees)

root

left

ping

0 1

right

2 pong

3

4

Applications of trees:I file systems, databasesI organization charts

Tree Terminology

I Root: node without parent (A)I Inner node: at least one child (A, B, C, D, F)I Leaf (external node): no children (H, I, E, J, G)I Depth of a node: length of the path to the rootI Height of the tree: maximum depth of any node (3)

A

B

D

H I

C

E F

J

G

I Ancestors:I A is parent of CI B is grandparent of H

I Descendant:I C is child of AI H is grandchild of B

I Substree:I node + all descendants

The Tree ADT

I Accessor operations:I root(): returns root of the treeI children(v): returns the list of children of node vI parent(v): returns parent of node v.

An error occurs (exception is thrown) if v is root.

I Generic operations:I size(): returns number of nodes in the treeI isEmpty(): indicates whether the tree is emptyI elements(): returns set of all the nodes of the treeI positions(): returns set of all positions of the tree.

I Query methods:I isInternal(v): indicates whether v is an inner nodeI isExternal(v): indicates whether v is a leafI isRoot(v): indicates whether v is the root node

Preorder Traversal

A traversal visits the nodes of a tree in systematic manner.

In a preorder traversal:I a node is visited before its descendants.

preOrder(v):visit(v)for each child w of v do

preOrder(w)

A

B

D

H I

C

E F

J

G

1

2

3

4 5

6

7 8

9

10

Postorder Traversal

In a postorder traversal:I a node is visited after its descendants.

postOrder(v):for each child w of v do

postOrder(w)visit(v)

A

B

D

H I

C

E F

J

G

1 2

3

4

5

6

7 8

9

10

Binary Trees

A binary tree is a tree with the property:I each inner node has exactly two children

We call the children of inner nodes left and right child.

Applications of binary trees:I arithmetic expressions

I decision processes

I searching

A

B

D E

H I

C

F G

Binary Trees and Arithmetic Expressions

Binary trees can represent arithmetic expressions:I inner nodes are operators

I leafs are operands

Example: (3× (a − 2)) + (b/4)

+

×

3 −

a 2

/

b 4

(postorder traversal can be used to evaluate an expression)

Binary Decision Trees

Binary tree can represent a decision process:I inner nodes are questions with yes/no answers

I leafs are decisions

Example:Question A?

Question B?

Result 1

yes

Question C?

Result 2

yes

Result 3

no

no

yes

Question D?

Result 4

yes

Result 5

no

no

Properties of Binary Trees

Notation:I n number of nodes

I i number of internal nodes

I e number of leafs

I h height

Properties:I e = i + 1

I n = 2e − 1

I h ≤ i

I h ≤ (n − 1)/2

I e ≤ 2h

I h ≥ log2 e

I h ≥ log2(n + 1) − 1

Binary Tree ADT

I Inherits all methods from Tree ADT.

I Additional methods:I leftChild(v): returns the left child of vI rightChild(v): returns the right child of vI sibling(v): returns the sibling of v

(exceptions are thrown if left/right child or sibling do not exist)

Inorder Traversal

In a inorder traversal:I a node is visited after its left and before its right subtree.

Application: drawing binary treesI x(v) = inorder rank v

I y(v) = depth of v

inOrder(v):if isInternal(v) then

inOrder(leftChild(v))visit(v)if isInternal(v) then

inOrder(rightChild(v))

A

B

D E

H I

C

F G1

2

3

4

5

6

7

8

9

Inorder Traversal: Printing Arithmetic Expressions

A specialization of the inorder traversal for printing expressions:

printExpression(v):if isInternal(v) then

print(‘(’)inOrder(leftChild(v))

print(v.element())if isInternal(v) then

inOrder(rightChild(v))print(‘)’)

+

×

3 −

a 2

/

b 4

Example: printExpression(v) of the above tree yields

((3× (a − 2)) + (b/4))

Postorder Traversal: Evaluating Expressions

Postorder traversal for evaluating arithmetic expressions:

eval(v):if isExternal(v) then

return v.element()else

x = eval(leftChild(v))y = eval(rightChild(v)) = operator stored at vreturn x y

+

×

3 −

4 2

/

8 4

Euler Tour Traversal

I Generic traversal of a binary tree:I preorder, postorder, inorder are special cases

I Walk around the tree, each node is visited tree times:I on the left (preorder)

I from below (inorder)

I on the right (postorder)

+

×

3 −

5 2

/

8 4

L R

B

Vector ADT

A Vector stores a list of elements:I Access via rank/index.

Accessor methods:elementAtRank(r),

Update methods:replaceAtRank(r, o),insertAtRank(r, o),removeAtRank(r)

Generic methods:size(), isEmtpy()

Here r is of type integer, n, m are nodes, o is an object (data).

List ADT

A List consists of a sequence of nodes:I The nodes store arbitrary objects.

I Access via before/after relation between nodes.

element 1 element 2 element 3 element 4

Accessor methods:first(), last(),before(n), after(n)

Query methods:isFirst(n), isLast(n)

Generic methods:size(), isEmtpy()

Update methods:replaceElement(n, o),swapElements(n,m),insertBefore(n, o),insertAfter(n, o),insertFirst(o),insertLast(o),remove(n)

Here r is of type integer, n, m are nodes, o is an object (data).

Sequence ADT

I Sequence ADT is the union of Vector and List ADT:I inherits all methods

I Additional methods:I atRank(r): returns the node at rank r

I rankOf(n): returns the rank of node n

I Elements can by accessed by:I rank/index, or

I navigation between nodes

RemarksThe distinction between Vector, List and Sequence is artificial:

I every element in a list naturally has a rankImportant are the different access methods:

I access via rank/index, or navigation between nodes

Singly Linked List

I A singly linked list provides an implementation of List ADT.

I Each node stores:I element (data)

I link to the next node node

next

element

I Variables first and optional last point to the first/last node.

I Optional variable size stores the size of the list.

∅

first last


Singly Linked List: size

If we store the size on the variable size, then:size():

return sizeRunning time: O(1).

If we do not store the size on a variable, then:size():

s = 0m = firstwhile m != ∅ do

s = s + 1m = m.next

donereturn s

Running time: O(n).

Singly Linked List: insertAfter

insertAfter(n, o):x = new node with element ox.next = n.nextn.next = xif n == last then last = xsize = size + 1

∅

A B C

∅

A B C

o

n

n

x

last

last

first

first

Running time: O(1).

Singly Linked List: insertFirst

insertFirst(o):x = new node with element ox.next = firstfirst = xif last == ∅ then last = xsize = size + 1

∅

A B C

∅

A B Co

first

first, x

Running time: O(1).

Singly Linked List: insertBefore

insertBefore(n, o):if n == first then

insertFirst(o)

elsem = firstwhile m.next != n do

m = m.nextdoneinsertAfter(m, o)

(error check m != ∅ has been left out for simplicity)

We need to find the node m before n:I requires a search through the list

Running time: O(n) where n the number of elements in the list.(worst case we have to search through the whole list)

Singly Linked List: Performance

Operation Worst case Complexitysize, isEmpty O(1) 1

first, last, after O(1) 2

before O(n)

replaceElement, swapElements O(1)

insertFirst, insertLast O(1) 2

insertAfter O(1)

insertBefore O(n)

remove O(n) 3

atRank, rankOf, elemAtRank O(n)

replaceAtRank O(n)

insertAtRank, removeAtRank O(n)

1 size needs O(n) if we do not store the size on a variable.2 last and insertLast need O(n) if we have no variable last.3 remove(n) runs in best case in O(1) if n == first.

Stack with a Singly Linked List

We can implement a stack with a singly linked list:I top element is stored at the first node

Each stack operation runs in O(1) time:

I push(o):insertFirst(o)

I pop():o = first().elementremove(first())return o

∅

first (top)


Queue with a Singly Linked List

We can implement a queue with a singly linked list:I front element is stored at the first node

I rear element is stored at the last node

Each queue operation runs in O(1) time:

I enqueue(o):insertLast(o)

I dequeue():o = first().elementremove(first())return o

∅

first (top) last (rear)


Doubly Linked List

I A doubly linked list provides an implementation of List ADT.

I Each node stores:I element (data)

I link to the next node

I link to the previous node node

prev next

element

I Special header and trailer nodes.

I Variable size storing the size of the list.

header trailer


Doubly Linked List: insertAfter

insertAfter(n, o):m = n.nextx = new node with element ox.prev = nx.next = mn.next = xm.prev = xsize = size + 1

A B C

A B Co

n

n mx

Doubly Linked List: insertBefore

insertBefore(n, o):insertAfter(n.prev, o)

A B C

A B C

o

n

n.prev n

x

Doubly Linked List: remove

remove(n):p = n.prevq = n.nextp.next = qq.prev = psize = size − 1

A B C D

A B C

o

n

p q

n

Doubly Linked List: Performance

Operation Worst case Complexitysize, isEmpty O(1)

first, last, after O(1)

before O(1)


insertFirst, insertLast O(1)

insertAfter O(1)

insertBefore O(1)

remove O(1)

atRank, rankOf, elemAtRank O(n)

replaceAtRank O(n)


Now all operations of List ADT are O(1):I only operations accessing via index/rank are O(n)

Amortization

Amortization is a tool for understanding algorithm complexity ifsteps have widely varying performance:

I analyses running time of a series of operations

I takes into account interaction between different operations

The amortized running time of an operation in a series ofoperations is the worst-case running time of the series dividedby the number of operations.

Example:I N operations in 1s,

I 1 operation in NsAmortized running time:

(N · 1 + N)/(N + 1) ≤ 2

That is: O(1) per operation. operations

time

N

N

Amortization Techniques: Accounting Method

The accounting method uses a scheme of credits and debitsto keep track of the running time of operations in a sequence:

I we pay one cyber-dollar for a constant computation time

We charge for every operation some amount of cyber-dollars:I this amount is the amortized running time of this operation

When an operation is executed we need enough cyber-dollarsto pay for its running time.

operations

time

$ $ $ $ $ $ $ $ $$ $ $ $ $ $ $ $ $

N

N

We charge:I 2 cyber-dollars per operation

The first N operations consume 1dollar each. For the last operationwe have N dollars saved.

Amortized O(1) per operation.

Amortization, Example: Clearable Table

Clearable Table supports operations:I add(o) for adding an object to the table

I clear() for emptying the table

Implementation using an array:I add(o) ∈ O(1)

I clear() ∈ O(m) where m is number of elements in the table(clear() removes one by one every entry in the table)

Amortization, Example: Clearable Table

time

clear() clear() clear()

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $$ $ $$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $

We charge 2 per operation:I add consumes 1; thus we save 1 per add.

Whenever m elements are in the list, we have saved m dollars:I we use the m dollars to pay for clear ∈ O(m)

Thus amortized costs per operation are 2 dollars, that is, O(1).

Array-based Implementation of Lists

Uses array in circular fashion (as for queues):I variable f points to first position

I variable l points one behind last position

A

A B C

f l

Adding elements runs in O(1) as long as the array is not full.

When the array is full:I we need to allocate a larger array B

I copy all elements from A to B

I set A = B (that is, from then on work further with B)

Array-based Implementation: Constant Increase

Each time the array is full we increase the size by k elements:I creating B of size n + k and copying n elements is O(n)

Example: k = 3

operations

time

Every k -th insert operation we need to resize the array.

Worst-case complexity for a sequence of m insert operations:bm/kc∑

i=1

k · i ∈ O(n2)

Average costs for each operation in the sequence: O(n).

Array-based Implementation: Doubling the Size

Each time the array is full we double the size:I creating B of size 2n and copying n elements is O(n)

operations

time

$$

$$

$$$$$$

$$$$$$$$$$$$$$$

$ $ $ $$ $ $

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $

Array-based Implementation: Doubling the Size

Each time the array is full we double the size:I insert costs O(1) if the array is not full

I insert costs O(n) if the array is full (for rezise n 7→ 2n)

We charge 3 cyber dollars per insert.

After doubling the size n 7→ 2n:I we have at least n times insert’s without resize

I we save 2n cyber-dollars before the next resizeThus we have 2n cyber-dollars for the next resize 2n 7→ 4n.

Hence the amortized costs for insert is 3: O(1).

(can also be used for fast queue and stack implementations)

Array-based Lists: Performance

The performance under assumption of the doubling strategy.

Operation Amortized Complexitysize, isEmpty O(1)

first, last, after O(1)

before O(1)


insertFirst, insertLast O(1)

insertAfter O(n)

insertBefore O(n)

remove O(n)

atRank, rankOf, elemAtRank O(1)

replaceAtRank O(1)


Comparison of the Amortized Complexities

Operation Singly Doubly Arraysize, isEmpty O(1) O(1) O(1)

first, last, after O(1) O(1) O(1)

before O(n) O(1) O(1)

replaceElement, swapElements O(1) O(1) O(1)

insertFirst, insertLast O(1) O(1) O(1)

insertAfter O(1) O(1) O(n)

insertBefore O(n) O(1) O(n)

remove O(n) O(1) O(n)

atRank, rankOf, elemAtRank O(n) O(n) O(1)

replaceAtRank O(n) O(n) O(1)

insertAtRank, removeAtRank O(n) O(n) O(n)

I singly linked lists (Singly)I doubly linked lists (Doubly)I array-based implementation with doubling strategy (Array)

Iterators

I Iterator allows to traverse the elements in a list or set.I Iterator ADT provides the following methods:

I object(): returns the current object

I hasNext(): indicates whether there are more elements

I nextObject(): goes to the next object and returns it

Priority Queue ADT

A priority queue stores a list of items:I the items are pairs (key, element)

I the key represens the priority: smaller key = higher priority

Main methods:I insertItem(k, o): inserts element o with key k

I removeMin(): removes item (k, o) with smallest key andreturns its element o

Additional methods:I minKey(): returns but does not remove the smallest key

I minElement(): returns but does not remove the elementwith the smallest key

I size(), isEmpty()

Applications of Priority Queues

Direct applications:I bandwidth management in routers

I task scheduling (execution of tasks in priority order)Indirect applications:

I auxiliary data structure for algorithms, e.g.:I shortest path in graphs


Priority Queues: Order on the Keys

I Keys can be arbitrary objects with a total order.

I Two distinct items in a queue may have the same key.

A relation ≤ is called order if for all x , y , z:I reflexivity: x ≤ x

I transitivity: x ≤ y and y ≤ z implies x ≤ z

I antisymmetry: x ≤ y and y ≤ x implies x = y

Implementation of the order: external class or functionI Comparator ADT

I isLessThan(x, y), isLessOrEqualTo(x, y)

I isEqual(x, y)

I isGreaterThan(x, y), isGreaterOrEqualTo(x, y)

I isComparable(x)

Sorting with Priority Queues

We can use a priority queue for soring as follows:I Insert all elements e stepwise with insertItem(e, e)

I Remove the elements in sorted order using removeMin()

Algorithm PriorityQueueSort(A,C):Input: List A, Comparator COutput: List A sorted in ascending order

P = new priority queue with comparator Cwhile ¬A.isEmpty() do

e = A.remove(A.first())P.insertItem(e, e)

while ¬P.isEmpty() doA.insertLast(P.removeMin())

Running time depends on:I implementation of the priority queue

List-based Priority Queues

Implementation with an unsorted List:

5 7 3 4 1I Performance:

I insertItem is O(1) since we can insert in front or end

I removeMin, minKey and minElement are O(n) since wehave to search the whole list for the smallest key

Implementation with an sorted List:

1 3 4 5 7I Performance:

I insertItem is O(n)

I for singly/doubly linked list O(n) for finding the insert position

I for array-based list O(n) for insertAtRank

I removeMin, minKey and minElement are O(1) since thesmallest key is in front of the list

Sorting

I Given is a sequence of pairs

(k1, e1), (k2, e2), . . . , (kn, en)

of elements ei with keys ki and an order ≤ on the keys.

4

e1

2

e2

3

e3

1

e4

5

e5

I We search a permutation π of the pairs such that the keyskπ(1) ≤ kπ(2) ≤ . . . ≤ kπ(n) are in ascending order.

1

e4

2

e2

3

e3

4

e1

5

e5

Selection-Sort

Selection-sort takes an unsorted list A and sorts as follows:I search the smallest element A and swap it with the first

I afterwards continue sorting the remainder of A

5 4 3 7 1

1 4 3 7 5

1 3 4 7 5

1 3 4 7 5

1 3 4 5 7

1 3 4 5 7

Unsorted Part

Sorted Part

Minimal Element

Selection-Sort: Properties

I Time complexity (best, average and worst-case) O(n2):

n + (n − 1) + . . .+ 1 =n2 + n

2∈ O(n2)

(caused by searching the minimal element)

I Selection-sort is an in-place sorting algorithm.

In-Place AlgorithmAn algorithm is in-place if apart from space for input data onlya constant amount of space is used: space complexity O(1).

Stable Sorting Algorithms

Stable Sorting AlgorithmA sorting algorithm is called stable if the order of items withequal key is preserved.

Example: not stable

3 2 2

A B C2 2 3

C B A2 2 3

C B AB, C exchanged order,although they have equal keys

Example: stable

3 2 2

A B C2 3 2

B A C2 2 3

B C A

Stable Sorting Algorithms

Applications of stable sorting:I preserving original order of elements with equal key

For example:I we have an alphabetically sorted list of names

I we want to sort by date of birth while keeping alphabeticalorder for persons with same birthday

Selection-sort is stable ifI we always select the first (leftmost) minimal element

Insertion-Sort

Selection-sort takes an unsorted list A and sorts as follows:I distinguishes a sorted and unsorted part of A

I in each step we remove an element from the unsortedpart, and insert it at the correct position in the sorted part

5 3 4 7 1

5 3 4 7 1

3 5 4 7 1

3 4 5 7 1

3 4 5 7 1

1 3 4 5 7

Unsorted Part

Sorted Part

Minimal Element

Insertion-Sort: Complexity

I Time complexity worst-case O(n2):

1 + 2 + . . .+ n =n2 + n

2∈ O(n2)

(searching insertion position together with inserting)I Time complexity best-case O(n):

I if list is already sorted, and

I we start searching insertion position from the end

I More general: time complexity O(n · (n − d + 1))

I if the first d elements are already sorted

I Insertion-sort is an in-place sorting algorithm:I space complexity O(1)

Insertion-Sort: Properties

I Simple implementation.I Efficient for:

I small lists

I big lists of which a large prefix is already sorted

I Insertion-sort is stable if:I we always pick the first element from the unsorted part

I we always insert behind all elements with equal keys

I Insertion-sort can be used online:I does not need all data at once,

I can sort a list while receiving it

Heaps

A heap is a binary tree storing keys at its inner nodes such that:

I if A is a parent of B, then key(A) ≤ key(B)

I the heap is a complete binary tree: let h be the heap heightI for i = 0, . . . , h − 1 there are 2i nodes at depth i

I at depth h − 1 the inner nodes are left of the external nodes

We call the rightmost inner node at depth h − 1 the ‘last node’.

2

5

9 7

6

last node

Height of Heaps

TheoremA heap storing n keys height O(log2 n).

Proof.A heap of height h contains 2i nodes at every depthi = 0, . . . , h − 2 and at least one node at depth h − 1. Thusn ≥ 1 + 2 + 4 + . . . 2h−2 + 1 = 2h−1. Hence h ≤ 1 + log2 n.

10

21

2ii

1h − 1

depth keys

Heaps and Priority Queues

We can use a heap to implement a priority queue:I inner nodes store (key, element) pair

I variable last points to the last node

(2,Sue)(5,Pat) (6,Mark)

(7,Anna)

(9, Jeff )

last

I For convenience, in the sequel, we only show the keys.

Heaps: Insertion

The insertion of key k into the heap consits of 3 steps:I Find the insertion node z (the new last node).

I Expand z to an internal node and store k at z.

I Restore the heap-order property (see following slides).

2

5

9 7

6

insertion node z

Example: insertion of key 1 (without restoring heap-order)

2

5

9 7

6

1z

Heaps: Insertion, Upheap

After insertion of k the heap-order may be violated.

We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root

as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).

2

5

9 7

6

1

Now the heap-order property is restored.

Heaps: Insertion, Upheap

After insertion of k the heap-order may be violated.

We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root

as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).

1

5

9 7

2

6

Now the heap-order property is restored.

Heaps: Insert, Finding the Insertion Position

An algorithm for finding the insertion position (new last node):I start from the current last node

I while the current node is a right child, go to the parent node

I if the current node is a left child, go to the right child

I while the current node has a left child, go to the left childTime complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)

Heaps: Removal of the Root

The removal of the root consits of 3 steps:I Replace the root key with the key of the last node w.

I Compress w and its children into a leaf.

I Restore the heap-order property (see following slides).

2

5

9 7

6w

Example: removal of the root (without restoring heap-order)

7

5

9

6w

Heaps: Removal, Downheap

Replacing the root key by k may violate the heap-order.

We restore the heap-order using the downheap algorithm:I we swap k with its smallest child

as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).

7

5

9

6

Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).

Heaps: Removal, Downheap

Replacing the root key by k may violate the heap-order.

We restore the heap-order using the downheap algorithm:I we swap k with its smallest child

as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).

5

7

9

6

Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).

Heaps: Removal, Finding the New Last Node

After the removal we have to find the new last node:I start from the old last node (which has been remove)

I while the current node is a left child, go to the parent node

I if the current node is a right child, go to the left child

I while the current node has an right child which is not a leaf,go to the right child

Time complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)

removed

Heap-Sort

We implement a priority queue by means of a heap:I insertItem(k, e) corresponds to adding (k, e) to the heap

I removeMin() corresponds to removing the root of the heapPerformance:

I insertItem(k, e), and removeMin() run in O(log2 n) time

I size(), isEmtpy(), minKey(), and minElement() are O(1)

Heap-sort is O(n log2 n)

Using a heap-based priority queue we can sort a list of nelements in O(n · log2 n) time (n times insert + n times removal).

Thus heap-sort is much faster than quadratic sorting algorithms(e.g. selection sort).

Vector-based Heap Implementation

We can represent a heap with n keys by a vector of size n + 1:

2

5

9 7

6

0 1 2 3 4 52 5 6 9 7

I The root node has rank 1 (cell at rank 0 is not used).I For a node at rank i :

I the left child is at rank 2i

I the right child is at rank 2i + 1

I the parent (if i > 1) is located at rank bi/2c

I Leafs and links between the nodes are not stored explicitly.

Vector-based Heap Implementation, continued

We can represent a heap with n keys by a vector of size n + 1:

2

5

9 7

6

0 1 2 3 4 52 5 6 9 7

I The last element in the heap has rank n, thus:I insertItem corresponds to inserting at rank n + 1

I removeMin corresponds to removing at rank n

I Yields in-place heap-sort (space complexity O(1)):I uses a max-heap (largest element on top)

Merging two Heaps

We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children

I we perform downheap to restore the heap-order

2

6 5

h1 3

4 5

h2 k = 7

7

2

6 5

3

4 5

Merging two Heaps

We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children

I we perform downheap to restore the heap-order

2

6 5

h1 3

4 5

h2 k = 7

2

5

6 7

3

4 5

Bottom-up Heap Construction

We have n keys and want to construct a heap from them.

Possibility one:I start from empty heap and use n times insert

I needs O(n log2 n) time

Possibility two: bottom-up heap constructionI for simplicity we assume n = 2h − 1 (for some h)

I take 2h−1 elements and turn them into heaps of size 1I for phase i = 1, . . . , log2 n:

I merge the heaps of size 2i − 1 to heaps of size 2i+1 − 1

2i − 1 2i − 1merge

2i − 1 2i − 1

2i+1 − 1

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

16 15 4 12 6 9 23 20



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

25

16 15

5

4 12

11

6 9

27

23 20



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

15

16 25

4

5 12

6

11 9

20

23 27



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

7

15

16 25

4

5 12

8

6

11 9

20

23 27



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

4

15

16 25

5

7 12

6

8

11 9

20

23 27



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

10

4

15

16 25

5

7 12

6

8

11 9

20

23 27

We are ready: this is the final heap.



16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

4

5

15

16 25

7

10 12

6

8

11 9

20

23 27

We are ready: this is the final heap.

Bottom-up Heap Construction, Performance

Visualization of the worst-case of the construction:

I displays the longest possible heapdown paths(may not be the actual path, but maximal length)

I each edge is traversed at most once

I we have 2n edges hence the time complexity is O(n)

I faster than n successive insertions

Dictionary ADT

Dictionary ADT models:I a searchable collection of key-element items.

I search via the key

Main operations are: searching, inserting, deleting itemsI findElement(k): returns the element with key k if the

dictionary contains such an element(otherwise returns special element No_Such_Key)

I insertItem(k, e): inserts (k, e) into the dictionary

I removeElement(k): like findElement(k) but additionallyremoves the item if present

I size(), isEmpty()

I keys(), elements()

Log File

A log file is a dictionary implemented based on an unsorted list:I Performance:

I insertItem is O(1) (can insert anywhere, e.g. end)

I findElement, and removeElement are O(n)(have to search the whole sequence in worst case)

I Efficient for small dictionaries or if insertions are muchmore frequent than search and removal.(e.g. access log of a workstation)

Ordered Dictionaries

Ordered Dictionaries:I Keys are assumed to come from a total order.I New operations:

I closestKeyBefore(k)

I closestElementBefore(k)

I closestKeyAfter(k)

I closestElementAfter(k)

Binary Search

Binary search performs findElement(k) on sorted arrays:I each step the search space is halved

I time complexity is O(log2 n)

I for pseudo code and complexity analysis see lecture 1

Example: findElement(7)

0 1 3 5 7 8 9 11 15 16 18low high

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

3

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

3

0 1 3 5 7 8 9 11 15 16 18low high

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

3

0 1 3 5 7 8 9 11 15 16 18low high5

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

3

0 1 3 5 7 8 9 11 15 16 18low high5

0 1 3 5 7 8 9 11 15 16 18lowhigh

Binary Search





0 1 3 5 7 8 9 11 15 16 18low high

8

0 1 3 5 7 8 9 11 15 16 18low high

3

0 1 3 5 7 8 9 11 15 16 18low high5

0 1 3 5 7 8 9 11 15 16 18lowhigh

7

Lookup Table

A lookup table is a (ordered) dictionary based on a sorted array:I Performance:

I findElement takes O(log2 n) using binary search

I insertItem is O(n) (shifting n/2 items worst case)

I removeElement takes O(n) (shifting n/2 items worst case)

I Efficient for small dictionaries or if search are much morefrequent than insertion and removal.(e.g. user authentication with password)

Data Structure for TreesA node is represented by an object storing:

I an element

I link to the parent node

I list of links to children nodes

A

B C

E F

D

∅

A

∅

B C

∅

D

∅

E

∅

F

Data Structure for Binary TreesA node is represented by an object storing:

I an element

I link to the parent node

I left child

I right child

A

B C

D E

∅

A∅ ∅

B C∅ ∅

D

∅ ∅

EBinary trees can also be represented by a Vector, see heaps.

Binary Search Tree

A binary search tree is a binary tree such that:I The inner nodes store keys (or key-element pairs).

(leafs are empty, and usually left out in literature)I For every node n:

I the left subtree of n contains only keys < key(n)

I the right subtree of n contains only keys > key(n)

6

2

1 4

9

8

I Inorder traversal visits the keys in increasing order.

Binary Search Tree: Searching

Searching for key k in a binary search tree t works as follows:I if the root of t is an inner node with key k ′:

I if k == k ′, then return the element stored at the root

I if k < k ′, then search in the left subtree

I if k > k ′, then search in the right subtree

I if t is a leaf, then return No_Such_Key (key not found)


6

2

1 4

9

8

<

>

Binary Search Tree: Minimal Key

Finding the minimal key in a binary search tree:I walk left until the the left child is a leafminKey(n):

if isExternal(n) thenreturn No_Such_Key

if isExternal(leftChild(n)) thenreturn key(n)

elsereturn minKey(leftChild(n))

Example: for the tree below, minKey() returns 1

6

2

1 4

9

8

Binary Search Tree: Maximal Key

Finding the maximal key in a binary search tree:I walk right until the the right child is a leafmaxKey(n):

if isExternal(n) thenreturn No_Such_Key

if isExternal(rightChild(n)) thenreturn key(n)

elsereturn maxKey(rightChild(n))

Example: for the tree below, maxKey() returns 9

6

2

1 4

9

8

Binary Search Tree: Insertion

Insertion of key k in a binary search tree t:I search for k and remember the leaf w where we ended up

(assuming that we did not find k)

I insert k at node w, and expand w to an inner node

Example: insert(5)

6

2

1 4

9

8

<

>

>

6

2

1 4

5

9

8

Binary Search Tree: Deletion above External

The algorithm removeAboveExternal(n) removes a node forwhich at least one child is a leaf (external node):

I if the left child of n is a leaf, replace n by its right subtree

I if the right child of n is a leaf, replace n by its left subtree

Example: removeAboveExternal(9)

6

2

1 4

9

8

7

remove6

2

1 4

8

7

Binary Search Tree: Deletion

The algorithm remove(n) removes a node:I if at least one child is a leaf then removeAboveExternal(n)

I if both children of n are internal nodes, then:I find the minimal node m in the right subtree of n

I replace the key of n by the key of m

I remove the node m using removeAboveExternal(m)

(works also with the maximal node of the left subtree)

Example: remove(6)

6

2

1 4

9

8

7

remove7

2

1 4

9

8

Performance of Binary Search Trees

Binary search tree storing n keys, and height h:I the space used is O(n)

I findElement, insertItem, removeElement take O(h) time

The height h is O(n) in worst case and O(log2 n) in best case:I findElement, insertItem, and removeElement

take O(n) time in worst case

AVL Trees

AVL trees are binary search trees such that for every innernode n the heights of the children differ at most by 1.

I AVL trees are often called height-balanced.

3

0

2

8

6

4 7

9

4

2

1

1 1

12

3

I Heights of the subtrees are displayed above the nodes.

AVL Trees: Balance Factor

The balance factor of a node is the height of its right subtreeminus the height of its left subtree.

3

0

2

8

6

4 7

9

1

1

0

0 0

00

-1

I Above the nodes we display the balance factor.

I Nodes with balance factor -1,0, or 1 are called balanced.

The Height of AVL Trees

The height of an AVL tree storing n keys is O(log n).

Proof.Let n(h) be the minimal number of inner nodes for height h.

I we have n(1) = 1 and n(2) = 2

I we know

n(h) = 1 + n(h − 1) + n(h − 2)

> 2 · n(h − 2) since n(h − 1) > n(h − 2)

> 4 · n(h − 4)

> 8 · n(h − 6)

n(h) > 2i · n(h − 2i) by induction

≥ 2h/2

Thus h ≤ 2 · log2 n(h).

AVL Trees: Insertion

Insertion of a key k in AVL trees works in the following steps:I We insert k as for binary search trees:

I let the inserted node be w

I After insertion we might need to rebalance the tree:I the balance of the ancestors of w may be affected

I nodes with balance factor -2 or 2 need rebalancing

Example: insertItem(1) (without rebalancing)

3

0

2

8

-1

1

0

03

0

2

1

8

-2

0

2

-1

0 insert

rebalance

AVL Trees: Rebalancing after Insertion

After insertion the AVL tree may need to be rebalanced:I walk from the inserted node to the root

I we rebalance the first node with balance factor -2 or 2

There are only the following 4 cases (2 modulo symmetry):

Left Left -2

-1

A

h + 1

inserted

B

h

C

h

Left Right -2

1

A

h

B

h + 1

inserted

C

h

Right Right 2

1A

h B

h

C

h + 1

inserted

Right Left 2

-1A

h B

h + 1

inserted

C

h

AVL Trees: Rebalancing, Case Left Left

The case ‘Left Left’ requires a right rotation:

x

y

A

h + 1

inserted

B

h

C

h

-2

-1y

xA

h + 1

inserted

B

h

C

h

0

0rotate right

Example: Case Left Left

Example: insertItem(0)

7

4

2

1 3

6

8

90

0

0

0 0

-1 1

-1 insert 7

4

2

1

0

3

6

8

9

0

-1

-1

0

0 0

-2 1

-2

7

2

1

0

4

3 6

8

90

-1

0

0 0

00

1

-1

rotate right 2, 4

AVL Trees: Rebalancing, Case Right Right

The case ‘Right Right’ requires a left rotation:(this case is symmetric to the case ‘Left Left’)

x

yA

h B

h

C

h + 1

inserted

2

1y

x

A

h

B

h

C

h + 1

inserted

0

0rotate left

Example: Case Right Right


3

2 5

4 8

0

0

0

0

1 insert 3

2 5

4 8

7

0

-1

0

1

0

2

5

3

2 4

8

70

-1

0

0

0

0rotate right 3, 5

AVL Trees: Rebalancing, Case Left Right

The case ‘Left Right’ requires a left and a right rotation:

x

y

zA

h B

hinserted

C

h − 1

D

h

-2

1

-1

x

z

y

A

h

B

hinserted

C

h − 1

D

h

-2

-2

0

rotate left y, z

z

y x

A

h

B

hinserted

C

h − 1

D

h

0

0 1rotate right x, z

(insertion in C or z instead of B works exactly the same)

Example: Case Left Right


7

4

2 6

80 0

0 0

-1 insert 7

4

2 6

5

80 -1

1 0

-2

0

7

6

4

2 5

8

0

-2

0

0

-2

0

rotate left 4, 66

4

2 5

7

80

0

0

0

1

0

rotate right 6, 7

AVL Trees: Rebalancing, Case Right Left

The case ‘Right Left’ requires a right and a left rotation:

x

y

zA

h

B

h − 1

C

hinserted

D

h

2

-1

1

x

z

yA

h B

h − 1 C

hinserted

D

h

2

2

0

rotate right y, z

z

x y

A

h

B

h − 1

C

hinserted

D

h

0

0-1

rotate left x, z

(this case is symmetric to the case ‘Left Right’)

Example: Case Right Left


5

8

1

0insert 5

8

7

2

-1

0

5

7

8

2

0

1

rotate right 7, 87

5 80 0

0rotate left 5, 7

AVL Trees: Rebalancing after Deletion

After deletion the AVL tree may need to be rebalanced:I walk from the inserted node to the root

I we rebalance all nodes with balance factor -2 or 2

There are only the following 6 cases (2 of which are new):

Left Left -2

-1

A

h + 1

B

h

C

hdeleted

Left Right -2

1

A

h

B

h + 1

C

hdeleted

Left -2

0

A

h + 1

B

h + 1

C

hdeleted

Right Right 2

1A

hdeleted

B

h

C

h + 1

Right Left 2

-1A

hdeleted

B

h + 1

C

h

Right 2

0A

hdeleted

B

h + 1

C

h + 1

AVL Trees: Rebalancing, Case Left

The case ‘Left’ requires a right rotation:

x

y

A

h + 1

B

h + 1

C

hdeleted

-2

0y

xA

h + 1 B

h + 1

C

hdeleted

1

-1rotate right

Example: Case Left

Example: remove(9)

7

4

2

1 3

6

5

8

90

0

0

-1 0

0 1

-1

0

remove 7

4

2

1 3

6

5

8

0

0

0

-10

0 0

-2

0

4

2

1 3

7

6

5

80

0

0 -1

0

1

0

-1

0

rotate right 4, 7

AVL Trees: Rebalancing, Case Right

The case ‘Right’ requires a left rotation:

x

yA

hdeleted

B

h + 1

C

h + 1

2

0y

x

A

hdeleted

B

h + 1

C

h + 1

-1

1rotate left

Example: Case Right

Example: remove(4)

6

4 8

7 90

00

1

0

remove 6

8

7 90

0

2

0

8

6

7

90

-1

1

0

rotate left 6, 8

AVL Trees: Performance

A single rotation (restructure) is O(1):I assuming a linked-structure binary tree

Insertion runs in O(log n):I initial find is O(log n)

I rebalancing and update of heights is O(log n)

(at most 2 rotations are needed, no further rebalancing)

Deletion runs in O(log n):I initial find is O(log n)

I rebalancing and update of heights is O(log n)

I rebalancing may decrease the height of the subtree, thusfurther rebalancing above the node may be necessary

Lookup (find) is O(log n) (height of the tree is O(log n)).

Example from Last Years Exam

8

5

3

2

1

4

6

7

10

9 11

11

I Show with pictures step for step how item 9 is removed.I Indicate in every picture:

I which node is not balanced, and

I which nodes are involved in rotation.

Dictionaries: Performance Overview

Dictionary methods:

search insert removeLog File O(n) O(1) O(n)

Lookup Table O(log2 n) O(n) O(n)

AVL Tree O(log2 n) O(log2 n) O(log2 n)

Ordered dictionary methods:

closestAfter closestBeforeLog File O(n) O(n)

Lookup Table O(log2 n) O(log2 n)

AVL Tree O(log2 n) O(log2 n)

I Log File corresponds to unsorted list.

I Lookup Table corresponds to unsorted list

Hash Functions and Hash Tables

A hash function h maps keys of a given type to integers in afixed interval [0, . . . ,N − 1]. We call h(x) hash value of x .

Examples:I h(x) = x mod N

is a hash function for integer keys

I h((x , y)) = (5 · x + 7 · y) mod Nis a hash function for pairs of integers

h(x) = x mod 5key element

01 6 tea2 2 coffee34 14 chocolate

A hash table consists of:I hash function h

I an array (called table) of size N

The idea is to store item (k, e) at index h(k).

Hash Tables: Example 1

Example: phone book with table size N = 5I hash function h(w) = (length of the word w) mod 5

0

1

2

3

4

(Alice, 020598555)

(Sue, 060011223)

(John, 020123456)

Alice

John

Sue

I Ideal case: one access for find(k) (that is, O(1)).I Problem: collisions

I Where to store Joe (collides with Sue)?

I This is an example of a bad hash function:I Lots of collisions even if we make the table size N larger.

Hash Tables: Example 2

A dictionary based on a hash table for:I items (social security number, name)

I 700 persons in the database

We choose a hash table of size N = 1000 with:I hash function h(x) = last three digits of x

0

1

2

3...

997

998

999

(025-611-001, Mr. X)

(987-067-002, Brad Pit)

(431-763-997, Alan Turing)

(007-007-999, James Bond)

Collisions

Collisionsoccur when different elements are mapped to the same cell:

I Keys k1, k2 with h(k1) = h(k2) are said to collide.

0

1

2

3...

(025-611-001, Mr. X)

(987-067-002, Brad Pit) (123-456-002, Dipsy)?

Different possibilities of handing collisions:I chaining,

I linear probing,

I double hashing, . . .

Collisions continued

Usual setting:I The set of keys is much larger than the available memory.

I Hence collisions are unavoidable.

How probable are collisions:I We have a party with p persons. What is the probability that

at least 2 persons have birthday the same day (N = 365).

I Probability for no collision:

q(p,N) =NN· N − 1

N· · · N − p + 1

N

=(N − 1) · (N − 2) · · · (N − p + 1)

Np−1

I Already for p ≥ 23 the probability for collisions is > 0.5.

Hashing: Efficiency Factors

The efficiency of hashing depends on various factors:I hash function

I type of the keys: integers, strings,. . .

I distribution of the actually used keys

I occupancy of the hash table (how full is the hash table)

I method of collision handling

The load factor α of a hash table is the ratio n/N, that is, thenumber of elements in the table divided by size of the table.

High load factor α ≥ 0.85 has negative effect on efficiency:I lots of collisions

I low efficiency due to collision overhead

What is a good Hash Function?

Hash fuctions should have the following properties:I Fast computation of the hash value (O(1)).I Hash values should be distributed (nearly) uniformly:

I Every has value (cell in the hash table) has equal probabilty.

I This should hold even if keys are non-uniformly distributed.

The goal of a hash function is:I ‘disperse’ the keys in an apparently random way

Example (Hash Function for Strings in Python)We dispay python hash values modulo 997:

h(‘a ′) = 535 h(‘b ′) = 80 h(‘c ′) = 618 h(‘d ′) = 163h(‘ab ′) = 354 h(‘ba ′) = 979 . . .

At least at first glance they look random.

Hash Code Map and Compression Map

Hash function is usually specified as composition of:I hash code map: h1 : keys→ integers

I compression map: h2 : integers→ [0, . . . ,N − 1]

The hash code map is appied before the compression map:I h(x) = h2(h1(x)) is the composed hash function

The compression map usually is of the form h2(x) = x mod N:

I The actual work is done by the hash code map.

I What are good N to choose? . . . see following slides

Compression Map: Example

We revisit the example (social security number, name):I hash function h(x) = x as number mod 1000

Assume the last digit is always 0 or 1 indicating male/femal.

0123456789

101112 ...

(025-611-000, Mr. X)(987-067-001, Ms. X)

(431-763-010, Alan Turing)(007-011-011, Madonna)

Then 80% of the cells in the table stay unused! Bad hash!

Compression Map: Division Remainder

A better hash function for ‘social security number’:I hash function h(x) = x as number mod 997

I e.g. h(025 − 611 − 000) = 025611000 mod 997 = 409

Why 997? Because 997 is a prime number!I Let the hash function be of the form h(x) = x mod N.

I Assume the keys are distributed in equidistance ∆ < N:

ki = z + i · ∆

We get a collision if:

ki mod N = kj mod N⇐⇒ z + i · ∆ mod N = z + j · ∆ mod N⇐⇒ i = j + m · N (for some m ∈ Z)

Thus a prime maximizes the distance of keys with collisions!

Hash Code Maps

What if the keys are not integers?I Integer cast: interpret the bits of the key as integer.

a0001

b0010

c0011 000100100011 = 291

What if keys are longer than 32/64 bit Integers?I Component sum:

I partition the bits of the key into parts of fixed length

I combine the components to one integer using sum(other combinations are possible, e.g. bitwise xor, . . . )

1001010 | 0010111 | 01100001001010 + 0010111 + 0110000 = 74 + 23 + 48 = 145

Hash Code Maps, continued

Other possible hash code maps:I Polynomial accumulation:

I partition the bits of the key into parts of fixed length

a0a1a2 . . . an

I take as hash value the value of the polynom:

a0 + a1 · z + a2 · z2 . . . an · zn

I especially suitable for strings (e.g. z = 33 has at most 6collisions for 50.000 english words)

I Mid-square method:I pick m bits from the middle of x2

I Random method:I take x as seed for random number generator

Collision Handling: Chaining

Chaining: each cell of the hash table points to a linked list ofelements that are mapped to this cell.

I colliding items are stored outside of the table

I simple but requires additional memory outside of the table

Example: keys = birthdays, elements = namesI hash function: h(x) = (month of birth) mod 5

0

1

2

3

4

(01.01., Sue) ∅

(12.03., John) (16.08., Madonna) ∅

Worst-case: everything in one cell, that is, linear list.

Collision Handling: Linear Probing

Open addressing:I the colliding items are placed in a different cell of the table

Linear probing:I colliding items stored in the next (circularly) available cell

I testing if cells are free is called ‘probing’

Example: h(x) = x mod 13I we insert: 18, 41, 22, 44, 59, 32, 31, 73

0 1 2 3 4 5 6 7 8 9 10 11 121841 2244 59 32 31 73

Colliding items might lump together causing new collisions.

Linear Probing: Search

Searching for a key k (findElement(k)) works as follows:I Start at cell h(k), and probe consecutive locations until:

I an item with key k is found, or

I an empty cell is found, or

I all N cells have been probed unsuccessfully.

findElement(k):i = h(k)p = 0while p < N do

c = A[i]if c == ∅ then return No_Such_Keyif c.key == k then return c.elementi = (i + 1) mod Np = p + 1

return No_Such_Key

Linear Probing: Deleting

Deletion remove(k) is expensive:I Removing 15, all consecutive elements have to be moved:

0 1 2 3 4 5 6 7 8 9 10 11 1215 2 3 4 5 6 7

0 1 2 3 4 5 6 7 8 9 10 11 122 3 4 5 6 7

To avoid the moving we introduce a special element Available:I Instead of deleting, we replace items by Available (A).

0 1 2 3 4 5 6 7 8 9 10 11 1215 2 3 4 5 6 7A

I From time to time we need to ‘clean up’:I remove all Available and reorder items

Linear Probing: Inserting

Inserting insertItem(k, o):I Start at cell h(k), probe consecutive elements until:

I empty or Available cell is found, then store item here, or

I all N cells have been probed (table full, throw exception)

0 1 2 3 4 5 6 7 8 9 10 11 1216 17 4 5 6 7A A

Example: insert(3) in the above table yields (h(x) = x mod 13)

0 1 2 3 4 5 6 7 8 9 10 11 1216 17 4 3 6 7A

Important: for findElement cells with Available are treated asfilled, that is, the search continues.

Linear Probing: Possible Extensions

Disadvantages of linear probing:I Colliding items lump together, causing:

I longer sequences of probes

I reduced performance

Possible improvements/ modifications:I instead of probing successive elements,

compute the i-th probing index hi depending on i and k:

hi(k) = h(k) + f (i , k)

Examples:I Fixed increment c: hi(k) = h(k) + c · i .I Changing directions: hi(k) = h(k) + c · i · (−1)i .

I Double hashing: hi(k) = h(k) + i · h ′(k).

Double Hashing

Double hashing uses a secondary hash function d(k):I Handles collisions by placing items in the first available cell

h(k) + j · d(k)

for j = 0,1, . . . ,N − 1.

I The function d(k) always be > 0 and < N.

I The size of the table N should be a prime.

Double Hashing: Example

We use double hashing with:I N = 13

I h(k) = k mod 13

I d(k) = 7 − (k mod 7)

k h(k) d(k) Probes18 5 3 541 2 1 222 9 6 944 5 5 5, 1059 7 4 732 6 3 631 5 4 5,9,073 8 4 8

0 1 2 3 4 5 6 7 8 9 10 11 121841 22 44593231 73

Performance of Hashing

In worst case insertion, lookup and removal take O(n) time:I occurs when all keys collide (end up in one cell)

The load factor α = n/N affects the performace:I Assuming that the hash values are like random numbers,

it can be shown that the expected number of probes is:

1/(1 − α)

α

f (x)

1/(1 − α)

0.2 0.4 0.6 0.8 1

5

10

15

20

Performance of Hashing

In worst case insertion, lookup and removal take O(n) time:I occurs when all keys collide (end up in one cell)

The load factor α = n/N affects the performace:I Assuming that the hash values are like random numbers,

it can be shown that the expected number of probes is:

1/(1 − α)

In practice hashing is very fast as long as α < 0.85:I O(1) expected running time for all Dictionary ADT methods

Applications of hash tables:I small databases

I compilers

I browser caches

Universal Hashing

No hash function is good in general:I there always exist keys that are mapped to the same value

Hence no single hash function h can be proven to be good.

However, we can consider a set of hash functions H.(assume that keys are from the interval [0,M − 1])

We say that H is universal (good) if for all keys 0 ≤ i 6= j < M:

probability(h(i) = h(j)) ≤ 1N

for h randomly selected from H.

Universal Hashing: Example

The following set of hash functions H is universal:I Choose a prime p betwen M and 2 ·M.

I Let H consist of the functions

h(k) = ((a · k + b) mod p) mod N

for 0 < a < p and 0 ≤ b < p.

Proof Sketch.Let 0 ≤ i 6= j < M. For every i ′ 6= j ′ < p there exist unique a,bsuch that i ′ = a · i + b mod p and j ′ = a · i + b mod p. Thusevery pair (i ′, j ′) with i ′ 6= j ′ has equal probability. Consequentlythe probability for i ′ mod N = j ′ mod N is ≤ 1

N .

Comparison AVL Trees vs. Hash Tables

Dictionary methods:

search insert removeAVL Tree O(log2 n) O(log2 n) O(log2 n)

Hash Table O(1) 1 O(1) 1 O(1) 1

1 expected running time of hash tables, worst-case is O(n).

Ordered dictionary methods:

closestAfter closestBeforeAVL Tree O(log2 n) O(log2 n)

Hash Table O(n + N) O(n + N)

Examples, when to use AVL trees instead of hash tables:1. if you need to be sure about worst-case performance2. if keys are imprecise (e.g. measurements),

e.g. find the closest key to 3.24: closestTo(3.72)

Sorting Algorithms

We have already seen:I Selection-sort

I Insertion-sort

I Heap-sort

We will see:I Bubble-sort

I Merge-sort

I Quick-sort

We will show that:I O(n · log n) is optimal for comparison based sorting.

Bubble-Sort

The basic idea of bubble-sort is as follows:I exchange neighboring elements that are in wrong order

I stops when no elements were exchanged

bubbleSort(A):n = length(A)

swapped = truewhile swapped == true do

swapped = falsefor i = 0 to n − 2 do

if A[i ] > A[i + 1] thenswap(A[i ],A[i + 1])

swapped = truedonen = n − 1

done

Bubble-Sort: Example

5 4 3 7 1

4 5 3 7 1

4 3 5 7 1

4 3 5 7 1

4 3 5 1 7

3 4 5 1 7

4 3 5 1 7

4 3 1 5 7

3 4 1 5 7

3 1 4 5 7

1 3 4 5 7

Unsorted Part

Sorted Part

Bubble-Sort: Properties

Time complexity:I worst-case:

(n − 1) + . . .+ 1 =(n − 1)2 + n − 1

2∈ O(n2)

(caused by sorting an inverse sorted list)

I best-case: O(n)

Bubble-sort is:I slow

I in-place

Divide-and-Conquer

Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide the input S into two disjoint subsets S1, S2

I Recur: recursively solve the subproblems S1, S2

I Conquer: combine solutions for S1, S2 to a solution for S(the base case of the recursion are problems of size 0 or 1)

Example: merge-sort

7 2 | 9 4 7→ 2 4 7 9

7 | 2 7→ 2 7

7 7→ 7 2 7→ 2

9 | 4 7→ 4 9

9 7→ 9 4 7→ 4

I | indicates the splitting point

I 7→ indicates merging of the sub-solutions

Merge Sort

Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elements

I Recur: recursively sort S1, S2

I Conquer: merge S1 and S2 into a sorting of S

Algorithm mergeSort(S,C):Input: a list S of n elements and a comparator COutput: the list S sorted according to C

if size(S) > 1 then(S1,S2) = partition S into size bn/2c and dn/2emergeSort(S1,C)

mergeSort(S2,C)

S = merge(S1,S2,C)

Merging two Sorted Sequences

Algorithm merge(A,B,C):Input: sorted lists A, BOutput: sorted lists containing the elements of A and B

S = empty listwhile ¬A.isEmtpy() or ¬B.isEmtpy() do

if A.first().element < B.first().element thenS.insertLast(A.remove(A.first()))

elseS.insertLast(B.remove(B.first()))

donewhile ¬A.isEmtpy() do S.insertLast(A.remove(A.first()))while ¬B.isEmtpy() do S.insertLast(B.remove(B.first()))

Performance:I Merging two sorted lists of length about n/2 is O(n) time.

(for singly linked lists, double linked lists, and arrays)

Merge-Sort Tree

An execution of merge-sort can be displayed in a binary tree:I each node represents recursive call and stores:

I unsorted sequence before execution, its partition

I sorted sequence after execution

I leaves are calls on subsequences of size 0 or 1

a n e x | a m p l e 7→ a a e e l m n p x

a n | e x 7→ a e n x

a | n 7→ a n

a 7→ a n 7→ n

e | x 7→ e x

e 7→ e x 7→ x

a m | p l e 7→ a e l m p

a | m 7→ a m

a 7→ a m 7→ m

p | l e 7→ e l p

p 7→ p l | e 7→ e l

l 7→ l e 7→ e

Merge-Sort: Example Execution

7 1 2 9 | 6 5 3 8 7→ 1 2 3 5 6 7 8 9

7 1 | 2 9 7→ 1 2 7 9

7 | 1 7→ 1 7

7 7→ 7 1 7→ 1

2 | 9 7→ 2 9

2 7→ 2 9 7→ 9

6 5 | 3 8 7→ 3 5 6 8

6 | 5 7→ 5 6

6 7→ 6 5 7→ 5

3 | 8 7→ 3 8

3 7→ 3 8 7→ 8

Finished merge-sort tree.

Merge-Sort: Running Time

The height h of the merge-sort tree is O(log2 n):I each recursive call splits the sequence in half

The work all nodes together at depth i is O(n):I partitioning, and merging of 2i sequences of n/2i

I 2i+1 ≤ n recursive calls

10 n

21 n/2

2ii n/2i

. . .. . . . . .

depth nodes size

Thus the worst-case running time is O(n · log2 n).

Quick-Sort

Quick-sort of a list S with n elements works as follows:I Divide: pick random element x (pivot) from S and split S in:

I L elements less than x

I E elements equal than x

I G elements greater than x

7 2 1 9 6 5 3 85

I Recur: recursively sort L, and G

2 1 3 5 7 9 6 8

L E G

I Conquer: join L, E , and G

1 2 3 5 6 7 8 9

Quick-Sort: The Partitioning

The partitioning runs in O(n) time:I we traverse S and compare every element y with x

I depending on the comparison insert y in L, E or G

Algorithm partition(S,p):Input: a list S of n elements, and positon p of the pivotOutput: list L, E , G of lists less, equal or greater than pivot

L,E ,G = empty listsx = S.elementAtRank(p)

while ¬S.isEmpty() doy = S.remove(S.first())if y < x then L.insertLast(y)

if y == x then E .insertLast(y)

if y > x then G.insertLast(y)

donereturn L, E , G

Quick-Sort Tree

An execution of quick-sort can be displayed in a binary tree:I each node represents recursive call and stores:

I unsorted sequence before execution, and its pivot

I sorted sequence after execution

I leaves are calls on subsequences of size 0 or 1

1 6 2 9 4 0 7→ 0 1 2 4 6 9

1 2 0 7→ 0 1 2

0 7→ 0 2 7→ 2

6 9 7→ 6 9

9 7→ 9

Quick-Sort: Example

8 2 9 3 1 5 7 6 4 7→ 1 2 3 4 5 6 7 8 9

2 3 1 5 4 7→ 1 2 3 4 5

2 3 5 4 7→ 2 3 4 5

2 7→ 2 5 4 7→ 4 5

4 7→ 4

8 9 7 7→ 7 8 9

7 7→ 7 9 7→ 9

Quick-Sort: Worst-Case Running Time

The worst-case running time occurs when:I the pivot is always the minimal or maximal element

I then one L and G has size n − 1, the other size 0

Then the running time is O(n2):

n + (n − 1) + (n − 2) + . . .+ 1 ∈ O(n2)

n

0 n − 1

0 n − 2

0 ...1

0 0

Quick-Sort: Average Running Time

Consider a recursive call on a list of size m:I Good call: if both L and G are each less then 3

4 · s size

I Bad call: one of L and G is greater than 34 · s size

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

bad calls good calls bad calls

A good call has probability 12 :

I half of the pivots give rise to good calls

Quick-Sort: Average Running Time

For a node at depth i, we expect (average):I i/2 ancestors are good calls

I the size of the sequence is ≤ (3/4)i/2 · nAs a consequence:

I for a node at depth 2 · log3/4 n the expected input size is 1

I the expected height of the quick-sort tree is O(log n)

O(log n)

O(n)

O(n)

O(n)

sa

sb

sd

. . . . . .

se

. . . . . .

sc

sf

. . . . . .

sg

. . . . . .

The amount of work at depth i is O(n).Thus the expected (average) running time is O(n · log n).

In-Place Quick-Sort

Quick-Sort can be sorted in-place (but then non-stable):

Algorithm inPlaceQuickSort(A, l, r):Input: list A, indices l and rOutput: list A where elements from index l to r are sorted

if l ≥ r then returnp = A[r] (take rightmost element as pivot)l ′ = l and r ′ = rwhile l ′ ≤ r ′ do

while l ′ ≤ r ′ and A[l ′] ≤ p do l ′ = l ′ + 1 (find > p)while l ′ ≤ r ′ and A[r ′] ≥ p do r ′ = r ′ − 1 (find < p)if l ′ < r ′ then swap(A[l ′],A[r ′]) (swap < p with > p)

doneswap(A[r],A[l ′]) (put pivot into the right place)inPlaceQuickSort(A, l, l ′ − 1) (sort left part)inPlaceQuickSort(A, l ′ + 1, r) (sort right part)

Considered in-place although recursion needs O(log n) space.

In-Place Quick-Sort: Example

5 8 3 7 1 6

5 1 3 7 8 6

5 1 3 6 8 7

1 5 3 6 8 7

1 3 5 6 8 7

1 3 5 6 8 7

1 3 5 6 8 7

1 3 5 6 7 8

1 3 5 6 7 8

Unsorted Part

Sorted Part

Pivot

Sorting: Lower Bound

Many sorting algorithms are comparison based:I sort by comparing pairs of objects

I Examples: selection-sort, insertion-sort, bubble-sort,heap-sort, merge-sort, quick-sort,. . .

xi < xj ?noyes

No comparison based sorting algorithm can be faster than

Ω(n · log n)

time (worst-case).

We will prove this lower bound on the next slides. . .

Sorting: Lower Bound (Decision Tree)

We will only count comparisons (sufficient for lower bound):I Assume input is a permutation of the numbers 1, 2, . . . , n.

I Every execution corresponds to a path in the decision tree:

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

I Algorithm itself maybe does not have this tree structure,but this is the maximal information gained by the algorithm.

Sorting: Lower Bound (Leaves)

Every leaf corresponds to exactly one input permutation:I application of the same swapping steps on two different

input permutations, e.g. . . . ,6,. . . ,7,. . . and . . . ,7,. . . ,6,. . . ,yields different results (not both results can be sorted)

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

Sorting: Lower Bound (Height of the Decision Tree)

The height of the tree is a lower bound on the running time:I There are n! = n · (n − 1) · · · 1 permutations of 1, 2, . . . , n.

I Thus the height of the tree is at least: log2(n!).

n!

log2(n!)

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

Sorting: Lower Bound

Hence any comparison-based sorting algorithm takes at least

log2(n!) ≥ log2(n/2)n/2

=n2

log2n2

∈ Ω(n · log n)

time in worst-case.

Summary of Comparison-Based Sorting Algorithms

Algorithm Time Notes

selection-sort O(n2) I slow (but good for small lists)I in-place, stableI insertion-sort good for online

sorting and nearly sorted lists

insertion-sort O(n2)

bubble-sort O(n2)

heap-sort O(n · log2 n)I in-place, not stable, fastI good for large inputs (1K − 1M)

merge-sort O(n · log2 n)

I fast, stable, usually not in-placeI sequential data accessI good for large inputs (> 1M)

quick-sort O(n · log2 n)

(expected)I in-place, randomized, not stableI fastest, good for huge inputs

Quick-sort usually performs fastest, although worst-case O(n2).

Sorting: Comparison of Runtime

Algorithm 25.000sorted

100.000sorted

25.000not sorted

100.000not sorted

selection-sort 1.1 19.4 1.1 19.5

insertion-sort 0 0 1.1 19.6

bubble-sort 0 0 5.5 89.8

Algorithm 5 millionsorted

20 millionsorted

5 millionnot sorted

20 millionnot sorted

insertion-sort 0.03 0.13 timeout timeout

heap-sort 3.6 15.6 8.3 42.2

merge-sort 2.5 10.5 3.7 16.1

quick-sort 0.5 2.2 2.0 8.7

I Source: Gumm, Sommer Einführung in die Informatik.

Bucket-Sort

Let S be a list of n key-element items with keys in [0,N − 1].

Bucket-sort uses the keys as indices into auxiliary array B:I the elements of B are lists, so-called bucketsI Phase 1:

I empty S by moving each item (k, e) into its bucket B[k]

I Phase 2:I for i = 0, . . . ,N − 1 move the items of B[k] to the end of S

Performance:I phase 1 takes O(n) time

I phase 2 takes O(n + N) timeThus bucket-sort is O(n + N).

Bucket-Sort: Example

I key range [0,9]

7,d 1,c 3,a 7,g 3,b 7,e

I Phase 1: filling the buckets

B

1 2 3 4 5 6 7 8 9

1,c 3,a 3,b 7,d 7,g 7,e

I Phase 2: emptying the buckets into the list

1,c 3,a 3,b 7,d 7,g 7,e

Bucket-Sort: Properties and Extensions

The keys are used as indices for an array, thus:I keys should be numbers from [0,N − 1]

I no external comparator

Bucket-sort is a stable sorting algorithm.

Extensions:I can be extended to an arbitrary (fixed) finite set of keys D

(e.g. the names of the 50 U.S. states)

I sort D and compute the rank rankOf(k) of each element

I put item (k, e) into bucket B[rankOf(k)]

Bucket-sort runs in O(n + N) time:I very efficient if keys come from a small intervall [0,N − 1]

(or in the extended version from a small set D)

Lexicographic Order

A d-tuple is a sequence of d keys (k1, k2, . . . , kd):I ki is called the i-th dimension of the tuple

Example: (2,5,1) as point in 3-dimensional space

The lexicographic order of d tuples is recursively defined:

(x1, x2, . . . , xd) < (y1, y2, . . . , yd)⇐⇒x1 < y1 ∨ (x1 = y1 ∧ (x2, . . . , xd) < (y2, . . . , yd))

That is, the tuples are first compared by dimension 1, then 2,. . .

Lexicographic-Sort

Lexicographic-sort sorts a list of d-tuples in lexicographic order:

I Let Ci be comparator comparing tuples by i-th dimension.

I Let stableSort be a stable sorting algorithm.

Lexicographic-sort executes d-times stableSort, thus:I let T (n) be the running time of stableSort

I then lexicographic-sort runs in O(d · T (n))

Algorithm lexicographicSort(S):Input: a list S of d-tuplesOutput: list S sorted in lexicographic order

for i = d downto 1 dostableSort(S,Ci)

done

Lexicographic-Sort: Example

(7,4,6) (5,1,5) (2,0,6) (5,1,4) (2,1,4)

(5,1,4) (2,1,4) (5,1,5) (7,4,6) (2,0,6) dimension 3

(2,0,6) (5,1,4) (2,1,4) (5,1,5) (7,4,6) dimension 2

(2,0,6) (2,1,4) (5,1,4) (5,1,5) (7,4,6) dimension 1

Number representations

We can write numbers in different numeral systems, e.g.:I 4310, that is, 43 in decimal system (base 10)

I 1010112, that is, 43 in binary system (base 2)

I 11213, that is, 43 represented base 3

For every base b ≥ 2 and every number m there exist uniquedigits 0 ≤ d0, . . . ,dl < b such that:

m = dl · bl + dl−1 · bl−1 + . . .+ d1 · b1 + d0 · b0

and if l > 0 then dl 6= 0.

Example

43 = 4310 = 4 · 101 + 3 · 100

= 1010112 = 1 · 25 + 0 · 24 + 1 · 23 + 0 · 22 + 1 · 21 + 1 · 20

= 11213 = 1 · 33 + 1 · 32 + 2 · 31 + 1 · 30

Radix-Sort

Radix-sort is specialization of lexicographic-sort:I uses bucket-sort as stable soring algorithm

I is applicable if tuples consists of integers from [0,N − 1]

I runs in O(d · (n + N)) time

Sorting integers of fixed bit-length d in linear time:I consider a list of n d-bit integers xd−1xd−2 . . . x0 (base 2)

I thus each integer is a d-tuple (xd−1, xd−2, . . . , x0)

I apply radix sort with N = 2

I the runtime is O(d · n)

For example, we can sort 32-bit integers in linear time.

Example

We sort the following list of 4-bit integers:

1001 0010 1101 0001 1110

0010 1110 1001 1101 0001

1001 1101 0001 0010 1110

1001 0001 0010 1101 1110

0001 0010 1001 1101 1110

Exercise C-4.14

Suppose we are given a sequence S of n elements each ofwhich is an integer from [0,n2 − 1]. Describe a simple methodfor sorting S in O(n) time.

I Each number from [0,n2 − 1] can be represented by a twodigit number in the number system with base n.

(n − 1) · n + (n − 1) = n2 − 1

I Conversion of each element into base-n is O(1).(O(n) for the whole list).

I Then use radix-sort to sort in O(2 · n), that is, O(n) time.

The Selection Problem

The selection problem:I given an integer k and a list x1, . . . , xn of n elements

I find the k -th smallest element in the list

Example: the 3rd smallest element of the following list is 6

7 4 9 6 2

An O(n · log n) solution:I sort the list (O(n · log n))

I pick the k -th element of the sorted list (O(1))

2 4 6 7 9

Can we find the k -th smallest element faster?

Quick-Select

Quick-select of the k-th smallest element in the list S:I based on prune-and-search paradigm

I Prune: pick random element x (pivot) from S and split S in:I L elements < x, E elements == x, G elements > x

7 2 1 9 6 5 3 85

2 1 3 5 7 9 6 8

L E GI partitioning into L, E and G works precisely as for quick-sort

I Search:I if k ≤ |L| then return quickSelect(k,L)

I if |L| < k ≤ |L| + |E | then return x

I if k > |L| + |E | then return quickSelect(k − |L| − |E |,G)

Quick-Select Visualization

Quick-select can be displayed by a sequence of nodes:I each node represents recursive call and stores: k, the

sequence, and the pivot element

k = 5, S = (7 4 9 3 2 6 5 1 8)

k = 2, S = (7 4 9 6 5 8)

k = 2, S = (7 4 6 5)

k = 1, S = (7 6 5)

found 5

Quick-Select: Running Time

The worst-case running is O(n2) time:I if the pivot is always the minimal or maximal element

The expected running time is O(n) (compare with quick-sort):I with probability 0.5 the recursive call is good: (3/4)n size

I T (n) ≤ b · a · n + T (34n)

I a is the time steps for partitioning per element

I b is the expected number of calls until a good call

I b = 2 (average number of coins to toss until head shows up)

I Let m = 2 · a · n, then

T (n) ≤ 2 · a · n + T ((3/4)n)

≤ 2 · a · n + 2 · a · (3/4) · n + 2 · a · (3/4)2 · n . . .= 8 · a · n ∈ O(n) (geometric series)

Quick-Select: Median of Medians

We can do selection in O(n) worst-case time.

Idea: recursively use select itself to find a good pivot:I divide S into n/5 sets of 5 elements

I find a median in each set (baby median)

I recursively use select to find the median of the medians

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

1

2

3

45

min. size of L

min. size of G

3

The minimal size of L and G is 0.3 · n.

Quick-Select: Median of Medians

We know:I The minimal size of L and G is 0.3 · n.

I Thus the maximal size of L and G is 0.7 · n.Let b ∈ N such that:

I partitioning of a list of size n takes at most b · n time,

I finding the baby medians takes at most b · n time,

I the base case n ≤ 1 takes at most b time.

We derive a recurrence equation for the time complexity:

T (n) =

b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1

We will see how to solve recurrence equations. . .

Fundamental Techniques

We have considered algorithms for solving special problems.Now we consider a few fundamental techniques:

I Devide-and-Conquer

I The Greedy Method

I Dynamic Programming

Divide-and-Conquer

Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide input S into k ≥ 2 disjoint subsets S1,. . . ,Sk

I Recur: recursively solve the subproblems S1,. . . ,Sk

I Conquer: combine solutions for S1,. . . ,Sk to solution for S(the base case of the recursion are problems of size 0 or 1)

We have already seen examples:I merge sort

I quick sort

I bottom-up heap construction

Focus now: analysing time complexity by recurrence equations.

Methods for Solving Recurrence Equation

We consider 3 methods for solving recurrence equation:I Iterative substitution method,

I Recursion tree method,

I Guess-and-test method, and

I Master method.

Iterative Substitution

Iterative substitution technique works as follows:I iteratively apply the recurrence equations to itself

I hope to find a pattern

Example

T (n) =

1 if n ≤ 1T (n − 1) + 2 if n > 1

We start with T (n) and apply the recursive case:

T (n) = T (n − 1) + 2= T (n − 2) + 4= T (n − k) + 2 · k

For k = n − 1 we reach the base case T (n − k) = 1 thus:

T (n) = 1 + 2 · (n − 1) = 2 · n − 1

Merge-Sort Review

Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elementsI Recur: recursively sort S1, S2

I Conquer: merge S1 and S2 into a sorting of SLet b ∈ N such that:

I merging two lists of size n/2 takes at most b · n time, and

I the base case n ≤ 1 takes at most b time

We obtain the following recurrence equation for merge-sort:

T (n) =

b if n ≤ 12 · T (n/2) + b · n if n > 1

We search a closed solution for the equation, that is:I T (n) = . . . where T (n) does not occur in the right side

Example: Merge-Sort

T (n) =

b if n ≤ 12 · T (n/2) + b · n if n > 1

We assume that n is a power of 2: n = 2k (that is, k = log2 n)I allowed since we are interested in asymptotic behaviour

We start with T (n) and apply the recursive case:

T (n) = 2 · T (n/2) + b · n= 2 ·

(2 · T (n/22) + b · (n/2)

)+ b · n

= 22 · T (n/22) + 2 · b · n= 23 · T (n/23) + 3 · b · n= 2k · T (n/2k) + k · b · n= n · b + (log2 n) · b · n

Thus T (n) = b · (n + n · log2 n) ∈ O(n log2 n).

The Recursion Tree Method

The recursion tree method is a visual approach:I draw the recursion tree and hope to find a pattern

T (n) =

b if n ≤ 23 · T (n/3) + b · n if n > 2

For a node with input size k: work at this node is b · k.

10 n

31 n/3

3ii n/3i

depth nodes size

Thus the work at depth i is 3i · b · n/3i = b · n.The height of the tree is log3 n. Thus T (n) is O(n · log3 n).

Guess-and-Test Method

The guess-and-test method works as follows:I we guess a solution (or an upper bound)

I we prove that the solution is true by induction

Example

T (n) =

1 if n ≤ 1T (n/2) + 2 · n if n > 1

Guess: T (n) ≤ 2 · nI for n = 1 it holds T (n) = 1 ≤ 2 · 1I for n > 1 we have:

T (n) = T (n/2) + 2 · n ≤ 2 · n/2 + 2 · n = 3 · n

Wrong guess: we cannot make 3 · n smaller or equal to 2 · n.

Example, continued

T (n) =

1 if n ≤ 1T (n/2) + 2 · n if n > 1

New guess: T (n) ≤ 4 · nI for n = 1 it holds T (n) = 1 ≤ 4 · 1I for n > 1 we have:

T (n) = T (n/2) + 2 · n≤ 4 · n/2 + 2 · n (by induction hypothesis)= 4 · n

This time the guess was good: 4 · n ≤ 4 · n. Thus T (n) ≤ 4 · n.

Example: Quick-Select with Median of Median

T (n) =

b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1

Guess: T (n) ≤ 20 · b · nI for n = 1 it holds T (n) = b ≤ 20 · b · 1I for n > 1 we have:

T (n) = T (0.7 · n) + T (0.2 · n) + 2 · b · n≤ 0.7 · 20 · b · n + 0.2 · 20 · b · n + 2 · b · n (by IH)= 0.9 · 20 · b · n + 2 · b · n = 18 · b · n + 2 · b · n= 20 · b · n

Thus the guess was good.This shows that quick-select with median of median is O(n).

Master method

Many divide-and-conquer recurrence equations have the form:

T (n) =

c if n < daT (n/b) + f (n) if n ≥ d

Theorem (The Master Theorem)

1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)

2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)

3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.

Master Method: Example 1

T (n) =






Example

T (n) = 4 · T (n/2) + n

Solution: logb a = 2, thus case 1 says T (n) = Θn2.


T (n) =






Example

T (n) = 2 · T (n/2) + n · log n

Solution: logb a = 1, thus case 2 says T (n) = Θn log2 n.


T (n) =






Example

T (n) = T (n/3) + n · log n

Solution: logb a = 0, thus case 3 says T (n) = Θn log n.


T (n) =






Example

T (n) = 8 · T (n/2) + n2



T (n) =






Example

T (n) = 9 · T (n/3) + n3



T (n) =






Example

T (n) = T (n/2) + 1 binary search

Solution: logb a = 0, thus case 2 says T (n) = Θlog n.


T (n) =






Example

T (n) = 2 · T (n/2) + log n heap construction

Solution: logb a = 0, thus case 1 says T (n) = Θn.

Integer Multiplication

Algorithm to multiply two n-bit integers A and B:I Divide: split A, B in n/2 higher-order and lower-order bits

A = Ah · 2n/2 + A`B = Bh · 2n/2 + B`

We can define A · B as follows:

A · B = (Ah · 2n/2 + A`) · (Bh · 2n/2 + A`)

= Ah · Bh · 2n + Ah · B` · 2n/2 + A` · Bh · 2n/2 + A` · B`

So, T (n) = 4 · T (n/2) + n wich implies T (n) ∈ O(n2).

Can we do better? . . .

Integer Multiplication, continued

Algorithm to multiply two n-bit integers A and B:I Divide: split A, B in n/2 higher-order and lower-order bits

A = Ah · 2n/2 + A`B = Bh · 2n/2 + B`

We use a different way to multiply the parts:

A · B = Ah · Bh · 2n + (Ah · B` + A` · Bh) · 2n/2 + A` · B`= Ah · Bh · 2n

+ [(Ah − A`) · (B` − Bh) + Ah · Bh + A` · B`] · 2n/2

+ A` · B`

Here T (n) = 3T (n/2) + n, hence T (n) ∈ O(nlog2 3) = O(n1.585).(note that we need to calculate Ah · Bh and A` · B` only once)

The Greedy Method

An optimization problem consists ofI a set of configurations (different choices) , and

I an objective function (a score assigned to configurations).We search a configuration that maximizes (or minimizes) theobjective function.

The greedy method for solving optimization problems:I tries to find the global optimum (or come close to it) by

iteratively selecting the locally optimal choice.

That is, in every step we choose the currently best possible.

Making Change

Problem: return coins for a given amount of money.I configurations: coins returned and money left to return

I objective function: minimize the number of coins returned

Greedy solution: always return the largest coin possible.

ExampleThe coins are valued 8, 4, 1. We give change for 18:

I we pick 8, 10 left to pay



I we pick 1, 0 left to payThus the greedy solution is 8,8,1,1.

Making Change, continued

ExampleThe coins are valued 4, 3, 1. We give change for 6:



I we pick 1, 0 left to payThus the greedy solution is 4,1,1.

But the optimal solution would be 3,3:I choosing currently best not always leads to global optimum

The above example does not have the greedy-choice property:

A problem has the greedy-choice property if beginning fromthe start configuration the sequence of locally optimal choicesleads to the globally optimal solution.

The Fractional Knapsack Problem

Given: a set S of n items where each item i has:I bi = a positive benefit

I wi = a positive weight

Fractional Knapsack Problem

I choose fractions xi ≤ wi of items with maximal total benefit:∑i∈S

bi ·xi

wi(objective to maximize)

I with weight at most W :∑i∈S

xi ≤W (constraint)

Example

You found a treasure with:I 50kg jewels, value 1 Million Euros

I 1kg chewing gum, value 20 Euros

I 5kg diamonds, value 5 million Euros

I 10kg gold, value 500.000 EurosYour backpack can carry only 20kg!

What do you take with you?I Of course the highest value per weight:

I 5kg diamonds (value/kg = 1 million Euro)

I 10kg gold (value/kg = 0.05 million Euro)

I 5kg jewels (value/kg = 0.02 million Euro)

Your backpack is filled with the maximum value 5.6 million Euro.

The Fractional Knapsack Algorithm

Greedy choice:I keep taking item with the highest benefit to weight ratio: bi

wi

I run time: O(n log n) (for sorting by this ratio)

fractionalKnapsack(S,~b, ~w ,W ):for each i ∈ S do

xi = 0vi = bi/wi

sort S such that elements are descending w.r.t. viwhile W > 0 do

i = S.remove(S.first())xi = min(wi ,W )

W = W − xidone

The Fractional Knapsack Algorithm

Correctness: suppose there is a better solution A.I Then A did not always choose the highest vj .

I Thus there exist items i , j in A such that:

xi > 0 vi < vj xj < wj

Let

a = min(xi ,wj − xj)

But then we could replace amount a of item i with item j ,which would increase the total benefit.

Thus the solution cannot be better than the greedy choice.

The fract. knapsack problem has the greedy-choice property!

Task Scheduling

Given: a set T of n tasks, each having:I a start time si

I a finish time fi (where si < fi )

Goal: perform all tasks using a minimal number of machines

time1 2 3 4 5 6 7 8 9

Machine 1Machine 2Machine 3

Task Scheduling Algorithm

Greedy choice:I keep taking the task with smallest start time

I assign task to free machine if possible(if all machines are busy take a new machine)

taskSchedule(T ,~s,~f ):m = 0 number of machinessort T such that elements are ascending w.r.t. siwhile ¬T .isEmpty() do

i = T .remove(T .first())if a machine j ≤ m has time for task i then

schedule i on machine jelse

m = m + 1schedule i on machine m

done

Example

Given: a set T of tasks:

[1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8]

(ordered by starting time)

time1 2 3 4 5 6 7 8 9

Machine 1Machine 2Machine 3

Task Scheduling Algorithm

Correctness: suppose there is a better schedule.I Assume:

I the greedy algorithm uses m machines,

I there exists a schedule using less than m machines.

I Let i be the first task scheduled on machine m.

I Then on each of the machines 1, . . . ,m − 1 there runs atask with starting time ≤ si and finishing time fi > si .

I All these task conflict with i and with each other.Hence, there cannot be a schedule with less than m machines.

The Greedy Method: Applications

Further applications of greedy algorithms:I string compression (construction of Huffman codes)

I shortest path in graphs

I minimal spanning trees

I . . .

Dynamic Programming

Dynamic programming is a algorithm design paradigm:I used for optimization problems

I solving complex problems by splitting into smaller parts

I optimal solutions of subproblemsare compbined to globally optimal solution

Sounds a bit like divide-and-conquer, but:I not necessarily recursive

I subproblems are allowed to overlap

I a suitable definition of subproblems can be choosen

Also does not tell much. . . we consider examples. . .

Maximum Subarray Problem

Given: array A of integers (can be negative).Goal: find the maximal sum A[i ] + . . .+ A[i + j ] of a subarray.

Algorithm maxSubarray(A, n):Input: array A containing n integersOutput: maximum subarray sum

max = 0for left = 0 to n − 1 do

sum = 0for right = left to n − 1 do

sum = sum + A[right]if sum > max then max = sum

donedonereturn max

The naive algorithm is O(n2).

Maximum Subarray Algorithm

How to split into subproblems?I let B[r ] be the maximum sum of a subarray ending at rank r

Let B[0] = max(A[0],0).We can compute B[r ] from B[r − 1] as follows:

B[r ] = max(0, B[r − 1] + A[r ])

That is, the maximal subarray ending at r is either:I the maximal subarray ending at r − 1 plus element A[r ], or

I the empty subarray ending at r .

The maximum subarray sum is the maximum of the B[i ]’s.This gives rise to a linear time algorithm. . .

Maximum Subarray Algorithm

Kadane’s algorithm for the maximum subarray problem:

Algorithm maxSubarray(A, n):Input: array A containing n integersOutput: maximum subarray sum

B = new array of length nB[0] = max(A[0],0)

max = B[0]

for r = 1 to n − 1 doB[r ] = max(0,B[r − 1] + A[r ])max = max(max,B[r ])

donereturn max

This algorithm computes the maximal subarray sum in O(n).

Making Change

Problem: return coins for a given amount of money.I configurations: coins returned and money left to return

I objective function: minimize the number of coins returnedLet C be the set of coins, e.g. C = 1,3,4.

How to split into subproblems?I let B[w ] be the best solution for making change for w

(that is the minimal number of coins for making w)

Let B[0] = 0. Compute B[w ] from B[0],. . . ,B[w − 1] as follows:

B[w ] = 1 + min B[w − c] | c ∈ C s.t. c ≤ w

That is, for the minimal number of coins for w we need:I a coin c from C plus the minimal number of coins for w − c

Example

Let C = 1,3,4, we make change for 6.

Let B[0] = ∅.I B[1] = 1 + max B[0] (for c = 1) = 1

I B[2] = 1 + max B[1] (for c = 1) = 2

I B[3] = 1 + max B[2] (for c = 1),B[0] + 1 (for c = 3) = 1

I B[4] = 1 + max B[3] (c = 1),B[1] (c = 3),B[0] (c = 4) = 1

I B[5] = 1 + max B[4] (c = 1),B[2] (c = 3),B[1] (c = 4) = 2

I B[6] = 1 + max B[5] (c = 1),B[3] (c = 3),B[2] (c = 4) = 2Thus we need 2 coins to make change for 6.

If we want to know which coins are needed to make change forw , we have to remember which coin was added in each step.(for example: coin 3 was choosen for B[6], and 3 for B[3])

The 0/1 Knapsack Problem

Given: a set S of n items where each item i has:I bi = a positive benefit

I wi = a positive weight

0/1 Knapsack Problem

I choose items T ⊆ S with maximal total benefit:∑i∈T

bi (objective to maximize)

I with weight at most W :∑i∈T

wi ≤W (constraint)

Thus now each item is either accepted or rejected entirely.

Example

You found a treasure with:I a 7kg gold bar,

I a 5kg gold bar, and

I a 4kg gold bar.The value of the gold coincides with the weight.Your backpack can carry only 10kg!

We cannot split the gold bars. What do you take with you?I The greedy approach would take the 7kg gold bar.

I However, better is: 5kg, 4kg with total weight 9kg.

The 0/1 Knapsack Algorithm

How to split into subproblems?I number items from 1,. . . ,n

I let Sk be the the set of elements 1,. . . ,k

I let B[k,w ] be the best selection from Sk with weight ≤ wThus the B[k,w ] are the solutions of subproblems.

We compute B[k,w ] from B[k − 1,0],. . . ,B[k − 1,w ] as follows:

B[k,w ] =

B[k − 1,w ], if wk > wmax(B[k − 1,w ],B[k − 1,w − wk] + bk) otherwise

That is, the best subset of Sk with weight ≤ w is either:I the best subset of Sk−1 with weight ≤ w , or

I the best subset of Sk−1 with weight ≤ w − wk plus item k

Example

We have:I item 1 with weight 3 and benefit 9 (b1/w1 = 3)

I item 2 with weight 2 and benefit 5 (b2/w2 = 2.5)

I item 3 with weight 2 and benefit 5 (b3/w3 = 2.5)Maximal total weight is w = 4!

We run the algorithm:I B[0,0] = 0, B[0,1] = 0, B[0,2] = 0, B[0,3] = 0, B[0,4] = 0

I B[1,0] = 0, B[1,1] = 0, B[1,2] = 0, B[1,3] = 9, B[1,4] = 9

I B[2,0] = 0, B[2,1] = 0, B[2,2] = 5, B[2,3] = 9, B[2,4] = 9

I B[3,0] = 0, B[3,1] = 0, B[3,2] = 5, B[3,3] = 9, B[3,4] = 10Thus the best benefit for weight 4 is B[3,4] = 10.

Review Matrix Multiplication

Let A, B be matrices:I let A have dimension d × e

I let B have dimension e × fThen C = A · B has dimension d × f .

The (naive) algorithm for computing C = A · B:

C[r , c] =

e−1∑i=0

A[r , i ] · B[i , c]

takes O(d · e · f ) time.(we multiply every row of A with every column of B)

Matrix Chain-Products

Matrix chain-product:I Compute A0 · A1 · · ·An−1.

I Let Ai have dimension di × di+1.Problem: how to put parenthesize?

Example:I B is 3× 100 (height × width)

I C is 100× 5

I D is 5× 5Then

I (B · C) · D takes 1500 + 75 = 1575 operations

I B · (C · D) takes 2500 + 1500 = 4000 operations

Attempt 1: Enumeration Approach

Idea of the enumeration (brute-force) approach:I Try all possible ways to parenthesize A0 · · ·An−1.

I Calculate the number of operations for each of them.

I Pick the best solution.

Running time:I As many ways to parenthesize as binary trees with n

leaves.

I This is exponential, almost 4n.Thus the algorithm is terribly slow.

Attempt 2: Greedy Approach

Idea: select the product that uses the fewest operations.

Example where the greedy approach fails:I A is 2× 1

I B is 1× 2

I C is 2× 3Then:

I A · B takes 4 operations

I B · C takes 6 operationsThus the greedy methods picks (A · B) · C. However:

I (A · B) · C takes 4 + 12 = 16 operations

I A · (B · C) takes 6 + 6 = 12 operations

Thus the greedy method does not yield the optimal result.

Solution: Dynamic Programming

What are the subproblems?I best parenthesization of Ai · Ai+1 · · ·Aj .

Let N[i , j ] be the number of operations for this subproblem.

The global optimum can be defined from optimal subproblems:I Recall that Ai has dimension di × di+1.

I Thus we can define:

N[i , j ] = mini≤k<j

N[i , k ] + N[k + 1, j ] + di · dk+1 · dj+1

Note that the subproblems are can overlap:

I e.g. N[1,4] and N[2,6] overlap

Solution: Dynamic Programming Algorithm

Matrix chain-products with running time O(n3):I We start with N[i , i ] = 0 (nothing to multiply there).

I Then for s = 2,3, . . . we compute problems of size s.(that is, the problems N[i , i + s − 1])

matrixChain(d0,d1, . . . ,dn−1):for i = 0 to n − 1 do N[i, i] = 0 donefor s = 2 to n do

for i = 0 to n − s doj = i + s − 1N[i, j] = +∞for k = i to j − 1 do

ops = N[i, k] + N[k + 1, j] + di · dk+1 · dj+1N[i, j] = min(N[i, j], ops)

donedone

done

Example

Let d0 = 2, d1 = 1, d2 = 2, d3 = 4, d4 = 3.I A0 is 2× 1, A1 is 1× 2, A2 is 2× 4, A3 is 4× 3.

We run the algorithm:I N[0,0] = 0, N[1,1] = 0, N[2,2] = 0, N[3,3] = 0

I N[0,1] = 4, N[1,2] = 8, N[2,3] = 24

I N[0,2] = min(0 + 8 + 8,4 + 0 + 16) = 16,N[1,3] = min(0 + 24 + 6,8 + 0 + 12) = 20

I N[0,4] = min(0 + 20 + 6,4 + 24 + 12,16 + 0 + 24) = 26

Thus the optimal solution is: A0 · ((A1 · A2) · A3).

Graphs

V

U X Z

W

Y

a

c

d

b

e

f

g

h

i

j

Graphs

A graph is a pair (V ,E) where:I V is a set of nodes, called vertices

I E is a set of pairs of nodes, called edgesBoth the nodes and edges can store elements (labels).

U X Zb

h

i

j

Example:I Nodes: U, X , Z

I Edges: (U,X ), (X ,Z ), (Z ,X ), (Z ,Z )

Edge Types

Directed edge:I ordered pair of vertices (u, v)

I first vertex x is the origin

I second vertex y is the destinationDirected graph: all edges are directed.

Amsterdam

London

flight A001

Undirected edge:I unordered pair of vertices (u, v)

Undirected graph: all edges are undirected.

Amsterdam

London

357.52 km

Applications of Graphs

I Electronic circuits.I Transportation networks:

I street network (Google maps)

I flight network

I Computer networks.

I . . .

Graphs: Terminology

I Start- and endpoint:I U is start-point of a

I V is end-point of a

I Edges:I a is outgoing edge of U

I a is incoming edge of V

I Degree of a vertex(number of edges):

I The degree deg(X ) of X is 5

I Self-loops:I j is a self-loop

V

U X Z

W

Y

a

c

d

b

e

f

g

h

i

j

Graphs: Terminology (continued)

I Path:I sequence of alternating

vertices and edges

I begins and ends with vertes

I each edge is preceded byits start-point and followedby its end-point

I e.g. U c W e X g Y f W d V

I Cycle:I path the first and last vertex

are equal

V

U X Z

W

Y

a

c

d

b

e

f

g

h

i

j

A path is simple if all vertices are distinct (except first & last).

Implementing Graphs

Vertex stores:I element

Edge stores:I element

I start-point and end-point

Edge List StructureGraph is stored as list of vertexes and list of edges.

Adjacency List StructureGraph is stored as list of vertexes. Each vertex stores:

I element

I list of (outgoing and incoming) edges

Implementing Graphs: Performance

Assume that we have a graph with n vertices, and m edges.

Edge List Adjacency ListoutgoingEdges(v) O(m) O(deg(v))areAdjacent(v,w) O(m) O(min(deg(v),deg(w)))

insertVertex(o) O(1) O(1)

insertEdge(v,w, o) O(1) O(1)

removeVertex(v) O(m) O(deg(v))insertEdge(e) O(1) O(1)

Weighted Graphs

V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

3

In a weighted graph each edge has an associated numbercalled weight.

The weight can represent distances, costs,. . .

Shortest Path Problem

Given: weighted graph and vertices u, vI find the shortest path between u and v

I shortest means ‘minimal total weight’

Example: shortest path from U to Z has weight 8

V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

3

Dijkstra’s Algorithm

The distance of a vertex v from a vertex s:I minimal weight of a path from v to s

Dijkstra’s algorithm computes distances (shortest paths):I from a given start vertex s to all other verteces

I assumptions: the edge weights are non-negative

Dijkstra’s algorithm:I For every vertex v we store the estimated distance d(v).

(initially d(s) = 0 and d(v) =∞ for all other vertices v)I We start with the set of vertices S = s and in every step:

I add to S the vertex v outside of S with smallest d(v)

I update distances of direct neighbours n of v:d(n) = min(d(n), d(v) + weight of the edge from v to n)

Dijkstra’s Algorithm: Example

Example: shortest paths from U

V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

∞

∞

∞

∞

∞

First we add U to S, and update the distances neighbours of U.



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

5

1

∞

∞

∞

Now W is the closest node outside S:I we add W to S and update the distances of its neighbours



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

3

1

7

∞

∞

Now V is the closest node outside S:I we add V to S and update the distances of its neighbours



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

3

1

6

∞

∞

Now X is the closest node outside S:I we add X to S and update the distances of its neighbours



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

3

1

6

7

8

Now Y is the closest node outside S:I we add Y to S and update the distances of its neighbours



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

3

1

6

7

8

Now Z is the closest node outside S:I we add Z to S and update the distances of its neighbours



V

U X Z

W

Y

5

1

2

3

6

3

1

2

5

30

3

1

6

7

8

We obtain a tree of all shortest paths from U to all other nodes.(e.g. the shortest path from U to Z has length 8)


Assume that we have a graph with n vertices, and m edges.I A priority queue stores (heap) vertices outside of S:

I key: distance

I element: vertex

I Each vertex v stores distance d(v) and link to heap node.

Algorithm on the following slide. . .


DijkstraDistances(G, s):Q = new heap-based priority queue(set all distances to∞ except for s with distance 0)for each v ∈ G.vertices() do

if v == s then v.setDistance(0) else v.setDistance(∞)

h = Q.insert(v.getDistance(), v) (returns heap node)v.setHeapNode(h)

while ¬Q.isEmtpy() dov = Q.removeMin()

(update the distance of all neighbours)for each e ∈ v.outgoingEdges() do

w = e.endPoint()d = v.getDistance() + e.getWeight()if d < w.getDistance() then

w.setDistance(d)

Q.replaceKey(w.getHeapNode(), d)

done

Dijkstra’s Algorithm: Performance

Assume that we have a graph with n vertices, and m edges.I The first loop is executed n times:

I n times inserting in priority queue (O(log n))

I The second loop is executed n times:I n times removing from the priority queue (O(log n))

I The third loop is executed for every edge once (m times):I m times updating a key in the priority queue (O(log n))

In total we have: (n · log n) + (n · log n) + (m · log n).I O((n + m) · log n)

(if we use the adjacency list structure)

Dijkstras algorithm is O(m · log n) (graph is connected).

Why Dijkstra’s Algorithm Works

Dijkstras algorithm is based on the greedy method:I it adds vertices by increasing size

Suppose it would not work:I Let x be the closest vertex with wrongly assigned distance.

I Let y be the previous node on the shortest path from s to x.I Then y is closer, and has correct distance:

I thus y has been processed before x, and

I then x must have been assigned the correct distance.

Hence there cannot be a wrong vertex. The algorithm works!

Why It Doesn’t Work For Negative Weights

Example: shortest path from X

X

Y Z V

7 5

-31

0

∞ ∞ ∞

We have processed all nodes, but V has the wrong distance!

The problem with negative weights problem is:I The nodes on a shortest path do not always have

increasing distance.



X

Y Z V

7 5

-31

0

7 5 ∞






X

Y Z V

7 5

-31

0

7 5 6






X

Y Z V

7 5

-31

0

7 4 6




Bellman-Ford Algorithm

Bellman-Ford algorithm:I works with negative weights

I iteration i finds shortest paths of length i

BellmanFord(G, s):for each v ∈ G.vertices() do

if v == s then v.setDistance(0) else v.setDistance(∞)

for i = 1 to n − 1 dofor each e ∈ G.edges() do

v = e.startPoint()w = e.endPoint()d = v.getDistance() + e.getWeight()if d < w.getDistance() then

w.setDistance(d)

done

Example

The nodes are labeled with the values d(v).

0

∞∞ ∞∞ ∞

56

7

-2 4

12

2-4

0

65 7

∞ ∞

56

7

-2 4

1 22

-4

0

64 7

6 9

56

7

-2 4

12

2-4

0

54 7

5 9

56

7

-2 4

12

2-4

0

53 7

5 9

56

7

-2 4

12

2-4

0

53 7

4 9

56

7

-2 4

12

2-4

All-Pair Shortest Paths

Find the distance between every pair of vertices:I n calls to Dijkstra’s algorithm is O(nm log n) time

I n calls to Bellman-Ford is O(n2m) timeWe can do it in O(n3) as follows (assumes nodes are 1, . . . ,n):

AllPair(G):for each vertex pairs (i, j) do

if i != j and (i, j) is an edge in G thenD0[i, j] = weight of edge (i, j)

elseif i == j then D0[i, j] = 0 else D0[i, j] =∞

for k = 1 to n dofor i = 1 to n do

for j = 1 to n doDk[i, j] = min(Dk−1[i, j],Dk−1[i, k] + Dk−1[k, j])

return Dn;

Example

1 2

3 4

1

2

1

53

1 2 3 41 0 1 5 ∞2 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 ∞ 1 0

k = 1

1 2 3 41 0 1 5 ∞2 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0

k = 2

1 2 3 41 0 1 5 32 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0

k = 3

1 2 3 41 0 1 5 32 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0

k = 4

1 2 3 41 0 1 4 32 5 0 3 23 ∞ ∞ 0 ∞4 3 4 1 0

Minimum Spanning Trees

Spanning tree T of a weighted graph G:I T contains all nodes and a subset of the edges of G

I T is a treeSpanning tree is minimal if the total edge weight is minimal.

Application:I Communication networks.

I Transportation networks.X

Y

ZU

V

W

P

8

9 6

3

54

1 10

7

2

Cycle Property

Cycle property:I Let T be a minimal spanning tree for a weighted graph G.

I Let e be an edge that is not in T .

I Let C be the cyle formed by e and T .Then for every edge f of C we have weight(f ) ≤ weight(e).

X

Y

ZU

V

W

P

8

9 6

3

54

1 10

7

2

Proof:I assume weight(f ) > weight(e)

I replacing f by e in T would yieldbetter spanning tree

Example: 8 ≥ 4,1,6.

Partition Property

Partition property:I Consider a partition of the vertices in A and B.

I Let e be a minimum edge between A and B.Then there is a minimum spanning tree containing e.

A

B

X

Y

ZU

V

W

P

4

9 6

3

56

1 10

7

2

Proof: let T be an MSTI if e 6∈ T then it must create a

cycle C with T ; let f be anedge of C between A and B

I by the cycle propertyweight(f ) ≤ weight(e)

I thus weight(f ) = weight(e)

I replacing f with e yields MST

Prim-Jarnik’s Algorithm

Similar to Dijkstra’s algorithm:I we start with a node s, S = s

I build the MST while stepwise extending a set SBut something is changed:

I nodes do not store the distance to to s, but instead

I nodes store the distance to the cloud S

Prim-Jarnik’s algorithm: pick a node sI For every vertex v we store the distance d(v) to S.

(initially d(s) = 0 and d(v) =∞ for all other vertices v)I We start with the set of vertices S = s and in every step:

I add to S the vertex v outside of S with smallest d(v)

I update distances of direct neighbours n of v:d(n) = min(d(n), weight of the edge from v to n)(whenever d(n) is changed we do n.setParent(v))

Example

We compute the minimal spanning tree starting from U:

V

U X Z

W

Y

1

3

4

2

2

1

3

5

0

∞

∞

∞

∞

∞

Prim-Jarnik’s algorithm is O(m · log n).

(same analysis as Dijkstra’s algorithm)

Example


V

U X Z

W

Y

1

3

4

2

2

1

3

5

0

1

3

∞

∞

∞



Example


V

U X Z

W

Y

1

3

4

2

2

1

3

5

0

1

3

2

∞

∞



Example


V

U X Z

W

Y

1

3

4

2

2

1

3

5

0

1

2

2

3

5



Example


V

U X Z

W

Y

1

3

4

2

2

1

3

5

0

1

2

2

1

5



Strings

A string is a sequence of characters. Let S be a string of size m:

I A substring of S is a string of the form S[i . . . j].

I A prefix of S is a string of the form S[0 . . . i].

I A suffix of S is a string of the form S[i . . . (m − 1)].

The alphabet Σ is the set of possible characters.

ExampleLet S = ‘‘acaabca ′′:

I “abc” is a substring of S

I “acaa” is a prefix of S

I “ca” is a suffix of S

Pattern Matching

The pattern matching problem:I given: strings T (text) and P (pattern)

I task: find a substring of T equal to P

Example: find P = “ake” in T = “Hey there, wake up!”.

Applications:I text editors

I search engines

I biological reasearch

I . . .

Brute-Force Algorithm

The brute-force algorithm:I compares P with text T for each possible shift of P

Let n be the size of T and m the size of P.bruteForceMatch(T ,P):

for pos = 0 to n − m domatch = truefor i = 0 to m − 1 do

if P[i] != T [pos + i] thenmatch = falsebreak the for-i-loop

if match then return pos (match at position pos)return −1 (no match)

Worst case complexity: O(n · m)

I Example for worst-case: T = aaa . . . aah, P = aaah

Boyer-Moore Heuristics

Boyer-Moore’s pattern matching algorithm:I Looking-glass heuristic:

I compare P with substring of T backwards

I Character-jump heuristic: if mismatch occurs at T [i] = cI if P does not contain c, shift P to align P[0] with T [i + 1]

I else, shift P to align the last occurrence of c in P with T [i]

Example: we search the pattern P = “rithm” in

a p a t t e r n m a t c h i n g a l g o r i t h m

r i t h m

t

m1

r i t h m

e

m2

r i t h m

a

m3

r i t h m

n

m4

r i t h m

g

m5

r i t h m

h

m6

r i t h m

m

m7

h

h8

t

t9

i

i10

r

r11

Last-Occurrence Function

The last-occurrence function L(c):I L(c) maps letters c ∈ Σ to integers L(c) where:

I L(c) is the largest index i such that P[i] = c, or

I L(c) is −1 if no such index exists.

Example:I Σ = a,b, c,d

I P = ‘‘abacab ′′

c a b c dL(c) 4 5 3 -1

The last-occurence function can be computed in O(m + s) time:I where m is the size of P and s the size of Σ

I L(c) can be represented as array where the numericalcodes of letters c ∈ Σ are the index

Boyer-Moore Algorithm

BoyerMooreMatch(T ,P, Σ):L = lastOccurrenceFunction(P, Σ)

pos = 0while pos ≤ n − m do

match = truefor i = m − 1 downto 0 do

if P[i] != T [pos + i] thenmatch = false(align last occurrence, but move at least 1)pos = pos + max(1, i − L(T [pos + i]))break the for-i-loop

doneif match then return pos (match at position pos)

donereturn −1 (no match)

Example

Example:I Σ = a,b, c,d

I P = ‘‘abacab ′′

c a b c dL(c) 4 5 3 -1

a b a c a a b c a d a b a c a b a a b b

a b a c a b

a

b1

a b a c a b

b

b2

a

a3

a

c4

a b a c a b

c

b5

a b a c a b

d

b6

a b a c a b

b

b7

a

a8

c

c9

a

a10

b

b11

a

a12

Boyer-Moore: Analysis

The worst case time complexity of Boyer-Moore is O(n · m + s)I Example of the worst-case: T = “aaa. . . a” and P = “baaa“

The worst-case:I may occur in images or DNA sequences

I is unlikely in English text

Boyer-Moore’s algorithm isI significantly faster than brute-force for English text

Knuth-Morris-Pratt (KMP) Algorithm

Knuth-Morris-Pratt’s algorithm:I compares the pattern left-to-right

I shifts pattern more intelligently than brute-force algorithm

When a mismatch occurs: what is the most we can shift thepattern to avoid redundant comparisons?

I Answer: largest prefix of P[0 . . . i] that is suffix of P[1 . . . i](where i is the last compared position)

. . a b a a b x . . . . .

a b a a b aa b a a b

a b a a b x

b

a b a a b aa bi

No need to compare again. Resume comparing here.

KMP Failure (Shift) Function

The Knuth-Morris-Pratt’s failure function F (i):I the algorithm preprocesses the pattern to compute F (i)

I F (i) is the largest j such that P[0 . . . j] is suffix of P[1 . . . i]

Example: P = ”abaaba“

i 0 1 2 3 4 5P[i] a b a a b aF (i) 0 0 1 1 2 3

. . a b a a b x . . . . .

a b a a b aa b a a b

a b a a b x

b

a b a a b aa bi

F (i − 1)

Knuth-Morris-Pratt Algorithm

KMPMatch(T ,P):F = failureFunction(P)

pos = 0i = 0while pos ≤ n − m do

if P[i] != T [pos + i] thenif i > 0 then

pos = pos + i − F (i − 1)

elsepos = pos + 1

i = F (i − 1)

elsei = i + 1

if i == m then return pos (match at position pos)donereturn −1 (no match)

Example

a b a a b c a b a b a a b a a b a a b b

a0

b1

a2

a3

b4

a5

a

a1

b

b2

a

a3

a

a4

b

b5

c

a6

a0

b1

a2

a3

b4

a5

a b

c

a7

a0

b1

a2

a3

b4

a5

c

a8

a0

b1

a2

a3

b4

a5

a

a9

b

b10

a

a11

b

a12

a0

b1

a2

a3

b4

a5

a

b

b13

a

a14

a

a15

b

b16

a

a17

i 0 1 2 3 4 5F (i) 0 0 1 1 2 3

Knuth-Morris-Pratt: Analysis

Knuth-Morris-Pratt’s algorithm runs in O(n + m) time:I Failure function can computed in O(m).I In each iteration of the while-loop:

I either i increases by 1, or

I pos increases by at least the amount i decreases.

I Thus the while-loop is executed at most 2n times.

Binary Character Encoding

Given: alphabet A1,A2, . . . ,An.I find binary codes c(Ai) of the letters Ai

Encode letters as binary numbers of bit length dlog2 ne:I e.g. alphabet A,B,C

I c(A) = 00, c(B) = 01, c(C) = 11

I ACBA = 00110100

More efficient encodings:I c(A) = 0, c(B) = 10, c(C) = 11

I ACBA = 011100

The coding should be unambiguous. Bad example:I Let c(A) = 10, for c(B) = 01, and for c(C) = 0.

I Then both BC and CA yield the 010.The code must be prefix-free: no c(Ai) is prefix of a c(Aj)!

Huffman code

Given:I Alphabet A1,A2, . . . ,An.

I Probabilities 0 ≤ p(Ai) ≤ 1 for every letter Ai .

Problem:I Find binary codes c(Ai) of the letters Ai .

I Expected length for the codes, that is,∑1≤i≤n

p(Ai) · |c(Ai)|

should be minimal.

Huffman Algorithm

Huffman Algorithm:I Create for every letter Ai a tree consisting only of Ai .I Search the two trees with the lowest sum of probabilities:

I merge these two trees with a new node at the root

I Repeat the last step until only one tree is left.

Example:

p(A) = .15, p(B) = .2, p(C) = .15, p(D) = .4, p(E) = .1

A B C D E.15 .2 .15 .4 .1A B D

C0

E1

.15 .2 .4 .25D

A0

B1

C0

E1

.4 .25.35D

A0

B1

0

C0

E1

1

.6

D0

A0

B1

0

C0

E1

1

1

.6 c(A) = 100c(B) = 101c(C) = 110c(D) = 0c(E) = 111

average code length:3 · 0.6 + 1 · 0.4 = 2.2

Data Structures and Algorithms - Vrije Universiteit …tcs/ds/slides.pdfData Structures and...

Documents

Transcript of Data Structures and Algorithms - Vrije Universiteit …tcs/ds/slides.pdfData Structures and...