Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za...

18
Theory of Algorithms: Theory of Algorithms: Divide and Conquer Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town August - October 2004

Transcript of Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za...

Page 1: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Theory of Algorithms:Theory of Algorithms:Divide and ConquerDivide and Conquer

Theory of Algorithms:Theory of Algorithms:Divide and ConquerDivide and Conquer

James Gain and Edwin Blake{jgain | edwin} @cs.uct.ac.za

Department of Computer Science

University of Cape Town

August - October 2004

Page 2: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

ObjectivesObjectives

To introduce the divide-and-conquer mind set

To show a variety of divide-and-conquer solutions: Merge Sort

Quick Sort

Multiplication of Large Integers

Strassen’s Matrix Multiplication

Closest Pair and Convex Hull by Divide-and-Conquer

To discuss the strengths and weaknesses of a divide-and-conquer strategy

Page 3: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Divide-and-ConquerDivide-and-Conquer

Best known algorithm design strategy:1. Divide instance of problem into two or more smaller

instances

2. Solve smaller instances recursively

3. Obtain solution to original (larger) instance by combining these solutions

The Master Theorem applies Recurrences are of the form T(n) = aT(n/b) + f (n), a

1, b 2

Silly Example (Addition): a0 + …. + an-1 = (a0 + … + an/2-1) + (an/2 + … an-1)

Page 4: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Divide-and-Conquer IllustratedDivide-and-Conquer Illustrated

SUBPROBLEM 2 OF SIZE n/2

SUBPROBLEM 1 OF SIZE n/2

A SOLUTION TO SUBPROBLEM 1

A SOLUTION TO THEORIGINAL PROBLEM

A SOLUTION TO SUBPROBLEM 2

A PROBLEM OF SIZE n

Page 5: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

MergesortMergesort

Algorithm:1. Split A[1..n] in half and copy of each half into arrays B[1.. n/2 ]

and C[1.. n/2 ]

2. Recursively MergeSort arrays B and C

3. Merge sorted arrays B and C into array A

Merging: REPEAT until no elements remain in one of B or C

1. Compare 1st element in the rest of B and C

2. Copy smaller into A, incrementing index of corresponding array

3. Once all elements in one of B or C is processed, copy the remaining unprocessed elements from the other array into A

Page 6: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Mergesort ExampleMergesort Example

7 2 1 6 4

7 2 1 6 4

7 2 1 6

4

7

2

2 7

1 2 7

4 6

1 2 4 6 7

Page 7: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Efficiency of MergesortEfficiency of Mergesort

Recurrence: C(n) = 2 C(n/2) + Cmerge(n) for n > 1, C(1) = 0

Cmerge(n) = n - 1 in the worst case

All cases have same efficiency: ( n log n)

Number of comparisons is close to theoretical minimum for comparison-based sorting: log n! ≈ n lg n - 1.44 n

Space requirement: ( n ) (NOT in-place)

Can be implemented without recursion (bottom-up)

Page 8: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

QuicksortQuicksort

Select a pivot (partitioning element)

Rearrange the list into two sublists: All elements positioned before the pivot are ≤ the pivot

Those positioned after the pivot are > the pivot

Requires a pivoting algorithm

Exchange the pivot with the last element in the first sublist The pivot is now in its final position

QuickSort the two sublists

p

A[i]≤p A[i]>p

Page 9: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

The Partition AlgorithmThe Partition Algorithm

Page 10: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Quicksort ExampleQuicksort Example

5 3 1 9 8 2 7

5 3 1 9 8 2 7

i j

i j

5 3 1 2 8 9 7 j i

2 3 1 5 8 9 7

2 3 1 8 9 7 i j

2 1 3

5 3 1 2 8 9 7 i j

i j

2 1 3 j i

1 2 3

Recursive Call Quicksort (2 3 1) & Quicksort (8 9 7)

Recursive Call Quicksort (1) & Quicksort (3)

i j

8 7 9 i j

8 7 9 j i

7 8 9

Recursive Call Quicksort (7) & Quicksort (9)

Page 11: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Worst Case Efficiency of QuicksortWorst Case Efficiency of Quicksort

In the worst case all splits are completely skewed

For instance, an already sorted list! One subarray is empty, other reduced by only one:

Make n+1 comparisons Exchange pivot with itself Quicksort left = Ø, right = A[1..n-1] Cworst = (n+1) + n + … + 3 = (n+1)(n+2)/2 - 3 = (n2)

p

A[i]≤p A[i]>p

Page 12: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

General Efficiency of QuicksortGeneral Efficiency of Quicksort

Efficiency Cases: Best: split in the middle — ( n log n) Worst: sorted array! — ( n2) Average: random arrays — ( n log n)

Improvements (in combination 20-25% faster): Better pivot selection: median of three partitioning

avoids worst case in sorted files Switch to Insertion sort on small subfiles Elimination of recursion

Considered the method of choice for internal sorting for large files (n ≥ 10000)

Page 13: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

QuickHull AlgorithmQuickHull Algorithm

Inspired by Quicksort

1. Sort points by increasing x-coordinate values

2. Identify leftmost and rightmost extreme points P1 and P2 (part of hull)

3. Compute upper hull: Find point Pmax that is farthest away from line P1P2

Quickhull the points to the left of line P1Pmax

Quickhull the points to the left of line PmaxP2

4. Similarly compute lower hull

P1

Pmax P2

L

L

Page 14: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Finding the Furthest PointFinding the Furthest Point

Given three points in the plane p1 , p2 , p3

Area of Triangle = p1 p2 p3 = 1/2 D

D =

D = x1 y1 + x3 y1 + x2 y3 - x3 y2 - x2 y1 - x1 y3

Properties of D : Positive iff p3 is to the left of p1p2

Correlates with distance of p3 from p1p2

x1 y1 1

x2 y2 1

x3 y3 1

Page 15: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Efficiency of QuickhullEfficiency of Quickhull

Finding point farthest away from line P1P2 is linear in the number of points

This gives same efficiency as quicksort: Worst case: (n2)

Average case: (n log n)

If an initial sort is required, this can be accomplished in ( n log n) — no increase in asymptotic efficiency class

Alternative Divide-and-Conquer Convex Hull: Graham’s scan and DCHull

Also ( n log n) but with lower coefficients

Page 16: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Strassen’s Matrix MultiplicationStrassen’s Matrix Multiplication

Strassen observed [1969] that the product of two matrices can be computed as follows:

Op Count: Each of M1, …, M7 rrequires 1 mult and 1 or 2 add/sub Total = 7 mul and 18 add/sub Compared with brute force which requires 8 mults and 4 add/sub

n/2 n/2 submatrices are computed recursively by the same method

C00 C01

C10 C11

A00 A01

A10 A11

B00 B01

B10 B11

M1 + M4 -M5+M7 M3+ M5

M2 + M4 M1 + M3 - M2 + M6

=

= *

Page 17: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Efficiency of Strassen’s AlgorithmEfficiency of Strassen’s Algorithm

If n is not a power of 2, matrices can be padded with zeros

Number of multiplications: M(n) = 7 M(n/2) for n > 1, M(1) = 1

Set n = 2k, apply backward substitution

M(2k) = 7M(2k-1) = 7 [7 M(2k-2)] = 72 M(2k-2) = … = 7k

M(n) = 7log n = nlog 7 ≈ n2.807

Number of additions: A(n) (nlog 7)

Other algorithms closer to the lower limit of n2 multiplications, but are even more complicated

Page 18: Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town.

Strengths and Weaknesses of Strengths and Weaknesses of Divide-and-ConquerDivide-and-Conquer

Strengths: Generally improves on Brute Force by one base

efficiency class

Easy to analyse using the Master Theorem

Weaknesses: Often requires recursion, which introduces overheads

Can be inapplicable and inferior to simpler algorithmic solutions