Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.

Stephen P. Carl - CS 242 1

Recursive Sorting Algorithms

Reading: Chapter 5


Divide and Conquer

• Divide and Conquer is a special case of problem decomposition.

• A “divide and conquer” algorithm – divides a problem into subproblems– conquers (by solving) the subproblems– combines the obtained solutions in come way to arrive at the

solution to the complete problem.


Sorting by Divide and Conquer

The two best-known sorting algorithms that use thisstrategy are Mergesort and Quicksort.

– Mergesort: subdivides the problem by dividing the sequence to be sorted in two and recursively sorting these subsequences. It then combines the solutions obtained using the merge procedure.

– Quicksort: subdivides the problem by dividing the sequence using a more sophisticated technique called partitioning; the subsequences are then recursively subdivided in the same way until eventually the sorted sequence in obtained.


Mergesort// Pre: 0 <= first,last < N (size of array a)

// Post: a[0] <= a[1] <= … <= a[N-1]

public static void MergeSort(Object a[], int first, int last)

{// base case: first >= last (nothing left to sort)

if (first < last) { // recursive step

int mid = (first + last) / 2;

MergeSort(a, first, mid);

MergeSort(a, mid+1, last);

merge(a, first, mid, last);

}

}


[3 7 8 1 5 2 4 6]

[3 7 8 1] [5 2 4 6]

[3 7] [8 1] [5 2] [4 6]

[3] [7] [8] [1] [5] [2] [4] [6]

[3 7] [1 8] [2 5] [4 6]

[1 3 7 8] [2 4 5 6]

[1 2 3 4 5 6 7 8]

1 array segment, 8 unsorted elements

2 array segments, 4 unsorted per segment

4 array segments, 2 unsorted per segment

8 array segments, each sorted by default

4 array segments, 2 sorted elements

2 array segments, 4 sorted elements

1 array segment, 8 sorted elements - done!

Split

Merge


The Merge Algorithm// Pre: a[first] to a[mid], a[mid+1] to a[last] sorted// Post: a[first] through a[last] are sorted

protected void merge(Object a[], int first, int mid, int last)

// create a temporary array// I <- first, J <- mid+1// do the following:// if (a[I] <= a[J]) // copy a[I] to temp array// increment I // else // copy a[J] to temp array// increment J// endif// while I < mid+1 and J < last+1


Merge Algorithm continued...

// if I < mid+1

// copy elements from I to mid into temp array

// else if J < last + 1

// copy elements from J to last into temp array

// end if

// copy temp array to original (first to last)

// end merge algorithm


Recursive Properties of Mergesort

• Is MergeSort linear or tree recursion?

• Is MergeSort tail recursive?

• In general, the space complexity of a recursive algorithm includes the stack space required, which is proportional to the maximum number of calls active during execution of the function.


Efficiency of Mergesort

MergeSort is tree recursion, so we must ask if its

performance in terms of number of recursive calls is as

awful as Fibonacci or the choose function. But notice

that the recursive calls do no redundant work, as was

the case with these other functions.

Sequence of size 8 : 15 calls to MergeSort

Sequence of size 16: 31 calls to MergeSort

Sequence of size 32: 63 calls to MergeSort


QuickSort

Idea: assume we know which value in a sequence will end up in or near the middle of the sequence once it has been sorted; call this value the pivot.

To quicksort the sequence:– Place all values less than the pivot value to the “left” of the

pivot value, in any order.– Place all values greater than the pivot value to the “right” of

the pivot value, again in any order.– Place the pivot value in its correct position in the sequence.– Now, quicksort the left and right subsequences.


Outline of QuickSort// Pre: low is a valid index into array AND// (high-low) < size of array a// Post: a[0] <= a[1] <= … <= a[N-1]

public void quicksort(Object a[], int low, int high)

{ int pivot_position;

if (low < high) { // base case: low >= high pivot_position = partition(a, low, high); quicksort(a, low, pivot_position-1); quicksort(a, pivot_position+1, high); }}


Discussion

In the preceding, pivot_position is the index of the pivot, calculatedby calling partition; this call also rearranges the elements in the array. The process repeats recursively on each subsequence, untileach subarray contains but a single element.

So after calling partition we have the following:

pivotElements < pivot Elements >= pivot


How to Choose the Pivot?

Problem: there is no way to ensure assumption that weknow the pivot value; we have to guess. Note:

• Quicksort works well only if the pivot value is chosen properly; what happens if we guess wrong and end up with the value that comes first in the sorted order?

• On average, choosing the first element in the sequence works okay, because there’s just a small chance that it will be the first value when sorted. This is sometimes called the lazy method.

• A better technique is to choose the median value from the first, middle and last elements in the unsorted sequence. This technique is called median-of-three.


Analysis of the Quicksort Algorithm• First, make some simplifying assumptions:

– Input array has size equal to some power of 2– Pivot element is part of one half of the array

• Best Case: each time an array of size n = 2k is divided into two exactly equal parts of size n.

• Pattern of calls to partition:– 1 call with an array of size n– 2 calls with arrays of size n/2– 4 calls with arrays of size n/4– and so on until ...

– 2k calls with arrays of size 2

• The number of levels in the call tree is k


Quicksort Analysis, continued

• We’re counting calls to partition because it contains the critical operations.

• An upper bound on the number of critical operations for each level is the size of the entire array, n, because no more than n comparisons can be done by the calls to partition over the entire array.

• So, an upper bound on Quicksort is: number of critical ops = n * kbut k is log (base 2) of n, so number of critical ops = n * log2 n

• Computational Complexity is O(n log n).


Quicksort Analysis: Worst CaseIn the worst case, partition chooses a bad pivot and

divides the array into subarrays of size 1 and n-1

elements. This occurs when using the ‘lazy’ method

and the array is already sorted or it is reverse sorted.

• How many calls to partition in this case? (n - 1) + (n - 2) + (n - 3) + … + 1 (use Gauss’ formula for summations)

• Result: for worst case input, Quicksort isn’t so quick! In fact, it has quadratic computational complexity.

• Average case: just as good as best case!


Efficiency of Divide and Conquer

Mergesort always divides array in half, so its time

complexity is equivalent to Quicksort in the best case,

but it takes twice as much space (why?).

Algorithms based on divide and conquer can be faster

(often much faster) than non-recursive quadratic-time

algorithms like insertion sort. The reason is that

operations are applied to the input in fewer passes.

For example, the merge operation can be viewed as

being called on the entire array but only log n times.


Efficiency Continued

Try applying divide and conquer just once to a quadratic

sorting algoirthm such as insertion sort; is there any savings?

• Insertion sort is quadratic, so for an input of size N = 8 there will be about 64 comparisons total.

• Instead, divide the array into 2 halves, apply insertion sort to each half, then combine the results using merge.

• For N = 8, 2 insertion sorts on N = 4 takes about 16 + 16 comparisons, plus one merge, which is about 8 comparison. Total = 40 comparisons.

• Saving even better for larger values of N.

Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.

Documents

Transcript of Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.