QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An...

18
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide-and-conquer. Input: An array A[1…n] of comparable elements, the starting position and the ending index of A. Output: A permutation of A such that elements are in ascending order

Transcript of QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An...

Page 1: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

QuickSort (Ch. 7)

Like Merge-Sort, based on the three-step process of divide-and-conquer.

Input: An array A[1…n] of comparable elements, the starting position and the ending index of A.

Output: A permutation of A such that elements are in ascending order

Page 2: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

QuickSort

Divide: Rearrange the array A[p..r] into two subarrays A[p..q-1] and A[q+1..r] such that each element of A[p..q-1] A[q] and each element of A[q+1..r] > A[q].

Conquer: Sort the two subarrays by recursive calls to QuickSort.

Combine: No work is needed to combine subarrays since they are sorted in-place.

To sort the subarray A[p…r] (initially, p = 1 and r = A.length):

Page 3: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

QuickSortQuicksort(A, p, r)1. if p < r2. q = Partition(A, p, r)3. Quicksort(A, p, q-1)4. Quicksort(A, q+1, r)

Partition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j] 7. swap A[i+1] and A[r]8. return i + 1

Initial call:Quicksort(A, 1, A.length)

What does Partition do? What is running time of Partition?

Note that Partition always selects the last element A[r] in the subarray A[p…r] as the pivot.

Page 4: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

PartitionPartition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j] 7. swap A[i+1] and A[r]8. return i + 1

In line 1, x is set to the value contained in A[r] (the pivot).

i is set to index before p, j starts at p

j goes through entire array segment from p to r-1; any time A[j] ≤ x, i is incremented and A[i] is exchanged with A[j]

When j = r, the value at A[r] is exchanged with the value at A[i+1]. This puts the value in A[r] between the values ≤ x and those > x. So the item at position A[i+1] is in its correct sorted position. The index i+1 is returned because that position no longer needs to be considered.

Page 5: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

PartitionPartition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j]

7. swap A[i+1] and A[r]8. return i + 1

Loop invariant:

As Partition executes, the array is divided into 4 regions, some possibly empty, such that

1. All values in A[p…i] are ≤ pivot.

2. All values in A[i+1…j-1] are > pivot.

3. A[r] = pivot.

Although not needed as part of the loop invariant, the 4th region is A[j…r-1], whose entries have not yet been examined, so we don’t know how they compare to the pivot.

Page 6: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

PartitionPartition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j] 7. swap A[i+1] and A[r]8. return i + 1

1.

2.

3.

4.

5.

Example on an 8-element array:

Page 7: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

PartitionPartition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j] 7. swap A[i+1] and A[r]8. return i + 1

The index j disappears after step 7 because it is no longer needed once the for loop is exited.

6.

7.

8.

9.

Page 8: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Formal Correctness of Partition

As the Partition procedure executes, the array is partitioned into four regions, some of which may be empty.

Loop invariant: At the beginning of each iteration of the for loop (lines 3-6), for any array index k,

1. If p ≤ k ≤ i, then A[k] ≤ x.2. If i+1 ≤ k ≤ j-1, then A[k] > x.3. If k=r, then A[k] = x.

The fourth region is A[j…r - 1], �whose entries have not yet been examined.

Partition(A, p, r)1. x = A[r] // choose pivot2. i = p - 13. for j = p to r - 1

4. if A[j] ≤ x5. i = i + 16. swap A[i] and A[j]

7. swap A[i+1] and A[r]8. return i + 1

Page 9: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Partition: CorrectnessInitialization: Before the loop starts i=0 and j=1, all the conditions of the loop invariant are satisfied because r is the pivot, and the sub-arrays A[p…i] (A[1…0]) and A[i+1…j-1] (A[1…0]) are empty.

Maintenance: In each iteration of the loop, either A[j] > x or A[j] ≤ x.

In the first case, j is incremented and cond. 2 holds for A[j-1] with no other

changes. In the second case, i is incremented, A[i] and A[j] areswapped, and then j is incremented. Cond. 1 holds for A[i] after

swap. By the IHOP, the item in A[i] was in A[i+1] during the last iteration,

and was > x then, so cond. 2 holds at the end of the iteration. Termination: At termination, j = r and A has been partitioned into3 sets: A[p…i] < pivot, A[i+1…r-1] > pivot, and A[r] = pivot.

The last 2 lines of Partition move the pivot element from the highest

position in the array to between the 2 sub-arrays by swapping A[i+1]

and A[r].

Page 10: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort Running Time

The running time of quicksort depends on the partitioning of subarrays.

• If the subarrays are balanced, then quicksort can run as fast as mergesort.

• If the subarrays are unbalanced, then quicksort can run as slowly as insertionsort.

Page 11: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort Running Time

Worst case occurs when subarrays are completely unbalanced, when there are 0 elements in one subarray and n-1 elements in the other.

We get the recurrence T(n) = T(n-1) + T(0) + Θ(n) = T(n-1) + Θ(n) = Θ(n2)

This is the same running time as the worst case of insertion sort.

What input instance would cause the worst-case running time of quicksort to occur?

When the array is already sorted in ascending order.

Page 12: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort Running Time

Best case occurs when subarrays are completely balanced and each subarray has ≤ n/2 elements

We get the recurrence T(n) = 2T(n/2) + Θ(n) = Θ(nlgn)by Case 2 of the master theorem.

This is the same running time as merge sort.

Page 13: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort Average-case

The average case of quicksort’s running time is much closer to the best case than it is to the worst case.

Imagine that Partition always produces a 9-to-1 split.

Then we get the recurrence T(n) ≤ T(9n/10) + T(n/10) + Θ(n) = O(nlgn)

The recursion tree is like the one for T(n) = T(n/3) + T(2n/3) + O(n)

Any split of constant proportionality will yield a recursion tree of depth Θ(lgn)

Page 14: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort Average-case

Intuition: Some splits will be close to balanced and others close to

unbalanced, so good and bad splits will be randomly distributed in

recursion tree.

The running time will be (asymptotically) bad only if there are many

bad splits in a row.

• A bad split followed by a good split results in a good partitioningafter one extra step.

Page 15: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort with Good Average Running Time

How can we modify Quicksort to get good average case behavior onall inputs?

Randomized-Partition(A, p, r)1. i = Random(p, r)2. swap A[r] and A[i]3. return Partition(A, p, r)

2 techniques:

1. randomly permute input prior to running Quicksort. Will producetree of possible executions, most of them finish fast.

2. choose partition randomly at each iteration instead of choosingelement in highest array position.

Randomized-Quicksort(A, p, r)1. if p < r2. q = Randomized-Partition(A, p, r)3. Randomized-Quicksort(A, p, q-1)4. Randomized-Quicksort(A,

q+1, r)

Page 16: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Quicksort with Good Average Running Time

Randomization of Quicksort stops any specific type of

array from causing worst- case behavior. For example,

an already-sorted array causes worst-case behavior in

non-randomized Quicksort but not in Randomized-

Quicksort.

Page 17: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Average-case Analysis of Randomized-Quicksort

• Rename elements of A as z1, z2, …, zn and define the set Zij to be the set of elements between zi and zj, inclusive.

• Each pair of elements is compared at most once, because elements are compared only to the pivot and the pivot is not used again.

• Let Xij = I{ zi is compared to zj }• Since each pair is compared at most once, the total #

comparisons=

• Taking the expectations of both sides, use L.5.1 and LoE:

• But what is Pr{zi is compared to zj}?

Page 18: QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.

Average-case Analysis of Randomized-Quicksort

Observations:• Numbers in separate partitions are never compared.• If the pivot is not zi or zj, these elements are never compared.• The probability that zi is compared to zj is the probability that either

one is chosen as the pivot.• There are j – i + 1 elements and the probability that any particular

one is chosen as the pivot is 1/(j – i + 1)

Thus Pr{zi is compared to zj} = Pr{zi is the pivot} + Pr{zj is the pivot} = 2/(j-i+1). Substituting this for E[X] and letting k = j - i:

So the expected running time of Randomized-Quicksort is O(nlgn) when elements in A are distinct.