Post on 14-Jan-2016
description
CSS342: Sorting Algorithms 1
CSS342: Sorting Algorithms
Professor: Munehiro Fukuda
CSS342: Sorting Algorithms 2
Why We Desperately Need Efficient Sorting Algorithms?
• Data must be sorted before we run the following programs:– Search algorithms such as binary search and interpolation
search– Many computational geometry/graphics algorithm such
as the convex hull
• We always or frequently need to sort the following data:– Dictionary– White/yellow pages– Student grades
CSS342: Sorting Algorithms 3
TopicsDay 1: Lecture• Selection Sort worst/average O(n2)• Bubble Sort worst/average O(n2)• Insertion Sort worst/average O(n2)• Shell Sort worst O(n2) average O(n3/2)• Merge Sort worst/average O(n log n)• Quick Sort worst O(n2) average O(n log n)• Radix Sort worst/average O(n)
Day 2: Lab Work• Partial Quick Sort
Homework Assignment• Non-recursive Semi-In-Place Merge Sort
CSS342: Sorting Algorithms 4
Selection Sort
Initial array
After 1st swap
After 2nd swap
After 3rd swap
After 4th swap
29 10 14 1337
29 10 14 13 37
2910 1413 37
2910 1413 37
2910 1413 37
Scan item 0 to size-1, locate the largest item, and swap it with the rightmost item.
Scan item 0 to size-2, locate the 2nd largest item, and swap it with the 2nd rightmost item.
Scan item 0 to size-3, locate the 3rd largest item, and swap it with the 3rd rightmost item.
Scan item 0 to size-4, locate the 4th largest item, and swap it with the 4th rightmost item.
Scan item 0 to size-5, locate the 5th largest item, and swap it with the 5th rightmost item.
0 size-1
O(n2) sorting
CSS342: Sorting Algorithms 5
Selection Sort
template <class Object>void selectionSort( vector<Object> & a ) { for ( int last = a.size( ) - 1; last >= 1; --last ) { int indexSoFar = 0; // Index of largest item found so far. // Assume 0th item is the largest first for ( int i = 1; i <= last; ++i ) { if ( a[i] > a[indexSoFar] )
indexSoFar = i; } // indexSoFar points to the largest item at this point swap( a[indexSoFar], a[last] ); }}
last = a.size( ) - 1
last
last
0
0
0
indexSoFar
swap
swap
swap
indexSoFar
indexSoFar
O(n2) sorting
CSS342: Sorting Algorithms 6
Efficiency of Selection Sort
29 10 14 1337
29 10 14 13 37
2910 1413 37
2910 1413 37
2910 1413 37
Initial array
After 1st swap
After 2nd swap
After 3rd swap
After 4th swap
N-1 (=4) 1
N-2 (=3) 1
N-3 (=2) 1
N-4 (=1) 1
Comparisons Swapping
O(n(n-1)/2) O(n-1)
O(n2)
O(n2) sorting
CSS342: Sorting Algorithms 7
Bubble Sort
29 10 14 1337
2910 14 1337
2910 14 1337
2910 14 1337
2910 14 13 3737
2910 14 13 3737
2910 14 13 3737
2910 14 13 3737
292910 14 13 3737
Pass 1 Pass 2
1310 14 2929 3737
1310 14 2929 3737
292910 13 1414 3737
Pass 3
141410 13 2929 3737
292910 1313 1414 3737
Pass 4
O(n2) sorting
CSS342: Sorting Algorithms 8
Bubble Sort
15 10
indexnextIndex
swap
#include <iostream>#include <vector>#include <string>
using namespace std;
template <class Object>void bubbleSort( vector<Object> & a ) { bool swapOccurred = true; // true when swaps occur
for ( int pass = 1; ( pass < a.size( ) ) && swapOccurred; ++pass ) { swapOccurred = false; // swaps have not occurred at the beginning for ( int i = 0; i < a.size( ) - pass; ++i ) { // a bubble(i) goes from 0 to size - pass if ( a[i] > a[i + 1] ) {
swap( a[i], a[i + 1] );swapOccurred = true; // a swap has occured
} } }}
O(n2) sorting
CSS342: Sorting Algorithms 9
Efficiency of Bubble Sort
29 10 14 1337
2910 14 1337
2910 14 1337
2910 14 1337
2910 14 13 3737
2910 14 13 3737
2910 14 13 3737
2910 14 13 3737
292910 14 13 3737
ComparisonSwapping
N-1N-1
N-2N-2
11
……
O(n2)O(n2)
O(n2)
Pass 1 Pass 2
O(n2) sorting
CSS342: Sorting Algorithms 10
Insertion Sort2929 10 14 1337
2929 14 1337
29291010 14 1337
2910 29 1337
14141010 2929 1337
14141010 2929 133737
1410 14 3729
13131010 1414 37372929
Sorted UnsortedCopy 10
Shift 29
Insert 10, copy 14
Shift 29
Insert 14; copy 37
Copy 13
Shift 37, 29 and 14.
Insert 13
14141010 2929 1337 Shift nothing
unsortedTop
O(n2) sorting
CSS342: Sorting Algorithms 11
Insertion Sort
template <class Object>
void SortedList<Object>::insertionSort( ) {
for ( int unsorted = 1; unsorted < array.size( ); ++unsorted ) {
// Assume the 0th item is sorted. Unsorted items start from the 1st item
Object unsortedTop = array[unsorted]; // Copy the top of unsorted group
int i;
for ( i = unsorted - 1; ( i >= 0 ) && (array[i] > unsortedTop ); --i )
// Upon a successful comparison, shift array[i] to the right
array[i + 1] = array[i];
// insert the unsorted top into the sorted group;
array[i + 1] = unsortedTop;
}
}
#endif
13 11 25 2037291410832unsorted
13
unsortedTop
37
loc loc+1loc loc+1
loc loc+1
copycompare
shift
2914
insert
O(n2) sorting
CSS342: Sorting Algorithms 12
Efficiency of Insertion Sort
2929 10 14 1337
29291010 14 1337
29291010 14 1337
14141010 2929 1337
14141010 2929 1337
14141010 2929 133737
14141010 2929 133737
13131010 1414 37372929
Sorted Unsorted Comparison Insertion Shift
1 1 1
2 1 2
3 1 3
N-1(=4) 1 N-1=(4)
O(n2) = O(n2) O(n) O(n2)
O(n2) sorting
CSS342: Sorting Algorithms 13
ShellSort
• The idea is to perform an insertion sort among items in gap
• This reduces the large amount of data movement.
81 20388785157541582895173512961194
0 16
81 95173512961194
3887851575415828
20
gap = 17/2 = 8
81
95
173512
96
11
94
38
878515
75
41
58
28
20
gap = 8/2.2 = 3
8195
17
3512
96
11
94
38
878515
75
41
58
28
20
81
95
17
35
12
96
11
94
38
87
85
15
75
41
58
28
20
gap = 3/2.2 = 1
81
95
17
35
12
96
11
94
38
87
85
15
75
41
58
28
20
81
95
17
35
12
96
11
94
38
87
85
15
75
41
58
28
20sort
sort
sort
O(n3/2) sorting
Practically chosen
Initially divided by 2
CSS342: Sorting Algorithms 14
ShellSorttemplate <class Comparable>void shellsort( vector<comparable> &a );{ for ( int gap = a.size( ) / 2; gap > 0; gap = ( gap == 2 )? 1 : int( gap / 2.2 ) ) { for ( int i = gap; i < a.size( ); i++ ) { Comparable tmp = a[i];
int j = i;
for ( ; j >= gap && tmp < a[j – gap]; j -= gap ) a[j] = a[j – gap]; a[j] = tmp; } }}
81 20388785157541582895173512961194
0 16
gap = 16/2 = 8
20
tmp
Shift a[16-8] if it is larger than tmp
Assume i = a.size( ) –1
Shift a[16-8 * 2] if it is larger than tmp
(1)
(1)
(2)
(2)
(3)
(3)
(4)
(4)(4)
(5)
(5)
O(n3/2) sorting
CSS342: Sorting Algorithms 15
Efficiency of ShellSort
• Performance– Worst case: O(N2)– Average case:
• O(N3/2) when dividing 2
• O(N5/4) or O(N7/6) when dividing 2.2
– Proof: • A long-standing open problem
O(n3/2) sorting
CSS342: Sorting Algorithms 16
Sorting Algorithms
• Selection Sort• Bubble Sort• Insertion Sort• Shell Sort• Merge Sort• Quick Sort
O(n2) (Shell’s average casedepends on increment.)
Use a recursive solutionTake advantage of tree’slog(n) characteristics
O(n log n)
O(nlog n) sorting
CSS342: Sorting Algorithms 17
Mergesort(with an auxiliary temporary array)
1 4 8 13 14 20 25 2 3 5 7 11 23
Assuming that we have already had two sorted array,How can we merge them into one sorted array?
2 3 5 7 11 23 25201413841
O(nlog n) sorting
CSS342: Sorting Algorithms 18
Mergesort(with an auxiliary temporary array)
Template <class Comparable>void merge(vector<Comparable> &a, int first, int mid, int last) { vector<Comparable> tempArray(a.size( )); int first1 = first; int lsat1 = mid; int first2 = mid + 1; int last2 = last;
int index = first1; for ( ; (first1 <= last1) && (first2 <= last2); ++index) { if (a[first1] < a[first2]) {
tempArray[index] = a[first1];++first1;
} else {tempArray[index] = a[first2];++first2;
} } for ( ; first1 <= last1; ++first1, ++index) tempArray[index] = a[first1]; for ( ; first2 <= last2; ++first2, ++index) tempArray[index] = a[first2]; for ( index = first; index <= last; ++index ) a[index] = tempArray[index];}
first mid last
sorted sorted
firs
t1
last
1fi
rst2
last
2
theArray
tempArray
< >=
inde
x
first midsorted sorted
firs
t1
last
1
firs
t2la
st2theArray
tempArray
inde
x
O(nlog n) sorting
CSS342: Sorting Algorithms 19
Mergesort(from down to top: conquer)
38 16 17123927 24 5
3816 2739 1217 245
16 3827 39 5 12 2417
5 12161724 383927
Now, how can we make each item separated?
O(nlog n) sorting
CSS342: Sorting Algorithms 20
Mergesort(from top to down: divide)
3816 17123927 24 5
38 16 17123927 24 5
firstmid=(fist + last)/2
last
theArray
3816 3927 1712 24 5first last first last
mid=(fist + last)/2mid=(fist + last)/2
3816 3927 1712 24 5first
first
last
last
first < last
O(nlog n) sorting
CSS342: Sorting Algorithms 21
Mergesort(final view)
38 16 17123927 24 5first
mid=(fist + last)/2last
theArray
38 16 3927 1712 24 5
38 16 3927 1712 24 5firstlast
38 16 17123927 24 5
3816 27 39 12 17 245
16 3827 39 5 12 2417
5 12 16 17 24 38 3927
template<Comparable>void mergesort(vector<Comparable> &a, int first, int last) { if ( first < last ) { int mid = ( first + last ) / 2; mergesort( a, first, mid ); mergesort( a, mid+1, last ); merge( a, first, mid, last ); }}
O(nlog n) sorting
CSS342: Sorting Algorithms 22
Mergesort(Efficiency Analysis)
38 16 17123927 24 5
3816 27 39 12 17 245
16 3827 39 5 12 2417
5 12 16 17 24 38 3927
Level # pairs of arrays #comparisons
# copies in a pair
1 4 1
2 * 2
2 2 3
4 *2
3 1 7
8 * 2
X n/2x 2x-1
2x * 2
At level X, #nodes in each pair = 2x
At level X, # major operations = n/ 2x * (3 * 2x – 1) = O(3n)#levels = log n, where n = # array elements ( if n is a power of 2 )#levels = log n + 1 if n is not a power of 2# operations = O(3n) * (log n + 1) = O(3 n log n) = O(n log n)
O(nlog n) sorting
CSS342: Sorting Algorithms 23
Quicksort(A partition about a pivot)
13
81
9243
65
3157
26
750
6513
8192433157
26
750
6513 81 9243
315726 750
13 4331 57260
Select a pivot
Partition
Smaller items Larger items
O(nlog n) sorting
CSS342: Sorting Algorithms 24
Quicksort(Code overview)
template<class Comparable>void quicksort(vector<Comparable> &a, int first, int last) { int pivotIndex; // after partition, pivotIndex points to a pivot
if ( first < last ) { partition( a, fist, last, pivotIndex ); quicksort( a, first, pivotIndex - 1 ); quicksort( a, pivotIndex + 1, last ); }}
O(nlog n) sorting
CSS342: Sorting Algorithms 25
Quicksort(Partitioning Algorithm)
p < p > p ?
S1 S2 unknown
first lastS1 firstUnknown last
Repeat moving each element in the unknown region to S1 or S2Until unknown reaches 0.
p ?
unknown
first
lastS1
firstUnknown last
Initial State
O(nlog n) sorting
CSS342: Sorting Algorithms 26
Quicksort(Moving an new unknown into S1)
p < p > p ?
S1 S2 unknown
first lastS1 firstUnknow last
new<p
p < p > p ?
S1 S2 unknown
first lastS1
firstUnknow
last
new<p
swap
O(nlog n) sorting
CSS342: Sorting Algorithms 27
Quicksort(Moving an new unknown into S2)
p < p > p ?
S1 S2 unknown
first lastS1 firstUnknow last
new>p
p < p ?
S1 S2 unknown
first lastS1
firstUnknow
last
new>p> p
O(nlog n) sorting
CSS342: Sorting Algorithms 28
Quicksort(Partitioning Code)
template<class Comparable>void partition(vector<Comparable> a[], int first,
int last, int& pivotIndex) { //place it in a[first] choosePivot( a, first, last ); Comparable pivot = theArray[first]; int lastS1 = first; int firstUnknown = first + 1;
for ( ; firstUnknown <= last; ++ firstUnknown ) if ( a[firstUnknown] < pivot ) { ++lastS1; swap( a[firstUnknown], a[lastS1] ); } // else item from unknown belongs in S2 swap( a[first], a[lastS1] ); pivotIndex = lastS1;}
p ?
unknown
firstlastS1
firstUnknowlast
p
p < p > p ?
S1 S2 unknown
first lastS1 firstUnknow last
new<p
p < p > p ?
S1 S2 unknown
first lastS1firstUnknow
last
new<p
swap
p < p > p
S1 S2
first lastS1 firstUnknowlast
swap
p< p > p
S1 S2
first lastS1 firstUnknowlast
O(nlog n) sorting
CSS342: Sorting Algorithms 29
Quicksort(Example)
27 28 16263912
27 28 16263912
27 28 16263912
27 12 16263928
27 12 16263928
27 12 16283926
27 12 39281626
16 12 39282726
Original array
firstUnknown=1(points to 28)28 belongs in S2
S2S1 is empty.12 belongs in S1, so swap 28 and 12
39 belongs in S2
26 belongs in S1, swap 28 and 26
16 belongs in S1, swap 39 and 16
S1 and S2 are determined
Place pivot between S1 and S2
O(nlog n) sorting
CSS342: Sorting Algorithms 30
Quicksort(Efficiency Analysis)
• Worst case: If the pivot is the smallest item in the array segment, S1 will remain empty.– S2 decreases in size by only 1 at each recursive call.– Level 1 requires n-1 comparisons.– Level 2 requires n-2 comparisons.– Thus, (n-1) + (n-2) + …. + 2 + 1 = n(n-1)/2 = O(n2)– Then, how can we select the best pivot?
• Average case: S1 and S2 contain the same number of items.– log n or log n + 1 levels of recursions occur.– Each level requires n-k comparisons– Thus, at most (n-1) * (log n + 1) = O(n log n )
O(nlog n) sorting
CSS342: Sorting Algorithms 31
Mergesort versus Quicksort
Worst case Average case
Mergesort n log n n log n
Quicksort n2 n log n
Then, why do we need Quicksort?Reasons: 1. Mergesort requires item-copying operations from the array a to the temp
array and vice versa.2. A worst-case situation is not typical.
Then, why do we need Mergesort?Reason:
If you sort a linked list, no item-copying operations are necessary.
O(nlog n) sorting
CSS342: Sorting Algorithms 32
Radix Sort(Algorithm Overview)
0123 2154 0222 0004 0283 1560 1061 2150 Original integers
1560 2150 1061 0222 0123 0283 2154 0004 Grouped by 4th digit
1560 2150 1061 0222 0123 0283 2154 0004 Combined
0004 0222 0123 2150 2154 1560 1061 0283 Grouped by 3rd digit
0004 0222 0123 2150 2154 1560 1061 0283 Combined
0004 1061 0123 2150 2154 0222 0283 1560 Grouped by 2nd digit
0004 1061 0123 2150 2154 0222 0283 1560 Combined
0004 0123 0222 0283 1061 1560 2150 2154 Grouped by 1st digit
0004 0123 0222 0283 1061 1560 2150 2154 Combined (sorted)
O(n) sorting
CSS342: Sorting Algorithms 33
Radix Sort(Efficiency Analysis)
• Each grouping work requires n shuffles.• # grouping and combining steps is # digits.
– The previous case is 4.
• Thus, for k digit number, the performance is:– K * n = O( n ) where k is irrelevant to n
• Disadvantage:– Need to compare digits in the same order rather than items.
– Need to accommodate 10 groups for numbers
– Need to accommodate 27 groups for strings (alphabet + blank)
O(log n) sorting
CSS342: Sorting Algorithms 34
A Comparison of Sorting Algorithms
n log nn log nHeapsort
n log nn2Treesort
nnRadix sort
n log nn2Quicksort
n log nMergesort
n2n2Insertion sort
n2n2Bubble sort
n2n2Selection sort
Average caseWorst case
Shell sort n2 n3/2 ,n5/4depends on
increment
n log n
Studied in css343
Studied in css343
Question: do we really need to always use mergesort or quicksort?
CSS342: Sorting Algorithms 35
Lab Work
• Partial Quicksort– Find the top k items– Find the bottom k items– Find the median
• Key Idea:– Focus on only either partition[first, pivot -1] or
partition[pivot, last] that fits the requirements: top k, bottom k, or middle.
CSS342: Sorting Algorithms 36
Programming Assignment
• In-Place Sorting– Sort data items only in the original array. Example: Quick Sort– Impractical for Merge Sort
8 5 4 1 7 2 6 3orig
85 41 72 63temp
8541 72 63
temp
orig
8541 72 63
• Non-Recursive, Semi-In-Place Merge Sort– Using a loop rather than recursion.– Using only one additional temporary array.– Moving data from the original to temporary or
vice versa at each stage