Internal Sorting 2users.encs.concordia.ca/~sthiel/coen352/03a_Internal_Sorting.pdf · Internal...

31
Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/31 Internal Sorting 2 S. Thiel 1 1 Department of Computer Science & Software Engineering Concordia University July 11, 2018

Transcript of Internal Sorting 2users.encs.concordia.ca/~sthiel/coen352/03a_Internal_Sorting.pdf · Internal...

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

1/31

Internal Sorting 2

S. Thiel1

1Department of Computer Science & Software EngineeringConcordia University

July 11, 2018

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

2/31

Outline

SortingQuicksortMergesortHeapsortRadix Sort

References

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

3/31

Sorting

I Three general purpose sorting algorithms

I They sort by comparing

I Quicksort

I Mergesort

I Heapsort

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

4/31

Quicksort

I Pick a pivot

I Partition around the pivot

I Items that are smaller/equal go before

I Items that are larger go after

I Recursively apply Quicksort on each partition

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

5/31

Quicksort 2

I Quicksort works by partitioning an input in two, thensorting each half recursively

I The partition is made around a chosen “pivot”

I The left partition only has elements smaller than the“pivot”

I The right partition only has elements bigger than the“pivot”

I Each “partition” step takes Θ (N) operations

I How many “partition” steps are needed?

I Actually, each “partition” step takes gradually feweroperations. . . why?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

6/31

Quicksort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ(n2)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

7/31

Quicksort Sort Properties 1

I Quicksort. . .

I is a divide and conquer algorithm

I works best with good pivot selection

I is recursive

I puts a pivot in place every pass

I is in-place

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

8/31

Quicksort Sort Properties 2

I Is it Stable?

I Some neat optimizations to make it fast and stable withmany duplicates

I . . . might be a bit slower otherwise

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

9/31

Quicksort flavors

I There are two ”partitioning schemes”I Hoare (my preference)

I two scanning indicesI move towards each other till they swap or crossI swap when both point at an element on the wrong side

(inversion)

I LomutoI two-indices, but only one is scanningI swap out of place items to beginningI less efficient (Internet says 3x as slow, let’s see why)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

10/31

Quicksort Standard Approaches [1, p.242]

I Picking a pivot often uses median of three

I Diversion to Insertion Sort for small partitions

I Diversion to HeapSort if it looks like Θ(n2)

ishappening (Introsort, Musser)

I Tail recursion

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

11/31

Standard Implementation 1

1 v o i d q s o r t ( i n t [ ] A , i n t i , i n t j ) { // Q u i c k s o r t2 i n t p i v o t i n d e x = f i n d p i v o t (A, i , j ) ;3 swap (A, p i v o t i n d e x , j ) ; // S t i c k p i v o t a t end4 // k w i l l be t h e f i r s t p o s i t i o n i n t h e r i g h t5 // s u b a r r a y6 i n t k = p a r t i t i o n (A, i −1, j , A [ j ] ) ;7 swap (A, k , j ) ; // Put p i v o t i n p l a c e8 i f ( ( k− i ) > 1) q s o r t (A, i , k−1) ; // S o r t l e f t9 i f ( ( j−k ) > 1) q s o r t (A, k+1, j ) ; // S o r t r i g h t

10 }

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

12/31

Standard Implementation 2

1 i n t p a r t i t i o n ( i n t [ ] A , i n t l , i n t r , i n t p i v o t ) {2 do { // Move bounds inward u n t i l t h e y meet3 w h i l e (A[++ l ] < p i v o t ) ;4 w h i l e ( ( r !=0) && (A[−− r ] > p i v o t ) ) ;5 swap (A, l , r ) ; // Swap out−of−p l a c e v a l u e s6 } w h i l e ( l < r ) ; // Stop when t h e y c r o s s7 swap (A, l , r ) ; // R e v e r s e l a s t , wasted swap8 r e t u r n l ; // Return f i r s t p o s i t i o n i n r i g h t p a r t i t i o n9 }

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

13/31

Mergesort

I Splits Input into halves repeatedly

I Merges halves back together

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

14/31

Mergesort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ (n log n)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

15/31

Figure: A Mergesort example [2].

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

16/31

Standard Implementation 1

1 L i s t m e r g e s o r t ( L i s t i n l i s t ) {2 i f ( i n l i s t . s i z e ( ) <= 1) r e t u r n i n l i s t ;3 L i s t L1 = i n l i s t . s u b l i s t ( 0 , i n l i s t . s i z e ( ) /2) ;4 L i s t L2 = i n l i s t . s u b l i s t ( i n l i s t . s i z e ( ) /2 ,5 i n l i s t . s i z e ( )−1)6 r e t u r n merge ( m e r g e s o r t ( L1 ) , m e r g e s o r t ( L2 ) ) ;7 }

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

17/31

Standard Implementation 2

1 L i s t merge ( L i s t L1 , L i s t L2 ) {2 L i s t L = new L i s t ( ) ;3 w h i l e ( ! L1 . i sEmpty ( ) && ! L2 . i sEmpty ( ) ) {4 i f ( L1 . g e t ( 0 ) <= L2 . g e t ( 0 ) )5 L . add ( L1 . remove ( 0 ) ;6 e l s e L . add ( L2 . remove ( 0 ) ;7 }8 i f ( L1 . i sEmpty ( ) ) L . a d d A l l ( L2 ) ;9 i f ( L2 . i sEmpty ( ) ) L . a d d A l l ( L1 ) ;

10 }

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

18/31

Mergesort with Lists

I The code above looks nice with ListsI Finding the halfway point of a List is costly

I How costly? I say 2n. Why?

I We can alternate and skip finding halfway points for 1n

I We can use a List-of-Lists and just start mergingsublists depth-first?

I We can use a List-of-Lists approach finding inherentstructure first?

I is cost of merging lists of varying sizes worth it?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

19/31

Mergesort with Arrays

I Using two arrays solves most Array-related issues

I Using two arrays implements pretty smoothly

I There is no cost to finding the halfway point

I You need an empty array, unlike Quicksort/Mergesortw/ lists

I Can you benefit from sorting existing runs?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

20/31

Mergesort with existing runs

I Does the best-case change?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

21/31

Heapsort

I Build a heap in Θ (n)

I Take the value off the heap Θ (1)

I re-settle the heap Θ (log n)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

22/31

Heapsort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ (n log n)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

23/31

Heapsort Properties

I Can be done in-place

I Generally considered slower than Quicksort

I Effective when only the first few values in a list areneeded

I Shaffer and most others show Top-Down Heapsort

I Bottom-Up Heapsort is twice as fast, faster with a bitextra memory

I works well when you have more data than fits inmemory

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

24/31

Heap navigation

I We can use math to navigate the heap

I iParent(i) = floor((i − 1)/2)

I iLeftChild(i) = 2 ∗ i + 1

I iRightChild(i) = 2 ∗ i + 2

I http://faculty.simpson.edu/lydia.sinapova/

www/cmsc250/LN250_Weiss/L13-HeapSortEx.htm

I https://en.wikipedia.org/wiki/Heapsort

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

25/31

Two Types of Radix Sort

I Most Significant Digit looks good, but actually can bebad

I Least Significant Digit looks weird, but actually good

I LSD Radix sort is how most of us sort cards

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

26/31

LSD Radix Sort Analysis

I In the best case: Θ (n)

I In the average case: Θ (n)

I In the worst case: Θ (n)

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

27/31

MSD Radix Sort Analysis

I In the best case: Θ (n)

I In the average case: Θ (L)

I In the worst case: Θ (L)

I What the heck is L?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

28/31

LSD Radix Sort Code

Algorithm 1 LSD Radix Sort

Require: input, val()/* Initialization */

1: buf ← initArray [input.length]2: counts ← initArray [passes][bucketCount]

Ensure: sorted input/* perform initial counting pass */

3: for i in input do

4: for j = 0 to passes − 1 do5: counts[j][bitsFor(val(i), j)]6: end for7: end for

/* convert counts to indices */8: for j = 1 to passes − 1 do

9: convertToIndices(counts[j])10: end for

/* deal to buffer based on current radix */11: for j = 0 to passes − 1 do

12: for i in input do13: buf [counts[j][bitsFor(val(i), j)]] = i14: end for15: swap(input, buf )16: end for

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

29/31

LSD Radix Sort Example

Figure: A Radix Sort Example.

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

30/31

LSD Radix Sort Analysis

I If the size of n grows with n, not so good

I we go back to Θ (n log n)

I but! If that is the case, then comparison sorts become:

I Θ (n log n log n) why?

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

31/31

References I

[1] Clifford A. Shaffer.Data Structures and Algorithm Analysis in Java.2013.

[2] Wikipedia.Mergesort.https://en.wikipedia.org/wiki/Merge_sort, May2017.