ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec...

Post on 29-Dec-2015

222 views 1 download

Tags:

Transcript of ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec...

ITEC 2620AIntroduction to Data Structures

Instructor: Prof. Z. YangCourse Website: http://people.math.yorku.ca/~zyang/itec2620a.htm Office: TEL 3049

Sorting

3

Key Points

• Non-recursive sorting algorithms– Selection Sort– Insertion Sort– Best, Average, and Worst

cases

4

Sorting

• Why is sorting important?– Easier to search sorted data sets– Searching and sorting are primary

problems of computer science• Sorting in general

– Arrange a set of elements by their “keys” in increasing/decreasing order.

• Example: How would we sort a deck of cards?

5

Selection Sort

• Find the smallest unsorted value, and move it into position.

• What do you do with what was previously there?– “swap” it to where the smallest

value was– sorting can work on the original

array

• Example

6

Pseudocode

• Loop through all elements (have to put n elements into place)– for loop

• Loop through all remaining elements and find smallest– initialization, for loop, and branch

• Swap smallest element into correct place– single method

7

Insertion Sort

• Get the next value, and push it over until it is “semi” sorted.– elements in selection sort are in

their final position– elements in insertion sort can

still move• which do you use to organize a pile

of paper?

• Example

8

Pseudocode

• Loop through all elements (have to put n elements into place)– for loop

• Loop through all sorted elements, and swap until slot is found– while loop– swap method

9

Bubble Sort

• Slow and unintuitive…useless• Relay race

– pair-wise swaps until smaller value is found

– smaller value is then swapped up

10

Cost of Sorting

• Selection Sort– Is there a best, worst, and

average case?• two for loops always the same

– n elements in outer loop– n-1, n-2, n-3, …, 2, 1 elements

in inner loop• average n/2 elements for each

pass of the outer loop

– n * n/2 compares

11

Cost of Sorting (Cont’d)

• Insertion Sort– Worst – same as selection sort,

next element swaps till end• n * n/2 compares

– Best – next element is already sorted, no swaps• n * 1 compares

– Average – linear search, 50% of values n/4• n * n/4 compares

12

Value of Sorting

• Current cost of sorting is roughly n2 compares

• Toronto phone book– 2 million records– 4 trillion compares to sort

• Linear search– 1 million compares

• Binary search– 20 compares

13

Trade-Offs

• Write a method, or re-implement each time?

• Buy a parking pass, or pay cash each time?

• Sort in advance, or do linear search each time?

• Trade-offs are an important part of program design– which component should you

optimize?– is the cost of optimization worth the

savings?

Complexity Analysis

15

Key Points

• Analysis of non-recursive algorithms– Estimation– Complexity Analysis– Big-Oh Notation

16

Factors in Cost Estimation

• Does the program’s execution depend on the input?– Math.max(a, b);

• always processes two numbers constant time

– maxValue(anArray);• processes n numbers varies with

array size

17

Value of Cost Estimation

• Constant time programs– run once, always the same…– estimation not really required

• Variable time programs– run once– future runs depend on relative

size of input• based on what function?

18

Cost Analysis

• Consider the following code:sum = 0;for (i=1; i<=n; i++)

for (j=1; j<=n; j++) sum++;• It takes longer when n is larger.

19

Asymptotic Analysis

• “What is the ultimate growth rate of an algorithm as a function of its input size?”

• “If the problem size doubles, approximately how much longer will it take?”

• Quadratic• Linear (linear search)• Logarithmic (binary search)• Exponential

20

Big-Oh Notation

• Big-Oh represents the “order of” the cost function– ignoring all constants, find the largest

function of n in the cost function

• Selection Sort– n * n/2 compares + n swaps

• O(n2)

• Linear Search– n compares + n increments + 1

initialization• O(n)

21

Simplifying Conventions

• Only focus on the largest function of n

• Ignore smaller terms • Ignore constants

22

Examples

• Example 1:– Matrix multiplication

Anm * Bmn = Cnn

• Example 2:selectionSort(a);

for (int i = 0; i < n; i++)

binarySearch(i,a);

23

Trade-Offs and Limitations

• “What is the dominant term in the cost function?”– What if the constant term is larger than

n?

• What happens if both algorithms have the same complexity?– Selection sort and Insertion sort are both O(n2)

• Constants can matter– Same complexity (obvious) and different

complexity (problem size)