New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4:...

22
Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an integer , , find the th smallest element. When , it is called the median problem. Example: Given , the th smallest element is 19. Question: How do you solve this problem? 1

Transcript of New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4:...

Page 1: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Lecture 4: The Linear Time Selection

Selection Problem

Given a sequence of numbers������������������

, and aninteger � , � � � � � , find the � th smallest element.When ��� ��������� , it is called the median problem.

Example: Given� � ����� �! � �#" � �#$ � % � �#"!" , the & th

smallest element is 19.

Question: How do you solve this problem?

1

Page 2: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

First Solution: Selection by sorting

Step 1: Sort the elements in ascending order with anyalgorithm of complexity � � � ����� ��� .

Step 2: Return the � th element of the sorted array.

The complexity of this solution is � � � ��� ��� .Question: Can we do better?

Answer: YES, but we need to recall Partition( � � � ���)

used in Quicksort!

2

Page 3: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Second Solution : Linear running time in average

Recall of Partition( � � � ���)

Definition: Rearrange the array � � � � � ��� into two (pos-sibly empty) subarrays � � � � ��� � � � and � � � � � � � ���such that

� ��� � � � � ��� � ��� �for any

� � � � � � � and� � � � � � �

.

x

p rq

x x

x = A[r]

(1) The original � � ��� is used as the pivot.(2) It is a deterministic algorithm.(3) The element for the

�th position is found!

Note that this partition is different from the partition weused in COMP 171.

3

Page 4: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

The Idea of Partition( � � � ���)

ip j rx

x x unrestricted

(1) Initially � � ��� � � � � � � � � � .

(2) Increase�

by 1 each time to find a place for � � � � .At the same time increase � when necessary.

(3) The procedure stops when� � �

.

4

Page 5: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

One Iteration of the Procedure Partition

ip j rx

x x

>x

jp i

x x

x

r

rp

x x

i j

x< x

p i j r

x x

(A) A[j] > x

(B) A[j] < x

(A) Only increase�

by 1.

(B) ��� � � � . � � � ��� � � � � . ��

� � � .

5

Page 6: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

The Operation of Partition( ��������� ): Example

i

j

j

j

p i j

p i j

p i j

p i j, r

p i j, r

r

r

r

r

r

r

rp, j

p, i

p, i

p, i

2 8 7 41 3 5 6

2

2

2

2

2

2

2

2

1

1

1

1

1 3

3

3

3

8

8

8

8

8

8

8

7

7

7

7

7

7

5

5

5

6

6

8 7 1 3 5 6 4

7 1 3 5 6 4

1 3 5 6 4

3 5 6 4

5 6 4

6 4

4

4

(1)

(2)

(3)

(4)

(5)

(6)

(9)

(7)

(8)

6

Page 7: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

The Partition( � � � ���) Algorithm

Partition(A, p, r){// A[r] is the pivot elementx = A[r];i = p-1;for (j = p to r-1) {if (A[j] <= x) {i = i+1;exchange A[i] and A[j]

}}

// put pivot in positionexchange A[i+1] and A[r]// q = i+1return i+1;}

7

Page 8: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

The Running Time of Partition( � � � ���)

comparison of array elementsassignment, addition, comparison of loop variables

Partition( � � � � �):

� � � � ��� 1� � � � � 1for

� � �to

� � � ��� � � � �if � � � � � � � � � � �

� � � � � � � � � � �exchange � � � ��� � � � � � �� � � � �

exchange � � � � � � � � � ��� 3return � � � 1

Total: � � � � � and � ��� � � � � � � ���

Running time is � � � � � � , that is, linear in the lengthof the array � � � � � ��� .

8

Page 9: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Randomized-Partition( � � � � �)

The Idea: In the algorithm Partition( � � � � �), � � �� is al-

ways used as the pivot � to partition the array � � � � � �� .In the algorithm Randomized-Partition( � � � ���

), we ran-domly choose an

�,� � � � �

, and use � � � � as pivot.

Randomized-Partition(A, p, r){j = random(p, r);exchange A[r] and A[j]Partition(A, p, r);}

Remark: random(�,

�) is a pseudorandom-number

generator that returns a random number between�

and�.

9

Page 10: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Randomized-Select( � � � � � � � ), � � � � � � � � �Problem: Select the � th smallest element in � � � � � ��� ,where � � � � � � � � � .

Solution: Apply Randomized-Partition( ��� � � � ), getting

p q r

k =kth element

q−p+1

Case 1: ��� � , pivot is the solution.

Case 2: ��� � , the � th smallest element in ��� ��� ��� must be the� th smallest element in ��� ���� ��

�� .

Case 3: ��� � , the � th smallest element in ��� ��� ��� must be the� ��� ��� th smallest element in ���� ����� ��� .

If necessary, recursively call the same procedure to the subar-

ray.

10

Page 11: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Randomized-Select( � � � � � � � ), � � � � � � � � �if� � � �

return � � � �� � Randomized-Partition ��� � � ��� �� � � � � � �if � � � �

the pivot is the answerreturn � � ��

else if � �

return Randomized-Select ��� � � � � � � � � �else

return Randomized-Select ��� � � � � � � � � � � �

Remark: To find the � th smallest element in � � � � � � � ,call Randomized-Select( � � � � � � � ).

11

Page 12: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Running Time of Randomized-Select( � � � � � � � )Let � � � � � � be the average number of comparisons ofarray elements for � � � � � .

Then � � � � � � � " and for � � � we get

� � � � � � � � � � � � initial partition� �� ��� � ��

� �� � � � � � � � � � � recursion,

� �� � �� � � �

� � � � � � � � � � recursion,� � �

We will prove by induction on � that

� � � � � � & �for all � and � .

12

Page 13: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Proof that � � � � � � & �Induction basis: � � � � � � � " & � � �Induction step: Assume that � � � � � � & �for all � � and � � � � � . Then� � � � � ��

����

����

������ �

� � � � � � �� ��� �

�� ������ � � �

�� � ���

�����

�� �

������ �

� � � � ��� �

�� ������ � � �

����

�����

�� � � � � �� �

� � � � � ����� � �

� � � ���� � � � � � �

��� �

�����

�� � � ��� � �

�� � � � � � � �� � �

�� �

13

Page 14: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Proof that � � � � � � & �

� � � � � � � � � � ��� � � � �

where

� � � � � � � � � � � � � &%� � & � � � & � � ���� � � � � � &!� � & � � � � � "

����� � � � � � � "for � � � � � � ����� . Hence

� � � � � � � � � � � ����� � � � � � &!� � �for all � . Therefore

� � � � � � � � � � � � � & � ��

&%� �

14

Page 15: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Running Time of Randomized-Select( � � � � � � � )We proved that � � � � � � &%� . Since � � � � � � � � � � ,we have in particular that

� � � � � � � � � ��� �

15

Page 16: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Randomized-Quicksort Algorithm

We make use of the Randomized-Partition idea to de-velop a new version of quicksort.

Randomized-Quicksort(A, p, r){if (p < r) {q = Randomized-Partition(A, p, r);Randomized-Quicksort(A, p, q-1);Randomized-Quicksort(A, q+1, r);}}

Does it run faster than the original version of quick-sort?

16

Page 17: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Running Time of the Randomized-Quicksort

Results :

Worst Case: � � � � � � � � � � .Average Case: � � � � � � � � ��� ��� .Clearly, the worst case is still � � � � � , what about theaverage case?

17

Page 18: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Average running time of Randomized-Quicksort

Key observations:

� The running time of (randomized) quicksort is dom-inated by the time spent in (randomized) partition.In the partition procedure, the time is dominatedby the number of key comparisons.

� When a pivot is selected, the pivot is comparedwith every other elements, then the elements arepartitioned into two parts accordingly.

� Elements in different partition are NEVER com-pared with each other in all operations.

Tricks: We find the expected number of comparisonsfor all randomized-partition calls.

18

Page 19: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Average running time of Randomized-Quicksort

Let � be the input array which is a permutation of the� distinct elements � � � � �� � � � .

Let�

be the total number of comparisons performedin ALL calls to randomized-partition. Let

���� be the

number of comparisons between � � and � � , observethat

���� can only be 0 or 1. Our goal is to compute

the expected value of�

, i.e.,

� � � � � � ����

� ��

� � � �� �

����

����

� ��

� � � �� � � � ���

����

� ��

� � � �� ��� � � � � is compared to � �

� � �� � � � � � is not compared to � �

� � " �

����

� ��

� � � �� � � � � � is compared to � �

19

Page 20: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Average running time of Randomized-Quicksort

It remains to show how to find � � � � � is compared to � ��.

For � � � � � � � , let � ��� � � � �� � � � � �������� � �

(remember � � � � �

� ����� � � ).

Key observations:

� If � � or � � is selected as a pivot BEFORE any el-ements in

� � � �� � � � � � �������� � ���

���, � � and � � will

be compared.

� Conversely, if any element in � ��� other then � � or� � is selected as a pivot before � � and � � , � � and� � will be placed in DIFFERENT partitions, andhence they will NOT compare with each other inALL randomized-partition calls.

� ANY element other than the elements in � ��� hasno effect to � � � � � is compared to � �

�.

20

Page 21: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Average running time of Randomized-Quicksort

It remains to find the probability that � � or � � is the firstpivot chosen from � ��� .

� � � � � is compared to � ��

� � � � � � or � � is the first pivot chosen from � ����

� � � � � � is the first pivot chosen from � ����

� � � � � � is the first pivot chosen from � ����

� �� � � � �

� �� � � � �

� �� � � � �

21

Page 22: New Lecture 4: The Linear Time Selectiondekai/271/notes/L04/L04.pdf · 2003. 9. 10. · Lecture 4: The Linear Time Selection Selection Problem Given a sequence of numbers , and an

Average running time of Randomized-Quicksort

Putting everything together, we have

� � � � ����

� ��

� � � �� �� � � � �

����

� ���� �

� �� �

� � �

���

� ��

� �� � �

����

� �� � � ��� � �

� � � � ��� ���

Hence, the expected number of comparisons is � � � ��� � � ,which is the average running time of Randomized-Quicksort.

22