Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

62
Quickselect under Yaroslavskiy’s Dual-Pivoting Algorithm Sebastian Wild Markus E. Nebel Hosam Mahmoud [email protected] Computer Science Department 28 May 2013 AofA 2013 Menorca Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 1 / 17

description

I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain). A paper covering the analyses of this talk (and some more!) has been submitted. Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine: slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort Check my website for preprints of papers and my other talks: wwwagak.cs.uni-kl.de/sebastian-wild.html

Transcript of Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Page 1: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Quickselect underYaroslavskiy’s Dual-Pivoting Algorithm

Sebastian Wild Markus E. Nebel Hosam [email protected]

Computer Science Department

28 May 2013AofA 2013Menorca

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 1 / 17

Page 2: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Selection

Consider the following problem

The Selection Problem

Given an unsorted array A of n data elements,find the m-th smallest element.

Applications

estimate distribution of data collection in statisticsworld rankings in sports

Algorithms

trivial solution: sort(A); return A[m]; O(n logn)linear worst case possible: Median-of-Medians (Blum et al.)but rather slow in practice (compared to sorting!)Hoare 1961: Quicksort idea usable for selection

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 2 / 17

Page 3: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

From Quicksort to Quickselect

Sort whole array by Quicksort

Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort

Does success of dual partitioning in sorting carry over to selection?

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17

Page 4: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

From Quicksort to Quickselect

Can prune many branches

Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort

Does success of dual partitioning in sorting carry over to selection?

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17

Page 5: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

From Quicksort to Quickselect

Can prune many branches Dual Pivot Quicksort

Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort

Does success of dual partitioning in sorting carry over to selection?

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17

Page 6: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

From Quicksort to Quickselect

Can prune many branches Prune even more branches

Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort

Does success of dual partitioning in sorting carry over to selection?

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17

Page 7: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

From Quicksort to Quickselect

Can prune many branches Prune even more branches

Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort

Does success of dual partitioning in sorting carry over to selection?

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17

Page 8: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Notation

Analyse (random) number of data comparisons C(m)n

to select from random permutation

P < Q: (random) ranks of the two pivots

subarray sizes determined by P and Q:

left P rightQmiddle

J1 := P − 1 J2 := Q− P − 1 J3 := n−Q

selection continues recursively in

left subarray, if m < P

middle subarray, if P < m < Q

right subarray, if Q < m

or terminates if m = P or m = Q

corresponding events

E1 := {m < P}

E2 := {P < m < Q}

E3 := {Q < m}

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17

Page 9: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Notation

Analyse (random) number of data comparisons C(m)n

to select from random permutation

P < Q: (random) ranks of the two pivots

subarray sizes determined by P and Q:

left P rightQmiddle

J1 := P − 1 J2 := Q− P − 1 J3 := n−Q

selection continues recursively in

left subarray, if m < P

middle subarray, if P < m < Q

right subarray, if Q < m

or terminates if m = P or m = Q

corresponding events

E1 := {m < P}

E2 := {P < m < Q}

E3 := {Q < m}

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17

Page 10: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Notation

Analyse (random) number of data comparisons C(m)n

to select from random permutation

P < Q: (random) ranks of the two pivots

subarray sizes determined by P and Q:

left P rightQmiddle

J1 := P − 1 J2 := Q− P − 1 J3 := n−Q

selection continues recursively in

left subarray, if m < P

middle subarray, if P < m < Q

right subarray, if Q < m

or terminates if m = P or m = Q

corresponding events

E1 := {m < P}

E2 := {P < m < Q}

E3 := {Q < m}

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17

Page 11: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Quickselect Recurrence

Yaroslavskiy’s partitioning preserves randomness

can set up recurrence for the distribution of C(m)n :

C(m)n

D= Tn +

C(m)J1

if m lies in left subarray

C(m−P)J2

if m lies in middle subarray

C(m−Q)J3

if m lies in right subarray

= Tn + 1E1 · C(m)J1

+ 1E2 · C(m−P)J2

+ 1E3 · C(m−Q)J3

Tn is the (random) number of comparisons during partitioning

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 5 / 17

Page 12: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Getting Rid of m

Complication: second parameter m in recurrence

C(m)n

D= Tn + 1E1 · C

(m)J1

+ 1E2 · C(m−P)J2

+ 1E3 · C(m−Q)J3

consider simpler quantities

1 Random Rank: m =MnD= Uniform {1, . . . , n}

abbreviate Cn := C(Mn)n

← focus in talk

CnD= Tn+ 1E1

CJ1 + 1E2CJ2 + 1E3

CJ3

2 Extreme Rank: m = 1abbreviate Cn := C

(1)n

CnD= Tn + CP−1

explicit solution for expectations possible (generating functions)But: Requires tedious computations

More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17

Page 13: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Getting Rid of m

Complication: second parameter m in recurrence

C(m)n

D= Tn + 1E1 · C

(m)J1

+ 1E2 · C(m−P)J2

+ 1E3 · C(m−Q)J3

consider simpler quantities

1 Random Rank: m =MnD= Uniform {1, . . . , n}

abbreviate Cn := C(Mn)n

← focus in talk

CnD= Tn+ 1E1

CJ1 + 1E2CJ2 + 1E3

CJ3

2 Extreme Rank: m = 1abbreviate Cn := C

(1)n

CnD= Tn + CP−1

explicit solution for expectations possible (generating functions)But: Requires tedious computations

More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17

Page 14: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Getting Rid of m

Complication: second parameter m in recurrence

C(m)n

D= Tn + 1E1 · C

(m)J1

+ 1E2 · C(m−P)J2

+ 1E3 · C(m−Q)J3

consider simpler quantities

1 Random Rank: m =MnD= Uniform {1, . . . , n}

abbreviate Cn := C(Mn)n

← focus in talk

CnD= Tn+ 1E1

CJ1 + 1E2CJ2 + 1E3

CJ3

2 Extreme Rank: m = 1abbreviate Cn := C

(1)n

CnD= Tn + CP−1

explicit solution for expectations possible (generating functions)But: Requires tedious computations

More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17

Page 15: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence

Idea: Computation easy if we can drop index n

i. e. when recurrence “in equilibrium”

For that, we need stochastic convergence of Cn

0 100 200 300 400 500 600

99.9% massn = 10

n = 20

n = 30

n = 40

Cn

But:E[Cn] grows with n

high mass region moves with n

}Cn does not converge

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17

Page 16: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence

Idea: Computation easy if we can drop index n

i. e. when recurrence “in equilibrium”

Try normalized variables C∗n :=

Cn

ninstead:

0 2 4 6 8 10 12 14 16 18

99.9% massn = 10

n = 20

n = 30

n = 40

Cn/n

E[C∗n] seems constant

range of high mass looks bounded

}C∗n might converge

Works because of property of Quickselect: E[Cn] = O(√

Var(Cn))

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17

Page 17: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Rewrite recurrence: (Write as sum)

CnD= Tn + 1E1CJ1 + 1E2CJ2 + 1E3CJ3

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 18: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Rewrite recurrence: (Divide by n)

CnD= Tn +

3∑i=1

1EiCJi

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 19: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Rewrite recurrence: (Rearrange)

Cn

n

D=

Tn

n+

3∑i=1

1EiCJin

· JiJi

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 20: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Rewrite recurrence: (use normalized variables)

Cn

n

D=

Tn

n+

3∑i=1

1Ei

Jin·CJiJi

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 21: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Recurrence for C∗n

C∗n

D= T∗n +

3∑i=1

1Ei

Jin· C∗

Jiwith T∗n :=

Tn

n

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 22: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Recurrence for C∗n

C∗nD= T∗n +

3∑i=1

1Ei

Jin︸ ︷︷ ︸ · C∗Ji

↓ ↓

T∗ A∗i

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 23: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Recurrence for C∗n

C∗nD= T∗n +

3∑i=1

1Ei

Jin︸ ︷︷ ︸ · C∗Ji

↓ ↓3∑i=1

E[|A∗i |

]< 1

T∗ A∗i

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 24: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

Recurrence for C∗n

C∗nD= T∗n +

3∑i=1

1Ei

Jin︸ ︷︷ ︸ · C∗Ji

↓ ↓ ↓ ↓3∑i=1

E[|A∗i |

]< 1

C∗D= T∗ +

3∑i=1

A∗i · C∗

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 25: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

“Equilibrium” equation for C∗

C∗D= T∗ +

3∑i=1

A∗i · C∗

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 26: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

“Equilibrium” equation for C∗

(Take expectation on both sides)

C∗D= T∗ +

3∑i=1

A∗i · C∗

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 27: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

“Equilibrium” equation for C∗

(and solve for E C∗)

E C∗ = E T∗+3∑i=1

EA∗i · E C∗

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 28: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

“Equilibrium” equation for C∗

(and solve for E C∗)

E C∗ =E T∗

1−∑3i=1 EA∗i

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i

. . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 29: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

The Contraction Method

“Equilibrium” equation for C∗

(and solve for E C∗)

E C∗ =E T∗

1−∑3i=1 EA∗i

Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.

have to show: (all convergences in L1-norm)

1 T∗n → T∗

2 1Ei

Jin→ A∗i . . . upcoming

3∑3i=1 E

[|A∗i |

]< 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17

Page 30: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 31: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 32: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 33: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 34: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

induces pivot values: U1 = S1, U2 = S1 + S20

0.51 0 0.5 1

0

0.5

1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 35: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

U1 U2

induces pivot values: U1 = S1, U2 = S1 + S20

0.51 0 0.5 1

0

0.5

1

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 36: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

U1 U2J1 J2 J3

S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)

n− 2 elements to draw conditional on S

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 37: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

U1 U2J1 J2 J3

S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)

n− 2 elements to draw conditional on S

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 38: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

U1 U2J1 J2 J3

S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)

n− 2 elements to draw conditional on S

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 39: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1

S1 S2 S3

U1 U2J1 J2 J3

S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)

n− 2 elements to draw conditional on S

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 40: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1U1U1 U2 U2

Continue of sub-interval with smaller n

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 41: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Probabilistic Model

Need “convenient” probabilistic model:same probability space for finite n and limit case

Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values

Our equivalent model:

1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)

2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)

3 Continue recursively

0 1U1U1 U2 U2

Continue of sub-interval with smaller n

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17

Page 42: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Distribution of Tn — Repetition1 T∗n → T∗

Invariant:< p `

→> qg

←p 6 ◦ 6 q k

k’s range K G

Tn = number of comparisons in first partitioning step

∼ n one for every element

+ J2 second for all medium elements

+ l@K second for large elements at C ′k

+ s@ G second for small elements at C ′g

Markus Nebel just showed:

l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )

}conditional on J

s@ GD= Hypergeometric(n− 2; J1 , J3 )

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17

Page 43: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Distribution of Tn — Repetition1 T∗n → T∗

Invariant:< p `

→> qg

←p 6 ◦ 6 q k

k’s range K G

Tn = number of comparisons in first partitioning step

∼ n one for every element

+ J2 second for all medium elements

+ l@K second for large elements at C ′k

+ s@ G second for small elements at C ′g

Markus Nebel just showed:

l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )

}conditional on J

s@ GD= Hypergeometric(n− 2; J1 , J3 )

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17

Page 44: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Distribution of Tn — Repetition1 T∗n → T∗

Invariant:< p `

→> qg

←p 6 ◦ 6 q k

k’s range K G

Tn = number of comparisons in first partitioning step

∼ n one for every element

+ J2 second for all medium elements

+ l@K second for large elements at C ′k

+ s@ G second for small elements at C ′g

Markus Nebel just showed:

l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )

}conditional on J

s@ GD= Hypergeometric(n− 2; J1 , J3 )

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17

Page 45: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of T ∗n1 T∗n → T∗

T∗n =Tn

n= 1+

J2

n+l@K

n+s@ G

n

J2

n→ S2 by law of large numbers, since E[Ji | S] = n · Si

Multinomial and Hypergeometric concentrated around mean

l@K

n→ E

[l@K

n

∣∣∣∣ S]

by Chernoff bound arguments

= E[E[l@K | J

]n

∣∣∣∣ S]

= E[J3 ( J1 + J2 )

n(n− 2)

∣∣∣∣ S]

∼ S3 · (S1 + S2)s@ G

n→ S1 · S3 similar

T∗n → T∗ = 1+ S2 + S3(2S1 + S2)

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17

Page 46: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of T ∗n1 T∗n → T∗

T∗n =Tn

n= 1+

J2

n+l@K

n+s@ G

n

J2

n→ S2 by law of large numbers, since E[Ji | S] = n · Si

Multinomial and Hypergeometric concentrated around mean

l@K

n→ E

[l@K

n

∣∣∣∣ S]

by Chernoff bound arguments

= E[E[l@K | J

]n

∣∣∣∣ S]

= E[J3 ( J1 + J2 )

n(n− 2)

∣∣∣∣ S]

∼ S3 · (S1 + S2)s@ G

n→ S1 · S3 similar

T∗n → T∗ = 1+ S2 + S3(2S1 + S2)

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17

Page 47: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of T ∗n1 T∗n → T∗

T∗n =Tn

n= 1+

J2

n+l@K

n+s@ G

n

J2

n→ S2 by law of large numbers, since E[Ji | S] = n · Si

Multinomial and Hypergeometric concentrated around mean

l@K

n→ E

[l@K

n

∣∣∣∣ S]

by Chernoff bound arguments

= E[E[l@K | J

]n

∣∣∣∣ S]

= E[J3 ( J1 + J2 )

n(n− 2)

∣∣∣∣ S]

∼ S3 · (S1 + S2)s@ G

n→ S1 · S3 similar

T∗n → T∗ = 1+ S2 + S3(2S1 + S2)

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17

Page 48: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of Coefficients & Contraction Condition

2 1EiJin → A∗i

standard computation:

1E1

J1

n→ 1F1 · S1 with F1 = {V < S1}

i = 2, 3 similar

3∑3i=1 E

[|A∗i |

]< 1

E[1F1S1

]= 16

by symmetry:3∑i=1

E[|A∗i |

]= 12 < 1

We (finally) proved convergence C∗n→ C∗.

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17

Page 49: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of Coefficients & Contraction Condition

2 1EiJin → A∗i

standard computation:

1E1

J1

n→ 1F1 · S1 with F1 = {V < S1}

i = 2, 3 similar

3∑3i=1 E

[|A∗i |

]< 1

E[1F1S1

]= 16

by symmetry:3∑i=1

E[|A∗i |

]= 12 < 1

We (finally) proved convergence C∗n→ C∗.

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17

Page 50: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Convergence of Coefficients & Contraction Condition

2 1EiJin → A∗i

standard computation:

1E1

J1

n→ 1F1 · S1 with F1 = {V < S1}

i = 2, 3 similar

3∑3i=1 E

[|A∗i |

]< 1

E[1F1S1

]= 16

by symmetry:3∑i=1

E[|A∗i |

]= 12 < 1

We (finally) proved convergence C∗n→ C∗.

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17

Page 51: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Expectations

Computing expectations and plugging in . . .

E T∗ = 1912

Recall:

E C∗ =ET∗

1−∑3

i=1 EA∗i

∑3i=1 EA∗i =

12

} E C∗ = 19

6 = 3.16

L1 convergence E Cn ∼ 3.16n

similarly: E Cn ∼ 2.375n

Comparisons Swaps

0

1

2

3

+5.55%

+100%

random rank

Classic Quickselect

Yaroslavskiy Select

Comparisons Swaps

0

1

2

+18.8%

+125%

extreme rank

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17

Page 52: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Expectations

Computing expectations and plugging in . . .

E T∗ = 1912

Recall:

E C∗ =ET∗

1−∑3

i=1 EA∗i

∑3i=1 EA∗i =

12

} E C∗ = 19

6 = 3.16

L1 convergence E Cn ∼ 3.16n

similarly: E Cn ∼ 2.375n

Comparisons Swaps

0

1

2

3

+5.55%

+100%

random rank

Classic Quickselect

Yaroslavskiy Select

Comparisons Swaps

0

1

2

+18.8%

+125%

extreme rank

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17

Page 53: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Expectations

Computing expectations and plugging in . . .

E T∗ = 1912

Recall:

E C∗ =ET∗

1−∑3

i=1 EA∗i

∑3i=1 EA∗i =

12

} E C∗ = 19

6 = 3.16

L1 convergence E Cn ∼ 3.16n

similarly: E Cn ∼ 2.375n

Comparisons Swaps

0

1

2

3

+5.55%

+100%

random rank

Classic Quickselect

Yaroslavskiy Select

Comparisons Swaps

0

1

2

+18.8%

+125%

extreme rank

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17

Page 54: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Expectations

Computing expectations and plugging in . . .

E T∗ = 1912

Recall:

E C∗ =ET∗

1−∑3

i=1 EA∗i

∑3i=1 EA∗i =

12

} E C∗ = 19

6 = 3.16

L1 convergence E Cn ∼ 3.16n

similarly: E Cn ∼ 2.375n

Comparisons Swaps

0

1

2

3

+5.55%

+100%

random rank

Classic Quickselect

Yaroslavskiy Select

Comparisons Swaps

0

1

2

+18.8%

+125%

extreme rank

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17

Page 55: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Pivot Sampling

successful optimization of Quicksort/-select: median of threeHere: generalized pivot selection scheme

1 Choose random sample of k elements2 Sort the sample3 Select pivots s. t. in sorted sample

t1 smaller ,

t2 between and

t3 larger than pivots

Example withk = 8 and t = (3, 2, 1):

U1 U2

t1 t2 t3

Our stochastic model nicely generalizes to pivot sampling:

only the distribution of spacings S changes!

S D= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =

st11 s

t22 s

t33

B

contraction conditions remain valid

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17

Page 56: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Pivot Sampling

successful optimization of Quicksort/-select: median of threeHere: generalized pivot selection scheme

1 Choose random sample of k elements2 Sort the sample3 Select pivots s. t. in sorted sample

t1 smaller ,

t2 between and

t3 larger than pivots

Example withk = 8 and t = (3, 2, 1):

U1 U2

t1 t2 t3

Our stochastic model nicely generalizes to pivot sampling:

only the distribution of spacings S changes!

S D= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =

st11 s

t22 s

t33

B

contraction conditions remain valid

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17

Page 57: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Pivot Sampling

General result: Cn ∼1+ t2+1

k+1 +(2(t1+1)+(t2+1))(t3+1)

(k+1)2

1 −∑3i=1

(ti+1)2

(k+1)2

· n

inconvenient for interpretation . . .

Limiting case for large sample:ti

k→ αi as k→∞

M= take exact quantiles as pivots

Cn ∼1+ α2 + (2α1 + α2)α3

1 −∑3i=1α

2i

· n

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

α1

α2

optimal choice is not symmetric

classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17

Page 58: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Pivot Sampling

General result: Cn ∼1+ t2+1

k+1 +(2(t1+1)+(t2+1))(t3+1)

(k+1)2

1 −∑3i=1

(ti+1)2

(k+1)2

· n

inconvenient for interpretation . . .

Limiting case for large sample:ti

k→ αi as k→∞

M= take exact quantiles as pivots

Cn ∼1+ α2 + (2α1 + α2)α3

1 −∑3i=1α

2i

· n

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

α1

α2

optimal choice is not symmetric

classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17

Page 59: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Pivot Sampling

General result: Cn ∼1+ t2+1

k+1 +(2(t1+1)+(t2+1))(t3+1)

(k+1)2

1 −∑3i=1

(ti+1)2

(k+1)2

· n

inconvenient for interpretation . . .

Limiting case for large sample:ti

k→ αi as k→∞

M= take exact quantiles as pivots

Cn ∼1+ α2 + (2α1 + α2)α3

1 −∑3i=1α

2i

· n

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

α1

α2

optimal choice is not symmetric

classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17

Page 60: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Results — Pivot Sampling

General result: Cn ∼1+ t2+1

k+1 +(2(t1+1)+(t2+1))(t3+1)

(k+1)2

1 −∑3i=1

(ti+1)2

(k+1)2

· n

inconvenient for interpretation . . .

Limiting case for large sample:ti

k→ αi as k→∞

M= take exact quantiles as pivots

Cn ∼1+ α2 + (2α1 + α2)α3

1 −∑3i=1α

2i

· n

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

2.4654

2.5

α1

α2

optimal choice is not symmetric

classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17

Page 61: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

Conclusion

Practitioners’ Lesson:Yaroslavskiy’s dual-pivoting not helpful for selection

Analyzers’ Lesson:expected costs of Quickselect derivable by contraction methodpivot sampling included naturally

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 16 / 17

Page 62: Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

References & Further Reading

M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest and R. E. Tarjan.Time bounds for selection.J. of Comp. and System Sc., 7(4):448–461, 1973.

C. A. R. Hoare.Algorithm 65: Find.Communications of the ACM, 4(7):321–322, July 1961.

H.-K. Hwang and T.-H. Tsai.Quickselect and Dickman function.Combinatorics, Probability and Computing, 11, 2000.

Hosam M. Mahmoud.Sorting: A distribution theory.John Wiley & Sons, Hoboken, NJ, USA, 2000.

H. M. Mahmoud, R. Modarres and R. T. Smythe.Analysis of quickselect : an algorithm for order statistics.Informatique théorique et appl., 29(4):255–276, 1995.

C. Martínez, D. Panario and A. Viola.Adaptive sampling strategies for quickselects.ACM Transactions on Algorithms, 6(3):1–45, June 2010.

C. Martínez and S. Roura.Optimal Sampling Strategies in Quicksort and Quickselect.SIAM Journal on Computing, 31(3):683, 2001.

R. Neininger.On a multivariate contraction method for randomrecursive structures with applications to Quicksort.Random Structures & Alg., 19(3-4):498–524, 2001.

R. Neininger and L. Rüschendorf.A General Limit Theorem for Recursive Algorithms andCombinatorial Structures.The Annals of Applied Probability, 14(1):378–418, 2004.

A. Panholzer and H. Prodinger.A generating functions approach for the analysis ofgrand averages for multiple QUICKSELECT.Random Structures & Alg., 13(3-4):189–209, 1998.

U. Rösler and L. Rüschendorf.The contraction method for recursive algorithms.Algorithmica, 29(1):3–33, 2001.

S. Wild.Java 7’s Dual Pivot Quicksort.Master thesis, University of Kaiserslautern, 2012.

S. Wild, M. E. Nebel and H. Mahmoud.Analysis of Quickselect underYaroslavskiy’s Dual-Pivoting Algorithm.Algorithmica, in preparation.

S. Wild, M. E. Nebel and R. Neininger.Average Case and Distributional Analysis ofJava 7’s Dual Pivot Quicksort.ACM Transactions on Algorithms, submitted.

V. Yaroslavskiy.Dual-Pivot Quicksort.2009.

Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 17 / 17