Engineering Java 7's Dual Pivot Quicksort Using MaLiJAn
-
Upload
sebastian-wild -
Category
Technology
-
view
2.044 -
download
0
description
Transcript of Engineering Java 7's Dual Pivot Quicksort Using MaLiJAn
Engineering Java 7’s Dual Pivot QuicksortUsing MaLiJAn
Sebastian Wild Markus E. Nebel Raphael Reitzig Ulrich Laube[wild, nebel, r_reitzi, laube] @cs.uni-kl.de
Computer Science DepartmentUniversity of Kaiserslautern
January 7, 2013Meeting on Algorithm Engineering & Experiments 2013
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 1 / 23
Background
Since Java 7: new dual pivot Quicksort in JRE library
Basic algorithm by Vladimir YaroslavskiyOptimizations by Jon Bentley, Joshua Bloch and others(see java.core-libs.devel mailing list)
Motivated by experience with classic QuicksortValidated by running time benchmark
In this talk:Can we exploit special properties of dual pivot Quicksort?
Can we get more insight than running time measurements?
. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 2 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Select two elements as pivots.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Only value relative to pivot counts.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
k
A[k] is medium ; go on
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
` k
A[k] is small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
` k
Swap small element to left end.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
` k
Swap small element to left end.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
` k
A[k] is large ; Find swap partner.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
g` k
A[k] is large ; Find swap partner:g skips over large elements.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
g` k
A[k] is large ; Swap
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 2 4 7 8 9 6
g` k
A[k] is large ; Swap
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 2 4 7 8 9 6
g` k
A[k] is old A[g], small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is old A[g], small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is medium ; go on
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is large ; Find swap partner.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is large ; Find swap partner:g skips over large elements.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
g and k have crossed!Swap pivots in place
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
g` k
g and k have crossed!Swap pivots in place
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Partitioning done!
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Recursively sort three sublists.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
1 2 3 4 5 6 7 8 9
Done.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 1
A[k]: small
A[g]: —
∆(g− k): 1
BytecodeInstructions: 24
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 2
A[k]: medium
A[g]: —
∆(g− k): 1
BytecodeInstructions: 15
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 3
A[k]: large
A[g]: large
∆(g− k): 1
BytecodeInstructions: 10
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 4
A[k]: large
A[g]: small
∆(g− k): 2
BytecodeInstructions: 44
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 5
A[k]: large
A[g]: medium
∆(g− k): 2
BytecodeInstructions: 36
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Asymmetry
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Algorithm is asymmetric:
Cycles have different cost; Would rather execute cheap
ones often
Cycles chosen by classessmall , medium or large
Probability for classes dependson pivot values
; Maybe we can “influence pivot values accordingly”?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 5 / 23
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Optimizing Pivot Sampling
Which are “good” pivot selection schemes?Is the symmetric choice best possible?
Need objective function to optimize
Typical approaches to judge efficiency:
A Count number of basic operations.(Here: number of executed Java Bytecode instructions.)
B Measure total running time.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 7 / 23
Optimizing Pivot Sampling
Relative performance of pivot sampling compared to tertiles-of-five:Pivot Selection Scheme A 1 B 2
JRE7+5.14% +0.80%
JRE7(1,3) −1.85% −0.44%
+3.34% −0.42%
— (stack overflow!) +10.6%
+2.48% +2.73%
+11.3% +3.31%
+12.7% +3.29%
+16.4% +2.48%
+39.0% +5.87%
1Average number of executed bytecodes on almost sorted lists of length 105.2Average running time on random permutations of length 106.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 8 / 23
Methods
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
2
43 5 6
7
8
9 10
1112
View program as Markov chain over blocks
Termination via absorbing state
Transition i→ j has probability p(n)
i→j
depending on input size n
Visiting block i incurs constant costs c(i)Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
2
43 5 6
7
8
9 10
1112
View program as Markov chain over blocks
Termination via absorbing state
Transition i→ j has probability p(n)
i→j
depending on input size n
Visiting block i incurs constant costs c(i)Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Maximum Likelihood Analysis
How to determine block costs and transition probabilities?
Transition Probabilities
1
2
Count transitions in executions on sample data; Allows arbitrary input distributions!
Take relative frequency as estimate for p(n)
i→j
Extrapolate p(n)
i→j to a function pi→j(n) in n
Block Costs
1
2
We consider two cost measures:A bc(i) = number of Bytecodes instructions in block i.
B t(i) = running time of block i
All steps are automated in our tool MaLiJAn3
3http://wwwagak.cs.uni-kl.de/malijan.htmlSebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 10 / 23
Block Sampling
Running times t(i) in B are typically few nanoseconds; direct measurement not possible.
Idea: Sampling Based Approach
1 2 3
12
1 2 4 5 6 7 5 6 7 5 6 7 8 10
11 12
1
time
3 2 6 5 5 8 10sampling
ns
µs
In regular intervals, store current basic block (concurrently)We observe only ≈ 1h of all blocks ; repeat execution
Relative frequencies of observed samples approachrelative running time contribution of blocks.
Count in separate run how often block i gets executed in totalTogether, this allows to compute t(i)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 11 / 23
A Decent Word of Caution
�1 Determining current block adds a small systematic error.
2 Java Specialty: Just-in-time Compilation
Running time heavily influenced by HotSpot JIT compilerJIT collects profiling information at beginning
; First input determines which optimizations are found
. . . more details in the paper
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 12 / 23
Input Distributions
We consider 2 different input distributions:1 Random Permutations
well-studied in literature
2 Almost Sorted Lists
Random model by Brodal et al.4:A[i] chosen i. i. d. uniform in [i− d, i+ d]for constant d (here d = 100)
4G. Brodal, R. Fagerberg, G. Moruz: On the Adaptiveness of Quicksort,J. Exp. Algorithmics 12 (2008), pp. 3.2:1–3.2:20
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 13 / 23
Results
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n
15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n
13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
22
23
24 log. plot, normalized by n lnnJRE7, JRE7(1,3)
model fits data well!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n
15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n
13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
22
23
24 log. plot, normalized by n lnnJRE7, JRE7(1,3)
model fits data well!
105 106 107 108
22
23
24
n
bcn
lnn
19.40n lnn+ 51n18.73n lnn+ 62nJRE7JRE7(1,3)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10818
19
20
21log. plot, normalized by n lnn
JRE7, JRE7(1,3)model fits data well!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
; asymptotically, JRE7(1,3) executes less Bytecodes!
Can we explain, why?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Cycle Costs
bc
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
In #Bytecodes:
Cycle 3 cheapest
Cycle 1 most expensive
of all cycles
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 15 / 23
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
JRE7(1,3) executes cheap Cycle 3 more oftenand expensive Cycle 1 less often than JRE7.
; Asymptotically, less executed Bytecodes!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Running Time Results
How about running time?
HotSpot JIT compiler has two modes
-Xcomp JIT compiler without profiling informationwarmup profiling JIT with warmup on fixed input
; trigger JIT compilation
; Do Block Sampling for both modes
Should we expect same block running times?. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 17 / 23
Cycle Costs
bc
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
but:smaller difference
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
but:smaller difference
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10820
22
24
105 106 107 10814
15
16
17
18
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10820
22
24
105 106 107 10814
15
16
17
18
JIT without profiling
; asymptotically, JRE7(1,3) faster!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
;For JRE7(1,3), the code created by profiling JITfor Cycle 3 is much slower than for JRE7!
; That’s the place to focus future research on.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
;For JRE7(1,3), the code created by profiling JITfor Cycle 3 is much slower than for JRE7!
; That’s the place to focus future research on.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 22 / 23
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 23 / 23