ScalaMeter 2012

ScalaMeter Performance regression testing framework

Aleksandar Prokopec, Josh Suereth

List(1 to 100000: _*).map(x => x * x)

class List[+T] extends Seq[T] { // implementation 1 }

List(1 to 100000: _*).map(x => x * x)

First example

def measure() { val buffer = mutable.ArrayBuffer(0 until 2000000: _*) val start = System.currentTimeMillis() var sum = 0 buffer.foreach(sum += _) val end = System.currentTimeMillis() println(end - start) }

First example

def measure() { val buffer = mutable.ArrayBuffer(0 until 2000000: _*) val start = System.currentTimeMillis() var sum = 0 buffer.foreach(sum += _) val end = System.currentTimeMillis() println(end - start) } measure()

First example

def measure() { val buffer = mutable.ArrayBuffer(0 until 2000000: _*) val start = System.currentTimeMillis() var sum = 0 buffer.foreach(sum += _) val end = System.currentTimeMillis() println(end - start) } measure()

The warmup problem

def measure() { val buffer = mutable.ArrayBuffer(0 until 2000000: _*) val start = System.currentTimeMillis() var sum = 0 buffer.foreach(sum += _) val end = System.currentTimeMillis() println(end - start) } measure() measure()

26 ms, 11 ms

The warmup problem

def measure() { val buffer = mutable.ArrayBuffer(0 until 2000000: _*) val start = System.currentTimeMillis() var sum = 0 buffer.foreach(sum += _) val end = System.currentTimeMillis() println(end - start) } measure() measure()

26 ms, 11 ms

Why? Mainly: - JIT compilation - dynamic optimization

The warmup problem

45 ms, 10 ms

26 ms, 11 ms

The warmup problem

def measure2() { val buffer = mutable.ArrayBuffer(0 until 4000000: _*) val start = System.currentTimeMillis() buffer.map(_ + 1) val end = System.currentTimeMillis() println(end - start) }

The warmup problem

241, 238, 235, 236, 234

The warmup problem

241, 238, 235, 236, 234, 429

The warmup problem

241, 238, 235, 236, 234, 429, 209

The warmup problem

241, 238, 235, 236, 234, 429, 209, 194, 195, 195

The warmup problem

Bottomline: benchmark has to be repeated until the running time becomes “stable”. The number of repetitions is not known in advance.

241, 238, 235, 236, 234, 429, 209, 194, 195, 195

Warming up the JVM

241, 238, 235, 236, 234, 429, 209, 194, 195, 195, 194, 194, 193, 194, 196, 195, 195

Can this be automated? Idea: measure variance of the running times. When it becomes sufficiently small, the test has stabilized.

The interference problem

val buffer = ArrayBuffer(0 until 900000: _*) buffer.map(_ + 1) val buffer = ListBuffer(0 until 900000: _*) buffer.map(_ + 1)

Lets measure the first map 3 times with 7 repetitions:

61, 54, 54, 54, 55, 55, 56 186, 54, 54, 54, 55, 54, 53 54, 54, 53, 53, 53, 54, 51

Now, lets measure the list buffer map in between:

61, 54, 54, 54, 55, 55, 56 186, 54, 54, 54, 55, 54, 53 54, 54, 53, 53, 53, 54, 51

59, 54, 54, 54, 54, 54, 54 44, 36, 36, 36, 35, 36, 36 45, 45, 45, 45, 44, 46, 45 18, 17, 18, 18, 17, 292, 16 45, 45, 44, 44, 45, 45, 44

Now, lets measure the list buffer map in between:

61, 54, 54, 54, 55, 55, 56 186, 54, 54, 54, 55, 54, 53 54, 54, 53, 53, 53, 54, 51

59, 54, 54, 54, 54, 54, 54 44, 36, 36, 36, 35, 36, 36 45, 45, 45, 45, 44, 46, 45 18, 17, 18, 18, 17, 292, 16 45, 45, 44, 44, 45, 45, 44

Using separate JVM

Bottomline: always run the tests in a new JVM.

Using separate JVM

Bottomline: always run the tests in a new JVM. This may not reflect a real-world scenario, but it gives a good idea of how different several alternatives are.

Using separate JVM

Bottomline: always run the tests in a new JVM. It results in a reproducible, more stable measurement.

The List.map example

val list = (0 until 2500000).toList list.map(_ % 2 == 0)

The List.map example

37, 38, 37, 1175, 38, 37, 37, 37, 37, …, 38, 37, 37, 37, 37, 465, 35, 35, …

The garbage collection problem

37, 38, 37, 1175, 38, 37, 37, 37, 37, …, 38, 37, 37, 37, 37, 465, 35, 35, …

This benchmark triggers GC cycles!

37, 38, 37, 1175, 38, 37, 37, 37, 37, …, 38, 37, 37, 37, 37, 465, 35, 35, … -> mean: 47 ms

37, 37, 37, 647, 37, 36, 38, 37, 36, …, 36, 37, 36, 37, 36, 37, 534, 36, 33, … -> mean: 39 ms

37, 38, 37, 1175, 38, 37, 37, 37, 37, …, 38, 37, 37, 37, 37, 465, 35, 35, … -> mean: 47 ms

37, 37, 37, 647, 37, 36, 38, 37, 36, …, 36, 37, 36, 37, 36, 37, 534, 36, 33, … -> mean: 39 ms

Solutions: -repeat A LOT of times –an accurate mean, but takes A LONG time

Solutions: -repeat A LOT of times –an accurate mean, but takes A LONG time -ignore the measurements with GC – gives a reproducible value, and less measurements

- how to do this?

- manually - verbose:gc

Automatic GC detection

- manually - verbose:gc - automatically using callbacks in JDK7

37, 37, 37, 647, 37, 36, 38, 37, 36, …, 36, 37, 36, 37, 36, 37, 534, 36, 33, …

Automatic GC detection

- manually - verbose:gc - automatically using callbacks in JDK7

37, 37, 37, 647, 37, 36, 38, 37, 36, …, 36, 37, 36, 37, 36, 37, 534, 36, 33, …

raises a GC event

The runtime problem

- there are other runtime events beside GC – e.g. JIT compilation, dynamic optimization, etc.

- these take time, but cannot be determined accurately

The runtime problem

- these take time, but cannot be determined accurately - heap state also influences memory allocation patterns

and performance

The runtime problem

and performance

val list = (0 until 4000000).toList list.groupBy(_ % 10)

(allocation intensive)

The runtime problem

and performance

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

The runtime problem

and performance

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

affects the mean – 116 ms vs 178 ms

Outlier elimination

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

Outlier elimination

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

109, 110, 111, 113, 115, 118, 120, 121, 122, 123, 794

Outlier elimination

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

109, 110, 111, 113, 115, 118, 120, 121, 122, 123, 794

inspect tail and its variance contribution

109, 110, 111, 113, 115, 118, 120, 121, 122, 123

Outlier elimination

120, 121, 122, 118, 123, 794, 109, 111, 115, 113, 110

109, 110, 111, 113, 115, 118, 120, 121, 122, 123, 794

inspect tail and its variance contribution

109, 110, 111, 113, 115, 118, 120, 121, 122, 123

109, 110, 111, 113, 115, 118, 120, 121, 122, 123, 124

redo the measurement

Doing all this manually

ScalaMeter

Does all this analysis automatically, highly configurable.

Plus, it detects performance regressions.

And generates reports.

ScalaMeter example

object ListTest extends PerformanceTest.Microbenchmark {

A range of predefined benchmark types

ScalaMeter example

object ListTest extends PerformanceTest.Microbenchmark { val sizes = Gen.range("size”)(500000, 1000000, 100000)

Generators provide input data for tests

ScalaMeter example

object ListTest extends PerformanceTest.Microbenchmark { val sizes = Gen.range("size”)(500000, 1000000, 100000) val lists = for (sz <- sizes) yield (0 until sz).toList

Generators can be composed a la ScalaCheck

ScalaMeter example

object ListTest extends PerformanceTest.Microbenchmark { val sizes = Gen.range("size”)(500000, 1000000, 100000) val lists = for (sz <- sizes) yield (0 until sz).toList using(lists) in { xs => xs.groupBy(_ % 10) } }

Concise syntax to specify and group tests

ScalaMeter example

object ListTest extends PerformanceTest.Microbenchmark { val sizes = Gen.range("size”)(500000, 1000000, 100000) val lists = for (sz <- sizes) yield (0 until sz).toList measure method “groupBy” in { using(lists) in { xs => xs.groupBy(_ % 10) } using(ranges) in { xs => xs.groupBy(_ % 10) } } }

Automatic regression testing

using(lists) in { xs => var sum = 0 xs.foreach(x => sum += x) }

[info] Test group: foreach [info] - foreach.Test-0 measurements: [info] - at size -> 2000000, 1 alternatives: passed [info] (ci = <7.28, 8.22>, significance = 1.0E-10)

using(lists) in { xs => var sum = 0 xs.foreach(x => sum += math.sqrt(x)) }

[info] Test group: foreach [info] - foreach.Test-0 measurements: [info] - at size -> 2000000, 2 alternatives: failed [info] (ci = <14.57, 15.38>, significance = 1.0E-10) [error] Failed confidence interval test: <-7.85, -6.60> [error] Previous (mean = 7.75, stdev = 0.44, ci = <7.28, 8.22>) [error] Latest (mean = 14.97, stdev = 0.38, ci = <14.57, 15.38>)

- configurable: ANOVA (analysis of variance) or confidence interval testing

- can apply noise to make unstable tests more solid - various policies on keeping the result history

Report generation

Tutorials online!

http://axel22.github.com/scalameter

Questions?

ScalaMeter 2012

Software

Transcript of ScalaMeter 2012

Voorjaar 2012 - Printemps 2012 - Spring 2012

› wp-content › uploads › 2014 › 12 › 201415_Predlog… · 2013 2010 2013 2012 2012 2010 2014 2013 2012 2010 2011 2010 2010 2012 2012 2010 2012 2014 2010 2012 2012 2010 2012

Monroe County, New York · May 16, 2012 June 8, 2012 June 15, 2012 June 20, 2012 June 29, 2012 July 18, 2012 August 15, 2012 September 7, 2012 September 19, 2012 September 28, 2012

BANCO DE LA REPÚBLICA SUBGERENCIA DE ......2012 b. Febrero/ 2012 c. Marzo/ 2012 d. Abril/ 2012 e. Mayo/ 2012 f. Junio/ 2012 g. Julio/ 2012 h. Agosto/ 2012 i. Septiembre/ 2012 j. Octubre

2012 ic 2012

AQUATOX Model Calibration for the LBR TP TMDL · 2/26/2014 · 1/12/2012 2/11/2012 3/12/2012 4/11/2012 5/11/2012 6/10/2012 7/10/2012 8/9/2012 9/8/2012 10/8/2012 11/7/2012 12/7/2012

ドラゴンソースe-Magazine · ドラゴンソースe-Magazine ... 0.386 . 3750 2011 7 " _L-þ!d . 370* A n . No.i 2012 2012 2011 2012 2012 2012 2012 2012 2012 2012 2012

· PDF fileDate April 26, 2012 May 25, 2012 June 13, 2012 June 14, 2012 June 15, 2012 June 19, 2012 July 30, 2012 Aug. 19, 2012 Aug. 30, 2012 sept. 10, 2012

calendari act. 23 d'abril act. dia 4 maig › ca › files › doc1395 › calendari_act... · 2013 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012

GRY W INTERNECIE - SPIDOR...GRY W INTERNECIE RAPORT Styczeń 2012 Luty 2012 Marzec 2012 Kwiecień 2012 Maj 2012 Czerwiec 2012 Lipiec 2012 Sierpień 2012 Wrzesień 2012 Październik

Estrazione dispositivi medici ai sensi dell'art. 57 comma ... · 22/08/2012 26/11/2012 13/11/2012 05/07/2012 02/07/2012 02/07/2012 05/07/2012 12/07/2012 03/07/2012 02/07/2012 03/07/2012

I-J2012 - YAMATO AUTOMOTIVE · 2012. 7. 18. · I-J2012 edition 2012 - Auflage 2012 - édition 2012 - uitgave 2012 - edizione 2012 - izdanje 2012 wydanie 2012 - издание 2012

PROOCCEESSS CCHHAAN NGGEE CNO …...F1508 June 2012 Aug 2012Dec F1517 June 2012 Aug 2012 Dec 2012 EP2SGX125 F1508 June 2012 Aug 2012 Dec 2012 EP2SGX130 F1508 June 2012 Aug 2012 Dec

2012: URBIO 2012

Elastomer Catalog 2012 Elastomer Catalog 2012 Elastomer Catalog 2012 Elastomer Catalog 2012 Elastomer Catalog 2012 Elastomer Catalog 2012 Elastomer Catalog 2012 abcd effefe adada

October 2012 Data Release - Fannie Mae · 2020. 5. 7. · January 2012 1.0 February 2012 0.8 March 2012 0.9 April 2012 1.3 May 2012 1.4 June 2012 2.0 July 2012 1.7 August 2012 1.6

€¦ · XLS file · Web view · 2017-09-231.032 2012 30 39 51. 2012 0.86 2012 28 5 1457 1465. 2012 3.6909999999999998 2/3/2012 2012 85 8. 2012 6.2130000000000001 5/3/2012 2012

Länsrapport 2012 Blekinge län - folkhalsomyndigheten.se...Karlshamn 2012 2012 Karlskrona 2012 2012 Olofström 2012 Ronneby 2012 2012 Sölvesborg 2012 2012 Procent Antal Ja, både

s-space.snu.ac.krs-space.snu.ac.kr/bitstream/10371/98564/1/3. 예산... · 2019. 4. 29. · 2012 2013 2012 2013 2012 2013 2012 2013 2012 2013 2012 2013 2012 tll 2013 2012 2013 2012

2012 · 2016-03-23 · 2012. 2012. 2012. 2012. 2012. 2012. 2012. 2012. 2012. UNIVERSIDADE FEDERAL OE MINAS GERAIS ARQUIVOS ... abrigo GO JA 01, por projeto Paranaïba -Serranõpo-