Tales About Scala Performance
-
Upload
haim-yadid -
Category
Technology
-
view
104 -
download
1
description
Transcript of Tales About Scala Performance
© Copyright Performize-IT LTD.
Tales About Scala Performance
About Me
My Name: Haim Yadid Hard to PronounceLuckily it is meaningful
Haim => Life Yadid => Friend
hybrid nick: lifey
© Copyright Performize-IT LTD.
:: ::::::this :: Nil::
Performize-IT
© Copyright Performize-IT LTD.
Performize-IT
© Copyright Performize-IT LTD.
Optimizing Software since 2007
Performance Bottlenecks
Crashes
GC Tuning Training&Mentoring
OutOfMemory
Concurrency
Contact Me
© Copyright Performize-IT LTD.
http://il.linkedin.com/in/haimyadid
www.performize-it.com
blog.performize-it.com
https://github.com/lifey
@lifeyx
© Copyright Performize IT LTD.
Once Upon A Time
Benchmarks by Google
© Copyright Performize-IT LTD.
So we are done
So what is this talk about?
© Copyright Performize-IT LTD.
Best practices Micro benchmarks?
Understanding
Understand
How to Find performance problemsHow to solve themReach a well performing production system
Prerequisites:Familiarity with the JVMBasic knowledge of Scala
© Copyright Performize-IT LTD.
Performance is all about
MethodologyMonitoring
Hotspots Isolation Analysis Solution
Tools are your Best Friends for this task
© Copyright Performize-IT LTD.
Scala Runs on the JVM
All JVM capabilities and tools still apply Take your best friends with you
© Copyright Performize-IT LTD.
Premature Optimization
© Copyright Performize-IT LTD.
I shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurely
Monitoring the JVM
Java management extensions (JMX)on the same machine(Attach)Remotely via command line paramsTools
JConsoleJVisualVMMission Control
© Copyright Performize-IT LTD.
Remote Monitoring - JMX
Add params to command line of profiled app-Dcom.sun.management.jmxremote-Dcom.sun.management.jmxremote.port=<port>-Dcom.sun.management.jmxremote.authenticate=false-Dcom.sun.management.jmxremote.ssl=false
Recommend authentication and security, refer tohttp://java.sun.com/j2se/1.5.0/docs/guide/management/agent.html
© Copyright Performize-IT LTD.
Production
© Copyright Performize IT LTD.
A Tale about a Stack
Your First Scala Function
Functional Programming recursionEasy to understand Probably your 1st program in Scala will look like:
© Copyright Performize-IT LTD.
def sumOfSquares(st:Int , end : Int ) = { if (st>end) 0 else st*st + sumOfSquares(st+1,end) }
And your first exception will be:
© Copyright Performize-IT LTD.
java.lang.StackOverflowError at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:8) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)
at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)
Tail Recursion
© Copyright Performize-IT LTD.
Recursive call to the function must be the value returned
if (number == 1) 1 else number * factorial (number -‐ 1)
Favor tail recursion
The JVM does not optimize recursionMeaning extra call for every iterationLimit on recursion depthScala compiler can optimize tail recursion!!
© Copyright Performize-IT LTD.
@tailrec def sumOfSquares(st:Int , end : Int, sum = 0 ) = { if (st>end) sum else sumOfSquares(st+1,end,sum + st*st)}
@tailrec Annotation
A compile time directivefail compilation if tail recursion optimization cannot be appliedUse whenever the fact tail recursion is used is mandatory for performance and functionality
© Copyright Performize-IT LTD.
Stack Size
Ranges from 256k-1024kDepending on platform and JVM versionWhat is it in your system?
java -XX:+PrintFlagsFinal -version |& grep ThreadStackSize
Tune thread stack to your needs Example: -Xss1312k
© Copyright Performize-IT LTD.
Production
Stacks in Scala
Scala stack is just like Java Stackjstack is your best friend Scala terminology may be obscuredE.g. List will look like $colon$colon
© Copyright Performize-IT LTD.
JStack
Part of the JDKDumps stack traces of all live threadsSynopsis: jstack -lUse when
Get a snapshot for program activitydetect deadlocks
© Copyright Performize-IT LTD.
Takipi’s Stackifier
www.stackifier.com
© Copyright Performize-IT LTD.
© Copyright Performize IT LTD.
Humpty Dumpty sat on a heap,Humpty Dumpty had anOutOfMemory flip.All the king’s horses and all the king’s menCouldn’t put Humpty together again
Heap
Max Used
In a Perfect World.....
Heap(Or Perm Gen) is depleted -XX:+HeapDumpOnOutOfMemoryErrorScala code does not have larger memory footprintScala code may have larger permgen footprint
© Copyright Performize-IT LTD.
Production
MAT
MAT - Memory Analyzer ToolA very powerful tool analyzing heap dumps
Use to investigate :Memory leaksOutOfMemory errors Memory footprint
AlternativesYourkit /JProbe/JProfiler (Commercial)VisualVM(JDK)JHat(JDK)
© Copyright Performize-IT LTD.
MAT-name-resolver
Add-on for MAT Helps MAT understand ScalaDeveloped by Iulian Dragos from TypesafeGithub project https://github.com/dragos/MAT-name-resolver
© Copyright Performize-IT LTD.
List[Int] ?
© Copyright Performize-IT LTD.
OutOfMemory Perm Space
Class byte code resides in PermGenScala will use more perm space You can write small piece of codewhich will create a lot of byte-code
© Copyright Performize-IT LTD.
@ScalaSignature
@ScalaSignature(bytes="... Meta data needed for:
ReflectionCompilation
Larger class files
© Copyright Performize-IT LTD.
More classes
Each closure is actually a JVM class Implicit conversions are classesCompanion objects are also classes
© Copyright Performize-IT LTD.
Well
© Copyright Performize-IT LTD.
object ClosureExample extends App { val f = (x: Int) => x*x println (s"closure ${f(5)}");}
ClosureExample$.classpackage com.performizeit.scalapeno.demos;
import scala.Function0;import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.reflect.ScalaSignature;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;
@ScalaSignature(bytes="\006\001\035:Q!\001\002\t\002-\tab\0217pgV\024X-\022=b[BdWM\003\002\004\t\005)A-Z7pg*\021QAB\001\ng\016\fG.\0319f]>T!a\002\005\002\031A,'OZ8s[&TX-\033;\013\003%\t1aY8n\007\001\001\"\001D\007\016\003\t1QA\004\002\t\002=\021ab\0217pgV\024X-\022=b[BdWmE\002\016!Y\001\"!\005\013\016\003IQ\021aE\001\006g\016\fG.Y\005\003+I\021a!\0218z%\0264\007CA\t\030\023\tA\"CA\002BaBDQAG\007\005\002m\ta\001P5oSRtD#A\006\t\017ui!\031!C\001=\005\ta-F\001 !\021\t\002E\t\022\n\005\005\022\"!\003$v]\016$\030n\03482!\t\t2%\003\002%%\t\031\021J\034;\t\r\031j\001\025!\003 \003\t1\007\005")public final class ClosureExample{ public static void main(String[] paramArrayOfString) { ClosureExample..MODULE$.main(paramArrayOfString); }
public static void delayedInit(Function0<BoxedUnit> paramFunction0) { ClosureExample..MODULE$.delayedInit(paramFunction0); }
public static String[] args() { return ClosureExample..MODULE$.args(); }
public static void scala$App$_setter_$executionStart_$eq(long paramLong) { ClosureExample..MODULE$.scala$App$_setter_$executionStart_$eq(paramLong); }
public static long executionStart() { return ClosureExample..MODULE$.executionStart(); }
public static Function1<Object, Object> f() { return ClosureExample..MODULE$.f(); }
public static class delayedInit$body extends AbstractFunction0 { private final ClosureExample. $outer;
public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));
return BoxedUnit.UNIT; }
public delayedInit$body(ClosureExample. $outer) { } }}
ClosureExample.classpackage com.performizeit.scalapeno.demos;
import scala.App;import scala.App.class;import scala.DelayedInit;import scala.Function0;import scala.Function1;import scala.Serializable;import scala.collection.mutable.ListBuffer;import scala.runtime.AbstractFunction1.mcII.sp;import scala.runtime.BoxedUnit;
public final class ClosureExample$ implements App{ public static final MODULE$; private Function1<Object, Object> f; private final long executionStart; private String[] scala$App$$_args; private final ListBuffer<Function0<BoxedUnit>> scala$App$$initCode;
static { new (); }
public long executionStart() { return this.executionStart; } public String[] scala$App$$_args() { return this.scala$App$$_args; } public void scala$App$$_args_$eq(String[] x$1) { this.scala$App$$_args = x$1; } public ListBuffer<Function0<BoxedUnit>> scala$App$$initCode() { return this.scala$App$$initCode; } public void scala$App$_setter_$executionStart_$eq(long x$1) { this.executionStart = x$1; } public void scala$App$_setter_$scala$App$$initCode_$eq(ListBuffer x$1) { this.scala$App$$initCode = x$1; } public String[] args() { return App.class.args(this); } public void delayedInit(Function0<BoxedUnit> body) { App.class.delayedInit(this, body); } public void main(String[] args) { App.class.main(this, args); } public Function1<Object, Object> f() { return this.f; } public void f_$eq(Function1 x$1) { this.f = x$1; }
ClosureExample$$anonfun$1.classpackage com.performizeit.scalapeno.demos;
import scala.Serializable;import scala.runtime.AbstractFunction1.mcII.sp;
public final class ClosureExample$$anonfun$1 extends AbstractFunction1.mcII.sp implements Serializable{ public static final long serialVersionUID = 0L;
public final int apply(int x) { return apply$mcII$sp(x); } public int apply$mcII$sp(int x) { return x * x; }
}
ClosureExample$delayedInit$body.classpackage com.performizeit.scalapeno.demos;
import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;
public final class ClosureExample$delayedInit$body extends AbstractFunction0{ private final ClosureExample. $outer;
public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));
return BoxedUnit.UNIT; }
public ClosureExample$delayedInit$body(ClosureExample. $outer) {
@specialized
Generics implemented by type erasureFor primitive types this means : Boxing/UnboxingPerformance hit Large memory footprint
@specialized annotation enables specialized implementations
© Copyright Performize-IT LTD.
What about code cache?
Code cache hold optimized assembly code Should be large enough to hold If you need more perm gen You may need more code cache-XX:CodeCacheSize=Monitor it via JMX
© Copyright Performize-IT LTD.
Production
@specialized Nightmare
© Copyright Performize-IT LTD.
class SpecializeNightmare { trait S1[@specialized A, @specialized B] { def f(p1:A): Unit }}
Generates 165 classes
Don’t try with 3,4,5
OutOfMemory Perm Gen Space
Congrats you have a perm gen OOM -XX:MaxPermSize=1024m(Or -J-XX:MaxPermSize=1024m if you use Scala command line)
© Copyright Performize-IT LTD.
Production
© Copyright Performize IT LTD.
Oh dear! Oh dear! I shall be too late!
-optimise
A scalac command line parameter Performs optimizations of bytecode Inlining boxing/unboxing elimination etcImproves performance Slower compilation
© Copyright Performize-IT LTD.
Production
Inlining
Scala uses information it has in compile time To know which methods can be inlinedIt can do better job than the JVMAutomatic when you -optimise
© Copyright Performize-IT LTD.
Production
Inlining Visibility
On scala compiler levelAdd -Ylog:inline to see what inlined
© Copyright Performize-IT LTD.
scalac -optimise -Ylog:inline -d ../bin com/performizeit/scalapeno/demos/ClosureExampleInline.scala |& grep inlined
[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline.<init> // 1 inlined: com.performizeit.scalapeno.demos.ClosureExampleInline.delayedInit[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline$$anonfun$f$1.apply // 1 inlined: com.performizeit.scalapeno.demos.anonfun$f$1.apply$mcII$sp
Inlining Visibility JVM
JIT Compiler compiler optionsNot recommended for production
-XX:+PrintCompilation-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
© Copyright Performize-IT LTD.
! Prod
@inline
You may direct the compiler to inline a methodUsually you will not need it the compiler will do it anyway.Or the JVM will do it anywayNo real need to clutter the code....
© Copyright Performize-IT LTD.
@inline final def f = (x: Int) => x*x
Member accessors
Get/Setgetters to a val fieldsgetters&setters to var fieldsWill you pay for this?
Nope !JVM inlines accessor methods (by default)If you insist on penalty-XX:-UseFastAccessorMethods
© Copyright Performize-IT LTD.
Parallel Collections
ParArrayParVectormutable.ParHashMapmutable.ParHashSetimmutable.ParHashMapimmutable.ParHashSetParRangeParTrieMap
© Copyright Performize-IT LTD.
Parallel Collections
Apply only when has a location is a hotspotVery easy to use behind the scenes ForkJoinFramework (Java 6)Dangerous when code :
has side effectsNon associative
Easy to use
© Copyright Performize-IT LTD.
val v = Vector(Range(0,10000000)).flatten v.par.map(_ + 1)
Only when proven to improve
Profiler - JVisualVM
Part of the JDKA profiler Use when
Want to identify hotspot Analyze memory allocation bottlenecks
Alternatives Yourkit (Commercial)JProbe(Commercial)JProfiler(Commercial)
© Copyright Performize-IT LTD.
Sampling vs Instrumentation
Sampling - sample application threads and stack traces to get statistics Instrumentation - modify byte code to record times and invocation counts
© Copyright Performize-IT LTD.
Scala Stacks revisited
© Copyright Performize-IT LTD.
while (true) { var a = List(Range(0,1000)).flatten // println(a) for (i <- 1 to 10 ) { a = a :+ i println(a.last) } }
© Copyright Performize IT LTD.
Garbage Collection
Immutability
Immutability may cause more objects allocation Not necessary a performance hit
Short lived objectsGC handles them efficientlyEscape analysis
Parallelization!!!
© Copyright Performize-IT LTD.
VisualVM (allocation hotspots)
Find locations large amounts of bytes are being allocated.large number of objects being allocation
© Copyright Performize-IT LTD.
Large (im)mutable state
You have a huge graph which changes graduallyEventually end up in Old Generation A small change may cause huge impact on state That may screw up GC
© Copyright Performize-IT LTD.
GC Visibility
GC can be visualized partially through JMXThe best way to do get the whole picture is by GC logs
-Xloggc:<log file name>-XX:+PrintGCDetails -XX:+PrintGCDateStamps
Java 7 supports a “rolling appender” -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<#files> -XX:GCLogFileSize=<number>M
© Copyright Performize-IT LTD.
Prod
GCViewer
Analysis GC logs Use when:
Experience GC problemsIs GC efficient ?(throughput )Does GC stops application ( pause time)
Alternatives Cesnum (Commercial)
© Copyright Performize-IT LTD.
© Copyright Performize IT LTD.
And They Lived Happily Ever After
slides /: (_ + _)
Don’t be afraid of Scala You will be able to optimize large scale apps Optimize where needed You need to (Java =>) Scala Yourself ATM - Know Java to optimize Scala
© Copyright Performize-IT LTD.
© Copyright Performize IT LTD.
Q&A
© Copyright Performize IT LTD.
The End