Jvm Performance Tunning

Post on 23-Jun-2015

3.078 views 0 download

Tags:

description

Java Virtual Machine Tuning guide.It describes JVM Memory model and tuning guide

Transcript of Jvm Performance Tunning

JavaStudy NetworkDaehyub Cho

JVM [Java Virtual Machine]

PerformanceTuning

AGENDA

Basic concept of JVM Tuning1

Hotspot compiler2

Threading Model3

Memory Model4

Basic Concept of JVM TuningBasic concept of JVM Tuning

Basic of performance tuning

1. Decide what performance level is “good enough”2. Test & measurement

• Scenario based• Stress Tool (Load Runner)• Profiling Tool (J probe, etc)

3. Profile application to find bottlenecks4. Tuning

• Application *• Middleware [WAS]• OS• JVM

5. Return to Step 2 [feedback]

JVM Tuning

• Improve performance about 10~20%• Find appropriate parameter for your application

– Hotspot compile option– Thread model option *– GC and memory related option **

• Changing parameter is very dangerous action– Need more test and feed back– Ref spec.org

Hotspot CompilerHotspot compiler

JVM Layout

• Hotspot from JDK 1.3

VM

ClientCompiler

ServerCompiler

• Runtime• GC• Interpreter• Threading & Locking• ….

JVM

Hotspot Compiler

Hotspot compiler

• JIT (Just-In-Time Compiler)– Compile byte code to native code– Compile as rules of optimization (Not thinking)– At execution/installation– Compile byte code to native code

• Hotspot– Compile byte code to native code– ‘Thinking’ to trying find where optimization can take place– Adaptive Optimizing in runtime

Hotspot Detection

• Hotspot detection• Method Inlining• Dynamic Deoptimization

Hotspot Detection and Method Inlining

• Literal constants are folded

• String concatenation is sometimes folded

• Constant fields are inlined

int foo = 9* 10; int foo = 90;

String foo = “Hello “ + (9*10); String foo = “Hello 90”;

public class A{ public static final VALUE=99;}public class B{ static int VALUE2=A.VALUE;}

public class B{ static int VALUE2=99;}

When after compiling class B

Hotspot detection / Method Inlining

• Dead code branches are eliminated

public class A{ static final boolean DEBUG = false; public void methodA() if(DEBUG) System.out.println(“DEBUG MODE); System.out.println(“Say Hello”); }// method A}// class A

↓public class A{ static final boolean DEBUG = false; public void methodA() System.out.println(“Say Hello”); }// method A}// class A

Hotspot Client compiler

• Java Option : -client• Focused on Simple & Fast start up• 3 Phase compiler

– HIR (High Level Intermediate Representation)– LIR (Low Level Intermediate Representation)– Machine code

• It focuses on local code quality and does very few global optimizations since those are often the most expensive in terms of compile time

• It has for inlining any function that has no exception handlers or synchronization and also supports deoptimization for debugging and inlining

Hotspot Server compiler

• Java Option : -server• Focused on optimization• SSA (Static Single Assignment)-based IR

Hotspot compiler Option

• Hotspot compile option– -XX:MaxInlineSize=<size>

• Integer specifying maximum number of bytecode instructions in a method which gets inlined.

– -XX:FreqInlineSize=<size>• Integer specifying maximum number of bytecode instructions in a

frequently executed method which gets inlined.

– -Xint• Interpreter only (no JIT compilation)

– -XX:+PrintCompilation

ThreadingThreading model

Threading Model

• Thread Model– Java is multi threaded programming language– Native thread model from JDK 1.2

• Thread mapping (M:N and 1:1)• Thread synchronization

JavaApplication

Java Thread

OperatingSystemThread Handling

Thread SchedulingLock Mgmt (synchronization)

JVM

Solaris M:N Thread Model

JavaApplication

Java Thread

JVM

Solaris OS

OS Kernel

Solaris Thread

LWP

Kernel Thread

Solaris M:N Thread Model

• Solaris M:N Thread Model– Thread based synchronization– LWP based synchronization

Thread based sync LWP based sync

JDK1.2 N/A Default

JDK1.3 Default -XX:+UseLWPSynchronization

JDK1.4 -XX:-UseLWPSynchronization Default

Solaris 1:1 Thread Model

JavaApplication

Java Thread

JVM

Solaris OS

OS Kernel

Solaris Thread

LWP

Kernel Thread

Solaris 1:1 Thread Model

• Solaris 1:1 Thread Model– Bound thread– Alternate Libthread

Bound Thread Alternate Libthread*

JDK1.2 N/A export LD_LIBRARY_PATH=/usr/lib/lwp

JDK1.3 -XX:+UseBoundThreads export LD_LIBRARY_PATH=/usr/lib/lwp

JDK1.4 -XX:+UseBoundThreads export LD_LIBRARY_PATH=/usr/lib/lwp

※ In Solaris 9, alternate lib thread is default, do not add /usr/lib/lwp to LD_LIBRARY_PATH

JVM Performance Test on Solaris

Architecture Cpus Threads Model %diff in throughput (against Standard Model)

Sparc 30 400/2000 Standard ---

Sparc 30 400/2000 LWP Synchronization 215%/800%

Sparc 30 400/2000 Bound Threads -10%/-80%

Sparc 30 400/2000 Alternate One-to-one 275%/900%

Sparc 4 400/2000 Standard ---

Sparc 4 400/2000 LWP Synchronization 30%/60%

Sparc 4 400/2000 Bound Threads -5%/-45%

Sparc 4 400/2000 Alternate One-to-one 30%/50%

Sparc 2 400/2000 Standard ---

Sparc 2 400/2000 LWP Synchronization 0%/25%

Sparc 2 400/2000 Bound Threads -30%/-40%

Sparc 2 400/2000 Alternate One-to-one -10%/0%

Intel 4 400/2000 Standard ---

Intel 4 400/2000 LWP Synchronization 25%/60%

Intel 4 400/2000 Bound Threads 0%/-10%

Intel 4 400/2000 Alternate One-to-one 20%/60%

Intel 2 400/2000 Standard ---

Intel 2 400/2000 LWP Synchronization 15%/45%

Intel 2 400/2000 Bound Threads -10%/-15%

Intel 2 400/2000 Alternate One-to-one 15%/35%

< Solaris 8 with JVM 1.3 >See next page graph!!

JVM Performance Test on Solaris

• Performance Test Result Graph

Memory TuningMemory Model

Memory Tuning

• Garbage Collection• JVM Memory Layout• Garbage Collection Model• Server VM and Client VM• Garbage Collection Measurement & Analysis• Tuning Garbage Collection

Generational Garbage Collection

JVM Memory Layout

• New/Young – Recently created object• Old – Long lived object• Perm – JVM classes and methods

Eden Old Perm

New/Young Old

Used in Application JVM

Total Heap Size

SS1 SS2

Garbage Collection

• Garbage Collection– Collecting unused java object– Cleaning memory– Minor GC

• Collection memory in New/Young generation

– Major GC (Full GC)• Collection memory in Old generation

Minor GC

• Minor Collection– New/Young Generation– Copy and Scavenge – Very Fast

Minor GC

Eden SS1 SS1

Copy live objects to Survivor area

New Object

Garbage

Lived Object

1st Minor GC

Old

Old

Old

Minor GC

2nd Minor GC

Old

Old

Old

New Object

Garbage

Lived Object

Minor GC

OLD

3rd Minor GC

Objects moved old space when they become tenured

New Object

Garbage

Lived Object

Major GC

• Major Collection– Old Generation– Mark and compact– Slow

• 1st – goes through the entire heap, marking unreachable objects• 2nd – unreachable objects are compacted

Major GC

Eden SS1 SS2

Eden SS1 SS2

Mark the objects to be removed

Eden SS1 SS2

Compact the objects to be removed

Server option versus Client option

• -X:NewRatio=2 (1.3) , -Xmn128m(1.4), -XX:NewSize=<size> -XX:MaxNewSize=<size>

GC Tuning Parameter

• Memory Tuning Parameter– Perm Size : -XX:MaxPermSize=64m– Total Heap Size : -ms512m –mx 512m– New Size

• -XX:NewRatio=2 Old/New Size• -XX:NewSize=128m• -Xmn128m (JDK 1.4)

– Survivor Size : -XX:SurvivorRatio=64 (eden/survivor)– Heap Ratio

• -XX:MaxHeapFreeRatio=70• -XX:MinHeapFreeRatio=40

– Suvivor Ratio• -XX:TargetSurvivorRatio=50

Support for –XX Option

• Options that begin with -X are nonstandard (not guaranteed to be supported on all VM implementations), and are subject to change without notice in subsequent releases of the Java 2 SDK.

• Because the -XX options have specific system requirements for correct operation and may require privileged access to system configuration parameters, they are not recommended for casual use. These options are also subject to change

without notice.

Garbage Collection Model

• New type of GC– Default Collector– Parallel GC for young generation - JDK 1.4– Concurrent GC for old generation - JDK 1.4 – Incremental Low Pause Collector (Train GC)

Parallel GC

• Parallel GC– Improve performance of GC– For young generation (Minor GC)– More than 4CPU and 256MB Physical

memory required

threads

timegc

threads

Default GC Parallel GC

Young Generation

Parallel GC

• Two Parallel Collectors– Low-pause : -XX:+UseParNewGC

• Near real-time or pause dependent application• Works with

– Mark and compact collector– Concurrent old area collector

– Throughput : -XX:+UseParallelGC• Enterprise or throughput oriented application• Works only with the mark and compact collector

Parallel GC

• Throughput Collector– –XX:+UseParallelGC– -XX:ParallelGCThreads=<desired number>– -XX:+UseAdaptiveSizePolicy

• Adaptive resizing of the young generation

Parallel GC

• Throughput Collector– AggressiveHeap

• Enabled By-XX:+AggresiveHeap• Inspect machine resources and attempts to set various parameters to

be optimal for long-running,memory-intensive jobs– Useful in more than 4 CPU machine, more than 256M– Useful in Server Application– Do not use with –ms and –mx

• Example) HP Itanium 1.4.2 java -XX:+ServerApp -XX:+AggresiveHeap -Xmn3400m -spec.jbb.JBBmain -propfile Test1

Concurrent GC

• Concurrent GC– Reduce pause time to collect

Old Generation– For old generation (Full GC)

– Enabled by -XX:+UseConcMarkSweepGC

threads

timegc

threads

Default GC Concurrent GC

OldGeneration

Incremental GC

• Incremental GC– Enabled by –XIncgc (from JDK 1.3)– Collect Old generation whenever collect young generation– Reduce pause time for collect old generation– Disadvantage

• More frequently young generation GC has occurred.• More resource is needed• Do not use with –XX:+UseParallelGC and –XX:+UseParNewGC

Incremental GC

• Incremental GC

Minor GC

After many time of Minor GC

Full GC

Minor GC

Minor GC

Old Generation is collected in Minor GC

Default GC Incremental GC

Young Generation

OldGeneration

Incremental GC

• Incremental GC-client –XX:+PrintGCDetails -Xincgc –ms32m –mx32m

[GC [DefNew: 540K->35K(576K), 0.0053557 secs][Train: 3495K->3493K(32128K), 0.0043531 secs] 4036K->3529K(32704K), 0.0099856 secs][GC [DefNew: 547K->64K(576K), 0.0048216 secs][Train: 3529K->3540K(32128K), 0.0058683 secs] 4041K->3604K(32704K), 0.0109779 secs][GC [DefNew: 575K->64K(576K), 0.0164904 secs] 4116K->3670K(32704K), 0.0169019 secs][GC [DefNew: 576K->64K(576K), 0.0057541 secs][Train: 3671K->3651K(32128K), 0.0051286 secs] 4182K->3715K(32704K), 0.0113042 secs][GC [DefNew: 575K->56K(576K), 0.0114559 secs] 4227K->3745K(32704K), 0.0191390 secs][Full GC [Train MSC: 3689K->3280K(32128K), 0.0909523 secs] 4038K->3378K(32704K), 0.0910213 secs][GC [DefNew: 502K->64K(576K), 0.0173220 secs][Train: 3329K->3329K(32128K), 0.0066279 secs] 3782K->3393K(32704K), 0.0325125 secs

Young Generation GC Old Generation GC in Minor GC TimeMinor GC

Full GC

Sun JVM 1.4.1 in Windows OS

Mark-compact Better throughput

Incremental GC(Train) Better Pause

Parallel GC Best Throughput

Concurrent GC Best Pause

Garbage Collection Measurement

• -verbosegc (All Platform)• -XX:+PrintGCDetails ( JDK 1.4)• -Xverbosegc (HP)

Garbage Collection Measurement

• -verbosegc

[GC 40549K->20909K(64768K), 0.0484179 secs][GC 41197K->21405K(64768K), 0.0411095 secs][GC 41693K->22995K(64768K), 0.0846190 secs][GC 43283K->23672K(64768K), 0.0492838 secs][Full GC 43960K->1749K(64768K), 0.1452965 secs][GC 22037K->2810K(64768K), 0.0310949 secs][GC 23098K->3657K(64768K), 0.0469624 secs][GC 23945K->4847K(64768K), 0.0580108 secs]

Full GC

Total Heap Size

GC Time

Heap size after GC

Heap size before GC

GC Log analysis using AWK script

• Awk script

BEGIN{ printf("Minor\tMajor\tAlive\tFree\n");}{ if( substr($0,1,4) == "[GC "){ split($0,array," "); printf("%s\t0.0\t",array[3])

split(array[2],barray,"K") before=barray[1] after=substr(barray[2],3) reclaim=before-after printf("%s\t%s\n",after,reclaim) }

if( substr($0,1,9) == "[Full GC "){ split($0,array," "); printf("0.0\t%s\t",array[4])

split(array[3],barray,"K") before = barray[1] after = substr(barray[2],3) reclaim = before - after printf("%s\t%s\n",after,reclaim) } next;}

% awk –f gc.awk gc.log

※ Usage

gc.awk

Minor       Major       Alive       Freed0.0484179   0.0         20909       196400.0411095   0.0         21405       197920.0846190   0.0         22995       186980.0492838   0.0         23672       196110.0         0.1452965   1749        422110.0310949   0.0         2810        192270.0469624   0.0         3657        194410.0580108   0.0         4847        19098

gc.log

GC Log analysis using AWK script

< GC Time >

GC Log analysis using HPJtune

※ http://www.hp.com/products1/unix/java/java2/hpjtune/index.html

GC Log analysis using AWK script

< GC Amount >

Garbage Collection Tuning

• GC Tuning– Find Most Important factor

• Low pause? Or High performance?• Select appropriate GC model (New Model has risk!!)

– Select “server” or “client”– Find appropriate Heap size by reviewing GC log– Find ratio of young and old generation

Garbage Collection Tuning

• GC Tuning– Full GC Most important factor in GC tuning

• How frequently ? How long ?• Short and Frequently decrease old space• Long and Sometimes increase old space• Short and Sometimes decrease throughput by Load balancing

– Fix Heap size• Set “ms” and “mx” as same• Remove shrinking and growing overhead

– Don’t• Don’t make heap size bigger than physical memory (SWAP)• Don’t make new generation bigger than half the heap

Jmeter / Threads Histogram

Jmeter /Threads Group Histogram

Example

Example

2004-01-08 오후 7:14

2004-01-09 오전 8 시 전후

2004-01-09 오후 7 시 전후

금요일 업무시간

2004-01-10오전 10 시 전후

2004-01-10오후 6 시 전후

PEAK TIME52000~56000 sec9 시 ~ 1 시간 가량

Before TunedOld Area

Example

Peak Time 시에 Old GC 시간이 4~8 sec 로 이로 인한 Hang 현상 유발이 가능함

Before TunedGC Time

Example

12 일 03:38A12 일 05:58P13 일 07:18A13 일 09:38P14 일 11:58A15 일 01:18A15 일 03:38P16 일 05:58A16 일 07:18P17 일 08:38A17 일 10:58P

Weekend

Mon Office

Our

Tue Office

Our

Thur Office

Our

Fri Office

Our

After AP TunedGC Time

Example

12 일 03:38A12 일 05:58P13 일 07:18A13 일 09:38P14 일 11:58A15 일 01:18A15 일 03:38P16 일 05:58A16 일 07:18P17 일 08:38A17 일 10:58P

Weekend

Mon Office

Our

Tue Office

Our

Thur Office

Our

Fri Office

Our

Summary

JVM Tuning Summary

• Determine JVM performance goal• Gather statistics on your application• Select hotspot compiler• Tuning heap• Check threading model• Feedback

More TipsMore Tips

Thread dump

• Thread dump– Enabled by

• Unix “kill –3 [JAVA PID]”• Windows “Ctrl+Break”

– Snapshot of java application– Can profiling “hang-up”, and “slow-down”

Thread dump example

""

• Thread dump when slowdown in WAS

ExecuteThread: '232' for queue: 'default'" daemon prio=5 tid=0x573ca630 nid=0xd2c waiting for monitor entry [0x5cebf000..0x5cebfdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.util.ListResourceBundle.handleGetObject(ListResourceBundle.java:122) at java.util.ResourceBundle.getObject(ResourceBundle.java:371) at java.util.ResourceBundle.getObject(ResourceBundle.java:374) at java.text.DateFormatSymbols.initializeData(DateFormatSymbols.java:483) at java.text.DateFormatSymbols.<init>(DateFormatSymbols.java:99) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:275) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.util.CmLog.setFileLog(CmLog.java:171) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:371) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)

"ExecuteThread: '231' for queue: 'default'" daemon prio=5 tid=0x573f9a60 nid=0x13a8 waiting for monitor entry [0x5ce7f000..0x5ce7fdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols.java:333) at java.text.DecimalFormatSymbols.<init>(DecimalFormatSymbols.java:55) at java.text.NumberFormat.getInstance(NumberFormat.java:565) at java.text.NumberFormat.getInstance(NumberFormat.java:324) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:327) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:276) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:67) at XXX.uv.com.datastu.DateTime.setCurrentTime(DateTime.java:190) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:239) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)

• Profiling CPU usage/HP UX– HP UX : Glance + Thread Dump

HP Glance

Press “G”

Thread monitoring

• Profiling CPU usage/HP UX

"Application Manager Thread" prio=8 tid=0x002a6c00 nid=62 lwp_id=15999 waiting on monitor [0x64bce000..0x64bce4b8] at java.lang.Thread.sleep(Native Method) at weblogic.management.mbeans.custom.ApplicationManager$ApplicationPoller.run(ApplicationManager.java:1137)

CPU Load of Thread 15999 is 17.7%

Thread 15999 is working on weblogic.management.mbeans.custom.ApplicationManager(ApplicationManger.java 1137)

Glance Thread Monitoring

Java Thread Dump

• Other tools– Profile with Java option– Analyze using HP Jmeter– Jprobe– Stress Test

• Load Runner• MS Stress (Free)

• Related URL– Java Thread http://java.sun.com/docs/hotspot/threads/threads.htm– Java Performance http://java.sun.com/docs/hotspot/PerformanceFAQ.html– Java Thread http://www.javaworld.com/javaworld/jw-09-1998/jw-09-threads.html– Pick up performance with generational gc

http://www.javaworld.com/javaworld/jw-01-2002/jw-0111-hotspotgc.html– JVM1.4 GC Tunning http://java.sun.com/docs/hotspot/gc1.4.2/index.html– HP Jmeter,Jtune,Jconfig http://www.hp.com/products1/unix/java/developers/index.html– SPECjvm98– SPECjAppServer2001/2002

Thank you