«Большие объёмы данных и сборка мусора в Java
-
Upload
- -
Category
Technology
-
view
1.209 -
download
0
description
Transcript of «Большие объёмы данных и сборка мусора в Java
What it is all about?What it is all about?
• Automatic memory management, how it works
• Why JVM need Stop-the-World pauses• Tuning GC in HotSpot JVM
Automatic memory managementAutomatic memory management
Languages with automatic memory management Java, JavaScript, Erlang, Haskell, Python, PHP, C#,
Ruby, Perl, SmallTalk, OCaml, List, Scala, ML, Go, D, … … and countingLangauges without automatic memory managment C, C++, Pascal/Delphi, Objective-C Anything else, anyone?
How to manage memory?How to manage memory?
Garbage – data structure (object) in memory unreachable for the program.
How to find garbage? Reference counting Object graph traversal Do not collect garbage at all
Reference countingReference counting
+ Simple+ No Stop-the-World pauses required– Cannot collect cyclic graphs– 15-30% CPU overhead– Pretty bad for multi core systems
Object graph traversalObject graph traversal
• RootsStatic fieldsLocal variables (stack frames)
• Reachable objects - alive• Unreachable objects - garbageIn general, graph should not be mutated during graph
traversal. As a consequence, application should be frozen for period of while runtime is collecting garbage.
Garbage collectionGarbage collection
Copy collection Traverse object graph and copy reachable object to other
space Mark old space as free
Mark / Sweep Traverse object graph and mark reachable objects Scan (sweep) whole memory and “free” unmarked objects
Mark / Sweep / Compact … mark … sweep …. Relocate live objects to defragment free space
AlgorithmsAlgorithms
Garbage collectionGarbage collection
S – whole heap sizeL – size of live objects
Copy collection Throughput Mark / Sweep Throughput
EconomicsEconomics
L
LSc
S
LSc
L
LSc
21
For all algorithms based on reference reachability. GC efficiency is in reverse proportion to amount of
live objects.
For all algorithms based on reference reachability. GC efficiency is in reverse proportion to amount of
live objects.
Total amount of garbage
Garbage collectionGarbage collectionGenerational approachGenerational approach
Dea
th r
ate
(byt
e/se
c)
Age
Young GC period
Old GC period
Heap demography
∞
WHAT DO WE HAVE IN TOOL BOX?WHAT DO WE HAVE IN TOOL BOX?
Garbage collectionGarbage collection
Stop-the-world (STW) pause – pause of all application threads require
Compacting algorithms – can move objects in memory to defragment free space
Parallel collection – using multiple cores to reduce STW time
Concurrent collection – collection algorithms working in parallel with application threads
Terms dictionaryTerms dictionary
Garbage collectionGarbage collection
Throughput algorithms– minimize total time of program execution– economically efficient CPU utilization
Low pause algorithms – minimize time of individual STW pause– may use background (concurrent) collection– may incremental collection
Throughput vs low latencyThroughput vs low latency
Oracle HotSpot JVMOracle HotSpot JVM
Throughput algorithmsParallel GC (-XX:+UseParallelOldGC)Young: Copy collector Old: Parallel Mark Sweep Compact
Low pause algorithms Concurrent Mark Sweet (-XX:+UseConcMarkSweepGC)Young: Copy collector Old: Mark Sweep – not compacting (prone for fragmentation)– most work is in background– young collections are STW
Oracle HotSpot JVMOracle HotSpot JVM
Low pause algorithms Garbage First – G1 (-XX:+UseG1GC)Young: Copy collector Old: Incremental copy collector– incremental – more STW but shorter– collect regions with more garbage first– compacting, but had problems with large objects
G1 – algorithm of future, hopefully not forever– bad throughput– pauses normally are twice longer than CMS
Garbage collectionGarbage collectionGenerational approachGenerational approach
Young space collection High throughput Low memory utilization
Promotion Eden (nursery) -> Survivor (keep) space -> Old space
Old space collection Better memory utilization Orders of magnitude lower throughput
Memory barrier JVM “tracks” references from old to young space
Oracle’s HotSpot JVMOracle’s HotSpot JVM
Default (serial) collector Young: Serial copy collector, Old: serial MSC
Parallel scavenge / Parallel old GC Young: Parallel copy collector, Old: serial MSC or parallel MSC
Concurrent mark sweep (CMS) Young: Serial or parallel copy collector, Old: concurrent mark
sweep
G1 (garbage first) Young: Copy collector (region based) Old: Incremental MSC
http://blog.ragozin.info/2011/07/hotspot-jvm-garbage-collection-options.html
Oracle’s HotSpot JVMOracle’s HotSpot JVM
Young collector Old collector JVM option Serial (DefNew) Serial Mark-Sweep-Compact -XX:+UseSerialGC
Parallel scavenge (PSYoungGen) Serial Mark-Sweep-Compact (PSOldGen) -XX:+UseParallelGC
Parallel scavenge (PSYoungGen) Parallel Mark-Sweep-Compact (ParOldGen) -XX:+UseParallelOldGC
Serial (DefNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:-UseParNewGC
Parallel (ParNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
G1 -XX:+UseG1GC
http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html
Oracle’s Jrockit JVMOracle’s Jrockit JVM
-Xgc: option Generational Mark Sweep/Compactgenconcon or gencon Yes concurrent incrementalsingleconcon or singlecon No concurrent incremental
genconpar Yes concurrent parallelsingleconpar No concurrent parallelgenparpar or genpar Yes parallel parallelsingleparpar or singlepar No parallel parallel
genparcon Yes parallel incrementalsingleparcon No parallel incremental
http://blog.ragozin.info/2011/07/jrockit-gc-in-action.html
Azul Zing JVMAzul Zing JVM
• Generational GC• Young – Concurrent mark sweep compact
MSC) • Old – Concurrent mark sweep compact MSC)Azul Zing can relocate objects in memory
without STW pause. Secret – read barrier (барьер чтения).Requires special linux kernel modules to run
JVM HEAP SIZE AND PAUSESJVM HEAP SIZE AND PAUSES
Concurrent Mark SweepConcurrent Mark Sweep
Initial mark - Stop-The-World Collect root references (thread stacks) – mark them gray Mark them as gray
Concurrent mark - concurrent Do three color marking until grays exhaust Mark all black objects on dirty regions as gray (by card table) Repeat
Remark - Stop-The-World Final remark
Sweep - concurrent Scan heap and reclaim white objects
Cost structure of pauses (CMS)Cost structure of pauses (CMS)Summary of pausesSummary of pauses
MOVING OUT OF HEAPMOVING OUT OF HEAP
Direct memory buffersDirect memory buffers
java.nio.ByteBuffer.allocateDirect()
Pro• Memory is allocated out of heap• Memory is deallocated when ByteBuffer is collected• Cross platform, native java
Con• Fragmentation of non-heap memory• Memory is deallocated when ByteBuffer is collected• Complicated multi thread programming• -XX:MaxDirectMemorySize=<value>
RTSJRTSJ
Scoped memory• Objects can be allocated in chosen memory
areas• Scoped and immortal areas are not garbage
collected• Scoped areas can be release by whole area• Cross references between areas are limited
and this limitation is enforced in run time
Unsafe javaUnsafe java
sun.misc.Unsafe• Unsafe.allocateMemory(…)• Unsafe.reallocateMemory(…)• Unsafe.freeMemory(…)
YOUNG COLLECTIONYOUNG COLLECTION
Memory spaces in HotSpot JVMMemory spaces in HotSpot JVM
Memory geometry• Young space: -XX:NewSize=<n> -XX:MaxNewSize=<n>• Survival space: Young space / -XX:SurvivorRatio=<n>• Young + old space: -Xms<n> -Xmx<n>• Permanent space: -XX:PermSize=<n> -XX:MaxPerSize=<n>
Eden Survivor 1 Survivor 2 Tenured
Permanent
* G1 has same set of spaces but they are not continuous address ranges but dynamic sets of regions
How young collection works?How young collection works?
Collect root references Stack frame references References from other spaces (tenured + permanent) does it mean scanning old space?
Travers object graph Visit only live object Copy live object to other region of young space or old space
Consider whole Eden and old survivor space to be free memory
Write barrier is required to effectively collect references from old to young space.
Write barrier is required to effectively collect references from old to young space.
How young collector worksHow young collector works
Eden S1 S2 Tenured
Dirty cards
Collect roots for young GCScan stack tracesScan dirty pages in old space
Collecting root referencesCollecting root references
Card marking barrierEach 512 bytes of heap is associated with flag (card).Once reference is written in memory, associated card is marked dirty.
Card marking barrierEach 512 bytes of heap is associated with flag (card).Once reference is written in memory, associated card is marked dirty.
How young collection works?How young collection works?
Eden S1 S2 Tenured
Dirty cards
Collect roots for young GCClean cardsRecursive copy of live objects (only live objects are traversed)
Coping live objectsCoping live objects
Card table is reset just before copy collector starts to move objects.Card table is reset just before copy collector starts to move objects.
How young collection works?How young collection works?
Eden S1 S2 Tenured
Dirty cards
Collection finishedCollection finished
Since every object in young space has been relocated, clean card means that there is no references to young space in particular 512 bytes of heap.
Since every object in young space has been relocated, clean card means that there is no references to young space in particular 512 bytes of heap.
Thread local allocation blocksThread local allocation blocks
• Each thread preallocates block in Eden• Thread is allocating new objects in its TLAB• Then TLAB is full, new TLAB allocated• If object does not fit TLAB
• Allocate in Eden space• If does not fit Eden (or ‑XX:PretenureSizeThreshold)
• Allocate in old space
TLA in HotSpot JVMTLA in HotSpot JVM
Young collection stop-the-worldYoung collection stop-the-world
Total STW time Collect roots
Scan thread stacks Scan dirty cards
Read card table ~ Sheap Scan pages marked as dirty ~
Copy live objects Process special references
heapSC
1
* You can use -XX:+PrintGCTaskTimeStamps to analyze time of individual phases
* You can use -XX:+PrintReferenceGC to analyze reference processing times
OLD SPACE COLLECTIONOLD SPACE COLLECTION
HotSpot: Old space collectionHotSpot: Old space collection
Stop-the-World Mark-Sweep-Compact Single threaded Multithreaded
Concurrent Mark Sweep (CMS) Background collection of old space
G1 (Garbage Fisrt) Incremental Stop-the-Wolrd collection
HotSpot: Old space collectionHotSpot: Old space collection
HotSpot’s CMS (Concurrent Mark Sweep)• Does not compact• Prone to fragmentation• Use separate free lists for each object size• Use statistic to manage fragmentation• Introduces 2 short STW phases
Concurrent Mark SweepConcurrent Mark Sweep
HotSpot: Old space collectionHotSpot: Old space collection
HotSpot’s G1• Space is divided into regions• Regions can be collected individually• Write barrier tracks references between regions• Subset of regions collected during STW pause
Live object are “evacuated” to other regions
• Young collections – all Eden regions collected• Partial collection – few old regions collected• Global marking is used to estimated live population
Incremental collectionIncremental collection
Concurrent Mark SweepConcurrent Mark Sweep
roots
Three color markingThree color marking
Concurrent Mark SweepConcurrent Mark Sweep
roots
Three color markingThree color marking
Concurrent Mark SweepConcurrent Mark Sweep
roots
roots
Three color markingThree color marking
Concurrent Mark SweepConcurrent Mark Sweep
roots
roots
Three color markingThree color marking
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC A
C
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC C
D
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC C
D
Reference queue: B D
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC C
D
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC D
D
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC
Reference queue: B D
SATB barrier exampleSATB barrier example
Concurrent Marking ArtifactsConcurrent Marking Artifacts
A B C D
GC
Reference queue:
B
D
empty
SATB barrier exampleSATB barrier example
Concurrent Mark SweepConcurrent Mark Sweep
Initial mark - Stop-The-World Collect root references (thread stacks) – mark them gray Mark them as gray
Concurrent mark - concurrent Do three color marking until grays exhaust Mark all black objects on dirty regions as gray (by card table) Repeat
Remark - Stop-The-World Final remark
Sweep - concurrent Scan heap and reclaim white objects
Cost structure of pauses (CMS)Cost structure of pauses (CMS)Summary of pausesSummary of pauses
Patching OpenJDKPatching OpenJDKSerial collector gainSerial collector gain
http://aragozin.blogspot.com/2011/07/openjdk-patch-cutting-down-gc-pause.html
Patching OpenJDKPatching OpenJDKCMS collector gainCMS collector gain
http://aragozin.blogspot.com/2011/07/openjdk-patch-cutting-down-gc-pause.html
Concurrent Mark SweepConcurrent Mark Sweep
Concurrent mode failureIf background collection cannot free memory fast enough. CMS
will perform Stop-The-World single thread Mark-Sweep-Compact.
Promotion failureDue to fragmentation. Old space may not have continuous block
of memory to accommodate promoted object even if free space is available.
CMS will perform Stop-The-World single thread Mark-Sweep-Compact to defragment memory.
Full GCFull GC
TUNING TROUBLESHOTINGTUNING TROUBLESHOTING
Common reasons for long STWCommon reasons for long STW
[Times: user=0.53 sys=0.06, real=0.15 secs]
• Full GC• OS Swapping• Too many survivors in young space• Long reference processing• JNI delays• Long CMS initial mark / remark
CMS Check listCMS Check list
• jdk6u22 - jdk6u26 – broken free lists logic• -XX:CMSWaitDuration=…• -XX:+CMSScavengeBeforeRemark=…• -XX:-CMSConcurrentMTEnabled• Consider CMS for permanent space• Size your heap -Xmn / -Xms / -Xmx
Expected data + young space + CMS overhead CMS overhead ~30% of expected data
Tuning young collectionTuning young collection
Eden size too small – frequent YGC, objects promoted to old space early too large – more long lived objects need to be copied
Survivor space size too small – overflow, objects prematurely promoted too large – memory wasted
Tenuring threshold higher – objects are kept in young space for longer higher – more objects in young space, more copy time
Tuning young collectionTuning young collection
Eden size -XX:MaxNewSize=<n> -XX:NewSize=<n> Eden size = new size – 2 * survivor space size
Survivor space size -XX:SurvivorRatio=<n> Survivor space size = new size / survivor ratio
Tenuring threshold -XX:MaxTenuringThreshold=<n>
Tuning young collectionTuning young collection
Small heap sizes Balance tenuring threshold / survivor space to keep objects in
limited young space for longer
Large heap sizes (4Gb and greater) Limit tenuring threshold to avoid increase in copy time Limit survivor space to avoid accidental long young collections Increase Eden size instead of increasing tenuring threshold
Tuning young collectionTuning young collection
GC tuning is based on application allocation pattern
If application allocation patterns is changed – you are in trouble
In practice application always have different “modes of operation”
GC tuning – choosing better evil
DiagnosticsDiagnostics
Surviving with huge heapSurviving with huge heap
• CMS is very good in terms of pauses You can reliably keep pauses under 150ms – 50ms
on 30GiB – 50 GiB
• Fragmentation treat Not big deal for server type of applications XML processing is GC disaster
• Very narrow GC comfort zone If you tune for “long run” you are likely to have
pauses during initial loads / bulk refreshes