VMIL keynote : Lessons from a production JVM runtime developer
-
Upload
mark-stoodley -
Category
Technology
-
view
299 -
download
1
Transcript of VMIL keynote : Lessons from a production JVM runtime developer
Thegood,thegoodenough,andthethingswewishwehaddonebetter
LessonsfromaproductionJVMruntimedeveloper
MarkStoodley“ProductionJVMRuntimeDeveloper”atIBM
Projectco-leadforEclipseOMR
LessonsfromaProductionJVMRuntimeDeveloper
Thegood,thegoodenough,andthethingswewishwehaddonebetter
MarkStoodley“ProductionJVM(J9)RuntimeDeveloper”atIBM
Projectco-leadforEclipseOMR
3
Important disclaimers• THEINFORMATIONCONTAINEDINTHISPRESENTATIONISPROVIDEDFORINFORMATIONALPURPOSESONLY.• WHILSTEFFORTSWEREMADETOVERIFYTHECOMPLETENESSANDACCURACYOFTHEINFORMATION
CONTAINEDINTHISPRESENTATION,ITISPROVIDED“ASIS”,WITHOUTWARRANTYOFANYKIND,EXPRESSORIMPLIED.
• ALLPERFORMANCEDATAINCLUDEDINTHISPRESENTATIONHAVEBEENGATHEREDINACONTROLLEDENVIRONMENT.YOUROWNTESTRESULTSMAYVARYBASEDONHARDWARE,SOFTWAREORINFRASTRUCTUREDIFFERENCES.
• ALLDATAINCLUDEDINTHISPRESENTATIONAREMEANTTOBEUSEDONLYASAGUIDE.• INADDITION,THEINFORMATIONCONTAINEDINTHISPRESENTATIONISBASEDONIBM’SCURRENT
PRODUCTPLANSANDSTRATEGY,WHICHARESUBJECTTOCHANGEBYIBM,WITHOUTNOTICE.• IBMANDITSAFFILIATEDCOMPANIESSHALLNOTBERESPONSIBLEFORANYDAMAGESARISINGOUT
OFTHEUSEOF,OROTHERWISERELATEDTO,THISPRESENTATIONORANYOTHERDOCUMENTATION.• NOTHINGCONTAINEDINTHISPRESENTATIONISINTENDEDTO,ORSHALLHAVETHEEFFECTOF:
– CREATINGANYWARRANTORREPRESENTATIONFROMIBM,ITSAFFILIATEDCOMPANIESORITSORTHEIRSUPPLIERSAND/ORLICENSORS
CompleteimplementationRobustqualityevenunderhighstress
ScalableperformanceReliableandstableservice
Widevarietyofdeployedworkloads
4
SomeProductionRuntimeCharacteristics
Soundsprettyconstrained
Itis,butit’sstillpossibletoinnovate
5
J9:ProductionJVMfor~18years,stillvibrant• Java(SE)releases:Java1.4.2,5.0,6,6.1,7,7.1,8,Java.next
• Sometechnologyhighlights:– Cooperativesuspend(1999)– Diagnosticabilities:e.g.limitfiles,permethodoptions(1999)– FulloptimizationwhilesupportingtypeaccurateGC(1999)– AOT(rom-able)compilationforJava(1999)– Adaptivecompilation(cold,warm,hot,veryhot,scorching)(1999)– Aggressiveruntimenativecodepatching(2000)– Invocationandtime-basedcompilationtriggers(2000)– JITprofilinginfrastructureandoptimizations(2001)– Speculativeclasshierarchybasedinlining andoptimization(2001)– Fairlycompletesetofclassicalcompileroptimizations anddataflowanalyses(2001)– Java-specificoptimizationslike”check”removal(2001)– Javadebugsupport(2001)– Escapeanalysisandstackallocation(2001)– Automaticlockcoarsening(2002)– Multiplecodecaches(2005)– Real-timeSpecificationforJava(AOTandJIT)(2005)– Sharedclasses(2005)– Asynchronouscompilation(2006)– Interpreterprofiling(2006)– DynamicAOTcompilationforJava(2006)– HotCodeReplacementsupport(2007)– Compressedreferences(2007)– Multiplecompilationthreads(2010)– Onstackreplacement(2013)– TransactionalMemory(2013)– Packedobjects(2013)– Multitenancy(2013)– AutoSIMD(2014)– AutoGPU(2014)– Heuristictuningandretuning(1999– ongoing)
• Performancemetricsthathavebeenorareactivelytracked:– Latency(elapsedtime)– Throughput(operations/sec)– Start-uptime– Ramp-uptime– CPUconsumption– Resourceconsumptionatidle– Compilation time– Memoryfootprint– JITlibrarysize– Incrementalpauses
• Diagnosticfacilities– DirectDumpReader,Snapfiles,verbosetracefiles,-Xtrace,-Xdump,…– JITlogs,JITlimitfiles,per-methodJIToptionsets– Coreanalyzertool– HealthCenter,GCMemoryVisualizer,MemoryAnalysistool,…
• Hardwareplatformsthatareorhavebeensupported:– ME:ARM32,X86(IA32),MIPS,POWER,SH4– 32-bitSE:ARM,POWER,X86,Z– 64-bitSE:POWER,X86,Z– Hardreal-time(RTSJcompliant):IA32
• Hardwareexploitationhighlights:– EfficientCPUinstructionsequences– Managingdifferentkindsofhardwareregisters– Exploitinghardwaredatatypesupport– Cryptographic,compressionacceleration– Characterconversionlooprecognitionandacceleration– Atomiclockingandothersynchronizationoptimization– SimultaneousMultiThreading– TransactionalMemory– SIMD(Singleinstructionmultipledata)– GPU(Graphicsprocessingunit)
6
ProductionRuntime:J9JVM
Calls toClibraries
7Operating system
Nativeapplications
OS-specific calls
Virtual machine
Garbage collector
Interpreter
Exception handler
Class loader
Pluggable components that dynamically load into the virtual machine
Thread model
JVM Profiler
Debugger
Port Library (file IO, sockets, memory allocation)
Uses one of many Java platform configurations
JCL natives
JNIJava calls
JNI, INL, Fastcall
TR JIT
VMInterface
Zip, fdlibm
JavaVMClasses
SE 8SE 7
SE 6SE 5
CDCMIDP
CLDC
TwoopensourceprojectsfromJ9JVM!
OpenJDK
HotSpotEclipseOMR
OpenJDK
OpenJ9
OMR
OpenJDK
OpenJ9
OMR
Provenadaptabletechnologyintheopenforrapidinnovationandcollaborationacrossmultiple
languagecommunities
OpenJDK IBMSDKforJava
Javacommunityopeninnovationandcollaboration,deepplatform
exploitationforX86&IBMhardwareplatforms
(OpenPOWER,LinuxONE)
Ruby?
OMR
CommunitiesBeyondJava:EclipseOMR
COBOL PL/IEmulator
Python?
OMR
SOM?
OMR
InventYourOwnLanguage!
Longtermsupport,quickresponseforproblems,andotherformsofIBMcustomer
specificengagement
+IBMisms
GC
JITDiag
Port
8
EclipseOMRMission
Buildanopenreusablelanguageruntimefoundationforthecloud
• Toacceleratecloudplatformadvancementandinnovation
• Infullcooperationwithexistinglanguagecommunities
• Via adiversecommunityofpeopleinterestedinlanguageruntimes• Professionaldevelopers• Researchers• Students• Hobbyists
9
EclipseOMRtechnologycomponents
port platformabstraction(porting)librarythread crossplatformpthread-likethreadinglibrary
vm APIstomanageper-interpreterandper-threadcontextsgc garbagecollectionframeworkformanagedheapscompiler extensiblecompilerframework
jitbuilder WIPprojecttosimplifybringupforanewJITcompileromrtrace libraryforpublishingtraceeventsformonitoring/diagnostics
omrsigcompat signalhandlingcompatibilitylibraryexample demonstrationcodetoshowhowalanguageruntimemight
consumeOMRcomponents,alsousedfortesting
fvtest languageindependenttestframeworkbuiltontheexamplegluesothatcomponentscanbetestedoutsideofalanguageruntime,usesGoogleTest1.7framework
+afewothers~800KLOC atthispoint,morecomponentscoming!
10
Lessons(finally!)
11
Lesson#1Buildaplatformportandthreadlibraries
(keepthe“ifdef soup”inoneplace)
Easiestifyoustartwithmorethanoneplatform
12
OMRplatformportinglibrary:omr/port
• Crossplatform(Linux,OSX,Windows,AIX,zOS,etc.)“thin”wrapfor:• Time,Process,Memory:allocate,free,reserve,pages,numa• CPU,Environment,FilesandPermissions,Sharedlibraries• Terminal(tty),Strings,Locales,Signals• Etc.
• Portlibraryactuallyastruct containingmanyfunctionpointers,e.g.uintptr_t (*time_hires_clock)(struct OMRPortLibrary *portLibrary);
uintptr_t (*sysinfo_get_pid)(struct OMRPortLibrary *portLibrary);
intptr_t (*sysinfo_get_CPU_utilization)(struct OMRPortLibrary *portLibrary, struct J9SysinfoCPUTime *cpuTime);
13
OMRthreadlibrary:omr/thread
• Crossplatform(Linux,OSX,Windows,AIX,zOS,etc.)libraryforthreadsandsynchronization
• E.g.semaphores,mutexes,threads,policies,priorities,interrupts,threadlocalstorage,CPUconsumption,affinity,sleeping,waiting,forkwheresupported,etc.
• Verysimilartopthreads,e.g.• intptr_t omrthread_create(omrthread_t *handle, uintptr_t stacksize,
uintptr_t priority, uintptr_t suspend, omrthread_entrypoint_t entrypoint, void *entryarg);
• Alsohas3-tierspinlockfromJava:fastforshortholdtimes• Inner:atomiccmpxchg,middle:spinwithoutyielding,outer:yield,fail:blockJ14
Lesson#2Createafasteventpub/subframework
e.g.JITcanlistenforclass(un)loadandclassloaderevents
e.g.buildverboseGCfunctionalityontopofit
15
• Createeventdescriptions(includingparameters)inXML• Hookgen toolgeneratesheadersautomatically
• Easytotriggeranevent:TRIGGER_J9HOOK_MM_OMR_GC_CYCLE_START(
_extensions->omrHookInterface,env->getOmrVMThread(),omrtime_hires_clock(),J9HOOK_MM_OMR_GC_CYCLE_START,/* other args */);
• Easytoregisteraneventhook(e.g.fromGCverboseoutputlogger):(*_mmPrivateHooks)->J9HookRegister(_mmPrivateHooks,
J9HOOK_MM_PRIVATE_SYSTEM_GC_START, verboseHandlerSystemGCStart, /* user data args */);
16
OMReventhooks:omr/util/hookable
Lesson#3Investindiagnostictools
Don’tjustdebugwithprintf’sAndgrubaroundincorefiles
17
OMRdiagnostics
• DDRGen:buildsarepresentationofinternalVMdatastructures• Readdwarf/debugoutputgeneratedbycompiler• Alsoscrapesheaderfilestolearnaboutmacrosfor,say,bitflags• Stillunderdevelopment/refactoringfromJava(jdmpview)
• Traceengine• Similartoevents,candefinetracepoints withxml• Cansetverbositysonotalltracepoints areonbydefault(savesstartup!)
• GCverboselogs• LeverageeventhookstogeneratefilesreadablebyIBMHealthCenterandGCMemoryVisualizertool
• JITcompilerlogs,limitfiles,andoptionsets• Detailloggingoutputwithnumberedoptsandtransformations• Canusecommand-lineoptionstoenable/disablenumberedopts/transformations• Limitfiles/optionscanonlyenableorexcludecompilingcertainmethods• OptionsetscanspecifyJITdiagnosticoptionsforonlycertainmethods 18
Lesson#4Donotonlyuselanguageleveltests
Forallcomponents,butespeciallyfortheJITToomanyvariablesnotunderlanguagecontrol
Needtoimprovehere
Seeomr/fvtests/<component>
19
Lesson#5Pleasedon’tbuildyetanotherGC
Therearewaytoomanyofthemalready!Andthey’realldifferent
L
20
OMRGarbageCollector:omr/gc
• Highlyparallel,scalablegarbagecollector• Exploitsmultiplecores• Balancesworkformultiplethreads
• Rocksolidautomaticmemorymanagementforlanguageruntimes• UsedforoveradecadeintheIBMJ9enterprisecaliberJavaVirtualMachine
• Mark/SweepGCpausetimes(dependsonlivedatasetsize):• ~0.5millisecondforsmall(2-4MB)heaps• ~5millisecondsforheapsat10sofMBs
• IntegratingMark/SweepGCtoexistingruntimeshouldbe<100linesofcode• Canthenaddevenmoreadvancedcapabilitiesincrementally
• Compaction• Generational• Concurrent
21
“Lesson”#6There’sanewopensourceJITintowninOMR
60+optimizationsandanalysesCreateyourownoptimizationsequences
NewJitBuilder librarytosimplifyILgeneration
22
Finalbitofadvice:JITandinterpreterhaveeachother’sbacks
JIT:makesimplestufffast,letinterpreterdohardstuff
Interpreter:besimpleandgeteverything100%right,letJITmaketherightthingsfast
23
PilotprojectsusingOMRinexistingruntimes
• Useport,thread,hook,GC,JIT,andmethodprofilingfromOMR• CapabilitiesfromIBMJ9migratedto:
• CRuby,CPython,SOM++• MethodprofilingviaIBMHealthCenter• Fast,ScalableGarbageCollection
• VerboseGCoutputforfree• EnablesexactsameGCvisualizationandinsightsforalllanguages
• E.g.enhancement• CRuby:allobjectmemorymovedontomanagedheap
• JustInTimecompilers• Compiledcodewithfocusoncompatibility
24*=notyetopensource
MethodProfilingforRuby
25
ScalablehighperformanceGarbageCollection
<cycle-start id="2" type="global" contextid="0" timestamp="2015-08-05T17:21:58.105" intervalms="5066.731" /><gc-start id="3" type="global" contextid="2" timestamp="2015-08-05T17:21:58.105">
<mem-info id="4" free="596848" total="4194304" percent="14"><mem type="tenure" free="596848" total="4194304" percent="14" />
</mem-info>
</gc-start><allocation-stats totalBytes="3596216" >
<allocated-bytes non-tlh="720016" tlh="2876200" /></allocation-stats>
<gc-opid="5"type="mark"timems="4.881"contextid="2"timestamp="2015-08-05T17:21:58.110"><trace-infoobjectcount="8914"scancount="7208"scanbytes="288320"/></gc-op><gc-opid="8"type="sweep"timems="0.688"contextid="2"timestamp="2015-08-05T17:21:58.111"/><gc-endid="9"type="global"contextid="2"durationms="5.896"usertimems="7.999"systemtimems="1.999"timestamp="2015-08-05T17:21:58.111"activeThreads="2"><mem-infoid="10"free="2508160"total="4194304"percent="59"><memtype="tenure"free="2508160"total="4194304"percent="59"micro-fragmented="297048"macro-fragmented="723458"/></mem-info></gc-end><cycle-endid="11"type="global"contextid="2"timestamp="2015-08-05T17:21:58.111"/>
Q:DoesthisverboseGCoutputcomefromJava,Ruby,Python,orSOM++?
A:Couldbeanyoneofthem! 26
SameGCvisualization&insightforalllanguages
27
Proof point: Just in Time Compilers• CRuby, CPython, SOM++ do not have JIT compilers
• Our efforts to date have high focus on compatibility:• Compile native instructions for methods and blocks• Avoid big changes to how existing runtimes work (ease adoption)• Consistent behaviour for compiled code vs interpreted code• No restrictions on native code used by extension modules: we can run Rails!• No benchmark tuning or specials, no profile exploitation (yet)
• Recent focus has been 100% on open sourcing JIT• Not on language ports
28
Speedu
pRe
lativ
etoInterpreter
RubyBench9000MicroBenchmarks3x
2x
1x
29
OpenJ9isalsocoming!
We’reworkingonitaroundournextIBMSDKforJavarelease
30
Connectingproductionruntimesandresearch
• OMRandOpenJ9:productionruntimetechnologyinopensourceprojects• IBMdevelopersworkingdirectlyinopensourceprojects• Notresearch:thisishowIBMbuildsitsruntimesgoingforward
• Opportunityforresearchersandruntimedeveloperstoworksidebyside• Flexiblelicensing(EPL1.0orAL2.0)• Levelplayingfield
• OMRandOpenJ9technologycouldbecometestbedforruntimesresearch• Earlydays:APIsarestillevolvingandimprovingaroundsolidtechnologybase• Realisticpathforresearchworktobecomeactiveproductioncode• Ideally,pathforresearchtoinfluencemorethanonelanguageruntime
31
Interestinglinksandcontactpoints• MarkStoodley [email protected] @mstoodle• MailingList [email protected]
• Signupat https://dev.eclipse.org/mailman/listinfo/omr-dev
• EclipseOMRWebSite https://www.eclipse.org/omr• EclipseOMRDeveloperWorks Opensite https://developer.ibm.com/open/omr/• EclipseOMRGithub project https://github.com/eclipse/omr
• IBMSDKforJavaDockerimageshttps://hub.docker.com/r/ibmcom/ibmjava/
• Ruby+OMR TechnologyPreviewGithub projectwithDockerimagesforLinuxonLinuxONE,OpenPOWER,andX86
https://github.com/rubyomr-preview/rubyomr-preview• SOM++withOMRGCandJitBuilder
https://github.com/charliegracie/SOMpp/tree/OSCON2016 32