Post on 10-Aug-2020
Alibaba Dragonwell JDK:Towards a Java Runtime for Cloud Computing
Xiaoming GuAlibaba JVM Team
1
2019 2019
Alibaba Infrastructure
...
System Software (OS / JVM / Virtualization)
Resource Scheduling / Cluster Management / Container
Database / Storage / Middleware / Computing Platform
2
2019 2019
Singles’ Day Shopping Festival
3200 14,00042,000
80,000
140,000
175,000
325,000
1200 3,850 15,00038,000
86,000120,000
256,000
0
50000
100000
150000
200000
250000
300000
350000
2011 2012 2013 2014 2015 2016 2017Alibaba cloud platform Alibaba payment service
more than 100x in 7 years
Peak #transactions per second
3
2019 2019
• A customized downstream of OpenJDK with free LTS
• https://github.com/alibaba/dragonwell8
• Preview now and GA soon
• Will be the recommended JDK on Alibaba Cloud
• Plan to update in every 3 months
4
2019 2019
• Java Flight Recorder (JFR)
• Low-overhead profiling framework
• Backported JFR from OpenJDK 11 to Alibaba Dragonwell 8
• In progress pushing back to OpenJDK 8
• JWarmUp
• Reduce startup time by reusing Just-in-Time (JIT) compilation info from a previous run
• In progress pushing back to OpenJDK
Key Customizations in Alibaba Dragonwell JDK
5
2019 2019• A challenge in the cloud
• CPU utilization is high during JVM startup
• Caused by excessive JIT compilations
AppAOT
6
2019 2019
• Ahead-of-Time (AOT) compilation
• AOT in OpenJDK
• jaotc --output libHelloWorld.so HelloWorld.class
• java -XX:AOTLibrary=./libHelloWorld.so HelloWorld
• CPU utilization by JIT compilations saved
• Limitation: AOT code loaded when JVM starts
AppAOT
7
2019 2019
• Enhanced from AOT: dynamically load/unload AOT code with the support of custom class loaders
• Java API
• AppAOTController.loadAOTLibraryForLoader(ClassLoader loader, String library)
• AppAOTController.unloadAOTLibraryForLoader(ClassLoaderloader)
AppAOT
8
2019 2019
• Reduce AOT code size
• Use static analysis to construct call graph
• Generate compilation method list from call graph with unreachable methods removed
• Do AOT compilations for methods on the compilation method list only
• Results from an example app
• 50% reduction on code size
• 90% of actually executed methods covered
AppAOT
9
app entry method
foo::m1
call bar::m1
foo::m2
Call graph
Methodlist
2019 2019
• Use case
AppAOT
10
Common App Platform
App1
(.jar)
App2
(.jar)
App3
(.jar)
AppN
(.jar)……
Normal case (with JIT)
2019 2019
• Use case
AppAOT
11
Common App Platform
AOT’edApp 1 (.so) ……
AOT’edApp 2 (.so)
AOT’edApp 3 (.so)
AOT’edApp N (.so)
With AppAOT
2019 2019
• ResultsAppAOT
12
0
20
40
60
80
100
120
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
173
177
181
185
189
193
197
201
205
209
213
217
221
225
229
233
CPU utilization
aot_tiered jit
AppAOT uses less CPUat app startup time
Peak performance ofAppAOT is similar to JIT
—— AppAOT—— JIT
2019 2019
• Multiple Java applications running on the same host together
• Memory is a shared resource
• Memory consumption changes along running
• Dynamically resize heap on demand
Elastic Heap
13
2019 2019
• Garbage collection (GC)
• Automatic memory management on heap
• Reclaim the space occupied by dead objects
• Intuition
• Increase heap size when GC happens more
• Decrease heap size with GC happens less
Elastic Heap
14
2019 2019
• Use case
Elastic Heap
15
online service with higher memory pressure
offline service with lower memory pressure
before midnght
online service
offlineservice
beforetraffic peak
online service
offlineservice
shortly after traffic peak
2019 2019
Elastic Heap
16
online service with lower memory pressure
offline service with higher memory pressure
before midnght
online service
offlineservice
after traffic peakfor a while
online service
offlineservice
shortly after traffic peak
2019 2019
• Results from an online serviceElastic Heap
17
CPU utilization
Memory usage
traffic peakstarts
traffic peakends
acquire memoryfrom OS
return memoryback to OS
time
2019 2019
• Both OpenJDK and our solution are based on Garbage-First (G1) GC
• The solution in OpenJDK is for full GC and concurrent cycle only
• Young GC is covered in our solution
• Heap resizing is more prompt because Young GC happens more frequently
• Heap resizing is concurrent
Elastic Heap
18
2019 2019
• Challenges in the cloud
• Excessive CPU utilization caused by JIT compilations
• AppAOT
• Better memory sharing across JVM instances on the same host
• Elastic heap
Summary
19
Q & A
20