Optimizing Communicating Event-Loop Languages with Truffle
-
Upload
stefan-marr -
Category
Technology
-
view
1.085 -
download
0
Transcript of Optimizing Communicating Event-Loop Languages with Truffle
Stefan Marr, Hanspeter MössenböckAGERE! WorkshopOctober 26, 2015
Optimizing Communicating Event-Loop
Languages with Truffle
Research funded by
NS
2
Initial Goals• Safety– Guaranteed Isolation– No Low-Level Data Races
• Deadlock Freedom
• Performance Competitive with Java
NS: A Platform For Concurrency Research
3
Communicating Event Loops
E Programming Language
à la
4
Communicating Event Loops
Actor A Actor B
Actor Principle
5
Communicating Event Loops
Actor A Actor B
But, Actor Not First-Class
6
Communicating Event Loops
Actor A Actor B
Actors Contain Objects
7
Communicating Event Loops
Actor A Actor B
Objects Can Have Far-References
8
public class PingPong new = Benchmark <: Value ( class Ping new: cnt with: pongAct = ( private pingsLeft ::= cnt. (* mutable slot *) private pongAct = pongAct. (* immutable slot *) )(
Newspeak
A Class-based Language
Dynamically Typed
No Global/Static State
9
public class PingPong new = Benchmark <: Value ( class Ping new: cnt with: pongAct = ( private pingsLeft ::= cnt. (* mutable slot *) private pongAct = pongAct. (* immutable slot *) )(
Newspeak
public ping = ( pongAct <-: ping: self. pingsLeft := pingsLeft - 1. )
Communicating Event-Loop Actors
10
public class PingPong new = Benchmark <: Value ( class Ping new: cnt with: pongAct = ( private pingsLeft ::= cnt. (* mutable slot *) private pongAct = pongAct. (* immutable slot *) )(
Newspeak
Newspeak Programming Language DraftSpecification Version 0.095http://bracha.org/newspeak-spec.pdf
With Spec:
public ping = ( pongAct <-: ping: self. pingsLeft := pingsLeft - 1. )
11
: Built on Truffle
cnt
1+cnt:
=if
cnt:= 0
cnt
1+cnt:=if cnt:= 0
NS
Truffle’s Self-Optimization Approach:[1] Würthinger, T.; Wöß, A.; Stadler, L.; Duboscq, G.; Simon, D. & Wimmer, C. (2012), Self-Optimizing AST Interpreters, Proceedings of the 8th Dynamic Languages Symposium.
JIT CompiledNative Code
Self-Optimized AST
12
: A Fast NewspeakNS
SOMNS versus Java (Graal Compiler)On average 1.65x slower (min. −3%, max. 2.6x)
Runti
me
Fact
orN
orm
alize
d to
Java
LowerIs
Better
13
vs. JVM Actor LibrariesNSRu
ntim
e Fa
ctor
Nor
mal
ized
to S
cala
z
LowerIs
BetterSavina Benchmarks[2] Imam, S. M. & Sarkar, V. (2014), Savina - An Actor Benchmark Suite: Enabling Empirical Evaluation of Actor Libraries, Proceedings of the 4th AGERE! Workshop, ACM.
14
TWO OPTIMIZATIONS
Enforcing IsolationAsynchronous Sends
15
Enforcing Isolation
16
Enforcing Isolation
if (isMutableObject(arg[i])) { return farReference(arg[i]);} else if (isValueObject(arg[i]) { return arg[i];} else if (isFarReference(arg[i]) && toCurrentActor(arg[i])) { ...} else if (isFarReference...} else if (isPromise(arg[i])......
public traverse: t col: start to: end = ( (* ... *))
worker <-: traverse: table col: 1 to: 10
17
Optimistic AST Specialization
async send
WrapArg WrapArg WrapArgWrapArg
ReadVarworker
ReadVartable
worker <-: traverse: table col: 1 to: 10
Literal1
Literal10
18
Optimistic AST Specialization
async send
WrapArg WrapArg WrapArgUnwrapFarRef
ReadVarworker
ReadVartable
worker <-: traverse: table col: 1 to: 10
Literal1
Literal10
19
Optimistic AST Specialization
async send
IsValue WrapArg WrapArgUnwrapFarRef
ReadVarworker
ReadVartable
worker <-: traverse: table col: 1 to: 10
Literal1
Literal10
20
Optimistic AST Specialization
async send
IsValueUnwrapFarRef
ReadVarworker
ReadVartable
worker <-: traverse: table col: 1 to: 10
Literal1
Literal10
21
Impact on Microbenchmarks
Speedup Factor over Unoptimized Version
public class With10Args new = Benchmark ( private aValue = Value new. private obj ::= Object new. public benchmark = ( self <-: a1: aValue a2: Object new a3: obj a4: Benchmark a5: aValue a6: 0 a7: 7 a8: 8 a9: #eee a0: '33'.) )
22
Method Lookup for Asynchronous Sends
A1
Event-Loop: Single Point of Reception
B
A2
Cdo
get
set
do
a1 <-: do
b <-: get
a2 <-: set
c <-: do
Megamorphic Method Invocations
while (true) { msg = mailbox.receive() mthd = msg.obj.getClass(). lookup(msg.selector()) mthd.invoke(obj, msg.args)}
23
Optimization: Send-site Caching
a1 <-: doactor(a1) <-: fun(o) { o.do()}
Code Transformation
Introduces Inline Cache
24
With Send-site Caching
A1
B
A2
Cdo
get
set
do
a1 <-: do
b <-: get
a2 <-: set
c <-: do
while (true) { msg = mailbox.receive() msg.fun. invoke(obj, msg.args)}fun1(.) {…}
fun2(.) {…}
fun3(.) {…}
fun4(.) {…}
25
Impact on Microbenchmarks
Speedup Factor over Unoptimized Version
public count = ( cnt := cnt + 1. cnt = iterations ifTrue: [ completionPP resolve: cnt ] ifFalse: [ self <-: count ])
26
Impact on Microbenchmarks
Speedup Factor over Unoptimized Version
public calc: a and: b = ( | r | r := a * b + b + b – a. r := r - (a * a * b). ^ r)
public benchmark = ( 1 to: numIter do: [:i | self <-: calc: 2 and: 4. self <-: calc: 1.2 and: 3.3. ].)
27
: Fast And Scalable
• Platform for Concurrency Research• Initial optimizations– Send-site Caching + Isolation
NS
LowerIs
Better
Runti
me
Fact
orN
orm
alize
d to
Sca
laz