A new execution model for Nashorn in Java 9
-
Upload
marcus-lagergren -
Category
Internet
-
view
335 -
download
2
Transcript of A new execution model for Nashorn in Java 9
Nashorn for Java 9
Marcus Lagergren
Quickly Bootstrapping Nashorn with a New Execu>on Model
The 2nd Annual VM Tech Summit
Stockholm, February 8, 2016 Part of JFokus 2016 Speakers wanted!
marcus[at]lagergren.net
@lagergren
Safe Harbor Statement "THE FOLLOWING IS INTENDED TO OUTLINE OUR
GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR
FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION. THE DEVELOPMENT,
RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE."
Safe Harbor Statement "THE FOLLOWING IS INTENDED TO OUTLINE OUR
GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR
FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION. THE DEVELOPMENT,
RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE."
@lagergren
Agenda
• Dynamic languages on the JVM – Why? • Nashorn • Performance • Op>mis>c Types, Steady State Performance • Startup and Warmup Performance • Nashorn Java 9 execu>on architecture • Future Work
Running Alien Languages on the JVM
Alien Languages on The JVM
• Not just dynamic languages • Any non-‐Java languages • “Alien” languages
Fantom
Fortress
BeanShell
Jaskell
ANTLR
JudoScript
ABCL
Erjang
X10
myForth
jdart
jgo
Nice
Gosu
Jacl
Alien Languages
Alien Languages
Alien Languages
JavaScript
• Automa>c memory management • State of the art JIT op>miza>ons • Na>ve threading capability • Hybridiza>on – (javax.scripting, JSR-‐223)
• Man decades of high tech – Already in the JVM
Why “Alien” Languages on the JVM?
Why “Alien” Languages on the JVM?
0
100000
200000
300000
400000
500000
Code bases
Lines
The Nashorn Project
• Invokedynamic POC • Compliant JavaScript run>me • Open • Fast* • Hybrid – JSR-‐223, javax.scrip>ng
• Extensible
Nashorn Goals, 2010-‐
* At first comparable in performance to na>ve implementa>ons, in domains where it makers.
• Toolbox for other dynamic languages on top of the JVM – Dynalink – TypeScript thesis – JRuby 9000 synergy – “the invokedynamic way of language implementa>on”
Long Term Nashorn Goals
What Does “Performance” Mean?
What Does “Performance” Mean?
Total Performance = Execu>on Time + Run>me Overhead
What Does “Performance” Mean?
Total Performance = Execu>on Time + Run>me Overhead
[JavaScript & na>ve >me – brought down by: invokedynamic op>miza>ons, incremental inlining, field access >me
minimiza>on, efficient na>ve code implementa>on, type specializa>on, op>mis>c type guesses, JIT op>miza>ons of
bytecode]
What Does “Performance” Mean?
Total Performance = Execu>on Time + Run>me Overhead
The goal of 8u60 (main func>onality, like –optimistic-types s>ll disabled by default)
What Does “Performance” Mean?
Total Performance = Execu>on Time + Run>me Overhead
The stretch goal for 9
What Does “Performance” Mean?
Total Performance = Execu>on Time + Run>me Overhead
[increase BC/Nashorn jit speed, minimize relinking of callsite/bytecode regenera>on, >red JIT recompila>on, class installa>on speed, >me spent in GC etc – MAKE STARTUP
FASTER]
What Does “Performance” Mean?
Total Performance = Execu>on Time +
Warmup/Steady State Overhead + Run>me Overhead
Especially important: Time to reach steady state!
Use cases: frequent restarts, REPL, redeployments, evals
What Does “Performance” Mean?
Especially important: Time to reach steady state!
8u60 has lazy compilaBon & code caching that helps some for runs n…, n > 1
Total Performance = Execu>on Time +
Warmup/Steady State Overhead + Run>me Overhead
Genera>ng Code That Runs Faster: Op>mis>c Types
[for even more in depth info, see my JVMLS presenta>ons from 2013 and 2014]
Op>mis>c Types
http://openjdk.java.net/jeps/196
Akack Execu>on Time
• invokedynamic implementa>on – JVM – java.lang.invoke implementa>on
• Boxing, boxing everywhere – In the libraries – Representa>on of generated code – Insufficient escape analysis, or even opportuni>es for it
Op>mis>c Types
function() { return a + b;}
Op>mis>c Types
function() { return a + b;}
public static f(ScriptFunction;Object;)Object; 0 aload 0 1 invokevirtual ScriptFunction.getScope()ScriptObject; 4 astore 2 5 aload 2 6 invokedynamic dyn:getProp|getElem|getMtd:a(Object;)Object; 11 aload 2 12 invokedynamic dyn:getProp|getElem|getMtd:b(Object;)Object; 17 invokestatic ScriptRuntime.ADD(Object;Object;)Object; 20 areturn
Op>mis>c Types
function() { return a + b;}
public static f(ScriptFunction;Object;)I 0 aload 0 1 invokevirtual ScriptFunction.getScope()ScriptObject; 4 astore 2 5 aload 2 6 invokedynamic dyn:getProp|getElem|getMtd:a(Object;)I 11 istore 3 12 iload 3 13 aload 2 14 invokedynamic dyn:getProp|getElem|getMtd:b(Object;)I 19 invokedynamic iadd(II)I // intrinsified to add, jo 24 ireturn
Op>mis>c Types
function() { return a + b;}
try { operation; // get a, get b or iadd} catch (final UnwarrantedOptimismException e) { throw new RewriteException(e.getLocalVariables(), e.getProgramPoint()); }
Op>mis>c Types
• Use whatever sta>c types there are • Guess the rest • Take a con>nua>on and recompile if wrong
long double Object (pessimis>c) int
Op>mis>c Types
• Retain primi>ve storage is possible – Dual fields, later VarHandles/TaggedArrays – Method specializa>on
• Add specialized version of na>ve methods
Op>mis>c Types @Function(arity = 2,
attributes = Attribute.NOT_ENUMERABLE,
where = Where.CONSTRUCTOR)
public static double max(final Object self, final Object... args) {
switch (args.length) {
case 0:
return Double.NEGATIVE_INFINITY;
case 1:
return JSType.toNumber(args[0]);
default:
double res = JSType.toNumber(args[0]);
for (int i = 1; i < args.length; i++) {
res = Math.max(res, JSType.toNumber(args[i]));
}
return res;
}
}
Op>mis>c Types @SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
Op>mis>c Types @SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
Op>mis>c Types @SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final double x, final double y) {
return Math.max(x, y);
}
Op>mis>c Types @SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final double x, final double y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final Object, final Object y) {
return Math.max(JSType.toNumber(x), JSType.toNumber(y));
}
Op>mis>c Types @SpecializedFunction
public static double max(final Object self) {
return Double.NEGATIVE_INFINITY;
}
Stable Run>me Performance
The Cost of Steady State Performance
Startup and warmup >me un>l steady state. Bytecode genera>on >me.
Memory usage / GC overhead.
Startup Time With and Without Op>mis>c Types
0
0.5
1
1.5
2
2.5
3
3.5
4
8u60, jit
9, jit
Bytecodes generated during Startup With and Without Op>mis>c Types
0
2
4
6
8
10
12
14
16
18
20
8u60, jit
9, jit
# classes generated during Startup With and Without Op>mis>c Types
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
8u60, jit
9, jit
Nightmare Use Cases
• Dynamically evaluated throwaway code • Lots of relinking – Lots of type invalida>on (special case of the above)
function mtd() {// x is an Object, but starts out assumed int// 9 times, 9 indy call sites, // 9 continuations, 9 recompilations//// Or x is just a getter with side effects or// whatever – JavaScript: anything goesreturn x * x * x * x * x * x * x * x * x;
}
Nightmare Use Cases
Nightmare Use Cases
• And from Java…
private static void block2() {get("TRACE");eval("\"semicolon\" in TRACE");get("itemToString");eval("this.tab.tokLitName");eval("try{TRACE(\"P\",\"\\\"processing item
%s token %s\\\\n\\\"\",\"itemToString(i)\",\"this.tab.tokLitName(this.currentToken)\");} catch (e){e.rhinoException;}");
get("STATENR");eval("\"semicolon\" in STATENR");eval("try{STATENR(\"i\");} catch (e
{e.rhinoException;}");// … goes on forever ..
(Op>mis>c) Recompila>on Cost
• The worst part – Grows non-‐linearly with type matrix size – Grows non-‐linearly with script size
Trickle down effects – It’s even worse
Stable Run>me Performance
• Op>mis>c types definitely gets us run>me performance in steady state
• But they also make HotSpot unfeasibly slow to warm up – (and also: The bigger the method, the worse the op>miza>ons)
Startup & Warmup Overhead
• Observa>on: most type guesses are invalidated once, and are then correct forever – ~95-‐99% of the >me in Octane
• Do we really need to generate so much new code?
Startup & Warmup Overhead
• Observa>on: even callsites aren’t monomorphic, for small polymorphism a guard tree s>ll works fine
• Gets rid of most relinks if (propertyMap == propertyMap1) { … }
} else if { (propertyMap == propertyMap2) { … }
…
} else { … megamorphic slow dispatch }
Startup & Warmup Overhead
• Let’s assume that steady state performance is indeed good enough (for now) – So for steady state, op>mis>c types are definitely a performance success.
• Akacking startup and warmup >me
Previous Work
• Lazy CompilaBon (jdk 8u40) alleviates some of this
• Code Caching (jdk 8u40) alleviates some of this too, for runs 2..n – OpBmisBc Type Caching (jdk 8u60)
Akacking Startup & Warmup Overhead?
• Akacking startup and warmup >me • First try: Tier the JIT? • Profiling JIT level with pessimis>c, non faul>ng code?
Tier the JIT?
• Pro: very likle new code needs to be wriken.
Tier the JIT?
• Pro: very likle new code needs to be wriken. • Pro: we already generate arbitrary level of
pessimism on demand for RewriteExceptions
Tier the JIT?
• Pro: very likle new code needs to be wriken. • Pro: we already generate arbitrary level of
pessimism on demand for RewriteExceptions
• Pro: no significant test matrix growth
Tier the JIT?
• Con: Type pollu>on – Too wide values wriken to scopes
• Con: (dealbreaker) code genera>on overhead – We don’t get away from the root cause of overhead – Smallest compile unit is a class with a method – System dic>onary locks – Byte code verifica>on – Various other class registra>on horrors • (also a problem in lambda forms or any woven bytecode)
Tier the JIT?
• Con: Type pollu>on – Too wide values wriken to scopes
• Con: Code genera>on overhead – We don’t get away from the root cause of overhead – Smallest compile unit is a class with a method – System dic>onary locks – Byte code verifica>on – Various other class registra>on horrors • (also a problem in lambda forms or any woven bytecode)
Tier the JIT?
• Con: Type pollu>on – Too wide values wriken to scopes
• Con: Code genera>on overhead – We don’t get away from the root cause of overhead – Smallest compile unit is a class with a method – System dic>onary locks – Byte code verifica>on – Various other class registra>on horrors • (also a problem in lambda forms or any woven bytecode)
DEAL BREAKER
• It looks like JIT overhead is expensive • We don’t get close to Rhino interpreter startup with the >ered JIT approach
• We also miss op>mis>c types in scope, by never using their narrowest form – Has to be corrected azerwards – Too slow & complex
Akacking Startup & Warmup Overhead?
Our Nemesis is Bytecode Land
Our Nemesis is Bytecode Land
More Bytecode == More Pain
AST profiling?
• Don’t add more bytecode genera>on • Execute AST once? – Collect types and use them for first JIT if called again
• Or keep execu>ng AST un>l something is hot – only then send it to the JIT.
Interpreter, you say?
Interpreter, you say?
Yep. Thread local. Run>me reentrant. Trampolined. Can transi>on to JIT.
Run>me overhead package jdk.nashorn.internal.ir;
public abstract class Node {
// …
public abstract Object interpret(Frame frame);
// …
}
Example: IfNode package jdk.nashorn.internal.ir;
public class IfNode extends Statement implements JoinPredecessor {
@Override
public Object interpret(final Frame frame) throws Throwable {
interpreterEnter(frame);
try {
Object result = ScriptRuntime.UNDEFINED;
if (JSType.toBoolean(test.interpret(frame))) {
assert pass != null;
result = pass.interpret(frame);
} else if (fail != null) {
result = fail.interpret(frame);
}
return interpreterReturn(frame, result);
} finally {
interpreterLeave(frame);
}
}
}
Control flow through Excep>ons
• InterpreterException– ControlFlowException• JumpException (inlined finallies) • LabelledControlFlowException
– BreakException– ContinueException
– ReturnException
Example: WhileNode package jdk.nashorn.internal.ir;
public class WhileNode extends LoopNode {
@Override
public Object interpret(final Frame frame) throws Throwable {
final Label label = frame.getCurrentLabel();
Object result = ScriptRuntime.UNDEFINED;
while (test == null || JSType.toBoolean(test.interpret(frame))) {
try {
checkOSR(frame); // Might need to transition to JIT on backedge
result = body.interpret(frame);
} catch (final BreakException e) {
if (e.matchesLabel(label)) {
break;
}
throw e;
} catch (final ContinueException e) {
if (e.matchesLabel(label)) {
continue;
}
throw e; //not my continue
}
}
return result;
}
}
CompiledFunction
• Represents one version of a JavaScript or Na>ve func>on – Contains invoker method handle – Poten>ally also constructor method handle – Is of a certain type (e.g. specialized on params or generic)
• A ScriptFunctionData has >= 0 CompiledFunctions.
• Basically add a CompiledFunction subclass – InterpretedFunction
Trampolines
• Basically add a CompiledFunction subclass – InterpretedFunction– CallNode.interpret returns a ScriptFunc>on that is really a trampoline
Trampolines
Trampolines
• Basically add a CompiledFunction subclass – InterpretedFunction– CallNode.interpret returns a ScriptFunc>on that is really a trampoline
– Trampoline: “interpret yourself” when invoked
Trampolines
• Basically add a CompiledFunction subclass – InterpretedFunction– CallNode.interpret returns a ScriptFunc>on that is really a trampoline
– Trampoline: “interpret yourself” when invoked
• Add a Node.interpret(Frame frame) method to the IR Node – Frame is an interpreter state (locals/scope)
Duplicated Func>onality?
• But then we need to implement a lot of code that already exists in the JIT
• Just think of reimplemenBng link logic for all indy calls!!!
Duplicated Func>onality?
• But then we need to implement a lot of code that already exists in the JIT
• Just think of reimplemenBng link logic for all indy calls!!!
• And that’s just one thing!
Duplicated Func>onality?
• But then we need to implement a lot of code that already exists in the JIT
• Just think of reimplemenBng link logic for all indy calls!!! – Na>ve calls, JavaScript calls, different guards, nested receiver checks, call site reuse
– Even through Dynalink – Infeasible to duplicate codegen logic for this – And the tes>ng! My god!
Reuse Link Logic
protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) { final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) { return findMegaMorphicSetMethod(desc, name); }
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/* * If doing property set on a scope object, we should stop proto search on the first * non-scope object. Without this, for example, when assigning "toString" on global scope, * we'll end up assigning it on it's proto - which is Object.prototype.toString !! * * toString = function() { print("global toString"); } // don't affect Object.prototype.toString */ FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors. if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) { // We should still check if inherited data property is not writable if (isExtensible() && !find.getProperty().isWritable()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } // Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well. if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) { find = null; } }
if (find != null) { if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) { if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) { throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode. } // Existing, non-writable property return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } } else { if (!isExtensible()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false); } }
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants(); if (globalConstants != null) { final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request); if (cinv != null) { return cinv; } }
return inv; }
Example: linking a set method for a ScriptObject
Reuse Link Logic
• Reuse link logic? protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) { final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) { return findMegaMorphicSetMethod(desc, name); }
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/* * If doing property set on a scope object, we should stop proto search on the first * non-scope object. Without this, for example, when assigning "toString" on global scope, * we'll end up assigning it on it's proto - which is Object.prototype.toString !! * * toString = function() { print("global toString"); } // don't affect Object.prototype.toString */ FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors. if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) { // We should still check if inherited data property is not writable if (isExtensible() && !find.getProperty().isWritable()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } // Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well. if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) { find = null; } }
if (find != null) { if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) { if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) { throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode. } // Existing, non-writable property return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } } else { if (!isExtensible()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false); } }
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants(); if (globalConstants != null) { final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request); if (cinv != null) { return cinv; } }
return inv; }
Example: linking a set method for a ScriptObject
Reuse Link Logic
• Reuse link logic? protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) { final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) { return findMegaMorphicSetMethod(desc, name); }
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/* * If doing property set on a scope object, we should stop proto search on the first * non-scope object. Without this, for example, when assigning "toString" on global scope, * we'll end up assigning it on it's proto - which is Object.prototype.toString !! * * toString = function() { print("global toString"); } // don't affect Object.prototype.toString */ FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors. if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) { // We should still check if inherited data property is not writable if (isExtensible() && !find.getProperty().isWritable()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } // Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well. if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) { find = null; } }
if (find != null) { if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) { if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) { throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode. } // Existing, non-writable property return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true); } } else { if (!isExtensible()) { return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false); } }
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants(); if (globalConstants != null) { final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request); if (cinv != null) { return cinv; } }
return inv; }
Example: linking a set method for a ScriptObject
Link Logic
• InterpreterAccessor– IndexNode, AccessNode, IdentNode– get, set methods (take a Frame) – interpret calls get – Lookup delegates to findGetMethod,findSetMethod
• InterpreterCall– Lookup delegates to findCallMethod,findNewMethod
• InterpreterCallable– interpret, creates/gets a ScriptFunction
• Trampolined to call invoke, possibly wormhole – call method (actual invoca>on)
Reusing Link Logic package jdk.nashorn.internal.ir;
public class AccessNode extends BaseNode {
@Override
public Object interpret(final Frame frame) throws Throwable {
interpreterEnter(frame);
try {
return get(frame);
} finally {
interpreterLeave(frame);
}
}
@Override
public Object get(final Frame frame, final Object interpretedBase) throws Throwable {
try {
// lookupGetter uses existing ScriptObject/Dynalink link logic (set is analogue)
final CallSite cs = lookupGetter(frame, getterType(), interpretedBase, 0);
final Object value = cs.getTarget().invokeExact(interpretedBase);
return interpreterReturn(frame, value);
} catch (final ECMAException e) {
if (e.hasScriptStackTrace()) { //has stack trace been rewritten
throw e;
}
throw e.rewriteStackTrace(frame);
}
}
}
Interpreter?
• The link logic reuse actually makes us end up with rela>vely likle new code!
• For most logic, we can just use ScriptObject and ScriptRuntime func>ons that already exist for slow cases
• And add type narrowing
Interpreter Speed; Observa>ons
• Startup is significantly faster (even early in the project)
• But execu>on is 5-‐100 >mes slower than execu>ng op>mized warmed up wriken code
• We get automa>c type profiles before JIT>ng • We need to transi>on from interpreted to JITted code fairly quickly – Time-‐to-‐steady-‐state must not suffer from fast startup
Transi>oning to JIT; Determinism
• Right now we are using “number of invoca>ons” as the only JIT metric – No explicit bytecode – MH return value filter counter
• Tests are then determinis>c • We are rather aggressive in transferring to JIT code as – It doesn’t take long to do a stable type profile (1-‐2 execu>ons)
Transi>oning to JIT; Loops • At n backedge execu>ons, have the interpreter throw a RewriteException – Contains in type map (no new code) – Works just like in the JIT for a too wide type
• Very likle code – Logic for all this already exists in op>mis>c JIT
Transi>oning to JIT; Loops • Need OSR support • Reuse Program Point concept from op>mis>c types • Add an “invisible” op>mis>c program point – LoopNode implements Optimistic– LoopNode.getProgramPoint()
Technicali>es: Stack trace and Security Issues
The Stack Trace Problem
• A stack trace is not a special case in the JIT – All bytecode has line number and file name informa>on
• An interpreter would contain Node.interpret methods instead of the correct JavaScript line numbers – Need a special case to rewrite stack traces from Interpreter mode to script code
– Doable
The Stack Trace Problem
function f() { print(a);}function g() { f();}g();
The Stack Trace Problem stacktrace.js:2 ReferenceError: "a" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:66)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:331)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:304)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2400)at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:2005)at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1866)at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.
getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.
getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)at jdk.nashorn.internal.scripts.Script$Recompilation$3$stacktrace.f(stacktrace.js:2)at jdk.nashorn.internal.scripts.Script$Recompilation$2$31$stacktrace.g(stacktrace.js:5)at jdk.nashorn.internal.scripts.Script$Recompilation$1$stacktrace.:program(stacktrace.js:7)at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:772)at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:267)at jdk.nashorn.internal.runtime.ScriptRuntime.applyThrow(ScriptRuntime.java:434)at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:411)at jdk.nashorn.tools.Shell.apply(Shell.java:410)at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)at jdk.nashorn.tools.Shell.run(Shell.java:172)at jdk.nashorn.tools.Shell.main(Shell.java:136)at jdk.nashorn.tools.Shell.main(Shell.java:112)
The Stack Trace Problem stacktrace.js:4 ReferenceError: "a" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:66)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:331)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:304)at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2400)at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:2005)at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1866)at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)at jdk.nashorn.internal.ir.IdentNode.get(IdentNode.java:422)at jdk.nashorn.internal.ir.IdentNode.interpret(IdentNode.java:400)at jdk.nashorn.internal.runtime.interpreter.Interpreter.interpret(Interpreter.java:395)at jdk.nashorn.internal.ir.CallNode$1.interpretArguments(CallNode.java:473)at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:625)at jdk.nashorn.internal.ir.CallNode$1.invokeCallSite(CallNode.java:511)at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:625)at jdk.nashorn.internal.ir.CallNode$1.invokeCallSite(CallNode.java:511)at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)at jdk.nashorn.internal.ir.BinaryNode.interpret(BinaryNode.java:633)at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)at jdk.nashorn.internal.runtime.interpreter.InterpreterCallable.wormholeInterpreterCall(InterpreterCallable.java:101)at jdk.nashorn.internal.scripts.Wormhole$\=stacktrace\,js.wormholeInterpreterCall(stacktrace.js:4)at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:763)at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:267)at jdk.nashorn.internal.runtime.ScriptRuntime.applyThrow(ScriptRuntime.java:434)at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:411)at jdk.nashorn.tools.Shell.apply(Shell.java:410)at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)at jdk.nashorn.tools.Shell.run(Shell.java:172)at jdk.nashorn.tools.Shell.main(Shell.java:136)at jdk.nashorn.tools.Shell.main(Shell.java:112)
The Stack Trace Problem stacktrace.js:2 ReferenceError: "a" is not defined
at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.
getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.
getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)at jdk.nashorn.internal.scripts.Script$Interpreted$.f(stacktrace.js:2)at jdk.nashorn.internal.scripts.Script$Interpreted$.g(stacktrace.js:5)at jdk.nashorn.internal.scripts.Script$Interpreted$.:program(stacktrace.js:7)at jdk.nashorn.tools.Shell.apply(Shell.java:410)at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)at jdk.nashorn.tools.Shell.run(Shell.java:172)at jdk.nashorn.tools.Shell.main(Shell.java:136)at jdk.nashorn.tools.Shell.main(Shell.java:112)
The Stack Trace Problem
• Any NativeError trace passed out of the interpreter needs to be trapped in e.g. FunctionNode.interpret and poten>ally rewriken
The Stack Trace Problem
• Any NativeError trace passed out of the interpreter needs to be trapped in e.g. FunctionNode.interpret and poten>ally rewriken
• nasgen tool needs to support Interpreter state in some cases, NativeError– Added @needsInterpreterFrame=[true|false] annota>on
Security Problems
• The JIT code uses a MethodHandles.lookup() reachable only from jdk.nashorn.internal.scripts package.
Security Problems
• The JIT code uses a MethodHandles.lookup() reachable only from jdk.nashorn.internal.scripts package.
• If we use a lookup from the IR package, it is too privileged – The Interpreter needs a Source.getLookup() method that returns the restricted lookup.
– Lookup is wriken to Source, whenever we enter a source we haven’t found before, through a “wormhole” in the script package.
Security Problems
• The JIT code uses a MethodHandles.lookup() reachable only from jdk.nashorn.internal.scripts package.
• If we use a lookup from the IR package, it is too privileged – The Interpreter needs a Source.getLookup() method that returns the restricted lookup.
– Lookup is wriken to Source, whenever we enter a source we haven’t found before, through a “wormhole” in the script package.
• One wormhole method per Source is all that’s required
Security Problems public static wormholeInterpreterCall(InterpreterCallable;Frame;ScriptFunction;Object;[Object;)Object;
0 aload 0
1 invokeinterface InterpreterCallable.getSource()Source; 6 dup
// CHECK IF SOURCE ALREADY HAS LOOKUP
7 invokevirtual Source.hasLookup()Z
10 ifne 22
// NO – GET ONE WITH SCRIPT PACKAGE PRIVILEGE
13 invokestatic MethodHandles.lookup()MethodHandles$Lookup; 16 invokevirtual Source.setLookup(MethodHandles$Lookup;)V
19 goto 23
22 pop
// MARSHALL PARAMETERS TO READ INVOCATION
23 aload 0
24 aload 1 25 aload 2
26 aload 3
27 aload 4
29 invokestatic InterpreterCallable.wormholeInterpreterCall
(InterpreterCallable;Frame;ScriptFunction;Object;[Object;)Object;
32 areturn
Interpreter Op>miza>ons
Avoiding too much link >me
• CallSite caching • Can actually be used to solve problems that we don’t detect in the JIT
function f() {g(); //new bootstrap/lookup, store in cacheg(); //no need to link separately, reuse g()g(); // -”-g(); // -”-g(); // -”-g = function() { return 17; } //invalidateg(); //new bootstrap/lookup
}
Run>me overhead
• Add known set of specialized, non invalidated CallSites already linked in current scope
• InterpreterAccessor– isCachedCallSite(…)
• InterpreterCall– isCachedCallSite(…)
“Code Shape Overhead” – Program Points
• Nashorn concept: program point • Used to iden>fy a program point in a method when re-‐JITTING?
• A lot of JIT only transforms change code shape – Spli�ng, Inlining Finallies, Folding (interpreter doesn’t need that)
– Program points are assigned very late
“Code Shape Overhead” – Program Points
• Should avoid JIT only transforms in interpreted mode for speed – Spli�ng, Lowering etc
• S>ll need to preserve code shape to correctly map program point informa>on
• Alterna>ve “fuzzier” program point representa>on – Tuple (# in expression, source posi>on)?
Background Processing
• This a main strength with two code execu>on environments
• We can do (even specula>ve) JIT>ng in the background early
• Even (non explicit) mul>threaded if we want – Balancing heuris>cs
• java.util.concurrent.Future<CompiledFunction>
Current Results
Current Results
• Tests are clean • We are JavaScript compliant in “interpreter only” and “mixed” modes.
• Startup performance is significantly beker than before
• Ini>al footprint / code genera>on >me is much lower – It is important to go to JIT quickly – Type info is usually already correct
JEP
• A JEP is coming, and is moving through the process
• Will be made public shortly
Bytes of Bytecode generated during Startup
0
2
4
6
8
10
12
14
16
18
20
8u60, jit
9, jit
9, interpreter
# Classes Generated During Startup
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
8u60, jit
9, jit
9, interpreter
Startup Time With and Without Op>mis>c Types
0
0.5
1
1.5
2
2.5
3
3.5
4
8u60, jit
9, jit
Startup Time With and Without Op>mis>c Types
0
0.5
1
1.5
2
2.5
3
3.5
4
8u60, jit
9, jit
9, interpreter
Startup Time With and Without Op>mis>c Types
0
0.5
1
1.5
2
2.5
3
3.5
4
8u60, jit
9, jit
9, interpreter
9, interpreter, no code shape
Ongoing & Future Work
Call and Alloca>on Site Profiling
• Execu>on Overhead – Alloca>on site profiling – Call site profiling
Alloca>on Profiling
• Simple to do • Eliminate op>mis>c data structure invalida>ons function vector() {
return new Array(); //defaults to optimistic int
}
var vectors = [];
for (var i = 0; i < 1e6; i++) {
vectors.push(new vector().push(“string”));
}
Par>al Compila>on
• Par>al compila>on • Reuse compiled Nodes* – Hang on to MethodHandles by “Signature” • Parameter types • Parameter values
– A loss let code
* PotenBal path profile polluBon issues that need to be worked around, similar to as in LambdaForm caching
package jdk.nashorn.internal.ir;
public abstract class Node {
protected WeakHashMap<Signature, MethodHandle> code;
// …
public abstract Object interpret(Frame frame);
// …
}
Par>al Compila>on
package jdk.nashorn.internal.ir;
public abstract class Node {
protected WeakHashMap<Signature, MethodHandle> code;
// …
public abstract Object interpret(Frame frame);
// …
}
Par>al Compila>on
Par>al Compila>on
• Enables par>al evalua>on – with MethodHandles.constant and MethodHandle combinators
– Truffle style behavior without requiring a modified VM
The Peeks and The Pokes (but safely)
• Not interpreter specific • VarHandles – Fast gekers and sekers – No extra storage for primi>ve/object versions – No bounds checks (e.g. spill pool, ArrayData, TypedArrays – who needs Unsafe?)
– (sun.misc.TaggedArray)
JFR Integra>on; Events
• Dynalink – Relinking callsites – Megamorhic callsites
• Language agnos>c – Mul> language data generated – Type changes
• Language specific – Array like object layout, packed / sparse – ScriptObject layouts
Parallelism
• The more cores, the more we can specula>vely work (e.g. JIT) in the background
• Specula>ve parallel background processing • java.util.concurrent.Future<CompiledFunction>
Leveraging JDK Changes
• Improvements and speedups of java.lang.invoke
• Improvements (or removal) of LambdaForms • We Project Valhalla, we think – VarHandles complete – The minimum compile unit -‐ can it shrink? • Could spring from ClassDynamic?
Research
• Mul> language framework – Mul> Pla�orm Typeless IR (JRuby 9000 style) – TypeScript
• “On implemen>ng mul>ple pluggable dynamic language frontends on the JVM, using the Nashorn run>me” [Gabrielsson, Lagergren, Szegedi]
• Pluggable VM frontends
Ques>ons? Demos? Beer?
Twiker: @lagergren E-‐mail: marcus [at] lagergren.net