JVM Languages Support in Eclipse - Monkey-patching the JDT for fun and profit?
JVM: A Platform for Multiple Languages
description
Transcript of JVM: A Platform for Multiple Languages
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.1
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.2
Insert Picture HereJVM: A Platform for Multiple LanguagesKrystal MoMember of Technical StaffHotSpot JVM Compiler Team
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.3
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.4
Once there was a time…
Source: http://www.tiobe.com
Java
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.5
Oh wait…
Source: http://www.tiobe.com
JavaTM
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.6
But really, what we meant was…
Source: http://www.tiobe.com
JavaTM
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.7
Fortunately, clearer minds prevailLanguage Implementations on JVM
Fantom
Fortress
(and many more…)
BeanShell
Jaskell
ANTLR
JudoScript
ABCL
Erjang
X10
myForth
C
jdartjgo
Nice
Gosu
Jacl
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.8
Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.9
Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.10
Why Make a Language At All? (1)
Syntax– the “easy” part
– pick one that fits your eyes
Semantics and Capabiliies– static vs. dynamic
– sequential vs. parallel
– …one that fits the problem domain
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.11
Why Make a Language At All? (2)
Language can use alternative syntax– where as library has to adhere to some host language
Language can impose more restrictions– e.g. controlling capability
– where as library has no control over host language’s capabilities
Versus writing a library
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.12
Why on JVM?
Mature low-level services– Dynamic (“JIT”) compilation
– Garbage collection
– Threading
– Debugging Support
Cross-platform Vast array of libraries
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.13
JVM StrengthsCompiler Optimizations
compiler tactics delayed compilation tiered compilation on-stack replacement delayed reoptimization program dependence graph representation static single assignment representationproof-based techniques exact type inference memory value inference memory value tracking constant folding reassociation operator strength reduction null check elimination type test strength reduction type test elimination algebraic simplification common subexpression elimination integer range typingflow-sensitive rewrites conditional constant propagation dominating test detection flow-carried type narrowing dead code elimination
language-specific techniques class hierarchy analysis
devirtualization symbolic constant propagation autobox elimination escape analysis lock elision lock fusion de-reflectionspeculative (profile-based) techniques optimistic nullness assertions optimistic type assertions optimistic type strengthening optimistic array length strengthening untaken branch pruning optimistic N-morphic inlining branch frequency prediction call frequency predictionmemory and placement transformation expression hoisting expression sinking redundant store elimination adjacent store fusion card-mark elimination merge-point splitting
loop transformations loop unrolling loop peeling safepoint elimination
iteration range splitting range check elimination loop vectorizationglobal code shaping inlining (graph integration) global code motion heat-based code layout switch balancing throw inliningcontrol flow graph transformation local code scheduling local code bundling delay slot filling graph-coloring register allocation linear scan register allocation live range splitting copy coalescing constant splitting copy removal address mode matching instruction peepholing DFA-based code generator
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.14
Why in Java?
Robustness: Runtime exceptions not fatal Reflection: Annotations instead of macros Tooling: Java IDEs speed up the development process etc.
Instead of C/C++?
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.15
Good IDEs Good Profilers Good tooling for developing
parsers and other language support
Excellent Tooling Support
Ease of Development
ANTLR
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.16
Developing a Language on JVM
Backed by JVM
SemanticsSyntax Low-level Details
•(your work goes here)•Backed by various libraries•e.g. ASM, dynalink
•Mature parser libraries•e.g. ANTLR, JavaCC
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.17
Case Study: Writing a Compiler in JavaUsing Reflection
for (IfNode n : graph.getNodes(IfNode.class)) { ... }
class CompareNode extends FloatingNode, implements ValueNumberable, Canonicalizable { @Input ValueNode x; @Input ValueNode y; @Data Condition condition;
public Node canonical(CanonicalizerTool t) { return this; }}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.18
Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.19
It can always be done
Java and JVM provide a rich set of primitives to build on Almost any language feature can be implemented on JVM
– albeit not necessarily efficient
Even without direct native support from JVM
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.20
David Wheeler
“All problems in computer science can be solved by another level of indirection.”
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.21
Kevlin Henney
“… except for the problem of too many layers of indirection.”
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.22
Case Study: A Bytecode Interpreter in Java“Double interpretation”
Java Source Program
Bytecode Interpreter in Java
Host JVM(also an interpreter)
Bytecode
i = j + 1
iload_2iconst_1iaddistore_1
while (true) { byte opcode = code[pc++]; switch (opcode) { // ... case ILOAD_2: int i = locals[2]; stack[sp++] = i; break; // ... }}
……………………………
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.23
Case Study: A JVM in JavaIdeally, redundant indirections are squeezed out
Java Source Program
Compiler in Java
Host MachineBytecode
i = j + 1
iload_2iconst_1iaddistore_1
lea eax, [edx+1]
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.24
Alternative Method Dispatching
e.g. prototype-based dispatch, metaclass, etc. Emulate with reflection
– Custom lookup / binding
– Then java.lang.reflect.Method.invoke()
– Reflective invocation overhead Security checking Argument boxing / unboxing
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.25
Tail-calls
Often seen in functional languages Emulate with trampoline loop Special case:
– Direct tail-recursions can easily be transformed into loops
– e.g. Scala does this
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.26
Case Study: tail-callRewrite into trampoline
static int a() { return b();}
static int b() { return c();}
static int c() { return 42;}
static int trampolineLoop(Task t) { Context ctx = new Context(); while (t != null) { t = t.invoke(ctx); } return ctx.value;}
static Task a(Context ctx) { return new Task(#b);}
static Task b(Context ctx) { return new Task(#c);}
static Task c(Context ctx) { ctx.value = 42; return null;}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.27
Case Study: tail-recursionRewrite into loop
static int fib(int n) { return fibInner(n, 0, 1);}
static int fibInner(int n, int a, int b) { if (n < 2) return b; return fibInner(n - 1, b, a + b);}
static int fib(int n) { int a = 0, b = 1; while (n >= 2) { n = n - 1; int temp = a + b; a = b; b = temp; } return b;}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.28
Coroutines
Emulate with threads– Can implement full (“stackful”) coroutine semantics
– Often use thread pooling as an optimization
– Waste (virtual) memory
– Could leak memory
– e.g. used by JRuby on stock JVMs
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.29
Coroutines
Emulate with Finite State Machines– Compile-time transformation
– Can only implement “stackless coroutines” Can only yield from the main method
– e.g. C# does this with its iterator
– e.g. there’s a coroutines library for Java that does the same
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.30
Case Study: C#’s iteratorOriginal source
static IEnumerable<int> GetNaturals() { int i = 1; while (true) { yield return i++; }}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.31
Case Study: C#’s iteratorTransformed into FSM (simplied from actual generated code)
static IEnumerable<int> GetNaturals() { return new NaturalsIterator(0);}
sealed class NaturalsIterator : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable { int _current, _state, _i; public NaturalsIterator(int state) { _state = state; } int IEnumerator<int>.Current { get { return _current; }
} bool IEnumerator.MoveNext() { switch (_state) { case 0: _i = 1; break; case 1: break; default: return false; } _current = _i++; _state = 1; return true; }}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.32
Reference Counting
Emulate with reified reference– Boxing overhead
– You really don’t want to do this…
public class CountedReference<T> { private volatile int refCount = 1; private final T target; public CountedReference(T target) { this.target = target; } public T addRef() { refCount++; return target; } public void release() { if (refCount >= 1) refCount--; if (refCount < 1 && target != null) { target.finalize(); target = null; } }}
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.33
Infinite Precision Integer
Emulate with java.math.BigInteger– Boxing overhead
Performance impact Heap bloat
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.34
Closer to the Metal
Less indirections Better performance
Reducing redundant emulation
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.35
The “J” in JVM
Semantics match => good perf, easy to impl Closer to Java => closer to the metal on JVM
Geared towards Java semantics
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.36
The “J” in JVM
Languages on JVM shouldn’t be forced to be like Java to be performant
Improve the VM to accommodate non-Java language features– In turn, benefits Java itself, e.g. lambdas
Should optimize for non-Java features, too
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.37
Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.38
Why dynamic languages?
Fast turnaround time for simple programs– no compile step required
– direct interpretation possible
– loose binding to the environment
Data-driven programming– program shape can change along with data shape
– radically open-ended code (plugins, aspects, closures)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.39
Dynamic languages are here to stay
Source: http://www.tiobe.com
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.40
What slows down a JVM
Non-Java languages require special call sites.– e.g.: Smalltalk message sending (no static types).
– e.g.: JavaScript or Ruby method call (different lookup rules).
In the past, special calls required simulation overheads– ...such as reflection and/or extra levels of lookup and indirection
– ...which have inhibited JIT optimizations.
Result: Pain for non-Java developers. Enter Java 7.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.41
Key Features
New bytecode instruction: invokedynamic.– Linked reflectively, under user control.
– User-visible object: java.lang.invoke.CallSite
– Dynamic call sites can be linked and relinked, dynamically.
New unit of behavior: method handle– The content of a dynamic call site is a method handle.
– Method handles are function pointers for the JVM.
– (Or if you like, each MH implements a single-method interface.)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.42
Dynamic Program Composition
Method Handles
BytecodesDynamic Call
Sites
JVM JIT
A dynamic call site is created for each invokedynamic call bytecode
Each call site is bound to one or more method handles, which point back to bytecoded methods
Bytecodes are created by Java compilers or dynamic runtimes
The JVM seamlessly integrates execution, optimizing to native code as necessary
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.43
Passing the burden to the JVM
Non-Java languages require special call sites. In the past, special calls required simulation overheads
Now, invokedynamic call sites are fully user-configurable– ...and are fully optimizable by the JIT.
Result: Much simpler code for language implementors– ...and new leverage for the JIT.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.44
What’s in a method call? (before invokedynamic)
Source code Bytecode Linking Executing
Naming Identifiers Utf8 constants JVM “dictionary”
Selecting Scopes Class names Loaded classes
V-table lookup
Adapting Argument conversion
C2I / I2C adapters
Receiver narrowing
Calling Jump with arguments
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.45
What’s in a method call? (using invokedynamic)
Source code Bytecode Linking Executing
Naming ∞ ∞ ∞ ∞
Selecting ∞ Bootstrap methods
Bootstrap method call
∞
Adapting ∞ Method handles
∞
Calling Jump with arguments
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.46
Charles NutterJRuby Lead, Red Hat
“Invokedynamic is the most important addition to Java in years. It will change the face of the platform.”
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.47
Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.48
Loose ends in the Java 7 API
Method handle introspection (reflection) Generalized proxies (more than single-method intfs) Class hierarchy analysis (override notification) Smaller issues:
– usability (MethodHandle.toString, polymorphic bindTo)
– sharp corners (MethodHandle.invokeWithArguments)
– repertoire (tryFinally, more fold/spread/collect options)
Integration with other APIs (java.lang.reflect)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.49
Support for Lambda in OpenJDK8
More transforms for SAM types (as needed). Faster bindTo operation to create bound MHs
– No JNI calls.
– Maybe multiple-value bindTo.
Faster inexact invoke (as needed).
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.50
Let’s continue building our “future VM”
Da Vinci Machine Project: an open source incubator for JVM futures
Contains code fragments (patches). Movement to OpenJDK requires:
– a standard (e.g., JSR 292)
– a feature release plan (7 vs. 8 vs. ...)
bsd-port for developer friendliness. [email protected]
http://hg.openjdk.java.net/mlvm/mlvm/hotspot/
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.51
Current Da Vinci Machine Patches
MLVM patches
meth method handles implementation
indy invokedynamic
coro light weight coroutines (Lukas Stadler)
inti interface injection (Tobias Ivarsson)
tailc hard tail call optimization (Arnold Schwaighofer)
tuple integrating tuple types (Michael Barker)
hotswap online general class schema updates (Thomas Wuerthinger)
anonk anonymous classes; light weight bytecode loading
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.52
Caveat: Change is hard and slow
Hacking code is relatively simple. Removing bugs is harder. Verifying is difficult (millions of users). Integrating to a giant system very hard.
– interpreter, multiple compilers
– managed heap (multiple GC algos.)
– debugging, monitoring, profiling machinery
– security interactions
Specifying is hard (the last 20%...). Running process is time-consuming.
(especially the “last 20%”)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.53
Further Reading
Multi-Language VM (MLVM) Project on OpenJDK JVM Language Summit JSR 292 Cookbook
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.54
References
VM Optimizations for Language Designers, John Pampuch, JVM Language Summit 2008
Method Handles and Beyond, Some basis vectors, John Rose, JVM Language Summit 2011
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.55
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.56