Graal and Truffle: One VM to Rule Them All
-
Upload
thomas-wuerthinger -
Category
Technology
-
view
4.650 -
download
3
description
Transcript of Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them All
Thomas Wuerthinger Oracle Labs @thomaswue 12-December-2013, at ETH Zurich
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 2
Disclaimer
The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 3
Agenda
§ One VM to Rule Them All? § Dynamic Compilation
§ Graal Compiler
§ Truffle System
§ Q&A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 4
One Language to Rule Them All? Let’s ask a search engine…
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 5
One Language to Rule Them All? Let’s ask Stack Overflow…
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6
Relative Speed of Programming Languages
3
Goal:
(as measured by the Computer Language Benchmarks Game, ~1y ago)
One VM to for all languages means interoperability and being able to
choose the best language for the task!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 7
Agenda
§ One VM to Rule Them All?
§ Dynamic Compilation § Graal Compiler
§ Truffle System
§ Q&A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 8
Static versus Dynamic Compilation (1)
§ Static (or ahead-of-time) Compilation – Compilation happens before program is run. – Can include profiling feedback from sample application runs.
§ Dynamic (or just-in-time) Compilation – Compilation happens while the program is running. – Base line execution (interpreter or simple compiler) gathers
profiling feeback. – Optimization => Deoptimization => Reoptimization cycles. – On-stack-replacement (OSR) to switch between the tiers (two or
more execution modes.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 9
Static versus Dynamic Compilation (2)
§ Static (or ahead-of-time) Compilation – Fast start-up, because compilation and profiling is not part of
application execution time. – Predictable performance as only the source program affects the
generated machine code.
§ Dynamic (or just-in-time) Compilation – Can exploit exact target platform properties when generating
machine code. – Profiling feedback captures part of the application behavior and
increases code quality. – The deoptimization capabilities allow the optimized code to be
incomplete and/or use aggressive speculation. – Can use assumptions about the current state of the system (e.g.,
loaded classes) in the generated code.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 10
Profiling Feedback for Java
§ Branch probabilities – Never taken branches can be omitted. – Exact probabilities allows if-cascade reordering.
§ Loop frequencies – Guide loop unrolling and loop invariant motion.
§ Type profile – Optimize instanceof, checkcast type checks (i.e., speculate that
only a specific set of types occurs) – Optimize virtual calls or interface calls.
Profiling feedback only helps when the program behavior during the observed period matches the overall program behavior.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 11
Static Single Assignment (SSA) Form
§ Every variable is assigned only once. § Phis capture values coming from different control flow branches. § Commonly used in compilers as it simplifies optimizations and
traversal along the def-use and use-def chain.
... if (condition) { x = value1 + value2; } else { x = value2; } return x;
... if (condition) { x1 = value1 + value2; } else { x2 = value2; } x3 = phi(x1, x2); return x3;
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 12
Agenda
§ One VM to Rule Them All?
§ Dynamic Compilation
§ Graal Compiler § Truffle System
§ Q&A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 13
Graal is an …
... extensible,
dynamic compiler using
object-oriented Java programming,
a graph intermediate representation,
and Java snippets.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 14
HotSpotVM versus GraalVM
HotSpot
Compilation Queue
Compiler Interface
Client Server
HotSpot
Compilation Queue
Compiler Interface
Graal
30k LOC 120k LOC 60k LOC
C++ Java
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 15
Why Java?
Tooling: Java IDEs speed up the development process.
Robustness: Runtime exceptions not fatal.
Reflection: Annotations instead of macros.
Meta-Evaluation: IR subgraph expressible in Java code.
Extensibility: No language barrier to the application.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 16
Snippets for Graph Construction
int max(int a, int b) { if (a > b) return a; else return b;}
Node max(ValueNode a, ValueNode b) { IfNode ifNode = new IfNode(new IntegerLessThanNode(a, b)); ifNode.trueSuccessor().setNext(new ReturnNode(a)); ifNode.falseSuccessor().setNext(new ReturnNode(b)); return ifNode;}
Manual construction:
Expression as snippet:
Data Code
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 17
Lowering
§ Replace one node with multiple other nodes. – New nodes provide more detailed description of semantics. – New nodes can be optimized and moved separately.
§ General Java lowerings – Example: Replace an array store with null check, bounds check,
store check, write operation.
§ VM specific lowerings – Examples: Replace a monitorenter with the code dependent on the
locking schemes used by the VM
if (array != null && index >= 0 && index < array.length && canAssign(array.getClass().getComponentType(), value)) { *(array + 16 + index*8) = value;} else { deoptimize; }
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18
Gradual Lowering
0
0.5
1
1.5
2
2.5
3
After parsing After optimizations After lowering Before code emission
Nod
es p
er b
ytec
ode
Graal Client Server
Numbers obtained while running the DaCapo benchmark suite.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 19
Extensibility
abstract class Phase { abstract void run(Graph g); }
• Multiple Target Platforms (AMD64, SPARC, PTX, HSAIL)
• Multiple Runtimes (HotSpot and Maxine)
• Adding new types of Nodes
• Adding new compiler Phases
for (IfNode n : graph.getNodes(IfNode.class)) { ... }
Compiler has about 100 different individual modules.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 20
Graph IR
...
If
condition
value1
Add
value2
Phi
Begin Begin
End End
Merge
Return
... if (condition) { result = value1 + value2; } else { result = value2; } return result;
• Static single assignment (SSA) form with def-use and use-def edges.
• Program dependence graph (sea of nodes), but with explicit
distinction between control flow and data flow edges.
• Graph visualization tools: IdealGraphVisualizer and c1visualizer.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 21
int get(x) { return x.field; }
Guards
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
int get(x) { if (cond) return x.field; else return 0; }
Guards
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 23
Eliding Exception Edges
Potential Actual
Invoke 1296646 14454 1.11%
BoundsCheck 166770 498 0.30%
NullCheck 1525061 686 0.04%
OutOfMemory 110078 0 0.00%
CheckCast 99192 0 0.00%
DivRem 6082 0 0.00%
MonitorNullCheck 33631 0 0.00%
TOTAL 3237460 15638 0.48%
Operation
Catch
Operation
Operation
Numbers obtained while running the DaCapo benchmark suite.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 24
Graal GPU Backends
Graal IR
PTX HSAIL
Truffle AST
JavaScript, Ruby, Python, …
Java bytecodes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 25
Java Peak Performance § SPECjvm2008
76
100 114
0
20
40
60
80
100
120
Client Graal Server
Configura*on: Intel Core i7-‐3770 @ 3,4 Ghz, 4 Cores 8 Threads, 16 GB RAM Comparison against HotSpot changeset tag hs25-‐b37 from June 13, 2013
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 26
Scala Peak Performance § Scala-Dacapo Benchmark Suite
61
100 106
0
20
40
60
80
100
120
Client Graal Server
Configura*on: Intel Core i7-‐3770 @ 3,4 Ghz, 4 Cores 8 Threads, 16 GB RAM Comparison against HotSpot changeset tag hs25-‐b37 from June 13, 2013
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 27
Your Compiler Extension? http://openjdk.java.net/projects/graal/
https://wiki.openjdk.java.net/display/Graal/Main
§ Graal Resources
$ hg clone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm graal build $ ./mx.sh ideinit $ ./mx.sh --vm graal vm
§ Graal License: GPLv2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 28
Agenda
§ One VM to Rule Them All?
§ Dynamic Compilation
§ Graal Compiler
§ Truffle System § Q&A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 29
“Write Your Own Language”
Prototype a new language
Parser and language work to build syntax tree (AST), AST Interpreter
Write a “real” VM
In C/C++, still using AST interpreter, spend a lot of time implementing runtime system, GC, …
People start using it
Define a bytecode format and write bytecode interpreter
People complain about performance
Write a JIT compiler Improve the garbage collector
Performance is still bad
Prototype a new language in Java
Parser and language work to build syntax tree (AST) Execute using AST interpreter
People start using it
And it is already fast
Current situation How it should be
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 30
Truffle: System Structure
Guest Language Implementation
Host Services
Guest Language Application
OS
Application Developer
Language Developer
VM Expert
Guest Language
Managed Host Language
Managed Host Languageor Unmanaged Language
Unmanaged Language(typically C or C++)
Written by: Written in:
OS Expert
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 31
Speculate and Optimize …
U
U U
U
U I
I I
G
G I
I I
G
G
Node Rewriting for Profiling Feedback
AST InterpreterRewritten Nodes
AST InterpreterUninitialized Nodes
Compilation usingPartial Evaluation
Compiled Code
Node Transitions
S
UI
D
G
Uninitialized Integer
Generic
DoubleString
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 32
Partial Evaluation
§ Example function: – f(x, y) = x + y + 1
§ Partial evaluation of example function: – g(y) = f(1, y) = 1 + y + 1 = y + 2
§ Interpreter function: – f(program, arguments) = calculations to interpret the program
§ Partial evaluation of interpreter function (first Futamura projection): – g(arguments) = f(#specificProgram, arguments) = compiled version of
#specificProgram that takes arguments as parameters
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 33
… and Deoptimize and Reoptimize!
I
I I
G
G I
I I
G
G
Deoptimizationto AST Interpreter
D
I D
G
G D
I D
G
G
Node Rewriting to Update Profiling Feedback
Recompilation usingPartial Evaluation
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 34
Object add(Object a, Object b) {
if(a instanceof Integer && b instanceof Integer) {
return (int)a + (int)b;
} else if (a instanceof String && b instanceof String) {
return (String)a + (String)b;
} else {
return genericAdd(a, b);
}
}
Object add(Object a,
Object b) {
return genericAdd(a, b);
}
int add(int a,
int b) {
return a + b;
}
String add(String a,
String b) {
return a + b;
}
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 35
class IAddNode extends BinaryNode {int executeInt(Frame f) throws UnexpectedResult {
int a;try {a = left.executeInt(f);
} catch (UnexpectedResult ex) {throw rewrite(f, ex.result, right.execute(f));
}
int b;try { b = right.executeInt(f);
} catch (UnexpectedResult ex) {throw rewrite(f, a, ex.result);
}
try {return Math.addExact(a, b);
} catch (ArithmeticException ex) {throw rewrite(f, a, b);
}}
Node Implementation
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 36
Generic
Double String
Uninitialized Specializing
FSA
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 37
@Specialization(rewriteOn=ArithmeticException.class) int addInt(int a, int b) {
return Math.addExact(a, b);}
@Specializationdouble addDouble(double a, double b) {
return a + b;}
@GenericObject addGeneric(Frame f, Object a, Object b) {
// Handling of String omitted for simplicity.Number aNum = Runtime.toNumber(f, a);Number bNum = Runtime.toNumber(f, b);return Double.valueOf(aNum.doubleValue() +
bNum.doubleValue());}
Truffle DSL
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 38
G U S
uninitialized monomorphic polymorphic megamorphic
U
S
U
S
S
…
Inline Caching
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 41
Truffle API Compiler Directives § Guards
§ Assumptions
if(condition) { // some code that is only valid if condition is true } else { CompilerDirectives.transferToInterpreter(); }
Assumption assumption = Truffle.getRuntime().createAssumption();
assumption.check(); // some code that is only valid if assumption is true
assumption.invalidate();
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 42
Performance Number Disclaimers § All Truffle numbers reflect the current development snapshot.
– Subject to change at any time (hopefully improve) – You have to know a benchmark to understand why it is slow or fast
§ We are not claiming to have complete language implementations. – JavaScript: quite complete, passing 99.8% of ECMAScript262 tests – Ruby: passing >45% of RubySpec language tests – R: early prototype
§ We measure against latest versions of competitors.
§ We measure peak performance (i.e., giving each benchmark enough iterations to warmup before starting measurement).
§ Benchmarks that are not shown – may not run at all, or – may not run fast
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 43
Peak Performance: JavaScript Speedup relative to V8
1.0
1.5
0.6 0.7
0.9
2.6
0.5
1.4
0.8 1.
0
0.7
0.8 1.
0
0.7
1.1
1.6
0.6
1.1
1.2
0.9
0.0
0.5
1.0
1.5
2.0
2.5
3.0
richa
rds
delta
blue
crypto
raytr
ace
navie
r-stok
es
splay
earle
y-boy
er
box2
d
gbem
u
Compo
site
Truffle
SpiderMonkey
Selection of benchmarks from Google‘s Octane benchmark suite v1.0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 44
Peak Performance: Ruby Speedup relative to JRuby 1.7.5
0.4
0.2 0.5 0.7 1.
7
0.8
0.6
0.3
1.7
14
4.7
2.7
4.9
1.0
0.7
2.7
14
4.5
1.1 1.
8
1.7
0
2
4
6
8
10
12
14
16
MRI 2.0.0
Topaz
Truf f le
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 45
Peak Performance: R Speedup relative to GNUR
2.1
2.0
2.7
94
0.8
22
38
24
14
39
23
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 46
Language Implementations
Simple Language JavaScript
R Python
Ruby
Smalltalk
C Java
Your language?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 47
Your Language? http://openjdk.java.net/projects/graal/
https://wiki.openjdk.java.net/display/Graal/Truffle+FAQ+and+Guidelines
§ Truffle API Resources
$ hg clone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm server build $ ./mx.sh ideinit $ ./mx.sh --vm server unittest SumTest
§ Truffle API License: GPLv2 with Classpath Exception
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 48
Acknowledgements Oracle Labs Laurent Daynès Erik Eckstein Michael Haupt Peter Kessler Christos Kotselidis David Leibs Roland Schatz Chris Seaton Doug Simon Michael Van De Vanter Christian Wimmer Christian Wirth Mario Wolczko Thomas Würthinger Laura Hill (Manager) Interns Danilo Ansaloni Daniele Bonetta Shams Imam Stephen Kell Gregor Richards Rifat Shariyar
JKU Linz Prof. Hanspeter Mössenböck Gilles Duboscq Matthias Grimmer Christian Häubl Josef Haider Christian Humer Christian Huber Manuel Rigger Lukas Stadler Bernhard Urban Andreas Wöß University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat
University of California, Irvine Prof. Michael Franz Codrut Stancu Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Petr Maj Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 49
Q/A
http://openjdk.java.net/projects/graal/
@thomaswue