Graal and Truffle: One VM to Rule Them All

51
Graal and Truffle: One VM to Rule Them All Thomas Wuerthinger Oracle Labs @thomaswue 12-December-2013, at ETH Zurich

description

Graal is a dynamic meta-circular research compiler for Java that is designed for extensibility and modularity. One of its main distinguishing elements is the handling of optimistic assumptions obtained via profiling feedback and the representation of deoptimization guards in the compiled code. Truffle is a self-optimizing runtime system on top of Graal that uses partial evaluation to derive compiled code from interpreters. Truffle is suitable for creating high-performance implementations for dynamic languages with only moderate effort. The presentation includes a description of the Truffle multi-language API and performance comparisons within the industry of current prototype Truffle language implementations (JavaScript, Ruby, and R). Both Graal and Truffle are open source and form themselves research platforms in the area of virtual machine and programming language implementation (http://openjdk.java.net/projects/graal/).

Transcript of Graal and Truffle: One VM to Rule Them All

Graal and Truffle: One VM to Rule Them All

Thomas Wuerthinger Oracle Labs @thomaswue 12-December-2013, at ETH Zurich

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 2

Disclaimer

The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 3

Agenda

§  One VM to Rule Them All? §  Dynamic Compilation

§  Graal Compiler

§  Truffle System

§  Q&A

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 4

One Language to Rule Them All? Let’s ask a search engine…

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 5

One Language to Rule Them All? Let’s ask Stack Overflow…

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6

Relative Speed of Programming Languages

3

Goal:

(as measured by the Computer Language Benchmarks Game, ~1y ago)

One VM to for all languages means interoperability and being able to

choose the best language for the task!

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 7

Agenda

§  One VM to Rule Them All?

§  Dynamic Compilation §  Graal Compiler

§  Truffle System

§  Q&A

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 8

Static versus Dynamic Compilation (1)

§ Static (or ahead-of-time) Compilation –  Compilation happens before program is run. –  Can include profiling feedback from sample application runs.

§ Dynamic (or just-in-time) Compilation –  Compilation happens while the program is running. –  Base line execution (interpreter or simple compiler) gathers

profiling feeback. –  Optimization => Deoptimization => Reoptimization cycles. –  On-stack-replacement (OSR) to switch between the tiers (two or

more execution modes.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 9

Static versus Dynamic Compilation (2)

§ Static (or ahead-of-time) Compilation –  Fast start-up, because compilation and profiling is not part of

application execution time. –  Predictable performance as only the source program affects the

generated machine code.

§ Dynamic (or just-in-time) Compilation –  Can exploit exact target platform properties when generating

machine code. –  Profiling feedback captures part of the application behavior and

increases code quality. –  The deoptimization capabilities allow the optimized code to be

incomplete and/or use aggressive speculation. –  Can use assumptions about the current state of the system (e.g.,

loaded classes) in the generated code.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 10

Profiling Feedback for Java

§ Branch probabilities –  Never taken branches can be omitted. –  Exact probabilities allows if-cascade reordering.

§ Loop frequencies –  Guide loop unrolling and loop invariant motion.

§ Type profile –  Optimize instanceof, checkcast type checks (i.e., speculate that

only a specific set of types occurs) –  Optimize virtual calls or interface calls.

Profiling feedback only helps when the program behavior during the observed period matches the overall program behavior.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 11

Static Single Assignment (SSA) Form

§ Every variable is assigned only once. § Phis capture values coming from different control flow branches. § Commonly used in compilers as it simplifies optimizations and

traversal along the def-use and use-def chain.

... if (condition) { x = value1 + value2; } else { x = value2; } return x;

... if (condition) { x1 = value1 + value2; } else { x2 = value2; } x3 = phi(x1, x2); return x3;

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 12

Agenda

§  One VM to Rule Them All?

§  Dynamic Compilation

§  Graal Compiler §  Truffle System

§  Q&A

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 13

Graal is an …

... extensible,

dynamic compiler using

object-oriented Java programming,

a graph intermediate representation,

and Java snippets.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 14

HotSpotVM versus GraalVM

HotSpot

Compilation Queue

Compiler Interface

Client Server

HotSpot

Compilation Queue

Compiler Interface

Graal

30k LOC 120k LOC 60k LOC

C++ Java

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 15

Why Java?

Tooling: Java IDEs speed up the development process.

Robustness: Runtime exceptions not fatal.

Reflection: Annotations instead of macros.

Meta-Evaluation: IR subgraph expressible in Java code.

Extensibility: No language barrier to the application.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 16

Snippets for Graph Construction

int max(int a, int b) { if (a > b) return a; else return b;}

Node max(ValueNode a, ValueNode b) { IfNode ifNode = new IfNode(new IntegerLessThanNode(a, b)); ifNode.trueSuccessor().setNext(new ReturnNode(a)); ifNode.falseSuccessor().setNext(new ReturnNode(b)); return ifNode;}

Manual construction:

Expression as snippet:

Data Code

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 17

Lowering

§ Replace one node with multiple other nodes. –  New nodes provide more detailed description of semantics. –  New nodes can be optimized and moved separately.

§ General Java lowerings –  Example: Replace an array store with null check, bounds check,

store check, write operation.

§ VM specific lowerings –  Examples: Replace a monitorenter with the code dependent on the

locking schemes used by the VM

if (array != null && index >= 0 && index < array.length && canAssign(array.getClass().getComponentType(), value)) { *(array + 16 + index*8) = value;} else { deoptimize; }

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18

Gradual Lowering

0

0.5

1

1.5

2

2.5

3

After parsing After optimizations After lowering Before code emission

Nod

es p

er b

ytec

ode

Graal Client Server

Numbers obtained while running the DaCapo benchmark suite.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 19

Extensibility

abstract  class  Phase  {  abstract  void  run(Graph  g);  }

•  Multiple Target Platforms (AMD64, SPARC, PTX, HSAIL)

•  Multiple Runtimes (HotSpot and Maxine)

•  Adding new types of Nodes

•  Adding new compiler Phases

for  (IfNode  n  :  graph.getNodes(IfNode.class))  {  ...  }

Compiler has about 100 different individual modules.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 20

Graph IR

...  

If  

condition  

value1  

Add  

value2  

Phi  

Begin   Begin  

End   End  

Merge  

Return  

... if (condition) { result = value1 + value2; } else { result = value2; } return result;

•  Static single assignment (SSA) form with def-use and use-def edges.

•  Program dependence graph (sea of nodes), but with explicit

distinction between control flow and data flow edges.

•  Graph visualization tools: IdealGraphVisualizer and c1visualizer.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 21

int get(x) { return x.field; }

Guards

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22

int get(x) { if (cond) return x.field; else return 0; }

Guards

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 23

Eliding Exception Edges

Potential Actual

Invoke 1296646 14454 1.11%

BoundsCheck 166770 498 0.30%

NullCheck 1525061 686 0.04%

OutOfMemory 110078 0 0.00%

CheckCast 99192 0 0.00%

DivRem 6082 0 0.00%

MonitorNullCheck 33631 0 0.00%

TOTAL 3237460 15638 0.48%

Operation

Catch

Operation

Operation

Numbers obtained while running the DaCapo benchmark suite.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 24

Graal GPU Backends

Graal IR

PTX HSAIL

Truffle AST

JavaScript, Ruby, Python, …

Java bytecodes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 25

Java Peak Performance § SPECjvm2008

76

100 114

0

20

40

60

80

100

120

Client Graal Server

Configura*on:  Intel  Core  i7-­‐3770  @  3,4  Ghz,  4  Cores  8  Threads,  16  GB  RAM  Comparison  against  HotSpot  changeset  tag  hs25-­‐b37  from  June  13,  2013  

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 26

Scala Peak Performance § Scala-Dacapo Benchmark Suite

61

100 106

0

20

40

60

80

100

120

Client Graal Server

Configura*on:  Intel  Core  i7-­‐3770  @  3,4  Ghz,  4  Cores  8  Threads,  16  GB  RAM  Comparison  against  HotSpot  changeset  tag  hs25-­‐b37  from  June  13,  2013  

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 27

Your Compiler Extension? http://openjdk.java.net/projects/graal/

https://wiki.openjdk.java.net/display/Graal/Main

§ Graal Resources

[email protected]

$ hg clone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm graal build $ ./mx.sh ideinit $ ./mx.sh --vm graal vm

§ Graal License: GPLv2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 28

Agenda

§  One VM to Rule Them All?

§  Dynamic Compilation

§  Graal Compiler

§  Truffle System §  Q&A

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 29

“Write Your Own Language”

Prototype a new language

Parser and language work to build syntax tree (AST), AST Interpreter

Write a “real” VM

In C/C++, still using AST interpreter, spend a lot of time implementing runtime system, GC, …

People start using it

Define a bytecode format and write bytecode interpreter

People complain about performance

Write a JIT compiler Improve the garbage collector

Performance is still bad

Prototype a new language in Java

Parser and language work to build syntax tree (AST) Execute using AST interpreter

People start using it

And it is already fast

Current situation How it should be

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 30

Truffle: System Structure

Guest Language Implementation

Host Services

Guest Language Application

OS

Application Developer

Language Developer

VM Expert

Guest Language

Managed Host Language

Managed Host Languageor Unmanaged Language

Unmanaged Language(typically C or C++)

Written by: Written in:

OS Expert

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 31

Speculate and Optimize …

U

U U

U

U I

I I

G

G I

I I

G

G

Node Rewriting for Profiling Feedback

AST InterpreterRewritten Nodes

AST InterpreterUninitialized Nodes

Compilation usingPartial Evaluation

Compiled Code

Node Transitions

S

UI

D

G

Uninitialized Integer

Generic

DoubleString

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 32

Partial Evaluation

§ Example function: –  f(x, y) = x + y + 1

§ Partial evaluation of example function: –  g(y) = f(1, y) = 1 + y + 1 = y + 2

§  Interpreter function: –  f(program, arguments) = calculations to interpret the program

§ Partial evaluation of interpreter function (first Futamura projection): –  g(arguments) = f(#specificProgram, arguments) = compiled version of

#specificProgram that takes arguments as parameters

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 33

… and Deoptimize and Reoptimize!

I

I I

G

G I

I I

G

G

Deoptimizationto AST Interpreter

D

I D

G

G D

I D

G

G

Node Rewriting to Update Profiling Feedback

Recompilation usingPartial Evaluation

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 34

Object add(Object a, Object b) {

if(a instanceof Integer && b instanceof Integer) {

return (int)a + (int)b;

} else if (a instanceof String && b instanceof String) {

return (String)a + (String)b;

} else {

return genericAdd(a, b);

}

}

Object add(Object a,

Object b) {

return genericAdd(a, b);

}

int add(int a,

int b) {

return a + b;

}

String add(String a,

String b) {

return a + b;

}

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 35

class IAddNode extends BinaryNode {int executeInt(Frame f) throws UnexpectedResult {

int a;try {a = left.executeInt(f);

} catch (UnexpectedResult ex) {throw rewrite(f, ex.result, right.execute(f));

}

int b;try { b = right.executeInt(f);

} catch (UnexpectedResult ex) {throw rewrite(f, a, ex.result);

}

try {return Math.addExact(a, b);

} catch (ArithmeticException ex) {throw rewrite(f, a, b);

}}

Node Implementation

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 36

Generic

Double String

Uninitialized Specializing

FSA

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 37

@Specialization(rewriteOn=ArithmeticException.class) int addInt(int a, int b) {

return Math.addExact(a, b);}

@Specializationdouble addDouble(double a, double b) {

return a + b;}

@GenericObject addGeneric(Frame f, Object a, Object b) {

// Handling of String omitted for simplicity.Number aNum = Runtime.toNumber(f, a);Number bNum = Runtime.toNumber(f, b);return Double.valueOf(aNum.doubleValue() +

bNum.doubleValue());}

Truffle DSL

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 38

G U S

uninitialized monomorphic polymorphic megamorphic

U

S

U

S

S

Inline Caching

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 39

Method Inlining

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 40

Method Inlining

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 41

Truffle API Compiler Directives § Guards

§ Assumptions

if(condition)  {      //  some  code  that  is  only  valid  if  condition  is  true  }  else  {      CompilerDirectives.transferToInterpreter();  }  

Assumption  assumption  =  Truffle.getRuntime().createAssumption();  

assumption.check();  //  some  code  that  is  only  valid  if  assumption  is  true  

assumption.invalidate();  

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 42

Performance Number Disclaimers § All Truffle numbers reflect the current development snapshot.

–  Subject to change at any time (hopefully improve) –  You have to know a benchmark to understand why it is slow or fast

§ We are not claiming to have complete language implementations. –  JavaScript: quite complete, passing 99.8% of ECMAScript262 tests –  Ruby: passing >45% of RubySpec language tests –  R: early prototype

§ We measure against latest versions of competitors.

§ We measure peak performance (i.e., giving each benchmark enough iterations to warmup before starting measurement).

§ Benchmarks that are not shown –  may not run at all, or –  may not run fast

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 43

Peak Performance: JavaScript Speedup relative to V8

1.0

1.5

0.6 0.7

0.9

2.6

0.5

1.4

0.8 1.

0

0.7

0.8 1.

0

0.7

1.1

1.6

0.6

1.1

1.2

0.9

0.0

0.5

1.0

1.5

2.0

2.5

3.0

richa

rds

delta

blue

crypto

raytr

ace

navie

r-stok

es

splay

earle

y-boy

er

box2

d

gbem

u

Compo

site

Truffle

SpiderMonkey

Selection of benchmarks from Google‘s Octane benchmark suite v1.0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 44

Peak Performance: Ruby Speedup relative to JRuby 1.7.5

0.4

0.2 0.5 0.7 1.

7

0.8

0.6

0.3

1.7

14

4.7

2.7

4.9

1.0

0.7

2.7

14

4.5

1.1 1.

8

1.7

0

2

4

6

8

10

12

14

16

MRI 2.0.0

Topaz

Truf f le

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 45

Peak Performance: R Speedup relative to GNUR

2.1

2.0

2.7

94

0.8

22

38

24

14

39

23

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

100.0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 46

Language Implementations

Simple Language JavaScript

R Python

Ruby

Smalltalk

C Java

Your language?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 47

Your Language? http://openjdk.java.net/projects/graal/

https://wiki.openjdk.java.net/display/Graal/Truffle+FAQ+and+Guidelines

§ Truffle API Resources

[email protected]

$ hg clone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm server build $ ./mx.sh ideinit $ ./mx.sh --vm server unittest SumTest

§ Truffle API License: GPLv2 with Classpath Exception

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 48

Acknowledgements Oracle Labs Laurent Daynès Erik Eckstein Michael Haupt Peter Kessler Christos Kotselidis David Leibs Roland Schatz Chris Seaton Doug Simon Michael Van De Vanter Christian Wimmer Christian Wirth Mario Wolczko Thomas Würthinger Laura Hill (Manager) Interns Danilo Ansaloni Daniele Bonetta Shams Imam Stephen Kell Gregor Richards Rifat Shariyar

JKU Linz Prof. Hanspeter Mössenböck Gilles Duboscq Matthias Grimmer Christian Häubl Josef Haider Christian Humer Christian Huber Manuel Rigger Lukas Stadler Bernhard Urban Andreas Wöß University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat

University of California, Irvine Prof. Michael Franz Codrut Stancu Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Petr Maj Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 49

Q/A

http://openjdk.java.net/projects/graal/

[email protected]

@thomaswue

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 50

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 51