Java Jit. Compilation and optimization by Andrey Kovalenko
-
Upload
valeriia-maliarenko -
Category
Software
-
view
212 -
download
1
Transcript of Java Jit. Compilation and optimization by Andrey Kovalenko
![Page 1: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/1.jpg)
JAVA JITC o m p i l a ti o n a n d o p ti m i z a ti o n
![Page 2: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/2.jpg)
2
AGENDA
o What is JITo Types - Client, Server, Tieredo Main optimizations approacho JIT tuningo Conclusions
![Page 3: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/3.jpg)
3
WHAT IS JITo Just In Time compilero Compilation done during execution of a
program – at run time – rather than prior to execution
o First presented at 1960 in LISPo Java, .NET, JS…o Oracle HotSpot, IBM J9, Azul…
![Page 4: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/4.jpg)
4
WHAT IS JITo JIT separates optimization from SD (just update JVM
- not improve code, tune for your platform)o JIT'ing requires Profiling• Because you don't want to JIT everything
o Profiling allows better code-gen• Inline what’s hot• Loop unrolling, range-check elimination, etc• Branch prediction, spill-code-gen, scheduling
![Page 5: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/5.jpg)
5
HOTSPOT JIT CLIENT (C1) WORKFLOWJava
Source Bytecode compiler
Bytecode
Optimized code JIT Compiler
Run time
1.5K invocations
![Page 6: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/6.jpg)
6
JIT CLIENT (C1)
o Produced Compilations quicklyo Generated code runs relatively slowly
![Page 7: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/7.jpg)
7
HOTSPOT JIT SERVER (C2) WORKFLOWJava
Source Bytecode compiler
Bytecode
Optimized code (native)
HotSpot info
Profiler
JIT compiler(optimization)
Run time
JIT compiler(deoptimization)
10K invocations
![Page 8: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/8.jpg)
8
HOTSPOT JIT SERVER (C2)
o Produce compilations slowly (long warm-up)o Generated code runs fasto Profiler guidedo Speculative
![Page 9: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/9.jpg)
9
HOTSPOT JIT TIERED (C2)o Available from Java 7o Default in Java 8o Best of C1 and C2 approaches
o Level0=Interpretero Level1-3=C1
o #1 – C1 w/o profilingo #2 – C1 with basic profiling (invocations)o #3 – C1 w full profiling (~35% overhead)
o Level4=C2
![Page 10: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/10.jpg)
10
KEYS FOR JIT VERSIONo -cliento -server (-d64)o -server (-d64) -XX:+TieredCompilation
![Page 11: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/11.jpg)
11
DEFAULT JIT VERSION
Install bits -client -server -d64Linux 32-bit 32-bit client compiler 32-bit server compiler Error
Linux 64-bit 64-bit server compiler 64-bit server compiler 64-bit server compiler
Mac OS X 64-bit server compiler 64-bit server compiler 64-bit server compiler
Windows 32-bit 32-bit client compiler 32-bit server compiler Error
Windows 64-bit 64-bit server compiler 64-bit server compiler 64-bit server compiler
OS Default compilerWindows, 32-bit, any number of CPUs -client
Windows, 64-bit, any number of CPUs -server
MacOS, any number of CPUs -server
Linux/Solaris, 32-bit, 1 CPU -client
Linux/Solaris, 32-bit, 2 or more CPUs -server
Linux, 64-bit, any number of CPUs -server
*In Java 8 the server compiler is the default in any of these cases
Information about default compiler% java -versionjava version "1.7.0" Java(TM) SE Runtime Environment (build 1.7.0-b147) Java HotSpot(TM) Server VM (build 21.0-b17, mixed mode)
![Page 12: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/12.jpg)
12
OPTIMIZATIONS IN HOTSPOT JVM• compiler tactics
• delayed compilation• tiered compilation• on-stack replacement• delayed reoptimization• program dependence graph rep.• static single assignment rep.
• proof-based techniques– exact type inference– memory value inference– memory value tracking– constant folding– reassociation– operator strength reduction– null check elimination– type test strength reduction– type test elimination– algebraic simplification– common subexpression elimination– integer range typing
• flow-sensitive rewrites– conditional constant propagation– dominating test detection– flow-carried type narrowing– dead code elimination
• language-specific techniques• class hierarchy analysis• devirtualization• symbolic constant propagation• autobox elimination• escape analysis• lock elision• lock fusion• de-reflection
• speculative (profile-based) techniques• optimistic nullness assertions• optimistic type assertions• optimistic type strengthening• optimistic array length strengthening• untaken branch pruning• optimistic N-morphic inlining• branch frequency prediction• call frequency prediction
• memory and placement transformationexpression hoistingexpression sinkingredundant store eliminationadjacent store fusioncard-mark eliminationmerge-point splitting
• loop transformations• loop unrolling• loop peeling• safepoint elimination• iteration range splitting• range check elimination• loop vectorization
• global code shaping• inlining (graph integration)• global code motion• heat-based code layout• switch balancing• throw inlining
• control flow graph transformation• local code scheduling• local code bundling• delay slot filling• graph-coloring register allocation• linear scan register allocation• live range splitting• copy coalescing• constant splitting• copy removal• address mode matching• instruction peepholing• DFA-based code generator
![Page 13: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/13.jpg)
13
INLINING – MOTHER OF OPTIMIZATIONBefore After
*Using JVM Devirtualization if needed Frequency and size matter
int addAll(int max){ int accum=0; for (int i=0;i<max;i++) { accum = add(accum, i); } return accum; }}int add(int a, int b) {return a+b;}
int addAll(int max){ int accum=0; for (int i=0;i<max;i++) { accum = accum+i; } return accum; }}int add(int a, int b) {return a+b;}
![Page 14: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/14.jpg)
14
OSR – ON-STACK REPLACEMENT
oRunning method never exits?oBut it’s getting really hot?oGenerally means loops, back-branchingoCompile and replace while runningoNot typically useful in large systemsoLooks great on benchmarks!
![Page 15: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/15.jpg)
15
ESCAPE ANALYSISoObject is referenced only inside some loop; no
other code can ever access that object?o It needn’t get a synchronization lock when
calling the methods working with objecto It needn’t store the fields in memory; it can
keep that value in a registeroSimilarly it can store the objects references in a
register
![Page 16: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/16.jpg)
16
ESCAPE ANALYSISpublic class Factorial { private BigInteger factorial; private int n;
public Factorial(int n) { this.n = n; }
public synchronized BigInteger getFactorial() { if (factorial == null) factorial =...; return factorial; }}
ArrayList< BigInteger > list = new ArrayList < BigInteger >(); for ( int i = 0 ; i < 100 ; i ++) { Factorial factorial = new Factorial ( i ); list.add(factorial.getFactorial ()); }
![Page 17: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/17.jpg)
17
ESCAPE ANALYSIS (SIMPLE CASE)o It needn’t get a synchronization lock when
calling the getFactorial() method.o It needn’t store the field n in memory; it can
keep that value in a register. o It can just keep track of the individual fields of
the object.oSometime – it needn’t to execute it at all.
![Page 18: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/18.jpg)
19
JIT TUNING (THESE MIGHT SAVE YOU )
o -client , -server or -XX:+TieredCompilationo -XX:ReservedCodeCacheSize=, -XX:InitialCodeCacheSize=
![Page 19: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/19.jpg)
20
JIT TUNINGo -XX:CompileThreshold=invocation value for compilingo -XX:CICompilerCount= number of threadso -XX:MaxFreqInlineSize=for hot methods (default value 325
bytes)o -XX:MaxInlineSize= method smaller this will be inlined anyway
(default value 35 bytes)
![Page 20: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/20.jpg)
21
WANT TO GET MORE DETAILS?(BE CAREFUL WITH USING THEM ON PRODUCTION)
o -XX:+UnlockDiagnosticVMOptionso -XX:+TraceClassLoadingo -XX:+LogCompilationo -XX:+PrintAssemblyo -XX:+PrintCompilation - info about compiled methods
o -XX:+PrintInlining – info about inlining decisions
o -XX:CompileCommand=… - to control compilation policy
![Page 21: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/21.jpg)
22
WANT TO GET MORE DETAILS? – LOGS
![Page 22: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/22.jpg)
23
WANT TO GET MORE DETAILS – JITWATCH, JSTAT
![Page 23: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/23.jpg)
24
CONCLUSIONS
o KISS, SOLID, DRY, YAGNI – all well-known principles are perfect for JIT to make his job
o Your code will be optimized and compiled, de-compiledo There is a lot of various algorithms to do it inside JVMo You need to reserve memory for compiled code (CodeCache inside
Metaspace/Permgen)o To get full performance throttle JVM needs to warm-upo Micro benchmarks lie to you. All the time
![Page 24: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/24.jpg)
25
WHAT WE DIDN’T TOUCH
o Deoptimazingo Specific benchmark for compilerso Specific compiled code exampleso…
![Page 25: Java Jit. Compilation and optimization by Andrey Kovalenko](https://reader035.fdocuments.in/reader035/viewer/2022062400/5882f4f41a28ab3f1e8b666d/html5/thumbnails/25.jpg)
26
Q&A