Lifecycle of a JIT compiled code
Ivan Krylov Azul Systems
2 Ivan Krylov
Is this the right talk?
• JVM Internals and Performance Talk:
• General Interest
• Got a performance critical app
• You have a reproducible case
• Have a way to collect JVM logs
3
Scope and Agenda
• Code flow inside JVM
• Profiles that JVM collects
• Deoptimization
• Four ways to control compilation
4
Ivan Krylov
Code Pipeline in JVM
5
Sourcesin .java Bytecodes
Machine code from
the Interpreter
Machine code from a
compiler
Javac 1) Static verification
2) Compilation with near-0 opts
Runtime 1) Dynamic verification
2) Linking
One of N compilers Governed by runtime
next tierdeoptimization
liveprofile data
JAOT in JDK9
• Bytecodes coming not from javac• Any JVM language; jasm-like tools; Byte Buddy
• May be no interpreter• JRockit VM
• May be only interpreter• Zero-port or -Xint
• May be a pure AOT (no bytecodes, no JIT)• Pure AOT can’t implement the entire Java SE standard• Hence a hybrid approach : Excelsior JET, Graal-based AOT (in works)
Everything can be different
6
Template Interpreter - a tier 0 compiler• Machine code creation
• From asm(masm) templates
• Populated with addresses in VM
• Generated at start-up
• Maintains invocation and Back branch C-rs
• Mixed native/java stack frame
• All bytescodes in a single blob with GoTo-s7
Template Interpreter - ladd example
8
0x000000010a59a920: mov (%rsp),%rax 0x000000010a59a924: add $0x10,%rsp 0x000000010a59a928: mov (%rsp),%rdx 0x000000010a59a92c: add $0x10,%rsp 0x000000010a59a930: add %rdx,%rax 0 0x000000010a590933: movzbl 0x1(%r13),%ebx 0x000000010a59a938: inc %r13 0x000000010a59a93b: movabs $0x10a0f6600,%r10 0x000000010a59a945: jmpq *(%r10,%rbx,8)
Entry if %RAX doesn’t contain
first operand
Entry otherwise
Read next bytecode
& dispatch
№1
№2
return value
bytecode cntr next bytecode
method start 0x000000010a0f6600:
What is a profile ?• It is all about counters• Inaccurate but indicative• If overflows - stays at
saturated level• Some counters with properties• Actual types observed, etc..
9
https://www.flickr.com/photos/icemanphoto/1739731523
Profile is.. methodCounters.hpp
• interpreter_invocation_count & invocation_counter
• interpreter_throwout_count
• backedge_counter
• number_of_breakpoints (4 dbg)• nmethod_age• interpreter_invocation_limit &
interpreter_backward_branch_limit
• interpreter_profile_limit• invoke_mask• backedge_mask• rate & prev_time• highest_comp_level &
highest_osr_comp_level
10
Anything else needed?
11
class A { static List<String> list = new LinkedList<>(); static { DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss"); list.add(dateFormat.format(new Date())); } static List<String> getList() { return list; }}
class B { String process(String s, int i) { if (i < A.list.size()) return s.concat( A.list.get(i) ); else return s; }
}B.process() may be compiled only after A is initialized
A may be initialized only at first invocation of B.process() (If B.process() is the only user of A)
Is SimpleEnum simple?
12
public enum SimpleEnum { One};
>ll SimpleEnum.java
-rw-r--r-- 1 ivan staff 35 Feb 23 21:30 SimpleEnum.java >ll SimpleEnum.class
-rw-r--r-- 1 ivan staff 732 Feb 23 21:30 SimpleEnum.class
Is SimpleEnum simple?
12
public enum SimpleEnum { One};
Compiled from "SimpleEnum.java"public final class SimpleEnum extends java.lang.Enum<SimpleEnum> { public static final SimpleEnum One;
public static SimpleEnum[] values(); Code: 0: getstatic #1 // Field $VALUES:[LSimpleEnum; 3: invokevirtual #2 // Method "[LSimpleEnum;".clone:()Ljava/lang/Object; 6: checkcast #3 // class "[LSimpleEnum;" 9: areturn
public static SimpleEnum valueOf(java.lang.String); Code: 0: ldc #4 // class SimpleEnum 2: aload_0 3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum; 6: checkcast #4 // class SimpleEnum 9: areturn
static {}; Code: 0: new #4 // class SimpleEnum 3: dup 4: ldc #7 // String One 6: iconst_0 7: invokespecial #8 // Method "<init>":(Ljava/lang/String;I)V 10: putstatic #9 // Field One:LSimpleEnum; 13: iconst_1 14: anewarray #4 // class SimpleEnum 17: dup 18: iconst_0 19: getstatic #9 // Field One:LSimpleEnum; 22: aastore 23: putstatic #1 // Field $VALUES:[LSimpleEnum; 26: return}
0: ldc #4 // class SimpleEnum 2: aload_0 3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum; 6: checkcast #4 // class SimpleEnum 9: areturn
static {}; Code: 0: new #4 // class SimpleEnum 3: dup 4: ldc #7 // String One 6: iconst_0 7: invokespecial #8 // Method "<init>":(Ljava/lang/String;I)V 10: putstatic #9 // Field One:LSimpleEnum; 13: iconst_1 14: anewarray #4 // class SimpleEnum 17: dup 18: iconst_0 19: getstatic #9 // Field One:LSimpleEnum; 22: aastore 23: putstatic #1 // Field $VALUES:[LSimpleEnum; 26: return}
Is SimpleEnum simple?
13
public enum SimpleEnum { One};
The Rumor:Inlining is important
14
• Hide the costs of preparing arguments
• Hide the costs of marshaling the result
• Code is co-located (I-cache friendly)
• Makes executable smaller in some cases
• Unlocks a bunch of optimizations
• Because the scope of a compilation block is bigger
• Easy to test: disable Inlining with -XX:-Inline - and observe XX% perf loss
• -XX:+PrintInlining to see what methods’re inlined and where
Inlining that didn’t happen
15
• Inlining Callee makes a caller too big
• Inlining too deep / recursive inlining
• Callee has exception handlers? (InlineMethodsWithExceptionHandlers)
• Callee is synchronized (InlineSynchronizedMethods )
• Class of Callee is not initialized
• Callee has unbalanced monitors
• Caller has jsr bytecode (!)
}
http://cliffhacks.blogspot.ru/2008/02/java-6-tryfinally-compilation-without.html
Overrulable
Did you see tiers?
16
495 7 n 0 jdk.internal.reflect.Reflection::getCallerClass (native) (static) 495 8 b 2 java.util.Properties::getProperty (49 bytes) 497 9 b 2 java.util.concurrent.ConcurrentHashMap::get (162 bytes) 500 10 b 3 java.lang.String::hashCode (48 bytes) 503 11 b 4 java.lang.String::hashCode (48 bytes) 507 10 3 java.lang.String::hashCode (48 bytes) made not entrant 507 12 b 2 java.lang.StringLatin1::hashCode (42 bytes) 509 13 b 3 java.lang.Boolean::<clinit> (31 bytes) 510 14 b 4 java.lang.Boolean::<clinit> (31 bytes) 510 13 3 java.lang.Boolean::<clinit> (31 bytes) made not entrant 511 15 b 3 java.lang.Boolean::<init> (10 bytes) 511 16 b 4 java.lang.Boolean::<init> (10 bytes) 513 15 3 java.lang.Boolean::<init> (10 bytes) made not entrant 513 17 b 2 java.lang.Object::<init> (1 bytes) 513 18 n 0 java.lang.Class::getPrimitiveClass (native) (static) 514 19 b 3 java.lang.Boolean::parseBoolean (19 bytes) 514 20 b 4 java.lang.Boolean::parseBoolean (19 bytes) 516 19 3 java.lang.Boolean::parseBoolean (19 bytes) made not entrant 516 21 b 2 java.lang.String::isLatin1 (19 bytes) 517 22 b 2 java.lang.Integer::parseInt (259 bytes) 533 23 b 3 java.lang.Character::<clinit> (25 bytes)
4 - Full Optimisation - C2
Tiered compilationFastest
Slow
Mediocre
Fast
Slowest
3 - Full Profile - C1(incl. MethodData)
2 - Limited Profile - C1 (+ invocation & backed counters)
1 - Simple Profile - C1
0 - Interpreter17
• … Compiler does “optimistic” (or “heroic”) optimizations• … Something AOT can’t
do• & “Whole World” is not
developed at time of compilation
Deoptimizations happen because…
18
19
Not all opts or speculative
• Constant propagation
• Loop invariant• …
• Bias locking
• NUMA-aware allocation
• TSX transactions
• Uncommon trap
• …
Deterministic opts Speculativew/fallback
Speculativew/hard stop
• CHA invalidation
• Effectively final fields modified
• …
Implicit code• Mandatory operations from JVMLS
• Zeroing of Objects
• Out of boundary checks
• Divide by 0 in math
• Null Pointer protection
• Checkcasts
• ….20
class C { int val; C[] kInverseVector(int k, C[] v) { C[] result = new C[v.length]; for (int i=0; i < v.length; i++){ result[i] = new C(); result[i].val = k / v[i].val ; } return result; } }
Per bci deoptimization reasons• Null Check (null object or div 0)
• Null Assert
• Range Check (OOB)
• Class Check (Unexpected class)
• Array Check (Unexpected array class)
• Intrinsic operand
• Bimorphic inlining failed21
Per method deoptimization reasons• Unloaded class
• Uninitialized class
• Unreached code
• Unhandled exception
• Unexpected Constraint
• Div0 check
• Age (tier threshold reached)
• Predicate failed
• Loop limit check
• Speculate class check
• Speculate Null Check
• Rm state change
• Unstable if
• Reason unstable fused if22
Uncommon trap happened. Now what?
• 1-off event?
• A change in a pattern?
• Keep old around?
• Recompile? Never compile?
• What the next invocation will be like?
23
class C { int val; C[] kInverseVector(int k, C[] v) { C[] result = new C[v.length]; for (int i=0; i < v.length; i++){ result[i]=new C(); result[i].val = k / v[i].val ; } return result; } }
How to control a compiler?
24
I. Legacy CompileCommand
25
• Provided statically via• Command line: -XX:HotspotCompile• .hotspot_compiler or
-XX:CompileCommandFile=/path/to/thefile
• Dynamically• jcmd Compiler.directives_add
• Syntax: command package/Class methodName
• Actions:
• dontinline or exclude or excludec2 or excludec1
exclude java/lang/Thread setPriority dontinline java/lang/String charAt
II. Compiler Control in Java 9
26
{ // matching several patterns require an array match: ["steve.*","alex.*"]
c2: { Enable: false, // Ignore this directive for c2. BreakAtExecute: true // This will not be applied since Enable is false above }
// applies to all compilers // + force inline, - dont inline inline : [ "+java/util.*", "-com/sun.*"], PrintInlining: true }
Common Compiler Control Directives
27
• Enable
• bool Exclude
• bool Inline
• bool BreakAtExecute
• bool BreakAtCompile
• bool Log
• bool PrintAssembly
Common Compiler Control Directives
27
• Enable
• bool Exclude
• bool Inline
• bool BreakAtExecute
• bool BreakAtCompile
• bool Log
• bool PrintAssembly
• bool PrintInlining
• bool PrintNMethods
• bool ReplayInline
• bool DumpReplay
• bool DumpInline
• bool CompilerDirectivesIgnoreCompileCommands
• bool PrintOptoAssembly
• bool PrintIntrinsics
• bool TraceOptoOutput
• bool TraceSpilling
• bool CloneMapDebug
• bool IGVPrintLevel
C2 Compiler Control Directives
28
• BlockLayoutByFrequency
• bool raceOptoPipelining
• bool Vectorize
• bool VectorizeDebug
• intx MaxNodeLimit
• intx DisableIntrinsics
III. java.lang.Compiler
29
• OpenJDK / Oracle don’t support it.
• IBM J9 supports it http://www.ibm.com/support/knowledgecenter/en/SSSTCZ_2.0.0/com.ibm.rt.doc.20/realtime/rt_jit.html
• Zing has a slight different flavourhttp://docs.azul.com/zing/Zing_UserGuide/#Zing_UserGuide/Zing_AT_ReadyNow_EnsureCriticalMethodsareCompiled_CompilerAPI.htm
• Oracle wants to deprecateSee https://bugs.openjdk.java.net/browse/JDK-4285505
{ Compiler.enable(); // ensure compiler is active Compiler.command("{com.mycompany.*}(compile)"); System.out.println("Now let’s wait till compilation is done"); Compiler.command("waitOnCompilationQueue"); System.out.println("Compilation is complete"); Compiler.disable(); // turn the compiler off }
IV. ReadyNow
30
• Between AOT and JIT
• Designed to compile sooner than normal JIT
• “Your JIT’s yesterday’s mistakes wont hurt you today”
• Designed to deoptimize less often than normal JIT
• Works well in good “scenarios”
• Designed to never produce an invalid code
• The fallback is the normal JIT
Three requirements of ReadyNow
31
I. Profiles must be adequate
II. Profile needs to be applicable
III. All necessary types are initialized or …
1. all the ones we need for fastpath
2. can force eager initialization in case of no/trivial static initializers
32
Recording a profileCT1 CT2 CT3 profile
Using a profile
33
CT1 CT2 CT3
ReadyNow
34
Bonus: JDK9 Ahead of Time Compilation
35
• Targeting Fast startup• Can interoperate with JIT• Only Linux x64• Only java.base is guaranteed• Must use the same JVM flags• Not for dynamically generated classes• No reflection support• No instrumentation support
jaotc --output libjava.base.so --module java.base jaotc --output libHelloWorld.so HelloWorld.class java -XX:AOTLibrary=./libHelloWorld.so HelloWorld
36
• Zing: A better JVM for the enterprise• Azul’s enterprise JVM focused on better metrics• Consistent performance - not just fast, always fast• Eliminate GC as a concern for enterprise apps• Very wide operating range
• From human-sensitive app responsiveness to low-latency trading
• From microservices to huge in-memory apps• Eliminates an entire class of engineering workarounds common in
Java•Zulu Embedded: When you need embedded Support
• 100% open source, based on OpenJDK• Certified Java SE compliant and compatible• Verified “non-contaminating” open source license• Identical metrics to OpenJDK and Oracle Java SE• World-class support offerings• Support for Linux & Windows; x86, ARM32, ARM64, PPC32,
MIPS36
Ivan Krylov www.azul.com Azul Systems ivan @ azul.com
Top Related