Lifting The Veil – Reading Java Byte Code During Lunchtime
Alexander ShopovCisco Lunch&Learn
Alexander Shopov
By day: Software Engineer at CiscoBy night: OSS contributorCoordinator of Bulgarian Gnome TP
Contacts:E-mail: [email protected]: [email protected]: http://www.linkedin.com/in/alshopovGoogle: Just search “al_shopov“
Please Learn And Share
License: CC-BY v3.0Creative Commons Attribution v3.0
Disclaimer
My opinions, knowledge and experience!
Not my employer's.
Contents
● Why read?● How to read?
● JVM Internals;● JVM Data Types;● JVM Opcodes.
● Let's read some code.● What next?
Why Read Byte code?
● Understand your platform● It is interesting and not too hard● How does Java function? How does X function?● Job interviews● Catch compiler bugs/optimizations● Learn to read before you write● Source may not correspond to binary● C/C++ people know their assembler● Java language evolution vs. Java platform evolution
Bad News And Good News
Bad:We will be
reading assembler
Good:Easiest
assembler in world
What Is The JVM?
● Stack based, byte oriented virtual machine without registers easily implementable on 32 bit hardware.
● 206 (<256) instructions that are easy to group and there is no need to remember them all
● Some leeway in implementations (even with Oracle)
Dramatis Personæ
● The JVM● The threads● The frames● The stacks – LIFO● The local variables – array of slots● The runtime constant pool – array of values● The bytecode – the instructions● Class files – serialized form of constants and byte
code
Enter JVM
JVM OS process
Enter Threads
Thr
ead
A
Thr
ead
B
Thr
ead
C
Thr
ead
D
Enter Frames
Thr
ead
A
Thr
ead
B
Thr
ead
C
Thr
ead
D
F0
F1
F2
F3
F4
F0
F1
F2
F0
F1
F0
F1
F2
F3
Enter Frames, Really!
F0
F1
F2
F3
F4
F0
F1
F2
F0
F1
F0
F1
F2
F3
What Is A Frame Actually?
F0
Let's Peek Inside A Frame
F0
F0
Enter Local Variables
0 1 2 3 4 5 6 …
Local variables
F0
Enter Stack
0 1 2 3 4 5 6 …
Local variables
Stack
F0
Enter Pool Of Constants
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
F0
Where Is The Code?
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
JVM (heap)
F0
Where Is The Code?
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
JVM (heap)
F0
Where Is The Code?
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
Method codePC
Class
JVM (heap)
F0
Where is the code?
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
Method codePC
Class
JVM (heap)
F0
Load
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
6
Method codePC
Class
JVM (heap)
F0
Load
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
6
Method codePC
Class
JVM (heap)
F0
And…
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
8
Stack
6
Method codePC
Class
JVM (heap)
F0
Store
6 8
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
8
Stack
6
Method codePC
Class
JVM Datatypes
● Primitive types● Java { numeric – integral: byte (±8), short (±16),
int (±32), long (±64), char (+16), floating point: float (±32), double (±64); boolean (int or byte) }
● returnAddress – pointers to the opcodes of JVM (jumps - loops)
● Reference types● class, array, interface● null
JVM Datatypes Descriptors
Java type Type descriptor
boolean Z
char C
byte B
short S
int I
float F
long J
double D
Object Ljava/lang/Object;
byte[] [B
String[][] [[Ljava/lang/String;
void V
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4()
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4() ()[B
(Ljava/lang/Object;Ljava/lang/Long;)J
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4() ()[B
long m5(Object, Long) (Ljava/lang/Object;Ljava/lang/Long;)J
206 instructions
DON'T PANIC!
Level 1 – Do Nothing/1
● nop
Level 2 – Load Constants/20
● aconst_null, ● iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3,
iconst_4, iconst_5● lconst_0, lconst_1, ● fconst_0, fconst_1, fconst_2● dconst_0, dconst_1● bipush, sipush – 1, 2 bytes● ldc, ldc_w, ldc2_w – load from index in constant
pool 1,2,2 bytes for index
Level 3 – Load Variables/33
● iload, lload, fload, dload, aload● iload_0, iload_1, iload_2, iload_3, lload_0,
lload_1, lload_2, lload_3, fload_0, fload_1, fload_2, fload_3, dload_0, dload_1, dload_2, dload_3, aload_0, aload_1, aload_2, aload_3
● iaload, laload, faload, daload, aaload, baload, caload, saload – consume reference to array and int index in it
Level 4 – Conversions/15
● i2l, i2f, i2d, l2i, l2f, l2d, f2i, f2l, f2d, d2i, d2l, d2f, i2b, i2c, i2s
Level 6 – Maths/37
● iadd, ladd, fadd, dadd, isub, lsub, fsub, dsub, imul, lmul, fmul, dmul, idiv, ldiv, fdiv, ddiv, irem, lrem, frem, drem, ineg, lneg, fneg, dneg, ishl, lshl, ishr, lshr, iushr, lushr, iand, land, ior, lor, ixor, lxor
● Iinc - increment local variable #index by signed byte const
Level 7 – Stores/33
● istore, lstore, fstore, dstore, astore, istore_0, istore_1, istore_2, istore_3, lstore_0, lstore_1, lstore_2, lstore_3, fstore_0, fstore_1, fstore_2, fstore_3, dstore_0, dstore_1, dstore_2, dstore_3, astore_0, astore_1, astore_2, astore_3, iastore, lastore, fastore, dastore, aastore, bastore, castore, sastore
Level 8 – No-branch Comparisons/5
● lcmp, fcmpl, fcmpg, dcmpl, dcmpg (beware NaN)
Level 9 – Objects/15
● getstatic, putstatic● getfield, putfield● invokevirtual, invokespecial, invokestatic,
invokeinterface● new, newarray, anewarray● arraylength● athrow● checkcast, instanceof (difference is treatment of
null)
Level 10 – Return/6
● ireturn, lreturn, freturn, dreturn, areturn, return
165 of 206
81%
We Have Enough Mana/Resources!
Let's dive in bytecode!
Enter Bytecode
javap – your only true friend now
javap -classpath PATH -p -c -l -s CLASS
Example 1
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
PCClass
0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
3
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
7
Stack
3
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
10
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 10
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
10
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 10
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
4
Stack
10
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 10
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
14
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 10
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 14
JVM (heap)
F0
Example 1
3 7
Cla
ss
Pool of constants
0 1 2 3
Local variables
Stack
14
PC
Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn
4 14
Example 1
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn
public static int //whatIsThis(int a, int b, int c) { int result = a + b; result += c; return result;}
Example 2
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn
Example 2
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn
public static int //whatIsThis(int a, int b, int c) { return a + b + c;}
Example 3
public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D
Example 3
public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D
public static int //whatIsThis(int a, float b, // double c) { return (int) (a + b + c);}
Example 4
public static void main(java.lang.String[]); Code: 0: getstatic #16
// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24
// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return
More verbosity
javap -v -classpath PATH -p -c -l -s CLASS
Example 4Constant pool:
#1=Class #2 // org/kambanaria/readbytecode/bgoug/Example4
#2=Utf8 org/kambanaria/readbytecode/bgoug/Example4
…
#16=Fieldref #17.#19 // java/lang/System.out:Ljava/io/PrintStream;
#17=Class #18 // java/lang/System
#18=Utf8 java/lang/System
#19=NameAndType #20:#21 // out:Ljava/io/PrintStream;
#20=Utf8 out
#21=Utf8 Ljava/io/PrintStream;
…
#22=String #23 // There
#23=Utf8 There
#24=Methodref #25.#27 //java/io/PrintStream.println:(Ljava/lang/String;)V
…
Example 4
public static void main(java.lang.String[]); Code: 0: getstatic #16
// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24
// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return
public static void //main(String[] args) { System.out.println("There");}
// Hello There!
Example 4
public static void main(java.lang.String[]); Code: 0: getstatic #16
// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24
// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return
Example 4 0: getstatic #16 getstatic = 0xb2, 16 = 0x00 10 3: ldc #22 ldc = 0x12, 22 = 0x16 5: invokevirtual #24 invokevirtual = 0xb6, 24 = 0x00 18 8: return return = 0xb1
b2 00 10 12 16 b6 00 18 b1
od -t x1 Example4.class | tail -60001000 00 0e 00 0f 00 01 00 07 00 00 00 37 00 02 00 010001020 00 00 00 09 b2 00 10 12 16 b6 00 18 b1 00 00 000001040 02 00 0a 00 00 00 0a 00 02 00 00 00 07 00 08 000001060 08 00 0b 00 00 00 0c 00 01 00 00 00 09 00 1e 000001100 1f 00 00 00 01 00 20 00 00 00 02 00 210001115
Example 5
public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn
public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
Example 5
public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn
public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
public char[] whatIsThis() { return this.content;}
Example 5
public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn
public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
public static void //main(String[] args) { System.out.println( // Arrays.toString( // new Example5() // .whatIsThis()));}
Level 11 – Stack/9
● pop a ➔● pop2 ba ➔● dup a aa➔● dup_x1 ba aba➔● dup_x2 cba acba➔● dup2 ba baba➔● dup2_x1 cba bacba➔● dup2_x2 dcba badcba➔● swap ba ab➔
Example 6
public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method
java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field s:Ljava/lang/String; 17: return
Example 6
public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method
java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field s:Ljava/lang/String; 17: return
public void //whatIsThis(String s) { if (null == s) { throw new NullPointerException(); } this.s = s;}
Level 12 – conditions, branches, loops/19
● ifeq, ifne, iflt, ifge, ifgt, ifle● if_icmpeq, if_icmpne, if_icmplt, if_icmpge,
if_icmpgt, if_icmple● if_acmpeq, if_acmpne● ifnull, ifnonnull● goto, jsr, ret
193 of 206
94%
Example 7
public static int parse(java.lang.String); Code: 0: aload_0 1: invokestatic #16 // Method
java/lang/Integer.parseInt:(Ljava/lang/String;)I 4: ireturn 5: astore_1 6: iconst_0 7: ireturn Exception table: from to target type 0 4 5 Class java/lang/NumberFormatException
public static int parse(String s) { try { return Integer.parseInt(s); } catch (NumberFormatException e) { return 0;}
Example 8
public class org.kambanaria.readbytecode.bgoug.Example8 { static final boolean $assertionsDisabled;
static {}; Code: 0: ldc #1 // class org/kambanaria/readbytecode/bgoug/Example8 2: invokevirtual #10 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #16 // Field $assertionsDisabled:Z 16: return
public class Example8 { private static String repeat(String s) { assert s != null; return s + s; }}
Example 8private static java.lang.String repeat(java.lang.String); Code: 0:getstatic #16 // Field $assertionsDisabled:Z 3:ifne 18 6:aload_0 7:ifnonnull 18 10:new #28 // class java/lang/AssertionError 13:dup 14:invokespecial #30 // Method java/lang/AssertionError."<init>":()V 17:athrow 18:new #31 // class java/lang/StringBuilder 21:dup 22:aload_0 23:invokestatic #33 // Method
java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26:invokespecial #39 // Method
java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 29:aload_0 30:invokevirtual #42 // Method
java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 33:invokevirtual #46 // Method
java/lang/StringBuilder.toString:()Ljava/lang/String; 36:areturn }
Now You Know
Beware Asserts In Public Methods!
Example 9
package org.kambanaria.readbytecode.bgoug;
public class Example9 { public class Inner {}
public static void // main(String[] args) throws Exception { Example9 exmpl = Example9.class.newInstance(); Inner innr = Inner.class.newInstance(); }}
java -cp bin/ org.kambanaria.readbytecode.bgoug.Example9Exception in thread "main" java.lang.InstantiationException:
org.kambanaria.readbytecode.bgoug.Example9$Inner at java.lang.Class.newInstance0(Class.java:357) at java.lang.Class.newInstance(Class.java:325) at org.kambanaria.readbytecode.bgoug.Example9.main(Example9.java:9)
Example 9public class org.kambanaria.readbytecode.bgoug.Example9 { public OKRB.Example9(); Code: 0:aload_0 1:invokespecial #8 // Method java/lang/Object."<init>":()V 4:return…}
public class org.kambanaria.readbytecode.bgoug.Example9$Inner { final OKRB.Example9 this$0; public OKRB.Example9$Inner(OKRB.Example9); Code: 0:aload_0 1:aload_1 2:putfield #10 //Field this$0:Lorg/kambanaria/readbytecode/bgoug/Example9; 5:aload_0 6:invokespecial #12 // Method java/lang/Object."<init>":()V 9:return }
Example 9
package org.kambanaria.readbytecode.bgoug;
public class Example9 { public class Inner {}
public static void // main(String[] args) throws Exception { Example9 exmpl = new Example9(); Inner innr = exmpl.new Inner(); }}
Further resources
● Oracle: The JVM Specification, Java SE 7 Edition● A. Arhipov:
Java Bytecode For Discriminating Developers● Wikipedia: Java Bytecode Instruction Listings● S. H. Park Understanding JVM Internals● C. McGlone: Looking "Under the Hood" with javap● P. Haggar: Java bytecode● C. Nutter: JVM Bytecode for Dummies
Presentation background
● Alexander Wilms: Hexagons
Top Related