Java Garbage Collection, Byte Code James Atlas August 7, 2008.

49
Java Garbage Collection, Java Garbage Collection, Byte Code Byte Code James Atlas James Atlas August 7, 2008 August 7, 2008

Transcript of Java Garbage Collection, Byte Code James Atlas August 7, 2008.

Page 1: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

Java Garbage Collection,Java Garbage Collection,Byte CodeByte Code

James AtlasJames Atlas

August 7, 2008August 7, 2008

Page 2: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 22

ReviewReview

• Java SecurityJava Security JVMJVM CryptographyCryptography

Page 3: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 33

ScheduleSchedule• TodayToday

Java Garbage CollectionJava Garbage Collection Java BytecodeJava Bytecode

• TuesdayTuesday ReviewReview

• ThursdayThursday Final (5-7PM)Final (5-7PM)

Page 4: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 44

Java Garbage CollectionJava Garbage Collection

Page 5: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 55

Garbage CollectionGarbage Collection• What is garbage and how can we deal with it?What is garbage and how can we deal with it?

• Garbage collection schemesGarbage collection schemes

Reference CountingReference Counting Mark and SweepMark and Sweep Stop and CopyStop and Copy

• A comparisonA comparison

Page 6: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 66

How Objects are Created in JavaHow Objects are Created in Java• An object is created in Java by invoking the new() An object is created in Java by invoking the new()

operator.operator.

• Calling the new() operator, the JVM will do the Calling the new() operator, the JVM will do the following:following:

allocate memory;allocate memory; assign fields their default values;assign fields their default values; run the constructor;run the constructor; a reference is returned.a reference is returned.

Page 7: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 77

How Java Reclaims Objects How Java Reclaims Objects Memory Memory • Java does not provide the programmer any means Java does not provide the programmer any means

to destroy objects explicitlyto destroy objects explicitly

• The advantages areThe advantages are

No dangling reference problem in JavaNo dangling reference problem in Java

Easier programmingEasier programming

No memory leak problemNo memory leak problem

Page 8: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 88

What is Garbage?What is Garbage?• Garbage: unreferenced objectsGarbage: unreferenced objects

Student john= new Student();Student john= new Student(); Student jeff= new Student();Student jeff= new Student(); john=jeff;john=jeff;

• john Object becomes garbage,john Object becomes garbage,because it is an unreferencedbecause it is an unreferencedObjectObject

john Object

john

jeff

jeff Object

Page 9: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 99

What is Garbage Collection?What is Garbage Collection?• What is Garbage Collection?What is Garbage Collection?

• Finding garbage and reclaiming memory Finding garbage and reclaiming memory allocated to it.allocated to it.

• When is the Garbage Collection process When is the Garbage Collection process invoked?invoked? When the total memory allocated to a Java When the total memory allocated to a Java

program exceeds some threshold.program exceeds some threshold.

• Is a running program affected by garbage Is a running program affected by garbage collection?collection? Yes, depending on the algorithm. Typically the Yes, depending on the algorithm. Typically the

program suspends during garbage collection.program suspends during garbage collection.

john Object

john

jeff

jeff Object

Page 10: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1010

Strategies for Handling GarbageStrategies for Handling Garbage• Modern Societies produce an excessive amount of Modern Societies produce an excessive amount of

waste? waste?

• What is the solution?What is the solution?

ReduceReduce ReuseReuse RecycleRecycle

• The same Applies to Java!!!The same Applies to Java!!!

Page 11: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1111

Reduce GarbageReduce Garbage• A Java program that does not create any objects A Java program that does not create any objects

does not create garbage.does not create garbage.

• Objects used until the end of the program do not Objects used until the end of the program do not become garbage.become garbage.

• Reducing the number of objects used will reduce Reducing the number of objects used will reduce the amount of garbage generated.the amount of garbage generated.

Page 12: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1212

Reuse GarbageReuse Garbage• Reuse objects instead of generating new ones.Reuse objects instead of generating new ones.

for (int i=0;i<1000000; ++i) {for (int i=0;i<1000000; ++i) {

SomeClass obj= new SomeClass(i);SomeClass obj= new SomeClass(i);

System.out.println(obj);System.out.println(obj);

}}

• This program generates one million objects and prints them out.This program generates one million objects and prints them out.

SomeClass obj= new SomeClass();SomeClass obj= new SomeClass();

for (int i=0;i< 1000000; ++i) {for (int i=0;i< 1000000; ++i) {

obj.setInt(i);obj.setInt(i);

System.out.println(onj);System.out.println(onj);

}}

• Using only one object and implementing the setInt() method, we dramatically Using only one object and implementing the setInt() method, we dramatically reduce the garbage generated.reduce the garbage generated.

Page 13: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1313

Recycle GarbageRecycle Garbage• Don't leave unused objects for the garbage collector.Don't leave unused objects for the garbage collector.

Put them instead in a container to be searched when an object is Put them instead in a container to be searched when an object is needed.needed.

• Advantage: reduces garbage generation.Advantage: reduces garbage generation.

• Disadvantage: puts more overhead on the programmer.Disadvantage: puts more overhead on the programmer.

• Can anyone think of what design pattern this represents?Can anyone think of what design pattern this represents?

Page 14: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1414

Garbage Collection StrategiesGarbage Collection Strategies

• Reference CountingReference Counting

• Mark and SweepMark and Sweep

• Stop and CopyStop and Copy

Page 15: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1515

Reference Counting Garbage Reference Counting Garbage CollectionCollection• Main Idea: Add a reference count field for every object.Main Idea: Add a reference count field for every object.

• This Field is updated when the number of references to an This Field is updated when the number of references to an object changes.object changes.

• ExampleExample

• Object p= new Integer(57);Object p= new Integer(57);

• Object q = p;Object q = p;

57

refCount = 2

p

q

Page 16: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1616

Reference Counting (cont'd)Reference Counting (cont'd)• The update of reference field when we have a reference The update of reference field when we have a reference

assignment ( i.e p=q) can be implemented as followsassignment ( i.e p=q) can be implemented as follows

if (p!=q)if (p!=q)

{{

if (p!=null)if (p!=null)

--p.refCount;--p.refCount;

p=q;p=q;

if (p!=null)if (p!=null)

++p.refCount;++p.refCount;

}}

57

refCount = 0

p

q

99

refCount = 2

Example: Object p = new Integer(57);Object q= new Integer(99);p=q

Page 17: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1717

Reference Counting (cont'd)Reference Counting (cont'd)

• What in case of indirect references?What in case of indirect references?

• We can still use reference counting, provided We can still use reference counting, provided we consider all references to an object we consider all references to an object including references from other objects.including references from other objects.

• Object p = new Association(new Integer(57), Object p = new Association(new Integer(57), new Integer(99));new Integer(99));

Page 18: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1818

Reference Counting (cont'd)Reference Counting (cont'd)• When does reference counting fail?When does reference counting fail?

• When head is assigned to null, first object reference count When head is assigned to null, first object reference count becomes 1 and not zerobecomes 1 and not zero

• Reference counting will fail whenever the data structure Reference counting will fail whenever the data structure contains a cycle of referencescontains a cycle of references

next

refCount = 1

ListElements

refCount = 1

ListElements

next

refCount = 1

ListElements

next

head

Page 19: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 1919

Reference Counting (cont'd)Reference Counting (cont'd)• Advantages and DisadvantagesAdvantages and Disadvantages

+ Garbage is easily identified.+ Garbage is easily identified. + Garbage can be collected incrementally.+ Garbage can be collected incrementally. - Every object should have a reference count field.- Every object should have a reference count field. Overhead for updating reference count fields.Overhead for updating reference count fields. It fails in the case of cyclic references.It fails in the case of cyclic references. It does not de-fragment the heapIt does not de-fragment the heap

Page 20: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2020

Mark-and-Sweep Garbage Mark-and-Sweep Garbage CollectionCollection• It is the first garbage collection algorithm that is able to It is the first garbage collection algorithm that is able to

reclaim garbage even for cyclic data structures.reclaim garbage even for cyclic data structures.

• Mark and sweep algorithm consists of two phases: Mark and sweep algorithm consists of two phases: mark phasemark phase sweep phasesweep phase

for each root variable rfor each root variable r

mark(r);mark(r);

sweep();sweep();

Page 21: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2121

Mark and Sweep (cont'd)Mark and Sweep (cont'd)void sweep(){void sweep(){

for each Object p in the heapfor each Object p in the heap

{{

if (p.marked)if (p.marked)

p.marked=false;p.marked=false;

elseelse

heap.release(p);heap.release(p);

}}

}}

program

Page 22: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2222

Mark and Sweep (cont'd)Mark and Sweep (cont'd)

• AdvantagesAdvantages It correctly identifies and collects garbage even It correctly identifies and collects garbage even

in the presence of reference cycles.in the presence of reference cycles. No overhead in manipulating references.No overhead in manipulating references.

• DisadvantagesDisadvantages The program suspends while garbage collecting.The program suspends while garbage collecting. It does not De-Fragment the heap.It does not De-Fragment the heap.

Page 23: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2323

Stop-and-Copy Garbage Stop-and-Copy Garbage CollectionCollection• This algorithm collects garbage and defragments the heap.This algorithm collects garbage and defragments the heap.

• The heap is divided into two regions: active and inactive.The heap is divided into two regions: active and inactive.

• When the memory in the active region is exhausted, the When the memory in the active region is exhausted, the program is suspended and :program is suspended and :

• Live objects are copied to the inactive region contiguouslyLive objects are copied to the inactive region contiguously

• The active and in active regions reverse their rolesThe active and in active regions reverse their roles

• The AlgorithmThe Algorithm

for each root variable rfor each root variable r

r=copy(r,inactiveHeap);r=copy(r,inactiveHeap);

swap (activeHeap,inactiveHeap);swap (activeHeap,inactiveHeap);

Page 24: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2424

Stop-and-Copy Garbage Stop-and-Copy Garbage Collection (cont'd)Collection (cont'd)Object copy(Object p, Heap destination)Object copy(Object p, Heap destination)

{{

if (p==null)if (p==null)

return null;return null;

if (p.forward==null)if (p.forward==null)

{{

q=destination.newInstance(p.class);q=destination.newInstance(p.class);

p.forward= q;p.forward= q;

for each field f in pfor each field f in p

{{

if (f is primitive type)if (f is primitive type)

q.f=p.f;q.f=p.f;

elseelse

q.f= copy(p.f, destination);q.f= copy(p.f, destination);

}}

q.forward = null;q.forward = null;

}}

return p.forward;return p.forward;

}}

A’

null

B’

null

C’

null

headinactive

active

Page 25: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2525

Stop-and-Copy Garbage Stop-and-Copy Garbage Collection (cont'd)Collection (cont'd)• Advantages Advantages

It works for cyclic data structuresIt works for cyclic data structures It Defragments the heap.It Defragments the heap.

• DisadvantagesDisadvantages

All objects are copied when the garbage collector is invoked – it All objects are copied when the garbage collector is invoked – it does not work incrementally.does not work incrementally.

It requires twice as much memory as the program actually uses.It requires twice as much memory as the program actually uses.

Page 26: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2626

Java Garbage Collector HistoryJava Garbage Collector History

• 1.0-1.31.0-1.3 mark-and-sweepmark-and-sweep

• 1.31.3 three memory spacesthree memory spaces

1.1. Permanent space: used for JVM class and method Permanent space: used for JVM class and method objects objects

2.2. Old object space: used for objects that have been around Old object space: used for objects that have been around a while a while

3.3. New (young) object space: used for newly created New (young) object space: used for newly created objectsobjects

• Also broken into Eden, Survivor1 and Survivor2Also broken into Eden, Survivor1 and Survivor2

allowed different techniques for each spaceallowed different techniques for each space

Page 27: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2727

Java Garbage Collector History Java Garbage Collector History (cont’)(cont’)• 1.3 techniques1.3 techniques

Copy-compaction: used for new object space. Copy-compaction: used for new object space. Mark-compact: used in old object space. Mark-compact: used in old object space.

• Similar to mark and sweep, mark-compact marks all Similar to mark and sweep, mark-compact marks all unreachable objects; in the second phase, the unreachable unreachable objects; in the second phase, the unreachable objects compact. objects compact.

Incremental garbage collection (optional)Incremental garbage collection (optional)

• Incremental GC creates a new middle section in the heap, which Incremental GC creates a new middle section in the heap, which divides into multiple trains. Garbage is reclaimed from each train divides into multiple trains. Garbage is reclaimed from each train one at a time. This provides fewer, more frequent pauses for one at a time. This provides fewer, more frequent pauses for garbage collection, but it can decrease overall application garbage collection, but it can decrease overall application performance.performance.

• still all “stop-the-world” techniquesstill all “stop-the-world” techniques

Page 28: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2828

Java Garbage Collector History Java Garbage Collector History (cont’)(cont’)• 1.4.1 introduced parallel GC algorithms1.4.1 introduced parallel GC algorithms

YoungYoung OldOld Stop the worldStop the world MultithreadedMultithreaded ConcurrentConcurrent

CopyingCopying XX XX

*Parallel *Parallel copyingcopying

XX XX XX

*Parallel *Parallel scavengingscavenging

XX XX XX

IncrementalIncremental 1 (see note below)1 (see note below) XX

Mark-compactMark-compact XX XX

*Concurrent*Concurrent XX 2 (see note below)2 (see note below) XX

Note 1: Subdivides the new generation to create an additional middle generationNote 2: Uses stop-the-world approach for two of its six phases

Page 29: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 2929

JVM Internals JVM Internals • The architectureThe architecture

JVM is an abstract conceptJVM is an abstract concept Sun just specified the interfaceSun just specified the interface implementation details depend on specific product (SUN implementation details depend on specific product (SUN

JDK, IBM JDK, Blackdown)JDK, IBM JDK, Blackdown)

• Java bytecode, the internal languageJava bytecode, the internal language independent from CPU-type (bytecode)independent from CPU-type (bytecode) Stackoriented, object-oriented, type-safeStackoriented, object-oriented, type-safe

Page 30: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3030

Runtime view on a JVMRuntime view on a JVM

ClassloaderClassloader

Ru

ntim

e D

ata

sto

rag

eR

un

time D

ata

sto

rag

e

MethodArea (Classes)

Heap (Objects)

Stack Frames

PC registers

Native method stacks

JVMruntime

JVMruntime

NativemethodsNative

methods

Page 31: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3131

Runtime dataRuntime data

• Frame:Frame: • Saves runtime state of execution threads, Saves runtime state of execution threads,

therefore holds information for method execution therefore holds information for method execution (program counter)(program counter)

• All frames of a thread are managed in a stack All frames of a thread are managed in a stack frameframe

Page 32: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3232

Runtime dataRuntime data

• Method areaMethod areaRuntime information of the class fileRuntime information of the class fileType informationType informationConstant PoolConstant PoolMethod informationMethod informationField informationField informationClass static fieldsClass static fieldsReference to the classloader of the classReference to the classloader of the classReference to reflection anchor (Class)Reference to reflection anchor (Class)

Page 33: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3333

The Constant PoolThe Constant Pool• The "The "constant poolconstant pool" is a heterogenous array of " is a heterogenous array of

data. Each entry in the constant pool can be one of data. Each entry in the constant pool can be one of the following: the following: string , class or interface name , reference to a field or string , class or interface name , reference to a field or

method , numeric value , constant String value method , numeric value , constant String value

• No other part of the class file makes specific No other part of the class file makes specific references to strings, classes, fields, or methods. references to strings, classes, fields, or methods. All references for constants, names of methods, All references for constants, names of methods, and fields are via lookup into the constant pool. and fields are via lookup into the constant pool.

Page 34: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3434

The Class File StructureThe Class File Structure

HEADER

CONSTANT-POOL

ACCESS FLAGS

(Final, Native,Private,

Protected,...)

INTERFACES

FIELDS METHODS

ATTRIBUTES

• You can use a classdumper like You can use a classdumper like javap -cjavap -c or or DumpClassDumpClass to to analyze these inner detailsanalyze these inner details

Page 35: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3535

Javap -c -verbose exampleJavap -c -verbose example

• Our Chicken.java classOur Chicken.java class

Page 36: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3636

The Class File FormatThe Class File Format• Java class files are brought into the JVM via the Java class files are brought into the JVM via the

classloaderclassloader

• The class file is basically just a plain byte array, The class file is basically just a plain byte array, following the rules of the byte code verifier. following the rules of the byte code verifier.

• All 16-bit and 32-bit quantities are formed by All 16-bit and 32-bit quantities are formed by reading in two or four 8-bit bytes, respectively, and reading in two or four 8-bit bytes, respectively, and joining them together in big-endian format. joining them together in big-endian format.

Page 37: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3737

Methods and FieldsMethods and Fields• The type of a field or method is indicated by a string The type of a field or method is indicated by a string

called its called its signaturesignature. .

• FieldsFields may have an additional attribute giving the may have an additional attribute giving the field's initial value. field's initial value.

• MethodsMethods have an additional CODE attribute giving have an additional CODE attribute giving the java bytecode for executing that method. the java bytecode for executing that method.

Page 38: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3838

The CODE AttributeThe CODE Attribute

• maximum maximum stack space stack space

• maximum maximum number of local variables number of local variables

• The actual The actual bytecodebytecode for executing the for executing the method. method.

• A table of A table of exceptionexception handlershandlers, , start and end offset into the bytecodes, start and end offset into the bytecodes, an exception type, and an exception type, and the offset of a handler for the exceptionthe offset of a handler for the exception

Page 39: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 3939

Bytecode BasicsBytecode Basics

Page 40: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4040

The JVM typesThe JVM types

• JVM-Types and their prefixesJVM-Types and their prefixes Byte Byte  bb   Short Short  ss   Integer Integer  ii  (java booleans are mapped to   (java booleans are mapped to

jvm ints!)jvm ints!) Long Long  ll   Character Character  cc Single float Single float ff   double float double float dd   References References  aa  to Classes, Interfaces, Arrays  to Classes, Interfaces, Arrays

• These Prefixes used in opcodes (These Prefixes used in opcodes (iiadd, astore,...)add, astore,...)

Page 41: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4141

The JVM Instruction The JVM Instruction MnemonicsMnemonics• Shuffling (pop, swap, dup, ...)Shuffling (pop, swap, dup, ...)• Calculating (iadd, isub, imul, idiv, ineg,...)Calculating (iadd, isub, imul, idiv, ineg,...)• Conversion (d2i, i2b, d2f, i2z,...)Conversion (d2i, i2b, d2f, i2z,...)• Local storage operation (iload, istore,...)Local storage operation (iload, istore,...)• Array Operation (arraylength, newarray,...)Array Operation (arraylength, newarray,...)• Object management (get/putfield, invokevirtual, Object management (get/putfield, invokevirtual,

new)new)• Push operation (aconst_null, iconst_m1,....)Push operation (aconst_null, iconst_m1,....)• Control flow (nop, goto, jsr, ret, tableswitch,...)Control flow (nop, goto, jsr, ret, tableswitch,...)• Threading (monitorenter, monitorexit,...)Threading (monitorenter, monitorexit,...)

Page 42: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4242

BytecodeBytecode• Java Bytecode (JBC) are followed by zero or more Java Bytecode (JBC) are followed by zero or more

bytes of additional operand information.bytes of additional operand information.• Table lookup instructions (tableswitch, Table lookup instructions (tableswitch,

lookupswitch) have a flexible lengthlookupswitch) have a flexible length• The The wide wide operation extension allows the base operation extension allows the base

operations to use „large“ operandsoperations to use „large“ operands• NoNo self-modifying self-modifying codecode• NoNo branching branching to arbitrary locations, only to to arbitrary locations, only to

beginning of instructions limited to scope of current beginning of instructions limited to scope of current method (enforced by verifier!)method (enforced by verifier!)

Page 43: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4343

Bytecode (Reverse) EngineeringBytecode (Reverse) Engineering

Page 44: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4444

Bytecode Engineering toolsBytecode Engineering tools• Obfuscators Obfuscators

Remove/Manipulate all information that can be used for Remove/Manipulate all information that can be used for reverse engineeringreverse engineering

• Native compilers Native compilers „„Real“ compile of java bytecodes to native instructions Real“ compile of java bytecodes to native instructions

(x86/sparc)(x86/sparc)

• Build your own bytecodeBuild your own bytecode Programmatic Generation Programmatic Generation Manipulate classfiles with an APIManipulate classfiles with an API

Page 45: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4545

ObfuscatorsObfuscatorsTechniques usedTechniques used• Identifier Name Mangling Identifier Name Mangling 

• The JVM does not need useful names for Methods and The JVM does not need useful names for Methods and Fields Fields

• They can be renamed to single letter identifiersThey can be renamed to single letter identifiers

• Constant Pool Name ManglingConstant Pool Name Mangling• Decrypts constant pool entries on runtimeDecrypts constant pool entries on runtime

• Control flow obfuscationControl flow obfuscation• Insertion of phantom variables, stack scrambling Insertion of phantom variables, stack scrambling • And by relying on their default values inserting ghost And by relying on their default values inserting ghost

branch instructions, which never executebranch instructions, which never execute

Page 46: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4646

ObfuscatorsObfuscatorsProblems with ObfuscationProblems with Obfuscation• Constant value Mangling implies overhead Constant value Mangling implies overhead

processing in extra method call of an processing in extra method call of an „deobfuscatename“ method in each retrieval from „deobfuscatename“ method in each retrieval from constant pool constant pool

• Dynamic class loading may become broken as Dynamic class loading may become broken as classes get new names and reflection calls like classes get new names and reflection calls like class.forName(„Account“) will fail because class class.forName(„Account“) will fail because class „Account“ now known as by it‘s obfuscated name „Account“ now known as by it‘s obfuscated name „b16“!„b16“!

• And: Obfuscation breaks patterns that can be And: Obfuscation breaks patterns that can be recognized by JIT-engines for optimizationrecognized by JIT-engines for optimization

Page 47: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4848

Protecting the Source Code:Protecting the Source Code:Native CompilersNative Compilers• Convert Java bytecode to CConvert Java bytecode to C• Generate executable via normal c-buildGenerate executable via normal c-build

fast executionfast executionAdditional decompilation effort needed Additional decompilation effort needed Long turnaround timesLong turnaround timesEven for small java programs you get monster Even for small java programs you get monster

size executable files (67mb source for Viva.java) size executable files (67mb source for Viva.java) from some commercial productsfrom some commercial products

Transformed program may than be vulnerable to Transformed program may than be vulnerable to buffer overflows and off-by-onesbuffer overflows and off-by-ones

Page 48: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 4949

Bytecode Reverse Bytecode Reverse EngineeringEngineering• DecompilationDecompilation

Get Source code from class filesGet Source code from class files

• Graphical AnalysisGraphical Analysis Rebuild the logical control flow Rebuild the logical control flow

• DisassemblyDisassembly Get symbolic bytecode from class filesGet symbolic bytecode from class files

Page 49: Java Garbage Collection, Byte Code James Atlas August 7, 2008.

August 7, 2008August 7, 2008 James Atlas - CISC370James Atlas - CISC370 5050

JADJAD

• Java DecompilerJava Decompiler Free for personal useFree for personal use JADClipse plugin for Eclipse - allows you to JADClipse plugin for Eclipse - allows you to

browse .class filesbrowse .class files