Java 6 Decompiler

10
Java 6 Decompiler Joshua Cranmer

description

Joshua Cranmer. Java 6 Decompiler. Why decompile?. Source code may be lost but compiled code not Examples Accidentally deleted source code (happened to me!) ‏ Need to patch abandonware (happened to me!) ‏ Security analysis (not happened to me) ‏. Myths of decompilation. - PowerPoint PPT Presentation

Transcript of Java 6 Decompiler

Page 1: Java 6 Decompiler

Java 6 DecompilerJoshua Cranmer

Page 2: Java 6 Decompiler

Why decompile?

Source code may be lost but compiled code not Examples

Accidentally deleted source code (happened to me!) Need to patch abandonware (happened to me!) Security analysis (not happened to me)

Page 3: Java 6 Decompiler

Myths of decompilation

Decompilers are illegal Just as legal as BitTorrent If so, then why does IDA Pro exist?

Decompilation is impossible Undecidable step is actually pre-disassembly (code v.

data) Decompilation is impractical

Based on the notion that it merely undos the steps a compiler does

Page 4: Java 6 Decompiler

Steps of decompilation

Signature recovery Simple parser Newer features make this more difficult

Stack analysis and variable recovery “simple” without optimization or arbitrary scoping

Trivial decompilation Example: fadd -> +

Control flow graph recovery Most difficult portion Direct translation impossible in some circumstances

Post-decompilation transformation Changes legal syntax to sensible syntax

Page 5: Java 6 Decompiler

Signature Recovery

Signatures are stored like (Ljava/lang/Object;I)V Generics use a syntax like (TE;)V Proposed Java 7 features are crazier Enums, annotations, etc. use specific bits or binary

JVM attributes (relatively simple) Completed Q1

Page 6: Java 6 Decompiler

Stack Analysis

Used to infer about variables and unroll some optimizations

Uses Static Single Assignment (a “variable” can only be assigned once) Variables are not presently unified, making ugliness

Most work done in late Q1 and Q2

Page 7: Java 6 Decompiler

Control Flow Graph Reconstruction

Hardest part of decompiling Worked on during Q2, Q3, and Q4 Basic algorithm: create blocks and unify Only unifications currently supported are if-else

blocks Couldn’t complete due to difficulty to get loops

working

Page 8: Java 6 Decompiler

Example of CFG Reconstruction

Following is an if-else-block recovery

A

B C

D

<block A>if <expression> {

<block B>} else {

<block C>}<block D>

Page 9: Java 6 Decompiler

Post-decompilation Transformation

Not implemented Idea is to take certain recognizable blocks of code

and refactor them into common expressions Examples:

Object.class (before Java 5) Inner class private accessors Bridge code String concatenation

Page 10: Java 6 Decompiler

Future work

Code is a horrible internal mess Probably switch to building off of other open-source

projects Better type analysis and unification (especially

generics) Allow especially CFG recovery to be generified

for other types of decompilation ??? Profit Send any and all questions to

[email protected]