Java 6 Decompiler
description
Transcript of Java 6 Decompiler
Java 6 DecompilerJoshua Cranmer
Why decompile?
Source code may be lost but compiled code not Examples
Accidentally deleted source code (happened to me!) Need to patch abandonware (happened to me!) Security analysis (not happened to me)
Myths of decompilation
Decompilers are illegal Just as legal as BitTorrent If so, then why does IDA Pro exist?
Decompilation is impossible Undecidable step is actually pre-disassembly (code v.
data) Decompilation is impractical
Based on the notion that it merely undos the steps a compiler does
Steps of decompilation
Signature recovery Simple parser Newer features make this more difficult
Stack analysis and variable recovery “simple” without optimization or arbitrary scoping
Trivial decompilation Example: fadd -> +
Control flow graph recovery Most difficult portion Direct translation impossible in some circumstances
Post-decompilation transformation Changes legal syntax to sensible syntax
Signature Recovery
Signatures are stored like (Ljava/lang/Object;I)V Generics use a syntax like (TE;)V Proposed Java 7 features are crazier Enums, annotations, etc. use specific bits or binary
JVM attributes (relatively simple) Completed Q1
Stack Analysis
Used to infer about variables and unroll some optimizations
Uses Static Single Assignment (a “variable” can only be assigned once) Variables are not presently unified, making ugliness
Most work done in late Q1 and Q2
Control Flow Graph Reconstruction
Hardest part of decompiling Worked on during Q2, Q3, and Q4 Basic algorithm: create blocks and unify Only unifications currently supported are if-else
blocks Couldn’t complete due to difficulty to get loops
working
Example of CFG Reconstruction
Following is an if-else-block recovery
A
B C
D
<block A>if <expression> {
<block B>} else {
<block C>}<block D>
Post-decompilation Transformation
Not implemented Idea is to take certain recognizable blocks of code
and refactor them into common expressions Examples:
Object.class (before Java 5) Inner class private accessors Bridge code String concatenation
Future work
Code is a horrible internal mess Probably switch to building off of other open-source
projects Better type analysis and unification (especially
generics) Allow especially CFG recovery to be generified
for other types of decompilation ??? Profit Send any and all questions to