Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty...

45
Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary Program Rewriting with Diablo Bjorn De Sutter Ghent University PLDI06 , June 06, 2006
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    1

Transcript of Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty...

Page 1: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11Engineering Sciences Faculty – Electronics and Information Systems Department

p. 1

Binary Program Rewriting

with DiabloBjorn De Sutter

Ghent University

PLDI06 , June 06, 2006

Page 2: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 2

Credits

• Bruno De Bus• Dominique Chanet• Ludo Van Put• Matias Madou• Bertrand Anckaert• Koen De Bosschere

Page 3: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 3

Overview

• Some background• Diablo

– Extensibility– Retargetability– Reliability

Page 4: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 4

Overview

• INTRODUCTION (45 min)• DATASTRUCTURES (1 hr)• ANALYSES AND

TRANSFORMATIONS (45 min)

• BACKENDS (30 min)

Page 5: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 5

Program Development

AssemblyCode

AssemblyCode

compilercompiler

AssemblyCode

AssemblyCode

AssemblyCode

AssemblyCode

Assemblycode

Assemblycode

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

assemblerassembler

ExecutableExecutable linkerlinker

BroncodeBroncodeprogrammingprogrammingBroncodeBroncodeBroncodeBroncodeSource codeSource code

Page 6: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 6

Linking

LinkerLinker

Data 1Data 1

Code 1Code 1

Meta 1Meta 1

Code 2Code 2

Data 2Data 2

Meta 2Meta 2

Code 3Code 3

Data 3Data 3

Meta 3Meta 3

startprintstop

Data 1Data 1

Code 1Code 1

Code 2Code 2

Code 3Code 3

Data 2Data 2

Data 3Data 3

Page 7: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 7

Program Development

AssemblyCode

AssemblyCode

compilercompiler

AssemblyCode

AssemblyCode

AssemblyCode

AssemblyCode

Assemblycode

Assemblycode

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

Code 1Code 1

Data 1Data 1

Meta 1Meta 1

assemblerassembler

ExecutableExecutable linkerlinker

BroncodeBroncodeprogrammingprogrammingBroncodeBroncodeBroncodeBroncodeSource codeSource code

CompactedExecutableCompactedExecutable

Squeeze++Squeeze++

Page 8: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 8

ExecExec

Additional optimization opportunities?

Squeeze++Squeeze++

ExecExec

MetaMeta

linkerlinker

Page 9: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 9

Code size reduction obtained with optimization and code abstraction

media int fp C++0%

10%

20%

30%

40%

50%

60%

70%

Results (Squeeze++)

[De Sutter, De Bus and De Bosschere, ACM TOPLAS, Sept 2005]

Page 10: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 10

ExecExec

ProblemsSqueeze++Squeeze++

ExecExec

MetaMeta

linkerlinker

Only for the Alpha architecture.Only for compaction/optimizationSmall change implies days of debugging

Not retargetableNot extensibleNot reliable

Page 11: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 11

Other rewriters?

ExecExec

OMOM

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

Same problems:- not retargetable- not extensible- not reliable

Page 12: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 12

Spike TruSpike Tru

EpoxieEpoxieWall92

OM mipsOM mipsSrivastava92

MahlerMahlerWall89

OM alphaOM alphaSrivastava94

OMOM

ATOMATOMSrivastava94

Spike NTSpike NTCohn97

OM's evolution

Page 13: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 13

AltoAlto

Squeeze++Squeeze++

SqueezeSqueeze PltoPlto

IltoIltoNNOTNNOT

Alto's Evolution

Page 14: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 14

static binary rewriting is useful…

ApplicationsApplications

- Optimization, compaction- Instrumentation- Obfuscation

- Program understanding, visualisation- Debugging- ...

ProblemsProblems

- Not retargetable- Not extensible- Not reliable

but it is a bit problematic…

Page 15: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 15

Overview

• Some background• Diablo

– Extensibility– Retargetability– Reliability

Page 16: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 16

Pre-link or post-link?

+ More meta information = more aggressive transformations

- No program overview

+ More meta information = more aggressive transformations

- No program overview

Pre-link

- Less meta information = more conservative transformations

+ Whole-program overview

- Less meta information = more conservative transformations

+ Whole-program overview

Post-link

+ More meta information+ Whole-program overview

- More implementation work

+ More meta information+ Whole-program overview

- More implementation work

Link - time

ExecExec

DiabloDiablo

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

ExecExec

Page 17: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 17

Extensibility – The problem

compactor

instrumentator

obfuscator performance

optimizer

Page 18: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 18

ExecExec

DiabloDiablo

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

Frontendoptimization

Frontendoptimization

ExecExec

Instructions

Creates InstructionsSchedules Instructions...

ROFs

Reads object filesLinks themWrites a program...

Analyses

livenessconstant propagation....

Optimizations

dead code eliminationinlining....

CFG

Creates functionsRemoves Basic Blocks....

Designed for extensibilityFrontend

instrumentationFrontend

instrumentation

Page 19: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 19

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

linkinglinkingCodeCode

DataData

CodeCode

DataData

CodeCode

DataData

Operation at different levels

DataData

DataData

DataData

disassemblydisassemblyCFG constructionCFG construction

ExecExec

Layout; assemblingLayout; assembling

FrontendFrontend

FrontendFrontendFrontendFrontend

Page 20: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 20

Linux kernel specialization frontendLinux kernel specialization frontend

Application optimization and compaction frontend Application optimization and compaction frontend

ExtensibilityDiabloDiablo

kDiablokDiablo

LCTES'04

LCTES'05

Instrumentation frontendInstrumentation frontendFITFITPASTE'04

Interactive binary program editorInteractive binary program editorLancetLancetPASTE'05

Steganography frontendSteganography frontendStiloStiloICISC'04

Interactive program obfuscatorInteractive program obfuscatorLocoLocoPEPM'05

Page 21: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 21

Overview

• Some background• Diablo

– Extensibility– Retargetability– Reliability

Page 22: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 22

Object formatsObject formats

LinkersLinkers ArchitecturesArchitectures

Retargetability - The problem

Retargetability

Page 23: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 23

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

CodeCode

DataData

CodeCode

DataData

Operation at different levels

DataData

DataData

DataData

ExecExec

Layout; assemblingLayout; assembling

FrontendFrontend

FrontendFrontendFrontendFrontend

linkinglinking

disassemblydisassembly

CFG constructionCFG construction

Page 24: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 24

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

linkinglinkingCodeCode

DataData

CodeCode

DataData

CodeCode

DataData

Retargeting multiple ROF & Linkers

DataData

DataData

DataData

disassemblydisassembly

CFG constructionCFG construction

ExecExec

Layout; assemblingLayout; assembling

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

Object file backend

Object file backend

Linker map & linker script

Linker map & linker script

ArchitectureBackend

ArchitectureBackend

Layout script & layout code

Layout script & layout codeExe backendExe backend

Page 25: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 25

Major implemented backends

DiabloDiablo

Diablo, FIT, kDiablo, LancetDiablo, FIT, kDiablo, LancetARMARM

MIPS32MIPS32

LCTES'04

ESA'04

Diablo, FIT, Stilo, kDiablo, Lancet, LocoDiablo, FIT, Stilo, kDiablo, Lancet, Locox86x86

Diablo, FITDiablo, FITAlphaAlpha

DiabloDiabloIA64IA64Europar'04

LCTES'05

Page 26: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 26

Overview

• Some background• Diablo

– Extensibility– Retargetability– Reliability

Page 27: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 27

ReliabilityReliability

CFG construction

CFG construction

RelocationRelocation Data flow analysis

Data flow analysis

Reliability - The

Problem

Conservative for safetyAggressive,

precise enough

to be useful

Page 28: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 28

CFG Construction

Potential problems:

- Indirect control flow transfers - Code that is treated as data - Unrealizable paths (procedures)

Solutions:

- Use relocation information: identifies computable addresses - Use pattern matching: identifies known address computations - Use knowledge on compiler-generated code

Potential problems:

- Indirect control flow transfers - Code that is treated as data - Unrealizable paths (procedures)

Solutions:

- Use relocation information: identifies computable addresses - Use pattern matching: identifies known address computations - Use knowledge on compiler-generated code

Potential problems:

- Differentiate data from code - Detect self-modifying code - Detect unrewritable code

Solutions:

- Section information - Symbols annotate data in code (ARM ABI) - Self-modifying code in data: no problem at this point - True self-modifying code: look at system calls and protection

Potential problems:

- Differentiate data from code - Detect self-modifying code - Detect unrewritable code

Solutions:

- Section information - Symbols annotate data in code (ARM ABI) - Self-modifying code in data: no problem at this point - True self-modifying code: look at system calls and protection

Disassembling Conservatively modelling control flow

Page 29: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 29

Detecting Data0x0080: 0x34900abc0x0084: 0x767688900x0088: 0xe25437ff0x008c: 0xc00234990x0090: 0x000012340x0094: 0x858599bd0x0098: 0x9912f4d70x009c: 0xde8723c40x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: 0x6798fedc

0x0080: mov r2, 0x0a00x0084: cmp r1, $00x0088: jl 0x0b40x008c: cmp r1, 50x0090: jge 0x0b40x0094: add r1, r2, r10x0098: ldr r1, [r1]0x009c: jmp r10x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: mov r3, r5

$code

$code

$data

Solution: add mapping symbolsSolution: add mapping symbols

Page 30: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 30

Detecting Control Flow Targets

0x0080: 0x34900abc0x0084: 0x767688900x0088: 0xe25437ff0x008c: 0xc00234990x0090: 0x000012340x0094: 0x858599bd0x0098: 0x9912f4d70x009c: 0xde8723c40x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: 0x6798fedc

0x0080: mov r2, 0x0a00x0084: cmp r1, $00x0088: jl 0x0b40x008c: cmp r1, 50x0090: jge 0x0b40x0094: add r1, r2, r10x0098: ldr r1, [r1]0x009c: jmp r10x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: mov r3, r5...0x00d0: add r4, r6, r6...0x0120: ldr r4, [r5]

$code

$code

$data

Direct control flow: trivialDirect control flow: trivial

Indirect control flow: only to code-addresses that are targets of relocations!

jmp r1

Problem: what about unrelocated computations on code- addresses?

Page 31: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 31

Control Flow: Pattern Matching

0x0080: 0x34900abc0x0084: 0x767688900x0088: 0xe25437ff0x008c: 0xc00234990x0090: 0x000012340x0094: 0x858599bd0x0098: 0x9912f4d70x009c: 0xde8723c40x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: 0x6798fedc

0x0080: mov r2, 0x0a00x0084: cmp r1, $00x0088: jl 0x0b40x008c: cmp r1, 50x0090: jge 0x0b40x0094: add r1, r2, r10x0098: ldr r1, [r1]0x009c: jmp r10x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: mov r3, r5...0x00d0: add r4, r6, r6...0x0120: ldr r4, [r5]

$code

$code

$data

Use pattern matching to improve accuracy of control flow graph: disallow computations on code-addresses that are not part of a recognized pattern

Use pattern matching to improve accuracy of control flow graph: disallow computations on code-addresses that are not part of a recognized pattern

Page 32: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 32

Pattern Matching: example0x0080: 0x34900abc0x0084: 0x767688900x0088: 0xe25437ff0x008c: 0xc00234990x0090: 0x000012340x0094: 0x858599bd0x0098: 0x9912f4d70x009c: 0xde8723c40x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: 0x6798fedc

0x0080: mov r2, 0x0a00x0084: cmp r1, $00x0088: jl 0x0b40x008c: cmp r1, 50x0090: jge 0x0b40x0094: add r1, r2, r10x0098: ldr r1, [r1]0x009c: jmp r10x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: mov r3, r5...0x00d0: add r4, r6, r6...0x0120: ldr r4, [r5]

$code

$code

$data

switch statement:

lower bounds check

upper bounds check

switch jump

jump table

switch statement:

lower bounds check

upper bounds check

switch jump

jump table

Page 33: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 33

Pattern Matching: example 20x0080: 0x34900abc0x0084: 0x767688900x0088: 0xe25437ff0x008c: 0xc00234990x0090: 0x000012340x0094: 0x858599bd0x0098: 0x9912f4d70x009c: 0xde8723c40x00a0: 0x000001200x00a4: 0x0000012c0x00a8: 0x000000d00x00ac: 0x000002480x00b0: 0x000002100x00b4: 0x6798fedc

0x0080: mov r2, 0x09c0x0084: cmp r1, $00x0088: jl 0x0b40x008c: cmp r1, $50x0090: jge 0x0b40x0094: add r1, r2, r10x0098: jmp r10x009c: jmp 0x01200x00a0: jmp 0x012c0x00a4: jmp 0x00d00x00a8: jmp 0x02480x00ac: jmp 0x02100x00b0: nop0x00b4: mov r3, r5...0x00d0: add r4, r6, r6...0x0120: ldr r4, [r5]

$codeswitch statement 2: address table is replaced by a series of direct jumps to the switch cases.

unrecognized pattern!

switch statement 2: address table is replaced by a series of direct jumps to the switch cases.

unrecognized pattern!

Solution: add pattern to Diablo

Page 34: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 34

Procedure Calls and ReturnsFunction calls and returns are often just “special” indirect jumps: not recognizing them makes the flow graph much too conservative

Solution: use pattern matching to recognize them: - rely on ABI - rely on compiler conventions

Function calls and returns are often just “special” indirect jumps: not recognizing them makes the flow graph much too conservative

Solution: use pattern matching to recognize them: - rely on ABI - rely on compiler conventions

ARM indirect procedure call:

mov r14, pcmov pc, r2

ARM procedure return:

mov pc, r14

orldr pc, [r13], #4

orldmia r13, {r4-r7,r15}!

Page 35: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 35

ReliabilityReliability

CFG construction

CFG construction

RelocationRelocation Data flow analysis

Data flow analysis

Reliability - The Problem

Page 36: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 36

Data flow analyses

Problem

- Difficult to analyse - Necessary to improve precision - Especially for C++-like languages (calls through function pointers)

Solution

- rely on calling-conventions - use symbol information - use mapping symbols - use source code information - use stack unwind information

Problem

- Difficult to analyse - Necessary to improve precision - Especially for C++-like languages (calls through function pointers)

Solution

- rely on calling-conventions - use symbol information - use mapping symbols - use source code information - use stack unwind information

Stack analysis

Page 37: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 40

ReliabilityReliability

CFG construction

CFG construction

RelocationRelocation Data flow analysis

Data flow analysis

Reliability - The Problem

Page 38: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 41

Relocation

Problem

- How to write a correct program? - How to layout data? - How to update pointers? - How to update addresses?

Observation - Most “strange” requirements come from linker manipulations

Solution - make relocations expressive - make relocations first class objects - let transformations update relocations - use linker scripts

Problem

- How to write a correct program? - How to layout data? - How to update pointers? - How to update addresses?

Observation - Most “strange” requirements come from linker manipulations

Solution - make relocations expressive - make relocations first class objects - let transformations update relocations - use linker scripts

Producing binary program again

Page 39: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 42

Overview

• Some background• Diablo

– Extensibility– Retargetability– Reliability (no, really now)

Page 40: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 43

ReliabilityReliability

CFG construction

CFG construction

RelocationRelocation Data flow analysis

Data flow analysis

Reliability - The Real Problem

Conservative for safetyAggressive,

precise enough

to be useful

Page 41: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 44

ReliabilityReliability

CFG construction

CFG construction

Reliability - The Real Problem

Conservative for safetyAggressive,

precise enough

to be useful

Solution: limit most conservative assumptions to

parts of program

Page 42: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 45

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

CodeCode

DataData

CodeCode

DataData

Limit imprecision to some parts

DataData

DataData

DataData

ExecExec

Page 43: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 46

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

MetaMeta

CodeCode

DataData

CodeCode

DataData

CodeCode

DataData

Limit imprecision to some parts

DataData

DataData

DataData

ExecExec

CodeCode

Page 44: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 47

What program parts?• Sections from object files

– only refer to each other via symbols– special code addresses identified by

relocations– extend relocations where necessary

• no relaxation• annotate PIC code with relocations if

necessary• mark data• ...

Page 45: Binary Program Rewriting with Diablo – Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department p. 1 Binary.

Binary Program Rewriting with Diablo - Bjorn De Sutter – 2006-6-11 Engineering Sciences Faculty – Electronics and Information Systems Department

p. 48

When/why does this work?• Under separate compilation

– Partial-separate compilation– Compiler-generated code only, not manually-

written assembler

• Compiler needs to maintain conventions• Assembly writers do not know compiler-

generated code– Because multiple compiler versions are

available

• Whenever imprecision could become viral, the linker (rewriter) is informed!