Automatically Proving the Correctness of Compiler Optimizations
description
Transcript of Automatically Proving the Correctness of Compiler Optimizations
Automatically Proving the Correctness of Compiler
OptimizationsSorin Lerner Todd Millstein Craig
ChambersUniversity of Washington
Goal: correct compilers
• The compiler is usually part of the trusted computing base.
• “But I use gcc, and it works great!”
gcc-bugs mailing list
• c/9525: incorrect code generation on SSE2 intrinsics• target/7336: [ARM] With -Os option, gcc incorrectly computes the
elimination offset• optimization/9325: wrong conversion of constants: (int)(float)(int)
(INT_MAX)• optimization/6537: For -O (but not -O2 or -O0) incorrect assembly is
generated• optimization/6891: G++ generates incorrect code when -Os is used• optimization/8613: [3.2/3.3/3.4 regression] -O2 optimization generates
wrong code • target/9732: PPC32: Wrong code with -O2 –fPIC• c/8224: Incorrect joining of signed and unsigned division • …
Searched for “incorrect” and “wrong” in the gcc-bugs mailing list.Some of the results:
And this is only for February 2003!On a mature compiler!
compilerSource CompiledProg
run!
inputexp-ectedoutput
Testing
• No correctness guarantees:• neither for the compiled
prog• nor for the compiler
DIFF
• To get benefits, must:• run over many inputs• compile many test cases
output
Verify each compilation
compilerSource CompiledProg
SemanticDIFF
• Translation validation [Pnueli et al 98, Necula 00]
• Credible compilation[Rinard 99]
• Compiler can still have bugs.
• Compile time increases.• “Semantic Diff” is hard.
Proving the whole compiler correct
compilerSource CompiledProg
Correctnesschecker
Proving the whole compiler correct
compiler
Correctnesschecker
Correctness checker
• Option 1: Prove compiler correct by hand.
• Proofs are long…
• And hard.• Compilers are
proven correct as written on paper. What about the implementation?
ProofProofProof«¬
$ \ rt l / .
Link?
Correctness checker
Our Approach
• Our approach: prove compiler correct automatically.
AutomaticTheoremProver
compiler
This seems really hard!
AutomaticTheoremProver
Task of provingcompiler correct
Complexity that an automatic theorem prover can handle.
Complexity of proving a compiler correct.
Making the problem easier
AutomaticTheoremProver
Task of provingcompiler correct
Making the problem easier
AutomaticTheoremProver
Task of provingoptimizer correct • Only prove optimizer correct.
• Trust front-end and code-generator.
Making the problem easier
AutomaticTheoremProver
Write optimizations in Cobalt, a domain-specific language.
Task of provingoptimizer correct
Making the problem easier
AutomaticTheoremProver
Separate correctness from profitability.
Write optimizations in Cobalt, a domain-specific language.
Task of provingoptimizer correct
Making the problem easier
Write optimizations in Cobalt, a domain-specific language.
Separate correctness from profitability.
Factor out the hard and common parts of the proof, and prove them once by hand.
AutomaticTheoremProver
Task of provingoptimizer correct
Results• Cobalt language
– realistic C-like IL– implemented const prop and folding, branch
folding, CSE, PRE, DAE, partial DAE, and simple forms of points-to analyses
• Correctness checker for Cobalt opts– using the Simplify theorem prover
• Execution engine for Cobalt opts– in the Whirlwind compiler
Caveats• May not be able to express your opt Cobalt:
– no interprocedural optimizations for now.– optimizations that build complicated data
structures may be difficult to express.
• A sound Cobalt optimization may be rejected by the correctness checker.
• Trusted computing base (TCB) includes:– front-end and code-generator, execution engine,
correctness checker, proofs done by hand once
Outline• Overview
• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound
• Profitability heuristics
• Pure analyses
y := 5
x := yREPLACE
x := 5
statement y := 5
statements thatdon’t define y
statement x := y
Constant Prop (straight-line code)
Adding arbitrary control flow
y := 5
x := y REPLACE x := 5
statement y := 5
statements thatdon’t define y
statement x := y
y := 5y := 5
is followed by
until
transform statement to x := 5
if
then
Constant prop in
statement y := 5
statements thatdon’t define y
is followed by
until
if
thentransform statement to x := 5
statement x := y
English
boolean expressions evaluated at nodes in the CFG
stmt(Y := C)
X := Y
followed by
until
Cobalt versionEnglish version
: mayDef(Y)
statement y := 5
statements thatdon’t define y
is followed by
until
if
thentransform statement to x := 5
statement x := y
Constant prop inCobalt
X := C
Outline• Overview
• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound
• Profitability heuristics
• Pure analyses
Proving correctness automatically
y := 5
x := y x := 5
y := 5y := 5
• Witnessing region• Invariant: y == 5
Constant prop revisited
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
Ask a theorem prover to show:1. A statement satisfying stmt(Y :=
C) establishes Y == C2. A statement satisfying :mayDef(Y)
maintains Y == C3. The statements X := Y and X := C
have the same semantics in a program state satisfying Y == C
X := C
Generalize to any forward optimization
Ask a theorem prover to show:1. A statement satisfying 1
establishes P2. A statement satisfying 2
maintains P3. The statements s and s’
have the same semantics in a program state satisfying P
We showed by hand once that these conditions imply correctness.
1
2
s
followed by
until
with witnessP
s’
Outline• Overview
• Forward optimizations (see paper for backwards)
• Profitability heuristics
• Pure analyses
Profitability heuristics
• Optimization correct ) safe to perform any subset of the matching transformations.
• So far, all transformations were also profitable.
• In some cases, many transformations are legal, but only a few are profitable.
The two pieces of an optimization
1
followed by 2
until s
s’with witness Pfiltered through choose
• Transformation pattern:– defines which
transformations are legal.
• Profitability heuristic:– describes which of the legal
transformations to actually perform.
– does not affect soundness.– can be written in a language
of the user’s choice.
• This way of factoring an optimization is crucial to our ability to prove optimizations sound automatically.
Profitability heuristic example: PRE
• PRE as code duplication followed by CSE
Profitability heuristic example: PRE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
...
}
x := a + b;x := a + b;
• Code duplication
• PRE as code duplication followed by CSE
Profitability heuristic example: PRE
• PRE as code duplication followed by CSE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
}
x :=
x := a + b;
• Code duplication
• CSE• self-assignment
removal
a + b; x;
Profitability heuristic example: PRE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
...
}
x := a + b;
Legal placements of x := a + bProfitable placement
Outline• Overview
• Forward optimizations (see paper for backwards)
• Profitability heuristics
• Pure analyses
Constant prop revisited (again)
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• Very conservative!• Can we do better?
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• Very conservative!• Can we do better?
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• mayPntTo is a pure analysis.• It computes dataflow info,
but performs no transformations.
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayPntTo in Cobalt
addrNotTaken(X)
“no location in the store points to X”
decl X
s
mayPntTo(X,Y) , : addrNotTaken(Y)
stmt(decl X)
followed by: stmt(... := &X)
defines
with witness
Future work
• Improving expressiveness– interprocedural optimizations– one-to-many and many-to-many
transformations
• Inferring the witness
• Generate specialized compiler binary from the Cobalt sources.
Summary and Conclusion
• Optimizations written in a domain-specific language can be proven correct automatically.
• Our correctness checker found several subtle bugs in Cobalt optimizations.
• A good step towards proving compilers correct automatically.