Maple Sugaring In Geauga County By Mrs. Maxwell Kindergarten.
Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax...
Transcript of Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax...
![Page 1: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/1.jpg)
Verifying a compiler:Why? How? How far?
Xavier Leroy
INRIA Paris-Rocquencourt
CGO 2011
X. Leroy (INRIA) Verifying a compiler CGO 2011 1 / 57
![Page 2: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/2.jpg)
Compiler verification:
Why?
X. Leroy (INRIA) Verifying a compiler CGO 2011 2 / 57
![Page 3: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/3.jpg)
Can you trust your compiler?
Executablemachine code
Sourceprogram
Compiler6= ?
The miscompilation issue: Bugs in the compiler can lead to incorrectmachine code being generated from a correct source program.
X. Leroy (INRIA) Verifying a compiler CGO 2011 3 / 57
![Page 4: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/4.jpg)
Miscompilation happens
NULLSTONE isolated defects [in integer division] in twelve of twentycommercially available compilers that were evaluated.
http://www.nullstone.com/htmls/category/divide.htm
We tested thirteen production-quality C compilers and, for each, foundsituations in which the compiler generated incorrect code for accessingvolatile variables.
E. Eide & J. Regehr, EMSOFT 2008
To improve the quality of C compilers, we created Csmith, arandomized test-case generation tool, and spent three years using it tofind compiler bugs. During this period we reported more than 325previously unknown bugs to compiler developers. Every compiler wetested was found to crash and also to silently generate wrong codewhen presented with valid input.
X. Yang, Y. Chen, E. Eide & J. Regehr, PLDI 2011
X. Leroy (INRIA) Verifying a compiler CGO 2011 4 / 57
![Page 5: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/5.jpg)
Exhibit A: GCC bug #323
Title: optimized code gives strange floating point results.
#include <stdio.h>
void test(double x, double y)
{double y2 = x + 1.0; // computed in 80 bits, not rounded to 64 bits
if (y != y2) printf("error_n");
}
void main()
{double x = .012;
double y = x + 1.0; // computed in 80 bits, rounded to 64 bits
test(x, y);
}
Why it is a bug: C99 allows intermediate results to be computed withexcess precision, but requires them to be rounded at assignments.
X. Leroy (INRIA) Verifying a compiler CGO 2011 5 / 57
![Page 6: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/6.jpg)
Exhibit A: GCC bug #323
Reported in 2000.
Dozens of duplicates.
More than 150 comments.
Still not acknowledged as a bug.
“Addressed” in 2009 (in GCC 4.5) via flag-fexcess-precision=standard.
Responsible for PHP’s strtod() function not terminating on someinputs. . .
. . . causing denial of service on many Web sites.
X. Leroy (INRIA) Verifying a compiler CGO 2011 6 / 57
![Page 7: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/7.jpg)
Are miscompilation bugs a problem?
For ordinary software:
Compiler-introduced bugs are negligible compared with the bugs inthe program itself.
Programmers rarely run into them.
When they do, debugging is very hard.
X. Leroy (INRIA) Verifying a compiler CGO 2011 7 / 57
![Page 8: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/8.jpg)
Are miscompilation bugs a problem?
For critical software validated by testing only:
Good testing should find all bugs, even those compiler-introduced.
Optimizations can complicate test plans.
X. Leroy (INRIA) Verifying a compiler CGO 2011 8 / 57
![Page 9: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/9.jpg)
Are miscompilation bugs a problem?
!"#$%&
'%()*+,-%.*(/01%-2(2-2%3456%7849:6%8;6$8<;62%=4>5?$@:%>4@A97$@:9$B2
#$8"872B"79$8C"98D562>4?%%%EE%F%GH%IE%GG%EJ
1K$?LB$%M%04??"@7$6%=$%N4B%1B$>:89O5$6
!"#$%&'()*+),-,&./$)*$)0123)4567)8)%9:&;<=$;)&9+&$,)=$,),+;(>%$,
!"#$%&'()*+ ,"#-%&'.)*+
/0"1234%&'.)*+%-30"12%'5)*+
678812%&')*+
9"1:#$32%&'*)*+
;20<<#="1 >320?34$#"!$#=0"0?12
@233-A3%1&'*)*+
!"#$%&
'%()*+,-%.*(/01%-2(2-2%3456%7849:6%8;6$8<;62%=4>5?$@:%>4@A97$@:9$B2
#$8"872B"79$8C"98D562>4?%%%EE%F%&G%HE%&&%EI
1J59K$?$@:%L%-M6:N?$%L
!"#$%&'($#
)(*+,*+-'./+0$1&"#/.
!'.2.34#1$5/
!67+7'($#
789:;+:.</.0$=#.$(+>".432/&$>'#'$=
For critical software validated by review, analysis & testing:(e.g. DO-178 in avionics)
Manual reviews of (representative fragments of) generated assembly.
Turning all optimizations off to get traceability.
Reduced usefulness of formal verification.
X. Leroy (INRIA) Verifying a compiler CGO 2011 9 / 57
![Page 10: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/10.jpg)
Miscompilation and formal verification
Executable model(Simulink, SCADE, etc)
code generator
Source code (C)
compiler
Executablemachine code
Modelchecking
Programproof
Staticanalysis
The guarantees obtained (so painfully!) by source-level formal verificationmay not carry over to the executable code . . .
X. Leroy (INRIA) Verifying a compiler CGO 2011 10 / 57
![Page 11: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/11.jpg)
A solution? Verified compilers
With a regular compiler:
Sourceprogram
Executable
compiler
Formal verification
?
X. Leroy (INRIA) Verifying a compiler CGO 2011 11 / 57
![Page 12: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/12.jpg)
A solution? Verified compilers
With a formally verified compiler:
Sourceprogram
Executable
compiler
Formal verification
observationalequivalence
The properties formally established on the source program carry over to theexecutable.
X. Leroy (INRIA) Verifying a compiler CGO 2011 11 / 57
![Page 13: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/13.jpg)
Formal verification of compilers
A radical solution to the miscompilation problem:
Apply program proof to the compiler itself to prove that it preserves thesemantics of the source code.
After all, compilers are complicated programs with a simple specification:
If compilation succeeds, the generated code should behave asprescribed by the semantics of the source program.
X. Leroy (INRIA) Verifying a compiler CGO 2011 12 / 57
![Page 14: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/14.jpg)
An old idea. . .
Mathematical Aspects of Computer Science, 1967
X. Leroy (INRIA) Verifying a compiler CGO 2011 13 / 57
![Page 15: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/15.jpg)
An old idea. . .
Machine Intelligence (7), 1972.
X. Leroy (INRIA) Verifying a compiler CGO 2011 14 / 57
![Page 16: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/16.jpg)
Compiler verification:
How far are we today?
(X. Leroy, Formal verification of a realistic compiler, CACM 07/2009)
X. Leroy (INRIA) Verifying a compiler CGO 2011 15 / 57
![Page 17: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/17.jpg)
The CompCert project(X.Leroy, S.Blazy, et al)
Develop and prove correct a realistic compiler, usable for critical embeddedsoftware.
Source language: a very large subset of C.
Target language: PowerPC/ARM/x86 assembly.
Generates reasonably compact and fast code⇒ careful code generation; some optimizations.
Note: compiler written from scratch, along with its proof; not trying toprove an existing compiler.
X. Leroy (INRIA) Verifying a compiler CGO 2011 16 / 57
![Page 18: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/18.jpg)
The subset of C supported
Supported natively:
Types: integers, floats, arrays, pointers, struct, union.
Expressions: all of C, including pointer arithmetic.
Control: if/then/else, loops, goto, regular switch.
Functions, including recursive functions and function pointers.
Dynamic allocation (malloc and free).
Volatile accesses.
X. Leroy (INRIA) Verifying a compiler CGO 2011 17 / 57
![Page 19: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/19.jpg)
The subset of C supported
Not supported at all:
The long long and long double types.
Unstructured switch (Duff’s device), longjmp/setjmp.
Variable-arity functions.
Supported through (unproved!) expansion after parsing:
Block-scoped variables.
typedef.
Bit-fields.
Assignment between struct or union.
Passing struct or union by value.
X. Leroy (INRIA) Verifying a compiler CGO 2011 18 / 57
![Page 20: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/20.jpg)
The formally verified part of the compiler
CompCert C Clight C#minor
CminorCminorSelRTL
LTL LTLin Linear
MachAsm
side-effects out
of expressions
type elimination
loop simplifications
stack allocation
of “&” variables
instruction
selection
CFG construction
expr. decomp.
register allocation (IRC)
linearization
of the CFG
spilling, reloading
calling conventions
layout of stack frames
asm code
generation
Optimizations: constant prop., CSE, tail calls,
(LCM), (Software pipelining)
(Instruction scheduling)
X. Leroy (INRIA) Verifying a compiler CGO 2011 19 / 57
![Page 21: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/21.jpg)
Formally verified in Coq
After 50 000 lines of Coq and 4 person.years of effort:
Theorem transf_c_program_is_refinement:
forall p tp,
transf_c_program p = OK tp ->
(forall beh, exec_C_program p beh -> not_wrong beh) ->
(forall beh, exec_asm_program tp beh -> exec_C_program p beh).
Behaviors beh = termination / divergence / going wrong+ trace of I/O operations (syscalls, volatile accesses).
X. Leroy (INRIA) Verifying a compiler CGO 2011 20 / 57
![Page 22: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/22.jpg)
The whole CompCert compiler
AST C
AST Asm
C source
AssemblyExecutable
parsing, construction of an AST
type-checking, de-sugaring
Verifi
edco
mp
iler
printing of
asm syntax
assembling
linking
Type reconstruction
Graph coloring
Code linearization heuristics
Proved in Coq(extracted to Caml)
Not proved(hand-written in Caml)
Part of the TCB
Not part of the TCB
X. Leroy (INRIA) Verifying a compiler CGO 2011 21 / 57
![Page 23: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/23.jpg)
Performance of generated code(On a PowerPC G5 processor)
AE
S
Alm
aben
ch
Bin
aryt
rees
Fan
nku
ch
FF
T
Kn
ucl
eoti
de
Nb
od
y
Qso
rt
Ray
trac
er
Sp
ectr
al
VM
ach
Execution time
gcc -O0
CompCertgcc -O1gcc -O3
X. Leroy (INRIA) Verifying a compiler CGO 2011 22 / 57
![Page 24: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/24.jpg)
Status
Complete source & proofs available for evaluation and research purposes:
http://compcert.inria.fr/
(+ research papers)
Compiler runs on / produces code for{Linux,MacOSX,Cygwin} / {PPC, ARM, x86}.
Tested on small benchmarks (up to 3000 LOC), real-world avionics codes,and by random testing.
X. Leroy (INRIA) Verifying a compiler CGO 2011 23 / 57
![Page 25: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/25.jpg)
As of early 2011, the under-development version of CompCert isthe only compiler we have tested for which Csmith cannot findwrong-code errors. This is not for lack of trying: we havedevoted about six CPU-years to the task. The apparentunbreakability of CompCert supports a strong argument thatdeveloping compiler optimizations within a proof framework,where safety checks are explicit and machine-checked, hastangible benefits for compiler users.
X. Yang, Y. Chen, E. Eide & J. Regehr, PLDI 2011
X. Leroy (INRIA) Verifying a compiler CGO 2011 24 / 57
![Page 26: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/26.jpg)
Compiler verification:
How?
X. Leroy (INRIA) Verifying a compiler CGO 2011 25 / 57
![Page 27: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/27.jpg)
Verification patterns(for each compilation pass)
transformation transformation
validator
×
Verified transformation Verified translation validation
= formally verified
= not verified
Verified translation validation:
Less to prove (if validator simpler than transformation).
Strong soundness guarantees, but no completeness in general.
Validator reusable for several variants of an optimization.
X. Leroy (INRIA) Verifying a compiler CGO 2011 26 / 57
![Page 28: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/28.jpg)
An example of a verified transformation:
register allocation by graph coloring
(X. Leroy, A formally verified compiler back-end, §8, J. Autom. Reasoning 43(4))
(S. Blazy, B. Robillard, A. W. Appel, Formal verification of coalescing graph-coloring
register allocation, ESOP 2010)
X. Leroy (INRIA) Verifying a compiler CGO 2011 27 / 57
![Page 29: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/29.jpg)
Starting point: a CFG for a Register Transfer Language
s = 0.0
i = 0
if (i >= size)
a = i << 2
b = load(tbl, a)
c = float(b)
s = s +f c
i = i + 1
d = float(size)
e = s /f d
return(e)
X. Leroy (INRIA) Verifying a compiler CGO 2011 28 / 57
![Page 30: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/30.jpg)
Algorithm 1: liveness analysis (backward dataflow analysis)
Lout(p) =⋃{transf(Lout(s), instr-at(s)) | s successof of p}
s = 0.0
i = 0
if (i >= size)
a = i << 2
b = load(tbl, a)
c = float(b)
s = s +f c
i = i + 1
d = float(size)
e = s /f d
return(e)
[tbl size s]
[tbl size i s]
[tbl size i s]
[tbl size i s a]
[tbl size i s b]
[tbl size i s c]
[tbl size i s]
[tbl size i s]
[s d]
[e]
X. Leroy (INRIA) Verifying a compiler CGO 2011 29 / 57
![Page 31: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/31.jpg)
Algorithm 2: construct interference graph
For each instruction p : r := . . ., add edges between r and Lout(p) \ {r}.
(+ Chaitin’s special case for moves.) (+ Recording of preferences.)
tbl
size
is
a
b
cd
e
X. Leroy (INRIA) Verifying a compiler CGO 2011 30 / 57
![Page 32: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/32.jpg)
Algorithm 3: Coloring of the interference graph
Construct a function φ : Variable → Register + Stackslot such thatφ(x) 6= φ(y) if x and y interfere.
We use the Iterated Register Coalescing heuristic by George & Appel.
tbl
size
is
a
b
cd
e
tbl
size
is
a
b
cd
e
X. Leroy (INRIA) Verifying a compiler CGO 2011 31 / 57
![Page 33: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/33.jpg)
Algorithm 4: Rewriting the code
Replace all variables x by their color φ(x).(Spilling & reloading are done in a later pass.)
f1 = 0.0
r3 = 0
if (r3 >= r2)
r4 = r3 << 2
r4 = load(r1, r4)
f2 = float(r4)
f1 = f1 +f f2
r3 = r3 + 1
f2 = float(r2)
f1 = f1 /f f2
return(f1)
X. Leroy (INRIA) Verifying a compiler CGO 2011 32 / 57
![Page 34: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/34.jpg)
What needs to be proved? Part 1: proofs of algorithms
Liveness analysis: show that the mapping Lout computed by Kildall’sfixpoint algorithm satisfies the inequations
Lout(p) ⊇ transf(Lout(s), instr-at(p)) if s successor of p
Construction of the interference graph: show that the final graph Gcontains all expected edges, e.g.
p : x := . . . ∧ y 6= x ∧ y ∈ Lout(p) =⇒ (x , y) ∈ G
Coloring of the interference graph: show that (x , y) ∈ G =⇒ φ(x) 6= φ(y)
Either by direct proof (Blazy, Robillard, Appel) or by verified validation:
Validator: enumerate all edges (x , y) of G and abort if φ(x) = φ(y)
Correctness proof for the validator: trivial.
X. Leroy (INRIA) Verifying a compiler CGO 2011 33 / 57
![Page 35: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/35.jpg)
What needs to be proved? Part 2: semantic preservation
What does “x is live at p” means, semantically?
Hmmm . . .
What does “x is dead at p” means, semantically?
That the program behaves the same regardless of the value of x at point p.
Invariant
Let E : variable → value be the values of variables at point p in theoriginal program. Let R : location→ value be the values of locations atpoint p in the transformed program.E and R agree at p, written p ` E ≈ R, iff
E (x) = R(φ(x)) for all x live before point p
X. Leroy (INRIA) Verifying a compiler CGO 2011 34 / 57
![Page 36: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/36.jpg)
What needs to be proved? Part 2: semantic preservation
What does “x is live at p” means, semantically?
Hmmm . . .
What does “x is dead at p” means, semantically?
That the program behaves the same regardless of the value of x at point p.
Invariant
Let E : variable → value be the values of variables at point p in theoriginal program. Let R : location→ value be the values of locations atpoint p in the transformed program.E and R agree at p, written p ` E ≈ R, iff
E (x) = R(φ(x)) for all x live before point p
X. Leroy (INRIA) Verifying a compiler CGO 2011 34 / 57
![Page 37: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/37.jpg)
What needs to be proved? Part 2: semantic preservation
What does “x is live at p” means, semantically?
Hmmm . . .
What does “x is dead at p” means, semantically?
That the program behaves the same regardless of the value of x at point p.
Invariant
Let E : variable → value be the values of variables at point p in theoriginal program. Let R : location→ value be the values of locations atpoint p in the transformed program.E and R agree at p, written p ` E ≈ R, iff
E (x) = R(φ(x)) for all x live before point p
X. Leroy (INRIA) Verifying a compiler CGO 2011 34 / 57
![Page 38: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/38.jpg)
Proving semantic preservation
Show a simulation diagram of the form
p,E ,Mp ` E ≈ R
p,R,M
p′,E ′,M ′
t
?........................................
p′ ` E ′ ≈ R ′
p′,R ′,M ′
t
?
................
Hypotheses: left, a transition in the original code; top, the invariant(register agreement) before the transition.
Conclusions: one transition in the transformed code; bottom, the invariantafter the transition.
X. Leroy (INRIA) Verifying a compiler CGO 2011 35 / 57
![Page 39: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/39.jpg)
Semantic preservation for whole executions
(initial state) S1invariant
T1 (initial state)
S2
ε ?
invariantT2
ε?
S3
ν1 ?
invariantT3
ν1?
S4
ν2 ?
invariantT4
ν2?
(final state) S5
ε ?
invariantT5 (final state)
ε?
Proves that the original program and the transformed program have thesame behavior (the trace t = ν1.ν2).
X. Leroy (INRIA) Verifying a compiler CGO 2011 36 / 57
![Page 40: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/40.jpg)
An example of verified translation validation:
software pipelining
(J.-B. Tristan and X. Leroy, A simple, verified validator for software pipelining,
POPL 2010)
X. Leroy (INRIA) Verifying a compiler CGO 2011 37 / 57
![Page 41: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/41.jpg)
Software pipelining
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
Original loop (4 iterations) Pipelined loop
X. Leroy (INRIA) Verifying a compiler CGO 2011 38 / 57
![Page 42: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/42.jpg)
Software pipelining
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
MUL LDST ADD LD
Steady
state
Original loop (4 iterations) Pipelined loop
X. Leroy (INRIA) Verifying a compiler CGO 2011 38 / 57
![Page 43: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/43.jpg)
Software pipelining
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
LDLDMULADDST
MUL LDST ADD LD
Steady
state
LDLDMUL LDLDMUL LDADD LD
Prologue
MULST ADDST ADDST
Epilogue
Original loop (4 iterations) Pipelined loop
X. Leroy (INRIA) Verifying a compiler CGO 2011 38 / 57
![Page 44: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/44.jpg)
The pipeliner as a black box
pipeline(B, i ,N) = (P ,S, E , µ, δ)
original loop body
loop index variable
loop bound variable
prologue
steady state
epilogue
initiation intervalunrolling factor
Effect on the loop variable i :
B : +1 P : +µ S : +δ E : +0
X. Leroy (INRIA) Verifying a compiler CGO 2011 39 / 57
![Page 45: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/45.jpg)
The corresponding loop transformation
i := 0
if(i < N)
B
in:
out
i := 0
if(N ≥ µ)
M := N − µM := M / δ
M := M × δM := M + µ
P
if(i < M)
SE
if(i < N)
B
in:
out
Before: After:
X. Leroy (INRIA) Verifying a compiler CGO 2011 40 / 57
![Page 46: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/46.jpg)
How to validate it?
For a fixed number of iterations N = µ+ n × δ + m, consider theunrollings of the two loops:
Original loop: BN (= N copies of B)
Pipelined loop: P ;Sn; E ;Bm
If n,m are known at compile-time, we can use symbolic evaluation to showsemantic equivalence between these two basic blocks.
X. Leroy (INRIA) Verifying a compiler CGO 2011 41 / 57
![Page 47: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/47.jpg)
Basics of symbolic evaluation
Interpret instructions as substitutions:
α(x := y + z) = x 7→ y + z
Interpret basic blocks by composing substitutions:
x u Mem
+1
store
load
*
y x Memz
α
y := x + 1;store(u, y);z := load(u);x := y ∗ z ;
=
X. Leroy (INRIA) Verifying a compiler CGO 2011 42 / 57
![Page 48: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/48.jpg)
Validation by comparison of symbolic evaluations
Fundamental property: two basic blocks B1 and B2 that have the samesymbolic evaluation (α(B1) = α(B2)) are semantically equivalent.
Special care must be taken for:
Comparing modulo algebraic laws and load/store properties:
(e + 1) + 2 = e + 3
load(store(m, e1, e2), e3) = load(m, e3) if e1 6= e3
Tracking operations that can fail at run-time (integer division).
Excluding fresh temporaries (introduced by Modulo VariableExpansion) from the comparison.
X. Leroy (INRIA) Verifying a compiler CGO 2011 43 / 57
![Page 49: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/49.jpg)
Checking semantic equivalence of the two loops
Original loop: Bµ+n×δ+m
Pipelined loop: P ;Sn; E ;Bm
How to check semantic equivalence for all counts n,m ?
Bµ Bδ Bδ Bm
P S S
E
Original:
Pipelined:
X. Leroy (INRIA) Verifying a compiler CGO 2011 44 / 57
![Page 50: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/50.jpg)
Checking semantic equivalence
Consider the case n = m = 0 and N = µ:
Bµ
P
E
Original:
Pipelined:
First (necessary) condition: P ; E ≈ Bµ
(with ≈ denoting semantic equivalence).
X. Leroy (INRIA) Verifying a compiler CGO 2011 45 / 57
![Page 51: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/51.jpg)
Checking semantic equivalence
What if we exit the pipelined loop one step earlier?
(I.e. n 7→ n − 1 and m 7→ m + δ and N unchanged.)
Bµ Bδ Bδ Bm
P S S
E
Original:
Pipelined:
X. Leroy (INRIA) Verifying a compiler CGO 2011 46 / 57
![Page 52: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/52.jpg)
Checking semantic equivalence
What if we exit the pipelined loop one step earlier?
(I.e. n 7→ n − 1 and m 7→ m + δ and N unchanged.)
Bµ Bδ Bδ Bm
P S S
E
Original:
Pipelined:
E
Second condition: S; E ≈ E ;Bδ
X. Leroy (INRIA) Verifying a compiler CGO 2011 46 / 57
![Page 53: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/53.jpg)
The validation algorithm
validate (i ,N,B) (P,S, E , µ, δ) θ =α(Bµ) ≈θ α(P; E) (A)∧ α(E ;Bδ) ≈θ α(S; E) (B)∧ α(B) v θ (C)∧ α(B)(i) = i + 1 (D)∧ α(P)(i) = i + µ (E)∧ α(S)(i) = i + δ (F)∧ α(E)(i) = i (G)∧ α(B)(N) = α(P)(N) = α(S)(N) = α(E)(N) = N (H)
(Where θ is the set of observed variables, α is symbolic evaluation,≈θ a syntactic equivalence, and v a condition on free variables.)
X. Leroy (INRIA) Verifying a compiler CGO 2011 47 / 57
![Page 54: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/54.jpg)
Soundness of the validator
validate (i ,N,B) (P,S, E , µ, δ) θ = true
∀n,m, α(Bµ+δ×n+m) ≈θ α(P;Sn; E ;Bm) v θ
(R1,M)
(R2,M)
(R ′1,M′)
(R ′2,M′)
execution oforiginal loop, unrolled
execution ofpipelined loop, unrolled
agree on θ agree on θ
⇓
⇓
X. Leroy (INRIA) Verifying a compiler CGO 2011 48 / 57
![Page 55: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/55.jpg)
Summary
α(Bµ) ≈θ α(P; E) ∧ α(E ;Bδ) ≈θ α(S; E)
A surprisingly simple validator for a difficult optimization.
So simple it could go into regular, non-verified compilers.
The validator uses completely different concepts from the optimization.(No RAW/WAR/WAW dependencies; no modulo variable expansion; etc)
The validator is incomplete in theory, but appears complete againstpublished modulo-scheduling algorithms.
X. Leroy (INRIA) Verifying a compiler CGO 2011 49 / 57
![Page 56: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/56.jpg)
Compiler verification:
How far can we go?
X. Leroy (INRIA) Verifying a compiler CGO 2011 50 / 57
![Page 57: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/57.jpg)
Current status
At this stage of the CompCert experiment, the initial goal – provingcorrect a nontrivial compiler – appears feasible.
(Within the limitations of today’s proof assistants such as Coq.)
This opens up many directions for future work.
X. Leroy (INRIA) Verifying a compiler CGO 2011 51 / 57
![Page 58: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/58.jpg)
Directions for future work
Higher assurance:
Prove more of the unproved parts (e.g. parsing, bit-field emulation).
Formal connections with verification tools.
Formal connections with hardware (microarchitecture) verification.
Other source languages:
Reactive languages (M. Pouzet, M. Pantel, M. Strecker).
Functional languages (Z. Dargaye, A. Tolmach).
Elements of C++ (T. Ramananandro, G. Dos Reis).
Connections with run-time system verification (A. Tolmach et al).
X. Leroy (INRIA) Verifying a compiler CGO 2011 52 / 57
![Page 59: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/59.jpg)
Directions for future work
Shared-memory concurrency:
Verified Software Toolchain (A. Appel et al).
CompCertTSO (P. Sewell et al).
Shorter, easier proofs:
More proof automation.
Generic execution engine for optimizations (S. Lerner & Z. Tatlock)
X. Leroy (INRIA) Verifying a compiler CGO 2011 53 / 57
![Page 60: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/60.jpg)
More optimizations!
Done in CompCert and related experiments:
Verified transformations Verified translation validation
Constant propagation Lazy Code Motion
CSE over ext. basic blocks Trace scheduling
Software pipelining
Register allocation Register allocation
w/ trivial spilling w/ advanced spilling & splitting
Much remains to be done, in particular:
Loop optimizations.
SSA-based optimizations.
Verified translation validation as the path of least resistance?
X. Leroy (INRIA) Verifying a compiler CGO 2011 54 / 57
![Page 61: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/61.jpg)
SSA and what to prove about it
Plan A: formalize and reason about SSA semantics.
RTL IntoSSA
SSA SSAoptims
SSA Out ofSSA
RTL
Plan B: use SSA in untrusted implementations; validate over RTL.
RTL IntoSSA
SSA SSAoptim
SSA Out ofSSA
RTL × RTL
Validator
Is SSA just an algorithmic device? (faster, simpler, more powerful optims)Or does SSA have deep semantic properties as well?
X. Leroy (INRIA) Verifying a compiler CGO 2011 55 / 57
![Page 62: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/62.jpg)
Loop optimizations
Optimizations such as loop interchange or blocking appear difficult toprove by elementary simulation arguments.
A promising approach: validation a posteriori with the polyhedral model.(Work in progress by A. Pilkiewicz; see also Ph. Clauss et al)
originalloop nest
optimizedloop nest
untrustedoptimizer
polyhedral
model 1
polyhedral
model 2
polyhedral
extraction
polyhedral
extraction
validator
X. Leroy (INRIA) Verifying a compiler CGO 2011 56 / 57
![Page 63: Verifying a compiler: Why? How? How far?type-checking, de-sugaring compiler printing of asm syntax assembling linking Type reconstruction Graph coloring Code linearization heuristics](https://reader034.fdocuments.in/reader034/viewer/2022050111/5f491360e688cb30f83067a1/html5/thumbnails/63.jpg)
In closing. . .
The formal verification of compilers and other development andverification tools
. . . is a fascinating challenge,
. . . appears within reach,
. . . and has practical importance for critical software.
Much remains to be done. Feel free to join!
X. Leroy (INRIA) Verifying a compiler CGO 2011 57 / 57