Type-Based Verification of Type-Based Verification of Assembly Language for Compiler Assembly Language for Compiler
DebuggingDebugging
Bor-Yuh Evan Chang Adam ChlipalaGeorge C. Necula Robert R. Schneck
University of California, Berkeley
TLDI 2005Long Beach, California
January 10, 2005
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
2
Problem and MotivationProblem and Motivation
• Type check assembly codeassembly code produced by existingexisting compilers for a Java-like language
• Validate that certifying compilation aids compiler debugging– using undergraduate compilers course as a
test bed
• Introduce undergraduates to using compiler technology for software quality
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
3
Problem and MotivationProblem and Motivation• Starting Point: Bytecode verification
– Effective, simple, and very accessible– But does not work for assembly code
• Why assembly code?– Native code compilers are more complex– Also debug JITs
• Why existing compilers?– Force to make the verifier flexible– Better accessibility to students (i.e., require little
or no annotations)– More test cases to validate “certified compilation
for debugging”
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
4
Debugging Compilers with Debugging Compilers with VerificationVerification
Run
Compile Run
Compile
r Test
Case
Compile
r Test
Case
Compiled
Program Test
Inputs
Stressed Compilers Student
Compile Verify
Compile
r Test
Case
Compile
r Test
Case
Relaxed Compilers Student
Ð^K^V^H^P^L^V^H0^L^V^Hp^L^V^H°^L^V^Hð^L^V^H^P^
M^V^H0^M^V^Hp^M^V^H^Ð^M^V^Hð^
M^V^
Ð^K^V^H^P^L^V^H0^L^V^Hp^L^V^H°^L^V^Hð^L^V^H^P^
M^V^H0^M^V^Hp^M^V^H^Ð^M^V^Hð^
M^V^
Line 2147: Type error – argument -8(SP0) :
unknown, which is not <= nonnull String.
Line 2147: Type error – argument -8(SP0) :
unknown, which is not <= nonnull String.
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
5
ChallengesChallengesclass P {
int f;int m() { … }
}class C extends P
{int m() { … }
}
…P p = new P();P c = new C();c.m();…
invokevirtual P.m()
branch (= rc 0) Labort
rtmp := m[rc + 8]
rtmp := m[rtmp +
12]
rarg0 := rc
rra := &Lret
jump [rtmp]
Lret:
invokevirtual P.m()
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
6
ChallengesChallengesclass P {
int f;int m() { … }
}class C extends P
{int m() { … }
}
…P p = new P();P c = new C();c.m();…
invokevirtual P.m()
branch (= rc 0) Labort
rtmp := m[rc + 8]
rtmp := m[rtmp + 12]
rarg0 := rc
rra := &Lret
jump [rtmp]
Lret:
hrc : P, … i
hrc : nonnull P, …i
hrtmp : disp(P), …i
hrtmp : meth(P,12), …i
hrarg0 : P, …i
hrrv : int, …i
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
7
ChallengesChallengesclass P {
int f;int m() { … }
}class C extends P
{int m() { … }
}
…P p = new P();P c = new C();c.m();…
invokevirtual P.m()
branch (= rc 0) Labort
rtmp := m[rc + 8]
rtmp := m[rtmp + 12]
rarg0 := rrpp
rra := &Lret
jump [rtmp]
Lret:
hrc : P, … i
hrc : nonnull P, …i
hrtmp : disp(P), …i
hrtmp : meth(P,12), …i
hrarg0 : PP, …i
hrrv : int, …i
unsoundunsound
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
8
ChallengesChallengesclass P {
int f;int m() { … }
}class C extends
P {int m() { … }
}
…P p = new P();P c = new C();int f = p.f;p.m();x = f + 1;
branch (= rp 0) Labort
rtmp := rp + 12
rf := m[rtmp]
branch (= rp 0) Labort
rtmp := m[rp + 8]
rtmp := m[rtmp + 12]
rarg0 := rp
rra := &Lret
jump [rtmp]
Lret:rx := rf + 1
reorderingand
optimization
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
9
ChallengesChallenges
branch (= rp 0) Labort
rtmp := rp + 12
rf := m[rtmp]
rx := rf + 1
branch (= rp 0) Labort
rtmp := m[rtmp - 4]
rtmp := m[rtmp + 12]
rarg0 := rp
rra := &Lret
jump [rtmp]
Lret:
hrtmp : int ptr, … i
“funny”pointer
arithmetic
class P {int f;int m() { … }
}class C extends
P {int m() { … }
}
…P p = new P();P c = new C();int f = p.f;p.m();x = f + 1;
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
10
OutlineOutline
• Overview• Abstract State
– Types– Join algorithm
• Verification Procedure• Educational Experience• Concluding Remarks
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
11
OverviewOverview
• To maximize accessibility (i.e., minimize annotations)– infer types at intermediate program
points using abstract interpretation– willingness to have a limited amount of
specialization to a compilation strategy• To handle assembly code
– use (simple) dependent types• To be less sensitive to the compilation
strategy– assign types to intermediate values lazily
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
12
Abstract StateAbstract State
• Keep intermediate expressions in symbolic form
• Symbolic valuesSymbolic values abstract irrelevant expressions keeping only type information
• Intermediate expressions track many dependencies between registers
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
13
Why Symbolic Values?Why Symbolic Values?
• Used to track dependencies between registers
• Value state form convenient for abstract interpretation
r1 = r2 Æ r1 = r3 + 4
r1 = + 4, r2 = + 4, r3 =
r1 := r2
[r1 ! (r2)]
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
14
ValuesValues
• Define a normalization of expressions to values
• Values give the expressions of interest– For our case, address computation and
comparisons with constants– Would vary depending on the source language or
compiler of interest– Not all equal expressions normalize to the same
value, so value forms must be chosen carefully
• Registers are equal if they map to the same values (after normalization)
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
15
TypesTypes
• Primitive Types and Object References
• Types for Run-time Structures
Tracks that the value is the dispatch table
of object
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
16
JoinJoin
h r1 = +4, r2 = +4, r3 = D : disp(), : nonnull exactly Ci
h r1 = +4, r2 = +8, r3 = D : disp(), : nonnull Ci
h r1 = +4, r2 = , r3 = D : disp(), : nonnull C, : >i (,) !
(,) !
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
17
JoinJoin
• Joining of type state basically given by subtyping– subtyping lattice still fairly simple – dictated
mostly by the class inheritance hierachy• Values are (fancy) labels for equivalence
classes of registers– lattice of conjunctions of equalities on
registers– first normalize expressions to values for join– each distinct pair of values in the input state
gives an equivalence class in the result state
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
18
OutlineOutline
• Overview• Abstract State
– Types– Join algorithm
• Verification Procedure• Educational Experience• Concluding Remarks
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
19
Verification ExampleVerification Exampleclass P {
int f;int m() { … }
}class C extends
P{ … }
branch (= rp 0) Labort
rtmp := rp + 12
rf := m[rtmp]
rx := rf + 1
rtmp := m[rtmp - 4]
rtmp := m[rtmp + 12]
rarg0 := rp
rra := &Lret
jump [rtmp]
Lret:
Typing Rule
hrtmp = + 12 D …i
hrp = D : Pi
h… D : nonnull Pi
hrf = D : inti
hrx = + 1 D : inti
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
20
Verification ExampleVerification Exampleclass P {
int f;int m() { … }
}class C extends
P{ … }
branch (= rp 0) Labort
rtmp := rp + 12
rf := m[rtmp]
rx := rf + 1
rtmp := m[rtmp - 4]
rtmp := m[rtmp + 12]
rarg0 := rp
rra := &Lret
jump [rtmp]
Lret:
hrtmp = + 12 D …i
hrp = D : Pi
h… D : nonnull Pi
Typing Rule
hrf = D : inti
hrx = + 1 D : inti
hrtmp = D : disp()i
hrtmp = D : meth(,12)i
hrarg0 = D …i
hrrv = D : inti
Calling meth(,12)
checks rarg0 =
+ 12 – 4 + 8
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
21
OutlineOutline
• Overview• Abstract State
– Types– Join algorithm
• Verification Procedure• Educational Experience• Concluding Remarks
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
22
Comparing Compiler DevelopmentComparing Compiler Development
• Good compilations increase: 53% to 75%
0%
20%
40%
60%
80%
100%
2002 2003 2004
test passed, type safe(observably correct)
test passed, type safe butCoolaid failed (incompleteness)
test passed, type error (hiddentype error)
test failed, type error (visible typeerror)
test failed, type safe (semanticerror)
Without CoolaidWithout Coolaid With CoolaidWith Coolaid
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
23
0%
20%
40%
60%
80%
100%
2002 2003 2004
test passed, type safe(observably correct)
test passed, type safe butCoolaid failed (incompleteness)
test passed, type error (hiddentype error)
test failed, type error (visible typeerror)
test failed, type safe (semanticerror)
Comparing Compiler DevelopmentComparing Compiler Development
• Type errors decrease: 44% to 19%– even 29% to 15% if only consider those that are visible
in testing
Without CoolaidWithout Coolaid With CoolaidWith Coolaid
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
24
0%
20%
40%
60%
80%
100%
2002 2003 2004
test passed, type safe(observably correct)
test passed, type safe butCoolaid failed (incompleteness)
test passed, type error (hiddentype error)
test failed, type error (visible typeerror)
test failed, type safe (semanticerror)
Comparing Compiler DevelopmentComparing Compiler Development
• False alarms: 6% to 1%– price for ease of use
Without CoolaidWithout Coolaid With CoolaidWith Coolaid
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
25
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
0–29 30–34 35–39 40–44 45–49
number of tests passed
perc
enta
ge o
f co
mpi
lers
2002 (without Coolaid)
2003 (without Coolaid)
2004 (with Coolaid)
Student PerspectiveStudent Perspective
• 0: “counter-productive” 6: “can’t imagine being without it”
• A favorite comment:“I would be totally lost without Coolaid. I learn best when I am using it hands-on …. I was able to really understand stack conventions and optimizations and to appreciate them.”
0
5
10
15
20
25
0 1 2 3 4 5 6
usefulness rating
num
ber
of g
roup
s
• Traditional testing procedure– 49 test cases / inputs
• Mean scores:– 33 (67%) in 2002– 34 (69%) in 2003– 39 (79%) in 2004
1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
26
ConclusionConclusion
• Extended data-flow based type inference of intermediate languages to assembly code– Able to retrofit existing compilers– Minimal annotations for accessibility
• Provided experimental confirmation that certifying compilation helps early compiler debugging
• Introduced undergraduates to the idea of improving software quality with program analysis
Top Related