Modular Machine Code Verification Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak...
-
Upload
amice-wilkerson -
Category
Documents
-
view
232 -
download
5
Transcript of Modular Machine Code Verification Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak...
Modular Machine Code Verification
Zhaozhong Ni
Advisor: Zhong ShaoCommittee: Zhong Shao, Paul Hudak
Carsten Schürmann, David WalkerDepartment of Computer Science, Yale University
Nov. 29, 2006
PhD Thesis Defense
22
19 Lines of Code on Every PCswapcontext:; store old context
mov eax, [esp+4] mov [eax+0], OK mov [eax+4], ebx mov [eax+8], ecx mov [eax+12], edx mov [eax+16], esi mov [eax+20], edi mov [eax+24], ebp mov [eax+28], esp
; load new contextmov eax, [esp+8]mov esp, [eax+28]mov ebp, [eax+24]mov edi, [eax+20]mov esi, [eax+16]mov edx, [eax+12]mov ecx, [eax+8]mov ebx, [eax+4]mov eax, [eax+0]ret
33
19 Lines of Code in Every msswapcontext:
Runs thousands of time per secondUsed by assembly, C, MSIL, JVML, etc.Basis of multi-tasking, OS, and softwareSafety and correctness taken for granted
44
swapcontext:
old
19 Lines of Code Looks Simple
eaxebxecxedxesiediebpesp
retp
…
…call swapcontext
…
retp’
…
………
a1
a2
a3
a4
a5
a6
a7
a8
b1
b2
b3
b4
b5
b6
b7
b8
OK
new
a8
55
19 Lines of Code Proven Hardswapcontext:
Simple code, complex reasoning!stack / heap / memory mutationprocedure call / first-class code pointerprotection / polymorphism
Lack specification and verification that areformal (machine checkable in sound logic)general (allows all possible usage of context)realistic (usable from assembly and C level)
66
Outline
Introduction
The XCAP Framework
Mini Thread Library
Connect XCAP to TAL
Conclusion
77
Software Reliability
Bugs are costly
Especially important formission-critical softwareconsumer electronics softwareinternet software
88
Test-Patch Approach
Works most of the time Gives no guarantee Could make things worse
test
pre-release?
create patch
debug
noyes
99
Language-based Approach
Uses types and other formal specificationsExcludes all bugs in certain categories
illegal command, overflow, dangling pointer, etc.
Successful and popularML, Java, C#, etc.
Reached virtual machine code levelJVML, MSIL, TIL, TAL, etc.
Meta-theorems can make guarantees
1010
Traditional Assumptions
Types are for application softwareyou can not write OS without (void *)
Types are for high-level languagesnot much to talk about 89 84 24 07 5B CD 15
Types are only for “no blue screen”how about “variable x is a prime number”
Type safety are bad for performanceturn off array-bound checking before release
1111
Program Specification
bool prime (int n) {
assert (n > 0);
for (int i = 2; i < n; i ++)
// n mod 2,…,i-1 ≠ 0
if (n % i == 0)
return false;
// n mod 2,…,n-1 ≠ 0
return true;
}
syntactic types
machine-logical specifications
meta-logical specifications
1212
Machine Code Verification
Motivationseverything goes down to binaryhigh-level safety efforts lost in compilationcritical code directly written in low level
ChallengesExpressivenessModularity
Goalsboth user and system level codemodular specification + certification
1313
Proof-Carrying Code
Code Proof
Checker
Meta theory
Specification
Proposed 10 years ago [Necula & Lee]
machine codemachine checkable proof
1414
Foundational PCC
Code Proof
Checker
Meta theory
Specification
Proposed by [Appel]
mathematic logic checkermathematic logic theory
1515
Approaches to PCC
Type-based PCC TAL [Morrisett98] Touchstone PCC [Colby00] Syntactic FPCC [Hamid02] FTAL [Crary03] LTAL [Chen03] …
Modular Generate proof easily Type safety
Logic-based PCC Original PCC [Necula98] Semantic FPCC [Appel01] CAP [Yu03] Open Verifier [Chang05] CCAP/CMAP [Yu04,
Feng05] …
Expressive Advanced properties Good interoperability
1616
PCC After 10 Years
In principle, can verify any machine code!
In reality, many programs are not verified.
For some code, we do not know HOW!
Code Proof
Checker
Meta theory
Specification
1717
User-level Code: List Append
a1x a2b1 an
Adapted from [Reynolds02]
nilbn
closurerk
e1
k
rk1
e2
k
rk2
en-1
k
rkn-1
rkn-2
y nilb2
bn-1 an
……
1818
User-level Code: List Append
a1x
a2b1
Adapted from [Reynolds02]
closurerk
e1
k
rk1
e2
k
rk2
en-1
k
rkn-1
rkn-2
y nil
bn-1 an
……
an-1b2
1919
Type-based Logic-based
Inductive definitions(correctness of list append)
- +
Strong update (Separation logic)
(allocation, de-allocation, mutation)- +
Embedded code pointers (continuation)
+ -
Impredicative polymorphisms (closure)
+ -
Adapted from [Reynolds02]
User-level Code: List Append
2020
ECP Problem w. Hoare LogicEmbedded code pointers (ECP)
Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses
“… are difficult to describe in … Hoare logic” [Reynolds02]
Previous approachesIgnore ECP [Necula98, Yu04] Limit ECP specifications to types [Hamid04]
Sacrifice modularity [Yu03]
Use complex indexed semantic models [Appel01]
2121
Outline
Introduction
The XCAP Framework
Mini Thread Library
Connect XCAP to TAL
Conclusion
2222
The XCAP Framework [POPL’06]
A logic-based PCC frameworkmodular verification of machine codesupports ECP without compromise
Support both system and user codeConsists of
target machine (not fixed)assertion language (consistency)inference rules (soundness)
2323
Target Machine
2424
Dynamic Semantics
2525
Hoare logic in CPS
Use general predicate logic for assertions
example:
Mechanized in a proof assistant (Coq)Extensions made: CCAP, CMAP, etc.
Certified Assembly Programming[Yu03, Hamid04, Yu04, Feng05]
2626
How CAP Certify Instructions
2727
How CAP Certify Programs
…
2828
The ECP Problem
cptr(f, a) = ?
2929
Internalize Hoare-derivation for ECP
Previous Approach
Circularity!
Stratification[OHearn97, Naumann01]
Works for simple case Hard for assembly Hard for polymorphism
Step-Indexing[Appel01, Appel02, Schneck03]
Works for polymorphism Heavyweight Not standard Hoare logic
3030
CAP’s ApproachSpecify ECP by checking against code spec
Verify all code specs are indeed valid
Modularity problem
3131
The XCAP ApproachSpecify ECP independent of code spec
Check ECP against global code spec
Verify global code spec is indeed valid
3232
Extended Propositions
3333
XCAP Rules
3434
How XCAP Works with ECP
(SEQ)
(ECP)
(JMP)
(JD)
3535
Verification of append()
nth(ls, 1)
l'
nth(ls, 2)
l'’-
nth(ls, n)
NULL-…...l
continuation code blockenvironment of type aenvr0
r1
env
z linked list stores lsf
environment of type aenv
continuation code block expecting an environment
aenv and a list stores ls
cnt
env
l
3636
Impredicative Polymorphisms
Important for ECP
Naïve interpretation function fails
environment of type aenv
continuation code block expecting an environment
aenv and a list stores ls
cnt
env
l
3737
New Interpretation
Soundness of interpretation
Interpretation
Consistency
3838
Recursive Specification
Simple recursive data structures linked list, queue, stack, tree, etc.
supported via inductive definition of PropComplex recursive structures with ECP
object (self refers to the entire object)threading invariant (each thread assumes others)
Recursive specification
3939
Memory Mutation
Strong update special conjunction (p * q) in separation logic
directly definable in Prop and PropXexplicit alias control, popular in system level
Weak update (general reference) mutable reference (int ref) in ML managed data pointers (int __gc*) in .NET
rely on GC to recycle memorypopular in user level
4040
Weak Update
Reference cell
Interpretation
Record macro
4141
Implementation in Coq
PropX can share similar tactics with Prop
Target machine 341 lines
PropX, interpretation, and consistency 1733 lines
XCAP with soundness 444 lines
CAP with soundness 402 lines
CAP to XCAP translation with proof 543 lines
Separation logic and lemmas 300 lines
append() example 1718 lines
4242
Outline
Introduction
The XCAP Framework
Mini Thread Library
Connect XCAP to TAL
Conclusion
4343
Why Thread Library?
Concurrent verification primitives’ correctness is assumedprimitives are not really “primitive”!poor portability due to lack of formal spec
Core of OS kernelassignment 1 of OS coursewritten in C and Assemblyrequires both safety and efficiency
4444
4545
A Mini Thread Library
Modeled after Pth
Non-preemptive user level threads
Written in (subset of) x86 assembly
eaxebxecxedx
esiediebpesp
pc
cf zf
Stack
unavailable
Code
Static
Heapsystem break
H
F
R
ss_start
ss_end
hp_start
4646
Threading Model
Current Thread
Scheduler
yield()swapcontext() exit()
queue_insert()queue_delete()
Ready Thread
spawn()
Stack
ReadyTread
Stack
StaticStack
Heap
Stack
start()
4747
Modules and Interfaces
machine context moduletypedef struct mctx_st *mctx_t;struct mctx_st {int eax, int ebx, int ecx, int edx, int esi, int edi, int ebp, int esp};void loadcontext (mctx_t mctx);void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);void swapcontext (mctx_t old, mctx_t new);
memory modulevoid *flist = NULL;void *malloc (int size);void free (void *ptr);
queue moduletypedef struct node_st *node_t;struct node_st {node_t next};void queue_insert (node_t *q, node_t t);node_t queue_delete (node_t *q);
threading moduleenum mth_state_t {READY, DEAD};typedef struct mth_st *mth_t; struct mth_st {mth_t next, mth_state_t state, mctx_st mctx};mth_t mth_current, mth_rq;mctx_t mctx_sched;void mth_start (int stacksize, void *(*main)(void *), void *arg);void mth_spawn (int stacksize, void *(*func)(void *), void *arg);void mth_exit (void *value);void mth_yield (void);void mth_scheduler (void);
4848
Verify That 19 Lines of Code
Step 1: specify machine context
Step 2: specify function call/return
Step 3: specify swapcontext()
Step 4: prove it!
4949
Machine Context
…
………
retvbxcxdxsidibpsp
cs
mctxpublic
private
typedef struct mctx_st *mctx_t;struct mctx_st { int eax,int ebx,int ecx,int edx, int esi, int edi, int ebp,int esp };
ret
5050
Function Call / Return
local storage
return address
argument 1
argument 2
…
argument n
caller frames
excess space
esp
5151
swapcontext()void swapcontext (mctx_t old, mctx_t new);
mov eax, [esp+4] mov [eax+ 0], OK mov [eax+ 4], ebxmov [eax+ 8], ecxmov [eax+12], edxmov [eax+16], esimov [eax+20], edimov [eax+24], ebpmov [eax+28], espmov eax, [esp+8] mov esp, [eax+28]mov ebp, [eax+24]mov edi, [eax+20]mov esi, [eax+16]mov edx, [eax+12]mov ecx, [eax+ 8]mov ebx, [eax+ 4]mov eax, [eax+ 0]ret
5252
Other Context Routinesvoid loadcontext (mctx_t mctx);void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);
5353
Thread Control Block
typedef struct mth_st *mth_t; struct mth_st {mth_t next, mth_state_t state, mctx_st mctx};
mth next
state
machinecontext
qNULL
state
machinecontext
next
state
machinecontext
5454
Threading Invariant
schedulercontext
mctx_schedsched
st
mth_curcur
ready threads
mth_rq…
5555
Threading Routinesvoid mth_yield (void);mth_t mth_spawn (int stacksize, void *(*func)(void *), void *arg);void mth_scheduler (void);
5656
Implementation
40,000 lines of Coq code
Where comes the complexity?lemma library: large and reusablex86 machine: finite integerembedding: de Burijin indices engineering: limited proof re-usetarget code: this is the kernel of software!
5757
Outline
Introduction
The XCAP Framework
Mini Thread Library
Connect XCAP to TAL
Conclusion
5858
Typed Assembly LanguageTAL [Morrisett et al]
Top-level typing judgmentTarget of type-preserving compilationFor user and simple system level code
5959
TAL to XCAP Translation (1)
Translation of value types
6060
TAL to XCAP Translation (2)
Translation of preconditions
Translation of code heap types
Translation of data heap types
6161
Typing Preservation
6262
Application Scenario
device driver
OS kernelfirmware
user application
libraryTAL
XCAP
6363
Outline
Introduction
The XCAP Framework
Mini Thread Library
Connect XCAP to TAL
Conclusion
6464
Summarizing XCAP
Support user-level machine codedemonstrated by type-preserving translation
Support system-level machine codedemonstrated by mini thread library
Support modular machine code verificationmodular as typeexpressive as logic
6565
Other Work
A syntactic approach to FPCC [LICS’02]
Simple type safety, no need of indexed model
Stack-based control abstractions [PLDI’06]
utilizes the fixed ECP pattern to simplify things
An open framework for FPCC [TLDI’07]
allows different verification styles in a system
6666
Some Future Directions
Add logic power to higher level languagesC and C#, certifying compilation
Certify those safe “unsafe” codegarbage collector, preemptive thread library, device driver, etc.
Consider other propertiescorrectness, liveness, security, etc.
Build tools for productivityconcrete syntax and parser, large lemma libraries, etc.
6767
Thank You!