Download - Shape Analysis for Low-level Code

Transcript
Page 1: Shape Analysis for Low-level Code

Shape Analysis for Low-level Code

Hongseok Yang(Seoul National University)

(Joint work with Cristiano Calcagno, Dino Distefano and Peter O’Hearn)

Page 2: Shape Analysis for Low-level Code

Dream

Automatically verify the memory safety of systems code, such as device derivers and memory managers.

Challenges: 1. Pointer arithmetic.2. Scalability.3. Concurrency.

Page 3: Shape Analysis for Low-level Code

Our Analyzer Handles programs for dynamic memory

management. Experimental results (Pentium

3.2GHz,4GB)Found a hidden assumption of the K&R memory manager. These are “fixed” versions.

Proved memory safety and even partial correctness.

Page 4: Shape Analysis for Low-level Code

Sample Analysis Result

Program: ans = malloc_bestfit_acyclic(n);Precondition: n¸2 Æ mls(freep,0)

Postcondition: (ans=0 Æ n¸2 Æ mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,q’) * mls(q’,0))

Page 5: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 6: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 7: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 8: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 9: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global VarsStack Heap

Page 10: Shape Analysis for Low-level Code

Multiword Lists

24

515 3 18 3 nil 2

lp 15 18

24

Link Field Size Field

Page 11: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 25 15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

p

Page 12: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p

Page 13: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 14: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 15: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 8 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 16: Shape Analysis for Low-level Code

Coalescing

24 515 3 24 8 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 17: Shape Analysis for Low-level Code

Coalescing

15 3 24 8 nil 2

15 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p

Page 18: Shape Analysis for Low-level Code

Coalescing

15 3 24 8 nil 2

15 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p=0

Nodeful High-level View

Nodeful High-level View

Nodeless Low-level View

Complex numerical relationships are used only for reconstructing a high-

level view.

Page 19: Shape Analysis for Low-level Code

Separation Logic blk(p+2,p+5)

nd(p,q,5) =def (pq) * (p+15) * blk(p+2,p+5)

mls(p,q)

p+2 p+5

p+5

5q

p

3 4 2

qp

Page 20: Shape Analysis for Low-level Code

Symbolic Heaps

9x’,y’. (P1 Æ P2 Æ … Æ Pn) Æ (H1 * H2 * … * Hm)

whereP ::= E=F | E·F | E!=F | …H ::= EF | blk(E,F) | mls(E,F) | nd(E,F,G) |…

Page 21: Shape Analysis for Low-level Code

Abstract Domain

P(CanSymH)>,µ

Pfin(SymH)>,µ

P(Emb) P(Abs)

y=x+z Æ x y*x+1 z*blk(x+2,0)*mls(y,0)

nd(x,y,z) * mls(y,0)

{Q1, Q2, … ,Qn}

{T1,T2,…,Tn}

Page 22: Shape Analysis for Low-level Code

Our Analysis

while(B) { C;

}

{T1,T2,…,Tn}

{ T’1,T’2,…,T’m}

{Q1, Q2, … ,Qn}

Nodeful View:

P(CanSymH)>

Nodeless View:

Pfin(SymH)>

{Q’1, Q’2, … ,Q’m}

Emb; Rearrangement

Abstraction

Sym. Execution

Page 23: Shape Analysis for Low-level Code

Our Analysis

while(B) { C;

}

{T1,T2,…,Tn}

{ T’1,T’2,…,T’m}

{Q1, Q2, … ,Qn}

Nodeful View:

P(CanSymH)>

Nodeless View:

Pfin(SymH)>

{Q’1, Q’2, … ,Q’m}

Page 24: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = P(SymExec(A) o Rearrange(A))d «while b C¬d = FixComp(P(Abs) o F)

where F : P(CanSymHeaps) ! P(CanSymHeaps) F(d’) = P(Abs)(d [ «C¬d’)

Page 25: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = (P(SymExec(A)) o lift(Rearrange(A)))d «while b C¬d = FixComp(P(Abs) o F)

where F : P(CanSymHeaps) ! P(CanSymHeaps) F(d’) = P(Abs)(d [ «C¬d’)

SymExec(A) :

Proof Rules in Sep. Log.

Rearrange(A) :

Unrolling of mls and nd

Page 26: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = (P(SymExec(A)) o lift(Rearrange(A)))d «while b C¬d = FixComp(F)

where F : P(CanSymH)> ! P(CanSymH)>

F(d’) = P(Abs)(d [ («C¬o P(Emb))d’)

Emb: CanSymH !SymH Abs : SymH ! CanSymH

Information Loss

Widened Differential Fixpoint Algorithm

Page 27: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(p q’ * p+1 3 * blk(p+2,z’) * mls(q’,0))

Page 28: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

Page 29: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0) * r 4)

Page 30: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0) * true)

Page 31: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

Page 32: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))

Page 33: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))

Page 34: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. mls(p,0)

Page 35: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.Precondition: true

… (xx’,s) * blk(x+2,x+s) Ã … nd(x,x’,s)

x’ s

x x+2 x+s

x’ s

x x+s

Page 36: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.Precondition: s = s’+i

… (xx’,s) * blk(x+2,x+i) * nd(x+i,y’,s’) Ã … nd(x,x’,s)

y’ s’x’ s

x x+2 x+i x+i+s’

x’ s

x x+s

Page 37: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 38: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*q’r’,t’*blk(q’+2,q’+t’)*mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 39: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*nd(q’,r’,t’) *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 40: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 41: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

mls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 42: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

mls(lp,p)*mls(p,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 43: Shape Analysis for Low-level Code

Theorem Prover for “Q1 ` Q2”

without prover with prover

malloc_K&R about 20 hours 502.23 secs

free_K&R 23.844 secs 9.69 secs

Page 44: Shape Analysis for Low-level Code

Put Prover inside Hoare Powerdomain?

Q1 ` Q2, Q3 ` Q4

{Q1, Q2, Q3, Q4}

x0 = {}

x1 = F(x0) = {Q1, Q2, Q4}

x2 = F(x1) = {Q1, Q2, Q3, Q4}

P(CanSymH), µ vs. PH(CanSymH), v

{Q2, Q3} v

But, works only when ` is transitive.

Page 45: Shape Analysis for Low-level Code

Put Prover inside Hoare Powerdomain?

Q1 ` Q2, Q2 ` Q3, Q3 ` Q1

x0 = {}

x1 = F(x0) = {Q1, Q2}

x2 = F(x1) = {Q2, Q3}

x3 = F(x2) = {Q3, Q1}

x4 = F(x3) = {Q1, Q2}

P(CanSymH), µ vs. PH(CanSymH), v

But, works only when ` is transitive.

Page 46: Shape Analysis for Low-level Code

Put Prover inside Widening!

r : P(CanSymH) £ P(CanSymH) ! P(CanSymH)

x0r x1 =def x0 [ { Q 2 x1 | 8Q’ 2 x0. Q ` Q’ }

x0 = {}

x1 = x0 r F(x0)

x2 = x1 r F(x1)

xn+1 = xn r F(xn)

…x0 µ x1 µ x2 µ x3 …

Page 47: Shape Analysis for Low-level Code

Add Differencing

F : P(CanSymH) ! P(CanSymH)

x0 = {}

x1 = x0rF({}) = {Q1}

x2 = x1rF({Q1}) = {Q1,Q2}

x3 = x2rF({Q1,Q2}) = {Q1,Q2,Q3}

x4 = x3rF({Q1,Q2,Q3}) = {Q1,Q2,Q3}xn+1 = xnrF(yn), yn+1 = xn+1-xn

Nonstandard Fixpoint Algorithm:

• NOT y µ (x r y).

• NOT F(wdfix F) µ wdfix F.

NOT (F(wdfix F)) µ (wdfix F)

Page 48: Shape Analysis for Low-level Code

Soundness

Analysis results can be compiled into separation-logic proofs.

Page 49: Shape Analysis for Low-level Code

Widened Differential Fixpoint Algo.

«while (*) C¬d0 = ??

x0 = d0

x1 = x0r F(x0) y1 = x1 – x0

x2 = x1r F(y1) y2 = x2 – x1

x3 = x2r F(y2) = x2(x3) µ (d0) [ (y1)

[ (y2)x3 = d0r F(d0) r F(y1) r F(y2)(x3) (d0) [ (F(d0)) [ (F(y1))

[ (F(y2))

Page 50: Shape Analysis for Low-level Code

Widened Differential Fixpoint Algo.

{d0} C {F(d0)} {y1} C {F(y1)} {y2} C {F(y2)}

{d0} C {x3} {y1} C {x3} {y2} C {x3}

{d0 Ç y1 Ç y2} C {x3}

{x3} C {x3}

{x3} while (*) C {x3}

{d0} while (*) C {x3}

Disjunction Rule

Consequence:

(x3) (d0) [ (F(d0)) [ (F(y1)) [ (F(y2))

Consequence:

(x3) µ (d0) [ (y1)

[ (y2)