T. Lev-Ami, R. Manevich, M. Sagiv tvla TVLA: A System for Generating Abstract Interpreters A....

Post on 30-Dec-2015

213 views 0 download

Tags:

Transcript of T. Lev-Ami, R. Manevich, M. Sagiv tvla TVLA: A System for Generating Abstract Interpreters A....

T. Lev-Ami, R. Manevich, M. Sagiv

http://www.cs.tau.ac.il/~tvla

TVLA: A System for Generating

Abstract Interpreters

A. Loginov, G. Ramalingam, E. Yahav

T. Reps & R. Wilhelm

Object Oriented Programs

• Extensive use of the heap

• Dynamic allocation

• Recursive data structures

• Destructive reference updates• x.f := y

Verification of OO Programs

• Verify properties about the heap

• No runtime errors– Null dereferences– Freed memory– Dangling pointers

• No memory leaks

• Partial correctness

A Common Approach to Pointer Analysis in Software Verification

• Break verification problem into two phases

– Preprocessing phase – separate points-to analysis• typically no distinction between objects allocated at same site

– Verification phase – verification using points-to results• May lose precision

– May lose ability to perform strong updates– May produce false alarms

Program

Property

Pointer Analysis

VerificationAnalysis

Loss of Precision in Two-Phase Approach

f = new InputStream();

f.read();

f.close();

Verify f is not readafter it is closed

Straightforward ...

Loss of Precision in Two-Phase Approach

while (?) {

f = new InputStream();

f.read();

f.close();

}

f1f

False alarm!“read may be erroneous”

closed=1/2

The TVLA Approach

Express concrete semantics using first order logic

Abstract Interpretation using 3-valued Kleene’s logic1. verification integrated with heap analysis

heap analysis (merging criterion) can adapt to verification

Program

Property

Frontend

First-OrderTransition Sys

AbstractInterpretation

while (?) {

f = new InputStream();

f.read();

f.close();

}

The TVLA Approach: An Example

f

closedf

closedf closed

closedf closed closed

Concrete States

f

closedf

closedf

Abstract States

Outline

• Expressing concrete semantics using logic (TVLA input)

• Canonincal abstraction

• Abstract interpretation using Canonincal abstraction (TVLA engine)

• Applications & ongoing Work

Traditional Heap Interpretation• States = Two level stores

– env: Var Values– fields: Loc Values– Values=Loc Atoms

• Example – env = [x 30, p 79]– next = [30 40, 40 50, 50 79, 79 90]– val = [30 1, 40 2, 50 3, 79 4, 90 5]

1 40 2 50 3 79 4 90 5 0 x

p

30 40 50 79 90

Predicate Logic• Vocabulary

– A finite set of predicate symbols Peach with a fixed arity

• Logical Structures S provide meaning for predicates – A set of individuals (nodes) U– pS: (US)k {0, 1}

• FOTC over TC, express logical structure properties

Representing Stores as Logical Structures

• Locations Individuals• Program variables Unary predicates• Fields Binary predicates• Example

– U = {u1, u2, u3, u4, u5}– x = {u1}, p = {u3}– next = {<u1, u2>, <u2, u3>, <u3, u4>, <u4, u5>}

u1 u2 u3 u4 u5xn n n n

p

Concrete Interpretation Rules (Lists)Statement Safety Precondition Postcondition

x =NULL 1 v:x’(v)

x=malloc() v0: active(v0) v: x’(v) eq(v, v0)

v: active’(v) active(v) eq(v,v0)

x=y 1 v:x’(v) y(v)

x=y next v0: y(v0) active(v0) v:x’(v) next(v0, v)

x next=y v0: x(v0) active(v0) v1,v2: next’(v1, v2) ( eq(v1, v0 ) next(v1, v2))

(eq(v1, v0) y(v2))

Invariants• No memory leaks

v: active(v) {x PVar} v1: x(v1) next*(v1, v)• Acyclic list(x)

v1,v2: x(v1) next*(v1, v2) next+(v2, v1)• Reverse (x)

v1,v2, v3: x(v1) next*(v1, v2) next(v2, v3) next’(v3, v2)

• 1: True

• 0: False

• 1/2: Unknown

• A join semi-lattice: 0 1 = 1/2

Kleene 3-Valued Logic

1/2 Information

order

Logical order

Boolean Connectives [Kleene] 0 1/2 1

0 0 0 01/2 0 1/2 1/21 0 1/2 1

0 1/2 1

0 0 1/2 11/2 1/2 1/2 11 1 1 1

3-Valued Logical Structures

• A set of individuals (nodes) U

• Predicate meaning– pS: (US)k {0, 1, 1/2}

Canonical Abstraction• Partition the individuals into equivalence classes based on the values of

their unary predicates

– Every individual is mapped into its equivalence class

• Collapse predicates via

– pS (u’1, ..., u’k) = {pB (u1, ..., uk) | f(u1)=u’1, ..., f(u’k)=u’k) }

• At most 2A abstract individuals

Canonical Abstraction

x = NULL;

while (…) do {

t = malloc();

t next=x;

x = t

}

u1x

t

u2 u3

u1x

t

u2,3

n n

n

n

u1x

t

u2 u3n n

x

t

n nu2u1 u3

Canonical Abstraction

x = NULL;

while (…) do {

t = malloc();

t next=x;

x = t

} u1x

t

u2,3n

n

n

Canonical Abstraction and Equality

• Summary nodes may represent more than one element

• (In)equality need not be preserved under abstraction

• Explicitly record equality

• Summary nodes are nodes with eq(u, u)=1/2

Canonical Abstraction and Equality

x = NULL;

while (…) do {

t = malloc();

t next=x;

x = t

}

u1x

t

u2 u3

u1x

t

u2,3

eq

eq

eq

n n

n

n

eq eq

eq

eq

eq

eqeq

u2,3

Canonical Abstraction

x = NULL;

while (…) do {

t = malloc();

t next=x;

x = t

}

u1x

t

u2 u3n n

u1x

t

u2,3n

n

Heap Sharing predicate

is(v)=0

u1x

t

u2 un…

u1x

t

u2..n

n

n

v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2)

is(v)=0 is(v)=0

is(v)=0 is(v)=0

n n n

Heap Sharing predicate

is(v)=0

u1x

t

u2 un…

is(v)=1 is(v)=0

n n

n

n

u1x

t

u2n

is(v)=0 is(v)=1 is(v)=0

n

u3..n

n

n

v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2)

Best Conservative Interpretation (CC79)

Abstraction

ConcretizationConcrete Representati

on

Collecting Interpretation

stc

ConcreteRepresentati

on

AbstractRepresentati

on

Abstract Representati

on

Abstract Interpretation

st#

yx

yx ...

Evaluateupdateformulas

y

x

y

x

...

inverse embedding

y

x

y

xCanonincalCanonincal abstraction

xy

“Focus”- Based Transformer (x = x n)

“Focus”-Based Transformer (x = x n)

y

x

y

x

EvaluateupdateFormulas (Kleene)

y

x

y

xCanonincal

yx

yx

Focus(x n)

“Partial ”

xy

Example: Abstract Interpretation

t

x

n

x

t n

x

tn n

xtt

x

ntt

nt

x

t

x

t

xempty

x

tn

n

x

tn

n

n

x

tn

t n

xn

x

tn

nreturn x

x = t

t =malloc(..);

tnext=x;

x = NULL

TF

Summary

Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000)

Static analysis of C programs manipulating linked lists

Defects checked References to freed memory Null dereferences Memory leaks

Existing algorithms are inadequate Miss errors Report too many false alarms

Dereference of NULL pointers

typedef struct element {

int value; struct element *next; } Elements

bool search(int value, Elements *c) {Elements *elem;for (elem = c;

c != NULL; elem = elem->next;)

if (elem->val == value)

return TRUE;return FALSE

Dereference of NULL pointers

typedef struct element {

int value; struct element *next; } Elements

bool search(int value, Elements *c) {Elements *elem;for (elem = c;

c != NULL; elem = elem->next;)

if (elem->val == value)

return TRUE;return FALSE

potential null dereference

Dereference of NULL pointers

typedef struct element {

int value; struct element *next; } Elements

bool search(int value, Elements *c) {Elements *elem;for (elem = c;

elem != NULL; elem = elem->next;)

if (elem->val == value)

return TRUE;return FALSE

potential null dereference

Memory leakage

Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {

g = c->next;h = c;c->next = h;c = g;

}return h;

typedef struct element {

int value; struct element *next; } Elements

Memory leakage

Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {

g = c->next;h = c;c->next = h;c = g;

}return h;

leakage of address pointed-by h

typedef struct element {

int value; struct element *next; } Elements

Memory leakage

Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {

g = c->next;h = c;c->next = h;c = g;

}return h;

typedef struct element {

int value; struct element *next; } Elements

✔ No memory leaks

Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000)

Partial correctness The elements are sorted The list is a permutation of the original list

Termination At every loop iterations the set of elements

reachable from the head is decreased

Example: InsertSortList InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r n; pl = NULL; while (l != r) { if (l data > r data) { pr n = rn; r n = l; if (pl = = NULL) x = r; else pl n = r; r = pr; break; } pl = l; l = l n; } pr = r; r = rn; } return x; }

typedef struct list_cell { int data; struct list_cell *n;} *List;

assert sorted[x];

assert permutation[x,x’]

v:inOrder[dle, n](v) v1: next(v, v1) dle(v,v1)

Example: InsertSort

Run Demo

List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) {

pl = x; rn = r->n; l = x->n; while (l != r) {

pr->n = rn ; r->n = l;

pl->n = r; r = pr; break; }

pl = l; l = l->n;

} pr = r; r = rn;

}

typedef struct list_cell { int data; struct list_cell *n;} *List;

14

Example: Mark and Sweep

void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected = = Universe – Reachset(root) )}

void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked = = Reachset(root))}

Challenges: Heap & Concurrency[Yahav POPL’01]

Concurrency with the heap is evil… Java threads are just heap allocated objectsData and control are strongly related

Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue)

Heap analysis requires information about thread scheduling

Thread t1 = new Thread();Thread t2 = new Thread();…t = t1;…t.start();

Configurations – Example

at[l_C]

rval[myLock]

held_by

at[l_1]rval[myLock]

at[l_0]at[l_0]at[l_1]

rval[myLock]

blocked

l_0: while (true) {l_1: synchronized(myLock) {l_C: // critical sectionl_2: } l_3: }

Concrete Configuration

at[l_C]

rval[myLock]

held_by

at[l_1]rval[myLock]

at[l_0]at[l_0]

at[l_1]

rval[myLock]

blocked

Abstract Configuration

at[l_C]

rval[myLock]

held_byblocked

at[l_1]rval[myLock]

at[l_0]

Examples VerifiedProgram Property

twoLock Q No interferenceNo memory leaksPartial correctness

Producer/consumer No interferenceNo memory leaks

ApprenticeChallenge

Counter increasing

Dining philosophers with resource ordering

Absence of deadlock

Mutex Mutual exclusion

Web Server No interference

Compile-Time GC for Java(Ran Shaham, SAS’03)

The compiler can issue free when objects are no longer needed

Analysis of Java/JavaCard programsGeneric Java Bytecode FrontendPrecise analysis but expensive (cost of

procedures)

More Scalable Solution[Gilad Arnold]

Combine backward and forward analysis Forward analysis determine the shape Backward analysis locates heap liveness Use

E. YahavSchool of Computer Science

Tel-Aviv University

Verifying Safety Propertiesusing Separation

and Heterogeneous AbstractionsPLDI’04

G. RamalingamIBM T.J. Watson Research Center

Quick Overview:Fine Grained Heap Abstraction

HeapH

Precise ...... but often too expensive

Separation & Heterogeneous AbstractionFine-grainedabstraction

Coarseabstraction

Outline of Separation

Decompose verification problem into a set of subproblems

Adapt abstraction to each subproblem

Outline of Separation

Decompose verification problem into a set of subproblems• Analysis-user specifies a separation strategy

Adapt abstraction to each subproblem Heterogeneous abstraction

Prototype Implementation

Implemented over TVLACorrectness comes from the embedding theoremApplied to several example programs

Up to 5000 lines of Java

Used to verify Absence of concurrent modification exception (CME) JDBC API conformance IOStreams API conformance

Improved performance In some cases improved precision

Analysis Times

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

ISPath InputStream5 InputStream5b inputStream6JDBCExample

JDBCExample 2

db KernelBenchmark 1

SQLExecutor

Benchmark

Tim

e (s

ec)

vanilla

single

Ongoing Work

Temporal properties [Yahav, ESOP’03]Assume guarantee reasoning [Yorsh, VMCAI’04,

TACAS’04]Correctness of collection implementation

[Livshits]Automatic refinement [Loginov, ESOP’03] Integrating numeric domains [Gopan, TACAS’04]Procedure summaries [Jeannet, SAS’04]Sclalablity [Manevich SAS’04, Lev-Ami]Heap modularity [Bauer & Rinetzky]

Conclusion

FOTC is expressive for defining semanticsTVLA is a very useful research tool

Try before you publish Easy to try new algorithms

Sometimes faster than existing shape analyzers

But has high interpretation overhead Currently improved

Summary

http://www.cs.tau.ac.il/~tvla