Applications of TVLA
description
Transcript of Applications of TVLA
1
Applications of TVLA
Mooly SagivTel Aviv University
http://www.cs.tau.ac.il/~rumster/TVLA/Shape Analysis with Applications
2
Outline Issues
– Complexity of TVLA– Weak vs. Strong Updates
Cleanness– Null derefernces – Memory leaks– Freed storage (homework)– The concurrent modification problem
Partial correctness– Sorting – GC
Total Correctenss Flow dependences Multithreading Other
3
Complexity of Shape Analysis
x = malloc()
if (…) y1 = x
if (…) y2 = x
if (…) y3 = x
if (…) yn =x
4
Complexity of TVLA analysis Maximal number of Nodes in Blurred Structures
– 3|A| Size of 3-valued structure representation Action cost
– Focus– Precondition– Coerce– New– Update– Coerce– Blur
5
Weak vs. Strong Updates
if (…)
x = y
else
x = z
xn = NULL
6
Detecting Incorrect Library Usages (J. Field, D. Goyal. G. Ramalingam, A. Warshavski)
Java provides libraries for manipulating data structures Collections
– Lists– Hashset– …
Iterators over collections allows sequential accesses
Statically detect incorrect library usages
Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item);
7
The Concurrent Modification Problem
Static analysis of Java programs manipulating Java 2 collections
Inconsistent usages of iterators– An Iterator object i defined on a collection
object c– No use of i may be preceded by update to the
contents of c, unless the update was also made via I
– Guarantees order independence
8
Artificial Example
Set v = new Set(); Iterator i1 = v.iterator();Iterator i2 = v.iterator();Iterator i3 = i1;i1.next(); i1.remove(); if (...) { i2.next(); }if (...) { i3.next(); }v.add("..."); if (...) { i1.next();}
9
class Make { private Worklist worklist; public static void main (String[] args) { Make m = new Make(); m.initializeWorklist(args); m.processWorklist(); } void initializeWorklist(String[] args) { ...; worklist = new Worklist(); ... // add some items to worklist} void processWorklist() { Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item); } } void processItem(Object i){ ...; doSubproblem(...);} void doSubproblem(...) { ... worklist.addItem(newitem); ... }}
public class Worklist { Set s; public Worklist() {. ..; s = new HashSet(); ... } public void addItem(Object item) { s.add(item); } public Set unprocessedItems() { return s; }}return rev; }
10
Static Detection of Concurrent Modifications
Statically Check for CME exceptions Warn against potential CME Sound (conservative) solution Not too many false alarms Coding in TVLA
– Operational Semantics– Vanilla solution is Imprecise (and inefficient)– Derive instrumentation predicates
Java to TVP front-end– Extract potentially relevant client code
11
CME specification in Java’class Version { /* represents distinct versions of a Set */ }class Collection { Version version; Collection() { version = new Version(); } boolean add(Object o) { version = new Version(); } Iterator iterator() { return new Iterator(this); }}class Iterator { Collection set; Version definingVersion; Iterator (Collection s) { definingVersion = s.version; set = s; } void remove() { requires (definingVersion == set.version); set.ver = new Version(); definingVersion = set.version; } Object next() { requires (definingVersion == set.version); }}
12
Vanilla TVLA Encoding Local iterators are pointers
– Unary predicates Relevant fields are pointer selectors
– Binary predicates
13
Artificial Example
Set v = new Set(); Iterator i1 = v.iterator();Iterator i2 = v.iterator();Iterator i3 = i1;i1.next(); i1.remove(); if (...) { i2.next(); }if (...) { i3.next(); }v.add("..."); if (...) { i1.next();}
14
Improved TVLA Encodings• Use reachability• Explicitly maintain relevant information
• valid[i] = i.defVersion == i.set.Version• iterOf[i, v] = i.set == v• mutex[i, j] = i.set ==j.set && i != j• same[v, w] == v == w• Can be automatically derived from the specification• Polynomial complexity in programs where iterators are not
stored in the client heap Meet over all path solution• Adaptive to programs with client heap
15
Empirical Results
BenchmarkLocErr.FATime(sec)
Space(MB)
Structs.
Kernel68315060194363
MapTest3351061204937
IteratorTest
126000.234208
JFE289611236499878
16
Partial Correctness {P} S {Q} How to derive loop invariants Abstract interpretation provides a sound solution The abstract domain represents a class of program
invariants
17
Example Sorting of linked lists
dle(v1, v2) = v1.data v2.data inOrder[n, dle](v) = v1: n(v, v1) dle(v, v1) inROrder[n, dle](v) = v1: n(v, v1) dle(v1, v) Captures intermediate invariants as well
typedef struct node { struct node *n; int data;} *Elements;
18
typedef struct node { struct node *n; int data;} *Elements;
L insert_sort(L x) { L r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r ->n; pl = NULL; while (l != r) { if (l->data > r->data) {
pr->n = rn; r->n = l; if (pl == NULL) x = r; else pl->n = r; r = pr; break;}
pl = l; l = l->n; } pr = r; r = rn; }return x; }
19
L insert_sort(L x) { L r, pr, rn, l, pl;…
return x; }
ninOrder[dle,n]=½
r[n,x]
dle
inOrder[dle,n]=1/2r[n,x]
dle
nx
inOrder[dle,n]r[n,x]
dle
inOrder[dle,n]r[n,x]
dle
nx
20
/*pred.tvp */foreach (z in PVar) {
%p z(v_1) unique box}%p n(v_1, v_2) function%i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v))foreach (z in PVar) {
%i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v))}%i c[n](v) = n+(v, v)%p dle(v1, v2) reflexive transitive%i inOrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v, v_1) nonabs%i inROrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v_1, v) nonabs%r !dle(v_1, v_2) ==> dle(v_2, v_1)
21
/* cond.tvp */%action uninterpreted() { %t "uninterpreted"}%action Is_Not_Null_Var(x1) { %t x1 + " != NULL" %f { x1(v) } %p E(v) x1(v)}%action Is_Null_Var(x1) { %t x1 + " == NULL" %f { x1(v) } %p !(E(v) x1(v))}
%action Is_Eq_Var(x1, x2) { %t x1 + " == " + x2 %f { x1(v), x2(v) } %p A(v) x1(v) <-> x2(v)}
%action Is_Not_Eq_Var(x1, x2) { %t x1 + " != " + x2 %f { x1(v), x2(v) } %p !A(v) x1(v) <-> x2(v)}
22
%action Greater_Data_L(x1, x2) { %t x1 + "->data > " + x2 + "->data" %f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p !E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2)}
%action Less_Equal_Data_L(x1, x2) { %t x1 + "->data <= " + x2 + "->data"
%f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2)}
23
stat.tvp%action Set_Next_Null_L(x1) { %t x1 + "->" + n + " = null" %f { x1(v) } %message !(E(v) x1(v)) -> { n(v_1, v_2) = ... is[n](v) = ... r[n,x1](v) = ... foreach (z in PVar –{x}) { r[n, x](v) = ... } c[n](v) = inOrder[dle,n](v) = inOrder[dle,n](v) | x1(v) inROrder[dle,n](v) = inROrder[dle,n](v) | x1(v) }}
24
stat.tvp(more)
%action Malloc_L(x1) { %t x1 + " = (L) malloc(sizeof(struct node)) " %new {
x1(v) = isNew(v) inOrder[dle, n](v1, v2)=… inROrder[dle, n](v1, v2)=… }}
25
Abstract interpretation of
if x->data <= y.data
26
From Local Outlook to Global Outlook
(Safety) Every time control reaches a given point:– there are no garbage memory cells– the list is acyclic– each cell is locally ordered
(History) The list is a permutation of the original list
27
Bugs Found Pointer manipulations
– null dereferences– memory leaks
Forget to sort the first element Swap equal elements in bubble sort
(non-termination)
28
L insert_sort_b2(L x) { L r, pr, rn, l, pl; if (x == NULL) return NULL; pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n;
while (l != r) { if (l->d > r->d) {
pr->n = rn; r->n = l ; pl->n = r; r = pr; break
}pl = l; l = l->n;
}pr = r; r = rn; }
return x; }
dle
n
inOrder[dle,n]=½
dleinOrder[dle,n]=1
dle
n
x
29
Running Times Procedure Total
#structures CPU time (seconds)
create 9 0.45
insert_sort 2963 158.695
merge 1238 74.092
reverse 87 2.26
insert_b1 8198 627.309
insert_b2 1823 114.474
bubble 4350 245.74
bubble_b 4794 295.3
quicksort >100,000
30
Properties Not Proved (Liveness) Termination Stability
31
Example: Mark and Sweepvoid Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected = = Universe – Reachset(root) )}
void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked = = Reachset(root))}
Run Demo
32
Total Correctness Usually more complicated Need to show that something good eventually
happens Difficult for programs with unbounded concrete
states Example linked lists
– Show decreased set of reachable locations
33
Program Dependences A statement s1 depends on s2 if
– s2 writes into a location l– s1 reads from location l– There is no intervening write in between
Useful for– Parallelization– Scheduling– Program Slicing
How to compute– Scalars – Stack pointers– Heap allocated pointers
34
int y;List p, q;q = (List) malloc();p = q;p->d = 5;…y = q->d;
Flow Dependences vs. May-Aliasesint y;List p, q;q = (List) malloc();p = q;t=p;p->d = 5;t->d = 7;y = q->d;
int y;List p, q;q = (List) malloc();p = q;t=p;p->d = 5;p=(List) malloc();y = q->d;
35
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l_5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit
36
/*pred.tvp */foreach (z in PVar) {
%p z(v_1) unique box}%p n(v_1, v_2) function%i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v))foreach (z in PVar) {
%i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v))foreach (l in Label) {
%p lst_w_v[l,z]() // l is the last write to into the variable z}
foreach (l in Label) {%p lst_w_f[l,n](v_1) box // l is the last write to into the v_1.n%p lst_w_f[l,d](v_1) box// l is the last write to into v_1.data}
}%i c[n](v) = n+(v, v)
37
Operational Semantics for Statements
stupdate
l: x=rhslst_w_v[l,x]’ = 1lst_w_v[l’, x] = 0
l: x->d = rhslst_w_f[l, d](v) = (x(v) ? 1 : lst_w_f[l, d](v))lst_w_f[l’, d](v) = (x(v)? 0 : lst_w_f[l’, d](v))
l: x->n = rhslst_w_f[l, n](v) = (x(v) ? 1 : lst_w_f[l, n](v))lst_w_f[l’, n](v) = (x(v)? 0 : lst_w_f[l’, n](v))
38
“Read” Formulae for Statememts
expformula
l’: x := NULL0
l’: x := ylst_w_v[l,y]
l’: x->n =NULLlst_w_v[l, x]
l’: x->n= ylst_w_v[l, x]lst_w_v[l, y]
l’: x = y->nlst_w_v[l, y](v) v: y(v) lst_w_f[l, n](v)
39
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit
head
temp
tail
lst_w_v[l1, head]lst_w_v[l10, tail]lst_w_v[l6, temp]
lst_w_f[l2, d]
lst_w_f[l9,n]
lst_w_f[l7,d]
lst_w_f[l8,n]
40
Java Concurrency Threads and locks are just dynamically allocated
objects synchronized implements mutual exclusion wait, notify and notifyAll coordinate activities
across threads
41
l_0: while (true) {l_1: synchronized(sharedLock) {l_C: // critical actionsl_2: }l_3: }
Two threads: (pc1,pc2,lockAcquired1,lockAcquired2)
Example - Mutual Exclusion
Allocate new lock ? Allocate new thread ?
42
Program Model Interleaving model of concurrency Program is modeled as a transition system
43
Configurations A program configuration encodes:
– global store– program-location of every thread– status of locks and threads
First-order logical structures used to represent program configurations
44
Configurations Predicates model properties of interest
– is_thread(t)– { at[lab](t) : lab Labels }– { rval[fld](o1,o2) : fld Fields }– held_by(l,t)– blocked(t,l)– waiting(t,l)
Can use the framework with different predicates
45
Configurations
is_threadat[l_C]
rval[this]
held_by
blocked
is_threadat[l_1]
rval[this]
is_threadat[l_0]
is_threadat[l_0]
is_threadat[l_1]
rval[this]
blocked
46
Configurations Program control-flow is not separately
represented Program location for each thread is encoded
inside the configuration– { at[lab](t) : lab Labels }
47
Structural Operational Semantics - actions
An action consists of:– precondition(when) formula– update formulae
Precondition formula may use a free variable ts for “currently scheduled” thread
Semantics is non-deterministic
48
Structural Operational Semantics - actions
lock(v)
tts: rval[v](ts,l) held_by(l,t)
held_by’(l1,t1) = held_by(l1,t1) (l1 = l t1 = ts)
blocked’ (t1,l1) = blocked(t1,l1) ((l1 l) (t1 ts))
precondition
predicateupdate
49
Safety Properties Configuration-local property as logical formula
Example: no total deadlockt,lb : is_thread(t) blocked(t, lb)
t1,t2: (t1 t2) (at[lcrit](t1) at[lcrit](t2))
Example: mutual exclusion
50
Concrete Configuration
is_threadat[l_C]
rval[this]
held_by
blocked
is_threadat[l_1]
rval[this]
is_threadat[l_0]
is_threadat[l_0]
is_threadat[l_1]
rval[this]
blocked
51
Abstract Configuration
is_threadat[l_C]
rval[this]
held_byblocked
is_threadat[l_1] rval[this]
is_threadat[l_0]
52
Safety Properties Revisited RW interference WW interference Total deadlock Nested monitors Illegal thread interactions
53
Interprocedural Analysis (Rinetzky) Model the stack as a linked list (CC 2001)
– Observe alias patterns– Handles recursion with pointers from the stack to the heap
(but rather slow) Exploit referential transparency
– The part of the store modified by a procedure is limited– Summarize irrelevant calling contexts– Pre-analyze Abstract Data Types
» Analyzed parts of LEDA linked lists
54
Other TVLA Applications Handling trees (G. Yorsh) Allowing data structure specification (M. Rinard) Refinement of implementations (A. Mulhren) Mobile Ambients (F. Nielson & H.R. Nielson)
55
TVLA Generalizations ITVLA (F. Dimaio, N. Dor)
– Support arithmetic operations– Import static domains for integer analysis
» Intervals» Polyhedra
– Derive Update Formula for Instrumentation Predicates (A. Loginov, T. Reps)
56
Summary TVLA supports design and prototyping of
sophisticated static analyzers– “The heap is your friend”– First order logic specifications is natural– Powerful static domain
But scaling is an issue– Interprocedural analysis– User specification– Predictability– User interface
Debugability of Operational Semantics