Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence...
-
Upload
gabrielle-lyon -
Category
Documents
-
view
215 -
download
0
Transcript of Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence...
![Page 1: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/1.jpg)
Quantified Invariant Generationusing an
Interpolating Saturation Prover
Ken McMillan
Cadence Research Labs
![Page 2: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/2.jpg)
Introduction• Interpolants derived from proofs can provide an effective relevance
heuristic for constructing inductive invariants– Provides a way of generalizing proofs about bounded behaviors to the
unbounded case
• Exploits a prover’s ability to focus on relevant facts
– Used in various applications, including
• Hardware verification (propositional case)
• Predicate abstraction (quantifier-free)
• Program verification (quantifier-free)
• This talk– Moving to the first-order case, including FO(TC)
– Modifying SPASS to create an interpolating FO prover
– Apply to program verification with arrays, linked lists
![Page 3: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/3.jpg)
Invariants from unwindings• Consider this very simple approach:
– Partially unwind a program into a loop-free, in-line program
– Construct a Floyd/Hoare proof for the in-line program
– See if this proof contains an inductive invariant proving the property
• Example program:
x = y = 0;while(*) x++; y++;while(x != 0) x--; y--;assert (y == 0);
{x == y}
invariant:
![Page 4: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/4.jpg)
{x = 0 ^ y = 0}
{x = y}
{x = y}
{x = y}
{x = 0 ) y = 0}
{False}
{True}
{y = 0}
{y = 1}
{y = 2}
{y = 1}
{y = 0}
{False}
{True}
Unwind the loops
Proof of inline program contains invariants
for both loops
• Assertions may diverge as we unwind• A practical method must somehow
prevent this kind of divergence!
x = y = 0;
x++; y++;
x++; y++;
[x!=0];x--; y--;
[x!=0];x--; y--;
[x == 0][y != 0]
![Page 5: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/5.jpg)
Interpolation Lemma• If A B = false, there exists an interpolant A' for (A,B) such that:
– A implies A’– A’ is inconsistent with B– A’ is expressed over the common vocabulary of A and B
[Craig,57]
A variety of techniques exist for deriving an interpolant from a refutation of A B, generated by a theorem prover.
![Page 6: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/6.jpg)
Interpolants as Floyd-Hoare proofs
False
x1=y0
True
y1>x1
))
)
1. Each formula implies the next
2. Each is over common symbols of prefix and suffix
3. Begins with true, ends with false
Proving in-line programs
SSAsequence Prover
Interpolation
HoareProof
proof
x=y;
y++;
[x=y]
x1= y0
y1=y0+1
x1y1
{False}
{x=y}
{True}
{y>x}
x = y
y++
[x == y]
![Page 7: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/7.jpg)
Need for quantified interpolants
• Existing interpolating provers cannot produce quantified interpolants• Problem: how to prevent the number of quantifiers from diverging in the
same way that constants diverge when we unwind the loops?
• For linked structures we also require a theory of reachability (in effect, transitive closure)
for(i = 0; i < N; i++) a[i] = i;
for(j = 0; j < N; j++) assert a[j] = j;
{8 x. 0 · x ^ x < i ) a[x] = x}
invariant:
Can we build an interpolating prover for full FOLCan we build an interpolating prover for full FOLthan that handles reachability, and avoids divergence?than that handles reachability, and avoids divergence?
![Page 8: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/8.jpg)
Clausal provers• A clausal refutation prover takes a set of clauses and returns a proof of
unsatisfiability (i.e., a refutation) if possible.• A prover is based on inference rules of this form:
P1 ... Pn
C
• where P1 ... Pn are the premises and C the conclusion.
• A typical inference rule is resolution, of which this is an instance:
p(a) p(U) ! q(U)q(a)
• This was accomplished by unifying p(a) and P(U), then dropping the complementary literals.
![Page 9: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/9.jpg)
Superposition calculusModern FOL provers based on the superposition calculus
– example superposition inference:
– this is just substitution of equals for equals
– in practice this approach generates a lot of substitutions!
– use reduction order to reduce number of inferences
Q(a) P ! (a = c)
P ! Q(c)
![Page 10: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/10.jpg)
Reduction orders• A reduction order  is:
– a total, well founded order on ground terms– subterm property: f(a)  a– monotonicity: a  b implies f(a)  f(b)
• Example: Recursive Path Ordering (with Status) (RPOS)
– start with a precedence on symbols: a  b  c  f– induces a reduction ordering on ground terms:
f(f(a)  f(a)  a  f(b)  b  c  f
![Page 11: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/11.jpg)
These terms must be maximal in their clauses
Ordering Constraint• Constrains rewrites to be “downward” in the reduction order:
Q(a) P ! (a = c)
P ! Q(c)
example: this inference only possible if a  c
Thm: Superposition with OC is complete for refutation in FOL with equality.
So how do we get interpolants from these proofs?
![Page 12: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/12.jpg)
Local Proofs• A proof is local for a pair of clause sets (A,B) when every inference step
uses only symbols from A or only symbols from B.• From a local refutation of (A,B), we can derive an interpolant for (A,B) in
linear time.• This interpolant is a Boolean combination of formulas in the proof
![Page 13: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/13.jpg)
Reduction orders and locality• A reduction order is oriented for (A,B) when:
– s  t for every s L (B) and t 2L(B)
• Intuition: rewriting eliminates first A variables, then B variables.
oriented: x y c d f
x = yA B
f(x) = c
f(y) = d
c d
x = y f(x) = c ` f(y) = c
f(y) = c f(y) = d ` c = d
c = d c d ` ?
Local!!
![Page 14: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/14.jpg)
Orientation is not enough
• Local superposition gives only c=c.• Solution: replace non-local superposition with two inferences:
Q(a)
: Q(b)
A B
Q  a  b  ca = c
b = c
Q(a) a = c
Q(c)
Q(a)
a = U ! Q(U)
This “procrastination” step is an example of a reduction rule,and preserves completeness.
a = c
Q(c)
Second inference can be postponed until after resolving with : Q(b)
![Page 15: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/15.jpg)
Completeness of local inference• Thm: Local superposition with procrastination is complete for refutation
of pairs (A,B) such that:– (A,B) has a universally quantified interpolant
– The reduction order is oriented for (A,B)
• This gives us a complete method for generation of universally quantified interpolants for arbitrary first-order formulas!
• This is easily extensible to interpolants for sequences of formulas, hence we can use the method to generate Floyd/Hoare proofs for inline programs.
![Page 16: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/16.jpg)
Avoiding Divergence• As argued earlier, we still need to prevent interpolants from diverging as
we unwind the program further.• Idea: stratify the clause language
Example: Let Lk be the set of clauses with at most k
variables and nesting depth at most k.
Note that each Lk is a finite language.
• Stratified saturation prover:– Initially let k = 1
– Restrict prover to generate only clauses in Lk
– When prover saturates, increase k by one and continue
The stratified prover is complete, since every proof is contained
in some Lk.
![Page 17: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/17.jpg)
Completeness for universal invariants• Lemma: For every safety program M with a 8 safety invariant, and
every stratified saturation prover P, there exists an integer k such that P
refutes every unwinding of M in Lk, provided:
– The reduction ordering is oriented properly
• This means that as we unwind further, eventually all the interpolants are contained in Lk, for some k.
• Theorem: Under the above conditions, there is some unwinding of M for which the interpolants generated by P contain a safety invariant for M.
This means we have a complete procedure for finding universally quantified safety invariants whenever these exist!
![Page 18: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/18.jpg)
In practice• We have proved theoretical convergence. But does the procedure
converge in practice in a reasonable time?
• Modify SPASS, an efficient superposition-based saturation prover:– Generate oriented precedence orders
– Add procrastination rule to SPASS’s reduction rules
– Drop all non-local inferences
– Add stratification (SPASS already has something similar)
• Add axiomatizations of the necessary theories– An advantage of a full FOL prover is we can add axioms!
– As argued earlier, we need a theory of arrays and reachability (TC)
• Since this theory is not finitely axiomatizable, we use an incomplete axiomatization that is intended to handle typical operations in list-manipulating programs
![Page 19: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/19.jpg)
Simple example
for(i = 0; i < N; i++) a[i] = i;
for(j = 0; j < N; j++) assert a[j] = j;
{8 x. 0 · x ^ x < i ) a[x] = x}
invariant:
![Page 20: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/20.jpg)
i = 0;
[i < N];a[i] = i; i++;
[i < N];a[i] = i; i++;
[i >= N]; j = 0;
[j < N]; j++;
[j < N];a[j] != j;
Unwinding simple example• Unwind the loops twice
i0 = 0
i0 < Na1 = update(a0,i0,i0)i1 = i0 + 1
i1 < Na2 = update(a1,i1,i1)i2 = i+1 + 1
i ¸ N ^ j0 = 0
j0 < N ^ j1 = j0 + 1
j1 < Nselect(a2,j1) j1
invariant
invariant
{i0 = 0}
{0 · U ^ U < i1 ) select(a1,U)=U}
{0 · U ^ U < i2 ) select(a2,U)=U}
{j · U ^ U < N ) select(a2,U)=U}
{j · U ^ U < N ) select(a2,U) = U}
note: stratification prevents constants divergingas 0, succ(0), succ(succ(0)), ...
![Page 21: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/21.jpg)
List deletion example
• Invariant synthesized with 3 unwindings (after some: simplification):
a = create_list(); while(a){ tmp = a->next; free(a); a = tmp;}
{rea(next,a,nil) ^8 x (rea(next,a,x)! x = nil _ alloc(x))}
• That is, a is acyclic, and every cell is allocated• Note that interpolation can synthesize Boolean structure.
![Page 22: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/22.jpg)
More small examples
name description assertion unwindings bound time (s)array set set all array elements to 0 all elements zero 3 L 1 0.01array test set all array elements to 0 all tests O K 3 L 1 0.01
then test all elementsl l saf e create a linked list then memory safety 3 L 1 0.04
traverse itl l acyc create a linked list list acyclic 3 L 1 0.02l l del ete delete an acyclic list memory safety 2 L 1 0.01l l del mi d delete any element result acyclic 2 L 1 0.02
of acyclic listl l rev reverse an acyclic list result acyclic 3 L 1 0.02
This shows that divergence can be controlled. This shows that divergence can be controlled. But can we scale to large programs?...But can we scale to large programs?...
![Page 23: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/23.jpg)
Conclusion• Interpolants and invariant generation
– Computing interpolants from proofs allows us to generalize from special cases such as loop-free unwindings
– Interpolation can extract relevant facts from proofs of these special cases
– Must avoid divergence
• Quantified invariants– Needed for programs that manipulating arrays or heaps
– FO equality prover modified to produce local proofs (hence interpolants)
• Complete for universal invariants
– Can be used to construct invariants of simple array- and list-manipulating programs, using partial axiomatization of FO(TC)
• Language stratification prevents divergence
– Might be used as a relevance heuristic for shape analysis, IPA
![Page 24: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/24.jpg)
Expressiveness hierarchy
CanonicalCanonicalHeapHeap
AbstractionsAbstractions
IndexedIndexedPredicatePredicate
AbstractionAbstraction
PredicatePredicateAbstractionAbstraction
88FO(TC)FO(TC)
QFQF
ParameterizedParameterizedAbstract DomainAbstract Domain
InterpolantInterpolantLanguageLanguage
Exp
ress
ive
ne
ssE
xpre
ssiv
en
ess
88FOFO
![Page 25: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/25.jpg)
Interpolants for sequences• Let A1...An be a sequence of formulas
• A sequence A’0...A’n is an interpolant for A1...An when
– A’0 = True
– A’i-1 ^ Ai ) A’i, for i = 1..n
– An = False
– and finally, A’i 2 L (A1...Ai) \ L(Ai+1...An)
A1 A2 A3 An...
A'1 A'2 A'3 A‘n-1...True False) ) ) )
In other words, the interpolant is a structured
refutation of A1...An
![Page 26: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/26.jpg)
Need for Reachability
• This condition needed to prove memory safety (no use after free).
• Cannot be expressed in FO– We need some predicate identifying a closed set of nodes that is allocated
• We require a theory of reachability (in effect, transitive closure)
... node *a = create_list(); while(a){ assert(alloc(a)); a = a->next; }...
invariant:
8 x (rea(next,a,x) ^ x nil ! alloc(x))
Can we build an interpolating prover for full FOLCan we build an interpolating prover for full FOLthan that handles reachability, and avoids divergence?than that handles reachability, and avoids divergence?
![Page 27: Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.](https://reader033.fdocuments.in/reader033/viewer/2022061305/551461245503462d4e8b5904/html5/thumbnails/27.jpg)
Partially Axiomatizing FO(TC)• Axioms of the theory of arrays (with select and store)
8 (A, I, V) (select(update(A,I,V), I) = V
8 (A,I,J,V) (I J ! select(update(A,I,V), J) = select(A,J))
• Axioms for reachability (rea)
8 (L,E,X) (rea(L,select(L,E),X) ! rea(L,E,X))
8 (L,E) rea(L,E,E)
[ if e->link reaches x then e reaches x]
8 (L,E,X) (rea(L,E,X) ! E = X _ rea(L,select(L,E),X))
[ if e reaches x then e = x or e->link reaches x]etc...
Since FO(TC) is incomplete, these axioms must be incomplete