Generating Tiny Interpolants and Near-interpolants from a Resolution Refutation
description
Transcript of Generating Tiny Interpolants and Near-interpolants from a Resolution Refutation
GENERATING TINY INTERPOLANTS AND NEAR-INTERPOLANTS FROM A RESOLUTION REFUTATIONAlexander Nadel3, Vadim Ryvchin2,3 and Yakir Vizel1
Interpolation’13 WorkshopSaint Petersburg, RussiaJuly 14th, 2013
1 - Computer Science Dept., The Technion, Haifa, Israel
2 - Information Systems Engineering Dept., The Technion, Haifa, Israel
3 - Intel, Haifa, Israel
2
Problem Statement• Interpolation-based model checking (ITP) is an efficient
and complete model checking procedure.
• One invocation of ITP uses many interpolants, where the interpolants are generated from a resolution refutation produced by the SAT solver
• Interpolants generated by the current method are highly redundant and might become too large rendering ITP slow or even intractable.
3
The Solution in Our CAV’13 Paper• Resolution-driven Variable Elimination (RVE)
• is a new way to generate interpolants from a resolution refutation• generates tiny interpolants very fast in the vast majority of cases , but• when it gets stuck for even ONE invocation for a given model checking
instance, the model checker gets stuck
• Solution to : • Adjust RVE so that it never gets stuck: when it cannot find an interpolant, it
generates a near-interpolant• Only few additional clauses are required to make it an interpolant
• We complete it to an interpolant with new model checking techniques
• Main results: our model checking algorithm outperforms ITP on most test-cases; and the interpolants are 117x smaller
4
Today’s Agenda• In focus: algorithms for generating interpolants and near-
interpolants from a resolution refutation:• A comparative description of 3 methods for generating interpolants:
• McMillan’s approach: the fundamental widely used algorithm• A-local variable elimination• Resolution-driven variable elimination (RVE)
• Adjusting RVE to generate near-interpolants in the worst case
• Not in focus: • Completing a near-interpolant to an interpolant• Our model checking algorithm CNF-ITP
5
Interpolant Generation: Problem Definition
• Input: propositional formulas A and B, such that A B ⇒ • Output: a formula I, such that
• A ⇒ I• I B ⇒ • V(I) G, where G V(A) V(B)
• Model checking needs: the interpolant is fed back into the SAT solver it must be in CNF⇒
6
Resolution• Resolution: given two clauses c1=c3 p and c2=c4 p,
derive a logical consequence c5=c1 p c2 = c3 c4
• p is the pivot variable
• Resolution refutation: a derivation by resolution of the empty clause from a given unsatisfiable formula
• A SAT solver can generate a resolution refutation
7
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
A B
A-local variables: a1
Global variables: g1, g2, g3
Example
8
Method 1 for Interpolant Generation: McMillan’s Method• Associate a formula p(c) with each node as follows
• An input node:• c A ⇒ p(c) = g(c)
• g(c): c restricted to global literals• c B ⇒ p(c) = T
• An internal node c3 = c1 p c2
• p is A-local ⇒ p(c3) = p(c1) p(c2)• p isn’t A-local ⇒ p(c3) = p(c1) p(c2)
• p() is the interpolant
9
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
g1 g2 g1 g3
(g1 g2) (g1 g3)
g2 g3 g4 g2 g4 T T
g2 g4
(g2 g3 g4) (g2 g4)
I = [(g1 g2) (g1 g3)] [(g2 g3 g4) (g2 g4)]
I
IMcMillan’s Method
10
McMillan’s Method: Pros and Cons• Pros:
• The interpolant is linear in the size of the resolution refutation• ITP works well when the resolution refutation is not overly complex
• Cons:• In many cases, the interpolant is huge and highly redundant
• Simplifying the formula on-the-fly helps, but doesn’t eliminate the problem
• The interpolant is not natively in CNF, translation is required
11
McMillan’s Method: Translating to CNF
I = [(g1 g2) (g1 g3)] [(g2 g3 g4) (g2 g4)]
g1
g2 g3 g4
a b c d
e
f
h
gI in CNF
12
Method 2 for Interpolant Generation: A-Local Variable Elimination• Variable elimination:
• Given formula F in CNF and variable p • VE(F, p) is created by replacing clauses containing p with the results of
pairwise resolutions between clauses containing p and p• VE(F, p) is equisatisfiable to F and p V(VE(F, p))
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4
g1 g2 g3 g4
g1 g2
g1 g2 g3 g4
g1 g2 g3
g2 g4 T
VE(A, a1)
13
A-Local Variable Elimination• Eliminate all the A-local variables from A one by one.
• The resulting formula is an interpolant
14
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
A-Local Variable Elimination
I = (g1 g2 g3 g4) (g1 g2) (g1 g2 g3 g4) (g1 g2 g3) (g2 g4)
g1 g2 g3 g4
g1 g2
g1 g2 g3 g4
g1 g2 g3
g2 g4 T
15
A-Local Variable Elimination: Correctness
• A ⇒ I: follows from the correctness of resolution
• I B ⇒ • Proof: Start with A B ⇒ and apply Lemma 1 for each elimination
of A-local variable
• Lemma 1: Let: (1) X ∧ Y ⇒ c; (2) p V(Y c). Then: VE(X, p) ∧ Y ⇒ c.
• V(I) G: by construction
16
A-Local Variable Elimination: Pros and Cons
• Pro: the formula is natively in CNF the translation ⇒overhead is saved
• Con: variable elimination blows up • The same problem as in the DPLL algorithm for deciding SAT
• Can one limit the amount of elimination and still get an interpolant?
17
Method 3 for Interpolant Generation: Resolution-driven Variable Elimination (RVE)• Associate a formula I(c), called the clause interpolant, with
each node c reachable from A as follows:
• For an input node: c ⇒ I(c) = c
• For an internal node c3 = c1 p c2, where
c1 and c2 are reachable from A• p is global ⇒ I(c3) = I(c1) I(c2)• p is A-local ⇒ I(c3) = VE(I(c1) I(c2), p)
• For an internal node, one of whose parents is not reachable from A: propagate the clause interpolant from the other parent
18
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
a1 g1 g2 a1 g1 g3
(a1 g1 g2) (a1 g1 g3)
a1 g2 g3 g4 a1 g2 a1 g4
g2 g4
(a1 g2 g3 g4) (g2 g4)
I = [(g1 g2 g3 g4) (g1 g2 g3 g4)] (g2 g4)]
I
IRVE
19
RVE: Correctness• I(c) is a clause interpolant of a clause c reachable from A
iff:• A ⇒ I(c)• I(c) B ⇒ c• V(I(c)) G L(c)
• L(c): A-local variables that appear in c
• By definition a clause interpolant of is an interpolant
• Proof: show that I(c) is a clause interpolant for every c
20
RVE: Pros and Cons• Pros
• Terminates where A-local variable elimination blows up in many cases because of variable elimination locality
21
I = (g1 g2 g3 g4) (g1 g2 g3 g4) (g2 g4)Resolution-driven variable elimination:
A-local variable elimination:
I = (g1 g2 g3 g4) (g1 g2) (g1 g2 g3 g4) (g1 g2 g3) (g2 g4)
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
g1 g2 g1 g2 g3
Saved!
22
RVE: Pros and Cons• Pros
• Generates significantly smaller interpolants than A-local variable elimination because of variable elimination locality
• Unlike McMillan’s method: • Optimizes the interpolant on-the-fly by local variable elimination• Generates the interpolant natively in CNF
23
I = (g1 g2 g3 g4) (g1 g2 g3 g4) (g2 g4)
Resolution-driven variable elimination:
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
McMillan’s method:
24
RVE: Pros and Cons• Pros
• Generates significantly smaller interpolants than A-local variable elimination because of variable elimination locality
• Unlike McMillan’s method: • Optimizes the interpolant on-the-fly by local variable elimination• Generates the interpolant natively in CNF
• Cons• Might still blow-up because of variable elimination unlike McMillan’s
method
25
Near-Interpolants
• The algorithm: • Adjust RVE to generate a B-weak interpolant missing only few
clauses from an interpolant. It may still find interpolants.• Find the remaining clauses with model checking techniques
B-weak Interpolant• A ⇒ I• I B ⇒ • V(I) G
26
Find B-weak Interpolant
1. Apply RVE adjusted as follows:• For each node with A-local pivot variable p eliminate p only if the
clause interpolant doesn’t grow as a result (bounded elimination)
2. Apply bounded A-local variable elimination to I globally3. Apply incomplete A-local variable elimination to I
• Eliminate A-local variables, but apply resolution only to some of the pairs, such that each input clause still contributes to at least one output clause
Non-Global Interpolant• A ⇒ I• I B ⇒ • (I) G
B-weak Interpolant• A ⇒ I• I B ⇒ • (I) G
After this stage we have either an interpolant or a non-global interpolant. We return in the former case, and continue in the latter.We have either an interpolant or a non-global interpolant. We return in the former case, and continue in the latter.
We return a B-weak interpolant (which perchance may be an interpolant)
27
a1 g1 g2 g1 g3
a1 g2 g3
a1 g2 g4
a1 g3 g4
a1 g3 g4
a1 g4
a1 g6 g5 a1 g6
a1 g5
g4 g5
g4 g5
⊥g 5
g5
a1 g1 g2
g1
g2
g3
a1
g4
g5
a1 g2 g4 a1 g3 g4 a1 g6 g5 a1 g6
a1 g1 g2
(a1 g1 g2) (a1 g2 g4)
(a1 g1 g2) (a1 g2 g4) (a1 g3 g4)
(a1 g6 g5) (a1 g6)
I = (a1 g1 g2) (a1 g2 g4) (a1 g3 g4) (a1 g6 g5) (a1 g6)
Variable elimination is skipped, since it would increase the number of clauses! I
II is a non-global interpolant.
g6
B
28
I = (a1 g1 g2) (a1 g2 g4) (a1 g3 g4) (a1 g6 g5) (a1 g6)
Variable elimination is skipped, since it would increase the number of clauses!
I’ = (g1 g2 g6 g5) (g2 g4 g6) (g3 g4 g6 g5)
Incomplete variable elimination example: each input clause contributes to the output
I’ is a B-weak interpolant!
29
RVE: Optimizations1. Store only such parts of the resolution refutation that
are reachable from A• Essential to keep the resolution refutation small• Can also be applied to McMillan’s method
30
RVE: Optimizations
2. Start from the vertex cut in A-resolution refutation, such that:
• its clauses are implied by A only, and• it’s the closest possible to
resolution refutation restricted to clauses implied by A
Consider the cut as the input clauses instead of A
31
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
I = g2 g3
I
I
I is an interpolant!
32
a1 g1 g2 g1 g3
a1 g2 g3
a1 g2 g4
a1 g3 g4
a1 g3 g4
a1 g4
a1 g6 g5 a1 g6
a1 g5
g4 g5
g4 g5
⊥g 5
g5
a1 g1 g2
g1
g2
g3
a1
g4
g5
a1 g2 g4 a1 g3 g4
a1 g1 g2
(a1 g1 g2) (a1 g2 g4)
(a1 g1 g2) (a1 g2 g4) (a1 g3 g4)
a1 g5
I = (g1 g2 g5) (g2 g4 g5) (g3 g4 g5)
I
I
g6
I is an interpolant
33
Experiments• Benchmarks: HWMC’12 benchmark set, 289 instances• Machines: Intel E5-2687W, 3.1GHz freq.; 32GB mem.• Timeout: 900 sec.
34
Results Summary• CNF-ITP vs. ITP vs. IC3, run-time
• CNF-ITP outperforms ITP in 43 cases, while ITP is better in 18 cases• CNF-ITP outperforms IC3 in 23 cases, while IC3 is better in 80 cases• CNF-ITP outperforms both ITP and IC3 in 18 cases
• CNF-ITP vs. ITP, interpolant size: 117x reduction!
• RVE in CNF-ITP:• CNF-ITP with RVE only solved 16 instances out of 51 solved by CNF-ITP.• CNF-ITP with RVE only outperforms both ITP and IC3 in 9 cases • >95% of the clauses in the interpolants were generated by RVE
• Some clauses are used across bounds and iterations in CNF-ITP• The remaining 5% clauses were generated with B-strengthening (inductive
generalization)
35
0 100 200 300 400 500 600 700 800 9000
100
200
300
400
500
600
700
800
900
ITP vs. CNF-ITP Run-Time
ITP
CNF-ITP
36
1 10 100 1000 10000 100000 1000000 100000001
10
100
1000
10000
100000
1000000
10000000
Average Clause Size Comparison (Log. Scale)
ITP
CNF - ITP
37
0 50 100 150 200 2500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CNF-ITP: Ratio of Clauses Learnt with B-strengthening
Instance Number
Clause Ratio
Instance 74: the last one where all the clauses are generated with RVE
38
Challenges• How to direct the SAT solver towards a good interpolant?
• How to assess what “good” is?
• The ultimate challenge: design an algorithm that instantly generates “good” tiny interpolants in CNF whenever the SAT solver completes
39
40
McMillan’s Method: Correctness• A ⇒ I
• Prove I ⇒ A as follows• Let m be an assignment that falsifies I • m defines a path from to a clause in A, falsified by m.
• Invariant: p(c) is falsified by m for every clause in the path
41
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
g1 g2 g1 g3
(g1 g2) (g1 g3)
g2 g3 g4 g2 g4 T T
g2 g4
(g2 g3 g4) (g2 g4)
I = [(g1 g2) (g1 g3)] [(g2 g3 g4) (g2 g4)]
I
Im = {a1, g1, g2, g3, g4}Non-A-local pivot: Choose a parent c, whose p(c) is falsifiedA-local pivot: Choose a parent c, whose pivot literal is falsified (both p(c)’s are falsified)
A ⇒ I holds: the end clause in A is falsified by construction!
42
McMillan’s Method: Correctness• A ⇒ I
• Prove I ⇒ A as follows• Let m be an assignment that falsifies I • m defines a path from to a clause in A, falsified by m.
• I B ⇒ • Invariant that holds for every clause: p(c) B ⇒ c
• p(c) B ⇒ c implies (I = p()) B ⇒
43
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
g1 g2 g1 g3
(g1 g2) (g1 g3)
g2 g3 g4 g2 g4 T T
g2 g4
(g2 g3 g4) (g2 g4)
I = [(g1 g2) (g1 g3)] [(g2 g3 g4) (g2 g4)]
I
IMcMillan’s Method The invariant: p(c) B ⇒ cThe leafs: trivially holds
Global pivot: p(c3) B = p(c1) p(c2) B ⇒ c1 c2 ⇒c3
Local pivot: Assume m╞ p(c3) B = (p(c1) B) (p(c2) B)
Assume WLOG m╞ p(c1) B.
Since p(c1) B ⇒ c1, we have m╞ c1.
We have m╞ g(c1), otherwise switching the pivot’s value in m would contradict p(c1) B ⇒ c1.
c3 = g(c1) g(c2) \ (p G). Hence m╞ c3
m = {a1, g1, g2, g3, g4}
44
McMillan’s Method: Correctness• A ⇒ I
• Prove I ⇒ A as follows• Let m be an assignment that falsifies I • m defines a path from to a clause in A, falsified by m.
• I B ⇒ • The following invariant holds: p(c) B ⇒ c
• p(c) B ⇒ c implies I = p() B ⇒
• V(I) G• By construction
45
a1 g1 g2 a1 g1 g3 a1 g2 g3 g4 a1 g2a1 g4 g2 g3 g3
g2 g4a1 g2 g3
a1 g2 g3
g2 g3
g3
⊥
g1 a1
g4
a1
g2
g3
a1 g1 g2 a1 g1 g3
(a1 g1 g2) (a1 g1 g3)
a1 g2 g3 g4 a1 g2 a1 g4
g2 g4
(a1 g2 g3 g4) (g2 g4)
I = [(g1 g2 g3 g4) (g1 g2 g3 g4)] (g2 g4)]
I
IClause interpolant: A ⇒ I(c)I(c) B ⇒ cV(I(c)) G L(c)
The leafs: trivially holdsGlobal pivot:A ⇒ I(c1) I(c2) = I(c3)
I(c3) B = I(c1) I(c2) B ⇒ c1
c2 ⇒ c3
V(I(c3)) = V(I(c1)) V(I(c2)) G L(c1) L(c2) = G L(c3)
Local pivot:A ⇒ I(c1) I(c2) ⇒ VE(I(c1) I(c2), p) = I(c3)
I(c3) B = VE(I(c1) I(c2), p) B ⇒ c3
V(I(c3)) = V(I(c1)) V(I(c2)) \ {p} G L(c1) L(c2) \ {p} = G L(c3)
I(c1) I(c2) B ⇒ c1 c2 ⇒ c3
Lemma 1: Let: (1) X ∧ Y ⇒ c; (2) p V(Y c). Then: VE(X, p) ∧ Y ⇒ c.
RVE Correctness