Non-clausal Reasoning
-
Upload
kiayada-hayes -
Category
Documents
-
view
31 -
download
0
description
Transcript of Non-clausal Reasoning
Non-clausal Reasoning
Fahiem Bacchus, Christian Thiffault, TorontoToby Walsh, UCC & Uppsala
(soon UNSW, NICTA, Uppsala)
Every morning …
I read the plaque on the wall of this house …
Dedicated to the memory of George Boole …
Professor of Mathematics at Queens College (now University College Cork)
George Boole (1815-1864)
Boolean algebraThe Mathematical Analysis of
Logic, Cambridge, 1847
The Calculus of Logic, Cambridge and Dublin Mathematical journal, 1848
Reduce propositional logic to algebraic manipulations
George Boole (1815-1864)
Boolean algebraThe Mathematical Analysis of
Logic, Cambridge, 1847
The Calculus of Logic, Cambridge and Dublin Mathematical journal, 1848
Reduce propositional logic to algebraic manipulations
How do we automate reasoning with propositional formulae?
Propositional SATisfiability
Rapid progress being made 10 years ago, < 50 vars Today, > 1000 vars
Algorithmic advances Learning Watched literals ..
Heuristic advances VSIDS branching
Propositional SATisfiability
Efficient implementations Chaff, Berkmin, Forklift, … SAT competition has new winner almost every
year
Practical applications Hardware verification Planning …
SAT folklore
Need to solve in CNF Everything is a clause
Efficient reasoning Optimize code with
simple data structures … Effective reasoning
Conversion into CNF does not hinder unit propagation
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Overturning SAT folklore
Deciding arbitrary Boolean formulae Without converting into CNF
Efficient reasoning Raw speed as good as optimized CNF solvers
Effective reasoning More inference than unit propagation
Exploit structure More exotic gates, …
Davis Putnam procedure
DPLL(S)
if S empty then SAT
if S contains {} then UNSAT
if S contains unit, l then DPLL(S u {l})
else chose literal, l
if DPLL(S u {l}) then SAT
else DPLL(S u {-l})
Unit Propagation
If the formula has a unit clause then the literal in that clause must be true Set the literal to true and reduce the formula.
Unit propagation is the most commonly used type of constraint propagation
One of the most important parts of current SAT solvers
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
a=true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
a=true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
a=true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
a=true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
b=false
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
b=false
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
b=false
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c = true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c = true
Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c = true
Implementing Unit Propagation UP is main (often only) inference rule
applied at each search node. Performing UP occupies most of the time
in these solvers. More efficient implementations of UP has
been one of the recent advances.
Implementing Unit Propagation Most DPLL solvers do not build an explicit
representation of the reduced formula Too expensive in time and space to do this.
Rather they keep original formula and mark the changes made
All changes generated by UP undone when we backtrack.
Tableau [Crawford and Auton 95]
We number the variables and clauses. Each variable has
a field to store its current value, true, false or unvalued the list of clauses it appears positively in the list of clauses it appears negatively in
Each clause has a list of its literals a flag to indicate whether or not it is satisfied the number of unvalued literals it contains
Tableau [Crawford and Auton 95]
Unit propagated literal put on a stack pop the literal on top of the stack
mark the variable with the appropriate value. mark each clause it appears positively in as satisfied. for each clause it appears negatively in
if the clause is not already satisfied decrement the clause’s counter
if the counter is equal to 1, the clause is unit find the single unvalued literal in the clause and add
that literal to the UP stack. remember all changes so that they can be undone on
backtrack.
Watch literals [SATO, Chaff]
Tableau’s technique requires visiting each clause a variable appears in when we value a variable.
When clause learning is employed, and 100,000’s of long new clauses are added to the original formula this becomes slow.
The watch literal technique is more efficient.
Watch literals [SATO, Chaff]
For each clause, pick two literals to watch. At least one of these literals must be false for
the clause to be unit. For each variable instead of lists of all of the
clauses it appears in positively and negatively, we only have lists of the clauses it is a watch for. reduces the total size of these lists from O(kn) to O(n)
Watch literals [SATO, Chaff]
When we assign a value to a variable we Ignore the clauses it watches positively For each clause it watches negatively, we search the
clause: if we find an unvalued literal or a true literal not equal to
the other watch we replace this literal the watch otherwise the clause is unit and we UP the other watch
literal if it is not already true. On backtrack we do nothing!
The new watch literals retain the property that at least one of them must become false if the clause is to become unit.
Solving non-CNF formulae
Convert into CNF Use efficient DPLL
solver like Chaff
Adapt DPLL solver to reason with non-CNF Exploit structure Permit complex gates
(eg counting, XOR, ..)
Encoding into CNF
Most common (and relatively efficient?) is that of [Tseitin 1970].
Recusively converts a formula by adding a new variable for every subformula.
Linear space
Tseitin EncodingA (C & D)
1. (~V1, C)2. (~V1, D)3. (~C,~D,V1)
Tseitin EncodingA (C & D)
V1 (C & D)(~V1, C), (~V1, D), (~C,~D,V1)
1. (~V1, C)2. (~V1, D)3. (~C,~D,V1)
4. (~V2, ~A, V1)5. (A, V2)6. (~V1, V2)
Tseitin EncodingA (C & D)
V1 (C & D)(~V1, C), (~V1, D), (~C,~D,V1)
V2 (A V1)(~V2,~A,V1), (A, V2), (~V1, V2)
1. (~V1, C)2. (~V1, D)3. (~C,~D,V1)
4. (~V2, ~A, V1)5. (A, V2)6. (~V1, V2)7. (V2)
Tseitin EncodingA (C & D)
V1 (C & D)(~V1, C), (~V1, D), (~C,~D,V1)
V2 (A V1)(~V2,~A,V1), (A, V2), (~V1, V2)
Disadvantage of CNF
Structural information is lost Flattens formulae into clauses. In a Boolean circuit
Which variables are inputs? Which are internal wires? …
Additional variables are added. Potentially increases the size of the DPLL search.
Structural Information
Not all structural information can be recovered [Lang & Marquis, 1989].
Recovering structural information can improve performance [EqSatZ, LSAT].
Why lose this information in the first place? In addition, we can exploit more complex
gates
Extra Variables
Potentially “increase” search space Do not branch on any on the newly
introduced “subformula” variables. Theoretically this can increase exponentially
the size of smallest DPLL proof [Jarvisalo et al. 2004]
Empirically solvers restricted in this way can perform poorly
Extra Variables
The alternative is unrestricted branching. However, with unrestricted branching, a
CNF solver can waste a lot of time branching on variables that have become “irrelevant”.
Irrelevant Variables
A (C & D) A=false
formula satisfied
1. (~V1, C)2. (~V1, D)3. (~C,~D,V1)
4. (~V2, ~A,V1)5. (A,V2)6. (~V1,V2)7. (V2)8. (~A)
Solver must still determine that the remaining clauses are SAT
Irrelevant Variables
A (C & D)
V1 (C & D)
V2 (A V1)
Converting to CNF is Unnecessary Search can be performed on the original
formula. This has been noted in previous work on
circuit based solvers, e.g. [Ganai et al. 2002]
Reasoning with the original formula may permit other efficiencies E.g. exploiting structure, & complex gates
DPLL on formulae
View formulae as DAGs Every node has a label (True/ False/ Unassigned)
Branch on the truth value of any unassigned node Use Boolean logic to propagate truth values to
neighbouring nodes Contradiction when node is labeled both True and False
Find consistent labeling with truth values that assigns True to root (SAT) Or exhaust all possibilities (UNSAT)
\/
xor
A B
&
C D
True
False\/
&
C D
Labeling unit propagation
Labeling a node assigning a truth value to corresponding var in CNF encoding
Propagating labels in the DAG unit propagation in the CNF encoding
Learning
Once a contradiction is detected a conflict clause can be learned set of impossible node assignments can use 1-UIP scheme (as in CNF solvers)
Learned clauses stored and used to unit propagate node truth values
Complex gates
Gates can have arbitrary degree n-ary AND, n-ary OR, …
Gates can be complicated Boolean functions n-ary XOR (which requires exponential
number of CNF clauses) cardinality gates (at least one, k out of n, ..)
Label propagation
Use lazy data structures as in CNF solvers For example. assign one child as a true watch
for an AND gate Don’t check if AND gate can be labeled true until its
true watch becomes true Some benchmarks have AND gates with thousands of
children No intrinsic loss of efficiency in using the DAG
over CNF.
Structure based optimizations
We can also exploit the extra structural information the DAG provides
Two such optimizations Don’t care propagation to deal with irrelevant
subformulae Conflict clause reduction
Don’t Care labeling
Add a third “truth” value to the DAG: “don’t care” A node C is don’t care wrt a particular parent P
If its truth value can no longer affect the truth value of P nor any of its P siblings.
Or P is don’t care.
A node C is don’t care if it is don’t care wrt to all of its parents
No need to branch on don’t cares!
Don’t Care labeling
Assign a don’t care watch parent for each node. When P is labeled, C can becom don’t care wrt
to its watch parent P If C becomes don’t care wrt to its don’t care
watch we look for another watch. If we can’t find one we know, C has become
don’t care
\/
xor
B
&
C D
True
False\/
&
C D
Don’t care
A A
xor
B
Conflict Clause Reductions
If one learns (L1,L2,...) and one has (~L1, L2) then we can reduce the conflict clause (~L1,L2) resolves with (L1,L2,...) to give (L2,...) Result subsumes the original conflict clause
In CNF, we would have to search the clause database to detect this situation Probably not going to be effective
Conflict Clause Reductions
Suppose P is an AND node, and C is a child Then ~C implies ~P
If we have the conflict clause: (~P,~C,X,…)
This reduces to (~P,X,…)
Equivalent to a resolution step against (C,~P)
Conflict Clause Reductions
When conflict clause generated Search neighbours in DAG for such reductions
More useful on “shorter” clauses Experimentally found it only worth looking for such
reductions on clauses of length 100 or less
Empirical Results.
We compared with Zchaff Tried to isolate impact of CNF v non-CNF
Made the two solvers as close as possible Same magic numbers (e.g., clause database cleanup
criteria, restart intervals etc.) Same branching heuristics
Expect similar improvements could be obtained with others CNF solvers
Empirical Results caveats
Lack of non-clausal benchmarks Hope SAT-05 competition will include non-
CNF
Benchmarks we did obtain had already been transformed into simpler formulas No complex XOR or IFF gates
FVP-UNSAT-2.0 (Velev) Time Problem #Vars Time Imp/Sec
Zchaff NoClause Zchaff NoClause 4pipe 5,237 188.89 9.87 467,001 509,433 4pipe_1 4,647 26.55 35.52 512,108 327,098 4pipe_2 4,941 49.76 36.5 482,896 327,298 4pipe_3 5,233 144.34 62.03 424,551 316,049 4pipe_4 5,525 93.83 42.26 470,936 326,186 5pipe 9,471 54.68 33.34 526,457 409,154 5pipe_1 8,441 126.11 116.18 425,921 280,758 5pipe_2 8,851 138.62 177.24 437,166 279,298 5pipe_3 9,267 137.7 134.08 441,319 295,976 5pipe_4 9,764 873.81 284.62 370,906 270,234 5pipe_5 10,113 249.11 137.09 456,400 298,903 6pipe 15,800 4,550.92 297.13 322,039 288,855 6pipe_6 17,064 1,406.18 1,056.56 402,301 267,207 7pipe 23,910 12,717.00 1,657.70 306,433 244,343 7pipe_bug 24,065 128.9 0.29 266,901 403,148
FVP-UNSAT-2.0 Decisions Problem #Vars Zchaff NoClause 4pipe 5,237 541,195 41,637 4pipe_1 4,647 131,223 114,512 4pipe_2 4,941 210,169 112,720 4pipe_3 5,233 392,564 169,117 4pipe_4 5,525 295,841 122,497 5pipe 9,471 334,761 102,077 5pipe_1 8,441 381,921 255,894 5pipe_2 8,851 397,550 362,840 5pipe_3 9,267 385,239 292,802 5pipe_4 9,764 1,393,529 503,128 5pipe_5 10,113 578,432 283,554 6pipe 15,800 5,232,321 435,781 6pipe_6 17,064 2,153,346 1,326,371 7pipe 23,910 12,437,654 1,276,763 7pipe_bug 24,065 1,075,907 481
FVP-UNSAT-2.0 Don’t CaresProblem DC Time No DC DC Decision No DC 4pipe 9.87 57.68 41,637 198,828 4pipe_1 35.52 62.65 114,512 159,049 4pipe_2 36.5 94.46 112,720 212,986 4pipe_3 62.03 213.27 169,117 365,007 4pipe_4 42.26 318.64 122,497 525,623 5pipe 33.34 246.93 102,077 650,312 5pipe_1 116.18 300.59 255,894 489,825 5pipe_2 177.24 360.67 362,840 585,133 5pipe_3 134.08 387.65 292,802 593,815 5pipe_4 284.62 2097.31 503,128 1,842,074 5pipe_5 137.09 379.19 283,554 543,535 6pipe 297.13 10,241.64 435,781 4,726,470 6pipe_6 1,056.56 3,455.35 1,326,371 2,615,479 7pipe 1,657.70 12,685.59 1,276,763 6,687,186 7pipe_bug 0.29 1 481 2,006
FVP-UNSAT-2.0 Clause Reduction
Problem Red. Time No Red. Red. Ave Cls. SizeNo Red 4pipe 9.87 30.5 40 148 4pipe_1 35.52 45.13 77 132 4pipe_2 36.5 54.97 84 132 4pipe_3 62.03 69.7 108 162 4pipe_4 42.26 80.13 112 157 5pipe 33.34 16.29 93 113 5pipe_1 116.18 195.81 140 204 5pipe_2 177.24 159.45 165 199 5pipe_3 134.08 154.02 165 218 5pipe_4 284.62 504.27 208 264 5pipe_5 137.09 216.11 172 237 6pipe 297.13 647.78 232 540 6pipe_6 1,056.56 1,421.42 309 380 7pipe 1,657.70 2,053.92 336 761 7pipe_bug 0.29 0.29 10 10
Other SeriesSeries (#probs) Zchaff
Time Dec. Imp/Sec Cls Size sss-sat-1.0 (100) 128 2,970,794 728,144 70 vliw-sat-1.1 (100) 3,284 154,742,779 302,302 82 fvp-unsat-1.0 (4) 245 3,620,014 322,587 326 fvp-unsat-2.0 (22) 20,903 26,113,810 327,590 651
Series (#probs) NoClauseTime Dec. Imp/Sec Cls Size
sss-sat-1.0 (100) 225 1,532,843 616,705 39 vliw-sat-1.1 (100) 1,033 4,455,378 260,779 55 fvp-unsat-1.0 (4) 172 554,100 402,621 100 fvp-unsat-2.0 (22) 4,104 5,537,711 267,858 240
Conclusions
No intrinsic reason to convert to CNF Many other structure based optimizations
remain to be investigated Branching heuristics Non-clausal conflicts More complex gates …