smtlecture.3

Satisfiability Modulo TheoriesLezione 3 - Efficient SAT-Solvers

(slides revision: Saturday 14th March, 2015, 11:46)

Roberto Bruttomesso

Seminario di Logica Matematica(Corso Prof. Silvio Ghilardi)

3 Novembre 2011

R. Bruttomesso (SMT) SAT-Solvers 3 Novembre 2011 1 / 24

Recall from last lecture . . .

Approaches to solve SMT formulæ are based on the observation thatSMT can be reduced to SAT, i.e., the purely Boolean SatisfiabilityProblem

Eager Lazy

sat / unsat

SMT formula ϕ

ψEncoder SAT-solver

good

SAT-solver

SMT formula ϕ

Extract Struct.

T -solver

Candidatemodel

unsat sat

not good


Outline

1 Introduction

2 DPLL SAT-SolversThe DPLL ProcedureThe Iterative DPLL Procedure

3 CDCL SAT-SolversClause LearningConflict AnalysisNon-Chronological Bactracking


Disclaimer

The SAT-Solving algorithm described as follows is not the onlyapproach existing

BDDs

Quantifier Elimination

Random SAT-Solving

We study the DPLL/CDCL approach as it is precise (notapproximate), it uses a linear amount of memory, and very robust

It is fair to say that it is the most used approach by industry for puresolving

The approach evolved in many years, many groups have contributed.Here we do not see the history of this evolution but just the finalproduct


Complexity Considerations

SAT is “the” NP-Complete problem

It is unlikely to be solved in polynomial time

Most likely, it takes some O(2n) time complexity for an algorithm tosolve SAT

Complexity of SAT cannot be alleviated by faster machines (only byparallelism, if we ever reach that technology). Suppose you can executeM instructions in 1 hour, then the maximum n you can handle is

Speed of machine Max input size in 1 hour

1x log2(M) = n100x log2(100 ·M) ≈ n + 3.31000000x log2(1000000 ·M) ≈ n + 19


CNF Formulæ

From now on we shall focus on solving formulæ in Conjunctive Normal Form CNF

C1 ∧ C2 ∧ . . . ∧ Cn

where each Ci is a clause, a disjunction of the kind

(l1 ∨ l2 ∨ . . . lm)

where every li is a literal, which is a variable or a negated Boolean variable

a or ¬a

For simplicity we will write them as a set of clauses, omitting the ∧, like

(¬a1 ∨ a2)(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(¬a4 ∨ a5 ∨ a10)


Basic Notation

We indicate a (possibly partial) assignment as the set of literals thatare > under it, i.e., we write the assignment

{a1 7→ >, a2 7→ >, a3 7→ ⊥, a4 7→ ⊥, a9 7→ >}

as{a1, a2,¬a3,¬a4, a9}

An assignment is used to evaluate a formula, e.g.,

(¬a1 ∨ a2)(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(¬a4 ∨ a5 ∨ a10)

evaluates to > under the assignment above

at least one this colored literal in each clause to make it >all this colored literals in one clause to make it ⊥


Simple Facts

SAT-Solving is (the art of) finding the assignment satisfying all clauses

The assignment evolves incrementally by taking Decisions, startingfrom empty { }, but it can be backtracked if found wrong

The evolution of the assignment can be represented as a tree

(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)


Simple Facts




(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{ }


Simple Facts




(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{¬a1}

¬a1


Simple Facts




(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{¬a1,¬a2}

¬a2

¬a1


Simple Facts




(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{¬a1, a2}

a2¬a2

¬a1


Simple Facts




(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{. . .}

a2

a1

¬a2a2¬a2

¬a1


Simple Facts

There are assignments which can be trivially driven towards the rightdirection. In the example below, given the current assignment, thethird clause can be satisfied only by setting a2 7→ >

(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{¬a1}

This step is called Boolean Constraint Propagation (BCP). Itrepresents a forced deduction. It triggers whenever all literals but oneare assigned ⊥

(¬a1 ∨ a2 ∨ ¬a3 ∨ a4 ∨ ¬a5)


Simple Facts

There are assignments which can be trivially driven towards the rightdirection. In the example below, given the current assignment, thethird clause can be satisfied only by setting a2 7→ >

(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(a1 ∨ a2)(¬a4 ∨ a5 ∨ a10)

{¬a1, a2}a2

¬a1

This step is called Boolean Constraint Propagation (BCP). Itrepresents a forced deduction. It triggers whenever all literals but oneare assigned ⊥

(¬a1 ∨ a2 ∨ ¬a3 ∨ a4 ∨ ¬a5)


Simple Facts

Decision-level: in an assignment, it is the number of decisions taken

Clearly, BCPs do not contribute to increase the decision level

When necessary, we will indicate decision level on top of literals5a in

the assignment

Example:

{ 0a10,

1¬a1,2a3,

3¬a4,3a5}

¬a1

¬a4

a5

a3

a10


Simple Facts

The process of finding the satisfying assignment is called search, andin its basic version it evolves with Decisions, BCPs, andbacktracking

Decisionnu

mb

erof

variables

failures

model found

Backtracking

BCP


The DPLL Procedure (Revised)

DPLL( V , C )

if ( “a clause has all ⊥ literals” ) return unsat; // Failureif ( “all clauses has a > literal” ) return sat; // Model found

if ( “a clause has all but one literal l to ⊥” )return DPLL( V ∪ {l}, C ); // BCP

l = “pick an unassigned literal” // Decision

if ( DPLL( V ∪ {l}, C ) == sat ) // Left Branchreturn sat;

elsereturn DPLL( V ∪ {¬l}, C ); // Backtracking (Rigth Branch)

To be called as DPLL( { }, C )


The Iterative DPLL Procedure (Revised)

dl = 0; flipped = { }; trail = { }; // Decision level, flipped vars, assignment

while ( true )

if ( BCP( ) == conflict ) // Do BCP until possibledone = false;do

if ( dl == 0 ) return unsat; // Unresolvable conflictl = GetDecisionLiteralAt( dl );BacktrackTo( dl − 1 ); // Backtracking (shrinks trail)if ( var(l) ∈ flipped )

dl = dl − 1;else

trail = trail ∪ {¬l};flipped = flipped ∪ {var(l)};done = true;

while ( !done );

elseif ( “all variables assigned” ) return sat; // trail holds satisfying assignmentl = Decision( ); // Do another decisiontrail = trail ∪ {l}dl = dl + 1; // Increase decision level


The DPLL Procedure (Revised)

Some characteristics

BCP is called with highest priority, as much as possible

Backtracking is performed chronologically: search backtracks tothe previous decision-level

Some differences w.r.t. “original” DPLL Procedure

C is not touched: non-destructive data-structures (memoryefficient)

Pure Literal Rule is not used: expensive to implement (!)


From DPLL to CDCL

Consider the following scenario before and after BCP

. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1}

. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}

BCP leads to a conflict, but, what is the reason for it ? Do a7 and a2 play any role ?

Does it make sense to consider the assignments (which backtracking would produce) ?

{ 0a10,

1¬a3,2¬a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3a2,

4a1}

No, because whenever a10, ¬a3 is assigned, then a1 must not be set to >. This translatesto an additional clause, which can be learnt, i.e., it can be added to the formula

(¬a10 ∨ a3 ∨ ¬a1)


From DPLL to CDCL


. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1}

. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}



{ 0a10,

1¬a3,2¬a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3a2,

4a1}

No, because whenever a10, ¬a3 is assigned, then a1 must not be set to >.

This translatesto an additional clause, which can be learnt, i.e., it can be added to the formula

(¬a10 ∨ a3 ∨ ¬a1)


From DPLL to CDCL


. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1}

. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}



{ 0a10,

1¬a3,2¬a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3¬a2,4a1} — { 0

a10,1¬a3,

2a7,

3a2,

4a1}

No, because whenever a10, ¬a3 is assigned, then a1 must not be set to >. This translatesto an additional clause, which can be learnt, i.e., it can be added to the formula

(¬a10 ∨ a3 ∨ ¬a1)


From DPLL to CDCL

Search

Without learnt clause With learnt clause

a10

¬a3

¬a2

¬a7a7

a1

a4

a5

a6

¬a1

¬a2a2 a2

¬a1

¬a3

¬a2

¬a7a7

¬a2a2 a2

a10

¬a1 ¬a1 ¬a1


Deriving the clause to be learnt

The clause to be learnt is derived from the conflict “caused” by BCP (conflict is nevercaused by a Decision)

To better analyze a conflict we need to look at the implication graph

. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}

Any cut of the graph that separates the conflict from the decision variables represents apossible learnt clause

In this course we take the clause that contains only one variable of the currentdecision level





. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6} ¬a3

a4

a5

a1

a10

a6

¬a2

a7







. . .(¬a10 ∨ ¬a1 ∨ a4)(a3 ∨ ¬a1 ∨ a5)(¬a4 ∨ a6)(¬a5 ∨ ¬a6). . .

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6} ¬a3

a4

a5

a1

a10

a6





In practice, we use resolution steps to compute the learnt clause:

we start from the clause with all literals to ⊥ (conflicting clause)

iteratively, we take the clause that propagated the last literal on the trail and weapply resolution

we stop when only one literal from the current decision level is left in the clause

Trail dl Reasona10 0 (a10)¬a3 1 Decisiona7 2 Decision¬a2 3 Decisiona1 4 Decisiona4 4 (¬a10 ∨ ¬a1 ∨ a4)a5 4 (a3 ∨ ¬a1 ∨ a5)a6 4 (¬a4 ∨ a6)

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}

We say that (¬a10 ∨ a3 ∨ ¬a1) is the conflict clause and it is the one we learn







(¬a5 ∨ ¬a6)Trail dl Reasona10 0 (a10)¬a3 1 Decisiona7 2 Decision¬a2 3 Decisiona1 4 Decisiona4 4 (¬a10 ∨ ¬a1 ∨ a4)a5 4 (a3 ∨ ¬a1 ∨ a5)a6 4 (¬a4 ∨ a6)

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)Trail dl Reasona10 0 (a10)¬a3 1 Decisiona7 2 Decision¬a2 3 Decisiona1 4 Decisiona4 4 (¬a10 ∨ ¬a1 ∨ a4)a5 4 (a3 ∨ ¬a1 ∨ a5)a6 4 (¬a4 ∨ a6)

{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)

(¬a5 ∨ ¬a4)


{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)

(¬a5 ∨ ¬a4) (a3 ∨ ¬a1 ∨ a5)


{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)

(¬a5 ∨ ¬a4) (a3 ∨ ¬a1 ∨ a5)

(¬a4 ∨ a3 ∨ ¬a1)


{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)

(¬a5 ∨ ¬a4) (a3 ∨ ¬a1 ∨ a5)

(¬a4 ∨ a3 ∨ ¬a1) (¬a10 ∨ ¬a1 ∨ a4)


{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}








(¬a5 ∨ ¬a6) (¬a4 ∨ a6)

(¬a5 ∨ ¬a4) (a3 ∨ ¬a1 ∨ a5)

(¬a4 ∨ a3 ∨ ¬a1) (¬a10 ∨ ¬a1 ∨ a4)

(¬a10 ∨ a3 ∨ ¬a1)


{ 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6}



Non-chronological backtracking

So far we have been backtracking chronologically, but

given our (wrong) assignment { 0a10,

1¬a3,2a7,

3¬a2,4a1,

4a4,

4a5,

4a6} and the

computed conflict clause (¬a10 ∨ a3 ∨ ¬a1), is it really clever tobacktrack to level 3 ?

The correct level to backtrack to is:

“the level that would have propagated ¬a1 if we had the clause(¬a10 ∨ a3 ∨ ¬a1) as part of the original formula”

In our case, it is level 1, because assignment { 0a10,

1¬a3} is sufficient topropagate ¬a1 in (¬a10 ∨ a3 ∨ ¬a1)

We say that ¬a1 is the asserting literal as it becomes true by BCPonce we have backtracked to the correct decision level


CDCL: Conflict Driven Clause Learning

Search

DPLL CDCLno learning conflict-driven learning

chronological backtracking non-chronological backtracking

¬a3

a1

¬a2 a2

¬a3 a3 ¬a3

a4

a1

¬a2

a3


The CDCL Procedure

dl = 0; trail = { }; // Decision level, assignment

while ( true )

if ( BCP( ) == conflict ) // Do BCP until possibleif ( dl == 0 ) return unsat; // Unresolvable conflictC, dl = AnalyzeConflict( ); // Compute conf. clause, and dec. levelAddClause( C ); // Add C to clause databaseBacktrackTo( dl ); // Backtracking (shrinks trail)

elseif ( “all variables assigned” ) return sat; // trail holds satisfying assignmentl = Decision( ); // Do another decisiontrail = trail ∪ {l}dl = dl + 1; // Increase decision level


The CDCL Procedure

confl

ict

no confl.

assign var

conflict at level 0

left

no vars

SATBCP Decision

AnalyzeConflict

UNSAT

AddClause

BacktrackTo


Other things which we did not see

Watched Literals: technique to efficiently discover which clausesbecome unsat

Decision heuristics: how do we choose the right variable ? Andwhich polarity ? (>,⊥). Landmark strategy is VSIDS heuristic

Clause removal: adding too many clauses negatively impactsperformance, need mechanisms to remove learnts

Restarts: start the search from scratch, but retaining learnts

many many other heuristics discovered every year


Exercise

Consider the following clause set and assignment

(¬a1 ∨ a2)(¬a1 ∨ a3 ∨ a9)(¬a2 ∨ ¬a3 ∨ a4)(¬a4 ∨ a5 ∨ a10)(¬a4 ∨ a6 ∨ a11)(¬a5 ∨ ¬a6)(a1 ∨ a7 ∨ ¬a12)(a1 ∨ a8)(¬a7 ∨ ¬a8 ∨ ¬a13). . .

{ 1¬a9,2

a12,2

a13,3¬a10,

3¬a11, . . . ,6a1}

Find the correct conflict clause

Find the correct decision level to backtrack


smtlecture.3

Education

Transcript of smtlecture.3