Introduction to Satisfiability Modulo Theories Yakir Vizel SMT Seminar Spring 2014.

55
Introduction to Satisfiability Modulo Theories Yakir Vizel SMT Seminar Spring 2014

Transcript of Introduction to Satisfiability Modulo Theories Yakir Vizel SMT Seminar Spring 2014.

Introduction to Satisfiability Modulo Theories

Yakir Vizel

SMT Seminar Spring 2014

Outline

• Background– Propositional logic– DPLL-style SAT solvers

• The SMT World– First order logic– Theories

• SMT Solving– Eager approach– Lazy approach– DPLL(T)

Motivation

int power3(int in) {int i, out_a; out_a = in; for (i=0; i<2; i++) out_a = out_a * in; return out_a; }

int power3_new(int in) {int out_b; out_b = (in * in) * in; return out_b; }

• Are these two programs equal?

• In general, this is an undecidable problem– No sound and complete method

Motivation (2)

out0_a = in ⋀out1_a = out0_a * in ⋀out2_a = out1_a * in

out0_b = (in * in) * in

• Only bounded loops in our example

Need to prove that a⋀ b out2_a = out0_b is a valid formula

Basic Definitions• Assignment to a formula is a mapping

from the variables of to a domain D– (a ⋁ b) c: a -> 1, b -> 0, c -> 1

• Given a formula – it is satisfiable if there exists an assignments

s.t. is evaluated to TRUE– it is contradiction if such an assignment does

not exist– it is a tautology (or valid) if under every

assignment it evaluates to TRUE

Decision Procedure• The Decision Problem for a given formula

is to determine whether is valid

• So clearly, a Decision Procedure… –We want it to be Sound and Complete

• Sound = returns “Valid” when is valid• Complete = terminates, and when is valid, it

returns “Valid”

• A Theory is decidable if there is a decision procedure for it

Formal Reasoning• We want to reason about the validity (or

satisfiability) of a formula

• Model Theoretic approach– Enumerate possible solutions– Need a finite number of candidates

• Proof Theoretic approach– Deductive mechanism– Based on axioms and inference rules

• Forms an inference system

Formal Reasoning• Three statements

– If x is a prime number greater than 2, then x is odd– It it not the case that x is not a prime number

greater than 2– x is not odd

• Prime number greater than 2 – A• Odd number - B

A BAB

Inference Rules

(M.P)

¬(CONTR)FALSE

(DOUBLE NEG)¬¬⇔

Proof Theoretic

1. A B (premise)

A BAB

2.A (premise)

4. B (premise)3. A (2, DOUBLE NEG)

5. B (1+3, M.P)6. FALSE (4+5, CONTR)

(DOUBLE NEG)¬¬⇔

(M.P)

Model Theoretic

A BAB

A B AB (AB) ⋀ A (AB) ⋀ A ⋀ ¬B

0 0 1 0 0

0 1 1 0 0

1 0 0 0 0

1 1 1 1 0

Propositional Logic

• Logic:– A (formal) language for making statements

about objects and reasoning about properties of these objects

• The alphabet:– Countable set of proposition symbols: a, b, c, …– Logical connectives: ⋁ (or), ⋀ (and), ¬

(not/negation), ≣ (equivalence), (implication)• Connectives can be expressed using other

connectives• (a b) is the same as (¬a ⋁ b)

Propositional Logic (2)

• Formula is defined using the following grammar:– formula: formula ⋀ formula | ¬formula | (formula) |

atom– atom: Boolean variable | TRUE | FALSE

• An example to a data structure that uses this grammar: And-Inverter Graph (AIG)

a b c

Normal Forms• Negation Normal Form

– Use only or, and and not– ¬ appears only on proposition symbols

(variables)

• Conjunctive Normal Form– literal: a or a– clause: disjunction of literals– CNF: conjunction of clauses

DPLL-style SAT solvers

• Objective:– Check satisfiability of a CNF formula

• Given a CNF formula, is there a satisfying assignment?

• Approach:– Branch: make arbitrary decisions– Propagate implication graph– Use conflicts to guide inference steps

GRASP, CHAFF, MiniSAT, Glucose

SAT solvers can also generate refutation proofs!

Pseudo CodeDPLL() AddClauses(cnf()); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) return SAT; while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl);

The Implication Graph (BCP)

(a b) (b c d)

a

c

Decisions

b

Assignment: a b c d

d

Propositional Resolution

a b c a c d

b c d

When a conflict occurs, the implication graph isused to guide the resolution of clauses, so that thesame conflict will not occur again.

Conflict Clauses

(a c) (a b) (b c d) (b d)

b

Assignment: a b c d

d

Conflict!

(b c )

resolve

Conflict!(a c)

resolve

Conflict!

a

c

Decision

Conflict Clauses

c (a c) (a b) (b c d) (b d)

(b c )

(a c)

(c)

( )

b

d

a

c

Generating refutations

• Refutation = a proof of the null clause– Record a DAG containing all resolution steps

performed during conflict clause generation.– When null clause is generated, we can extract a

proof of the null clause as a resolution DAG.

Original clauses

Derived clauses

Null clause

Formal Reasoning – Which?

• Model Theoretic– The SAT solver tries to find an assignment

• In a way enumerating assignments

– Makes arbitrary decisions for variables

• Proof Theoretic– The learning scheme when hitting a conflict– Uses Resolution as an inference rule

• The only inference rule

• SO… Combination of the two… v ⋁ A ¬v ⋁ B(RES)A ⋁ B

Circuit to SAT

ab

c p

g

Can the circuit output be 1?

inputvariables output

variable

(a g) (b g)(a b g)

(g p) (c p)(g c p)

CNF(p)

p is satisfiable when theformula CNF(p) pis satisfiable

Questions?

• Is a SAT Solver a Decision Procedure?

• Yes!– But it looks for an assignment…?– Yes, but can prove the lack of one

• So, in order to prove the validity of – Prove ¬ is a contradiction

First Order Logic

• The alphabet:– Countable set of symbols: a, b, c, …– Logical connectives: ⋁ (or), ⋀ (and), ¬

(not/negation), ≣ (equivalence), (implication), quantifiers (∀,∃),

– Non-logical symbols • functions (F(x1,…,xn))

• Constants (1,2,3,…)• Predicate symbols (x > 0)

First Order Logic

• The Decision Problem for FOL is undecidable– There is no sound and complete decision

procedure for it

• Several approaches– Limit the syntax

• Only FOL formulas without quantifiers or function symbols

– Limit the possible models (assignments)• Not always easier

– Show validity for a given Theory• Propositional logic is a “subset” – or formally – a

Theory

Equality and Uninterpreted Functions

• Or in short EUF• Based on

– (x1 = y1 … xn = yn) f(x1,…,xn) = f(y1,….,yn)

• NP-Complete problem– Meaning Decidable– Polynomial reduction to propositional problem

Equality and Uninterpreted Functions

• Recall the motivating example

out0_a = in ⋀out1_a = F(out0_a, in)

⋀out2_a = F(out1_a, in)

out0_b = (in * in) * inout0_a = in ⋀out1_a = out0_a * in ⋀out2_a = out1_a * in

out0_b = F(F(in, in), in)

• Suitable for any function F

Linear Arithemetic

• Or LIA• Formulas of the form

– 3x + 2y ≤ 5z ⋀ 2x -2y = 0– Equality is a fragment of LIA

• Usage– Code optimizations performed by compilers

Arrays

• Operations on array data-structure– A map from an index to an element

• Basic operations– Read an element from an index i– Writing an element to index i

Satisfiability Modulo Theories

• Objective:– Given a theory T check satisfiability of a T-

formula

• Approach:– Eager encoding

• Reduce to a SAT problem

– Lazy encoding• Intro to DPLL(T)

– DPLL(T)

Reduction to SAT

• SMT Solving is based on the observation that a T-formula can be reduced to SAT

• (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– Use Boolean variables for each conjunct

• a – (x-y≤0)• b – (y-z≤0)• c – (z-x≤-1)• d – (z-x≤-2)

– a ⋀ b ⋀ (c ⋁ d)

Eager Encoding

• Reduction to SAT clearly does not capture the semantics

• Eagerly encode the semantics into the formula• In our example, we have four T-atoms: a,b,c,d

– Need to capture the relation between the atoms:• Are a and b consistent?• Are a and ¬b consistent?• …• Are a and b and c consistent?• …

– Can lead to 2n relations – Exponential!

Eager Encoding - Example

• Back to our examples– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– a ⋀ b ⋀ (c ⋁ d)

• Clearly (x-y≤0), (y-z≤0) and (z-x≤-1) cannot be all TRUE– Thus we can add ¬(a ⋀ b ⋀ c)

• In a similar manner we add– ¬(¬a ⋀ ¬b ⋀ ¬c)– ¬(a ⋀ b ⋀ d)– ¬(¬a ⋀ ¬b ⋀ ¬d)

Eager Encoding – Example (2)

• The resulting formula– (a ⋀ b ⋀ (c ⋁ d)) ⋀ ¬(a ⋀ b ⋀ c) ⋀ ¬(¬a ⋀ ¬b

⋀ ¬c) ⋀ ¬(a ⋀ b ⋀ d) ⋀ ¬(¬a ⋀ ¬b ⋀ ¬d)

• The formula is then passed to a SAT solver– If it is satisfiable, then so is the original formula– Same goes for unsatisfiability

• Redundancy?– ¬(¬a ⋀ ¬b ⋀ ¬d)– ¬(¬a ⋀ ¬b ⋀ ¬c)

Schematics

Encoder

SAT-Solver

SMT formula

SAT/UnSAT

CNF formula

T-SolverCNF + Relations

Pseudo Code

Eager() R = T-Solver(lit()); B = e() ⋀ e(R); while (true) do <a, res> = SAT(B); if (res == “UnSAT”) return UnSAT else return SAT;

Eager Encoding

• Pros– Easy to implement

• SAT solver is used as a black-box

– Works well for bit-vectors theory

• Cons– Encoding may get too big to handle– Search starts only after entire problem is

translated

Lazy Encoding

• Recall: reduction to SAT does not capture the semantics

• Lazily encode the semantics into the formula– How?

• Less redundancies

Lazy Encoding – How it works

• Our example– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– a ⋀ b ⋀ (c ⋁ d)

• Start by trying to solve the propositional formula– If UnSAT – done– Otherwise, get the assignment

• a -> TRUE, b -> TRUE, c -> TRUE, d -> FALSE

– Check if the assignment hold in the theory• If yes, a true assignment• Otherwise, block the assignment: ¬(a ⋀ b ⋀ c ⋀ ¬d)

– Solve again

Schematics

Encoder

SAT-Solver

SMT formula

UnSAT

CNF formula

T-Solver

SAT

assignment

blocking clause

Pseudo Code

Lazy() B = e(); while (true) do <a, res> = SAT(B); if (res == “UnSAT”) return UnSAT else <t,res> = T-Solver(T(a)) if (res == “SAT”) return SAT; B = B ⋀ e(t)

Lazy Encoding

• Pros– Easy to implement

• SAT solver is used as a black-box• Need to implement T-Solver

• Cons– SAT Solver must finish entirely before T-

inconsistency is checked– SAT Solver restarts (at the beginning) if last

assignment failed

How Can We Improve?

• The SAT Solver and T-Solver are separate– Each works as a black-box

• Try to bring them closer together– One should help the other

Incrementality

• Problem– SAT Solver restarts (at the beginning) if last

assignment failed

• Keep the state of the SAT solver, and simply resume– After a full assignment is found, check with T-

Solver– If not a real assignment, add a clause and BCP

Pseudo CodeLazy-DPLL() AddClauses(cnf(e())); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) <t,res> = T-Solver(T(a)) if (res == “SAT”) return SAT; AddClause(e(t)); while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl); else while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl);

Use Partial Assignment

• Problem– SAT Solver must finish entirely before T-

inconsistency is checked

• Try to find a T-inconsistency using only a partial assignment– Send partial assignment to T-Solver after k

decisions• May be hard to find the optimal k

Use Partial Assignment

• Consider a formula that has (10 ≤ x) and (x < 0) (represented by v and u respectively)– v and u are both assigned to TRUE– Every call to the T-Solver results in UnSAT

• May also be due to other reasons

• Thus, when these two are assigned TRUE we want to block all possible assignments– By adding ¬(v ⋀ u)

Theory Propagation

• How do we improve more?• Deeper integration

• T-Solver uses current partial assignment to deduce values for other literals– Similar to BCP, but using the Theory

• Improves performance dramatically• But makes conflict analysis hard

Theory Propagation

• Consider a formula that has (10 ≤ x), (x < 0), (x ≤ y) and (5 ≤ y)

• By knowing that (10 ≤ x) and (x ≤ y) are assigned to TRUE, the T-Solver can deduce facts

(5 ≤ y)

¬(x < 0) (10 ≤ x)

(x ≤ y)

Decisions

Pseudo CodeDPLL_T() AddClauses(cnf(e())); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) return SAT; repeat while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl); <t,res> = T-Solver(T(a)) AddClause(e(t)); until (t == true)

Schematics

Encoder

SAT-Solver

SMT formula

UnSAT/SAT

CNF formula

T-Solver

Combining Theories

• We have seen how to solve a specific theory

• But what if we have– 3x + 2y ≤ f(z) ⋀ f(x) -2y = 0– LIA_UF

• Assuming we have a decision procedure for each theory– There is a procedure, called Nelson-Oppen– Other conditions as well

Advanced Topics

• Proof generation– We want something similar to a Resolution

Proof in the propositional case

• Interpolants generation– Specific methods for different theories– Based on deduction rules

• Specific usage in model checking

Questions?