Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation...

127
Satisfiability Modulo Theories: A Calculus of Computation PUC, Rio de Janeiro, 2009 Leonardo de Moura Microsoft Research

Transcript of Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation...

Page 1: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories: A Calculus of Computation PUC, Rio de Janeiro, 2009

Leonardo de MouraMicrosoft Research

Page 2: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Symbolic Reasoning

Satisfiability Modulo Theories: A Calculus of Computation

Verification/Analysis tools need some form of

Symbolic Reasoning

Page 3: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Symbolic Reasoning

PSpace-complete(QBF)

Semi-decidable(First-order logic)

NP-complete(Propositional logic)

NEXPTime-complete(EPR)

P-time(Equality)

Logic is “The Calculus of Computer Science” (Z. Manna).High computational complexity

Satisfiability Modulo Theories: A Calculus of Computation

Undecidable(FOL + LA)

Page 4: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Applications

Satisfiability Modulo Theories: A Calculus of Computation

Test case generation

Verifying Compilers

Predicate Abstraction

Invariant Generation

Type Checking

Model Based Testing

Page 5: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Some Applications @ Microsoft

VCC

Hyper-VTerminator T-2

NModel

HAVOC

F7SAGE

VigilanteSpecExplorer

Satisfiability Modulo Theories: A Calculus of Computation

Page 6: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Test case generation

unsigned GCD(x, y) {

requires(y > 0);

while (true) {unsigned m = x % y; if (m == 0) return y; x = y; y = m;

}}

We want a trace where the loop is executed twice.

(y0 > 0) and

(m0 = x0 % y0) and

not (m0 = 0) and

(x1 = y0) and

(y1 = m0) and

(m1 = x1 % y1) and

(m1 = 0)

Solver

x0 = 2

y0 = 4

m0 = 2

x1 = 4

y1 = 2

m1 = 0

SSA

Satisfiability Modulo Theories: A Calculus of Computation

Page 7: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Type checking

Signature:div : int, { x : int | x 0 } int

Satisfiability Modulo Theories: A Calculus of Computation

SubtypeCall site:if a 1 and a b then

return div(a, b)

Verification conditiona 1 and a b implies b 0

Page 8: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories (SMT)

Satisfiability Modulo Theories: A Calculus of Computation

Is formula F satisfiable modulo theory T ?

SMT solvers have specialized algorithms for

T

Page 9: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories (SMT)

b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1)

Satisfiability Modulo Theories: A Calculus of Computation

Page 10: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories (SMT)

Arithmetic

b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1)

Satisfiability Modulo Theories: A Calculus of Computation

Page 11: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories (SMT)

ArithmeticArray Theory

b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1)

Satisfiability Modulo Theories: A Calculus of Computation

Page 12: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Satisfiability Modulo Theories (SMT)

ArithmeticArray TheoryUninterpreted

Functions

b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1)

Satisfiability Modulo Theories: A Calculus of Computation

Page 13: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Theories

A Theory is a set of sentences

Alternative definition:A Theory is a class of structures

Satisfiability Modulo Theories: A Calculus of Computation

Page 14: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SMT@Microsoft: Solver

Z3 is a new solver developed at Microsoft Research.Development/Research driven by internal customers.Free for academic research.Interfaces:

http://research.microsoft.com/projects/z3

Z3

TextC/C++.NET

OCaml

Satisfiability Modulo Theories: A Calculus of Computation

Page 15: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Ground formulas

For most SMT solvers: F is a set of ground formulas

Many ApplicationsBounded Model Checking

Test-Case Generation

Satisfiability Modulo Theories: A Calculus of Computation

Page 16: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Little Engines of Proof

An SMT Solver is a collection ofLittle Engines of Proof

Satisfiability Modulo Theories: A Calculus of Computation

Page 17: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

a b c d e s t

Page 18: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

a b c d e s ta,b

Page 19: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

c d e s ta,ba,b,

c

Page 20: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

d,e

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

d e s ta,b,c

Page 21: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

a,b,c,s

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

s ta,b,c

d,e

Page 22: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

td,ea,b,c,s

d,e,t

Page 23: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

Page 24: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e, a s

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

Unsatisfiable

Page 25: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality

a = b, b = c, d = e, b = s, d = t, a e

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

Model|M| = { 0, 1 }M(a) = M(b) = M(c) = M(s) = 0M(d) = M(e) = M(t) = 1

Page 26: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e))

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

g(d)

f(a,g(d))

g(e)

f(b,g(e))

Congruence Rule:x1 = y1, …, xn = yn implies f(x1, …, xn) = f(y1, …, yn)

Page 27: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e))

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

g(d)

f(a,g(d))

g(e)

f(b,g(e))

Congruence Rule:x1 = y1, …, xn = yn implies f(x1, …, xn) = f(y1, …, yn)

g(d),g(e)

Page 28: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e))

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

f(a,g(d))

f(b,g(e))

Congruence Rule:x1 = y1, …, xn = yn implies f(x1, …, xn) = f(y1, …, yn)

g(d),g(e)

f(a,g(d)),f(b,g(e))

Page 29: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e))

Satisfiability Modulo Theories: A Calculus of Computation

a,b,c,s

d,e,t

g(d),g(e)

f(a,g(d)),f(b,g(e))

Unsatisfiable

Page 30: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

(fully shared) DAGs for representing termsUnion-find data-structure + Congruence Closure

O(n log n)

Satisfiability Modulo Theories: A Calculus of Computation

Page 31: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Equality + (uninterpreted) Functions

Many theories can be reduced toEquality + Uninterpreted functions

Arrays, SetsLists, Tuples

Inductive Datatypes

Satisfiability Modulo Theories: A Calculus of Computation

Page 32: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2

Satisfiability Modulo Theories: A Calculus of Computation

x := x + 1

x1 = x0 + 1

x1 ≤ x0 + 1, x1 ≥ x0 + 1

x1 – x0 ≤ 1, x0 – x1 ≤ -1

Page 33: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

Page 34: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

Page 35: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

-3

Page 36: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

-3

-1

Page 37: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

-3

-14

Page 38: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

-3

-14

0

Page 39: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0

Satisfiability Modulo Theories: A Calculus of Computation

a b

c

d

2

-3

-14

0

Unsatisfiableiff

Negative Cycle

a – b ≤ 2, b – c ≤ -3, c – a ≤ 0

0 ≤ -1

Page 40: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Deciding Difference Arithmetic

Shortest Path Algorithms

Bellman-Ford: O(nm)Floyd-Warshall: O(n3)

Satisfiability Modulo Theories: A Calculus of Computation

Page 41: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Other Little Engines

Satisfiability Modulo Theories: A Calculus of Computation

Linear Arithmetic Simplex, Fourier Motzkin

Non-linear (complex) arithmetic

Grobner Basis

Non-linear (real) arithmetic Cylindrical Algebraic Decomposition

Equational theories Knuth-Bendix Completion

Bitvectors Bit-blasting, rewriting

Page 42: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Combining Solvers

In practice, we need a combination of theory solvers.

Nelson-Oppen combination method.Reduction techniques.

Model-based theory combination.

Satisfiability Modulo Theories: A Calculus of Computation

Page 43: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers): Case Analysis

Satisfiability Modulo Theories: A Calculus of Computation

p q, p q,p q,p q

Page 44: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers):Case Analysis

Satisfiability Modulo Theories: A Calculus of Computation

p q, p q,p q,p q

Assignment:p = false,q = false

Page 45: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers): Case Analysis

Satisfiability Modulo Theories: A Calculus of Computation

p q, p q,p q,p q

Assignment:p = false,q = true

Page 46: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers):Case Analysis

Satisfiability Modulo Theories: A Calculus of Computation

p q, p q,p q,p q

Assignment:p = true,q = false

Page 47: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers): Case Analysis

Satisfiability Modulo Theories: A Calculus of Computation

p q, p q,p q,p q

Assignment:p = true,q = true

Page 48: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAT (propositional checkers): DPLL

M | F

Partial model Set of clauses

Satisfiability Modulo Theories: A Calculus of Computation

Page 49: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL

Guessing (case-splitting)

p, q | p q, q r

p | p q, q r

Satisfiability Modulo Theories: A Calculus of Computation

Page 50: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL

Deducing

p, s| p q, p s

p | p q, p s

Satisfiability Modulo Theories: A Calculus of Computation

Page 51: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL

Backtracking

p, s| p q, s q, p q

p, s, q | p q, s q, p q

Satisfiability Modulo Theories: A Calculus of Computation

Page 52: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Modern DPLL

Efficient indexing (two-watch literal)Non-chronological backtracking (backjumping)Lemma learning…

Satisfiability Modulo Theories: A Calculus of Computation

Page 53: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Solvers = DPLL + Decision Procedures

Efficient decision procedures for conjunctions of ground literals.

a=b, a<5 | a=b f(a)=f(b), a < 5 a > 10

Satisfiability Modulo Theories: A Calculus of Computation

Page 54: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Theory Conflicts

Satisfiability Modulo Theories: A Calculus of Computation

a=b, a > 0, c > 0, a + c < 0 | F

backtrack

Page 55: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Naïve recipe?

Satisfiability Modulo Theories: A Calculus of Computation

SMT Solver = DPLL + Decision Procedure

Standard question:

Why don’t you use CPLEX for handling linear arithmetic?

Page 56: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Efficient SMT solvers

Satisfiability Modulo Theories: A Calculus of Computation

Decision Procedures must be:Incremental & BacktrackingTheory Propagation

a=b, a<5 | … a<6 f(a) = a

a=b, a<5, a<6 | … a<6 f(a) = a

Page 57: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Efficient SMT solvers

Satisfiability Modulo Theories: A Calculus of Computation

Decision Procedures must be:Incremental & BacktrackingTheory PropagationPrecise (theory) lemma learning

a=b, a > 0, c > 0, a + c < 0 | F Learn clause:(a=b) (a > 0) (c > 0) (a + c < 0)Imprecise!Precise clause:a > 0 c > 0 a + c < 0

Page 58: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SMT x SAT

For some theories, SMT can be reduced to SAT

bvmul32(a,b) = bvmul32 (b,a)

Higher level of abstraction

Satisfiability Modulo Theories: A Calculus of Computation

Page 59: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SMT x First-order provers

F T

First-order Theor

em Prove

rT may not have a finite axiomatization

Satisfiability Modulo Theories: A Calculus of Computation

Page 60: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Test case generation

Page 61: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Test case generation

Test (correctness + usability) is 95% of the deal:Dev/Test is 1-1 in products.Developers are responsible for unit tests.

Tools:Annotations and static analysis (SAL + ESP)File FuzzingUnit test case generation

Satisfiability Modulo Theories: A Calculus of Computation

Page 62: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Security is critical

Security bugs can be very expensive:Cost of each MS Security Bulletin: $600k to $Millions.Cost due to worms: $Billions.The real victim is the customer.

Most security exploits are initiated via files or packets.Ex: Internet Explorer parses dozens of file formats.

Security testing: hunting for million dollar bugsWrite A/VRead A/VNull pointer dereferenceDivision by zero

Satisfiability Modulo Theories: A Calculus of Computation

Page 63: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Hunting for Security Bugs.

Two main techniques used by “black hats”:Code inspection (of binaries).Black box fuzz testing.

Black box fuzz testing:A form of black box random testing.Randomly fuzz (=modify) a well formed input.Grammar-based fuzzing: rules to encode how to fuzz.

Heavily used in security testingAt MS: several internal tools.Conceptually simple yet effective in practice

Satisfiability Modulo Theories: A Calculus of Computation

Page 64: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Directed Automated Random Testing ( DART)

Execution Path

Run Test and Monitor Path Condition

Solve

seed

New input

TestInputs

Constraint System

KnownPaths

Satisfiability Modulo Theories: A Calculus of Computation

Page 65: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DARTish projects at Microsoft

PEX Implements DART for .NET.

SAGE Implements DART for x86 binaries.

YOGI Implements DART to check the feasibility of program paths generated statically.

Vigilante Partially implements DART to dynamically generate worm filters.

Satisfiability Modulo Theories: A Calculus of Computation

Page 66: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

What is Pex?

Test input generatorPex starts from parameterized unit testsGenerated tests are emitted as traditional unit tests

Satisfiability Modulo Theories: A Calculus of Computation

Page 67: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: The Spec

Satisfiability Modulo Theories: A Calculus of Computation

Page 68: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: AddItem Test

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 69: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Starting Pex…

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Inputs

Page 70: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 1, (0,null)Inputs

(0,null)

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 71: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 1, (0,null)Inputs Observed

Constraints

(0,null)

!(c<0)

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

c < 0 false

Page 72: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 1, (0,null)Inputs Observed

Constraints

(0,null) !(c<0) && 0==c

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

0 == c true

Page 73: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 1, (0,null)Inputs Observed

Constraints

(0,null) !(c<0) && 0==c

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

item == item true

This is a tautology, i.e. a constraint that is always true,regardless of the chosen values.

We can ignore such constraints.

Page 74: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Picking the next branch to coverConstraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 75: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Solve constraints using SMT solver

Constraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null)

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 76: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 2, (1, null)Constraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null) !(c<0) && 0!=c

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

0 == c false

Page 77: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Pick new branchConstraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null) !(c<0) && 0!=c

c<0

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 78: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 3, (-1, null)Constraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null) !(c<0) && 0!=c

c<0 (-1,null)

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 79: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 3, (-1, null)Constraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null) !(c<0) && 0!=c

c<0 (-1,null)

c<0

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

c < 0 true

Page 80: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

ArrayList: Run 3, (-1, null)Constraints to solve

Inputs Observed Constraints

(0,null) !(c<0) && 0==c

!(c<0) && 0!=c

(1,null) !(c<0) && 0!=c

c<0 (-1,null)

c<0

class ArrayList { object[] items; int count;

ArrayList(int capacity) { if (capacity < 0) throw ...; items = new object[capacity]; }

void Add(object item) { if (count == items.Length) ResizeArray();

items[this.count++] = item; }...

class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); }}

Page 81: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

PEX ↔ Z3Rich

Combination

Linear arithmeti

cBitvector Arrays

FreeFunction

s

Models Model used as test inputs

-QuantifierUsed to model custom

theories (e.g., .NET type system)

APIHuge number of small

problems. Textual interface is too inefficient.

Satisfiability Modulo Theories: A Calculus of Computation

Page 82: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Apply DART to large applications (not units).Start with well-formed input (not random).Combine with generational search (not DFS).

Negate 1-by-1 each constraint in a path constraint.Generate many children for each parent run.

SAGE

parent

generation 1

Satisfiability Modulo Theories: A Calculus of Computation

Page 83: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 0 – seed file

Satisfiability Modulo Theories: A Calculus of Computation

Page 84: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 00 00 00 00 00 00 00 00 00 00 00 00 ; RIFF............00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 1

Satisfiability Modulo Theories: A Calculus of Computation

Page 85: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

`

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

Satisfiability Modulo Theories: A Calculus of Computation

00000000h: 52 49 46 46 00 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF....*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 2

Page 86: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 3

Satisfiability Modulo Theories: A Calculus of Computation

Page 87: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 00 00 00 00 ; ....strh........00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 4

Satisfiability Modulo Theories: A Calculus of Computation

Page 88: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 5

Satisfiability Modulo Theories: A Calculus of Computation

Page 89: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 00 00 00 00 ; ....strf........00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 6

Satisfiability Modulo Theories: A Calculus of Computation

Page 90: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ; ....strf....(...00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 7

Satisfiability Modulo Theories: A Calculus of Computation

Page 91: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ; ....strf....(...00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 C9 9D E4 4E ; ............ÉäN�00000060h: 00 00 00 00 ; ....

Generation 8

Satisfiability Modulo Theories: A Calculus of Computation

Page 92: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ; ....strf....(...00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 9

Satisfiability Modulo Theories: A Calculus of Computation

Page 93: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Zero to Crash in 10 Generations

Starting with 100 zero bytes …SAGE generates a crashing test for Media1 parser

00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...*** ....00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ; ....strh....vids00000040h: 00 00 00 00 73 74 72 66 B2 75 76 3A 28 00 00 00 ; ....strf²uv:(...00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ; ................00000060h: 00 00 00 00 ; ....

Generation 10 – CRASH

Satisfiability Modulo Theories: A Calculus of Computation

Page 94: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAGE (cont.)

SAGE is very effective at finding bugs.Works on large applications.Fully automatedEasy to deploy (x86 analysis – any language)Used in various groups inside MicrosoftPowered by Z3.

Satisfiability Modulo Theories: A Calculus of Computation

Page 95: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

SAGE↔ Z3

Formulas are usually big conjunctions.SAGE uses only the bitvector and array theories.Pre-processing step has a huge performance impact.

Eliminate variables.Simplify formulas.

Early unsat detection.

Satisfiability Modulo Theories: A Calculus of Computation

Page 96: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Verifying CompilersAnnotated

Program

Verificatio

n Condition Fpre/post conditions

invariantsand other annotations

Page 97: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

class C { private int a, z; invariant z > 0

public void M() requires a != 0

{ z = 100/a; }

}

Annotations: Example

Page 98: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Spec# Approach for a Verifying Compiler

Source LanguageC# + goodies = Spec#

Specificationsmethod contracts,invariants,field and type annotations.

Program Logic: Dijkstra’s weakest preconditions.

Automatic Verificationtype checking,verification condition generation (VCG),SMT

Spec# (annotated C#)

Boogie PL

Spec# Compiler

VC Generator

Formulas

SMT Solver

Page 99: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Command languagex := E

x := x + 1

x := 10

havoc x

S ; T

assert P

assume P

S T

Page 100: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Reasoning about execution traces

Hoare triple { P } S { Q } says thatevery terminating execution trace of S that starts in a state satisfying P

does not go wrong, andterminates in a state satisfying Q

Page 101: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Reasoning about execution traces

Hoare triple { P } S { Q } says thatevery terminating execution trace of S that starts in a state satisfying P

does not go wrong, andterminates in a state satisfying Q

Given S and Q, what is the weakest P’ satisfying {P’} S {Q} ?

P' is called the weakest precondition of S with respect to Q, written wp(S, Q)to check {P} S {Q}, check P P’

Page 102: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Weakest preconditions

wp( x := E, Q ) =wp( havoc x, Q ) =wp( assert P, Q ) =wp( assume P, Q ) =wp( S ; T, Q ) =wp( S T, Q ) =

Q[ E / x ](x Q )P QP Qwp( S, wp( T, Q ))wp( S, Q ) wp( T, Q )

Page 103: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Structured if statement

if E then S else T end =

assume E; Sassume ¬E; T

Page 104: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

While loop with loop invariantwhile E

invariant Jdo

Send

= assert J;havoc x; assume J;( assume E; S; assert J; assume false assume ¬E)

where x denotes the assignment targets of S

“fast forward” to an arbitrary iteration of the loop

check that the loop invariant holds initially

check that the loop invariant is maintained by the loop body

Page 105: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Verification conditions: Structure

BIGand-or tree

(ground)

Axioms(non-ground)

Control & Data Flow

Page 106: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Hypervisor: A Manhattan Project

Meta OS: small layer of software between hardware and OSMini: 60K lines of non-trivial concurrent systems C codeCritical: must provide functional resource abstractionTrusted: a verification grand challenge

Hardware

Hypervisor

Page 107: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Hypervisor: Some Statistics

VCs have several MbThousands of non ground clausesDevelopers are willing to wait at most 5 min per VC

Satisfiability Modulo Theories: A Calculus of Computation

Page 108: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge: annotation burden

Partial solutionsAutomatic generation of: Loop InvariantsHoudini-style automatic annotation generation

Satisfiability Modulo Theories: A Calculus of Computation

Page 109: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge

Quantifiers, quantifiers, quantifiers, …Modeling the runtime h,o,f:

IsHeap(h) o ≠ null read(h, o, alloc) = tread(h,o, f) = null read(h, read(h,o,f),alloc)

= t

Satisfiability Modulo Theories: A Calculus of Computation

Page 110: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge

Quantifiers, quantifiers, quantifiers, …Modeling the runtimeFrame axioms o, f:

o ≠ null read(h0, o, alloc) = t read(h1,o,f) = read(h0,o,f) (o,f) M

Satisfiability Modulo Theories: A Calculus of Computation

Page 111: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge

Quantifiers, quantifiers, quantifiers, …Modeling the runtimeFrame axiomsUser provided assertions i,j: i j read(a,i) read(b,j)

Satisfiability Modulo Theories: A Calculus of Computation

Page 112: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge

Quantifiers, quantifiers, quantifiers, …Modeling the runtimeFrame axiomsUser provided assertionsTheories" x: p(x,x)" x,y,z: p(x,y), p(y,z) p(x,z)" x,y: p(x,y), p(y,x) x = y

Satisfiability Modulo Theories: A Calculus of Computation

Page 113: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge

Quantifiers, quantifiers, quantifiers, …Modeling the runtimeFrame axiomsUser provided assertionsTheoriesSolver must be fast in satisfiable instances.

We want to find bugs!

Satisfiability Modulo Theories: A Calculus of Computation

Page 114: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Bad news

There is no sound and refutationally completeprocedure for

linear integer arithmetic + free function symbols

Satisfiability Modulo Theories: A Calculus of Computation

Page 115: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Many Approaches

Heuristic quantifier instantiation

Combining SMT with Saturation provers

Complete quantifier instantiation

Decidable fragments

Model based quantifier instantiation

Satisfiability Modulo Theories: A Calculus of Computation

Page 116: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge: modeling runtime

Is the axiomatization of the runtime consistent?False implies everythingPartial solution: SMT + Saturation ProversFound many bugs using this approach

Satisfiability Modulo Theories: A Calculus of Computation

Page 117: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Challenge: Robustness

Standard complain“I made a small modification in my Spec, and Z3 is timingout”

This also happens with SAT solvers (NP-complete)In our case, the problems are undecidablePartial solution: parallelization

Satisfiability Modulo Theories: A Calculus of Computation

Page 118: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Parallel Z3

Joint work with Y. Hamadi (MSRC) and C. WintersteigerMulti-core & Multi-node (HPC)Different strategies in parallelCollaborate exchanging lemmas

Strategy 1

Strategy 2

Strategy 3

Strategy 4

Strategy 5

Satisfiability Modulo Theories: A Calculus of Computation

Page 119: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Conclusion

Logic as a platformMost verification/analysis tools need symbolic reasoningSMT is a hot areaMany applications & challengeshttp://research.microsoft.com/projects/z3

Thank You!

Satisfiability Modulo Theories: A Calculus of Computation

Page 120: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

E-matching & Quantifier instantiation

SMT solvers use heuristic quantifier instantiation.E-matching (matching modulo equalities).Example:" x: f(g(x)) = x { f(g(x)) }a = g(b), b = c,f(a) c Trigger

Satisfiability Modulo Theories: A Calculus of Computation

Page 121: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

E-matching & Quantifier instantiation

SMT solvers use heuristic quantifier instantiation.E-matching (matching modulo equalities).Example:" x: f(g(x)) = x { f(g(x)) }a = g(b), b = c,f(a) c

x=b f(g(b)) = b

Equalities and ground terms come from the partial model M

Satisfiability Modulo Theories: A Calculus of Computation

Page 122: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

E-matching: why do we use it?

Integrates smoothly with DPLL.Efficient for most VCsDecides useful theories:

ArraysPartial orders…

Satisfiability Modulo Theories: A Calculus of Computation

Page 123: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Efficient E-matching

E-matching is NP-Hard.In practice

Problem Indexing Technique

Fast retrieval E-matching code trees

Incremental E-Matching

Inverted path index

Satisfiability Modulo Theories: A Calculus of Computation

Page 124: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

E-matching code trees

Trigger:

f(x1, g(x1, a), h(x2), b)

Instructions:

1. init(f, 2)2. check(r4, b, 3)3. bind(r2, g, r5, 4)4. compare(r1, r5,

5)5. check(r6, a, 6)6. bind(r3, h, r7, 7)7. yield(r1, r7)

Compiler

Similar triggers share several instructions.

Combine code sequences in a code tree

Satisfiability Modulo Theories: A Calculus of Computation

Page 125: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL()

Tight integration: DPLL + Saturation solver.

BIGand-or tree

(ground)

Axioms(non-ground)

Saturation Solver

SMT

Satisfiability Modulo Theories: A Calculus of Computation

Page 126: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL()

Inference rule:

DPLL() is parametric.Examples:

ResolutionSuperposition calculus…

Satisfiability Modulo Theories: A Calculus of Computation

Page 127: Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

DPLL +

Theories

Saturation

Solver

DPLL()

Ground literals

Ground clauses

Satisfiability Modulo Theories: A Calculus of Computation