Symbolic Execution & Constraint Solving
description
Transcript of Symbolic Execution & Constraint Solving
Finding bugs: Analysis Techniques & Tools
Symbolic Execution& Constraint Solving
CS161 Computer Security
Cho, Chia Yuan
Lab• Q1: Manual reasoning on code
– Mergesort implementation published in Wikibooks
• Q2: Constraint Solving– ‘Solve’ for collisions in ELFHash function
• Q3: Whitebox & blackbox fuzzing– Use a dynamic symbolic execution tool to find bugs automatically
• Start early!
Big Picture
Attacks&
Defenses
Mobile Security
(Android)
Web Security
NetworkSecurity Crypto
Program Analysis & Verification
Symbolic Execution & Constraint Solving
Why?
A little history …
Can we build a machine that can automatically reason and prove
mathematical facts about programs?
1967
1967
1976
“From one simple view, it is an enhancedtesting technique. Instead of executing a programon a set of sample inputs, a program is "symbolically"executed for a set of classes of inputs.”
Why now?
Advances in SAT Solvers
Source: Sanjit Seshia
Advances in SAT Solvers
Source: Sanjit Seshia
Significance
How do we know our program is “correct”?
• In general, we don’t know.• Test it• Let users test it for us• Fuzz it• Try to prove it’s correct• Static analysis
Symbolic Execution
&Constraint
Solving
Precision
Coverage
Dynamic Sym Exec is Directed Testing• Path-by-path exploration
buf=malloc (s);
read(fd, buf, len);
s = lens = len + 2
len = input + 3;
if len < 10
if len % 2 == 0s = len
FT
TF
(len == input + 3) && !(len < 10)
&& !(len%2==0)
Dynamic Sym Exec is Directed Testing• Path-by-path exploration
buf=malloc (s);
read(fd, buf, len);
s = lens = len + 2
len = input + 3;
if len < 10
if len % 2 == 0s = len
FT
TF
(len == input + 3) && !(len < 10)
&& (len%2==0)
Can we combine all paths into 1 single formula?Þ Bounded Model Checking
How do we construct the formula & use a solver?
Q2 Goal: ‘Solve’ for Hash Collisions
Constructing Logic Formulas from Code
• Convert statements into Static Single Assignment (SSA) form
• Encode SSA into target solver input format
Static Single Assignment Equations• Unroll loops to form loop-free program
– for(i=0; i<2; i++){a=a+1;} a=a+1; a=a+1;
• Rename LHS of each assignment into a new local variable a1=a+1; a2=a+1;
• Whenever a variable is read (e.g., at RHS), replace it with last assigned variable name a1=a0+1; a2=a1+1;
Conditional (if) statements• Dynamic Symbolic Execution:– 2 separate path formulas
• Bounded Model Checking:– Merge both branches into 1 formula
Conditional (if) statements
Exampleint example1(int x) { int ret;
if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret;}
SSA
ret1 = x0
ret2 = -x0ret3 = (x0>0 ? ret1 : ret2)Q: Is !(ret3 >= 0) satisfiable?
Is this program correct?
Constructing Logic Formulas from Code• Convert statements into Static Single Assignment (SSA)
form= Bit-vector Equations in quantifier-free 1st order logic
• Encode SSA into target solver input format– Bit-vector arithmetic logic– “SMT” Solver– SMT-LIB 1.0 standard
Example SMT-LIB:extrafuns(x0 BitVec[32]):extrafuns(ret1 BitVec[32]):extrafuns(ret2 BitVec[32]):extrafuns(ret3 BitVec[32]):extrapreds(branchcond1):assumption (= ret1 x0):assumption (= ret2 (bvneg x0):assumption (iff branchcond1 (bvsgt x0 bv0[32]) :assumption (= ret3 (ite branchcond1 ret1 ret2)(not (bvsge ret3 bv0[32]):formula true
SSA
ret1 = x0ret2 = -x0
ret3 = (x0>0 ? ret1 : ret2)
Is !(ret3 >= 0) satisfiable?
Querying the Solver$ ./z3 example1.smt –m
ret3 -> bv2147483648[32]ret1 -> bv2147483648[32]branchcond1 -> falseret2 -> bv2147483648[32]x0 -> bv2147483648[32]sat
2147483648 0x80000000
int example1(int x) {…
• 32 bits Two’s Complement system– Positive range: [0 .. 2N-1 – 1]– Or: [0x00 .. 0x7FFFFFFF]– 0x80000000 is a negative signed 32-bit
value: -2147483648
Exampleint example1(int x) { int ret;
if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret;}
SSA
ret1 = x0
ret2 = -x0ret3 = (x0>0 ? ret1 : ret2)Q: Is !(ret3 >= 0) satisfiable?
Assertion violated ifx = -2147483648
Slightly Modified Exampleint example1(char x) { int ret;
if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret;}
SSA
ret1 = x0
ret2 = -x0ret3 = (x0>0 ? ret1 : ret2)Q: Is !(ret3 >= 0) satisfiable?
Example:extrafuns(x0 BitVec[32]):extrafuns(ret1 BitVec[32]):extrafuns(ret2 BitVec[32]):extrafuns(ret3 BitVec[32]):extrapreds(branchcond1):assumption (= ret1 (sign_extend[24] x0)):assumption (= ret2 (bvneg (sign_extend[24] x0)):assumption (iff branchcond1 (bvsgt x0 bv0[32]) :assumption (= ret3 (ite branchcond1 ret1 ret2)(not (bvsge ret3 bv0[32]):formula true
SSA
ret1 = x0ret2 = -x0
ret3 = (x0>0 ? ret1 : ret2)
Is !(ret3 >= 0) satisfiable?
Querying the Solver$ ./z3 example1.smt –m
unsat
int example1(char x) { int ret;
if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret;}
No satisfying assignment exists
==> Assertion holds for all possible inputs!
SMT-LIB “Cheat” Sheet: Bit-vectors• Declare 32-bit “variable” ‘a’: n-bits Sign Extension to ‘a’:
• :extrafuns( a BitVec[32] ) sign_extend[n] a
• 32-bit constant ‘1234’• bv1234[32]
• Unary functions:• ~a bvnot (a)• -a bvneg (a)
• Binary functions: Binary predicates:• bvand bvor bvxor bvadd bvshl bvlshr bvsgt bvsge bvfoo (a b)• & | ^ + << >> > >=
SMT-LIB “Cheat” Sheet: Booleans• Declare a predicate ‘C’:
• :extrapreds( C )
• Unary connectives:• ! C not (C)
• Binary connectives: • Implies and or xor iff foo (C D)• => && ||
• Ternary connectives:• C ? a : b ite (C a b) where a, b can be bit-vectors
+
Exercise: C Operator Precedence
1. SSA equations?2. SMT-LIB formula?
a = (b >> c) + d;b = -(a ^ ~c);
Exercise: C Operator Precedenceint a,b;char d;a = (b >> 3) + d;b = -(a ^ ~d);
SSAa1 = (b0 >> 3) + d0;b1 = -(a1 ^ ~d0);
SMT-LIB:extrafuns(a1 BitVec[32]):extrafuns(b0 BitVec[32]):extrafuns(b1 BitVec[32]):extrafuns(d0 BitVec[8]):assumption(= a1 (bvadd (bvlshr b0 bv3[32]) (sign_extend[24] d0)):assumption(= b1 (bvneg (bvxor (bvnot (sign_extend[24] d0) a1 )))
Additional References• An enjoyable read on verification history:– Vijay D’Silva, Tales from Verification History
• More about “constraint solvers”:– Daniel Kroening & Ofer Strichman, Decision Procedures: An
Algorithmic Point of View