Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)
-
date post
20-Dec-2015 -
Category
Documents
-
view
220 -
download
2
Transcript of Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)
Program Analysis Using Randomization
Sumit Gulwani, George Necula
(U.C. Berkeley)
What kind of Analysis?Any analysis that can be modeled as checking
equivalence of two expressions at a point in a program
Equivalent to checking Reachability Properties Complexity of our algorithm
(Almost) Linear Time Queries answered in (almost) constant time
The Randomized Strategy
Define a mapping F: Expression → Polynomial such that P1 ≡ P2
) E1 ≡ E2 (Soundness)
E1 ≡ E2 [according to some theory T]
) P1 ≡ P2 (Completeness w.r.t. T)
For a loop-free program, F(E) = i Predi vi
x = 7
t = x + y
y = 9 y = 5
C1
C2
Example
x = 3
F(t) = C1 C2 (3+5)
+ C1 ¬C2 (3+9)
+ ¬C1 C2 (7+5)
+ ¬C1 ¬C2 (7+9)
T
T F
F
Checking Polynomial Equivalence
P1 ≡ P2 can be determined by random testing with small error probability (Probabilistic Soundness)
F: Expression ! Polynomial
can be thought of as
F: Expression ! [List of numbers]
Algorithm
Statements with side-effect x = e Mem[e1] = e2
x = e Record in the register table: Store(x) Ã F(e) Register table is simply an array
Mem[e1] = e2
Record in the memory table: F(e1) Ã F(e2)
Expressions E : = n (Constant)
| x (Variable Reference)
| Mem[e] (Memory Read)
| e1 + e2 (Arithmetic)
| e1 - e2
| e1 * e2
| (e == 0) (Conditionals)
| (e ≥ 0)
| (c: e1, ¬c: e2) (Joins)
| (e1,e2) (Joins at Loop Entry)
| (x) (Loop Exit)
| U(e1, …, en) (Uninterpreted Functions)
Arithmetic and Uniterpreted Functions
F(n) ! [n, n, n]
F(x) ! Store(x) if x was defined before use
| Rand() otherwise
F(e1 + e2) ! F(e1) + F(e2)
F(e1 - e2) ! F(e1) - F(e2)
F(e1 * e2) ! F(e1) * F(e2)
F(e1 / e2) ! F(e1) / F(e2)
F(U(e1, …, en)) ! Rand(U, F(e1), …, F(en))
Joins
F ((c: e1, ¬c: e2)) = F(e1 ©r e2)
r = F(c)
e1 ©r e2 ≡ r × e1 + (1-r) × e2
Note that r + (1-r) = 1
Linear equalities are preserved Furthermore, if 0 ≤ r ≤ 1
Linear inequalities will be preserved
Preservation of Linear Invariants F(y) = r F(y1) + (1-r) F(y2)
F(x) = r F(x1) + (1-r) F(x2)
= r F(ay1 + b)
+ (1-r) F(ay2 + b)
= ar F(y1) + a(1-r) F(y2)
+ r b + (1-r) b
= a F(y) + b y = (c: y1, ¬c: y2)
x = (c: x1, ¬c: x2)
assert (x = ay + b)
y1 = …
x1 = a y1 + b
y2 = …
x2 = a y2 + b
Lock = L0
assert (Lock = L0)
Lock - -
C
C
Locking Example (Joins)
Lock + +
L1 = L0
L5 = Ф(c: L4, ¬c: L3)
assert (L5 = L0)
L4 = L3 -1
C
C
Locking Example (Joins)
L2 = L1 + 1
L3 = Ф(c: L2, ¬c: L1)
F(L1) = F(L0)
F(L2) = F(L1) + 1
F(L3) = r F(L2) + (1-r) F(L1)
= F(L0) + r
F(L4) = F(L3) – 1
F(L5) = r F(L4) + (1-r) F(L3)
= F(L3) – r
= F(L0) + r – r
= F(L0)
t = 2x - 3y
assert (t = 5)
Content of Conditionals
x - y == 5 ?
x + y == 15 ?
F
T F
T
P1
P0
P3
P2
F(x) = [1, 2, 3]
F(y) = [1, 4, 9]
P0:
t = 2x - 3y
assert (t = 5)
Content of Conditionals
x - y == 5 ?
x + y == 15 ?
F
T F
T
P1
P0
P3
P2
F(x) = [1, 2, 3]
F(y) = [1, 4, 9]
F(t) = F(2x – 3y)
= [-1, -8, -21]
P1:
Content of ConditionalsC
… = y + … … = y + ...
T F
Content of Conditionals
Split F(y) into F(yT) and F(yF) such that
F(yT) = A(F(y))
F(yT) ©r F(yF) = F(y), where r = F(c)
A([v1, v2, v3]) → [v1 ©r1 v2, v2 ©r2 v3, v3 ©r3 v1)]
C
… = yT + … … = yF + ...
T F
Example
t = 2x - 3y
assert (t = 5)
Content of Conditionals
x - y == 5 ?
x + y == 15 ?
F
T F
T
P1
P0
P3
P2
F(x) = [1, 2, 3]
F(y) = [1, 4, 9]
F(t) = [-1, -8, -21]
P1:
t = 2x - 3y
assert (tTT = 5)
Example (Content of Conditionals)
x - y == 5 ?
xT + yT == 15 ?
F
T F
T
P1
P0
P3
P2
F(x) = [1, 2, 3]
F(y) = [1, 4, 9]
F(t) = [-1, -8, -21]
F(x - y - 5) = [-5, -7, -11]
P1:
t = 2x - 3y
assert (tTT = 5)
Example (Content of Conditionals)
x - y == 5 ?
xT + yT == 15 ?
F
T F
T
P1
P0
P3
P2
F(xT) = [-3/2, 1/4, -2/3]
F(yT) = [-13/2, -19/4, -17/3]
F(tT) = [33/2, 59/4, 47/3]
Note that
xT – yT = 5
tT + yT = 10
Because, t = 2x - 3y
= 2(x-y) - y
= 10 - y
P2:
t = 2x - 3y
assert (tTT = 5)
Example (Content of Conditionals)
x - y == 5 ?
xT + yT == 15 ?
F
T F
T
P1
P0
P3
P2
F(xT) = [-3/2, 1/4, -2/3]
F(yT) = [-13/2, -19/4, -17/3]
F(tT) = [33/2, 59/4, 47/3]
F(xT + yT – 15) = [-23, -39/2, -64/3]
P2:
t = 2x - 3y
assert (tTT = 5)
Example (Content of Conditionals)
x - y == 5 ?
xT + yT == 15 ?
F
T F
T
P1
P0
P3
P2
F(xTT) = [10, 10, 10]
F(yTT) = [5, 5, 5]
F(tTT) = [5, 5, 5]
P3:
ConditionalsF (c) ! 1 (if our algorithm can prove that c is always true)
! 0 (if our algorithm can prove that c is always false)
! Rand (c) (equivalent conditionals get the same
random value) Let c be of the form: e == 0, Let F(e) = [v1, v2, v3]
e ≡ 0 ) c is always true Check: F(e) = F(0)
e ≡ n, n ≠ 0 ) c is always false
Check: v1 = v2 = v3 ≠ 0
e ≡ n1 E + n2, 0 < n2 < n1 ) c is always false For e.g. 2x + 1 ≠ 0 n1 = GCD { v1 – v2, v2 – v3 }
Check: n2 = v1 % n1 > 0
Detecting Equivalent Conditionals To Check: (e1 == 0) ≡ (e2 == 0)
e1 ≡ e (e2), e ≠ 0 ) (e1 == 0) ≡ (e2 == 0) For e.g. (x + 1 == 0) ≡ (2x + 2 == 0)
e ≠ 0 can be checked if we know F(e) F(e) = F(e1) / F(e2)
Loops
F((x))= F(x0 ©r1 xi+1)
xi+1 = g(xi)
r1 = Rand(c(xi))
xi = x0 ©r2 g(x0)
r2 = Rand() Linear Loop Invariants are preserved
Automatic Discovery of Invariants Automatic Use of Invariants
x = x0;
while c(x) { x = g(x); }
t = (x);
x = x0;
while c(x) { x = g(x); }
t = x;
Example (Loops)
x = 0; y = 1;
x = x + 1;
y = y + 2;
C(x) ?
assert (y = 2x + 1)
Example (Loops)
x = 0; y = 1;
x = x + 1;
y = y + 2;
C(x) ?
x’ = (x); y’ = (y);
assert (y’ = 2x’ + 1)
F(x’) = F((x))
= F(0 ©r1 ((0 ©r2 (1)) + 1))
= r1 0 + (1-r1) ((r2 0 + (1-r2)1) + 1)
= (1-r1) (1- r2 + 1)
= 2 - 2r1 - r2 + r1r2
F(y’) = F((y))
= 1 ©r1 ((1 ©r2 (3)) + 2)
= r11 + (1-r1) ((r2 1 + (1-r2) 3) + 2)
= r1 + (1-r1) (5- 2r2)
= 5 - 4r1 -2r2 + 2r1r2
MemoryM[x] = v6
M[y]=v
M[y+1]=v5
M[2z] = v4
M[2z+1] = v3
M[4z+3] = v2
M[2z+1] = v1
T1 = M[y]
M[y] = v
M[4z+3] = v2
M[2z+1] = v1
M[2z] = v4
T2 = M[y]
assert (T1 = T2)
Memory
F (Mem[a]) = F(v1 ©r1 v2 ©r2….vn ©rn v)
= F (r1v1 + r2v2 + … + rnvn + (1-r1-r2-…-rn) v)
ri = F(Conditions under which vi is read)
Example (Memory)M[x] = v6
M[y]=v
M[y+1]=v5
M[2z] = v4
M[2z+1] = v3
M[4z+3] = v2
M[2z+1] = v1
T1 = M[y]
T1 = M[y]
= v1 if (y == 2z+1)
v2 if (y != 2z+1 Æ y == 4z+3)
v4 if (y == 2z)
v otherwise
F(T1) = F(M[y])
= F(r1 v1 + r2 v2 + r4 v4 + (1 – r1 – r2 – r4) v)
where, r1 = F(y == 2z+1)
r2 = F(y != 2z+1 Æ y = 4z+3)
r4 = F(y == 2z)
Example (Memory)
M[y]=v
M[4z+3] = v2
M[2z+1] = v1
M[2z] = v4
T2 = M[y]
T2 = M[y]
= v4 if (y == 2z)
+ v1 if (y == 2z+1)
+ v2 if (y != 2z+1 Æ y == 4z+3)
+ v otherwiseF(T2) = F(M[y])
= F(r4 v4 + r1 v1 + r2 v2 + (1 – r4 – r1 – r2) v)
where, r4 = F(y == 2z)
r1 = F(y == 2z+1)
r2 = F(y != 2z+1 Æ y = 4z+3)
Example (Memory)M[x] = v6
M[y]=v
M[y+1]=v5
M[2z] = v4
M[2z+1] = v3
M[4z+3] = v2
M[2z+1] = v1
T1 = M[y]
M[y]=v
M[4z+3] = v2
M[2z+1] = v1
M[2z] = v4
T2 = M[y]
F(T1) = F(r1 v1 + r2 v2 + r4 v4 + (1 – r1 – r2 – r4) v)
F(T2) = F(r4 v4 + r1 v1 + r2 v2 + (1 – r4 – r1 – r2) v)
Applications Program Verification
Automatic discovery of useful loop invariants Translation Validation Compiler Optimizations
Eliminating redundant computations, branches, memory reads.
Partial Evaluation Interactive Debugging and Testing of Programs
Related Light-weight Techniques Value Numbering
Targets Structural Equivalence of expressions Detects only equalities
Random Testing Cannot ‘prove’ equivalence of expressions
can only provide a counter-example Exponential number of paths Even generating input data to execute a particular
path is difficult
Conclusion Comparison with Symbolic Analysis
very simple data structure: list of numbers with simple operations and fast judgements
There is a limit to what a linear time analysis can achieve! Excellent base to build up more complicated analysis
Join lazily
“The intriguing possibility that axioms of randomness may constitute a useful fundamental source of truth independent of, but supplementary to, the standard axiomatic structure of mathematics suggests that probabilistic algorithms ought to be sought vigorously.”
- J.T. Schwartz