Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

35
Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    2

Transcript of Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Page 1: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Program Analysis Using Randomization

Sumit Gulwani, George Necula

(U.C. Berkeley)

Page 2: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

What kind of Analysis?Any analysis that can be modeled as checking

equivalence of two expressions at a point in a program

Equivalent to checking Reachability Properties Complexity of our algorithm

(Almost) Linear Time Queries answered in (almost) constant time

Page 3: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

The Randomized Strategy

Define a mapping F: Expression → Polynomial such that P1 ≡ P2

) E1 ≡ E2 (Soundness)

E1 ≡ E2 [according to some theory T]

) P1 ≡ P2 (Completeness w.r.t. T)

For a loop-free program, F(E) = i Predi vi

Page 4: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

x = 7

t = x + y

y = 9 y = 5

C1

C2

Example

x = 3

F(t) = C1 C2 (3+5)

+ C1 ¬C2 (3+9)

+ ¬C1 C2 (7+5)

+ ¬C1 ¬C2 (7+9)

T

T F

F

Page 5: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Checking Polynomial Equivalence

P1 ≡ P2 can be determined by random testing with small error probability (Probabilistic Soundness)

F: Expression ! Polynomial

can be thought of as

F: Expression ! [List of numbers]

Page 6: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Algorithm

Statements with side-effect x = e Mem[e1] = e2

x = e Record in the register table: Store(x) Ã F(e) Register table is simply an array

Mem[e1] = e2

Record in the memory table: F(e1) Ã F(e2)

Page 7: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Expressions E : = n (Constant)

| x (Variable Reference)

| Mem[e] (Memory Read)

| e1 + e2 (Arithmetic)

| e1 - e2

| e1 * e2

| (e == 0) (Conditionals)

| (e ≥ 0)

| (c: e1, ¬c: e2) (Joins)

| (e1,e2) (Joins at Loop Entry)

| (x) (Loop Exit)

| U(e1, …, en) (Uninterpreted Functions)

Page 8: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Arithmetic and Uniterpreted Functions

F(n) ! [n, n, n]

F(x) ! Store(x) if x was defined before use

| Rand() otherwise

F(e1 + e2) ! F(e1) + F(e2)

F(e1 - e2) ! F(e1) - F(e2)

F(e1 * e2) ! F(e1) * F(e2)

F(e1 / e2) ! F(e1) / F(e2)

F(U(e1, …, en)) ! Rand(U, F(e1), …, F(en))

Page 9: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Joins

F ((c: e1, ¬c: e2)) = F(e1 ©r e2)

r = F(c)

e1 ©r e2 ≡ r × e1 + (1-r) × e2

Note that r + (1-r) = 1

Linear equalities are preserved Furthermore, if 0 ≤ r ≤ 1

Linear inequalities will be preserved

Page 10: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Preservation of Linear Invariants F(y) = r F(y1) + (1-r) F(y2)

F(x) = r F(x1) + (1-r) F(x2)

= r F(ay1 + b)

+ (1-r) F(ay2 + b)

= ar F(y1) + a(1-r) F(y2)

+ r b + (1-r) b

= a F(y) + b y = (c: y1, ¬c: y2)

x = (c: x1, ¬c: x2)

assert (x = ay + b)

y1 = …

x1 = a y1 + b

y2 = …

x2 = a y2 + b

Page 11: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Lock = L0

assert (Lock = L0)

Lock - -

C

C

Locking Example (Joins)

Lock + +

Page 12: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

L1 = L0

L5 = Ф(c: L4, ¬c: L3)

assert (L5 = L0)

L4 = L3 -1

C

C

Locking Example (Joins)

L2 = L1 + 1

L3 = Ф(c: L2, ¬c: L1)

F(L1) = F(L0)

F(L2) = F(L1) + 1

F(L3) = r F(L2) + (1-r) F(L1)

= F(L0) + r

F(L4) = F(L3) – 1

F(L5) = r F(L4) + (1-r) F(L3)

= F(L3) – r

= F(L0) + r – r

= F(L0)

Page 13: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (t = 5)

Content of Conditionals

x - y == 5 ?

x + y == 15 ?

F

T F

T

P1

P0

P3

P2

F(x) = [1, 2, 3]

F(y) = [1, 4, 9]

P0:

Page 14: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (t = 5)

Content of Conditionals

x - y == 5 ?

x + y == 15 ?

F

T F

T

P1

P0

P3

P2

F(x) = [1, 2, 3]

F(y) = [1, 4, 9]

F(t) = F(2x – 3y)

= [-1, -8, -21]

P1:

Page 15: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Content of ConditionalsC

… = y + … … = y + ...

T F

Page 16: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Content of Conditionals

Split F(y) into F(yT) and F(yF) such that

F(yT) = A(F(y))

F(yT) ©r F(yF) = F(y), where r = F(c)

A([v1, v2, v3]) → [v1 ©r1 v2, v2 ©r2 v3, v3 ©r3 v1)]

C

… = yT + … … = yF + ...

T F

Page 17: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example

Page 18: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (t = 5)

Content of Conditionals

x - y == 5 ?

x + y == 15 ?

F

T F

T

P1

P0

P3

P2

F(x) = [1, 2, 3]

F(y) = [1, 4, 9]

F(t) = [-1, -8, -21]

P1:

Page 19: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (tTT = 5)

Example (Content of Conditionals)

x - y == 5 ?

xT + yT == 15 ?

F

T F

T

P1

P0

P3

P2

F(x) = [1, 2, 3]

F(y) = [1, 4, 9]

F(t) = [-1, -8, -21]

F(x - y - 5) = [-5, -7, -11]

P1:

Page 20: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (tTT = 5)

Example (Content of Conditionals)

x - y == 5 ?

xT + yT == 15 ?

F

T F

T

P1

P0

P3

P2

F(xT) = [-3/2, 1/4, -2/3]

F(yT) = [-13/2, -19/4, -17/3]

F(tT) = [33/2, 59/4, 47/3]

Note that

xT – yT = 5

tT + yT = 10

Because, t = 2x - 3y

= 2(x-y) - y

= 10 - y

P2:

Page 21: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (tTT = 5)

Example (Content of Conditionals)

x - y == 5 ?

xT + yT == 15 ?

F

T F

T

P1

P0

P3

P2

F(xT) = [-3/2, 1/4, -2/3]

F(yT) = [-13/2, -19/4, -17/3]

F(tT) = [33/2, 59/4, 47/3]

F(xT + yT – 15) = [-23, -39/2, -64/3]

P2:

Page 22: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

t = 2x - 3y

assert (tTT = 5)

Example (Content of Conditionals)

x - y == 5 ?

xT + yT == 15 ?

F

T F

T

P1

P0

P3

P2

F(xTT) = [10, 10, 10]

F(yTT) = [5, 5, 5]

F(tTT) = [5, 5, 5]

P3:

Page 23: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

ConditionalsF (c) ! 1 (if our algorithm can prove that c is always true)

! 0 (if our algorithm can prove that c is always false)

! Rand (c) (equivalent conditionals get the same

random value) Let c be of the form: e == 0, Let F(e) = [v1, v2, v3]

e ≡ 0 ) c is always true Check: F(e) = F(0)

e ≡ n, n ≠ 0 ) c is always false

Check: v1 = v2 = v3 ≠ 0

e ≡ n1 E + n2, 0 < n2 < n1 ) c is always false For e.g. 2x + 1 ≠ 0 n1 = GCD { v1 – v2, v2 – v3 }

Check: n2 = v1 % n1 > 0

Page 24: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Detecting Equivalent Conditionals To Check: (e1 == 0) ≡ (e2 == 0)

e1 ≡ e (e2), e ≠ 0 ) (e1 == 0) ≡ (e2 == 0) For e.g. (x + 1 == 0) ≡ (2x + 2 == 0)

e ≠ 0 can be checked if we know F(e) F(e) = F(e1) / F(e2)

Page 25: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Loops

F((x))= F(x0 ©r1 xi+1)

xi+1 = g(xi)

r1 = Rand(c(xi))

xi = x0 ©r2 g(x0)

r2 = Rand() Linear Loop Invariants are preserved

Automatic Discovery of Invariants Automatic Use of Invariants

x = x0;

while c(x) { x = g(x); }

t = (x);

x = x0;

while c(x) { x = g(x); }

t = x;

Page 26: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example (Loops)

x = 0; y = 1;

x = x + 1;

y = y + 2;

C(x) ?

assert (y = 2x + 1)

Page 27: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example (Loops)

x = 0; y = 1;

x = x + 1;

y = y + 2;

C(x) ?

x’ = (x); y’ = (y);

assert (y’ = 2x’ + 1)

F(x’) = F((x))

= F(0 ©r1 ((0 ©r2 (1)) + 1))

= r1 0 + (1-r1) ((r2 0 + (1-r2)1) + 1)

= (1-r1) (1- r2 + 1)

= 2 - 2r1 - r2 + r1r2

F(y’) = F((y))

= 1 ©r1 ((1 ©r2 (3)) + 2)

= r11 + (1-r1) ((r2 1 + (1-r2) 3) + 2)

= r1 + (1-r1) (5- 2r2)

= 5 - 4r1 -2r2 + 2r1r2

Page 28: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

MemoryM[x] = v6

M[y]=v

M[y+1]=v5

M[2z] = v4

M[2z+1] = v3

M[4z+3] = v2

M[2z+1] = v1

T1 = M[y]

M[y] = v

M[4z+3] = v2

M[2z+1] = v1

M[2z] = v4

T2 = M[y]

assert (T1 = T2)

Page 29: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Memory

F (Mem[a]) = F(v1 ©r1 v2 ©r2….vn ©rn v)

= F (r1v1 + r2v2 + … + rnvn + (1-r1-r2-…-rn) v)

ri = F(Conditions under which vi is read)

Page 30: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example (Memory)M[x] = v6

M[y]=v

M[y+1]=v5

M[2z] = v4

M[2z+1] = v3

M[4z+3] = v2

M[2z+1] = v1

T1 = M[y]

T1 = M[y]

= v1 if (y == 2z+1)

v2 if (y != 2z+1 Æ y == 4z+3)

v4 if (y == 2z)

v otherwise

F(T1) = F(M[y])

= F(r1 v1 + r2 v2 + r4 v4 + (1 – r1 – r2 – r4) v)

where, r1 = F(y == 2z+1)

r2 = F(y != 2z+1 Æ y = 4z+3)

r4 = F(y == 2z)

Page 31: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example (Memory)

M[y]=v

M[4z+3] = v2

M[2z+1] = v1

M[2z] = v4

T2 = M[y]

T2 = M[y]

= v4 if (y == 2z)

+ v1 if (y == 2z+1)

+ v2 if (y != 2z+1 Æ y == 4z+3)

+ v otherwiseF(T2) = F(M[y])

= F(r4 v4 + r1 v1 + r2 v2 + (1 – r4 – r1 – r2) v)

where, r4 = F(y == 2z)

r1 = F(y == 2z+1)

r2 = F(y != 2z+1 Æ y = 4z+3)

Page 32: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Example (Memory)M[x] = v6

M[y]=v

M[y+1]=v5

M[2z] = v4

M[2z+1] = v3

M[4z+3] = v2

M[2z+1] = v1

T1 = M[y]

M[y]=v

M[4z+3] = v2

M[2z+1] = v1

M[2z] = v4

T2 = M[y]

F(T1) = F(r1 v1 + r2 v2 + r4 v4 + (1 – r1 – r2 – r4) v)

F(T2) = F(r4 v4 + r1 v1 + r2 v2 + (1 – r4 – r1 – r2) v)

Page 33: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Applications Program Verification

Automatic discovery of useful loop invariants Translation Validation Compiler Optimizations

Eliminating redundant computations, branches, memory reads.

Partial Evaluation Interactive Debugging and Testing of Programs

Page 34: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Related Light-weight Techniques Value Numbering

Targets Structural Equivalence of expressions Detects only equalities

Random Testing Cannot ‘prove’ equivalence of expressions

can only provide a counter-example Exponential number of paths Even generating input data to execute a particular

path is difficult

Page 35: Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)

Conclusion Comparison with Symbolic Analysis

very simple data structure: list of numbers with simple operations and fast judgements

There is a limit to what a linear time analysis can achieve! Excellent base to build up more complicated analysis

Join lazily

“The intriguing possibility that axioms of randomness may constitute a useful fundamental source of truth independent of, but supplementary to, the standard axiomatic structure of mathematics suggests that probabilistic algorithms ought to be sought vigorously.”

- J.T. Schwartz