Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions

Post on 10-Feb-2016

36 views 0 download

description

Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions. Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology. Outline. Examples Key Problem: Extracting Symbolic Bounds for Accessed Memory Regions - PowerPoint PPT Presentation

Transcript of Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions

Symbolic Bounds Analysis of Pointers, Array Indices, and

Accessed Memory Regions

Radu Rugina and Martin RinardLaboratory for Computer Science

Massachusetts Institute of Technology

Outline• Examples

• Key Problem: Extracting Symbolic Bounds for Accessed Memory Regions

• Key Technology: Formulating and Solving Systems of Symbolic Inequality Constraints

• Results• Conclusion

Example - Divide and Conquer Sort

47 6 1 53 8 2

8 2536 147

Example - Divide and Conquer Sort

47 6 1 53 8 2

Divide

2 8531 674

8 2536 147

47 6 1 53 8 2

Example - Divide and Conquer Sort

Conquer

Divide

Example - Divide and Conquer Sort

2 8531 674 Conquer

8 2536 147 Divide

47 6 1 53 8 2

41 6 7 32 5 8Combine

Example - Divide and Conquer Sort

2 8531 674 Conquer

8 2536 147 Divide

47 6 1 53 8 2

41 6 7 32 5 8Combine

21 3 4 65 7 8

“Sort n Items in d, Using t as Temporary Storage”

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Sort n Items in d, Using t as Temporary Storage”

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n); Motivating ProblemExploit parallelism in this code

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Recursively Sort Four Quarters of d”

Divide array into subarrays and recursively sort

subarrays

47 6 1 53 8 2

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Recursively Sort Four Quarters of d”

dd+n/4d+n/2

d+3*(n/4)

Subproblems Identified

Using Pointers Into Middle of Array

47 6 1 53 8 2

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Recursively Sort Four Quarters of d”

dd+n/4d+n/2

d+3*(n/4)

74 1 6 53 2 8

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Recursively Sort Four Quarters of d”

dd+n/4d+n/2

d+3*(n/4)

Sorted Results Written Back Into

Input Array

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Merge Sorted Quarters of d Into Halves of t”

74 1 6 53 2 8

41 6 7 32 5 8d

tt+n/2

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Merge Sorted Halves of t Back Into d”

21 3 4 65 7 8

41 6 7 32 5 8d

tt+n/2

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Use a Simple Sort for Small Problem Sizes”

47 6 1 53 8 2

dd+n

void sort(int *d, int *t, int n) if (n > CUTOFF) {

sort(d,t,n/4); sort(d+n/4,t+n/4,n/4);sort(d+2*(n/2),t+2*(n/2),n/4);sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));merge(d,d+n/4,d+n/2,t);merge(d+n/2,d+3*(n/

4),d+n,t+n/2);merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

“Use a Simple Sort for Small Problem Sizes”

47 1 6 53 8 2

dd+n

Parallel Sortvoid sort(int *d, int *t, int n) if (n > CUTOFF) {

spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4);spawn sort(d+2*(n/2),t+2*(n/2),n/4);spawn sort(d+3*(n/4),t+3*(n/4),n-

3*(n/4));sync;spawn merge(d,d+n/4,d+n/2,t);spawn

merge(d+n/2,d+3*(n/4),d+n,t+n/2);sync;merge(t,t+n/2,t+n,d);

} else insertionSort(d,d+n);

What Do You Need To Know To Exploit This Form of Parallelism?

What Do You Need To Know To Exploit This Form of Parallelism?

Symbolic Information About Accessed Memory Regions

Calls to sort access disjoint parts of d and tTogether, calls access [d,d+n-1] and [t,t+n-1]

sort(d,t,n/4);

sort(d+n/4,t+n/4,n/4);

sort(d+n/2,t+n/2,n/4);

sort(d+3*(n/4),t+3*(n/4), n-3*(n/4));

Information Needed To Exploit Parallelism

dt

dt

dt

dt

d+n-1t+n-1

d+n-1t+n-1

d+n-1t+n-1

d+n-1t+n-1

First two calls to merge access disjoint parts of d,t

Together, calls access [d,d+n-1] and [t,t+n-1]

merge(d,d+n/4,d+n/2,t);

merge(d+n/2,d+3*(n/4), d+n,t+n/2);

merge(t,t+n/2,t+n,d);

dt

dt

d+n-1t+n-1

d+n-1t+n-1

dt

d+n-1t+n-1

Information Needed To Exploit Parallelism

dt

d+n-1t+n-1

Information Needed To Exploit Parallelism

Calls to insertionSort access [d,d+n-1]

insertionSort(d,d+n);

What Do You Need To Know To Exploit This Form of Parallelism?

sort(p,n) accesses [p,p+n-1]insertionSort(p,n) accesses [p,p+n-1]merge(l,m,h,d) accesses [l,h-1], [d,d+(h-l)-1]

Symbolic Information About Accessed Memory Regions:

How Hard Is It To Figure These Things Out?

Challenging

How Hard Is It To Figure These Things Out?

How Hard Is It To Figure These Things Out?

void insertionSort(int *l, int *h) {int *p, *q, k;for (p = l+1; p < h; p++) {

for (k = *p, q = p-1; l <= q && k < *q; q--)*(q+1) = *q;

*(q+1) = k;}

}Not immediately obvious that

insertionSort(l,h) accesses [l,h-1]

void merge(int *l1, int*m, int *h2, int *d) {int *h1 = m; int *l2 = m;while ((l1 < h1) && (l2 < h2))

if (*l1 < *l2) *d++ = *l1++;else *d++ = *l2++;

while (l1 < h1) *d++ = *l1++;while (l2 < h2) *d++ = *l2++;

}

Not immediately obvious that merge(l,m,h,d)

accesses [l,h-1] and [d,d+(h-l)-1]

How Hard Is It To Figure These Things Out?

Issues• Heavy Use of Pointers

• Pointers into Middle of Arrays• Pointer Arithmetic• Pointer Comparison

• Multiple Procedures• sort(int *d, int *t, n)• insertionSort(int *l, int *h)• merge(int *l, int *m, int *h, int *t)

• Recursion

How the Compiler Does It

Compiler StructurePointer Analysis

Bounds Analysis

Region Analysis

Parallelization

Disambiguate References at Granularity of Allocation Blocks

Symbolic Upper and LowerBounds for Each Memory Access in Each Procedure

Symbolic Regions AccessedBy Execution of Each Procedure

Independent Procedure CallsThat Can Execute in Parallel

Example – Array Incrementvoid f(char *p, int n)

if (n > CUTOFF) {f(p, n/2); /* increment first half */f(p+n/2, n/2); /* increment second half */} else {/* base case: initialize small array */int i = 0;while (i < n) { *(p+i) += 1; i++; }}

Intra-procedural Bounds Analysis• For each integer variable at each program

point, derive lower and upper bounds

• Bounds are symbolic expressions• variables represent initial values of

parameters of enclosing procedure• bounds are linear combinations of

variables

• Example expression for f(p,n): p+n-1

What are upper and lower bounds for region accessed by while loop in base

case?

int i = 0;while (i < n) { *(p+i) += 1; i++; }

Bounds Analysis

Bounds Analysis, Step 1Build control flow graph

i = 0

i < n

*(p+i) += 1

i = i+1

Set up bounds at beginning of basic blocks

Bounds Analysis, Step 2

l1 i u1i = 0

i < n

*(p+i) += 1

i = i+1

l2 i u2

l3 i u3

Compute transfer functionsBounds Analysis, Step 3

l1 i u1i = 0

i < n

*(p+i) += 1

i = i+1

l2 i u2

l3 i u3

0 i 0

l3 i u3

l3+1 i u3+1

l2 i n-1 l2 i u2

l2 i u2

Compute transfer functionsBounds Analysis, Step 3

l1 i u1i = 0

i < n

*(p+i) += 1

i = i+1

l3 i u3

0 i 0

l3 i u3

l3+1 i u3+1

Set up constraints for boundsBounds Analysis, Step 4

l2 i n-1 l2 i u2

l2 i u2

l1 i u1i = 0

i < n

*(p+i) += 1

i = i+1

l3 i u3

0 i 0

l3 i u3

l3+1 i u3+1

l2 0l2 l3+1l3 l2

0 u2

u2+1 u2

n-1 u3

Set up constraints for boundsBounds Analysis, Step 4

l2 i n-1 l2 i u2

l2 i u2

- i +i = 0

i < n

*(p+i) += 1

i = i+1

l3 i u3

0 i 0

l3 i u3

l3+1 i u3+1

l2 0l2 l3+1l3 l2

0 u2

u2+1 u2

n-1 u3

Generate symbolic expressions for bounds

Goal: express bounds in terms of parametersl2 = c1p + c2n + c3

l3 = c4p + c5n + c6

Bounds Analysis, Step 5

u2 = c7p + c8n + c9

u3 = c10p + c11n + c12

c1p + c2n + c3 0c1p + c2n + c3 c4p + c5n + c6 +1c4p + c5n + c6 c1p + c2n + c3

Substitute expressions into constraintsBounds Analysis, Step 6

0 c7p + c8n + c9

c10p + c11n + c12 +1 c7p + c8n + c9

c7p + c8n + c9 c10p + c11n + c12

Goal

Solve Symbolic Constraint Systemfind values for constraint variables c1, ..., c12 that satisfy the inequality constraints

Maximize Lower Bounds

Minimize Upper Bounds

Reduce symbolic inequalities to linear inequalities

c1p + c2n + c3 c4p + c5n + c6

if

c1 c4, c2 c5, and c3 c6

Bounds Analysis, Step 7

Apply reduction and generate a linear programc1 0 c2 0 c3 0c1 c4 c2 c5 c3 c6+1c4 c1 c5 c2 c6 c3

lower bounds upper bounds

Bounds Analysis, Step 7

Objective Function:max: (c1 + ••• + c6) - (c7 + ••• + c12)

0 c7 0 c8 0 c9

c10 c7 c11 c8 c12+1

c9

c7 c10 c8 c11 c9 c12

Apply reduction and generate a linear program

• This is a linear program (LP), not an integer linear program (ILP)

• The coefficients in the symbolic expressions are rational numbers

• Rational coefficients are needed for expressions like middle of an array: low+(high - low)/2

Bounds Analysis, Step 7

Solve linear program to extract boundsc1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1

Bounds Analysis, Step 8

u2 = 0u3 = n-1

l2 i n-1 l2 i u2

l2 i u2

- i +i = 0

i < n

*(p+i) += 1

i = i+1

l3 i u3

0 i 0

l3 i u3

l3+1 i u3+1

l2 = 0l3 = 0

Solve linear program to extract boundsBounds Analysis, Step 8

0 i n-1 0 i n

0 i n

- i +i = 0

i < n

*(p+i) += 1

i = i+1

0 i n-1

0 i 0

0 i n-1

1 i n

c1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1

u2 = 0u3 = n-1

l2 = 0l3 = 0

Solve linear program to extract boundsBounds Analysis, Step 8

0 i n-1 0 i n

0 i n

- i +i = 0

i < n

*(p+i) += 1

i = i+1

0 i n-1

0 i 0

0 i n-1

1 i n

c1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1

u2 = 0u3 = n-1

l2 = 0l3 = 0

Region AnalysisGoal: Compute Accessed Regions of Memory

• Intra-Procedural• Use bounds at each load or store• Compute accessed region

• Inter-Procedural• Use intra-procedural results• Set up another symbolic constraint

system• Solve to find regions accessed by entire

execution of the procedure

Basic Principle of Inter-Procedural Region Analysis

• For each procedure• Generate symbolic expressions for

upper and lower bounds of accessed regions

• Constraint System• Accessed regions include regions

accessed by statements in procedure• Accessed regions include regions

accessed by invoked procedures

void f(char *p, int n) if (n > CUTOFF) {

f(p, n/2);

f(p+n/2, n/2);} else {

int i = 0;while (i < n) { *(p+i) += 1; i++; }

}

l(f,p,n) l(f,p,n/2)u(f,p,n) u(f,p,n/2)l(f,p,n) l(f,p+n/2,n/2)u(f,p,n) u(f,p+n/2,n/2)

l(f,p,n) pu(f,p,n) p+n-1

Inter-Procedural Constraints in Example

Accesses [ l(f,p,n), u(f,p,n) ]

Derive Constraint System• Generate symbolic expressions

l(f,p,n) = C1p + C2n + C3

u(f,p,n) = C4p + C5n + C6

• Build constraint systemC1p + C2n + C3 pC4p + C5n + C6 p + n -1C1p + C2n + C3 C1p + C2(n/2) + C3 C4p + C5n + C6 C4p + C5(n/2) + C6 C1p + C2n + C3 C1(p+n/2) + C2(n/2) + C3 C4p + C5n + C6 C4(p+n/2) + C5(n/2) + C6

• Simplify Constraint SystemC1p + C2n + C3 pC4p + C5n + C6 p + n -1C2n C2(n/2)C5n C5(n/2) C2(n/2) C1(n/2)C5(n/2) C4(n/2)

• Generate and Solve Linear Programl(f,p,n) = pu(f,p,n) = p+n-1

• Access region: [p, p+n-1]

Solve Constraint System

Parallelization

• Dependence Testing of Two Calls• Do accessed regions intersect?• Based on comparing upper and lower

bounds of accessed regions

• Parallelization• Find sequences of independent calls• Execute independent calls in parallel

Details• Inter-procedural positivity analysis

• Verify that variables are positive• Required for correctness of reduction

• Correlation analysis• Integer division

• Basic idea : (n-1)/2 n/2 n/2• Generalized : (n-m+1)/m n/m

n/m• Linear system decomposition

Comparison to Dataflow Analysis

• Dataflow analysis:• Uses iterative algorithms• Cannot handle lattices with infinite

ascending chains, because termination is not guaranteed

• Our framework • Reduces the analysis to a linear program• Works for lattices with infinite ascending

chains like integers, rational numbers or polynomials

• No possibility of non-termination

Automatic ParallelizationOf Sequential Programs

Data Race DetectionFor Parallel Programs

Array Bounds CheckingFor Unsafe Programs

Bounds Checks EliminationFor Safe Programs

Transformations Verifications

Uses of Symbolic Bounds Information

Application of Analysis Framework

• Bitwidth Analysis:• Computes minimum number of bits

to represent computed values • Important for hardware synthesis

from high level languages

• For our framework:• Bitwidth analysis is a special case:

Compute precise numeric bounds• Constraint system = linear program

Experimental Results• Implementation - SUIF, lp_solve, Cilk

• Parallelization speedups:

Application

Number of Processors1 2 4 6 8

Fibonacci 0.76 1.52 3.03 4.55 6.04Quicksort 1.00 1.99 3.89 5.68 7.36Mergesort 1.00 2.00 3.90 5.70 7.41Heat 1.03 2.02 3.89 5.53 6.83BlockMul 0.97 1.86 3.84 5.70 7.54NoTempMul

1.02 2.01 4.03 6.02 8.02

LU 0.98 1.95 3.89 5.66 7.39

• Implementation - SUIF, lp_solve, Cilk

• Parallelization speedups:• Close to linear speedups• Most of parallelism detected

Experimental Results

• Implementation - SUIF, lp_solve, Cilk

• Parallelization speedups:• Close to linear speedups• Most of parallelism detected

• Compiler also verified that:• Parallel versions were free of data races• Benchmarks do not violate the array bounds

Experimental Results

Experimental Results• Implementation - SUIF, lp_solve

• Bitwidth reduction:

0

20

40

60

80

100

percentage ofeliminatedregister bitspercentage ofeliminatedmemory bits

Context• Mainstream parallelizing compilers

• Loop nests, dense matrices• Affine access functions

• Our framework focuses on:• Recursion, dynamically allocated arrays• Pointers, pointer arithmetic• Key problems: pointer analysis, symbolic

region analysis, solving linear programs

Conclusion• Novel framework for symbolic bounds

analysis• Uses symbolic constraint systems• Reduces problem to linear programs• More powerful than iterative approaches

• Analysis uses:• Parallelization, data race detection• Detecting array bounds violations• Array bounds check elimination• Bitwidth analysis