Jaewook Shin, Priyadarshini Malusare and Paul D. Hovland Mathematics and Computer Science Division...

27
Jaewook Shin, Priyadarshini Malusare and Paul D. Hovland Mathematics and Computer Science Division Argonne National Laboratory Design and Implementation of a Context- Sensitive, Flow-Sensitive Activity Analysis Algorithm for Automatic Differentiation The 5th International Conference on Automatic Differentiation, Bonn, Germany, August 15, 2008

Transcript of Jaewook Shin, Priyadarshini Malusare and Paul D. Hovland Mathematics and Computer Science Division...

Jaewook Shin, Priyadarshini Malusare and Paul D. HovlandMathematics and Computer Science Division

Argonne National Laboratory

Design and Implementation of a Context-Sensitive,Flow-Sensitive Activity Analysis Algorithm for

Automatic Differentiation

The 5th International Conference on Automatic Differentiation, Bonn, Germany, August 15, 2008

2

Outline

1. Activity Analysis

2. Previous work: CSFI

3. New CSFS Algorithm

4. Experimental Results

5. Conclusion

3

Activity Analysis

AD is applied to a function with a set of input variables and a set of output variables.

Sometimes, we are interested in the derivatives – of a subset of the output variables dependent – with respect to a subset of the input variables independent

An intermediate variable is – varied if it is transitively dependent on any independent variable – useful if any dependent variable is transitively dependent on it– active if it is both varied and useful.

Partial derivatives need to be computed only for active variables.

Activity analysis is nonseparable.

4

Activity Analysis: Given f, compute dy1/dx1

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1) y1 = t1*2 t2 = cos(x2) y2 = t2*3 y3 = x3*t1}

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1) dt1/dx1 = cos(x1) y1 = t1*2 dy1/dt1 = 2 t2 = cos(x2) y2 = t2*3 y3 = x3*t1 dy1/dx1 = (dy1/dt1)*(dt1/dx1)}

5

Activity Analysis: Given f, compute dy1/dx1

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1) y1 = t1*2 t2 = cos(x2) y2 = t2*3 y3 = x3*t1}

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1) dt1/dx1 = cos(x1) y1 = t1*2 dy1/dt1 = 2 t2 = cos(x2) dt2/dx2 = -sin(x2) y2 = t2*3 dy2/dt2 = 3 y3 = x3*t1 dy3/dx3 = t1 dy3/dt1 = x3 dy1/dx1 = (dy1/dt1)*(dt1/dx1) dy2/dx2 = (dt2/dx2)*(dy2/dt2) dy3/dx1 = (dt1/dx1)*(dy3/dt1)}

6

Activity Analysis: varied + useful = active

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

varied

useful

varied

usefulactive

7

Previous work: Context-Sensitive, Flow-Insensitive Activity Analysis (VDGAA)

Graph reachability problem Variable Dependence Graph (VDG) Two separate (color) propagations:

– Forward (coloring red) for “varied” variables

– Backward (coloring yellow) for “useful” variables Context sensitivity is supported by a stack of contexts.

Run time

– Very fast in practice Small overestimations of active variables due to

– the way programs are usually written

– the property of activity analysis

8

VDGAA on the run: 1. build VDGAA

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

x1

x2

x3

t2

y1

y2

y3t1

Variable Dependence Graph (VDG)

9

VDGAA on the run: 2. Find ‘varied’ variables

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

x1

x2

x3

t2

y1

y2

y3t1Forward propagation

10

VDGAA on the run: 3. Find ‘useful’ variables

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; t2 = cos(x2); y2 = t2*3; y3 = x3*t1;}

x1

x2

x3

t2

y1

y2

y3t1Backward propagation

11

VDGAA on the run: Context Sensitivity

foo(a, b){ b = a;}

x p1 p2 y

ba

Forward propagation

call foo(x, p1); call foo(p2, y);

12

VDGAA on the run: Context Sensitivity

foo(a, b){ b = a;}

x p1 p2 y

ba

Backward propagation

call foo(x, p1); call foo(p2, y);

13

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; y3 = x3*t1; t1 = cos(x2); y2 = t1*3;}

VDGAA : Flow Insensitivity

x1

x2

x3

y1

y2

y3t1

Forward propagation

14

VDGAA : Flow Insensitivity

f(x1, x2, x3, y1, y2, y3){ t1 = sin(x1); y1 = t1*2; y3 = x3*t1; t1 = cos(x2); y2 = t1*3;}

x1

x2

x3

y1

y2

y3t1

Backward propagation

15

Context-Sensitive, Flow-Sensitive Activity Analysis (DUGAA)

Graph reachability problem Two separate (color) propagations:

– Forward for “varied” variables

– Backward for “useful” variables Context sensitivity is supported by a stack of contexts.

Definition-Use Graph (DUG) Flow sensitivity is supported by the use of reaching definitions for nodes.

Algorithm:

UD-DUChains

Build adef-use graph

Forwardpropagation

Backwardpropagation

16

DUGAA on the run: 1. build a DUG

1:f(x1, x2, x3, y1, y2, y3){2: t1 = sin(x1);3: y1 = t1*2;4: y3 = x3*t1;5: t1 = cos(x2);6: y2 = t1*3;7:}

x1@I

x2@I

x3@I

y1@O

y2@O

y3@O

t1@2

t1@5

y1@3

y3@4y2@6

Def-Use Graph (DUG)

17

DUGAA on the run: 2. Forward propagation

1:f(x1, x2, x3, y1, y2, y3){2: t1 = sin(x1);3: y1 = t1*2;4: y3 = x3*t1;5: t1 = cos(x2);6: y2 = t1*3;7:}

x1@I

x2@I

x3@I

y1@O

y2@O

y3@O

t1@2

t1@5

y1@3

y3@4y2@6

Forward propagation

18

DUGAA on the run: 3. Backward propagation

1:f(x1, x2, x3, y1, y2, y3){2: t1 = sin(x1);3: y1 = t1*2;4: y3 = x3*t1;5: t1 = cos(x2);6: y2 = t1*3;7:}

x1@I

x2@I

x3@I

y1@O

y2@O

y3@O

t1@2

t1@5

y1@3

y3@4y2@6

Backward propagation

19

Implementation

OpenAnalysis

VDGAA

DUGAA

OpenAD: ADTransformation

Open64Unparser

Open64Front end

Input(Fortran)

Output(Fortran)

20

Benchmarks

Benchmarks Description Source #lines

MITgcm MIT General Circulation Model MIT 27376

LU Lower-upper symmetric Gauss-Seidel NASPB 5951

CG Conjugate gradient NASPB 2480

newton Newton’s method + Rosenbrock function ANL 2189

adiabatic Adiabatic flow model in chemical engineering CMU 1009

msa Minimal surface area problem MINPACK-2 461

swirl Swirling flow problem MINPACK-2 355

c2 Ordinary differential equation solver ANL 64

21

Slowdowns in analysis run time: DUGAA vs. VDGAA

30.89

105.76

66.5

inf

31.6727

74

inf

0

30

60

90

120

150

MIT gcm

LU CG newton

adiabatic

msa

swirl

c2

22

Analysis run time

1.71

52.82

0.17

17.98

0.02

1.33

0.0

1.27

0.03

0.95

0.01

0.27

0.01

0.74

0.00.010

0.5

1

1.5

2(Seconds)

VDGAA DUGAA

23

Analysis run-time breakdown on MITgcm

0

81.26

35.09

1.72

53.8

16.8611.11

0.660

20

40

60

80

100

(% of run time)

UD-DU Chains

Graph generation

Transitive closure

Coloring

VDGAA DUGAA

24

Reduction in active variables

8/925

0/311 0/12 0/2 0/141 0/13 0/4

1/6

0

2

4

6

8

MIT gcm

LU CG newton

adiabatic

msa

swirl

c2

25

Conclusion

A new context-sensitive, flow-sensitive (CSFS) activity analysis algorithm: Def-Use Graph Activity Analysis (DUGAA)

Comparison of two activity analyses: DUGAA (CSFS) vs. VDGAA (CSFI)

Slower than VDGAA for all 8 benchmarks

– by a factor > 27

– but takes less than one minute for a code larger than 27k lines Makes fewer overestimations than VDGAA for two of the eight benchmarks. May save human effort in managing AD code.

Future work

– Comparison among CIFS, CSFI, and CSFS activity analyses

– Dealing with pointers and recursion

26

PARAM Edges

Edges between formal parameters Summarize the connectivity among formal parameters Transitive closure is applied to the dependence matrix of local variables

Checking for connectivity through global variables: A PARAM edge is generated from formal variable node F1 to formal variable node F2 by traversing the entire graph only when

– F1 has a value flow path to a global variable node AND

– F2 has a value flow path from a global variable node.

27

Context-Insensitive, Flow-Sensitive Activity Analysis (ICFGAA)

Interprocedural Control Flow Graph (ICFG) Iterative data-flow analysis (DFA) Two separate DFAs:

– Forward for “varied” variables

– Backward for “useful” variables

Long run time

– Large number of iterations

– Nonseparability of activity analysis: data-flow analysis values depend on other data-flow values

Large overestimation

– Due to context-insensitivity

– Value propagation through unrealizable control paths