Low-Memory Tour Reversal in Directed Graphs1
Transcript of Low-Memory Tour Reversal in Directed Graphs1
![Page 1: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/1.jpg)
Low-Memory Tour Reversal in Directed Graphs1
Uwe Naumann, Viktor Mosenkis, Elmar PeiseLuFG Informatik 12, RWTH Aachen University, Germany
1EuroAD 9, INRIA, Nov. 2009
![Page 2: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/2.jpg)
Contents
◮ Motivation: Derivative Code Compilers (dcc)
◮ Motivating Example
◮ Control Flow Reversal Problem
◮ Tour Reversal Problem
◮ Loop Compression
◮ Offline Algorithm
◮ Online Algorithm
◮ Implementation
◮ Implementation
◮ Test Results
◮ Conclusion and Outlook
![Page 3: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/3.jpg)
Motivation: Derivative Code Compilers (dcc)
F : Rn → R
m ⇒ Tokens ⇒ AST ⇒ AAST ⇒ F ′
tangent-linear:
F (x, x) ≡ ∇F (x) · x, x ∈ Rn
adjoint:F (x, y) ≡ ∇F (x)T · y, y ∈ R
m
second-order tangent-linear:
F (x, x, x) ≡< ∇2F (x), x, x >, x, x ∈ Rn
second-order adjoint:
F (x, x, y) ≡< ∇2F (x), x, y >, x ∈ Rn, y ∈ R
m
... and higher-order tangent-linear and adjoint codes.
![Page 4: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/4.jpg)
dcc
◮ AD by source code transformation for subset of C
◮ started off as teaching tool
◮ higher-order derivative codes by reapplication
◮ joint call tree reversal
◮ additional overloading mode
◮ main component of AC-SAMMM (see outlook)
◮ there are others2 ... :-(
2Tapenade, TAC, ADIC
![Page 5: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/5.jpg)
Case Study (Perhaps Later ...)
Use dcc to generate
◮ TLC
◮ ADC
◮ SOTLC
◮ SOADC◮ FoR◮ RoF◮ RoR
◮ TOTLC (FoFoR)
◮ TOADC (FoFoR)
for
y = f (x) =
n−1∏
i=0
xi (Speelpenning).
![Page 6: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/6.jpg)
Motivating Example
F : Rn → R y = F (x) =
n−1∏
i=0
xi
void F( i n t n ,f l o a t ∗x ,f l o a t &y ) {
i n t i =0;while ( i<n ) {
i f ( i ==1)y=x [ i ] ;
e l s e
y=y∗x [ i ]i=i +1;
}}
e.g n = 3
y=x [ 0 ] ;y=y∗x [ 1 ] ;y=y∗x [ 2 ] ;
xa [2 ]= y∗ ya ; ya=x [ 2 ] ∗ ya ;xa [1 ]= y∗ ya ; ya=x [ 1 ] ∗ ya ;xa [0 ]= ya ;
![Page 7: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/7.jpg)
Motivating Example (cont’d)
void Fa ( i n t n , f l o a t ∗x , f l o a t ∗ xa , f l o a t &y , f l o a t ya ){
i n t i =0;while ( i<n ) {
i f ( i ==1){ push ( cs , 1 ) ;
push ( fds , y ) ;y=x [ i ] ; }
e l s e
{ push ( cs , 2 ) ;push ( fds , y ) ;y=y∗x [ i ] }
push ( cs , 3 ) ;push ( i d s , i ) ;i=i +1;
}
i n t bb ;while ( pop ( cs , bb ) )
switch ( bb ) {case 1 : { y=pop ( f d s ) ;
xa [ i ]=ya ;break ; }
case 2 : { y=pop ( f d s ) ;xa [ i ]=y∗ ya ;ya=x [ i ]∗ ya ;break ; }
case 3 : { i=pop ( i d s ) ;break ; }
}
}
![Page 8: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/8.jpg)
Motivating Example (cont’d)
i n t i =0;while ( i<n ) {
i f ( i ==1){ push ( cs , 1 ) ;
push ( fds , y ) ;y=x [ i ] ; }
e l s e
{ push ( cs , 2 ) ;push ( fds , y ) ;y=y∗x [ i ] }
push ( cs , 3 ) ;push ( i d s , i ) ;i=i +1;
}
control stack (cs)
1, 2, 3, 2, 3, 2, 3, . . . , 2, 3
float data stack (fds)
y0, y1, y2, y3, y4, . . .
integer data stack (ids)
0, 1, 2, 3, 4, . . .
or0, 1, 1, 1, 1, . . .
if increments are stored
![Page 9: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/9.jpg)
Control Flow Reversal (CFR)
Given: y = F (x) as conditional SAC:
for (j = n; j < n + p + m; j + +)if (vj−1 > 0)
vj = ϕj−n(vi )i≺j
v−1 = x0; v0 = x1
if (v0 > 0) v1 = v−1 · v0 else ...if (v1 > 0) v2 = sin(v1) else ...if (v2 > 0) v3 = v−1 · v2 else ...if (v3 > 0) v4 = v3/v0 else ...if (v4 > 0) v5 = cos(v3) else ...x0 = v4; x1 = v5
-1 0
1
2
3
5 4
DAG G = (V ,E )
Wanted: predecessors of all vertices in reverse order requiringevaluation of conditions in reverse order, i.e. the values ofv4, v3, v2, v1, v0
![Page 10: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/10.jpg)
CFR is NP-complete
... by reduction from DAG Reversal
... from Fixed Cost DAG Reversal
◮ unit cost for storage andrecomputation
◮ fixed total cost |V |
◮ minimize storage
... from Vertex Cover enabling recom-putation at unit cost from stored argu-ments.
-1 0
1
2
3
5 4
◮ U. Naumann: DAG Reversal is NP-complete, Journal ofDiscrete Algorithms, Volume 7, Issue 4, December 2009, Pages402-410.
![Page 11: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/11.jpg)
Tour Reversal
1 : . . .f o r ( i =0; i <2∗n , i++) {
2 : i f ( i %2){ . . .}3 : e l s e { . . . }
}4 : . . .
2
1
3
4
0
1
0
1
... by enumeration of basic blocks
1, 2, 3, . . . , 2, 3, 4
... by counting loop traversals and flagging branches
0, 1 . . . , 0, 1, 2n (reducible control flow only!)
Tour Reversal ProblemUse minimal memory to reverse a tour (i1, i2, . . . , il ) in a directedgraph G = (V ,E ).
![Page 12: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/12.jpg)
Loop Compression
Enumerate Basic Blocks
◮ uncompressed tour:1, 2, 3, . . . , 2, 3, 42n + 2 integers
◮ compressed tour: 1, 2, 3, [n, 2], 46 integers
Count loops and flag branches
◮ uncompressed tour:0, 1, 0, 1, 0, 1, . . . , 0, 1, 2n2n + 1 integers
◮ compressed tour: 0, 1, [n, 2], 2n5 integers
1 : . . .f o r ( i =0; i <2∗n , i++) {
2 : i f ( i %2){ . . .}3 : e l s e { . . . }
}4 : . . .
![Page 13: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/13.jpg)
Offline Algorithm
Dynamic programming compresses according to
f (di ,j) =
1 i = j
mini≤s<j
f (di ,s ◦ ds+1,j) otherwise,
where di ,j is the optimally compressed subtour from i -th to j-thindex and
f (di ,j) := |N ∩ V | (loop elements not counted).
Problems
◮ need to see whole tour in order to find an optimal solution
◮ slow (O(n3) for tour of length n)
![Page 14: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/14.jpg)
Online Algorithm
1, 2, 3, . . . , 2, 3, 4 becomes
push(1): 1
push(2): 1, 2
push(3): 1, 2, 3
push(2): 1, 2, 3, 2
push(3): 1, 2, 3, [2, 2]
push(2): 1, 2, 3, [2, 2], 2
push(3): 1, 2, 3, [3, 2]
...
last push: 1, 2, 3, [n, 2], 4
pop: 1, 2, 3, [n, 2]
pop: 1, 2, 3, [n −1, 2], 2
◮ push◮ pushes element to top of
stack◮ compresses the loops on
window of predefined size
◮ pop◮ unrolls one instance of
last loop if needed◮ returns element from top
of stack and removes itfrom stack
![Page 15: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/15.jpg)
Test Results
Problem uncompr. stack size compr. stack size compr. rate
burger 3432 MB 144 Byte 25013889
bratu 4576 MB 168 Byte 28572380
ljc 3692 MB 704 KByte 5500
Problem Window size Runtime rate
burger 51 ≈ 2.5
bratu 51 ≈ 2.8
ljc 51 ≈ 5.5
ljc 11 ≈ 3.3
![Page 16: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/16.jpg)
Implementation
◮ C++ front-end◮ distribution: ccistack.{hpp,cpp}◮ class ccistack◮ void ccistack::push(int) and int ccistack::pop()
◮ C front-end◮ distribution: cistack.{h,c}◮ void init (struct cistack *k,
int MaxStackSize, int WindowSize)
to allocate◮ void finalize (struct cistack *k) to free◮ void push (struct cistack *k, int i)◮ int pop (struct cistack *k) and
int empty (struct cistack *k)
◮ Fortran front-end◮ uses C front-end
![Page 17: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/17.jpg)
C++ Front-End
#include <f s t r eam>using namespace s t d ;#include ” c i s t a c k . h”
i n t main ( i n t argc , char∗ a rgv [ ] ) {i f s t r e a m i n f i l e ( a rgv [ 1 ] ) ;o f s t r eam o u t f i l e ( a rgv [ 2 ] ) ;c i s t a c k cs (100 , 21 ) ; i n t v ;whi le ( ! i n f i l e . e o f ( ) ) {
i n f i l e >> v ;i f ( ! i n f i l e . e o f ( ) ) cs . push ( v ) ;
}whi le ( ! cs . empty ( ) )
o u t f i l e << cs . pop ( ) << end l ;return 0 ;
}
![Page 18: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/18.jpg)
C Front-End
#include ” c i s t a c k . h”
i n t main ( i n t argc , char∗ a rgv [ ] ) {s t ruct c i s t a c k cs ;FILE∗ i n f i l e=fopen ( a rgv [ 1 ] , ” r+” ) ;FILE∗ o u t f i l e=fopen ( a rgv [ 2 ] , ”w+” ) ;i n i t (&cs , 1 0 0 , 4 ) ;i n t v ;whi le ( f s c a n f ( i n f i l e , ”%i ” ,&v)>0) push(&cs , v ) ;whi le ( ! empty ( ) ) f p r i n t f ( o u t f i l e , ”%i \n” , pop(&cs ) ) ;f i n a l i z e (&cs ) ;f c l o s e ( c sou t ) ; f c l o s e ( o u t f i l e ) ; f c l o s e ( i n f i l e ) ;
}
![Page 19: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/19.jpg)
Conclusion (I)
◮ powerful control flow reversal tool ready to use for developersof adjoint compiler technology and/or debugging/profilingtools
◮ paper (almost) ready for submission
Outlook
◮ compression of integer data stack
◮ compression of float data stack
![Page 20: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/20.jpg)
Conclusion (II)
You need a (adjoint) derivative code compiler if
◮ finite differences cannot be trusted
◮ finite differences or exact forward sensitivities are tooexpensive
◮ you are unable to build and solve the adjoint system manually
You need to invest3, 6, 18, 36
(wo)man months for sustained
runtime of adjoint
runtime of original simulation
of50, 20, < 10, < 4
![Page 21: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/21.jpg)
Further Ongoing Activities at STCE
◮ CompAD-III (J. Riehme and D. Gendler, with UHerts andNAG)
◮ Adjoint error correction in ICON (J. Riehme and K. Leppkes,with MPI-M)
◮ Model reliability analysis in Sisyphe/Telemac (J. Riehme, withBAW)
◮ Data assimilation in JURASSIC (E. Varnik and M. Forster,with ICG-I at FZJ)
◮ dcc and the AaChen platform for Structured AutomaticManipulation of Mathematical Models (AC-SAMMM)(M. Forster and B. Gendler, with AVT)
◮ Toward adjoint MPI (M. Schanen, with ANL and INRIA)
◮ Pushing TBR (with ANL and INRIA)
◮ Call tree reversal (H. Lakhdar and X. Jin)
![Page 22: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/22.jpg)
Further Ongoing Activities at STCE (cont.)
◮ Elimination techniques in linearized DAGs (V. Mosenkis)
◮ Higher-order adjoints for uncertainty quatification (M.Beckers, with UHerts)
◮ Adjoint subgradients for McCormick relaxations (M. Beckers,V. Mosenkis, and M. Maier, with MechEng at MIT)
◮ What color is the non-constant part of your Jacobian?(E. Varnik and L. Razik)
◮ Shared-memory parallelism in tape-based adjoints (K. Leppkesand J. Riehme)
◮ Submitted: Hybrid AD for C/C++ (ADOL-C + dcc,D. Gendler, with UPaderborn)
![Page 23: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/23.jpg)
The AaChen platform for Structured Automatic
Manipulation of Mathematical Models
dcc Configuration
...DyOSMexa
Solution Solution
Problem
I2
I1
dcc
I3tailored AD
manipulation
symbolicAD
model refinement
ModelsDerivative Custom
Standard (AC−SAMMM)
MathematicalSubmodels inC− (imperative)
MathematicalModel in C−+(descriptive)
Diversity (The World)
Model in Mathematical
− Modelica
− gPROMS− ...
ToC−+
![Page 24: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/24.jpg)
Global Optimization using McCormick Relaxations
-1
0
1
2
3
4
5
-6 -4 -2 0 2 4
Original functionNat. interval ext. underest.
Convex underestimatorAffine underestimator
Global upper bound
(adjoint) subgradients by NAG Fortran compiler based on� A. Mitsos, B. Chachuat, and P. I. Barton: McCormick-Based Relaxations of
Algorithms, SIAM Journal on Optimization, 2009.
(C. Corbett, M. Beckers, V. Mosenkis, M. Maier)
![Page 25: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/25.jpg)
Uncertainty Quantification / Robust Optimizatione.g. minµx (µy +
√
σ2y ) where y = F (x) and w.l.o.g. F : R → R
using
◮ approximate mean
µy = F (µx) +F ′′(µx)
2· σ2
x
◮ approximate variance
σ2y = F ′(µx)
2 σ2x + F ′(µx)F ′′(µx)Sx σ3
x
+1
4
(
F ′′(µx))2
(Kx − 1)σ4x
for given initial mean µx , variance σ2x , skewness Sx , and kurtosis
Kx of x ∈ R. Second-order method (e.g. Newton) requiresderivatives up to fourth order. (M. Beckers)
![Page 26: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/26.jpg)
Adjoint (Total Model) Error Correction
... in ICON GCM: ICOsahedral Non-hydrostatic General Circulation Modeldeveloped by Max Planck Institute forMeteorology and Deutscher Wetterdienst.
(Discrete) Adjoint by NAG FortranCompiler. (J. Riehme)
(References to) theoretical foundations e.g. in
� R. Rannacher, F.-T. Suttmeier: A posteriori error control in finite element
methods via duality techniques: Application to perfect plasticity.
Computational Mechanics, 1998.
� M.B. Giles, N.A. Pierce and E. Suli: Progress in adjoint error correction for
integral functionals, Computing and Visualization in Science, 2004.
![Page 27: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/27.jpg)
Adjoint Error Correction (Linear Case)
Given: xf = F (A, xs) after solution of A · xf = xs and value oflinear objective J = J(g , xf ) =< g , xf > .
Wanted: corrected objective J =< g , xcf > − < xc
s ,A · xcf − xs > .
Solution:AT · xs = xf = g
NOT using discrete adjoint code
xf = A−1 · xs
J =< g , xf >
xf = g · J = g
xs = A−T · xf
produced by derivative code compiler.
![Page 28: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/28.jpg)
Adjoint Error Correction (Nonlinear Case)
Given: xf = F (A, xs) after solution of N(x) = 0 and nonlinearobjective J = J(xf ) with gradient g ≡ ∂J
∂xf.
Wanted: corrected objective J = J(xcf )− < xc
s ,N(xcf ) > .
Solution to the adjoint system preferably using discrete adjointcode
xf = F (xs)
J = J(xf )
xf = g · J = g
xs= F ′(xs)T · xf
produced by derivative code compiler.
![Page 29: Low-Memory Tour Reversal in Directed Graphs1](https://reader033.fdocuments.in/reader033/viewer/2022041421/6252259bc7ff8942da54d394/html5/thumbnails/29.jpg)
Objective: Potential Energy wrt. Height Field
rotating earth; all water except at poles; objective: potentialenergy somewhere in middle of the Sahara ... (F. Rauser, P. Korn)
Test Case 3: Unsteady Solid Body Rotation from� Laeuter: Unsteady analytical solutions of the spherical shallow water
equations. Journal of Computational Physics, 2005