Sparsity Detection in Matlab - EuroAd Workshop...and Engineering (ECCOMAS), volume 2. University of...

Sparsity Detection inMatlab

11th European Workshop onAutomatic DifferentiationCranfield University, Shrivenham9th December 2010Shaun Forth1 and Anand Chaturvedi21Applied Mathematics & Scientific Computing,Cranfield University2Indian Institute of Technology Rookee

2/ 26

Outline

1 Introduction

2 Approaches to Sparsity Detection

3 Preliminary MAD Implementations

4 Results

5 Conclusions & Further Work

3/ 26

Introduction

I For a large sparse Jacobian compression techniques cangreatly reduce the cost of determining a Jacobian’s entriesby AD [GW08, Chap.8].

I Such techniques require the Jacobian’s sparsity pattern -or a tight over-estimate.

I Sophisticated approaches in ADOL-C [WG04] andTAF [GK06] for C/C++ and Fortran.

I By-product of use of sparse storage in forward modeAD [BKBC96, FK04].

4/ 26

Example (Minpack FIC Sparsity Pattern [ACMX92])

Sparsity pattern = lo-cation of Jacobian en-tries

5/ 26

Problem Definition

Definition (Sparsity Estimation)Given a function f : x ∈ Rn → y ∈ Rm with Jacobian Jf ∈ Rm×n

and f defined by a computer program then the sparsityestimation problem is to compute the locations (i , j) of allnon-structurally zero elements (a.k.a. entries) of Jf.

I Jfi,j is structurally zero if it is zero for all valid x.I If the sparsity pattern is not valid for all valid independent

variables x then this should be signalled.

6/ 26

Example (Structurally Zero)Consider the trivial problem y = f(x) with,

fi = x2i ,

and

Jfi,j =

{2xi , i = j ,0, i 6= j .

I Sparsity pattern is clearly the identity matrix.I But if xi ≡ 0 then Jfi,i = 0.I Element Jfi,j , i 6= j is structurally zero.I Element Jfi,i is not structurally zero.I Element Jfi,i may accidentally be zero when xi ≡ 0.

n.b. Branching can give similar effects.

7/ 26

Approaches to Sparsity Detection

By-Product of Sparse Forward Mode ADUsing Bit OperationsVerma’s Matlab Implementation

8/ 26

By-Product of Sparse Forward Mode AD

ADiFor, MAD, ADO2 [BKBC96, FK04, PR98] may propagatederivative vectors using sparse storage.

I For each active variable’s derivatives ∇vk store only thenon-zeros as a data structure1,

∇vk =

{(j ,∂vk

∂xj

):∂vk

∂xj6= 0

}.

I Such methods typically do not store a derivative if it isaccidentally zero.

I Sparsity pattern may be an underestimate - unsafe.I Work-around: randomize x to reduce likelihood of

accidents [FK04] - resulting Jf of no practical use.

1or similar!

9/ 26

Using Bit Operations

The sparsity pattern vk for any variable vk may be defined by abit pattern,

vk ={

vkj , j = 1, . . . ,n ∈ {0,1}n

},

with

vkj =

{0 if ∂vk

∂xjstructurally zero

1 if ∂vk∂xj

not structurally zero.

e.g. for n = 3 then

x1 = {1,0,0}x2 = {0,1,0}x3 = {0,0,1}

v1 : v1 = x1 + x2 = {1,1,0}

10/ 26

Using Bit Operations (ctd.)

For active scalars x , y and constant a [GK06],I If v = x ◦ y with ◦ = +,−, ∗, / then,

v = x |y ,

with | the bit-wise or operation

v j =

{1 if x j = 1 or y j = 1,0 otherwise.

I For v = a ◦ x , or x ◦ a, a 6= 0,

v = x .

11/ 26

Using Bit Operations (ctd.)

I Such bit-wise operations exist in C, C++, Fortran andMatlab for (unsigned) integer types.

I With 32 bit (4 byte) integers the sparsity pattern for up to32 directional derivatives may be stored in a single integer.

I For n independent variables a vector of length n/32 canhold the entire sparsity pattern for a scalar.

I Just n/32 bit-wise or operations are required to combinesuch patterns in x |y .

I For 64 bit integers (8 bytes) use n/64 in above!I TAF can produce sparsity patterns 2 orders of magnitude

faster than Jacobians [GK06].

12/ 26

Verma’s Matlab Implementation

Verma [Ver98] realised that sparsity propagation rules must beadapted to vector/matrix operations.

Example (z = Inx)I In Matlab z = eye(n)*x.I If we don’t account for zeros in In then Jacobian sparsity

pattern is the overestimate,

Jz =

1 · · · 1...

...1 · · · 1

I In operation z = Ax account for sparsity of A.

13/ 26

Verma’s Matlab Implementation (ctd.)

I What if matrix is active z = Yx? Verma used sparsity of Y’svalue - but this might have entries that are accidentallyzero.

I Verma stored sparsities using full arrays - performanceoverhead.

14/ 26

Preliminary MAD Implementations

Account for Sparsity of Intermediate VariablesLogical Arrays for Derivative SparsityBitwise Storage for Derivative SparsityBranch Detection

15/ 26

Account for Sparsity of Intermediate Variables

Both our implementations have three components/properties:I value - stores an object’s numerical value.I entries - a logical array indicating the location of structural

non-zeros.I sparsity - storage of derivative sparsity pattern - see later.

The entries component is updated as one might expect:operation entries operationz = x + y z.entries = x.entries | y.entries

z = x .* y z.entries = x.entries & y.entries

z = x * y z.entries = logical(x.entries*y.entries)

16/ 26

Logical Matrices for Derivative Sparsity

I MAD’s derivvec class stores pattern for an object v of sized1 × d2 × · · · × dD as a (d1 · d2 · · · dD)× n logical matrix.

I The logical matrix may be full or sparse.I Storage requirement is

I full: 4 · d1 · d2 · · · dD · n bytesI sparse: < 12n nnz(v.entries) bytes (underlying CSC).

I Sparsity propagation a la Verma but using an object’sentries and not its value.operation entries operationz = x + y z.sparsity = x.sparsity | y.sparsity

z = x .* y z.sparsity = x.entries & y.sparsity

| x.sparsity & y.entries

Conformability - derivvec replicates entries n times.

17/ 26

Bitwise Storage for Derivative Sparsity

I bitvec class stores pattern for a d1 × d2 × · · · × dD object vas a (d1 · d2 · · · dD)× ceil(n/32) uint32 matrix.

I The integer matrix must be full - no sparse support.I Storage requirement is 4 · d1 · d2 · · · dD · ceil(n/32) bytes.I Matlab R2010a has insufficient support for uint64.I Sparsity propagation as for logical arrays.I Conformability - bitvec replicates entries to ceil(n/32)

uint32 integers as required.

18/ 26

Branch Detection

Basic facility [Men10].I Logs results of logical operations on active variables

expected to control branching.I Logs can be examined or successive logs compared for

changes in control flow.I Not fully tested.

19/ 26

Results

All test cases from the Minpack Collection [ACMX92].

Flow in Channel - FICSwirling Flow between Disks - SFDCombustion of Propane - CPF/CPR

20/ 26

Flow in Channel - FIC

problem size ntechnique 8 64 2400fmad 0.5 1.4 120.3logical derivvec 0.5 1.4 65.6bitvec 0.5 1.4 42.4

21/ 26

Swirling Flow between Disks - SFD

problem size ntechnique 14 700 1400 2800fmad 0.9 26.3 67.5 257.5logical derivvec 0.9 24.6 51.8 110.7bitvec 0.9 24.5 48.6 100.5

22/ 26

Combustion of Propane - CPF/CPR

problem and size nCPR CPF

technique 5 11fmad 0.31 0.33logical derivvec 0.26 0.28bitvec 0.25 0.28

23/ 26

Conclusions & Further Work

I It is possible to forward propagate sparsity patterns about2.5 to 4 times faster than derivatives for moderate sizedproblems in Matlab.

I Presently, bitwise operations are slightly faster.I Robustness requires:

I Storage of an objects entries as well as its derivativesparsity pattern.

I Branch detection.I Scope for further improvements:

I For efficiency of indexing and | and & operations logicalderivvec internal storage should be transposed.

I Improved Matlab support for uint64.I Hybrid approach using sparse bitwise storage -

C/C++/Fortran coding?

24/ 26

References I

Brett M. Averick, Richard G. Carter, Jorge J. Moré, and Guo-Liang Xue.The MINPACK-2 test problem collection.Preprint MCS–P153–0692, ANL/MCS–TM–150, Rev. 1, Mathematics andComputer Science Division, Argonne National Laboratory, Argonne, Ill., 1992.See ftp://info.mcs.anl.gov/pub/MINPACK-2/tprobs/P153.ps.Z.

Christian H. Bischof, Peyvand M. Khademi, Ali Bouaricha, and Alan Carle.Efficient computations of gradients and Jacobians by dynamic exploitation ofsparsity in automatic differentiation.Optimization Methods and Software, 7:1–39, 1996.

Shaun A. Forth and Robert Ketzscher.High-level interfaces for the MAD (Matlab Automatic Differentiation) package.In P. Neittaanmäki, T. Rossi, S. Korotov, E. Oñate, J. Périaux, and D. Knörzer,editors, 4th European Congress on Computational Methods in Applied Sciencesand Engineering (ECCOMAS), volume 2. University of Jyväskylä, Department ofMathematical Information Technology, Finland, Jul 24–28 2004.ISBN 951-39-1869-6.

25/ 26

References II

Ralf Giering and Thomas Kaminski.Automatic Sparsity Detection implemented as a source-to-source transformation.In Alexandrov, VN and VanAlbada, GD and Sloot, PMA and Dongarra, J, editor,COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, volume 3994of LECTURE NOTES IN COMPUTER SCIENCE, pages 591–598,HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY, 2006.SPRINGER-VERLAG BERLIN.6th International Conference on Computational Science (ICCS 2006), Reading,ENGLAND, MAY 28-31, 2006.

Andreas Griewank and Andrea Walther.Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation.Number 105 in Other Titles in Applied Mathematics. SIAM, Philadelphia, PA, 2ndedition, 2008.

Marina Menshikova.Uncertainty estimation using the moments method facilitated by automaticdifferentiation in Matlab.PhD thesis, Cranfield University, Cranfield Defence and Security, DefenceAcademy of the UK, Shrivenham, Swindon, SN6 8LA, UK, 2010..

http://hdl.handle.net/1826/4328

26/ 26

References III

J.D. Pryce and J.K. Reid.ADO1, a Fortran 90 code for automatic differentiation.Technical Report RAL-TR-1998-057, Rutherford Appleton Laboratory, Chilton,Didcot, Oxfordshire, OX11 OQX, England, 1998.Available via ftp://matisa.cc.rl.ac.uk/pub/reports/ prRAL98057.ps.gz.

Arun Verma.ADMAT: Automatic differentiation in MATLAB using object oriented methods.In SIAM Interdisciplinary Workshop on Object Oriented Methods forInteroperability, pages 174–183, Yorktown Heights, New York, Oct 21-23 1998.SIAM, National Science Foundation.

Andrea Walther and Andreas Griewank.ADOL-C: computing higher-order derivatives and sparsity patterns for functionswritten in c++.In Proceedings of the European Congress on Computational Methods in AppliedSciences and Engineering - ECCOMAS 2004, July 2004..

http://www.mit.jyu.fi/eccomas2004/proceedings/pdf/577.pdf

Sparsity Detection in Matlab - EuroAd Workshop...and Engineering (ECCOMAS), volume 2. University of...

Documents

Transcript of Sparsity Detection in Matlab - EuroAd Workshop...and Engineering (ECCOMAS), volume 2. University of...