Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques...

37
Hierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research Center and Uncertainty Quantification Center, KAUST (joint work with M. Genton, Y. Sun and D. Keyes) Center for Uncertainty Quantification http://sri-uq.kaust.edu.sa/

Transcript of Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques...

Page 1: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

Hierarchical matrix techniques for maximumlikelihood covariance estimation

Alexander Litvinenko,Extreme Computing Research Center and Uncertainty

Quantification Center, KAUST(joint work with M. Genton, Y. Sun and D. Keyes)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

http://sri-uq.kaust.edu.sa/

Page 2: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

The structure of the talk

1. Motivation

2. Hierarchical matrices [Hackbusch 1999]:

3. Matern covariance function

4. Uncertain parameters of the covariance function:

4.1 Uncertain covariance length4.2 Uncertain smoothness parameter

5. Identification of these parameters via maximizing thelog-likelihood.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

2 / 37

Page 3: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Motivation, problem 1

Task: to predict temperature, velocity, salinity, estimate parameters ofcovariance

Grid: 50Mi locations on 50 levels, 4*(X*Y*Z) + X*Y= 4*500*500*50 +

500*500 = 50Mi.

High-resolution time-dependent data about Red Sea: zonal velocity and

temperatureCenter for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

3 / 37

Page 4: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Motivation, problem 2

Task: to predict moisture, compute covariance, estimate its parameters

2D-Grid: ≈ 2.5Mi locations with 2.1Mi observations and 278K missing

values.

−120 −110 −100 −90 −80 −70

25

30

35

40

45

50

Soil moisture

longitude

latit

ude

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

High-resolution daily soil moisture data at the top layer of the Mississippibasin, U.S.A., 01.01.2014 (Chaney et al., in review).

Important for agriculture, defense. Moisture is very heterogeneous.Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

4 / 37

Page 5: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Motivation, estimation of uncertain parameters

H-matrix rank

3 7 9

cov. le

ngth

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

Box-plots for ` = 0.0334 (domain [0, 1]2) vs different H-matrixranks k = {3, 7, 9}.Which H-matrix rank is sufficient for identification of parametersof a particular type of cov. matrix?

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

5 / 37

Page 6: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Motivation for H-matrices

General dense matrix requires O(n3) storage and time. Could beexpensive!

If covariance matrix is structured (diagonal, Toeplitz, circulant)then we can apply e.g. FFT with O(nlogn), but if not ?

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

6 / 37

Page 7: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Hierarchical (H)-matrices

Introduction into Hierarchical (H)-matrix technique

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

7 / 37

Page 8: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Examples of H-matrix approximations

25 20

20 20

20 16

20 16

20 20

16 16

20 16

16 16

4 4

20 4 32

4 4

16 4 32

4 20

4 4

4 16

4 4

32 32

20 20

20 20 32

32 32

4 3

4 4 32

20 4

16 4 32

32 4

32 32

4 32

32 32

32 4

32 32

4 4

4 4

20 16

4 4

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 44 4

20 4 32

32 32 4

4 4

32 4

32 32 4

4 4

32 32

4 32 4

4 4

32 32

32 32 4

4

4 20

4 4 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

20 20

20 20 32

32 32

4 4

20 4 32

32 32

4 20

4 4 32

32 32

20 20

20 20 32

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

4 4 44 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 32

4 4

4 4

4 4

4 4 4

4 4

32 4

32 32 4

4 4

4 4

4 4

4 4 4

432 4

32 32

4 4

4 4

4 4

4 4

4 4 432 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

4 4

4 4

4 4

4 4

4 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

44 4

4 4

4 4

4 4

4 4

32 32

4 32 4

44 3

4 4

4 4

4 4

432 32

4 32

4 4

44 4

4 4

4 4

4 4

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

44 4

4 4

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 32

4 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

32 32

32 32 4

4 4

32 4

32 32 4

4 4

32 32

4 32 4

4 4

32 32

32 32 4

432 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 4

32 32

32 4

32 4

32 4

32 32

32 4

32 4

32 32

4 32

32 32

4 32

32 32

4 4

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

25 11

11 20 12

1320 11

9 1613

13

20 11

11 20 13

13 3213

13

20 8

10 20 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

13

20 10

10 20 12

12 3213

13

32 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

1332 13

13 3213

13

20 9

9 20 13

13 32 13

1332 13

13 32

13

13

32 13

13 32 13

1332 13

13 3213

13

32 13

13 32 13

1332 13

13 32

25 25

25

13 13

13 22

8 25

32

13 7

22 8

8 32

25

13 22

9 9

32 32

32

24 21

21 21

23 25

23

6 7

8 14

7 6

27 27

23 9

23

7 7

8 7

27 9

27 27

23 23

25

9 9

9 14

23 23

99 9

9 9

7 27

6 27

27 27

9 27

23 23

23

19 14

14 14

23 23

27 27

23 27

23 27

27 27

27 27

9 6

30 8

30 20 9

8 2

32 8

7 5

20 7 27 6

7 5

27 6

9 6 8

8 3

27 27

27 27 9

930 30

9 20

6 9

927 9

9 9

9 9

832

9 20

9 9

9 27

2 8

927 27

27 27

9 9

30 20

20 20

9 30

20 20

9 20

30 20

30 30

30

15 15

15 19

30 9

20 8

5 6

20 6

30 8

6 6

5 7 27

21 8

21

5 7

17 16

30 20

9 9

30

9 9

9 9

9 27

9 20

9 9

21 21

9

9 17

9 16

20 20

20 22 27

27 27

21

6 5

17 7

21 27

21 21

9 17

9 9 27

21 21

21

17 16

16 16

25 25

9 8

6 5

8 8

8 9

8 8

6 9

23 23

328 17

7 14

23 23

7 6

14 6 32

7 8

27 27

27 8

27 279

7 1

9 7

3

9 4

9 3

7 7

27 8 8 8

6 9

2 6

1 6

2

8

9 7

6 5 9

27 27

1 7 9

8 9

7 6

5 5

5 7 24

9 9

32 25

9 8

5 6

28

3

5 4

6 2

3 3

18 19 30

8 9

27 8

8 7

27 8

8 9

7 17

6 7 9

8 8

32 99

25 9

25 9 9

6 9

5 9 9

23 32

23

9 9

17 14

9 27

9 27

23

9 14

9 9

23 32

27 27

9 27

8 9

8 9 9

9 9

9 9

9 9

9 9

9 9 27

9 27 9

9 9

9

9 9

9

9 6

9 3

9

7 9

1 8

99 27

9 9

9 9

3 9

4 9

99

9 9

9 9

9 24

99 32

9 25

9 27

9 9

9

9 9

17 9

9 9

9 27

9 9

9 32

9 9

9

9 28

9

9 9

9 9

9

9

9 18

9 19

9 309

32 32

32

20 17

17 17

32 32

7 5

17 6 32

329 17

9 9

32 32

23 14

14 14 32

32 32

9 8

20 19

17 17 24

8 6

25

6 16

7 6

9 7

32 8

9 8

25 32

9

20 17

19 17

9 24

9 32

9 9

9 25

99 9

16 9

9 25

9 32

21 19

19 19 24

24 24

25

7 7

17 16

24 24

25 24

9 17

9 16 24

25 25

25

17 16

16 16

8 3

7 6

14 7 24

32 24 9

8

3 1

6 30

8 7

28 8

6 5

5 30

6 2

9 24

8 8 8

95 30

3 24

25 25

28

5 5

16 16

6 25

6 30

9

9 14

9 9 32

24 24

9 9

99 9

24 9

9 9

9

9 28

9 9

9 9

9 30

9 9

9 30

9

25 28

25

9 16

9 16

9 9

30 24

9 9

25 30

24 18

18 18 24

24 24

32 25

24 24

32 24

25 24

32 25

25 25

8 9

24 24

6 6

6 8 8

24 6

7 8

8 25

32 8

25 25

9 24

9 24

9 9

9 25

9 9

9 9 24

9 9

32 25

9 25

28 28

28

18 16

16 16

9 28

18 7

16 16 30

9

18 16

9 16

28 30

18 18

18 22 30

30 30

Figure : Three examples of H-matrix approximations, (left-middle)∈ Rn×n, n = 210, of the discretised covariance function cov(x , y) = e−r ,`1 = 0.15, `2 = 0.2, x , y ∈ [0, 1]2; (right) Soil moisture from exampleabove with 999 dofs. The biggest dense (dark) blocks ∈ R32×32, max.rank k = 4 on the left, k = 13 in the middle, and k = 9 on the right.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

8 / 37

Page 9: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Matern covariance functions

Matern covariance functions

Cθ =2σ2

Γ(ν)

( r

2`

)νKν

( r

`

), θ = (σ2, ν, `).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

9 / 37

Page 10: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Examples of Matern covariance matrices

Cν=3/2(r) =

(1 +

√3r

`

)exp

(−√

3r

`

)(1)

Cν=5/2(r) =

(1 +

√5r

`+

5r 2

3`2

)exp

(−√

5r

`

)(2)

ν = 1/2 exponential covariance function Cν=1/2(r) = exp(−r),ν →∞ Gaussian covariance function Cν=∞(r) = exp(−r 2).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

10 / 37

Page 11: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Identifying uncertain parameters

Identifying uncertain parameters

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

11 / 37

Page 12: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Identifying uncertain parameters

Given: a vector of measurements z = (z1, ..., zn)T with acovariance matrix C (θ∗) = C (σ2, ν, `).

Cθ =2σ2

Γ(ν)

( r

2`

)νKν

( r

`

), θ = (σ2, ν, `).

To identify: uncertain parameters (σ2, ν, `).Plan: Maximize the log-likelihood function

L(θ) = −1

2

(N log2π + log det{C (θ)}+ zTC (θ)−1z

),

On each iteration i we have a new matrix C (θi ).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

12 / 37

Page 13: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Other works

1. S. AMBIKASARAN, et al., Fast direct methods for gaussian processes and the analysis of NASA Keplermission, arXiv:1403.6015, (2014).

2. S. AMBIKASARAN, J. Y. LI, P. K. KITANIDIS, AND E. DARVE, Large-scale stochastic linear inversionusing hierarchical matrices, Computational Geosciences, (2013)

3. J. BALLANI AND D. KRESSNER, Sparse inverse covariance estimation with hierarchical matrices, (2015).

4. M. BEBENDORF, Why approximate LU decompositions of finite element discretizations of ellipticoperators can be computed with almost linear complexity, (2007).

5. S. BOERM AND J. GARCKE, Approximating gaussian processes with H2-matrices, 2007.

6. J. E. CASTRILLON, M. G. GENTON, AND R. YOKOTA, Multi-Level Restricted Maximum LikelihoodCovariance Estimation and Kriging for Large Non-Gridded Spatial Datasets, (2015).

7. J. DOELZ, H. HARBRECHT, AND C. SCHWAB, Covariance regularity and H-matrix approximation forrough random fields, ETH-Zuerich, 2014.

8. H. HARBRECHT et al, Efficient approximation of random fields for numerical applications, NumericalLinear Algebra with Applications, (2015).

9. C.-J. HSIEH, et al, Big QUIC: Sparse inverse covariance estimation for a million variables, 2013

10. J. QUINONERO-CANDELA, et al, A unifying view of sparse approximate gaussian process regression,(2005).

11. A. SAIBABA, S. AMBIKASARAN, J. YUE LI, P. KITANIDIS, AND E. DARVE, Application of hierarchicalmatrices to linear inverse problems in geostatistics, Oil & Gas Science (2012).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

13 / 37

Page 14: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Convergence of the optimization method

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

14 / 37

Page 15: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Details of the identification

To maximize the log-likelihood function we use the Brent’s method[Brent’73] (combining bisection method, secant method andinverse quadratic interpolation).

1. C (θ) ≈ CH(θ, k).

2. H-Cholesky: CH(θ, k) = LLT

3. zTC−1z = zT (LLT )−1z = vT · v , where v is a solution ofL(θ, k)v(θ) := z(θ∗).

4. Let λi be diagonal elements of H-Cholesky factor L, then

log det{C} = log det{LLT} = log det{n∏

i=1

λ2i } = 2

n∑i=1

logλi ,

L(θ, k) = −N

2log(2π)−

N∑i=1

log{Lii (θ, k)} − 1

2(v(θ)T · v(θ)). (3)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

15 / 37

Page 16: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

0 10 20 30 40−4000

−3000

−2000

−1000

0

1000

2000

parameter θ, truth θ*=12

Log−

likelih

ood(θ

)

Shape of Log−likelihood(θ)

log(det(C))

zTC

−1z

Log−likelihood

Figure : Minimum of negative log-likelihood (black) is atθ = (·, ·, `) ≈ 12 (σ2 and ν are fixed)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

16 / 37

Page 17: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Convergence of H-matrix approximations

0 10 20 30 40 50 60 70 80 90 100−25

−20

−15

−10

−5

0

rank k

log(r

el.

error)

Spectral norm, L=0.1, nu=1

Frob. norm, L=0.1

Spectral norm, L=0.2

Frob. norm, L=0.2

Spectral norm, L=0.5

Frob. norm, L=0.5

0 10 20 30 40 50 60 70 80 90 100−16

−14

−12

−10

−8

−6

−4

−2

0

rank k

log(r

el.

error)

Spectral norm, L=0.1, nu=0.5

Frob. norm, L=0.1

Spectral norm, L=0.2

Frob. norm, L=0.2

Spectral norm, L=0.5

Frob. norm, L=0.5

ν = 1(left) and ν = 0.5 (right) for different cov. lengths` = {0.1, 02, 0.5}

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

17 / 37

Page 18: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Convergence of H-matrix approximations

0 10 20 30 40 50 60 70 80 90 100−16

−14

−12

−10

−8

−6

−4

−2

0

rank k

log(r

el.

error)

Spectral norm, nu=1.5, L=0.1

Spectral norm, nu=1

Spectral norm, nu=0.5

0 10 20 30 40 50 60 70 80 90 100−22

−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

rank k

log(r

el.

error)

Spectral norm, nu=1.5, L=0.5

Spectral norm, nu=1

Spectral norm, nu=0.5

ν = {1.5, 1, 0.5}, ` = 0.1 (left) and ν = {1.5, 1, 0.5}, ` = 0.5(right)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

18 / 37

Page 19: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

What will change?

Approximate C by CH

1. How the eigenvalues of C and CH differ ?

2. How det(C ) differs from det(CH) ? [Below]

3. How L differs from LH ? [Mario Bebendorf et al]

4. How C−1 differs from (CH)−1 ? [Mario Bebendorf et al]

5. How L(θ, k) differs from L(θ)? [Below]

6. What is optimal H-matrix rank? [Below]

7. How θH differs from θ? [Below]

For theory, estimates for the rank and accuracy see works ofBebendorf, Grasedyck, Le Borne, Hackbusch,...

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

19 / 37

Page 20: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Remark

For a small H-matrix rank k the H-matrix Cholesky of CH crasheswhen eigenvalues of C come very close to zero. A remedy is toincrease the rank k.In our example for n = 652 we increased k from 7 to 9.

To avoid this instability, we can modify CHm = CH + δ2I . Assumeλi are eigenvalues of CH. Then eigenvalues of CHm will be λi + δ2.

log det(CHm ) = logn∏

i=1

(λi + δ2) =n∑

i=1

log(λi + δ2). (4)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

20 / 37

Page 21: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Error analysis

Theorem (Existence of H-matrix inverse in [Bebendorf’11,Ballani, Kressner’14)

Under certain conditions an H-matrix inverse exist

‖C−1H − C−1‖ ≤ ε‖C−1‖, (5)

theoretical estimations for rank kinv of C−1H are given.

Theorem (Error in log det)

Let E := C − CH, (CH)−1E := (CH)−1C − I and for the spectralradius

ρ((CH)−1E ) = ρ((CH)−1C − I) ≤ ε. (6)

Then |log det(C )− log det(CH)| ≤ −plog(1− ε).

Proof: See [Ballani, Kressner 14], [Ipsen’05].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

21 / 37

Page 22: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

How sensitive is Log-Likelihood to the H-matrix rank ?

It is not at all sensible.H-matrix approximation changes function L(θ, k) and estimationof θ very-very small.

θ 0.05 1.05 2.04 3.04 4.03 5.03 6.02 7.02 8.01 9 10L(exact) 1628 -2354 -1450 27 1744 3594 5529 7522 9559 11628 13727L(7) 1625 -2354 -1450 27 1745 3595 5530 7524 9560 11630 13726L(20) 1625 -2354 -1450 27 1745 3595 5530 7524 9561 11630 13725

Comparison of three likelihood functions, computed with differentH-matrix ranks: exact, H-rank 7, H-rank 20. Exponentialcovariance function, with covariance length ` = 0.9, domainG = [0, 1]2.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

22 / 37

Page 23: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

How sensitive is Log-Likelihood to the H-matrix rank ?

0 5 10−5000

0

5000

10000

15000

θ

−lo

glik

elih

ood

Figure : Three negative log-likelihood functions: exact, commuted withH-matrix rank 7 and 17. One can see that even with rank 7 one canachieve very accurate results.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

23 / 37

Page 24: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Do we need all measurements? Boxplots vs n

number of measurements

1000 2000 4000 8000 16000 32000

cov.

leng

th

0.15

0.2

0.25

0.3

0.35

0.4

Moisture data. Boxplots with increasing of number ofmeasurements, n = {1000, ..., 32000}. The mean and median areobtained after averaging 100 simulations.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

24 / 37

Page 25: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Decreasing of error bars with increasing number of measurements

Error bars (mean +/- st. dev.) computed for different n.

Decreasing of error bars with increasing of number ofmeasurements/dimension, n = {172, 332, 652}. The mean andmedian are obtained after averaging 200 simulations.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

25 / 37

Page 26: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

H-matrix approximation is robust w.r.t. parameter ν

Figure : Dependence of H-matrix approximation error on parameter ν.

Relative error ‖C−CH‖2

‖CH‖2via smoothness parameter ν. H-matrix rank

k = 8, n = 16641, Matern covariance matrix.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

26 / 37

Page 27: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

H-matrix approximation is robust w.r.t. cov. length `

Figure : Dependence of H-matrix approximation error on cov. length `.

Relative error ‖C−CH‖2

‖CH‖2via covariance length `. H-matrix rank k = 8,

n = 16641, Matern covariance matrix.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

27 / 37

Page 28: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

cov. length

0 0.2 0.4 0.6 0.8 1

log-d

ete

rmin

ant of C

-2000

-1800

-1600

-1400

-1200

-1000

-800

-600

-400

-200

0

rank5

ranks=3,4,5

cov. length

0 0.2 0.4 0.6 0.8 1

-log-lik

elihood

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

H-matrix approximation of log det(C ) (left) and of thelog-likelihood L (right); σ2 = 1, ν = 0.5, k = 5. The red line - k = 5 and the

blue - k = 3 on [0.01, 0.3], k = 4 on [0.3, 0.6] and k = 5 on [0.6, 1]. The rank k = 3 is sufficient to approximate

C , but insufficient to approximate C−1 on the whole interval [0.01, 1]. The first numerical instability appears at

point `i1 ≈ 0.3, to avoid it the rank k is increased by 1, until the second instability appears at point `i2 ≈ 0.6.Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

28 / 37

Page 29: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

parameter \nu

0 0.5 1 1.5 2 2.5 3 3.510 2

10 3

10 4

10 5

10 6

10 7

10 8

log-likelihood

zTC

-1z

H-matrix approx. of zTC−1z and of log-likelihood L; σ2 = 1,ν = 0.5, k increases from 5 until 12 after each jump. The red line - rank

k = 5 and blue - k = 5 on [0.1, 1.42]; k = 6 on [1.42, 1.57]; k = 7 on [1.57, 1.93]; k = 8 on [1.93, 2.24] and

k = {9, 10, 12} between [2.24, 3.14]. k has to be increased to approximate C−1. The first numerical instability

appears at point νi1 ≈ 1.42, to avoid it the rank k is increased by 1, until the second instability appears at point

`i2 ≈ 1.57 etc.Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

29 / 37

Page 30: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Time profiling, C-language

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

30 / 37

Page 31: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Parallel implementation with HLIBpro (R. Kriemann)

We used www.hlibpro.org to setup the exponential covariancematrix (cov. length=1), to compute Cholesky and the inverse.We used adaptive rank arithmetics with ε = 1e − 4 for each blockof CH and ε = 1e − 8 for each block of CH. Number of processingcores is 40.We took moisture data (see above) with N points.

N compute CH LLT inverseCompr. time size time size ε1 time size ε2

rate % sec. MB sec. MB sec. MB10000 14% 0.9 106 4.1 109 7.7e-6 44 230 7.8e-530000 7.5% 4.3 515 25 557 1.1e-3 316 1168 1.1e-1

Table : Here ε1 := ‖I − (LLT )−1C‖2, where L and C are H-matrices,I -identity matrix; ε2 := ‖I − BC‖2, where B is an H-matrixapproximation of C−1.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

31 / 37

Page 32: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Take into account the gradient

∂L(θi )

∂θi=

1

2tr

(C−1 ∂C

∂θi

)− 1

2zTC−1 ∂C

∂θiC−1z . (7)

For an exponential random field, have

∂C (θi )

∂θi=

∂`exp

(−√|x − y |2`

)=−√|x − y |2`2

exp

(−√|x − y |2`

)(8)

∂C (θi )

∂θi=: C2

∂L(θi )

∂θi=

1

2tr(C−1C2

)− 1

2zTC−1C2C−1 · z .

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

32 / 37

Page 33: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Conclusion

I Covariance matrices can be approximated in H-matrix format.

I Hypotes: H-matrix approximation is robust w.r.t. ν and `.

I Influence of H-matrix approximation error on the estimatedparameters is small.

I With application of H-matricesI we extend the class of covariance functions to work with,I allows non-regular discretization of the covariance function on

large spatial grids.

I With the maximizing algorithm we are able to identify bothparameters: covariance lengths ` and the smoothness ν

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

33 / 37

Page 34: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Future plans

I Parallel H-Cholesky for very large covariance matrices onnon-regular grids

I Preconditioning for log-likelihood to decrease cond(C )

I Domain decomposition for large domains + H-matrix in eachsub-domain

I Apply H-matrices for

1. Kriging estimate s := CsyC−1yy y

2. Estimation of variance σ, is the diagonal of conditional cov.matrix Css|y = diag

(Css − CsyC−1

yy Cys

),

3. Gestatistical optimal design ϕA := n−1traceCss|y ,

ϕC := cT(Css − CsyC−1

yy Cys

)c ,

I To implement gradient-based version

I Compare with the Bayesian Update (H. Matthies, H. Najm,K. Law, A. Stuart et al)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

34 / 37

Page 35: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Literature

1. Application of hierarchical matrices for computing the Karhunen-Loeveexpansion, B.N. Khoromskij, A. Litvinenko, H.G. Matthies, Computing 84(1-2), 49-67, 31, 20092. Parameter identification in a probabilistic setting, B.V. Rosic, A.Kucerova, J Sykora, O. Pajonk, A. Litvinenko, H.G. Matthies,Engineering Structures 50, 179-196, 20133. Methods for statistical data analysis with decision trees,http://www.math.nsc.ru/AP/datamine/eng/context.pdf V. Berikov, A.Litvinenko, Novosibirsk, Sobolev Institute of Mathematics, 20034. Parametric and uncertainty computations with tensor productrepresentations, H.G. Matthies, A. Litvinenko, O. Pajonk, B.V. Rosic, E.Zander, Uncertainty Quantification in Scientific Computing, 139-150,20125. Data sparse computation of the Karhunen-Loeve expansion, B.N.Khoromskij, A. Litvinenko, AIP Conference Proceedings 1048 (1), 311,20086. Kriging and spatial design accelerated by orders of magnitude:Combining low-rank covariance approximations with FFT-techniques W.Nowak, A. Litvinenko, Mathematical Geosciences 45 (4), 411-435, 2013Center for Uncertainty

QuantificationCenter for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

35 / 37

Page 36: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Acknowledgement

1. Lars Grasedyck (RWTH Aachen) and Steffen Boerm (UniKiel) for HLIB (www.hlib.org)

2. Ronald Kriemann (MPI Leipzig) for www.hlibpro.org

3. KAUST Research Computing group, KAUST SupercomputingLab (KSL)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

36 / 37

Page 37: Alexander Litvinenko, Quanti cation Center, KAUST · PDF fileHierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research

4*

Matern Fields (Whittle, 63)

Taken from D. Simpson (see also Finn Lindgren, Havard Rue,David Bolin,...)

TheoremThe covariance function of a Matern field

c(x , y) =1

Γ(ν + d/2)(4π)d/2κ2ν2ν−1(κ‖x − y‖)νKν(κ‖x − y‖)

(9)is the Green’s function of the differential operator

L2ν =

(κ2 −∆

)ν+d/2. (10)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

37 / 37