Bayesian computation with INLA

271
Bayesian computation using INLA Thiago G. Martins Norwegian University of Science and Technology Trondheim, Norway AS 2013, Ribno, Slovenia September, 2013 1 / 140

description

Short-course about Bayesian computation with INLA given on the AS2013 conference in Ribno, Slovenia.

Transcript of Bayesian computation with INLA

Page 1: Bayesian computation with INLA

Bayesian computation using INLA

Thiago G. Martins

Norwegian University of Science and TechnologyTrondheim, Norway

AS 2013, Ribno, Slovenia

September, 2013

1 / 140

Page 2: Bayesian computation with INLA

Parte I

Latent Gaussian models and INLA

methodology

2 / 140

Page 3: Bayesian computation with INLA

Outline

Latent Gaussian models

Are latent Gaussian models important?

Bayesian computing

INLA method

3 / 140

Page 4: Bayesian computation with INLA

Hierarchical Bayesian models

Hierarchical models are an extremely useful tool in Bayesian modelbuilding.

Three parts:

I Observations (y): Encodes information about observed data,including design and collection issues.

I The latent process (x): The unobserved process. May be thefocus of the study, or may be included to reduceautocorrelation. E.g., encode spatial and/or temporaldependence.

I The Parameter model (θ): Models for all of the parameters inthe observation and latent processes.

4 / 140

Page 5: Bayesian computation with INLA

Hierarchical Bayesian models

Hierarchical models are an extremely useful tool in Bayesian modelbuilding.

Three parts:

I Observations (y): Encodes information about observed data,including design and collection issues.

I The latent process (x): The unobserved process. May be thefocus of the study, or may be included to reduceautocorrelation. E.g., encode spatial and/or temporaldependence.

I The Parameter model (θ): Models for all of the parameters inthe observation and latent processes.

4 / 140

Page 6: Bayesian computation with INLA

Hierarchical Bayesian models

Hierarchical models are an extremely useful tool in Bayesian modelbuilding.

Three parts:

I Observations (y): Encodes information about observed data,including design and collection issues.

I The latent process (x): The unobserved process. May be thefocus of the study, or may be included to reduceautocorrelation. E.g., encode spatial and/or temporaldependence.

I The Parameter model (θ): Models for all of the parameters inthe observation and latent processes.

4 / 140

Page 7: Bayesian computation with INLA

Latent Gaussian models

A latent Gaussian model is a Bayesian hierarchical model of thefollowing form

I Observed data y, yi |xi ∼ π(yi |xi ,θ)

I Latent Gaussian field x ∼ N (·,Σ(θ))

I Hyperparameters θ ∼ π(θ)I variabilityI length/strength of dependenceI parameters in the likelihood

π(x,θ|y) ∝ π(θ) π(x|θ)∏i∈I

π(yi |xi ,θ)

5 / 140

Page 8: Bayesian computation with INLA

Latent Gaussian models

A latent Gaussian model is a Bayesian hierarchical model of thefollowing form

I Observed data y, yi |xi ∼ π(yi |xi ,θ)

I Latent Gaussian field x ∼ N (·,Σ(θ))

I Hyperparameters θ ∼ π(θ)I variabilityI length/strength of dependenceI parameters in the likelihood

π(x,θ|y) ∝ π(θ) π(x|θ)∏i∈I

π(yi |xi ,θ)

5 / 140

Page 9: Bayesian computation with INLA

Latent Gaussian models

A latent Gaussian model is a Bayesian hierarchical model of thefollowing form

I Observed data y, yi |xi ∼ π(yi |xi ,θ)

I Latent Gaussian field x ∼ N (·,Σ(θ))

I Hyperparameters θ ∼ π(θ)I variabilityI length/strength of dependenceI parameters in the likelihood

π(x,θ|y) ∝ π(θ) π(x|θ)∏i∈I

π(yi |xi ,θ)

5 / 140

Page 10: Bayesian computation with INLA

Latent Gaussian models

A latent Gaussian model is a Bayesian hierarchical model of thefollowing form

I Observed data y, yi |xi ∼ π(yi |xi ,θ)

I Latent Gaussian field x ∼ N (·,Σ(θ))

I Hyperparameters θ ∼ π(θ)I variabilityI length/strength of dependenceI parameters in the likelihood

π(x,θ|y) ∝ π(θ) π(x|θ)∏i∈I

π(yi |xi ,θ)

5 / 140

Page 11: Bayesian computation with INLA

Precision matrix

The precision matrix of the latent field

Q(θ) = Σ(θ)−1

plays a key role!

Two issues

I Building models through conditioning (“hierarchical models”)

I Computational benefits

6 / 140

Page 12: Bayesian computation with INLA

Precision matrix

The precision matrix of the latent field

Q(θ) = Σ(θ)−1

plays a key role!

Two issues

I Building models through conditioning (“hierarchical models”)

I Computational benefits

6 / 140

Page 13: Bayesian computation with INLA

Building models through conditioning

If

I x ∼ N (0,Q−1x )

I y|x ∼ N (x,Q−1y )

then

Q(x,y) =

[Qx + Qy −Qy

−Qy Qy

]Not so nice expressions using the Covariance-matrix

7 / 140

Page 14: Bayesian computation with INLA

Computational benefits

I Precision matrices encodes conditional independence:

xi ⊥ xj |x−ij ⇐⇒ Qij = 0

We are interested in models with sparse precision matrices.

I x ∼ N (·,Σ(θ)) with sparse Q(θ) = Σ(θ)−1

Gaussians with a sparse precision matrix are called GaussianMarkov random fields (GMRFs)

I Good computational properties through numerical algorithmsfor sparse matrices

8 / 140

Page 15: Bayesian computation with INLA

Computational benefits

I Precision matrices encodes conditional independence:

xi ⊥ xj |x−ij ⇐⇒ Qij = 0

We are interested in models with sparse precision matrices.

I x ∼ N (·,Σ(θ)) with sparse Q(θ) = Σ(θ)−1

Gaussians with a sparse precision matrix are called GaussianMarkov random fields (GMRFs)

I Good computational properties through numerical algorithmsfor sparse matrices

8 / 140

Page 16: Bayesian computation with INLA

Computational benefits

I Precision matrices encodes conditional independence:

xi ⊥ xj |x−ij ⇐⇒ Qij = 0

We are interested in models with sparse precision matrices.

I x ∼ N (·,Σ(θ)) with sparse Q(θ) = Σ(θ)−1

Gaussians with a sparse precision matrix are called GaussianMarkov random fields (GMRFs)

I Good computational properties through numerical algorithmsfor sparse matrices

8 / 140

Page 17: Bayesian computation with INLA

Numerical algorithms for sparse matrices: scalingproperties

I Time: O(n)

I Space: O(n3/2)

I Space-time: O(n2)

This is to be compared with general O(n3) algorithms for densematrices.

9 / 140

Page 18: Bayesian computation with INLA

Numerical algorithms for sparse matrices: scalingproperties

I Time: O(n)

I Space: O(n3/2)

I Space-time: O(n2)

This is to be compared with general O(n3) algorithms for densematrices.

9 / 140

Page 19: Bayesian computation with INLA

Outline

Latent Gaussian models

Are latent Gaussian models important?

Bayesian computing

INLA method

10 / 140

Page 20: Bayesian computation with INLA

Example (I): Mixed-effect model

yij |ηij ,θ1 ∼ π(yij |ηij ,θ1), i = 1, . . . ,N, j = 1, . . . ,M

ηij = µ+ cijβ + ui + vj + wij

where u, v and w are “random effects”.

If we assign Gaussian priors on µ, β, u and v, then

x|θ2 = (µ, β,u, v,η)|θ2

is jointly Gaussian.

θ = (θ1,θ2)

11 / 140

Page 21: Bayesian computation with INLA

Example (I): Mixed-effect model

yij |ηij ,θ1 ∼ π(yij |ηij ,θ1), i = 1, . . . ,N, j = 1, . . . ,M

ηij = µ+ cijβ + ui + vj + wij

where u, v and w are “random effects”.

If we assign Gaussian priors on µ, β, u and v, then

x|θ2 = (µ, β,u, v,η)|θ2

is jointly Gaussian.

θ = (θ1,θ2)

11 / 140

Page 22: Bayesian computation with INLA

Example (I) - cont.

We can reinterpret the model as

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

I dim(x) could be large 102-105

I dim(θ) is small 1-5

12 / 140

Page 23: Bayesian computation with INLA

Example (I) - cont.

We can reinterpret the model as

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

I dim(x) could be large 102-105

I dim(θ) is small 1-5

12 / 140

Page 24: Bayesian computation with INLA

Example (I) - cont.

Precision matrix (η,u, v, µ, β) N = 100, M = 5.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

13 / 140

Page 25: Bayesian computation with INLA

Example (II): Time-series model

Smoothing of binary time-series

I Data is sequence of 0 and 1s

I Probability for a 1 at time t, pt , depends on time

pt =exp(ηt)

1 + exp(ηt)

I Linear predictor

ηt = µ+ βct + ut + vt , t = 1, . . . , n

14 / 140

Page 26: Bayesian computation with INLA

Example (II): Time-series model

Smoothing of binary time-series

I Data is sequence of 0 and 1s

I Probability for a 1 at time t, pt , depends on time

pt =exp(ηt)

1 + exp(ηt)

I Linear predictor

ηt = µ+ βct + ut + vt , t = 1, . . . , n

14 / 140

Page 27: Bayesian computation with INLA

Example (II): Time-series model

Smoothing of binary time-series

I Data is sequence of 0 and 1s

I Probability for a 1 at time t, pt , depends on time

pt =exp(ηt)

1 + exp(ηt)

I Linear predictor

ηt = µ+ βct + ut + vt , t = 1, . . . , n

14 / 140

Page 28: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 29: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 30: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 31: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 32: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 33: Bayesian computation with INLA

Example (II) - cont.

Prior models

I µ and β are Normal

I u AR-model, likeut = φut−1 + εt

with parameters (φ, σ2ε ).

I v is an unstructured term or a “random effect”

givesx|θ = (µ, β,u, v,η)

is jointly Gaussian.

Hyperparametersθ = (φ, σ2

ε , σ2v )

15 / 140

Page 34: Bayesian computation with INLA

Example (II) - cont.

We can reinterpret the model as

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

I dim(x) could be large 102-105

I dim(θ) is small 1-5

16 / 140

Page 35: Bayesian computation with INLA

Example (II) - cont.

We can reinterpret the model as

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

I dim(x) could be large 102-105

I dim(θ) is small 1-5

16 / 140

Page 36: Bayesian computation with INLA

Example (II) - cont.

Precision matrix (η,u, v, µ, β), n = 100.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

17 / 140

Page 37: Bayesian computation with INLA

Example (III): Disease mapping

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative riskηi = µ+ ui + vi + f (ci )

I Structured component u

I Unstructured component v

I Smooth effect of a covariate c

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

18 / 140

Page 38: Bayesian computation with INLA

Example (III): Disease mapping

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative riskηi = µ+ ui + vi + f (ci )

I Structured component u

I Unstructured component v

I Smooth effect of a covariate c

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

18 / 140

Page 39: Bayesian computation with INLA

Example (III): Disease mapping

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative riskηi = µ+ ui + vi + f (ci )

I Structured component u

I Unstructured component v

I Smooth effect of a covariate c

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

18 / 140

Page 40: Bayesian computation with INLA

Example (III): Disease mapping

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative riskηi = µ+ ui + vi + f (ci )

I Structured component u

I Unstructured component v

I Smooth effect of a covariate c

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

18 / 140

Page 41: Bayesian computation with INLA

Example (III): Disease mapping

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative riskηi = µ+ ui + vi + f (ci )

I Structured component u

I Unstructured component v

I Smooth effect of a covariate c

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

18 / 140

Page 42: Bayesian computation with INLA

Yet Another Example (III)

We can reinterpret the model as

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

I dim(x) could be large 102-105

I dim(θ) is small 1-5

19 / 140

Page 43: Bayesian computation with INLA

Example (III) - cont.

Precision matrix (η,u, v, µ, f)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

20 / 140

Page 44: Bayesian computation with INLA

What we have learned so far

The latent Gaussian model construct

θ ∼ π(θ)

x|θ ∼ π(x|θ) = N (0,Q−1(θ))

y|x,θ ∼∏i

π(yi |ηi ,θ)

occurs in many, seemingly unrelated, statistical models.

GLM/GAM/GLMM/GAMM/++

21 / 140

Page 45: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 46: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 47: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 48: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 49: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 50: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 51: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 52: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 53: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 54: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 55: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 56: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 57: Bayesian computation with INLA

Further Examples

I Dynamic linear models

I Stochastic volatility

I Generalized linear (mixed) models

I Generalized additive (mixed) models

I Spline smoothing

I Semi-parametric regression

I Space-varying (semi-parametric) regression models

I Disease mapping

I Log-Gaussian Cox-processes

I Model-based geostatistics (*)

I Spatio-temporal models

I Survival analysis

I +++

22 / 140

Page 58: Bayesian computation with INLA

Outline

Latent Gaussian models

Are latent Gaussian models important?

Bayesian computing

INLA method

23 / 140

Page 59: Bayesian computation with INLA

Bayesian computing

We are interested in the posterior marginal quantities like π(xi |y)and π(θi |y).

This requires the evaluation of integrals of the form

π(xi |y) ∝∫x{−i}

∫θπ(y |x,θ)π(x|θ)π(θ) dθ dx{−i}

The computation of massively high dimensional integrals is at thecore of Bayesian computing.

24 / 140

Page 60: Bayesian computation with INLA

Bayesian computing

We are interested in the posterior marginal quantities like π(xi |y)and π(θi |y).

This requires the evaluation of integrals of the form

π(xi |y) ∝∫x{−i}

∫θπ(y |x,θ)π(x|θ)π(θ) dθ dx{−i}

The computation of massively high dimensional integrals is at thecore of Bayesian computing.

24 / 140

Page 61: Bayesian computation with INLA

Bayesian computing

We are interested in the posterior marginal quantities like π(xi |y)and π(θi |y).

This requires the evaluation of integrals of the form

π(xi |y) ∝∫x{−i}

∫θπ(y |x,θ)π(x|θ)π(θ) dθ dx{−i}

The computation of massively high dimensional integrals is at thecore of Bayesian computing.

24 / 140

Page 62: Bayesian computation with INLA

But surely we can already do this

I Markov Chain Monte Carlo (MCMC) is widely used by theapplied community.

I There are generic tools available for MCMC, OpenBUGS, JAGS,STAN and others for specific models, like BayesX.

I The issue of Bayesian computing is not “solved” even thoughMCMC is available

I Hierarchical models are more difficult for MCMC

I Strong dependencies, bad mixing.

I A main obstacle for Bayesian modeling is still the issue of“Bayesian computing”

25 / 140

Page 63: Bayesian computation with INLA

But surely we can already do this

I Markov Chain Monte Carlo (MCMC) is widely used by theapplied community.

I There are generic tools available for MCMC, OpenBUGS, JAGS,STAN and others for specific models, like BayesX.

I The issue of Bayesian computing is not “solved” even thoughMCMC is available

I Hierarchical models are more difficult for MCMC

I Strong dependencies, bad mixing.

I A main obstacle for Bayesian modeling is still the issue of“Bayesian computing”

25 / 140

Page 64: Bayesian computation with INLA

But surely we can already do this

I Markov Chain Monte Carlo (MCMC) is widely used by theapplied community.

I There are generic tools available for MCMC, OpenBUGS, JAGS,STAN and others for specific models, like BayesX.

I The issue of Bayesian computing is not “solved” even thoughMCMC is available

I Hierarchical models are more difficult for MCMC

I Strong dependencies, bad mixing.

I A main obstacle for Bayesian modeling is still the issue of“Bayesian computing”

25 / 140

Page 65: Bayesian computation with INLA

So what’s wrong with MCMC?

This is actually a problem with any Monte-Carlo scheme.

Error in expectations

The Monte-Carlo error is

Var

(E(f (X ))− 1

N

N∑i=1

f (xi )

)= O

(1√N

)

In practical terms, to reduce the variance to O(10−p) you needO(102p) samples!

This can be optimistic!

26 / 140

Page 66: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 67: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 68: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 69: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 70: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 71: Bayesian computation with INLA

Be more narrow

MCMC

I MCMC ‘works’ for everything, but it is not usually optimalwhen we focus on a specific class of models.

I It works for latent Gaussian models, but it’s too slow.

I (Unfortunately) sometimes it’s the only thing we can do.

INLA

I Integrated Nested Laplace Approximations

I Deterministic rather than stochastic algorithm, like MCMC.

I Specially designed for latent Gaussian models.

I Accurate results in a small fraction of computational time,when compared to MCMC.

27 / 140

Page 72: Bayesian computation with INLA

Comparing results with MCMC

I When comparing the results of R-INLA with MCMC, it isimportant to use the same model.

I Here we have compared the EPIL example results with thoseobtained using JAGS via the rjags package

28 / 140

Page 73: Bayesian computation with INLA

Comparing results with MCMC

I When comparing the results of R-INLA with MCMC, it isimportant to use the same model.

I Here we have compared the EPIL example results with thoseobtained using JAGS via the rjags package

28 / 140

Page 74: Bayesian computation with INLA

Intercept, 0.125 minutes

a0

Density

1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

Density

−0.5 0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

Density

1.5 2.0 2.5

0.0

0.5

1.0

1.5

29 / 140

Page 75: Bayesian computation with INLA

Intercept, 0.25 minutes

a0

Density

1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

6

Age

alpha.Age

Density

−0.5 0.0 0.5 1.0 1.5

0.0

0.4

0.8

1.2

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

Density

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 76: Bayesian computation with INLA

Intercept, 0.5 minutes

a0

De

nsity

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

De

nsity

−0.5 0.0 0.5 1.0 1.5 2.0

0.0

0.4

0.8

1.2

log(tau.Ind)

log(tau.b1)

De

nsity

0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

De

nsity

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 77: Bayesian computation with INLA

Intercept, 1 minutes

a0

De

nsity

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

De

nsity

−0.5 0.0 0.5 1.0 1.5 2.0

0.0

0.4

0.8

1.2

log(tau.Ind)

log(tau.b1)

De

nsity

0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

De

nsity

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 78: Bayesian computation with INLA

Intercept, 2 minutes

a0

Density

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

Density

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

0.0

0.4

0.8

1.2

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

Density

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 79: Bayesian computation with INLA

Intercept, 4 minutes

a0

Density

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

Density

−1.0 0.0 0.5 1.0 1.5 2.0

0.0

0.4

0.8

1.2

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

Density

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 80: Bayesian computation with INLA

Intercept, 8 minutes

a0

Density

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

Density

−1.0 0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

log(tau.Rand)

log(tau.b)

Density

1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 81: Bayesian computation with INLA

Intercept, 16 minutes

a0

Density

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

Density

−1.5 −0.5 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

log(tau.Ind)

log(tau.b1)

Density

0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

1.2

log(tau.Rand)

log(tau.b)

Density

1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

29 / 140

Page 82: Bayesian computation with INLA

Intercept, 32 minutes

a0

De

nsity

1.3 1.4 1.5 1.6 1.7 1.8 1.9

01

23

45

Age

alpha.Age

De

nsity

−1.5 −0.5 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

log(tau.Ind)

log(tau.b1)

De

nsity

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

1.2

log(tau.Rand)

log(tau.b)

De

nsity

1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

29 / 140

Page 83: Bayesian computation with INLA

Intercept, 64 minutes

a0

De

nsity

1.2 1.4 1.6 1.8

01

23

45

Age

alpha.Age

De

nsity

−1 0 1 2

0.0

0.2

0.4

0.6

0.8

1.0

log(tau.Ind)

log(tau.b1)

De

nsity

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

1.2

log(tau.Rand)

log(tau.b)

De

nsity

1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

29 / 140

Page 84: Bayesian computation with INLA

Intercept, 120 minutes

a0

Density

1.2 1.4 1.6 1.8 2.0

01

23

45

Age

alpha.Age

Density

−1 0 1 2

0.0

0.2

0.4

0.6

0.8

1.0

log(tau.Ind)

log(tau.b1)

Density

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

1.2

log(tau.Rand)

log(tau.b)

Density

1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

29 / 140

Page 85: Bayesian computation with INLA

Outline

Latent Gaussian models

Are latent Gaussian models important?

Bayesian computing

INLA method

30 / 140

Page 86: Bayesian computation with INLA

Main aim

Posteriorπ(x,θ|y) ∝ π(θ) π(x|θ)

∏i∈I

π(yi |xi ,θ)

Compute the posterior marginals:

π(xi |y) =

∫π(θ|y) π(xi |θ, y) dθ

π(θj |y) =

∫π(θ|y) dθ−j

31 / 140

Page 87: Bayesian computation with INLA

Main aim

Posteriorπ(x,θ|y) ∝ π(θ) π(x|θ)

∏i∈I

π(yi |xi ,θ)

Compute the posterior marginals:

π(xi |y) =

∫π(θ|y) π(xi |θ, y) dθ

π(θj |y) =

∫π(θ|y) dθ−j

31 / 140

Page 88: Bayesian computation with INLA

Tasks

1. Build an approximation to π(θ|y): π̃(θ |y)

2. Build an approximation to π(xi |θ, y): π̃(xi |θ, y)

π(xi |y) =

∫π(θ|y) π(xi |θ, y) dθ

π(θj |y) =

∫π(θ|y) dθ−j

3. Do the integration wrt θ numerically.

32 / 140

Page 89: Bayesian computation with INLA

Tasks

1. Build an approximation to π(θ|y): π̃(θ |y)

2. Build an approximation to π(xi |θ, y): π̃(xi |θ, y)

π(xi |y) =

∫π(θ|y) π(xi |θ, y) dθ

π(θj |y) =

∫π(θ|y) dθ−j

3. Do the integration wrt θ numerically.

32 / 140

Page 90: Bayesian computation with INLA

Tasks

1. Build an approximation to π(θ|y): π̃(θ |y)

2. Build an approximation to π(xi |θ, y): π̃(xi |θ, y)

π(xi |y) =

∫π(θ|y) π(xi |θ, y) dθ

π(θj |y) =

∫π(θ|y) dθ−j

3. Do the integration wrt θ numerically.

32 / 140

Page 91: Bayesian computation with INLA

Tasks

1. Build an approximation to π(θ|y): π̃(θ |y)

2. Build an approximation to π(xi |θ, y): π̃(xi |θ, y)

π̃(xi |y) =

∫π̃(θ|y) π̃(xi |θ, y) dθ

π̃(θj |y) =

∫π̃(θ|y) dθ−j

3. Do the integration wrt θ numerically.

32 / 140

Page 92: Bayesian computation with INLA

Tasks

1. Build an approximation to π(θ|y): π̃(θ |y)

2. Build an approximation to π(xi |θ, y): π̃(xi |θ, y)

π̃(xi |y) =

∫π̃(θ|y) π̃(xi |θ, y) dθ

π̃(θj |y) =

∫π̃(θ|y) dθ−j

3. Do the integration wrt θ numerically.

32 / 140

Page 93: Bayesian computation with INLA

Task 1: π̃(θ|y)

The Laplace approximation for π(θ|y) is

π(θ|y) =π(x,θ|y)

π(x|θ, y)

∝ π(θ) π(x|θ) π(y|x)

π(x|θ, y)

≈ π(θ) π(x|θ) π(y|x,θ)

πG (x|θ, y)

∣∣∣∣∣x=x∗(θ)

where πG (x|θ, y) is the Gaussian approximation of π(x|θ, y) andx∗(θ) is the mode.

33 / 140

Page 94: Bayesian computation with INLA

The GMRF-approximation

π(x|y) ∝ exp

(−1

2xTQx +

∑i

log π(yi |xi )

)

≈ exp

(−1

2(x− µ)T (Q + diag(ci ))(x− µ)

)= π̃(x|y)

Constructed as follows:

I Locate the mode x∗

I Expand to second order

Markov and computational properties are preserved

34 / 140

Page 95: Bayesian computation with INLA

Remarks

The Laplace approximation

π̃(θ|y)

turn out to be accurate: x|y,θ appears almost Gaussian in mostcases, as

I x is a priori Gaussian.

I y is typically not very informative.

I Observational model is usually ‘well-behaved’.

Note: π̃(θ|y) itself does not look Gaussian!

35 / 140

Page 96: Bayesian computation with INLA

Remarks

The Laplace approximation

π̃(θ|y)

turn out to be accurate: x|y,θ appears almost Gaussian in mostcases, as

I x is a priori Gaussian.

I y is typically not very informative.

I Observational model is usually ‘well-behaved’.

Note: π̃(θ|y) itself does not look Gaussian!

35 / 140

Page 97: Bayesian computation with INLA

Remarks

The Laplace approximation

π̃(θ|y)

turn out to be accurate: x|y,θ appears almost Gaussian in mostcases, as

I x is a priori Gaussian.

I y is typically not very informative.

I Observational model is usually ‘well-behaved’.

Note: π̃(θ|y) itself does not look Gaussian!

35 / 140

Page 98: Bayesian computation with INLA

Remarks

The Laplace approximation

π̃(θ|y)

turn out to be accurate: x|y,θ appears almost Gaussian in mostcases, as

I x is a priori Gaussian.

I y is typically not very informative.

I Observational model is usually ‘well-behaved’.

Note: π̃(θ|y) itself does not look Gaussian!

35 / 140

Page 99: Bayesian computation with INLA

Remarks

The Laplace approximation

π̃(θ|y)

turn out to be accurate: x|y,θ appears almost Gaussian in mostcases, as

I x is a priori Gaussian.

I y is typically not very informative.

I Observational model is usually ‘well-behaved’.

Note: π̃(θ|y) itself does not look Gaussian!

35 / 140

Page 100: Bayesian computation with INLA

Task 2: π̃(xi |y,θ)

This task is more challenging, since

I dimension of x, n is large

I and there are potential n marginals to compute, or at leastO(n).

Here we present three options:

1. Gaussian approximation

2. Laplace approximation

3. Simplified Laplace approximation

There is a trade-off between accuracy and complexity.

36 / 140

Page 101: Bayesian computation with INLA

Task 2: π̃(xi |y,θ)

This task is more challenging, since

I dimension of x, n is large

I and there are potential n marginals to compute, or at leastO(n).

Here we present three options:

1. Gaussian approximation

2. Laplace approximation

3. Simplified Laplace approximation

There is a trade-off between accuracy and complexity.

36 / 140

Page 102: Bayesian computation with INLA

π̃(xi |y,θ) - 1. Gaussian approximation

An obvious simple and fast alternative, is to use theGMRF-approximation πG (x|y,θ)

π̃(xi |θ, y) = N (xi ; µ(θ), σ2(θ))

I It is the fastest option, only need to compute the diagonal ofQ(θ)−1.

I Can present errors in location and asymmetry.

37 / 140

Page 103: Bayesian computation with INLA

π̃(xi |y,θ) - 1. Gaussian approximation

An obvious simple and fast alternative, is to use theGMRF-approximation πG (x|y,θ)

π̃(xi |θ, y) = N (xi ; µ(θ), σ2(θ))

I It is the fastest option, only need to compute the diagonal ofQ(θ)−1.

I Can present errors in location and asymmetry.

37 / 140

Page 104: Bayesian computation with INLA

π̃(xi |y,θ) - 2. Laplace approximation

I The Laplace approximation:

π̃(xi | y,θ) ≈ π(x,θ|y)

πGG (x−i |xi , y,θ)

∣∣∣∣∣x−i=x∗−i (xi ,θ)

I Again, approximation is very good, as x−i |xi , θ is ‘almostGaussian’,

I but it is expensive. In order to get the n marginals:I perform n optimizations, andI n factorizations of n − 1× n − 1 matrices.

38 / 140

Page 105: Bayesian computation with INLA

π̃(xi |y,θ) - 2. Laplace approximation

I The Laplace approximation:

π̃(xi | y,θ) ≈ π(x,θ|y)

πGG (x−i |xi , y,θ)

∣∣∣∣∣x−i=x∗−i (xi ,θ)

I Again, approximation is very good, as x−i |xi , θ is ‘almostGaussian’,

I but it is expensive. In order to get the n marginals:I perform n optimizations, andI n factorizations of n − 1× n − 1 matrices.

38 / 140

Page 106: Bayesian computation with INLA

π̃(xi |y,θ) - 2. Laplace approximation

I The Laplace approximation:

π̃(xi | y,θ) ≈ π(x,θ|y)

πGG (x−i |xi , y,θ)

∣∣∣∣∣x−i=x∗−i (xi ,θ)

I Again, approximation is very good, as x−i |xi , θ is ‘almostGaussian’,

I but it is expensive. In order to get the n marginals:I perform n optimizations, andI n factorizations of n − 1× n − 1 matrices.

38 / 140

Page 107: Bayesian computation with INLA

π̃(xi |y,θ) - 3. Simplified Laplace approximation

Taylor expansions of the LA for π(xi |θ, y):

I computational much faster

I correct the Gaussian approximation for error in shift andskewness

log π̃(xi |θ, y) = −1

2x2i + bxi +

1

6d x3

i + · · ·

I Fit a skew-Normal density

2φ(x)Φ(ax)

I sufficiently accurate for most applications

39 / 140

Page 108: Bayesian computation with INLA

π̃(xi |y,θ) - 3. Simplified Laplace approximation

Taylor expansions of the LA for π(xi |θ, y):

I computational much faster

I correct the Gaussian approximation for error in shift andskewness

log π̃(xi |θ, y) = −1

2x2i + bxi +

1

6d x3

i + · · ·

I Fit a skew-Normal density

2φ(x)Φ(ax)

I sufficiently accurate for most applications

39 / 140

Page 109: Bayesian computation with INLA

π̃(xi |y,θ) - 3. Simplified Laplace approximation

Taylor expansions of the LA for π(xi |θ, y):

I computational much faster

I correct the Gaussian approximation for error in shift andskewness

log π̃(xi |θ, y) = −1

2x2i + bxi +

1

6d x3

i + · · ·

I Fit a skew-Normal density

2φ(x)Φ(ax)

I sufficiently accurate for most applications

39 / 140

Page 110: Bayesian computation with INLA

π̃(xi |y,θ) - 3. Simplified Laplace approximation

Taylor expansions of the LA for π(xi |θ, y):

I computational much faster

I correct the Gaussian approximation for error in shift andskewness

log π̃(xi |θ, y) = −1

2x2i + bxi +

1

6d x3

i + · · ·

I Fit a skew-Normal density

2φ(x)Φ(ax)

I sufficiently accurate for most applications

39 / 140

Page 111: Bayesian computation with INLA

Task 3: Numerical integration wrt θ

Now that we know how to compute:

I π̃(θ|y) - Laplace approximation

I π̃(xi |θ, y) -1. Gaussian2. Laplace3. Simplified Laplace

Lets see how INLA works

40 / 140

Page 112: Bayesian computation with INLA

Task 3: Numerical integration wrt θ

Now that we know how to compute:

I π̃(θ|y) - Laplace approximation

I π̃(xi |θ, y) -1. Gaussian2. Laplace3. Simplified Laplace

Lets see how INLA works

40 / 140

Page 113: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) I

Explore π̃(θ|y)

I Locate the modeI Use the Hessian to construct new variablesI Grid-search

41 / 140

Page 114: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) I

Explore π̃(θ|y)

I Locate the modeI Use the Hessian to construct new variablesI Grid-search

41 / 140

Page 115: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) I

Explore π̃(θ|y)

I Locate the modeI Use the Hessian to construct new variablesI Grid-search

41 / 140

Page 116: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) I

Explore π̃(θ|y)

I Locate the modeI Use the Hessian to construct new variablesI Grid-search

41 / 140

Page 117: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) II

Step II For each θj

I For each i , evaluate the Laplace approximationfor selected values of xi

I Build a Skew-Normal or log-spline correctedGaussian

N (xi ; µi , σ2i )× exp(spline)

to represent the conditional marginal density.

42 / 140

Page 118: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) II

Step II For each θj

I For each i , evaluate the Laplace approximationfor selected values of xi

I Build a Skew-Normal or log-spline correctedGaussian

N (xi ; µi , σ2i )× exp(spline)

to represent the conditional marginal density.

42 / 140

Page 119: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) II

Step II For each θj

I For each i , evaluate the Laplace approximationfor selected values of xi

I Build a Skew-Normal or log-spline correctedGaussian

N (xi ; µi , σ2i )× exp(spline)

to represent the conditional marginal density.

42 / 140

Page 120: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

I For each i , sum out θ

π̃(xi | y) ∝∑j

π̃(xi | y,θj)× π̃(θj | y)

I Build a log-spline corrected Gaussian

N (xi ; µi , σ2i )× exp(spline)

to represent π̃(xi | y).

43 / 140

Page 121: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

I For each i , sum out θ

π̃(xi | y) ∝∑j

π̃(xi | y,θj)× π̃(θj | y)

I Build a log-spline corrected Gaussian

N (xi ; µi , σ2i )× exp(spline)

to represent π̃(xi | y).

43 / 140

Page 122: Bayesian computation with INLA

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

I For each i , sum out θ

π̃(xi | y) ∝∑j

π̃(xi | y,θj)× π̃(θj | y)

I Build a log-spline corrected Gaussian

N (xi ; µi , σ2i )× exp(spline)

to represent π̃(xi | y).

43 / 140

Page 123: Bayesian computation with INLA

Computing posterior marginals for θj (I)

Main idea

I Use the integration-points and build an interpolant

I Use numerical integration on that interpolant

44 / 140

Page 124: Bayesian computation with INLA

Computing posterior marginals for θj (I)

Main idea

I Use the integration-points and build an interpolant

I Use numerical integration on that interpolant

44 / 140

Page 125: Bayesian computation with INLA

How can we assess the error in the approximations?

Tool 1: Compare a sequence of improved approximations

1. Gaussian approximation

2. Simplified Laplace

3. Laplace

45 / 140

Page 126: Bayesian computation with INLA

How can we assess the error in the approximations?

Tool 2: Estimate the “effective” number of parameters as definedin the Deviance Information Criteria:

pD(θ) = D(x;θ)− D(x;θ)

and compare this with the number of observations.

Low ratio is good.

This criteria has theoretical justification.

46 / 140

Page 127: Bayesian computation with INLA

Parte II

R-INLA package

47 / 140

Page 128: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

48 / 140

Page 129: Bayesian computation with INLA

Implementing INLA

All procedures required to perform INLA need to be carefullyimplemented to achieve a good speed; easy to implement a slowversion of INLA.

I The GMRFLib-library

I The inla-program

I The INLA package for R

Happily, the R package is all we need to learn!!!

49 / 140

Page 130: Bayesian computation with INLA

Implementing INLA

All procedures required to perform INLA need to be carefullyimplemented to achieve a good speed; easy to implement a slowversion of INLA.

I The GMRFLib-libraryI Basic library written in C for fast computations for GMRFs.

I The inla-program

I The INLA package for R

Happily, the R package is all we need to learn!!!

49 / 140

Page 131: Bayesian computation with INLA

Implementing INLA

All procedures required to perform INLA need to be carefullyimplemented to achieve a good speed; easy to implement a slowversion of INLA.

I The GMRFLib-library

I The inla-program

I Define latent Gaussian models and interface with theGMRFLib-library

I Models are defined using .ini-filesI inla-program write all the results (E/Var/marginals) to files

I The INLA package for R

Happily, the R package is all we need to learn!!!

49 / 140

Page 132: Bayesian computation with INLA

Implementing INLA

All procedures required to perform INLA need to be carefullyimplemented to achieve a good speed; easy to implement a slowversion of INLA.

I The GMRFLib-library

I The inla-program

I The INLA package for R

I R-interface to the inla-program. (That’s why its not onCRAN.)

I Convert “formula”-statements into “.ini”-file definitionsI Run inla-programI Get results back to R

Happily, the R package is all we need to learn!!!

49 / 140

Page 133: Bayesian computation with INLA

Implementing INLA

All procedures required to perform INLA need to be carefullyimplemented to achieve a good speed; easy to implement a slowversion of INLA.

I The GMRFLib-library

I The inla-program

I The INLA package for R

Happily, the R package is all we need to learn!!!

49 / 140

Page 134: Bayesian computation with INLA

The INLA package for R

Data Frame

formula

− ini file

− Input files

Produces:

1.

2.

3.

inla

program

INLA

package

Collects results

Input

ARuns the

R

of type list

object

Output

plots etc.can get summary,

50 / 140

Page 135: Bayesian computation with INLA

R-INLA

I Visit the www-site

www.r-inla.org

and follow the instructions.

I www-site contains source-code, examples, reports +++

I The first time do> source("http://www.math.ntnu.no/inla/givmeINLA.R")

Later, you can upgrade the package doing> inla.upgrade()

or if you want the test-version, which you want,> inla.upgrade(testing=TRUE)

I Available for Linux, Windows and Mac

51 / 140

Page 136: Bayesian computation with INLA

R-INLA

I Visit the www-site

www.r-inla.org

and follow the instructions.

I www-site contains source-code, examples, reports +++

I The first time do> source("http://www.math.ntnu.no/inla/givmeINLA.R")

Later, you can upgrade the package doing> inla.upgrade()

or if you want the test-version, which you want,> inla.upgrade(testing=TRUE)

I Available for Linux, Windows and Mac

51 / 140

Page 137: Bayesian computation with INLA

R-INLA

I Visit the www-site

www.r-inla.org

and follow the instructions.

I www-site contains source-code, examples, reports +++

I The first time do> source("http://www.math.ntnu.no/inla/givmeINLA.R")

Later, you can upgrade the package doing> inla.upgrade()

or if you want the test-version, which you want,> inla.upgrade(testing=TRUE)

I Available for Linux, Windows and Mac

51 / 140

Page 138: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

52 / 140

Page 139: Bayesian computation with INLA

The structure of an R program using INLA

There are essentially three parts to an INLA program:

1. The data organization.

2. The formula - notation inherited from R’s native glm function.

3. The call to the INLA program.

53 / 140

Page 140: Bayesian computation with INLA

The inla function

I This is all that’s needed for a basic call

> result <- inla(

formula = y ~ 1 + x, # This describes your latent

# field

family = "gaussian", # The likelihood distribution.

data = data.frame(y,x) # A list or dataframe

)

54 / 140

Page 141: Bayesian computation with INLA

The simplest case: Linear regression

n = 100

x = sort(runif(n))

y = 1 + x + rnorm(n, sd = 0.1)

plot(x,y)

formula = y ~ 1 + x

result = inla(formula,

data = data.frame(x,y),

family = "gaussian")

summary(result)

plot(result)

55 / 140

Page 142: Bayesian computation with INLA

Call:

c("inla(formula = formula, family = \"gaussian\", data = data.frame(x, ", " y))")

Time used:

Pre-processing Running inla Post-processing Total

0.08050394 0.03020334 0.01916695 0.12987423

Fixed effects:

mean sd 0.025quant 0.5quant 0.975quant kld

(Intercept) 0.9690533 0.01849785 0.9327319 0.9690531 1.005387 0

x 1.0426582 0.03126996 0.9812582 1.0426580 1.104079 0

The model has no random effects

Model hyperparameters:

mean sd 0.025quant 0.5quant

Precision for the Gaussian observations 127.45 18.10 95.14 126.37

0.975quant

Precision for the Gaussian observations 166.11

Expected number of effective parameters(std dev): 2.209(0.02362)

Number of equivalent replicates : 45.27

Marginal Likelihood: 88.01

56 / 140

Page 143: Bayesian computation with INLA

Likelihood functions - family argument

result = inla(formula,

data = data.frame(x,y),

family = "gaussian")

I “binomial”

I “coxph”

I “Exponential”

I “gaussian”

I “gev”

I “laplace”

I “sn”(Skew Normal)

I “stochvol”, ”stochvol.nig”, ”stochvol.t”

I “T”

I “weibull”

I Many others: go to http://r-inla.org/

57 / 140

Page 144: Bayesian computation with INLA

Likelihood functions - family argument

result = inla(formula,

data = data.frame(x,y),

family = "gaussian")

I “binomial”

I “coxph”

I “Exponential”

I “gaussian”

I “gev”

I “laplace”

I “sn”(Skew Normal)

I “stochvol”, ”stochvol.nig”, ”stochvol.t”

I “T”

I “weibull”

I Many others: go to http://r-inla.org/

57 / 140

Page 145: Bayesian computation with INLA

A more general model

Assume the following model:

y ∼ π(y |η)

η = g(λ) = β0 + β1x1 + β2x2 + f (x3)

where

x1, x2 are covariates, linear effect

βi ∼ N (0, τ−11 )

x3 can be the index for spatial effect, random effect, etc

{f1, f2, . . . } ∼ N (0,Q−1f (τ2))

58 / 140

Page 146: Bayesian computation with INLA

A more general model

Assume the following model:

y ∼ π(y |η)

η = g(λ) = β0 + β1x1 + β2x2 + f (x3)

where

x1, x2 are covariates, linear effect

βi ∼ N (0, τ−11 )

x3 can be the index for spatial effect, random effect, etc

{f1, f2, . . . } ∼ N (0,Q−1f (τ2))

58 / 140

Page 147: Bayesian computation with INLA

A more general model (cont.)Assume the following model:

y ∼ π(y |η)

η = g(λ) = β0 + β1x1 + β2x2 + f (x3)

> formula = y ∼ x1 + x2 + f(x3, ...)

y =

y1

y2...

yn

g−→ η =

η1

η2...ηn

η =

η1

η2...ηn

= β0 ∗

11...1

+ β1 ∗

x11

x12...

x1n

+ β2 ∗

x21

x22...

x2n

+

fx31

fx32

...fx3n

59 / 140

Page 148: Bayesian computation with INLA

A more general model (cont.)Assume the following model:

y ∼ π(y |η)

η = g(λ) = β0 + β1x1 + β2x2 + f (x3)

> formula = y ∼ x1 + x2 + f(x3, ...)

y =

y1

y2...

yn

g−→ η =

η1

η2...ηn

η =

η1

η2...ηn

= β0 ∗

11...1

+ β1 ∗

x11

x12...

x1n

+ β2 ∗

x21

x22...

x2n

+

fx31

fx32

...fx3n

59 / 140

Page 149: Bayesian computation with INLA

A more general model (cont.)Assume the following model:

y ∼ π(y |η)

η = g(λ) = β0 + β1x1 + β2x2 + f (x3)

> formula = y ∼ x1 + x2 + f(x3, ...)

y =

y1

y2...

yn

g−→ η =

η1

η2...ηn

η =

η1

η2...ηn

= β0 ∗

11...1

+ β1 ∗

x11

x12...

x1n

+ β2 ∗

x21

x22...

x2n

+

fx31

fx32

...fx3n

59 / 140

Page 150: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 151: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 152: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 153: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 154: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 155: Bayesian computation with INLA

Model specification - INLA packageThe model is specified in R through a formula, similar to glm:

> formula = y ∼ x1 + x2 + f(x3, ...)

I y is the name of your response variable in your data frame.

I An intercept is fitted automatically! Use -1 in your formula toavoid it.

I The fixed effects (β0, β1 and β2) are taken as i.i.d. normalwith zero mean and small precision. (This can be changed)

I The f() function contains the random effect specifications.

Some models

I iid, iid1d, ii2d, iid3d: random effects

I rw1, rw2, ar1: smooth effect of covariates or time effect

I seasonal: seasonal effect

I besag: spatial effect (CAR model)

I generic: user defined precision matrix60 / 140

Page 156: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 157: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 158: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 159: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 160: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 161: Bayesian computation with INLA

Specifying random effects

Random effects are added to the formula through the function

f(name, model="...", hyper = ...,

replicate = ..., constr = FALSE, cyclic = FALSE)

I name - the name of the random effect. Also refers to thevalues in data which are used for various things, usuallyindexes, e.g. for space or time.

I model - the latent model. Eg. “iid”, “rw2”, “ar1”, etc.

I hyper - specify the prior on the hyperparameters

I constr - Sum to zero constraint?

I cyclic - Are you cyclic? (RW1, RW2 and AR1)

I The are more advanced options, we see later.

61 / 140

Page 162: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

62 / 140

Page 163: Bayesian computation with INLA

EPIL example

Seizure counts in a randomized trial of anti-convulsant therapy inepilepsy. From WinBUGS manual.

Patient y1 y2 y3 y4 Trt Base Age

1 5 3 3 3 0 11 312 3 5 3 3 0 11 30

....59 1 4 3 2 1 12 37

63 / 140

Page 164: Bayesian computation with INLA

EPIL example (cont.)

I Mixed model with repeated Poisson counts

yjk ∼ Poisson(µjk); j = 1, . . . , 59; k = 1, . . . , 4

log(µjk) = α0 + α1 log(Basej/4) + α2Trtj+α3Trtj log(Basej/4) + α4Agej + α5V 4+Indj + βjk

αi ∼ N (0, τα) τα knownIndj ∼ N (0, τInd) τInd ∼ Gamma(a1, b1)βjk ∼ N (0, τβ) τβ ∼ Gamma(a2, b2)

64 / 140

Page 165: Bayesian computation with INLA

EPIL example (cont.)The Epil data frame:

y Trt Base Age V4 rand Ind

5 0 11 31 0 1 1

3 0 11 31 0 2 1...

Specifying the model:

formula = y ∼ log(Base/4) + Trt + I(Trt *

log(Base/4)) + log(Age) + V4 +

f(Ind, model = "iid") + f(rand, model="iid")

η =

η1

η2...

η4∗59

= β0 ∗

11...1

+ . . .+

f Ind1

f Ind1...

f Ind59

+

f Rand1

f Rand2

...f Ind4∗59

65 / 140

Page 166: Bayesian computation with INLA

EPIL example (cont.)The Epil data frame:

y Trt Base Age V4 rand Ind

5 0 11 31 0 1 1

3 0 11 31 0 2 1...

Specifying the model:

formula = y ∼ log(Base/4) + Trt + I(Trt *

log(Base/4)) + log(Age) + V4 +

f(Ind, model = "iid") + f(rand, model="iid")

η =

η1

η2...

η4∗59

= β0 ∗

11...1

+ . . .+

f Ind1

f Ind1...

f Ind59

+

f Rand1

f Rand2

...f Ind4∗59

65 / 140

Page 167: Bayesian computation with INLA

EPIL example (cont.)The Epil data frame:

y Trt Base Age V4 rand Ind

5 0 11 31 0 1 1

3 0 11 31 0 2 1...

Specifying the model:

formula = y ∼ log(Base/4) + Trt + I(Trt *

log(Base/4)) + log(Age) + V4 +

f(Ind, model = "iid") + f(rand, model="iid")

η =

η1

η2...

η4∗59

= β0 ∗

11...1

+ . . .+

f Ind1

f Ind1...

f Ind59

+

f Rand1

f Rand2

...f Ind4∗59

65 / 140

Page 168: Bayesian computation with INLA

data(Epil)

my.center = function(x) (x - mean(x))

Epil$CTrt = my.center(Epil$Trt)

Epil$ClBase4 = my.center(log(Epil$Base/4))

Epil$CV4 = my.center(Epil$V4)

Epil$ClAge = my.center(log(Epil$Age))

formula = y ~ ClBase4*CTrt + ClAge + CV4 +

f(Ind, model="iid") + f(rand, model="iid")

result = inla(formula,family="poisson", data = Epil)

summary(result)

plot(result)

66 / 140

Page 169: Bayesian computation with INLA

Epil-example from Win/Open-BUGS

1.2 1.4 1.6 1.8 2.0

01

23

45

Marginals for α0

67 / 140

Page 170: Bayesian computation with INLA

Epil-example from Win/Open-BUGS

0 5 10 15

0.0

0.1

0.2

0.3

Marginals for τβ

67 / 140

Page 171: Bayesian computation with INLA

EPIL example (cont.)

Access results

- Summaries (mean, sd, [0.025, 0.5, 0.975]-quantiles, kld)

I result$summary.fixed

I result$summary.random$Ind

I result$summary.random$rand

I result$summary.hyperpar

- Post. marginals (matrix with x- and y- axis)

I result$marginals.fixed

I result$marginals.random$Ind

I result$marginals.random$rand

I result$marginals.hyperpar

68 / 140

Page 172: Bayesian computation with INLA

EPIL example (cont.)

Access results

- Summaries (mean, sd, [0.025, 0.5, 0.975]-quantiles, kld)

I result$summary.fixed

I result$summary.random$Ind

I result$summary.random$rand

I result$summary.hyperpar

- Post. marginals (matrix with x- and y- axis)

I result$marginals.fixed

I result$marginals.random$Ind

I result$marginals.random$rand

I result$marginals.hyperpar

68 / 140

Page 173: Bayesian computation with INLA

Smoothing binary times series

0 100 200 300

0.00.5

1.01.5

2.0

Time

Number of days in Tokyo with rainfall above 1 mm in 1983-84.We want to estimate the probability of rain pt for calendar dayt = 1, . . . , 366

69 / 140

Page 174: Bayesian computation with INLA

Smoothing binary times series

I Model with time series component

yt ∼ Binomial(nt , pt); t = 1, . . . , 366

pt = exp(ηt)1+exp(ηt)

ηt = f (t)f = {f1, . . . , f366} ∼ cyclic RW2(τ)τ ∼ Gamma(1, 0.0001)

70 / 140

Page 175: Bayesian computation with INLA

Smoothing binary time series

The Tokyo data frame:

y n time

0 2 1

0 2 2

1 2 3...

71 / 140

Page 176: Bayesian computation with INLA

Smoothing binary time series

The Tokyo data frame:

y n time

0 2 1

0 2 2

1 2 3...

Specifying the model:formula = y ∼ f(time, model="rw2", cyclic=TRUE)-1

71 / 140

Page 177: Bayesian computation with INLA

Smoothing binary time series

The Tokyo data frame:

y n time

0 2 1

0 2 2

1 2 3...

Specifying the model:formula = y ∼ f(time, model="rw2", cyclic=TRUE)-1

η =

η1

η2...

η366

=

f time1

f time2

...f time366

71 / 140

Page 178: Bayesian computation with INLA

data(Tokyo)

formula = y ~ f(time, model="rw2", cyclic=TRUE) - 1

result = inla(formula, family="binomial", Ntrials=n,

data=Tokyo)

72 / 140

Page 179: Bayesian computation with INLA

Posterior for temporal effect

0 100 200 300

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

0.5

time

PostMean 0.025% 0.5% 0.975%

73 / 140

Page 180: Bayesian computation with INLA

Posterior for precision

0 10000 20000 30000 40000 50000 60000

0e+00

1e-05

2e-05

3e-05

4e-05

5e-05

6e-05

7e-05

PostDens [Precision for time]

74 / 140

Page 181: Bayesian computation with INLA

Disease mapping in Germany

Larynx cancer mortality counts are observed in the 544 district ofGermany from 1986 to 1990 and level of smoking consumption(100 possible values).

0.63

0.95

1.27

1.59

1.91

2.23

2.55

26.22

38.02

49.82

61.61

73.41

85.2

97

75 / 140

Page 182: Bayesian computation with INLA

yi , i = 1, . . . , 544 counts of cancer mortality in Region iEi , i = 1, . . . , 544 known variable accounting for demographicvariation in Region ici , i = 1, . . . , 544 level of smoking consumption registered inRegion i

0.63

0.95

1.27

1.59

1.91

2.23

2.55

26.22

38.02

49.82

61.61

73.41

85.2

97

76 / 140

Page 183: Bayesian computation with INLA

The model

yi ∼ Poisson{Ei exp(ηi )}; i = 1, . . . , 544ηi = µ+ f (ci ) + fs(si ) + fu(si )

where:

I f (ci ) is a smooth effect of the covariate

f = {f1, . . . , f100} ∼ RW2(τf )

I fs(si ) is a spatial effect modeled as an intrinsic GMRF

fs(s)|fs(s ′), s 6= s ′, λs ∼ N (1

ns

∑s∼s′

fs(s ′),τfsns

)

I fu(si ) is a random effect

fu = {fu(s1), . . . , fu(s544)} ∼ N(0, τfu I)

I µ is an intercept term µ ∼ N (0, 0.0001)

77 / 140

Page 184: Bayesian computation with INLA

The model

yi ∼ Poisson{Ei exp(ηi )}; i = 1, . . . , 544ηi = µ+ f (ci ) + fs(si ) + fu(si )

where:

I f (ci ) is a smooth effect of the covariate

f = {f1, . . . , f100} ∼ RW2(τf )

I fs(si ) is a spatial effect modeled as an intrinsic GMRF

fs(s)|fs(s ′), s 6= s ′, λs ∼ N (1

ns

∑s∼s′

fs(s ′),τfsns

)

I fu(si ) is a random effect

fu = {fu(s1), . . . , fu(s544)} ∼ N(0, τfu I)

I µ is an intercept term µ ∼ N (0, 0.0001)

77 / 140

Page 185: Bayesian computation with INLA

The model

yi ∼ Poisson{Ei exp(ηi )}; i = 1, . . . , 544ηi = µ+ f (ci ) + fs(si ) + fu(si )

where:

I f (ci ) is a smooth effect of the covariate

f = {f1, . . . , f100} ∼ RW2(τf )

I fs(si ) is a spatial effect modeled as an intrinsic GMRF

fs(s)|fs(s ′), s 6= s ′, λs ∼ N (1

ns

∑s∼s′

fs(s ′),τfsns

)

I fu(si ) is a random effect

fu = {fu(s1), . . . , fu(s544)} ∼ N(0, τfu I)

I µ is an intercept term µ ∼ N (0, 0.0001)

77 / 140

Page 186: Bayesian computation with INLA

The model

yi ∼ Poisson{Ei exp(ηi )}; i = 1, . . . , 544ηi = µ+ f (ci ) + fs(si ) + fu(si )

where:

I f (ci ) is a smooth effect of the covariate

f = {f1, . . . , f100} ∼ RW2(τf )

I fs(si ) is a spatial effect modeled as an intrinsic GMRF

fs(s)|fs(s ′), s 6= s ′, λs ∼ N (1

ns

∑s∼s′

fs(s ′),τfsns

)

I fu(si ) is a random effect

fu = {fu(s1), . . . , fu(s544)} ∼ N(0, τfu I)

I µ is an intercept term µ ∼ N (0, 0.0001)

77 / 140

Page 187: Bayesian computation with INLA

The model

yi ∼ Poisson{Ei exp(ηi )}; i = 1, . . . , 544ηi = µ+ f (ci ) + fs(si ) + fu(si )

where:

I f (ci ) is a smooth effect of the covariate

f = {f1, . . . , f100} ∼ RW2(τf )

I fs(si ) is a spatial effect modeled as an intrinsic GMRF

fs(s)|fs(s ′), s 6= s ′, λs ∼ N (1

ns

∑s∼s′

fs(s ′),τfsns

)

I fu(si ) is a random effect

fu = {fu(s1), . . . , fu(s544)} ∼ N(0, τfu I)

I µ is an intercept term µ ∼ N (0, 0.0001)

77 / 140

Page 188: Bayesian computation with INLA

For identifiably we define a sum-to-zero constraint for all intrinsicmodels, so ∑

s fs(s) = 0∑i fi = 0

78 / 140

Page 189: Bayesian computation with INLA

The Germany data frame:

region E Y x

0 7.965008 8 56

1 22.836219 22 65

The model is:

ηi = µ+ f (ci ) + fs(si ) + fu(si )

I The data set has to contain one separate column for eachterm specified through f() so in this case we have to add onecolumn.> Germany = cbind(Germany, region.struct=Germany$region)

I We also need the graph file where the neighborhood structureis specified germany.graph

79 / 140

Page 190: Bayesian computation with INLA

The Germany data frame:

region E Y x

0 7.965008 8 56

1 22.836219 22 65

The model is:

ηi = µ+ f (ci ) + fs(si ) + fu(si )

I The data set has to contain one separate column for eachterm specified through f() so in this case we have to add onecolumn.> Germany = cbind(Germany, region.struct=Germany$region)

I We also need the graph file where the neighborhood structureis specified germany.graph

79 / 140

Page 191: Bayesian computation with INLA

The Germany data frame:

region E Y x

0 7.965008 8 56

1 22.836219 22 65

The model is:

ηi = µ+ f (ci ) + fs(si ) + fu(si )

I The data set has to contain one separate column for eachterm specified through f() so in this case we have to add onecolumn.> Germany = cbind(Germany, region.struct=Germany$region)

I We also need the graph file where the neighborhood structureis specified germany.graph

79 / 140

Page 192: Bayesian computation with INLA

The new data set is:

region E Y x region.struct

0 7.965008 8 56 0

1 22.836219 22 65 1

Then the formula isformula <- Y ∼f(region.struct,model="besag",graph="germany.graph")+

f(x,model="rw2")+f(region)

80 / 140

Page 193: Bayesian computation with INLA

The new data set is:

region E Y x region.struct

0 7.965008 8 56 0

1 22.836219 22 65 1

Then the formula isformula <- Y ∼f(region.struct,model="besag",graph="germany.graph")+

f(x,model="rw2")+f(region)

The sum-to-zero constraint is default in the inla function for allintrinsic models.

80 / 140

Page 194: Bayesian computation with INLA

The new data set is:

region E Y x region.struct

0 7.965008 8 56 0

1 22.836219 22 65 1

Then the formula isformula <- Y ∼f(region.struct,model="besag",graph="germany.graph")+

f(x,model="rw2")+f(region)

The sum-to-zero constraint is default in the inla function for allintrinsic models.

80 / 140

Page 195: Bayesian computation with INLA

The new data set is:

region E Y x region.struct

0 7.965008 8 56 0

1 22.836219 22 65 1

Then the formula isformula <- Y ∼f(region.struct,model="besag",graph="germany.graph")+

f(x,model="rw2")+f(region)

80 / 140

Page 196: Bayesian computation with INLA

The new data set is:

region E Y x region.struct

0 7.965008 8 56 0

1 22.836219 22 65 1

Then the formula isformula <- Y ∼f(region.struct,model="besag",graph="germany.graph")+

f(x,model="rw2")+f(region)

The location of the graph file has to be provided here (the graphfile cannot be loaded in R)

80 / 140

Page 197: Bayesian computation with INLA

The graph file

The germany.graph file:

5441 1 122 2 10 113 4 6 8 15 387...

I Total number of nodes in the graph

I Identifier for the node

I Number of neighbors

I Identifiers for the neighbors

81 / 140

Page 198: Bayesian computation with INLA

The graph file

The germany.graph file:

5441 1 122 2 10 113 4 6 8 15 387...

I Total number of nodes in the graph

I Identifier for the node

I Number of neighbors

I Identifiers for the neighbors

81 / 140

Page 199: Bayesian computation with INLA

The graph file

The germany.graph file:

5441 1 122 2 10 113 4 6 8 15 387...

I Total number of nodes in the graph

I Identifier for the node

I Number of neighbors

I Identifiers for the neighbors

81 / 140

Page 200: Bayesian computation with INLA

The graph file

The germany.graph file:

5441 1 122 2 10 113 4 6 8 15 387...

I Total number of nodes in the graph

I Identifier for the node

I Number of neighbors

I Identifiers for the neighbors

81 / 140

Page 201: Bayesian computation with INLA

The graph file

The germany.graph file:

5441 1 122 2 10 113 4 6 8 15 387...

I Total number of nodes in the graph

I Identifier for the node

I Number of neighbors

I Identifiers for the neighbors

81 / 140

Page 202: Bayesian computation with INLA

data(Germany)

g = system.file("demodata/germany.graph", package="INLA")

source(system.file("demodata/Bym-map.R", package="INLA"))

Germany = cbind(Germany, region.struct=Germany$region)

# standard BYM model

formula1 = Y ~ f(region.struct,model="besag",graph=g) +

f(region,model="iid")

# with linear covariate

formula2 = Y ~ f(region.struct,model="besag",graph=g) +

f(region,model="iid") + x

# with smooth covariate

formula3 = Y ~ f(region.struct,model="besag",graph=g) +

f(region,model="iid") + f(x, model="rw2")

82 / 140

Page 203: Bayesian computation with INLA

result1 = inla(formula1,family="poisson",data=Germany,E=E,

control.compute=list(dic=TRUE))

result2 = inla(formula2,family="poisson",data=Germany,E=E,

control.compute=list(dic=TRUE))

result3 = inla(formula3,family="poisson",data=Germany,E=E,

control.compute=list(dic=TRUE))

83 / 140

Page 204: Bayesian computation with INLA

Other graph specification

- It is also possible to define the graph structure of your modelusing:

I A symmetric (dense or sparse) matrix, where the non-zeropattern of the matrix defines the graph.

I A inla.graph object.

See FAQ on the webpage for more information.

84 / 140

Page 205: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

85 / 140

Page 206: Bayesian computation with INLA

Model evaluationI Deviance Information Criterion (DIC):

result = inla(..., control.compute = list(dic = TRUE))

result$dic$dic

I Conditional predictive ordinate (CPO) and probability integraltransform (PIT):

CPOi = π(yi |y−i )

PITi = Prob(Yi ≤ yobsi |y−i )

result = inla(..., control.compute = list(cpo = TRUE))

result$cpo$cpo

result$cpo$dic

86 / 140

Page 207: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

87 / 140

Page 208: Bayesian computation with INLA

Controlling θ

I We often need to set our own priors and using our ownparameters in these.

I These can be set in two ways

Old style using prior=.., param=..., initial=...,

fixed=...

New style using hyper = list(prec =

list(initial=2, fixed=TRUE, ....))

The old-style is there for backward-compatibility only. The twostyles can also be mixed.

88 / 140

Page 209: Bayesian computation with INLA

Controlling θ

I We often need to set our own priors and using our ownparameters in these.

I These can be set in two ways

Old style using prior=.., param=..., initial=...,

fixed=...

New style using hyper = list(prec =

list(initial=2, fixed=TRUE, ....))

The old-style is there for backward-compatibility only. The twostyles can also be mixed.

88 / 140

Page 210: Bayesian computation with INLA

Controlling θ

I We often need to set our own priors and using our ownparameters in these.

I These can be set in two ways

Old style using prior=.., param=..., initial=...,

fixed=...

New style using hyper = list(prec =

list(initial=2, fixed=TRUE, ....))

The old-style is there for backward-compatibility only. The twostyles can also be mixed.

88 / 140

Page 211: Bayesian computation with INLA

Example- New style

hyper = list(

prec = list(

prior = "loggamma",

param = c(2,0.1),

initial = 3,

fixed = FALSE

)

)

formula = y ~ f(i, model="iid", hyper = hyper) + ...

- Old style

formula = y ~ f(i, model="iid", prior = "loggamma",

param = c(2,0.1), inital = 3,

fixed = FALSE) + ...

89 / 140

Page 212: Bayesian computation with INLA

Internal and external scale

Hyperparameters, like the precision τ is represented internally usinga “good” transformation, like

θ1 = log(τ)

I Initial values are given in the internal scale

I the to.theta and from.theta functions can be used to mapbetween the external and internal scale.

90 / 140

Page 213: Bayesian computation with INLA

Internal and external scale

Hyperparameters, like the precision τ is represented internally usinga “good” transformation, like

θ1 = log(τ)

I Initial values are given in the internal scale

I the to.theta and from.theta functions can be used to mapbetween the external and internal scale.

90 / 140

Page 214: Bayesian computation with INLA

Internal and external scale

Hyperparameters, like the precision τ is represented internally usinga “good” transformation, like

θ1 = log(τ)

I Initial values are given in the internal scale

I the to.theta and from.theta functions can be used to mapbetween the external and internal scale.

90 / 140

Page 215: Bayesian computation with INLA

Example: AR1 model

hyper

theta1

name log precisionshort.name prec

prior loggammaparam 1 5e-05initial 4fixed FALSE

to.thetafrom.theta

theta2

name logit lag one correlationshort.name rho

prior normalparam 0 0.15initial 2fixed FALSE

to.thetafrom.theta

constr FALSE

nrow.ncol FALSE

augmented FALSE

aug.factor 1

aug.constr

n.div.by

n.required FALSE

set.default.values FALSE

pdf ar1

91 / 140

Page 216: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

92 / 140

Page 217: Bayesian computation with INLA

Feature: replicate

“replicate” generates iid replicates from the same model with thesame hyperparameters.

If x | θ ∼ AR(1), then nrep=3, makes

x = (x1, x2, x3)

with mutually independent xi ’s from AR(1) with the same θ

Most f()-models can be replicated

93 / 140

Page 218: Bayesian computation with INLA

Example: replicate

n=100

x1 = arima.sim(n, model=list(ar=0.9)) + 1

x2 = arima.sim(n, model=list(ar=0.9)) - 1

y1 = rpois(n,exp(x1))

y2 = rpois(n,exp(x2))

y = c(y1,y2)

i = rep(1:n,2)

r = rep(1:2,each=n)

intercept = as.factor(r)

formula = y ~ f(i, model="ar1", replicate=r) + intercept -1

result = inla(formula, family = "poisson",

data = data.frame(y=y,i=i,r=r))

94 / 140

Page 219: Bayesian computation with INLA

Example: replicate

i = rep(1:n,2)

r = rep(1:2,each=n)

intercept = as.factor(r)

formula = y ~ f(i, model="ar1", replicate=r) + intercept -1

y1,1

...yn,1y1,2

...yn,2

g−→

η1,1

...ηn,1η1,2

...ηn,2

=

f i1,1...

f in,1

f i1,2...

f in,2

+ β0,1 ∗

1...10...0

+ β0,2 ∗

0...01...1

95 / 140

Page 220: Bayesian computation with INLA

Feature: More than one family

Every observation could have its own likelihood!

I Response is a matrix or list

I Each “column” defines a separate “family”

I Each “family” has its own hyperparameters

96 / 140

Page 221: Bayesian computation with INLA

n=100

phi = 0.9

x1 = 1 + arima.sim(n, model=list(ar=phi))

x2 = 0.5 + arima.sim(n, model=list(ar=phi))

y1 = rbinom(n,size=1, prob=exp(x1)/(1+exp(x1)))

y2 = rpois(n,exp(x2))

y = matrix(NA, 2*n, 2)

y[ 1:n, 1] = y1

y[n+1:n, 2] = y2

i = rep(1:n,2)

r = rep(1:2,each=n)

intercept = as.factor(r)

Ntrials = c(rep(1,n), rep(NA,n))

formula = y ~ f(i, model="ar1", replicate=r) + intercept -1

result = inla(formula, family = c("binomial", "poisson"),

Ntrials = Ntrials, data = data.frame(y,i,r))

97 / 140

Page 222: Bayesian computation with INLA

y = matrix(NA, 2*n, 2)

y[ 1:n, 1] = y1

y[n+1:n, 2] = y2

i = rep(1:n,2)

r = rep(1:2,each=n)

intercept = as.factor(r)

Ntrials = c(rep(1,n), rep(NA,n))

formula = y ~ f(i, model="ar1", replicate=r) + intercept -1

result = inla(formula, family = c("binomial", "poisson"),

Ntrials = Ntrials, data = data.frame(y,i,r))

y1,1 NA...

...yn,1 NANA y1,2

......

NA yn,2

g−→

η1,1

...ηn,1η1,2

...ηn,2

=

f i1,1...

f in,1f i1,2...

f in,2

+ β0,1 ∗

1...10...0

+ β0,2 ∗

0...01...1

98 / 140

Page 223: Bayesian computation with INLA

More than one family - More examples

Some rather advanced examples on www.r-inla.org using thisfeature

I Preferential sampling, geostatistics (marked point process)

I Weibull-survival data and “longitudinal” data

99 / 140

Page 224: Bayesian computation with INLA

Feature: copy

The model

formula = y ~ f(i, ...) + ...

Only allow ONE element from each sub-model, to contribute tothe linear predictor for each observation.

Sometimes this is not sufficient.

100 / 140

Page 225: Bayesian computation with INLA

Feature: copy

Supposeηi = ui + ui+1 + ...

Then we can code this as

formula = f(i, model="iid") + f(i.plus, copy="i")

I The copy-feature, creates an additional sub-model which isε-close to the target.

I Many copies allowed

I Copy with unknown scaling (default scaling is fixed to 1).

η1...ηn

=

u1...

un

+

u2...

un

101 / 140

Page 226: Bayesian computation with INLA

Feature: copySuppose that

ηi = ai + bizi + ....

where(ai , bi )

iid∼ N2(0,Σ)

- Simulate data

n = 100

Sigma = matrix(c(1, 0.8, 0.8, 1), 2, 2)

z = runif(n)

ab = rmvnorm(n, sigma = Sigma)

a = ab[, 1]

b = ab[, 2]

eta = a + b * z

s = 0.1

y = eta + rnorm(n, sd=s)

102 / 140

Page 227: Bayesian computation with INLA

i = 1:n

j = 1:n + n

formula = y ~ f(i, model="iid2d", n = 2*n) + f(j, z, copy="i") -1

r = inla(formula, data = data.frame(y, i, j))

η1

...ηn

=

a1

...anb1

...bn

+

b1 ∗ z1

...bn ∗ zn

103 / 140

Page 228: Bayesian computation with INLA

Feature: Linear-combinations

Possible to extract extra information from the model through linearcombinations of the latent field, say

v = Bx

for a k × n matrix B.

104 / 140

Page 229: Bayesian computation with INLA

Feature: Linear-combinations (cont.)

Two different approaches.

1. Most “correct” is to do the computations on the enlarged field

x̃ = (x, v)

But this often lead to more dense precision matrix.

2. The second option is to compute these “offline”, as(conditionally on θ)

Var(v1) = Var(bT1 x) ≈ bT

1 Q−1GMRFapproxb1

andE (v1) = b1E (x)

Approximate density of v1 with a Normal.

105 / 140

Page 230: Bayesian computation with INLA

Feature: Linear-combinations (cont.)

Two different approaches.

1. Most “correct” is to do the computations on the enlarged field

x̃ = (x, v)

But this often lead to more dense precision matrix.

2. The second option is to compute these “offline”, as(conditionally on θ)

Var(v1) = Var(bT1 x) ≈ bT

1 Q−1GMRFapproxb1

andE (v1) = b1E (x)

Approximate density of v1 with a Normal.

105 / 140

Page 231: Bayesian computation with INLA

formula = y ~ ClBase4*CTrt + ClAge + CV4 +

f(Ind, model="iid") + f(rand, model="iid")

## Now I want the posterior for

##

## 1) 2*CTrt - CV4

## 2) Ind[2] - rand[2]

##

lc1 = inla.make.lincomb( CTrt = 2, CV4 = -1)

names(lc1) = "lc1"

lc2 = inla.make.lincomb( Ind = c(NA,1), rand = c(NA,-1))

names(lc2) = "lc2"

## default is to derive the marginals from lc’s without changing the

## latent field

result1 = inla(formula,family="poisson", data = Epil,

lincomb = c(lc1, lc2))

## but the lincombs can also be additionally included into the latent

## field for increased accurancy...

result2 = inla(formula,family="poisson", data = Epil,

lincomb = c(lc1, lc2),

control.inla = list(lincomb.derived.only = FALSE))

106 / 140

Page 232: Bayesian computation with INLA

- Get the results

result$summary.lincomb.derived

result$marginals.lincomb.derived # results of the

# default method

result$summary.lincomb

result$marginals.lincomb # alternative method

- Posterior correlation matrix between all the linearcombinations

control.inla = list(lincomb.derived.correlation.matrix = TRUE)

result$misc$lincomb.derived.correlation.matrix

- Many linear combinations at onceUse inla.make.lincombs()

107 / 140

Page 233: Bayesian computation with INLA

A-matrix in the linear predictor (I)

Usual formulaη = ...

andyi ∼ π(yi | ηi , ...)

108 / 140

Page 234: Bayesian computation with INLA

A-matrix in the linear predictor (II)

Extended formulaη = ...

η∗ = Aη

andyi ∼ π(yi | η∗i , ...)

Implemented as

A = matrix(...)

A = sparseMatrix(...)

result = inla(formula, ...,

control.predictor = list(A = A))

109 / 140

Page 235: Bayesian computation with INLA

A-matrix in the linear predictor (II)

Extended formulaη = ...

η∗ = Aη

andyi ∼ π(yi | η∗i , ...)

Implemented as

A = matrix(...)

A = sparseMatrix(...)

result = inla(formula, ...,

control.predictor = list(A = A))

109 / 140

Page 236: Bayesian computation with INLA

A-matrix in the linear predictor (III)

I Can really simplify model-formulations

I Duplicate to some extent the “copy” feature

I Really useful for some models; the A-matrix need not to be asquare matrix...

110 / 140

Page 237: Bayesian computation with INLA

Feature: remote computing

For large/huge models, its more convenient to run thecomputations on the remote (Linux/Mac) computational server

inla(...., inla.call="remote")

using ssh (and Cygwin on windows).

111 / 140

Page 238: Bayesian computation with INLA

Control statements

The control.xxx statements control various parts of the INLAprogram

I control.predictorI A — The ”A matrix”or ”Observational Matrix”linking the

latent field to the data.

I control.modeI x,theta, result — Gives modes to INLA.I restart = TRUE — Tells INLA to try to improve on the

supplied mode

I control.computeI dic, mlik, cpo — Compute measures of fit.

I control.inlaI strategy and int.strategy contain useful advanced

features.

Various other—see help!

112 / 140

Page 239: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

113 / 140

Page 240: Bayesian computation with INLA

Space-varying regression

Number of (insurance-type) losses Nkt in 431municipalities/regions of Norway in relation to one weathercovariate Wkt .The likelihood is

Nkt ∼ Poisson(Akt pkt); k = 1, . . . , 431 t = 1, . . . , 10

The model for log pkt is:

log pkt = β0 + βk Wkt

where βk is the regression coefficients for each municipality.

114 / 140

Page 241: Bayesian computation with INLA

Borrow strength..

Few losses is in each region; high variability in the estimates.

Borrow strength, by letting {β1, . . . , β431} to be smooth in space:

{β1, . . . , β431} ∼ CAR(τβ)

115 / 140

Page 242: Bayesian computation with INLA

Borrow strength..

Few losses is in each region; high variability in the estimates.

Borrow strength, by letting {β1, . . . , β431} to be smooth in space:

{β1, . . . , β431} ∼ CAR(τβ)

115 / 140

Page 243: Bayesian computation with INLA

The data set:

y region W

1 0 1 0.4

2 0 1 0.4

10 0 1 0.4

11 1 2 0.2

12 0 2 0.2

20 0 2 0.2

116 / 140

Page 244: Bayesian computation with INLA

Second argument in f() is the weight which defaults to 1

ηi = ...+ wi fi + ...

is represented as

f(i, w, ...)

No need for sum-to-zero constraint!

norway = read.table("norway.dat", header=TRUE)

formula = y ~ 1 + f(region, W, model="besag",

graph.file="norway.graph",

constr=FALSE)

result = inla(formula, family="poisson", data=norway)

117 / 140

Page 245: Bayesian computation with INLA

Survival models

patient time event age sex1 8,16 1,1 28,28 02 23,13 1,0 48,48 13 22,18 1,1 32,32 0

I Times of infection from the time of insertion of catheter on 38kidney patients using portable dialysis equipment.

I 2 observation for each patient (38 patients).

I Each time can be an event (infection) or a censoring (noinfection)

118 / 140

Page 246: Bayesian computation with INLA

The Kidney data

The Kidney data frame

time event age sex ID

8 1 28 0 1

16 1 28 0 1

23 1 48 1 2

13 0 48 1 2

22 1 32 0 3

28 1 32 0 3

119 / 140

Page 247: Bayesian computation with INLA

data(Kidney)

formula = inla.surv(time,event) ~ age + sex + f(ID,model="iid")

result1 = inla(formula, family="coxph", data=Kidney)

result2 = inla(formula, family="weibull", data=Kidney)

result3 = inla(formula, family="exponential", data=Kidney)

120 / 140

Page 248: Bayesian computation with INLA

Outline

INLA implementation

R-INLA - Model specification

Some examples

Model evaluation

Controlling hyperparameters and priors

Some more advanced features

More examples

Extras

121 / 140

Page 249: Bayesian computation with INLA

A toy-example using copy

State-space modelyt = xt + vt

xt = 2xt−1 − xt−2 + wt

Rewrite this asyt = xt + vt

0 = xt − 2xt−1 + xt−2 + wt

and implement this as two families

1. Observations yt with precision Prec(vt)

2. Observations 0 with precision Prec(wt), or Prec=HIGH.

122 / 140

Page 250: Bayesian computation with INLA

A toy-example using copy

State-space modelyt = xt + vt

xt = 2xt−1 − xt−2 + wt

Rewrite this asyt = xt + vt

0 = xt − 2xt−1 + xt−2 + wt

and implement this as two families

1. Observations yt with precision Prec(vt)

2. Observations 0 with precision Prec(wt), or Prec=HIGH.

122 / 140

Page 251: Bayesian computation with INLA

n = 100

m = n-2

y = sin((1:n)*0.2) + rnorm(n, sd=0.1)

formula = Y ~ f(i, model="iid", initial=-10, fixed=TRUE) +

f(j, w, copy="i") + f(k, copy="i") +

f(l, model ="iid") -1

Y = matrix(NA, n+m, 2)

Y[1:n, 1] = y

Y[1:m + n, 2] = 0

i = c(1:n, 3:n) # x_t

j = c(rep(NA,n), 3:n -1) # x_t-1

w = c(rep(NA,n), rep(-2,m)) # weights for j

k = c(rep(NA,n), 3:n -2) # x_t-2

l = c(rep(NA,n), 1:m) # v_t

r = inla(formula, data = data.frame(i,j,w,k,l,Y),

family = c("gaussian", "gaussian"),

control.data = list(list(), list(initial=10, fixed=TRUE)))

123 / 140

Page 252: Bayesian computation with INLA

Stochastic Volatility model

0 200 400 600 800 1000

−2

02

4

Log of the daily difference of the pound-dollar exchange rate fromOctober 1st, 1981, to June 28th, 1985.

124 / 140

Page 253: Bayesian computation with INLA

Stochastic Volatility model

Simple model

xt | x1, . . . , xt−1, τ, φ ∼ N (φxt−1, 1/τ)

where |φ| < 1 to ensure a stationary process.

Observations are taken to be

yt | x1, . . . , xt , µ ∼ N (0, exp(µ+ xt))

125 / 140

Page 254: Bayesian computation with INLA

Results

Using just the first 50 data-points only, which makes the problemmuch harder.

126 / 140

Page 255: Bayesian computation with INLA

Results

−10 −5 0 5 10 15 20

0.00

0.02

0.04

0.06

0.08

0.10

ν = logit(2φ− 1)

126 / 140

Page 256: Bayesian computation with INLA

Results

0 2 4 6

0.00

0.05

0.10

0.15

0.20

0.25

0.30

log(κx)

126 / 140

Page 257: Bayesian computation with INLA

Using the full dataset

0 200 400 600 800 1000

−20

24

The Pound-Dollar data.

127 / 140

Page 258: Bayesian computation with INLA

Using the full dataset

0 200 400 600 800

−3−2

−10

12

x$V1

x$V2

Mean of xt + µ

128 / 140

Page 259: Bayesian computation with INLA

Using the full dataset

0 100 200 300 400 500

0.000

0.005

0.010

0.015

0.020

convert.dens(xx, yy, FUN = exp)$x

conver

t.dens(

xx, yy

, FUN =

exp)$

y

The posterior marginal for the precision.

129 / 140

Page 260: Bayesian computation with INLA

Using the full dataset

0.70 0.75 0.80 0.85 0.90 0.95 1.00

010

2030

40

convert.dens(xx, yy, FUN = phi.trans)$x

conver

t.dens(

xx, yy

, FUN =

phi.tra

ns)$y

The posterior marginal for the lag-1 correlation.

130 / 140

Page 261: Bayesian computation with INLA

Using the full dataset

0 200 400 600 800 1000

−3−2

−10

12

x$V1

x$V2

Predictions for µ+ xt+k

131 / 140

Page 262: Bayesian computation with INLA

New data-model: Student-tν

Now extend the model to use Student-tν distribution

yt | x1, . . . , xt ∼ exp(µ/2 + xt/2)× Student-tν/√ν/(ν − 2)

132 / 140

Page 263: Bayesian computation with INLA

Student-tν

0 20 40 60 80 100

0.00

0.02

0.04

0.06

0.08

convert.dens(xx, yy, FUN = dof.trans)$x

conver

t.dens(

xx, yy

, FUN =

dof.tra

ns)$y

Posterior marginal for ν.

133 / 140

Page 264: Bayesian computation with INLA

Student-tν

0 200 400 600 800 1000

−3−2

−10

12

x$V1

x$V2

Predictions

134 / 140

Page 265: Bayesian computation with INLA

Student-tν

0 200 400 600 800 1000

−3−2

−10

12

x$V1

x$V2

Comparing predictions with Student−tν and Gaussian

135 / 140

Page 266: Bayesian computation with INLA

Student-tν

However,I No support for Student-tν in the data

I Bayes-factorI Deviance Information Criteria

136 / 140

Page 267: Bayesian computation with INLA

Disease mapping: The BYM-model

I Data yi ∼ Poisson(Eiexp(ηi ))

I Log-relative risk ηi = ui + viI Structured component u

I Unstructured component v

I Log-precisions log κu and log κv

−0.63

−0.37

−0.1

0.17

0.44

0.71

0.98

I A hard case: Insulin Dependent Diabetes Mellitus in 366districts of Sardinia. Few counts.

I dim(θ) = 2.

137 / 140

Page 268: Bayesian computation with INLA

Marginals for θ|y

138 / 140

Page 269: Bayesian computation with INLA

Marginals for θ|y

138 / 140

Page 270: Bayesian computation with INLA

Marginals for xi |y

139 / 140

Page 271: Bayesian computation with INLA

THANK YOU

140 / 140