Bayesian Emulation and Calibration of aDynamic Epidemic Model for H1N1 Influenza
Marian Farah1
Paul Birrell1, Stefano Conti2, Daniela De Angelis1,2
1MRC Biostatistics Unit, Cambridge, UK2Health Protection Agency, London, UK
ICERM Bayesian Nonparametrics WorkshopSeptember 19, 2012
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Motivation
• Tracking and predicting the behavior of an emergingepidemic is essential for a prompt public health response.
• Inferential goals:
• What is happening? i.e., real-time estimation of theepidemic parameters.
• What is going to happen next? i.e., forecasting the(short-term) evolution of the epidemic.
• What happened? i.e., “reconstructing” the epidemic byestimating its parameters and evolution dynamics.
• Noisy time-series data coming from different sources.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
1 Introduction: Epidemic modeling
2 Emulation and calibration of epidemic models
3 Preliminary results
4 Discussion
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Introduction
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Epidemic modeling
• Transmission model:
S(t)getting infected
∣
∣
−→ E (t)latent period
∣
∣
−→ I (t)infectious period
∣
∣
−→ R(t)
• Transmission depends on the virulence, the mixingpatterns in the population, and the transition ratesamong the S , E , I , and R states.
• Transmission dynamics are typically described by a systemof differential equations.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Birrell et al. (2011) H1N1 model
S(t)η1,η2−→ E (t)
η3−→ I (t)
η4−→ R(t)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Birrell et al. (2011) H1N1 model
S(t)η1,η2−→ E (t)
η3−→ I (t)
η4−→ R(t)
η5 ↓ incubation
Expected # of symptomatic individuals
η6 ↓ propensity to consult doctor
Expected # of doctor consultations
↓ delay in reporting
Expected # of reported cases, µ(η, t)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Birrell et al. (2011) H1N1 model
S(t)η1,η2−→ E (t)
η3−→ I (t)
η4−→ R(t)
η5 ↓ incubation
Expected # of symptomatic individuals
η6 ↓ propensity to consult doctor
Expected # of doctor consultations
↓ delay in reporting
Expected # of reported cases, µ(η, t)
• η = (η1, . . . , η6): underlying parameters of the epidemic.
• Proportion of symptomatic cases, propensity to consult,exponential growth rate, expected infectious period, ameasure of the initial number of infected individuals,population interaction parameters.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Computational challenge
• The likelihood of reported data, z(t), t = 1, . . . ,T ,depends on µ.
• p(η | z{1:T}, µ) ∝T∏
t=1p(
z(t); µ(η, t))
× p(η)
• µ(η, t) must be computed at every MCMC iteration.
• µ is computationally expensive.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Computational challenge
• The likelihood of reported data, z(t), t = 1, . . . ,T ,depends on µ.
• p(η | z{1:T}, µ) ∝T∏
t=1p(
z(t); µ(η, t))
× p(η)
• µ(η, t) must be computed at every MCMC iteration.
• µ is computationally expensive.
• What about an efficient estimate?
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Computer simulator
specify inputs η = X run code outputs
X =
x1,1 . . . x1,6
x2,1 . . . x2,6
......
...xn,1 . . . xn,6
→
Birrellet al.(2011)
→
µ(x1, 1), . . . , µ(x1,T )µ(x2, 1), . . . , µ(x2,T )
µ(x2, t)...
µ(xn, 1), . . . , µ(xn,T )
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Computer simulator
specify inputs η = X run code outputs
X =
x1,1 . . . x1,6
x2,1 . . . x2,6
......
...xn,1 . . . xn,6
→
Birrellet al.(2011)
→
µ(x1, 1), . . . , µ(x1,T )µ(x2, 1), . . . , µ(x2,T )
µ(x2, t)...
µ(xn, 1), . . . , µ(xn,T )
0 50 100 150 200 2500
2
4
6
8
10
12x 104
time
µ(η,
t)
0 50 100 150 200 250
0
2
4
6
8
10
12
time
log
µ(η,
t)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration and Emulation
• Calibration: (e.g., Higdon et al., 2004)
Posterior inference for η through the simulator, µ, and“field” observed data z(t),
Observed = Reality + Error
Observed = Simulator + bias + Error
z↑ µ↑ b↑
• p(η, b | z{1:T}, µ) ∝T∏
t=1p(
z(t); µ(η, t)+ b)
× p(η)p(b)
• Emulation: (e.g., Kennedy and O’Hagan, 2000)
Estimating a slow computer simulator output, µ, using fast
statistical model (an emulator), say µ̂.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration and Emulation
• Idea: (e.g., Bayarri et al., 2007a)
Replace the slow simulator output, µ, with the fast
emulator estimation, µ̂, and obtain posterior inference forη through
• p(η, b | z{1:T}, µ̂) ∝T∏
t=1p(
z(t); µ̂(η, t)+ b)
× p(η)p(b)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Emulation and calibration of
dynamic models
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Emulation review
• A deterministic computer simulator is a function f (·)that maps input x to a unique output y = f (x).
• The function f (·) is treated as unknown and given a prior.
• Likelihood: data are runs of the simulator, given adesign over the input space, e.g., Latin Hypercube.
• Emulator: the posterior (predictive) distribution of f (·).
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
The Gaussian process
y(x) ∼ GP(m(x), v c(x , x ′) )
m(·), v , and c(·, ·) are the mean, variance, & correlationfunction (e.g., Neal 1998; Rasmussen & Williams 2006).
−3 −2 −1 0 1 2 3−5
−4
−3
−2
−1
0
1
2
3
4
5
6
x
y
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t
20output function
f (x) = x + 3sin(x/2)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t
output function simulator dataprior realizations
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t
output function simulator dataprior realizations95% posterior region
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t
output function simulator dataprior realizations95% posterior region
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration review
• Simulator: Specify x → f (x).
• For x = η, f (η) simulates a physical system.
• η is uncertain.
• Calibration: solving the inverse-problem, i.e., η | z, f (·).
• If f (·) is computationally expensive, it is emulated.
• Priors for η and f (·).
• Likelihood: data come from field observations andsimulator runs.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t
6
8 output function simulator datafield data
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
0 2 4 6 8−2
0
2
4
6
8
10
input
oupu
t 6
8 output function simulator datafield data
z ∼ N(f (η), σ2 = 0.32)
η ∼ N(2, 0.052)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Toy example
1 2 3 4 50
2
4
6
8
η
dens
ity
truthpriorposterior
• Assuming σ2 is known.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• yt(xi ) = f (xi , t) is the simulator output at input point xiand time t.
x1 −→ y1(x1), y2(x1), . . . , yT (x1)x2 −→ y1(x2), y2(x2), . . . , yT (x2)...
......
...xn −→ y1(xn), y2(xn), . . . , yT (xn)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• yt(xi ) = f (xi , t) is the simulator output at input point xiand time t.
x1 −→ y1(x1), y2(x1), . . . , yT (x1)x2 −→ y1(x2), y2(x2), . . . , yT (x2)...
......
...xn −→ y1(xn), y2(xn), . . . , yT (xn)
• Need to model three types of interdependencies:
1 over the input space.
2 over time within each time series.
3 across series of different input points.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Modeling dependence over the input space alone
Typically using a Gaussian process prior for outputs.
y(x) ∼ GP(m(x), v c(x , x ′) )
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Modeling dependence over the input space alone
Typically using a Gaussian process prior for outputs.
y(x) ∼ GP(m(x), v c(x , x ′) )
• Modeling dependence for a single time series
Typically, TVAR(p) model is used; e.g., p = 1,
yt(x) = φt yt−1(x) + ǫt(x), ǫt(x) ∼ N(0, vt),
φt = φt−1 + ωt , ωt ∼ N(0, wt).
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Linking across time series for different inputs using amultivariate TVAR(p) model (Liu and West, 2009),
yt (x1)...
yt (xn)
=
yt−1 (x1) · · · yt−p (x1)...
. . ....
yt−1 (xn) · · · yt−p (xn)
φ1,t
...φp,t
+
ǫt(x1)...
ǫt(xn)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Linking across time series for different inputs using amultivariate TVAR(p) model (Liu and West, 2009),
yt (x1)...
yt (xn)
=
yt−1 (x1) · · · yt−p (x1)...
. . ....
yt−1 (xn) · · · yt−p (xn)
φ1,t
...φp,t
+
ǫt(x1)...
ǫt(xn)
Cov(
ǫt(xi ), ǫt(xj ))
= vt c(xi , xj)
• c(xi , xj) is the (i , j) element in the n × n correlationmatrix induced by the Gaussian process.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Linking across time series for different inputs using amultivariate TVAR(p) model (Liu and West, 2009),
yt (x1)...
yt (xn)
=
yt−1 (x1) · · · yt−p (x1)...
. . ....
yt−1 (xn) · · · yt−p (xn)
φ1,t
...φp,t
+
ǫt(x1)...
ǫt(xn)
Cov(
ǫt(xi ), ǫt(xj ))
= vt c(xi , xj)
• c(xi , xj) is the (i , j) element in the n × n correlationmatrix induced by the Gaussian process.
• φt = φt−1 + ωt , where φt = (φ1t , . . . , φpt)′.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Birrell et al. (2011) simulator
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
12t = 20
x2
log
µ
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
12t = 40
x2
log
µ
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
12t = 85
x2
log
µ
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
12t = 140
x2
log
µ
• x2: Exponential growth rate.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Birrell et al. (2011) simulator
0 50 100 150 200 250
0
2
4
6
8
10
12
time
log
µ(η,
t)
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Dynamic emulation
• Extending Liu and West (2009)
• Modeling input-dependent trends:
yt(x) = φt yt−1(x) + h(x)βt + ǫt
• Modeling systematic temporal trend:
yt(x) = θt + φt yt−1(x) + h(x)βt + ǫt
θt
φt
βt
=
θt−1
φt−1
βt−1
+
ω1t
ω2t
ω3t
• Posterior inference through Forward-FilteringBackward-Sampling.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration
• Two sources of data:
• Simulator data: Ds = {(yt , x); t = 1, . . . ,T}. Modelparameters are specified as inputs x.
• “Field” observed epidemic data DF = {zt ; t = 1, . . . ,T}.Model parameters, η, are unknown.
• Two-stage calibration (e.g., Bayarri et al., 2007b)
• Stage 1: Estimate the emulator model parameters usingonly Ds .
• Stage 2: Model zt using a parametric distribution centeredon the emulator model. Then, conditional onstage 1, estimate p(η | DF ,Ds).
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Results
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Validating the emulator
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
0 50 100 150 200 250−5
0
5
10
15
time
log
µ
• Simulation runs (black), emulator’s median & 95% region (red).
• Plots based on a MVTVAR(1) and Gaussian correlation function.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration
• Generated synthetic epidemic data.
• Set η = η0. Then, z ∼Poisson(
µ(η0, t))
0 50 100 150 200 250−2
0
2
4
6
8
time
log
obse
rvat
ion
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Calibration
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
η1
0 0.2 0.4 0.6 0.8 10
1
2
3
4
5
6
η2
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
3
η3
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
3
3.5
η4
0 0.2 0.4 0.6 0.8 10
1
2
3
4
5
η5
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
η6
• Truth — Prior
• η2 is exponential growth rate, and η5 is effect of summerholiday on population interaction.
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
Discussion
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
• What we have done:
• Estimation of epidemic dynamics by combining astatistical emulator with reported epidemic data.
• Dynamic emulation through modeling dependenciesacross time and epidemic parameter space.
• Still to do:
• Consider different age groups in the population.
• Incorporate additional sources of information.
• Real-time calibration and forecasting using epidemic data.
• . . .
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
• What we have done:
• Estimation of epidemic dynamics by combining astatistical emulator with reported epidemic data.
• Dynamic emulation through modeling dependenciesacross time and epidemic parameter space.
• Still to do:
• Consider different age groups in the population.
• Incorporate additional sources of information.
• Real-time calibration and forecasting using epidemic data.
• . . .
Thank you!
DynamicBayesian
modeling forepidemics
Marian FarahBiostatisticsCambridge
Outline
Introduction
Methods
Results
Discussion
References:
• Bayarri, M., Berger, J., Paulo, R., Sacks, J., Cafeo, J., Cavendish, J., Lin,C., and Tu, J. (2007a), “A framework for validation of computer models,”Technometrics, 49, 138–154.
• Bayarri, M. J., Berger, J. O., Cafeo, J., Garcia-Donato, G., Liu, F., Palomo,J., Parthasarathy, R., Paulo, R., Sacks, J., and Walsh, D. (2007b),“Computer model validation with functional output,” Annals of Statistics,35, 1874–1906.
• Birrell, P. J., Ketsetzis, G., Gay, N. J., Cooper, B. S., Presanis, A. M.,Harris, R. J., Charlett, A., Zhang, X.-S., White, P. J., Pebody, R. G., andDe Angelis, D. (2011), “Bayesian modeling to unmask and predict infuenzaA/H1N1pdm dynamics in London,” Proceedings of the National Academy
of Sciences.
• Kennedy, M. C. and O’Hagan, A. (2000), “Predicting the output from acomplex computer code when fast approximations are available,”Biometrika, 87, 1–13.
• Liu, F. and West, M. (2009), “A Dynamic Modelling Strategy for BayesianComputer Model Emulation,” Bayesian Analysis, 4, 393–412.
Top Related