Approximate Bayesian Inference
Approximate Bayesian Inference for SurvivalModels
Rupali Akerkar1
Department of Mathematical SciencesNTNU,Norway
15 May 2009
1With H.Rue/S.Martino
Approximate Bayesian Inference
Introduction
Outline
Basic idea
I Survival model
I Present survival model as a latent Gaussian model
I Apply INLA
I Verify results with MCMC results
Approximate Bayesian Inference
Introduction
Outline
Basic idea
I Survival model
I Present survival model as a latent Gaussian model
I Apply INLA
I Verify results with MCMC results
Approximate Bayesian Inference
Introduction
Outline
Basic idea
I Survival model
I Present survival model as a latent Gaussian model
I Apply INLA
I Verify results with MCMC results
Approximate Bayesian Inference
Introduction
Outline
Basic idea
I Survival model
I Present survival model as a latent Gaussian model
I Apply INLA
I Verify results with MCMC results
Approximate Bayesian Inference
Introduction
Outline
Outline
I Survival Analysis
I Some definitionsI CensoringI Likelihood
I Survival Models
I Parametric modelsI Semiparametric models for hazard
I piecewise constant modelsI piecewise linear models
I Spatial model
Approximate Bayesian Inference
Survival Data
Censoring
Let T be a random survival time, the following functions aredefined:
I Density function:
T ∼ f (t)
I Survival function:
S(t) = 1− F (t) =
∫ ∞t
f (u)du
I Hazard function:
h(t) = limδt→0
1
δtP(t < T < t + δt | T > t)
thus
h(t) = limδt→0
S(t)− S(t + δt)
S(t)=
f (t)
S(t)
Approximate Bayesian Inference
Survival Data
Censoring
Censoring
I Uncensored observation: The failure time is recorded
I Right censored observation: The censoring time C < T isrecorded
I Interval censored observation: The failure time is notobserved exactly but it is known that Tlo < T < Tup
Approximate Bayesian Inference
Survival Data
Censoring
Each observation is described by a triple(Tlo , Tup, δ) with
Tlo = Tup = T , δ = 1 if the obs. is uncensored
Tlo = Tup = C , δ = 0 if the obs. is right censored
Tlo < Tup, δ = 0 if the obs. is interval censored
Approximate Bayesian Inference
Survival Data
Likelihood
Likelihood
The likelihood function is:
L =∏
Li
where:
Approximate Bayesian Inference
Survival Data
Likelihood
I if i is uncensored
Li = h(T )S(T ) = h(T )exp{−∫ T
0h(u)du}
I if i is censored
Li = S(C ) = exp{−∫ C
0h(u)du}
Approximate Bayesian Inference
Survival Data
Likelihood
I if i is interval censored
Li = S(Tlo)− S(Tup)
= exp{−∫ Tl0
0h(u)du}{1− exp(−
∫ Tup
Tlo
h(u)du)}
Approximate Bayesian Inference
Survival Data
Likelihood
The contribution to the log-likelihood of data i , (Tlo ,Tup,δi ), is ingeneral
li = δih(Tup,i )−∫ Tup,i
0h(u)du + log{1− exp(−
∫ Tup,i
Tlo,i
h(u)du)}
Approximate Bayesian Inference
Survival Data
Likelihood
The contribution to the log-likelihood of data i , (Tlo ,Tup,δi ), is ingeneral
li = δih(Tup,i )−∫ Tup,i
0h(u)du + log{1− exp(−
∫ Tup,i
Tlo,i
h(u)du)}
Only included for uncensored data
Approximate Bayesian Inference
Survival Data
Likelihood
The contribution to the log-likelihood of data i , (Tlo ,Tup,δi ), is ingeneral
li = δih(Tup,i )−∫ Tup,i
0h(u)du + log{1− exp(−
∫ Tup,i
Tlo,i
h(u)du)}
Included for all data
Approximate Bayesian Inference
Survival Data
Likelihood
The contribution to the log-likelihood of data i , (Tlo ,Tup,δi ), is ingeneral
li = δih(Tup,i )−∫ Tup,i
0h(u)du + log{1− exp(−
∫ Tup,i
Tlo,i
h(u)du)}
Only included for interval censored data
Approximate Bayesian Inference
Survival Models
Model
Cox Model
The model proposed by Cox in 1972 is
h(t|z1, ...., zp) = h0(t) exp(z1β1 + ....+ zpβp)
I h0 = baseline hazard
I zi = covariates
I βi = regression parameters
For this model, the covariates are assumed to have fixed effects onfailure pattern.But the covariates effects do change with time.
Approximate Bayesian Inference
Survival Models
Model
More comprehensive model
A more comprehensive model is achieved by assuming
h(t) = exp(z0β0(t) + ...+ z1βp(t))
where β0 = log(h0).
Approximate Bayesian Inference
Survival Models
Model
I Parametric modelsI ExponentialI Weibull
I Parametric models with frailtyI Semiparametric models
I Piecewise-constant baseline hazardI Piecewise-linear baseline hazard
Approximate Bayesian Inference
Survival Models
Parametric models
The Weibull modelI The data : (Tlo,i , Tup,i , δi ), i = 1,...,nd
I The hazards rate :
h(u; z , α) = αuα−1exp(η)
with η = zTβI the log-likelihood :
l = δh(Tup)−∫ Tup
0h(u)du + log{1− exp(−
∫ Tup
Tlo
h(u)du)}
= δ[logα + (α− 1)logTup + η]− eηTαup + log{1− e−eη(Tα
up−Tαlo )}
I Priors for parameters :
β ∼ N(0, τβI )
α ∼ π(α)
Approximate Bayesian Inference
Survival Models
Parametric models
The Weibull modelI The data : (Tlo,i , Tup,i , δi ), i = 1,...,nd
I The hazards rate :
h(u; z , α) = αuα−1exp(η)
with η = zTβI the log-likelihood :
l = δh(Tup)−∫ Tup
0h(u)du + log{1− exp(−
∫ Tup
Tlo
h(u)du)}
= δ[logα + (α− 1)logTup + η]− eηTαup + log{1− e−eη(Tα
up−Tαlo )}
I Priors for parameters :
β ∼ N(0, τβI )
α ∼ π(α)
Approximate Bayesian Inference
Survival Models
Parametric models
The Weibull modelI The data : (Tlo,i , Tup,i , δi ), i = 1,...,nd
I The hazards rate :
h(u; z , α) = αuα−1exp(η)
with η = zTβI the log-likelihood :
l = δh(Tup)−∫ Tup
0h(u)du + log{1− exp(−
∫ Tup
Tlo
h(u)du)}
= δ[logα + (α− 1)logTup + η]− eηTαup + log{1− e−eη(Tα
up−Tαlo )}
I Priors for parameters :
β ∼ N(0, τβI )
α ∼ π(α)
Approximate Bayesian Inference
Survival Models
Parametric models
The Weibull modelI The data : (Tlo,i , Tup,i , δi ), i = 1,...,nd
I The hazards rate :
h(u; z , α) = αuα−1exp(η)
with η = zTβI the log-likelihood :
l = δh(Tup)−∫ Tup
0h(u)du + log{1− exp(−
∫ Tup
Tlo
h(u)du)}
= δ[logα + (α− 1)logTup + η]− eηTαup + log{1− e−eη(Tα
up−Tαlo )}
I Priors for parameters :
β ∼ N(0, τβI )
α ∼ π(α)
Approximate Bayesian Inference
Survival Models
Parametric models
The Weibull model as a latent Gaussian model
I The latent Gaussian field
x = {η1, ..., ηnd, β} ∼ N(0,Q−1)
I The hyperparametersθ = α
I The likelihood
π(data|x,θ) =
nd∏i=1
π(datai |ηi ,θ)
We can apply INLA to this model without problems!!
Approximate Bayesian Inference
Survival Models
Parametric models
Note
In more general term we can have
η =∑
j
fj(zj) +∑k
βj z̃j + ε
where fj(zj) can represent
I smooth effect of covariate
I time varying effect of covarite
I space effect
As long as the prior for f () is Gaussian we are still in the latentGaussian model framework!
Approximate Bayesian Inference
Survival Models
The Weibull model
Example1 - Kidney data
(Nahman et al.,1992) The time to the first infection for kidneydialysis patients is analysed. The data are right censored. Onebinary covariate z is catheter placement.The model is
h(t; z) = αtα−1exp{β0 + β1z}
we assumeβ0, β1 ∼ N(0, 0.001)
α ∼ Γ(1, 0.001)
Approximate Bayesian Inference
Survival Models
The Weibull model
Example
−3 −2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
posterior marginals of intercept
−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
0.0
0.2
0.4
0.6
0.8
1.0
posterior marginal densities of fixed effect
0.6 0.8 1.0 1.2 1.4
0.0
0.5
1.0
1.5
2.0
2.5
Alpha.parameter.for.Weibull
dens
Solid line from INLA (0.165sec and 0.0235 for inla.hyperpar),histogram from 3x106 samples from Winbugs(1900 sec)
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Parametric models with frailty - Hazard function
Let tij be the survival times for the j th subject in the i th cluster,i=1,...,n,j=1,...,mi , the hazard function is given as:
h(tij ; zij ,wi , α) = αtα−1ij wiexp(zT
ij β)
= αtα−1ij exp(ηij)
withηij = zT
ij β + log(wi )
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Parametric models with frailty - Log-likekihood function
The likelihood function for the generic data (Tlo ,Tup, δ) is then
l = δh(Tup)−∫ Tup
0h(u)du + log{1− exp(−
∫ Tup
Tlo
h(u)du)}
= δ[logα + (α− 1)logTup + η]− eηTαup + log{1− e−eη(Tα
up−Tαlo )}
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Log-normal model for frailty
If we assume log(wi ) = εi to have a Gaussian prior N(0, τw ), theparametric frailty model falls into the latent Gaussian family.
I The latent Gaussian field
x = {η11, ..., ηnmn , ε1, ...εn, β} ∼ N(0,Q−1)
I the hyperparametersθ = (α, τw )
I The likelihood
π(data|x,θ) =
nd∏i=1
π(data|ηi ,θ)
...and so no problem to apply INLA!!
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Example2 - Rat data
(Mantel et al.,1977) study time till tumor development in rats from50 distinct litters, the covarite z is a treatment( drug or placebo),w is the frailty variable (litter/cluster). The model is
h(tij ; zij ,wi , α) = αtα−1exp{β0 + β1z + log(wi )}
We assumeβ0, β1 ∼ N(0, 0.001)
α, τw ∼ Γ(1, 0.001)
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Example2 - Rat data
−3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0
0.0
0.5
1.0
1.5
posterior marginal densities of intercept
−1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
posterior marginal densities of fixed effect
Solid line from INLA(1.422 sec) and histogram from 106 samplesfrom Winbugs(1687 sec)
Approximate Bayesian Inference
Survival Models
Parametric models with frailty
Example - Rat data
2.5 3.0 3.5 4.0 4.5 5.0 5.5
0.0
0.2
0.4
0.6
posterior marginals of Weibull parameterx
y
0 1000 2000 3000 4000
0.00
000.
0005
0.00
100.
0015
0.00
20
posterior marginals of group−effectx
y
Solid line from INLA(1.422 sec) and histogram from 106 samplesfrom Winbugs(1687 sec)
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Piecewise constant model for h0(t)
Divide the time line into J predefined intervals, Ik = (Sk ,Sk+1] fork = 1, ..., J with 0 = s1 < ... < sJ <∞, we define the baselinehazard as:
h0(t) = λk if t ∈ Ik = (sk , sk+1]
and the baseline survival is then:
S0(t) = exp{−∫ t
0h0(u)du} = exp{
k−1∑j=1
(sj+1 − sj)λj − (t − sk)λk}
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Piecewise constant model for h0(t)
In general, let the hazard rate be
h(t; .) = h0(t)exp(zTβ) = exp{zTβ + logh0(t)}= exp{zTβ + logλk}; t ∈ Ik
and assume a RW prior for the piecewise baseline hazard
logλ1, ..., logλJ | τλ ∼ RW (τλ)
thenηk = zTβ + logλk | ... ∼ Gaussian
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Log-likelihood for right censored data
The log-likelihood contribution for a (possibly) right censoredobservation t ∈ Ik is :
log [h(t; .)δS(t; .)] = δηk − {k−1∑j=1
(sj+1 − sj)eηj + (t − sk)eηk}
= δηk − (t − sk)eηk −k−1∑j=1
(sj+1 − sj)eηj
I This can be seen as the log likelihood from a Poisson withmean (t − sk)eηk observed to be 0 or 1 according to δ
I This can be seen as the likelihood from k − 1 Poisson withmean (sj+1 − sj)e
ηj observed to be 0
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Log-likelihood for right censored data
The log-likelihood contribution for a (possibly) right censoredobservation t ∈ Ik is :
log [h(t; .)δS(t; .)] = δηk − {k−1∑j=1
(sj+1 − sj)eηj + (t − sk)eηk}
= δηk − (t − sk)eηk −k−1∑j=1
(sj+1 − sj)eηj
I This can be seen as the log likelihood from a Poisson withmean (t − sk)eηk observed to be 0 or 1 according to δ
I This can be seen as the likelihood from k − 1 Poisson withmean (sj+1 − sj)e
ηj observed to be 0
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Log-likelihood for right censored data
The log-likelihood contribution for a (possibly) right censoredobservation t ∈ Ik is :
log [h(t; .)δS(t; .)] = δηk − {k−1∑j=1
(sj+1 − sj)eηj + (t − sk)eηk}
= δηk − (t − sk)eηk −k−1∑j=1
(sj+1 − sj)eηj
I This can be seen as the log likelihood from a Poisson withmean (t − sk)eηk observed to be 0 or 1 according to δ
I This can be seen as the likelihood from k − 1 Poisson withmean (sj+1 − sj)e
ηj observed to be 0
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Log-likelihood for right censored data
Each data point ti is written as k ”augmented data points”yi1, ...., yik coming from Poisson distribution
Approximate Bayesian Inference
Survival Models
Semiparametric models for hazard
Piecewise constant model for h0(t) as latent Gaussian field
I The latent Gaussian field
x = {logλ1, ..., logλJ , β, η11...}
I The hyperparametersθ
I The ”augmented” Poisson data
Approximate Bayesian Inference
Survival Models
Semiparametric models for hazard
Example3 -Times to death for a Breast-cancer trial
(Sedmak et al., 1989) The time to death of 45 breast cancerpatients is analysed. The data are right censored. One binarycovariate z(immunohistochemical response) is also recorded.The model is:
h(t; z) = h0(t)exp{β0 + β1z}
Moreover we divide the time line into 2 equal intervals
h0(t) = λk t ∈ Ik
and assumelogλ1, logλ2 ∼ RW 1(τλ)
β0, β1 ∼ N(0, 0.001)
Approximate Bayesian Inference
Survival Models
Semiparametric models for hazard
Example3 -Times to death for a Breast-cancer trial
−1.5 −1.0 −0.5 0.0 0.5 1.0
0.0
0.5
1.0
1.5
posterior marginals of fixed effect
−7.0 −6.5 −6.0 −5.5
0.0
0.5
1.0
1.5
2.0
2.5
posterior marginals of intercept
The solid line is the INLA approximations(0.090sec) and thehistogram is from 5x106 Winbug samples(685sec)
Approximate Bayesian Inference
Survival Models
Semiparametric models for hazard
Example3 -Times to death for a Breast-cancer trial
−0.15 −0.10 −0.05 0.00 0.05 0.10 0.15
05
1015
20
posterior marginals of hazard for first interval
−10 −5 0 5 10
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
posterior marginals of logprecision
The solid line is the INLA approximations(0.090sec) and thehistogram is from 5x106 Winbug samples(685sec)
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Piecewise linear model for h0(t)
Divide the time line into J predefined intervals, Ik = (Sk ,Sk+1] fork = 1, ..., J with 0 = s1 < ... < sJ <∞, we define the baselinehazard as:
h0(t) = λj +λj+1 − λj
sj+1 − sjt if t ∈ Ij = (sj , sj+1]
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Log-likelihood for right censored data
The log-likelihood contribution for a (possibly) right censoredobservation t ∈ Ik
log [h(t; .)S(t; .)] = δηk − w1eη1 −
k−1∑j=2
wjeηj − wkeηk − wk+1e
ηk+1
where, w ’s, the weights are
w1 =s2 − s1
2, wj =
sj+1 − sj−1
2
wk = t − sk + sk−1
2− (t − sk)2
2(sk+1 − sk)and wk+1 =
(t − sk)2
2(sk+ − sk)
Approximate Bayesian Inference
Survival Models
Semiparametric model for hazard
Example4-Times to death for a Breast-cancer trial
500 1000 1500 2000
−1.
5−
1.0
−0.
50.
00.
51.
0
baseline.hazard.linear
Posterior mean together with 0.025quant 0.975quant
500 1000 1500 2000
−1.
5−
1.0
−0.
50.
00.
51.
0
baseline.hazard.constant
Posterior mean together with 0.025quant 0.975quant
Piecewise linear and constant baseline hazard
Approximate Bayesian Inference
Survival Models
Semiparametric model with spatial effect
Example5- Leukemia survival data
(Henderson et al.,2002) We study time to death for 1043leukaemia patients.The covariates included are age, wbc, tpi, sex and spatialinformation on district level. Here η is
ηij = α+β.hazj +β.sex ∗ sexi +β.agei +β.tpii +β.wbci +β.spatiali
Approximate Bayesian Inference
Survival Models
Semiparametric model with spatial effect
Example5- Leukemia survival data
Assumeα, β.sex ∼ N(0, 0.001)
β.haz , β.age, β.tpi , β.wbc ∼ RW 1(τ ′s)
β.spatial ∼ besag
Approximate Bayesian Inference
Survival Models
Semiparametric model with spatial effect
Example5- Leukemia survival data
2 4 6 8 10
−0.
50.
00.
51.
0
posterior mean and quantiles for age using inla(line) and winbugs(dotted)l
a1
5 10 15
−0.
6−
0.4
−0.
20.
00.
20.
4
posterior mean and quantiles for tpi using inla(line) and winbugs(dotted)1:19
tpi1
5 10 15 20
−0.
4−
0.2
0.0
0.2
district means for inla(red colour) and winbugsx
d1
Posterior means of age,tpi and spatial effect, Solid line(red) is theINLA approximations and dotted line from Winbugs samples.
Approximate Bayesian Inference
Survival Models
Semiparametric model with spatial effect
Example5-Leukemia survival data
20 30 40 50 60 70 80 90
−1.
00.
01.
0
age1
Posterior mean together with 0.025quant 0.975quant
0 100 200 300 400 500
−1.
00.
01.
0
wbc1
Posterior mean together with 0.025quant 0.975quant
−5 0 5
−0.
6−
0.2
0.2
tpi1
Posterior mean together with 0.025quant 0.975quant
0 5 10 15 20
−0.
6−
0.2
0.2
0.6
district
Posterior mean together with 0.025quant 0.975quant
0 1000 2000 3000 4000 5000
−2
01
2
hazard
Posterior mean together with 0.025quant 0.975quant
Compared these with results by Kneib et al.,2007.
Approximate Bayesian Inference
Demo
Semiparametric model with spatial effect
Example5-Leukemia survival data-Demo
demo(Leuk)
Approximate Bayesian Inference
Demo
New inla input
New inla input
data=read.table(”data1.txt”, header = T) this is some datan=length(data$time)surv.time = list(truncation=rep(0,n), event =data$event,lower=data$time,upper = rep(0,n), time=data$time)d = c(as.list(data), surv.time = surv.time)formula = surv.time ∼ placementmodel=inla(formula,family=”weibull”, data= d, verbose=TRUE,keep=TRUE )h=inla.hyperpar(model)
Approximate Bayesian Inference
Summary
Summary
I Many survival models fall in the latent Gaussian models family
I For such models INLA is a fast and reliable tool for estimate
I ... there is still lot to do to make INLA a more general tool forsurvival models
Top Related