Empirical Bayes Inference in Structured Hazard Regression · Empirical Bayes Inference in...
Transcript of Empirical Bayes Inference in Structured Hazard Regression · Empirical Bayes Inference in...
Empirical Bayes Inference inStructured Hazard Regression
Thomas Kneib
Department of StatisticsLudwig-Maximilians-University Munich
20.11.2006
Thomas Kneib Outline
Outline
• Leukemia survival data.
• Structured hazard regression for continuous survival times.
• Empirical Bayes inference in structured hazard regression.
• Multi-state models.
Empirical Bayes Inference in Structured Hazard Regression 1
Thomas Kneib Leukemia Survival Data
Leukemia Survival Data
• Survival times of adults after diagnosis of acute myeloid leukemia.
• 1,043 cases diagnosed between 1982 and 1998 in Northwest England.
• 16 % (right) censored.
• Continuous and categorical covariates:
age age at diagnosis,wbc white blood cell count at diagnosis,sex sex of the patient,tpi Townsend deprivation index.
• Spatial information in different resolution.
Empirical Bayes Inference in Structured Hazard Regression 2
Thomas Kneib Leukemia Survival Data
• Classical Cox proportional hazards model:
λ(t; x) = λ0(t) exp(x′γ).
• Baseline hazard λ0(t) is a nuisance parameter and remains unspecified.
• Estimate γ based on the partial likelihood.
• Questions / Limitations:
– Simultaneous estimation of baseline hazard rate and covariate effects.
– Flexible modelling of covariate effects (e.g. nonlinear effects, interactions).
– Spatially correlated survival times.
– Non-proportional hazards models / time-varying effects.
⇒ Structured hazard regression models.
Empirical Bayes Inference in Structured Hazard Regression 3
Thomas Kneib Leukemia Survival Data
• Replace usual parametric predictor with a flexible semiparametric predictor
λ(t; ·) = λ0(t) exp[g1(t)sex + f1(age) + f2(wbc) + f3(tpi) + fspat(si)]
and absorb the baseline
λ(t; ·) = exp[g0(t) + g1(t)sex + f1(age) + f2(wbc) + f3(tpi) + fspat(si)]
where
– g0(t) = log(λ0(t)) is the log-baseline hazard,
– g1(t) is a time-varing gender effect,
– f1, f2, f3 are nonparametric functions of age, white blood cell count anddeprivation, and
– fspat is a spatial function.
Empirical Bayes Inference in Structured Hazard Regression 4
Thomas Kneib Leukemia Survival Data
−5
−2
14
0 7 14time in years
log(baseline) Log-baseline hazard.
Time-varying gender effect.
−1
01
2
0 7 14time in years
time−varying effect of sex
Empirical Bayes Inference in Structured Hazard Regression 5
Thomas Kneib Leukemia Survival Data
−1.5
−.7
50
.75
1.5
14 27 40 53 66 79 92age in years
effect of age Effect of age at diagnosis.
Effect of white blood cell count.−
.50
.51
1.5
2
0 125 250 375 500
white blood cell count
effect of white blood cell count
Empirical Bayes Inference in Structured Hazard Regression 6
Thomas Kneib Leukemia Survival Data
−.5
−.2
50
.25
.5
−6 −2 2 6 10townsend deprivation index
effect of townsend deprivation index Effect of deprivation.
Empirical Bayes Inference in Structured Hazard Regression 7
Thomas Kneib Leukemia Survival Data
-0.44 0 0.3
District-level analysis−0.44 0.3
Individual-level analysis
Empirical Bayes Inference in Structured Hazard Regression 8
Thomas Kneib Structured Hazard Regression
Structured Hazard Regression
• A general structured hazard regression model consists of an arbitrary combination ofthe following model terms:
– Log baseline hazard g0(t) = log(λ0(t)).
– Time-varying effects gl(t)ul of covariates ul.
– Nonparametric effects fj(xj) of continuous covariates xj.
– Spatial effects fspat(s) of a spatial location variable s.
– Interaction surfaces fj,k(xj, xk) of two continuous covariates.
– Varying coefficient interactions ujfk(xk) or ujfspat(s).
– Frailty terms bg (random intercept) or xjbg (random slopes).
• All covariates are themselves allowed to be (piecewise constant) time-varying.
Empirical Bayes Inference in Structured Hazard Regression 9
Thomas Kneib Structured Hazard Regression
• Penalised splines for the baseline effect, time-varying effects, and nonparametriceffects:
– Approximate f(x) (or g(t)) by a weighted sum of B-spline basis functions
f(x) =∑
ξjBj(x).
– Employ a large number of basis functions to enable flexibility.
– Penalise differences between parameters of adjacent basis functions to ensuresmoothness:
Pen(ξ|τ2) =1
2τ2
∑(∆kξj)2.
– Bayesian interpretation: Assume a k-th order random walk prior for ξj, e.g.
ξj = ξj−1 + uj, uj ∼ N(0, τ2) (RW1).
ξj = 2ξj−1 − ξj−2 + uj, uj ∼ N(0, τ2) (RW2).
Empirical Bayes Inference in Structured Hazard Regression 10
Thomas Kneib Structured Hazard Regression
−2
−1
01
2
−3 −1.5 0 1.5 3
−2
−1
01
2
−3 −1.5 0 1.5 3
−2
−1
01
2
−3 −1.5 0 1.5 3
Empirical Bayes Inference in Structured Hazard Regression 11
Thomas Kneib Structured Hazard Regression
• Bivariate Tensor product P-splines for interaction surfaces:
– Define bivariate basis functions (Tensor products of univariate basis functions).
– Extend random walks on the line to random walks on a regular grid.
f f f f f
f f f f f
f f f f f
f f f f f
f f f f f
v v
v
v
Empirical Bayes Inference in Structured Hazard Regression 12
Thomas Kneib Structured Hazard Regression
• Spatial effects for regional data s ∈ {1, . . . , S}: (Intrinsic Gaussian) Markov randomfields.
– Bivariate extension of a first order random walk on the real line.
– Define appropriate neighbourhoods for the regions.
– Assume that the expected value of fspat(s) = ξs is the average of the functionevaluations of adjacent sites:
ξs|ξs′, s′ 6= s, τ2 ∼ N
1
Ns
∑
s′∈∂s
ξs′,τ2
Ns
.
τ2
2
t−1 t t+1
f(t−1)
E[f(t)|f(t−1),f(t+1)]
f(t+1)
Empirical Bayes Inference in Structured Hazard Regression 13
Thomas Kneib Structured Hazard Regression
• Spatial effects for point-referenced data: Stationary Gaussian random fields.
– Well-known as Kriging in the geostatistics literature.
– Spatial effect follows a zero mean stationary Gaussian stochastic process.
– Correlation of two arbitrary sites is defined by an intrinsic correlation function.
– Can be interpreted as a basis function approach with radial basis functions.
Empirical Bayes Inference in Structured Hazard Regression 14
Thomas Kneib Structured Hazard Regression
• Cluster-specific frailty terms:
– Account for unobserved heterogeneity.
– Easiest case: i.i.d Gaussian frailty.
• All covariates in the discussed model terms are allowed to be piecewise constanttime-varying.
Empirical Bayes Inference in Structured Hazard Regression 15
Thomas Kneib Empirical Bayes Inference
Empirical Bayes Inference
• Generic representation of structured hazard regression models:
λ(t) = exp [x(t)′γ + f1(z1(t)) + . . . + fp(zp(t))]
• For example:
f(z(t)) = g(t) z(t) = t log-baseline effect,
f(z(t)) = u(t)g(t) z(t) = (u, t) time-varying effect of u(t),f(z(t)) = f(x(t)) z(t) = x(t) smooth function of a continuous
covariate x(t),f(z(t)) = fspat(s) z(t) = s spatial effect,
f(z(t)) = f(x1(t), x2(t)) z(t) = (x1(t), x2(t)) interaction surface,
f(z(t)) = bg z(t) = g i.i.d. frailty bg, g is a groupingindex.
• The generic representation facilitates description of inferential details.
Empirical Bayes Inference in Structured Hazard Regression 16
Thomas Kneib Empirical Bayes Inference
• All vectors of function evaluations fj can be expressed as
fj = Zjξj
with design matrix Zj, constructed from zj(t), and regression coefficients ξj.
• Generic form of the prior for ξj:
p(ξj|τ2j ) ∝ (τ2
j )−kj2 exp
(− 1
2τ2j
ξ′jKjξj
)
• Kj ≥ 0 acts as a penalty matrix, rank(Kj) = kj ≤ dj = dim(ξj).
• τ2j ≥ 0 can be interpreted as a variance or (inverse) smoothness parameter.
• Relation to penalized likelihood: Penalty terms
Pλj(ξj) = log[p(ξj|τ2
j )] = −12λjξ
′jKjξj, λj =
1τ2j
.
Empirical Bayes Inference in Structured Hazard Regression 17
Thomas Kneib Empirical Bayes Inference
• Likelihood contributions for right- and uncensored survival times:
λ(T )δ exp
(−
∫ T
0
λ(t)dt
),
where δ is the censoring indicator.
• Likelihood contributions for interval-censored observations:
P (T ∈ [Tlower, Tupper]) = S(Tlower)− S(Tupper)
= exp
[−
∫ Tlower
0
λ(t)dt
]− exp
[−
∫ Tupper
0
λ(t)dt
].
⇒ Derivatives of the log-likelihood become much more complicated for interval-censored survival times.
• In general, numerical integration has to be used to evaluate the cumulative hazardrate (e.g. the trapezoidal rule).
• Left truncation can easily be included.
Empirical Bayes Inference in Structured Hazard Regression 18
Thomas Kneib Empirical Bayes Inference
• Principal idea of empirical Bayes estimation:
– Differentiate between parameters of primary interest and hyperparameters.
– Estimate the hyperparameters up-front from their marginal posterior.
– Plug the resulting estimates back into the joint posterior and maximize with respectto the parameters of primary interest (yields posterior mode estimates).
• In structured hazard regression models:
– regression coefficients are parameters of primary interest,
– variance components are hyperparameters.
• Employ mixed model methodology to perform empirical Bayes inference: Consider ξj
a correlated random effect with multivariate Gaussian distribution.
• Problem: In most cases partially improper random effects distribution (kj = rk(Kj) <dim(ξj) = dj).
Empirical Bayes Inference in Structured Hazard Regression 19
Thomas Kneib Empirical Bayes Inference
• Mixed model representation: Decompose
ξj = Xjβj + Zjbj,
wherep(βj) ∝ const and bj ∼ N(0, τ2
j Ikj).
⇒ βj is a fixed effect and bj is an i.i.d. random effect.
• This yields the variance components model
λ(t; ·) = exp [x′β + z′b] ,
where in turnp(β) ∝ const, b ∼ N(0, Q),
andQ = blockdiag(τ2
1 I, . . . , τ2pI).
Empirical Bayes Inference in Structured Hazard Regression 20
Thomas Kneib Empirical Bayes Inference
• Obtain empirical Bayes estimates / penalized likelihood estimates via iterating
– Penalized maximum likelihood for the regression coefficients β and b.
– Restricted Maximum / Marginal likelihood for the variance parameters in Q:
Lmarg(Q) =∫
L(β, b,Q)p(b)dβdb → maxQ
.
• Penalized score function and penalized Fisher information:
sp(β, b) =
∂l(β, b)∂β
∂l(β, b)∂b
−Q−1b
Fp(β, b) =
∂2l(β, b)∂β∂β′
∂2l(β, b)∂β∂b′
∂2l(β, b)∂b∂β′
∂2l(β, b)∂b∂b′
−Q−1
.
Empirical Bayes Inference in Structured Hazard Regression 21
Thomas Kneib Empirical Bayes Inference
• Marginal likelihood estimation corresponds to REML estimation of variances inGaussian mixed models.
• The marginal likelihood can not be derived analytically ⇒ Apply a Laplaceapproximation.
• This yields the approximate marginal log-likelihood
lmarg(Q) ≈ l(β̂, b̂)− 12
log |Q| − 12b̂′Q−1b̂− 1
2log |Fp|,
where Fp is the penalised Fisher information matrix.
• If both l(β̂, b̂) and b̂ vary only slowly when changing the variance components we canfurther reduce the marginal log-likelihood to
lmarg(Q) ≈ −12
log |Q| − 12
log |Fp| − 12b′Q−1b,
where b denotes a fixed value, e.g. a current estimate.
Empirical Bayes Inference in Structured Hazard Regression 22
Thomas Kneib Empirical Bayes Inference
• This allows to device a Fisher Scoring algorithm based on matrix differentiation rules.
Empirical Bayes Inference in Structured Hazard Regression 23
Thomas Kneib Software
Software
• Implemented in BayesX, a software package for Bayesian inference in geoadditive andrelated models.
• Available from
http://www.stat.uni-muenchen.de/~bayesx
Empirical Bayes Inference in Structured Hazard Regression 24
Thomas Kneib Software
• More features:
– Fully Bayesian inference based on MCMC in comparable model classes.
– Univariate responses from exponential families (Gaussian, Binomial, Poisson,Negative Binomial, Gamma, . . . ).
– Categorical responses (multinomial regression models, cumulative models,sequential models).
• Latest development: Multi-state models.
Empirical Bayes Inference in Structured Hazard Regression 25
Thomas Kneib Multi-State Models
Multi-State Models
• Multi-state models form a general class for the description of the evolution of discretephenomena in continuous time.
• We observe paths of a process
X = {X(t), t ≥ 0} with X(t) ∈ {1, . . . , q}.
• Yields a similar data structure as for Markov processes.
• Examples:
– Recurrent events:
����3
����
����
1 2........................................................................................................................................................
.......................
........................................................................................................................................................
.......................
........................................................................................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
........................................................................................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
........................................................................................................................
.......................
........................................................................................................................
.......................
Empirical Bayes Inference in Structured Hazard Regression 26
Thomas Kneib Multi-State Models
– Disease progression:
����
q
����
����
����
����
1 2 3 q-1· · ·
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.............................................................................................................
.......................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..............................................................................................................................................................................
.......................
................................................................................................................................................................................................................................................................................................
.......
.......................
.......................................................................................................................................................................................................................................................................................................
.......................
– Competing risks:
����
����
����
����
2 3 4 q
����1
· · ·
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..................................................................................................................................................................................
.......................
...................................................................................................................................................................................................................................................................................................
.......................
.........................................................................................................................................................................................................................................................................................
.......................
• (Homogenous) Markov processes can be compactly described in terms of the transitionintensities
λij = lim∆t→0
P (X(t + ∆t) = j|X(t) = i)∆t
Empirical Bayes Inference in Structured Hazard Regression 27
Thomas Kneib Multi-State Models
• Often not flexible enough in practice since
– The transition intensities might vary over time.
– The transition intensities might be related to covariates.
– The Markov model implies independent and exponentially distributed waiting times.
Empirical Bayes Inference in Structured Hazard Regression 28
Thomas Kneib Human Sleep Data
Human Sleep Data
• Human sleep can be considered an example of a recurrent event type multi-statemodel.
• State Space:
Awake Phases of wakefulnessREM Rapid eye movement phase (dream phase)
Non-REM Non-REM phases (may be further differentiated)
• Aims of sleep research:
– Describe the dynamics underlying the human sleep process.
– Analyse associations between the sleep process and nocturnal hormonal secretion.
– (Compare the sleep process of healthy and diseased persons.)
Empirical Bayes Inference in Structured Hazard Regression 29
Thomas Kneib Human Sleep Data
stat
e
awak
eN
on−
RE
MR
EM
time since sleep onset (in hours)
cort
isol
0 2 4 6 8
050
100
150
Empirical Bayes Inference in Structured Hazard Regression 30
Thomas Kneib Human Sleep Data
• Data generation:
– Sleep recording based on electroencephalographic (EEG) measures every 30 seconds(afterwards classified into the three sleep stages).
– Measurement of hormonal secretion based on blood samples taken every 10minutes.
– A training night familiarises the participants of the study with the experimentalenvironment.
⇒ Sleep processes of 70 participants.
• Simple parametric approaches are not appropriate in this application due to
– Changing dynamics of human sleep over night.
– The time-varying influence of the hormonal concentration on the transitionintensities.
– Unobserved heterogeneity.
⇒ Model transition intensities nonparametrically.
Empirical Bayes Inference in Structured Hazard Regression 31
Thomas Kneib Specification of Transition Intensities
Specification of Transition Intensities
• To reduce complexity, we consider a simplified transition space:
Awake
Non-REM REM
Sleep
?
6
-
¾
λRN(t)
λNR(t)
λAS(t) λSA(t)
Empirical Bayes Inference in Structured Hazard Regression 32
Thomas Kneib Specification of Transition Intensities
• Model specification:
λAS,i(t) = exp[γ
(AS)0 (t) + b
(AS)i
]
λSA,i(t) = exp[γ
(SA)0 (t) + b
(SA)i
]
λNR,i(t) = exp[γ
(NR)0 (t) + ci(t)γ
(NR)1 (t) + b
(NR)i
]
λRN,i(t) = exp[γ
(RN)0 (t) + ci(t)γ
(RN)1 (t) + b
(RN)i
]
where
ci(t) =
{1 cortisol > 60 n mol/l at time t
0 cortisol ≤ 60 n mol/l at time t,
b(j)i ∼ N(0, τ2
j ) = transition- and individual-specific frailty terms.
Empirical Bayes Inference in Structured Hazard Regression 33
Thomas Kneib Counting Process Representation
Counting Process Representation
• A multi-state model with k different types of transitions can be equivalently expressedin terms of k counting processes Nh(t), h = 1, . . . , k counting these transitions.
awak
eN
on−
RE
MR
EM
700 720 740 760 780 800
awake => sleep
700 720 740 760 780 800
2122
2324
Non−REM => REM
700 720 740 760 780 800
67
89
sleep => awake
700 720 740 760 780 800
2223
2425
REM => Non−REM
700 720 740 760 780 800
45
67
Empirical Bayes Inference in Structured Hazard Regression 34
Thomas Kneib Counting Process Representation
• From the counting process representation we can derive the likelihood contributionsfor individual i:
li =k∑
h=1
[∫ Ti
0
log(λhi(t))dNhi(t)−∫ Ti
0
λhi(t)Yhi(t)dt
]
=ni∑
j=1
k∑
h=1
[δhi(tij) log(λhi(tij))− Yhi(tij)
∫ tij
ti,j−1
λhi(t)dt
].
k number of possible transitions.
Nhi(t) counting process for type h event and individual i.
Yhi(t) at risk indicator for type h event and individual i.
tij event times of individual i.
ni number of events for individual i.
δhi(tij) transition indicator for type h transition.
Empirical Bayes Inference in Structured Hazard Regression 35
Thomas Kneib Counting Process Representation
• Baseline effects:
0 2 4 6 8
−2.
5−
2.0
−1.
5−
1.0
awake −> sleep (mixed model)
0 2 4 6 8
−4.
5−
4.0
−3.
5−
3.0
−2.
5−
2.0
−1.
5
sleep −> awake (mixed model)
0 2 4 6 8
−14
−12
−10
−8
−6
−4
−2
Non−REM −> REM (mixed model)
0 2 4 6 8
−4.
5−
4.0
−3.
5−
3.0
−2.
5
REM −> Non−REM (mixed model)
Empirical Bayes Inference in Structured Hazard Regression 36
Thomas Kneib Counting Process Representation
• Time-varying effects for a high level of cortisol:
0 2 4 6 8
−4
−3
−2
−1
01
2
Non−REM −> REM (mixed model)
0 2 4 6 8
−2
−1
01
REM −> Non−REM (mixed model)
• Individual-specific variation is only detected for the transition between REM andNon-REM.
Empirical Bayes Inference in Structured Hazard Regression 37
Thomas Kneib Conclusions
Conclusions
• Unified framework for general regression models describing the hazard rate of survivalmodels.
• Empirical Bayes inference based on mixed model methodology.
• Extendable to models for transition intensities in multi state models.
• Future work:
– More general censoring mechanisms for multi-state models.
– Conditions for propriety of posteriors.
– Joint modelling of covariates and duration times.
Empirical Bayes Inference in Structured Hazard Regression 38
Thomas Kneib References
References
• Brezger, Kneib & Lang (2005). BayesX: Analyzing Bayesian structured additiveregression models. Journal of Statistical Software, 14 (11).
• Fahrmeir, Kneib & Lang (2004). Penalized structured additive regression forspace-time data: a Bayesian perspective. Statistica Sinica 14, 731–761.
• Kneib & Fahrmeir (2006). A mixed model approach for geoadditive hazardregression. Scandinavian Journal of Statistics, to appear.
• Kneib (2006). Geoadditive hazard regression for interval censored survival times.Computational Statistics and Data Analysis, 51, 777-792.
• Kneib & Hennerfeind (2006). Bayesian semiparametric multi-state models. Inpreparation.
Empirical Bayes Inference in Structured Hazard Regression 39