Lecture 3: Parametric Survival Modeling
-
Upload
jewel-jefferson -
Category
Documents
-
view
225 -
download
3
description
Transcript of Lecture 3: Parametric Survival Modeling
Lecture 3: Parametric Survival Modeling
Parametric modelsExample and nuances in R
Parametric Distributions
• We’ve discussed a variety of parametric distributions– Exponential, Weibull, log-normal, log-logistic,
gamma, …. • But… how do we “fit” a model
– Model parameterizations– Inclusion of coefficients
Modeling Homogeneous Population
• Relatively “simple”• Once we’ve determined the distribution we
need to estimate the parameters• For example, exponential
Covariates
• Frequently want to adjust survival for covariates
• Two main approaches– Accelerated Failure Time model– Multiplicative model
Accelerated Failure Time
• Under AFT model for two populations– expected survival time– median survival time– Survival at time t
for Population 1 are c times that of population 2, where c is constant.
Accelerated Failure Time
• Data include – Failure time T > 0 – Vector of covariates Z’=(Z1, Z2, …, Zp)
• Quantitative• Qualitative
• Log transform T for linear model approach
'ln T Y W Z
Accelerated Failure Time• When Z = 0, So(t) is survival function of e+W
Accelerated Failure Time• First consider 2 populations that only differ by
1 unit in zk
Accelerated Failure Time• First consider 2 populations that only differ by
1 unit in zk
Exponential Models in R
• Recall:
• Parameterization is the same for exponential in R– rexp(n, rate)
and t tf t e S t e
Exponential Models in R
• We can run an expontial survival model in R using– survreg(formula, data, dist)
• R gives us:• But, we can find:
• In a model with no covariates,
'ln
where 1
t W
Z
ˆˆ ,
Exponential Models in R
• The distribution of any T is exponential with constant hazard rate:
• We can interpret as the hazard ratio corresponding to a 1 unit increase in the covariate
ke
Weibull Models in R• Recall:
• Now our scale parameter is no longer 1• Unlike exponential, the parameterization for Weibull
is different in R…• Random weibull generation…
– rweibull(n, shape, scale)
1 and t tf t t e S t e
1
at
tf t e
Weibull Models in R
• Again we can run a Weibull model in R but parameterization different here too…– survreg(formula, data, dist)
• R gives us:• But, we can find:
'ln t W Z
ˆˆ ˆ, , and ˆˆ , and
AML Example In R
• Survival in patients with Acute Myelogenous Leukemia.
• Data– 23 Subjects– Time to death– Censoring indicator– Treatment
• Standard course of chemotherapy • Chemo extended ('maintainance') for additional cycles.
AML Dataset> library(survival)> aml time status x1 9 1 Maintained2 13 1 Maintained3 13 0 Maintained4 18 1 Maintained5 23 1 Maintained6 28 0 Maintained7 31 1 Maintained8 34 1 Maintained9 45 0 Maintained10 48 1 Maintained11 161 0 Maintained12 5 1 Nonmaintained13 5 1 Nonmaintained14 8 1 Nonmaintained15 8 1 Nonmaintained16 12 1 Nonmaintained…23 45 1 Nonmaintained
AML model in R: exponential(no covariates)
>library(MASS)>library(survival)>sdat<-Surv(aml$time, aml$status)>exp_fit<-survreg(sdat~1, dist=“exponential“)>#exp_fit<-survreg(sdat~1, dist="weibull", scale=1) alternative>summary(exp_fit)Call:survreg(formula = sdat ~ 1, dist = "exponential") Value Std. Error z p(Intercept) 3.63 0.236 15.4 1.75e-53
Scale fixed at 1
Exponential distributionLoglik(model)= -83.3 Loglik(intercept only)= -83.3Number of Newton-Raphson Iterations: 4 n= 23
Checking Exponential Model Fit
Model Checks: Exponential###Model checks for exponentialpar(mfrow=c(1,3))lam_hat<-exp(-exp_fit$coefficient)logHt<-log(-log(emp_fit$surv))logt<-log(emp_fit$time)# Plot log cumulative hazard vs. log timeplot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))")points(logt, logHt, pch=16)abline(log(lam_hat), 1, lwd=2, col="red")# Second model check: Plot of H(t) vs. tHt<--log(emp_fit$surv)t<-emp_fit$timeplot(t, Ht, lwd=2, type="l", xlab="time", ylab="H(t)")points(t, Ht, pch=16)abline(0,lam_hat, lwd=2, col="red")#Third model checkfit.dat<-exp(-lam_hat*c(0:150))plot(emp_fit, xlab="Time", ylab="Survival Fraction")lines(c(0:150), fit.dat, lwd=2, col=2)
Lets Look at some Specifics for Exponential
• Exponential Model… 1st estimate lambda
– 12 month survival ?
– Median survival ?
– Mean survival ?
AML model in R: Weibull(no covariates)
>weib_fit<-survreg(sdat~1, dist="weibull", scale=0)> summary(weib_fit)
Call:survreg(formula = sdat ~ 1, dist = "weibull", scale = 0) Value Std. Error z p(Intercept) 3.6425 0.217 16.780 3.43e-63Log(scale) -0.0922 0.169 -0.544 5.86e-01
Scale= 0.912
Weibull distributionLoglik(model)= -83.2 Loglik(intercept only)= -83.2Number of Newton-Raphson Iterations: 5 n= 23
Model Checks: Weibull
Model Checks: Weibull###Model checks for weibullalp_hat<-1/exp(weib_fit$scale)lam_hat<-exp(-weib_fit$coefficient[1]/exp(weib_fit$scale))logHt<-log(-log(emp_fit$surv))logt<-log(emp_fit$time)# Plot log cumulative hazard vs. log timeplot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))")points(logt, logHt, pch=16)abline(log(lam_hat), alp_hat, lwd=2, col="red")# Plot of survival function vs. empircalfit.dat<-exp(-lam_hat*c(0:150)^alp_hat)plot(emp_fit, xlab="Time", ylab="Survival Fraction")lines(c(0:150), fit.dat, lwd=2, col=2)
Lets Look at some Specifics for Weibull
• 1st estimate lambda and alpha
– 12 month survival ?
– Median survival ?
– Mean survival ?
Compare Weibull/Exponential Fits to the Empirical Distribution (no covariates)
Empirical Distribution: What about specific times (no covariates)?
• Empirical:– 12 month survival = 0.74 – Median survival = 27
What about relative to the empirical distribution (no covariates)?
• Empirical:– 12 month survival = 74%– Median survival = 27 months
• Exponential Model:– 12 month survival = 73%– Median survival = 26.1 months– Mean survival = 37.7 months
• Weibull model: – 12 month survival = 75.5%– Median survival = 27.3 months– Mean survival = 36.9 months
What about covariates….
AML model in R: exponential(with covariate)
> exp_fit2<-survreg(sdat~x, dist="exponential", data=aml)> summary(exp_fit2)Call:survreg(formula = sdat ~ x, data = aml, dist = "exponential") Value Std. Error z p(Intercept) 4.101 0.378 10.85 1.96e-27xNonmaintained -0.958 0.483 -1.98 4.75e-02
Scale fixed at 1
Exponential distributionLoglik(model)= -81.3 Loglik(intercept only)= -83.3Chisq= 4.06 on 1 degrees of freedom, p= 0.044 Number of Newton-Raphson Iterations: 4 n= 23
Exponential Fit by Group
What about estimates by group?
• Maintiained? • Non-maintained?
AML model in R: Weibull(with covariates)
> weib_fit2<-survreg(sdat~x, dist="weibull", data=aml, scale=0)> summary(weib_fit2)
Call:survreg(formula = sdat ~ x, data = aml, dist = "weibull", scale = 0) Value Std. Error z p(Intercept) 4.109 0.300 13.70 9.89e-43xNonmaintained -0.929 0.383 -2.43 1.51e-02Log(scale) -0.235 0.178 -1.32 1.88e-01
Scale= 0.791
Weibull distributionLoglik(model)= -80.5 Loglik(intercept only)= -83.2 Chisq= 5.31 on 1 degrees of freedom, p= 0.021 Number of Newton-Raphson Iterations: 5 n= 23
Weibull fit
What about estimates by group?
• Maintiained? • Non-maintained?
Exponential and Weibull Fits for Maintained vs. Non-maintained
Empirical Distribution: What about specific survival times (with covariate)?
• Maintained:– 12 month survival = 91% – Median survival = 31 months
• Non-Maintained:– 12 Month survival = 58% – Median survival = 23 months
ComparisonsMaintained• Empirical:
– 12 month survival = 91%– Median survival = 31 months
• Exponential Model:– 12 month survival = 82%– Median survival = 41.9 months
• Weibull model: – 12 month survival = 88%– Median survival = 45.9 months
Non-maintained
• Empirical:– 12 month survival = 58%– Median survival = 23 months
• Exponential Model:– 12 month survival = 60%– Median survival = 16.1 months
• Weibull model: – 12 month survival = 66%– Median survival = 18 months
Compare Exponential & Empirical Distribution(with covariates)
Compare Weibull & Empirical Distribution(with covariates)
Multiplicative Hazard Rate Models
• Hazard rate of individual with covariate vector z is:
• In these models ho(t) may be parametric or arbitrary non-negative function
• Most common link function proposed by Cox
'oh t h t c z z
''c e zz
Multiplicative Hazard Rate Models
• Key feature is proportional hazards
Multiplicative Hazard Rate Model
• These parametric models are very similar to semi-parametric Cox proportional hazard models we will discuss later…
• The AFT models using the exponential/Weibull are also classified as multiplicative models due to their proportional hazards property– This is not true for any other parametric distribution
• Since Cox models are so commonly used, it is rare to see a parametric implementation of these models
Advantages of Parametric Models
• If we correctly characterize the underlying distribution, our estimates will be more precise than semi- and non-parametric estimates.
• This means we may have greater power to identify relationships between our outcome and predictors
• However…
Disadvantages of Parametric Models
• If we use the wrong distribution problems can arise– Distribution often chosen based on the shape of the
model without covariates,• This can/will change as covariates are added\
– Alternatively use intuition/theory about what the dependency is expected to be
• BUT the time-dependency is what is left over after conditioning on covariates so we are also likely to fail here.
Brief SAS Code /************************************//* Accelerated Failure Time Models *//************************************/
/*Exponential models: 1st is intercept only, second is with the covariate*/proc lifereg data=aml;
model time*status(0) = /dist=exponential;run;
proc lifereg data=aml;class x;model time*status(0) = x/dist=exponential;
run;
Brief SAS Code /************************************//* Accelerated Failure Time Models *//************************************/
/*Weibull models: 1st is intercept only, second is with the covariate*/proc lifereg data=aml;model time*status(0) = /dist=weibull;
run;proc lifereg data=aml;
class x;model time*status(0) = x/dist=weibull;
run;
Example of SAS OutputAnalysis of Maximum Likelihood Parameter EstimatesParameter
DF Estimate Standard Error
95% Confidence Limits
Chi-Square
Pr > ChiSq
Intercept 1 3.6288 0.2357 3.1668 4.0907 237.02 <.0001Scale 0 1.0000 0.0000 1.0000 1.0000Weibull Scale
1 37.6667 8.8781 23.7316 59.7843
Weibull Shape
0 1.0000 0.0000 1.0000 1.0000
Lagrange Multiplier StatisticsParameter Chi-Square Pr > ChiSqScale 0.3305 0.5654
Next Time
• Likelihoods!!!