Lecture 3: Parametric Survival Modeling

49
Lecture 3: Parametric Survival Modeling Parametric models Example and nuances in R

description

Parametric Distributions We’ve discussed a variety of parametric distributions Exponential, Weibull, log-normal, log-logistic, gamma, …. But… how do we “fit” a model Model parameterizations Inclusion of coefficients

Transcript of Lecture 3: Parametric Survival Modeling

Page 1: Lecture 3: Parametric Survival Modeling

Lecture 3: Parametric Survival Modeling

Parametric modelsExample and nuances in R

Page 2: Lecture 3: Parametric Survival Modeling

Parametric Distributions

• We’ve discussed a variety of parametric distributions– Exponential, Weibull, log-normal, log-logistic,

gamma, …. • But… how do we “fit” a model

– Model parameterizations– Inclusion of coefficients

Page 3: Lecture 3: Parametric Survival Modeling

Modeling Homogeneous Population

• Relatively “simple”• Once we’ve determined the distribution we

need to estimate the parameters• For example, exponential

Page 4: Lecture 3: Parametric Survival Modeling

Covariates

• Frequently want to adjust survival for covariates

• Two main approaches– Accelerated Failure Time model– Multiplicative model

Page 5: Lecture 3: Parametric Survival Modeling

Accelerated Failure Time

• Under AFT model for two populations– expected survival time– median survival time– Survival at time t

for Population 1 are c times that of population 2, where c is constant.

Page 6: Lecture 3: Parametric Survival Modeling

Accelerated Failure Time

• Data include – Failure time T > 0 – Vector of covariates Z’=(Z1, Z2, …, Zp)

• Quantitative• Qualitative

• Log transform T for linear model approach

'ln T Y W Z

Page 7: Lecture 3: Parametric Survival Modeling

Accelerated Failure Time• When Z = 0, So(t) is survival function of e+W

Page 8: Lecture 3: Parametric Survival Modeling

Accelerated Failure Time• First consider 2 populations that only differ by

1 unit in zk

Page 9: Lecture 3: Parametric Survival Modeling

Accelerated Failure Time• First consider 2 populations that only differ by

1 unit in zk

Page 10: Lecture 3: Parametric Survival Modeling

Exponential Models in R

• Recall:

• Parameterization is the same for exponential in R– rexp(n, rate)

and t tf t e S t e

Page 11: Lecture 3: Parametric Survival Modeling

Exponential Models in R

• We can run an expontial survival model in R using– survreg(formula, data, dist)

• R gives us:• But, we can find:

• In a model with no covariates,

'ln

where 1

t W

Z

ˆˆ ,

Page 12: Lecture 3: Parametric Survival Modeling

Exponential Models in R

• The distribution of any T is exponential with constant hazard rate:

• We can interpret as the hazard ratio corresponding to a 1 unit increase in the covariate

ke

Page 13: Lecture 3: Parametric Survival Modeling

Weibull Models in R• Recall:

• Now our scale parameter is no longer 1• Unlike exponential, the parameterization for Weibull

is different in R…• Random weibull generation…

– rweibull(n, shape, scale)

1 and t tf t t e S t e

1

at

tf t e

Page 14: Lecture 3: Parametric Survival Modeling

Weibull Models in R

• Again we can run a Weibull model in R but parameterization different here too…– survreg(formula, data, dist)

• R gives us:• But, we can find:

'ln t W Z

ˆˆ ˆ, , and ˆˆ , and

Page 15: Lecture 3: Parametric Survival Modeling

AML Example In R

• Survival in patients with Acute Myelogenous Leukemia.

• Data– 23 Subjects– Time to death– Censoring indicator– Treatment

• Standard course of chemotherapy • Chemo extended ('maintainance') for additional cycles.

Page 16: Lecture 3: Parametric Survival Modeling

AML Dataset> library(survival)> aml time status x1 9 1 Maintained2 13 1 Maintained3 13 0 Maintained4 18 1 Maintained5 23 1 Maintained6 28 0 Maintained7 31 1 Maintained8 34 1 Maintained9 45 0 Maintained10 48 1 Maintained11 161 0 Maintained12 5 1 Nonmaintained13 5 1 Nonmaintained14 8 1 Nonmaintained15 8 1 Nonmaintained16 12 1 Nonmaintained…23 45 1 Nonmaintained

Page 17: Lecture 3: Parametric Survival Modeling

AML model in R: exponential(no covariates)

>library(MASS)>library(survival)>sdat<-Surv(aml$time, aml$status)>exp_fit<-survreg(sdat~1, dist=“exponential“)>#exp_fit<-survreg(sdat~1, dist="weibull", scale=1) alternative>summary(exp_fit)Call:survreg(formula = sdat ~ 1, dist = "exponential") Value Std. Error z p(Intercept) 3.63 0.236 15.4 1.75e-53

Scale fixed at 1

Exponential distributionLoglik(model)= -83.3 Loglik(intercept only)= -83.3Number of Newton-Raphson Iterations: 4 n= 23

Page 18: Lecture 3: Parametric Survival Modeling

Checking Exponential Model Fit

Page 19: Lecture 3: Parametric Survival Modeling

Model Checks: Exponential###Model checks for exponentialpar(mfrow=c(1,3))lam_hat<-exp(-exp_fit$coefficient)logHt<-log(-log(emp_fit$surv))logt<-log(emp_fit$time)# Plot log cumulative hazard vs. log timeplot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))")points(logt, logHt, pch=16)abline(log(lam_hat), 1, lwd=2, col="red")# Second model check: Plot of H(t) vs. tHt<--log(emp_fit$surv)t<-emp_fit$timeplot(t, Ht, lwd=2, type="l", xlab="time", ylab="H(t)")points(t, Ht, pch=16)abline(0,lam_hat, lwd=2, col="red")#Third model checkfit.dat<-exp(-lam_hat*c(0:150))plot(emp_fit, xlab="Time", ylab="Survival Fraction")lines(c(0:150), fit.dat, lwd=2, col=2)

Page 20: Lecture 3: Parametric Survival Modeling

Lets Look at some Specifics for Exponential

• Exponential Model… 1st estimate lambda

– 12 month survival ?

– Median survival ?

– Mean survival ?

Page 21: Lecture 3: Parametric Survival Modeling

AML model in R: Weibull(no covariates)

>weib_fit<-survreg(sdat~1, dist="weibull", scale=0)> summary(weib_fit)

Call:survreg(formula = sdat ~ 1, dist = "weibull", scale = 0) Value Std. Error z p(Intercept) 3.6425 0.217 16.780 3.43e-63Log(scale) -0.0922 0.169 -0.544 5.86e-01

Scale= 0.912

Weibull distributionLoglik(model)= -83.2 Loglik(intercept only)= -83.2Number of Newton-Raphson Iterations: 5 n= 23

Page 22: Lecture 3: Parametric Survival Modeling

Model Checks: Weibull

Page 23: Lecture 3: Parametric Survival Modeling

Model Checks: Weibull###Model checks for weibullalp_hat<-1/exp(weib_fit$scale)lam_hat<-exp(-weib_fit$coefficient[1]/exp(weib_fit$scale))logHt<-log(-log(emp_fit$surv))logt<-log(emp_fit$time)# Plot log cumulative hazard vs. log timeplot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))")points(logt, logHt, pch=16)abline(log(lam_hat), alp_hat, lwd=2, col="red")# Plot of survival function vs. empircalfit.dat<-exp(-lam_hat*c(0:150)^alp_hat)plot(emp_fit, xlab="Time", ylab="Survival Fraction")lines(c(0:150), fit.dat, lwd=2, col=2)

Page 24: Lecture 3: Parametric Survival Modeling

Lets Look at some Specifics for Weibull

• 1st estimate lambda and alpha

– 12 month survival ?

– Median survival ?

– Mean survival ?

Page 25: Lecture 3: Parametric Survival Modeling

Compare Weibull/Exponential Fits to the Empirical Distribution (no covariates)

Page 26: Lecture 3: Parametric Survival Modeling

Empirical Distribution: What about specific times (no covariates)?

• Empirical:– 12 month survival = 0.74 – Median survival = 27

Page 27: Lecture 3: Parametric Survival Modeling

What about relative to the empirical distribution (no covariates)?

• Empirical:– 12 month survival = 74%– Median survival = 27 months

• Exponential Model:– 12 month survival = 73%– Median survival = 26.1 months– Mean survival = 37.7 months

• Weibull model: – 12 month survival = 75.5%– Median survival = 27.3 months– Mean survival = 36.9 months

Page 28: Lecture 3: Parametric Survival Modeling

What about covariates….

Page 29: Lecture 3: Parametric Survival Modeling

AML model in R: exponential(with covariate)

> exp_fit2<-survreg(sdat~x, dist="exponential", data=aml)> summary(exp_fit2)Call:survreg(formula = sdat ~ x, data = aml, dist = "exponential") Value Std. Error z p(Intercept) 4.101 0.378 10.85 1.96e-27xNonmaintained -0.958 0.483 -1.98 4.75e-02

Scale fixed at 1

Exponential distributionLoglik(model)= -81.3 Loglik(intercept only)= -83.3Chisq= 4.06 on 1 degrees of freedom, p= 0.044 Number of Newton-Raphson Iterations: 4 n= 23

Page 30: Lecture 3: Parametric Survival Modeling

Exponential Fit by Group

Page 31: Lecture 3: Parametric Survival Modeling

What about estimates by group?

• Maintiained? • Non-maintained?

Page 32: Lecture 3: Parametric Survival Modeling

AML model in R: Weibull(with covariates)

> weib_fit2<-survreg(sdat~x, dist="weibull", data=aml, scale=0)> summary(weib_fit2)

Call:survreg(formula = sdat ~ x, data = aml, dist = "weibull", scale = 0) Value Std. Error z p(Intercept) 4.109 0.300 13.70 9.89e-43xNonmaintained -0.929 0.383 -2.43 1.51e-02Log(scale) -0.235 0.178 -1.32 1.88e-01

Scale= 0.791

Weibull distributionLoglik(model)= -80.5 Loglik(intercept only)= -83.2 Chisq= 5.31 on 1 degrees of freedom, p= 0.021 Number of Newton-Raphson Iterations: 5 n= 23

Page 33: Lecture 3: Parametric Survival Modeling

Weibull fit

Page 34: Lecture 3: Parametric Survival Modeling

What about estimates by group?

• Maintiained? • Non-maintained?

Page 35: Lecture 3: Parametric Survival Modeling

Exponential and Weibull Fits for Maintained vs. Non-maintained

Page 36: Lecture 3: Parametric Survival Modeling

Empirical Distribution: What about specific survival times (with covariate)?

• Maintained:– 12 month survival = 91% – Median survival = 31 months

• Non-Maintained:– 12 Month survival = 58% – Median survival = 23 months

Page 37: Lecture 3: Parametric Survival Modeling

ComparisonsMaintained• Empirical:

– 12 month survival = 91%– Median survival = 31 months

• Exponential Model:– 12 month survival = 82%– Median survival = 41.9 months

• Weibull model: – 12 month survival = 88%– Median survival = 45.9 months

Non-maintained

• Empirical:– 12 month survival = 58%– Median survival = 23 months

• Exponential Model:– 12 month survival = 60%– Median survival = 16.1 months

• Weibull model: – 12 month survival = 66%– Median survival = 18 months

Page 38: Lecture 3: Parametric Survival Modeling

Compare Exponential & Empirical Distribution(with covariates)

Page 39: Lecture 3: Parametric Survival Modeling

Compare Weibull & Empirical Distribution(with covariates)

Page 40: Lecture 3: Parametric Survival Modeling
Page 41: Lecture 3: Parametric Survival Modeling

Multiplicative Hazard Rate Models

• Hazard rate of individual with covariate vector z is:

• In these models ho(t) may be parametric or arbitrary non-negative function

• Most common link function proposed by Cox

'oh t h t c z z

''c e zz

Page 42: Lecture 3: Parametric Survival Modeling

Multiplicative Hazard Rate Models

• Key feature is proportional hazards

Page 43: Lecture 3: Parametric Survival Modeling

Multiplicative Hazard Rate Model

• These parametric models are very similar to semi-parametric Cox proportional hazard models we will discuss later…

• The AFT models using the exponential/Weibull are also classified as multiplicative models due to their proportional hazards property– This is not true for any other parametric distribution

• Since Cox models are so commonly used, it is rare to see a parametric implementation of these models

Page 44: Lecture 3: Parametric Survival Modeling

Advantages of Parametric Models

• If we correctly characterize the underlying distribution, our estimates will be more precise than semi- and non-parametric estimates.

• This means we may have greater power to identify relationships between our outcome and predictors

• However…

Page 45: Lecture 3: Parametric Survival Modeling

Disadvantages of Parametric Models

• If we use the wrong distribution problems can arise– Distribution often chosen based on the shape of the

model without covariates,• This can/will change as covariates are added\

– Alternatively use intuition/theory about what the dependency is expected to be

• BUT the time-dependency is what is left over after conditioning on covariates so we are also likely to fail here.

Page 46: Lecture 3: Parametric Survival Modeling

Brief SAS Code /************************************//* Accelerated Failure Time Models *//************************************/

/*Exponential models: 1st is intercept only, second is with the covariate*/proc lifereg data=aml;

model time*status(0) = /dist=exponential;run;

proc lifereg data=aml;class x;model time*status(0) = x/dist=exponential;

run;

Page 47: Lecture 3: Parametric Survival Modeling

Brief SAS Code /************************************//* Accelerated Failure Time Models *//************************************/

/*Weibull models: 1st is intercept only, second is with the covariate*/proc lifereg data=aml;model time*status(0) = /dist=weibull;

run;proc lifereg data=aml;

class x;model time*status(0) = x/dist=weibull;

run;

Page 48: Lecture 3: Parametric Survival Modeling

Example of SAS OutputAnalysis of Maximum Likelihood Parameter EstimatesParameter

DF Estimate Standard Error

95% Confidence Limits

Chi-Square

Pr > ChiSq

Intercept 1 3.6288 0.2357 3.1668 4.0907 237.02 <.0001Scale 0 1.0000 0.0000 1.0000 1.0000Weibull Scale

1 37.6667 8.8781 23.7316 59.7843

Weibull Shape

0 1.0000 0.0000 1.0000 1.0000

Lagrange Multiplier StatisticsParameter Chi-Square Pr > ChiSqScale 0.3305 0.5654

Page 49: Lecture 3: Parametric Survival Modeling

Next Time

• Likelihoods!!!