HSRP 734: Advanced Statistical Methods July 24, 2008

36
HSRP 734: HSRP 734: Advanced Statistical Advanced Statistical Methods Methods July 24, 2008 July 24, 2008

description

HSRP 734: Advanced Statistical Methods July 24, 2008. Objectives. Describe methods to evaluate the proportional hazard assumption Describe other model diagnostics Describe a stratified analysis Use SAS to implement. Background. - PowerPoint PPT Presentation

Transcript of HSRP 734: Advanced Statistical Methods July 24, 2008

Page 1: HSRP 734:  Advanced Statistical Methods July 24, 2008

HSRP 734: HSRP 734: Advanced Statistical Advanced Statistical

MethodsMethodsJuly 24, 2008July 24, 2008

Page 2: HSRP 734:  Advanced Statistical Methods July 24, 2008

ObjectivesObjectives

Describe methods to evaluate the Describe methods to evaluate the proportional hazard assumptionproportional hazard assumption

Describe other model diagnosticsDescribe other model diagnostics

Describe a stratified analysisDescribe a stratified analysis

Use SAS to implement Use SAS to implement

Page 3: HSRP 734:  Advanced Statistical Methods July 24, 2008

BackgroundBackground

The Cox PH model is a semi-The Cox PH model is a semi-parametric regression-based parametric regression-based approach to survival analysisapproach to survival analysis Nonparametric estimation of the Nonparametric estimation of the

baseline hazard functionbaseline hazard function Parametric estimation of the Parametric estimation of the

proportionality constant as a linear proportionality constant as a linear function of the covariatesfunction of the covariates

Page 4: HSRP 734:  Advanced Statistical Methods July 24, 2008

Checking the Proportional Checking the Proportional Hazards AssumptionsHazards Assumptions

Graphical approach (subjective)Graphical approach (subjective) Compare Compare estimated –ln(-ln) survivor estimated –ln(-ln) survivor

curvescurves over different categories of over different categories of variables being investigated. Parallel variables being investigated. Parallel curves indicates that the PH assumption is curves indicates that the PH assumption is satisfied.satisfied.

Compare Compare observed with predictedobserved with predicted survival curves.survival curves.

Goodness-of-fit tests (global)Goodness-of-fit tests (global) Time-dependent variables Time-dependent variables

(computationally cumbersome)(computationally cumbersome)

Page 5: HSRP 734:  Advanced Statistical Methods July 24, 2008

Graphical Approach Graphical Approach BackgroundBackground

Consider a Cox PH model in which we Consider a Cox PH model in which we wish to model gender effect (i.e., single wish to model gender effect (i.e., single binary covariate)binary covariate)

Now take the natural log of both sides of Now take the natural log of both sides of the equation:the equation:

log h(t; log h(t; XX) = log h) = log h00(t) + (t) + 1 1 x femalex female

female

o ethXth 1)();(

Page 6: HSRP 734:  Advanced Statistical Methods July 24, 2008

Graphical Approach Graphical Approach BackgroundBackground

Assuming that a female is coded as 1 Assuming that a female is coded as 1 and a male is coded as 0, we haveand a male is coded as 0, we haveFemale:Female:

log h(t; log h(t; XX) = log h) = log h00(t) + (t) + 1 1 x 1x 1

= log h= log h00(t) + (t) + 1 1

Male:Male:

log h(t; log h(t; XX) = log h) = log h00(t) + (t) + 1 1 x 0x 0

= log h= log h00(t)(t)

Page 7: HSRP 734:  Advanced Statistical Methods July 24, 2008

Graphical Approach Graphical Approach BackgroundBackground

Thus, a plot of the log-hazard over time Thus, a plot of the log-hazard over time would yield two curves – one for females would yield two curves – one for females and one for males, and the distance and one for males, and the distance between the curves would be fixed at between the curves would be fixed at 11..

A simple method for assessing the A simple method for assessing the proportional hazards assumption is an proportional hazards assumption is an examination of the extent to which the examination of the extent to which the two (or more) curves are equidistant two (or more) curves are equidistant over time.over time.

Page 8: HSRP 734:  Advanced Statistical Methods July 24, 2008

Graphical ApproachGraphical Approach

Advantages:Advantages: SimpleSimple Often the eye is better at evaluating Often the eye is better at evaluating

patterns than a formal analytical method patterns than a formal analytical method Disadvantages:Disadvantages:

Not formalNot formal What do you do with continuous variablesWhat do you do with continuous variables Univariate by nature and one has to think Univariate by nature and one has to think

hard of how to best consider hard of how to best consider combinations of variablescombinations of variables

Page 9: HSRP 734:  Advanced Statistical Methods July 24, 2008

Survival curve and Survival curve and hazard hazard

(under PH)(under PH) Equivalently, one can use log log Equivalently, one can use log log

survival curvessurvival curves

Some math required to figure this Some math required to figure this outout

Lets start with Lets start with -d (log S(t))/dt = h(t)-d (log S(t))/dt = h(t)

Page 10: HSRP 734:  Advanced Statistical Methods July 24, 2008

Survival curve and Survival curve and hazard hazard

(under PH)(under PH)

pXpXXpXpXX

t

tpXpXX

t

eeduuh

dueuh

duuh

tSe

eXtS

etS

...2211

...2211

0

0

0

...22110

0

)(

);(

,)(

0

)(

)(

)(

gives,

Page 11: HSRP 734:  Advanced Statistical Methods July 24, 2008

Survival curve and Survival curve and hazard hazard

(under PH)(under PH)

0

0

0

0

log ( ) log ( )

log ( ) log ( )

log( log ( )) log( log ( )) log( )

log( log ( )) log( log ( ))

i ii

i ii

i ii

X

X

X

i ii

S t S t e

S t S t e

S t S t e

S t S t X

Page 12: HSRP 734:  Advanced Statistical Methods July 24, 2008

Log-log PlotsLog-log Plots

eval ph example.saseval ph example.sas

Page 13: HSRP 734:  Advanced Statistical Methods July 24, 2008

Empirical Log-log PlotsEmpirical Log-log Plots

We can get the survival functions based We can get the survival functions based on Kaplan-Meier estimates that do not on Kaplan-Meier estimates that do not assume an underlying Cox model.assume an underlying Cox model.

Empirical log-log plots:Empirical log-log plots: Calculate K-M estimatesCalculate K-M estimates Create a new dataset. Keep only two variables Create a new dataset. Keep only two variables

timetime and and survivalsurvival. . In the new data set, create the group variable In the new data set, create the group variable

(e.g., maintained).(e.g., maintained). Do the log(-log(Survival)) transformation.Do the log(-log(Survival)) transformation.

Page 14: HSRP 734:  Advanced Statistical Methods July 24, 2008

Empirical Log-log Plots Empirical Log-log Plots using SASusing SAS

You need to spend some time to You need to spend some time to create a data where you can make a create a data where you can make a plot.plot.

eval ph example.saseval ph example.sas

Page 15: HSRP 734:  Advanced Statistical Methods July 24, 2008

Another Alternative Another Alternative ApproachApproach

Using Lehmann’s alternative expressionUsing Lehmann’s alternative expression

time over constant

etS

tS

tSetS

tStSe

)(ˆlog

)(ˆlog

)(ˆlog)(ˆlog

)(ˆ)(ˆ

0

1

01

01

Page 16: HSRP 734:  Advanced Statistical Methods July 24, 2008

Observed vs. Predicted Observed vs. Predicted PlotsPlots

Idea:Idea: Use K-M curves to obtain the Use K-M curves to obtain the

“observed” plots.“observed” plots. Use Cox PH model to obtain the Use Cox PH model to obtain the

“expected” plots.“expected” plots. Put both sets of plots on the same Put both sets of plots on the same

graph.graph. If they are close, then complies with If they are close, then complies with

PH assumption; if not, then the PH assumption; if not, then the assumption is violated.assumption is violated.

Page 17: HSRP 734:  Advanced Statistical Methods July 24, 2008

Expected Plot from SASExpected Plot from SAS

mai nt ai ned 0 1

Sur vi vor Funct i on Es t i mat e

0. 0

0. 1

0. 2

0. 3

0. 4

0. 5

0. 6

0. 7

0. 8

0. 9

weeks

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170

Page 18: HSRP 734:  Advanced Statistical Methods July 24, 2008

Time-Dependent Time-Dependent CovariatesCovariates

Add a Add a time-dependent variabletime-dependent variable to to assess the PH assumption for a time-assess the PH assumption for a time-independent variable. The Cox model independent variable. The Cox model is extended to contain an interaction is extended to contain an interaction term between the covariate and some term between the covariate and some function of time. If the test is function of time. If the test is significant, then the PH assumption is significant, then the PH assumption is violated.violated.

eval ph example.saseval ph example.sas

Page 19: HSRP 734:  Advanced Statistical Methods July 24, 2008

Regression diagnosticsRegression diagnostics

Model checking in Cox PH model uses Model checking in Cox PH model uses measures analogous to those used for measures analogous to those used for linear and logistic regressions: linear and logistic regressions: residuals, leverage, and influenceresiduals, leverage, and influence..

Diagnostics can be plotted and Diagnostics can be plotted and examined in order to identify examined in order to identify observations that are influential or observations that are influential or that have high leverage in that have high leverage in determining the fit.determining the fit.

Page 20: HSRP 734:  Advanced Statistical Methods July 24, 2008

Identification of Influential Identification of Influential and Poorly Fit Observationsand Poorly Fit Observations High leverage values in isolation are High leverage values in isolation are

not necessarily a concern. The issue not necessarily a concern. The issue is how does high leverage contribute is how does high leverage contribute to a measure of the influence a to a measure of the influence a covariate value has on the estimate covariate value has on the estimate of the coefficient of concern.of the coefficient of concern.

Page 21: HSRP 734:  Advanced Statistical Methods July 24, 2008

Identification of Influential Identification of Influential and Poorly Fit Observationsand Poorly Fit Observations

Specifically, we use the measure of the change in Specifically, we use the measure of the change in the value of the estimated coefficient upon deletion the value of the estimated coefficient upon deletion of an observationof an observation

where denotes the MLE from the partial where denotes the MLE from the partial likelihood using the entire sample and is that likelihood using the entire sample and is that when the iwhen the ithth observation is deleted. observation is deleted.

Plots of vs. the covariate values are helpful in Plots of vs. the covariate values are helpful in identifying observation that greatly influence identifying observation that greatly influence parameter estimation and hypothesis testing.parameter estimation and hypothesis testing.

are called dfbetasare called dfbetas

)(ˆˆˆ

ikkki

k)(

ˆik

ki

ki

Page 22: HSRP 734:  Advanced Statistical Methods July 24, 2008

ResidualsResiduals

Martingale residualsMartingale residuals Deviance residualsDeviance residuals Plot of martingale residuals vs. Plot of martingale residuals vs.

covariates or fitted valuescovariates or fitted values Plot of deviance residuals vs. Plot of deviance residuals vs.

covariates or fitted valuescovariates or fitted values

Page 23: HSRP 734:  Advanced Statistical Methods July 24, 2008

Identification of Influential Identification of Influential and Poorly Fit Observationsand Poorly Fit Observations obtain obtain dfbetadfbeta from a Cox PH model from a Cox PH model

by requesting that they be included by requesting that they be included in the OUTPUT datasetin the OUTPUT dataset

obtain linear predictor score, obtain linear predictor score, martingale and deviance residuals martingale and deviance residuals from a Cox PH model by requesting from a Cox PH model by requesting that they be included in the that they be included in the OUTPUT datasetOUTPUT dataset

eval ph example.saseval ph example.sas

Page 24: HSRP 734:  Advanced Statistical Methods July 24, 2008

Non-Proportional Hazards - Non-Proportional Hazards - StratificationStratification

What if the proportional hazards What if the proportional hazards assumption does not fit?assumption does not fit?

If you find that the proportional If you find that the proportional hazards assumption does not fit for a hazards assumption does not fit for a specific set of groups, you can specific set of groups, you can compute a stratified analysis in compute a stratified analysis in which you stratify by group.which you stratify by group.

Page 25: HSRP 734:  Advanced Statistical Methods July 24, 2008

StratificationStratification

Advantages:Advantages: Flexibility in that it allows for different Flexibility in that it allows for different

hazard functions for each stratumhazard functions for each stratum Relatively simple idea and easy to Relatively simple idea and easy to

implementimplement Retains single estimate for each Retains single estimate for each

regression parameter, assuming no regression parameter, assuming no strata by covariate interactionstrata by covariate interaction

Crude form of adjustmentCrude form of adjustment

Page 26: HSRP 734:  Advanced Statistical Methods July 24, 2008

StratificationStratification

Disadvantages:Disadvantages: Loss of parsimonyLoss of parsimony Requires larger sample size to obtain Requires larger sample size to obtain

similar quality estimators – number of similar quality estimators – number of individuals within a stratum is individuals within a stratum is importantimportant

Not valid if there is a strata by Not valid if there is a strata by covariate interactioncovariate interaction

Page 27: HSRP 734:  Advanced Statistical Methods July 24, 2008

Stratified Cox ModelStratified Cox Model

Using the UISSURV data, assume RACE Using the UISSURV data, assume RACE doesn’t satisfy the PH assumption.doesn’t satisfy the PH assumption.

Further assume that the hazard function for Further assume that the hazard function for non-whites and whites differ only because non-whites and whites differ only because they have different baseline hazard function. they have different baseline hazard function. The effect of TREAT is the same for both non-The effect of TREAT is the same for both non-whites and whites.whites and whites.

Because of different baseline hazard Because of different baseline hazard functions, the fitted stratified Cox model will functions, the fitted stratified Cox model will have different estimated survival curves for have different estimated survival curves for non-whites and whites.non-whites and whites.

Page 28: HSRP 734:  Advanced Statistical Methods July 24, 2008

Stratified Cox ModelStratified Cox Model

eval ph example.saseval ph example.sas

Page 29: HSRP 734:  Advanced Statistical Methods July 24, 2008

Stratified Cox ModelStratified Cox Model

Some comments:Some comments: Note that RACE is not included in the model Note that RACE is not included in the model

because it doesn’t satisfy the PH assumption. So because it doesn’t satisfy the PH assumption. So instead the RACE variable is controlled by instead the RACE variable is controlled by stratification.stratification.

Now we can estimate the treatment effect Now we can estimate the treatment effect adjusted for RACE.adjusted for RACE.

It is not possible to obtain a hazard ratio for the It is not possible to obtain a hazard ratio for the RACE effect adjusted for TREAT. This is the RACE effect adjusted for TREAT. This is the price to be paid for the stratification. Also a price to be paid for the stratification. Also a single value for the hazard ratio for RACE is not single value for the hazard ratio for RACE is not appropriate because it must vary with time.appropriate because it must vary with time.

Page 30: HSRP 734:  Advanced Statistical Methods July 24, 2008

General Stratified Cox General Stratified Cox ModelModel

If there are more than one variable not If there are more than one variable not satisfying the PH assumption, a general satisfying the PH assumption, a general stratified Cox model can be used. stratified Cox model can be used.

Define a new single variable (e.g., Z) Define a new single variable (e.g., Z) which is the combinations of the which is the combinations of the variables (i.e., covariate pattern) and variables (i.e., covariate pattern) and then apply the same stratified Cox then apply the same stratified Cox model.model.

The trouble is that you might not have The trouble is that you might not have enough sample sizes in each stratum.enough sample sizes in each stratum.

Page 31: HSRP 734:  Advanced Statistical Methods July 24, 2008

General Stratified Cox General Stratified Cox ModelModel

In the statistical software, you don’t In the statistical software, you don’t need to create a new variable Z by need to create a new variable Z by yourself. The software will do it for yourself. The software will do it for you automatically as long as you you automatically as long as you specify the variables not satisfying specify the variables not satisfying PH assumption in the model.PH assumption in the model.

Page 32: HSRP 734:  Advanced Statistical Methods July 24, 2008

General Stratified Cox General Stratified Cox Model Model —— SAS SAS

eval ph example.saseval ph example.sas

Page 33: HSRP 734:  Advanced Statistical Methods July 24, 2008

No-Interaction No-Interaction AssumptionAssumption

Previous stratified model contains Previous stratified model contains regression coefficients that do not regression coefficients that do not vary over the strata. This is the “no-vary over the strata. This is the “no-interaction assumption”.interaction assumption”.

Page 34: HSRP 734:  Advanced Statistical Methods July 24, 2008

InteractionInteraction

If we allow for interaction between the If we allow for interaction between the TREAT and RACE, we can fit two separate TREAT and RACE, we can fit two separate Cox models to non-whites and whites with Cox models to non-whites and whites with each model containing TREAT.each model containing TREAT.

An alternative way is to fit a stratified An alternative way is to fit a stratified model with an interaction term between model with an interaction term between TREAT and RACE. Note that though TREAT and RACE. Note that though RACE can’t be treated it as a covariate, it RACE can’t be treated it as a covariate, it can be included as an interaction term.can be included as an interaction term.

Page 35: HSRP 734:  Advanced Statistical Methods July 24, 2008

InteractionInteraction

eval ph example.saseval ph example.sas

Page 36: HSRP 734:  Advanced Statistical Methods July 24, 2008

SummarySummary

Non-proportional hazards and strata by Non-proportional hazards and strata by covariate interactions greatly covariate interactions greatly complicate our analyses and complicate our analyses and interpretations.interpretations.

Options:Options: Run completely separate analyses for each Run completely separate analyses for each

stratum – simple but often confusingstratum – simple but often confusing Attempt to explicitly model these Attempt to explicitly model these

interactions – more complicated and often interactions – more complicated and often confusing to describe.confusing to describe.