Introduction to Survival Analysis Seminar in Statistics 1 Presented by: Stefan Bauer, Stephan Hemri...

Post on 15-Jan-2016

232 views 0 download

Tags:

Transcript of Introduction to Survival Analysis Seminar in Statistics 1 Presented by: Stefan Bauer, Stephan Hemri...

1

Introduction to Survival Analysis

Seminar in Statistics

Presented by: Stefan Bauer, Stephan Hemri

28.02.2011

2

Definition

• Survival analysis: - method for analysing timing of events; - data analytic approach to estimate the

time until an event occurs.• Historically survival time refers to the time

that an individual „survives“ over some period until the event of death occurs.

• Event is also named failure.

3

Areas of application

Survival analysis is used as a tool in many different settings:

• proving or disproving the value of medical treatments for diseases;

• evaluating reliability of technical equipment;

• monitoring social phenomena like divorce and unemployment.

4

Examples

Time from...

• marriage to divorce;

• birth to cancer diagnosis;

• entry to a study to relapse.

5

Censoring

The survival time is not known exactly! This may occur due to the following reasons:

• a person does not experience the event before the study ends;

• a person is lost to follow-up during the study period;

• a person withdraws from the study because of some other reason.

6

Right censored

7

Left censored

8

Outcome variable

9

Outcome variable

• Survival time ≠ calendar time (e.g. follow-up starts for each individual on the day of surgery)

• Correct starting and ending times may be unknown due to censoring

10

Survivor function

• Probability that random variable T exceeds specified time t

• Fundamental to survival analysis

t S(t)

1 S(1) = P(T > 1)

2 S(2) = P(T > 2)

3 S(3) = P(T > 3)

. .

. .

. .

11

Survivor function

12

Survivor function

13

Hazard function

• h(t) has no upper bounds

• Often called: Failure rate

14

Example: Hazard function

Assume having a huge follow-up study on heart attacks:• 600 heart attacks (events) per year;

• 50 events per month;

• 11.5 events per week;

• 0.0011 events per minute.

h(t) = rate of events occurring per time unit

15

Relation between S(t) and h(t)

If T continous:

16

Sketch of Proof

i) Find relationship between density f(t) and S(t)

ii) Express relationship between h(t) and S(t) as a function of density f(t)

i in ii → h(t) as a function of S(t) and vice versa

17

Example: Relationship

18

Types of hazard functions

19

Hazard ratio

Cox proportional hazards model:

• h0(t): baseline hazard rate

• X: vector of explanatory variables• : hazard ratio for the coefficient • Ratio between the predicted hazard rate

of two individuals that differ by 1 unit in the variable

20

Example: Hazard ratioh (𝑡 , 𝑿 )=h0(𝑡)𝑒

∑𝑖=1

𝑝

𝛽𝑖 𝑋 𝑖

21

Basic descriptive measures

• Group mean (ignore censorship)

• Median (t for which (t) = 0.5)

• Average hazard rate:

22

Goals (of survival analysis)

• to estimate and interpret survivor and or hazard function;

• to compare survivor and or hazard function;

• to assess the relationship of explanatory variables to survival times -> we need mathematical modelling (Cox model).

23

Computer layout

individual t (in weeks)δ (failed or censored)

1 5 1

2 12 0

3 3,5 0

4 8 0

5 6 0

6 3,5 1

24

Computer layout

Layout for multivariate data with p explanatory variables:

individual t (in weeks)δ (failed or censored)

X1 X2 .... Xp

1 t1 ᵟ1 X11 X12 ... X1p

2 t2 ᵟ2 X21 X22 ... X2p

... ... … ... ... ... ...

n tn ᵟn Xn1 Xn2 .... Xnp

25

Notation & terminology

• Ordered failures:

• Frequency counts:

• mi = # individuals who failed at t(i)

• qi = # ind. censored in [t(i),t(i+1))

• Risk set R(t(i)): Collection of individuals who have survived at least until time t(i)

unorderedcensored t’s

failed t’sordered (t(i))

26

Manual analysis layout

Ordered failure times

# of failures mi

# censored in [t(i),t(i+1))

Risk set R(t(i))

t(0)=0 mi q0 R(t(0))

t(1) m1 q1 R(t(1))

.... ... ... ...

t(n) mn qn R(t(n))

27

Manual analysis layout

Ordered failure times

# of failures mi

# censored in [t(i),t(i+1))

Risk set R(t(i))

t(0) = 0 0 0 6 persons survive ≥ 0 weeks

t(1) = 3.5 1 1 6 persons survive ≥ 3,5 weeks

t(2) = 5 1 3 4 persons survive ≥ 5 weeks

28

Example: Leukaemia remission

Extended Remission Data containing:• two groups of leukaemia patients:

treatment & placebo;• log WBC values of each individual;

(WBC: white blood cell count)

Expected behaviour:

The higher the WBC value is the lower the expected survival time.

29

Example: Analysis layout

• Analysis layout for treatment group:

t(j) mi qi R(t(j))

t(0) = 0 0 0 21 persons

t(1) = 6 3 1 21 persons

t(2) = 7 1 1 17 persons

t(3) = 10 1 2 15 persons

t(4) = 13 1 0 12 persons

t(5) = 16 1 3 11 persons

t(6) = 22 1 0 7 persons

t(7) = 23 1 5 6 persons

30

Example: Confounding

31

Example: Confounding

• Confounding of treatment effect by log WBC

• Log WBC suggests: Treatment group survives longer simply because of lower WBC values

• Controlling for WBC necessary

32

Example: Interaction

33

Example: Conclusion

• Need to consider confounding and interaction;

• basic problem: comparing survival of the two groups after adjusting for confounding and interaction;

• problem can be extended to the multivariate case by adding additional explanatory variables.

34

Summary

• Survival analysis encompasses a variety of methods for analyzing the timing of events;

• problem of censoring: exact survival time unknown;- mixture of complete and incomplete

observations- difference to other statistical data

35

Summary

• Relationship between S(t) and h(t):

• Goals: • Estimation & Interpretation of S(t) and h(t)• Comparison of different S(t) and h(t)• Assessment of relationship of explanatory

variables to survival time

36

References

• A Conceptual Approach to Survival Analysis, Johnson, L.L., 2005. Downloaded from www.nihtraining.com on 19.02.2011.

• Applied Survival Analysis: Regression Modelling of Time to event Data, Hosmer, D.W., Lemeshow, S., Wiley Series in Probability and Statistics 1999.

• Lesson 6: Sample Size and Power - Part A, The Pennsylvania State University, 2007. Downloaded from http://www.stat.psu.edu/online/ courses/stat509/06_sample/09_sample_hazard.htm on 24.02.2011

• Survival analysis: A self-learning text, Kleinbaum, D.G. & Klein M., Springer 2005.

• Survival and Event History Analysis: A Process Point of View (Statistics for Biology and Health), Aalen, O., Borgan, O. & Gjessing H., Springer 2010.