3 rd Summer School in Computational Biology September 10, 2014

59
3 rd Summer School in Computational Biology September 10, 2014 Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory Center for Cancer Research and Cell Biology Queen’s University Belfast, UK

description

3 rd Summer School in Computational Biology September 10, 2014. Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory Center for Cancer Research and Cell Biology Queen’s University Belfast, UK. Exercise – Survival Analysis. Homework ~ 1.5 hours. - PowerPoint PPT Presentation

Transcript of 3 rd Summer School in Computational Biology September 10, 2014

Page 1: 3 rd  Summer School in Computational Biology  September 10, 2014

3rd Summer Schoolin Computational Biology

September 10, 2014

Frank Emmert-Streib & Salissou MoutariComputational Biology and Machine Learning Laboratory

Center for Cancer Research and Cell Biology Queen’s University Belfast, UK

Page 2: 3 rd  Summer School in Computational Biology  September 10, 2014

Exercise – Survival Analysis

Homework ~ 1.5 hours

Page 3: 3 rd  Summer School in Computational Biology  September 10, 2014

3

1. Kaplan-Meier Survival Curves

Page 4: 3 rd  Summer School in Computational Biology  September 10, 2014

4

Result: Survival Curve

S(t)

Page 5: 3 rd  Summer School in Computational Biology  September 10, 2014

5

Goal: estimate S(t) from data

• A survival curve shows S(t) as a function of t.– S(t): survival function (survivor function)– t: time

S(t) gives the probability that the random variable T is larger than a specified time t, i.e.,S(t) = Pr(T>t)T: is the event

Problem: censoring

Page 6: 3 rd  Summer School in Computational Biology  September 10, 2014

6

Small example: Leukemia

Chemotherapy(we use this info later)

censoring

Acute Myelogenous Leukemia (AML)

survival time

Only 5 patients

Page 7: 3 rd  Summer School in Computational Biology  September 10, 2014

7

Small example: Leukemia

censoring

Number in risk Number of events

event

???

Page 8: 3 rd  Summer School in Computational Biology  September 10, 2014

8

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Kaplan & Meier 1958

Page 9: 3 rd  Summer School in Computational Biology  September 10, 2014

9

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Page 10: 3 rd  Summer School in Computational Biology  September 10, 2014

10

Check S(t) till t

Page 11: 3 rd  Summer School in Computational Biology  September 10, 2014

11

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Page 12: 3 rd  Summer School in Computational Biology  September 10, 2014

12

Check S(t) till t

Page 13: 3 rd  Summer School in Computational Biology  September 10, 2014

13

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Last time seen,still alive at thattime

Page 14: 3 rd  Summer School in Computational Biology  September 10, 2014

14

Check S(t) till t

Page 15: 3 rd  Summer School in Computational Biology  September 10, 2014

15

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Page 16: 3 rd  Summer School in Computational Biology  September 10, 2014

16

Check S(t) till t

Page 17: 3 rd  Summer School in Computational Biology  September 10, 2014

17

Kaplan-Meier estimator for S(t)

• Estimator:

ni: number of subjects at time ti

di: number of events at time ti

Page 18: 3 rd  Summer School in Computational Biology  September 10, 2014

18

Check S(t) till t

Page 19: 3 rd  Summer School in Computational Biology  September 10, 2014

19

Full data set: Leukemia

23 patients

Page 20: 3 rd  Summer School in Computational Biology  September 10, 2014

20

R code

Page 21: 3 rd  Summer School in Computational Biology  September 10, 2014

21

2. Comparing Survival Curves

Page 22: 3 rd  Summer School in Computational Biology  September 10, 2014

22

Reasons for comparing survival curves (SC)

• Treatment vs no treatment:– Compare a SC for patients that have been treated

with a certain medication with the SC for patient that have not been treated.

– Result: Has the treatment an effect on the survival of the patients?

Page 23: 3 rd  Summer School in Computational Biology  September 10, 2014

23

Reasons for comparing survival curves

• Chemotherapy vs no chemotherapy :– Compare a SC for patients that had chemotherapy

with the SC for patient that have not had chemotherapy.

– Result: Has the chemotherapy an effect on the survival of the patients?

Survival Analysis has a big practical relevance

Page 24: 3 rd  Summer School in Computational Biology  September 10, 2014

24

Data: Leukemia

11 patients with chemo12 patients without

Goal: compare thetwo SCs statistically

Group 1

Group 2

Page 25: 3 rd  Summer School in Computational Biology  September 10, 2014

25

R code

Page 26: 3 rd  Summer School in Computational Biology  September 10, 2014

26

Log-rank test (Mantel-Haenszel)

• Hypothesis:Null hypothesis H0: No difference in survival between (group 1) and (group 2).

Alternative hypothesis H1: Difference in survival between (group 1) and (group 2).

Mantel and Haenszel 1959

Page 27: 3 rd  Summer School in Computational Biology  September 10, 2014

27

Idea of the test

• For each time t, estimate the expected number of events for (group 1) and (group 2).

Number in risk at t in i Number of events at t in i

Page 28: 3 rd  Summer School in Computational Biology  September 10, 2014

28

The eit are obtained assuming H0 is true.Hence, mit – eit is a measure for the deviation of the data from H0.

sum

E2E1 O1 - E1 O2 – E2

Page 29: 3 rd  Summer School in Computational Biology  September 10, 2014

29

Wrapping up

• Test statistic:

• Sampling distribution:s follows a chi-square distribution with one degree of freedom

Page 30: 3 rd  Summer School in Computational Biology  September 10, 2014

30

R code

• Back to our leukemia data set:

Page 31: 3 rd  Summer School in Computational Biology  September 10, 2014

31

Data: Leukemia

11 patients with chemo12 patients without

Goal: compare thetwo SCs statistically

Group 1

Group 2

Page 32: 3 rd  Summer School in Computational Biology  September 10, 2014

Survival Analysis & Biomarkers

Page 33: 3 rd  Summer School in Computational Biology  September 10, 2014

NIH Definition of Biomarker

A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to therapeutic intervention.

Page 34: 3 rd  Summer School in Computational Biology  September 10, 2014

FDA Definition of Biomarker

Any measurable diagnostic indicator that is used to assess the risk or presence of disease

Page 35: 3 rd  Summer School in Computational Biology  September 10, 2014

What is a biomarker?

These definitions are very broad and do not help in finding practical implementations for a particular disease.

Page 36: 3 rd  Summer School in Computational Biology  September 10, 2014

Our “definition”

Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application.

Application: Identify a set of genes that can be used for a prognostic analysis.

…that are good!

Page 37: 3 rd  Summer School in Computational Biology  September 10, 2014

Definition of ‘prognosis’

A prognosis is a medical term denoting the prediction of how a patient will progress over time.

For instance, a patient with a diagnosed disease can have:– Long time survival– Short time survival

Page 38: 3 rd  Summer School in Computational Biology  September 10, 2014

Our “definition”

Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application.

Application: Identify a set of genes that can be used for a prognostic analysis.

• Set of genes: we call biomarkers • Use biomarkers to predict the prognostic outcome of

a patientto classifysurvival

Page 39: 3 rd  Summer School in Computational Biology  September 10, 2014
Page 40: 3 rd  Summer School in Computational Biology  September 10, 2014

Underlying idea to identify biomarkers

The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods.

In the previous example:1. Survival analysis2. Differential expression of genes 3. Classification

Page 41: 3 rd  Summer School in Computational Biology  September 10, 2014
Page 42: 3 rd  Summer School in Computational Biology  September 10, 2014

Underlying idea to identify biomarkers

The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods.

In the previous example:1. Clustering2. Survival analysis3. Differential expression of genes 4. Classification

Page 43: 3 rd  Summer School in Computational Biology  September 10, 2014

Our “definition”

Remark: We do not want to address all possible problems that can involve biomarkers but focus on a particular application.

Application: Identify a set of genes that can be used for a prognostic analysis.

Structured patient groups vs unstructured patient groups

Statistics: Feature selection problem

Page 44: 3 rd  Summer School in Computational Biology  September 10, 2014

Underlying idea to identify biomarkers

The identification of biomarkers is a composite approach (or a procedure) that is based on a couple of other methods.

The definition of the procedure is part of the experimental design of the whole experiment.

Yes, the experimental design includes the analysis of the data!

Page 45: 3 rd  Summer School in Computational Biology  September 10, 2014

Summary & Outlook to Genome

and Network Medicine

Almost there!

Page 46: 3 rd  Summer School in Computational Biology  September 10, 2014

Schedule

17 lectures

Page 47: 3 rd  Summer School in Computational Biology  September 10, 2014

Interdisciplinary summer school

Page 48: 3 rd  Summer School in Computational Biology  September 10, 2014

Vision of the VC

Universities require interdisciplinary engagement in the educational and research

effort

Professor Patrick Johnston of President andVice-Chancellor (VC) of Queen’s University

Page 49: 3 rd  Summer School in Computational Biology  September 10, 2014

A look 5 years ahead

Page 50: 3 rd  Summer School in Computational Biology  September 10, 2014

1. Single cell experiments

Experimental measurements of– DNA– Gene expression (mRNA)– Protein binding

within single cells.

What do the other high-throughput data provide information for? Populations of cells.

NGS

Page 51: 3 rd  Summer School in Computational Biology  September 10, 2014

1. Single cell experiments

Experimental measurements of– DNA– Gene expression (mRNA)– Protein binding

within single cells.

What do the other high-throughput data provide information for? Populations of cells.

NGS

Study the heterogeneity of cancer tumors.

Page 52: 3 rd  Summer School in Computational Biology  September 10, 2014

1. Single cell experiments

PacBio (Pacific Biosciences)SMRT: Single molecule real time sequencing

Page 53: 3 rd  Summer School in Computational Biology  September 10, 2014

2. Personalized Medicine

The idea behind Personalized medicine is to provide a customization of healthcare using molecular analysis - with medical decisions, practices etc, which are tailored to the needs of the individual patient.

One drug for all customized treatment.

Page 54: 3 rd  Summer School in Computational Biology  September 10, 2014

2. Personalized Medicine

2012

Page 55: 3 rd  Summer School in Computational Biology  September 10, 2014

What does this all mean?

Page 56: 3 rd  Summer School in Computational Biology  September 10, 2014

What does this all mean?

It means first of all more data!

Page 57: 3 rd  Summer School in Computational Biology  September 10, 2014

What does this all mean?

It means first of all more data!

Page 58: 3 rd  Summer School in Computational Biology  September 10, 2014

Survey

Please participate in the survey about the summer school in order to help us to improve.

We will send it early next week.

Page 59: 3 rd  Summer School in Computational Biology  September 10, 2014

Thank you to everyone for participating!

We hope you enjoyed the summer school.