Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox...

35
Introduction to Survival Analysis Kan Ren Apex Data and Knowledge Management Lab Shanghai Jiao Tong University Seminar Tutorial at Apex Lab Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 1 / 32

Transcript of Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox...

Page 1: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Introduction to Survival Analysis

Kan Ren

Apex Data and Knowledge Management LabShanghai Jiao Tong University

Seminar Tutorial at Apex Lab

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 1 / 32

Page 2: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 2 / 32

Page 3: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Probability

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 3 / 32

Page 4: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Probability

Probability

Probability Density Function (P.D.F.):

pt(t) = Pr(T = t) . (1)

Cumulative distribution function (C.D.F.):

wt(t) = Pr(T < t) =

∫ t

0pt(v)dv . (2)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 4 / 32

Page 5: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Censored Data

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 5 / 32

Page 6: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Censored Data

Censored Data

Right Censored Data

The event happens after the observation time.

E : Event; tobsv : The observe time;

{(x , tobsv , e = True/False)};{(x ,TE )}, TE is the event happening log.

Example

Patient’s survival time.

The true winning price of a bidding auction.

The next visit time of the user.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 6 / 32

Page 7: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Challenges

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 7 / 32

Page 8: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Challenges

Challenges

Right Censorship

Partially data usage: discard large data for learning.

Right Censorship: only know that the event happening time is greaterthan the observing time window.

Evaluation: proper evaluation metric is needed.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 8 / 32

Page 9: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Challenges

Modeling Right Censored Data in Display AdsLosing and Winning in 2nd-price Auction

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 9 / 32

Page 10: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Background Challenges

Modeling Right Censored DataRight Censored

Right Censorship

As in 2nd price auction, if you lose, you only know that the market price ishigher than your bidding price, which result in right censorship.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 10 / 32

Page 11: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 11 / 32

Page 12: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Kaplan Meier Estimator

Preliminaries

S(t) = Pr(t < TE ): Survival rate

F (t) = 1− S(t): Failing rate.

Algorithm

The estimator for an individual is given by

S(t) =∏i :ti≤t

(1− di

ni

), (3)

where di is the number of events and ni is the total individuals at risk attime i .

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 12 / 32

Page 13: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Survival Tree with Kaplan Meier Methods

Cons of KM

Corse grained, the same for all individuals.

Statistcal method, cannot apply personalized forecasting.

Question

How to apply an appropriate clustering method for one individual?

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 13 / 32

Page 14: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Survival Tree with Kaplan Meier Methods

Cons of KM

Corse grained, the same for all individuals.

Statistcal method, cannot apply personalized forecasting.

Question

How to apply an appropriate clustering method for one individual?

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 13 / 32

Page 15: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Tree-based Mapping

Goal

Given the auction feature x , forecast the market price distribution px(z)a.

aYuchen Wang, Kan Ren, Weinan Zhang, Yong Yu. Functional Bid LandscapeForecasting for Display Advertising. ECML-PKDD, 2016.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 14 / 32

Page 16: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Tree-based MappingMethodology

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 15 / 32

Page 17: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Node Splitting

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 16 / 32

Page 18: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Node SplittingKLD and Clustering

Kullback-Leibler Divergence (KLD)

A measure of the difference between two probability distributions P and Q.

Node Splitting (one step)

Divide all the category (including in this node) values into two sets,maximizing KLD between the resulted two sets.

Algorithm

Using K-Means Clustering according to KLD values.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 17 / 32

Page 19: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Node SplittingKLD and Clustering

Kullback-Leibler Divergence (KLD)

A measure of the difference between two probability distributions P and Q.

Node Splitting (one step)

Divide all the category (including in this node) values into two sets,maximizing KLD between the resulted two sets.

Algorithm

Using K-Means Clustering according to KLD values.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 17 / 32

Page 20: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Node SplittingKLD and Clustering

Kullback-Leibler Divergence (KLD)

A measure of the difference between two probability distributions P and Q.

Node Splitting (one step)

Divide all the category (including in this node) values into two sets,maximizing KLD between the resulted two sets.

Algorithm

Using K-Means Clustering according to KLD values.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 17 / 32

Page 21: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Node SplittingKLD and Clustering

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 18 / 32

Page 22: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Handling CensorshipSurvival Model

For winning auctions: We have the true market price value.

For lost auctions: We only know our proposed bid price and knowthat the true market price is higher than that.

Intuition

Most related works focus only on the winning auctions without consideringthe lost auction, which contains the information to infer the truedistribution.

(bi ,wi ,mi )i=1,2,··· ,M −→ (bj , dj , nj)j=1,2,··· ,N

bj < bj+1, dj is number of winning auctions by bj − 1, nj is number of lostauctions by bj − 1. So

w(bx ) = 1−∏

bj<bx

nj − dj

nj, p(z) = w(z + 1)− w(z). (4)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 19 / 32

Page 23: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Non-parametric Models

Survival Model

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 20 / 32

Page 24: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Outline

1 BackgroundProbabilityCensored DataChallenges

2 MethodologyNon-parametric Models

Kaplan Meier EstimatorSurvival Tree

Parametric ModelCox Hazard Proportional ModelDeep Survival Analysis

3 Evaluation

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 21 / 32

Page 25: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Cox Hazard Proportional Model

Hazard Rate The rate of the event happening given not happened before.

Hazard Function The function λ(t|x) to predict the hazard rate w.r.t. thecovariate input x .

Hazard Proportional Model The hazard function which models with theproportional relationship with the input covariate, whereλ(t|x) = λ0(t) exp(h(x)).

Example

Linear Cox Hazard Model: h(x) = βx .Question: What if h(x) is non-linear?

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 22 / 32

Page 26: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

DiscussionRelationship among hazard rate λ, P.D.F. function p(z), C.D.F. function S(b)

λ(b) = limdb→0

Pr(b ≤ z ≤ b + db|z > b)

db

= limdb→0

Pr(b ≤ z ≤ b + db)/Pr(z > b)

db

= limdb→0

(wz(b + db)− wz(b))/S(b)

db=

pz(b)

S(b)= −S ′(b)

S(b).

(5)

pt(t|x) =∂wt(t|x)

∂t=−∂S(t|x)

∂t

=∂ exp(

∫ t0 λ(v |x)dv)

∂t

= exp

(∫ t

0λ(v |x)dv

)λ(t|x) .

(6)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 23 / 32

Page 27: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Cost Function: Partial Likelihood

Likelihoodi =λ(ti |x i )∑

j :tj>tiλ(ti |x j)

=λ0(ti )e

h(x i )∑j :tj>ti

λ0(ti )eh(x j )

=eh(x i )∑

j :tj>tieh(x j )

.

(7)

LPL = − log∏

i :(x i ,ti )

Likelihoodi

= −∑

i :(x i ,ti )

h(x i )− log∑j :tj>ti

eh(x j )

(8)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 24 / 32

Page 28: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Base Hazard Function

Example

Weibull Distribution: λ0(t) = kη

(tη

)k−1· e−(t/η)k .

Question: formulation assumption; without considering x .

Figure: Probability Density Function Figure: Cumulative Distribution Function

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 25 / 32

Page 29: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Deep Survival Analysis

NN-based Cox Model

Using deep neural network to model h(x).a b c

aFaraggi D, Simon R. A neural network model for survival data[J]. Statistics inmedicine, 1995.

bRanganath R, Perotte A, Elhadad N, et al. Deep Survival Analysis[C]//MachineLearning for Healthcare Conference. 2016.

cLuck M, Sylvain T, Cardinal H, et al. Deep Learning for Patient-Specific KidneyGraft Survival Analysis[J]. arXiv preprint arXiv:1705.10245, 2017.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 26 / 32

Page 30: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Deep Survival Analysis

Generative NN-based Survival Time Estimationa

aDeep Multi-task Gaussian Processes for Survival Analysis with Competing Risks,NIPS 2017

fZ ∼ GP(0,KΘZ), fT ∼ GP(0,KΘT

)

Zi ∼ N (fZ (Xi ), σ2Z I), Ti ∼ N (fT (Xi ), σ

2T I)

Ti = min(T 1i , . . . ,T

Ki ) .

(9)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 27 / 32

Page 31: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Deep Survival AnalysisDeepHit (Lee et al. AAAI 2018)

DeepHit: A Deep Learning Approach to Survival Analysis with Competing Risks. Lee et al. AAAI 2018.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 28 / 32

Page 32: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Methodology Parametric Model

Deep Survival AnalysisDeepHit (Lee et al. AAAI 2018)

S(b) = P(z ≤ b|x)

=b∑

j=0

P(z = zj |x)(10)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 29 / 32

Page 33: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Evaluation

Evaluation

Log-Likelihood

P̄ = − 1

N

∑(x i ,zi )∈Dtest

log p′z(zi |x i ) , (11)

where N = |Dtest| is the number of the test dataset and p′t(t|x) is thelearned P.D.F.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 30 / 32

Page 34: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Evaluation

EvaluationRelationship between hazard and P.D.F.

λ(b) = limdb→0

Pr(b ≤ z ≤ b + db|z > b)

db

= limdb→0

Pr(b ≤ z ≤ b + db)/Pr(z > b)

db

= limdb→0

(wz(b + db)− wz(b))/S(b)

db=

pz(b)

S(b)= −S ′(b)

S(b).

(12)

pt(t|x) =∂wt(t|x)

∂t=−∂S(t|x)

∂t

=∂ exp(

∫ t0 λ(v |x)dv)

∂t

= exp

(∫ t

0λ(v |x)dv

)λ(t|x) .

(13)

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 31 / 32

Page 35: Introduction to Survival Analysis - Kan Ren · 2021. 4. 3. · Deep Survival Analysis NN-based Cox Model Using deep neural network to model h(x).abc aFaraggi D, Simon R. A neural

Evaluation

Evaluation

Concordance Index (C-index)

Considering all possible pairs (Ti ,Ei ), (Tj ,Ej) for i ≤ j , the C-index iscalculated by considering the number of pairs correctly ordered by themodel divided by the total number of admissible pairs.admissible: can be ordered in a meaningful way. (uncensored, uncensored);(uncensored, right-censored). admissible pairs.

Kan Ren (Shanghai Jiao Tong University) Introduction to Survival Analysis Seminar Tutorial at Apex Lab 32 / 32