Prize4Life:*Predic-ng*Disease*...
Transcript of Prize4Life:*Predic-ng*Disease*...
![Page 1: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/1.jpg)
Prize4Life: Predic-ng Disease Progression in ALS
Special thanks to Neta Zach and Robert Küffner
Lester Mackey
November 22, 2013
Joint work with Lilly Fang
![Page 2: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/2.jpg)
Goals of the Talk
§ Bring awareness to a fatal disease • Amyotrophic lateral sclerosis (ALS)
§ Present an example of crowdsourced science • $50,000 ALS Predic-on Prize4Life Challenge
§ Introduce you to a rich data source • 8500 pa-ent PRO-‐ACT database
§ Highlight interes-ng (open) sta-s-cal ques-ons
![Page 3: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/3.jpg)
Lou Gehrig (died within 2 years of diagnosis)
Stephen Hawking (has lived with the disease for 50 years)
§ Amyotrophic lateral sclerosis or Lou Gehrig’s Disease • A neurodegenera-ve disease that targets motor neurons • Leads to muscle atrophy, paralysis, and ul-mately death • 100% fatal, typically within 3-‐5 years, but not always
What is ALS?
Slow progressor Fast
progressor
![Page 4: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/4.jpg)
Prize4Life
§ 2004: 29-‐year-‐old Avi Kremer diagnosed with ALS § 2006: Founded ALS non-‐profit
• Goal: Accelerate development of treatment for ALS
Avi, 9 months after diagnosis
Avi, 2011, receiving Israeli PM award for Entrepreneurship and Innovation
![Page 5: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/5.jpg)
Prize4Life: Incen-ves for Innova-on
§ $1M ALS Biomarker Prize, 2006-‐2011 • Goal: Inexpensive, sensi-ve tool for monitoring disease progression and treatment efficacy
§ $1M ALS Treatment Prize, 2008-‐Present • Goal: Therapy increasing lifespan of ALS mice by 25%
§ $50K ALS Predic-on Prize, 7/2012-‐10/2012 • Goal: Predict rate of disease progression in ALS pa-ents
§ Dis-nguish the slow progressors from the fast
Ques;ons • What do we mean by disease progression? • Why is progression predic-on valuable? • How can we hope to predict progression accurately?
![Page 6: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/6.jpg)
Predic-ng ALS Progression: What?
§ ALS Func;onal Ra;ng Scale (ALSFRS) • Measure of pa-ent func-onality, ranging from 0-‐40 • Based on 10 ques-ons regarding everyday ac-vity:
§ Speaking, respira-on, climbing stairs, dressing, wri-ng, … § Ac-vity score of 4 is normal, 0 is complete inability
• Slow progressor loses 0-‐3 points per year • Fast progressor can lose 20
Speech Respira. Saliv. Swall. Handwr Cuing Dress. Turn. Climb. Walk. Total
Visit 0 3 4 3 3 4 4 3 4 4 4 36 Month 1
3 4 3 3 4 4 3 4 4 4 36 Month 2
3 4 2 3 4 4 3 4 4 4 35 Month 3
3 4 2 3 4 4 3 4 4 3 34
![Page 7: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/7.jpg)
State of Progression Predic-on
Speech Respira. Saliv. Swall. Handwr Cuing Dress. Turn. Climb. Walk. Total
Visit 0 3 4 3 3 4 4 3 4 4 4 36
Month 1
3 4 3 3 4 4 3 4 4 4 36 Month 2
3 4 2 3 4 4 3 4 4 4 35 Month 3
3 4 2 3 4 4 3 4 4 3 34
Clinical Presenta;on: § A 69 year old Caucasian female 19 months aker diagnosis § Bulbar onset (degenera-on in muscles controlling speaking/swallowing) § Weight stable and normal
ALSFRS Scores
![Page 8: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/8.jpg)
Respiratory rate Pulse Blood pressure
Visit 0 12 82 150/80 Month 1 18 81 144/80 Month 2 Missing Missing Missing Month 3 18 92 142/84
Urine pH
Glucose Hemogl. Bilirubin Trigly Cholest K Cl Ca Na Phos CO2 Albumin Crea-nine
(BUN) Visit 0 7 6.4 133 9 1.25 6.53 4.1 104 2.35 139 1.36 26 46 62 7.85 Month 1 6 5.4 132 7 2.35 6.11 4.3 105 2.45 139 1.45 28 46 71 8.96 Month 2 7 6.1 127 7 1.66 7.07 4.6 106 2.38 140 1.23 26 47 71 8.43 Month 3 6 5.6 131 7 1.29 6.53 4.5 105 2.38 140 1.39 29 47 62 7.78
Basophils Eosinophils Monocytes Lymphocytes Neutrophils
Visit 0 0.02 0.13 0.51 1.61 4.32 Month 1 0.03 0.19 0.52 1.61 4.05 Month 2 0.02 0.22 0.67 2.49 4.70 Month 3 0.07 0.21 0.71 2.35 4.37
State of Progression Predic-on
Clinical Presenta;on: Vitals and Lab Tests
![Page 9: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/9.jpg)
State of Progression Predic-on
Six expert ALS clinicians es;mated change in ALSFRS over 9 months Reality: The pa;ent lost 12 points
Clinician A B C D E F Average Score -‐3 -‐3 -‐4 -‐5 -‐6 -‐11 -‐5.33
![Page 10: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/10.jpg)
Predic-ng ALS Progression: Why?
Why predict rate of disease progression? § Helping clinicians
• More accurate prognosis • Iden-fying predic-ve pa-ent characteris-cs
§ Which lab tests worthwhile?
§ Stra-fying clinical trial pa-ents • Less variability ⇒ fewer pa-ents needed ⇒ less expensive, more interpretable clinical trials
• Recent 1000 pa-ent trial cost over $100 million • Using our algorithm, Prize4Life es-mates a 20% reduc-on in pa-ents needed to observe drug effect
![Page 11: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/11.jpg)
Predic-ng ALS Progression: How?
The PRO-‐ACT Database § Pooled Resource Open-‐Access ALS Clinical Trials § 8500 de-‐iden-fied pa-ent records from completed clinical trials § Largest ALS pa-ent data set ever assembled § Demographics, Medical and family history data § Func-onal measures (ALSFRS, lung capacity) § Vital signs (weight, height, respiratory rate) § Lab data (blood chemistry, hematology, and urinalysis)
§ Released to the public in Dec. 2012
![Page 12: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/12.jpg)
The ALS Predic-on Prize
![Page 13: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/13.jpg)
ALS Predic-on Prize: Setup
§ The Contest Data • 918 training pa-ents
§ 12 months of data (demographic, ALSFRS, vital sta-s-cs, lab tests)
§ Time series: roughly monthly measurements, unequally spaced
• 279 test pa-ents § First 3 months of data available at test ;me
§ Challenge: Given first 3 months of pa-ent data, predict progression of ALS over subsequent 9 months
§ Measure: ALS Func-onal Ra-ng Scale (ALSFRS) score • Rate of progression = slope of ALSFRS score
![Page 14: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/14.jpg)
Target for Predic-on ALSFRS sc
ore
Months
![Page 15: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/15.jpg)
12
12 )()(mm
mALSFRSmALSFRSslope−
−=
ALSFRS sc
ore
Months
m1
m2
Target for Predic-on § Issues: Timing of future visits unknown; Slope unstable § Open Ques;on: Be^er targets for predic;on?
• Es-mate ALSFRS score as a func-on of -me? • Classify pa-ent as slow or fast progressor?
First visit a_er 3 months
First visit a_er 12 months
![Page 16: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/16.jpg)
ALS Progression Types
fast
slow
non-‐linear
![Page 17: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/17.jpg)
The Difficulty of Predic-on
![Page 18: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/18.jpg)
ALS Predic-on Prize: Evalua-on
§ Contest run on Innocen;ve prize plaporm • Hosts science compe--ons • See also Kaggle, Challenge.gov
§ Contestants uploaded code to Innocen-ve server • Code had to be wriqen in R! • Max running -me: 6 hours
§ Leaderboard displayed error on test set • Max # submissions: 100
§ Error metric: Root mean squared devia-on (RMSD)
![Page 19: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/19.jpg)
ALS Predic-on Prize: Evalua-on
§ Oct. 1, 2012: Test set released to contestants § The Final Contest Data
• 918 training pa-ents + 279 test pa-ents § 12 months of data (demographic, ALSFRS, vital sta-s-cs, lab tests)
• 625 valida-on pa-ents determined prize winners § Data never seen by contestants, no prior feedback given § Tests ability to generalize to new pa-ents
![Page 20: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/20.jpg)
Featuriza;on • Sta-c Data • Time Series Data
Modeling and Inference • Bayesian Addi-ve Regression Trees
Post-‐hoc Evalua;on • BART Performance • Feature Selec-on • Model Comparison
Our Approach
![Page 21: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/21.jpg)
Featuriza-on
§ Goal: Compact numeric representa-on of each pa-ent • Features will serve as covariates in a regression model • Most extracted features will be irrelevant • Rely on model selec-on / methods robust to irrelevant features
Issue: Features manually specified by non-‐expert (me) Open Ques;on: Automa;c featuriza;on of longitudinal data?
![Page 22: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/22.jpg)
Featuriza-on
§ Sta;c Data
ALS History Time from onset, Site of onset
Family History Mother, Father, Grandmother, Uncle…
…………………… ……………………
49
Categorical variables encoded as binary indicators
Demographics Age, Race, Sex
§ Goal: Compact numeric representa-on of each pa-ent • Features will serve as covariates in a regression model • Most extracted features will be irrelevant • Rely on model selec-on / methods robust to irrelevant features
![Page 23: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/23.jpg)
Featuriza-on
§ Time Series Data • Repeated measurements of variables over -me
§ ALSFRS ques-on scores § Alterna-ve ALS measures (forced and slow vital capacity) § Vital signs (weight, height, blood pressure, respiratory rate) § Lab tests (blood chemistry, hematology, urinalysis)
• Number and frequency of measurements vary across pa-ents
§ Goal: Compact numeric representa-on of each pa-ent • Features will serve as covariates in a regression model • Most extracted features will be irrelevant • Rely on model selec-on / methods robust to irrelevant features
![Page 24: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/24.jpg)
Featuriza-on
§ Time Series Data • Compute summary sta-s-cs from each -me series
§ Mean value, standard devia-on, slope, last recorded value, maximum value…
• Compute pairwise slopes (difference quo-ents between adjacent measurements) § Induces a deriva-ve -me series § Extract same summary sta-s-cs
§ Goal: Compact numeric representa-on of each pa-ent • Features will serve as covariates in a regression model • Most extracted features will be irrelevant • Rely on model selec-on / methods robust to irrelevant features
![Page 25: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/25.jpg)
Featurizing Time Series Data
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
![Page 26: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/26.jpg)
Featurizing Time Series Data
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
Features extracted • Mean = 38.75 • SD = 0.816 • Max = 40 • Min = 37 • Last = 37 • etc.
![Page 27: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/27.jpg)
Featurizing Time Series Data
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
Features extracted • Mean = 38.75 • SD = 0.816 • Max = 40 • Min = 37 • Last = 37 • Slope = -‐1 • etc.
![Page 28: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/28.jpg)
Featurizing Time Series Data
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
slope -‐1 slope 0
slope -‐2
![Page 29: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/29.jpg)
Featurizing Time Series Data
-‐2.5
-‐2
-‐1.5
-‐1
-‐0.5
0
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
ALSFRS Slope
slope -‐1 slope 0
slope -‐2
Deriva;ve ;me series
![Page 30: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/30.jpg)
Featurizing Time Series Data
-‐2.5
-‐2
-‐1.5
-‐1
-‐0.5
0
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
ALSFRS Slope
slope 0 slope -‐1
slope -‐2
Deriva;ve ;me series
![Page 31: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/31.jpg)
Featurizing Time Series Data
-‐2.5
-‐2
-‐1.5
-‐1
-‐0.5
0
36
37
38
39
40
0 0.5 1 1.5 2 2.5 3 3.5
ALSFRS Score
Months
ALSFRS Slope
Features extracted Mean = -‐1 SD = 1 Max = 0 Min = -‐2 Last = -‐2 Slope = -‐0.5 etc.
Deriva;ve ;me series
![Page 32: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/32.jpg)
Featurizing Time Series Data
§ 435 temporal features extracted § Problem: Missing data
• Average pa-ent missing 10% of features • One pa-ent missing 55% of features! • Missing values imputed using median heuris-c
§ Problem: Outliers • Nonsense values: Number of liters recorded as MDMD • Units incorrectly recorded ⇒ Wrong conversions • Extreme values
§ Treated as missing if > 4 standard devia-ons from mean
Open Ques;on: Regression robust to (sparse) covariate outliers?
Room for improvement
![Page 33: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/33.jpg)
Modeling and Inference
§ Regression model Future ALSFRS Slope = f(features) + noise
§ Goal: infer f from data
• Bayesian: Place a prior on f, infer its posterior • Bonus: Uncertainty es-mates for each predic-on
§ What prior? • Flexible and nonparametric
§ Avoid restric-ve assump-ons about func-onal form • Favor simple, sparse models
§ Avoid overfiing to irrelevant features
Unknown regression func-on
![Page 34: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/34.jpg)
Bayesian Addi-ve Regression Trees*
§ f(features) = sum of “simple” decision trees
• Simplicity = tree depends on few features § Irrelevant features seldom selected
• Similar to frequen-st ensemble methods § Boosted decision trees, random forests
*Chipman, George, and McCulloch (2010)
Days since onset > 705
-‐0.5 -‐0.83
Past ALSFRS slope > -‐0.6
0.06 -‐0.08
+ + …
…
![Page 35: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/35.jpg)
BART Inference
§ Es;ma;ng f: Markov Chain Monte Carlo • R package ‘bart’ available on CRAN • 10,000 posterior samples:
• 10 minutes on MacBook Pro (2.5 GHz CPU, 4GB RAM)
§ Predic;on: Posterior mean • Average of
§ Variance reduc;on • Average predic-ons of 10 BART models
^ f1 , f2 , f3 , f4 , …
^ ^ ^
… … …
^ fi
^ fi =
… … …+
… … …+
… 100 trees
^ ^ ^ f1(features), f2(features), f3(features), …
![Page 36: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/36.jpg)
Accuracy of BART Inference
0 2000 4000 6000 8000 100000.510
0.520
0.530
0.540
Number of BART Samples
Val
idat
ion
RM
SD
10000 samples: 0.5109 2000 samples: 0.5144
1 sample: 0.5459
100 samples: 0.5234
![Page 37: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/37.jpg)
BART Feature Selec-on
§ Many pairwise slope features
§ Lab data excluded
Top Ten Features Ordered by BART Usage
Average usage
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Mean ALSFRSMin Turning Score
Last ALSFRSLast Weight Slope
Last FVC Slope
Mean Weight Slope
Last Systolic Blood Pressure Slope
ALSFRS Slope
Max Dressing Score
Onset Delta
All 484 Features Ordered by Usage
Average usage
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
![Page 38: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/38.jpg)
BART on Feature Subsets
5 10 15 20 25
0.515
0.520
0.525
0.530
Effect of Adding Each Feature in Order of BART Usage
Features Added in Order of Usage
Val
idat
ion
RM
SD
Onset.Delta max.dressing
alsfrs.score.slope
last.slope.bp.systolic
mean.slope.weight
last.slope.fvc.liters last.alsfrs.score
last.speech
last.handwriting meansquares.speech
1 feature: 0.5291
3 features: 0.5246
21 features: 0.5113
6 features: 0.5190
14 features: 0.5157
![Page 39: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/39.jpg)
Model Our RMSD (Test)
Our RMSD (Valida;on)
Compe;tor RMSD
Lasso Regression 0.5006 0.5287 -‐ Random Forests 0.5052 0.5120 0.52-‐0.53
BART 0.4860 0.5109 -‐
Model Our RMSD (Test)
Our RMSD (Valida;on)
Compe;tor RMSD
Lasso Regression 0.5006 0.5287 -‐
BART 0.4860 0.5109 -‐
Model Our RMSD (Test)
Our RMSD (Valida;on)
Compe;tor RMSD
BART 0.4860 0.5109 -‐
Model Comparison
Model Our RMSD (Test)
Our RMSD (Valida;on)
Compe;tor RMSD
Lasso Regression 0.5006 0.5287 -‐ Random Forests 0.5052 0.5120 0.52-‐0.53 Boosted Trees 0.4940 0.5118 -‐ BART 0.4860 0.5109 -‐
How do other models perform using our feature set?
§ Addi;ve decision tree models especially effec-ve § Featuriza;on was a main differen-ator of compe-tors
![Page 40: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/40.jpg)
Contest Evalua-on
Pa-ent Pa-ent
Pa-ent Pa-ent Pa-ent
…
Baseline performance
![Page 41: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/41.jpg)
RMSD: Slow vs. Fast Progressors
0.29 0.30 0.30 0.31 0.30 0.34 0.26 0.36 0.61 1.43
med all 0.43 0.43 0.40 0.42 0.44 0.38 0.46 0.47 0.92 1.04
slow
slow 0.78 0.79 0.84 0.83 0.82 0.88 0.91 0.88 1.04 1.67
fast fast
1 2
6
3 4 5
9 10
7 8
0.51 0.52 0.52 0.53 0.53 0.53 0.57 0.57 0.89 1.30
Different solvers predict slow or fast progressors more reliably. Larger (absolute) errors in case of steep slopes.
![Page 42: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/42.jpg)
Similarity among Predic-ons
Predic-ons more correlated to each other than to real slopes: room for improvement?
Slopes vs. Predic-ons
True slope
Pred
icted slo
pe
Predic-ons first vs second
Predicted slope
Pred
cted
slop
e
![Page 43: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/43.jpg)
Similarity among Predic-ons
Aggregate
Gold standard
Baseline: SVR
Mul-var. regression
Linear regression
Predic-on of mean
Linear regression
BART
Nonparam. regression
Random forest
1
2
6
3
4
5
9
10
7
8
short branch = similar
predictions
![Page 44: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/44.jpg)
Algorithms vs. Clinicians
Based on 14 pa7ents.
1 2
Pearsons correla;on
0.4
0.6
0.2
0.8
Be^er
1 2
RMSD
0.4
0.6
0.2
Be^er
![Page 45: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/45.jpg)
Robustness of Ranking
0
25
50
75
100 Pearsons Correla;on
1 2 3
0
25
50
75
100
1 2 3
RMSD
![Page 46: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/46.jpg)
The Future
![Page 47: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/47.jpg)
The Future: New ALS Predictors?
Four solvers iden-fy uric acid as predic-ve of progression § Reported once in the literature but not rou-nely used
New predictors supported by three or more solvers § Pulse § Blood pressure § Crea;nine § Basophils § Monocytes § Crea;ne kinase ⇒ New lines of inquiry for ALS Open Ques;on: Be^er biomarkers based on predic;ve features?
![Page 48: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/48.jpg)
The Future: Clinical Adop-on? § Grand Challenge: Introduce algorithms to clinicians, trial managers, and pharmaceu-cal companies • More accurate prognoses for ALS pa-ents • Less expensive, more interpretable clinical trials • New incen-ves for ALS drug development
![Page 49: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/49.jpg)
The End
Ques-ons?
![Page 50: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/50.jpg)
Distribu-on of ALSFRS Slopes
-‐3 -‐2 -‐1 -‐0 1 Slope
Freq
uency
Fast Slow Gray area
![Page 51: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/51.jpg)
-2000 -1500 -1000 -500 0
-3-2
-10
1
Onset.Delta versus ALSFRS Slope on Train and Test Data
Onset.Delta
Futu
re A
LSFR
S S
lope
Onset Delta vs. Target
![Page 52: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/52.jpg)
Max Dressing Score vs. Target
0 1 2 3 4
-3-2
-10
1
max.dressing versus ALSFRS Slope on Train and Test Data
max.dressing
Futu
re A
LSFR
S S
lope
![Page 53: Prize4Life:*Predic-ng*Disease* Progression*in*ALS*lmackey/papers/alsprize4life...Prize4Life:*Predic-ng*Disease* Progression*in*ALS* Special*thanks*to*Neta*Zach*and*Robert Küffner*](https://reader035.fdocuments.in/reader035/viewer/2022062611/612f447f1ecc515869435536/html5/thumbnails/53.jpg)
Past ALSFRS Slope vs. Target
-10 -8 -6 -4 -2 0 2 4
-3-2
-10
1
alsfrs.score.slope versus ALSFRS Slope on Train and Test Data
alsfrs.score.slope
Futu
re A
LSFR
S S
lope