Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design,...

25
Shaping the Future of Drug Development Transparent machine learning Alind Gupta Real-World and Advanced Analytics

Transcript of Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design,...

Page 1: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Shaping the Future

of Drug Development

Transparent machine learning

Alind Gupta

Real-World and Advanced Analytics

Page 2: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Machine learning

4/28/2020 Cytel Inc. 2

• Algorithms that learn from data + evaluation criteria

• Key challenge – identify the (unobservable) data-generating distribution from sample

Linear regression

Neural network

X

Y

Which model better

approximates data

distribution P(X,Y)?

Research areas• Prediction• Knowledge discovery• Anomaly detection• Summarization• Optimal decision-making

Page 3: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Potential applications in clinical research

4/28/2020 Cytel Inc. 3

Shah, Pratik, et al. "Artificial intelligence and machine learning in clinical development: a translational perspective." NPJ digital medicine 2.1 (2019): 1-5.

Page 4: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Black-box machine learning

4/28/2020 Cytel Inc. 4

• High capacity, low interpretability models (e.g. deep neural networks)

• Problems:

• Biases and limitations?

• Inability to audit decision-making

• Difficult to troubleshoot

• May not engender trust in users, regulators

Input PredictionBlack box

model

?

Page 5: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Transparency is important

4/28/2020 Cytel Inc. 5

EU GPDR

• Individual’s “Right to explanation” about automated decisions

FDA guidance for Good Machine Learning Practices (GMLP)

• “[A]ppropriate validation, transparency” to assure “safety and effectiveness”

• Focus on validation with “clinicians in the loop” where necessary

https://www.fda.gov/files/medical%20devices/published/US-FDA-Artificial-Intelligence-and-Machine-Learning-Discussion-Paper.pdf

Page 6: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Limitations of black boxes

4/28/2020 Cytel Inc. 6

Adversarial attacks

• What patterns are black box models really representing?

IBM Watson for Oncology

• “Overpromising and under-delivery”

COMPAS

• As good as random people on the internet at predicting recidivism

Panda

(57%)

Gibbon

(99%)

+

=

Strickland, E. (2019). IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectrum, 56(4), 24-31.Dressel, J., & Farid, H. (2018). The accuracy, fairness, and limits of predicting recidivism. Science advances, 4(1), eaao5580.Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency (pp. 77-91).Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017, April). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security (pp. 506-519).

Page 7: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Bayesian networks

A transparent and flexible machine learning method

4/28/2020 Cytel Inc. 7

Page 8: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Key idea

4/28/2020 Cytel Inc. 8

Performing computations on a Directed Acyclic Graph (DAG)

Data

DAG structure DAG parametrization

(Maximum likelihood, Bayesian

estimation)Subject-matter

expert

Page 9: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Applications of graphical models

4/28/2020 Cytel Inc. 9

• Risk prediction

Causal inference

• Bayesian inference

• Computer vision

• NLP

• Gene networks

Page 10: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Use case: Immunotherapy for advanced cancer

4/28/2020 Cytel Inc. 10

Challenges

• High individual-level heterogeneity in response to treatment

• Subsets show durable response, severe adverse events

• Short follow-up

• Multiple outcomes of interest

Uses for machine learning

• Identifying predictors of response

• Long-term predictions for health economic evaluations for HTA (better than curve-fitting?)

• Informing future trial design, surrogate endpoints

• Patient simulation with time-varying interventions

Page 11: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Multivariate prediction model

4/28/2020 Cytel Inc. 11

• Based on IPD from RCT

• 3 outcomes over 3 years

DAG learning

• MLE + bootstrapping + model averaging

• Constrained edge orientation based on causal tiers

• Comparison to known/expected relationships

Risk

score X

Page 12: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Classification performance

4/28/2020 Cytel Inc. 12

Outcome 1

Results for other outcomes

Page 13: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

External validation using RWD

4/28/2020 Cytel Inc. 13

• To assess generalizability and limitations

• Problem – what to do about missing covariates?

• HRQoL is highly prognostic in RCT but not present in RWD

Good

real-world

generalizability

All variables

Common subset

Real-world data

Page 14: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

External validation by key subgroups

4/28/2020 Cytel Inc. 14

Good

real-world

generalizability

Moderate

real-world

generalizability

Subgroup A Subgroup B

Page 15: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Prognostic variables by treatment group

4/28/2020 Cytel Inc. 15

Pro

gn

ostic v

alu

e

Treatment A Treatment B

Variables ordered by increasing prognostic value

Differentially

prognostic

variable

Page 16: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Communication

4/28/2020 Cytel Inc. 16

Page 17: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Dynamic Bayesian networks

Extending Bayesian networks for time modelling

4/28/2020 Cytel Inc. 17

Page 18: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Predicting trends in time

4/28/2020 Cytel Inc. 18

Challenges

• Extrapolation in time

• Time-varying covariates

• How prognostic are changes in variables?

• Time-varying interventions

Initial distribution Time replication

Page 19: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Survival curves

4/28/2020 Cytel Inc. 19

Time

Page 20: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Prediction performance from baseline

4/28/2020 Cytel Inc. 20

Plateau?

Treatment group A Treatment group B

X X

Page 21: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Prognostic value of changes

4/28/2020 Cytel Inc. 21

High

levels

Intermediate

levels

Low

levels

Month

T

Month

T +1

Survival

Death

Biomarker A high → high

Biomarker A med → med

Biomarker B low → med

Biomarker A high → med

Biomarker B high → low

Biomarker A med → low

Page 22: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Future directions

4/28/2020 Cytel Inc. 22

• Relaxing the Markov assumption

• Latent variable models

• Adding outcomes

• PFS, ORR, TFS

Page 23: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Conclusion

4/28/2020 Cytel Inc. 23

• Bayesian networks are transparent, interpretable models

• Useful for multivariate prediction

• Useful for missing data problems + small data

• Useful as time models for dynamic processes

Page 24: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Statistical

Software

Global Products and Services

Strategic

Consulting

Project-Based

Services

Functional

Services

Provision (FSP)Industry standard for trial design, including adaptive (e.g. East, StatXact) Operations software (e.g. ACES, EnForeSys, FlexRandomizer)

All 25 top biopharma companies, the FDA, EMA & PMDA use our software

PhD statisticians expert in innovative design & complex statistical questions

Experts in Data Science, PK/PD, Enrolment & Event Forecasting, Portfolio/Program Optimization (NPV)

Reliable Biometrics service provider delivering high quality, on time

Lead staff with over 15 years industry experience on average

Including biostatistics & programming, ISC, data management, PK/PD analysis, medical writing

Creation of dedicated teams operating within/as an extension of the client’s own biostatistics & programming, data management and PK/PD teams

Leader in offshoring of Biometrics competencies

NEW BOOK

Introduction to

Adaptive Design &

Master Protocols

COMING 2021

Page 25: Alind Gupta Real-World and Advanced Analytics Webinars/Oncology...• Informing future trial design, surrogate endpoints • Patient simulation with time-varying interventions Multivariate

Stage of Development

End-to-End Biometric Solutions for All Phases Development

Protocol Design Study Conduct Reporting & Submission

Cytel’s Statistical and Adaptive Trial Software

Cytel’s Clinical Research Services

Study / Adaptive Design

Exploratory & Predictive Analyses

Simulation & Modeling

Feasibility & Patient

Recruitment Modeling

Regulatory Support &

Representation

eCRF Development

Data Management

Biostatistics

Statistical Programming

Data Monitoring

Final Study ReportingStrategic Program Planning

CDISC migration

Pharmacometrics &

Pharmacology (QPP)

Real World Analytics

Interim Analyses

Randomization

Data Monitoring Committee

Support

Integrated Summaries of Safety &

Efficacy

eCTD Reporting for Submission

Health Economics and Outcomes

Research (HEOR)

All of Cytel’s services are offered across all four phases of drug development and across a multitude of therapeutic areas

25

NEW BOOK

Introduction to

Adaptive Design &

Master Protocols

COMING 2021