Machine Learning for Drug Development

28
Machine Learning for Drug Development 1 ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 Marinka Zitnik Department of Biomedical Informatics Broad Institute of Harvard and MIT Harvard Data Science Initiative [email protected] https:// zitniklab.hms.harvard.edu

Transcript of Machine Learning for Drug Development

Page 1: Machine Learning for Drug Development

Machine Learning for Drug Development

1ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Marinka ZitnikDepartment of Biomedical InformaticsBroad Institute of Harvard and MITHarvard Data Science Initiative

[email protected]://zitniklab.hms.harvard.edu

Page 2: Machine Learning for Drug Development

OutlineOverview and introduction

Part 1: Virtual drug screening and drug repurposing

Part 2: Adverse drug effects, drug-drug interactions

Part 3: Clinical trial site identification, patient recruitment

Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation

Part 5: Molecular property prediction and transformers

Demos, resources, wrap-up & future directions2ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 3: Machine Learning for Drug Development

Datasets to facilitate algorithmic innovation

3ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 4: Machine Learning for Drug Development

Therapeutics are one of most exciting areas for computational

scientists. However,Retrieving, curating, and processing datasets is time-consuming and

requires extensive domain expertise

Datasets are scattered around the bio repositories and there is no centralized data repository for a variety of therapeutics

Many tasks are under-explored in AI/ML communitybecause of the lack of data access

4ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 5: Machine Learning for Drug Development

● Open-Source ML Datasets for Therapeutics:○ Wide range of tasks: target discovery, activity

screening, efficacy, safety, manufacturing○ Wide range of products: small molecules, antibodies,

vaccine, miRNA● Numerous Data Functions:

○ Extensive data functions ○ Model evaluation, data processing and splits,

molecule generation oracles, and much more● 3 Lines of Code:

○ Minimum package dependency, lightweight loaders

5ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 6: Machine Learning for Drug Development

Domain scientists

MLscientists

Identify meaningful learning tasks

Design powerful ML models

Advancing algorithms for key therapeutics problems

Our Vision for TDC

6ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 7: Machine Learning for Drug Development

Modular Structure of TDC

7

Single-instance

Multi-instance

Generation

Y

Y

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 8: Machine Learning for Drug Development

DrugRes

ADME

Tox

HTS

QM

Yields

Paratope

Epitope

Develop

DTI

DDI

PPI

GDA

DrugSyn

PeptideMHC

AntibodyAff

MTI

Catalyst

Reaction

MolGen

PairMolGen

RetroSyn

67 datasets spread over 22 learning tasks

8ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 9: Machine Learning for Drug Development

Model performance evaluators

Data processing helpers

A variety of data splits

Data Functions to Support Your Research

9ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 10: Machine Learning for Drug Development

GuacaMol

MOSES

Literature

Molecule Generation

GeneratedMolecules

Oracle Score

Optimize

GuacaMol: Benchmarking Models for de Novo Molecular Design, J. Chem. Inf. Model., 2019MOSES: A Benchmarking Platform for Molecular Generation Models, Frontiers in Pharmacology, 2020

Molecule Generation Oracles

10ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 11: Machine Learning for Drug Development

Leaderboards: Submit your Models

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 11

zitniklab.hms.harvard.edu/TDC

Page 12: Machine Learning for Drug Development

You Are Invited to Join TDC! TDC is an Open-Source, Community Effort

github.com/mims-harvard/TDC

zitniklab.hms.harvard.edu/TDC

12ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 13: Machine Learning for Drug Development

Demos, tools, and implementations

13ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 14: Machine Learning for Drug Development

DeepPurpose: Deep Learning Library for Compound and Protein Modeling

DTI, Drug Property, PPI, DDI, Protein Function Prediction

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 14

DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020

https://github.com/kexinhuang12345/DeepPurpose

Page 15: Machine Learning for Drug Development

How can domain scientists interact with AI systems?

15

DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 16: Machine Learning for Drug Development

MolDesigner: Interactive Design of Drugs with Deep Learning

16http://deeppurpose.sunlab.orgML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 17: Machine Learning for Drug Development

DEMO: DRUG-TARGET INTERACTION PREDICTION

Drug: Remdesivir Remdesivir is indicated for the treatment of adult and pediatric patients aged 12 years and over weighing at least 40 kg for coronavirus disease 2019 (COVID-19) infection requiring hospitalization.

Target protein: Replicase polyprotein 1ab. Multifunctional protein involved in the transcription and replication of viral RNAs

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 17

Molecular structure of Remdesivir Amino acid sequence of Replicase polyprotein 1ab

Page 18: Machine Learning for Drug Development

How can domain scientists interact with AI systems?

18

DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 19: Machine Learning for Drug Development

Automating Science

19ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 20: Machine Learning for Drug Development

Automating Science

20ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 21: Machine Learning for Drug Development

How to explain predictions?Key idea: § Summarize where in the data the model “looks” for

evidence for its prediction§ Find a small subgraph most influential for the prediction

Approach to generate explanations for graph neural networks based on counterfactual reasoning

21GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 22: Machine Learning for Drug Development

GNNExplainer: Key Idea§ Input: Given prediction 𝑓(𝑥) for node/link 𝑥§ Output: Explanation, a small subgraph 𝑀! together

with a small subset of node features:§ 𝑀! is most influential for prediction 𝑓(𝑥)

§ Approach: Learn 𝑀! via counterfactual reasoning§ Intuition: If removing 𝑣 from

the graph strongly decreases the probability of prediction ⇒ 𝑣 is a good counterfactual explanation for the prediction

GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 22ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 23: Machine Learning for Drug Development

Examples of Explanations”Will rosuvastatin treat hyperlipidemia? Whatis the disease treatment mechanism?”

New Algorithms: GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019New Insights: Discovery of Disease Treatment Mechanisms through the Multiscale Interactome, Nature Communications 2021 (in press)23ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 24: Machine Learning for Drug Development

Open challenges and future directions

24ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 25: Machine Learning for Drug Development

Learn about Therapeutics ML!

https://www.drugsymposium.org

Videos from the presentations are now publicly available to everyone through the Symposium Video Channel

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 25

Page 26: Machine Learning for Drug Development

Open Challenges§ Disconnected, uncoupled biomedical knowledge:

§ Challenge: Need to combine data in their broadest sense to close the gap between research and patient data

§ Diverse mechanisms of drug action:§ Challenge: Need to consider diverse mechanisms through which a

drug can treat a disease

§ Novel drugs in development, emerging diseases:§ Challenge: Need to learn and reason about never-before-seen

phenomena

§ Datasets for a variety of therapeutics tasks:§ Challenge: Need datasets and benchmarks to accelerate ML

model development, validation and transition into production and clinical implementation

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 26

Page 27: Machine Learning for Drug Development

OutlineOverview and introduction

Part 1: Virtual drug screening and drug repurposing

Part 2: Adverse drug effects, drug-drug interactions

Part 3: Clinical trial site identification, patient recruitment

Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation

Part 5: Molecular property prediction and transformers

Demos, resources, wrap-up & future directions27ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021

Page 28: Machine Learning for Drug Development

28https://zitniklab.hms.harvard.edu/drugml

ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021