Automatic Compound Design by Matched Molecular Pairs

Automatic Compound Design by Matched Molecular Pairs Willem van HoornSenior Solutions ConsultantProfessional Services

• Matched Molecular Pairs (MMPs)• Implementation in PP• Reaction Fingerprints• Using MMPs as automatic learning machine

Contents

Ceci n’est pas une MMP

Sildenafil Vardenafil

Similarity = 0.55 / 0.98 (ECFP_4 / MDL public keys)

MMP: - Single change- Typically: 1 or 2 bond cleavage; replace R-group or template

Recent AZ review

http://pubs.acs.org/doi/abs/10.1021/jm200452d

MMP as predictor of activity

Classic QSAR with full molecule descriptors QSAR using MMP

DpIC50(m-Br to m-Cl-p-F) = -0.19

Classic QSAR / regression• More generic, can predict >1 change• Interpretability varies

MMPs• Can only predict “one step away from known”• Very interpretable• Can answer “what to make next” challenge

What have the MMPs done for us?

“Learning Machine” using MMPs

Example of MMP learning machine

1 2 transformation applied to compound 3 should yield more attractive compound 4

MMP in Pipeline Pilot

Components

Protocols

PP 8.5 CU1

PP MMP algorithm based on GSK publication

Test set: EGFR from ChEMBL

Ed Griffen et alJ Med Chem. 2011, 54, 7739-50

- ChEMBL version 11

- 4609 IC50 values

- 3581 compounds

Generate MMPs and transformations

>90k MMPs in

<1 minute

MMP output

MMP transformation

Full transformation

DpIC50 distribution of transformations

90,343 MMPs yield 180,684 transformations (AB / BA)

10fold 100fold 1000fold etc

bioisosters

activity cliffsactivity cliffs

MMP transformations vs. full reactions

Not specific enough, seen >>1 in data set but large stddev(DpIC50)

Too specific, seen once in dataset, DpIC50 statistics n=1

Would like to have something that describes “reaction centre + nearby environment”

Would like increase confidence by looking at similar MMP transformations (with similar DpIC50)

PP reaction fingerprints: RCFP

• RCFP are similar to ECFP, atoms described by: Charge Hybridization Whether the atom is Reactant or Product Whether or not the atom is in the “Reaction Site”

• Need mapped reactions

PP 8.5

Reaction mapping is necessary

Only features describing reaction site

Mapped

All features, no information whether atom is in product or reactant

Unmapped

Reaction direction matters

Reaction fingerprints are not identical A→ B ≠ B → A

MMP transformation as rules

“Rule” = MMP transformation Effect = DpIC50

Context of MMP

transformation

Tanimoto seach of MMP transformations

DpIC50 = 1.9

A single observation…

DpIC50 = 1.8

DpIC50 = 1.5

DpIC50 = 1.3

… becomes more believable when looking at similars

Express significance as Bayesian probability

Bayesian model “Good” molecules: DpIC50 ≥ 1

Rank test set by likelihood transformation will yield

≥10fold increase in potency

Bayes can predict MMP 10 fold increase

• RCFP_6 > RCFP_4

• RCFP_4 >> RCFP_2

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0%

Random Model

Perfect Model

dActivity_class_increase_RCFP_2 Model

% of Samples

Enrichment plots of test set

Confidence vs. DpIC50

Bayesian score = confidence

DpIC50

Semi-quantitative Bayesian predictions

• Multi-category Bayesian• Class = DpIC50 bin• RCFP_6

Compare:• Normalised Probability (default)• #Enrichment• #EstPGood• Prediction

#EstPGood score smallest prediction error

22.5%22.5%

30.0% 19%

MMP vs. Full molecule transformations

Modelling with mapped reactions works better (it should)

22.5% 30.0%

• 80% training set– Generate MMP transformations– Learn classic regression model (PLS)– Learn Bayesian model from reaction fingerprints

MMP Idea Generator: Training

• ~5.6 predictions per test set molecule• MMP pIC50 := mean (pIC50reactant + DpIC50transformation)

• RCFP pIC50 := mean (pIC50reactant + DpIC50predicted by Bayes)

MMP Idea Generator: Test

Runtime ~ 30 min

~34k transformations >6.5M design ideas

Test set

QSAR by MMP

QSAR by Bayes / RCFP_6

SAR by MMP vs. SAR by PLSECFP_6 / phys property descriptors

MMP PLS

• MMP predictions nearly as good as PLS predictions

• Not 100% like with like comparison: fewer predictions for MMP

Consensus MMP & PLS predictions

Consensus: 26 / 62

Found by PLS: 10 / 56

Found by MMP: 11 / 56

Red: top 5% by pIC50 (59)

Solid: top 10% (118) by MMP or PLS. Total = 174

12 / 1006

• For one dataset it has been shown that– MMP transformations can form basis of an

automatic “Learning Machine”– Can select “significant rules”– Consensus MMP/regresssion activity prediction

works better than individual predictions

Conclusions

Spares

MMP vs. Bayes/RCFP predictions

Automatic Compound Design by Matched Molecular Pairs

Technology

Transcript of Automatic Compound Design by Matched Molecular Pairs

Bass & Guitar Loops Matched Pairs 115bpm

Bioisosteres in Medicinal Chemistry - estranky.sk · 2014-06-25 · • ChEMBL – Matched Molecular Pairs • Cambridge Structural Database (CSD) [next talk] • Descriptors •

1 STA 617 – Chp10 Models for matched pairs 10.4 Symmetry, Quasi-symmetry and Quasi-independence.

Practical applications of Matched Series Analysis: SAR ...optibrium.com/downloads/Practical_applications_of_MSA_preprint.pdf · Matched Molecular Pairs Analysis (MMPA) [1] is the

Models for Matched Pairs (Models for Square Tables) Edpsy ...

Inference for Matched Pairs and Two-Sample Data - William Michael Landau · 2019. 10. 6. · Will Landau Matched Pairs Two-Sample Inference: Large Samples Your Turn: wood product

Two Population Means Hypothesis Testing and Confidence Intervals For Matched Pairs

INFERENCE IN EXPERIMENTS WITH MATCHED PAIRS By …Inference in Experiments with Matched Pairs Yuehao Bai Department of Economics University of Chicago ybai@uchicago.edu Joseph P. Romano

Lecture 25 - Biostatisticsiruczins/teaching/140.652/lecture25.pdf · Outline Matched pairs data Dependence Marginal homogeneity McNemar’s test Estimation Relationship with CMH Marginal

Visualization and manipulation of Matched Molecular Series ...€¦ · Matched (Molecular) Pairs 1.6 [Cl, F] 3.5 Coined by Kenny and Sadowski in 2005* Easier to predict differences

INFERENCE WITH MATCHED PAIRS

Inference in Experiments with Matched Pairshome.uchicago.edu/~amshaikh/webfiles/pairs.pdfstatus is determined according to a \matched pairs" design. By a \matched pairs" design, we

T-Tests in SAS 1.One-sample T-Test 2.Matched Pairs T-Test 3.Two-sample T-Test.

An Introduction to Matched Pairs Designs - University … pairs designs.pdfAn Introduction to Matched Pairs Designs February 25, 2013 5 / 19 Related Background Matched Pairs Designs

Hypothesis Tests: Two Related Samples AKA Dependent Samples Tests AKA Matched-Pairs Tests Cal State Northridge 320 Andrew Ainsworth PhD.

AKA Dependent Samples Tests AKA Matched-Pairs Tests Cal State Northridge 320

a Low Noise, Matched Dual PNP Transistor MAT03 · parametric matching and high frequency performance. Low noise characteristics ... three MAT03 matched pairs, a further reduction

Which is a typical characteristic of an ionic compound? Electron pairs are shared among atoms. The ionic compound has a low solubility in water. The ionic.

Logistic Regression For Matched Pairs DataLogistic Regression For Matched Pairs Data Eric Meredith Department of Mathematical Sciences Montana State University May 11, 2007 A writing

Package ‘estimatr’ · matched-pairs, block-clustered, and matched-pair clustered designs. One speciﬁes their design by passing the blocks and clusters in their data and this