Enriching SAR Information by Molecular Scaffold …cisrg.shef.ac.uk/shef2016/talks/oral8.pdfin...

23
in partnership with Making the discoveries that defeat cancer Enriching SAR Information by Molecular Scaffold Enumeration Dr Yi Mok in silico Medicinal Chemistry Cancer Research UK Cancer Therapeutics Unit Division of Cancer Therapeutics ICR London 7 th Joint Sheffield Conference on Chemoinformatics Monday 4 July 2016 [email protected]

Transcript of Enriching SAR Information by Molecular Scaffold …cisrg.shef.ac.uk/shef2016/talks/oral8.pdfin...

in partnership with

Making the discoveries that defeat cancer

Enriching SAR Information by Molecular Scaffold Enumeration Dr Yi Mok

in silico Medicinal Chemistry Cancer Research UK Cancer Therapeutics Unit Division of Cancer Therapeutics ICR London

7th Joint Sheffield Conference on Chemoinformatics Monday 4 July 2016

[email protected]

Overview 2

Aims Enhancing the application of objective scaffold definitions in clustering Mimicking scaffold exploration when establishing SAR Enriching SAR information during HTS hit identification

Method

Systematic molecular scaffold enumeration Results

Relevance to medicinal chemistry Scaffold structural diversity Relevance to SAR analysis

Conclusions and Outlook

Applications in scaffold morphing and hopping

Molecular Scaffolds and SAR 3

O N

N

R3

R1R2

Ring systems Graph framework

Molecular Scaffolds and SAR 4

NVP-AUY922

Markush scaffold Exemplar medicinal chemistry scaffold

Langdon et al. (2011) J Chem Inf Model, 51, 2174.

1Schuffenhauer et. al. (2007) J Chem Inf Model, 47, 47. 2Langdon et al. (2011) J Chem Inf Model, 51, 2174.

Molecular Scaffolds and SAR 5

•  Preferably dataset-independent

•  Objective and invariant

O NHO

OH O N HN

O

N

NO

O N

N

NO

O N

N

O N

N

The Scaffold Tree1,2

Level 3 Level 2 Level 1 Level 0

Clustering using Core Scaffolds 6

Introduce ‘controlled fuzziness’ in scaffold representation to enhance objective scaffold definitions

and its applications in compound clustering

Using objective scaffold definitions in compound clustering

ü  Represents the concept of ‘compound series’ in medicinal chemistry

ü  Structurally relevant and easily interpretable

ü  Can readily derive SAR

×  May overlook key functional groups and functional group similarities

×  Definition may be too stringent

Enumeration of Core Scaffold (EnCore)1

Aims -  To mimic scaffold exploration efforts when establishing SAR -  To enrich SAR information during HTS hit identification

Literature precedence of molecular enumeration -  Regioisomers enumeration (HREMS)2

-  Reagents enumeration3

-  Scaffold morphing4

Core Scaffold Enumeration 7

1Mok & Brown. J Chem Inf Model, submitted. 2Krska et al. (2015) J Chem Inf Model, 55, 1130.

3Ward & Kettle (2011) J Med Chem, 54, 4670. 4Beno & Langley (2010) J Chem Inf Model, 50, 1159.

Features -  C, N, O elemental changes -  Change one atom (‘mutation’) on the scaffold only per generation -  Applicable to all atoms in a scaffold -  Keep the (non-)aromaticity of scaffold -  Collection of mutated scaffolds à Enumerated scaffold cluster

EnCore 8

Generate canonical SMILES

Introduce single-atom mutations

Check valence of mutated scaffolds

Compare aromaticity to

parent scaffold

Keep unique mutated scaffolds

Output enumerated

scaffold cluster

Input molecular scaffold

Next generation

Mok & Brown. J Chem Inf Model, submitted.

ON

N

N

NH

NN

ON

NH

N

N

ON

N

O

N

O

N

N ON

N8

NH

NN

N

NH

N

O

N

NO

N

O N

N

NH

6

EnCore 9

EnCore enumerated scaffold cluster represents an exhaustive collection of scaffolds within the defined constraints

NH

N

Mok & Brown. J Chem Inf Model, submitted.

ON

O

N

N

O

N

O

N O

N O

NH

NN

N

NH

N

N

NH

N

N NH

N

N NH

NN

NH

N

N

NH

N

N NH

N

N NH

N

N

N NH

NN

NHN

N

NH

N

N

NH

NHNN N

H

NN N

H

NH

NH

N

NH

N

N

NH

N

NH

O

N NH

N NH

7

22

Can EnCore enumerated scaffold clusters retrieve explored scaffolds in a medicinal chemistry compound series?

Relevance to Medicinal Chemistry 10

Medicinal chemistry compound series

EnCore enumeration

Literature medicinal chemistry compound series -  Published in J Med Chem in a single manuscript -  62 publications contained at least 100 compounds with IC50 against defined target

Relevance to Medicinal Chemistry 11

Mok & Brown. J Chem Inf Model, submitted.

Literature medicinal chemistry compound series -  Published in J Med Chem in a single manuscript -  62 publications contained at least 100 compounds with IC50 against defined target -  Removed from list papers on virtual screening, QSAR models and reviews -  43 literature medicinal chemistry series

-  A wide spectrum of therapeutic targets -  Diverse structural profile

Relevance to Medicinal Chemistry 12

0

10

20

30

40

80 120 160 200

No.

of S

caffo

lds

No. of Compounds

Mok & Brown. J Chem Inf Model, submitted.

0

5

10

15

20

25

1 2 3 4 N

umbe

r of S

erie

s Generation

Used the top Level 1 scaffold in each series for EnCore enumeration

Relevance to Medicinal Chemistry 13

Used the top Level 1 scaffold in each series for EnCore enumeration

Relevance to Medicinal Chemistry 14

125 HIV-1 Reverse Transcriptase inhibitors1

N

HN

NH

OHN

NH

O

113 2

O

NH NH

N

O

NH

121 Sigma Receptor ligands2

1 34 10 EnCore could mimic the scaffold exploration when establishing SAR

1Hargrave et. al. (1991) J Med Chem, 34, 2231. 2Gilligan et al. (1992) J Med Chem, 35, 4344.

0 1 2 3 4Generation

0.1

1

10

100

1000

10000

100000

Num

ber o

f Uni

que

Mut

ated

Sca

ffold

s

12 65 210 448 median

Scaffold Structural Diversity 15

How many generations of EnCore enumeration?

DrugBank1 approved drugs -  1826 compounds -  962 Lipinski-compliant with minimum two rings -  475 unique Level 1 Scaffolds

1 www.drugbank.ca

O

NH

N

OO

O O

O

Scaffold Structural Diversity 16

Two generations of enumeration a balance between chemical space sampling and time required to generate enumerated scaffold clusters

Average EPFP7 FPsim Avg FP similarity to parent scaffold for each generation

0 1 2 3 4Generation

0.0

0.2

0.4

0.6

0.8

1.0

Tani

mot

o Fi

nger

prin

t Sim

ilarit

y

Max FP similarity to parent scaffold for each generation

0 1 2 3 4Generation

0.0

0.2

0.4

0.6

0.8

1.0

Tani

mot

o Fi

nger

prin

t Sim

ilarit

y

Maximum EPFP7 FPsim

Mok & Brown. J Chem Inf Model, submitted.

Relevance to SAR Analysis 17

ICR/CRT screening library -  Designed for high-throughput screening -  214,540 compounds -  23,319 Level 1 scaffolds

-  11,657 representing at least two compounds -  11,662 represented only once (singletons)

Can EnCore enumerated scaffold clusters identify extant scaffolds and associate structurally related screening compounds to the parent scaffold?

EnCore EnCore

Exemplar scaffold A

NH

Exemplar scaffold B

NH

N

0

20

40

0 1 2 3 4Log10(Number of Compounds in Parent Scaffold)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

0

20

40

0 1 2 3 4Log10(Number of Compounds in Parent Scaffold)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

Relevance to SAR Analysis 18

After two generations of mutations -  Extant scaffolds identified in

17,199 enumerated scaffold clusters out of 23,319 scaffolds (74%)

-  Maximum increase to 54 extant scaffolds

Mok & Brown. J Chem Inf Model, submitted.

No. of Compounds (Log)

No.

of S

caffo

lds

0

20

40

0 1 2 3 4Log10(Number of Compounds in Parent Scaffold)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

0

20

40

0 1 2 3 4Log10(Number of Compounds in Parent Scaffold)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

Relevance to SAR Analysis 19

After two generations of mutations -  Extant scaffolds identified in

17,199 enumerated scaffold clusters out of 23,319 scaffolds (74%)

-  Maximum increase to 54 extant scaffolds

-  No extant scaffold match for only 6,120 scaffolds (26%)

EnCore can enrich SAR information in screening library by associating structurally related screening compounds to multiple scaffolds

Mok & Brown. J Chem Inf Model, submitted.

No. of Compounds (Log)

No.

of S

caffo

lds

0

20

40

0 1 2 3 4Log10(Number of Compounds in Parent Scaffold)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

No. of Compounds (Log)

No.

of S

caffo

lds

0

20

40

0 1 2 3 4Log10(Number of Compounds in Enumerated Scaffold Cluster)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

Singleton Scaffolds after EnCore Enumeration

0

20

40

0 1 2 3 4Log10(Number of Compounds in Enumerated Scaffold Cluster)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

No. of Compounds (Log)

No.

of S

caffo

lds

Relevance to SAR Analysis 20

Mok & Brown. J Chem Inf Model, submitted.

-  Two enumerated scaffold clusters have >10,000 screening compounds after enumeration

0

20

40

0 1 2 3 4Log10(Number of Compounds in Enumerated Scaffold Cluster)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

Singleton Scaffolds after EnCore Enumeration

0

20

40

0 1 2 3 4Log10(Number of Compounds in Enumerated Scaffold Cluster)

Num

ber o

f Sca

ffold

s in

Enu

mer

ated

Sca

ffold

Clu

ster

13103010030010003000

No. of Compounds (Log)

No.

of S

caffo

lds

Relevance to SAR Analysis 21

EnCore can enrich SAR information in screening library by introducing structurally related screening compounds in singleton scaffolds

Mok & Brown. J Chem Inf Model, submitted.

After two generations of mutations -  7,369 out of 11,662 singleton scaffolds

match with extant scaffolds in their enumerated scaffold clusters (63%)

Conclusions and Outlook 22

•  A list of literature medicinal chemistry compound series defined •  EnCore can mimic scaffold exploration when establishing SAR

•  EnCore can enrich SAR information in screening library

•  By associating structurally related compounds to multiple clusters

•  By introducing structurally related compounds in singleton scaffolds

•  EnCore offers complementary capabilities to literature enumeration tools and clustering methods

•  Multiple generations of enumerated scaffolds represent an exhaustive collection for scaffold morphing and hopping

in partnership with

in silico Medicinal Chemistry Nathan Brown Fabio Broccatelli Michael Carter Nicholas Firth Teresa Kaserer Sarah Langdon Joshua Meyers Lewis Vidler Medicinal Chemistry Julian Blagg