Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

30
Combining Cheminformatics Methods and Pathway Analysis to Identify Molecules with Whole Cell Activity Against Mycobacterium Tuberculosis Malabika Sarker 1 , Carolyn Talcott 1 , Peter Madrid 1 , Sidharth Chopra 1 , Barry A. Bunin 2 Gyanu Lamichhane 3 , Joel S. Freundlich 4 and Sean Ekins 2, 5, 1 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA. 2 Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 3 Johns Hopkins School of Medicine, Department of Medicine, 1550 Orleans St, Room 103, Baltimore, MD 21287, USA. 4 Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ New Jersey Medical School, 185 South Orange Avenue Newark, NJ 07103, USA. 5 Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA. .

description

ACS talk wed 28 2012

Transcript of Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Page 1: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Combining Cheminformatics Methods and Pathway Analysis to Identify Molecules with Whole Cell Activity

Against Mycobacterium Tuberculosis

Malabika Sarker1, Carolyn Talcott1, Peter Madrid1, Sidharth Chopra1, Barry

A. Bunin2 Gyanu Lamichhane3, Joel S. Freundlich4 and Sean Ekins2, 5,

1SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.

2Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 3Johns Hopkins School of Medicine, Department of Medicine, 1550 Orleans St, Room 103, Baltimore, MD 21287, USA.

4Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ – New

Jersey Medical School, 185 South Orange Avenue Newark, NJ 07103, USA. 5Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.

.

Page 2: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds)

1/3rd of worlds population infected!!!!

Multi drug resistance in 4.3% of cases

Extensively drug resistant increasing incidence

No new drugs in over 40 yrs

Drug-drug interactions and Co-morbidity with HIV

Collaboration between groups is rare

These groups may work on existing or new targets

Use of computational methods with TB is rare

Literature TB data is not well collated (SAR)

Funded by Bill and Melinda Gates Foundation

Applying CDD to Build a disease community for TB

Page 3: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final
Page 4: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

~ 20 public datasets

for TB

Including Novartis

data on TB hits

>300,000 cpds

Patents, Papers

Annotated by CDD

Open to browse by

anyone

http://www.collaborativedrug.

com/register

Molecules with activity

against

Page 5: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Simple descriptor analysis on > 300,000 compounds tested vs TB

Dataset MWT logP HBD HBA RO 5

Atom

count PSA RBN

MLSMR

Active ≥

90%

inhibition at

10uM

(N = 4096)

357.10

(84.70)

3.58

(1.39)

1.16

(0.93)

4.89

(1.94)

0.20

(0.48)

42.99

(12.70)

83.46

(34.31)

4.85

(2.43)

Inactive

< 90%

inhibition at

10uM

(N =

216367)

350.15

(77.98)**

2.82

(1.44)**

1.14

(0.88)

4.86

(1.77)

0.09

(0.31)**

43.38

(10.73)

85.06

(32.08)*

4.91

(2.35)

TAACF-

NIAID CB2

Active ≥ 90%

inhibition at

10uM

(N =1702)

349.58

(63.82)

4.04

(1.02)

0.98

(0.84)

4.18

(1.66)

0.19

(0.40)

41.88

(9.44)

70.28

(29.55)

4.76

(1.99)

Inactive

< 90%

inhibition at

10uM

(N

=100,931)

352.59

(70.87)

3.38

(1.36)**

1.11

(0.82)**

4.24

(1.58)

0.12

(0.34)**

42.43

(8.94)*

77.75

(30.17)*

*

4.72

(1.99)

Page 6: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Ekins et al,

Trends in

Microbiology

19: 65-74, 2011

Fitting into the drug discovery process

Page 7: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

BMGF 3 Academia/ Govt lab – Industry screening partnerships

CDD used for data sharing / collaboration – along with cheminformatics

expertise

Previously supported larger groups of labs – many continued as customers

Page 8: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

CDD is a partner on a 5 year project supporting >20 labs and providing cheminformatics

support

Already found hits for a TB target using docking www.mm4tb.org

More Medicines for Tuberculosis

Page 9: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Bayesian Classification TB Models

Dateset

(number of

molecules)

External

ROC Score

Internal

ROC

Score Concordance Specificity Sensitivity

MLSMR

All single point

screen

(N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26

MLSMR

dose response set

(N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96

We can use the public data for machine learning

model building

Using Discovery Studio Bayesian model

Leave out 50% x 100

Ekins et al., Mol BioSyst, 6: 840-851, 2010

Page 10: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

100K library Novartis Data FDA drugs

Additional test sets

Suggests models can predict data from the same and independent labs

Initial enrichment – enables screening few compounds to find actives

21 hits in 2108 cpds 34 hits in 248 cpds 1702 hits in >100K cpds

Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011. Ekins et al., Mol BioSyst, 6: 840-851, 2010

Page 11: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Searching for TB molecular mimics; collaboration

Lamichhane G, et al Mbio, 2: e00301-10, 2011

Modeling – CDD

Biology – Johns Hopkins

Chemistry – Texas A&M

Azaserine exhibited a good fit for this

pharmacophore, as judged by its

quantitative

FitValue (= 2.1) and visual inspection.

Page 12: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

CDD

Literature data on

molecules

and their targets

Similarity search with a

mimic enables target

fishing

SRI

Pathway data (targets)

Species differences in

pathways

Where to intervene

Combine the knowledge

Select new targets

Take mimic strategy

Aim 1

Develop API to

link CDD and

SRI databases

Aim 2

Target and

compound data

added to

pathway model

Aim 3

Identify new

targets for drugs

CDD and SRI STTR collaboration

Page 13: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Mimic strategy

1. The enzymes around these metabolites are "in

vivo essential".

2. These enzymes have no human homolog.

3. These enzyme targets are not yet explored

though some enzymes from the same pathways

are drug targets (experimental or predicted).

Page 14: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Identification of essential in vivo enzymes of Mtb

Analysis of metabolic pathway and reaction information for the essential enzymes

Comparison of non-human-homologues enzymes with Mtb in vivo essential gene set

Selection of targets – in vivo essential, not homologous to human and not known as TB drug-targets

In silico design of small molecule inhibitors or pharmacophores for selected enzyme targets

In vitro testing of selected pharmacophores

CDD

SRI

SRI

SRI

SRI

SRI

Leverages work of

Lamichhane et al.,

Sassetti et al.,

Approach taken

similar to that of

Lamichhane et al

Mbio paper 2011

-Instead mimic the

substrate

Uses data from SRI

and CDD databases

to select targets that

have not been

exploited with small

molecules

Page 15: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

The cellular overview diagram for M. tuberculosis H37Rv, from the TBCyc database (http://tbcyc.tbdb.org/index.shtml)

TBCyc gave a total of

53 non-redundant pathways

for the set of

314 essential in vivo genes.

Sarker et al., Pharm Res 2012, in press

Page 16: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Venn diagram shows the degree of association between the in vivo mutants of Mtb in different animal models

Sarker et al., Pharm Res 2012, in press

Page 17: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Anishetty et al

Sassetti et al

185 proteins from Mtb absent in human

49 proteins unique to Mtb

Among 314 essential in vivo proteins of Mtb 66 proteins were non-

human homolgs

C.M. Sassetti, et.al., Molecular microbiology. 48:77-84 (2003).

S. Anishetty. et al., Comput Biol Chem. 29:368-378 (2005).

Page 18: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

https://www.collaborativedrug.com/buzz/2011/05/02/new-tb-targets-and-

molecules-data-available-for-public-access-use/

TB target database for in vivo essential genes.

Page 19: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

14 known gene targets and 31 predicted gene targets for already known 35 approved TB drugs

TB molecules with activity in vitro and target information (from CDD) - now

added external links to pathways, literature etc.

Page 20: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

TB molecules and target information database connects molecule, gene, pathway and literature

Page 21: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Essential Gene Pathway Essential Substrate/s

bioB (Rv1589) Biotin biosynthesis dethiobiotin

thiE (Rv0414c) Thiamine biosynthesis 2-(4-methylthiazol-5-yl)ethyl phosphate and

[(4-amino-2-methyl-pyrimidin-5-yl)methoxy-

oxido-phosphoryl] phosphate

cysE (Rv2335) Cysteine biosynthesis L-serine and acetyl-CoA

cobC (Rv2231c) No pathway assigned L-threonine O-3-phosphate

glpX (Rv1099c) glycolysis and gluconeogenesis D-fructose 1,6-bisphosphate

ppgK (Rv2702) Amino sugar and nucleotide sugar metabolism

Gluconeogenesis

β-D-glucose

arcA (Rv1001) arginine degradation V (arginine deiminase pathway) L-arginine

panD (Rv3601c) β-alanine biosynthesis IV L-aspartate

otsA (Rv3490) trehalose biosynthesis I UDP-D-glucose and α-D-glucose 6-

phosphate

Targets, metabolites and pathways pursued in this study

Sarker et al., Pharm Res 2012, in press

Page 22: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Pharmacophore developed (using Accelrys Discovery Studio) from 3D conformations of the substrate

van der Waals surface for the metabolite mapped onto it

pharmacophore plus shape searched in 3D compound databases from vendors

In silico hits collated

Filtered for TB whole cell activity and reactivity

Compounds filtered based on Bayesian score using models derived from NIAID / Southern Research

Inst data to retrieve ideal molecular properties for in vitro TB activity

Page 23: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Biotin biosynthesis

dethiobiotin

Pharmacophore

Searching Maybridge (57K)

gives 72 molecules – many of

them hydrophobic so they

stand a chance of in vitro

activity

Take substrate

and generate 3D

conformers and

build a

pharmacophore

Use the

pharmacophore

to search vendor

libraries in 3D

Buy and test

compounds

Example of mimic strategy for bioB Rv1589

Page 24: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

a. b.

c. d.

e. f.

g. h.

i. j.

k. l.

Substrate Pharmacophores Developed for Mtb Enzymes

Sarker et al., Pharm Res 2012, in press

Green = Hydrogen bond acceptor, Purple = hydrogen bond donor, cyan = hydrophobe

Grey – van der Waals surface

Page 25: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Two Proposed Mimics of D-fructose 1,6 bisphosphate

DFP000133SC MIC 40μg/ml

DFP000134SC MIC 20μg/ml

Computationally searched >80,000 molecules – narrowed to 842 hits -tested

23 compounds in vitro (3 picked as inactives), lead to 2 proposed as mimics

of D-fructose 1,6 bisphosphate

Sarker et al., Pharm Res 2012, in press

a.

b.

Page 26: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

1. Find candidate genes

coding potential targets.

1. choose pathogen

2. search for genes

choose source--

experimental in vitro/ex vivo

data, in silico (single/double)

knockout (choose nutrient set,

survival conditions)

choose filter (no human

ortholog, ..., user edit)

Output: target candidate

list--gene names associated

with reference identifier.

2. Prioritize target candidate

list.

1. Annotate (choose properties:

pathways, reactions, EC#,

GO characterization)

2. Filtering (choose thresholds)

3. Sort (choose criteria: number

of pathways, number of

reactions, ...)

4. Annotate reaction substrates

with structure information.

Output: Prioritized target list

annotated with prioritizing

properties and associated

reactions with their substrates

annotated with structure (these

are the candidate molecules to

mimic).

Metabolites (and metadata,

required as sdf file for software)

3. For each candidate

molecule develop

pharmacophore model that

suggests mimics.

1. Develop pharmacophore

models from metabolites

2. Search known drug

databases for compounds

mapping to

pharmacophore,

3. Filter based on ADME/Tox

properties

4. Filter based on other

models for target

bioactivity

5. Sorting or Pareto

optimization of results

Output: Pharmacophores and

candidate mimics for

substrates of target enzymes

Molecule id, source

4. Submit top mimics for

preliminary experimental

validation and lead

optimization

1. select molecules from 3

2. order from vendor

3. test in vitro / ex vivo

4. add results to CDD

database

5. prioritize compounds for

lead optimization / in vivo

studies

6. partnering with 3rd party for

preclinical/ clinical studies

Output: Experimental results

to be fed into the CDD

database

Proposed generalized workflow for molecule discovery

Page 27: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Summary

POC took < 6mths - - Submitted phase II STTR,

Still need to test vs target - verify it hits suggested target – optimize cpds.

Need to link SRI and CDD databases via API – new product

• Computational models based on Whole cell TB data could improve efficiency of

screening

• Collaborations get us to interesting compounds quickly

• Additional prospective validation ongoing with IDRI, Southern Research Institute

and UMDNJ using machine learning models - testing small numbers of

compounds

• UMDNJ – mined GSK malaria public data, scored with bayesian models –

ordered from vendors

Page 28: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Library

size

Number

of hits

Hit rate

(%)

Notes

Reference

100997 1782 1.76

Diverse

library Ananthan

215110 3817 1.77

Diverse

library Maddry

25671 1329 5.18

Human

kinase

focussed

library Reynolds

Ranked Asinex 25K library with

dose response model - 99

screened.16 cpds were

identified with IC50<100uM

Compare with HTS screening

below

Example 1. Kinase library Example 2. Asinex library

Example 3. IDRI: 3 models - 48 compounds tested, 11 activity < or equal to MIC

10uM (22.9% hit rate)

Example 4. UMDNJ 1 model – 4 tested, 3 active (1 MIC < 0.125ug/ml)

Bayesian Machine Learning Models – Improve Hit Rates

Page 29: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

What next - Apps for collaboration ODDT – Open drug discovery teams

Flipboard-like app for aggregating social media for diseases etc

Alex Clark, Molecular Materials Informatics, Inc

Williams et al DDT 16:928-939, 2011

Clark et al submitted 2012

Ekins et al submitted 2012

Page 30: Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Acknowledgments collaborators (Allen Casey, Robert Reynolds

etc..)

Alex Clark (Molecular Materials Informatics, Inc)

Accelrys

CDD

Funding BMGF

Award Number R41AI088893 from the National Institute Of Allergy And Infectious Diseases.

Email: [email protected]

Slideshare: http://www.slideshare.net/ekinssean

Twitter: collabchem

Blog: http://www.collabchem.com/

Website: http://www.collaborations.com/CHEMISTRY.HTM