Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Post on 28-Jan-2015

106 views 1 download

Tags:

description

ACS talk wed 28 2012

Transcript of Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final

Combining Cheminformatics Methods and Pathway Analysis to Identify Molecules with Whole Cell Activity

Against Mycobacterium Tuberculosis

Malabika Sarker1, Carolyn Talcott1, Peter Madrid1, Sidharth Chopra1, Barry

A. Bunin2 Gyanu Lamichhane3, Joel S. Freundlich4 and Sean Ekins2, 5,

1SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.

2Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 3Johns Hopkins School of Medicine, Department of Medicine, 1550 Orleans St, Room 103, Baltimore, MD 21287, USA.

4Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ – New

Jersey Medical School, 185 South Orange Avenue Newark, NJ 07103, USA. 5Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.

.

Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds)

1/3rd of worlds population infected!!!!

Multi drug resistance in 4.3% of cases

Extensively drug resistant increasing incidence

No new drugs in over 40 yrs

Drug-drug interactions and Co-morbidity with HIV

Collaboration between groups is rare

These groups may work on existing or new targets

Use of computational methods with TB is rare

Literature TB data is not well collated (SAR)

Funded by Bill and Melinda Gates Foundation

Applying CDD to Build a disease community for TB

~ 20 public datasets

for TB

Including Novartis

data on TB hits

>300,000 cpds

Patents, Papers

Annotated by CDD

Open to browse by

anyone

http://www.collaborativedrug.

com/register

Molecules with activity

against

Simple descriptor analysis on > 300,000 compounds tested vs TB

Dataset MWT logP HBD HBA RO 5

Atom

count PSA RBN

MLSMR

Active ≥

90%

inhibition at

10uM

(N = 4096)

357.10

(84.70)

3.58

(1.39)

1.16

(0.93)

4.89

(1.94)

0.20

(0.48)

42.99

(12.70)

83.46

(34.31)

4.85

(2.43)

Inactive

< 90%

inhibition at

10uM

(N =

216367)

350.15

(77.98)**

2.82

(1.44)**

1.14

(0.88)

4.86

(1.77)

0.09

(0.31)**

43.38

(10.73)

85.06

(32.08)*

4.91

(2.35)

TAACF-

NIAID CB2

Active ≥ 90%

inhibition at

10uM

(N =1702)

349.58

(63.82)

4.04

(1.02)

0.98

(0.84)

4.18

(1.66)

0.19

(0.40)

41.88

(9.44)

70.28

(29.55)

4.76

(1.99)

Inactive

< 90%

inhibition at

10uM

(N

=100,931)

352.59

(70.87)

3.38

(1.36)**

1.11

(0.82)**

4.24

(1.58)

0.12

(0.34)**

42.43

(8.94)*

77.75

(30.17)*

*

4.72

(1.99)

Ekins et al,

Trends in

Microbiology

19: 65-74, 2011

Fitting into the drug discovery process

BMGF 3 Academia/ Govt lab – Industry screening partnerships

CDD used for data sharing / collaboration – along with cheminformatics

expertise

Previously supported larger groups of labs – many continued as customers

CDD is a partner on a 5 year project supporting >20 labs and providing cheminformatics

support

Already found hits for a TB target using docking www.mm4tb.org

More Medicines for Tuberculosis

Bayesian Classification TB Models

Dateset

(number of

molecules)

External

ROC Score

Internal

ROC

Score Concordance Specificity Sensitivity

MLSMR

All single point

screen

(N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26

MLSMR

dose response set

(N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96

We can use the public data for machine learning

model building

Using Discovery Studio Bayesian model

Leave out 50% x 100

Ekins et al., Mol BioSyst, 6: 840-851, 2010

100K library Novartis Data FDA drugs

Additional test sets

Suggests models can predict data from the same and independent labs

Initial enrichment – enables screening few compounds to find actives

21 hits in 2108 cpds 34 hits in 248 cpds 1702 hits in >100K cpds

Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011. Ekins et al., Mol BioSyst, 6: 840-851, 2010

Searching for TB molecular mimics; collaboration

Lamichhane G, et al Mbio, 2: e00301-10, 2011

Modeling – CDD

Biology – Johns Hopkins

Chemistry – Texas A&M

Azaserine exhibited a good fit for this

pharmacophore, as judged by its

quantitative

FitValue (= 2.1) and visual inspection.

CDD

Literature data on

molecules

and their targets

Similarity search with a

mimic enables target

fishing

SRI

Pathway data (targets)

Species differences in

pathways

Where to intervene

Combine the knowledge

Select new targets

Take mimic strategy

Aim 1

Develop API to

link CDD and

SRI databases

Aim 2

Target and

compound data

added to

pathway model

Aim 3

Identify new

targets for drugs

CDD and SRI STTR collaboration

Mimic strategy

1. The enzymes around these metabolites are "in

vivo essential".

2. These enzymes have no human homolog.

3. These enzyme targets are not yet explored

though some enzymes from the same pathways

are drug targets (experimental or predicted).

Identification of essential in vivo enzymes of Mtb

Analysis of metabolic pathway and reaction information for the essential enzymes

Comparison of non-human-homologues enzymes with Mtb in vivo essential gene set

Selection of targets – in vivo essential, not homologous to human and not known as TB drug-targets

In silico design of small molecule inhibitors or pharmacophores for selected enzyme targets

In vitro testing of selected pharmacophores

CDD

SRI

SRI

SRI

SRI

SRI

Leverages work of

Lamichhane et al.,

Sassetti et al.,

Approach taken

similar to that of

Lamichhane et al

Mbio paper 2011

-Instead mimic the

substrate

Uses data from SRI

and CDD databases

to select targets that

have not been

exploited with small

molecules

The cellular overview diagram for M. tuberculosis H37Rv, from the TBCyc database (http://tbcyc.tbdb.org/index.shtml)

TBCyc gave a total of

53 non-redundant pathways

for the set of

314 essential in vivo genes.

Sarker et al., Pharm Res 2012, in press

Venn diagram shows the degree of association between the in vivo mutants of Mtb in different animal models

Sarker et al., Pharm Res 2012, in press

Anishetty et al

Sassetti et al

185 proteins from Mtb absent in human

49 proteins unique to Mtb

Among 314 essential in vivo proteins of Mtb 66 proteins were non-

human homolgs

C.M. Sassetti, et.al., Molecular microbiology. 48:77-84 (2003).

S. Anishetty. et al., Comput Biol Chem. 29:368-378 (2005).

https://www.collaborativedrug.com/buzz/2011/05/02/new-tb-targets-and-

molecules-data-available-for-public-access-use/

TB target database for in vivo essential genes.

14 known gene targets and 31 predicted gene targets for already known 35 approved TB drugs

TB molecules with activity in vitro and target information (from CDD) - now

added external links to pathways, literature etc.

TB molecules and target information database connects molecule, gene, pathway and literature

Essential Gene Pathway Essential Substrate/s

bioB (Rv1589) Biotin biosynthesis dethiobiotin

thiE (Rv0414c) Thiamine biosynthesis 2-(4-methylthiazol-5-yl)ethyl phosphate and

[(4-amino-2-methyl-pyrimidin-5-yl)methoxy-

oxido-phosphoryl] phosphate

cysE (Rv2335) Cysteine biosynthesis L-serine and acetyl-CoA

cobC (Rv2231c) No pathway assigned L-threonine O-3-phosphate

glpX (Rv1099c) glycolysis and gluconeogenesis D-fructose 1,6-bisphosphate

ppgK (Rv2702) Amino sugar and nucleotide sugar metabolism

Gluconeogenesis

β-D-glucose

arcA (Rv1001) arginine degradation V (arginine deiminase pathway) L-arginine

panD (Rv3601c) β-alanine biosynthesis IV L-aspartate

otsA (Rv3490) trehalose biosynthesis I UDP-D-glucose and α-D-glucose 6-

phosphate

Targets, metabolites and pathways pursued in this study

Sarker et al., Pharm Res 2012, in press

Pharmacophore developed (using Accelrys Discovery Studio) from 3D conformations of the substrate

van der Waals surface for the metabolite mapped onto it

pharmacophore plus shape searched in 3D compound databases from vendors

In silico hits collated

Filtered for TB whole cell activity and reactivity

Compounds filtered based on Bayesian score using models derived from NIAID / Southern Research

Inst data to retrieve ideal molecular properties for in vitro TB activity

Biotin biosynthesis

dethiobiotin

Pharmacophore

Searching Maybridge (57K)

gives 72 molecules – many of

them hydrophobic so they

stand a chance of in vitro

activity

Take substrate

and generate 3D

conformers and

build a

pharmacophore

Use the

pharmacophore

to search vendor

libraries in 3D

Buy and test

compounds

Example of mimic strategy for bioB Rv1589

a. b.

c. d.

e. f.

g. h.

i. j.

k. l.

Substrate Pharmacophores Developed for Mtb Enzymes

Sarker et al., Pharm Res 2012, in press

Green = Hydrogen bond acceptor, Purple = hydrogen bond donor, cyan = hydrophobe

Grey – van der Waals surface

Two Proposed Mimics of D-fructose 1,6 bisphosphate

DFP000133SC MIC 40μg/ml

DFP000134SC MIC 20μg/ml

Computationally searched >80,000 molecules – narrowed to 842 hits -tested

23 compounds in vitro (3 picked as inactives), lead to 2 proposed as mimics

of D-fructose 1,6 bisphosphate

Sarker et al., Pharm Res 2012, in press

a.

b.

1. Find candidate genes

coding potential targets.

1. choose pathogen

2. search for genes

choose source--

experimental in vitro/ex vivo

data, in silico (single/double)

knockout (choose nutrient set,

survival conditions)

choose filter (no human

ortholog, ..., user edit)

Output: target candidate

list--gene names associated

with reference identifier.

2. Prioritize target candidate

list.

1. Annotate (choose properties:

pathways, reactions, EC#,

GO characterization)

2. Filtering (choose thresholds)

3. Sort (choose criteria: number

of pathways, number of

reactions, ...)

4. Annotate reaction substrates

with structure information.

Output: Prioritized target list

annotated with prioritizing

properties and associated

reactions with their substrates

annotated with structure (these

are the candidate molecules to

mimic).

Metabolites (and metadata,

required as sdf file for software)

3. For each candidate

molecule develop

pharmacophore model that

suggests mimics.

1. Develop pharmacophore

models from metabolites

2. Search known drug

databases for compounds

mapping to

pharmacophore,

3. Filter based on ADME/Tox

properties

4. Filter based on other

models for target

bioactivity

5. Sorting or Pareto

optimization of results

Output: Pharmacophores and

candidate mimics for

substrates of target enzymes

Molecule id, source

4. Submit top mimics for

preliminary experimental

validation and lead

optimization

1. select molecules from 3

2. order from vendor

3. test in vitro / ex vivo

4. add results to CDD

database

5. prioritize compounds for

lead optimization / in vivo

studies

6. partnering with 3rd party for

preclinical/ clinical studies

Output: Experimental results

to be fed into the CDD

database

Proposed generalized workflow for molecule discovery

Summary

POC took < 6mths - - Submitted phase II STTR,

Still need to test vs target - verify it hits suggested target – optimize cpds.

Need to link SRI and CDD databases via API – new product

• Computational models based on Whole cell TB data could improve efficiency of

screening

• Collaborations get us to interesting compounds quickly

• Additional prospective validation ongoing with IDRI, Southern Research Institute

and UMDNJ using machine learning models - testing small numbers of

compounds

• UMDNJ – mined GSK malaria public data, scored with bayesian models –

ordered from vendors

Library

size

Number

of hits

Hit rate

(%)

Notes

Reference

100997 1782 1.76

Diverse

library Ananthan

215110 3817 1.77

Diverse

library Maddry

25671 1329 5.18

Human

kinase

focussed

library Reynolds

Ranked Asinex 25K library with

dose response model - 99

screened.16 cpds were

identified with IC50<100uM

Compare with HTS screening

below

Example 1. Kinase library Example 2. Asinex library

Example 3. IDRI: 3 models - 48 compounds tested, 11 activity < or equal to MIC

10uM (22.9% hit rate)

Example 4. UMDNJ 1 model – 4 tested, 3 active (1 MIC < 0.125ug/ml)

Bayesian Machine Learning Models – Improve Hit Rates

What next - Apps for collaboration ODDT – Open drug discovery teams

Flipboard-like app for aggregating social media for diseases etc

Alex Clark, Molecular Materials Informatics, Inc

Williams et al DDT 16:928-939, 2011

Clark et al submitted 2012

Ekins et al submitted 2012

Acknowledgments collaborators (Allen Casey, Robert Reynolds

etc..)

Alex Clark (Molecular Materials Informatics, Inc)

Accelrys

CDD

Funding BMGF

Award Number R41AI088893 from the National Institute Of Allergy And Infectious Diseases.

Email: ekinssean@yahoo.com

Slideshare: http://www.slideshare.net/ekinssean

Twitter: collabchem

Blog: http://www.collabchem.com/

Website: http://www.collaborations.com/CHEMISTRY.HTM