Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final
-
Upload
sean-ekins -
Category
Health & Medicine
-
view
106 -
download
1
description
Transcript of Acs combining cheminformatics methods and pathway analysis to identify molecules with whole final
Combining Cheminformatics Methods and Pathway Analysis to Identify Molecules with Whole Cell Activity
Against Mycobacterium Tuberculosis
Malabika Sarker1, Carolyn Talcott1, Peter Madrid1, Sidharth Chopra1, Barry
A. Bunin2 Gyanu Lamichhane3, Joel S. Freundlich4 and Sean Ekins2, 5,
1SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.
2Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 3Johns Hopkins School of Medicine, Department of Medicine, 1550 Orleans St, Room 103, Baltimore, MD 21287, USA.
4Departments of Pharmacology & Physiology and Medicine, Center for Emerging and Reemerging Pathogens, UMDNJ – New
Jersey Medical School, 185 South Orange Avenue Newark, NJ 07103, USA. 5Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
.
Tuberculosis Kills 1.6-1.7m/yr (~1 every 8 seconds)
1/3rd of worlds population infected!!!!
Multi drug resistance in 4.3% of cases
Extensively drug resistant increasing incidence
No new drugs in over 40 yrs
Drug-drug interactions and Co-morbidity with HIV
Collaboration between groups is rare
These groups may work on existing or new targets
Use of computational methods with TB is rare
Literature TB data is not well collated (SAR)
Funded by Bill and Melinda Gates Foundation
Applying CDD to Build a disease community for TB
~ 20 public datasets
for TB
Including Novartis
data on TB hits
>300,000 cpds
Patents, Papers
Annotated by CDD
Open to browse by
anyone
http://www.collaborativedrug.
com/register
Molecules with activity
against
Simple descriptor analysis on > 300,000 compounds tested vs TB
Dataset MWT logP HBD HBA RO 5
Atom
count PSA RBN
MLSMR
Active ≥
90%
inhibition at
10uM
(N = 4096)
357.10
(84.70)
3.58
(1.39)
1.16
(0.93)
4.89
(1.94)
0.20
(0.48)
42.99
(12.70)
83.46
(34.31)
4.85
(2.43)
Inactive
< 90%
inhibition at
10uM
(N =
216367)
350.15
(77.98)**
2.82
(1.44)**
1.14
(0.88)
4.86
(1.77)
0.09
(0.31)**
43.38
(10.73)
85.06
(32.08)*
4.91
(2.35)
TAACF-
NIAID CB2
Active ≥ 90%
inhibition at
10uM
(N =1702)
349.58
(63.82)
4.04
(1.02)
0.98
(0.84)
4.18
(1.66)
0.19
(0.40)
41.88
(9.44)
70.28
(29.55)
4.76
(1.99)
Inactive
< 90%
inhibition at
10uM
(N
=100,931)
352.59
(70.87)
3.38
(1.36)**
1.11
(0.82)**
4.24
(1.58)
0.12
(0.34)**
42.43
(8.94)*
77.75
(30.17)*
*
4.72
(1.99)
Ekins et al,
Trends in
Microbiology
19: 65-74, 2011
Fitting into the drug discovery process
BMGF 3 Academia/ Govt lab – Industry screening partnerships
CDD used for data sharing / collaboration – along with cheminformatics
expertise
Previously supported larger groups of labs – many continued as customers
CDD is a partner on a 5 year project supporting >20 labs and providing cheminformatics
support
Already found hits for a TB target using docking www.mm4tb.org
More Medicines for Tuberculosis
Bayesian Classification TB Models
Dateset
(number of
molecules)
External
ROC Score
Internal
ROC
Score Concordance Specificity Sensitivity
MLSMR
All single point
screen
(N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26
MLSMR
dose response set
(N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96
We can use the public data for machine learning
model building
Using Discovery Studio Bayesian model
Leave out 50% x 100
Ekins et al., Mol BioSyst, 6: 840-851, 2010
100K library Novartis Data FDA drugs
Additional test sets
Suggests models can predict data from the same and independent labs
Initial enrichment – enables screening few compounds to find actives
21 hits in 2108 cpds 34 hits in 248 cpds 1702 hits in >100K cpds
Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011. Ekins et al., Mol BioSyst, 6: 840-851, 2010
Searching for TB molecular mimics; collaboration
Lamichhane G, et al Mbio, 2: e00301-10, 2011
Modeling – CDD
Biology – Johns Hopkins
Chemistry – Texas A&M
Azaserine exhibited a good fit for this
pharmacophore, as judged by its
quantitative
FitValue (= 2.1) and visual inspection.
CDD
Literature data on
molecules
and their targets
Similarity search with a
mimic enables target
fishing
SRI
Pathway data (targets)
Species differences in
pathways
Where to intervene
Combine the knowledge
Select new targets
Take mimic strategy
Aim 1
Develop API to
link CDD and
SRI databases
Aim 2
Target and
compound data
added to
pathway model
Aim 3
Identify new
targets for drugs
CDD and SRI STTR collaboration
Mimic strategy
1. The enzymes around these metabolites are "in
vivo essential".
2. These enzymes have no human homolog.
3. These enzyme targets are not yet explored
though some enzymes from the same pathways
are drug targets (experimental or predicted).
Identification of essential in vivo enzymes of Mtb
Analysis of metabolic pathway and reaction information for the essential enzymes
Comparison of non-human-homologues enzymes with Mtb in vivo essential gene set
Selection of targets – in vivo essential, not homologous to human and not known as TB drug-targets
In silico design of small molecule inhibitors or pharmacophores for selected enzyme targets
In vitro testing of selected pharmacophores
CDD
SRI
SRI
SRI
SRI
SRI
Leverages work of
Lamichhane et al.,
Sassetti et al.,
Approach taken
similar to that of
Lamichhane et al
Mbio paper 2011
-Instead mimic the
substrate
Uses data from SRI
and CDD databases
to select targets that
have not been
exploited with small
molecules
The cellular overview diagram for M. tuberculosis H37Rv, from the TBCyc database (http://tbcyc.tbdb.org/index.shtml)
TBCyc gave a total of
53 non-redundant pathways
for the set of
314 essential in vivo genes.
Sarker et al., Pharm Res 2012, in press
Venn diagram shows the degree of association between the in vivo mutants of Mtb in different animal models
Sarker et al., Pharm Res 2012, in press
Anishetty et al
Sassetti et al
185 proteins from Mtb absent in human
49 proteins unique to Mtb
Among 314 essential in vivo proteins of Mtb 66 proteins were non-
human homolgs
C.M. Sassetti, et.al., Molecular microbiology. 48:77-84 (2003).
S. Anishetty. et al., Comput Biol Chem. 29:368-378 (2005).
https://www.collaborativedrug.com/buzz/2011/05/02/new-tb-targets-and-
molecules-data-available-for-public-access-use/
TB target database for in vivo essential genes.
14 known gene targets and 31 predicted gene targets for already known 35 approved TB drugs
TB molecules with activity in vitro and target information (from CDD) - now
added external links to pathways, literature etc.
TB molecules and target information database connects molecule, gene, pathway and literature
Essential Gene Pathway Essential Substrate/s
bioB (Rv1589) Biotin biosynthesis dethiobiotin
thiE (Rv0414c) Thiamine biosynthesis 2-(4-methylthiazol-5-yl)ethyl phosphate and
[(4-amino-2-methyl-pyrimidin-5-yl)methoxy-
oxido-phosphoryl] phosphate
cysE (Rv2335) Cysteine biosynthesis L-serine and acetyl-CoA
cobC (Rv2231c) No pathway assigned L-threonine O-3-phosphate
glpX (Rv1099c) glycolysis and gluconeogenesis D-fructose 1,6-bisphosphate
ppgK (Rv2702) Amino sugar and nucleotide sugar metabolism
Gluconeogenesis
β-D-glucose
arcA (Rv1001) arginine degradation V (arginine deiminase pathway) L-arginine
panD (Rv3601c) β-alanine biosynthesis IV L-aspartate
otsA (Rv3490) trehalose biosynthesis I UDP-D-glucose and α-D-glucose 6-
phosphate
Targets, metabolites and pathways pursued in this study
Sarker et al., Pharm Res 2012, in press
Pharmacophore developed (using Accelrys Discovery Studio) from 3D conformations of the substrate
van der Waals surface for the metabolite mapped onto it
pharmacophore plus shape searched in 3D compound databases from vendors
In silico hits collated
Filtered for TB whole cell activity and reactivity
Compounds filtered based on Bayesian score using models derived from NIAID / Southern Research
Inst data to retrieve ideal molecular properties for in vitro TB activity
Biotin biosynthesis
dethiobiotin
Pharmacophore
Searching Maybridge (57K)
gives 72 molecules – many of
them hydrophobic so they
stand a chance of in vitro
activity
Take substrate
and generate 3D
conformers and
build a
pharmacophore
Use the
pharmacophore
to search vendor
libraries in 3D
Buy and test
compounds
Example of mimic strategy for bioB Rv1589
a. b.
c. d.
e. f.
g. h.
i. j.
k. l.
Substrate Pharmacophores Developed for Mtb Enzymes
Sarker et al., Pharm Res 2012, in press
Green = Hydrogen bond acceptor, Purple = hydrogen bond donor, cyan = hydrophobe
Grey – van der Waals surface
Two Proposed Mimics of D-fructose 1,6 bisphosphate
DFP000133SC MIC 40μg/ml
DFP000134SC MIC 20μg/ml
Computationally searched >80,000 molecules – narrowed to 842 hits -tested
23 compounds in vitro (3 picked as inactives), lead to 2 proposed as mimics
of D-fructose 1,6 bisphosphate
Sarker et al., Pharm Res 2012, in press
a.
b.
1. Find candidate genes
coding potential targets.
1. choose pathogen
2. search for genes
choose source--
experimental in vitro/ex vivo
data, in silico (single/double)
knockout (choose nutrient set,
survival conditions)
choose filter (no human
ortholog, ..., user edit)
Output: target candidate
list--gene names associated
with reference identifier.
2. Prioritize target candidate
list.
1. Annotate (choose properties:
pathways, reactions, EC#,
GO characterization)
2. Filtering (choose thresholds)
3. Sort (choose criteria: number
of pathways, number of
reactions, ...)
4. Annotate reaction substrates
with structure information.
Output: Prioritized target list
annotated with prioritizing
properties and associated
reactions with their substrates
annotated with structure (these
are the candidate molecules to
mimic).
Metabolites (and metadata,
required as sdf file for software)
3. For each candidate
molecule develop
pharmacophore model that
suggests mimics.
1. Develop pharmacophore
models from metabolites
2. Search known drug
databases for compounds
mapping to
pharmacophore,
3. Filter based on ADME/Tox
properties
4. Filter based on other
models for target
bioactivity
5. Sorting or Pareto
optimization of results
Output: Pharmacophores and
candidate mimics for
substrates of target enzymes
Molecule id, source
4. Submit top mimics for
preliminary experimental
validation and lead
optimization
1. select molecules from 3
2. order from vendor
3. test in vitro / ex vivo
4. add results to CDD
database
5. prioritize compounds for
lead optimization / in vivo
studies
6. partnering with 3rd party for
preclinical/ clinical studies
Output: Experimental results
to be fed into the CDD
database
Proposed generalized workflow for molecule discovery
Summary
POC took < 6mths - - Submitted phase II STTR,
Still need to test vs target - verify it hits suggested target – optimize cpds.
Need to link SRI and CDD databases via API – new product
• Computational models based on Whole cell TB data could improve efficiency of
screening
• Collaborations get us to interesting compounds quickly
• Additional prospective validation ongoing with IDRI, Southern Research Institute
and UMDNJ using machine learning models - testing small numbers of
compounds
• UMDNJ – mined GSK malaria public data, scored with bayesian models –
ordered from vendors
Library
size
Number
of hits
Hit rate
(%)
Notes
Reference
100997 1782 1.76
Diverse
library Ananthan
215110 3817 1.77
Diverse
library Maddry
25671 1329 5.18
Human
kinase
focussed
library Reynolds
Ranked Asinex 25K library with
dose response model - 99
screened.16 cpds were
identified with IC50<100uM
Compare with HTS screening
below
Example 1. Kinase library Example 2. Asinex library
Example 3. IDRI: 3 models - 48 compounds tested, 11 activity < or equal to MIC
10uM (22.9% hit rate)
Example 4. UMDNJ 1 model – 4 tested, 3 active (1 MIC < 0.125ug/ml)
Bayesian Machine Learning Models – Improve Hit Rates
What next - Apps for collaboration ODDT – Open drug discovery teams
Flipboard-like app for aggregating social media for diseases etc
Alex Clark, Molecular Materials Informatics, Inc
Williams et al DDT 16:928-939, 2011
Clark et al submitted 2012
Ekins et al submitted 2012
Acknowledgments collaborators (Allen Casey, Robert Reynolds
etc..)
Alex Clark (Molecular Materials Informatics, Inc)
Accelrys
CDD
Funding BMGF
Award Number R41AI088893 from the National Institute Of Allergy And Infectious Diseases.
Email: [email protected]
Slideshare: http://www.slideshare.net/ekinssean
Twitter: collabchem
Blog: http://www.collabchem.com/
Website: http://www.collaborations.com/CHEMISTRY.HTM