Vanderwall cheminformatics Drexel Part 1

Cheminformatics & The evolving relationship

between data in the public domain & pharma

Dana VanderwallCheminformatics, Bristol-Myers Squibb

How do we start to find a new chemical that might be the next drug?

Typically- Need a specific protein to target that we think we can use to fix the problem that causes the disease Caveat: emerging trends (&what’s old is new again)

Need to design experiments that test for chemicals that can fit that protein (lock & key)

Thousands to >2 million chemicals are tested with that protein to look for a starting point

This is where drug discovery gets really modern Highly automated robots and infromatics can do work that used to

take years in 1 week

Compound optimization Compounds are optimized for many parameters including

Potency, selectivity, oral bioavailability, safety

1-3 years

>2 million compounds tested in primary assay

Make another 2-10,000

Getting ready for the clinic All compounds are tested for safety in animals Need to prove we can give enough to get the positive benefit without side effects We have to be able to make it on a scale and form suitable for dosing in the clinic

The early stages need milligrams or grams (tablespoons) To start testing in humans requires many kilograms of very, very pure material

1-3 years

2-3 years

Profiling Assays and Lead Op Progression

1 2-5 10-50 300-600

Hit Triage

Early Lead Op

Late Lead Op

Target to Hit Hit to Lead Lead to Candidate

# Assays

Chemical Structures are the Intellectual Property The targets exist in nature- chemical structures are the

unique component that pharma & biotech can bring to the table

(Biologicals are increasing in importance) As such, the structures, and their biological activity, are

extremely sensitive

Captured in the patents filedNever disclosed until protectedEven similarity/sub-structure searches on public

sites are treated cautiously

GlaxoSmithKline moves to stimulate public-private partnerships for R&D in neglected tropical diseases

http://www.gsk.com/responsibility/access/rnd-neglected-tropical-diseases.htm

GSK launched the open lab at Tres Cantos as one way in which to share our expertise and seek to stimulate open innovation in drug discovery into diseases of the developing world 60 slots for scientists Access to screening facility & TC staff scientists to support

collaborations $5M GBP facilities expansion

Committed to sharing data & IP on GSK research in DDW Starting with recently generated novel anti-malarial hits

Malaria

Mosquito-borne infectious disease, caused by the plasmodium parasite

250 M cases/annum, 1-3 M deaths

Variety of drugs available, but resistance is a constant problem

http://www.mcwhealthcare.com/malaria_drugs_medicines/life_cycle_of_plasmodium.htm http://www.mcwhealthcare.com/malaria_drugs_medicines/life_cycle_of_plasmodium.htm

The assumption is:

One target One consequence

The Complexity of Cell Biology

Target

In reality:This target is one component of a complicated biochemical network.

• A selective probe may influence many pathways.

• Probes can interact with multiple targets.

• Network interactions can be redundant.

• Biological effects are often a consequence of interaction with multiple targets.

Target

Emerging paradigm- look for the cellular activity first Advances in cell biology & the HTS

platforms are enabling HTS screening for a cellular phenotype

Start with something that works in a cellular model for disease phenotype (a.k.a. black box), then figure out how it worksTarget deconvolution

Supporting black-box HTS for anti-malarials 2M compound GSK HTS collection screened @ 2M vs. P.

falciparum (3D7) infected human erythrocytes

12 mos. Screening in biohazard labAvg. z’ = 0.7

19,451 primary hits; inh. parasite growth >80%; 13,533 confirmed in via retests 1,982 showed cytotox in HepG2s @10M None active in cell background control

8,000 also active against DD2 (multi-drug resistant strain) >50%

F-J Gamo et al. Nature 465, 305-310 (2010) doi:10.1038/nature09107

Characterizing the hits Clustering was used to help characterize chemical space

416 “molecular frameworks” Bemis & Murcko J. Med. Chem. 39 2887 (1996)

857 clusters/1978 singletons by Daylight FP/Tanimoto (.85)

Three-dimensional plot of some of the novel chemical diversity present in TCAMS

Characterizing the hits

Compounds with an abnormally high frequency of activity across HTS campaigns were filtered out

Excluded where IFI=5% where tested in >100 HTS to 20% where tested in >25 HTS (~1800 cmpds.)

~70 compounds that clustered with know anti-malarials

How are these rest of these compounds working???

100screens HTS ofnumber total

50% Inh. % wherescreens HTS ofnumber IndexFrequency Inhibition

Can we leverage the historical target data on compounds?

Target assays Clear relationship between interactions and

measurements, but what does it mean biologically?

Can we use the data to figure out which targets lead to which biological

response?

kinase_1 kinase_2 kinase_3 kinase_4

7TM_1 7TM_2

NR_3NR_2NR_1

stimulant

readout

Phenotypic assays Clear biological result associated with

readout, but from which interaction(s)?

Can we leverage the historical target data on compounds?

Find all target assay data for compounds tested in anti-malarial screen Aggregate at the target-result type level (max pIC50/pEC50)

Of the 2M tested, 130K had some associated target assay data Incl. 3,435 of the 13,500 ‘actives’ “Hits”* at 413 targets

*pIC50 >7.0 for antag/inh/blocker *pEC50 >6.5 for ag/activation/opener

Given that some targets are screened in 2-3 modes, >650 target-result type combinations

Surely not all 400 targets are significant Data very sparse, avg. ~2 pXC50s per compound that

had data

Finding targets ‘enriched’ among the anti-malarials An ‘enrichment’ was calculated for each possible target-result type

combination Are compounds active at target X more prevalent amongst the compounds that

inhibited P. falciparum, or equally distributed across all screened compounds?

For each target –result type, calculate:

@target 0pIC50/pEC5 measuredset with screening entire from compounds ofnumber the

@target thresholdactivity set with screening entire from compounds ofnumber the

@target 0pIC50/pEC5 measured a with hits alantimalari ofnumber the

@target thresholdactivity with hits alantimalari ofnumber the

compounds screened allin actives target all

hits among activestarget factor Enrichment

Narrowing down the possible candidates

~140 targets @ ≥2 fold enrichment ~50 with homologues in P. falciparum

400 targets >2 fold

enrichment>2 fold

enrichment

Targets with homologues in P. falciparum genomeAspartic protease Methionyl-tRNA synthetase

b-Ketoacid reductase Phenylalanyl-tRNA synthetase

Calcium/calmodulin-dependent kinase

Phosphatidylinositol 3-kinase

Cysteine protease Plasmodium electron transport chain

Dihydrofolate reductase Ribosome

Dihydroorotate dehydrogenase

Ser/Thr protein kinase

DNA gyrase Tyrosyl-tRNA synthetase

Isoleucyl-tRNA synthetase

Targets with NO homologues in P. falciparum genomeGPCR: Adrenergic antag Nuclear Receptor ag/antag

GPCR: Cannabanoid antag Ion Channel inh

GPCR: Chemokine antag Phospholipse inh

GPCR: Cholinergic ag Lipid amide hydrolase inh

GPCR: Free Fatty Acid ag Serine protease inh

GPCR: Serotonin ag/antag Toll-like receptor ag

GPCR: Opiod ag/antag

GPCR: Peptide hormone receptor ag/antag

Data publicly available All chemical structures and exp. data for compounds

available@http://www.ebi.ac.uk/chemblntd

EXT_CMPD_NUMBER

SMILES

Percentage_inhibition_3D7

Percentage_inhibition_DD2

Percentage_inhibition_3D7_PFLDH

XC50_MOD_3D7

XC50_3D7 (µM)

Percentage_inhibition_HEPG2

Chemical cluster Nr

Graph_Frame_Cluster

Target_Hypothesis

P. falciparum locus

Commercial Supplier_Reference

Additional information & interest in additional collaborations contact:

jose.f.garcia-bustos@gsk.com

And the raw target data used to develop hypotheses? That was trickier Release the list of 400 targets & all the inactive

compounds would Reveal our whole compound collection All the targets in the current (and past) portfolio

Needed some level of validation for analysis to publish

Surrogates for internal data

Chemical structures associated with a particular target hypothesis were used as ‘bait’ to find published structures & data that validate proposed MOA for each chemotype Similarity & SSS in Aureus DBs & SciFinder Exemplars and their similarity to original hits

published in Suppl. Material with reference We often found our own compounds and data in J

Med Chem and Patent literature.

AcknowledgementsAnti-malarial HTS

Tres Cantos Medicines Development Campus, Tres Cantos Spain

Medicines Research Centre, Stevenage, UK

Darren VS Green

Collegeville & King or Prussia, PA, USA

Vinod Kumar Samiul Hasan James Brown Catherine Peishoff Lon Cardon

Francisco-Javier Gamo Laura Sanz Jaume Vidal Cristina de Cozar Emilio Alvarez Jose-Luis Lavandera Jose Garcia-Bustos

Vanderwall cheminformatics Drexel Part 1

Documents

Transcript of Vanderwall cheminformatics Drexel Part 1

Open Source Cheminformatics

Solutions for Cheminformatics

Overview of a Public Web-accessible ChemInformatics ... · PDF fileaccessible cheminformatics database for shared ... Sciences MS database ... Overview of a Public Web-accessible ChemInformatics

Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013

Cheminformatics Tools for Enabling Metabolomics Yannick ... · Cheminformatics Tools for Enabling Metabolomics by ... development of cheminformatics tools for data organization and

Chemoinformatics, cheminformatics, chemical informatics: What …acscinf.org/docs/meetings/225nm/presentations/225nm10-part1.pdf · Chemoinformatics, cheminformatics, chemical informatics:

Cheminformatics in R

CheminformatiCs ColleCtion - Accelrys - Scientific Enterprise

Cheminformatics Software Development: Case Studies

Cheminformatics of Drug-like Small Molecules - ...

Cheminformatics and Computational Approaches in Metabolomics

PDF - Journal of Cheminformatics

Journal of Cheminformatics

C2D Cheminformatics : Methods,Tools and Results By OSDD-Cheminformatics team.

USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem

Cheminformatics of Drug-like Small Molecules - - BioC2013 ...faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Chem...Cheminformatics of Drug-like Small Molecules Cheminformatics

Cheminformatics in drug design

Cheminformatics in SharePoint

Cheminformatics: An overview

Cheminformatics and mass spectrometry course - Fiehn …fiehnlab.ucdavis.edu/downloads/staff/kind/Teaching/cheminformatics... · Mass Spectrometry meets Cheminformatics ... • Analytical