Lipinski in silico drug discovery durham nc 2014

Post on 12-Jul-2015

373 views 3 download

Tags:

Transcript of Lipinski in silico drug discovery durham nc 2014

From a computational prediction paper to amusing, wide ranging,

slightly dangerous and provocative thoughts

Christopher A. Lipinski

Scientific Advisor

Melior Discovery

clipinski@meliordiscovery.com

1

Do drug structure networks map on biology networks?

2

Chemistry drug class network

3

Network comparison conclusions• “A startling result from our initial work on

pharmacological networks was the observation that networks based on ligand similarities differed greatly from those based on the sequence identities among their targets.”

• “Biological targets may be related by their ligands, leading to connections unanticipated by bioinformatics similarities.”

4

What is going on?

• Old maxim: Similar biology implies similar chemistry

• If strictly true biology and chemistry networks should coincide

5

Network comparisons – meaning?• “Structure of the ligand reflects the target”

• Evolution selects target structure to perform a useful biological function

• Useful target structure is retained against a breadth of biology

• Conservation in chemistry binding motifs

• Conservation in motifs where chemistry binding is not evolutionarily desired–eg. protein – protein interactions

6

Evolution in the ligand• Chemo-attractant gradient

• Evolves into ligand signaling–eg. histidine, tryptophan

–histidine histamine

–tryptophan serotonin

• ligand synthesis, transport, degradation

• Easier to repurpose existing ligand than to evolve a new one

7

Evolution in the target

• Evolution works on what pre-exists

• Proteins cannot dense pack completely–tension between density and H-bonding

• Cavities, clefts exist de-novo–ability to catalyze exists de-novo

• Proteins associate de-novo–tension in self association vs other association

–fibrils, plaques represent a limit

8

Re-use of ligand motifs

• Protein-small molecule association–eg, enzymes, kinases

–extensive experience

–frequently - new biology from old motifs

• Protein-protein association–only recent limited experience

–evolution disfavors small ligand

–ligand motif patterns unknown

9

Hit / lead implications

• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area

• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target

10

NIH Molecular Library probes• Small molecules are critical to probing biology

• Major NIH funding effort

• Probe program in 2014 is in its 5th year

• Over 320 ML probes disclosed via:–NIH web based “probe book”

–disclosed in peer reviewed publications

• Probes are biology rich but chemistry sparse

• A public private database problem–connectivity to chemistry literature is poor

11

Why are NIH probes “sparse” in chemistry

• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area

• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target

• Must be able to connect ML probe chemistry structure to the medicinal chemistry literature

12

NIH Probes for Data Analysis

13

https://www.collaborativedrug.com/pages/public_access

14

Hidden NIH ML probe spreadsheet

15

WebTable_121012.xlsx was discovered athttp://mli.nih.gov/mli/mlp-probes-2/?dl_id=1352

NIH Probe Report Book

16

http://www.ncbi.nlm.nih.gov/books/NBK47352/

High Quality Data Rich Format

17

ML370 is CID 70680248

18

Structure in CAS SciFinder©

19

Search for exact structure

20

ML370 not in CAS Registry

21

Is this an outlier?How many ML probes have no CAS Registry Number©?

MLPCN probes and CAS• 19 of 308 ML probes have no CAS Regno©

• CAS does not index PubChem CID number

• CAS does not index NIH ML probe number

• Even if in the title of a published journal paper

--------------------------------------------------------------

• Proprietary world of CAS and:

• Public world of PubChem, ChEMBL, UNICHEM

• ChemSpider etc. DO NOT communicate

• Public Be Aware! Potential IP consequences22

What is required for “Prior Art”

• Chemical structure must be disclosed

• Method of synthesis must be disclosed

• Is a chemical structure without data prior art?

• 70 M chemistry structures in public databases

• 35 M chemistry structures have no data

• Computer generation of chemistry structures up to 13 heavy atoms – billions of structures

• Computer generation of synthetic schemes

23

Focus on Reliability

Chemistry Reliability• Chemistry “known”, eg. Pubchem CID vs. SID

–SID is substance

–CID is chemistry

• Frequent HTS hitter, eg. BADAPPLE–privileged (sub)structure

• Flawed chemistry structure, eg. PAINS–worthless in drug discovery

• Confidence in Chemistry (CiC) is MISSING–index of chemical uncertainty in biology testing

Confidence in Chemistry CiC

• Chemistry can be drawn, named, indexed–compounds exist in databases and get tested

• Uncertainty exists about the exact structure–hence reduced CiC

• Two main causes of reduced CiC–geometric isomerism

• eg. E, Z olefins; E, Z oximes; cis, trans; exo, endo

–chirality; relative or absolute–mostly tetrahedral carbon, rarely other

Confidence in Chemistry Calculation

• 1 / number of unique structures in depiction

• Examples

• Single structure is 1/ 1 = 1

• 1 chiral center racemate is 1 / 2 = 0.5

• EZ mixture is 1 / 2 (geometric isomers) = 0.5

• 2 chiral center racemate is 1/(2 x 2) = 0.25

• 1 chiral center racemic EZ mixture is 1 / 4 =0.25

Confidence in Chemistry- Why care?

• Chemist does not care- already very familiar

• Biologist does not care- ignorance–does not understand chemistry concepts

–CiC concept is orthogonal to biology

–acronym designations dominate biology literature

• Chemoinformatics person does not care–chemistry fuziness resolved via InChI

–layered concept captures knowledge state

• Chemical biologist should care but may not

Low CiC exists in Chemical Biology• 328 NIH Molecular Library ML probes

• publically available from NIH or CDD

• 42 probes had low CiC from drawn structure

• 32 probes had OK CiC from drawn structure

• BUT

• I had reason to check the chemistry

• Most were problems of geometric isomerism

• ML096 is an example

• CiC is not a chemistry quality index. It measures uncertainty as to the chemical causing the biology

ML096 ambiguous chemistry

CAS 1186646-97-8 “Z” geometry

No literature on ML096 except in NIH MLP probe reportNo compounds in CAS SciFinder© very similar to ML096No literature teaching about how the chemistry proceeds

ML096 might be this compound. Cannot tell from available data. So Confidence in Chemistry (CiC) is lower than implied by depicted structure

ML096 Spectra uninformative

Phosphomannose Isomerase Inhibitor

Can Confidence in Chemistry be calculated?

• From chemical structure as drawn?

• From InChI? From Smiles?

• This would give an upper limit to CiC

• Real CiC might be lower

• If CiC could be calculated would it be useful?--------------------------------------------------------------

• What did I learn from the MLP probes?

• Chemical in Chemical Biology is the weaker sister

Scientific Summary• Literature chemical searches on leads have

value even if the prior art biology has nothing in common with the new biology

• NIH Molecular probes: Need access to chemistry literature to fully capitalize on value

• Databases are being flooded with virtual compounds – poisoning of prior art?

• Confidence in Chemistry (CiC). Uncertainty in the “chemical” in chemical biology needs attention. Computational opportunity?

33