Lipinski in silico drug discovery durham nc 2014

33
From a computational prediction paper to amusing, wide ranging, slightly dangerous and provocative thoughts Christopher A. Lipinski Scientific Advisor Melior Discovery [email protected] 1

Transcript of Lipinski in silico drug discovery durham nc 2014

Page 1: Lipinski in silico drug discovery durham nc 2014

From a computational prediction paper to amusing, wide ranging,

slightly dangerous and provocative thoughts

Christopher A. Lipinski

Scientific Advisor

Melior Discovery

[email protected]

1

Page 2: Lipinski in silico drug discovery durham nc 2014

Do drug structure networks map on biology networks?

2

Page 3: Lipinski in silico drug discovery durham nc 2014

Chemistry drug class network

3

Page 4: Lipinski in silico drug discovery durham nc 2014

Network comparison conclusions• “A startling result from our initial work on

pharmacological networks was the observation that networks based on ligand similarities differed greatly from those based on the sequence identities among their targets.”

• “Biological targets may be related by their ligands, leading to connections unanticipated by bioinformatics similarities.”

4

Page 5: Lipinski in silico drug discovery durham nc 2014

What is going on?

• Old maxim: Similar biology implies similar chemistry

• If strictly true biology and chemistry networks should coincide

5

Page 6: Lipinski in silico drug discovery durham nc 2014

Network comparisons – meaning?• “Structure of the ligand reflects the target”

• Evolution selects target structure to perform a useful biological function

• Useful target structure is retained against a breadth of biology

• Conservation in chemistry binding motifs

• Conservation in motifs where chemistry binding is not evolutionarily desired–eg. protein – protein interactions

6

Page 7: Lipinski in silico drug discovery durham nc 2014

Evolution in the ligand• Chemo-attractant gradient

• Evolves into ligand signaling–eg. histidine, tryptophan

–histidine histamine

–tryptophan serotonin

• ligand synthesis, transport, degradation

• Easier to repurpose existing ligand than to evolve a new one

7

Page 8: Lipinski in silico drug discovery durham nc 2014

Evolution in the target

• Evolution works on what pre-exists

• Proteins cannot dense pack completely–tension between density and H-bonding

• Cavities, clefts exist de-novo–ability to catalyze exists de-novo

• Proteins associate de-novo–tension in self association vs other association

–fibrils, plaques represent a limit

8

Page 9: Lipinski in silico drug discovery durham nc 2014

Re-use of ligand motifs

• Protein-small molecule association–eg, enzymes, kinases

–extensive experience

–frequently - new biology from old motifs

• Protein-protein association–only recent limited experience

–evolution disfavors small ligand

–ligand motif patterns unknown

9

Page 10: Lipinski in silico drug discovery durham nc 2014

Hit / lead implications

• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area

• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target

10

Page 11: Lipinski in silico drug discovery durham nc 2014

NIH Molecular Library probes• Small molecules are critical to probing biology

• Major NIH funding effort

• Probe program in 2014 is in its 5th year

• Over 320 ML probes disclosed via:–NIH web based “probe book”

–disclosed in peer reviewed publications

• Probes are biology rich but chemistry sparse

• A public private database problem–connectivity to chemistry literature is poor

11

Page 12: Lipinski in silico drug discovery durham nc 2014

Why are NIH probes “sparse” in chemistry

• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area

• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target

• Must be able to connect ML probe chemistry structure to the medicinal chemistry literature

12

Page 13: Lipinski in silico drug discovery durham nc 2014

NIH Probes for Data Analysis

13

https://www.collaborativedrug.com/pages/public_access

Page 14: Lipinski in silico drug discovery durham nc 2014

14

Page 15: Lipinski in silico drug discovery durham nc 2014

Hidden NIH ML probe spreadsheet

15

WebTable_121012.xlsx was discovered athttp://mli.nih.gov/mli/mlp-probes-2/?dl_id=1352

Page 16: Lipinski in silico drug discovery durham nc 2014

NIH Probe Report Book

16

http://www.ncbi.nlm.nih.gov/books/NBK47352/

Page 17: Lipinski in silico drug discovery durham nc 2014

High Quality Data Rich Format

17

Page 18: Lipinski in silico drug discovery durham nc 2014

ML370 is CID 70680248

18

Page 19: Lipinski in silico drug discovery durham nc 2014

Structure in CAS SciFinder©

19

Page 20: Lipinski in silico drug discovery durham nc 2014

Search for exact structure

20

Page 21: Lipinski in silico drug discovery durham nc 2014

ML370 not in CAS Registry

21

Is this an outlier?How many ML probes have no CAS Registry Number©?

Page 22: Lipinski in silico drug discovery durham nc 2014

MLPCN probes and CAS• 19 of 308 ML probes have no CAS Regno©

• CAS does not index PubChem CID number

• CAS does not index NIH ML probe number

• Even if in the title of a published journal paper

--------------------------------------------------------------

• Proprietary world of CAS and:

• Public world of PubChem, ChEMBL, UNICHEM

• ChemSpider etc. DO NOT communicate

• Public Be Aware! Potential IP consequences22

Page 23: Lipinski in silico drug discovery durham nc 2014

What is required for “Prior Art”

• Chemical structure must be disclosed

• Method of synthesis must be disclosed

• Is a chemical structure without data prior art?

• 70 M chemistry structures in public databases

• 35 M chemistry structures have no data

• Computer generation of chemistry structures up to 13 heavy atoms – billions of structures

• Computer generation of synthetic schemes

23

Page 24: Lipinski in silico drug discovery durham nc 2014

Focus on Reliability

Page 25: Lipinski in silico drug discovery durham nc 2014

Chemistry Reliability• Chemistry “known”, eg. Pubchem CID vs. SID

–SID is substance

–CID is chemistry

• Frequent HTS hitter, eg. BADAPPLE–privileged (sub)structure

• Flawed chemistry structure, eg. PAINS–worthless in drug discovery

• Confidence in Chemistry (CiC) is MISSING–index of chemical uncertainty in biology testing

Page 26: Lipinski in silico drug discovery durham nc 2014

Confidence in Chemistry CiC

• Chemistry can be drawn, named, indexed–compounds exist in databases and get tested

• Uncertainty exists about the exact structure–hence reduced CiC

• Two main causes of reduced CiC–geometric isomerism

• eg. E, Z olefins; E, Z oximes; cis, trans; exo, endo

–chirality; relative or absolute–mostly tetrahedral carbon, rarely other

Page 27: Lipinski in silico drug discovery durham nc 2014

Confidence in Chemistry Calculation

• 1 / number of unique structures in depiction

• Examples

• Single structure is 1/ 1 = 1

• 1 chiral center racemate is 1 / 2 = 0.5

• EZ mixture is 1 / 2 (geometric isomers) = 0.5

• 2 chiral center racemate is 1/(2 x 2) = 0.25

• 1 chiral center racemic EZ mixture is 1 / 4 =0.25

Page 28: Lipinski in silico drug discovery durham nc 2014

Confidence in Chemistry- Why care?

• Chemist does not care- already very familiar

• Biologist does not care- ignorance–does not understand chemistry concepts

–CiC concept is orthogonal to biology

–acronym designations dominate biology literature

• Chemoinformatics person does not care–chemistry fuziness resolved via InChI

–layered concept captures knowledge state

• Chemical biologist should care but may not

Page 29: Lipinski in silico drug discovery durham nc 2014

Low CiC exists in Chemical Biology• 328 NIH Molecular Library ML probes

• publically available from NIH or CDD

• 42 probes had low CiC from drawn structure

• 32 probes had OK CiC from drawn structure

• BUT

• I had reason to check the chemistry

• Most were problems of geometric isomerism

• ML096 is an example

• CiC is not a chemistry quality index. It measures uncertainty as to the chemical causing the biology

Page 30: Lipinski in silico drug discovery durham nc 2014

ML096 ambiguous chemistry

CAS 1186646-97-8 “Z” geometry

No literature on ML096 except in NIH MLP probe reportNo compounds in CAS SciFinder© very similar to ML096No literature teaching about how the chemistry proceeds

ML096 might be this compound. Cannot tell from available data. So Confidence in Chemistry (CiC) is lower than implied by depicted structure

Page 31: Lipinski in silico drug discovery durham nc 2014

ML096 Spectra uninformative

Phosphomannose Isomerase Inhibitor

Page 32: Lipinski in silico drug discovery durham nc 2014

Can Confidence in Chemistry be calculated?

• From chemical structure as drawn?

• From InChI? From Smiles?

• This would give an upper limit to CiC

• Real CiC might be lower

• If CiC could be calculated would it be useful?--------------------------------------------------------------

• What did I learn from the MLP probes?

• Chemical in Chemical Biology is the weaker sister

Page 33: Lipinski in silico drug discovery durham nc 2014

Scientific Summary• Literature chemical searches on leads have

value even if the prior art biology has nothing in common with the new biology

• NIH Molecular probes: Need access to chemistry literature to fully capitalize on value

• Databases are being flooded with virtual compounds – poisoning of prior art?

• Confidence in Chemistry (CiC). Uncertainty in the “chemical” in chemical biology needs attention. Computational opportunity?

33