Lipinski in silico drug discovery durham nc 2014
-
Upload
christopher-lipinski -
Category
Science
-
view
373 -
download
3
Transcript of Lipinski in silico drug discovery durham nc 2014
From a computational prediction paper to amusing, wide ranging,
slightly dangerous and provocative thoughts
Christopher A. Lipinski
Scientific Advisor
Melior Discovery
1
Do drug structure networks map on biology networks?
2
Chemistry drug class network
3
Network comparison conclusions• “A startling result from our initial work on
pharmacological networks was the observation that networks based on ligand similarities differed greatly from those based on the sequence identities among their targets.”
• “Biological targets may be related by their ligands, leading to connections unanticipated by bioinformatics similarities.”
4
What is going on?
• Old maxim: Similar biology implies similar chemistry
• If strictly true biology and chemistry networks should coincide
5
Network comparisons – meaning?• “Structure of the ligand reflects the target”
• Evolution selects target structure to perform a useful biological function
• Useful target structure is retained against a breadth of biology
• Conservation in chemistry binding motifs
• Conservation in motifs where chemistry binding is not evolutionarily desired–eg. protein – protein interactions
6
Evolution in the ligand• Chemo-attractant gradient
• Evolves into ligand signaling–eg. histidine, tryptophan
–histidine histamine
–tryptophan serotonin
• ligand synthesis, transport, degradation
• Easier to repurpose existing ligand than to evolve a new one
7
Evolution in the target
• Evolution works on what pre-exists
• Proteins cannot dense pack completely–tension between density and H-bonding
• Cavities, clefts exist de-novo–ability to catalyze exists de-novo
• Proteins associate de-novo–tension in self association vs other association
–fibrils, plaques represent a limit
8
Re-use of ligand motifs
• Protein-small molecule association–eg, enzymes, kinases
–extensive experience
–frequently - new biology from old motifs
• Protein-protein association–only recent limited experience
–evolution disfavors small ligand
–ligand motif patterns unknown
9
Hit / lead implications
• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area
• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target
10
NIH Molecular Library probes• Small molecules are critical to probing biology
• Major NIH funding effort
• Probe program in 2014 is in its 5th year
• Over 320 ML probes disclosed via:–NIH web based “probe book”
–disclosed in peer reviewed publications
• Probes are biology rich but chemistry sparse
• A public private database problem–connectivity to chemistry literature is poor
11
Why are NIH probes “sparse” in chemistry
• You have a screening hit. SAR on the historical chemistry of your hit can be useful even if it comes from a different biology area
• Medicinal chemistry principles outside of your current biology target can be extrapolated to the ligand chemistry (but not biology) of the new target
• Must be able to connect ML probe chemistry structure to the medicinal chemistry literature
12
NIH Probes for Data Analysis
13
https://www.collaborativedrug.com/pages/public_access
14
Hidden NIH ML probe spreadsheet
15
WebTable_121012.xlsx was discovered athttp://mli.nih.gov/mli/mlp-probes-2/?dl_id=1352
NIH Probe Report Book
16
http://www.ncbi.nlm.nih.gov/books/NBK47352/
High Quality Data Rich Format
17
ML370 is CID 70680248
18
Structure in CAS SciFinder©
19
Search for exact structure
20
ML370 not in CAS Registry
21
Is this an outlier?How many ML probes have no CAS Registry Number©?
MLPCN probes and CAS• 19 of 308 ML probes have no CAS Regno©
• CAS does not index PubChem CID number
• CAS does not index NIH ML probe number
• Even if in the title of a published journal paper
--------------------------------------------------------------
• Proprietary world of CAS and:
• Public world of PubChem, ChEMBL, UNICHEM
• ChemSpider etc. DO NOT communicate
• Public Be Aware! Potential IP consequences22
What is required for “Prior Art”
• Chemical structure must be disclosed
• Method of synthesis must be disclosed
• Is a chemical structure without data prior art?
• 70 M chemistry structures in public databases
• 35 M chemistry structures have no data
• Computer generation of chemistry structures up to 13 heavy atoms – billions of structures
• Computer generation of synthetic schemes
23
Focus on Reliability
Chemistry Reliability• Chemistry “known”, eg. Pubchem CID vs. SID
–SID is substance
–CID is chemistry
• Frequent HTS hitter, eg. BADAPPLE–privileged (sub)structure
• Flawed chemistry structure, eg. PAINS–worthless in drug discovery
• Confidence in Chemistry (CiC) is MISSING–index of chemical uncertainty in biology testing
Confidence in Chemistry CiC
• Chemistry can be drawn, named, indexed–compounds exist in databases and get tested
• Uncertainty exists about the exact structure–hence reduced CiC
• Two main causes of reduced CiC–geometric isomerism
• eg. E, Z olefins; E, Z oximes; cis, trans; exo, endo
–chirality; relative or absolute–mostly tetrahedral carbon, rarely other
Confidence in Chemistry Calculation
• 1 / number of unique structures in depiction
• Examples
• Single structure is 1/ 1 = 1
• 1 chiral center racemate is 1 / 2 = 0.5
• EZ mixture is 1 / 2 (geometric isomers) = 0.5
• 2 chiral center racemate is 1/(2 x 2) = 0.25
• 1 chiral center racemic EZ mixture is 1 / 4 =0.25
Confidence in Chemistry- Why care?
• Chemist does not care- already very familiar
• Biologist does not care- ignorance–does not understand chemistry concepts
–CiC concept is orthogonal to biology
–acronym designations dominate biology literature
• Chemoinformatics person does not care–chemistry fuziness resolved via InChI
–layered concept captures knowledge state
• Chemical biologist should care but may not
Low CiC exists in Chemical Biology• 328 NIH Molecular Library ML probes
• publically available from NIH or CDD
• 42 probes had low CiC from drawn structure
• 32 probes had OK CiC from drawn structure
• BUT
• I had reason to check the chemistry
• Most were problems of geometric isomerism
• ML096 is an example
• CiC is not a chemistry quality index. It measures uncertainty as to the chemical causing the biology
ML096 ambiguous chemistry
CAS 1186646-97-8 “Z” geometry
No literature on ML096 except in NIH MLP probe reportNo compounds in CAS SciFinder© very similar to ML096No literature teaching about how the chemistry proceeds
ML096 might be this compound. Cannot tell from available data. So Confidence in Chemistry (CiC) is lower than implied by depicted structure
ML096 Spectra uninformative
Phosphomannose Isomerase Inhibitor
Can Confidence in Chemistry be calculated?
• From chemical structure as drawn?
• From InChI? From Smiles?
• This would give an upper limit to CiC
• Real CiC might be lower
• If CiC could be calculated would it be useful?--------------------------------------------------------------
• What did I learn from the MLP probes?
• Chemical in Chemical Biology is the weaker sister
Scientific Summary• Literature chemical searches on leads have
value even if the prior art biology has nothing in common with the new biology
• NIH Molecular probes: Need access to chemistry literature to fully capitalize on value
• Databases are being flooded with virtual compounds – poisoning of prior art?
• Confidence in Chemistry (CiC). Uncertainty in the “chemical” in chemical biology needs attention. Computational opportunity?
33