Pharmacophores in Chemoinformatics: 1. Pharmacophore Patterns
Transcript of Pharmacophores in Chemoinformatics: 1. Pharmacophore Patterns
Pharmacophores in Chemoinformatics:Pharmacophores in Chemoinformatics:
1. Pharmacophore Patterns & Topological 1. Pharmacophore Patterns & Topological FingerprintsFingerprints
Dragos HorvathDragos HorvathLaboratoire dLaboratoire d’’InfoChimieInfoChimie
UMR 7177 CNRS UMR 7177 CNRS ––
UniversitUniversitéé
de Strasbourgde [email protected]@chimie.u--strasbg.frstrasbg.fr
The Pharmacophore Way of Life The Pharmacophore Way of Life ––
A Medicinal A Medicinal ChemistChemist’’s Dreams Dream
•
(Bio)Molecular Recognition is based on ligand-site interactions of extremely complicated nature–
Understanding them requires a solid knowledge of statistical physics and, therefore, of higher maths…
–
But medicinal chemists hate maths…
so they developed a simplified rule set to rationalize ligand binding.
•
Functional groups of similar physicochemical behavior represent pharmacophore types:–
Hydrophobic, Aromatic, Hydrogen Bond (HB) donors, Cations, HB Acceptors, Anions.
–
Now, we just need to know how each of the six types interacts with the site…
welcome to the “pharmacophore”
paradigm,
farewell higher maths (for the moment, at least)
The Interaction Saga: (1) van der Waals The Interaction Saga: (1) van der Waals InteractionsInteractions
•
Atoms are more or less hard spheres –
squeezing them against each other causes a sharp rise in energy:–
Erep
=Aij
d-12
•
At distances larger than the sum of their «
van der Waals spheres
», an attractive term due to dipole-induced dipole
interactions (London dispersion term) is predominant…–
Eatt
= -
Bij
d-6
The Interaction Saga: (2) Electrostatics & The Interaction Saga: (2) Electrostatics & SolvationSolvation
•
Coulomb charge-charge interactions are easy to compute, once the partial charges Qk
are assigned on the atoms…–
ECoul
=Qi
Qj
/4πεd•
…
and the solvent molecules are explicitly modeled –
accountig for all the possible solvation shell structures, in order to estimate a solvation free energy.
•
Alternatively, a continuum solvent model may be employed.
p∈i
t∈i
u∈i
v∈i
BEi;σi
QiQk
BEk;σk
p∈k
t∈k
npnt
np
neglected!
Et∈i
Ep∈i
σiε0
= Ep.np p∈i 1- εextεint
σkε0
= Ep.np p∈k 1- εextεint
D. Horvath et al., J. Chem. Phys. 104, 6679 (1996)
The Interaction Saga: (2bis) The Hydrophobic The Interaction Saga: (2bis) The Hydrophobic EffectEffect
•
The mysterious force that separates grease and water is not due to grease-grease van der Waals interactions being stronger than grease-water attraction!
•
It is not of electrostatic nature either, because greasy alkyl chains have no charges!
•
Actually, it’s not a force at all, but the consequence of the drift towards a more probable state of matter (?!)
•
For practical purposes, however, it makes sense to believe that hydrophobes «
attract
»
each other –
for making
hydrophobic contacts significantly improves binding affinity!
Physical Chemistry For Dummies: The RulesPhysical Chemistry For Dummies: The Rules
•
Hydrophobes make favorable contacts with other hydrophobes (we do not want to know why!). Assume strenght proportional to the buried hydrophobic area.
•
Hydrophobes in close contact to polar groups cause frustration, for they chase away the water molecules favorably solvating the latter and offer no substitute interactions
•
Hydrogen bond donors seek to pair with acceptors, so that they may reestablish the water hydrogen bonds they lost
•
Cations seek to pair with anions and avoid hydrophobes.•
Shape is of paramount importance: groups of a same kind may replace each other if they are shaped likely
BioIsoSteres BioIsoSteres ––
Equivalent Functional GroupsEquivalent Functional Groups
•
Wikipedia: bioisosteres are substituents or groups with similar physical or chemical properties that impart similar biological properties to a chemical compound
O
OH
R
R
O–
O
RNH+
NH2
H2N
RNH+
NH2
H2N
N
HN
NN
R
N
–N
N
N
R
Pharmacophore PatternsPharmacophore Patterns
•
The pharmacophore pattern of a molecule characterizes the relative arrangement of all its pharmacophore types–
What pharmacophore types
are represented?
–
How are they arranged (spatially, topologically) with respect to each other ?
–
How can these aspects be captured numerically to yield molecular descriptors of the pharmacophore pattern?
•
Note: Pharmacophore patterns are essentially 3D. Since geometry is determined by connectivity, 2D “pharmacophore patterns”
also make sense!
Exploiting Exploiting ppharmacophore harmacophore ppatternsatterns……
•
N-dimensional vector D(M)=[D1
(M), D2
(M), …,DN
(M)]; each Di
encodes an element of the pharmacophore pattern–
Allows meaningful quantitative definitions of molecular similarity:
•
Neighborhood Behavior: Similar molecules -
characterized by covariant vectors -
are likely to display similar
biological
properties
•
As chemists do not easily perceive the pharmacophore pattern, such covariance
may reveal hidden but real molecular relatedness…
–
May serve as starting point for searching a binding pharmacophore –
the subset of features that really
participate in binding to a receptor•
Machine learning to select those elements Di
that are systematically present in actives, but not in inactives of a molecular learning
set!
Some Some eexamples of "xamples of "hhidden idden ssimilarity"imilarity"
0102030405060708090
100
A1h
Alpha1
Alpha2
Beta1h
AT1h
BZD
cBom
bB
2hC
CK
Ah
D1h
D2h
DaU
ptE
TAh
Galan
H1c
ML1
M1h
M3h
NK
1hN
PY
Muh
5HT1Ah
5HT1D
5HT2ch
5HT3h
5HT6h
5HTU
ptSigm
a1V
1Ah
K-A
TPC
lC
atBElastP
DE
IIP
DE
IVP
KC
EG
F-TKP
K55fyn
HIVP
NE
UP
ThIL-8M
AP
kinC
GR
P
010
20304050
607080
90100
010
2030
405060
7080
90100
NI N
N
S
Br
H
N
NON
Cl
Cl
I
NN
NN
N
O
NCl
OH
Tricentric Pharmacophore Fingerprints: Tricentric Pharmacophore Fingerprints: monitoring feature amonitoring feature arrangementrrangement
•
Topological: the distance between two features equals the (minimal) number of chemical bonds between them
N
N
O
N
Cl
99 411
•
Spatial: if stable conformers are known, use the distance in Ǻ
between two features
Example: Example: Binary Pharmacophore TriBinary Pharmacophore Tripletsplets
33 33
33
33
66
77
44
33 44
44
33 55
Hp3Hp3--Hp3
Hp3--Hp3Hp3
Hp3Hp3--Hp3
Hp3--Hp4Hp4
Hp3Hp3--Hp3
Hp3--Hp5Hp5
…… Ar4Ar4--Hp3
Hp3--Hp4Hp4
Ar4Ar4--Hp3
Hp3--Hp5Hp5
…… …… …… …… Hp7Hp7--Ar4
Ar4--PC6PC6
……Hp3Hp3--HA5
HA5--Ar5Ar5
55
55 33
0 0 0 … 0 0 … … 1 … … … 0 … … 0 …
Basis Basis TripletsTriplets::•• all possible feature combinationsall possible feature combinations•• at a given series of distancesat a given series of distances……
Hp4Hp4--HA5
HA5--Ar5Ar5
55
55 44??
Pickett, Mason & McLay, J. Chem. Inf. Comp. Sci. 36:1214-1223 (1996)
………… ……
First key improvement: First key improvement: Fuzzy Fuzzy mapping of mapping of atom triplets onto basis triplets in 2Datom triplets onto basis triplets in 2D--FPTFPT
33 33
33
44
66
77
44
33 44
55
55 33
0 0 0 … 0 0 … +6 … … +3 … … … … 0 …
55
55 44
Hp3Hp3--Hp3
Hp3--Hp3Hp3
Hp3Hp3--Hp3
Hp3--Hp4Hp4
Hp3Hp3--Hp3
Hp3--Hp5Hp5
…… Ar4Ar4--Hp3
Hp3--Hp4Hp4
Ar4Ar4--Hp3
Hp3--Hp5Hp5
…… ………… …… Hp7Hp7--Ar4
Ar4--PC6PC6
……Hp3Hp3--HA5
HA5--Ar5Ar5
Hp4Hp4--HA5
HA5--Ar5Ar5
………… ……
Di (m) = total occupancy of basis triplet i in molecule m.
Combinatorial enumeration of basisCombinatorial enumeration of basis
tripletstriplets•
Example: there are 36796 basis triplets,
verifying triangle
inequalities,
when considering
6 pharmacophore types
and 11 edge lenghts between Emin =3 to Emax =13 with an increment of Estep =1: (3, 4, 5,…13)–
Canonical representation: T1
d23 -T2
d13 -T3
d12 with T3
≥T2
≥T1
(alphabetically).
44
66
77
Hp7-Ar4-PC6
Ar4-Hp7-PC6
–
Out of
two corners of a same type, priority is given to
the one opposed to the shorter edge.
44
66
77
Ar4-Hp7-Hp6
Ar5-Hp6-Hp7
TriTripletplet
matching pmatching procedurerocedure
•
The triplet matching score represents the optimal degree of pharmacophore field overlap:–
if corner k of the triplet is of pharmacophore type T, e.g. F(k,T)=1, then it contributes to the total pharmacophore field of type T,
observed at a point P of the plane:
)exp(),()( 2,
3
1Pk
kTT dTkFP ∑
=
−×=Ψ ρ
Horvath, D. ComPharm pp. 395-439; in "QSPR /QSAR Studies by Molecular Descriptors", Diudea, M., Editor, Nova Science Publishers, Inc., New York, 2001
Control parameters for tControl parameters for tririplet enumerationplet enumeration
& & mmatchingatching
in two 2Din two 2D--FPT versions.FPT versions.
Parameter Description FPT-1 FPT-2
Emin Minimal Edge Length of basis triangles (number of bonds between two pharmacophore types) 2 4
Emax Maximal Triangle Edge Length of basis triangles 12 15
Estep Edge length increment for enumeration of basis triangles 2 2
e Edge length excess parameter: in a molecule, triplets with edge length > Emax+e are ignored 0 2
Δ Maximal edge length discrepancy tolerated when attempting to overlay a molecular triplet atop of a basis triangle. 2 2
ρHp = ρAr Gaussian fuzziness parameter for apolar (Hydrophobic and Aromatic) types 0.6 0.9
ρPC = ρNC Gaussian fuzziness parameter for charged (Positive and Negative Charge) types 0.6 0.8
ρHA = ρHD Gaussian fuzziness parameter for polar (Hydrogen bond Donor and Acceptor) types 0.6 0.7
l Aromatic-Hydrophobic interchangeability level 0.6 0.5
Number of basis triplets at given setup 4494 7155
Second key improvement: Second key improvement: Proteolytic Proteolytic equilibrium dependence of 2Dequilibrium dependence of 2D--FPTFPT
Ar5Ar5--N
C5NC5--P
C8PC8
Ar8Ar8--N
C8NC8--P
C8PC8
12%
88%
Some Some ‘‘activity cliffsactivity cliffs’’
in in rulerule--based descriptor based descriptor spacespace
are smoothed out in are smoothed out in 2D2D--FPTFPT--spacespace
•Neutral
•Cation
•Neutral
•Anion
•Neutral
• 90%C
ation
•Neutral
• 50%C
ation
•Neutral
•Anion •Neutral
•Neutral
•Neu
tral
• 40%
Cation
•Neu
tral
• 70%
Cation
Best Matching Candidates
Pharmacophore PatternPharmacophore Pattern--Based Similarity Based Similarity Queries: Lead Hopping!Queries: Lead Hopping!
PharmacophoreHypothesis
AutomatedFingerprintMatching...
ReferenceFingerprint
Nearest Neighbors
Superposition-based Similarity Scoring
Potential Pharmacophore Fingerprint Library
?Docking
Some Some eexamples of "xamples of "hhidden idden ssimilarity"imilarity"
0102030405060708090
100
A1h
Alpha1
Alpha2
Beta1h
AT1h
BZD
cBom
bB
2hC
CK
Ah
D1h
D2h
DaU
ptE
TAh
Galan
H1c
ML1
M1h
M3h
NK
1hN
PY
Muh
5HT1Ah
5HT1D
5HT2ch
5HT3h
5HT6h
5HTU
ptSigm
a1V
1Ah
K-A
TPC
lC
atBElastP
DE
IIP
DE
IVP
KC
EG
F-TKP
K55fyn
HIVP
NE
UP
ThIL-8M
AP
kinC
GR
P
010
20304050
607080
90100
010
2030
405060
7080
90100
NI N
N
S
Br
H
N
NON
Cl
Cl
I
NN
NN
N
O
NCl
OH
Successful Virtual Screening SimulationsSuccessful Virtual Screening Simulations
0
10
20
30
40
50
60
70
80
90
% R
etrie
ved
See
d C
ompo
unds
Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (OPT3) Confirmed Inactives (OPT3)
0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
30
35
40
45
0 20 40 60 80 100 120 140 160 180 200
Selection Size
0
10
20
30
40
50
60
70
80
90
0
10
20
30
40
50
60
70
80
90
Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (OPT3) Confirmed Inactives (OPT3)Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (FPT-2) Confirmed Inactives (FPT-2)
0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
30
35
40
45
0 20 40 60 80 100 120 140 160 180 200
Selection Size
0
5
10
15
20
25
30
35
40
45
0 20 40 60 80 100 120 140 160 180 200
Selection Size
% R
etrie
ved
Seed
Com
poun
ds%
Ret
rieve
d S
eed
Com
poun
ds
0
1
2
3
4
5
6
7
Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (OPT3) Confirmed Inactives (OPT3)
0
1
2
3
4
5
6
7
8
0
10
20
30
40
50
60
70
80
90
0 20 40 60 80 100 120 140 160 180 200
Selection Size
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (OPT3) Confirmed Inactives (OPT3)Confirmed Actives (PF) Confirmed Inactives (PF)Confirmed Actives (FPT-2) Confirmed Inactives (FPT-2)
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
0
10
20
30
40
50
60
70
80
90
0 20 40 60 80 100 120 140 160 180 200
Selection Size
0
10
20
30
40
50
60
70
80
90
0 20 40 60 80 100 120 140 160 180 200
Selection Size
% R
etrie
ved
See
d C
ompo
unds
% R
etrie
ved
Seed
Com
poun
ds%
Ret
rieve
d Se
ed C
ompo
unds
D2
TK
Successful QSAR model construction with 2DSuccessful QSAR model construction with 2D-- FPTFPT: predicting c: predicting c--Met TK activityMet TK activity
4
4.5
5
5.5
6
6.5
7
7.5
8
8.5
9
4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9Calculated pIC50
Exp
erim
enta
l pI
C50
.
Learning Set Compounds Validation Set Compounds
25 variables entering nonlinear model153 molecules for training: RMSE=0.4 (log units), R2=0.8240 molecules for validation: RMSE=0.8 (log units), R2=0.538 validation molecules out of 40 mispredicted by more than 1 log
What more could be done?What more could be done?
•
3D FPT version under study–
does it pay off to generate conformers? How many would you need to get better results than with 2D-FPT? What’s the best conformational sampler to use?
•
Accessibility-weighted fingerprints?–
class to return (topological and/or 3D) estimate of the solvent-
accessible fraction of an atom?
•
Tautomer-dependent fingerprints?–
if tautomers and their percentage were enumerated like any other
microspecies…
Pharmacophore HypothesesPharmacophore Hypotheses
(A): From individual Active Leads: 2D/3D•
ALL features in the Lead assumed relevant for binding
(B): Consensus hypotheses from set of Leads: 2D/3D•
Ignore features that can be deleted without losing activity
(C): Site-Ligand interaction models: 3D*•
Select Ligand features shown to interact with the site in the 3D X-ray structure of the site-ligand complex.
(D): Active Site filling models: 3D*•
Design a pharmacophoric feature distribution complemen- tary to the groups available in the active site
*
In these cases, docking may be performed starting from pharmacophore –based overlays
ComPharm OverlayComPharm Overlay……
- chosen conformer of the reference
- chosen conformer of the candidate
- pair of matching atoms
- 3 Euler angles- mirroring toggle
GA-controlledoverlay optimization
ComPharmComPharm
PharmacophoricPharmacophoric
FieldsFields
•
A descriptor of the nature of the molecule’s pharmacophoric neigh- borhood “seen” by every reference atom, assuming an optimal overlay of the molecule on the reference...
Pharmacophoric FeaturesAlk. Aro. HBA HDB (+) (-)
1 X11 X12 X13 X14 X15 X16
2 X21 X22 X23 X24 X25 X26
3 X31 X32 X33 X34 X35 X36
4 X41 X42 X43 X44 X45 X46R
efer
ence
Ato
ms
5 X51 X52 X53 X54 X55 X56