Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and...
-
Upload
karley-leuty -
Category
Documents
-
view
218 -
download
1
Transcript of Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and...
![Page 1: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/1.jpg)
Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies
Anders KarlénUppsala University
OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
H3C
HO
CH3
CH3
O CH3CH3 CH3
CH3
CH3
H2NSH
O
P
P
O
HOHO
HO
HO OH
NH2CH3
O
H3C CH3
O
NO
OHNH
O
O
N
O
O
H
CH3H3C
H3C
CH3S
NH3C
CH3
OO
![Page 2: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/2.jpg)
Aim of study
• Derive a “benchmark data set“– Drug-like– Physicochemically diverse – Commercially available and inexpensive– Amenable to analytical measurements
• Start the generation of benchmark data– Derive good-quality data from the same
lab
![Page 3: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/3.jpg)
Possible use of the data set
• General description of drugs• Developing ADME/TOX filters
(permeability, solubility, plasma protein binding etc.)
• To validate novel experimental techniques
![Page 4: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/4.jpg)
Generation of a ”benchmark” data set based on the list of drugs in Sweden (FASS 2001)
691 cpds
Remove compounds•Molecular weight >900•Polymers, polypeptides•Inorganic and metal containing
799 cpds 370 cpds
Select commercially available< $800/g
332 cpds
•Select only oral, nasal, pulminal, ocular, parenteral and rectal administered drugs
284 cpds
Remove “odd” ATC classese.g. A01(Mouth and teeth),A05(Bile acids)A06 (Laxative)…
Exp.design
24-compound data set
450
![Page 5: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/5.jpg)
Cost and availability of the 691-compound data set
Histogram
Binned Price/gram ($)0.0284 - 24.9 24.9 - 50.2 50.2 - 79.6 79.6 - 100 100 - 995 995 - 3228000
50
100
150
200
450 of the 691 compounds can be boughtPrice range $0.03/gram - $3,228 000/gram (2001)
NN
N
N
Methenamine
HO
CH2
OH
H
H3C
CH3
OH
CH3
Calcitrol
Back0.03 -24.9 24.9 – 50.2 50.2 – 79.6 79.6 – 100 100 – 995 995 – 3,228 000
![Page 6: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/6.jpg)
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Principal component analysis
Lipophilicity
Size
Polarity
• General descriptors
• General hydrogen bonding descriptors
• Hydrogen bond donor descriptors
• Hydrogen bond acceptor descriptors
28 molecular descriptors
![Page 7: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/7.jpg)
Principal component analysis
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MOL_WEIGHT)
0 - 200200 - 400400 - 600600 - 800800 - 1000
SIMCA-P+ 11 - 2006-11-10 10:27:53
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MLOGP)
-7 - -4-4 - -1-1 - 22 - 55 - 8
SIMCA-P+ 11 - 2006-11-10 10:32:21
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable PSASAVOL)
0 - 100100 - 200200 - 300300 - 400
SIMCA-P+ 11 - 2006-11-10 10:34:12
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
![Page 8: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/8.jpg)
The factorial design“A face-centered central composite
design”
+ + -
+ - -
+ - +
- + -
+ + +- + +
- - +
- - -
![Page 9: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/9.jpg)
20 proteolytes4 nonproteolytes
NSH
COOHO
H2N-SO2
F3C
S
NH
NH
OO
SNH
NH
O
NH
O
N
N
OO
COOHOH
OHNH2
H
COOH
NH2
H
O
I
I
OH
I
I
N
NSH
NH2 F
HOOC
S
ON
NCl
NH2 NH2
NH
NH2
O NH
N
NH2O
Cl
S
N
S
NH
NHH2N-SO2
Cl
OO
NH
OO
Cl
O
OH
O
H H
H
OHO
SNN
NO2
OO
CF3
NOH
SN
Cl
NH2
NH
O
N
O
Cl OO
O
O
O OHOH
O
OHHH
OH
OH
O
NH2
N
N
N N
N
NH2
OH
NH
NH
O H COOH
COOH
ONH
O
ONH2
O
N
NCl
OH
N
OH
OH
O
O
O O
O
OHOH
O
OH
O OH
N
O
Captopril ()
Bendroflumethiazidea ( )
Glipizide ( )
Levodopa ()
Levothyroxine ( )
Thiamazole ( )
Amantadine ( )
Sulindac ( )
Amiloride ()
Carbamazepine ( )
Chlorprothixene ( )
Hydrochlorothiazide ( )
Chlorzoxazone ( )
Prednisone ()
Tinidazole ( )
Flupenthixol ( )
Metoclopramide ()
Fenofibrate ( )
Tetracycline ()
Folic acid ( )
Carisoprodola ()
Meclizinea ( )
Terfenadineb ( )
Erythromycin ( )
24-compound data set
The cost of buying the entire data set (at least 1 gram of each compound) is less than $1,500
![Page 10: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/10.jpg)
Comparison of the data sets with respect
to some common molecular descriptors 691-compound data set 24-compound data set
Min Max Mean Min Max Mean
MW 60 854 347 114 777 349
PSA 0 373 93 8 246 99
logPMor 6.4 7.6 1.9 2.0 5.3 1.9
logDACD_6.5 10.6 12.3 0.74 5.0 4.8 0.94
HBD 0 19 2.4 0 8 2.7
HBA 0 19 4.9 1 14 4.7
OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
N
NO
CH3
O ON
NN
NHO CH3O
O
Candesartan cilexetillogPMor= 7.6
NeomycinHBD = 19
![Page 11: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/11.jpg)
Comparison of the data sets with respect to functional groups
0,00%
25,00%
50,00%
75,00%
ALIPHATI
C q-A
MIN
E
ALIPHATI
C t-AM
INE
ALIPHATI
C s-AM
INE
ALIPHATI
C p-A
MIN
E
COOH
BENZENE
ALIPHATI
C OH
AROMATIC
t-AMIN
E
AROMATIC
s-AMIN
E
AROMATIC
p-A
MINE
AROMATIC
OH
ESTER
HETEROCYCLIC
Functional group
Pe
rce
nt
of
co
mp
ou
nd
s c
on
tain
ing
th
e f
un
cti
on
al g
rou
p
24-set
FASS (druglike only)691- set
![Page 12: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/12.jpg)
Number of substances Percent of dataset
ATC Description 24-set 691-set 24-set 691-setA GI 1 69 4,2% 9,99%B Blood 0 21 0,0% 3,04%C Cardio 2 89 8,3% 12,88%D Topical 0 36 0,0% 5,21%G Gen.hormones 1 38 4,2% 5,50%H Hormones 3 14 12,5% 2,03%J Infection 5 89 20,8% 12,88%L Tum.,immuno 1 53 4,2% 7,67%M Muscle,mov. 3 37 12,5% 5,35%N Nervous 6 134 25,0% 19,39%P Antiparasite 0 13 0,0% 1,88%R Respiration 1 52 4,2% 7,53%S Eye,ear 1 24 4,2% 3,47%V Various 0 22 0,0% 3,18%
Distribution in ATC
Comparison of the data sets with respect to ATC classes
The Anatomical Therapeutic Chemical (ATC) classification system is the most commonly used classification system for drug substances
![Page 13: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/13.jpg)
Start the generation of benchmark data.Derive good-quality data from the same
lab
1. Measurment of pKa by pH-metric or pH-UV technique (n=20)
2. Measurment of lipophilicity(a) pH-metric logP (n=18)(b) capacity factors by RP-HPLC (n=21)
3. Measurment of intrinsic and kinetic solubility pH-metric solubility (CheqSol technique) or shake-plate solubility (n=17)
4. Measurment of permeability across Caco-2 Cells. A to B direction (n=22)
![Page 14: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/14.jpg)
2. LipophilicitypH-metric measurment of logP and logD
-3,00
-2,00
-1,00
0,00
1,00
2,00
3,00
4,00
5,00
6,00
7,00
Aman
tadine
Amilo
ride
Bendr
oflum
ethia
zide
Capto
pril
Chlorp
roth
ixene
Chlorz
oxaz
one
Erythr
omyc
in
Fenof
ibrat
e
Flupen
thixo
l
Glipizi
de
Hydro
chlor
othia
zide
Levo
dopa
Levo
thyr
oxine
Mec
lizine
Met
oclop
ram
ide
Sulind
ac
Terfen
adine
Tetrac
yclin
e
Thiam
azole
Tinida
zole
Series1
Series2logP (neutral)logD (pH 7.4)
logP missing for;•Folic acid•Carbamazepin•Prednisone•Carisoprodol
![Page 15: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/15.jpg)
2. LipophilicityExperimental logP vs calculated logP
R2 = 0,70
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
Pcr
ip
Crippen logP
R2 = 0,88
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0logPexp
log
PA
CD
ACD/LogP
R2 = 0,89
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
PC
log
P
ClogP (BioByte)
R2 = 0,80
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
5,0
6,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
log
PM
or
Moriguchi logP
![Page 16: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/16.jpg)
2. LipophilicityCorrelation between the measured HPLC
capacity factor (k) and pH-metric logD (pH 6.8)•Compounds from the 8 corner points have different colors
•The 2 compounds at each corner point have the same color
•The axis points are colored black
•Center point pink
R2 = 0.92
(pH=6.8)
![Page 17: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/17.jpg)
3. SolubilityMeasurment of intrinsic solubility using CheqSol
(24-compound data set)
Log
(g
/mL
)
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
Terfena
dine
Mecli
zine
Chlorpro
thixe
ne
Fenofib
rate
Glipizi
de
Folic A
cid
Sulinda
c
Bendro
flum
ethiazid
e
Levo
thyr
oxine
Flupe
nthixo
l
Meto
clopr
amide
Carbam
azepin
e
Prednis
one
Tetracy
cline
Hydro
chlor
othiaz
ide
Chlorzoxa
zone
Aman
tadin
e
names
Solubility ranges from 0.009 g/ml to 2119 g/ml
![Page 18: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/18.jpg)
3. Solubility
http://www.cheqsol.com/download%20files/download01.pdf
19 of the compounds studied also present in the 691-compound data set
CheqSol solubility ranges from 0.9 g/mL to 3500 g/mL in these 19 compounds
Compound not present in the 691 data set
Kinetic Solubility
Kinetic Solubility
CheqSol Shake-Flask Literature Chaser non-chaser
1 Phthalic Acid 5330 5950 8462
2 Quinine 363 201 491 391
3 Trazodone 134.6 138.0 435
4 Nitrofurantoin 112.5 109.5 78.9 319
5 Nortriptyline 27.0 49.3 20.0 27.3
6 Verapamil 48.5 48.5 9.7 47.8
7 Niflumic Acid 9.53 29.5 59
8 Imipramine 17.2 21.7 18.1 17.3
9 Flumequine 34.2 20.7 121
10 Furosemide 19.7 20.4 5.9 96
11 Maprotiline 5.80 8.05 3.49 77
12 Piroxicam 5.92 5.95 3.16 233
13 Warfarin 5.30 5.25 5.60 120
14 Chlorpromazine 2.70 2.41 1.71 2.70
15 Lidocaine 3500 3810 4600
16 Famotidine 740 1100 5900
17 Hydrochlorothiazide 630 700 2400
18 Chlorpheniramine 608.3 615.2 668
19 Sulfamerazine 200.3 203.0 701
20 Ketoprofen 130.6 178.0 336
21 Propranolol 81.0 70.0 340
22 Ibuprofen 50.0 49.0 180
23 Pindolol 41.7 32.7 1424
24 Miconazole 1.00 0.67
25 Diclofenac 0.90 0.80 45
26 Amodiaquin 0.41 8.8
27 Pamoic acid 0.0003 0.019
All results in µg/mL
Name Equilibrium solubility
In the 24-compound data set the solubility ranges from 0.009 g/ml to 2119 g/ml
![Page 19: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/19.jpg)
24-compound data set is structurally diverse
-10
-8
-6
-4
-2
0
2
4
6
8
-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
t[2]
t[1]
No ClassClass 1Class 2
SIMCA-P+ 11 - 2006-11-10 14:05:50
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
No class19-data set24-data set
![Page 20: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/20.jpg)
0.01
0.1
1
10
100
0.01 0.1 1 10 100 1000
Caco-2 permeability (x 10-6 cm/s) at pH 6.5
Hu
ma
n j
eju
nu
m p
erm
eab
ilit
y (x
10
-4 c
m/s
) at
pH
6.5 Furosemide
Hydrochlorothiazide
Atenolol
Cimetidine
Manni tol
Terbutaline
Amoxi ci l l i n (C)
Lisinopril(C)
Metoprolol
Cephalexin (C)
Enalapril (C)
Propranolol
Phenylalanine (C)
Desipramine
Antipyrine
Piroxi cam
Verapamil (C)
Ketoprofen
Naproxen
D-Glucose (C)
logY = 0.6532 logX - 0.3036, R2 = 0.7276 (all drugs)logY = 0.7524 logX - 0.5441, R2 = 0.8492 (passively diffusive)LogY = 0.542LogX + 0.06, R2 = 0.7854 (Carrier-mediated)
Sun, D. et al. Comparison of Human and Caco 2 Gene Expression Profiles for 12,000 Genes and the Permeabilities of 26 Drugs in the Human Intestine and Caco 2 Cells. Pharm Res 2002, 19, 1398-1413
4. Permeability/absorption
![Page 21: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/21.jpg)
Low
Med
ium
Hig
h
4. Permeability/absorption In vitro Papp values in human Caco-2 cells
![Page 22: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/22.jpg)
Suggestions on the ”Uppsala diverse data set” usage
• The 24 compounds can be used– as a test set for testing already derived models of permeability,
lipophilicity, solubility etc.– as a validation set for new experimental techniques– on its own for building and validating models by dividing it into a
training set and a test set
We hope that other groups are willing to help us to supplement the herein-started characterization
”Bench mark data set”
J. Med. Chem.; (ASAP); 2006; 49(23); 6660-6671
![Page 23: Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University.](https://reader036.fdocuments.in/reader036/viewer/2022062312/551a2618550346cb358b4c82/html5/thumbnails/23.jpg)
Acknowledgements
AstraZeneca R&D MölndalSusanne Winiwarter Anna-Lena UngellJohan WernevikFredrik BergströmLeif Engström
Sirius Analytical Instruments LtdJohn Comer Karl BoxRuth Allen Jon Mole
Faculty of Pharmacy Uppsala UniversityChristian SköldTorbjörn LundstedtAnders HallbergHans Lennernäs