CZ3253: Computer Aided Drug design Lecture 6: QSAR part II Prof. Chen Yu Zong Tel: 6874-6877 Room...
-
Upload
ferdinand-byrd -
Category
Documents
-
view
223 -
download
0
description
Transcript of CZ3253: Computer Aided Drug design Lecture 6: QSAR part II Prof. Chen Yu Zong Tel: 6874-6877 Room...
CZ3253: Computer Aided Drug designCZ3253: Computer Aided Drug design
Lecture 6: QSAR part II Lecture 6: QSAR part II
Prof. Chen Yu ZongProf. Chen Yu Zong
Tel: 6874-6877Tel: 6874-6877Email: Email: [email protected]@nus.edu.sghttp://xin.cz3.nus.edu.sghttp://xin.cz3.nus.edu.sg
Room 07-24, level 7, SOC1, Room 07-24, level 7, SOC1, National University of SingaporeNational University of Singapore
22
Examples of QSAR ApplicationsExamples of QSAR Applications::
Application of Application of in silico in silico technology to technology to screen out potentially toxic compounds screen out potentially toxic compounds using expert and QSAR modelsusing expert and QSAR models
33
Commercial SoftwareCommercial Software
Commercially available toxicity estimation packages are available to predict a variety of toxic endpoints including mutagenicity, carcinogenicity, teratogenicity, skin and eye irritation and acute toxicity:
• DEREK (Deductive Estimation of Risk from Existing Knowledge)- www.chem.leeds.ac.uk/luk
• HazardExpert – www.compudrug.com/hazard
• CASE (Computer Automated Structure Evaluation) – www.multicase.com
• TOPKAT (Toxicity Prediction by Computer Assisted Technology) – www.accelrys.com/products/topkat
• OncoLogic – www.logichem.com
44
Pharma AlgorithmsPharma Algorithms
N10,00022,000
8,00020,000
5,5001,000
5001,000
90036,000
...
Log PDMSO Solubility
pKaStability at pH < 2Aqueous SolubilityPermeability (HIA)
Active TransportPgp Transport
Oral Bioavailability (Human)LD50 Intraperitoneal
...
Providers of Databases, Predictors and Development Tools
55
Pharma Algorithms Pharma Algorithms Development ToolsDevelopment Tools
Algorithm Builder development platform:
• Data storage and manipulation
• Generation of fragmental descriptors
• Statistical procedures: MLR, PLS, PCA, Recursive Partitioning, HCA
• Tools for predictive algorithm development
66
Generation of DescriptorsGeneration of Descriptors
...
...
...
...
Y
Structure1
Structure2
StructureN
...
F1 F2 FM
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
F3
77
““Causal” DescriptorsCausal” Descriptors
One-atom("topological")
Three-atom
Five-atom
Larger chains, Ring scaffolds
Atom chains Activity effects
Non-specific(size, PSA)
COOH, CONH
Ionization,H-bonding
Reactivity, internal interactions
Similarity to natural compounds
N
OOH
OH
N
OOH
OH
N
OOH
OH
N
OOH
OH
N NH
O
O
OCl O
Examples
Spec
ifici
ty
Frag
men
t Siz
e
88
Algorithm DevelopmentAlgorithm Development
• Graphical Interface provides easy to use tools for programming complex algorithms
• Combine fragmental, descriptor and similarity based methods
• Use logical expressions, conditions and equations based on descriptors, sub-fragments, internal interactions or any other chemical criteria
• Combine multiple sub-algorithms into general algorithms
• Rapidly develop ‘custom’ filters incorporating ‘expert’ in-house or project specific rules
99
Our focus
Tox Effects in Drug DesignTox Effects in Drug Design
Tox Effect
Acute (LD50)
Organ-specific effects
Mutagenicity
Reproductive effects
Carcinogenicity
Programs
Topkat, AB/LD50
AB/Tox* (next version)
Many programs, AB/Tox
Many programs, AB/Tox*
Many programs, AB/Tox*
1010
Existing ProgramsExisting Programs
QSAR
Expert
Other
DEREKHAZARD
TopKatQickProp
ADME LD50
AB/ToxAB/LD50AB/Oral %F
Mixed
Combinations of above
“Manually” derived skeletons
COMPACT
Combined
Descriptors
C-SAR M-CASE “Statistical” skeletons
META
Will consider these
1111
What Is What Is LDLD5050
A dose that kills 50% of animals during 24 hrs
In drug design, used at pre-clinical stage
In early stages, replaced with “reductionist” considerations
Some scientists question its utility
1212
Informatics Toxicologists PK Specialists
“Reactivity + log P ” Empirical
knowledge
Empiricalknowledge +simulations
Complexity of Complexity of LDLD5050
O ra l LD 50
Tox Effects
Alkyla tion
"N arcosis"
O rgan-specific
D istribution"B asa l" C N S, PN S
ATP Synthesis
Krebs C ycle
O ther targets
Excretion
Oral % F
1313
Is this good enough?
Acute Tox in Drug Acute Tox in Drug DesignDesign
Lead Selection
No tests performed
Reactive groups discarded
Lead Optimization
Basal cytotoxicity tested
Intra-cellular effects considered
Pre-clinical Stage
Animal tests are required
ADME effects considered
1414
Acute Tox in Drug DesignAcute Tox in Drug Design
An LD50 Model for mouse (intraperitoneal administration) was developed using data from the RTECS database (35,000 compounds)
1515
Distribution of Acute EffectsDistribution of Acute Effects
Extra-cellular effects - may be “invisible” in cytotoxic assays
RTECS DB: mouse, intraperitoneal administration
LD50 < 50 mg/kg(N = 4,099)
All compounds(N ~ 35,000)
Narcosis
32%
7%
55%6%
23%
25%
14%
38%
Natural toxins
Nervous system s(hydrophobic bases)
Reactivity
Other
Reactivity
Other
1616
In VivoIn Vivo vsvs. . In VitroIn Vitro
Log LD 50
Log IC 50
IC50 cannot model LD50 when extra-cellular effects occur
NN
O
LD 50 = 750 m g/kg
Intra-cellular
Natural toxins
N
NN
NOO
N
N
O
ON
H
LD50 = 0.008 m g/kg
ADM E Factors
N
NLD 50 = 51 m g/kg
Extra-cellular
In testina l permability,
1st pass metabolism
- Log LD 50
1717
How to Predict These Effects?How to Predict These Effects?
Quality of Predictions = Knowledge of Specific Effects
How much knowledge do we get?
“Reductionist” QSARs do not work
LD50 involves much more than “log P + reactivity”
1818
How Much Knowledge?How Much Knowledge?
QSAR Model
Log 1/LD50 = ai xi
Knowledge
Expert Deduction Little KnowledgeNCl N
Cl
Active Inactive
More KnowledgeC-SAR + DeductionNCl N
Cl
Active InactiveCl
CN CNCl
Active InactiveStruct. Space
1919
C-SAR + DeductionC-SAR + Deduction
LD50 values are split into groups using fragmental descriptors from AB
n = 7588avg. = 1.048
sd = 0.641N
N o Yes
F81 >= 1N o Yes
n = 7165avg. = 0.999sd = 0.583
N0
N o Yes
F44 >= 1N o Yes
n = 6844avg. = 0.978sd = 0.562
N00
N o Yes
F36 >= 1N o Yes
n = 5918avg. = 0.936sd = 0.526
N000
n = 926avg. = 1.245
sd = 0.694N001
n = 321avg. = 1.46sd = 0.805
N01
N o Yes
F7 >= 1N o Yes
n = 169avg. = 1.221sd = 0.737
N010
n = 152avg. = 1.725sd = 0.797
N011
n = 423avg. = 1.878sd = 0.943
N1
N o Yes
F56 >= 1N o Yes
n = 184avg. = 1.61sd = 0.86
N10
n = 239avg. = 2.085sd = 0.953
N11
OO ONO
PO
NHal
The most significant skeletons are “potential toxicophores”
2020
Specific Effects in AB/Specific Effects in AB/LDLD5050
> 33,000 Compounds with LD50 from RTECS DB
Natural toxins
Cholinesterase DNA Alkylation
AT P Synthesis
ON
O
ON
O
PO O
PO
O
SO
O
HONNCl
NCl
Cl
CN
CNCl
FO
FO
OCl
OF
O SO
O
OH SO
O
O
O
O
O
O O
OH H
N
N
NO
PO
OO
N
N
OO
O
O
O
O
Toxicity classes
Active:
Inactive:
Active:
Inactive:
2121
Low-Specific EffectsLow-Specific Effects
Small non-bases are least toxic.Hydrophobic amines are most toxic
Arrows denote increasing toxicity
Base p K a
Lo g P
3.5 7.0 8.5
3.2
1.2
M W
230
300
N arcos is N ervous syst.
N
N
LD 50 = 51 m g/kg(CNS effect)
NNO
LD 50 = 750 m g/kg("narcosis")
2222
C-SAR + Deduction
To get new knowledge, statistics must help deduction.To use QSAR models, they must work in narrow structural spaces.
Efficacy ComparisonEfficacy ComparisonK
now
ledg
e
Struct. DiversityEffo
rt
Expert Deduction
QSAR Model
2323
QSAR Models in AB/QSAR Models in AB/LDLD5050
NH
OCN
O
NH
OCN
O
NH
OCN
O
NH
OCN
O
NH
OCN
O
NH
OCN
O
NH
OCN
O
F ive -a tom cha ins
R eactive ske le ton
1. Narrow struct. spaces2. Dynamic fragmentation3. “Causal” parameters
- Log LD50 = a i F i
* S imila rity a lgorithm based on M ACCS II key
ClassS-1 Specific toxinsS-2 O rganometa llicsS-2 Covalent cationsS-4 Cho linesteraseS-5 A lkylating agentsS-6 ATP SynthesisL-1 L ipoph ilic basesL-2 Non-lipophilic basesL-3 W eak basesL-4 Hydrophilic basesN-1 Large non-basesN-2 Very weak basesN-3 M id-size non-basesN-4 Small non-basesAll compounds
N260120
1,3001,100
800600
4,0003,8004,6003,0003,3002,8004,3003,700
33,680
R---*---
0.860.890.820.790.750.750.750.820.840.830.800.760.83
pLD 50
+1.0 ... +6 .5-0.5 ... +2.5-0.5 ... +4.5-1.5 ... +4.0-0.5 ... +2.5+0.0 ... +2 .4-0.5 ... +1.5-1.0 ... +1.0-1.0 ... +0.9-1.2 ... +0.8-1.5 ... +1.0-1.5 ... +0.8-1.5 ... +0.5-2.0 ... +0.5-2.0 ... +6.5
S - Specific effects, L - Low-specific, N - Non-specific.
2424
What is novel?What is novel? The novel features of the Pharma Algorithms approach are:
• Combination of approaches used separately in earlier software i.e. Expert Rues (e.g. DEREK), C-SAR (e.g. CASE) and QSAR (e.g. TOPKAT)
• Reliable Confidence Intervals are generated from QSAR models (class specific and global) that are
derived using an automated multi-step process:1. Chain fragmentation and PLS with multiple bootstrapping2. Selection of best fragments with ‘stable’ increments3. Derivation of multiple models from subsets of the training set
to produce ranges of predictions4. Selection of the best model to use for a particular compound
by comprison of the different ranges5. Calculation of the confidence interval from the range of
predictions produced by the most appropriate model
2525
Screening the Specs DBScreening the Specs DB
SPECS are a supplier of diverse compound screening collections
A set (N = 14,902) was randomly selected (from > 200,000) and
screened using the AB/LD50 toxicity predictor.
Calculation of LD50 for the set takes about 30min on a standard Windows laptop
Compounds were deemed “Toxic” if LD50 < 50 mg/kg
Results:Overall only 2.7% were “toxic” (i.e. 310 of 14,902)
As expected a higher proportion (3.9%) of the bases (i.e alkylamines) were toxic (i.e. 92 of 2,351)
2626
Most significant
Toxic SkeletonsToxic Skeletons
N
NN
M W > 34068% (86/127)
N
NS
38% (60/158)
N
NN
MW < 26030% (3/9)
NH
NN
21% (7/33)
NH
OO30% (6/20)
NN
O25% (6/24)
N
N
N
CF3
67% (4/6)
CN
CN
31% (16/52)
NH
O
O
15% (4/26)
O O
37% (6/16)
O O
100% (4/4)Natural tox in? Alkylation, ox idation CholinesteraseCyanide relase
Artefacts?
Exp. verification required
2727
What We Have Learned So FarWhat We Have Learned So Far
Screening for basal cytotoxicity is not enough
The “C-SAR + Deductive” method opens new possibilities
The extra-cellular effects can be estimated in silico
Can we model in vivo toxicity?
2828
Administration vs. ADME
OR IVSc IP
ADME Effects
OR – OralSc – SubcutaneousIP – IntraperitonealIV – Intravenous
Stomach
Intestine
Vein
Liver
Toxicaction
Dissolution, permeation,hydrolysis, metabolism
IV
OR
Tissue,organs
2929
Informatics ADME Specialists
“Simple descriptors” “Simulations”
Complexity of ADMEComplexity of ADME
Absorption
G u t 1 st P ass
Solubility
Permeability
T ransporters
Liver 1 st Pass
O ral % F
“Simple descriptors” disregard many factors.Can we simulate them in HT mode?
3030
Oral %Oral %FF Prediction in HT Mode Prediction in HT Mode
Reliability validated by the consistency of independent predictions
Non-Batch Interface:
3131
Cost/Benefit ConsiderationsCost/Benefit Considerations
In silico Bioavailability and Toxicity predictions for compound collections are inexpensive to perform
The value of predictions is variable- Decisions still need to be made by expert scientists in a project context
In silico tools can assist the expert in a detailed evaluation of ‘hits’, ‘leads’ and ‘candidates’ but there is a need for:1. Predictions for a range of toxicity types:
LD50 (oral, i.v.,s.c.) Genotoxicity and Carcinogenicity Organ specific Effects (e.g. hepatotoxicity)
2. Integration of the prediction software with databases containing the training data so that the availability and behaviour of similar compounds can be checked
3232
Drug DesignDrug Design
General Principles
Aim for low logP
Aim for low M.Wt.
C. Hansch et. al. ‘ The Principle of Minimal Hydrophobicity in Drug Design’ J. Pharm. Sci., 1987, 76, 663
M.C. Wenlock et. Al. ‘Comparison of Physicochemical Property Profiles of Development and Marketed Oral Drugs’ J. Med. Chem., 2003, 46, 1250
3333
Simulations in HT ScreeningSimulations in HT Screening
ActivityTox
%F
“Reductionist” Methods:
High Activity = Low %F + High Tox
HT Simulations aim at:
High Activity = High %F + Low Tox
ActivityTox
%F
Very rough estimations, assuming that activity increases with increasing log P and MWt