QSAR (Summary)

Author
smartpharmacist 
Category
Documents

view
219 
download
0
Embed Size (px)
Transcript of QSAR (Summary)

8/2/2019 QSAR (Summary)
1/15
1
??
e.g. anticancer effect
(Activity)
??
??
Method
To relate theTo relate the biological activitybiological activity of a series of compounds to theirof a series of compounds to their
physicochemical parametersphysicochemical parameters in ain a quantitativequantitative fashion using afashion using amathematical formula .mathematical formula .

8/2/2019 QSAR (Summary)
2/15

8/2/2019 QSAR (Summary)
3/15
3
one term equation produced separatelyseparately for eachindependent variable.
It is useful for discovering some of the mostmostimportant variables.
Assume data is linearlinear.
IgnoreIgnore thethe interactioninteraction of multiple variables.
Y = mx +b

8/2/2019 QSAR (Summary)
4/15
4
single multiple term linear equation is produced. Equations maximise explanation of correlation
between dependant and independent variables. To produce reilable results, typically need 5 times
as many molecules as independent variables.
y= ax1 +ax2 +..+constant

8/2/2019 QSAR (Summary)
5/15
5
Partial least squares regression is an extension of themultiple linear regression model . In its simplest form, alinear model specifies the (linear) relationship between
a dependent (response) variable Y, and a set of predictor variables, the X's .
partial least squares regression is probably the leastleast
restrictiverestrictive of the various multivariate extensions of themultiple linear regression model. This flexibility allowsittobeused:

8/2/2019 QSAR (Summary)
6/15
6
1 In situations where the use of traditional multivariate methods is severelylimited, such as when there are fewer observations than predictor variables .
2 Furthermore, partial least squares regression can be used as an exploratoryanalysis tool to select suitablesuitable predictor variables and to identify outliers beforeclassical linear regression.
3 Partial least squares regression has been used in various disciplines where
predictive linear modeling, especially with a large number of predictors, isnecessary .
6
setofdes
criptors
varia
bles
dependent (response) variable
Y = b0 + b1X1 + b2X2 + ... + bpXp

8/2/2019 QSAR (Summary)
7/15
7
QSAR model validation implies quantitativeassessment of model robustness and their predictivepower.
The predictive power of a QSAR model can be definedas its ability to predict accurately the modeledproperty (e.g. biological activity) of new compounds.
The validation ofQSAR models constitutes in the
following steps:(a) Statistical Diagnostics(b) Internal Validation(c) External Validation.

8/2/2019 QSAR (Summary)
8/15
8
11n/pn/p RatioRatio:
n/p 5 or n 5 p, where n is the number of data points and p isthe number of descriptors used in theQSAR model.
22 Fraction of the Variance or square of correlation coefficient (rFraction of the Variance or square of correlation coefficient (r22):):
The value of r2 may vary between 0 and 1, where :
(1) means a perfect model explaining 100% of the variance(0) means a model without any explanatory power.
QSAR model having rr22 >> 00..66 will be considered for validation

8/2/2019 QSAR (Summary)
9/15
9
33 CrossCrossValidation Test (qValidation Test (q22):):aQSAR model must have qq22 >> 00..55 for their predictive ability .
44 Standard Deviation (s):Standard Deviation (s):
The smaller (s) value is always required for the predictive QSAR
model.
55 Fischer Statistics (F):Fischer Statistics (F):
The F value of eachQSAR model was compared to that of theirrespective literature value at 9595% level% level .
66 AnothersAnothers..

8/2/2019 QSAR (Summary)
10/15
1010
r2 always increases as more descriptors are added.
Q2 initially increases as more parameters are added but then starts todecrease indicating datadata overover fittingfitting (optimization)(optimization)
Thus QThus Q22 is a better indicator of the model quality.is a better indicator of the model quality.
Y pred and Y indicate predictedpredicted and observedobserved activity values respectively and Yindicate meanmean activity value .

8/2/2019 QSAR (Summary)
11/15
11
The most commonly statistic used to describe thegoodnessgoodness ofof fitfit ofof datadata for a regression model is the square
of the correlation coefficient, r2
R2=1(YobsY pred)2
(YobsY mean)2
measures how closely the observed datatracks the fitted regression line.
If r2 = 1 perfect fit=0.95 good model
=0.7 poor model

8/2/2019 QSAR (Summary)
12/15
12
The steric and hydrophobic parameters of thesubstituents are the two most importantdeterminants for the activities .

8/2/2019 QSAR (Summary)
13/15
13

8/2/2019 QSAR (Summary)
14/15
14
According to this QSAR model, the paclitaxel derivative (3) must have a more
hydrophobic but smaller or less polarizableX substituent for improved cytotoxicity .
The positive coefficient of ICYALK suggests that the presence of cycloalkyl containing Xsubstituents will be more favorable to the activity
ICYALK is an indicator variable for the unusual
activity of the cycloalkyl containing X subs.

8/2/2019 QSAR (Summary)
15/15
15