QSAR (Summary)

download QSAR (Summary)

of 15

Embed Size (px)

Transcript of QSAR (Summary)

  • 8/2/2019 QSAR (Summary)

    1/15

    1

    ??

    e.g. anticancer effect

    (Activity)

    ??

    ??

    Method

    To relate theTo relate the biological activitybiological activity of a series of compounds to theirof a series of compounds to their

    physicochemical parametersphysicochemical parameters in ain a quantitativequantitative fashion using afashion using amathematical formula .mathematical formula .

  • 8/2/2019 QSAR (Summary)

    2/15

  • 8/2/2019 QSAR (Summary)

    3/15

    3

    one term equation produced separatelyseparately for eachindependent variable.

    It is useful for discovering some of the mostmostimportant variables.

    Assume data is linearlinear.

    IgnoreIgnore thethe interactioninteraction of multiple variables.

    Y = mx +b

  • 8/2/2019 QSAR (Summary)

    4/15

    4

    single multiple term linear equation is produced. Equations maximise explanation of correlation

    between dependant and independent variables. To produce reilable results, typically need 5 times

    as many molecules as independent variables.

    y= ax1 +ax2 +..+constant

  • 8/2/2019 QSAR (Summary)

    5/15

    5

    Partial least squares regression is an extension of themultiple linear regression model . In its simplest form, alinear model specifies the (linear) relationship between

    a dependent (response) variable Y, and a set of predictor variables, the X's .

    partial least squares regression is probably the leastleast

    restrictiverestrictive of the various multivariate extensions of themultiple linear regression model. This flexibility allowsittobeused:

  • 8/2/2019 QSAR (Summary)

    6/15

    6

    1- In situations where the use of traditional multivariate methods is severelylimited, such as when there are fewer observations than predictor variables .

    2- Furthermore, partial least squares regression can be used as an exploratoryanalysis tool to select suitablesuitable predictor variables and to identify outliers beforeclassical linear regression.

    3- Partial least squares regression has been used in various disciplines where

    predictive linear modeling, especially with a large number of predictors, isnecessary .

    6

    setofdes

    criptors

    varia

    bles

    dependent (response) variable

    Y = b0 + b1X1 + b2X2 + ... + bpXp

  • 8/2/2019 QSAR (Summary)

    7/15

    7

    QSAR model validation implies quantitativeassessment of model robustness and their predictivepower.

    The predictive power of a QSAR model can be definedas its ability to predict accurately the modeledproperty (e.g. biological activity) of new compounds.

    The validation ofQSAR models constitutes in the

    following steps:(a) Statistical Diagnostics(b) Internal Validation(c) External Validation.

  • 8/2/2019 QSAR (Summary)

    8/15

    8

    11--n/pn/p RatioRatio:

    n/p 5 or n 5 p, where n is the number of data points and p isthe number of descriptors used in theQSAR model.

    22-- Fraction of the Variance or square of correlation coefficient (rFraction of the Variance or square of correlation coefficient (r22):):

    The value of r2 may vary between 0 and 1, where :

    (1) means a perfect model explaining 100% of the variance(0) means a model without any explanatory power.

    QSAR model having rr22 >> 00..66 will be considered for validation

  • 8/2/2019 QSAR (Summary)

    9/15

    9

    33-- CrossCross--Validation Test (qValidation Test (q22):):aQSAR model must have qq22 >> 00..55 for their predictive ability .

    44-- Standard Deviation (s):Standard Deviation (s):

    The smaller (s) value is always required for the predictive QSAR

    model.

    55-- Fischer Statistics (F):Fischer Statistics (F):

    The F value of eachQSAR model was compared to that of theirrespective literature value at 9595% level% level .

    66-- AnothersAnothers..

  • 8/2/2019 QSAR (Summary)

    10/15

    1010

    r2 always increases as more descriptors are added.

    Q2 initially increases as more parameters are added but then starts todecrease indicating datadata overover fittingfitting (optimization)(optimization)

    Thus QThus Q22 is a better indicator of the model quality.is a better indicator of the model quality.

    Y pred and Y indicate predictedpredicted and observedobserved activity values respectively and Yindicate meanmean activity value .

  • 8/2/2019 QSAR (Summary)

    11/15

    11

    The most commonly statistic used to describe thegoodnessgoodness ofof fitfit ofof datadata for a regression model is the square

    of the correlation coefficient, r2

    R2=1-(Yobs-Y pred)2

    (Yobs-Y mean)2

    measures how closely the observed datatracks the fitted regression line.

    If r2 = 1 perfect fit=0.95 good model

    =0.7 poor model

  • 8/2/2019 QSAR (Summary)

    12/15

    12

    The steric and hydrophobic parameters of thesubstituents are the two most importantdeterminants for the activities .

  • 8/2/2019 QSAR (Summary)

    13/15

    13

  • 8/2/2019 QSAR (Summary)

    14/15

    14

    According to this QSAR model, the paclitaxel derivative (3) must have a more

    hydrophobic but smaller or less polarizableX substituent for improved cytotoxicity .

    The positive coefficient of ICYALK suggests that the presence of cycloalkyl containing Xsubstituents will be more favorable to the activity

    ICYALK is an indicator variable for the unusual

    activity of the cycloalkyl containing X subs.

  • 8/2/2019 QSAR (Summary)

    15/15

    15