QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using...

8
Analytica Chimica Acta 552 (2005) 42–49 QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization Li Lin, Wei-Qi Lin, Jian-Hui Jiang, Yan-Ping Zhou, Guo-Li Shen, Ru-Qin Yu State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha 410082, PR China Received 17 January 2005; received in revised form 14 July 2005; accepted 19 July 2005 Available online 3 October 2005 Abstract In the present work, we employed piecewise hyper-sphere modeling by particle swarm optimization (PHMPSO) which splits the dataset into subsets with desired linearity in each model for QSAR studies of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3- (3H)-ones (PQs) for their affinity to benzodiazepine receptor (BzR). The results were compared to those obtained by MLR modeling in a single model with the whole data set as well as in submodels based on K-means clustering analysis. It has been clearly shown that electronic descriptors and spatial descriptors play the important roles in the compounds’ affinity to BzR. In addition, the molecular density, the Y component of the principal moment of inertia, the magnitude and the Y component of the dipole moment of the molecules can detrimentally affect PQ analogue BzR affinity, while the X component of the dipole moment of the molecules can favorably affect compounds’ affinity. © 2005 Published by Elsevier B.V. Keywords: Quantitative structure–activity relationship; Piecewise hyper-sphere modeling; Particle swarm optimization; 2-Aryl(heteroaryl)-2,5- dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones 1. Introduction Benzodiazepine receptor (BzR) ligands exert a continu- ous profile of pharmacological activities [1], ranging from full agonists (anxiolytic, hypnotic and anticonvulsant agent), through antagonists (nil efficacy), to inverse agonists (pro- convulsant and anxiogenic agents). Diazepam, -CCM and Rol15-1788 can be considered as typical representatives of an agonist, an inverse agonist and an antagonist, respectively [2]. After 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quin- olin-3-(3H)-ones (PQs) were discovered as high affinity BzR ligands [3], there has been a growing interest in their pe- culiar pharmacological activity. The lack of some unwanted side effects found in classical benzodiazepines as well as the net shift of intrinsic activity caused by small structural Corresponding author. Tel.: +86 731 882 2782; fax.: +86 731 882 2577. E-mail address: [email protected] (R.-Q. Yu). changes render the PQs a fascinating and intriguing class of BzR ligands. Only few papers dealing with quantitative structure–activity studies of PQs have been reported [4–8]. Quantitative structure–property relationship (QSAR/QS- PR) studies are unquestionably of great importance in drug design and biochemistry. Once a correlation between structure and activity/property is found, newly designed compounds, including those not yet synthesized, can be readily screened on the computer in order to select structures with potential properties desired. It is then possible to select the most promising compounds to synthesize and test in the laboratory. Thus, the QSAR/QSPR approach conserves resources and accelerates the process of development of new molecules for use as drugs, materials, additives or for any other purpose. Conventional QSAR modeling often treats the dataset involved using a single linear regression model, as a result the single linear model frequently leads to unacceptable errors, especially when the compounds in the dataset exhibit obviously structure diversity [9–11]. To circumvent this 0003-2670/$ – see front matter © 2005 Published by Elsevier B.V. doi:10.1016/j.aca.2005.07.033

Transcript of QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using...

Page 1: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

Analytica Chimica Acta 552 (2005) 42–49

QSAR analysis of a series of2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using

piecewise hyper-sphere modeling by particle swarm optimization

Li Lin, Wei-Qi Lin, Jian-Hui Jiang, Yan-Ping Zhou, Guo-Li Shen, Ru-Qin Yu∗

State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering,Hunan University, Changsha 410082, PR China

Received 17 January 2005; received in revised form 14 July 2005; accepted 19 July 2005Available online 3 October 2005

Abstract

In the present work, we employed piecewise hyper-sphere modeling by particle swarm optimization (PHMPSO) which splits the datasetinto subsets with desired linearity in each model for QSAR studies of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-

eling in aat electronicity, thetallyity.

aryl)-2,5-

classtive

QS-e ineen

nedn betures

electst inervesof newany

s the, as atablexhibit

(3H)-ones (PQs) for their affinity to benzodiazepine receptor (BzR). The results were compared to those obtained by MLR modsingle model with the whole data set as well as in submodels based on K-means clustering analysis. It has been clearly shown thdescriptors and spatial descriptors play the important roles in the compounds’ affinity to BzR. In addition, the molecular densYcomponent of the principal moment of inertia, the magnitude and theY component of the dipole moment of the molecules can detrimenaffect PQ analogue BzR affinity, while theX component of the dipole moment of the molecules can favorably affect compounds’ affin© 2005 Published by Elsevier B.V.

Keywords: Quantitative structure–activity relationship; Piecewise hyper-sphere modeling; Particle swarm optimization; 2-Aryl(heterodihydropyrazolo[4,3-c]quinolin-3-(3H)-ones

1. Introduction

Benzodiazepine receptor (BzR) ligands exert a continu-ous profile of pharmacological activities[1], ranging fromfull agonists (anxiolytic, hypnotic and anticonvulsant agent),through antagonists (nil efficacy), to inverse agonists (pro-convulsant and anxiogenic agents). Diazepam,-CCM andRol15-1788 can be considered as typical representatives ofan agonist, an inverse agonist and an antagonist, respectively[2].

After 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quin-olin-3-(3H)-ones (PQs) were discovered as high affinity BzRligands[3], there has been a growing interest in their pe-culiar pharmacological activity. The lack of some unwantedside effects found in classical benzodiazepines as well asthe net shift of intrinsic activity caused by small structural

∗ Corresponding author. Tel.: +86 731 882 2782; fax.: +86 731 882 2577.E-mail address: [email protected] (R.-Q. Yu).

changes render the PQs a fascinating and intriguingof BzR ligands. Only few papers dealing with quantitastructure–activity studies of PQs have been reported[4–8].

Quantitative structure–property relationship (QSAR/PR) studies are unquestionably of great importancdrug design and biochemistry. Once a correlation betwstructure and activity/property is found, newly desigcompounds, including those not yet synthesized, careadily screened on the computer in order to select strucwith potential properties desired. It is then possible to sthe most promising compounds to synthesize and tethe laboratory. Thus, the QSAR/QSPR approach consresources and accelerates the process of developmentmolecules for use as drugs, materials, additives or forother purpose. Conventional QSAR modeling often treatdataset involved using a single linear regression modelresult the single linear model frequently leads to unacceperrors, especially when the compounds in the dataset eobviously structure diversity[9–11]. To circumvent this

0003-2670/$ – see front matter © 2005 Published by Elsevier B.V.doi:10.1016/j.aca.2005.07.033

Page 2: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49 43

problem, an innovative approach based on piecewise hyper-sphere modeling by particle swarm optimization (PHMPSO)developed by present authors[12] was used to split thedataset into subsets with desired linearity in each model. Theprimary feature of this approach is the use of hyper-spheresto cluster the compounds in the training set in order to obtainsubmodels, whereupon particle swarm optimization (PSO)is applied to find an optimal hyper-sphere model. PSOinitially proposed by Kennedy and Eberhart[13–17] is astochastic global optimization technique through simulationof simplified social behavior of bird flocking. In the past fewyears, PSO has been successfully applied as an optimizationtechnique in many research areas, especially in solvingchemical problems. Shen et al.[18] applied a hybridizedPSO approach to the neural network structure training andused this method in QSAR studies of the bioactivity ofsome organic compounds. Subsequently, they proposed aprocedure based on a minimum spanning tree and PSOalgorithms and used it in QSAR analysis of antagonism ofangiotensin II[19]. Very recently, a version of modified PSOwas used to seek the optimized combinations of variablesfrom which one can extract the most related latent variablesthat capture maximally the information of the originalvariable blocks to establish the regression model[20].

In the present study, PHMPSO was adopted to build thestructure–activity correlation model for the affinity of a se-r ed tot ub-m ssionm entb atedt guea l, fort rdoua

2

2s

ses,t fort ad endl thet ventt sings oachi intos s. Foc thath elingb stera y

splitting the whole dataset into subsets with desired linearityin each model. Compounds in the training set are groupedinto different hard clusters based on a probability distancemeasure to hyper-sphere cores that is a Euclidean distancebetween the data point and hyper-sphere core multipliedby the corresponding hyper-sphere weight. The term ‘hard’means that each compound belongs to only one cluster. Eachcluster used to formulate submodel is geometrically consid-ered as a hyper-sphere in the data space. Each data point inhyper-spheres stands for an object or a compound in the train-ing set. Considering the influence of the volume variation ofhyper-spheres on the allocation of a particular sample point todifferent clusters, the hyper-sphere weight was defined as thereciprocal of the square of hyper-sphere radius. Accordingto the different characteristics of hyper-spheres, compoundsin training set are allocated to different hyper-spheres. Then,a given compound in prediction set can be predicted directlyby the submodel into which this compound is assigned. Inthis algorithm, PSO is applied to search the optimal hyper-spheres. PSO is an evolutionary computation techniquedeveloped by Eberhart and Kennedy (1995), by simulatingsocial behavior of bird flocking or fish schooling. Similarto GA, PSO is a population-based optimization tool. Thesystem is initialized with a population of random solutionsand searches for optima by updating generations. UnlikeGA, PSO has no evolution operators, such as cross-over andm les,a thec ageso e aref n isa codedt andr ailedd where[

f theo ingn ointsa t intos linearfi on tot space.C on isd

F

ro s

ies of PQ analogues to BzR. The results were comparhose obtained by MLR in a single model as well as in sodels based on K-means clustering analysis. The regreodel achieved in this way shows significant improvemoth in fitting and prediction ability. It has been demonstr

hat PHMPSO is effective in QSAR analysis of PQ analoffinity to BzR and can be used as a complementary too

he experimental assessment might be expensive, hazand time consuming.

. Algorithms and data sets

.1. Piecewise hyper-sphere modeling by particlewarm optimization

As the structural diversity in QSAR training set increahere might be difficult to use a single linear modelhe whole population of compounds of interest withesired error level. The quality of the model may dep

ess on the types of variables present and more onypes of compounds present in the data set. To circumhis problem, besides the non-linear approaches uophisticated non-linear functions, an alternative apprs to find multiple models by splitting the whole data setubsets and model each subset as linear substructureonstructing piecewise linear models for the compoundsave structural diversity, piecewise hyper-sphere mody particle swarm optimization used the concept of clunalysis[12]. This solution is to find multiple models b

s

r

utation. In PSO, the potential solutions, called particre “flown” through the problem space by followingurrent optimum particles. Compared to GA, the advantf PSO are that PSO is easy to implement and ther

ew parameters to adjust. In PSO, each single solutioparticle in the search space, and each particle is en

o a real string, the bits of which stand for the coresadii of a series of hyper-spheres, respectively. The detiscrete steps of PHMPSO have been described else

12].In the aforementioned algorithm, the performance o

ptimization is measured by a novel fitness function involvon-Euclidean probability distances between the data pnd the hyper-sphere cores. Splitting the whole dataseubsets should necessarily reduce the residuals fromtting within each subset and at the same time pay attentihe closeness of the samples in each subset in the dataonsidering these two requirements, the fitness functiefined as follows.

itness= N∑

j=1

(Yj − YPj)2

×(

1 + ρ

[∑Kk=1∑

xj ∈ Ωk

(||xj − Ck||2wk

)∑N

j=1||xj − m||2

])

(1)

whereN is the number of samples in the model,K the numbef hyper-spheres, i.e. the number of submodels,Ωk represent

Page 3: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

44 L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49

the whole set of compounds in hyper-spherek, Ck the core ofhyper-spherek,wk the weight of corresponding hyper-spherek and ρ is weighting coefficient to keep balance betweenaccuracy and the compactness of clusters in the data space. Byexperience,ρ takes the value of 0.1. Andm is the mean of allthe data points.Yj andYPj are, respectively, the experimentaland calculated bioactivity values ofjth sample. The first termof the right side of Eq.(1) is the sum of residual squares(RSS), which defined the accuracy of each model. And thecompactness of clusters is controlled by the second term ofthe right side of Eq.(1).

2.2. Data sets

For evaluating the performance of the algorithmmentioned above, a series of 58 PQ analogues with thecorresponding affinity to BzR were used as a dataset. Theirchemical structures and affinity are shown inTable 1.These data were taken from a comprehensive review byHadjipavlou-Litina et al.[21]. Savini et al.[22] synthesizedand evaluated some PQ analogues for their affinity to BzR.We stochastically divided the data into two sets, the trainingset and the prediction set. The training set consisted of 46compounds randomly chosen from the whole set and usedfor developing regression models, while the prediction setwas composed of the remaining 12 compounds and used int

icsR emi-c hesed cturea mica cal-c e in-d h asS eda nds( ptors( diuso rea),d rv andJ iptors)[ log-ad eem rst po-lh ow-e theo t al.[ -t s.i the

Table 1Summary of observed and calculated BzR affinity of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) along with their struc-tures used in QSAR study

No. Substituents log 1/IC50

R R′ Observed Calculateda

1 H H 9.35 9.36632 H 4-Cl 9.00 9.09473b H 4-OCH3 9.17 9.10464 6-F H 8.16 7.75575 6-CF3 H 5.73 6.01796 6-OCH3 H 5.66 5.35287 8-F H 9.54 9.31078 8-F 3-NO2 8.70 8.88179 8-F 3-NH2 9.26 9.2554

10b 8-F 4-OCH3 9.48 8.870111 8-F 4-OH 9.34 9.682012 8-Cl H 9.37 9.045813b 8-OCH3 H 9.17 9.056014b 8-OC2H5 H 8.85 8.978315 8-C4H9 H 9.00 9.061116 8-C4H9 4-COOH 5.93 5.715017b 8-C4H9 2-Pyridyl-2′-yl 9.50 8.408218 8-C4H9 2-Pyrimidyl-2′-yl 8.77 8.323219 8-cyC6H11 H 8.35 8.140620 8-cyC6H11 4-COOH 5.55 5.838521 8-cyC6H11 2-Pyrimidyl-2′-yl 8.36 7.837722 8-OCH2C6H5 H 7.75 8.012023 8-OCF3 H 9.15 9.084324 8-OCF3 2-F 9.40 9.028525b 8-OCF3 2-Cl 8.60 8.264326 8-OCF3 2-CH3 8.47 8.771627 8-OCF3 3-Br 7.46 7.247728 8-OCF3 3-CH3 8.20 8.599129 8-OCF3 3-Cl 7.62 7.543030 8-OCF3 3-F 9.40 9.344431 8-OCF3 3-NO2 7.20 7.226032b 8-OCF3 3-NH2 9.62 8.881733 8-OCF3 4-Br 7.82 7.804934 8-OCF3 4-CH3 8.79 8.686535 8-OCF3 4-Cl 7.90 8.372636 8-OCF3 4-F 9.00 8.786137b 8-OCF3 4-NO2 7.40 7.990338b 8-OCF3 4-NH2 9.10 8.782739 8-OCF3 4-OCH3 9.22 8.878740 8-OCF3 4-OH 9.63 9.302141 9-OH H 9.62 9.677842 9-OCH3 H 8.84 9.335643b 6,8-F H 7.87 7.698344 6,8-F 3-F 8.02 8.413845 6,8-F 4-Br 6.79 6.869246 6,8-F 4-OCH3 8.12 8.039647 6,8-F 2-Pyridyl-2′-yl 7.82 8.189448 6,8-F 2-Pyrimidyl-2′-yl 6.47 7.077949 7,9-Cl H 8.43 7.922950 6,7,8-F H 7.70 7.767451 6,7,8-F 4-CH3 7.15 7.600152 6,7,8-F 4-Cl 7.13 7.139753 6,7,8-F 4-F 7.68 7.407054 6,7,8-F 4-OCH3 8.14 7.765855b 7,8,9-OCH3 H 8.90 8.763256b 7,8,9-OCH3 4-COOH 5.52 6.176257 7,8,9-OCH3 2-Pyridyl-2′-yl 8.50 8.336258 7,8,9-OCH3 2-Pyrimidyl-2′-yl 7.24 7.8213

a Calculated by PHMPSO with the best six variables.b Compounds used for prediction.

he prediction of the affinity of the compounds.Using Cerius2 3.5 software system on a Silicon Graph

3000 workstation, over 70 descriptors representing chal structure were calculated as the original variables. Tescriptors cover different aspects of the molecular strund consist of topological, structural, spatial, thermodynand electronic descriptors. The topological descriptorsulated include electrotopological-state indices (E-Statices)[23–27]representing the electron accessibility, suc-ssCH2, S-aaCH[27], etc. Structural descriptors includre molecular weight (MW), the number of rotatable boRotlbonds) and the number of hydrogen bond acceHbond acceptor). The spatial descriptors used involve raf gyration (RadOfGyration), surface area projections (aensity, principal moment of inertia (PMI)[28], moleculaolume (Vm), surface area projections (Shadow indices)urs charged partial surface area descriptors (Jurs descr28]. The thermodynamic descriptors calculated includerithm of the octanol/water partition coefficient (logP) [29],esolvation free energy of water (FH2O) [30], desolvation frenergy for octanol (Foct) [30], heat of formation (Hf)[31] andolar refractivity (MolRef)[29]. The electronic descripto

aken were concerning surperdelocalizability (Sr), atomicarizabilities (Apol) [32], the dipole moment (dipole)[32],ighest occupied molecular orbital energy (HOMO) and lst unoccupied molecular orbital energy (LUMO). Onther hand, five variables used by Hadjipavlou-Litina e

22] were also included in the list of variables.I is an indicaor parameter (0/1) for the seven substituted compoundσR′

mplies a negative role for electron-attracting groups in

Page 4: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49 45

3- and 4-position.LR′4′ is a steric length parameter which ismeasured along the substitution point bond axis. B5R is oneof Verloop’s sterimol parameters which define the maximumwidth of substituent.

The PHMPSO algorithm was written in Matlab 6.5 and runon a personal computer (Intel Pentium processor 4/1.5G Hz256 MB RAM).

3. Results and discussion

For studying the QSAR model of PQ analogues, PHMPSOwas applied to split the whole dataset into submodels withdesired linearity in each model. The best six variablesselected by classical stepwise regression were used in piece-wise hyper-sphere modeling. In PHMPSO, hyper-spheresare used for clustering all compounds in the training set toconstruct submodels. PSO is applied to search the optimalhyper-spheres. MLR is used to build regression model for thecompounds in each hyper-sphere subset. Two factors must beconsidered to decide which value the number of submodelsK should take: dataset splitting by PHMPSO should improvelinearity in each submodel; on the other hand, the number ofcompounds in each submodel cannot be too few compared tothe number of variables, otherwise the stability of submodelsw SOs e twos s, re-s thep thisw sett epb ttingw e datas ed onP thew time,w ds int l fore clas-s anceb nsid-

ering the influence of the clustering results on the regressionmodel. PHMPSO explicitly incorporates regression. It aimsat clustering the compounds in the training set with improvedlinearity in each subset and decreasing the error level of theregression model. The assignation and prediction of a givenprediction set compound to appropriate model can be ob-tained directly according to the characteristic attribute of eachhyper-sphere. Comparing with K-means methods, PHMPSOcan improve the performance of the regression modelremarkably.

For comparing the results of the three processes, two pa-rameters were used to test the validity of the regression mod-els. They are the correlation coefficient (R) and the sum ofresidual squares to be used to evaluate the correlation betweenthe experimental values and the predicted values of the com-pounds’ BzR affinity. We calculated values of these two pa-rameters of the training set and the prediction set in the threeprocesses. The statistical results are summarized inTable 2.It can be seen from this table that although the variables usedin the three processes are identical, comparing with other twoalgorithms, better results were obtained from PHMPSO. Onthe other hand, we calculatedR and RSS of the compoundsin each hyper-sphere subset modeled by PHMPSO and theresults are also shown inTable 2. From this table, one can findthatR in each submodel by PHMPSO is higher than those inthe model obtained by MLR in a single model and K-meansc O iss othert y oft ationo s andtT Ra eringa delsb bil-i ilityo val-u er alueso cor-r BzR

TS 3-(3H)- othert

D

ethod

S 0.787S 0.929T 0.893P 0.862 2

ould be not sufficient and overfitting might occur. PHMPplit the dataset into two hyper-sphere subsets. Thesubsets happened to contain 25 and 21 compoundpectively. In fitness function, the larger the value ofarameterρ, the better the compactness of clusters. Butill cause the deviation from linear fitting within each sub

o increase. Accordingly,ρ is set to 0.1 by experience to kealance between the deviation (residuals) from linear fiithin each subset and the compactness of clusters in thpace. Compared to building the regression model basHMPSO, we also used MLR to build a single model forhole dataset with the same six variables. At the samee applied K-means clustering to classify the compoun

he training set and then used MLR to build linear modeach subset. The purpose of the K-means methods is toify the compounds in the training set based on the distetween the compounds and the cluster centroids, not co

able 2tatistical results in 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-

wo methods

ataset Method

R (correlation coefficient)

Method 1a Method 2b M

ubset 1 0.8996 0.6693ubset 2 0.9522 0.8468raining set 0.9602 0.8740rediction set 0.9302 0.8761a Method 1: QSAR study by PHMPSO.b Method 2: QSAR study using MLR in a whole data set.c Method 3: QSAR study by K-means clustering analysis.

lustering analysis. RSS in each submodel by PHMPSmaller compared to those in the model obtained by thewo methods. In addition, in order to evaluate the stabilithe models, we carried out the leave-one-out cross-validn the training set for each model in the three processe

he statistical results are shown inTable 3. It can be seen fromables 2 and 3that comparing with a single model by MLnalysis and the submodels obtained by K-means clustnalysis, better results were obtained from multiple moy PHMPSO algorithm. The calibration and predictive a

ties of the model were considerably improved. The stabf the model was also improved. The calculated affinityes using PHMPSO are listed inTable 1. Fig. 1a shows thelationship between the calculated and experimental vf BzR affinity of PQ analogues using PHMPSO. Theelation between the experimental and the calculated

ones (PQs) affinity to benzodiazepine receptor using PHMPSO and

RSS (the sum of residual squares)

3c Method 1a Method 2b Method 3c

9 2.3677 7.1015 5.63757 1.9259 5.9019 5.43978 4.2936 13.0031 11.07727 3.1836 4.2819 7.156

Page 5: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

46 L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49

Fig. 1. (a) Calculated vs. observed log 1/IC50 of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) by PHMPSO. (b) Calculated vs.observed log 1/IC50 of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) by MLR modeling using a whole data set. (c) Calculated vs.observed log 1/IC50 of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) in submodels based on K-means clustering analysis.

affinity values of PQ analogues by MLR modeling using awhole data set is demonstrated inFig. 1b.Fig. 1c shows the re-lationship between the experimental and the calculated BzRaffinity values of PQ analogues in submodels based on K-

Table 3The statistical results of the leave-one-out cross-validation on the trainingset for the two submodels built by PHMPSO and the models constructed onthe other two methods

Dataset MethodVariance

Method 1a Method 2b Method 3c

Subset 1 0.1886 0.4011 0.5250Subset 2 0.2501 0.5398

a Method 1: QSAR study by PHMPSO.b Method 2: QSAR study using MLR in a whole data set.c Method 3: QSAR study by K-means clustering analysis.

means clustering analysis.Fig. 2a reveals the residual squaresfor the all observations by PHMPSO. The residual squaresfor the all observations by MLR in a single model are re-vealed inFig. 2b. Fig. 2c shows the residual squares for theall observations in submodels based on K-means clusteringanalysis. A comparison ofFig. 1a–c andFig. 2a–c show thatbetter results are obtained from multiple models by PHMPSOthan the other two methods. The convergence process forPHMPSO can be examined inFig. 3. As can be seen fromthis figure, PHMPSO can converge to a satisfactory solutionquickly.

The six best variables applied in PHMPSO include dipole-mag, dipole-X, dipole-Y, density, PMI-Y and Jurs-TPSA.These descriptors encode different aspects of the molecularstructure. Dipole moment descriptors belong to 3D electronicdescriptors and they indicate the strength and orientation be-

Page 6: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49 47

Fig. 2. (a) The residual squares of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) by PHMPSO. (b) The residual squares of2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) by MLR modeling using as a whole data set. (c) The residual squares of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) in submodels based on K-means clustering analysis.

havior of molecules in an electrostatic field. Density, PMI-Yand Jurs-TPSA all belong to spatial descriptors. Density isdefined as the ratio of molecular weight to molecular volume.It reflects the types of atoms and how tightly they are packedin a molecule. This descriptor can be related to transport and

Fig. 3. Convergence curve of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones (PQs) for PHMPSO.

melt behavior. PMI descriptors calculate the principal mo-ments of inertia about the principal axes of a molecule. PMI-Y is the Y component of the principal moment of inertia.Jurs-TPSA is the total polar surface area: sum of solvent-accessible surface areas of atoms with absolute value of par-tial charges greater or equal than 0.2. From the calculationresults, one can suggest that electronic descriptors and spatialdescriptors play the key roles in the compounds’ affinity toBzR.

Forty-six compounds in the training set were grouped intotwo hyper-sphere subsets by PHMPSO. These two subsetshappened to contain 25 and 21 compounds. The allocationsof these compounds are presented inTable 4. The values ofthe six selected variables of all these compounds are alsoshown inTable 4. In order to interpret the influence of thesix variables on the compounds’ affinity, we calculated theregression coefficients of the six variables in the two sub-models and the results can be expressed by the followingequations:log 1/IC50 = −3.1764 dipole-mag+0.5025 dipole-X

−6.4981 dipole-Y−0.2324 density

−3.8668 PMI-Y+0.6673 Jurs-TPSA (subset 1)

(2)

Page 7: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

48 L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49

Table 4Six best variables used in PHMPSO and the subsets which the compounds were group into

Compound Dipole-mag Dipole-X Dipole-Y Density PMI-Y Jurs-TPSA Activity Subseta

1 2.743 −2.235 1.452 1.16 441.021 98.232 9.35 12 10.912 −8.207 −7.082 1.23 693.837 172.994 9.00 13b 2.215 −1.830 −0.503 1.16 678.263 116.155 9.17 14 7.829 1.136 7.719 1.21 530.558 156.294 8.16 15 15.681 5.211 14.762 1.28 789.110 188.873 5.73 26 19.910 0.814 19.707 1.16 599.652 123.407 5.66 27 4.784 4.756 0.401 1.21 492.996 152.386 9.54 18 6.000 −4.858 −3.275 1.27 802.736 233.924 8.70 19 4.141 4.108 0.004 1.21 590.489 209.153 9.26 1

10b 5.519 5.345 −1.372 1.20 738.798 171.416 9.48 111 4.794 2.991 −3.746 1.23 606.143 209.861 9.34 112 6.516 6.497 0.209 1.23 549.526 173.395 9.37 113b 2.645 −0.171 2.607 1.16 518.541 121.984 9.17 114b 2.648 −0.271 2.576 1.14 552.245 124.213 8.85 115 5.844 5.633 0.534 1.08 624.563 272.247 9.00 116 12.607 −11.096 −5.955 1.12 976.549 275.677 5.93 217b 2.132 −0.957 −0.690 1.09 978.171 144.704 9.50 118 4.193 −3.016 −0.513 1.11 920.528 169.370 8.77 119 5.481 −4.887 2.262 1.09 741.753 104.948 8.35 120 13.973 −12.202 −6.346 1.13 1142.644 227.394 5.55 221 5.426 −3.826 0.463 1.11 1067.287 101.049 8.36 222 7.633 5.465 −2.588 1.14 1019.564 123.691 7.75 123 11.187 10.259 −4.358 1.30 642.607 233.886 9.15 124 9.436 6.501 −4.246 1.34 717.925 255.822 9.40 125b 9.659 6.127 −4.079 1.35 783.093 271.231 8.60 226 12.041 11.101 −4.000 1.27 710.037 230.502 8.47 127 7.054 2.740 −6.350 1.49 1268.420 298.098 7.46 228 12.367 11.630 −4.160 1.27 774.017 234.178 8.20 129 6.896 1.421 −6.589 1.36 941.925 297.746 7.62 230 6.743 2.922 −6.003 1.34 803.411 280.176 9.40 131 7.378 0.658 −7.042 1.35 1026.120 318.430 7.20 232b 11.210 10.226 −4.504 1.30 786.826 290.619 9.62 233 11.353 4.335 −10.493 1.49 1282.549 315.360 7.82 234 11.571 10.980 −3.579 1.27 743.992 243.611 8.79 135 11.726 3.744 −11.069 1.35 948.666 304.845 7.90 236 11.368 4.858 −10.275 1.34 790.705 284.189 9.00 237b 12.557 2.524 −12.212 1.34 1021.655 334.629 7.40 238b 11.051 9.955 −4.741 1.30 757.323 303.065 9.10 239 9.969 9.268 −3.282 1.28 892.183 253.975 9.22 240 9.636 6.925 −6.679 1.32 768.204 292.795 9.63 141 1.696 −1.547 0.580 1.18 448.123 154.534 9.62 142 3.451 −2.564 1.287 1.16 455.213 139.140 8.84 143b 10.300 8.040 6.426 1.26 571.032 196.433 7.87 144 4.019 0.849 3.806 1.31 696.828 255.376 8.02 145 3.027 2.980 −0.529 1.48 1205.081 273.785 6.79 246 10.477 9.110 5.173 1.25 849.016 215.267 8.12 247 11.473 9.222 6.782 1.24 787.325 178.700 7.82 248 10.610 8.530 6.188 1.26 800.405 191.593 6.47 149 10.605 10.079 3.296 1.30 669.311 227.071 8.43 150 16.638 13.673 9.436 1.31 659.585 241.659 7.70 251 17.906 14.478 10.514 1.27 794.383 249.695 7.15 252 8.358 8.187 1.590 1.37 982.029 326.423 7.13 253 9.250 8.816 2.663 1.35 820.054 305.657 7.68 254 17.391 15.132 8.541 1.29 965.601 259.433 8.14 255b 3.961 −2.536 −0.480 1.16 759.847 167.256 8.90 156b 13.074 −9.323 −8.955 1.19 1207.326 272.102 5.52 257 4.933 1.085 −3.307 1.16 1037.991 146.049 8.50 158 2.675 −2.235 −1.426 1.17 1050.424 162.499 7.24 2

a Subset which the compounds were grouped into.b Compounds used for prediction.

Page 8: QSAR analysis of a series of 2-aryl(heteroaryl)-2,5-dihydropyrazolo[4,3-c]quinolin-3-(3H)-ones using piecewise hyper-sphere modeling by particle swarm optimization

L. Lin et al. / Analytica Chimica Acta 552 (2005) 42–49 49

log1/IC50 = −1.0468 dipole-mag+7.3559 dipole-X

− 6.3907 dipole-Y−1.2172 density

− 1.0632PMI-Y−3.9834 Jurs-TPSA (subset 2)

(3)

It can be seen from these two equations that in the twosubmodels, compounds’ affinity increases with the decreaseof parameter dipole-mag, dipole-Y, density and PMI-Y val-ues, as evidenced by their negative regression coefficients inthe two equations, which indicates that compounds’ affin-ity increases with the decrease of the molecular density, theY component of the principal moment of inertia, the mag-nitude and theY component of the dipole moment of themolecules. On the other hand, the positive coefficients ofdipole-X in the two equations signifies that the increase ofthe X component of the dipole moment can increase theaffinity. For the parameter Jurs-TPSA, its positive regres-sion coefficient in Eq.(2) and negative coefficient in Eq.(3) indicates that increased total polar surface area of theatoms increases the compounds’ affinity in the first submodelbut decreases the affinity of the compounds in the secondsubset.

The results indicated that PHMPSO is really useful forstudying the QSAR model of PQ analogues and PHMPSOh wisem

4

ene guesw ereu t toc h theo nearm cani torsa BzRa sity,t them nto gueB o-m nds’a

as eansc fromt libra-t ysiso stedt ARm

Acknowledgments

The work was financially supported by the National Nat-ural Science Foundation of China (Grants Nos. 20375012,20105007, 20205005 and 20435010).

References

[1] C. Braestrup, T. Honore, M. Nielsen, E.N. Petersen, H. Jensen,Biochem. Pharmacol. 33 (1984) 859.

[2] W. Haefely, E. Kyburz, M. Gerecke, H. Mohler, Adv. Drug Res. 14(1985) 165.

[3] N. Yokoyama, B. Ritter, A.D. Neubert, J. Med. Chem. 25 (1982)337.

[4] S. Takada, H. Shindo, T. Sasatani, N. Chomei, A. Matsashita, E.Masami, K. Kawasaki, S. Murata, Y. Takahara, H. Shintaku, J. Med.Chem. 31 (1988) 1738.

[5] H. Shindo, S. Takada, S. Murata, E. Masami, A. Matsushita, J. Med.Chem. 32 (1989) 1213.

[6] G. Wong, G. Zi-Qiang, R.I. Fryer, P. Skolnick, Med. Chem. Res. 2(1992) 217.

[7] R.I. Fryer, P. Zhang, R. Rios, Z.Q. Gu, A.S. Basile, P. Skolnick, J.Med. Chem. 36 (1993) 1669.

[8] L.T. Schove, J.I. Perez, P.A. Maguire, G.I. Loew, Med. Chem. Res.4 (1994) 307.

[9] S.J. Cho, M.A. Hermsmeier, J. Chem. Inf. Comput. Sci. 42 (2002)927.

[10] A.M.B. Nasser, J.H. Jiang, Y.Z. Liang, R.Q. Yu, Chemom. Intell.Lab. Syst. 72 (2004) 73.

[ 02)

[ , J.

[ 95)

[ 69.[[[ mul.

[ , J.

[ u, J.

[ Inf.

[ 004)

[ A.g.

[[[ 91)

[ .[ .[[ , J.

[ , p.

[[

as been tested to be a useful tool in developing pieceodels.

. Conclusion

In our current work, PHMPSO algorithm has bemployed in the QSAR study of a series of PQ analoith satisfactory results obtained. Hyper-spheres wsed for clustering all compounds in the training seonstruct submodels and PSO was applied to searcptimal hyper-spheres for finding satisfied piecewise liodels. By analyzing the results of this process, we

nfer that electronic descriptors and spatial descripre the most important descriptors in predicting theffinity of PQ analogues. In addition, the molecular den

he Y component of the principal moment of inertia,agnitude and theY component of the dipole momef the molecules can detrimentally affect PQ analozR affinity, while the X component of the dipole ment of the molecules can favorably affect compouffinity.

QSAR modeling was also carried out by MLR iningle model as well as in submodels based on K-mlustering analysis. Comparing the results obtainedhe three methods, PHMPSO shows satisfactory caion and prediction performance in the QSAR analf PQ analogue BzR affinity. PHMPSO has been te

o be useful in improving the performance of QSodels.

11] Y.P. Du, Y.Z. Liang, D. Yun, J. Chem. Inf. Comput. Sci. 42 (201283.

12] W.Q. Lin, J.H. Jiang, Q. Shen, H.L. Wu, G.L. Shen, R.Q. YuChem. Inf. Comput. Sci. 45 (2005) 535.

13] J. Kennedy, R. Eberhart, IEEE Int. Conf. Neural Netw. 4 (191942.

14] Y. Shi, R. Eberhart, IEEE World Congr. Comput. Intell. (1998)15] M. Clerc, J. Kennedy, IEEE Trans. Evol. Comput. 6 (2002) 58.16] Y. Shi, R. Eberhart, Proc. Congr. Evol. Comput. (2001) 101.17] J. Kennedy, R.A. Eberhart, IEEE Int. Conf. Comput. Cybern. Si

(1997) 4104.18] Q. Shen, J.H. Jiang, C.X. Jiao, W.Q. Lin, G.L. Shen, R.Q. Yu

Comput. Chem. 25 (2004) 1726.19] Q. Shen, J.H. Jiang, C.X. Jiao, S.Y. Huan, G.L. Shen, R.Q. Y

Chem. Inf. Comput. Sci. 44 (2004) 2027.20] W.Q. Lin, J.H. Jiang, Q. Shen, G.L. Shen, R.Q. Yu, J. Chem.

Comput. Sci. 45 (2) 486.21] D. Hadjipavlou-Litina, R. Garg, C. Hansch, Chem. Rev. 104 (2

3751.22] L. Savini, P. Massarelli, C. Nencini, C. Pellerano, G. Biggio,

Maciocco, G. Tuligi, A. Carrieri, N. Cinone, A. Carotti, BioorMed. Chem. 6 (1998) 389.

23] L.B. Kier, L.H. Hall, Pharm. Res. 7 (1990) 801.24] L.H. Hall, L.B. Kier, J. Chem. Inf. Comput. Sci. 31 (1991) 76.25] L.H. Hall, B. Mohney, L.B. Kier, Quant. Struct.-Act. Relat. 10 (19

43.26] L.B. Kier, L.H. Hall, J.W. Frazer, J. Math. Chem. 7 (1991) 22927] L.H. Hall, L.B. Kier, J. Chem. Inf. Comput. Sci. 35 (1995) 103928] D.T. Stanton, P.C. Jurs, Anal. Chem. 62 (1990) 2323.29] V.N. Viswanadhan, A.K. Ghose, G.R. Revankar, R.K. Robins

Chem. Inf. Comput. Sci. 29 (1989) 163.30] A.J. Hopfinger, et al., Ann Arbor Press, Ann Arbor, ML, 1980

385.31] M.J.S. Dewar, W. Thiel, J. Am. Chem. Soc. 99 (1977) 4899.32] J. Gasteiger, M. Marsali, Tetrahedron 36 (1980) 3219.