Ligand-Based Virtual Screening and Molecular Docking Studies to...

17
Ligand-Based Virtual Screening and Molecular Docking Studies to Identify the Critical Chemical Features of Potent Cathepsin D Inhibitors Sugunadevi Sakkiah, Sundarapandian Thangapandian and Keun Woo Lee* Division of Applied Life Science (BK21 Program), Systems and Synthetic Agrobiotech Center, Plant Molecular Biology and Biotechnology Research Center, Research Institute of Natural Science, Gyeongsang National University, 501 Jinju-daero, Gazha-dong, Jinju 660-701, South Korea *Corresponding author: Keun Woo Lee, [email protected] Cathepsin D is a major component of lysosomes and plays a major role in catabolism and degenera- tive diseases. The quantitative structure–activity relationship study was used to explore the critical chemical features of cathepsin D inhibitors. Top 10 hypotheses were built based on 36 known cathep- sin D inhibitors using HYPOGEN DISCOVERY STUDIO v2.5. The best hypothesis Hypo1 consists of three hydrophobic, one hydrogen bond acceptor lipid, and one hydrogen bond acceptor features. The selected Hypo1 model was cross-validated using Fi- scher’s randomization method to identify the strong correlation between experimental and pre- dicted activity value as well as the test set and decoy sets used to validate its predictability. More- over, the best hypothesis was used as a 3D query in virtual screening of Scaffold database. Subse- quently, the screened hit molecules were filtered by applying Lipinski’s rule of five, absorption, dis- tribution, metabolism, and toxicity, and molecular docking studies. Finally, 49 compounds were obtained as potent cathepsin D inhibitors based on the consensus scoring values, critical interactions with protein active site residues, and predicted activity values. Thus, we suggest that the applica- tion of Hypo1 could assist in the selection of potent cathepsin D leads from various databases. Hence, this model was used as a valuable tool to design new candidate for cathepsin D inhibitors. Key words: cathepsin D, consensus scoring, molecular docking, pharmacophore, virtual screening Received 8 November 2010, revised 22 December 2011 and accepted for publication 7 January 2012 A number of key cellular functions are controlled by different prote- ases such as caspases. Caspases plays a central role in apoptotic cell death particularly (1), degradation and elimination of abnormal pro- teins (2), and cancer metastasis (3). The basement membrane degra- dation and the surrounding extracellular matrix represent a key step in cancer invasion and metastasis (4). Several classes of proteases including serine proteases, plasminogen activator, matrix metallopro- teases, and cathepsins are thought to be involved in the aforemen- tioned degradation process (5–7). Among these, cathepsin families play a major regulatory role as the growth factors and growth fac- tor–binding proteins and its receptor activities have an important bio- logical function in human colorectal cancer development (8,9). Cathepsins are lysosomal proteolytic enzyme that located in intra- lysosome. They are synthesized as an inactive preproenzyme and activated by post-transcriptional glycosylation. They are classified based on the chemical nature of groups responsible for its catalytic activity: serine proteases (Cathepsin A and G), cysteine proteases (Cathepsin B, C, H, K, L, and S), and aspartic proteases (Cathepsin D and E) (10,11,12). Among these, Cathepsin D (CatD, EC 3.4.23.5) is an intracellular aspartic protease of pepsin superfamily involved in various pathological conditions like cancer (13), Alzheimer's dis- ease (14), and neuronal ceroid lipofuscinosis (15). Cathepsin D exists as two chains: amino terminal light chain (14 kDa) and car- boxyl terminal heavy chain (34 kDa). It is a bilobed molecule, each lobe contributes one aspartate (D) residue for its active mechanism, and the substrate-binding cleft formed between the lobes. It com- prises three topologically distinct regions: (i) N-terminal domain (residues 1–188) (15), (ii) C-terminal domain (residues 189–346), and (iii) interdomain that consists of antiparallel b-sheet from N-termi- nus (residues 1–7), C-terminus (residues 330–346), and the linking interdomain (residues 160–200). Cathepsin D associates with vari- ous therapeutically significant processes like lysosomal biogenesis and protein targeting (16,17), antigen processing, and presentation of peptide fragments to class II major histocompatibility complexes (18–20), connective tissue disease pathology (21), degenerative brain changes (22,23), and cleavage of amyloid precursor protein within senile plaques of Alzheimer brain (24). Therefore, CatD will be a promising drug target for various diseases. Recently, there are numerous evidences that the pharmacophore modeling–based virtual screening and molecular docking studies are extensively used to discover the potent and novel leads for various druggable targets (25–30). Although numerous CatD inhibitors were designed and validated, still there is a lack of information about the critical chemical features for CatD inhibitor. Therefore, ligand- based pharmacophore technique was utilized to key out its selective chemical features, which could serve as a guide in designing the 64 Chem Biol Drug Des 2012; 80: 64–80 Research Article ª 2012 John Wiley & Sons A/S doi: 10.1111/j.1747-0285.2012.01339.x

Transcript of Ligand-Based Virtual Screening and Molecular Docking Studies to...

Page 1: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Ligand-Based Virtual Screening and MolecularDocking Studies to Identify the Critical ChemicalFeatures of Potent Cathepsin D Inhibitors

Sugunadevi Sakkiah, SundarapandianThangapandian and Keun Woo Lee*

Division of Applied Life Science (BK21 Program), Systems andSynthetic Agrobiotech Center, Plant Molecular Biology andBiotechnology Research Center, Research Institute of NaturalScience, Gyeongsang National University, 501 Jinju-daero,Gazha-dong, Jinju 660-701, South Korea*Corresponding author: Keun Woo Lee, [email protected]

Cathepsin D is a major component of lysosomesand plays a major role in catabolism and degenera-tive diseases. The quantitative structure–activityrelationship study was used to explore the criticalchemical features of cathepsin D inhibitors. Top 10hypotheses were built based on 36 known cathep-sin D inhibitors using HYPOGEN ⁄ DISCOVERY STUDIO

v2.5. The best hypothesis Hypo1 consists of threehydrophobic, one hydrogen bond acceptor lipid,and one hydrogen bond acceptor features. Theselected Hypo1 model was cross-validated using Fi-scher’s randomization method to identify thestrong correlation between experimental and pre-dicted activity value as well as the test set anddecoy sets used to validate its predictability. More-over, the best hypothesis was used as a 3D query invirtual screening of Scaffold database. Subse-quently, the screened hit molecules were filteredby applying Lipinski’s rule of five, absorption, dis-tribution, metabolism, and toxicity, and moleculardocking studies. Finally, 49 compounds wereobtained as potent cathepsin D inhibitors based onthe consensus scoring values, critical interactionswith protein active site residues, and predictedactivity values. Thus, we suggest that the applica-tion of Hypo1 could assist in the selection of potentcathepsin D leads from various databases. Hence,this model was used as a valuable tool to designnew candidate for cathepsin D inhibitors.

Key words: cathepsin D, consensus scoring, molecular docking,pharmacophore, virtual screening

Received 8 November 2010, revised 22 December 2011 and acceptedfor publication 7 January 2012

A number of key cellular functions are controlled by different prote-ases such as caspases. Caspases plays a central role in apoptotic celldeath particularly (1), degradation and elimination of abnormal pro-

teins (2), and cancer metastasis (3). The basement membrane degra-dation and the surrounding extracellular matrix represent a key stepin cancer invasion and metastasis (4). Several classes of proteasesincluding serine proteases, plasminogen activator, matrix metallopro-teases, and cathepsins are thought to be involved in the aforemen-tioned degradation process (5–7). Among these, cathepsin familiesplay a major regulatory role as the growth factors and growth fac-tor–binding proteins and its receptor activities have an important bio-logical function in human colorectal cancer development (8,9).

Cathepsins are lysosomal proteolytic enzyme that located in intra-lysosome. They are synthesized as an inactive preproenzyme andactivated by post-transcriptional glycosylation. They are classifiedbased on the chemical nature of groups responsible for its catalyticactivity: serine proteases (Cathepsin A and G), cysteine proteases(Cathepsin B, C, H, K, L, and S), and aspartic proteases (CathepsinD and E) (10,11,12). Among these, Cathepsin D (CatD, EC 3.4.23.5)is an intracellular aspartic protease of pepsin superfamily involvedin various pathological conditions like cancer (13), Alzheimer's dis-ease (14), and neuronal ceroid lipofuscinosis (15). Cathepsin Dexists as two chains: amino terminal light chain (14 kDa) and car-boxyl terminal heavy chain (34 kDa). It is a bilobed molecule, eachlobe contributes one aspartate (D) residue for its active mechanism,and the substrate-binding cleft formed between the lobes. It com-prises three topologically distinct regions: (i) N-terminal domain(residues 1–188) (15), (ii) C-terminal domain (residues 189–346), and(iii) interdomain that consists of antiparallel b-sheet from N-termi-nus (residues 1–7), C-terminus (residues 330–346), and the linkinginterdomain (residues 160–200). Cathepsin D associates with vari-ous therapeutically significant processes like lysosomal biogenesisand protein targeting (16,17), antigen processing, and presentationof peptide fragments to class II major histocompatibility complexes(18–20), connective tissue disease pathology (21), degenerativebrain changes (22,23), and cleavage of amyloid precursor proteinwithin senile plaques of Alzheimer brain (24). Therefore, CatD willbe a promising drug target for various diseases.

Recently, there are numerous evidences that the pharmacophoremodeling–based virtual screening and molecular docking studies areextensively used to discover the potent and novel leads for variousdruggable targets (25–30). Although numerous CatD inhibitors weredesigned and validated, still there is a lack of information aboutthe critical chemical features for CatD inhibitor. Therefore, ligand-based pharmacophore technique was utilized to key out its selectivechemical features, which could serve as a guide in designing the

64

Chem Biol Drug Des 2012; 80: 64–80

Research Article

ª 2012 John Wiley & Sons A/S

doi: 10.1111/j.1747-0285.2012.01339.x

Page 2: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

potent CatD inhibitors. Hence, we generated a 3D chemical fea-ture–based pharmacophore model using well-known inhibitors andvalidated by Fischer's randomization method, test set, and otherstatistical parameters. The test set consists of structurally differentmolecules that are not included in the hypothesis generation. Thevalidated pharmacophore model was used as a 3D query in virtualscreening to choose new compounds from the Scaffold database.Then, the hits were filtered out by applying Lipinski's rule of five,and absorption, distribution, metabolism, and toxicity (ADMET) tomake them more drug-likeness. Molecular docking study was uti-lized to identify the molecules that can form a suitable orientationand the critical interaction with active residues of CatD.

Computational Methods

Pharmacophore modelingPharmacophore modeling is one of the most potent and rapid meth-ods to discover a novel scaffolds from various chemical databases.In general, two different approaches are used to build the ligand-based pharmacophore model: (i) based on common chemical fea-tures of the molecules in training set (15) and (ii) using the activityvalues and the structure of the compounds in training set.

SoftwarePharmacophore hypotheses were generated using HYPOGEN algo-rithm implemented in DISCOVERY STUDIO v2.5 (DS, www.accelrys.com;Accelrys, Inc. San Diego, CA, USA). The two- and three-dimensionalstructures of compounds were constructed using CHEMSKETCH V12,www.acdlabs.com (Advanced Chemistry Development, Inc. Toronto,Ontario, Canada.) and DS, respectively.

Preparation of data setsA total of 76 known CatD inhibitors were collected from various lit-eratures based on its structural diversity and activity coverage. Theactivity data for all compounds were taken from different scientificgroups that tested with an equivalent binding assay for CatD inhibi-tory activity (IC50). To achieve a best-quality hypothesis, the trainingset must obey the rules in a 3D quantitative structure–activity rela-tionship generation. Among 76, 36 molecules have selected as atraining set based on the following criteria to produce the goodquantitative pharmacophore model: (i) by covering a wide activityrange of four orders of magnitude, (ii) by including the most active,moderate, and inactive inhibitors, (iii) by binding of all the com-pounds present in the training set the same target in the similarway, and (iv) by testing the activity values (IC50) for the each com-pound using similar assay method. The remaining 40 compounds(31–35) are used as a test set to validate the generated pharmaco-phores. The biological activity of the training set molecules againstCatD is reported as IC50 values, which spans from 6 to 10 000 nM

with four orders of magnitude.

Conformational analysesThe training and test set compounds were energy-minimized byapplying the CHARMm force field (36) and subjected to a conforma-

tional analysis using Poling algorithm (37). It promotes the confor-mational variation that forces similar conformer away from eachother and ameliorates the conformational space coverage. For this,the maximum numbers of 255 conformations are generated for eachcompound with a constraint 20 kcal ⁄ mol. Thus, the best three-dimensional arrangement of chemical functionalities explaining theactivity variations among the training and test sets was identifiedfrom the above procedure.

Generation of pharmacophoreHYPOGEN, the most leading algorithm for the automated pharma-cophore generation, was used to identify the critical chemicalfeatures of CatD inhibitors. The algorithm has shown to be suc-cessful in a large number of applications in medicinal chemistry(38–40). Typically, HYPOGEN requires an informative training set asminimum number of 16 molecules with bioactivities spans overfour orders of magnitude. All the default parameters wereapplied in the generation of pharmacophore, except the uncer-tainly value that changed to 2.0 instead of 3.0. Furthermore, inthe hypothesis generation phase, five pharmacophore features,hydrogen bond acceptor (HBA), hydrogen bond donor (HBD), ringaromatic (RA), hydrophobic (Z), and hydrogen bond acceptor lipid(HBAL), were considered. The predictive pharmacophore is pro-duced through three phases such as constructive, subtractive, andoptimization phases. Constructive phase is the initial stage ofhypothesis generation that considers all possible pharmacophoreconfigurations of the most active compounds to entail pharmaco-phore demands. The next proximate phase, subtractive, all possi-ble pharmacophore configurations of the constructive phase aremaintained or discarded depending on the number of least activecompounds of training set that share a pharmacophore pattern.Finally optimization phase, candidate model within the pharmaco-phore space of a particular target through fine perturbation topharmacophore hypotheses is evaluated. The hypothesis genera-tion process stops when no better score of the hypothesis canbe accomplished and the output parameters determine the qualityof the hypothesis.

Pharmacophore validationsThe selected pharmacophore model was validated by several meth-ods to determine its ability to discriminate between the active andinactive compounds. Pharmacophore was evaluated for statisticalsignificance using the cross-validation procedure derived from Fi-scher's randomization method, test and decoy sets to confirm itspredictive power. As an initial validation, the confidence level of Fi-scher's randomization method was set to 95%. Hence, 19 randomspreadsheets were generated, and the each spreadsheet producedhypothesis. Second, test set, based on the activity values, 40 mole-cules were classified into highly active (IC50 £ 300 nM), moderatelyactive (300 nM < IC50 £ 3000 nM), and inactive (IC50 > 3000 nM).Test set was used to evaluate whether the best hypothesis can pre-dict the compounds in their own activity range. Finally, decoy setwas used to confirm the predictive power of our proposed modelbased on the analyses of statistical parameters like goodness of hit(GH) and enrichment factor (EF). Following equations were used tocalculate the EF and GH:

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 65

Page 3: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

EF = [(Ha · D) ⁄ (Ht · A)]

GH = [(Ha ⁄ 4HtA) (3A + Ht) · (1 ) ((Ht ) Ha) ⁄ (D ) A))]

where 'Ha' is the total number of active molecules in the hit list,'D' is total number of molecules in the decoy set, 'Ht' is the totalnumber of hit molecules from the database, and 'A' is the totalnumber of actives in the decoy set.

Virtual screeningThe best pharmacophore model was used as a 3D query to searchthe Scaffold database, which consists of about �300 000 structurallydiversified molecules. Best Flexible Search was applied to generatethe conformation for each molecule in the database. Virtual screeningwas implemented to retrieve the compounds that fit the chemical fea-tures present in the best hypothesis. The screened hits were furtherfiltered by applying Lipinski's rule of five and ADMET properties.

Molecular docking protocolMolecular docking calculation was made using LigandFit (41), whichcan perform either rigid or flexible docking referring to the confor-mational state of the ligand. LigandFit docks the small moleculesinto the protein active site by considering its shape complementary.

Three-dimensional structure of CatD–statin complex (PDB ID: 1LYB)was obtained from Protein Data Bank (42) (PDB, http://www.rcsb.org/pdb/home/home.do). The solvent molecules locatedinside the active site were retained and the remaining water mole-cules were removed. Maximum 10 poses are retained based on theroot mean square deviation (RMSD) threshold for diversity (1.5 �)and score threshold for diversity 20 kcal ⁄ mol. Consensus scoringfunction implemented in the LiganFit module was used to select thebest leads from molecular docking study. Consensus scoring (43,44),combination of multiple scoring functions, might dramatically reducethe number of false positives by its distinct scoring functions. In ourstudy, consensus scoring method was based on different scoringfunctions such as LigScore1, LigScore2 (45), piecewise linear poten-tial 1 (PLP1) (46), piecewise linear potential 2 (PLP2), potential meanforce (PMF) (47,48), Jain (49), Ludi (50,51), and Dock score. The finalhits from virtual screening and training set compounds are dockedflexibly into CatD active site. All seven scoring functions mentionedabove are used to retrieve the novel candidate molecules from Scaf-fold database. Best candidate molecules were selected based on theconsensus scoring function, appropriate interactions with critical resi-dues and the predicted activity (cutoff of <1000 nM).

Results and Discussion

Pharmacophore modeling is a 3D computational approach widelyused in drug design process. In this framework, its usefulness cov-ers two major areas: (i) performance of the 'scaffold hopping' (vir-tual screening for new leads) and (ii) the rationalization of activitydistribution within the group of molecules exhibiting a similar phar-macological profile that can recognized by the same target. To iden-tify the novel CatD inhibitors, a 3D pharmacophore models have

constructed and the best quantitative hypothesis has chosen, whichdepends on its predictive ability in terms of activity.

Hypothesis generationWe have generated the pharmacophore models based on theknown CatD inhibitors. DS enabling automated generation of phar-macophore model and applied to panels of compounds that have awide spectrum of activity against the given biological target andpossess structural diversity to produce the reliable results (52).Based on activity values of the training set molecules (Figure 1), 10optimal hypotheses were generated according to their statisticalvalues. All generated hypotheses include four common featuressuch as HBA, HBAL, Z, and HBD (Table 1). Therefore, these chemi-cal features are assumed to be necessary for CatD inhibitors. HYPO-

GEN calculates two important theoretical cost values fordetermining the success of pharmacophore models (30). Hypothesesare believed to be statistically relevant when the total cost close tothe fixed cost and far away from null cost. Extend of the pharmaco-phore space is expressed as the configuration cost. The total costof a hypothesis is defined as the sum of three cost factors: errorcost, weight cost, and configuration cost. Statistically, the null costvalue of the top 10 hypotheses was 250.45, the fixed cost value122.46, and the configuration cost value 16.83 (Table 1). Amongthese 10 hypotheses, two hypotheses (Hypo1 and Hypo2) showedthe cost difference value more than 85 and all other hypothesishaving <70. Comparing Hypo1 and Hypo2, Hypo1 has shown thecorrelation coefficient of 0.90, demonstrating the better predictionability than Hypo2 (correlation coefficient of 0.87). Thus, Hypo1 wasselected as a best hypothesis for further analysis that is character-ized by lowest total cost (154.77), highest cost difference (Dcost:95.68), and lowest root mean square deviation (RMSD) of 1.25 �(Table 1). All hypotheses have three Z group, which imply that thisgroup plays an important role in CatD inhibition. The above analysisindicates that geometric constrains and the chemical features ofHypo1 are statistically significant (Figure 2). Hence, Hypo1 was con-sidered as a most reliable pharmacophore model for inhibiting theCatD activity.

To confirm the predictability of Hypo1, the training set was classifiedinto three groups based on its activity values: highly active(IC50 £ 300 nM, +++), moderately active (300 nM < IC50 £ 3000 nM,++), and low active (IC50 > 3000 nM, +). In training set, Hypo1 under-estimated the two active compounds as moderately active one andoverestimated two low active compounds as moderately active, andof the four moderately active compounds, two are overestimated asactive and two are underestimated as low active. Altogether, for 28of the 36 compounds, the predicted IC50 values were found to bewithin the same order of magnitude when compared with the experi-mentally reported values in the training set compounds. Active mole-cule from the training set mapped all the chemical features in Hypo1but the inactive molecule fails to map one of the Z group (Figure 3).In addition, most of the activity value of 36 training set moleculeswas predicted correctly by Hypo1 model (Table S1).

Evaluation of hypothesisHypothesis validation is one of the important steps in pharmaco-phore generation. Several methods were used to confirm the quality

Sakkiah et al.

66 Chem Biol Drug Des 2012; 80: 64–80

Page 4: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 1: Information of statistical significance and predictive power presented in cost values measured in bits for the top 10 hypothesesas a result of automated 3D QSAR pharmacophore generation

Hypothesis no. Total cost Cost differencea RMSb Correlation Featuresb Max. fit

Hypo1 154.77 95.68 1.25 0.90 HBA, HBAL, 3Z 14.36Hypo2 161.27 89.18 1.39 0.87 2HBAL, 3Z 14.06Hypo3 184.21 66.24 1.84 0.76 HBAL, HBD, 3Z 10.97Hypo4 186.30 64.15 1.88 0.75 HBA, HBD, 3Z 10.82Hypo5 194.66 55.79 1.99 0.71 2HBA, 3Z 11.90Hypo6 196.31 54.14 2.02 0.70 HBA, HBAL, 3Z 10.52Hypo7 197.83 52.62 2.04 0.69 HBA, HBD, 3Z 9.09Hypo8 198.53 51.92 2.04 0.69 HBA, HBAL, 3Z 11.39Hypo9 198.67 51.78 2.05 0.69 HBA, HBD, 3Z 9.95Hypo10 198.69 51.76 2.04 0.69 2HBAL, 3Z 11.98

aCost difference between the null and the total cost. The null cost, the fixed cost, and the configuration cost are 250.45, 122.46, and 16.83, respectively.bAbbreviation used for features: RMS, root mean square deviation; HBA, hydrogen bond acceptor; HBD, hydrogen bond donor; HBAL, hydrogen bond acceptorlipid; Z, hydrophobic.

Figure 1: Chemically diverse 36 compounds used as training set to generate the pharmacophore models. The IC50 values are indicated inparentheses for each compound.

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 67

Page 5: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

of Hypo1 like Fischer's randomization method, external test set, GH,and EF. The best hypothesis has been mainly validated to checkwhether the hypothesis can able to select the active compounds.During the screening process such as the percentage of active andinactive compounds picked from data set, there is a correlationbetween the predicted and estimated values of test set along withits efficiency in reducing true negatives and false positives.

Fischer’s randomization methodThe reliability of Hypo1 was cross-validated by using Fischer'srandomization method. The purpose of this cross-validation is torandomize the activity data among the training set compoundsand generate pharmacophore models based on the randomizeddata set by applying the same features, and parameters wereadopted from the initial HYPOGEN run. To achieve 95% confidence

Figure 2: Hypo1 geometric constraint consists of three Z, one HBA, and one HBAL, where hydrophobic (Z), hydrogen bond acceptor lipid(HBAL), and hydrogen bond acceptor (HBA).

A B

Figure 3: Best pharmacophore model Hypo1 aligned with training set compounds. (A) With active compound 1 (IC50 6 nM), (B) With inac-tive compound 31 (IC50 6000 nM). Pharmacophore features are color-coded as blue for hydrophobic (Z), green for hydrogen bond acceptor(HBA) and hydrogen bond acceptor lipid (HBAL).

Figure 4: Correlation between the experimental and predicted activity (pIC50) by Hypo1 for training and test set compounds.

Sakkiah et al.

68 Chem Biol Drug Des 2012; 80: 64–80

Page 6: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 2: Experimental and predicted IC50 data values of 40 test set molecules against Hypo1

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

1

NH2

O

OHO

NHO

NH

O

NH

O

NH2

OHO

NH

O NHOOH

ONH

OOH 14.05 6 9 1.5 +++ +++

2

¼ 15

ONH

O

NH

OH

ONH

S

S

NH

12.04 10 9.7 )1.0 +++ +++

3O

NH

O

NH OHO

NH

NHO

O

O NH

S

S

13.63 11 32 2.9 +++ +++

4

O

NH

ONH

SS

NH

O

OOHO NH

10.94 25 180 7.3 +++ +++

5O

NH

O

NH OHO

NH

NH

O

O

O NH

S

S

13.63 50 49 )1.0 +++ +++

6

O

NH

O NHO

NHOH

O NHO

S

NH

O13.56 56 55 )1.0 +++ +++

7

O

NH

O

NH

S

O

NH

OH

O

NH11.31 60 230 3.8 +++ +++

8

ONH

O

NH

S

ONHOH

O

N

ONH

ONH

13.8 70 63 )1.1 +++ +++

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 69

Page 7: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 2: (Continued)

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

9

ONH

ONH

S

O

NH

OH

ONH

O

OH10.81 90 180 2.0 +++ +++

10

NH2

O

OH

O

NH

O

NH

O

NH

O

NH2

OH

O

NH

O

OH

11.99 98 340 3.4 +++ ++

11O

NH

O

NHOH O

NHNH

O

O

OS

8.98 110 420 3.8 +++ ++

12

ONH

O

NH

S

ONH

OH

NO

ONH

ONH

13.54 130 250 1.9 +++ +++

13

O NH

O

NH

S

OH

O

NH

S

NH2

8.36 150 69 )2.2 +++ +++

14

O

NH

O

NH

SSOHO

NHNHO

O

12.59 180 130 )1.4 +++ +++

15

O

NH

O

NHS

OH

O

NH

NH

O

O

S

12.13 190 160 )1.2 +++ +++

16

O

NH

O

NH

SS

NH

O

O

OHO

NH

z

9.96 195 170 )1.2 +++ +++

Sakkiah et al.

70 Chem Biol Drug Des 2012; 80: 64–80

Page 8: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 2: (Continued)

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

17

ONH

O

NH

S

S

NH

OH

O

NH

11.9 200 570 2.8 +++ ++

18

O

NHO

NHOH

O

NHNH

O

12.02 260 200 )1.3 +++ +++

19

O

NH

O

NH

S

O

NH

OH

O

ONH O

NH

11.92 280 250 )1.1 +++ +++

20

NH

ONH

S

O

NH

OH

O

9.51 370 120 )3.2 ++ +++

21

O

NHO

NHO

NHOH

ONHO

OH

S 11.60 430 590 1.4 ++ ++

22

O

NH

O

NH

SS

NH

O

O

OH

O

NH

12.13 460 1600 3.5 ++ ++

23

O

NH OH

NHN

S

OO N

7.93 660 700 1.1 ++ ++

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 71

Page 9: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 2: (Continued)

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

24

O

NH

O

NHOH O

NHNH

O

O

OO

8.78 910 330 )2.7 ++ ++

25

O

NH

OH NH

N

S

OO

N

8.77 1000 3500 3.5 ++ +

26

O

NH

O

NH

O

O

NHO

O

OH

O

NH

8.21 1110 1300 1.2 ++ ++

27

O

NH

O

NH

S

O

NHOH

N

ONH

O

NH

12.68 1230 260 )4.7 ++ +++

28 O NHO

NH

S

OH

O

NHNH

OS

9.25 1250 460 )2.7 ++ ++

29

O

NH

OH

NHF

FF

N

SOO

N

11.75 1260 400 )3.1 ++ ++

30

ONH

ONH

S

O

NH

OH

O

ONH

O

OH 12.57 1500 460 )3.2 ++ ++

Sakkiah et al.

72 Chem Biol Drug Des 2012; 80: 64–80

Page 10: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Table 2: (Continued)

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

31

NH

O

NH

OHNH

ONS

OO

11.03 1700 3000 1.8 ++ ++

32

NH

O

NH

O

NH

OH

O

6.26 1700 2400 1.4 ++ ++

33

O

ONH

OH

NHF

FF

N

8.76 2020 1700 )1.2 ++ ++

34

NH

ONH

OH

NHO

F

FF

NSOO

10.59 2570 1000 )2.6 ++ ++

35

ONH

ONH

S

O

NH

OH

ONH

O

OH 9.39 3900 3900 )1.0 + +

36

ONH

OH

NH

F

F

FN

SO

O7.40 4790 1700 )2.8 + ++

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 73

Page 11: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

level (significance = [1 ) (1 + 0) ⁄ (19 + 1)] · 100% = 95%), 19 ran-dom hypotheses have been generated and the results are shownin Table S2. The 19 resulted random hypotheses had a highertotal cost, RMSD, and lowest cost difference than the Hypo1hypothesis, which proved the robustness of Hypo1.

Test set methodTo ensure the statistical relevance of Hypo1 model, Ligand pharma-cophore mapping was used to find the optimum fit value of testset molecules. In test set validation, a correlation coefficient valuewas 0.84 (Figure 4), which shows a good correlation between the

experimental and predicted activity values (Table 2). The test setcompounds were classified into three different categories based onIC50 values like training set (mentioned in Computational Method).Of 19 highly active compounds, 16 were predicted correctly but theremaining three were underestimated as moderately active com-pounds. Of 15 moderately active compounds, two compounds areoverestimated as active, one compound was predicted as inactivecompound, and one inactive compound was overestimated as mod-erately active. All other members of the test set were predictedcorrectly or as better than their actual activities. The test set vali-dation further confirms that Hypo1 hypothesis was able to discoverthe novel scaffolds for CatD inhibitor.

Table 2: (Continued)

Compound no. Structure Fit valuea Exp. IC50 nM Pred. IC50 nM Errorb Exp. scalec Pred. scalec

37

ONH

OH

NH

NH

NS

O O

8.45 4000 8100 )2.0 + +

38

O

N

Cl

O

N

NH2

NHOH

8.49 3700 9000 )2.5 + +

39

O

NH

O

NH

S

NH

OH

O

NH

8.54 3300 9700 )3.0 + +

40

ONH

O

NH

OH

ONH

S

NH

8.25 6400 6000 1.1 + +

aFit value indicates how well the features in the pharmacophore overlap the chemical features in the molecule. Fit = weight · [max(0.1 ) SSE)] whereSSE = (D ⁄ T)2, D = displacement of the feature from the center of the location constraints and T = the radius of the location constraint sphere for the feature(tolerance).bDifference between the predicted and experimental values. '+' indicates that the predicted IC50 is higher than the experimental IC50; ')' indicates that the pre-dicted IC50 is lower than the experimental IC50; a value of 1 indicates that the predicted IC50 is equal to the experimental IC50.cActivity scale: IC50 £ 300 nM = +++ (highly active); 300 nM < IC50 £ 3000 nM = ++ (moderately active); IC50 > 3000 nM = + (low active).

Sakkiah et al.

74 Chem Biol Drug Des 2012; 80: 64–80

Page 12: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Decoy setFinally, decoy set has prepared to affirm the efficiency of Hypo1 bycomputing Goodness-of-fit (GF) and EF values. Decoy set containsboth the active (25) and inactive compounds (2175) of CatD. Thescreening was performed using Ligand Pharmacophore Mappingmodule and by computing the various parameters such as totalnumber of compounds in the hit list (Ht), number of active per centof yields (%Y), per cent ratio of actives in the hit list (%A), EF, falsenegatives, false positives, and GF. The false positives and falsenegatives are 7 and 1, respectively. Enrichment factor (6.81) and GH(0.81) values are good indications of the high efficiency of Hypo1(Table 3).

By overall validations, from Fischer's randomization method, we canassure that Hypo1 hypothesis is not come by chance. Test set andstatistical parameters of decoy set revealed that Hypo1 can predictmost of molecules in the same order of magnitude and can discrim-inate the active from inactive inhibitors, respectively. Finally, weconcluded that the Hypo1 was far better than all other hypothesesand used for virtual screening.

Virtual screeningVirtual screening of small molecule is one aspect of a sophisti-cated approach in drug discovery (44). Virtual screening of data-bases can serve for two main purposes: first, to validate thequality of the generated pharmacophore model by selectivedetection of compounds with known inhibitory activity and sec-ond, to find novel, potential leads against particular target.Figure 5 shows the work flow of virtual screening strategy usedin this study. Among �300 000 compounds, 169 381 compoundswere selected based on the fit values >10 and satisfied all thechemical features of Hypo1. ADMET properties such as blood–brain barrier (BBB), solubility, absorption, hepatotoxicity, plasmaprotein binding, and CYP2D6 were computed but here we mainlyfocused on BBB, solubility, and absorption. If the molecules showthe level value of 3, 3, and 0, then it represents that the mole-cules have good solubility and absorption for BBB, solubility, andabsorption, respectively.

The rule of five suggested that a chemical compounds could bean orally active drug in humans. The rule states that the most

'drug-like' compounds should have the value of clogP £5, molec-ular weight £500, and number of hydrogen bond acceptor £10and donors £5. The hit compounds (169, 381) are narroweddown by applying ADMET (12 945) and drug-likeness properties(4818) and subjected to molecular docking to find its bindingaffinity. The hit compounds 669 and 360 that have very goodfit values of 11.20 and 11.83, respectively, are presented inFigure 6.

All the hits that passed the above filtrations were selected as agood leads for inhibiting CatD activity. The hits obtained from theScaffold database are shown a structural diversity when comparedwith the known CatD inhibitors.

Molecular dockingSeveral recent studies have been shown that the coupling of phar-macophore-based virtual screening with molecular docking filtrationwill be a one of the efficient methods to identify a novel scaffolds.

Table 3: Statistical parameter from screening test set molecules

No. Parameter Values

1 Total number of molecules in database (D) 22002 Total number of actives in database (A) 25.003 Total number of hit molecules from the database (Ht) 31.004 Total number of active molecules in hit list (Ha) 24.005 % Yield of actives [(Ha ⁄ Ht) · 100] 77.426 % Ratio of actives [(Ha ⁄ A) · 100] 96.007 Enrichment factor (EF) 6.818 False negatives [A ) Ha] 19 False positives [Ht ) Ha] 7

10 Goodness-of-fit score (GF) 0.81

Figure 5: Flow chart for virtual screening process.

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 75

Page 13: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

Thus to refine the retrieved hits from database, all the hit com-pounds were docked into the active site of CatD. Hypo1 was com-pared with active site of CatD (Figure 7), and it clearly shows agood agreement with critical residues that are important for thefunction of CatD. Molecular docking method is used to identify theorientation of ligand in the binding pocket of protein as well as topredict the binding affinity between ligand and protein. For theaspartic protease CatD described in this study, flexible docking fol-lowed by consensus scoring methods was performed. Totally, 4818hit compounds mapping the pharmacophore as well as satisfying

the postsearch filtering from database and 36 training set com-pounds were docked into the protein active site.

Moreover, most suitable scoring functions were selected to discrimi-nate active from inactive compounds. Six of the eight scoringfunctions like LigScore1, LigScore2, PLP1, PLP2, Jain, PMF, and Ludiand Dock score were used for the prediction of potent CatD inhibi-tors. LigScore 1 and LigScore 2 are calculated by descriptors ofpolar surface in receptor–ligand interactions. Piecewise linearpotential scores have been calculated by the descriptions about the

A B

Figure 6: Hit molecules from scaffold database on the best hypothesis, Hypo1. (A) Compound 669 (B) Compound 360.

Figure 7: Comparison and superimposition of Hypo1 in active site of the crystal structure of Cathepsin D (PDB ID: 1LYB).

Figure 8: The hit compound-360 bound to cathepsin D, hydrogen bonds are shown in black.

Sakkiah et al.

76 Chem Biol Drug Des 2012; 80: 64–80

Page 14: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

formation of hydrogen bonds. Higher PLP score indicated strongerreceptor–ligand binding. Potential mean force score was calculatedby summing the pairwise interaction terms over all interatomic pairsof the receptor–ligand complex. Jain and Ludi scores were con-sulted in hydrophobic interaction and degree of freedom, respec-tively. Docking score was considered as the degree of difficultyabout ligand moving into the binding site. These results are applica-

ble for compounds with the same or similar mode of interactionsas the active compounds of CatD inhibitors. By analyzing the train-ing test compounds using consensus scoring method, Jain scoreshowed good agreement with the activity values. As we mentionedearlier that Z group plays an important role in the inhibition of CatDactivity, Jain score (represent hydrophobic interaction) was used tosort out the database hit molecules. In the case of the consensus

Figure 9: 2D structure for the final 49 hit compounds.

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 77

Page 15: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

scoring process, 1082 of 4818 molecules were selected from data-base. Visual inspection of the molecular docking results has beencarried out to avoid the false-positive hit from the screened hitcompounds based on the critical interactions of the small moleculeswith the receptor. The purpose of this visualization was to eliminatethe hits poses that do not fit into the active site of CatD and alsofails to form hydrogen bond (H-bond) interactions with the keyamino acid residues such as Asp33 and Asp231. Based on the over-all docking analyses, 92 hit molecules were picked out.

The 92 molecules were selected based on the consensus scoringfunction, which all have a good Jain, LigScore, PLP, PMP, and Dockscores, as well as it shows all the necessary interactions with cata-lytic residues like Asp33 and Asp231. One can see that the com-pound-360 is one of the best molecules that satisfied all thecritical features of Hypo1, having a good consensus score, and itshows two H-bond interactions with the active aspartic acid (Asp33and Asp231) present in the active site as well as we can seeclearly that the three Z group blocks the entry of the active site(Figure 8). Finally, 49 molecules (Figure 9) were selected as goodinhibitors of CatD from Scaffold database based on the predictedactivity value (cutoff < 1000 nM) of the compounds (Table S3).

Conclusions

The rational drug design approach is applied to discover newCatD inhibitor molecules with appropriate shape and chemicalfeatures to be subsequently optimized as highly active and selec-tive drugs. The best Hypo1 pharmacophore model, which consistsof one HBA, one HBAL, and three Z features, was developedbased on 36 known inhibitors of CatD by using the HYPOGEN

algorithm. This pharmacophore model was further validated by Fi-scher's randomization method, an external test set of 40 structur-ally diverse compounds and decoy set. All results suggest thatHypo1 model has very good reliability in the accurate predictionof active CatD inhibitors from the inactive molecules. The pur-pose of pharmacophore modeling used in 3D database search isto find the diverse structures with critical chemical features.Moreover, the pharmacophoric features in Hypo1 and moleculardocking conformation were correctly matched with the bindingsurface of CatD. The molecular docking studies have helped usto identify 49 hit compounds based on the predicted activity val-ues (<1000 nM) which were suggested as potent inhibitors ofCatD from Scaffold database. The knowledge of important chemi-cal features for CatD antagonism is expected to be useful in thedevelopment of new CatD inhibitors. Thus, we believed that ourpharmacophore model can be used for virtual screening of novellead compounds and chemical modification and optimization ofhit compounds, therefore accelerating the discovery of newagents for the treatment of various diseases related to CatD.

Acknowledgments

This research was supported by Basic Science Research Program(2009-0073267), Pioneer Research Center Program (2009-0081539),and Management of Climate change Program (2010-0029084)through that National Research Foundation of Korea (NRF) funded

by the Ministry of Education, Science and Technology (MEST) ofRepublic of Korea. And this work was also supported by the Next-Generation BioGreen21 Program (PJ008038) from Rural DevelopmentAdministration (RDA) of Republic of Korea.

References

1. Green D.R., Reed J.C. (1998) Mitochondria and apoptosis. Sci-ence;281:1309–1312.

2. Ciechanover A. (2005) Intracellular protein degradation: from avague idea, through the lysosome and the ubiquitin–proteasomesystem, and onto human diseases and drug targeting (nobel lec-ture). Angew Chem Int Ed;44:5944–5967.

3. Duffy M.J. (1996) Proteases as prognostic markers in cancer.Clin Cancer Res;2:613–618.

4. Saleh Y., Siewinski M., Kielan W., Zi�lkowski P., Grybos M., Ry-bka J. (2003) Regulation of cathepsin B and L expression in vitroin gastric cancer tissues by egg cystatin. J Exp Ther On-col;3:319–324.

5. Turk B. (2006) Targeting proteases: successes, failures andfuture prospects. Nat Rev Drug Discov;5:785–799.

6. Owen C.A. (2005) Proteinases and oxidants as targets in thetreatment of chronic obstructive pulmonary disease. Proc AmThorac Soc;2:373–385.

7. Wolf K., Mazo I., Leung H., Engelke K., von Andrian U.H., Deryu-gina E.I., Strongin A.Y., Brocker E.B., Friedl P. (2003) Compensa-tion mechanism in tumor cell migration. J Cell Biol;160:267–277.

8. Yu H., Rohan T. (2000) Role of the insulin-like growth factorfamily in cancer development and progression. J Natl CancerInst;92:1472–1489.

9. Benes P., Vetvicka V., Fusek M. (2008) Cathepsin D – many func-tions of one aspartic protease. Crit Rev Oncol Hematol;68:12–28.

10. B�hling F., Waldburg N., Reisenauer A., Heimburg A., Golpon H.,Welte T. (2004) Lysosomal cysteine proteases in the lung: rolein protein processing and immunoregulation. Eur RespirJ;23:620–628.

11. Flannery T., Gibson D., Mirakhur M., McQuaid S., Greenan C.,Trimble A., Walker B., McCormick D., Johnston P.G. (2003) Theclinical significance of cathepsin S expression in human astrocy-tomas. Am J Pathol;163:175–182.

12. Felbor U., Dreier L., Bryant R.A.R., Ploegh H.L., Olsen B.R., Mo-thes W. (2000) Secreted cathepsin L generates endostatin fromcollagen XVIII. EMBO J;19:1187–1194.

13. Vetvicka V., Vektvickov� J., Fusek M. (1994) Effect of human pro-cathepsin D on proliferation of human cell lines. CancerLett;79:131–135.

14. Zhou W., Scott S.A., Shelton S.B., Crutcher K.A. (2006) Cathep-sin D-mediated proteolysis of apolipoprotein E: possible role inAlzheimer's disease. Neuroscience;143:689–701.

15. Siintola E., Partanen S., Strçmme P., Haapanen A., Haltia M.,Maehlen J., Lehesjoki A.E., Tyynela J. (2006) Cathepsin D defi-ciency underlies congenital human neuronal ceroid-lipofuscinosis.Brain;129:1438–1445.

16. Huo S., Wang J., Cieplak P., Kollman P.A., Kuntz I.D. (2002)Molecular dynamics and free energy analyses of cathepsin D –inhibitor interactions: insight into structure-based ligand design.J Med Chem;45:1412–1419.

Sakkiah et al.

78 Chem Biol Drug Des 2012; 80: 64–80

Page 16: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

17. Figura K.V., Hasilik A. (1986) Lysosomal enzymes and their recep-tors. Annu Rev Biochem;55:167–193.

18. Jaskiewicz E., Zhu G., Bassi R., Darling D.S., Young W.W. (1996)b1,4-N-acetylgalactosaminyltransferase (GM2 Synthase) isreleased from golgi membranes as a neuraminidase-sensitive,disulfide-bonded dimer by a cathepsin D-like protease. J BiolChem;271:26395–26403.

19. Guagliardi L.E., Koppelman B., Blum J.S., Marks M.S., CresswellP., Brodsky F.M. (1990) Co-localization of molecules involved inantigen processing and presentation in an early endocytic com-partment. Nature;343:133–139.

20. Peters P.J., Neefjes J.J., Oorschot V., Ploegh H.L., Geuze H.J.(1991) Segregation of MHC class II molecules from MHC class Imolecules in the Golgi complex for transport to lysosomal com-partments. Nature;349:669–676.

21. Taralp A., Kaplan H., Sytwu I.-I., Vlattas I., Bohacek R., KnapA.K. et al. (1995) Characterization of the S subsite specificity ofcathepsin B. J Biol Chem;270:18036–18043.

22. Banay-Schwartz M., DeGuzman T., Palkovits M., Lajtha A. (1994)Calpain activity in adult and aged human brain regions. Neuro-chem Res;19:563–567.

23. Matus A., Green G.D.J. (1987) Age-related increase in a cathep-sin D-like protease that degrades brain microtubule-associatedproteins. Biochemistry;26:8083–8086.

24. Cataldo A.M., Nixon R.A. (1990) Enzymatically active lysosomalproteases are associated with amyloid deposits in Alzheimerbrain. Proc Natl Acad Sci USA;87:3861–3865.

25. Venkatesan S.K., Shukla A.K., Dubey V.K. (2010) Molecular dock-ing studies of selected tricyclic and quinone derivatives on try-panothione reductase of Leishmania infantum. J ComputChem;31:2463–2475.

26. Saravanan P., Venkatesan S.K., Mohan C.G., Patra S., Dubey V.K.(2010) Mitogen-activated protein kinase 4 of Leishmania para-site as a therapeutic target. Eur J Med Chem;45:5662–5670.

27. Sakkiah S., Krishnamoorthy N., Gajendrarao P., ThangapandianS., Lee Y., Kim S., Suh J.K., Kim H.H., Lee K.W. (2009) Pharma-cophore mapping and virtual screening for SIRT1 activators. BullKorean Chem Soc;30:1152–1156.

28. Sakkiah S., Thangapandian S., John S., Kwon Y.J., Lee K.W.(2010) 3D QSAR pharmacophore based virtual screening andmolecular docking for identification of potential HSP90 inhibi-tors. Eur J Med Chem;45:2132–2140.

29. Sakkiah S., Thangapandian S., John S., Lee K.W. (2011) Identifi-cation of critical chemical features for Aurora kinase-B inhibitorsusing Hip-Hop, virtual screening and molecular docking. J MolStruct;985:14–26.

30. Sakkiah S., Thangapandian S., John S., Lee K.W. (2011) Pharma-cophore based virtual screening, molecular docking studies todesign potent heat shock protein 90 inhibitors. Eur J MedChem;46:2937–2947.

31. Charrier N., Clarke B., Demont E., Dingwall C., Dunsdon R.,Hawkins J., Hubbard J. et al. (2009) Second generation ofBACE-1 inhibitors part 2: optimisation of the non-prime side sub-stituent. Bioorg Med Chem Lett;19:3669–3673.

32. Beswick P., Charrier N., Clarke B., Demont E., Dingwall C., Duns-don R., Faller A. et al. (2008) BACE-1 inhibitors part 3: identifi-cation of hydroxy ethylamines (HEAs) with nanomolar potency incells. Bioorg Med Chem Lett;18:1022–1026.

33. Charrier N., Clarke B., Cutler L., Demont E., Dingwall C., Duns-don R., Hawkins J. et al. (2009) Second generation of BACE-1inhibitors. Part 1: the need for improved pharmacokinetics. Bio-org Med Chem Lett;19:3664–3668.

34. Charrier N., Clarke B., Cutler L., Demont E., Dingwall C., Duns-don R., Hawkins J. et al. (2009) Second generation of BACE-1inhibitors part 3: towards non hydroxyethylamine transition statemimetics. Bioorg Med Chem Lett;19:3674–3678.

35. Charrier N., Clarke B., Cutler L., Demont E., Dingwall C., Duns-don R., East P. et al. (2008) Second generation of hydroxyethyl-amine BACE-1 inhibitors: optimizing potency and oralbioavailability. J Med Chem;51:3313–3317.

36. Brooks B., Bruccoleri R., Olafson B., States D., Swaminathan S.,Karplus M. (1983) CHARMM: a program for macromolecularenergy, minimization, and dynamics calculations. J ComputChem;4:187–217.

37. Smellie A., Teig S.L., Towbin P. (1995) Poling: promoting confor-mational variation. J Comput Chem;16:171–187.

38. Du L.-P., Tsai K.-C., Li M.-Y., You Q.-D., Xia L. (2004) The phar-macophore hypotheses of IKr potassium channel blockers: novelclass III antiarrhythmic agents. Bioorg Med Chem Lett;14:4771–4777.

39. Yang Q., Du L., Tsai K.-C., Wang X., Li M., You Q. (2009) Phar-macophore mapping for Kv1.5 potassium channel blockers. QSARComb Sci;28:59–71.

40. Shao Y.-M., Yang W.-B., Peng H.-P., Hsu M.-F., Tsai K.-C., Kuo T.-H., Wang A.H., Liang P.H., Lin C.H., Yang A.S., Wong C.H. (2007)Structure-based design and synthesis of highly potent SARS-CoV3CL protease inhibitors. ChemBioChem;8:1654–1657.

41. Venkatachalam C.M., Jiang X., Oldfield T., Waldman M. (2003)LigandFit: a novel method for the shape-directed rapid dockingof ligands to protein active sites. J Mol Graph Model;21:289–307.

42. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N.,Weissig H., Shindyalov I.N., Bourne P.E. (2000) The protein databank. Nucleic Acids Res;28:235–242.

43. Friesner R.A., Banks J.L., Murphy R.B., Halgren T.A., Klicic J.J.,Mainz D.T., Repasky M.P., Knoll E.H., Shelley M., Perry J.K.,Shaw D.E., Francis P., Shenkin P.S. (2004) Glide: a new approachfor rapid, accurate docking and scoring. 1. Method and assess-ment of docking accuracy. J Med Chem;47:1739–1749.

44. Halgren T.A., Murphy R.B., Friesner R.A., Beard H.S., Frye L.L.,Pollard W.T., Banks J.L. (2004) Glide: a new approach for rapid,accurate docking and scoring. 2. Enrichment factors in databasescreening. J Med Chem;47:1750–1759.

45. Krammer A., Kirchhoff P.D., Jiang X., Venkatachalam C.M., Wald-man M. (2005) LigScore: a novel scoring function for predictingbinding affinities. J Mol Graph Model;23:395–407.

46. Gehlhaar D.K., Verkhivker G.M., Rejto P.A., Sherman C.J., FogelD.R., Fogel L.J., Freer S.T. (1995) Molecular recognition of theinhibitor AG-1343 by HIV-1 protease: conformationally flexibledocking by evolutionary programming. Chem Biol;2:317–324.

47. Muegge I. (2005) PMF scoring revisited. J Med Chem;49:5895–5902.

48. Muegge I., Martin Y.C. (1999) A general and fast scoring func-tion for protein – ligand interactions: a simplified potentialapproach. J Med Chem;42:791–804.

Pharmacophore-Based Identification of Potent Inhibitor of Cathepsin D

Chem Biol Drug Des 2012; 80: 64–80 79

Page 17: Ligand-Based Virtual Screening and Molecular Docking Studies to …bio.gnu.ac.kr/publication/pdf/2012_10(83).pdf · 2012. 7. 2. · docking studies. Finally, 49 compounds were obtained

49. Jain A.N. (1996) Scoring noncovalent protein-ligand interactions:a continuous differentiable function tuned to compute bindingaffinities. J Comput Aided Mol Des;10:427–440.

50. Bohm H.J. (1994) The development of a simple empirical scoringfunction to estimate the binding constant for a protein-ligandcomplex of known three-dimensional structure. J Comput AidedMol Des;8:243–256.

51. Bçhm H.-J. (1998) Prediction of binding constants of proteinligands: a fast method for the prioritization of hits obtained fromde novo design or 3D database search programs. J ComputAided Mol Des;12:309–323.

52. Tsai K.-C., Teng L.-W., Shao Y.-M., Chen Y.-C., Lee Y.-C., Li M.,Hsiao N.W. (2009) The first pharmacophore model for potentNF-[kappa]B inhibitors. Bioorg Med Chem Lett;19:5665–5669.

Supporting Information

Additional Supporting Information may be found in the online ver-sion of this article:

Table S1. Actual and estimated activity of the training set mole-cules based on the pharmacophore model Hypo1.

Table S2. Validation of Hypo1 using Fischer's randomizationmethod.

Table S3. Hit molecules from scaffold database and its corre-sponding various score values.

Please note: Wiley-Blackwell is not responsible for the content orfunctionality of any supporting materials supplied by the authors.Any queries (other than missing material) should be directed to thecorresponding author for the article.

Sakkiah et al.

80 Chem Biol Drug Des 2012; 80: 64–80