drug design

146
PROJECT REPORT ON “Drug Design: to find a drug that changes protein activity of Influenza Virus” Submitted in the partial fulfillment for the award of Degree Of BACHELOR OF TECHNOLOGY IN BIOTECHNOLOGY (Session 2009-2013) Submitted by Supervisor Bosky Mangal Mr. Oisik Das (1109152) DEPARTMENT OF BIOTECHNOLOGY (Maharishi Markandeshwar Engineering College, Mullana)

description

swine flue

Transcript of drug design

Page 1: drug design

PROJECT REPORT

ON

“Drug Design: to find a drug that changes protein activity of Influenza Virus”

Submitted in the partial fulfillment for the award of Degree

Of

BACHELOR OF TECHNOLOGY

IN

BIOTECHNOLOGY

(Session 2009-2013)

Submitted by Supervisor

Bosky Mangal Mr. Oisik Das

(1109152)

DEPARTMENT OF BIOTECHNOLOGY

(Maharishi Markandeshwar Engineering College, Mullana)

MAHARISHI MARKANDESHWAR UNIVERSITY

MULLANA (AMBALA)-133207

Page 2: drug design

CERTIFICATE

This is to certify that the project entitled “Drug Design: to find a drug that changes protein activity of Influenza Virus” submitted by Ms. Bosky Mangal, to The Department of Biotechnology, Maharishi Markandeshwar University, Mullana, Ambala for partial fulfillment of the requirements for the degree of Bachelor in Technology in Biotechnology, has been carried out under my supervision.

The assistance and help received during the course of investigation and sources of literatures have been fully acknowledged.

Dr. Anil Kumar Sharma

(Head of the Department)

Department of Biotechnology

MMU, Mullana, Ambala

Page 3: drug design

CERTIFICATE

This is to certify that the project entitled “Drug Design: to find a drug that changes protein activity of influenza Virus” submitted by Bosky Mangal, student of B.tech (Biotechnology) Eighth Semester is a bonafide work carried out by her under my guidance, for the partial fulfillment of B.tech (Biotechnology) degree course awarded by Maharishi Markandeshwar University , Mullana (Ambala).

I wish her luck and success in the future.

Mr.Oisik Das

(Biotechnology Department

MMU Mullana ,Ambala)

Page 4: drug design

DECLARATION

I hereby declare that the work contained in this project entitled “Drug Design: to find a drug that changes protein activity of Influenza Virus” has not been previously submitted for degree at any other higher education institute.

To the best of my knowledge and believe, this project contains no material previously published or written by another person except where due references are made.

Bosky Mangal

110904655

B.Tech(Bio-technology)

8Th sem

Page 5: drug design

ACKNOWLEDGEMENT

First of all I would like to thank almighty GOD who has given this wonderful gift of life to us. He is the one who is guiding us in right direction to follow noble path of humanity. In my six months minor project report it is a wonderful experience to be a part of work “Drug Design: to find a drug that changes protein activity of Influenza Virus”. I owe my deep regards for the supporting and kind staff authorities who are helping me in my lean patches during this project. I am grateful to all the staff and co-students of IBI Biosolutions Pvt. Ltd for sharing their experience with me. I would like to express my heartiest concern for Mr. Sachin Sharma for his able guidance and for his inspiring attitude, praiseworthy attitude and honest support. Not to forget the pain staking efforts of our college last but not the least I would express my utmost regards for the Biotechnology department of our Institute.

Page 6: drug design

ABSTRACT

Influenza is a serious problem in the medical community. Each year in the United States, roughly 200,000 individuals are hospitalized due to influenza. Additionally, on average 36,000 deaths are attributed to influenza yearly in the US. Children and elderly are more susceptible to have serious complications from influenza. There are two types of influenza, A and B, with hundreds of strains of each. Influenza A is generally considered to be the more prevalent and dangerous type, as it is usually associated with epidemics. Influenza is an evolving virus, constantly reproducing new mutant strains resistant to treatment .Clearly, there is an enormous need for a practical approach to the treatment of influenza. The goal of this research is to design a new antiviral drug which is effective against both influenza A and B. The ideal drug should have minimal side effects and fewer restraints than the current drugs on the market. The purpose of this thesis is to present my research procedure, difficulties which were overcome, and resulting information.

Page 7: drug design

Review of Literature

Influenza: Information, Biological Activity, and Current Options

Influenza is a serious problem in the medical community. Each year in the United States, roughly 200,000 individuals are hospitalized due to influenza. Additionally, on average 36,000 deaths are attributed to influenza yearly in the US. [Center for Disease Control and Prevention. Influenza. http://www.cdc.gov/flu/ (accessed June 19, 2009).]

Children and elderly are more susceptible to have serious complications from influenza.

There are two types of influenza, A and B, with hundreds of strains of each. Influenza A is generally considered to be the more prevalent and dangerous type, as it is usually associated with epidemics. Influenza is an evolving virus, constantly reproducing new mutant strains resistant to treatment.

[Couch, Robert B. The New England Journal of Medicine 1997, 337: 927-929.]

The influenza virus is a segmented, membrane-enclosed; negative-strand RNA virus.3 the influenza viral protein membrane is made up of three main components: hemagglutinin (HA), the M2 proton channel, and neuraminidase (NA). There are sixteen subtypes of hemagglutinin, HA: H1-H16. Hemagglutinin is involved in the attachment to sialic acid, which is a receptor on the target cell surface. The hemagglutinin allows binding onto and consequently penetration of the virus into the target cell.

[ Luo, M., Air, G. M., Brouillette, W.J. The Journal of Infectious Diseases. 1997, 176: 62-65.

Malaisree, M., Rungrotmongkol, T., Decha, P., Intharathep, P., Aruksakunwon, O., Hannongbuw, S. Proteins 2008, 71: 1908-1918.]

Page 8: drug design

There are nine subtypes of neuraminidase, NA: N1-N9. After the virus has replicated within the target cell, the neuraminidase cleaves the terminal sialic. acid from the receptor, allowing the newly formed virus to be released and infect other cells. Each of the three components are important in the replication and spread of influenza throughout the body, but if just but if just one segment of the cycle can be stopped, influenza could be more easily controlled Currently, there are few options for the prevention or treatment of influenza. Vaccines are typically readily available for prevention.

[Couch. , Robert B. The New England Journal of Medicine 2000, 343: 1778-1788]

Many people who are at-risk do not take advantage of this form of prevention. There are four pharmaceutical products currently approved by the FDA available for the treatment or prevention of influenza, which include two different types: ion channel blockers and neuraminidase inhibitors. These drugs are approved for either treatment or prevention if it is almost certain the patient will contract the virus.

Amantadine and Rimantadine are two ion channel blocking drugs. They function by blocking an ion channel in the M2 protein of the viral membrane. The drugs prohibit the entrance of hydrogen ions through the membrane, which in turn prevent replication

[Balfour Jr, Henry H. The New England Journal of Medicine. 1999, 340: 1255-1269.]

Amantadine and Rimantadine reduce and shorten the symptoms of influenza A if given to patients within 48 hours of the emergence of symptoms.7 Oseltamivir and Zanamivir are two neuraminidase inhibitors. They are effective because they inhibit the production of neuraminidase, preventing the virus from penetrating the cell surface, and thus preventing infection. Oseltamivir and Zanamivir are effective in reducing symptoms for both influenza A and B when given to patients who are symptomatic for less than two days.

[Robert B. The New England Journal of Medicine 2000, 343: 1778-1788]

Page 9: drug design

Clearly, there is an enormous need for a practical approach to the treatment of influenza. The goal of this research is to design a new antiviral drug which is effective against both influenza A and B. The ideal drug should have minimal side effects and fewer restraints than the current drugs on the market. The purpose of this thesis is to present my research procedure, difficulties which were overcome, and resulting information.

Structural Approaches to Drug Discovery: Ligand-Protein Interactions; Stroud, Robert M.; Finer-Moore, Janet; Royal Society of Chemistry: Cambridge, UK, 2012

Chapter 1

Page 10: drug design

BIOINFORMATICIn biology, bioinformatics disciplinary field that develops and improves upon methods for storing, retrieving, organizing and analyzing biological data. A major activity in bioinformatics is to develop software tools to generate useful biological knowledge.

Bioinformatics has become an important part of many areas of biology. In experimental molecular biology, bioinformatics techniques such as image and signal processing allow extraction of useful results from large amounts of raw data. In the field of genetics and genomics, it aids in sequencing and annotating genomes and their observed mutations. It plays a role in the textual mining of biological literature and the development of biological and gene ontologies to organize and query biological data. It plays a role in the analysis of gene and protein expression and regulation. Bioinformatics tools aid in the comparison of genetic and genomic data and more generally in the understanding of evolutionary aspects of molecular biology. At a more integrative level, it helps analyze and catalogue the biological pathways and networks that are an important part of systems biology. In structural biology, it aids in the simulation and modeling of DNA, RNA, and protein structures as well as molecular interactions.

Bioinformatics uses many areas of computer science, mathematics and engineering to process biological data. Complex machines are used to read in biological data at a much faster rate than before. Databases and information systems are used to store and organize biological data. Analyzing biological data may involve algorithms in artificial intelligence, soft computing, data mining, image processing, and simulation. The algorithms in turn depend on theoretical foundations such as discrete mathematics, control theory, system theory, information theory, and statistics. Commonly used software tools and technologies in the field include Java, C#, XML, Perl, C, C++, Python, R, SQL, CUDA, MATLAB, and spreadsheet applications.

History

Page 11: drug design

Building on the recognition of the importance of information transmission, accumulation and processing in biological systems, in 1978 Paulien Hogeweg, coined the term "Bioinformatics" to refer to the study of information processes in biotic systems. This definition placed bioinformatics as a field parallel to biophysics or biochemistry (biochemistry is the study of chemical processes in biological systems Examples of relevant biological information processes studied in the early days of bioinformatics are the formation of complex social interaction structures by simple behavioral rules, and the information accumulation and maintenance in models of prebiotic evolution.

One early contributor to bioinformatics was Elvin A. Kabat, who pioneered biological sequence analysis with his comprehensive volumes of antibody sequences released with Tai Te Wu between 1980 and 1991. Another significant pioneer in the field was Margaret Oakley Dayhoff, who has been hailed by David Lipman, director of the National Center for Biotechnology Information, as the "mother and father of bioinformatics."]

At the beginning of the "genomic revolution", the term bioinformatics was re-discovered to refer to the creation and maintenance of a database to store biological information such as nucleotide sequences and amino acid sequences. Development of this type of database involved not only design issues but the development of complex interfaces whereby researchers could access existing data as well as submit new or revised data.

Goals

In order to study how normal cellular activities are altered in different disease states, the biological data must be combined to form a comprehensive picture of these activities. Therefore, the field of bioinformatics has evolved such that the most pressing task now involves the analysis and interpretation of various types of data. This includes nucleotide and amino acid sequences, protein domains, and protein structures.[9] The actual process of analyzing and interpreting data is referred to as computational biology. Important sub-disciplines within bioinformatics and computational biology include:

the development and implementation of tools that enable efficient access to, use and management of, various types of information.

the development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets. For example, methods to locate a gene within a sequence, predict protein structure and/or function, and cluster protein sequences into families of related sequences.

Page 12: drug design

The primary goal of bioinformatics is to increase the understanding of biological processes. What sets it apart from other approaches, however, is its focus on developing and applying computationally intensive techniques to achieve this goal. Examples include: pattern recognition, data mining, machine learning algorithms, and visualization. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein–protein interactions, genome-wide association studies, and the modeling of evolution.

Bioinformatics now entails the creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.

Over the past few decades rapid developments in genomic and other molecular research technologies and developments in information technologies have combined to produce a tremendous amount of information related to molecular biology. Bioinformatics is the name given to these mathematical and computing approaches used to glean understanding of biological processes.

Approaches

Common activities in bioinformatics include mapping and analyzing DNA and protein sequences, aligning different DNA and protein sequences to compare them, and creating and viewing 3-D models of protein structures.

There are two fundamental ways of modelling a Biological system (e.g., living cell) both coming under Bioinformatic approaches.

Static

Sequences – Proteins, Nucleic acids and Peptides

Interaction data among the above entities including microarray data and Networks of proteins, metabolites

Dynamic

Structures – Proteins, Nucleic acids, Ligands (including metabolites and drugs) and Peptides (structures studied with bioinformatics tools are not considered static anymore and their dynamics is often the core of the structural studies)

Systems Biology comes under this category including reaction fluxes and variable concentrations of metabolites

Page 13: drug design

Multi-Agent Based modelling approaches capturing cellular events such as signalling, transcription and reaction dynamics

A broad sub-category under bioinformatics is structural bioinformatics.

Major research areas

Sequence analysis

Since the Phage Φ-X174 was sequenced in 1977,[10] the DNA sequences of thousands of organisms have been decoded and stored in databases. This sequence information is analyzed to determine genes that encode polypeptides (proteins), RNA genes, regulatory sequences, structural motifs, and repetitive sequences. A comparison of genes within a species or between different species can show similarities between protein functions, or relations between species (the use of molecular systematics to construct phylogenetic trees). With the growing amount of data, it long ago became impractical to analyze DNA sequences manually. Today, computer programs such as BLAST are used daily to search sequences from more than 260 000 organisms, containing over 190 billion nucleotides.[11] These programs can compensate for mutations (exchanged, deleted or inserted bases) in the DNA sequence, to identify sequences that are related, but not identical. A variant of this sequence alignment is used in the sequencing process itself. The so-called shotgun sequencing technique (which was used, for example, by The Institute for Genomic Research to sequence the first bacterial genome, Haemophilus influenzae)[12] does not produce entire chromosomes. Instead it generates the sequences of many thousands of small DNA fragments (ranging from 35 to 900 nucleotides long, depending on the sequencing technology). The ends of these fragments overlap and, when aligned properly by a genome assembly program, can be used to reconstruct the complete genome. Shotgun sequencing yields sequence data quickly, but the task of assembling the fragments can be quite complicated for larger genomes. For a genome as large as the human genome, it may take many days of CPU time on large-memory, multiprocessor computers to assemble the fragments, and the resulting assembly will usually contain numerous gaps that have to be filled in later. Shotgun sequencing is the method of choice for virtually all genomes sequenced today, and genome assembly algorithms are a critical area of bioinformatics research.

Another aspect of bioinformatics in sequence analysis is annotation. This involves computational gene finding to search for protein-coding genes, RNA genes, and other functional sequences within a genome. Not all of the nucleotides within a genome are part of genes. Within the genomes of higher organisms, large parts of the DNA do not serve any obvious purpose. This so-called junk DNA may, however, contain unrecognized functional elements. Bioinformatics helps to bridge the gap between

Page 14: drug design

genome and proteome projects — for example, in the use of DNA sequences for protein identification.

Genome annotation

In the context of genomics, annotation is the process of marking the genes and other biological features in a DNA sequence. The first genome annotation software system was designed in 1995 by Dr. Owen White, who was part of the team at The Institute for Genomic Research that sequenced and analyzed the first genome of a free-living organism to be decoded, the bacterium Haemophilus influenzae. Dr. White built a software system to find the genes (fragments of genomic sequence that encode proteins), the transfer RNAs, and to make initial assignments of function to those genes. Most current genome annotation systems work similarly, but the programs available for analysis of genomic DNA, such as the GeneMark program trained and used to find protein-coding genes in Haemophilus influenzae, are constantly changing and improving.

Computational evolutionary biology

Evolutionary biology is the study of the origin and descent of species, as well as their change over time. Informatics has assisted evolutionary biologists in several key ways; it has enabled researchers to:

trace the evolution of a large number of organisms by measuring changes in their DNA, rather than through physical taxonomy or physiological observations alone.

more recently, compare entire genomes, which permits the study of more complex evolutionary events, such as gene duplication, horizontal gene transfer, and the prediction of factors important in bacterial speciation,

build complex computational models of populations to predict the outcome of the system over time.

track and share information on an increasingly large number of species and organisms.

Future work endeavours to reconstruct the now more complex tree of life.

The area of research within computer science that uses genetic algorithms is sometimes confused with computational evolutionary biology, but the two areas are not necessarily related.

Literature analysis

Page 15: drug design

The growth in the number of published literature makes it virtually impossible to read every paper, resulting in disjointed sub-fields of research. Literature analysis aims to employ computational and statistical linguistics to mine this growing library of text resources. For example:

abbreviation recognition - identify the long-form and abbreviation of biological terms,

named entity recognition - recognizing biological terms such as gene names protein-protein interaction - identify which proteins interact with which

proteins from text

The area of research draws from statistics and computational linguistics.

Analysis of gene expression

The expression of many genes can be determined by measuring mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag (EST) sequencing, serial analysis of gene expression (SAGE) tag sequencing, massively parallel signature sequencing (MPSS), RNA-Seq, also known as "Whole Transcriptome Shotgun Sequencing" (WTSS), or various applications of multiplexed in-situ hybridization. All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate signal from noise in high-throughput gene expression studies. Such studies are often used to determine the genes implicated in a disorder: one might compare microarray data from cancerous epithelial cells to data from non-cancerous cells to determine the transcripts that are up-regulated and down-regulated in a particular population of cancer cells.

Analysis of regulation

Regulation is the complex orchestration of events starting with an extracellular signal such as a hormone and leading to an increase or decrease in the activity of one or more proteins. Bioinformatics techniques have been applied to explore various steps in this process. For example, promoter analysis involves the identification and study of sequence motifs in the DNA surrounding the coding region of a gene. These motifs influence the extent to which that region is transcribed into mRNA. Expression data can be used to infer gene regulation: one might compare microarray data from a wide variety of states of an organism to form hypotheses about the genes involved in each state. In a single-cell organism, one might compare stages of the cell cycle, along with various stress conditions (heat shock, starvation, etc.). One can then apply clustering algorithms to that expression data to determine which genes are co-expressed. For example, the upstream regions (promoters) of co-expressed genes can be searched for over-represented regulatory elements. Examples of clustering algorithms applied in

Page 16: drug design

gene clustering are k-means clustering, self-organizing maps (SOMs), hierarchical clustering, and consensus clustering methods such as the Bi-CoPaM. The later, namely Bi-CoPaM, has been actually proposed to address various issues specific to gene discovery problems such as consistent co-expression of genes over multiple microarray datasets.

Analysis of protein expression

Protein microarrays and high throughput (HT) mass spectrometry (MS) can provide a snapshot of the proteins present in a biological sample. Bioinformatics is very much involved in making sense of protein microarray and HT MS data; the former approach faces similar problems as with microarrays targeted at mRNA, the latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.

Analysis of mutations in cancer

In cancer, the genomes of affected cells are rearranged in complex or even unpredictable ways. Massive sequencing efforts are used to identify previously unknown point mutations in a variety of genes in cancer. Bioinformaticians continue to produce specialized automated systems to manage the sheer volume of sequence data produced, and they create new algorithms and software to compare the sequencing results to the growing collection of human genome sequences and germline polymorphisms. New physical detection technologies are employed, such as oligonucleotide microarrays to identify chromosomal gains and losses (called comparative genomic hybridization), and single-nucleotide polymorphism arrays to detect known point mutations. These detection methods simultaneously measure several hundred thousand sites throughout the genome, and when used in high-throughput to measure thousands of samples, generate terabytes of data per experiment. Again the massive amounts and new types of data generate new opportunities for bioinformaticians. The data is often found to contain considerable variability, or noise, and thus Hidden Markov model and change-point analysis methods are being developed to infer real copy number changes.

Another type of data that requires novel informatics development is the analysis of lesions found to be recurrent among many tumors.

Comparative genomics

Page 17: drug design

The core of comparative genome analysis is the establishment of the correspondence between genes (orthology analysis) or other genomic features in different organisms. It is these intergenomic maps that make it possible to trace the evolutionary processes responsible for the divergence of two genomes. A multitude of evolutionary events acting at various organizational levels shape genome evolution. At the lowest level, point mutations affect individual nucleotides. At a higher level, large chromosomal segments undergo duplication, lateral transfer, inversion, transposition, deletion and insertion.

Ultimately, whole genomes are involved in processes of hybridization, polyploidization and endosymbiosis, often leading to rapid speciation. The complexity of genome evolution poses many exciting challenges to developers of mathematical models and algorithms, who have recourse to a spectra of algorithmic, statistical and mathematical techniques, ranging from exact, heuristics, fixed parameter and approximation algorithms for problems based on parsimony models to Markov Chain Monte Carlo algorithms for Bayesian analysis of problems based on probabilistic models.

Many of these studies are based on the homology detection and protein families computation.

Network and systems biology

Network analysis seeks to understand the relationships within biological networks such as metabolic or protein-protein interaction networks. Although biological networks can be constructed from a single type of molecule or entity (such as genes), network biology often attempts to integrate many different data types, such as proteins, small molecules, gene expression data, and others, which are all connected physically and/or functionally.

Systems biology involves the use of computer simulations of cellular subsystems (such as the networks of metabolites and enzymes which comprise metabolism, signal transduction pathways and gene regulatory networks) to both analyze and visualize the complex connections of these cellular processes. Artificial life or virtual evolution attempts to understand evolutionary processes via the computer simulation of simple (artificial) life forms.

High-throughput image analysis

Page 18: drug design

Computational technologies are used to accelerate or fully automate the processing, quantification and analysis of large amounts of high-information-content biomedical imagery. Modern image analysis systems augment an observer's ability to make measurements from a large or complex set of images, by improving accuracy, objectivity, or speed. A fully developed analysis system may completely replace the observer. Although these systems are not unique to biomedical imagery, biomedical imaging is becoming more important for both diagnostics and research. Some examples are:

high-throughput and high-fidelity quantification and sub-cellular localization (high-content screening, cytohistopathology, Bioimage informatics)

morphometrics clinical image analysis and visualization determining the real-time air-flow patterns in breathing lungs of living animals quantifying occlusion size in real-time imagery from the development of and

recovery during arterial injury making behavioral observations from extended video recordings of laboratory

animals infrared measurements for metabolic activity determination inferring clone overlaps in DNA mapping, e.g. the Sulston score

Structural Bio-informatic approaches:

Prediction of protein structure

Protein structure prediction is another important application of bioinformatics. The amino acid sequence of a protein, the so-called primary structure, can be easily determined from the sequence on the gene that codes for it. In the vast majority of cases, this primary structure uniquely determines a structure in its native environment. (Of course, there are exceptions, such as the bovine spongiform encephalopathy – a.k.a. Mad Cow Disease – prion.) Knowledge of this structure is vital in understanding the function of the protein. For lack of better terms, structural information is usually classified as one of secondary, tertiary and quaternary structure. A viable general solution to such predictions remains an open problem. Most efforts have so far been directed towards heuristics that work most of the time.

One of the key ideas in bioinformatics is the notion of homology. In the genomic branch of bioinformatics, homology is used to predict the function of a gene: if the sequence of gene A, whose function is known, is homologous to the sequence of gene B, whose function is unknown, one could infer that B may share A's function. In the structural branch of bioinformatics, homology is used to determine which parts of a

Page 19: drug design

protein are important in structure formation and interaction with other proteins. In a technique called homology modeling, this information is used to predict the structure of a protein once the structure of a homologous protein is known. This currently remains the only way to predict protein structures reliably.

One example of this is the similar protein homology between hemoglobin in humans and the hemoglobin in legumes (leghemoglobin). Both serve the same purpose of transporting oxygen in the organism. Though both of these proteins have completely different amino acid sequences, their protein structures are virtually identical, which reflects their near identical purposes.

Other techniques for predicting protein structure include protein threading and de novo (from scratch) physics-based modeling.

.

Molecular Interaction

Efficient software is available today for studying interactions among proteins, ligands and peptides. Types of interactions most often encountered in the field include – Protein–ligand (including drug), protein–protein and protein–peptide.

Molecular dynamic simulation of movement of atoms about rotatable bonds is the fundamental principle behind computational algorithms, termed docking algorithms for studying molecular interactions.

.

In the last two decades, tens of thousands of protein three-dimensional structures have been determined by X-ray crystallography and Protein nuclear magnetic resonance spectroscopy (protein NMR). One central question for the biological scientist is whether it is practical to predict possible protein–protein interactions only based on these 3D shapes, without doing protein–protein interaction experiments. A variety of methods have been developed to tackle the Protein–protein docking problem, though it seems that there is still much work to be done in this field.

Software and tools

Software tools for bioinformatics range from simple command-line tools, to more complex graphical programs and standalone web-services available from various bioinformatics companies or public institutions.

Open-source bioinformatics software

Page 20: drug design

Many free and open-source software tools have existed and continued to grow since the 1980s.[16] The combination of a continued need for new algorithms for the analysis of emerging types of biological readouts, the potential for innovative in silico experiments, and freely available open code bases have helped to create opportunities for all research groups to contribute to both bioinformatics and the range of open-source software available, regardless of their funding arrangements. The open source tools often act as incubators of ideas, or community-supported plug-ins in commercial applications. They may also provide de facto standards and shared object models for assisting with the challenge of bioinformation integration.

The range of open-source software packages includes titles such as Bioconductor, BioPerl, Biopython, BioJava, BioRuby, Bioclipse, EMBOSS, Taverna workbench, and UGENE. In order to maintain this tradition and create further opportunities, the non-profit Open Bioinformatics Foundation[16] have supported the annual Bioinformatics Open Source Conference (BOSC) since 2000.

Page 21: drug design

Chapter 2

Drug designing

Drug design, sometimes referred to as rational drug design or more simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activates or inhibits the function of a biomolecule such as a protein, which in turn results in a therapeutic benefit to the patient. In the most basic sense, drug design involves the design of small molecules that are complementary in shape and charge to the biomolecular target with which they interact and therefore will bind to it. Drug design frequently but not necessarily relies on computer modeling techniques. This type of modeling is often referred to as computer-aided drug design. Finally, drug design that relies on the knowledge of the three-dimensional structure of the biomolecular target is known as structure-based drug design.

The phrase "drug design" is to some extent a misnomer. What is really meant by drug design is ligand design (i.e., design of a small molecule that will bind tightly to its target). Although modeling techniques for prediction of binding affinity are reasonably successful, there are many other properties, such as bioavailability, metabolic half-life, lack of side effects, etc., that first must be optimized before a ligand can become a safe and efficacious drug. These other characteristics are often difficult to optimize using rational drug design techniques.

Background

Typically a drug target is a key  molecule  involved in a particular metabolic or signaling pathway that is specific to a disease condition or pathology or to the infectivity or survival of a microbial pathogen. Some approaches attempt to inhibit the functioning of the pathway in the diseased state by causing a key molecule to stop functioning. Drugs may be designed that bind to the active region and inhibit this key molecule. Another approach may be to enhance the normal pathway by promoting specific molecules in the normal pathways that may have been affected in the diseased state. In addition, these drugs should also be designed so as not to affect any other important "off-target" molecules or antitargets that may be similar in appearance to the target molecule, since drug interactions with off-target molecules may lead to undesirable side effects. Sequence homology is often used to identify such risks.Most commonly, drugs are organic small molecules produced through chemical synthesis, but biopolymer-based drugs (also

Page 22: drug design

known as biologics) produced through biological processes are becoming increasingly more common. In addition, mRNA-based gene silencing technologies may have therapeutic applications.

Types

Flow charts of two strategies of structure-based drug design

There are two major types of drug design. The first is referred to as ligand-based drug design and the second, structure-based drug design.

Page 23: drug design

Ligand-based

Ligand-based drug design (or indirect drug design) relies on knowledge of other molecules that bind to the biological target of interest. These other molecules may be used to derive a pharmacophore model that defines the minimum necessary structural characteristics a molecule must possess in order to bind to the target. [4] In other words, a model of the biological target may be built based on the knowledge of what binds to it, and this model in turn may be used to design new molecular entities that interact with the target. Alternatively, a quantitative structure-activity relationship (QSAR), in which a correlation between calculated properties of molecules and their experimentally determined biological activity, may be derived. These QSAR relationships in turn may be used to predict the activity of new analogs.

Structure-based

Structure-based drug design (or direct drug design) relies on knowledge of the three dimensional structure of the biological target obtained through methods such as x-ray crystallography or NMR spectroscopy.[5] If an experimental structure of a target is not available, it may be possible to create a homology model of the target based on the experimental structure of a related protein. Using the structure of the biological target, candidate drugs that are predicted to bind with high affinity and selectivity to the target may be designed using interactive graphics and the intuition of a medicinal chemist. Alternatively various automated computational procedures may be used to suggest new drug candidates.

As experimental methods such as X-ray crystallography and NMR develop, the amount of information concerning 3D structures of biomolecular targets has increased dramatically. In parallel, information about the structural dynamics and electronic properties about ligands has also increased. This has encouraged the rapid development of the structure-based drug design. Current methods for structure-based drug design can be divided roughly into two categories. The first category is about “finding” ligands for a given receptor, which is usually referred as database searching. In this case, a large number of potential ligand molecules are screened to find those fitting the binding pocket of the receptor. This method is usually referred as ligand-based drug design. The key advantage of database searching is that it saves synthetic effort to obtain new lead compounds. Another category of structure-based drug design methods is about “building” ligands, which is usually referred as receptor-based drug design. In this case, ligand molecules are built up within the constraints of the binding pocket by assembling small pieces in a stepwise manner. These pieces can be either individual atoms or molecular fragments. The key advantage of such a method is that

Page 24: drug design

novel structures, not contained in any database, can be suggested. These techniques are raising much excitement to the drug design community

Active site identification

Active site identification is the first step in this program. It analyzes the protein to find the binding pocket, derives key interaction sites within the binding pocket, and then prepares the necessary data for Ligand fragment link. The basic inputs for this step are the 3D structure of the protein and a pre-docked ligand in PDB format, as well as their atomic properties. Both ligand and protein atoms need to be classified and their atomic properties should be defined, basically, into four atomic types:

Hydrophobic atom: All carbons in hydrocarbon chains or in aromatic groups. H-bond donor: Oxygen and nitrogen atoms bonded to hydrogen atom(s). H-bond acceptor: Oxygen and sp2 or sp hybridized nitrogen atoms with lone

electron pair(s). Polar atom: Oxygen and nitrogen atoms that are neither H-bond donor nor H-

bond acceptor, sulfur, phosphorus, halogen, metal, and carbon atoms bonded to hetero-atom(s).

The space inside the ligand binding region would be studied with virtual probe atoms of the four types above so the chemical environment of all spots in the ligand binding region can be known. Hence we are clear what kind of chemical fragments can be put into their corresponding spots in the ligand binding region of the receptor.

Amino acid symbols  

Amino acids are classified into different ways based on polarity, structure, nutritional requirement, metabolic fate, etc. Generally used classification is based on polarity. Based on polarity amino acids are classified into four groups.

Non-polar amino acids

They have equal number of amino and carboxyl groups and are neutral.These amino acids are hydrophobic and have no charge on the 'R' group. The amino acids in this group are alanine, valine, leucine, isoleucine, phenyl alanine, glycine, tryptophan, methionine and proline.

Page 25: drug design

Polar amino acids with no charge

These amino acids do not have any charge on the 'R' group. These amino acids participate in hydrogen bonding of protein structure. The amino acids in this group are - serine, threonine, tyrosine, cysteine, glutamine and aspargine.

Polar amino acids with positive charge

Polar amino acids with positive charge have more amino groups as compared to carboxyl groups making it basic.

The amino acids, which have positive charge on the 'R' group are placed in this category. They are lysine, arginine and histidine.

Page 26: drug design

valine, leucine, isoleucine, phenyl alanine, glycine, tryptophan, methionine and proline.

Polar amino acids with no charge

These amino acids do not have any charge on the 'R' group. These amino acids participate in hydrogen bonding of protein structure. The amino acids in this group are - serine, threonine, tyrosine, cysteine, glutamine and aspargine.

Polar amino acids with positive charge

Polar amino acids with positive charge have more amino groups as compared to carboxyl groups making it basic.

Page 27: drug design

The amino acids, which have positive charge on the 'R' group are placed in this category. They are lysine, arginine and histidine.

Polar amino acids with negative charge

Polar amino acids with negative charge have more carboxyl groups than amino groups making them acidic.

The amino acids, which have negative charge on the 'R' group are placed in this category. They are called as dicarboxylic mono-amino acids. They are aspartic acid and glutamic acid.

Proline is amino acid.

Page 28: drug design

Single letter and triple letter codes for amino acids.

Amino acid One letter symbol Three letter symbol

alanine A Ala

arginine R Arg

asparagine N Asn

aspartic acid D Asp

cysteine C Cys

glutamic acid E Glu

glutamine Q Gln

glycine G Gly

histidine H His

isoleucine I Ile

leucine L Leu

lysine K Lys

methionine M Met

phenylalanine F Phe

proline P Pro

serine S Ser

threonine T Thr

tryptophan W Trp

tyrosine Y Tyr

valine V Val

Page 29: drug design

CHAPTER 3

INTRODUCTION

Swine influenza

This article is about influenza viruses in pigs. For the 2009 outbreak, see 2009 flu pandemic. For the 2009 human virus, see Pandemic H1N1/09 virus.

Electron microscope image of the reassorted H1N1 influenza virus photographed at the CDC Influenza Laboratory. The viruses are 80–120 nanometres in diameter

Swine influenza, also called pig influenza, swine flu, hog flu and pig flu, is an infection caused by any one of several types of swine influenza viruses. Swine influenza virus (SIV) or swine-origin influenza virus (S-OIV) is any strain of the influenza family of viruses that is endemic in pigs.[2] As of 2009, the known SIV strains include influenza C and the subtypes of influenza A known as H1N1, H1N2, H2N1, H3N1, H3N2, and H2N3.

Swine influenza virus is common throughout pig populations worldwide. Transmission of the virus from pigs to humans is not common and does not always lead to human flu, often resulting only in the production of antibodies in the blood. If transmission does cause human flu, it is called zoonotic swine flu. People with regular exposure to pigs are at increased risk of swine flu infection.

During the mid-20th century, identification of influenza subtypes became possible, allowing accurate diagnosis of transmission to humans. Since then, only 50 such transmissions have been confirmed. These strains of swine flu rarely pass from human to human. Symptoms of zoonotic swine flu in humans are similar to those of influenza and of influenza-like illness in general, namely chills, fever, sore throat, muscle pains, severe headache, coughing, weakness and general discomfort.

In August 2010, the World Health Organization declared the swine flu an endamic.

Classification

Of the three genera of influenza viruses that cause human flu, two also cause influenza in pigs, with influenza A being common in pigs and influenza C being rare.[3]

Influenza B has not been reported in pigs. Within influenza A and influenza C, the strains found in pigs and humans are largely distinct, although because of reassortment there have been transfers of genes among strains crossing swine, avian, and human species boundaries.

Page 30: drug design

Influenza C

Influenza viruses infect both humans and pigs, but do not infect birds. [4] Transmission between pigs and humans have occurred in the past.[5] For example, influenza C caused small outbreaks of a mild form of influenza amongst children in Japan[6] and California.[6] Because of its limited host range and the lack of genetic diversity in influenza C, this form of influenza does not cause pandemics in humans.

Influenza A

Swine influenza is known to be caused by influenza A subtypes H1N1,[8] H1N2, H2N3, H3N1,[10] and H3N2. In pigs, three influenza A virus subtypes (H1N1, H1N2,H3N2 and H7N9) are the most common strains worldwide.[11] In the United States, the H1N1 subtype was exclusively prevalent among swine populations before 1998; however, since late August 1998, H3N2 subtypes have been isolated from pigs. As of 2004, H3N2 virus isolates in US swine and turkey stocks were triple reassortants, containing genes from human (HA, NA, and PB1), swine (NS, NP, and M), and avian (PB2 and PA) lineages.[12] In August 2012, the Center for Disease Control and Prevention confirmed 145 human cases (113 in Indiana, 30 in Ohio, one in Hawaii and one in Illinois) of H3N2v since July 2012.The death of a 61-year-old Madison County, Ohio woman is the first in the nation associated with a new swine flu strain. She contracted the illness after having contact with hogs at the Ross County Fair.

Surveillance

Although there is no formal national surveillance system in the United States to determine what viruses are circulating in pigs, an informal surveillance network in the United States is part of a world surveillance network

History

Swine influenza was first proposed to be a disease related to human flu during the 1918 flu pandemic, when pigs became ill at the same time as humans.[17] The first identification of an influenza virus as a cause of disease in pigs occurred about ten years later, in 1930.] For the following 60 years, swine influenza strains were almost exclusively H1N1. Then, between 1997 and 2002, new strains of three different subtypes and five different genotypes emerged as causes of influenza among pigs in North America. In 1997–1998, H3N2 strains emerged. These strains, which include genes derived by reassortment from human, swine and avian viruses, have become a major cause of swine influenza in North America. Reassortment between H1N1 and H3N2 produced H1N2. In 1999 in Canada, a strain of H4N6 crossed the species barrier from birds to pigs, but was contained on a single farm.

Page 31: drug design

The H1N1 form of swine flu is one of the descendants of the strain that caused the 1918 flu pandemic. As well as persisting in pigs, the descendants of the 1918 virus have also circulated in humans through the 20th century, contributing to the normal seasonal epidemics of influenza. However, direct transmission from pigs to humans is rare, with only 12 recorded cases in the U.S. since 2005. Nevertheless, the retention of influenza strains in pigs after these strains have disappeared from the human population might make pigs a reservoir where influenza viruses could persist, later emerging to reinfect humans once human immunity to these strains has waned.

Swine flu has been reported numerous times as a zoonosis in humans, usually with limited distribution, rarely with a widespread distribution. Outbreaks in swine are common and cause significant economic losses in industry, primarily by causing stunting and extended time to market. For example, this disease costs the British meat industry about £65 million every year.

Transmission

Transmission between pigs

Influenza is quite common in pigs, with about half of breeding pigs having been exposed to the virus in the US. Antibodies to the virus are also common in pigs in other countries.[57]

The main route of transmission is through direct contact between infected and uninfected animals These close contacts are particularly common during animal transport. Intensive farming may also increase the risk of transmission, as the pigs are raised in very close proximity to each other. The direct transfer of the virus probably occurs either by pigs touching noses, or through dried mucus. Airborne transmission through the aerosols produced by pigs coughing or sneezing are also an important means of infection. The virus usually spreads quickly through a herd, infecting all the pigs within just a few days. Transmission may also occur through wild animals, such as wild boar, which can spread the disease between farms.[60]

Transmission to humans

People who work with poultry and swine, especially those with intense exposures, are at increased risk of zoonotic infection with influenza virus endemic in these animals, and constitute a population of human hosts in which zoonosis and reassortment can co-occur. Vaccination of these workers against influenza and surveillance for new influenza strains among this population may therefore be an important public health measure Transmission of influenza from swine to humans who work with swine was documented in a small surveillance study performed in 2004 at the University of Iowa.] This study, among others, forms the basis of a recommendation that people

Page 32: drug design

whose jobs involve handling poultry and swine be the focus of increased public health surveillance. Other professions at particular risk of infection are veterinarians and meat processing workers, although the risk of infection for both of these groups is lower than that of farm workers

Interaction with avian H5N1 in pigs

Pigs are unusual as they can be infected with influenza strains that usually infect three different species: pigs, birds and humans This makes pigs a host where influenza viruses might exchange genes, producing new and dangerous strains Avian influenza virus H3N2 is endemic in pigs in China, and has been detected in pigs in Vietnam, increasing fears of the emergence of new variant strains.[66] H3N2 evolved from H2N2 by antigenic shift In August 2004, researchers in China found H5N1 in pigs

These H5N1 infections may be quite common; in a survey of 10 apparently healthy pigs housed near poultry farms in West Java, where avian flu had broken out, five of the pig samples contained the H5N1 virus. The Indonesian government has since found similar results in the same region. Additional tests of 150 pigs outside the area were negative

Signs and symptoms

In swine

In pigs, influenza infection produces fever, lethargy, sneezing, coughing, difficulty breathing and decreased appetite In some cases the infection can cause abortion. Although mortality is usually low (around 1–4%),] the virus can produce weight loss and poor growth, causing economic loss to farmers Infected pigs can lose up to 12 pounds of body weight over a three- to four-week period.

Page 33: drug design

In humans

Direct transmission of a swine flu virus from pigs to humans is occasionally possible (called zoonotic swine flu). In all, 50 cases are known to have occurred since the first report in medical literature in 1958, which have resulted in a total of six deaths.[72] Of these six people, one was pregnant, one had leukemia, one had Hodgkin's lymphoma and two were known to be previously healthy Despite these apparently low numbers of infections, the true rate of infection may be higher, since most cases only cause a very mild disease, and will probably never be reported or diagnosed

According to the Centers for Disease Control and Prevention (CDC), in humans the symptoms of the 2009 "swine flu" H1N1 virus are similar to those of influenza and of influenza-like illness in general. Symptoms include fever, cough, sore throat, body aches, headache, chills and fatigue. The 2009 outbreak has shown an increased percentage of patients reporting diarrhea and vomiting The 2009 H1N1 virus is not zoonotic swine flu, as it is not transmitted from pigs to humans, but from person to person.

Because these symptoms are not specific to swine flu, a differential diagnosis of probable swine flu requires not only symptoms, but also a high likelihood of swine flu due to the person's recent history. For example, during the 2009 swine flu outbreak in the United States, the CDC advised physicians to "consider swine influenza infection in the differential diagnosis of patients with acute febrile respiratory illness who have either been in contact with persons with confirmed swine flu, or who were in one of the five U.S. states that have reported swine flu cases or in Mexico during the seven

Page 34: drug design

days preceding their illness onset."[75] A diagnosis of confirmed swine flu requires laboratory testing of a respiratory sample (a simple nose and throat swab).[75]

The most common cause of death is respiratory failure. Other causes of death are pneumonia (leading to sepsis),[76] high fever (leading to neurological problems), dehydration (from excessive vomiting and diarrhea), electrolyte imbalance and kidney failure.[77] Fatalities are more likely in young children and the elderly.

Diagnosis

The CDC recommends real-time RT-PCR as the method of choice for diagnosing H1N1 This method allows a specific diagnosis of novel influenza (H1N1) as opposed to seasonal influenza. Near-patient point-of-care tests are in development.[79]

Prevention

Prevention of swine influenza has three components: prevention in swine, prevention of transmission to humans, and prevention of its spread among humans.

The proteins which was present influenza virus A were Hemagglutinin and Neuraminidase.

Hemmaglutinin

Influenza hemagglutinin (HA) or haemagglutinin (British English) is a type of hemagglutinin found on the surface of the influenza viruses. It is an antigenic glycoprotein. It is responsible for binding the virus to the cell that is being infected. HA proteins bind to cells with sialic acid on the membranes, such as cells in the upper respiratory tract or erythrocytes.

The name "hemagglutinin" comes from the protein's ability to cause red blood cells (erythrocytes) to clump together ("agglutinate") in vitro. Subtypes

Structure of influenza, showing neuraminidase marked as NA and hemagglutinin as HA.

Page 35: drug design

There are at least 17 different HA antigens. These subtypes are named H1 through H17. H16 was discovered only in 2004 on influenza A viruses isolated from black-headed gulls from Sweden and Norway. The most recent H17 was discovered in 2012 in fruit bats. he first three hemagglutinins, H1, H2, and H3, are found in human influenza viruses.

Viral neuraminidase (NA) is another protein found on the surface of influenza. Influenza viruses are characterised by the type of HA and NA that they carry; hence H1N1, H5N2 etc.

A highly pathogenic avian flu virus of H5N1 type has been found to infect humans at a low rate. It has been reported that single amino acid changes in this avian virus strain's type H5 hemagglutinin have been found in human patients that "can significantly alter receptor specificity of avian H5N1 viruses, providing them with an ability to bind to receptors optimal for human influenza viruses".[5][6] This finding seems to explain how an H5N1 virus that normally does not infect humans can mutate and become able to efficiently infect human cells. The hemagglutinin of the H5N1 virus has been associated with the high pathogenicity of this flu virus strain, apparently due to its ease of conversion to an active form by proteolysis.

Function and Mechanism

HA has two functions. Firstly, it allows the recognition of target vertebrate cells, accomplished through the binding to these cells' sialic acid-containing receptors. Secondly, once bound it facilitates the entry of the viral genome into the target cells by causing the fusion of host endosomal membrane with the viral membrane.[9]

HA binds to the monosaccharide sialic acid which is present on the surface of its target cells, which causes the viral particles to stick to the cell's surface. The cell membrane then engulfs the virus and the portion of the membrane that encloses it pinches off to form a new membrane-bound compartment within the cell called an endosome, which contains the engulfed virus. The cell then attempts to begin digesting the contents of the endosome by acidifying its interior and transforming it into a lysosome. However, as soon as the pH within the endosome drops to about 6.0, the original folded structure of the HA molecule becomes unstable, causing it to partially unfold and release a very hydrophobic portion of its peptide chain that was previously hidden within the protein.[citation needed]

This so-called "fusion peptide" acts like a molecular grappling hook by inserting itself into the endosomal membrane and locking on. Then, when the rest of the HA molecule refolds into a new structure (which is more stable at the lower pH), it "retracts the grappling hook" and pulls the endosomal membrane right up next to the virus particle's own membrane, causing the two to fuse together. Once this has

Page 36: drug design

happened, the contents of the virus, including its RNA genome, are free to pour out into the cells cytoplasm.

Structure

HA is a homotrimeric integral membrane glycoprotein. It is shaped like a cylinder, and is approximately 13.5 nanometres long. The three identical monomers that constitute HA are constructed into a central α helix coil; three spherical heads contain the sialic acid binding sites. HA monomers are synthesized as precursors that are then glycosylated and cleaved into two smaller polypeptides: the HA1 and HA2 subunits. Each HA monomer consists of a long, helical chain anchored in the membrane by HA2 and topped by a large HA1 globule

Neuraminidase

Neuraminidase enzymes are glycoside hydrolase enzymes (EC 3.2.1.18) that cleave the glycosidic linkages of neuraminic acids. Neuraminidase enzymes are a large family, found in a range of organisms. The best-known neuraminidase is the viral neuraminidase, a drug target for the prevention of the spread of influenza infection. The viral neuraminidases are frequently used as an antigenic determinants found on the surface of the Influenza virus. Some variants of the influenza neuraminidase confer more virulence to the virus than others. Other homologs are found in mammalian cells, which have a range of functions. At least four mammalian sialidase homologs have been described in the human genome (see NEU1, NEU2, NEU3, NEU4).

Neuraminidases, also called sialidases, catalyze the hydrolysis of terminal sialic acid residues from the newly formed virions and from the host cell receptors.[1] Sialidase activities include assistance in the mobility of virus particles through the respiratory tract mucus and in the elution of virion progeny from the infected cell.

Structure

Influenza neuraminidase exists as a mushroom-shape projection on the surface of the influenza virus. It has a head consisting of four co-planar and roughly spherical subunits, and a hydrophobic region that is embedded within the interior of the virus' membrane. It comprises a single polypeptide chain that is oriented in the opposite direction to the hemagglutinin antigen. The composition of the polypeptide is a single chain of six conserved polar amino acids, followed by hydrophilic, variable amino acids. β-Sheets predominate as the secondary level of protein conformation.

Recent emergence of oseltamivir and zanamivir resistant human influenza A(H1N1) H274Y has emphasized the need for suitable expression systems to obtain large quantities of highly pure and stable, recombinant neuraminidase through two separate

Page 37: drug design

artificial tetramerization domains that facilitate the formation of catalytically active neuraminidase homotetramers from yeast and Staphylothermus marinus, which allow for secretion of FLAG-tagged proteins and further purification.

Mechanism

Proposed mechanism of catalysis of influenza virus sialidase 4 (Link to glycosidase mechanism) .The enzymatic mechanism of influenza virus sialidase has been studied by Taylor et al, shown in Figure 1. The enzyme catalysis process has four steps. The first step involves the distortion of the α-sialoside from a 2C5 chair conformation (the lowest-energy form in solution) to a pseudoboat conformation when the sialoside binds to the sialidase. The second step leads to an oxocarbocation intermediate, the sialosyl cation. The third step is the formation of Neu5Ac initially as the α-anomer, and then mutarotation and release as the more thermodynamically-stable β-Neu5Ac.

How does swine flu virus work ( A PICTORIAL REPRESENTATION)

Page 38: drug design
Page 39: drug design

Chapter : 4

Methodology:

Energy refinement of H1N1 was modeled by Modeller 9v10[4] using PDB entry 1LV1 as a template. The predicted models were evaluated for geometry, stereochemistry checks and energy distribution using PROCHECK[5].The models were systematically analyzed using ProSA for various structural properties and the best modelled structure containing94.6% residues in the core region of the Ramachandran plot was selected as the docking target enzyme. Three potential binding sites of modelled H1N1 were revealey Ligsite[6] program where pkt-48 is found to be the most favourable and conserved region containing critical aspartic , Glycine residues (D198,227 & G27) and has a better binding affinity. In this study, methyl-formamide, is considered as seed molecules for them de-novo generation with a final output of twenty structurally complimentary potential lead molecules using Ligbuilder V1.2[7]. All the twenty de-novo designed and selected ligand molecules were docked into the target enzyme using Autodock4.2.3[8].. Binding energies for all the 10 designed ligand molecules as examined by Autodock 4.2.3. ranges between -3.53 to -0.59KJ/mole.

1) TARGET IDENTIFICATION

2) TARGET VALIDATION

3) STRUCTURAL RETRIVAL OR DETERMINATION

4) STRUCTURE VALIDATION

5) ACTIVE SITE IDENTIFICATION

6) LEAD IDENTIFICATION

7) DEVELOPMENT OF LEAD INTO ACTIVE SITE

8) DOCKING ANALYSIS

9) ADME TOXICO ANALYSIS

10) PROPOSAL OF NEW DRUG CANDIDATE OR MOLECULE

Structure Based drug designing in 1996 In silico = primirialy computer minded or data on silicon chips PDB is real time visualition technique(NMR,X-ray,etc.) If structure meets then for further steps If not structure modeling techniques Denovo, threading, homology modeling. Alignment = to bring together two similar or identical entities Global(FASTA). Local(BLAST).

Page 40: drug design

TARGET IDENTIFICATION AND VALIDATION

Protein Selection

Prior to ligand development, the protein target was first selected. For this research, the neuraminidase subtype N4 (PDB ID 2HTV) was chosen for study. (See Figure 2.3) It is structurally similar to N1 neuraminidase, but has had fewer investigations involving antiviral activity. Its structure was initially released on September 9, 2005, but last modified on February 24, 2009. It is a strain of influenza A virus. It consists of two polypeptide chains and is classified as a hydrolase.8

Figure : Visualization of 2HTV, N4 neuraminidase

Structural comparison of N1, N4 and N8 Group-1 neuraminidase shows their active sites to be virtually identical. Group-1 NAs consist of N1, N4, N5, and N8. Group-2 contains N2, N3, N6, N7 and N9. There are conformational differences between Group-1 and Group-2 NAs. These differences come in the form of various amino acid configurations. The differences result in a large cavity being present in Group-1 NAs which is not available in Group-2 NAs.9

TARDET VALIDATION

1) search for query protein sequence.{inFASTA}

2) search for query homologs (BLAST from NCBI)

3) search for homology structure

4) preparation of modeling (modeler version 9v10)

Page 41: drug design

.ali file alignment file

.atm file atomic file

.py file modellar pytham/ program file

FASTA FILE

>3SAL:A|PDBID|CHAIN|SEQUENCE

PEFLNNTEPLCNVSGFAIVSKDNGIRIGSRGHVFVIREPFVACGPTECRTFFLTQGALLNDKHSNNTVKDRSPYRALMSV

PLGSSPNAYQAKFESVAWSATACHDGKKWLAVGISGADDDAYAVIHYGGMPTDVVRSWRKQILRTQESSCVCMNGNCYWV

MTDGPANSQASYKIFKSHEGMVTNEREVSFQGGHIEECSCYPNLGKVECVCRDNWNGMNRPILIFDEDLDYEVGYLCAGI

PTDTPRVQDSSFTGSCTNAVGGSGTNNYGVKGFGFRQGNSVWAGRTVSISSRSGFEILLIEDGWIRTSKTIVKKVEVLNN

KNWSGYSGAFTIPITMTSKQCLVPCFWLEMIRGKPEERTSIWTSSSSTVFCGVSSEVPGWSWDDGAILPFDIDKM

>3SAL:B|PDBID|CHAIN|SEQUENCE

PEFLNNTEPLCNVSGFAIVSKDNGIRIGSRGHVFVIREPFVACGPTECRTFFLTQGALLNDKHSNNTVKDRSPYRALMSV

PLGSSPNAYQAKFESVAWSATACHDGKKWLAVGISGADDDAYAVIHYGGMPTDVVRSWRKQILRTQESSCVCMNGNCYWV

MTDGPANSQASYKIFKSHEGMVTNEREVSFQGGHIEECSCYPNLGKVECVCRDNWNGMNRPILIFDEDLDYEVGYLCAGI

PTDTPRVQDSSFTGSCTNAVGGSGTNNYGVKGFGFRQGNSVWAGRTVSISSRSGFEILLIEDGWIRTSKTIVKKVEVLNN

KNWSGYSGAFTIPITMTSKQCLVPCFWLEMIRGKPEERTSIWTSSSSTVFCGVSSEVPGWSWDDGAILPFDIDKM

Page 42: drug design

ALI FILE:

>p1;AAAA

structure:X::::::::

PEFLNNTEPLCNVSGFAIVSKDNGIRIGSRGHVFVIREPFVACGPTECRTFFLTQGALLN

DKHSNNTVKDRSPYRALMSVPLGSSPNAYQAKFESVAWSATACHDGKKWLAVGISGADDD

AYAVIHYGGMPTDVVRSWRKQILRTQESSCVCMNGNCYWVMTDGPANSQASYKIFKSHEG

MVTNEREVSFQGGHIEECSCYPNLGKVECVCRDNWNGMNRPILIFDEDLDYEVGYLCAGI

PTDTPRVQDSSFTGSCTNAVGGSGTNNYGVKGFGFRQGNSVWAGRTVSISSRSGFEILLI

EDGWIRTSKTIVKKVEVLNNKNWSGYSGAFTIPITMTSKQCLVPCFWLEMIRGKPEERTS

IWTSSSSTVFCGVSSEVPGWSWDDGAILPFDIDKM*

>p2;BBBB

sequence:y::::::::

PEFLNNTEPLCNVSGFAIVSKDNGIRIGSRGHVFVIREPFVACGPTECRTFFLTQGALLN

DKHSNNTVKDRSPYRALMSVPLGSSPNAYQAKFESVAWSATACHDGKKWLAVGVSGADDD

AYAVIHYGGMPTDVVRSWRKQILRTQESSCVCMNGNCYWVMTDGPANSQASYKIFKSHEG

MVTNEREVSFQGGHIEECSCYPNLGKVECVCRDNWNGMNRPILIFDEDLDYEVGYLCAGI

PTDTPRVQDSSFTGSCTNAVGGSGTNNYGVKGFGFRQGNSVWAGRTVSISSRSGFEILLI

Page 43: drug design

EDGWIRTSKTIVKKVEVLNNKNWSGYSGAFTIPITMTGKQCLVPCFWLEMIRGKPEERTS

IWTSSSSTVFCGVSSEVPGWSWDDGAILPFDIDKM*

PYTHON FILE:

from modeller.automodel import*

log.verbose()

env=environ()

env.io.atom_files_directory='./:../AAAA.atm'

a=automodel(env,

alnfile='AAAA.ali',

knowns='AAAA',

sequence='BBBB')

a.starting_model=1

a.ending_model=5

a.make()

ATM FILE:

ATOM 155 CG2 THR 30 5.649 80.387 76.658 1.00 8.90 C

ATOM 156 N ASP 31 5.327 81.123 81.387 1.00 21.51 N

ATOM 157 CA ASP 31 5.105 81.984 82.553 1.00 22.00 C

ATOM 158 C ASP 31 5.353 81.336 83.886 1.00 24.06 C

ATOM 159 O ASP 31 4.845 80.257 84.163 1.00 24.63 O

ATOM 160 CB ASP 31 3.643 82.432 82.603 1.00 27.02 C

Page 44: drug design

ATOM 161 CG ASP 31 3.351 83.592 81.695 1.00 28.60 C

ATOM 162 OD1 ASP 31 3.585 84.735 82.134 1.00 29.87 O

ATOM 163 OD2 ASP 31 2.874 83.357 80.553 1.00 32.70 O

ATOM 164 N ASP 32 6.041 82.056 84.756 1.00 27.54 N

ATOM 165 CA ASP 32 6.305 81.561 86.098 1.00 32.45 C

ATOM 166 C ASP 32 5.035 81.781 86.953 1.00 31.11 C

ATOM 167 O ASP 32 4.687 80.959 87.797 1.00 32.59 O

ATOM 168 CB ASP 32 7.514 82.285 86.703 1.00 38.29 C

ATOM 169 CG ASP 32 8.150 81.493 87.832 1.00 45.66 C

ATOM 170 OD1 ASP 32 8.697 80.391 87.550 1.00 51.06 O

ATOM 171 OD2 ASP 32 8.074 81.942 89.003 1.00 49.70 O

ATOM 172 N GLN 33 4.325 82.876 86.692 1.00 29.33 N

ATOM 173 CA GLN 33 3.104 83.206 87.406 1.00 29.10 C

ATOM 174 C GLN 33 2.092 83.840 86.483 1.00 25.47 C

ATOM 175 O GLN 33 2.336 84.888 85.895 1.00 29.43 O

ATOM 176 CB GLN 33 3.378 84.203 88.514 1.00 36.87 C

ATOM 177 CG GLN 33 4.176 83.666 89.666 1.00 49.28 C

ATOM 178 CD GLN 33 5.098 84.735 90.231 1.00 56.34 C

ATOM 179 OE1 GLN 33 5.927 85.307 89.503 1.00 61.39 O

ATOM 180 NE2 GLN 33 4.933 85.052 91.516 1.00 60.19 N

ATOM 181 N ILE 34 0.944 83.215 86.357 1.00 20.02 N

ATOM 182 CA ILE 34 -0.081 83.771 85.525 1.00 17.22 C

ATOM 183 C ILE 34 -1.383 83.502 86.267 1.00 14.70 C

ATOM 184 O ILE 34 -1.539 82.459 86.896 1.00 15.39 O

ATOM 185 CB ILE 34 -0.028 83.175 84.105 1.00 17.97 C

ATOM 186 CG1 ILE 34 -0.870 84.028 83.171 1.00 20.61 C

Page 45: drug design

ATOM 187 CG2 ILE 34 -0.531 81.772 84.085 1.00 18.49 C

ATOM 188 CD1 ILE 34 -0.960 83.489 81.775 1.00 22.77 C

ATOM 189 N GLU 35 -2.276 84.478 86.271 1.00 12.14 N

ATOM 190 CA GLU 35 -3.523 84.356 86.998 1.00 9.31 C

ATOM 191 C GLU 35 -4.622 83.670 86.233 1.00 7.29 C

ATOM 192 O GLU 35 -4.927 84.036 85.103 1.00 9.41 O

ATOM 193 CB GLU 35 -3.984 85.735 87.437 1.00 11.57 C

ATOM 194 CG GLU 35 -4.962 85.747 88.589 1.00 18.28 C

ATOM 195 CD GLU 35 -5.559 87.124 88.838 1.00 22.13 C

ATOM 196 OE1 GLU 35 -5.075 88.110 88.247 1.00 29.08 O

ATOM 197 OE2 GLU 35 -6.539 87.229 89.597 1.00 25.32 O

ATOM 198 N VAL 36 -5.212 82.674 86.880 1.00 9.63 N

ATOM 199 CA VAL 36 -6.317 81.882 86.353 1.00 9.15 C

ATOM 200 C VAL 36 -7.501 82.005 87.308 1.00 10.04 C

ATOM 201 O VAL 36 -7.364 82.467 88.445 1.00 9.25 O

ATOM 202 CB VAL 36 -5.934 80.379 86.175 1.00 7.62 C

ATOM 203 CG1 VAL 36 -4.928 80.217 85.029 1.00 6.60 C

ATOM 204 CG2 VAL 36 -5.389 79.806 87.457 1.00 6.96 C

ATOM 205 N THR 37 -8.675 81.614 86.842 1.00 11.38 N

ATOM 206 CA THR 37 -9.866 81.704 87.677 1.00 10.31 C

ATOM 207 C THR 37 -9.822 80.707 88.818 1.00 11.93 C

ATOM 208 O THR 37 -10.275 81.012 89.914 1.00 12.93 O

ATOM 209 CB THR 37 -11.171 81.473 86.869 1.00 7.72 C

ATOM 210 OG1 THR 37 -11.133 80.170 86.263 1.00 8.07 O

ATOM 211 CG2 THR 37 -11.345 82.546 85.792 1.00 5.67 C

ATOM 212 N ASN 38 -9.325 79.499 88.552 1.00 13.63 N

Page 46: drug design

ATOM 213 CA ASN 38 -9.262 78.488 89.588 1.00 11.33 C

ATOM 214 C ASN 38 -8.104 77.534 89.361 1.00 10.27 C

ATOM 215 O ASN 38 -7.661 77.377 88.244 1.00 12.09 O

ATOM 216 CB ASN 38 -10.579 77.723 89.690 1.00 14.55 C

ATOM 217 CG ASN 38 -10.705 77.036 91.011 1.00 18.78 C

ATOM 218 OD1 ASN 38 -11.333 77.514 91.920 1.00 25.55 O

ATOM 219 ND2 ASN 38 -10.121 75.876 91.116 1.00 27.49 N

ATOM 220 N ALA 39 -7.627 76.903 90.428 1.00 8.87 N

ATOM 221 CA ALA 39 -6.505 75.977 90.377 1.00 10.10 C

ATOM 222 C ALA 39 -6.689 74.953 91.483 1.00 12.30 C

ATOM 223 O ALA 39 -7.548 75.109 92.340 1.00 14.57 O

ATOM 224 CB ALA 39 -5.189 76.726 90.590 1.00 6.40 C

ATOM 225 N THR 40 -5.904 73.887 91.466 1.00 13.05 N

ATOM 226 CA THR 40 -6.013 72.879 92.505 1.00 10.71 C

ATOM 227 C THR 40 -4.617 72.394 92.846 1.00 11.57 C

ATOM 228 O THR 40 -3.736 72.351 91.987 1.00 13.18 O

ATOM 229 CB THR 40 -6.972 71.731 92.108 1.00 12.00 C

ATOM 230 OG1 THR 40 -7.134 70.844 93.213 1.00 18.49 O

ATOM 231 CG2 THR 40 -6.454 70.940 90.943 1.00 17.16 C

ATOM 232 N GLU 41 -4.394 72.183 94.134 1.00 9.95 N

ATOM 233 CA GLU 41 -3.112 71.747 94.658 1.00 11.51 C

ATOM 234 C GLU 41 -2.846 70.257 94.405 1.00 12.88 C

ATOM 235 O GLU 41 -3.712 69.404 94.674 1.00 17.08 O

ATOM 236 CB GLU 41 -3.060 72.060 96.154 1.00 9.18 C

ATOM 237 CG GLU 41 -1.765 71.698 96.834 1.00 10.08 C

ATOM 238 CD GLU 41 -0.588 72.445 96.270 1.00 13.05 C

Page 47: drug design

ATOM 239 OE1 GLU 41 -0.595 73.690 96.392 1.00 18.54 O

ATOM 240 OE2 GLU 41 0.318 71.812 95.677 1.00 13.57 O

ATOM 241 N LEU 42 -1.652 69.950 93.898 1.00 9.69 N

ATOM 242 CA LEU 42 -1.294 68.584 93.604 1.00 7.62 C

ATOM 243 C LEU 42 -0.325 67.968 94.594 1.00 8.70 C

ATOM 244 O LEU 42 -0.151 66.749 94.609 1.00 11.30 O

ATOM 245 CB LEU 42 -0.714 68.471 92.194 1.00 8.30 C

ATOM 246 CG LEU 42 -1.594 68.793 90.984 1.00 8.46 C

ATOM 247 CD1 LEU 42 -0.855 68.410 89.713 1.00 7.53 C

ATOM 248 CD2 LEU 42 -2.921 68.057 91.059 1.00 7.50 C

ATOM 249 N VAL 43 0.310 68.800 95.412 1.00 10.16 N

ATOM 250 CA VAL 43 1.277 68.328 96.396 1.00 8.47 C

ATOM 251 C VAL 43 0.707 68.287 97.801 1.00 12.12 C

ATOM 252 O VAL 43 0.314 69.316 98.331 1.00 13.85 O

ATOM 253 CB VAL 43 2.530 69.227 96.454 1.00 6.04 C

ATOM 254 CG1 VAL 43 3.444 68.773 97.561 1.00 4.59 C

ATOM 255 CG2 VAL 43 3.266 69.219 95.149 1.00 3.81 C

ATOM 256 N GLN 44 0.654 67.095 98.395 1.00 12.11 N

ATOM 257 CA GLN 44 0.184 66.921 99.767 1.00 11.39 C

ATOM 258 C GLN 44 1.345 67.338 100.674 1.00 14.23 C

ATOM 259 O GLN 44 2.414 66.720 100.640 1.00 18.47 O

ATOM 260 CB GLN 44 -0.159 65.452 100.021 1.00 12.84 C

ATOM 261 CG GLN 44 -0.658 65.171 101.415 1.00 12.53 C

ATOM 262 CD GLN 44 -1.973 65.878 101.710 1.00 14.94 C

ATOM 263 OE1 GLN 44 -2.809 66.068 100.830 1.00 18.22 O

ATOM 264 NE2 GLN 44 -2.159 66.273 102.950 1.00 17.85 N

Page 48: drug design

ATOM 265 N SER 45 1.166 68.369 101.488 1.00 15.68 N

ATOM 266 CA SER 45 2.263 68.808 102.342 1.00 17.49 C

ATOM 267 C SER 45 2.022 68.834 103.853 1.00 18.77 C

ATOM 268 O SER 45 2.841 69.355 104.605 1.00 20.60 O

ATOM 269 CB SER 45 2.801 70.157 101.858 1.00 17.87 C

ATOM 270 OG SER 45 1.771 71.129 101.846 1.00 19.97 O

ATOM 271 N SER 46 0.909 68.272 104.300 1.00 20.88 N

ATOM 272 CA SER 46 0.624 68.216 105.727 1.00 24.90 C

ATOM 273 C SER 46 0.278 66.792 106.158 1.00 25.89 C

ATOM 274 O SER 46 -0.204 65.979 105.362 1.00 22.45 O

ATOM 275 CB SER 46 -0.531 69.152 106.093 1.00 27.20 C

ATOM 276 OG SER 46 -1.743 68.769 105.442 1.00 31.95 O

ATOM 277 N SER 47 0.501 66.511 107.433 1.00 28.61 N

ATOM 278 CA SER 47 0.198 65.204 107.976 1.00 30.19 C

ATOM 279 C SER 47 -0.703 65.396 109.172 1.00 33.21 C

ATOM 280 O SER 47 -0.797 66.498 109.726 1.00 32.75 O

ATOM 281 CB SER 47 1.475 64.503 108.415 1.00 28.46 C

ATOM 282 OG SER 47 1.163 63.308 109.103 1.00 29.14 O

ATOM 283 N THR 48 -1.393 64.329 109.550 1.00 36.64 N

ATOM 284 CA THR 48 -2.269 64.371 110.714 1.00 37.39 C

ATOM 285 C THR 48 -1.428 64.328 111.987 1.00 33.31 C

ATOM 286 O THR 48 -1.828 64.865 113.017 1.00 37.43 O

ATOM 287 CB THR 48 -3.277 63.190 110.690 1.00 40.61 C

ATOM 288 OG1 THR 48 -2.579 61.934 110.629 1.00 42.64 O

ATOM 289 CG2 THR 48 -4.162 63.301 109.452 1.00 44.70 C

ATOM 290 N GLY 49 -0.250 63.714 111.897 1.00 28.62 N

Page 49: drug design

ATOM 291 CA GLY 49 0.640 63.606 113.041 1.00 22.75 C

ATOM 292 C GLY 49 0.413 62.322 113.803 1.00 19.67 C

ATOM 293 O GLY 49 1.095 62.046 114.784 1.00 20.57 O

ATOM 294 N LYS 50 -0.542 61.536 113.318 1.00 18.46 N

ATOM 295 CA LYS 50 -0.919 60.272 113.913 1.00 19.62 C

ATOM 296 C LYS 50 -1.001 59.278 112.811 1.00 16.92 C

ATOM 297 O LYS 50 -1.297 59.639 111.685 1.00 15.19 O

ATOM 298 CB LYS 50 -2.286 60.365 114.569 1.00 24.36 C

ATOM 299 CG LYS 50 -2.280 61.221 115.795 1.00 32.92 C

ATOM 300 CD LYS 50 -3.659 61.783 116.069 1.00 39.66 C

ATOM 301 CE LYS 50 -3.576 62.881 117.131 1.00 44.67 C

ATOM 302 NZ LYS 50 -2.714 64.056 116.686 1.00 48.88 N

ATOM 303 N ILE 51 -0.779 58.021 113.173 1.00 15.91 N

ATOM 304 CA ILE 51 -0.791 56.887 112.278 1.00 13.89 C

ATOM 305 C ILE 51 -2.116 56.157 112.483 1.00 17.72 C

ATOM 306 O ILE 51 -2.432 55.727 113.589 1.00 18.02 O

ATOM 307 CB ILE 51 0.386 55.939 112.624 1.00 12.07 C

ATOM 308 CG1 ILE 51 1.722 56.611 112.321 1.00 12.41 C

ATOM 309 CG2 ILE 51 0.263 54.629 111.878 1.00 10.33 C

ATOM 310 CD1 ILE 51 2.926 55.756 112.678 1.00 13.03 C

ATOM 311 N CYS 52 -2.893 56.017 111.419 1.00 20.39 N

ATOM 312 CA CYS 52 -4.170 55.332 111.511 1.00 22.64 C

ATOM 313 C CYS 52 -4.013 53.824 111.542 1.00 24.98 C

ATOM 314 O CYS 52 -3.357 53.237 110.668 1.00 27.03 O

ATOM 315 CB CYS 52 -5.070 55.728 110.352 1.00 22.38 C

ATOM 316 SG CYS 52 -5.493 57.483 110.406 1.00 22.84 S

Page 50: drug design

ATOM 317 N ASN 53 -4.660 53.198 112.521 1.00 25.67 N

ATOM 318 CA ASN 53 -4.595 51.758 112.679 1.00 28.06 C

ATOM 319 C ASN 53 -5.341 50.984 111.612 1.00 28.57 C

ATOM 320 O ASN 53 -5.343 49.747 111.624 1.00 29.65 O

ATOM 321 CB ASN 53 -5.064 51.340 114.069 1.00 34.04 C

ATOM 322 CG ASN 53 -6.446 51.883 114.428 1.00 39.14 C

ATOM 323 OD1 ASN 53 -7.166 52.462 113.587 1.00 42.04 O

ATOM 324 ND2 ASN 53 -6.816 51.715 115.695 1.00 40.57 N

ATOM 325 N ASN 54 -5.991 51.721 110.712 1.00 29.20 N

ATOM 326 CA ASN 54 -6.707 51.129 109.583 1.00 29.29 C

ATOM 327 C ASN 54 -6.241 51.779 108.302 1.00 26.96 C

ATOM 328 O ASN 54 -6.024 52.988 108.265 1.00 28.44 O

ATOM 329 CB ASN 54 -8.212 51.320 109.723 1.00 35.76 C

ATOM 330 CG ASN 54 -8.885 50.120 110.323 1.00 40.37 C

ATOM 331 OD1 ASN 54 -9.130 50.064 111.529 1.00 43.04 O

ATOM 332 ND2 ASN 54 -9.168 49.128 109.483 1.00 43.25 N

ATOM 333 N PRO 55 -6.150 51.005 107.207 1.00 22.58 N

ATOM 334 CA PRO 55 -6.467 49.582 107.092 1.00 21.79 C

ATOM 335 C PRO 55 -5.324 48.582 107.330 1.00 22.14 C

ATOM 336 O PRO 55 -5.552 47.369 107.300 1.00 24.75 O

ATOM 337 CB PRO 55 -6.984 49.472 105.659 1.00 21.19 C

ATOM 338 CG PRO 55 -6.383 50.688 104.923 1.00 22.15 C

ATOM 339 CD PRO 55 -5.664 51.537 105.923 1.00 21.87 C

ATOM 340 N HIS 56 -4.103 49.055 107.543 1.00 21.55 N

ATOM 341 CA HIS 56 -2.979 48.145 107.713 1.00 18.52 C

ATOM 342 C HIS 56 -2.859 47.708 109.160 1.00 17.50 C

Page 51: drug design

ATOM 343 O HIS 56 -3.150 48.486 110.068 1.00 17.92 O

ATOM 344 CB HIS 56 -1.675 48.835 107.242 1.00 17.60 C

ATOM 345 CG HIS 56 -1.783 49.486 105.889 1.00 16.13 C

ATOM 346 ND1 HIS 56 -1.768 48.745 104.727 1.00 16.95 N

ATOM 347 CD2 HIS 56 -1.955 50.795 105.581 1.00 15.33 C

ATOM 348 CE1 HIS 56 -1.928 49.612 103.746 1.00 12.46 C

ATOM 349 NE2 HIS 56 -2.044 50.858 104.209 1.00 15.78 N

ATOM 350 N ARG 57 -2.482 46.453 109.378 1.00 16.91 N

ATOM 351 CA ARG 57 -2.272 45.941 110.727 1.00 17.66 C

ATOM 352 C ARG 57 -0.976 46.563 111.253 1.00 16.88 C

ATOM 353 O ARG 57 0.128 46.165 110.870 1.00 17.98 O

ATOM 354 CB ARG 57 -2.146 44.410 110.748 1.00 21.61 C

ATOM 355 CG ARG 57 -2.057 43.861 112.183 1.00 31.80 C

ATOM 356 CD ARG 57 -1.841 42.340 112.291 1.00 39.13 C

ATOM 357 NE ARG 57 -2.045 41.805 113.652 1.00 47.25 N

ATOM 358 CZ ARG 57 -1.342 42.126 114.756 1.00 50.56 C

ATOM 359 NH1 ARG 57 -0.350 43.014 114.721 1.00 51.19 N

ATOM 360 NH2 ARG 57 -1.597 41.504 115.910 1.00 52.25 N

ATOM 361 N ILE 58 -1.125 47.581 112.084 1.00 17.36 N

ATOM 362 CA ILE 58 0.002 48.283 112.677 1.00 16.14 C

ATOM 363 C ILE 58 0.421 47.602 113.976 1.00 18.69 C

ATOM 364 O ILE 58 -0.426 47.210 114.775 1.00 25.79 O

ATOM 365 CB ILE 58 -0.375 49.749 113.016 1.00 16.08 C

ATOM 366 CG1 ILE 58 -0.966 50.453 111.792 1.00 13.26 C

ATOM 367 CG2 ILE 58 0.832 50.503 113.522 1.00 17.44 C

ATOM 368 CD1 ILE 58 -0.098 50.403 110.560 1.00 13.84 C

Page 52: drug design

ATOM 369 N LEU 59 1.715 47.387 114.160 1.00 18.86 N

ATOM 370 CA LEU 59 2.197 46.792 115.392 1.00 15.59 C

ATOM 371 C LEU 59 3.202 47.750 116.005 1.00 15.46 C

ATOM 372 O LEU 59 4.241 48.036 115.418 1.00 18.17 O

ATOM 373 CB LEU 59 2.843 45.419 115.159 1.00 14.37 C

ATOM 374 CG LEU 59 3.369 44.731 116.433 1.00 12.72 C

ATOM 375 CD1 LEU 59 2.249 44.553 117.433 1.00 12.09 C

ATOM 376 CD2 LEU 59 3.974 43.401 116.108 1.00 12.74 C

ATOM 377 N ASP 60 2.857 48.295 117.158 1.00 15.32 N

ATOM 378 CA ASP 60 3.735 49.208 117.868 1.00 15.10 C

ATOM 379 C ASP 60 4.775 48.374 118.640 1.00 15.30 C

ATOM 380 O ASP 60 4.419 47.482 119.417 1.00 16.20 O

ATOM 381 CB ASP 60 2.895 50.058 118.824 1.00 14.36 C

ATOM 382 CG ASP 60 3.671 51.207 119.446 1.00 16.13 C

ATOM 383 OD1 ASP 60 4.923 51.223 119.408 1.00 15.32 O

ATOM 384 OD2 ASP 60 3.001 52.124 119.965 1.00 17.51 O

ATOM 385 N GLY 61 6.052 48.628 118.370 1.00 15.53 N

ATOM 386 CA GLY 61 7.126 47.908 119.030 1.00 16.96 C

ATOM 387 C GLY 61 7.364 48.399 120.443 1.00 18.81 C

ATOM 388 O GLY 61 8.013 47.721 121.248 1.00 20.76 O

ATOM 389 N ILE 62 6.803 49.566 120.746 1.00 19.44 N

ATOM 390 CA ILE 62 6.909 50.215 122.056 1.00 20.00 C

ATOM 391 C ILE 62 8.365 50.465 122.425 1.00 18.96 C

ATOM 392 O ILE 62 8.964 51.395 121.900 1.00 21.69 O

ATOM 393 CB ILE 62 6.158 49.432 123.185 1.00 23.28 C

ATOM 394 CG1 ILE 62 4.718 49.127 122.759 1.00 23.75 C

Page 53: drug design

ATOM 395 CG2 ILE 62 6.064 50.292 124.443 1.00 22.96 C

ATOM 396 CD1 ILE 62 4.078 48.034 123.573 1.00 26.74 C

ATOM 397 N ASP 63 8.933 49.682 123.338 1.00 18.05 N

ATOM 398 CA ASP 63 10.331 49.873 123.713 1.00 19.19 C

ATOM 399 C ASP 63 11.173 48.691 123.299 1.00 16.89 C

ATOM 400 O ASP 63 12.242 48.435 123.861 1.00 15.21 O

ATOM 401 CB ASP 63 10.519 50.202 125.209 1.00 24.94 C

ATOM 402 CG ASP 63 9.668 49.337 126.137 1.00 28.19 C

ATOM 403 OD1 ASP 63 9.309 48.188 125.777 1.00 31.85 O

ATOM 404 OD2 ASP 63 9.353 49.830 127.241 1.00 30.45 O

ATOM 405 N CYS 64 10.686 48.007 122.270 1.00 12.66 N

ATOM 406 CA CYS 64 11.353 46.870 121.688 1.00 10.90 C

ATOM 407 C CYS 64 11.605 47.105 120.222 1.00 12.08 C

ATOM 408 O CYS 64 10.739 47.639 119.517 1.00 16.14 O

ATOM 409 CB CYS 64 10.466 45.656 121.776 1.00 9.91 C

ATOM 410 SG CYS 64 10.335 45.076 123.466 1.00 11.20 S

ATOM 411 N THR 65 12.807 46.755 119.776 1.00 11.48 N

ATOM 412 CA THR 65 13.155 46.826 118.359 1.00 9.71 C

ATOM 413 C THR 65 12.622 45.494 117.789 1.00 10.00 C

ATOM 414 O THR 65 12.308 44.583 118.556 1.00 15.28 O

ATOM 415 CB THR 65 14.697 46.868 118.143 1.00 6.85 C

ATOM 416 OG1 THR 65 15.322 45.800 118.863 1.00 9.93 O

ATOM 417 CG2 THR 65 15.277 48.166 118.591 1.00 3.77 C

ATOM 418 N LEU 66 12.450 45.375 116.480 1.00 8.25 N

ATOM 419 CA LEU 66 12.008 44.104 115.928 1.00 7.52 C

ATOM 420 C LEU 66 13.003 42.973 116.289 1.00 7.54 C

Page 54: drug design

ATOM 421 O LEU 66 12.586 41.858 116.575 1.00 11.33 O

ATOM 422 CB LEU 66 11.839 44.211 114.411 1.00 9.78 C

ATOM 423 CG LEU 66 11.594 42.944 113.577 1.00 9.03 C

ATOM 424 CD1 LEU 66 10.375 42.191 114.068 1.00 8.35 C

ATOM 425 CD2 LEU 66 11.401 43.335 112.135 1.00 8.66 C

ATOM 426 N ILE 67 14.305 43.256 116.305 1.00 7.74 N

ATOM 427 CA ILE 67 15.307 42.241 116.642 1.00 7.06 C

ATOM 428 C ILE 67 15.211 41.738 118.098 1.00 9.71 C

ATOM 429 O ILE 67 15.331 40.539 118.349 1.00 12.03 O

ATOM 430 CB ILE 67 16.761 42.716 116.304 1.00 7.61 C

ATOM 431 CG1 ILE 67 16.907 42.962 114.794 1.00 6.08 C

ATOM 432 CG2 ILE 67 17.798 41.683 116.751 1.00 5.65 C

ATOM 433 CD1 ILE 67 16.421 41.848 113.908 1.00 5.09 C

ATOM 434 N ASP 68 14.994 42.635 119.057 1.00 10.41 N

ATOM 435 CA ASP 68 14.866 42.224 120.452 1.00 9.51 C

ATOM 436 C ASP 68 13.607 41.392 120.678 1.00 10.31 C

ATOM 437 O ASP 68 13.610 40.475 121.502 1.00 14.74 O

ATOM 438 CB ASP 68 14.875 43.420 121.389 1.00 11.56 C

ATOM 439 CG ASP 68 16.264 43.976 121.606 1.00 13.91 C

ATOM 440 OD1 ASP 68 17.235 43.198 121.459 1.00 14.87 O

ATOM 441 OD2 ASP 68 16.387 45.195 121.899 1.00 17.28 O

ATOM 442 N ALA 69 12.541 41.704 119.942 1.00 8.26 N

ATOM 443 CA ALA 69 11.282 40.965 120.016 1.00 7.07 C

ATOM 444 C ALA 69 11.471 39.588 119.372 1.00 8.01 C

ATOM 445 O ALA 69 10.823 38.621 119.753 1.00 14.19 O

ATOM 446 CB ALA 69 10.182 41.733 119.300 1.00 6.36 C

Page 55: drug design

ATOM 447 N LEU 70 12.339 39.522 118.367 1.00 8.60 N

ATOM 448 CA LEU 70 12.655 38.276 117.681 1.00 6.94 C

ATOM 449 C LEU 70 13.523 37.377 118.571 1.00 8.95 C

ATOM 450 O LEU 70 13.278 36.184 118.666 1.00 13.99 O

ATOM 451 CB LEU 70 13.400 38.566 116.385 1.00 6.58 C

ATOM 452 CG LEU 70 14.032 37.401 115.612 1.00 6.99 C

ATOM 453 CD1 LEU 70 13.031 36.863 114.644 1.00 2.00 C

ATOM 454 CD2 LEU 70 15.272 37.862 114.890 1.00 3.16 C

ATOM 455 N LEU 71 14.560 37.932 119.193 1.00 8.62 N

ATOM 456 CA LEU 71 15.434 37.137 120.055 1.00 7.99 C

ATOM 457 C LEU 71 14.742 36.703 121.350 1.00 9.34 C

ATOM 458 O LEU 71 15.022 35.625 121.889 1.00 11.06 O

ATOM 459 CB LEU 71 16.713 37.902 120.385 1.00 7.84 C

ATOM 460 CG LEU 71 17.576 38.259 119.179 1.00 6.85 C

ATOM 461 CD1 LEU 71 18.757 39.072 119.623 1.00 6.69 C

ATOM 462 CD2 LEU 71 18.028 37.006 118.475 1.00 4.91 C

ATOM 463 N GLY 72 13.856 37.554 121.852 1.00 8.24 N

ATOM 464 CA GLY 72 13.134 37.236 123.056 1.00 4.20 C

ATOM 465 C GLY 72 13.703 37.913 124.267 1.00 9.35 C

ATOM 466 O GLY 72 13.886 37.271 125.300 1.00 13.34 O

ATOM 467 N ASP 73 14.094 39.172 124.113 1.00 11.32 N

ATOM 468 CA ASP 73 14.612 39.971 125.220 1.00 10.18 C

ATOM 469 C ASP 73 13.427 39.944 126.196 1.00 13.04 C

ATOM 470 O ASP 73 12.316 40.287 125.800 1.00 15.77 O

ATOM 471 CB ASP 73 14.847 41.388 124.703 1.00 8.82 C

ATOM 472 CG ASP 73 15.320 42.340 125.768 1.00 9.52 C

Page 56: drug design

ATOM 473 OD1 ASP 73 14.804 42.350 126.902 1.00 15.85 O

ATOM 474 OD2 ASP 73 16.202 43.142 125.460 1.00 15.99 O

ATOM 475 N PRO 74 13.635 39.526 127.466 1.00 14.37 N

ATOM 476 CA PRO 74 12.522 39.473 128.426 1.00 12.63 C

ATOM 477 C PRO 74 11.437 40.571 128.357 1.00 13.86 C

ATOM 478 O PRO 74 10.247 40.255 128.422 1.00 15.34 O

ATOM 479 CB PRO 74 13.241 39.422 129.774 1.00 10.96 C

ATOM 480 CG PRO 74 14.440 38.613 129.468 1.00 10.36 C

ATOM 481 CD PRO 74 14.906 39.181 128.137 1.00 12.18 C

ATOM 482 N HIS 75 11.815 41.839 128.184 1.00 13.77 N

ATOM 483 CA HIS 75 10.798 42.901 128.108 1.00 14.05 C

ATOM 484 C HIS 75 9.998 42.964 126.797 1.00 15.14 C

ATOM 485 O HIS 75 9.022 43.716 126.683 1.00 16.06 O

ATOM 486 CB HIS 75 11.363 44.287 128.508 1.00 14.49 C

ATOM 487 CG HIS 75 12.088 45.028 127.420 1.00 15.72 C

ATOM 488 ND1 HIS 75 13.366 44.699 127.008 1.00 16.17 N

ATOM 489 CD2 HIS 75 11.755 46.142 126.731 1.00 14.34 C

ATOM 490 CE1 HIS 75 13.785 45.577 126.119 1.00 16.35 C

ATOM 491 NE2 HIS 75 12.824 46.468 125.931 1.00 15.88 N

ATOM 492 N CYS 76 10.370 42.116 125.843 1.00 14.34 N

ATOM 493 CA CYS 76 9.708 42.052 124.550 1.00 12.58 C

ATOM 494 C CYS 76 8.935 40.764 124.410 1.00 13.66 C

ATOM 495 O CYS 76 8.458 40.442 123.327 1.00 15.62 O

ATOM 496 CB CYS 76 10.736 42.143 123.437 1.00 10.76 C

ATOM 497 SG CYS 76 11.736 43.647 123.608 1.00 8.33 S

ATOM 498 N ASP 77 8.786 40.044 125.515 1.00 15.60 N

Page 57: drug design

ATOM 499 CA ASP 77 8.071 38.772 125.527 1.00 18.52 C

ATOM 500 C ASP 77 6.641 38.895 125.043 1.00 15.92 C

ATOM 501 O ASP 77 6.056 37.934 124.548 1.00 16.25 O

ATOM 502 CB ASP 77 8.095 38.147 126.932 1.00 22.74 C

ATOM 503 CG ASP 77 9.331 37.275 127.178 1.00 25.55 C

ATOM 504 OD1 ASP 77 9.981 36.806 126.199 1.00 25.96 O

ATOM 505 OD2 ASP 77 9.639 37.041 128.367 1.00 29.00 O

ATOM 506 N VAL 78 6.091 40.092 125.167 1.00 16.28 N

ATOM 507 CA VAL 78 4.721 40.364 124.743 1.00 17.71 C

ATOM 508 C VAL 78 4.501 40.245 123.232 1.00 17.13 C

ATOM 509 O VAL 78 3.381 40.014 122.799 1.00 22.17 O

ATOM 510 CB VAL 78 4.234 41.770 125.252 1.00 18.39 C

ATOM 511 CG1 VAL 78 5.089 42.900 124.668 1.00 18.08 C

ATOM 512 CG2 VAL 78 2.762 41.984 124.918 1.00 19.95 C

ATOM 513 N PHE 79 5.573 40.332 122.446 1.00 17.04 N

ATOM 514 CA PHE 79 5.472 40.242 120.998 1.00 13.60 C

ATOM 515 C PHE 79 5.606 38.845 120.459 1.00 14.61 C

ATOM 516 O PHE 79 5.604 38.636 119.248 1.00 18.88 O

ATOM 517 CB PHE 79 6.510 41.140 120.344 1.00 12.68 C

ATOM 518 CG PHE 79 6.337 42.577 120.681 1.00 14.04 C

ATOM 519 CD1 PHE 79 5.311 43.313 120.107 1.00 15.21 C

ATOM 520 CD2 PHE 79 7.162 43.188 121.617 1.00 13.90 C

ATOM 521 CE1 PHE 79 5.108 44.637 120.466 1.00 16.22 C

ATOM 522 CE2 PHE 79 6.971 44.510 121.983 1.00 13.35 C

ATOM 523 CZ PHE 79 5.944 45.237 121.410 1.00 17.06 C

ATOM 524 N GLN 80 5.713 37.872 121.342 1.00 16.26 N

Page 58: drug design

ATOM 525 CA GLN 80 5.863 36.509 120.892 1.00 18.65 C

ATOM 526 C GLN 80 4.748 36.133 119.968 1.00 20.29 C

ATOM 527 O GLN 80 3.582 36.333 120.293 1.00 23.21 O

ATOM 528 CB GLN 80 5.860 35.546 122.059 1.00 21.01 C

ATOM 529 CG GLN 80 7.150 35.544 122.807 1.00 24.76 C

ATOM 530 CD GLN 80 7.210 34.443 123.829 1.00 27.90 C

ATOM 531 OE1 GLN 80 6.371 33.524 123.829 1.00 29.59 O

ATOM 532 NE2 GLN 80 8.221 34.496 124.690 1.00 31.72 N

ATOM 533 N ASN 81 5.129 35.650 118.789 1.00 21.13 N

Page 59: drug design

STRUCTURE VALIDATION

RAMACHANDRAN PLOT

A Ramachandran plot (also known as a Ramachandran map or a Ramachandran diagram or a [φ,ψ] plot), developed by Gopalasamudram Narayana Ramachandran and Viswanathan Sasisekharan is a way to visualize dihedral angles ψ against φ of amino acid residues in protein structure. It shows the possible conformations of ψ and φ angles for a polypeptide. Mathematically, the Ramachandran plot is the

visualization of a function . The domain of this function is the torus. Hence, the conventional Ramachandran plot is a projection of the torus on the plane, resulting in a distorted view and the presence of discontinuities

-protein complexation on ring-size

Page 60: drug design

Chapter 5

HOMOLOGY MODELLING STEPS

1) homology target

2) target template alignment

3) backbone matching

4) side chain replacement

5) grouping

6) energy satisfaction

7) final model generation

RAMACHANDRAN PLOT FOR OTHER MODELS

Page 61: drug design
Page 62: drug design
Page 63: drug design
Page 64: drug design

Table:

So our best model is plot no 2 becaause it is known that the edope score should be less and favoured region should be more.

Core(%) Allowed(%) EDOBE VALUES

86.1 1.2 -39286.523438

87.9 0.9 -39286.523438

86.7 0.9 -39011.328125

86.4 0.9 -39214.257812

87.9 0.6 39214.257812

Page 65: drug design

The best Ramachandran plot taken for Modelled H1N1 complex

Page 66: drug design

CHAPTER 6

LEAD IDENTIFICATION

An important step in this work was to identify a molecule of interest for docking, the ligand. Each protein active site has a unique size and shape which determines the complexity, and size and shape of a binding ligand. Because the size and shape of an active site are biologically determined, the substrate dictates the size, shape and chemical make-up of a target site. This relationship determines the complementary size, shape and complexity of possible ligands.5 For this research twenty-five ligands were created, based upon analysis of two previously FDA approved influenza medications, Relenza (Zanamivir) and Tamiflu (Oseltamivir).

fig:zanamivir fig : oseltamivir

Similarities in these two molecules were identified. They include: a six-membered ring, possibly containing one double bond or a hetero-atom; a carbonyl or carboxyl group adjacent to the double bond or heteroatom; an amino group in the 3- position on the ring relative to the carbonyl or carboxyl group position; and an amide group in the 4-position on the ring and in an anti-configuration relative to the amino group. The similarities were combined into two new molecules, which will be referred to as the basic structures. (See Figure 2.2) After finding the basic structure with the six-membered ring, a second basic structure was created using a five-membered ring even though neither Zanamivir nor Oseltamivir contain five-membered rings. The five-membered ring structure was created to investigate the dependence of ligand of different size.

Page 67: drug design

DEVELOPMENT OF LEAD INTO ACTIVE SITE

HEX (protein-ligand docking)

         File-open-receptor-(best model).pdb

         File-open-ligand-lead.pdb

         Attach the ligand in the pocket i.e active site of protein.

         Controls-matching-correlation type-shape and skin-activate-dismiss.

         File- save as-both.pdb

         View the both.pdbfile in SPDV to check that ligand lies in the pocket.

         Repeat the steps of Hex until ligand lies in pocket.

         Save this both.pdb as complex.pdb

         Make ligand.pdb from complex.pdb.

         Convert the pdb format to mol2 format.

This gives us ligandmol2

Page 68: drug design
Page 69: drug design

LIGBUILDER

Lead molecule is then grown in the pocket with the help of ligbuilder.

To pocket.index, 2 input files are given which are:

Complex.pdb

Ligandmol2(sybylmol2)

5 output files are thus obtained which are:

1) Pocket_Atom.txt

2) Pocket_gridfile.txt

3) Pocket_keysite.txt

4) Pharmacophore.pdb

5) Pharmacophore.txt

Page 70: drug design

To Grow.index, 3 input files are given:

Ligandmol2

Pocket_gridfile.txt Pocket_Atom.txt

2 output files obtained are:

Population record file.lig Ligand collection file.lig

Ligand generation cycle consists of following steps:

Parent selection Elite selection Growing by mutation Filter application Final ligand

To process.index, Ligand collection file.lig is given as input file.

10 files i.e. results.mdb are obtained as output.

These mdb files atre converted to pdb format.

CHAPTER: 7

Page 71: drug design

DOCKING ANALYSIS

AutoDock Procedure and Theory

The second section of the research utilized AutoDock 4.110 which was run using a Linux platform, Ubuntu. AutoDock 4.1 functions on the docking simulation method of automated docking. It employs a more physically detailed docking technique that can incorporate flexible docking. AutoDock 4.1 gives good results when predicting rankings for a series of similar molecules.10,11 It contains a suite of automated docking tools. Its purpose is to predict how small molecules, such as drug candidates, bind to a receptor in a known three-dimensional structure. It consists of two main programs: AutoDock and AutoGrid. AutoGrid pre-calculates a set of grids.

AutoDock is responsible for the docking of the ligand to the protein. AutoDock 4.1 employs a graphical user interface, AutoDockTools. This allows the user to modify the ligand before a docking and to visually analyze dockings after completion Ligand files in the .mol2 format were first opened in AutoDock for preparation. Once opened, charges were added and all non-polar hydrogen atoms were merged. Next, bonds within the ligand were set as rotatable or fixed. After the root atom of the ligand was detected and all torsions were selected and set, the file was then saved as a .pdbqt file type. The protein file also was prepared for docking. Protein files can be found online at the Protein Data Bank website.12 The PDB website contains an archive housing information for experimentally determined structures of proteins, nucleic acids and other complex assemblies.

Structures can be searched based upon sequence, structure or function. Each molecule can be viewed and downloaded for further analysis. Each structure has a unique four character ID, which can be used to import the structure directly into AutoDock, or the structure file may be downloaded from the website and opened from the saved PDB file. In this project, the 2HTV protein was imported directly into AutoDock. Once the protein file was opened in AutoDock, the excess water molecules were isolated and deleted from the structure. All hydrogen atoms were added to the protein structure. These changes were then saved. The rigid and flexible residues of the protein were selected, and two additional files created; a file_rigid.pbdqt and file_flex.pbdqt.

Page 72: drug design

A set of grid maps were constructed, using the AutoGrid function. Both the protein and the appropriate ligand files were chosen for the mapping. A grid box was then used to select which area of the protein structure to be mapped. Ideally this grid box is located at the active site. In situations where the active site of a protein is unknown, it is possible for the grid box to encompass the entire protein, enabling blind docking. Because the active site of the 2HTV protein was unknown, a grid box covering the entire protein structure was implemented. The final step in submitting the docking is to run the AutoDock function. To prepare for this, the rigid protein and ligand files were selected. The Lamarkian genetic algorithm was set up, which controlled the number of scans, the number of mutations, and number of conformations returned. The docking parameters were set, and a docking product file was created. Finally, AutoDock was launched, and the resulting docking conformations were returned in the .dpf file.

After the docking completed, the product file was opened for viewing and the returned conformations were analyzed. The conformations are sorted by the software from best to worst, based upon their docked energy. Selecting analyze clusterings allows the user to view the ligand at its docked location, which ideally would be within the active site of the protein. Each resulting set of dockings can be viewed as spheres, to visualize where each of the dockings occurred. Isocontour maps can be created to display the interactions between oxygen atoms in the protein and the ligand. Hydrogen bonding interactions also can be modeled

Preparing a Ligand for AutoDock

Ligand files are opened in AutoDock for preparation. The previously saved .mol2 files are compatible with AutoDock. Charges are added, and all non-polar hydrogens are merged. Next, bonds within the ligand are set as rotatable or fixed. After the root is detected and all torsions are selected and set, the file must be saved as a .pdbqt file type.

Open the ligand: Ligand Input Open… Change filetype to .mol2 in dropdown box. Select ligand. Charges will be added and non-polar hydrogen merged if needed. Click OK

-Detect the root: Ligand Torsion Tree Detect Root… A green sphere will appear (See Figure G.1)

-Choose torsions: Ligand Torsion Tree Choose Torsions… Allows bonds to be selected as rotatable or fixed. Click on purple bonds to

Page 73: drug design

activate. Rotatable bonds are green, fixed bonds are red. Click Done.

-Set torsions: Ligand Torsion Tree Set Number of Torsions… Choose rotations based upon moving the fewest atoms or most atoms, as well as the number of torsions allowed. Click Dismiss.

-Save .pbdqt: Ligand Output Save as PBDQT… Enter filename.pbdqt to save ligand. Click Save.

Preparing a PDB file for AutoDockOnce a protein file is opened in AutoDock, the excess water molecules should be isolated and deleted from the structure. All hydrogen atoms shouldbe added to the protein structure. These changes must be saved. The rigid and flexible residues of the protein are selected, and two additional files are created; a file_rigid.pbdqt and file_flex.pbdqt.

-Open protein: If file is saved on computer, Right click PMV Molecules. Choose protein file.pbd. Click Open. If file will be downloaded from PDB website, Read Read Molecule From Web. Enter PBD code. Click OK.

-Select water: Select Select from String… In Residue box, type HOH*. In Atom box, type *. Click Add. Click Dismiss. (See Figure H.1)

-Delete water: Edit Delete Delete AtomSet… Click Continue.

-Add hydrogens: Edit Hydrogens Add… Choose All Hydrogens, method of noBond Order, and yes to renumber. Click OK.

-Resave: File Save Write PBD… Type filename.pbd. Write ATOM and HETATM records. Choose Sort Nodes. Click OK. (See

Page 74: drug design

Figure H.2)

-Choose macromolecule: Flexible Residues Input Choose Macromolecule… Select protein. Click Select Molecule. Click Yes to merge non-polar hydrogens. Click OK.

-Select flexible residues: Select Select From String… Click Clear Form. Enter desired flexible residue in the Residue field and click Add. Click Dismiss.

-Choose torsions: Flexible Residues Choose Torsions in Currently Selected Residues… Click on a desired bond to inactivate it. Click Close. Clear selection using the Pencil Eraser icon.

-Save flexible file: Flexible Residues Output Save Flexible PBDQT… Enter filename_flex.pbdqt.

-Save rigid file: Flexible Residues Output Save Rigid PDBQT… Enter filename_rigid.pdbqt.

-Remove molecule: Edit Delete Delete Molecule… Select protein and click Delete Molecule. Click Continue. Click Dismiss

Figure : Removing water from macromolecule

Page 75: drug design

Figure : Saving PDB file

Running AutoGrid

A grid map must be set up, using AutoGrid. Both the protein and the appropriate ligand files are chosen. A grid box is used to select which area of the protein structure to be mapped. This grid box ideally is positioned in order to include the active site within its boundaries. When the active site of a protein is unknown, this grid box can encompass the entire protein. A grid parameter file (.gpf) needs to be created. Then, AutoGrid can be run

-Preparing macromolecule for grid: Grid Macromolecule Open… Choose filename_rigid.pbdqt. Click Open. Click Yes to preserve the input charges. Click OK to warning boxes.

-Set map types: Grid Set Map Types Choose/Open Ligand… Choose/open ligand file.

-Set grid box: Grid Grid Box… Using the thumbwheels, position the grid box over the active site of the protein. If the active site is unknown, the grid box can encompass the entire protein. Record the grid settings to

Page 76: drug design

make this step easier in the future. Save grid by clicking File Close Saving Current. (See Figure I.1)

-Save GPF: Grid Output Save GPF… Save grid parameter file as filename.gpf.

-Running AutoGrid: Run Run AutoGrid… Verify filenames and click Launch.

Figure I.1: Grid box

Running AutoDock

The final step in submitting the docking is to run AutoDock . To set up this procedure, the rigid protein and ligand files are selected. The genetic algorithm is set up, which controls the number of scans, the number of mutations, and number of conformations returned. The docking parameters are set, and a docking product file is created. AutoDock is launched, and the resulting docking conformations are returned in the .dpf file.

Page 77: drug design

Selecting rigid file: Docking Macromolecule Set Rigid Filename… Select filename_rigid.pdbqt. Click Open.

-Select ligand: Docking Ligand Choose… Click on ligand. Click Select Ligand.

-Select flexible file: Docking Macromolecule Set Flexible Residue filename… Select filename_flexible.pdbqt. Click Open.

-Set algorithm parameters: Docking Search Parameters Genetic Algorithm… Do initial runs using the short setting. Click Accept.

-Set docking parameters: Docking Docking Parameters… Use defaults. Click Close.

-Create DPF: Docking Output Lamarckian GA… Enter filename.dpf. Click Save.

-Running AutoDock: Run Run AutoDock… Verify filenames and click Launch.

Analyzing Docking Results

After the docking is completed, the product file will be opened for viewing. After opening the product file, the returned conformations are analyzed. By opening clusterings, the ligand is viewed at its docked location. Each resulting set of dockings can be viewed as spheres, to visualize where each of the dockings occurred. Isocontour maps can be created to display the interactions between the oxygen atoms in the protein and the ligand. Hydrogen bonding interactions can also be modeled

-Open results: Analyze Dockings Open… Open the file.dlg

-Opening conformations: Analyze Conformations Load… Opens Conformation Chooser, displaying energies and clusters of the results, ranked from best to worst based upon lowest energy in cluster and best individual per cluster. (See Figure K.1)

-Analyzing results: Analyze Clusterings Open… Associations between

Page 78: drug design

ligand and receptor can be viewed (See Figure K.2)

-Review all docking sites: Analyze Dockings Show as Spheres… Each docked conformation is shown as a sphere (See Figure K.3)

-Viewing hydrogen bonding: Analyze Dockings Show Interactions… Alters the display to show interactions between ligand and receptor

FINAL STEP OF DRUG DESIGNING

ADME TOXICO ANALYSIS

:

The best docked file is taken that is in which the bining energy is higher.

The best docked file was file no: 10. With binding energy -3.66 k/mol

For this the docked file is viewed under software named spdv4.0.4

It is then drawn on molsoft to check whether it comes under rule of five of drugs.

The docked file is swiss pdb viewer 4.0.4

Page 79: drug design

The structure is drawn on molsoft and molecular properties are detected

Fig : Molecular properties of .the chosen docked file

Page 80: drug design

Fig : drug likeness model score

Page 81: drug design

Chapter 8

Result and Discussion:

Five modelled structures of H1N1 VIRUS generated by Modeller 9.10 contains 96% to 98% residues in the core region of Ramachandran plot and the overall G factor ranges between -0.09 to -0.04. Z-scores were within the range and energy functions of the residues were at minimum as analyzed by ProSA.

Binding pocket determination of modelled by H1N1 VIRUS Ligsite program revealed three potential binding sites pkt-139, pkt-48 and pkt- 26 where -48 was found to be the major cleft with critical D13, T26 & G27. Ligand docking predicted the binding of generated derivatives at the substrate binding cleft with negative interaction energy and efficient binding. Pharmacokinetic properties analysis of the optimized lead molecules performed by Molsoft LLC program predicted minimum number of Hydrogen bond acceptors, Hydrogen bond donors and molecular weight. The partition co-efficient CLogP and Solubility CLogS were found to be minimal for the designed ligand. These observed properties suggested good absorption and easy transportation of the molecule across the membrane, which according to the rule of five; a compound could possibly behave as a drug. The pharmacodynamic properties were calculated using PASS program at Pa>Pi.

Docking Model no. G bind kj/mol

Lig-1 1 -2.01

Lig-2 2 -2.19

Lig-3 2 -2.19

Lig-4 8 -3.28

Lig-5 3 -3.17

Page 82: drug design

Lig-6 2 -3.07

Lig-7 7 -3.21

Lig-8 1 -0.96

Lig-9 10 -3.69

Lig-10 10 -3.66

Molecular formula: C29 H58

Molecular weight: 406.45

Number of HBA: 0

Number of HBD: 0

MolLogP : 5.66

MolLogS : -8.14 (in Log(moles/L)) 21.96 (in mg/L)

MolPSA : 0.00 A2

MolVol : 530..13 A3

Number of stereo centers: 14

Page 83: drug design

Project Title:

REVISTING LIPINSKI’S RULE : A DEVEATION STUDY FOCUSING HEPATIC ESTABLISED LIGANDS

INTRODUCTION

Lipinski's rule of five

Lipinski's rule of five also known as the Pfizer's rule of five or simply the Rule of five (RO5) is a rule of thumb to evaluate druglikeness or determine if a chemical compound with a certain pharmacological or biological activity has properties that would make it a likely orally active drug in humans. The rule was formulated by Christopher A. Lipinski in 1997, based on the observation that most medication drugs are relatively small and lipophilic molecules.

The rule describes molecular properties important for a drug's pharmacokinetics in the human body, including their absorption, distribution, metabolism, and excretion ("ADME"). However, the rule does not predict if a compound is pharmacologically active.

The rule is important to keep in mind during drug discovery when a pharmacologically active lead structure is optimized step-wise to increase the activity and selectivity of the compound as well as to insure drug-like physicochemical properties are maintained as described by Lipinski's rule. Candidate drugs that conform to the RO5 tend to have lower attrition rates during clinical trials and hence have an increased chance of reaching the market.

Page 84: drug design

Components of the rule

Lipinski's rule states that, in general, an orally active drug has no more than one violation of the following criteria:

Not more than 5 hydrogen bond donors (nitrogen or oxygen atoms with one or more hydrogen atoms)

Not more than 10 hydrogen bond acceptors (nitrogen or oxygen atoms) A molecular mass less than 500 daltons An octanol-water partition coefficient[5] log P not greater than 5

Note that all numbers are multiples of five, which is the origin of the rule's name. As with many other rules of thumb, (such as Baldwin's rules for ring closure or Murphy's law), there are many exceptions to Lipinski's Rule.

Variants

In an attempt to improve the predictions of drug likeness, the rules have spawned many extensions, for example the following:

Partition coefficient log P in −0.4 to +5.6 range Molar refractivity from 40 to 130 Molecular weight from 180 to 500 Number of atoms from 20 to 70 (includes H-bond donors [e.g.;OH's and NH's]

and H-bond acceptors [e.g.; N's and O's] Polar surface area no greater than 140

Also the 500 molecular weight cutoff has been questioned. Polar surface area and the number of rotatable bonds has been found to better discriminate between compounds that are orally active and those that are not for a large data set of compounds in the rat. In particular, compounds which meet only the two criteria of:

10 or fewer rotatable bonds and polar surface area equal to or less than 140 Å2 are predicted to have good oral bioavailability.

Lead-like

During drug discovery, lipophilicity and molecular weight are often increased in order to improve the affinity and selectivity of the drug candidate. Hence it is often difficult to maintain drug-likeness (i.e., RO5 complicance) during hit and lead optimization. Hence it has been proposed that members of screening libraries from which hits are discovered should be biased toward lower molecular weight and lipophility so that medicinal chemists will have an easier time in delivering optimized drug development candidates that are also drug-like. Hence the rule of five has been extended to the rule

Page 85: drug design

of three (RO3) for defining lead-like compounds. A rule of three compliant compound is defined as one that has:

octanol-water partition coefficient log P not greater than 3 molecular mass less than 300 daltons not more than 3 hydrogen bond donors not more than 3 hydrogen bond acceptors not more than 3 rotatable bonds

STEPS FOR DEVIATION STUDY

Diseases related to a particular organ

Their ligand structure and their study

Finding out the most common part in them

Checking them on rule of five parameter.

Finding out which one and how many of them deviate from rule of five parameters.

By getting the most common part in those ligand we can conclude that the common part is mostly required in any ligand that works as a drug for disease related to that parameter of the organ .

By checking the deviation from rule of five of those ligand we can conclude that rule 5 parameter can vary in order to get a good and highly effective drug.

Liver : The organ taken for study

The liver is a vital organ present in vertebrates and some other animals. It has a wide range of functions, including detoxification, protein synthesis, and production of biochemicals necessary for digestion. The liver is necessary for survival; there is currently no way to compensate for the absence of liver function in the long term, although new liver dialysis techniques can be used in the short term.

This organ plays a major role in metabolism and has a number of functions in the body, including glycogen storage, decomposition of red blood cells, plasma protein synthesis, hormone production, and detoxification. It lies below the diaphragm in the

Page 86: drug design

abdominal-pelvic region of the abdomen. It produces bile, an alkaline compound which aids in digestion via the emulsification of lipids. The liver's highly specialized tissues regulate a wide variety of high-volume biochemical reactions, including the synthesis and breakdown of small and complex molecules, many of which are necessary for normal vital functions.[2]

Terminology related to the liver often starts in hepar- or hepat- from the Greek word for liver, hēpar (ἧπαρ, root hepat-, ἡπατ-).

Functions

The liver stores a multitude of substances, including glucose (in the form of glycogen), vitamin A (1–2 years' supply), vitamin D (1–4 months' supply)[ vitamin B12 (1–3 years' supply), vitamin K, iron, and copper.

The liver is responsible for immunological effects—the reticuloendothelial system of the liver contains many immunologically active cells, acting as a 'sieve' for antigens carried to it via the portal system.

The liver produces albumin, the major osmolar component of blood serum.

The liver synthesizes angiotensinogen, a hormone that is responsible for raising the blood pressure when activated by renin, an enzyme that is released when the kidney senses low blood pressure.

Diseases of the liver

The liver supports almost every organ in the body and is vital for survival. Because of its strategic location and multidimensional functions, the liver is also prone to many diseases.

The most common include: Infections such as hepatitis A, B, C, D, E, alcohol damage, fatty liver, cirrhosis, cancer, drug damage (particularly by acetaminophen (paracetamol) and cancer drugs).

Many diseases of the liver are accompanied by jaundice caused by increased levels of bilirubin in the system. The bilirubin results from the breakup of the hemoglobin of dead red blood cells; normally, the liver removes bilirubin from the blood and excretes it through bile.

There are also many pediatric liver diseases including biliary atresia, alpha-1 antitrypsin deficiency, alagille syndrome, progressive familial intrahepatic cholestasis, and Langerhans cell histiocytosis, to name but a few.

Page 87: drug design

Diseases that interfere with liver function will lead to derangement of these processes. However, the liver has a great capacity to regenerate and has a large reserve capacity. In most cases, the liver only produces symptoms after extensive damage. Liver diseases may be diagnosed by liver function tests, for example, by production of acute phase proteins.

Disease symptoms

The classic symptoms of liver damage include the following:

Pale stools occur when stercobilin, a brown pigment, is absent from the stool. Stercobilin is derived from bilirubin metabolites produced in the liver.

Dark urine occurs when bilirubin mixes with urine Jaundice (yellow skin and/or whites of the eyes) This is where bilirubin

deposits in skin, causing an intense itch. Itching is the most common complaint by people who have liver failure. Often this itch cannot be relieved by drugs.

Swelling of the abdomen, ankles and feet occurs because the liver fails to make albumin.

Excessive fatigue occurs from a generalized loss of nutrients, minerals and vitamins.

Bruising and easy bleeding are other features of liver disease. The liver makes substances which help prevent bleeding. When liver damage occurs, these substances are no longer present and severe bleeding can occur.

Diagnosis

The diagnosis of liver function is made by blood tests. Liver function tests can readily pinpoint the extent of liver damage. If infection is suspected, then other serological tests are done. Sometimes, one may require an ultrasound or a CT scan to produce an image of the liver.

Physical examination of the liver is not accurate in determining the extent of liver damage. It can only reveal presence of tenderness or the size of liver, but in all cases, some type of radiological study is required to examine it.[12]

Biopsy / scan

Damage to the liver is sometimes determined with a biopsy, particularly when the cause of liver damage is unknown. In the 21st century they were largely replaced by high-resolution radiographic scans. The latter do not require ultrasound guidance, lab involvement, microscopic analysis, organ damage, pain, or patient sedation; and the results are available immediately on a computer screen.

Page 88: drug design

In a biopsy, a needle is inserted into the skin just below the rib cage and a tissue sample obtained. The tissue is sent to the laboratory, where it is analyzed under a microscope. Sometimes, a radiologist may assist the physician performing a liver biopsy by providing ultrasound guidance.[13]

Regeneration

The liver is the only human internal organ capable of natural regeneration of lost tissue; as little as 25% of a liver can regenerate into a whole liver. This is, however, not true regeneration but rather compensatory growth. The lobes that are removed do not regrow and the growth of the liver is a restoration of function, not original form. This contrasts with true regeneration where both original function and form are restored. In liver, large areas of the tissues are formed but for the formation of new cells there must be sufficient amount of material so the circulation of the blood becomes more active.

This is predominantly due to the hepatocytes re-entering the cell cycle. That is, the hepatocytes go from the quiescent G0 phase to the G1 phase and undergo mitosis. This process is activated by the p75 receptors. There is also some evidence of bipotential stem cells, called hepatic oval cells or ovalocytes (not to be confused with oval red blood cells of ovalocytosis), which are thought to reside in the canals of Hering. These cells can differentiate into either hepatocytes or cholangiocytes, the latter being the cells that line the bile ducts.

Scientific and medical works about liver regeneration often refer to the Greek Titan Prometheus who was chained to a rock in the Caucasus where, each day, his liver was devoured by an eagle, only to grow back each night. The myth suggests the ancient Greeks knew about the liver’s remarkable capacity for self-repair, however, this claim is without evidence.

Liver transplantation

Human liver transplants were first performed by Thomas Starzl in the United States and Roy Calne in Cambridge, England in 1963 and 1965, respectively.

Liver transplantation is the only option for those with irreversible liver failure. Most transplants are done for chronic liver diseases leading to cirrhosis, such as chronic hepatitis C, alcoholism, autoimmune hepatitis, and many others. Less commonly, liver transplantation is done for fulminant hepatic failure, in which liver failure occurs over days to weeks.

Liver allografts for transplant usually come from donors who have died from fatal brain injury. Living donor liver transplantation is a technique in which a portion of a living person's liver is removed and used to replace the entire liver of the recipient.

Page 89: drug design

This was first performed in 1989 for pediatric liver transplantation. Only 20 percent of an adult's liver (Couinaud segments 2 and 3) is needed to serve as a liver allograft for an infant or small child.

More recently, adult-to-adult liver transplantation has been done using the donor's right hepatic lobe, which amounts to 60 percent of the liver. Due to the ability of the liver to regenerate, both the donor and recipient end up with normal liver function if all goes well. This procedure is more controversial, as it entails performing a much larger operation on the donor, and indeed there have been at least two donor deaths out of the first several hundred cases. A recent publication has addressed the problem of donor mortality, and at least 14 cases have been found.[19] The risk of postoperative complications (and death) is far greater in right-sided operations than that in left-sided operations.

With the recent advances of noninvasive imaging, living liver donors usually have to undergo imaging examinations for liver anatomy to decide if the anatomy is feasible for donation. The evaluation is usually performed by multidetector row computed tomography (MDCT) and magnetic resonance imaging (MRI). MDCT is good in vascular anatomy and volumetry. MRI is used for biliary tree anatomy. Donors with very unusual vascular anatomy, which makes them unsuitable for donation, could be screened out to avoid unnecessary operations.

THE DISEASE AND THEIR RELATED DRUGS WITH STRUCTURE:

NAME OF THE DISEASE : LIVER CANCER

DRUG USED IN TREATMENT OF CANCER IN THE TIME OF CHEMOTHERAPY

Doxorubinin

The structure of doxorubicin

Page 90: drug design

Fluorouracil(5fu)

Gemcitabine

Name of diseases cirrhosis, alchoholic liver disease

Name of the drug used for treatment of following disease

Page 91: drug design

URSODIOL

NAME OF THE DISEASE : LIVER CYST /ABSCESS

NAME OF DRUGS USED

Metronidazole

Cephalosporin

NAME OF DISEASE: ALCHOHOLIC LIVER DISEASE

NAME OF THE DRUG USED IN THE TREATMENT

Disulfiram:

Page 92: drug design

Acamprosate:

NAME OF THE DISEASE: HEPATITIS A, B, C

DRUGS USE IN THE TREATMENT

FOR HEPATITIS A : NO DRUGS ARE USED.

FOR HEPATITUS B: LAMIVUDINE (Epivir)

TENOFOVIR:

ENTACAVIR:

Page 93: drug design

HEPATITIS C: AS SUCH NO DRUGS ARE PRESCRIBED

BUT IT LEADS TO LIVER CIRRHOSIS AND LATER LIVER TRANSPLANT.

Checking of structure on the rule of five parameter with help of softwares namely orisis or molsoft

ORISIS

The OSIRIS Property Explorer shown in this page is an integral part of Actelion's (1) inhouse substance registration system. It lets you draw chemical structures and calculates on-the-fly various drug-relevant properties whenever a structure is valid. Prediction results are valued and color coded. Properties with high risks of undesired effects like mutagenicity or a poor intestinal absorption are shown in red. Whereas a green color indicates drug-conform behaviour.

MOLSOFT

Welcome to Molsoft LLC! Molsoft a leading provider of tools, databases and consulting services in the area of structure prediction, structural proteomics, bioinformatics, cheminformatics, molecular visualization and animation, and rational drug design. Molsoft offers complete solutions customized for a biotechnology or pharmaceutical company in the areas of computational biology and chemistry. Molsoft is committed to continuous innovation, scientific excellence, the development of the cutting edge technologies and original ideas. Molsoft offers software tools and services in lead discovery, modeling, cheminformatics, bioinformatics, and corporate data management; and forms partnerships with biotechnology and pharmaceutical companies.

The details of terminology used to satisfy the rule of five:

Molecular weight – known relationship between poor permeability and high molecular weight.

Lipophilicity (ratio of octanol solubility to water solubility) – measured through LogP.

Number of hydrogen bond donors and acceptors – High numbers may impair permeability across membrane bilayer

The rule of five – formulation

Page 94: drug design

Poor absorption or permeation are more likely when:

There are more than 5 H-bond donors.

The molecular weight is over 500.

The LogP is over 5.

There are more than 10 H-bond acceptors.

THE STRUCTURE VIEWED UNDER THE FOLLOWING SOFT WARE

DISULFIRAM(ALCHOHOLIC LIVER DISEASE) C;

clogp- 4.13

Solubility : 2.39

molecular wieght : 234

druglikeness:3.8

drug score:0.4

ACAMPROSATE

Page 95: drug design

clogp- 1.95

Solubility : 2.25

molecular wieght : 131

druglikeness: 4.75

drug score: 0.48

CEPHOLOSPORIN

clogp- 2.16

Solubility : 1.94

molecular wieght : 195

druglikeness: 1.24

drug score: 0.83

Page 96: drug design

METRONIDAZOLE

clogp- 1.04

Solubility : 0.62

molecular wieght : 157

druglikeness: 0.34

drug score: 0.78

URASODIOL

clogp- 3.54

Solubility : 4.36

molecular wieght : 366

druglikeness: 5.34

Page 97: drug design

drug score: 0.38

ENTECAVIR

clogp- 1.88

Solubility : 2.57

molecular wieght : 185

druglikeness: 1.0

drug score: 0.82

TENOFOVIR

clogp- 1.88

Page 98: drug design

Solubility : 1.12

molecular wieght : 136

druglikeness: .71

drug score: 0.82

LAMIVUDINE

clogp- 0.52

Solubility : 2.36

molecular wieght : 245

druglikeness: 2.41

drug score: 0.9

ALBENDAZOLE

Page 99: drug design

clogp- 2.05

Solubility : 3.35

molecular wieght : 295

druglikeness: 7.66

drug score: 0.43

DOXORUBICIN

clogp- 2.5

Solubility : 3.41

molecular wieght : 376

druglikeness: 2.65

drug score: 0.43

Page 100: drug design

GEMCITABINE

clogp- 0.44

Solubility : 1.5

molecular wieght : 125gms

druglikeness: 1.74

drug score: 0.56

Page 101: drug design

CONCLUSION

The common structure in the above ligand is the benzene ring and carboxylic acid. We can conclude that the common part is mostly required in the above drugs for curing liver disease are so . No drug deviated from the rule of five parameter. So it is concluded that all the drugs are highly effective and as a drug . The goal which was to Identifying calculable parameters of the selected compound library, related to absorption and permeability was achieved Calculations, however imprecise (give only probabilities), may help when choices must be made as to the design or purchase .Accurate prediction of solubility of complex compound is still an “elusive target”

Page 102: drug design

References

[1] Center for Disease Control and Prevention. Influenza. http://www.cdc.gov/flu/ (accessed June 19, 2009).

[2] Couch, Robert B. The New England Journal of Medicine 1997, 337: 927-929.

[3] Luo, M., Air, G. M., Brouillette, W.J. The Journal of Infectious Diseases. 1997, 176: 62-65.

[4] Malaisree, M., Rungrotmongkol, T., Decha, P., Intharathep, P., Aruksakunwon, O., Ha nnongbuw, S. Proteins 2008, 71: 1908-1918.

[5] Kass, Itamar and Arking, Isaiah, T. Structure 2005, 13: 1789-1798.

[6] Couch, Robert B. The New England Journal of Medicine 2000, 343: 1778-1788.

[7] Balfour Jr, Henry H. The New England Journal of Medicine. 1999, 340: 1255-1269.

[8] Lewars, Errol. Computational Chemistry: Introduction to the Theory and Applications of Mo lecular and Quantum Mechanics; Kluwer Academic Publishers: Boston, 2003.

[9] Finer-Moore, Janet S.; Blaney, Jeff; Stroud, Robert M. Facing the Wall in Computationall Based Approaches to Drug Discovery. In Computational and Structural Approaches to Drug Discovery: Ligand-Protein Interactions; Stroud, Robert M.; Finer

[10] Rosenfeld, Robin J.; Goodsel, David S.; Musah, Rabi A.; Morris, Garrett M.; Gooding, David B.; Oson, Arthur J. Journal of Computer-Aided Molecular Design 2003, 17: 525-536.

[11] Morris, Garrett M.; Goodsell, David S.; Halliday, Robert S.; Huey, Ruth; Hart, William E.; Belew, Richard K.; Olson, Arthur J. Journal of Computational Chemistry 1998, 19: 1639-1662.

[12] rotein Data Bank http://www.rcsb.org/pdb/home/home.do (accessed april, 2013)

[13] Morris, Garret; Huey, Ruth. 2013. http://autodock.scripps.edu/faqs-help/tutorial/using-autodock-4-with-autodocktools (accessed September 8, 2008) .