Protein 3D structure and classification database
-
Upload
nadeem-akhter -
Category
Education
-
view
260 -
download
2
Transcript of Protein 3D structure and classification database
Protein structure and modelingPrestonUniversityIslam abad
1
Introduction • Protein • Function of proteins • Enzymes • Structures• Catalysts • Transportation • Regulation • Signaling
2
Amino acids • Amino acids basic units of proteins • Chiral carbon • Side chain • Hydrogen group • Amino group • Carboxylic group
3
Protein sequences: amino acids
4
5
6
7
Codes for amino acids
8
Protein
• Primary structure • Secondary structure• Tertiary structure • Quaternary structure
9
Secondary structures • Structures formed via introduction of hydrogen bonding in the
linear polypeptide chain • Alpha helices • Beta sheets
10
Alpha helices
• Right hand-coiled or spiral conformation (helix) in which every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier ( hydrogen bonding).
• Among types of local structure in proteins, the α-helix is the most regular and the most predictable from sequence, as well as the most prevalent.
11
12
Beta sheets • The β sheet (also β-pleated sheet) is the second form of
regular secondary structure in proteins. It is less common than the alpha helix.
• Beta sheets consist of beta strands connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet.
• A beta strand (also β strand) is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an almost fully extended conformation.
13
Parallel beta sheet
Anti parallel beta sheet
14
Tertiary structure • The term protein tertiary structure refers to a protein's geometric shape. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein
secondary structures, the protein domains.
15
16
Conformational parameters for secondary structure of a protein • Dihedral angles: in proteins the A.A joint is specific in its
orientation which determines the conformation of the protein. • The conformation of the protein could then be elucidated via
the angles in the parent chain and not the side chain of the protein
• The angle Phi φ is present at the C alpha to Nitrogen of amino group in the polypeptide
• The angle Psi ψ is present at the C alpha to carbon of carboxylic group in the polypeptide
• The angles phi and psi should be considered as 180 degrees when the polypeptide is in fully extended conformation 17
Ramachandran plot
18
• There are certain permitted values for these angles.
• As if the values are not appropriate there might be steric hindrance and the conformation might get distorted.
• The protein might also get non functional
• A Ramachandran plot can be used in two different ways.• One is to show in theory which values, or conformations, of
the ψ and φ angles are possible for an amino-acid residue in a protein .
• A second is to show the empirical distribution of data points observed in a single structure in usage for structure validation, or else in a database of many structures
• Either case is usually shown against outlines for the theoretically favored regions.
19
Ramachandran Plot
20
Hydropathy plot • A hydropathy plot is a quantitative analysis of the degree of
hydrophobicity or hydrophilicity of amino acids of a protein.• It is used to characterize or identify possible structure or
domains of a protein.
• If more hydrophobic residues are present in a plot this means that the protein is a trans membrane protein and domain refers to the inner side of the membrane that spans the membrane multiple times.
21
• The plot has amino acid sequence of a protein on its x-axis• Degree of hydrophobicity and hydrophilicity on its y-axis• There is a number of methods to measure the degree of
interaction of polar solvents such as water with specific amino acids.
• For instance, the Kyte-Doolittle scale indicates hydrophobic amino acids, whereas the Hopp-Woods scale measures hydrophilic residues.
22
• Analyzing the shape of the plot gives information about partial structure of the protein.
• For instance, if a stretch of about 20 amino acids shows positive for hydrophobicity, these amino acids may be part of alpha-helix spanning across a lipid bilayer, which is composed of hydrophobic fatty acids.
• On the converse, amino acids with high hydrophilicity indicate that these residues are in contact with solvent, or water, and that they are therefore likely to reside on the outer surface of the protein.
• Expasy protscale - could be used to construct a hydropathy plot instantaneously
23
Expasy – protscale
24
25
26
27
28
aquaporin
Methods of protein structure and modeling
Threading or fold recognition
Ab initio/ De novo method
29
1)Threading • There might be a structural similarity in two proteins with
almost less than ten percent of the sequence similarity • When sequence based comparison methods are not much
efficient to recognize the folds and domains in the target sequence then we proceed with the threading
• Threading is the method by which a library of unique structures is searched for structure analogues to the target sequence, and is based on the theory that there may be only a distinct number of folds
30
Basic components of foldingRepresentation
of the query sequence
Representation of the protein
structural models
Objective function
Aligning a sequence to a
model
Selecting a model from a
library
31
Representation of the query sequence
• Similar protein sequence leads to the similar protein structure • Sequences similar to the query sequence are carrying
information about the 3D structure of the query sequence • The algorithms are also there to develop the different
representation
32
Representation of the protein structural models
• Protein structure is determined by all the non hydrogen atoms in their 3D conformation
• The 3D coordinates in the soft wares used for threading purpose are more well suited to the abstract protein structure and give almost a view which is just like the original 3D protein structures
33
Objective function
• The 3D data deposited in the databases like PDB is analyzed via the different statistical protocols
• These analyzed data are now referred to as knowledge based potentials or empirical potentials
• In the case of non-linear models the other name is contact potentials etc
34
Aligning a sequence to a model
• The goal of threading alignment algorithm is to find an optimal match for the query sequence to the best suited sample protein sequence
• The sequence structure algorithms can be done to find the best suited match
35
Selecting a model from a library
• The different models which result as a base of alignments of the sequences and structures would lead to multiple results
• The best result with the highest score would be selected to model the protein structure
36
2) Ab initio method • Ab initio structure prediction leads to the protein structure
determination by the protein sequence alone • The free energy estimation of all the molecules present in the
amino acid sequence of the protein is also done independently
• The two key components of the de novo methods are the procedure for the efficiently carrying the conformational search and the free energy estimation function used for evaluating the possible conformations.
37
Ab-initio method
Advantages • Ab-initio approach can
be applied to model any sequence
Disadvantages • Low resolution models • Limited number of
residues of less than 100 amino acids could be modeled only
38
Thanks
39
Signal peptide prediction • A signal peptide which is also sometimes referred to as signal
sequence, leader sequence or leader peptide is a short 5-30 amino acids long peptide present at the N-terminus of the majority of newly synthesized proteins.
• These proteins are destined towards the secretory pathway.• These proteins include those that reside either inside certain
organelles (the endoplasmic reticulum, Golgi or endosomes), secreted from the cell, or inserted into most cellular membranes.
• Signal peptide version 4 has been used to detect the presence of the signal peptides
40
41
42