Protein 3D structure and classification database

42
Protein structure and modeling Preston University Islam abad 1

Transcript of Protein 3D structure and classification database

Page 1: Protein 3D structure and classification database

Protein structure and modelingPrestonUniversityIslam abad

1

Page 2: Protein 3D structure and classification database

Introduction • Protein • Function of proteins • Enzymes • Structures• Catalysts • Transportation • Regulation • Signaling

2

Page 3: Protein 3D structure and classification database

Amino acids • Amino acids basic units of proteins • Chiral carbon • Side chain • Hydrogen group • Amino group • Carboxylic group

3

Page 4: Protein 3D structure and classification database

Protein sequences: amino acids

4

Page 5: Protein 3D structure and classification database

5

Page 6: Protein 3D structure and classification database

6

Page 7: Protein 3D structure and classification database

7

Page 8: Protein 3D structure and classification database

Codes for amino acids

8

Page 9: Protein 3D structure and classification database

Protein

• Primary structure • Secondary structure• Tertiary structure • Quaternary structure

9

Page 10: Protein 3D structure and classification database

Secondary structures • Structures formed via introduction of hydrogen bonding in the

linear polypeptide chain • Alpha helices • Beta sheets

10

Page 11: Protein 3D structure and classification database

Alpha helices

• Right hand-coiled or spiral conformation (helix) in which every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier ( hydrogen bonding).

• Among types of local structure in proteins, the α-helix is the most regular and the most predictable from sequence, as well as the most prevalent.

11

Page 12: Protein 3D structure and classification database

12

Page 13: Protein 3D structure and classification database

Beta sheets • The β sheet (also β-pleated sheet) is the second form of

regular secondary structure in proteins. It is less common than the alpha helix.

• Beta sheets consist of beta strands connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet.

• A beta strand (also β strand) is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an almost fully extended conformation.

13

Page 14: Protein 3D structure and classification database

Parallel beta sheet

Anti parallel beta sheet

14

Page 15: Protein 3D structure and classification database

Tertiary structure • The term protein tertiary structure refers to a protein's geometric shape. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein

secondary structures, the protein domains.

15

Page 16: Protein 3D structure and classification database

16

Page 17: Protein 3D structure and classification database

Conformational parameters for secondary structure of a protein • Dihedral angles: in proteins the A.A joint is specific in its

orientation which determines the conformation of the protein. • The conformation of the protein could then be elucidated via

the angles in the parent chain and not the side chain of the protein

• The angle Phi φ is present at the C alpha to Nitrogen of amino group in the polypeptide

• The angle Psi ψ is present at the C alpha to carbon of carboxylic group in the polypeptide

• The angles phi and psi should be considered as 180 degrees when the polypeptide is in fully extended conformation 17

Page 18: Protein 3D structure and classification database

Ramachandran plot

18

• There are certain permitted values for these angles.

• As if the values are not appropriate there might be steric hindrance and the conformation might get distorted.

• The protein might also get non functional

Page 19: Protein 3D structure and classification database

• A Ramachandran plot can be used in two different ways.• One is to show in theory which values, or conformations, of

the ψ and φ angles are possible for an amino-acid residue in a protein .

• A second is to show the empirical distribution of data points observed in a single structure in usage for structure validation, or else in a database of many structures

• Either case is usually shown against outlines for the theoretically favored regions.

19

Page 20: Protein 3D structure and classification database

Ramachandran Plot

20

Page 21: Protein 3D structure and classification database

Hydropathy plot • A hydropathy plot is a quantitative analysis of the degree of

hydrophobicity or hydrophilicity of amino acids of a protein.• It is used to characterize or identify possible structure or

domains of a protein.

• If more hydrophobic residues are present in a plot this means that the protein is a trans membrane protein and domain refers to the inner side of the membrane that spans the membrane multiple times.

21

Page 22: Protein 3D structure and classification database

• The plot has amino acid sequence of a protein on its x-axis• Degree of hydrophobicity and hydrophilicity on its y-axis• There is a number of methods to measure the degree of

interaction of polar solvents such as water with specific amino acids.

• For instance, the Kyte-Doolittle scale indicates hydrophobic amino acids, whereas the Hopp-Woods scale measures hydrophilic residues.

22

Page 23: Protein 3D structure and classification database

• Analyzing the shape of the plot gives information about partial structure of the protein.

• For instance, if a stretch of about 20 amino acids shows positive for hydrophobicity, these amino acids may be part of alpha-helix spanning across a lipid bilayer, which is composed of hydrophobic fatty acids.

• On the converse, amino acids with high hydrophilicity indicate that these residues are in contact with solvent, or water, and that they are therefore likely to reside on the outer surface of the protein.

• Expasy protscale - could be used to construct a hydropathy plot instantaneously

23

Page 24: Protein 3D structure and classification database

Expasy – protscale

24

Page 25: Protein 3D structure and classification database

25

Page 26: Protein 3D structure and classification database

26

Page 27: Protein 3D structure and classification database

27

Page 28: Protein 3D structure and classification database

28

aquaporin

Page 29: Protein 3D structure and classification database

Methods of protein structure and modeling

Threading or fold recognition

Ab initio/ De novo method

29

Page 30: Protein 3D structure and classification database

1)Threading • There might be a structural similarity in two proteins with

almost less than ten percent of the sequence similarity • When sequence based comparison methods are not much

efficient to recognize the folds and domains in the target sequence then we proceed with the threading

• Threading is the method by which a library of unique structures is searched for structure analogues to the target sequence, and is based on the theory that there may be only a distinct number of folds

30

Page 31: Protein 3D structure and classification database

Basic components of foldingRepresentation

of the query sequence

Representation of the protein

structural models

Objective function

Aligning a sequence to a

model

Selecting a model from a

library

31

Page 32: Protein 3D structure and classification database

Representation of the query sequence

• Similar protein sequence leads to the similar protein structure • Sequences similar to the query sequence are carrying

information about the 3D structure of the query sequence • The algorithms are also there to develop the different

representation

32

Page 33: Protein 3D structure and classification database

Representation of the protein structural models

• Protein structure is determined by all the non hydrogen atoms in their 3D conformation

• The 3D coordinates in the soft wares used for threading purpose are more well suited to the abstract protein structure and give almost a view which is just like the original 3D protein structures

33

Page 34: Protein 3D structure and classification database

Objective function

• The 3D data deposited in the databases like PDB is analyzed via the different statistical protocols

• These analyzed data are now referred to as knowledge based potentials or empirical potentials

• In the case of non-linear models the other name is contact potentials etc

34

Page 35: Protein 3D structure and classification database

Aligning a sequence to a model

• The goal of threading alignment algorithm is to find an optimal match for the query sequence to the best suited sample protein sequence

• The sequence structure algorithms can be done to find the best suited match

35

Page 36: Protein 3D structure and classification database

Selecting a model from a library

• The different models which result as a base of alignments of the sequences and structures would lead to multiple results

• The best result with the highest score would be selected to model the protein structure

36

Page 37: Protein 3D structure and classification database

2) Ab initio method • Ab initio structure prediction leads to the protein structure

determination by the protein sequence alone • The free energy estimation of all the molecules present in the

amino acid sequence of the protein is also done independently

• The two key components of the de novo methods are the procedure for the efficiently carrying the conformational search and the free energy estimation function used for evaluating the possible conformations.

37

Page 38: Protein 3D structure and classification database

Ab-initio method

Advantages • Ab-initio approach can

be applied to model any sequence

Disadvantages • Low resolution models • Limited number of

residues of less than 100 amino acids could be modeled only

38

Page 39: Protein 3D structure and classification database

Thanks

39

Page 40: Protein 3D structure and classification database

Signal peptide prediction • A signal peptide which is also sometimes referred to as signal

sequence, leader sequence or leader peptide is a short 5-30 amino acids long peptide present at the N-terminus of the majority of newly synthesized proteins.

• These proteins are destined towards the secretory pathway.• These proteins include those that reside either inside certain

organelles (the endoplasmic reticulum, Golgi or endosomes), secreted from the cell, or inserted into most cellular membranes.

• Signal peptide version 4 has been used to detect the presence of the signal peptides

40

Page 41: Protein 3D structure and classification database

41

Page 42: Protein 3D structure and classification database

42