Protein Structure: protein folding

52
Protein Structure: protein folding Bioinformatics in Biosophy Park, Jong Hwa MRC-DUNN Hills Road Cambridge CB2 2XY England 1 Next: 02/06/2001

description

1. Protein Structure: protein folding. Park, Jong Hwa MRC-DUNN Hills Road Cambridge CB2 2XY England. Bioinformatics in Biosophy. Next :. 02/06/2001. An Unusual Fold. Alpha helix is in the middle of beta sheets. From DNA To Protein In Cell. Different types of protein folds. - PowerPoint PPT Presentation

Transcript of Protein Structure: protein folding

Page 1: Protein Structure: protein folding

Protein Structure: protein folding

Bioinformatics in Biosophy

Park, Jong Hwa

MRC-DUNN Hills Road Cambridge

CB2 2XYEngland

1

Next:02/06/2001

Page 2: Protein Structure: protein folding

An Unusual Fold• Alpha helix is in the middle of beta sheets

Page 3: Protein Structure: protein folding

From DNA

To

Protein

In

Cell

Page 4: Protein Structure: protein folding

Different types of protein folds

Page 5: Protein Structure: protein folding

Types of protein folds

Page 6: Protein Structure: protein folding

Most proteins => Common folds

From Alex Finkelstein’s talk

Page 7: Protein Structure: protein folding

unusual fold: No Alpha no Beta

Page 8: Protein Structure: protein folding

Beta Inside

Page 9: Protein Structure: protein folding

Few seq: Rare fold, Many seq: Common Fold

Page 10: Protein Structure: protein folding

Physical Selection of Folds (NOT evolutionary selection)

Page 11: Protein Structure: protein folding

What is protein folding

• Protein folding refers to the process of protein primary structures folding into tertiary structures to produce stable and functioning biological molecules.

http://www.people.virginia.edu/~rjh9u/protfold.html

Page 12: Protein Structure: protein folding

Experimental background for Bioinformatics

• Understanding experimental techniques and knowledge on protein folding is essential for making bioinformatic algorithms for folding.

• Theories of folding and implementation of them are possible and quite acceptable.

• It is possible to compute 3D folds using computer for small peptides.

Page 13: Protein Structure: protein folding

Why protein folding problem?

1. Protein folding is one of the best known challenges humans have good research target

2. The physical rules of folding are not known.

3. The computational complexity is extremely high.

4. There are many diseases which are linked to protein folding.1. Alzheimer, BSE (mad cow disease), Parkinson, etc

5. Without knowing how proteins fold, it is not possible to design proteins fast and accurately.

Page 14: Protein Structure: protein folding

Proteins fold back!• In 1961, Christian Anfinsen showed that the

proteins actually tie themselves: If proteins become unfolded, they fold back into proper shape of their own accord; no shaper or folder is needed.

• This means most of the information for protein folding is encoded in the primary structure of proteins (sequences) !

• It is quite unbelievable!

Page 15: Protein Structure: protein folding

Levinthal’s paradox

• The fact that many naturally-occurring proteins fold reliably and quickly to their native state despite the astronomical number of possible configurations has come to be known as Levinthal's Paradox.

http://brian.ch.cam.ac.uk/~mark/levinthal/levinthal.html

C. Levinthal, in Mössbauer Spectroscopy in Biological Systems, Proceedings of a Meeting Held at Allerton House, Monticello, Illinois , edited by J. T. P. DeBrunner and E. Munck, pp. 22-24, University of Illinois Press, Illinois (1969).

Page 16: Protein Structure: protein folding

Astronomical possible conformations?

• there would be 3N possible configurations in our theoretical protein (N = length of protein).

• In nature, proteins apparently do not sample all of these possible configurations since they fold in a few seconds, and even postulating a minimum time for going from one conformation to another, the proteins would have time to try on the order of 108 different conformations at most before reaching their final state. – From Levinthal

Page 17: Protein Structure: protein folding

Longer than the age of universe?• Levinthal estimated the number of configurations a

typical protein could adopt (3N where N is the number of amino acids) and noted that even if protein configurations could be sampled at a rate of, say, 1013 per second, it would take longer than the age of the universe to find the native structure if this sampling was simply random.

• The result of this simple calculation markedly contradicts the actual properties of proteins, which can generally find the native structure from an unfolded state on time scales of the order of seconds or less.

This well-known contradiction has come to be known as Levinthal's paradox.

Page 18: Protein Structure: protein folding

NP-Hard?

• The most relevant class is that of NP-hard problems, for which there is no known algorithm that is guaranteed to find the solution within polynomial time, effectively rendering NP-hard problems intractable for large sizes.

• It has been shown that finding the global minimum of a protein is NP-hard.

Page 19: Protein Structure: protein folding

Native fold is not the most stable

http://online.itp.ucsb.edu/online/infobio01/finkelstein/oh/34.html

Page 20: Protein Structure: protein folding

Guided Folding and nucleation

• We feel that protein folding is speeded and guided by the rapid formation of local interactions which then determine the further folding of the peptide.

• This suggests local amino acid sequences which form stable interactions and serve as nucleation points in the folding process.

Page 21: Protein Structure: protein folding
Page 22: Protein Structure: protein folding

Nucleus of folding

Page 23: Protein Structure: protein folding

Folding Time and protein size

Page 24: Protein Structure: protein folding

Proteins have Folding Pathways

• As proteins do not search for all the possible conformations for folding,

• there must be one or MORE folding pathways to have the final stable folds.

Page 25: Protein Structure: protein folding

Many different paths

Folding paths can have different shapes.

It could be narrow and specific path or some kind of funnel.

The paths can be complicaated and reversible.

Page 26: Protein Structure: protein folding

Multiple pathways on a protein-folding energy landscape• Goldbeck, R.A., Thomas, Y.G., Chen, E., Esquerra, R.M., and Kliger, D.S. (1999) Multiple pathways on a protein-folding energy

landscape: kinetic evidence. Proc Natl Acad Sci U S A 96: 2782-7.The funnel landscape model predicts that protein folding proceeds

through multiple kinetic pathways.Abstract: Experimental evidence is presented for more than one such pathway

in the folding dynamics of a globular protein, cytochrome c. After photodissociation of CO from the partially denatured ferrous protein, fast time-resolved CD spectroscopy shows a submillisecond folding process that is complete in approximately 10(-6) s, concomitant with heme binding of a methionine residue. Kinetic modeling of time-resolved magnetic circular dichroism data further provides strong evidence that a 50-microseconds heme-histidine binding process proceeds in parallel with the faster pathway, implying that Met and His binding occur in different conformational ensembles of the protein, i.e., along respective ultrafast (microseconds) and fast (milliseconds) folding pathways. This kinetic heterogeneity appears to be intrinsic to the diffusional nature of early folding dynamics on the energy landscape, as opposed to the late-time heterogeneity associated with nonnative heme ligation and proline isomers in cytochrome c. Evidence from BPTI

Page 27: Protein Structure: protein folding

Pathway means some intermediates

The folding pathway naturally indicate some intermediate states for protein folding.

1. Pre-folded or misfolded state

2. Some Nucleus

3. Intermediates such a molten globule or partially folded state

4. Natural fold state (folded)

Page 28: Protein Structure: protein folding

Intermediates and Molten Globules

Molten globule is not just misfolded structure.

Page 29: Protein Structure: protein folding

Native versus Molten golubles

Page 30: Protein Structure: protein folding

Free energy barrier

Page 31: Protein Structure: protein folding

Native folds and energy difference

Page 32: Protein Structure: protein folding

Correct Knotting for the final Str.

• From Alexy Finkelstein’s lecture

Page 33: Protein Structure: protein folding

Is folding hierarchical?• Baldwin, R.L., and Rose, G.D. (1999) Is protein folding

hierarchic? II. Folding intermediates and transition states. Trends Biochem Sci 24: 77-83.

The folding reactions of some small proteins show clear evidence of a hierarchic process, whereas others, lacking detectable intermediates, do not.

Evidence from folding intermediates and transition states suggests that folding begins locally, and that the formation of native secondary structure precedes the formation of tertiary interactions, not the reverse.

Some computational experiment results support a hierarchic model of protein folding.

Page 34: Protein Structure: protein folding
Page 35: Protein Structure: protein folding

Dynamic nature of folding

Protein folding is dynamic, so the intermediates, misfolded and folded structures exist at the same time until all the species become natually folded.

There is a continuous exchange of species during the processes.

Dynamic nature Studying Unfolding of proteins is necessary

Page 36: Protein Structure: protein folding

Network of unfolding pathways

Page 37: Protein Structure: protein folding

Typically accepted stages of folding

1. Forming short segments of secondary structure forming folding nuclei

2. Interaction of nuclei to form domains

3. Interaction of domains to form the molten globule

4. Rearrangement of molten globule to native conformation of a monomer

5. Interaction of monomers to form multimers in multisubunit proteins.

6. Further conformational adjustments.

Page 38: Protein Structure: protein folding

Wait! Here comes the chaperone

• We found some proteins which help other proteins fold correctly.

• Folding became complicated by chaperones

Page 39: Protein Structure: protein folding

Computational Folding

Page 40: Protein Structure: protein folding

The essence of protein folding problem in Bioinformatics

• We have learnt physical and experimental backgrounds of folding.

• For Bioinformatists, the folding problem is on HOW to map and predict all the long range residue-residue interactions.

• Secondary structure prediction is easy. Putting secondary structures to tertiary structure is NOT.

• Solving protein folding problem can be said as : “Ab Initio Structure Prediction”

Page 41: Protein Structure: protein folding
Page 42: Protein Structure: protein folding

Theory of protein structure.

There are two theoretical approaches to the protein folding problem. 

The first approach is to consider proteins merely as long polymer chains whose energies must be minimized by searching in the space of torsion angles or by using simplified lattice models. 

Page 43: Protein Structure: protein folding

All atoms model• Parameters are obtained through high-level quantum

mechanical calculations on short peptide fragments

• Such an approach has several advantages. It assures the generality and allows further refinement upon the availability of more accurate quantum mechanical methods and upon the need for such an improvement.

• Detailed models require both a large number of particles, typically more than 10,000

• A small time step of one to two femtoseconds (10–15 seconds), direct simulation of the folding processes, which take place on a microsecond or larger time scale, has been difficult.

Page 44: Protein Structure: protein folding

An alternative way of looking at the problem is to consider proteins as systems of regular secondary structures.  

In terms of secondary structure, the protein folding process can be represented as a sequence of the following events:

(1) formation of -helices and -sheets by the hydrophobically collapsed peptide chain,

(2) assembly of the regular secondary structure elements into the protein core, and

(3) joining of nonregular loops and the less stable "peripheral" helices and -strands to the core and the association of independently formed domains.     

Page 45: Protein Structure: protein folding

Theoretical protein foldingA theory of protein self-organization must reproduce all

these events to finally calculate three-dimensional structures of proteins. 

The development of this theory requires, first of all, • a design  of new, specialized force field to calculate

free energies of -helices and -sheets relative to the coil (the secondary structures and the coil are represented by large ensembles of conformers) , instead of enthalpies of individual conformers given by molecular mechanics force fields. 

• This theory requires also a design of alternative global energy optimization strategies, such as calculation of the lowest energy partition of the peptide chain into different secondary and supersecondary structures, instead of searching in the space of torsion angles or lattice simulations. 

Page 46: Protein Structure: protein folding

Protein Structure Modelling: Comparative modelling

In 4 years, humans will have almost all protein domain structures in nature.

Instead of solving the difficult protein folding problem, we can model most of protein structures by comparative modelling Methods.

Page 47: Protein Structure: protein folding

The steps of modelling

Sequence homologous Structure Alignment Superimposition adjustment of atomic coordinates (such as by satisfaction of spatial restraints ) Optimization validation Iteration of the process.

Page 48: Protein Structure: protein folding

Modelling process

Page 49: Protein Structure: protein folding

Modeller by Andrej Sali• First, many distance and dihedral angle restraints on

the target sequence are calculated from its alignment with template 3D structures. The form of these

restraints was obtained from a statistical analysis of the relationships between many pairs of homologous

structures. This analysis relied on a database of 105 family alignments that included 416 proteins with known 3D structure [ali & Overington, 1994]. By scanning the database, tables quantifying various correlations were obtained, such as the correlations between two equivalent - distances, or between equivalent mainchain dihedral angles from two related proteins.

Page 50: Protein Structure: protein folding

• These relationships were expressed as conditional probability density functions (pdf's) and can be used directly as spatial restraints. For example, probabilities for different values of the mainchain dihedral angles are calculated from the type of a residue considered, from mainchain conformation of an equivalent residue, and from sequence similarity between the two proteins. Another example is the pdf for a certain - distance given equivalent distances in two related protein structures. An important feature of the method is that the spatial restraints are obtained empirically, from a database of protein structure alignments.

Page 51: Protein Structure: protein folding

• Next, the spatial restraints and CHARMM energy terms enforcing proper stereochemistry [Brooks et al., 1983] are combined into an objective function. Finally, the model is obtained by optimizing the objective function in Cartesian space. The optimization is carried out by the use of the variable target function method [Braun & Gõ, 1985] employing methods of conjugate gradients and molecular dynamics with simulated annealing. Several slightly different models can be calculated by varying the initial structure.

• The variability among these models can be used to estimate the errors in the corresponding regions of the fold.

Page 52: Protein Structure: protein folding

Running Modelling Program

• http://www.expasy.ch/swissmod/SWISS-MODEL.html