Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and...

31
Protein Threading Zhanggroup 2003 10 22

Transcript of Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and...

Page 1: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Protein Threading

Zhanggroup 2003 10 22

Page 2: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Overview

Background protein structure protein folding and designability

Protein threadingCurrent limitations to protein threading

Page 3: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Computational complexity of certain formulations of the protein threading problem

Performance of protein threading systems

References

Page 4: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Protein Structure

Primary, secondary, tertiary structure

Page 5: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Can only refer to the structure of a protein if a particular environment is assumed

solvent environment (aqueous trans-membrane ……)

temperature pH etcDifferent environments yield different

structures or no stable structure at all

Page 6: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Proteins molecules are not completely rigid structures

kinetic energy energetic collisions with solvent molecules

vibrations sidechain conformational changes

flexible sections of the peptide chainThe native tertiary structure of a protein

is thus an average

Page 7: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Protein Folding

Protein folding = searching for a conformation having minimum energy

Page 8: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Factors in protein folding

hydrophobic effects electrostatic charges in residues hydrogen bondingChaperonins,ribosomes

Page 9: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

3 stages of folding

denatured unfolded state molten globule state native compact statemost proteins will return to their native

state after forced denaturation

Page 10: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

The Protein Folding Problem

Given a proteins amino acid sequence what is its tertiary structure

The protein folding problem is hard

Page 11: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Direct approach :molecular dynamics simulation

Simulate on an atomic level the folding of a single protein molecule

protein = thousands of atomssolvent environment = hundreds to

thousands of molecules => thousands of atoms

Page 12: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Sub-picosecond time scalesrun the simulation for 1-5 secondsWe need another years of Moores law

to make this computation feasible

Page 13: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

DesignabilityA protein with a stable native state can

not have another low-energy state nearby in conformational space

A structure is highly designable if its minimum energy state has no low-energy neighbours

Page 14: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Protein Threadinginverse protein folding problem: givena tertiary structure, find an amino acid

sequence that folds to that structureProtein threading: given a library of

possible protein folds and an amino acid sequence find the fold with the

best sequence -> structure alignment (threading)

Page 15: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Evolution depends on designability to preserve function under mutation

Estimate only different protein structures exist in nature (Chothia,1992)

Page 16: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

four componentsa library of protein folds (templates)a scoring function to measure the

fitness of a sequence -> structure alignment

a search technique for finding the best alignment between a fixed sequence and structure

Page 17: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

a means of choosing the best fold from among the best scoring alignments of a sequence to all possible folds

Page 18: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Scoring Schemes for Sequence->Structure

Alignments

The scoring scheme for a particular threading of a sequence onto a structure measures the degree to which

Page 19: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

environmental preferences are satisfied Different amino acid types prefer different

environments e.g. structural preferences: in helix in sheet not exposed to solvent pairwise interactions with neighbouring amino

acids

Page 20: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Formal Statement of the ProteinThreading Problem

C is a protein core having m segments Ci representing a set of contiguous amino acids Let ci be the length of Ci

Sequence a = a1a2…an of amino acids

Page 21: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.
Page 22: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.
Page 23: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.
Page 24: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.
Page 25: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Current limitations to protein threading

Statistical problems

Definition of neighbor and /or pairwise contact environments:

energetic neighbor ? contact neighbor

Page 26: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Computational Complexity of Finding an Optimal Alignment

The complexity of the protein threading problem depends on whether:

Variable-length gaps are allowed in alignments

the scoring function for an alignment incorporates pairwise interactions between amino acids

Page 27: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Property(I) makes the search space exponential in size to the length of the sequence

Property(Ii) forces a solution to take non-local effects into account

Page 28: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Any protein threading scheme with both properties is NP-complete

(3-SAT Lathrop 1994)

(MAX-CUT Akutsu,Miyano 1999)

Page 29: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Thus all protein threading approaches can be divided

into four groups:

1 no variable length gaps allowed

2 no pairwise interactions considered in scoring function

3 no optimal solution guarantee

4 exponential runtime

Page 30: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Performance of Protein Threading Systems

CASP1(1994) CASP2(1996) CASP3(1998): Critical

Assessment of Structure Prediction meetings

protein threading methods have consistently been

the winners

success depends on structural similarity of target to

known structures

successful even when target sequence and library

sequence have low homology

Page 31: Protein Threading Zhanggroup 2003 10 22. Overview Background protein structure protein folding and designability Protein threading Current limitations.

Much room for improvement in all areas of protein threading e.g.:

algorithms for searching the threading space

reliable biologically accurate scoring functions