Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of...

40
Prediction of protein structure

Transcript of Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of...

Page 1: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Prediction of protein structure

Page 2: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

aim

Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function relationships.

Page 3: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Genbank/EMBL 105.000.000

Uniprot 5.200.000

PDB 47.000

Page 4: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 5: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

The protein folding problem

The information for 3D structures is coded in the protein sequence

Proteins fold in their native structure in seconds

Native structures are both thermodynamically stables and kinetically available

Page 6: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

AVVTW...GTTWVRAVVTW...GTTWVR

ab-initio prediction

Prediction from sequence using first principles

Page 7: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Ab-initio prediction

“In theory”, we should be able to build native structures from first principles using sequence information and molecular dynamics simulations: “Ab-initio prediction of structure”

Simulaciones de 1 s de “folding” de una proteína modelo (Duan-Kollman: Science, 277, 1793, 1998).

Simulaciones de folding reversible de péptidos (20-200 ns) (Daura et al., Angew. Chem., 38, 236, 1999).

Simulaciones distribuidas de folding de Villin (36-residues) (Zagrovic et al., JMB, 323, 927, 2002).

Page 8: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

... the bad news ...

It is not possible to span simulations to the “seconds” range

Simulations are limited to small systems and fast folding/unfolding events in known structures steered dynamics biased molecular dynamics

Simplified systems

Page 9: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

typical shortcuts

Reduce conformational space 1,2 atoms per residue fixed lattices

Statistic force-fields obtained from known structures Average distances between residues Interactions

Use building blocks: 3-9 residues from PDB structures

Page 10: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Some protein from ESome protein from E.coli.coli predicted at 7.6 Åpredicted at 7.6 Å

(CASP3, H.Scheraga)(CASP3, H.Scheraga)

Results from ab-initio

Average error 5 Å - Average error 5 Å - 10 Å10 Å

Function cannot be Function cannot be predictedpredicted

Long simulationsLong simulations

Page 11: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

comparative modelling

The most efficient way to predict protein structure is to compare with known 3D structures

Page 12: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Protein folds

Page 13: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Basic concept

In a given protein 3D structure is a more conserved characteristic than sequence Some aminoacids are “equivalent” to each other Evolutionary pressure allows only aminoacids

substitutions that keep 3D structure largely unaltered

Two proteins of “similar” sequences must have the “same” 3D structure

Page 14: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Possible scenarios1. Homology can be recognized using sequence comparison tools or

protein family databases (blast, clustal, pfam,...).

Structural and functional predictions are feasible

2. Homology exist but cannot be recognized easily (psi-blast, threading)

Low resolution fold predictions are possible. No functional information.

3. No homology

1D predictions. Sequence motifs. Limited functional prediction. Ab-initio prediction

Page 15: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

fold prediction

Page 16: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

3D struc. prediction

Page 17: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

1D prediction

Prediction is based on averaging aminoacid properties

AGGCFHIKLAAGIHLLVILVVKLGFSTRDEEASS

Average over a window

Page 18: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 19: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

1D prediction. Properties

Secondary structure propensitites Hydrophobicity (transmembrane) Accesibility ...

Page 20: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Aminoacido P() P() P(turn)Ala 1.29 0.9 0.78Cys 1.11 0.74 0.8Leu 1.3 1.02 0.59Met 1.47 0.97 0.39Glu 1.44 0.75 1Gln 1.27 0.8 0.97His 1.22 1.08 0.69Lys 1.23 0.77 0.96

Val 0.91 1.49 0.47Ile 0.97 1.45 0.51Phe 1.07 1.32 0.58Tyr 0.72 1.25 1.05Trp 0.99 1.14 0.75Thr 0.82 1.21 1.03

Gly 0.56 0.92 1.64Ser 0.82 0.95 1.33Asp 1.04 0.72 1.41Asn 0.9 0.76 1.23Pro 0.52 0.64 1.91

Arg 0.96 0.99 0.88

Propensities Chou-FasmanBiochemistry 17, 4277 1978

turn

Page 21: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Some programs (www.expasy.org)

BCM PSSP - Baylor College of Medicine Prof - Cascaded Multiple Classifiers for Secondary Structure Prediction GOR I (Garnier et al, 1978) [At PBIL or at SBDS] GOR II (Gibrat et al, 1987) GOR IV (Garnier et al, 1996) HNN - Hierarchical Neural Network method (Guermeur, 1997) Jpred - A consensus method for protein secondary structure prediction at

University of Dundee nnPredict - University of California at San Francisco (UCSF) PredictProtein - PHDsec, PHDacc, PHDhtm, PHDtopology, PHDthreader,

MaxHom, EvalSec from Columbia University PSA - BioMolecular Engineering Research Center (BMERC) / Boston PSIpred - Various protein structure prediction methods at Brunel University SOPM (Geourjon and Deléage, 1994) SOPMA (Geourjon and Deléage, 1995) AGADIR - An algorithm to predict the helical content of peptides

Page 22: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

1D Prediction

Original methods: 1 sequence and uniform parameters (25-30%)

Original improvements: Parameters specific from protein classes

Present methods use sequence profiles obtained from multiple alignments and neural networks to extract parameters (70-75%, 98% for transmembrane helix)

Page 23: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 24: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Methods for remote homology

Homology can be recognized using PSI-Blast

Fold prediction is possible using threading methods

Acurate 3D prediction is not possible: No structure-function relationship can be inferred from models

Page 25: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Threading

Unknown sequence is “folded” in a number of known structures

Scoring functions evaluate the fitting between sequence and structure according to statistical functions and sequence comparison

Page 26: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

ATTWV....PRKSCTATTWV....PRKSCT

..........

10.510.5 5.2>> ..........

SELECTED HITSELECTED HIT

Page 27: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

ATTWV....PRKSCTATTWV....PRKSCT SequenceSequenceHHHHH....CCBBBBHHHHH....CCBBBB Pred. Sec. Struc.Pred. Sec. Struc.eeebb....eeebebeeebb....eeebeb Pred. accesibilityPred. accesibility

..........

SequenceSequence GGTV....ATTW ........... ATTVL....FFRKGGTV....ATTW ........... ATTVL....FFRKObs SS Obs SS BBBB....CCHH ........... HHHB.....CBCB BBBB....CCHH ........... HHHB.....CBCB Obs Acc. Obs Acc. EEBE.....BBEB ........... BBEBB....EBBEEEBE.....BBEB ........... BBEBB....EBBE

Page 28: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Threading accurancyThreading accurancy

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

% ACIERTOS

5 10 15 20 25

% IDENTIDAD SECUENCIAS

Page 29: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 30: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 31: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Comparative modelling

Good for homology >30%

Accurancy is very high for homology > 60%

Reminder The model must be USEFUL Only the “interesting” regions of the protein need

to be modelled

Page 32: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Expected accurancy

Strongly dependent on the quality of the sequence alignment

Strongly dependent on the identity with “template” structures. Very good structures if identity > 60-70%.

Quality of the model is better in the backbone than side chains

Quality of the model is better in conserved regions

Page 33: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 34: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Quality test

No energy differences between a correct or wrong model

The structure must by “chemically correct” to use it in quantitative predictions

Page 35: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Analysis software

PROCHECK WHATCHECK Suite Biotech PROSA

Page 36: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 37: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 38: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.
Page 39: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Prediction software

SwissModel (automatic) http://www.expasy.org/swissmod/

SwissModel Repository http://swissmodel.expasy.org/repository/

3D-JIGSAW (M.Stenberg) http://www.bmm.icnet.uk/servers/3djigsaw/

Modeller (A.Sali) http://salilab.org/modeller/modeller.html

MODBASE (A. Sali) http://alto.compbio.ucsf.edu/modbase-cgi/index.cgi

Page 40: Prediction of protein structure. aim Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function.

Final test

The model must justify experimental data (i.e. differences between unknown sequence and templates) and be useful to understand function.