Download - BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Transcript
Page 1: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

BMC Bioinformatics 2005, 6(Suppl 4):S3BMC Bioinformatics 2005, 6(Suppl 4):S3

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Strict relation between protein function and Strict relation between protein function and structurestructureGap between known sequences and Gap between known sequences and known tertiary structures is constantly known tertiary structures is constantly increasingincreasingThere is a need for automatic methodsThere is a need for automatic methodsGeneral methodology able to solve the General methodology able to solve the problem has not yet been devisedproblem has not yet been devised

Page 2: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Protein structure prediction is a very Protein structure prediction is a very difficult taskdifficult taskWhy?Why?

Page 3: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Complex interactions exist between intra-Complex interactions exist between intra-molecular atoms and between the protein molecular atoms and between the protein and the surrounding environment.and the surrounding environment.Number of interactions to track increases Number of interactions to track increases exponentially with molecule sizeexponentially with molecule sizeThe number of possible structures that The number of possible structures that proteins may possess is extremely largeproteins may possess is extremely large

Page 4: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

The physical basis of protein structural The physical basis of protein structural stability is not fully understood stability is not fully understood The primary sequence may not fully The primary sequence may not fully specify the tertiary structure (chaperones specify the tertiary structure (chaperones have the ability to induce proteins to fold in have the ability to induce proteins to fold in specific ways)specific ways)

Page 5: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Direct simulation of protein folding via Direct simulation of protein folding via methods such as molecular dynamics is methods such as molecular dynamics is not generally reliable for both practical and not generally reliable for both practical and theoretical reasonstheoretical reasonsDistributed computing projects are tackling Distributed computing projects are tackling such simulation difficulties such simulation difficulties

Page 6: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Distributed computing projects:Distributed computing projects:– Folding@home (Stanford University's Folding@home (Stanford University's

Chemistry Department )Chemistry Department )– Predictor@home (Scripps Research Institute )Predictor@home (Scripps Research Institute )– Human Proteome Folding Project (part of Human Proteome Folding Project (part of

World Community Grid run by IBM)World Community Grid run by IBM)

Page 7: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Goal of protein structure prediction is to Goal of protein structure prediction is to determine the 3D structure of proteins determine the 3D structure of proteins from their amino acid sequence from their amino acid sequence Some approaches:Some approaches:– Comparative Protein Modeling: uses Comparative Protein Modeling: uses

previously solved structures as starting points previously solved structures as starting points

Page 8: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Comparative Protein Modeling: 2 methodsComparative Protein Modeling: 2 methods– homology modelinghomology modeling– protein threading protein threading

Protein threading:Protein threading:– scans the amino acid sequence of an unknown scans the amino acid sequence of an unknown

structure against a database of solved structures structure against a database of solved structures – a scoring function is used to assess the compatibility a scoring function is used to assess the compatibility

of the unknown sequence (target sequence) to the of the unknown sequence (target sequence) to the known structure (template)known structure (template)

Page 9: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Homology ModelingHomology Modeling– Facilitated by the fact that 3D structure of Facilitated by the fact that 3D structure of

proteins from the same family is more proteins from the same family is more conserved than their primary sequences conserved than their primary sequences

– Example: human hemoglobin and Example: human hemoglobin and leghemoglobin (hemoglobin in legumes) leghemoglobin (hemoglobin in legumes)

If proteins are similar at the sequence If proteins are similar at the sequence level then structural similarity can usually level then structural similarity can usually be assumedbe assumed

Page 10: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Predicting structure from scratchPredicting structure from scratch– De novoDe novo structure prediction (or structure prediction (or ab initioab initio

structure prediction)structure prediction)– Requires vast computational resources Requires vast computational resources – Uses stochastic methods to search possible Uses stochastic methods to search possible

solutions solutions – Finding the structure with the lowest free Finding the structure with the lowest free

energy is the key element of this approachenergy is the key element of this approach

Page 11: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Distributed computingDistributed computing– Folding@homeFolding@home– Predictor@homePredictor@home– Human Proteome Folding Project Human Proteome Folding Project

Employs the unused CPU cycles of Employs the unused CPU cycles of personal computers worldwide to analyze personal computers worldwide to analyze scientific data scientific data

Page 12: BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.

Protein Structure PredictionProtein Structure Predictionnot a trivial matternot a trivial matter

Computational simulations of model Computational simulations of model proteins proteins – most proteins are too large for current most proteins are too large for current

technology to simulate folding on an atom by technology to simulate folding on an atom by atom basis atom basis

– lattice proteins: highly simplified computer lattice proteins: highly simplified computer models of proteins, amino acid sequence models of proteins, amino acid sequence behaves like a single functional unit (a bead)behaves like a single functional unit (a bead)