110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture...

76
1 BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07 BCB 444/544 Lecture 23 Protein Tertiary Structure Prediction #23_Oct15

Transcript of 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture...

Page 1: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

1BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

BCB 444/544

Lecture 23

Protein Tertiary Structure Prediction

#23_Oct15

Page 2: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

2BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Mon Oct 15 - Lecture 23

Protein Tertiary Structure Prediction

• Chp 15 - pp 214 - 230

Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab 8

(Terribilini)

RNA Structure/Function & RNA Structure Prediction

• Chp 16 - pp 231 - 242

Fri Oct 18 - Lecture 25

Gene Prediction • Chp 8 - pp 97 - 112

Required Reading (before lecture)

Page 3: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

3BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

New Reading & Homework Assignment

ALL: HomeWork #4 (emailed & posted online Sat AM)

Due: Mon Oct 22 by 5 PM (not Fri Oct 19) Read:

Ginalski et al.(2005) Practical Lessons from Protein Structure Prediction, Nucleic Acids Res. 33:1874-91. http://nar.oxfordjournals.org/cgi/content/full/33/6/1874 (PDF posted on website)

• Although somewhat dated, this paper provides a nice overview of protein structure prediction methods and evaluation of predicted structures.

• Your assignment is to write a summary of this paper - for details see HW#4 posted online & sent by email on Sat

Oct 13

Page 4: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

4BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Seminars Last Week

Dr. Klaus Schulten (Univ of Illinois) - Baker Center

Seminar The Computational Microscope

2:10 PM in E164 Lagomarcino http

://www.bioinformatics.iastate.edu/seminars/abstracts/2007_2008/Klaus_Schulten_Seminar.pdf

• Check out links on Schulten's website (videos, etc) • http://www.ks.uiuc.edu/~kschulte/

• Great seminar - amazing simulations of dynamics in proteins and large macromolecular assemblies

• Very computationally intensive - very impressive demonstration of power of computation to produce insights not attainable using only experimental approaches

Page 5: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

5BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Seminars this Week

BCB List of URLs for Seminars related to Bioinformatics:http://www.bcb.iastate.edu/seminars/index.html

• Oct 18 Thur - BBMB Seminar 4:10 in 1414 MBB • Sachdeve Sidhu (Genentech) Phage peptide and

antibody libraries in protein engineering and ligand selection

• Oct 19 Fri - BCB Faculty Seminar 2:10 in 102 ScI• Lyric Bartholomay (Ent, ISU) TBA

Page 6: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

6BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Sequence & Structure: Analysis

• Diamond STING Millennium - Many useful structure analysis tools, including Protein Dossier http://trantor.bioc.columbia.edu/SMS/

• SwissProt (UniProt)Protein knowledgebase

http://us.expasy.org/sprot

• InterProSequence analysis tools

http://www.ebi.ac.uk/interpro

Page 7: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

7BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Chp 14 - Secondary Structure Prediction

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 14

Protein Secondary Structure Prediction

• √Secondary Structure Prediction for Globular Proteins

• √Secondary Structure Prediction for Transmembrane Proteins

• √Coiled-Coil Prediction

Page 8: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

8BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Where Find "Actual" Secondary Structure? In the PDB

Page 9: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

9BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

How Does Predicted Secondary Structure Compare with Actual? (An example)

Query MAATAAEAVASGSGEPREEAGALGPAWDESQLRSYSFPTRPIPRLSQSDPRAEELIENEEGOR V CCCCHHHHHHHHCCHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCFDM CCCCCCCCCCCCCCCCCEECCCCCCCCCHHHCCCCCCEECCCCCCCCCCHHHHHHHHCCCCDM CCCCHHHHHHCCCCCCCEECCCCCCCCCHHHCCCCCCEECCCCCCCCCCHHHHHHHHCCC

DSSPAuthor

Predicted - Using 3 methods (from CMD server, Jernigan Group, ISU)

Actual - Calculated from PDB coordinates by DSSP or author:

Page 10: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

10BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Chp 15 - Tertiary Structure Prediction

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 15

Protein Tertiary Structure Prediction

• Methods• Homology Modeling• Threading and Fold Recognition• Ab Initio Protein Structural Prediction• CASP

Page 11: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

11BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Structural Genomics - Status & Goal

~ 20,000 "traditional" genes in human genome (recall, this is fewer than earlier estimate of

30,000)

~ 2,000 proteins in a typical cell> 4.9 million sequences in UniProt (Oct 2007)

> 46,000 protein structures in the PDB (Oct 2007)

Experimental determination of protein structure lags far behind sequence determination!

Goal: Determine structures of "all" protein folds in nature, using combination of experimental structure determination methods (X-ray crystallography, NMR, mass spectrometry) & structure prediction

Page 12: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

12BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Structural Genomics Project

TargetDB: Database of Structural Genomics

Targets

http://targetdb.pdb.org

Page 13: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

13BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

PMDB: Protein Model Database http://mi.caspur.it/PMDB/help.php

also, via NAR's Molecular Biology Database Collection http://www.oxfordjournals.org/nar/database/summary/855

Database of Theoretical Structures?

Theoretical structural models (predicted) are no longer accepted by the PDB (since 10/15/06); but, it is possible to search for models deposited earlier:

http://www.rcsb.org/pdb/search/searchModels.do

Page 14: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

14BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Structure Prediction or Protein Folding Problem

"Major unsolved problem in molecular biology"

In cells: spontaneousassisted by enzymesassisted by chaperones

In vitro: many proteins can fold to their "native" states spontaneously & without assistance

but, many do not!

Page 15: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

15BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Deciphering the Protein Folding Code

• Protein Structure Prediction or• Protein Folding Problem

Given the amino acid sequence of a protein, predict its 3-dimensional structure (fold)

• Inverse Folding ProblemGiven a protein fold, identify every amino acid sequence that can adopt its 3-dimensional structure

Page 16: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

16BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Structure Prediction

Structure is largely determined by sequence

BUT:• Similar sequences can assume different structures• Dissimilar sequences can assume similar structures• Many proteins are multi-functional 2 Major Protein Folding Problems:

1- Determine folding pathway 2- Predict tertiary structure from sequence

Both still largely unsolved problems

Page 17: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

17BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Steps in Protein Folding

1- "Collapse"- driving force is burial of

hydrophobic aa’s (fast - msecs)

2- Molten globule - helices & sheets form, but

"loose" (slow - secs)

3- "Final" native folded state - compaction

& rearrangement of 2' structures

Native state?- assumed to be lowest free energy- may be an ensemble of structures

Page 18: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

18BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Dynamics

• Protein in native state is NOT static

• Function of many proteins requires conformational changes, sometimes large, sometimes small

• Globular proteins are inherently "unstable"

(NOT evolved for maximum stability)

• Energy difference between native and denatured state is very small (5-15 kcal/mol)

(this is equivalent to ~ 2 H-bonds!)

• Folding involves changes in both entropy & enthalpy

Page 19: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

19BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Difficulty of Tertiary Structure Prediction

Folding or tertiary structure prediction problem can be formulated as a search for minimum energy conformation

• Search space is defined by psi/phi angles of backbone and side-chain rotamers

• Search space is enormous even for small proteins!

• Number of local minima increases exponentially with number of residues

Computationally it is an exceedingly difficult problem!

Page 20: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

20BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Tertiary Structure Prediction Methods

2 (or 3) Major Methods:1. Comparative Modeling:

• Homology Modeling (easiest!) • Threading and Fold Recognition (harder)

2. Ab Initio Protein Structural Prediction (really hard)

Page 21: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

21BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Comparative Modeling?

Comparative modeling - term is

sometimes used interchangeably with homology modeling, but also sometimes used to mean both:

• homology modeling

• threading/fold recognition

Page 22: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

22BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Ab Initio Prediction

1. Develop energy function

• bond energy• bond angle energy• dihedral angle energy• van der Waals energy• electrostatic energy

2. Calculate structure by minimizing energy function • usually Molecular Dynamics (MD) or Monte Carlo (MC)

Ab initio prediction - impractical for most real (long) proteins• Computationally? very expensive• Accuracy? Usually poor for all except short peptides

(but much improvement recently!)

Provides both folding pathway & folded structure

Page 23: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

23BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Comparative Modeling

Provide folded structure only

Two types:

1) Homology modeling

2) Threading (fold recognition)

Both rely on availability of experimentally determined structures that are "homologous" or at least structurally very similar to target

Page 24: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

24BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Homology Modeling

1. Identify homologous protein sequences (-BLAST)2. Among available structures (in PDB), choose one

with closest sequence to target as template(can combine steps 1 & 2 by using PDB-BLAST)

1. Build model by placing target sequence residues in corresponding positions on homologous structure & refine by "tweaking" modeled structure (energy minimization)

2. Homology modeling - works "well"1. Computationally? "relatively" inexpensive2. Accuracy? higher sequence identity better

model

1. Requires ~30% sequence identity with sequence for which structure is known

Page 25: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

25BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading - Fold RecognitionIdentify “best” fit between target sequence & template structure

1. Develop energy function2. Develop template library3. Align target sequence with each template in library &

score4. Identify top scoring template (1D to 3D alignment)5. Refine structure as in homology modeling

Threading - works "sometimes"1. Computationally? Can be expensive or cheap, depends

on energy function & whether "all atom" or "backbone only" threading is used

2. Accuracy? in theory, should not depend on sequence identity (should depend on quality of template library & "luck")

Usually, higher sequence identity to protein of known structure better model

Page 26: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

26BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading: the Motivation

• Basic premise:

• Statistics from Protein Data Bank (>46,000 structures)

• Thus, chances for a protein to have a native-like structural fold in PDB are quite good

• Note: Proteins with similar structural folds could be either homologs or analogs

The number of unique structural folds in nature is fairly small (probably 2000-3000)

Prior to Structural Genomics Project, 90% of "new" structures submitted to PDB were similar to existing folds in PDB - suggesting that almost all folds in nature have been identified

Page 27: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

27BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

1. Align target sequence with template structures

in fold library (usually from the PDB)

2. Calculate energy score to evaluate "goodness of fit" between target sequence & template structure

3. Rank models based on energy scores

Target Sequence

Structure Templates

ALKKGF…HFDTSE

Steps in Threading

Page 28: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

28BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading Goal - & Issues

• Structure database - must be "complete"

• Can't build a good model if there is no good template in library!

• Sequence-structure alignment algorithm:

• Bad alignment Bad score!

• Energy function or Scoring Scheme:

• Must distinguish correct sequence-fold alignment from incorrect sequence-fold alignments

• Must distinguish “correct” fold from close decoys

• Prediction reliability assessment - How determine

whether predicted structure is correct? (or even close?)

Find “correct” sequence-structure alignment of a target sequence with its native-like fold in template library (usually derived from PDB)

Page 29: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

29BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading: Template database

• Build a database of structural templates e.g., ASTRAL domain library derived from the

PDB

Sometimes, supplement with additional decoys e.g., generated using ab initio approach such as Rosetta (Baker)

Page 30: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

30BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading: Energy function

• Two main methods (& combinations of these)

• Structural profile (environmental) physicochemical properties of amino acids

• Contact potential (statistical) based on contact statistics from PDB

famous one: Miyazawa & Jernigan (ISU)

Page 31: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

31BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Threading: Typical energy function

How well does a specific residue fit structural environment?

What is "probability" that two specific residues are in contact?

Alignment gap penalty?

Total energy: Ep + Es + Eg

Goal: Find a sequence-structure alignment that minimizes energy function

Page 32: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

32BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

A Local Example: Rapid Threading Approach for Protein Structure Prediction

Kai-Ming Ho, Physics Haibo Cao

Yungok Ihm Zhong Gao

James MorrisCai-zhuang

Wang Drena Dobbs, GDCB

Jae-Hyung LeeMichael

TerribiliniJeff Sander

Cao H, Ihm Y, Wang, CZ, Morris, JR, Su, M, Dobbs, D, Ho, KM (2004)

Three-dimensional threading approach to protein structure recognition

Polymer 45:687-697

Page 33: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

33BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Motivations for & Assumptions of Ho Threading Algorithm

Goal: Develop a threading algorithm that:• Is simple & rapid enough to be used in high throughput

applications• Is relatively "insensitive" to sequence similarity

between target protein sequence & sequence of template structure

(to enhance detection of remote homologs & structures that are similar due to convergent evolution)

• Can be used to answer questions such as:What are predicted structures of all "unassigned" ORFs in Arabidopsis?Does Arabidopsis have a protein with structure similar to mammalian Tumor Necrosis Factor (TNF)?

Assumptions:• Native state of a protein is lowest free energy state• Hydrophobic interactions drive protein folding

Page 34: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

34BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Simplify: Template structure representation

,1=ijC 5.6≤ijr Åif (contact)

,0=ijC Otherwise

A neighbor in sequence (non-contact)

i

j

1

N

Template structure ( contact matrix) C NN ×

Yungok Ihm

Page 35: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

35BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Simplify: Target Sequence Representation

• Miyazawa-Jernigan (MJ) model: inter-residue contact energy M(i,j) is a quasi-chemical approximation based on pair-wise contact statistics extracted from known protein structures in the PDB: 20 X 20 matrix = 210 values ("letters")

• Li-Tang-Wingreen (LTW): factorize the MJ interaction

matrix to reduce the number of parameters associated with amino acids from 210 to 20 q values

• Hydrophobic-Polar (HP): represent amino acids as either H (hydrophobic) or polar (P); Dill et al demonstrated the utility of this simple binary alphabet representation: 2 values

Compare results with 210 vs 20 vs 2 letter representations

How low can we go?

Page 36: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

36BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Simplify: Energy Function

• Interaction “counts” only if two hydrophobic amino acid residues are in contact

• At residue level, pair-wise hydrophobic interaction is dominant:

E = i,j Cij Uij

Cij : contact matrix

Uij = U(residue I, residue J)

MJ: U = Uij

LTW: U = Qi*Qj

HP: U = {1,0}

Yungok Ihm

Page 37: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

Energy calculation: Contact energy

Miyazawa-Jernigan (MJ) matrix:

210 parametersStatistical potential

Li-Tang-Wingreen (LTW):

20 parameters

})){(2~

( βαα +++= jiij qqCM

Contact Energy: )(1

ijjijic CQCQEN

ij

β+∑=

=2604.0,6797.0

−=−=−−=

βααii qQ

with

C M F I L

CMFILVW

046 054 -020 049 -001 006057 001 003 -008052 018 010 -001 -004

=M

iq

Qi~ solubility

~ hydrophobicity

contact matrix C

Yungok Ihm

Page 38: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

ij

1

N

Template Structure

β+∑==

N

ij

jijic QCQE1

Contact Energy

Contact Matrix

Sequence

AVFMRIHNDIVYNDIANTTQ

Sequence Vector

)6497.0 ,1197.1 ,9897.0 ,7997.0(

),.....,,,(

== EFVA QQQQS

otherwise(a neighbor in sequence)

,0

56 if ,1

ij

ijij

C

rC Å

Scoring Function

Summary of Ho Threading Procedure

Yungok Ihm

Page 39: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

Can complexity be further reduced?Consider simplifying structure representation, too

ALKKGF…HFDTSE

Sequence – Structure (1D – 3D problem)

(1D – 2D problem)

(1D – 1D problem)

Sequence – Contact Matrix

Sequence – 1D Profile

Haibo Cao

Page 40: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

Examine eigenvectors of contact matrix

∑=

= N

i

ii

iii

TV

TVr

1

2

2

)~

(

)~

(

λ

λ

211

2

1

1

~~

)~

(~~

TVTV

TVVTCTT

i

N

i

i

N

i

iii

λλ

λ

≅=

=≡

=

=

Hydrophobic Contacts

iλiV :i-th eigenvector

C

1V :eigenvector with largest eigenvalue

:i-th eigenvalue of

:fraction of hydrophobic contacts from i-th eigenvectorir:protein sequence of the template structureT

C :contact matrix

Haibo Cao

Page 41: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

Represent contact matrix by its dominanteigenvector (1D profile)

• First eigenvector (with highest eigenvalue) dominates the overlap between sequence and structure

• Higher ranking (rank > 4) eigenvectors are “sequence blind”

Haibo Cao

Page 42: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

42BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Threading Alignment StepThreading Alignment Step - - now fast! now fast! Align Align target sequence vector (1D)target sequence vector (1D) with with eigenvector profile of eigenvector profile of template structure template structure (1D)(1D)

1VP =1D Profile

Maximize the overlap between the

Sequence (S) and the profile (P) allowing gapsPS •

Calculate contact energy

using the alignment: Ec

New profile CPP =

Cao et al Polymer 45 (2004)

Page 43: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

43BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Parameters for alignment?

• Gap penalty: Insertion/deletion in helices or

strands is strongly penalized; smaller penalties for in/dels in loops

Gap penalties apply to alignment score only, not to energy calculation

• Size penalty: If a target residue and aligned

template residue differ in radius by > 0.5Å and if residue is involved in > 2 contacts, alignment is penalized

Size penalties apply to alignment score only, not to energy calculation

Loop

Helix

ALKKGFG…HFDTSE

Yungok Ihm

Page 44: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

44BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

How incorporate secondary structure?

• Predict secondary structure of target sequence (PSIPRED, PROF, JPRED, SAM, GOR V)

N+ = total number of matches between predicted & actual secondary structure of template

N- = total number of mismatches

Ns = total number of residues selected in alignment

“Global fitness” : f = 1 + (N+ - N-) / Ns

Emod = f * Ethreading

Yungok Ihm

Page 45: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

How much better is this “fit” than random?

Eshuffle : Shuffled Sequence vs Structure

Erelative = Emod – Eshuffled

Yungok Ihm

Avg E score for same sequence shuffled (randomized) many times

E score modifed to reflect fit with predicted 2' structure

Page 46: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

46BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Performance Evaluation? "Blind Test"

CASP5 Competition (CASP7 is most recent)

(Critical Assessment of Protein Structure Prediction)

Given: Amino acid sequence

Goal: Predict 3-D structure (before experimental results published)

Page 47: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

47BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Typical Results: (well, actually, our BEST Results):

HO = #1-Ranked CASP5 Prediction for this Target

• Target 174

• PDB ID = 1MG7

Actual Structure

Predicted Structure

T174_1

T174_2

Cao, Ihm, Wang, Dobbs, Ho

Page 48: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

48BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

• FR Fold Recognition • (targets manually assessed by Nick Grishin)

• -----------------------------------------------------------

• Rank Z-Score Ngood Npred NgNW NpNW Group-name • 1 24.26 9.00 12.00 9 12 Ginalski • 2 21.64 7.00 12.00 7 12 Skolnick Kolinski • 3 19.55 8.00 12.50 9 14 Baker • 4 16.88 6.00 10.00 6 10 BIOINFO.PL • 5 15.25 7.00 7.00 7 7 Shortle • 6 14.56 6.50 11.50 7 13 BAKER-ROBETTA • 7 13.49 4.00 11.00 4 11 Brooks • 8 11.34 3.00 6.00 3 6 Ho-Kai-Ming • 9 10.45 3.00 5.50 3 6 Jones-NewFold • -----------------------------------------------------------

• FR NgNW - number of good predictions without weighting for multiple models• FR NpNW - number of total predictions without weighting for multiple models

Overall Performance in CASP5 Contest

~8th out of 180 (M. Levitt, Stanford)

Page 49: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

49BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

CASP - Check it out!

Critical Assessment of Protein Structure Prediction

http://predictioncenter.gc.ucdavis.edu/

• CASP7 contest - 2006:• http://www.predictioncenter.org/casp7/Casp7.html

• Provides assessment of automated servers for protein structure prediction (LiveBench, CAFASP,

EVA) & URLs for them

• Related contests & resources:

• Protein Function Prediction (part of CASP)

• CAPRI = Critical Assessment of Predicted Interactions

• New: CASPM = CASP for M = Mutant proteins

• Predict effects of small (point) mutations, e.g., SNPs

Page 50: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

50BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Another Convenient List of Links for Protein Prediction Servers

http://en.wikipedia.org/wiki/List_of_protein_structure_prediction_software

Page 51: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

51BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Chp 13 - Protein Structure Visualization, Comparison & Classification

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 13

Protein Structure Visualization, Comparison & Classification

• Protein Structural Visualization

Protein Structure Comparison• Protein Structure Classification

Page 52: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

52BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Protein Structure Comparison Methods

3 Basic Approaches for Aligning Structures (see Xiong textbook for details)

1. Intermolecular 2. Intramolecular 3. Combined

But, very active research area - many recent new methods

3 Popular Methods: 1. DALI = Distance Matrix Alignment of Structures

(Holm)• FSSP Database

2. SSAP = Sequential Structure Alignment Program (Orengo)1. CATH Database

• CE = Combinatorial Extension (Bourne)• VAST at NCBI

URLS:

http://en.wikipedia.org/wiki/Structural_alignment_software

Page 53: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

53BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Another local example: Combining Structure Prediction, Machine Learning & "Real" (wet-lab) Experiments to Investigate the Lentiviral

Rev Protein: A Step Toward New HIV Therapies

Susan Carpenter (Washington State Univ)

Wendy SparksYvonne Wannemuehler

Drena Dobbs, GDCBJae-Hyung LeeMichael Terribilini

Kai-Ming Ho, Physics Yungok IhmHaibo CaoCai-zhuang Wang

Gloria Culver, BBMBLaura Dutca

Page 54: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

5410/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction

ProvirusCytoplasm

Nucleus

Late: Structural ProteinsProgeny RNA

Macromolecular interactions mediated by

Rev protein in lentiviruses (HIV & EIAV)

pre-mRNA AAAA

(protein-protein)

(protein-protein)

(protein-protein)

NUCLEAR EXPORT

AAAARevRevRevRevNUCLEAR IMPORT

SpliceosomeSpliceosome

AAAA

Early: Regulatory Proteins

Tat

RevRev

RevRev MULTIMERIZATIONAAAARevRev

RNA BINDINGRevRev

(protein-RNA)

Susan Carpenter

Page 55: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

55BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Rev is essential for lentiviral replication

• Rev is a small nucleoplasmic shuttling protein

(HIV Rev 115 aa; EIAV Rev 165 aa)

• Recognizes a specific binding site on viral RNA:

Rev Responsive Element (RRE)

• Interacts with CRM1 to export incompletely spliced viral RNAs from nucleus to the cytoplasm

• Specific domains of Rev mediate nuclear localization, RNA binding, and nuclear export

• Critical role of Rev in lentiviral replication makes it an attractive target for antiviral (AIDs) therapy

Page 56: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

56BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Problem: no high resolution Rev structure! not even for HIV Rev, despite intense effort ($$)

• Why?? • Rev aggregates at concentrations needed for NMR or

X-ray crystallography

• What about insights from sequence comparisons? • "undetectable" sequence similarity among Revs from

different lentiviruses (eg, EIAV vs HIV <10%)

• But: • We know that lentiviral Rev proteins are functionally

"homologous" - even in highly diverse lentiviruses

Page 57: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

57BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

• Computationally model structures of lentiviral Rev proteins

- using structural threading algorithm (with Ho et al)

• Predict critical residues for RNA-binding, protein interaction - using machine learning algorithms (with Honavar et al )

• Test model and predictions - using genetic/biochemical approaches (with Carpenter &

Culver)- using biophysical approaches (with Andreotti & Yu groups)

Initially: focus on EIAV Rev & RRE

Hypothesis: Rev proteins from diverse lentiviruses share structural features critical for function

Approach:

Page 58: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

58BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

HIV-1 Rev

Functional domains: EIAV vs HIV Rev

1 31 165

EIAV Rev

NES NLS

RRDRW

ERLE

KRRRK

RBM Folding?

exon 1 exon 2

NES - Nuclear Export SignalNLS - Nuclear Localization SignalRBM - putative RNA Binding Motif

1 116

NESNLS/RBM

RQARRNRRRRWR

Page 59: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

59BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Predicted EIAV Rev Structure

Yungok Ihm

Page 60: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

60BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

EIAV HIVFIV

SIV Dimer HIV Dimer

Comparison of Predicted Rev Structures

Yungok Ihm

Page 61: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

61BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

A

Predicted Structure HIV Rev

N-terminus

B

NMR Structure HIV Rev N-terminal

Peptide (Battiste & Williamson)

C

OverlayAlignment of Predicted

& NMR Structures

Predicted vs Experimental Structure of

N-terminal region of HIV Rev

Yungok Ihm

Page 62: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

62BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Location of functional residues EIAV Rev?

Yungok Ihm

Putative RBM

NESLeu36,45,49: On surface,

consistent with rolein nuclear export

Leu95 & Leu109:Buried in core, critical

hydrophic contacts for fold?

Page 63: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

63BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Mutate hydrophobic residues predicted to be critical for helical packing in core

L65

L95

L109

Yungok Ihm

Single Ala Mutation L A

Single AspMutation L D

Negligible effect on Rev activity

Dramatic change in Rev activity?

Insert charged aa in hydrophobic core

Double AlaMutation LL AA

Reduction in Rev activity?

L65 vs L95 & L109

Single mutants: Leu to Ala Leu to Asp

Double mutants: Leu to Ala

Page 64: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

64BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

aaa

50100150

L65ADL95ADL109ADL65AL95AL65AL109AL95AL109ASingle MutationsDouble MutationsControls

Act

ivity

of

Rev

Str

uctu

ral M

utan

ts

Sha

m

RI

pcD

NA

3

Functional Analysis of Rev Structural Mutants in vivo (CAT assay)

Wendy Sparks

Page 65: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

65BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Functional domains: EIAV vs HIV Rev

HIV-1 Rev

- RNA interaction - Protein interactionNES - Nuclear Export SignalNLS - Nuclear Localization SignalRBM - putative RNA Binding Motif

Green

Red

1 116

NESNLS/RBM

RQARRNRRRRWR

EIAV Rev

NES NLS

RRDRW

ERLE

KRRRK

RBM Folding?

Page 66: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

66BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Putative RNA-binding Motifs & Predicted RNA-binding Residues Mapped onto Predicted EIAV Rev Structure

61 71 81 91

ARRHLGPGPT QHTPSRRDRW IREQILQAEV LQERLEWRIR …++ +++++++ ++++++++++ + +

121 131 141 151 161 HFREDQRGDF SAWGDYQQAQ ERRWGEQSSP RVLRPGDSKRRRKHL + ++++ ++ +++ +++++++++++++++

Michael Terribilini

Yungok Ihm

KRRRK

RRDRW

ERLE

Page 67: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

67BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Express & purify MBP-ERev deletion mutants

60

42

3022

Mark

er

MB

P

1-1

65

31

-16

5

31

-14

5

57

-16

5

57

-14

5

57

-12

4

12

5-1

65

14

6-1

65

MBP-ERev

1-16531-165

31-145

57-165

57-145

57-124125-165

146-165

NES NLS

1 31 57 125 146 165RBM Folding?

Jae-Hyung Lee

MBP

MBP

MBP

MBP

MBP

MBP

MBP

MBP

Page 68: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

68BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

MBP-ERev binds specifically to RRE in vitro

sense antisense

31

-16

5

BS

A MB

P1

-16

5

BS

A

MB

P

1-1

65

31

-16

5 Cold RRE

No p

rote

in

No c

old

RR

E

UV crosslinking Competition

Undigested32P-RRE

Jae-Hyung Lee

Page 69: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

PREDICTED:

Structure

Protein binding residues

RNA binding residues

KRRRK

RRDRW

VALIDATED:

Protein binding residues

RNA binding residues

EIAV Rev: Binding Predictions vs Experiments

++

131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDSKRRRKHL++++++++++ ++ +++ ++++++ + ++++++++++++++++++++

61 71 81 91

ARRHLGPGPTQHTPSRRDRWIREQILQAEVLQERLEWRI+++++++++++++++ ++++++++++++++++

41 51GPLESDQWCRVLRQSLPEEKISSQTCI++++++++ ++

Lee et al (2006)J Virol 80:3844

Terribilini et al (2006)PSB 11:415

57-1

65

MB

PW

T

31-1

65

31-1

45

145-1

65

RRDRW

ERLE KRRR

K

NES

57 125 145 16531 FOLD?

NLS/RBM

RBM

Jae-Hyung Lee

Page 70: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

70BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

AADAA

AALA

KAAAK

Roles of Putative RNA Binding Motifs?

NES NLS

RRDRW

ERLE

KRRRK

RBD

ERDE

RBD

1 31 57 124 146 165

Jae-Hyung Lee

Page 71: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

Rev RNA Binding Motifs: Predicted vs Experiment

AADAA AALA KAAAK

ERDE

PREDICTED:

Structure

Protein binding residues

RNA binding residues

KRRRK

RRDRW

VALIDATED:

Protein binding residues

RNA binding residues

++

131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDSKRRRKHL++++++++++ ++ +++ ++++++ + ++++++++++++++++++++

61 71 81 91

ARRHLGPGPTQHTPSRRDRWIREQILQAEVLQERLEWRI+++++++++++++++ ++++++++++++++++

41 51GPLESDQWCRVLRQSLPEEKISSQTCI++++++++ ++

RRDR

WERLE KRRRK

NES

57 125 145 16531

KA

AA

K

AA

DA

A

AA

LA

ER

DE

WT NLS

RBM FOLD?

NLS/RBM

Jae-Hyung Lee

Page 72: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

KRRRK

RRDRW

Summary: Predictions vs Experiments

131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDSKRRRKHL++++++++++ ++ +++ ++++++ + ++++++++++++++++++++

61 71 81 91

ARRHLGPGPTQHTPSRRDRWIREQILQAEVLQERLEWRI+++++++++++++++ ++++++++++++++++

41 51GPLESDQWCRVLRQSLPEEKISSQTCI++++++++ ++

Lee et al (2006)J Virol 80:3844

Terribilini et al (2006)PSB 11:415

RRDRW ERLE

KRRR

K

NES

57 125 145 16531

FOLD NLS/RBMRBM

ERLE

Page 73: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

73BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Conclusions & Future Directions

Combination of computational & wet lab approaches revealed that:• EIAV Rev has a bipartite RNA binding domain• Two Arg-rich RBMs are critical

• RRDRW in central region (but not ERLE)• KRRRK at C-terminus, overlapping the NLS

• Based on computational modeling, the RBMs are in close proximity within the 3-D structure of protein

• Lentiviral Rev proteins & their cognate RRE binding sites may be more similar in structure than has been appreciated

Lee et al (2006)J Virol 80:3844

Terribilini et al (2006)PSB 11:415

Future: Computational: Use Rev-RRE model system to discover "predictive rules" for protein-RNA recognition

Experimental?

Page 74: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

74BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Experimentally determine the structure of Rev-RRE complex !!!

Page 75: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction

Building “Designer” Zinc Finger DNA-binding Proteins J Sander, P Zaback, F Fu, J Townsend, R Winfrey

D Wright, K Joung, L Miller, D Dobbs, D Voytas

Wright et al (2006)Nature Protocols

Sander et al (2007)Nucleic Acids Res

Page 76: 110/15/07BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction BCB 444/544 Lecture 23  Protein Tertiary Structure Prediction #23_Oct15.

76BCB 444/544 F07 ISU Dobbs #23 - Protein Tertiary Structure Prediction 10/15/07

Chp 16 - RNA Structure Prediction

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 16 RNA Structure Prediction (Terribilini)

• Introduction• Types of RNA Structures• RNA Secondary Structure Prediction Methods• Ab Initio Approach• Comparative Approach• Performance Evaluation