Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

28
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II Prof. Corey O’Hern Department of Mechanical Engineering Department of Physics Yale University

description

Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II. Prof. Corey O’Hern Department of Mechanical Engineering Department of Physics Yale University. What did we learn about proteins?. Many degrees of freedom; exponentially growing # of - PowerPoint PPT Presentation

Transcript of Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Page 1: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Bioinformatics: Practical Application of Simulation and Data

Mining

Protein Folding II

Prof. Corey O’HernDepartment of Mechanical Engineering

Department of PhysicsYale University

Page 2: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

What did we learn about proteins?•Many degrees of freedom; exponentially growing # of energy minima/structures•Folding is process of exploring energy landscape to find global energy minimum•Need to identify pathways in energy landscape; # of pathways grows exponentially with # of structures•Coarse-graining/clumping required

energy minimumtransition

•Transitions are temperature dependent

Page 3: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

J. D. Honeycutt and D. Thirumalai, “The nature of foldedstates of globular proteins,” Biopolymers 32 (1992) 695.

T. Veitshans, D. Klimov, and D. Thirumalai, “Protein folding kinetics: timescales, pathways and energy landscapes

in terms of sequence-dependent properties,” Folding & Design 2 (1996)1.

Coarse-grained (continuum, implicit solvent, C) models for proteins

Page 4: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

3-letter C model: B9N3(LB)4N3B9N3(LB)5L

B=hydrophobicN=neutralL=hydrophilic

Nsequences= 3 ~ 1022

Np ~ exp(aNm)~1019 Number of structuresper sequence

Nm Number of sequences forNm=46

Page 5: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

different mapping?

and dynamics

Page 6: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Molecular Dynamics: Equations of Motion

rFi =m i

ri =m id 2rridt2

rri t( ) for i=1,…Natoms

rFi =−

∂V∂rri

Coupled 2nd order Diff. Eq.

How are they coupled?

Page 7: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

(iv) Bond length potential

Page 8: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Pair Forces: Lennard-Jones Interactions

ij rrij

rrj

rri rrj +

rrij=rri

rrij =

rri −rrj

Parallelogramrule

rFij =−

dVdrij

rij -dV/drij > 0; repulsive-dV/drij < 0; attractive

force on i due to j

rFi =

rFij

j∑

Page 9: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

‘Long-range interactions’

BB

V(r)

r/

NB, NL, NNLL, LB

r*=21/6

hard-core

attractions-dV/dr < 0

Page 10: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Bond Angle Potential

Vb θijk( )=k02

θijk −θ0( )2

θ0=105

i jk

cosθijk =rji grjk

θijk

rFjb =−

dV b

dθijk

dθijk

drrj=−k0 θijk −θ0( )

dθijk

drrj

θijk=[0,]

Page 11: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Dihedral Angle Potential

Vd(ijkl)

Vd(ijkl)Vd jijkl( )=A 1+ cojijkl( )+ B 1+ co3jijkl( )

cosjijkl=

rrij×rrkj( )g

rrjk ×rrlk( )

rrij×rrkj

rrjk ×rrlk

rFjd =−

dV d

djijkl

djijkl

drrj= Ainjijkl+ 3Bin3jijkl( )

djijkl

drrj

ijkl

Successive N’s

Page 12: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Bond Stretch Potential

Vbs =kb2

rij−( )2

rFijbs =−kb rij−( )rij

i j

for i, j=i+1, i-1

Page 13: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

rFitot =

rFi

lr +rFi

b +rFi

d +rFi

b =m iri

Equations of Motion

xi t + Δt( )=xi t( )+ vi t( )Δt+12i t( ) Δt( )2

vi t+ Δt( )=vi t( )+i t( )+ i t+ Δt( )

2Δt

velocityverletalgorithm

Constant Energy vs. Constant Temperature (velocity rescaling, Langevin/Nosé-Hoover thermostats)

Page 14: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Collapsed Structure

T0=5h; fast quench; (Rg/)2= 5.48

Page 15: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Native State

T0=h; slow quench; (Rg/)2= 7.78

Page 16: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

QuickTime™ and aH.264 decompressor

are needed to see this picture.

Page 17: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

start end

Page 18: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

native states

Total Potential Energy

Page 19: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

slow quench

unfolded

native state

Rg2 =

12N2 rij

2

i, j∑

Radius of Gyration

Tf

Page 20: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

(1)Construct the backbone in 2D

(2)Assign sequence of hydrophobic (B) and neutral (N) residues, B residues experience an effective attraction. No bond bending potential.

(3) Evolve system under Langevin dynamics at temperature T

()Collapse/folding induced by decreasing temperatureat rate r.

BN

2-letter C model: (BN3)3B

Page 21: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

QuickTime™ and aGIF decompressor

are needed to see this picture.

Page 22: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Energy LandscapeRg

end-to-end distance end-to-end distance

5 contacts4 contacts 3 contacts

E/CE/C

Page 23: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Rate DependenceEC

CT

5 contacts

4 contacts

3 contacts2 contacts

Page 24: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Misfolding

Page 25: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Reliable Folding at Low Rate

log10 rη / T( )

Page 26: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

QuickTime™ and aGIF decompressor

are needed to see this picture.

Slow rate

Page 27: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

QuickTime™ and aGIF decompressor

are needed to see this picture.

Fast rate

Page 28: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II

Next…

•Thermostats…Yuck!•More results on coarse-grained models•Results for atomistic models•Homework•Next Lecture: Protein Folding III (2/15/10)

So far…

•Uh-oh, proteins do not fold reliably…•Quench rates and potentials