Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II
description
Transcript of Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II
Bioinformatics: Practical Application of Simulation and Data
Mining
Protein Folding II
Prof. Corey O’HernDepartment of Mechanical Engineering
Department of PhysicsYale University
What did we learn about proteins?•Many degrees of freedom; exponentially growing # of energy minima/structures•Folding is process of exploring energy landscape to find global energy minimum•Need to identify pathways in energy landscape; # of pathways grows exponentially with # of structures•Coarse-graining/clumping required
energy minimumtransition
•Transitions are temperature dependent
J. D. Honeycutt and D. Thirumalai, “The nature of foldedstates of globular proteins,” Biopolymers 32 (1992) 695.
T. Veitshans, D. Klimov, and D. Thirumalai, “Protein folding kinetics: timescales, pathways and energy landscapes
in terms of sequence-dependent properties,” Folding & Design 2 (1996)1.
Coarse-grained (continuum, implicit solvent, C) models for proteins
3-letter C model: B9N3(LB)4N3B9N3(LB)5L
B=hydrophobicN=neutralL=hydrophilic
Nsequences= 3 ~ 1022
Np ~ exp(aNm)~1019 Number of structuresper sequence
Nm Number of sequences forNm=46
different mapping?
and dynamics
Molecular Dynamics: Equations of Motion
rFi =m i
ri =m id 2rridt2
rri t( ) for i=1,…Natoms
rFi =−
∂V∂rri
Coupled 2nd order Diff. Eq.
How are they coupled?
(iv) Bond length potential
Pair Forces: Lennard-Jones Interactions
ij rrij
rrj
rri rrj +
rrij=rri
rrij =
rri −rrj
Parallelogramrule
rFij =−
dVdrij
rij -dV/drij > 0; repulsive-dV/drij < 0; attractive
force on i due to j
rFi =
rFij
j∑
‘Long-range interactions’
BB
V(r)
r/
NB, NL, NNLL, LB
r*=21/6
hard-core
attractions-dV/dr < 0
Bond Angle Potential
Vb θijk( )=k02
θijk −θ0( )2
θ0=105
i jk
cosθijk =rji grjk
θijk
rFjb =−
dV b
dθijk
dθijk
drrj=−k0 θijk −θ0( )
dθijk
drrj
θijk=[0,]
Dihedral Angle Potential
Vd(ijkl)
Vd(ijkl)Vd jijkl( )=A 1+ cojijkl( )+ B 1+ co3jijkl( )
cosjijkl=
rrij×rrkj( )g
rrjk ×rrlk( )
rrij×rrkj
rrjk ×rrlk
rFjd =−
dV d
djijkl
djijkl
drrj= Ainjijkl+ 3Bin3jijkl( )
djijkl
drrj
ijkl
Successive N’s
Bond Stretch Potential
Vbs =kb2
rij−( )2
rFijbs =−kb rij−( )rij
i j
for i, j=i+1, i-1
rFitot =
rFi
lr +rFi
b +rFi
d +rFi
b =m iri
Equations of Motion
xi t + Δt( )=xi t( )+ vi t( )Δt+12i t( ) Δt( )2
vi t+ Δt( )=vi t( )+i t( )+ i t+ Δt( )
2Δt
velocityverletalgorithm
Constant Energy vs. Constant Temperature (velocity rescaling, Langevin/Nosé-Hoover thermostats)
Collapsed Structure
T0=5h; fast quench; (Rg/)2= 5.48
Native State
T0=h; slow quench; (Rg/)2= 7.78
QuickTime™ and aH.264 decompressor
are needed to see this picture.
start end
native states
Total Potential Energy
slow quench
unfolded
native state
Rg2 =
12N2 rij
2
i, j∑
Radius of Gyration
Tf
(1)Construct the backbone in 2D
(2)Assign sequence of hydrophobic (B) and neutral (N) residues, B residues experience an effective attraction. No bond bending potential.
(3) Evolve system under Langevin dynamics at temperature T
()Collapse/folding induced by decreasing temperatureat rate r.
BN
2-letter C model: (BN3)3B
QuickTime™ and aGIF decompressor
are needed to see this picture.
Energy LandscapeRg
end-to-end distance end-to-end distance
5 contacts4 contacts 3 contacts
E/CE/C
Rate DependenceEC
CT
5 contacts
4 contacts
3 contacts2 contacts
Misfolding
Reliable Folding at Low Rate
log10 rη / T( )
QuickTime™ and aGIF decompressor
are needed to see this picture.
Slow rate
QuickTime™ and aGIF decompressor
are needed to see this picture.
Fast rate
Next…
•Thermostats…Yuck!•More results on coarse-grained models•Results for atomistic models•Homework•Next Lecture: Protein Folding III (2/15/10)
So far…
•Uh-oh, proteins do not fold reliably…•Quench rates and potentials