Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II
description
Transcript of Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II
![Page 1: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/1.jpg)
Bioinformatics: Practical Application of Simulation and Data
Mining
Protein Folding II
Prof. Corey O’HernDepartment of Mechanical Engineering
Department of PhysicsYale University
![Page 2: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/2.jpg)
What did we learn about proteins?•Many degrees of freedom; exponentially growing # of energy minima/structures•Folding is process of exploring energy landscape to find global energy minimum•Need to identify pathways in energy landscape; # of pathways grows exponentially with # of structures•Coarse-graining/clumping required
energy minimumtransition
•Transitions are temperature dependent
![Page 3: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/3.jpg)
J. D. Honeycutt and D. Thirumalai, “The nature of foldedstates of globular proteins,” Biopolymers 32 (1992) 695.
T. Veitshans, D. Klimov, and D. Thirumalai, “Protein folding kinetics: timescales, pathways and energy landscapes
in terms of sequence-dependent properties,” Folding & Design 2 (1996)1.
Coarse-grained (continuum, implicit solvent, C) models for proteins
![Page 4: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/4.jpg)
3-letter C model: B9N3(LB)4N3B9N3(LB)5L
B=hydrophobicN=neutralL=hydrophilic
Nsequences= 3 ~ 1022
Np ~ exp(aNm)~1019 Number of structuresper sequence
Nm Number of sequences forNm=46
![Page 5: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/5.jpg)
different mapping?
and dynamics
![Page 6: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/6.jpg)
Molecular Dynamics: Equations of Motion
rFi =m i
ri =m id 2rridt2
rri t( ) for i=1,…Natoms
rFi =−
∂V∂rri
Coupled 2nd order Diff. Eq.
How are they coupled?
![Page 7: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/7.jpg)
(iv) Bond length potential
![Page 8: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/8.jpg)
Pair Forces: Lennard-Jones Interactions
ij rrij
rrj
rri rrj +
rrij=rri
rrij =
rri −rrj
Parallelogramrule
rFij =−
dVdrij
rij -dV/drij > 0; repulsive-dV/drij < 0; attractive
force on i due to j
rFi =
rFij
j∑
![Page 9: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/9.jpg)
‘Long-range interactions’
BB
V(r)
r/
NB, NL, NNLL, LB
r*=21/6
hard-core
attractions-dV/dr < 0
![Page 10: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/10.jpg)
Bond Angle Potential
Vb θijk( )=k02
θijk −θ0( )2
θ0=105
i jk
cosθijk =rji grjk
θijk
rFjb =−
dV b
dθijk
dθijk
drrj=−k0 θijk −θ0( )
dθijk
drrj
θijk=[0,]
![Page 11: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/11.jpg)
Dihedral Angle Potential
Vd(ijkl)
Vd(ijkl)Vd jijkl( )=A 1+ cojijkl( )+ B 1+ co3jijkl( )
cosjijkl=
rrij×rrkj( )g
rrjk ×rrlk( )
rrij×rrkj
rrjk ×rrlk
rFjd =−
dV d
djijkl
djijkl
drrj= Ainjijkl+ 3Bin3jijkl( )
djijkl
drrj
ijkl
Successive N’s
![Page 12: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/12.jpg)
Bond Stretch Potential
Vbs =kb2
rij−( )2
rFijbs =−kb rij−( )rij
i j
for i, j=i+1, i-1
![Page 13: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/13.jpg)
rFitot =
rFi
lr +rFi
b +rFi
d +rFi
b =m iri
Equations of Motion
xi t + Δt( )=xi t( )+ vi t( )Δt+12i t( ) Δt( )2
vi t+ Δt( )=vi t( )+i t( )+ i t+ Δt( )
2Δt
velocityverletalgorithm
Constant Energy vs. Constant Temperature (velocity rescaling, Langevin/Nosé-Hoover thermostats)
![Page 14: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/14.jpg)
Collapsed Structure
T0=5h; fast quench; (Rg/)2= 5.48
![Page 15: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/15.jpg)
Native State
T0=h; slow quench; (Rg/)2= 7.78
![Page 16: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/16.jpg)
QuickTime™ and aH.264 decompressor
are needed to see this picture.
![Page 17: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/17.jpg)
start end
![Page 18: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/18.jpg)
native states
Total Potential Energy
![Page 19: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/19.jpg)
slow quench
unfolded
native state
Rg2 =
12N2 rij
2
i, j∑
Radius of Gyration
Tf
![Page 20: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/20.jpg)
(1)Construct the backbone in 2D
(2)Assign sequence of hydrophobic (B) and neutral (N) residues, B residues experience an effective attraction. No bond bending potential.
(3) Evolve system under Langevin dynamics at temperature T
()Collapse/folding induced by decreasing temperatureat rate r.
BN
2-letter C model: (BN3)3B
![Page 21: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/21.jpg)
QuickTime™ and aGIF decompressor
are needed to see this picture.
![Page 22: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/22.jpg)
Energy LandscapeRg
end-to-end distance end-to-end distance
5 contacts4 contacts 3 contacts
E/CE/C
![Page 23: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/23.jpg)
Rate DependenceEC
CT
5 contacts
4 contacts
3 contacts2 contacts
![Page 24: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/24.jpg)
Misfolding
![Page 25: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/25.jpg)
Reliable Folding at Low Rate
log10 rη / T( )
![Page 26: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/26.jpg)
QuickTime™ and aGIF decompressor
are needed to see this picture.
Slow rate
![Page 27: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/27.jpg)
QuickTime™ and aGIF decompressor
are needed to see this picture.
Fast rate
![Page 28: Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding II](https://reader036.fdocuments.net/reader036/viewer/2022070503/56815656550346895dc3fb5f/html5/thumbnails/28.jpg)
Next…
•Thermostats…Yuck!•More results on coarse-grained models•Results for atomistic models•Homework•Next Lecture: Protein Folding III (2/15/10)
So far…
•Uh-oh, proteins do not fold reliably…•Quench rates and potentials