Understanding the Molecular Mechanism of Elasticity inElastin from a Solvation Perspective
by
Zhuyi Xue
A thesis submitted in conformity with the requirementsfor the degree of Masters of Science
Graduate Department of BiochemistryUniversity of Toronto
Copyright © 2013 by Zhuyi Xue
Abstract
Understanding the Molecular Mechanism of Elasticity in Elastin from a Solvation
Perspective
Zhuyi Xue
Masters of Science
Graduate Department of Biochemistry
University of Toronto
2013
Elastin is an extracellular matrix protein that provides tissues with elasticity. In this the-
sis, we studied three aspects of elastin-based peptides by performing molecular dynamics
(MD) simulations in explicit solvents: aggregation, solvent quality & mechanical prop-
erty. First, by simulating the peptides in water and methanol, we found that methanol
stabilizes the secondary structure of amyloid-like peptides, based on which we hypoth-
esized that the reduction of solvophobic effect in methanol compared to that in water
prevents their formation of amyloid-like fibrils. Second, we studied the solvent effects
of various solvents with different polarities on the peptides, and found that they exhibit
different solvent qualities. Third, we developed a model to predict the Young’s modulus
of elastin-like material using data from MD simulations. This model produces consistent
results with experimental measurements, hence provides a way to evaluate the solvent
effects on elasticity. We conclude that hydrophobic effect plays an important role in
generating elasticity.
ii
Acknowledgements
To study abroad for the first time is a wonderful yet very challenging experience. I feel
grateful to all the people that have helped me along the way.
To my supervisor and committee. Thanks to Dr. Regis Pomes, who brought me here
initially, and my committee: Dr. Fred Keeley, Dr. Simon Sharpe and Dr. Zhaolei Zhang
for the guidance, suggestion and comments.
To my colleagues. Thanks to Dr. John Holyoake, Dr. Chris Neale, Dr. Nilu Chakrabarti,
Dr. Loan Huynh, Dr. Chris Madill, Dr. Sarah Rauscher, David Caplan, Grace Li,
Kethika Kulleperuma, Aditi Ramesh, Christopher Ing, and Ana Nikolic for the guidance,
suggestion and comments.
To my friends and family members. Thanks to Lois Yin, Guang Shi and Feiyang Liu.
Thanks to my best friends, Gangzhi Zheng, Jian Yu, Quan Jin and Yong Zhu. Thanks to
my dear Beibei Zhang. Thanks to my mom and all the other family members for being
understanding and supportive all the time.
Finally, I would also like to thank the following high-performance computing consortia
of Compute Canada for providing computational resources for the work in this thesis:
SciNet, RQCHP, CLUMEQ, WestGrid and SHARCNET.
iii
Contents
List of Tables ix
List of Figures x
List of Acronyms xi
List of Symbols xv
1 Introduction 1
1.1 Elastomeric Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Elastin and Tropoelastin . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 In Vivo and In Vitro Elastogenesis . . . . . . . . . . . . . . . . . . . . . 4
1.4 Aggregation Propensities of Elastin-based Peptides . . . . . . . . . . . . 6
1.5 Solvent Quality & Conformational Equilibria . . . . . . . . . . . . . . . . 8
1.6 Molecular Mechanism of Elasticity in Elastin . . . . . . . . . . . . . . . . 10
1.7 Review of Previous MD Simulations on Elastin-based Peptides . . . . . . 11
1.8 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.9 Organization of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Methods 17
2.1 Molecular Dynamics Simulations . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Force Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
iv
2.2.1 All-atom Force Fields . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 Coarse Grained Force Fields . . . . . . . . . . . . . . . . . . . . . 28
2.3 Sampling Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Elastin-based Peptides in Water and Methanol 32
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Intrinsically Disordered Peptides . . . . . . . . . . . . . . . . . . 34
3.2.2 Radius of Gyration . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 Intramolecular Peptide-peptide Interactions . . . . . . . . . . . . 36
3.2.4 Interactions between Peptide and Solvent . . . . . . . . . . . . . . 38
3.2.5 β-sheet Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 Solvent Quality Studies 50
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Radius of Gyration . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.2 Secondary Structure Content . . . . . . . . . . . . . . . . . . . . 53
4.2.3 Size of peptides In Vacuo . . . . . . . . . . . . . . . . . . . . . . 56
4.2.4 The Discrepancy in β-sheet content . . . . . . . . . . . . . . . . . 59
4.2.5 Ratio of cis/trans Peptide Bonds . . . . . . . . . . . . . . . . . . 63
4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 Modeling Mechanical Properties 71
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
v
5.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.1 Modulus of a Monomer as a Spring . . . . . . . . . . . . . . . . . 72
5.2.2 Young’s Modulus in the tetrahedron model . . . . . . . . . . . . . 73
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.1 Modulus of Peptide Monomers . . . . . . . . . . . . . . . . . . . . 81
5.3.2 Young’s Modulus . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.3 Stress-strain Curve . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4.1 Comparison between Experiments and Simulations . . . . . . . . 87
5.4.2 Comparison between Results in Water and in Methanol . . . . . . 88
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.6 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6 Summary & Future Directions 93
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Appendix A Force Fields Comparison 96
A.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
A.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
A.2.1 Force Fields Comparison for (GVPGV)7 . . . . . . . . . . . . . . 99
A.2.2 Force Fields Comparison for (GV)18 . . . . . . . . . . . . . . . . . 102
A.2.3 Force Fields Comparison for Dipeptides In Vacuo . . . . . . . . . 111
A.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
A.4 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Appendix B sumcoresg 117
B.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
B.2 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
vi
B.3 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
B.4 Screen Shots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
B.5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Appendix C xit 128
C.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
C.2 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
C.3 Usage Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
C.4 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Appendix D tprparser 135
D.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
D.2 Material & Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
D.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
D.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Bibliography 139
vii
List of Tables
1.1 Definition of the mechanical properties for quantification of elasticity . . 2
1.2 Comparison of in vivo and in vitro elastogenesis . . . . . . . . . . . . . . 5
2.1 Functional forms of bond and angle potentials . . . . . . . . . . . . . . . 22
2.2 Functional forms of the potentials of proper and improper dihedral angles 22
2.3 Functional forms of the Lennard-Jones and electrostatic potentials . . . . 23
2.4 Evolutions of different force fields in chronological order . . . . . . . . . . 29
3.1 Model peptides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1 Peptide hydrophobicity and the solvent in which the peptide first reaches
its maximum Rg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Percentages of different cis-X-Pro in PDB database . . . . . . . . . . . . 64
4.3 Summary of the fraction of cis-X-nonPro and cis-X-Pro . . . . . . . . . 66
4.4 Box size of and number of solvent molecules in each system in OPLS-AA/L 69
4.5 Box size of and number of solvent molecules in each system in CHARMM22* 70
5.1 Comparison of Young’s moduli . . . . . . . . . . . . . . . . . . . . . . . . 84
A.1 Selected force field sets for comparison . . . . . . . . . . . . . . . . . . . 97
B.1 Summary of scripts and folders in sumcoresg . . . . . . . . . . . . . . . 122
C.1 Summary of scripts and folders in xit . . . . . . . . . . . . . . . . . . . 132
viii
List of Figures
2.1 Workflow of a MD simulation . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Family tree of AMBER force fields . . . . . . . . . . . . . . . . . . . . . 27
3.1 Snapshots of (GVPGV)7 and (GGVGV)7 in water and methanol . . . . . 35
3.2 Distribution of Rg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Propensity for intramolecular peptide-peptide interactions . . . . . . . . 39
3.4 Propensity for intermolecular peptide-solvent interactions . . . . . . . . . 41
3.5 Propensity to form β-sheet structure . . . . . . . . . . . . . . . . . . . . 42
3.6 RDFs between peptide and solvent nonpolar atoms . . . . . . . . . . . . 47
3.7 RDFs between the peptide nonpolar and solvent polar atoms . . . . . . . 48
3.8 Time evolution of the peptide Rg . . . . . . . . . . . . . . . . . . . . . . 49
4.1 Average Rg of model peptides in water, alcohol solvents, and octane . . . 54
4.2 Various types of backbone structures as defined in DSSP . . . . . . . . . . 55
4.3 Rg and intramolecular peptide-peptide H-bonds propensity of ELPs in
vacuo as a function of temperature . . . . . . . . . . . . . . . . . . . . . 57
4.4 Distribution of end-to-end distances in vacuo at 2707 K . . . . . . . . . . 57
4.5 Intramolecular H-bonds propensity of the model peptides in various solvents 58
4.6 Comparison of β-sheet content between in Dataset 1 and Dataset 2. . . . 60
4.7 Comparison of Rg in Dataset 1 and Dataset 2 . . . . . . . . . . . . . . . 61
ix
4.8 Average Rg of (PGV)12 in different solvents in CHARMM22* and OPLS-
AA/L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.9 Fraction of cis-X-nonPro . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.10 Fraction of cis-X-Pro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1 Illustration of a spring complex in the tetrahedron model . . . . . . . . . 74
5.2 Illustration of a unit cell in the tetrahedron model . . . . . . . . . . . . . 75
5.3 PMF along the end-to-end distance of ELPs . . . . . . . . . . . . . . . . 82
5.4 Young’s modulus as a function of strain for (GVPGV)7 and (PGV)12 . . 83
5.5 Stress-strain curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.1 Distributions of Rg of (GVPGV)7 in different force field sets . . . . . . . 100
A.2 PMFs of (GVPGV)7 in different force field sets . . . . . . . . . . . . . . 101
A.3 Average Rg of (GVPGV)7, (GV)18 and G36 in water, alcohol solvents, and
octane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
A.4 A snapshot of the zigzag extension of (GV)18 in methanol in CHARMM22*104
A.5 H-bonding maps of (GV)18 in different force fields . . . . . . . . . . . . . 105
A.6 H-bonding maps of (GV)18 in other solvents in CHARMM22* . . . . . . 106
A.7 PMFs of Ramachandran plots for Gly in (GV)18 in different force fields . 108
A.8 PMFs of Ramachandran plots for Val in (GV)18 in different force fields . 109
A.9 PMFs of Ramachandran plots for Gly in G36 in different force fields . . . 110
A.10 Potential energy maps of the Gly dipeptide in different force fields . . . . 112
A.11 Potential energy maps of the Val dipeptide in different force fields . . . . 113
A.12 Potential energy maps of the Pro dipeptide in different force fields . . . . 114
B.1 Workflow of sumcoresg . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
B.2 Historical usage data along the time . . . . . . . . . . . . . . . . . . . . . 125
B.3 Historical usage data in a bar chart . . . . . . . . . . . . . . . . . . . . . 126
x
List of Acronyms
Rg radius of gyration 22, 36–38,
40, 46, 49,
50, 52–54,
56, 57, 59,
61, 67, 68,
90, 91, 95–
99, 102,
103
ALP amyloid-like peptide 7, 8, 13, 30,
32–34, 36,
38–41, 43,
44, 47, 51,
53, 93, 95
AMBER Assisted Model Building with Energy Refinement 21–26, 29
CD circular dichroism 96
CG coarse-grained 21, 28, 95
CHARMM Chemistry at HARvard Macromolecular Mechanics 21–23, 26, 29
xi
EBP elastin-based peptide 6–9, 11, 12,
14, 15, 32, 96
ELP elastin-like peptide 7–9, 13–
15, 28,
30, 32–34,
36, 38–41,
43, 44, 47,
51, 53, 56,
57, 67, 71,
81, 82, 84,
86–88, 91,
93–96
ENM elastic network model 95
FTIR Fourier transform infrared spectroscopy 7
GP genipin 5
H-bond hydrogen bond 36, 38–41,
46, 56–58,
102, 105,
106, 115
HP hydrophobic 3, 6, 7, 13,
32, 73, 74,
78, 87, 91, 95
HPC high-performance computing 14
xii
HTTP Hypertext Transfer Protocol 119, 123
IDP intrinsically disordered peptide 8, 34, 94, 95
LINCS linear constraint solver 45, 69, 70,
115
MD molecular dynamics 11–15, 17,
30, 32, 71,
85, 88, 94–
96, 128, 130,
134, 135, 137
MR multiple replica 30
NMR nuclear magnetic resonance 11, 21, 68, 96
OPLS Optimized Potentials for Liquid 21–24, 29,
31, 70
PME Particle-Mesh Ewald 45, 69, 115
PMF potential of mean force 72, 73, 81,
82, 85, 92,
99, 101, 107–
111
PQQ pyrroloquinoline quinone 5
RDF radial distribution function 46–48
xiii
REX replica exchange 30
RHS right-hand-side 76
SEM standard error of mean 37, 82
SSH Secure Shell 2 119–122
ssNMR solid-state NMR 11
STDR simulated tempering distributed replica 14, 30
US umbrella sampling 30
VREX virtual replica exchange 30
XDR External Data Representation 136
XL cross-linking 3, 4, 73, 74,
87, 91, 95
xiv
List of Symbols
A cross-sectional area perpendicular to the direction of extension of
a piece of elastin-like material
72
F pulling force 72, 74, 76
G0 system free energy when the peptide is in its relaxed state 73
Gd system free energy when the peptide’s end-to-end distance is d 72
H system enthalpy 89
KY Young’s modulus 72, 74, 79–
81, 83–85,
87, 88, 91, 94
S system entropy 89
T temperature 72, 89
U Potential Energy 18–20
X length of a spring complex in the tetrahedron model in Figure 5.1 74, 76, 77, 80
Z Partition function 72
∆S change of system entropy between the extended and relaxed states 88–90
∆d extension of a peptide monomer 88, 89
∆l extension of a piece of elastin-like material 72, 79
∆G change of system free energy when the peptide end-to-end distance
changes from d to d0
88
xv
F force applied on a particular atom by the rest of the system during
MD simulations
18–20
R coordinates of all the atom in a MD system 18
r coordinate of a particular atom in a MD system 18–20
v velocity of a particular atom in a MD system 18–20
θtetra tetrahedron angle, 109.4712◦ 74
m in-methanol property value it is superscript of 90
w in-water property value it is superscript of 90
u solute component of the property it is subscript of 90
v solvent component of the property it is subscript of 90
d end-to-end distance of a peptide monomer 72, 76, 89
d0 end-to-end distance of a peptide monomer in its relaxed state 71–73, 76,
78, 81, 85,
88, 94, 95
dt time step used in MD systems 18, 20
f recoiling force of an piece of elastin-like material in its extended
state
xvi, 88, 89
fe enthalpic part of the recoiling force (f) 89
fs entropic part of the recoiling force (f) 89
h0 height of a piece of elastin-like material in its relaxed state 77
k modulus of a spring or a peptide monomer 71–73, 76,
79, 81, 85,
87–89, 91,
94, 95
kB Boltzmann constant 72
kc modulus of a spring complex in the tetrahedron model 74, 76, 78
ku modulus of a unit cell in the tetrahedron model 74, 78, 79
xvi
l length of a piece of elastin-like material 72
l0 length of a piece of elastin-like material in its relaxed state 77
m mass of a particular atom in a MD system 20
nu,x number of unit cells along the x axis in the tetrahedron model 79
nu,y number of unit cells along the y axis in the tetrahedron model 79
nu,z number of unit cells along the z axis in the tetrahedron model 79
p pressure 89
p0 probability when the peptide is in its relaxed state 73
pd probability when the peptide’s end-to-end distance is d 72
r ratio of the extension (x) over the its original length (x0) of a
spring complex in the tetrahedron model. Equal to the strain of
elastin-like material
76, 79, 80
r′ ratio of the shrinkage (x) over the its original width and height of
a piece of elastin-like material
76
s length of OO1 in the tetrahedron model in Figure 5.1 74, 76
s0 length of OO1 in the tetrahedron model in the relaxed state in
Figure 5.1
74
t time in a MD simulation 18
w0 width of a piece of elastin-like material in its relaxed state 77
x extension of a spring complex in the tetrahedron model in Figure
5.1
xvii, 74, 76
x0 length of a spring complex in its relaxed state in the tetrahedron
modelin Figure 5.1
xvii, 74
xvii
Chapter 1
Introduction
1.1 Elastomeric Proteins
A protein is considered elastomeric or elastic if it possesses elasticity, which is the phys-
ical property of a material to return to its original shape after being deformed by an
external force. The measurements of elasticity upon stretching include resilience, stiff-
ness, strength, extensibility and toughness. Their definitions are shown in Table 1.1.
Elastomeric proteins play crucial biological roles throughout the animal kingdom [101].
Among them, those that possess high-resilience, large extensibility and low stiffness are
usually described as rubber-like proteins [41] since such properties are also characteristic
of rubber. Typical rubber-like proteins include elastin and resilin. Elastin exists in most
of the vertebrates and is responsible for the extensibility and recoil of biological tissues like
blood vessels, lung and elastic ligaments. Resilin, while being very similar to elastin, only
exists in insects, and is responsible for conveying essential mechanical properties in tis-
sues like the wing joints of dragonfly, fleas cuticles, and the tymbal of cicada [101]. Other
examples for elastomeric but not rubber-like proteins include collagen fibers, which are
1
Chapter 1. Introduction 2
highly resilient but also very stiff, CoIP from mussel byssus threads and spider dragline
silks, which have considerable stiffness, strength, and extensibility, preventing them from
fracture [41, 101]. A more comprehensive review on various elastomeric proteins and
their measurements of mechanical properties can be found in Rauscher & Pomes, 2010
[101].
The mechanical properties of various elastomeric proteins are undoubtedly determined by
their underlying structures at the molecular level. Because of their promising applications
in biomedical engineering and material science [4], a variety of research studies have been
motivated and conducted to investigate their structure-function relationships[32, 22, 18].
The work presented in this thesis focuses on one of them: elastin.
Property Definition
stress force applied on the material normalized by its cross-sectional
area during deformation
strain extension of the material along the direction of the applied force
normalized by its original length
resilience reflected the efficiency of the material for storing energy, defined
as the difference between the work done upon deformation and
the heat released upon relax normalized by the work
stiffness measured by the Young’s modulus of the material, which is de-
fined as the slope of stress-strain curve upon stretching
strength defined as the stress at which the material ruptures
extensibility defined as the strain at which the material ruptures
toughness defined as the total amount of work needed to rupture the ma-
terial
Table 1.1: Definition of the mechanical properties for quantification of elasticity.
Chapter 1. Introduction 3
1.2 Elastin and Tropoelastin
Elastin is an extracellular matrix protein [123, 85] that has been found in all vertebrates
except for jawless agnathans such as lamprey [23]. The content of elastin varies in
different tissues. For example, it is about 28-32% dry mass in major vascular vessels,
3-7% in lung, 50% in elastic ligaments, 4% in tendon, and 2-3% in skin [112, 123]. In
addition, elastin has also been found in vertebral ligamenta flava, vocal chords, elastic
cartilage, and bladder [83, 24].
Since elastin is a matrix protein, it has a monomeric precursor called tropoelastin.
While mature elastin is extremely insoluble, tropoelastin is soluble at room temperature.
Tropoelastin is unusual in terms of both of its amino acid composition and domain com-
position. At the amino acid level, tropoelastin is mostly made of hydrophobic residues.
As a result, tropoelastin is among the most hydrophobic proteins. For example, human
tropoelastin has 34 exons [17] and over 700 amino acids, but 75% of the entire sequence
consists of only 4 hydrophobic residues: Gly, Val, Ala, and Pro [52]. Such a high level
of hydrophobicity is actually common in the elastin of all higher vertebrates in despite
of some species variation [123]. At the domain level, tropoelastin consists of alternat-
ing hydrophobic (HP) and cross-linking (XL) domains. HP domains are usually rich in
nonpolar residues and highly repetitive. For example, Exon 24 of human tropoelastin
contains 7 fold PGVGV[L/A] repeats [52]. XL domains are usually rich in Ala, with
a couple of Lys interspersed in the form of KAAK or KAAAK [123]. When tropoe-
lastins are crosslinked to form a matrix, it is the 4 Lys from two XL domains that act
as the crosslinkers and are oxidatively deaminated to form a desmosine or isodesmosine,
the crosslink[123]. It is thought that the XL domains impart strength and stability to
elastin, while the HP domains confer extensibility [101].
Chapter 1. Introduction 4
1.3 In Vivo and In Vitro Elastogenesis
The process of elastin generation, which includes tropoelastin production and matrix
formation, is called elastogenesis [115]. Usually, this term is used for the in vivo process,
but since in vitro synthesis of elastin-like material has been made possible [10, 120], we
think it also applies to the in vitro process.
In vivo, isoforms of tropoelastin mRNA are produced from a single tropoelastin gene
due to alternative splicing. They are transported to the rough endoplasmic reticu-
lum (RER) in the cytoplasm and translated to tropoelastin polypeptide. With very
few post-translational modifications, tropoelastin binds to the elastin-binding protein
(EBP), which prevents it from degradation, and together they are then transported close
to the cell surface via the Golgi apparatus. At the cell surface, EBP also binds to a
β-galactosugar, which reduces its affinity for tropoelastin. As a result, tropoelastin is
released to the extracellular environment. Released tropoelastins align with each other
upon the microfibrils, the scaffold made of multiple distinctive proteins, so that their
crosslinker residues (i.e. Lys in the XL domains of tropoelastin) can come close to each
other for crosslinking reactions to happen. This process is also named coacervation. Af-
ter the alignment, Tropoelastins are crosslinked together by lysyl oxidase via oxidative
deamination, which results in the formation of mature elastin. The formed crosslinks
prevent elastin from falling apart under extension, which is essential in conveying its me-
chanical properties. Since elastin is closely connected with microfibrils, the final structure
of elastin and microfibrils together is also called elastic fiber. A more comprehensive de-
scription of elastogenesis can be found in Vrhovski and Weiss, 1998 [123] and Eldijk et
al. 2012 [115].
An in vitro process similar to the flow of in vivo elastogenesis has been made possible
to produce elastin-like materials [10, 120] though the details are quite different. Instead
Chapter 1. Introduction 5
of using tropoelastin directly, a much shorter elastin-like peptide can be used as the pre-
cursor peptide for the later-formed polymeric matrix. First, the monomeric peptides are
produced in genetically modified E. coli and purified. Second, coacervation is induced
by increasing the temperature. Then, crosslinking agent like genipin (GP) or pyrrolo-
quinoline quinone (PQQ) is added to the coacervate to start crosslinking reactions. The
coacervate with added crosslinking agent is left overnight, during which self-alignment
and the formation of crosslinks take places. Some of the major differences compared to
the in vivo process are summarized in Table 1.2.
Step In Vivo [123, 115] In Vitro [11, 10, 120]
Monomeric
peptide
Tropoelastin Elastin-like peptide or
tropoelastin
Production
of monomeric
peptides
produced through transcription,
splicing, translation,
transportation to cytoplasm,
post-translation, transportation
out of membrane
produced with genetically
modified E. coli and purified
Coacervation Induced by increase of
concentration
Induced by increase of
temperature
Crosslinking Achieved with lysyl oxidase Achieved with chemical
crosslinker like GP or PQQ
Table 1.2: Comparison of in vivo and in vitro elastogenesis.
Chapter 1. Introduction 6
1.4 Aggregation Propensities of Elastin-based Pep-
tides
As described above, there is a well-documented protocol for producing elastin-like ma-
terial in vitro, but very little is know about this process at the molecular level. Of the
unknowns, coacervation is one of those are of particular interest to this thesis. For ex-
ample, very few is known about the structure of coacervate, the protein-rich phase after
phase separation.
In the simplest sense, coacervation can be understood as a type of protein aggregation.
A more precise definition of in vitro coacervation is that it is a reversible temperature-
induced phase separation process, in which tropoelastin molecules aggregate, self-assemble,
and form a turbid, protein-rich second phase [122, 123, 24]. Coacervation is generally
considered as a result of increased hydrophobic interactions between the HP domains
of tropoelastin as the temperature increases [123]. The onset temperature of coacerva-
tion depends on multiple factors like sequence composition, peptide concentration, ionic
strength, pH, and solvent hydrophobicity [123, 11, 79, 80].
It has been found that model peptides derived from the sequences of HP domains, i.e.
elastin-based peptides (EBPs), can also coacervate. Furthermore, materials made out of
such sequences have been shown to possess similar mechanical properties to that of native
elastin, hence they are described as elastin-like [10]. However, not all EBPs coacervate.
The aggregation propensities of EBPs can be modulated by inducing sequence variation or
different solvent conditions. For example, an EBP with PGVGVA repeats named EP20-
24-24 is capable of coacervation, but when P is mutated to G, resulting in GGVGVA
repeats, it forms amyloid-like fibrils instead, which contain a large amount of β-sheets
[79]. Another EBP, (VGGVG)n, forms amyloid-like structure when deposited in water
[36, 34, 35], which is consistent with the previous observation [79] since its repetitive
Chapter 1. Introduction 7
unit, GGVGV, is very similar to that of EP20-24-24 after P-to-G mutation, GGVGVA.
However, if (VGGVG)n is deposited in methanol, instead, it forms an amorphous film
initially, which evolves to be beaded string structures eventually [36]. The beaded string
morphology might be an artifact caused by the oxidized silicon on the substrate surface
used for deposition [34]. To impede the contact of peptides with the substrate, in more
recent work on another very similar sequence, (VGGLG)n, a pegboard-like substrate
surface was used, and it lead to the formation of cigar-like bundles instead of beaded
strings. Meanwhile, (VGGLG)n also forms amyloid-like fibril when deposited in water as
(VGGVG)n [20]. The presence of β-sheet structure in those fibrils have been confirmed
by Fourier transform infrared spectroscopy (FTIR) [105], and models for those structures
have been proposed and evaluated [35, 105].
On the one hand, coacervation is believed to be an important step in in vivo elastogenesis
and it is suggested that coacervation concentrates and aligns tropoelastin before cross-
linking [123]. On the other hand, the formation and deposition of amyloid-like fibrils
are associated with many neuro-degenerative diseases like Alzheimer’s and Parkinson’s
diseases [29]. Amyloid-like structure has even been proposed by Dobson to be a generic
and inherent structural form accessible to all proteins under appropriate conditions [28,
29]. Therefore, it is important to understand the molecular mechanism of amyloid-like
fibril formation in order to develop effective treatments for those diseases. Given the
two types of aggregation, it is very interesting that some of the EBPs could display both
aggregation types under varying conditions.
In the literature, elastin-based peptides have also been called elastin-like peptides or
elastin-derived peptides. In order to have a consistent nomenclature, all peptides that are
derived from HP domains of tropoelastin will be called elastin-based peptides (EBPs) in
this thesis, but only EBPs that tend to coacervate are called elastin-like peptides (ELPs),
while those that tend to form amyloid-like fibrils are called amyloid-like peptides (ALPs).
Chapter 1. Introduction 8
In 2006, my colleague Sarah Rauscher and coworkers investigated the structural prop-
erties of a set of EBPs of different sequence compositions. They found that ELPs and
ALPs are distinguishable according to backbone hydration and peptide-peptide hydro-
gen bonding, and ELPs remain disordered in both monomeric and aggregated state [98].
Furthermore, they discovered that Pro-Gly (PG) content is a very important criteria for
determining a peptide’s aggregation propensity [98]. From their PG diagram, it shows
that peptides with higher PG contents are unlikely to form amyloid-like fibrils since both
Pro and Gly are secondary structure breakers due to their extreme rigidity and flexibility,
respectively. As a result, peptides with a high percentage in P and G are destined to be
disordered. This discovery almost refutes Dobson’s proposal that amyloid-like structure
is accessible to all kinds of proteins.
However, very limited knowledge is known about how the solvent conditions affect EBPs’
aggregation propensities at the molecular level, which is one of the most important ques-
tions to be concerned in this thesis.
1.5 Solvent Quality & Conformational Equilibria
In order to develop a comprehensive understanding of the aggregation process, it is impor-
tant to have a quantitative measure of the structure of peptides in solution. That EBPs
tend to aggregate and be disordered [98] reminds us of the similarity between intrinsi-
cally disordered peptides (IDPs) and synthetic polymers [104, 90], whose conformation
has been well-studied in the discipline of polymer physics. The conformational equilibria
of synthetic polymers is governed by the balance of chain-chain and chain-solvent inter-
actions, which are in turn determined by solvent quality. [90] Therefore, we can adopt
the analysis methods from polymer physics and apply them to the polypeptides.
Chapter 1. Introduction 9
A single polymer molecule in a dilute solution can adopt a swollen coil in a good sol-
vent, a collapsed globule in a poor solvent, or a state in-between. In a good solvent
(e.g. polystyrene in benzene), chain-solvent interactions are favored over chain-chain in-
teractions, so the molecule swells; in a poor solvent (e.g. polystyrene in ethanol), the
chain-chain interactions dominate over chain-solvent interactions, so the molecule col-
lapses and becomes compact. If the poor and good solvents are interpolated, there will
be an ideal point where chain-chain interactions and chain-solvent interactions balance
out. This point is called the θ-point. the corresponding solvent and temperature are
called θ-solvent and θ-temperature. At the θ-point (e.g. polystyrene in cyclohexane at
34.5 ◦C), the chain adopts a random coil, and it reaches its maximum chain entropy [104].
A special case of solvation is called polymer melt, which means that the polymers are
solvated by themselves. It is first predicted by Paul Flory that polymer molecules may
behave as ideal chains when solvated by themselves [104], and this prediction has since
been validated for synthetic homopolymers like poly(methyl methacrylate) [62]. Encour-
agingly, my colleague Sarah Rauscher recently discovered that the conformation of ELP
(GVPGV)7 in aggregation resembles that in a polymer melt with MD simulations[97],
which not only contributes to our understanding the structure of the coacervate, but also
suggests the applicability of Flory theorem to polypeptides as well.
The solvent quality of a particular solvent is mainly affected by the inherent properties of
the solvent molecules and the temperature [90]. The theory from polymer physics works
well for uniform polymers like polyethylene, but when it comes to a polypeptide, the case
is often more complicated due to the uneven distribution of polar and nonpolar groups,
i.e. polar backbone and nonpolar sidechains in the EBPs, which causes the formation of
secondary structures in proteins. Therefore, even with the aid of polymer physics, the
problem of how solvent quality affects the conformational equilibria of a peptide needs
to be further explored.
Chapter 1. Introduction 10
1.6 Molecular Mechanism of Elasticity in Elastin
Elastin has been under study for over 70 years [130], and it has been shown that its
elasticity is primarily due to entropy loss between the stretched and relaxed state [82,
49]. However, because of its conformational heterogeneity [98] and extreme insolubility
[96], the characterization of its atomistic structure remains elusive, hence its molecular
mechanism for elasticity still remains controversial. This section presents a brief review of
the various models proposed to explain the molecular mechanism of elasticity in elastin.
Over the course of elastin research, two major groups of structure-function models have
been proposed, which consider elastin to be either isotropic or anisotropic [123, 83].
The isotropic model considers elastin to be a random-chain network like rubber, where
each individual peptide is kinetically free. As a result, the elasticity in elastin is mainly
due to the decrease in chain entropy when it is being stretched [49, 30]. There are many
research results compatible with the random-chain network model. For example, elastin
contains a high percentage of Pro and Gly, which is conducive to disordered peptide
structure, and polarized light microscopy on elastin exhibits no birefringence, which
suggests isotropic conformation [1]. However, this model cannot explain the fact that
elastin is not self-lubricating and requires plasticizer such as water in order to exhibit
elasticity. [92]
The anisotropic models can be further categorized into the two-phase model (mainly
the liquid drop model [129] and the oiled-coil model [44]) and the β-spiral model [119].
The two-phase model emphasizes that elastin contains both a hydrophobic phase and
a hydrophilic phase. When elastin is in its relaxed state, the hydrophobic phase is
buried inside while the hydrophilic phase is on the surface, but when it is stretched, with
the increase in its surface area, the hydrophobic phase becomes more exposed, which
results in relative ordering of the surrounding water molecules, and induces decrease in
Chapter 1. Introduction 11
the total entropy of the system [92, 129, 44]. The two-phase model is supported by
a fluorescence study, in which the dye-labeled elastin exhibits lower fluorescence when
being stretched, indicating a nonhomogeneous environment inside the elastin network
[43], but it is criticized for being unlikely to convey high backbone mobility, which is
observed by both nuclear magnetic resonance (NMR) and solid-state NMR (ssNMR)
studies [110, 96]. The β-spiral model suggests that elastin consists of helical structures
which comprise repetitive β-turns, and elasticity is caused by reduced liberational entropy
upon stretching [119]. However, the β-spiral structure has been reported to be very
unstable [67] though transient β-turns are abundant [122, 98]. In a previous study from
our group, it has been shown that the hydrophobic domains of elastin remain disordered
even in the aggregated state due to its richness in Pro [98]. Therefore, the β-spiral model
is highly unreliable.
More detailed reviews of the various models for elastin can also be found in published
papers and reviews [68, 123, 83]. Overall, the proposed models are supported by some ex-
perimental results, but unfortunately, none of them can explain all the evidence observed
in experiments properly [68].
1.7 Review of Previous MD Simulations on Elastin-
based Peptides
Because there is still no experimental approach for obtaining high-resolution structural
information about intrinsically disordered peptides like EBPs, molecular dynamics (MD)
simulation is a good technique to study their structures at the atomistic level. MD
simulation is also the major technique used in this thesis. In this section, we briefly
review all of the MD simulations that have been conducted on peptides relevant to
Chapter 1. Introduction 12
elastin.
The first MD simulation on an EBP was conducted by Chang and Urry in 1989 [21].
Starting from a previously developed β-spiral structure [119], they simulated the repeti-
tive polypeptide VPGVG in vacuo for 100 ps in both relaxed and stretched states. [21]
In 1990, Wasserman and Salemme simulated the (VPGVG)18 in β-spiral structure for
130 ps but with water molecules included [125].
The analysis on both of the above simulations turns out to be supportive of the so-called
“librational elasticity mechanism” for explaining the elastin’s elasticity [114]. However,
in Chang and Urry’s simulations, since elastin is known to be functional only in its
hydrated state and brittle otherwise [6], the state of VPGVG in vacuo is probably not
representative of the elastin’s functional states. Besides, in both simulations, the time
scales are only in the magnitude of 100 ps, which is much too short to allow adequate
conformational relaxation from the initial state of the peptide from today’s perspective.
Interestingly, the β-spiral model was refuted about a decade later in 2001 after Li et al.
simulated a 90-residue β-spiral-structured EBP, (VPGVG)18 with explicit water for a
total of 80 ns at 7 different temperatures between 7 and 42 ◦C [67]. They found that the
peptide collapses at all temperatures, which shows the unstability of the β-spiral struc-
ture. They conclude that the well-ordered β-spiral model is not a good description of
elastin in water [68]. Besides, Li et al. also found that the peptide at higher temperatures
above the transition temperature, i.e. the temperature at which coacervation happens,
appeared to be more compact than at lower temperatures below the transition tempera-
ture. Based on these results, they proposed an atomic-level description of coacervation.
However, a later study from our group, which involved much more extensive sampling
on the peptide (GVPGV)7 in a total sampling time of 84 µs at 105 temperatures be-
tween 266 and 749 K (800 ns per temperature) suggests that the compactness observed
Chapter 1. Introduction 13
at higher temperatures is probably due to a shorter relaxation time [97]. Therefore, Li’s
results on the overall compactness of the peptide is probably an artifact of insufficient
sampling time (80 ns). If the simulations from Li et al. could have been extended [67],
the peptides at lower temperature were expected to become more compact than at higher
temperatures.
Although it has been known for many years that elasticity in elastin is mainly entropic
[121, 82, 30, 5], what is still not clear is which part of the system is the major contributor
to the entropy change. Is it the change of backbone chain entropy, or that of the entropy
of the water (a.k.a. the hydrophobic effect)? The earliest simulations failed to answer
this question because either they were in vacuo [21] or the hydration properties were
not analyzed [125, 68]. In 2002, by pulling and releasing the (VPGVG)18 at 10 and
42 ◦C, Li et al. found that the orientational entropy of water molecules hydrating the
hydrophobic groups decreases upon pulling and increases upon releasing, while the chain
entropy undergoes be opposite change, at least within short extension, which is consistent
with results from a previous microcalorimetry experiment in 1978 [42]. Therefore, they
concluded that hydrophobic hydration is an important source of elasticity in elastin [66].
In 2004, Floquet et al. characterized the structural properties of hexapeptide VGVAPG
derived from the repetitive HP domains using both MD simulations and experimental
techniques, and they found that the GVAP sequence in the peptide exhibits a so-called
VIII β-turn [37]. In 2006, another small elastin-based oligopeptide GVG(VPGVG) was
simulated for around 100 ns [6], and the peptide’s kinetics were analyzed.
The simulations published in 2004 and 2006 only contain oligopeptides. Also in 2006,
A much more extensive study on much longer peptides were conducted in our group
[98]. The key objective of that work was to examine the structural properties of ELPs
and ALPs. As mentioned in Section 1.4, ELPs and ALPs are both derived from the
HP domains of tropoelastin, but display different aggregation propensities. That study
Chapter 1. Introduction 14
showed that the two types of peptides are separable based on backbone hydration and
peptide-peptide hydrogen bonding, and ELPs remain disordered in both monomeric and
aggregated state [98]. Another important contribution from that work is the discovery
of a PG threshold in the peptides’ sequence composition, above which the peptides are
elastin-like and below which the peptides are amyloid-like [98]. The total sampling times
of this work reached 800 ns, which is about 8000 times longer than the first MD simulation
done in 1989.
However, despite the enormous advancement in computational power, the time scale in
MD simulations is still very limited compared to that in experiments on macroscopic
systems. This bottleneck is even more challenging for disordered protein because of their
structural heterogeneity. To relieve the limitation, our group developed an enhanced-
sampling technique called simulated tempering distributed replica (STDR). In 2009, a
sampling time of 42 µs for the system of (GVPGV)7 as a monomer in explicit water [99]
was reached after deploying STDR on high-performance computing (HPC) facilities.
Continuing from the monomer study, Sarah Rauscher also explored the structural prop-
erties of the ELP (GVPGV)7 in the aggregated state. Surprisingly, she found that an
elastin-like aggregate state resembles a polymer melt, in which the monomers become
very flexible, and behave similarly to a polymer chain in an θ-solvent. Based on her re-
sults, She proposed a unified model which intends to resolve the contradictions between
different structure-function models that only when in the aggregate, by having exten-
sive intermolecular peptide-peptide nonpolar interactions (consistent with the two-phase
model), can the peptides’ chain entropy become maximized and thereby the peptides
become random chains (consistent the random-chain network model) [97].
As reviewed above, all of the previous MD simulations are done either in vacuo or in
water. Since EBPs can display different aggregation propensities in different solvents,
and hydrophobic effect can be an important source of elasticity, it would be interesting
Chapter 1. Introduction 15
and informative to simulate EBPs in different solvents of different polarities from water.
1.8 Objectives
The objectives of this thesis include:
1. Characterize the structural properties of EBPs in water and methanol, and explain
the variation of their aggregation propensities.
2. Characterize the structural properties of EBPs in other alcoholic solvents of varying
polarities, and compare their solvent quality for these peptides.
3. Model the Young’s modulus of macroscopic material based on MD data, and com-
pare the modeled results to experimental measurements.
1.9 Organization of this Thesis
Chapter 1 provides a general introduction to elastin-relevant topics. Chapter 2 briefly in-
troduces MD simulations, the developments of MD force fields and models, and sampling
errors. Chapter 3–5 present the major results in this thesis. Chapter 3 examines the
structural properties of EBPs successively in water and in methanol, and discusses the
solvents’ effects on their aggregation properties. Chapter 4 extends the solvent sets to
include more alcoholic solvents and octane, and discusses their solvent qualities for EBPs.
Chapter 5 describes how the modulus of a monomeric ELP is calculated, and proposes
a mathematical model to calculate the Young’s modulus of a piece of macroscopic mate-
rial based on these ELPs. Chapter 6 summarizes the contributions from this thesis and
proposes future directions. Finally, Appendix A presents the result of an ongoing work
Chapter 1. Introduction 16
on force fields comparison, which aims to find an optimal force field for this project, and
Appendices B, C and D describe three computational tools developed when preparing
this thesis.
Chapter 2
Methods
2.1 Molecular Dynamics Simulations
MD simulation is the major technique employed in this thesis. MD simulations intend
to generate a conformational ensemble of the target molecular system by simulating its
dynamics using classical Newtonian mechanics, and based on the ensemble, interesting
structural, thermodynamic and mechanical properties can be calculated. The rest of this
section presents the basic theory and practice of MD simulations.
A typical MD simulation needs two ingredients. The first one is a set of atom coordinates
of the system of interest, which can be either obtained from experiments like X-ray
crystallography or NMR study, or constructed de novo, and the second one is a force
field, which includes a set of functions that define the calculation of the potential energy
of the target system and the corresponding parameters used by these functions [38]. The
exact mathematical forms of the functions depend on the force field as discussed in the
next section.
17
Chapter 2. Methods 18
To start the simulation, the initial velocities of all atoms are assigned artificially according
to the Maxwell distribution at a particular temperature. The force at time 0 is calculated
as
F i(0) =∂U(0)
∂ri(0), (2.1)
where F i is the force applied on the ith atom, ri is its coordinates, U is the potential
energy of the system as a function of the coordinates of all the atoms (R).
To integrate the Newtonian’s equations of motion for the next time step dt, suppose the
current time is t, the Taylor expansion of r(t+ dt) is
r(t+ dt) = r(t) + dtv(t) +dt2
2
F (t)
m+dt3
3!
∂r3(t)
∂t3+O(dt4), (2.2)
where m is the mass of the atom.
Several algorithms have been developed to perform the integration, and the most common
one is called the Verlet algorithm. It starts by expanding r(t− dt),
r(t− dt) = r(t)− dtv(t) +dt2
2
F (t)
m− dt3
3!
∂r3(t)
∂t3+O(dt4), (2.3)
and then add it to (2.2), which yields
r(t+ dt) = 2r(t)− r(t− dt) +F (t)
mdt2, (2.4)
the error of which is of order dt4. Similarly, subtracting Equation (2.2) by (2.3) yields
v(t) =r(t+ dt)− r(t− dt)
2dt, (2.5)
the error of which is of order dt2. It is possible to obtain more accurate v(t) using refined
algorithms, which can be referenced in [38, 3].
Back at time 0 following Equation (2.1), to integrate to the next time step using Verlet
algorithm. We need the velocity at time -dt, which can be estimated using
ri(−dt) = ri(0)− vi(0) · dt. (2.6)
Chapter 2. Methods 19
The accuracy for r(-dt) is not so important since it is just used to bootstrap the simulation
[38]. Then,
ri(dt) = 2ri(0)− ri(0− dt) +F i(0)
mdt2. (2.7)
With F i(dt) calculated similarly to Equation (2.1),
ri(2dt) = 2ri(dt)− ri(0) +F i(dt)
mdt2. (2.8)
Then,
vi(dt) =ri(2dt)− ri(0)
2dt. (2.9)
The same calculation is applied for every atom and continuously repeated until the in-
tended sampling time is reached. For this algorithm, r is always one step ahead of v.
The only difference between the first and following calculations is that the coordinates at
the previous time step can be obtained directly rather than from estimation. The same
process is also illustrated in Figure 2.1.
2.2 Force Fields
A force field in MD simulations is defined as a set of functions used to calculate the
potential energy of the system (U) together with the corresponding parameters used
in these functions [9]. U is usually decomposed into two groups: the terms arising from
bonded interactions (Ubonded) and those arising from nonbonded interactions (Unonbonded).
The bonded potential energy terms include that of bonds (Ubond), angles (Uangle), dihedral
angles (Udihedral) and improper dihedral angles (Uimproper), which are used to retain the
chirality and planarity of particular chemical groups such as sp3 C atoms and planar rings,
while the nonbonded energy terms include pairwise Lennard-Jones potential (ULJ) and
electrostatic potential (Uelectrostatic). Their relationship can be summarized in Equations
(2.10)–(2.12):
U = Ubonded + Unonbonded, (2.10)
Chapter 2. Methods 20
Starting with ri(t − dt), ri(t), if t=0, then
assign velocities and bootstrap r(−dt).
Calculating the force:
F i(t) =∂U(t)
∂ri(t), (2.1)
Calculating the position:
ri(t+ dt) = 2ri(t)− ri(t− dt) +F i(t)
mdt2, (2.8)
Calculating the velocity:
vi(t) =ri(t+ dt)− ri(t− dt)
2dt. (2.9)
t = t + dt
Output ri(t) (and vi(t)) for future analysis
Figure 2.1: Workflow of a MD simulation. U is the potential energy of the system. F
is the force applied on the ith atom. r, v and m are its coordinates, velocity and mass,
respectively. dt is the length of a time step in the simulation. The calculation at each
time step is looped through all atoms in the system. For details of each step, please refer
to Section 2.1.
Chapter 2. Methods 21
where
Ubonded = Ubond + Uangle + Udihedral + Uimproper, (2.11)
and
Unonbonded = ULJ + Uelectrostatic. (2.12)
Despite this relationship, the exact mathematical forms of Ubond, Uangle, Udihedral, Uimproper,
ULJ and Uelectrostatic may be different in different force fields, which will be discussed in a
moment.
In this thesis, all the force fields discussed are limited to empirical ones. In contrast to
the force fields that involve quantum calculations, empirical force fields consider atoms
as the smallest particles, and only include relatively simple empirical functional forms.
In the rest of this section, three all-atom force field families, Optimized Potentials for
Liquid (OPLS) [59], Assisted Model Building with Energy Refinement (AMBER) [126],
and Chemistry at HARvard Macromolecular Mechanics (CHARMM) [19] as well as a
coarse-grained (CG) force field MARTINI [76] are to be introduced.
2.2.1 All-atom Force Fields
With tremendous progress in computational power of both hardware and software in
the last decade, all-atom force fields have become dominant over united-atom (a.k.a.
extended-atom) force fields, in which aliphatic hydrogen atoms are incorporated into
the heavy atoms, to which they are bonded to. To be consistent with previous studies
from this laboratory[98, 99, 100], the work presented in this thesis started with using
OPLS-AA/L [61]. However, recent studies published in the last several years compared
a variety of modern force fields and their results suggest that OPLS-AA/L is relatively
inferior in reproducing NMR measurements for biomolecular systems [8, 71]. Besides,
the results from Sarah Rauscher in our group suggests that OPLS-AA/L over collapses
Chapter 2. Methods 22
the N-terminal SH3 domain of the pro tine drk and hence underestimate its radius of
gyration (Rg) [97]. Therefore, an force fields comparison study for the selection of an
optimal force field for this project is underway and the preliminary results are shown
in Appendix A. As part of the background for this comparison, a brief review of the
difference in the functional forms among the three most commonly used all-atom force
field families, OPLS [59], AMBER [126] and CHARMM [19], as well as the developments
since their invention, is reviewed here.
Force Field Ubond Uangle Ref.
OPLS∑
bondsKr(r − req)2∑
anglesKθ(θ − θeq)2 [58, 61]
AMBER∑
bondsKr(r − req)2∑
anglesKθ(θ − θeq)2 [25]
CHARMM∑
bondsKr(r − req)2∑
anglesKθ(θ − θeq)2 +∑
UBK1,3(r1,3 − r1,30 ) [74]
Table 2.1: Functional forms of bond and angle potentials in the OPLS, AMBER and
CHARMM force fields.
Force Field Udihedral Uimproper Ref.
OPLS∑
dih.
∑3n=1
Vn2
[1 + (−1)n−1 cos(nφ+ γn)] - [58, 61]
AMBER∑
dih.Vn2
[1 + cos(nφ− γ)] - [25]
CHARMM∑
dih.Kχ[1 + cos(nχ− δ)]∑
imp.Kimp.(φ− φ0)2 [74]
Table 2.2: Functional forms of the potentials of proper and improper dihedral angles
in the OPLS, AMBER and CHARMM force fields. Uimproper for OPLS and AMBER is
filled with “-” because it is modeled using the same functional form as the proper periodic
dihedral angles.
Tables 2.1–2.3 show that the potential energy functions used for the three force fields
Chapter 2. Methods 23
Force Field ULenard−Jones Uelectrostatic Ref.
OPLS∑atoms
i<j
(Aij
R12ij− Bij
R6ij
)· fij
∑atomsi<j
qiqjεRij· fij [58, 61]
AMBER∑atoms
i<j
(Aij
R12ij− Bij
R6ij
)· fij
∑atomsi<j
qiqjεRij· fij [25]
CHARMM∑atoms
i<j
(Aij
R12ij− Bij
R6ij
) ∑atomsi<j
qiqjεRij
[74]
Table 2.3: Functional forms of the Lennard-Jones (LJ) and electrostatic potentials in the
OPLS, AMBER and CHARMM force fields. In OPLS, for intramolecular 1,4-interactions,
fij = 0.5; otherwise, fij = 1. In AMBER, for intramolecular 1,4-LJ interactions, fij =
0.5, for intramolecular 1,4-electrostatic interactions, fij = 0.833; otherwise, fij = 1. In
CHARMM, there is no fij term.
are very similar. The major differences consist in the calculation of Uangle, Udihedral and
Uimproper. When calculating Uangle, an additional Urey-Bradley (UB) component is in-
cluded in CHARMM to model a virtual harmonic bond between the 1st and 3rd atoms
involved in an angle, θ∠123. This term was originally described by the Urey-Bradley force
field [113]. For historical reasons, it was included when modeling an aqueous dipeptide
solution system [103]. At first, the UB term also included a linear component, but this
term was later dropped as it was found to be unnecessary in absolute energy calculations
[94]. Therefore, it is now simplified as a single quadratic equation alone to more accu-
rately model the vibrational spectra [9]. OPLS uses Ryckaert-Bellemans (RB) potentials
for calculating Udihedral, while AMBER and CHARMM model it as periodic trigonometric
functions. OPLS and AMBER model Uimproper as proper dihedral angles, yet with dif-
ferent parameters, while CHARMM uses a quadratic potential. In addition, the scaling
factors for 1,4 interactions are also different among the three force fields as described in
the caption of Table 2.3. Although they have very similar functional forms for calculating
Chapter 2. Methods 24
the potential energy of the system, significant difference exists in terms of their parame-
terization philosophies [9] (i.e. how the parameters in the potential energy functions are
obtained or derived, and optimized). A discussion of such difference is beyond the scope
of this thesis.
OPLS was first developed as a united-atom force field (a.k.a. OPLS-UA) in 1988 [59].
In 1996, OPLS-AA, the all-atom version of OPLS [58] was developed. A major im-
provement of OPLS-AA took place in 2001, which reparameterized the Fourier torsional
coefficients in the calculation of Udihedral with more accurate quantum chemistry software,
resulting in OPLS-AA/L [61], which became widely used and stable.
AMBER first appeared in 1981 as a program for building models of molecules and
calculating their interactions [126]. The first so-named AMBER force field was developed
by Weiner et al. as a united-atom force field in 1984 (ff84) [127]. In 1986, it was extended
to become an all-atom force field (ff86) [128]. The first major improvement of AMBER
was published in 1994 by Cornell et al., who used a new charge model (RESP), new
VDW parameters that took consideration of vicinal electronegative atoms, and high-
level quantum mechanical data that was not available at Weiner’s time [25]. This version
of AMBER (ff94) is coined as the second-generation force field, after which AMBER
became one of the most widely used force fields for biomolecular simulations. Over the
time, many variants of AMBER have been developed, whose names can be confusing
for newcomers. In general, different versions of the AMBER force field are named as
“ff” + “last two digits of the year when it began to be used” + “any particular feature
(optional)”.
With the progress of computational power, the deficiency in ff94 such as over-stabilization
of α-helices became explicit [51]. This issue was first addressed in ff96 [63], and later
Chapter 2. Methods 25
ff99 [124]. Both ff96 and ff99 tried to improve the force fields by refitting the backbone
dihedral parameters for φ and ψ, but it has been revealed that the way dihedral pa-
rameters were optimized in ff96 or ff99 results in incorrect conformational preferences
for Gly. Besides, over-stabilization of β-sheet and α-helices has also been observed in
ff96 and ff99, respectively [51]. The same problem of ff94 was also addressed by Garcıa
and Sanbonmatsu, who simply set the backbone dihedral potential for φ and ψ to zero,
resulting in ff94GS [39]. The over-stabilization-of-α-helix problem of ff99 was addressed
respectively by Sorin and Pande, who developed ff99φ by replacing the backbone dihedral
potential for φ in ff99 with that from ff94 [108], Duan et al., who developed ff03 with a
fundamentally new approach for deriving atomic partial charges [31], and Hornak et al.,
who developed ff99SB with extensive optimizations of backbone dihedral parameters for
both φ and ψ [51].
In more recent years, the AMBER force fields continue to be improved. Best et al.,
in an attempt to obtain the correct balance of secondary structure propensities, de-
veloped ff99SB* [15] and ff03* [15] using simple backbone energy corrections. Li and
Bruschweiler integrated existing NMR data, and developed ff99SBnmr1 [69]. Nerenberg
and Gordon revised the φ′ backbone dihedral potential, and developed ff99SB-phi [87].
All of the above modifications of AMBER are focused on the backbone parameters. In-
stead, Lindorff-Larsen et al. improved the side-chain torsional potentials based on ff99SB,
yielding ff99SB-ILDN [72]. It is named as such because the parameterization is based on
the four types of residues, Ile(I), Leu(L), Asp(D), Asn(N). In the literature, names like
ff99sb*-ildn, ff99sb-ildn-nmr or ff99sb-ildn-phi [95, 71, 8] can also be found. Such names
indicate combinations of two force fields which modified different aspects of the same base
force field without conflicts. For example, ff99sb*-ildn is a combination of ff99SB* and
ff99SB-ILDN, both of which were developed based on ff99SB. The former only modifies
the backbone dihedral potential terms while the later modifies those of sidechains. In
2012, a new charge model is proposed to be used together with ff99sb*-ildn to improve
Chapter 2. Methods 26
residue-specific α-helix propensities, resulting in ff99SB*-ILDN-Q [14].
Most of the AMBER force fields have been developed using TIP3P [57] as the water
model. However, recognizing the deficiencies of the primitive three-site water model in
reproducing the phase diagram of water, Best et al. combined ff03* [15] and a highly opti-
mized water model called TIP4P/2005 [2], which behaves well in non-standard conditions
such as low temperature and high pressure, and developed ff03w [13].
As we can see, the development of AMBER is convoluted, hence Figure 2.2 is shown
to illustrate the relationships among different AMBER variants, in other words, how
AMBER has evolved.
CHARMM first appeared as a program for the calculation of macromolecular en-
ergy minimization and dynamics in 1982 [19]. In 1985, the first so-named united-atom
CHARMM force field, CHARMM19, was developed [102]. The naming convention for
CHARMM force field is “CHARMM” + “the version number of CHARMM program
which for the first time includes the then newest version of the CHARMM force field” [74].
For example, CHARMM19 indicates that this version of the CHARMM force field was
firstly included in the CHARMM program of version 19. The first all-atom CHARMM
force field for proteins, CHARMM22, was developed in 1998 by MacKerell et al. [74]. In
2004, a new potential energy component, energy correction map (CMAP), was added to
CHARMM22 to improve the accuracy of the backbone dihedral potential, resulting in
CHARMM22/CMAP [75] (a.k.a. CHARMM27 [71]). In 2011, based on CHARMM27,
Piana et al. developed CHARMM22* by removing the CMAP for all residues but Gly and
Pro, and adding modification of the backbone torsional potentials [95]. In 2012, in order
to overcome the over-stabilization of α-helix conformations in CHARMM22/CMAP, its
parameters were optimized again, leading to the development of the most recent version
of the CHARMM force field as of this writing, CHARMM36 [16].
Chapter 2. Methods 27
ff94[25]
ff94GS[39] ff96[63] ff99[124]
ff99φ[108] ff99SB[51]
ff99SB*[15]
ff99SB*-ILDN[8]
ff99SB*-ILDN-Q[14]
ff99SB-ILDN[72]
ff99SBnmr1[69]
ff99SB-phi[87]
ff03[31]
ff03*[15]
ff03w + TIP4P/2005[13]
Figure 2.2: Family tree of AMBER force fields. Note that not every AMBER force field
is developed based on the last released one. Instead, the history of AMBER family is
more like a tree as shown above.
Chapter 2. Methods 28
Overall, there is no doubt that all-atom force fields will continue to evolve, and that even
new representations of the energy surface such as the effects of charge polarization are
going to be developed [16]. Table 2.4 is a list of all the aforementioned force fields in
chronological order.
2.2.2 Coarse Grained Force Fields
An alternative to all-atom force fields is CG force field, which can be used to probe length
and time scale that are currently infeasible for atomistic systems. The aforementioned
united-atom force fields are just one type of the CG force fields, in which aliphatic
hydrogens are incorporated to their attached heavy atoms so that the total number of
atoms in the system is reduced, hence the simulations are sped up.
A CG force field that is particularly promising for the future study of this project is called
MARTINI. MARTINI was first developed in 2004 for coarse-grained lipid simulations [76],
and later extended for biomolecular simulations but still without protein in 2007 [77],
which is tagged as MARTINI 2.0. In 2008, MARTINI 2.1 started to include parameters
for simulations of coarse grained peptides [84].
The major techniques used for coarse graining in MARTINI include the reduction of
the number of degrees of freedom via four-to-one mapping (i.e. four heavy atoms are
typically represented as one) and the use of short-range potentials, which means that
the nonbonded potential vanishes when the interatomic distances become larger than
a specified cutoff (e.g. rcut = 1.2 nm [84]). This force field leads to increases in the
time scale by 2–3 orders of magnitude compared to their atomistic counterparts. [84]
Therefore, it also increases the length scale of system and can be useful for large-scale
simulations—in the context of this project, large aggregates of ELPs.
Chapter 2. Methods 29
Year Name First Author Ref.O
PL
S
1988 OPLS-UA Jorgensen et al. [59]
1996 OPLS-AA Jorgensen et al. [58]
2001 OPLS-AA/L Kaminski et al. [61]
AM
BE
R
1984 ff84 Weiner et al. [127]
1986 ff86 Weiner et al. [128]
1995 ff94 Cornell et al. [25]
1997 ff96 Kollman et al. [63]
2000 ff99 Wang et al. [124]
2002 ff94GS Garcia et al. [39]
2003 ff03 Duan et al. [31]
2005 ff99φ Sorin et al. [108]
2006 ff99SB Hornak et al. [51]
2009 ff99SB*, ff03* Best et al. [15]
2010 ff99SB-ILDN Lindorff-Larsen et al. [72]
2010 ff99SBnmr1 Li et al. [69]
2010 ff03w+TIP4P/2005 Best et al. [13]
2011 ff99SB-phi Nerenberg et al. [87]
2012 ff99SB*-ILDN-Q Best et al. [14]
CH
AR
MM
1985 CHARMM19 Reiher [102]
1998 CHARMM22 MacKerell et al. [74]
2004 CHARMM22/CMAP (a.k.a. CHARMM27) MacKerell et al. [75, 71]
2011 CHARMM22* Piana et al. [95]
2012 CHARMM36 Best et al. [16]
Table 2.4: Evolutions of OPLS, AMBER, CHARMM in chronological order.
Chapter 2. Methods 30
A special version of MARTINI with improved internal peptide dynamics has been devel-
oped by our group in collaboration with Mikyung Seo and Peter Tieleman in order to
simulate coarse grained ELPs and ALPs [107]. Unfortunately, due to unresolved issue in
the parameters which can lead to constant crash during simulations of multiple peptides,
my attempts to simulate mesoscopic aggregate of ELPs using this force field have been
unsuccessful as of the writing of this thesis.
2.3 Sampling Errors
When applying MD simulations to solve scientific problems, errors are unavoidable. In
general, there are two types of errors, statistical error (a.k.a. statistical sampling error)
and systematic error. The second type of error can be further divided into systematic
sampling error and systematic force field error.
Statistical error and systematic sampling error mainly come from insufficient sampling
time, while systematic force field error is a direct result of an imperfect force field used.
In terms of their effect on the properties calculated from MD simulations, statistical error
affects a value’s precision while systematic error affects its accuracy [86].
One way to alleviate systematic sampling error as well as the statistical error is to explore
the sampling (conformational) space of the target system as comprehensive as possible,
which can be achieved by either using enhanced sampling techniques, or multiple replica
(MR) simulations with different initial system conformations. Common enhanced sam-
pling techniques include umbrella sampling (US) [111], replica exchange (REX) [109], as
well as more recent algorithms developed in our laboratory, including STDR [99], virtual
replica exchange (VREX) [99]. MR simulations are also called brute force sampling,
which is the method exclusively used in this thesis.
Chapter 2. Methods 31
To reduce the systematic force field error, the key is to select an appropriate force field.
If the force field is biased or error-prone, even if the whole conformational space has
been well sampled, which is not always possible in the first place, the results, however
precise, would still be biased or even wrong. In fact, along with enormous advancements
of computational power and improvements of the sampling techniques, more limitations
of common force fields that were once implicit become explicit. In the last two years,
multiple studies have compared and evaluated a variety of modern force fields [95, 8, 71].
The results turn out to be surprising given that a force field that was once considered
superior may become inferior (e.g. OPLS-AA/L), which forced us to rethink of our
selection of a better force field for the near future. As mentioned above, a comparison
study for selecting a better force field is underway and the results obtained as of writing
are presented in Appendix A.
Chapter 3
Elastin-based Peptides in Water and
Methanol
3.1 Background
As introduced in Section 1.4, it is found that model peptides derived from the native
tropoelastin, i.e. EBP, can coacervate or form amyloid-like fibrils, which can be modu-
lated by sequence composition or solvent condition. Previous work from my group has
shown that a high percentage of combined Pro and Gly content is required to prevent an
EBP from forming amyloid-like fibrils in water [98], but still very limited knowledge is
known about how solvent conditions affect EBPs’ aggregation propensity at the molecular
level, which is to be investigated in this chapter.
We have performed atomistic MD simulations in explicit water and methanol to study
their solvent effects on a set of model EBPs, (GVPGV)7, (PGV)12, (GGVGV)7, (GVGVA)7,
(GV)18. The first 2 are referred to as ELPs since they are representative of a single HP
domain in native tropoelastin, and tend to coacervate. The next 3 are referred as ALPs
32
Chapter 3. Elastin-based Peptides in Water and Methanol 33
since they tend to form amyloid-like fibrils [98]. However, the sequence, (GGVGV)7, has
been found only to be able to form amyloid-like fibrils in water, but to form an amor-
phous film, which eventually becomes beaded string structures in methanol [36]. The
major difference between ELPs and ALPs in sequence composition is the presence of Pro
in the former. In addition, we have also included G35 in our study. G35 is considered
to be a good control for studying the solvent effects on the peptide backbone because of
its absence of any sidechains. The model peptides set is summarized in Table 3.1. All
the peptides are capped with an acetyl group at the N-terminal and an amide group at
the C-terminal, and simulated as monomers. The major results we found include: (1)
all peptides become more extended in methanol than in water; (2) The peptides remain
disordered in both solvents, which is consistent with the previous results in water from
our group [98]; (3) solvophobic effect (a.k.a. hydrophobic in water) is reduced in methanol
than in water; (4) in methanol, ALPs forms extensive β-sheet, but ELPs do not.
That methanol promotes extensive formation of β-sheets in ALPs, especially in (GGVGV)7,
might at first sight appear to be contradictory with the experimental observation that
methanol inhibits the formation of β-sheet-rich amyloid-like fibrils [36]. To resolve this
paradox, we hypothesize that the promotion of β-sheet formation for a monomer and the
inhibition of the formation of amyloid-like fibrils by methanol is due to the same reason,
the reduction of solvophobic effect. For a monomer, a reduction of the solvophobic effect
leads to better solvation of the peptide by surrounding solvent molecules. In particular,
in the case of (GGVGV)7 in methanol, the relatively nonpolar methanol molecules prefer-
entially solvate with the peptide’s nonpolar sidechains over its polar backbone, resulting
in the peptide’s formation of β-sheet, in which the sidechains become very exposed to
the solvent. However, for the fibrils, the same reduction leads to weaker interactions
among neigbouring β-sheet layers as in the stacking β-sheet model [93], hence inhibits
the formation of amyloid-fibrils.
Chapter 3. Elastin-based Peptides in Water and Methanol 34
Elastin-like peptides (ELPs) (GVPGV)7, (PGV)12
Amyloid-like peptides (ALPs) (GGVGV)7, (GVGVA)7 and (GV)18
Backbone control G35
Table 3.1: Model peptides.
3.2 Results
3.2.1 Intrinsically Disordered Peptides
Figure 3.1 shows four representative snapshots from the simulations of one ELP, (GVPGV)7,
and one ALP, (GGVGV)7, successively in water and methanol. The ELP is very col-
lapsed in water due to the hydrophobic effect. In contrast, it is much more extended in
methanol. Similar to the ELP, the ALP is also very collapsed in water, but with more
β-sheet formation. In contrast, it forms extensive β-sheets in methanol.
In our simulations, although methanol promotes the formation of β-sheets in ALPs,
none of the model peptides forms any stable tertiary structure as a monomer in water
or methanol, given that the tertiary structure is defined as the packing of secondary
structural elements, and it confirms the high propensity for intrinsic disorder of these
peptides as shown previously [98]. In addition to the inclusion of Pro, the secondary
structure broker, in ELPs, such disorder of the peptides as monomers is probably due to
their extremely simple and hydrophobic sequences. In order to sample a comprehensive
ensemble for the conformational equilibrium of IDPs, as many conformational states as
possible should be sampled per system, which is approached by multiple replica simu-
lations in this study, in which each replica starts from a unique initial conformational
state. In order to have a good description for the system, the properties of interest are
calculated as statistical average over all replicas.
Chapter 3. Elastin-based Peptides in Water and Methanol 35
(GVPGV)7
(GVPGV)7
(GGVGV)7
(GGVGV)7
Figure 3.1: Representative snapshots of an ELP, (GVPGV)7, on the left column and
an ALP, (GGVGV)7, on the right column in water (red) and methanol (blue) from the
simulations. The blue ends indicate the N-terminal of the peptides.
Chapter 3. Elastin-based Peptides in Water and Methanol 36
3.2.2 Radius of Gyration
The Rg was calculated to quantify the overall size of the peptides for all the systems.
As shown in Figure 3.2, all sequences have has a much broader distribution of Rg in
methanol than in water, and the average of Rg, as indicated by the vertical bars, is
larger in methanol than in water. An increased Rg indicates that the conformation
becomes more extended, which suggests that methanol is a better solvent than water
for these hydrophobic model peptides since they are not as unlikely to interact with the
solvent in methanol as in water. More extended conformations suggest that there are
fewer intramolecular peptide-peptide interactions and more peptide-solvent interactions.
Therefore, we quantified different types of intra- and inter- molecular interactions.
3.2.3 Intramolecular Peptide-peptide Interactions
Two types of interactions within a peptide are calculated and shown in Figure 3.3. The
x axis shows the number of peptide-peptide hydrogen bonds (H-bonds) normalized by
the number of H-bonding groups, and the y axis shows that of peptide-peptide nonpolar
interactions normalized by the number nonpolar groups. When going from water to
methanol, on the one hand, the number of nonpolar interactions decreases significantly for
all the model peptides, which is consistent with the increased Rg and hence more extended
conformations. On the other hand, the number of H-bonds increases significantly for
ALPs while it barely changes for ELPs, both of which are inconsistent with an increased
Rg. The trend for G35 resembles that for ALPs. Compared with ALPs, the relatively
higher propensity of nonpolar interactions in G35 in spite of its absence of nonpolar
sidechains reflects the packing of methylene Cα groups in the highly collapsed polypeptide
chain (See Figures 3.2), as CαH2 is the only nonpolar group in polyglycine.
Chapter 3. Elastin-based Peptides in Water and Methanol 37
0.0
0.1
0.2
0.3
0.4
P
(GVPGV)7
water
methanol
(PGV)12
0.0
0.1
0.2
0.3
0.4
P
(GGVGV)7 (GVGVA)7
0.6 0.8 1.0 1.2 1.4Rg (nm)
0.0
0.1
0.2
0.3
0.4
P
(GV)18
0.6 0.8 1.0 1.2 1.4Rg (nm)
(G)35
Figure 3.2: Distribution of Rg of the model peptides in water and methanol. The vertical
bar indicates the average value. Error bars in this and all the following figures are
calculated as SEM. Rg is calculated from equation R2g = 1
N
∑Ni=1 ‖~Ri − ~Rcm‖2 where ~Ri
is the position of the Cα atom of the ith residue, ~Rcm is the center of mass of all the Cα
atoms and N is the number of residues in the peptide.
Chapter 3. Elastin-based Peptides in Water and Methanol 38
The fact that nonpolar interactions are reduced in all the sequences suggests that the
solvophobic effect becomes weaker in methanol than in water. This is due to the presence
of methyl groups in methanol molecules, which makes them much more nonpolar than
water molecules. The fact that the H-bonds of ALPs are more abundant in methanol
suggests that more extended conformations lead to the formation of more H-bonds, which
is indicative of the formation of secondary structure.
3.2.4 Interactions between Peptide and Solvent
Interactions between peptide and solvent are categorized as the solvation of polar groups
and nonpolar groups. Each type of solvation includes the interactions of the polar or
nonpolar atoms of the peptide with both the polar groups of the solvent (i.e. OH groups
of both water and methanol) and, in the case of methanol, the nonpolar (methyl) group
of the solvent. The results are then normalized by the corresponding values calculated
from another set of control simulations, in which the peptides are restrained to their most
extended state so as to maximize their interactions with solvent. The normalized values
are always between 0–1 and used to quantify the extent of solvation.
Figure 3.4 shows that from water to methanol, ELPs move roughly along the direction
of the diagonal, which means that both the polar and nonpolar groups become better
solvated in methanol than in water, which is consistent with an increased Rg. However,
ALPs move towards the upper left corner of the plot instead, which means that although
the nonpolar groups also become better solvated as in ELPs, their polar groups become
more desolvated, suggesting that methanol preferentially solvates the nonpolar groups of
the solute over its polar groups. The trend for G35 is again similar to that seen for ALPs,
but at a lower scale in terms of solvation of nonpolar groups, which is consistent with
a relatively high propensity for intramolecular nonpolar interactions as shown in Figure
Chapter 3. Elastin-based Peptides in Water and Methanol 39
0.06 0.08 0.10 0.12 0.14 0.16Peptide-peptide H-bonds
0.6
0.8
1.0
1.2
1.4
1.6
Nonpola
r in
tera
ctio
ns
Figure 3.3: Propensity for intramolecular peptide-peptide interactions in water (red) and
in methanol (blue). The number of nonpolar interactions is normalized by the number
of primary and secondary nonpolar C atoms in the peptides. The number of H-bonds
is normalized by the number of H-bonding groups in the peptides, which is calculated
as 2 × N − P where N and P are respectively the total number of residues and the
number of Proline residues in each peptide. ELP: [N: (GVPGV)7, H: (PGV)12], ALP:
[•: (GGVGV)7, J: (GVGVA)7, I: (GV)18], F: G35.
Chapter 3. Elastin-based Peptides in Water and Methanol 40
3.3.
The results for ALPs is inconsistent with an increased Rg, but consistent with an increased
number of H-bonds as shown above. Overall, it is indicative of the formation of secondary
structure again. Therefore, we analyzed the content of secondary structure for all the
systems.
3.2.5 β-sheet Content
There is no α-helix formation in any of the systems, and the content of β-sheet is found
to undergo a significant change from water to methanol. As shown in Figure 3.5, the
β-sheet content roughly doubles from water to methanol for all model peptides, but in
terms of the absolute values, ELPs and G35 are relatively incapable of β-sheet formation
compared to ALPs, especially in methanol. That ELP cannot form extensive β-sheets is
consistent with a previous study, which ascribed this effect to the presence of Pro [98],
a secondary structure breaker, while G35 is full of Gly, which makes it too flexible to
stabilize in extended secondary structure. The effect of Pro is more obvious by noticing
that the β-sheet content of (PGV)12 is even lower than that of (GVPGV)7 because of its
higher fraction of Pro. The results depicted in Figure 3.5 show that methanol promotes
the formation β-sheet in ALPs as seen in Figure 3.1(d).
3.3 Discussion
We have shown the results of all of the model peptides successively in water and in
methanol. The results in water are consistent with those from our group previously. In
particular, ALPs forms more β-sheet in water than ELPs [98], the Rg of (GVPGV)7 is
about 0.84 nm [99], and the peptides are all intrinsically disordered [98, 99].
Chapter 3. Elastin-based Peptides in Water and Methanol 41
0.60 0.65 0.70 0.75 0.80 0.85Solvation of polar groups
0.60
0.65
0.70
0.75
0.80
0.85
Solv
ati
on o
f nonpola
r gro
ups
Figure 3.4: Propensity for intermolecular peptide-solvent interactions in water (red) and
in methanol (blue). The solvation of polar groups in water is quantified as the number of
H-bonds between the peptide and water, and in methanol, as the sum of the number of
H-bonds and that of pairwise interactions between the polar heavy atoms of the peptide
(i.e. backbone O and N atoms) and the nonpolar heavy atoms of methanol (i.e. methyl
C atoms). The solvation of nonpolar groups in water is quantified as the sum of the
number of pairwise interactions between the nonpolar heavy atoms (i.e. primary and
secondary nonpolar C atoms) of the peptide and the O atom of water, and in methanol,
as the sum of the number of pairwise interactions between the nonpolar heavy atoms of
the peptide and the O atoms of methanol and that of the nonpolar pairwise interactions
between the heavy atoms of the peptide and those of methanol. ELP: [N: (GVPGV)7,
H: (PGV)12], ALP: [•: (GGVGV)7, J: (GVGVA)7, I: (GV)18], F: G35.
Chapter 3. Elastin-based Peptides in Water and Methanol 42
(GVPGV)7 (PGV)12 (GGVGV)7 (GVGVA)7 (GV)18 (G)35
0.00
0.05
0.10
0.15
0.20
0.25
β-s
heet
conte
nt
0.04
0.02
0.10 0.10
0.09
0.06
0.08
0.04
0.220.21 0.21
0.09
water
methanol
Figure 3.5: Propensity to form β-sheet structure in all the model peptides in water and
in methanol.
Chapter 3. Elastin-based Peptides in Water and Methanol 43
Comparing the results in different solvents, We found that all the model peptides become
more extended in methanol than in water. More extended structures correspond better
solvation, which is true for ELPs, but not for the polar groups of ALPs due to their
significant propensity to form β-sheet. In a β-sheet, the nonpolar groups of the peptide
monomer are very exposed to the solvent, and hence are well solvated. Concurrently,
the polar groups of the backbone become relatively buried when to form H-bonds, and
hence are desolvated. Therefore, in ALPs, forming β-sheet provides the possibility of
solvating nonpolar groups, desolvating polar groups, and increasing the overall size of
the peptide at the same time. Similar results showing that alcohol can promote the
formation of extended secondary structure (i.e. α-helix and β-sheet) have also been
reported previously, but for very different sequences such as globular proteins ferredoxin
[106] and BBA5 [55]. A common feature between these two sequences and our ALPs is
that they contain very few or no Pro residues. In ELPs, the presence of Pro inhibits their
formation of β-sheet, so while their nonpolar groups also become much better solvated,
their polar groups are forced to become more solvated as well with the increase in the
peptides’ size. In other words, methanol swells ELPs.
Our results show that methanol promotes β-sheet formation in ALPs, including (GGVGV)7
which has the same repeat unit as (VGGVG)n. However, the latter has been found to
form amyloid-like fibrils which contain a high amount of β-sheet only in water, but to
form an amorphous film instead in methanol [36, 34, 35], which leads to an apparent
paradox: why does methanol promote the formation of β-sheet in ALP while preventing
it from forming β-sheet-rich amyloid-like fibrils.
Based on this study, we propose a hypothesis to resolve this apparent paradox: since the
increased content of β-sheet in (GGVGV)7 as a monomer in methanol is mainly a result of
reduced solvophobic effect, as reflected by the decrease of intramolecular peptide-peptide
nonpolar interactions and the exposure of its nonpolar groups to the solvent, it is reason-
Chapter 3. Elastin-based Peptides in Water and Methanol 44
able to assume that the solvophobic effect between different monomers is also reduced
between when multiple peptides exist. According to the stacking β-sheet model for Aβ-
amyloid fibrils [93], such a reduction in solvophobic effect would necessarily weaken the
nonpolar interactions between neigbouring layers of cross β-sheets, resulting in ineffective
stacking that prevents the formation of amyloid-like fibrils. Instead, our results suggest
that in methanol, many small β-sheets are present but they cannot assemble into highly-
ordered amyloid fibrils because of the weak solvophobic effect. Consistent with previous
studies [106, 64, 55], while methanol promotes secondary structure, it does not favor the
tertiary structure required for protein folding or fibril formations.
3.4 Conclusion
We have characterized different types of structural properties of our model peptides suc-
cessively in water and in methanol. Interestingly, methanol promotes β-sheet formation
of ALPs, but prevents them from forming amyloid-like fibrils, whose core cross-β-sheet
structure consists of stacked β-sheets. We hypothesize that this effect is due to the weak-
ening of nonpolar interactions among peptides because of a reduced solvophobic effect of
nonpolar groups in methanol. Concurrently, the preferential solvation of nonpolar side
chains over that of polar polypeptide backbone promotes β-sheet formation. As a result,
even though there is a higher amount of β-sheets, they cannot stack effectively to form
fibrils. In contrast, ELPs can not form β-sheet due to their high Pro content. Instead,
they swell as a result of both of their polar and nonpolar groups being better solvated.
Chapter 3. Elastin-based Peptides in Water and Methanol 45
3.5 Material & Methods
Simulation Setup We performed MD simulations of 40 replicas of each of the 6 model
peptides as a monomer successively in water and in methanol. Each replica was simulated
at 1 par, 300 K for 200 ns, and the simulations of (GVPGV)7 in water and in methanol
were extended to 500 ns. Each system is solvated in a triclinic box with the three angles
as 60◦, 60◦, 90◦. For peptides in water, the box size is 5.4 × 5.4 × 3.8 nm3 with 3700
water molecules per system. For (GVPGV)7 and (GVGVA)7 in methanol, the box size is
7.3× 7.3× 5.2 nm3 with 4000 methanol molecules per system. For (PGV)12, (GGVGV)7
and (GV)18 in methanol, the box size is 6.9×6.9×4.9 nm3 with 3300 methanol molecules
per system, for G35 in methanol, the box size is 6.4× 6.4× 4.5 nm with 2700 methanol
molecules per system. The initial structures of the peptides for different replicas were
selected from simulations at 700 K in vacuo to ensure that all replicas started with very
different initial states. For simplicity, only peptides structures without cis-Val-Pro were
selected.
All simulations were performed at constant pressure and temperature with periodic
boundary conditions using the program Gromacs 4.0.5 [12, 48] in the OPLS-AA/L force
field [58, 61]. Explicit TIP4P water [57] and methanol [58] models were used. The lin-
ear constraint solver (LINCS) algorithm was used to constrain all bond lengths [47, 46].
A cutoff of 1.4 nm was used for Lennard-Jones interactions. The Particle-Mesh Ewald
(PME) algorithm [26, 33] was used to calculate long-range electrostatics interactions with
a Fourier spacing of 0.12 and a interpolation order of 4. The Nose-Hoover thermostat
[88, 50] was used for temperature coupling with the peptide and solvent coupled to two
temperature baths and a time constant of 0.1 ps. The Parrinello-Rahman [91] algorithm
was used for pressure coupling with a time constant of 2 ps. The integration step size is
2 fs and the system coordinates were stored every 10 ps.
Chapter 3. Elastin-based Peptides in Water and Methanol 46
Structural Analysis The first 150 ns of each trajectory was omitted as equilibra-
tion based on the convergence analysis of Rg along the time as shown in Figure 3.8,
resulting in a total of 14 µs of production time for the systems of (GVPGV)7 in water
and in methanol, and a total of 2 µs of production time for all other systems. When
characterizing interactions between the peptide and solvent, a nonpolar interaction was
counted whenever two nonpolar heavy atoms were within a distance of 0.55 nm, which
was the radius of the peptide’s first solvation shell of the nonpolar C atoms based on
radial distribution function (RDF) analysis as shown in Figure 3.6. Only the primary
and secondary C atoms were considered because tertiary C atoms were found to have a
very different solvation shell as shown in Figure 3.6. A cutoff of 0.45 nm was selected in
the same way for calculating the interactions between polar and nonpolar heavy atoms as
shown in Figure 3.7. RDF describes the change of the density of particles of interest (e.g.
nonpolar atoms of the peptide) along the radius from a reference particle (e.g. nonpolar
atoms of the solvent) and is calculated using g rdf from Gromacs tools with a bin size
of 0.02 nm. For H-bond calculation, g hond also from Gromacs tools was used with a
donor-acceptor distance cutoff of 0.35 nm and an acceptor-donor-hydrogen angle cutoff
of 30◦ [73]. The same criteria were also applied to the characterization of intramolecular
peptide-peptide interactions except that when counting nonpolar interactions, the two
nonpolar heavy atoms had to be at least two residues away in the primary sequence. The
β-sheet content was calculated using the DSSP program [60]. The peptide snapshots were
generated with VMD [53], and the plots were created with Matplotlib [54].
Chapter 3. Elastin-based Peptides in Water and Methanol 47
0.0
0.2
0.4
0.6
0.8
1.0
Densi
ty
(GVPGV)7 in methanol Primary C
Secondary C
Tertiary C
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Radius (nm)
0.0
0.2
0.4
0.6
0.8
1.0
Densi
ty
(GGVGV)7 in methanol
Figure 3.6: RDFs between the primary, secondary, and tertiary nonpolar C atoms of the
peptide and the nonpolar heavy atoms of the solvent for one ELP, (GVPGV)7, and one
ALP, (GGVGV)7. The first solvation shell (i.e. the radius corresponds to its first trough
in density) of tertiary C atoms extends much further than that of primary and secondary
C atoms.
Chapter 3. Elastin-based Peptides in Water and Methanol 48
0.0
0.2
0.4
0.6
0.8
1.0
Densi
ty
(GVPGV)7 in water
Primary C
Secondary C
Tertiary C
(GVPGV)7 in methanol
0.2 0.4 0.6 0.8 1.0Radius (nm)
0.0
0.2
0.4
0.6
0.8
1.0
Densi
ty
(GGVGV)7 in water
0.2 0.4 0.6 0.8 1.0Radius (nm)
(GGVGV)7 in methanol
Figure 3.7: RDFs between the primary, secondary, and tertiary C atoms of the peptide
and the polar O atom of the solvent.
Chapter 3. Elastin-based Peptides in Water and Methanol 49
0 100 200 300 400 500
0.7
0.8
0.9
1.0
1.1
1.2
1.3
Rg (
nm
)
(GVPGV)7
water
methanol (PGV)12
0.7
0.8
0.9
1.0
1.1
1.2
1.3
Rg (
nm
)
(GGVGV)7 (GVGVA)7
0 50 100 150 200Time (ns)
0.7
0.8
0.9
1.0
1.1
1.2
1.3
Rg (
nm
)
(GV)18
0 50 100 150 200Time (ns)
(G)35
Figure 3.8: Time evolution of the peptide Rg averaged over all 40 replicas for each
system. The vertical line marks the simulation time of 150 ns, after which the systems
are considered to be equilibrated and the data is used for analysis.
Chapter 4
Solvent Quality Studies
4.1 Background
We have shown that methanol has a profound effect on the conformations of the model
peptides in Chapter 3 from the perspective of aggregation. This chapter will focus on
the examination of the quality of various solvents on these model peptides.
As introduced in Section 1.5, it is known from polymer physics that in a poor solvent, the
polymer molecule tends to collapse, while in a good solvent, it tends to swell, and between
the two in a θ-solvent, the molecule behaves as a random coil [104, 90]. The relatively
larger Rg as shown in Section 3.2.2, which corresponds to a greater extent of swelling of
the model peptides in methanol than in water, suggests that methanol is a better solvent
than water for such hydrophobic peptides. However, this does not necessarily mean that
methanol is a good solvent, because a good or poor solvent is an absolute concept and we
do not know what constitutes a θ-solvent for these peptides. The questions of whether
methanol is a good solvent and what kind of solvent can be a θ-solvent for the models
peptides are to be discussed in this chapter.
50
Chapter 4. Solvent Quality Studies 51
The model peptides used in this chapter are the same as in Chapter 3: two ELPs,
(GVPGV)7, (PGV)12, three ALPs, (GGVGV)7, (GVGVA)7, (GV)18, and G35 as a back-
bone control. However, for the solvents set, in addition to water (H2O) and methanol
(CH3OH), more alcohols, ethanol (C2H5OH), 1-propanol (C3H7OH), 1-butanol (C4H9OH),
1-pentanol (C5H9OH), 1-hexanol (C6H11OH), 1-heptanol (C7H13OH), 1-octanol (C8H15OH),
as well as octane (C8H18) are also included. This solvent series represents a trend of de-
creasing polarity, with water and octane representing the two extremes of polar and
nonpolar solvents. For convenience, the prefix “1-” will be omitted when referring to the
alcohols in the rest of this thesis. Note that polar and nonpolar are used to describe
the solvents while hydrophilic and hydrophobic are used to describe the peptides. The
difference is that a hydrophobic molecule is not necessarily nonpolar, while a nonpolar
molecule is usually also hydrophobic and the same logic applies to hydrophilic and polar.
For example, a hydrophobic peptide also has a polar backbone, and hence is not nonpo-
lar, but a nonpolar octane molecule can also be described as being hydrophobic. Besides,
it would be awkward in certain circumstances to describe a solvent as hydrophobic or
hydrophilic. For example, it does not make much sense to describe water as being more
hydrophilic or less hydrophobic than methanol, while it is natural to say water is more
polar or less nonpolar than methanol.
In comparison with the results from simulations at high temperature in vacuo, the con-
dition of which is used to mimic that of a θ-solvent, we found that none of the above
solvents is a θ- or good solvent for the model peptides. In order to find a θ- or good sol-
vent, other factors besides polarity such as the uneven distribution of polar and nonpolar
groups in the peptides need to be taken into consideration.
Chapter 4. Solvent Quality Studies 52
4.2 Results
4.2.1 Radius of Gyration
The Rg of the 6 model peptides in different solvents are shown in Figure 4.1. For each
peptide, Rg is shown as a function of the length of the alkyl chain of the solvent. The
shape of the curve varies from peptide to peptide. For (PGV)12, Rg keeps increasing
until ethanol, then flattens out, and starts decreasing at heptanol. For (GVPGV)7, Rg
reaches its peak at propanol. The indentation at butanol is unexpected, and according
the distribution of Rg, it is a result of the peptide being trapped in local minima where
Rg is low. Its Rg also starts decreasing at heptanol. The Rg of (GVGVA)7 and (GV)18
are very close from water to pentanol except for that at butanol, and they reach their
peaks at hexanol and heptanol, respectively. For (GGVGV)7, the Rg reaches its peak
at hexanol. For G35, the Rg also reaches its peak at heptanol, but at a much lower
scale than the other peptides. Although the model peptides are more hydrophobic than
average [78], they still have polar backbones, so both water and octane, which represent
the polar and nonpolar ends of the spectrum of solvent polarity, are the poorest solvents
in the series. The initial increase of Rg for all the peptides at the beginning indicates
an improvement in solvent quality. This is expected because as the alkyl chain of the
alcohol becomes longer, the solvent becomes more nonpolar, which results in a weaker
solvophobic effect and hence stronger peptide-solvent interactions. The decrease after
heptanol indicates that if the solvent becomes too nonpolar, peptide-peptide interactions
between polar peptide backbones bonds become more favored and over-compensate for
the improved solvation of the hydrophobic sidechains.
If the peptides are ordered by the length of the alkyl chain of the solvent in which it
first reaches its peak value in Rg as shown in Table 4.1, the following relationship is
Chapter 4. Solvent Quality Studies 53
obtained: the less hydrophobic the peptide is, the more sensitive it is to a decrease in
solvent polarity, with G35 as the only outlier.
Peptide Hydrophobicity score Solvent
(PGV)12 0.73 ethanol (C2H5OH)
(GVPGV)7 1.2 propanol (C3H7OH)
(GGVGV)7 1.44 hexanol (C6H11OH)
(GVGVA)7 1.88 heptanol (C7H13OH)
(GV)18 1.9 heptanol (C7H13OH)
G35 -0.4 heptanol (C7H13OH)
Table 4.1: Peptide hydrophobicity and the solvent in which the peptide first reaches its
maximum Rg. The hydrophobicity score is calculated using the Kyte-Doolittle hydropho-
bicity scales [65].
These results are consistent with how tropoelastin is isolated in experiments, where it
remains soluble in propanol:butanol (3:5 in volume), while other proteins precipitate [78].
4.2.2 Secondary Structure Content
Figure 4.2 shows the analysis of various structures defined in DSSP [60]. As it shows, there
is no helix formation in any of the peptides in any of the solvents. The content of bends
remains roughly constant across all solvents. The content of coil decreases dramatically
in octane while remains constant in the other solvents, which corresponds to an increase
in the propensity for β-sheets, β-bridges, and H-bonded turns.
As for the subplot of β-sheet content, that the content does not change much for ELPs
is not surprising because of their presence of Pro, and G35 is an outlier. However, that
the content does not keep increasing for ALPs as the solvent becomes more nonpolar
Chapter 4. Solvent Quality Studies 54
water
methanol
ethanol
propanol
butanol
pentanol
hexanol
heptanol
octanoloctane
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
Rg (
nm
)
(GVPGV)7
(PGV)12
(GGVGV)7
(GVGVA)7
(GV)18
(G)35
Figure 4.1: Average Rg of the model peptides in water, alcoholic solvents, and octane.
The alcoholic solvents are ordered by the length of the alkyl chain. Rg is calculated the
same way as in Figure 3.2.
Chapter 4. Solvent Quality Studies 55
0.0
0.1
0.2
0.3
0.4
0.5 β-sheet (E) α-helix (H) H-bonded turn (T)
0.0
0.1
0.2
0.3
0.4
0.5 β-bridge (B) 310-helix (G)
Coil (C)
water
methanol
ethanol
propanol
butanol
pentanol
hexanol
heptanol
octanol
octane
0.0
0.1
0.2
0.3
0.4
0.5
Bend (S)
water
methanol
ethanol
propanol
butanol
pentanol
hexanol
heptanol
octanol
octane
π-helix (I)(GVPGV)7
(PGV)12
(GGVGV)7
(GVGVA)7
(GV)18
(G)35
Figure 4.2: Various types of backbone structures as defined in DSSP[60] for all model
peptides in all solvents.
is unexpected. Instead, it decreases in trend except in octane. Comparing the β-sheet
content in water and methanol to that in Figure 3.5, which is calculated from another
independent set of simulations, there is a discrepancy. As we found out later, the dis-
crepancy is due to the introduction of cis- peptide bonds in the initial conformations of
the peptides, which will be discussed in detail in Subsections 4.2.4 and 4.2.5.
Chapter 4. Solvent Quality Studies 56
4.2.3 Size of peptides In Vacuo
The results of Rg in alcohols in Figure 4.1 show that there is a plateau for ELPs between
propanol and heptanol, which is around Rg = 1.22 nm, and a solvent with a longer alkyl
chain than heptanol does not extend the peptide any further as octanol is too hydropho-
bic. An interesting question inspired from this observation is whether 1.22 nm is greater
than the value of Rg in a θ-solvent, which we name it θ-Rg. In other words, are any
of the alcoholic solvents considered either θ-solvents or good solvents of the ELPs? To
address this question, simulations at a series of temperatures from 300 to 4039 K in vacuo
were conducted to investigate θ-Rg since at a significantly high temperature, the confor-
mational entropy of the system is maximized, which then dominates the conformational
ensemble, thereby the peptide is maximally disordered as in a θ-solvent.
Figure 4.3 shows that the Rg of ELPs increases rapidly as the temperature rises to 1000
K, and becomes constant above 2000 K at around 1.51 nm and 1.48 nm. We postulate
that these maximum values correspond to maximum disorder and therefore approximate
the θ-Rg of the two ELPs. The normal distribution of end-to-end distances as shown
in Figure 4.4 also indicates that the ELPs behave approximately as random chains at
such high temperatures. These results suggest that the peptides are not as extended in
alcohols as they are at high temperatures in vacuo. Therefore, the alcohols are neither θ-
or good solvents. One possible reason is the presence of intramolecular H-bonds, which
presumably contributes to the peptides’ collapse. Increasing solvent nonpolarity does not
always diminish intramolecular peptide-peptide H-bonds as shown in Figure 4.5, whereas
the peptides at high temperatures in vacuo form virtually no H-bonds as shown in Figure
4.3.
Chapter 4. Solvent Quality Studies 57
0 500 1000 1500 2000 2500 3000 3500 4000 4500Temperature (K)
0.8
1.0
1.2
1.4
1.6
Rg (
nm
)Rg (GVPGV)7
Rg (PGV)12
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
Pepti
de-p
epti
de H
-bonds
H-bond (GVPGV)7
H-bond (PGV)12
Figure 4.3: Rg and intramolecular peptide-peptide H-bonds propensity of ELPs in vacuo
as a function of temperature.
0 1 2 3 4 5 6 7 8 9End-to-end distance
0.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
P
(GVPGV)7
r2 = 0.996
0 1 2 3 4 5 6 7 8 9End-to-end distance
(PGV)12
r2 = 0.996
Figure 4.4: Distribution of end-to-end distance of ELPs in vacuo at 2707 K. The solid
lines are fits to normal distributions. r2 indicates the fitting quality.
Chapter 4. Solvent Quality Studies 58
water
methanol
ethanol
propanol
butanol
pentanol
hexanol
heptanol
octanoloctane
0.00
0.05
0.10
0.15
0.20
0.25
Pepti
de-p
epti
de H
-bonds
(GVPGV)7
(PGV)12
(GGVGV)7
(GVGVA)7
(GV)18
(G)35
Figure 4.5: Intramolecular H-bonds propensity of the model peptides in various solvents.
The total number of H-bonds is normalized by the number of H-bonding groups in each
peptide.
Chapter 4. Solvent Quality Studies 59
4.2.4 The Discrepancy in β-sheet content
The simulations reported in Chapter 3 and Chapter 4 are referred to as Dataset 1 &
Dataset 2 (Ds. 1 & Ds. 2). Ds. 1 is comprised of 12 different systems, which consist of 6
model peptides successively in water and in methanol. Ds. 2 is comprised of 60 systems,
which correspond to all possible combinations of the same 6 model peptides and 10
different solvents. Therefore, the systems included in Ds. 2 are a superset of those in Ds.
1, and the comparison between them will focus the common simulation systems.
The discrepancy in β-sheet content observed between Ds. 1 and Ds. 2 is plotted in Figure
4.6. It shows that the β-sheet content is significantly higher in Ds. 1 than that in Ds. 2.
The only difference between the two independent sets of simulations is the temperature
selected when preparing the initial conformations in vacuo. By design, we used high
temperature in order to ensure that the initial conformations for different replicas of the
same system are very different from each other so as to reduce the systematic sampling
error (as discussed in Chapter 2). The temperatures selected for Ds. 1 and Ds. 2 are 700 K
and 3000 K, respectively. The conformations generated at 3000 K are expected to be more
heterogeneous than those produced at 700 K. However, the former turns out to contain
an unrealistic number of cis- peptide bonds, which we think is probably the major cause
of the discrepancy in β-sheet content because they have been known to induce different
structural properties than their trans- counterparts [56, 89, 40]. The details about the
ratio of the cis/trans peptide bonds will be shown in the next subsection. In contrast,
as shown in Figure 4.7, the distributions of Rg of the peptides in water and in methanol
are not significantly affected by those cis- peptide bonds. Moreover, similar results have
already been obtained for (PGV)12 using another force field, CHARMM22* [95], as shown
in Figure 4.8. Taken together, the analysis suggests that the results depicted in Figure
4.1 are reproducible, which is currently being verified.
Chapter 4. Solvent Quality Studies 60
(GVPGV)35 (PGV)12 (GGVGV)35 (GVGVA)35 (GV)18 (G)35
0.00
0.05
0.10
0.15
0.20
0.25
β-s
heet
conte
nt
Ds. 1 waterDs. 2 waterDs. 1 methanolDs. 2 methanol
Figure 4.6: Comparison of β-sheet content between in Ds. 1 and Ds. 2.
Chapter 4. Solvent Quality Studies 61
0.0
0.1
0.2
0.3
0.4
P
(GVPGV)35
Ds. 1 waterDs. 1 methanolDs. 2 waterDs. 2 methanol
(PGV)12
0.0
0.1
0.2
0.3
0.4
P
(GGVGV)35 (GVGVA)35
0.6 0.8 1.0 1.2 1.4 1.6Rg (nm)
0.0
0.1
0.2
0.3
0.4
P
(GV)18
0.6 0.8 1.0 1.2 1.4 1.6Rg (nm)
(G)35
Figure 4.7: Comparison of Rg in Ds. 1 and Ds. 2.
Chapter 4. Solvent Quality Studies 62
water
methanol
ethanol
propanol
butanol
pentanol
hexanol
heptanol
octanoloctane
0.7
0.8
0.9
1.0
1.1
1.2
1.3
Radiu
s of
Gyra
tion (
Rg)
(nm
)
(PGV)12 CHARMM22*
(PGV)12 OPLS-AA/L
Figure 4.8: Average Rg of (PGV)12 in different solvents, successively using the low-
temperature (Ds. 1) protocol with CHARMM22* and the high-temperature (Ds. 2) pro-
tocol with OPLS-AA/L. The simulations in CHARMM22* include only 7 solvents.
Chapter 4. Solvent Quality Studies 63
4.2.5 Ratio of cis/trans Peptide Bonds
In total, 4 types of residues (G, V, P, A), and two types of bonds are present in the model
peptides, which are the single bond and the partial double bond (i.e. the C=O bond in
the peptide backbone and the peptide bond). The equilibrium population of torsional
isomers around single bonds is readily sampled at room T because of their low torsional
energy barriers, and that around C=O bonds is also readily sampled because there is only
one isomer. However, the equilibrium population around the peptide bonds are not easily
sampled because of the high energy barrier between their cis and trans configurations,
which can only be crossed at high temperatures in the simulations.
Pal et al. studied the occurrence of cis/trans peptide bonds in the PDB database
(http://www.rcsb.org) in 1999, and found that peptide bonds are mostly in their trans
configurations with only 0.3% in their cis configurations. For X-Pro, the proportion of
cis configurations is much higher at 5.7%, which accounts for 87% of all the cis- peptide
bonds. [89] The reason for a high fraction of cis-X-Pro is due to the rigidity of the
pyrrolidine ring in Pro, which decreases double bond character of the pre-proline peptide
bond and therefore the energy barrier between its cis and trans configurations.
Considering that Pal’s study in 1999 may be outdated since only 294 X-ray structures
were included, we have reanalyzed the percentage of cis-X-Pro using the current PDB
database, which included 86468 structures at the time of analysis. A peptide bond is
defined as in its cis configuration if the corresponding dihedral angle is closer to 0◦ than
to 180◦. Otherwise, it is defined as in its trans configuration. The structures are without
redundancy check, and include both proteins and nucleic acids generated by both X-ray
crystallography and NMR studies. The inclusion of nucleic acids des not affect the result
since they contain no peptide bonds at all. The percentage of cis-X-Pro is calculated as
cis-X-Pro % =Ncis-X-Pro
Ncis-X-Pro +Ntrans-X-Pro, (4.1)
Chapter 4. Solvent Quality Studies 64
and the result is shown Table 4.2. The overall percentage of cis-X-Pro is 4.5%, which
is less than 5.7%, but still approximately the same. Interestingly, the percentage of the
cis-V-Pro, which is the only type of cis-X-Pro available in the model peptides, is even
lower than the overall value, 2.6%, while those involving aromatic residues (F, Y, W) are
relatively higher. With 4.5%, the energy difference between cis and trans configurations
of X-Pro is calculated to be 7.6 kJ ·mol−1, or 1.8 kcal ·mol−1 assuming the temperature
is 300 K.
cis-X-Pro % cis-X-Pro % cis-X-Pro % cis-X-Pro %
IP 2.1 LP 2.8 KP 3.9 EP 6.8
CP 2.1 TP 3.4 HP 4.0 GP 6.9
DP 2.3 AP 3.5 NP 5.0 FP 7.4
VP 2.6 RP 3.5 SP 6.1 YP 12.1
MP 2.7 QP 3.7 PP 6.3 WP 13.7
Overall: 4.5
Table 4.2: Percentages of different cis-X-Pro calculated based on data from the PDB
database. The overall percentage is that of cis configurations out of all X-Pro peptide
bonds. The row in blue highlights that V-Pro is the only type of X-Pro available in the
model peptides.
Therefore, it is justifiable that for a short peptide with 35 peptide bonds or 11 V-Pro
peptide bonds as in (PGV)12, first, the percentage of cis-X-nonPro peptide bonds should
be set to zero (35× 0.3% · (1− 87%) = 0.014 ≈ 0); second, the percentage of cis-X-Pro
peptide bonds, can also be set to zero for simplicity (11× 2.6% = 0.029 ≈ 0). However,
at a temperatures as high as 3000 K, the systems were found to be contaminated with
too many cis- peptide bonds, the details of which are shown in Figure 4.9 and 4.10, and
summarized in Table 4.3.
In Ds. 1, in which the initial conformations were generated at 700 K, little X-nonPro
Chapter 4. Solvent Quality Studies 65
0
5
10
15
20
25
30
35
40
Num
ber
of
replic
as
(GVPGV)35
Ds. 2 waterDs. 2 methanol
(PGV)12
0
5
10
15
20
25
30
35
40
Num
ber
of
replic
as
(GGVGV)35 (GVGVA)35
0.0 0.2 0.4 0.6 0.8 1.0Fraction of cis-X-nonPro
0
5
10
15
20
25
30
35
40
Num
ber
of
replic
as
(GV)18
0.0 0.2 0.4 0.6 0.8 1.0Fraction of cis-X-nonPro
(G)35
Figure 4.9: Number of replicas (y-axis) vs. fraction of cis-X-nonPro (x-axis) in Ds. 2.
0.0 0.2 0.4 0.6 0.8 1.0Fraction of cis-X-Pro
0
5
10
15
20
25
30
35
40
Num
ber
of
replic
as
(GVPGV)35
Ds. 2 waterDs. 2 methanol
0.0 0.2 0.4 0.6 0.8 1.0Fraction of cis-X-Pro
(PGV)12
Figure 4.10: Number of replicas (y-axis) vs. fraction of cis-X-Pro (x-axis) in Ds. 2.
Chapter 4. Solvent Quality Studies 66
cis-X-nonPro cis-X-Pro
Ds. 1 Ds. 2 Ds. 1 Ds. 2
(GVPGV)7 0.01 0.17 0 0.33
(PGV)12 0.01 0.17 0 0.30
(GGVGV)7 0.02 0.18 - -
(GVGVA)7 0.01 0.18 - -
(GV)18 0.01 0.18 - -
G35 0.03 0.25 - -
Table 4.3: Summary of the fraction of cis-X-nonPro and cis-X-Pro in Ds. 1 and Ds. 2.
peptide bonds were in their cis configurations. By contrast, Figure 4.9 shows that in
Ds. 2, in which the initial conformations were generated at 3000 K, the fractions of cis
configurations in different replicas vary from 0 to as high as 0.6, which is unrealistic
compared to a fraction of 0.00039 (0.3%× (1− 87%)) in nature [89]. Since the presence
of cis- peptide bonds affect the formation of secondary structure [56, 89, 40], it is not
surprising to see a much lower content of β-sheet in Ds. 2 as shown in Figure 4.6 with
such a high amount of cis-X-nonPro .
For X-Pro, at 700 K, little cis-X-Pro was introduced in (GVPGV)7 whereas its fraction
was about 0.2 in (PGV)12, which means that with a larger number of X-Pro bonds, it
is easier for (PGV)12 to introduce cis-X-P than (GVPGV)7. These conformations were
omitted from the analysis of results for simplicity as mentioned above. At 3000 K, cis-X-
Pro becomes much more populous, with their fractions varying from 0.0 to 0.6 as shown
in Figure 4.10.
Chapter 4. Solvent Quality Studies 67
4.3 Discussion
We have examined the solvent quality of various solvents of different polarities for a set of
model peptides. Relative solvent quality was measured by the peptides’ average Rg in a
particular solvent: the larger the Rg, the better the solvent quality. Although the shape
of the Rg dependence on the different solvents which we examined differs from peptide to
peptide, in general, as the solvent becomes less polar, its solvent quality increases, and up
to a turning point (e.g. heptanol) beyond which the solvent becomes too nonpolar and
it quality begins to decrease. The increase of solvent quality along with its nonpolarity
is due to the hydrophobicity of the model peptides, while the existence of a turning
point despite such overall hydrophobicity is probably due to the polarity of the peptides’
backbone. For some of the model peptides, most notably ELPs, there is a plateau in
Rg, suggesting that Rg has little dependence on the solvent within the range of polarity
covered by this plateau. In addition, we also found a positive relationship between the
peptide’s hydrophobicity score and the nonpolarity of the solvent in which the peptide
first reaches its maximum Rg value, with G35 as an outlier.
We have also investigated θ-Rg of the ELPs by simulating them at very high temperatures
in vacuo. Interestingly, θ-Rg is much higher than the maximum Rg obtained in alcohols,
which are therefore all poor solvents of the model peptides in spite of their similar non-
polarity/hydrophobicity with the peptides. This result suggests that a solvent that is as
nonpolar/hydrophobic as the solute peptide is not necessarily a θ-solvent. Other factors
such as the uneven distribution of the polar and nonpolar groups, which distinguishes the
peptides from their homogeneous counterparts (i.e. synthetic polymers like polystyrene
and polyethylene), must be taken into consideration. The finding from our group that
the backbone of an ELP remains partially hydrated even as the peptide approaches the
condition of a polymer melt, corroborates the difficulty of achieving ideal (θ) solvation
even for such highly disordered peptides [97].
Chapter 4. Solvent Quality Studies 68
Unfortunately, inconsistency in properties such as β-sheet content is observed between
the results presented in this chapter and Chapter 3. A thorough investigation on this
issue revealed that the inconsistency is due to an unusually high ratio of cis/trans pep-
tide bonds in the peptides in this chapter, which is a result of improper preparation of
the initial peptide conformations at too high a temperature (3000 K) in vacuo. The
abnormal ratio makes many structural properties unreliable, but as we have showed,
Rg is not significantly affected, and qualitatively the same result for Rg of (PGV)12 has
been obtained with proper initial conformations in another force field, CHARMM22*.
Therefore, we think that the Rg part of the results in Section 4.2.1 is reproducible after
correcting the inappropriate step, which is currently being verified. We recommend the
next student who will continue to work on this project to pay particular attention to
the preparation of the initial conformations since they are generated de novo in comput-
ers rather than using experimental techniques like X-ray crystallography or NMR, and
thereby more likely to be subject to artifacts.
4.4 Material & Methods
Simulations in solvents in OPLS-AA/L We performed 40 replicas of each of the
6 model peptides as a monomer in 10 solvents successively. Each replica was simulated
at 1 par, 300 K for 200 ns, and the simulations of (GVPGV)7 in water and in methanol
were extended to 500 ns. Each system was solvated in a triclinic box with angles of 60◦,
60◦, 90◦. The box size of each system and number of solvent molecules in it are listed
in Table 4.4. The initial structures of the peptides were generated at 3000 K in vacuo,
which resulted in a considerable amount of cis- peptide bonds as discussion in Subsection
4.2.5.
All simulations were performed at constant pressure and constant temperature with peri-
Chapter 4. Solvent Quality Studies 69
Solvent Box size (nm3) No. Solvent Box size (nm3) No.
water 5.4× 5.4× 3.8 3700 pentanol 7.8× 7.8× 5.5 1800
methanol 6.9× 6.9× 4.9 3300 hexanol 8.0× 8.0× 5.7 1700
ethanol 6.7× 6.7× 4.8 2200 heptanol 7.9× 7.9× 5.6 1400
propanol 7.1× 7.1× 5.0 2000 octanol 7.0× 7.0× 4.9 900
butanol 7.6× 7.6× 5.4 2000 octane 5.2× 5.2× 3.7 350
Table 4.4: Box size of and number of solvent molecules in each system.
odic boundary conditions. The simulation package used was Gromacs-4.0.5 for all systems
except those in heptanol, for which Gromacs-4.5.5 was used. Models of explicit TIP4P
water [57], methanol, ethanol and propanol [58] from the Gromacs-4.5.5 software packages
were used directly. Models for all the other solvents were constructed with g_x2top from
Gromacs-4.5.5 tools and furnished with in-house script. The LINCS algorithm was used
to constrain all bond lengths [47, 46]. An cutoff of 1.4 nm was used for Lennard-Jones
interactions. The PME algorithm [26, 33] was used to calculate long-range electrostatics
interactions with a Fourier spacing of 0.12 and a interpolation order of 4. Nose-Hoover
thermostat [88, 50] was used for temperature coupling with the peptide and solvent cou-
pled to two temperature baths and a time constant of 0.1 ps. Parrinello-Rahman [91]
was used for pressure coupling with a time constant of 2 ps. The integration step size is
2 fs and the system coordinates were stored every 10 ps.
The first 150 ns of each trajectory was omitted for equilibration, resulting in a total of
14 µs of production time for the systems of (GVPGV)7 in water and in methanol, and a
total of 2 µs of production time for all other systems.
Simulations in vacuo Simulations in vacuo were performed using Gromacs-4.5.5 in
OPLS-AA/L force field at constant temperature without pressure coupling. The tem-
peratures were 300, 366, 447, 546, 667, 815, 996, 1216, 1485, 1814, 2216, 2707, 3306,
Chapter 4. Solvent Quality Studies 70
4039 K. 8 replicas were used for each system at a each temperature and simulated for
200 ns, resulting in 1.6 µs of production time per system. The LINCS algorithm was
used to constrain all bond lengths [47, 46]. The shift algorithm was used to calculated
Lennard-Jones and electrostatics interactions with cutoffs of 0.9 and 0.8 nm respectively.
Nose-Hoover thermostat [88, 50] was used for temperature coupling with a time constant
of 0.1 ps. The integration step size is 0.1 fs to avoid system crash at high temperatures
and the system coordinates were stored every 10 ps. The translation of and rotation
around the center of mass were removed every 10 step to avoid the flying ice cube effect
[45].
Simulations in CHARMM22* The CHARMM22* [95] force field was downloaded
from http://www.gromacs.org/Downloads/User_contributions/Force_fields. The
solvent models were from CGenFF of version 2b7 [116]. The atom charges were gen-
erated by CGenFF program of version 0.9.6 beta from https://www.paramchem.org/
AtomTyping/ [118, 117]. The simulations in CHARMM22* were conducted using Gromacs-
4.5.5, and included 10 300-ns replicas with the first 150 ns truncated as equilibration,
resulting in 1.5 µs of production time per system. All other technical information was
same as simulations in OPLS-AA/L except that time constant for the thermostat was 2
ps. The box size and number of solvents per system is shown in Figure 4.5.
Solvent Box size (nm3) No. Solvent Box size (nm3) No.
water 5.4× 5.4× 3.8 3700 heptanol 7.6× 7.6× 5.3 1300
methanol 6.8× 6.8× 4.8 3300 octanol 7.2× 7.2× 5.1 1000
ethanol 6.7× 6.7× 4.7 2200 octane 6.1× 6.1× 4.3 600
pentanol 7.7× 7.7× 5.4 1800
Table 4.5: Box size of and number of solvent molecules in each system in CHARMM22*
force field.
Chapter 5
Modeling Mechanical Properties
5.1 Background
The most important function of elastin is to provide elasticity to biological tissues, and
understanding the underlying mechanism is crucial for future applications like biomimetic
materials engineering. Therefore, in this chapter, we focus on gaining insights into the
molecular mechanism of elasticity in elastin by modeling its mechanical properties based
on MD simulations of ELPs.
For all MD simulations studies, it would be very helpful to have a direct comparison be-
tween results in silico and those in experiments. However, experimentalists are usually
working at a much larger scale of both time and size than MD simulators. Therefore,
we attempted to model the macroscopic properties from the microscopic ones obtainable
from our MD studies. At the microscopic level, the major mechanical properties con-
cerned in this chapter are the modulus (k) and equilibrium length (d0) of a monomer
peptide. Both k and d0 can be calculated from the peptide’s end-to-end distance distri-
bution, which is directly obtainable from MD simulations. At the macroscopic level, we
71
Chapter 5. Modeling Mechanical Properties 72
modeled the Young’s modulus of a piece of elastin-like material with k and d0. Young’s
modulus is defined as
KY =stress
strain=
F/A
∆L/l, (5.1)
where stress is defined as the quotient of the recoiling force (F ) divided by the cross
section area (A) of the material, and strain is defined as the ratio of the extension (∆l)
to the length (l) of the material. Hence KY is in units of MPa.
Our approach for relating the microscopic and macroscopic worlds is to start by calculat-
ing k and d0 of a monomer, then to use these values to model KY for a piece of elastin-like
material, and at last to compare the modeled KY to the experimental values.
5.2 Theory
5.2.1 Modulus of a Monomer as a Spring
The modulus of a peptide monomer is calculated by fitting a parabola, which is of the
same shape of the potential of a spring, to the system’s potential of mean force (PMF)
profile upon the peptide’s end-to-end distance (d). The PMF is defined as the change of
free energy along a reaction coordinate, which is the end-to-end distance of the peptide
in this case. According to the Boltzmann distribution,
pd =e−Gd/kBT
Z=e−βGd
Z, (5.2)
where pd and Gd are the probability and free energy when the end-to-end distance is
d, kB is the Boltzmann constant, T is the absolute temperature, and Z is the partition
function. At zero extension, i.e. when d = d0,
p0 =e−βG0
Z(5.3)
Chapter 5. Modeling Mechanical Properties 73
where p0 and G0 are the corresponding probability and free energy. Dividing Equation
(5.2) by (5.3) yields
pdp0
= e−β(Gd−G0) (5.4)
Since the PMF only calculates the free energy difference, so G0 can be set to 0. Then,
apply the logarithmic operation to both sides of Equation (5.4),
lnpdp0
= −βGd, (5.5)
so that
Gd = − 1
βlnpdp0, (5.6)
which is then fitted to a parabola function,
Gd = − 1
βlnpdp0∼=
1
2k(d− d0)2, (5.7)
where k is the modulus and d0 is the equilibrium length, i.e. the end-to-end distance
when the peptide is in its relaxed state.
5.2.2 Young’s Modulus in the tetrahedron model
We developed a mathematical model named the tetrahedron model to calculate the
Young’s modulus for a piece of macroscopic elastin-like material based on the obtained
modulus (k) and equilibrium length (d0) of a monomer. The calculated Young’s modulus
can then be directly compared to experimental measurements.
The tetrahedron model considers the elastin-like material to be a collection of tetrahedra
at its relaxed state. In this model, each XL domain is represented by a node or a crosslink,
and each HP domain is represented by the edge connecting two crosslinks. Each crosslink
has a valence of four, which means being connected by four HP domains. This is similar
as in native elastin where four Lys residues from two XL domains interact to form a
Chapter 5. Modeling Mechanical Properties 74
crosslink (e.g. desmosine or isodesmosine) [123]. Since each XL domain is flanked by
two HP domains, each crosslink is connected to four HP domains. The material in
this model is assumed to retain a constant volume during extension, so as the material
increases in length, it decreases in width and height. The model construction consists of
the following sequence: we start by calculating the modulus of a spring complex (kc) as
shown in Figure 5.1; then, based on kc, we analyze the modulus of a unit cell (ku) which
contains 4 tetrahedra as shown in Figure 5.2; finally, based on ku, we derive the Young’s
modulus (KY ) of the material.
OA A′
B B′O1
O2
O′1
O′2
d0 d
x0 x
X
s0s
θ=54.7◦ θ′F
Figure 5.1: A spring complex as defined in the tetrahedron model. Please refer to the
text for a detailed description.
A spring complex is defined as two springs forming a tetrahedron angle (θtetra), i.e.
∠O1AO2 = θtetra = 109.4712◦, as shown in Figure 5.1. Consider the following process,
starting from Point A, where the complex is at its relaxed state, a force F is applied to
A along OA direction perpendicular to O1O2. Then, A shifts to A′, O1 and O2 shift to
O′1 and O′2, respectively; s0, the projection of O1A on O1O, becomes s; ∠O1AO, which
equals θ = 12∠O1AO2 = 54.7356◦, becomes θ′; and the length of the complex becomes X
from its equilibrium length, x0, by an extension of x.
Chapter 5. Modeling Mechanical Properties 75
x
y
z
A
B
C
D
E
F
G
H
I
J
O1
N1
O2
N2
O3
N3
O4N4
Figure 5.2: A unit cell as defined in the tetrahedron model. Within each unit cell, there
are 4 tetrahedra, the centers of which are labeled O1 to O4. Please refer to the text for
detailed description.
Chapter 5. Modeling Mechanical Properties 76
The logic is first to derive F as a function of x, F (x), and then to calculate the derivative
of F with respect x, dF/dx, which is defined as the modulus of the complex (kc). Since
X = x0 + x, (5.8)
and therefore,
dF
dx=dF
dX· dXdx
=dF
dX· d(x0 + x)
dx=dF
dX. (5.9)
We prefer to calculate dF/dX instead because it is easier to obtain. According to Figure
5.1,
F = 2k(d− d0) cos θ′, (5.10)
where k is the modulus of a single spring, d0 and d are the lengths of the spring at its
relaxed and extended states, and the constant 2 at the beginning is for two springs. In
the right-hand-side (RHS), k and d0 are constants, so we try to replace the variables, d
and cos θ′ with X. Given that
d =√X2 + s2, (5.11)
cos θ′ =X√
X2 + s2, (5.12)
Equation (5.10) can be rewritten as
F = 2k · (√X2 + s2 − d0) ·
X√X2 + s2
= 2kX · [1− d0(X2 + s2)−12 ]. (5.13)
In order to replace s with X as well, we use the relationship between the two which
results from the constraint that the material’s volume is constant during its extension.
At the macroscopic level, in order to conserve the volume of the material, if its length
increases by a ratio of r while its width and hight shrink by a ratio of r′, the following
relationship must hold,
(1 + r)l0 · (1− r′)w0 · (1− r′)h0 = l0 · w0 · h0, (5.14)
Chapter 5. Modeling Mechanical Properties 77
where l0, w0, h0 are the material’s length, width and height at its relaxed state. Solving
Equation (5.14) gives
r′ = 1− 1√1 + r
. (5.15)
At the microscopic level, by relating Figure 5.1 to Figure 5.2, which is considered to
consist of 8 such spring complexes in the pattern of a 2× 4 matrix, consisting of parallel
arrangement of 2 groups of 4 in-series complexes, therefore, the following relationships
hold,
s = (1− r′)s0, (5.16)
X = (1 + r)x0, (5.17)
With θ ≈ 54.7356◦,
s0 = d0 · sin θ =
√6
3d0, (5.18)
x0 = d0 · cos θ =
√3
3d0, (5.19)
so that
s = (1− r′)√
6
3d0 =
√6
3
1√1 + r
d0, , (5.20)
X =
√3
3(1 + r)d0. (5.21)
Then,
s2 =2
3
1
1 + rd20 =
2
3
(√3
3
d0X
)d20 =
2√
3
9
d30X. (5.22)
Substituting s2 in Equation (5.13) with (5.22),
F = 2kX · [1− d0(X2 +2√
3
9
d30X
)−12 ]. (5.23)
Therefore, the modulus of the spring complex, i.e. the derivative of F with respect to X,
Chapter 5. Modeling Mechanical Properties 78
is
kc =dF
dX= 2k ·
1− d0
(X2 +
2√
3
9
d30X
)− 12
+ kX · d0
(X2 +
2√
3
9
d30X
)− 32(
2X − 2√
3
9
d30X2
)
= 2k ·
[1− d0
(X2 +
C
X
)− 12
]+ kX · d0
(X2 +
C
X
)− 32(
2X − C
X2
)
= k ·
2− d02X2 + 2C
X− 2X2 + C
X
(X2 + CX
)√X2 + C
X
= k ·
[2− d0 ·
3C
X· (X2 +
C
X)−
32
], (5.24)
where
C =2√
3 · d309
. (5.25)
With kc obtained, next we calculate the modulus of a unit cell, ku. The topology of a
unit cell, as shown in Figure 5.2, is similar to that of a diamond except that the carbon
atoms and C-C bonds in the latter are replaced with crosslinks and HP domains. Each
cubic unit cell consists of 4 tetrahedra which are made up of 16 HP domains and 18
crosslinks. Of the 18 crosslinks, 8 in the corners are shared by 8 neigbouring unit cells,
and 6 in the middle of surfaces are shared by 2 neigbouring unit cells, and the remaining
2 are exclusively inside a single unit cell, so there are only 8 effective crosslinks per unit
cell (8× 18
+ 6× 12
+ 4). A unit cell is a highly symmetrical structure in which each HP
domain forms an angle of α = (1 − θtetra)/2 = 35.2644 ◦ with each surface. Given that
the equilibrium length of a HP domain is d0, the length of a unit cell is
lu = 2 · (2 · cosα · d0 · cos 45◦)
= 4 ·√
6
3· d0 ·
√2
2
=4√3d0. (5.26)
As mentioned above, a spring complex in Figure 5.1 is considered as the basic structure
for a unit cell as in Figure 5.2. The 16 HP domains in a unit cell can be grouped into 2×4
Chapter 5. Modeling Mechanical Properties 79
spring complexes, consisting of parallel arrangement of 2 groups of 4 in-series complexes.
Given that in physics, two springs of modulus k result in an overall modulus of 2k if in
parallel, and 12k if in series, the modulus of a single unit cell is
ku = (2 · kc)/4 =1
2kc. (5.27)
Now that we know ku, we are ready to calculate the Young’s modulus of a piece of
macroscopic material (KY ). Let the number of unit cells along the x, y, z axes of the
material be nu,x, nu,y, nu,z, then
nu,x = w0/lu, (5.28)
nu,y = l0 /lu, (5.29)
nu,z = h0/lu. (5.30)
Let the pulling force be along the y direction, then nu,y unit cells are in series while
nu,x × nu,z springs are considered in parallel. Hence the modulus of the material is
K =kunu,y· nu,x × nu,z =
kuw0h0lu · l0
=
√3
8
kcd0
w0h0l0
, (5.31)
and its Young’s modulus is
KY =stress
strain=K∆l/w0h0
∆l/l0=
Kl0w0h0
=kulu
=
√3
8
kcd0, (5.32)
where ∆l is extension of the material. Substituting kc with (5.24),
KY =
√3
8
k
d0·[2− d0 ·
3C
X· (X2 +
C
X)−
32
]=
√3
4
k
d0·
[1−√
3
3
d40X
(X2 +2√
3
9
d30X
)−32
]. (5.33)
However, the above equation is not very convenient when comparing our results with
experimental ones. Instead, KY as a function of strain is preferred. strain equals the
ratio of increase in length, r, which has already been mentioned above, namely
strain = r =∆l
l0. (5.34)
Chapter 5. Modeling Mechanical Properties 80
Substituting X in (5.33) with (5.21) yields
KY =
√3
4
k
d0·
(1− d30
1 + r
[d203
(1 + r)2 +2
3
d201 + r
]− 32
)
=
√3
4
k
d0·
(1− 1
1 + r
[(1 + r)2
3+
2
3(1 + r)
]− 32
). (5.35)
Equation (5.35) shows that KY changes with strain with kd0
as part of the constant,
which is determined by the inherent property of the sequence the material is made of.
If KY is integrated with respect to r, a stress-strain curve can be obtained, while in
experiments, it is usually the stress-strain curve that is measured first, then the Young’s
modulus is obtained by fitting a straight line tangent to the seemingly linear region of
the curve. Therefore, the two curves from simulations and experiments can be compared
as well to test the quality of this model. Let stress = 0 when r = 0, the integral of
Equation (5.35) turns out to be
stress =
√3
4
k
d0·
(r −√
3 · (1 + r) ·
√r + 1
(r + 1)3 + 2+ 1
). (5.36)
In experiments, the material’s cross-sectional area can sometimes be difficult to measure,
so m/l0 may be used instead where m is the material mass and l0 is its original length.
Then the definition of stress in experiments becomes
stressexp =F
m/l0=K∆l
m/l0, (5.37)
Let the density of the material be ρ, where
ρ =m
V=
m
l0w0h0. (5.38)
Substituting Equation (5.38) into (5.37), the relationship between stressexp and stress
is
stressexp =K∆l
ρ · w0h0=stress
ρ. (5.39)
Chapter 5. Modeling Mechanical Properties 81
Therefore, according to the specific case, either Equation (5.36) or the following equation
can be used for comparison between results in simulations and experiments.
stressexp =
√3
4ρ
k
d0·
(r −√
3 · (1 + r) ·
√r + 1
(r + 1)3 + 2+ 1
). (5.40)
5.3 Results
5.3.1 Modulus of Peptide Monomers
Figure 5.3 shows the result of fitting a parabola to the PMF of peptide end-to-end
distance for (GVPGV)7 and (PGV)12 successively in water and in methanol. It shows
that a monomer’s modulus k is lower and its equilibrium length d0 is larger in methanol
than in water. A lower k is reflected by a broader parabola while a larger d0 is seen
from the right shift of the parabola. Therefore, the Young’s modulus must be lower in
methanol than in water for both ELPs according to (5.35) in the tetrahedron model.
Comparing between sequences, the modulus of (PGV)12 in water is about 26 % lower
than (GVPGV)7, which is significant based on the error bars. However, in methanol, the
moduli of the two ELPs are not significantly different since their error bars overlap. The
fitting quality for (GVPGV)7 of (PGV)12 in water are the best and poorest out of the
four systems, respectively.
5.3.2 Young’s Modulus
Figure 5.4 shows the Young’s modulus (KY ) as a function of strain according to Equation
(5.35). The comparison of the modeled moduli with experimental measurements are
summarized in Table 5.1. Figure 5.4 shows that as strain increases, the modulus converges
to 2.9 MPa in water and 0.51 MPa in methanol for (GVPGV)7, and 1.8 MPa in water
Chapter 5. Modeling Mechanical Properties 82
0
1
2
3
4
5
6
7
PM
F (k
J/m
ol)
k = 8.7 ± 0.7 pN/nmd0 = 1.3 ± 0.1 nmr2 = 0.97
(GVPGV)7 in water
k = 2.1 ± 0.3 pN/nmd0 = 1.8 ± 0.1 nmr2 = 0.89
(GVPGV)7 in methanol
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0d (nm)
0
1
2
3
4
5
6
7
PM
F (k
J/m
ol)
k = 6.4 ± 0.9 pN/nmd0 = 1.5 ± 0.1 nmr2 = 0.88
(PGV)12 in water
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0d (nm)
k = 2.7 ± 0.4 pN/nmd0 = 1.9 ± 0.1 nmr2 = 0.90
(PGV)12 in methanol
Figure 5.3: PMF along the end-to-end distance obtained from MD simulations of ELPs,
(GVPGV)7 and (PGV)12. A parabolic fit of the form PMF = k · (d − d0)2 + C, where
d0 is the equilibrium length of the peptide and C is a constant, is shown as dashed line.
Mean values and SEM are obtained by partitioning the MD trajectories used to compute
W (d) into four groups.
Chapter 5. Modeling Mechanical Properties 83
and 0.4 MPa in methanol for (PGV)12. The shaded areas indicate KY values obtained
when the strain is between 0 and 0.8, which covers the range where experimental values
are measured as shown in Table 5.1. KY in methanol is only 15 % and 30 % of that
in water for (GVPGV)7 and (PGV)12 respectively when the strain is 0.8. Despite KY
continues to be lower in methanol than in water when strain is larger than one, only KY
within a strain of one is shown because this is the approximate extensibility of elastin-
like materials in experiments. Table 5.1 shows that the KY when strain is 0.8 from the
tetrahedron model tends to overestimate KY compared to experimental values except
for (PGV)12 versus EP-20-244 (GP). Despite the overestimation ranges from 1.1 (2/1.8)
to 8 (2/2.5), considering the variance among experimental values and the simplicity of
this model, we think the results from the tetrahedron model are remarkably close to the
experimental measurements.
0.0 0.2 0.4 0.6 0.8 1.0Strain
0.0
0.5
1.0
1.5
2.0
2.5
3.0
KY
(M
Pa)
asymptotic limit = 2.9
asymptotic limit = 0.5
k=8.7±0.7 pN/nm; d0 =1.3±0.1 nmk=2.1±0.3 pN/nm; d0 =1.8±0.1 nm
KY = 2.0
KY = 0.3
(GVPGV)7 water
(GVPGV)7 methanol
0.0 0.2 0.4 0.6 0.8 1.0Strain
asymptotic limit = 1.8
asymptotic limit = 0.6
k=6.4±0.9 pN/nm; d0 =1.5±0.1 nmk=2.7±0.4 pN/nm; d0 =1.9±0.1 nm
KY = 1.3
KY = 0.4
(PGV)12 water
(PGV)12 methanol
Figure 5.4: Young’s modulus as a function of strain for (GVPGV)7 and (PGV)12. The
thick dashed lines indicate the converged values for KY in water and methanol, and the
shaded areas indicate KY values obtained in the strain range of 0–0.8.
Chapter 5. Modeling Mechanical Properties 84
Peptide/Protein Strain Water Methanol
Tetra.
(GVPGV)7 0 – 0.8 0 – 2.0 0 – 0.3
(PGV)12 0 – 0.8 0 – 1.3 0 – 0.4
(GVPGV)7 ∞ 2.9 ± 0.1 0.5 ± 0.1
(PGV)12 ∞ 1.9 ± 0.1 0.6 ± 0.1
Exp.
EP-20-24-24 (PQQ) [10] 0.1 – 0.5 0.25± 0.10 -
EP-20-244 (PQQ) [10] 0.4 – 0.8 0.25± 0.09 -
EP-20-244 (PQQ) [120] 0 – 0.2 0.4 -
EP-20-244 (GP) [120] 0 – 0.2 1.8 -
aortic elastin [70, 10] 0.2 – 0.6 0.8 -
Table 5.1: Comparison of the Young’s moduli of the ELPs calculated using the tetra-
hedron model and those for other peptides/protein obtained in experiments. ∞ means
that the strain is large enough to have converged KY . The values for peptides/protein
other than (GVPGV)7 and (PGV)12 are from experimental studies. Unit: MPa.
Chapter 5. Modeling Mechanical Properties 85
5.3.3 Stress-strain Curve
Figure 5.5 shows the stress-strain curves constructed with the tetrahedron model as well
as that measured in experiments. Despite the modeled curves tend to underestimate the
stress at a particular strain compared to the experimental one, roughly the two types
of curves are on the same order of magnitude (hundreds of N/g/mm). The presence of
an inflection point in the experimental curve suggest partial breakage of the matrix in
the material. In experiments, KY is calculated by fitting a straight line to the stress-
strain curve where it looks most linear. However, according to our model, KY is still
increasing though it may look linear when the strain is less than 1. Therefore, the model
suggests that fitting a tangent line to the curve for obtaining a value for KY can be an
oversimplification. Since the KY in methanol is much lower than in water as shown in
Table 5.1, unsurprisingly, the stress-strain curve is also considerably lower in methanol
than in water.
5.4 Discussion
We have shown that the modulus (k) and equilibrium length (d0) of a monomer can
be calculated by fitting a parabola to the system’s PMF along the peptide’s end-to-end
distance. Based on k and d0, we developed a mathematical model, the tetrahedron model,
to calculate the Young’s modulus (KY ) of macroscopic elastin-like material made of such
monomers. In this section, we present two comparative discussions of our results. First,
the results from MD simulations are compared to those from experiments. Second, the
results from MD simulations in water are compared to those in methanol.
Chapter 5. Modeling Mechanical Properties 86
0.0 0.2 0.4 0.6 0.8 1.0Strain
0
200
400
600
800
1000
Str
ess
(N
/g/m
m)
(GVPGV)7 in water
(GVPGV)7 in methanol
(PGV)12 in water
(PGV)12 in methanol
Figure 5.5: The upper plot shows the stress-strain curves constructed with Equation
(5.40) from the tetrahedron model for (GVPGV)7 and (PGV)12 successively in water
and in methanol, assuming density is 1.3× 10−3g ·mm−3 [10]. The lower plot is a stress-
strain curve of an ELP measured in experiments in 20% methanol solution (personal
communication with Fred Keeley).
Chapter 5. Modeling Mechanical Properties 87
5.4.1 Comparison between Experiments and Simulations
At the monomer level, k of the model ELPs is compared to that obtained from a full
tropoelastin in Baldock et al. [7]. Using the worm-like-chain model as described in the
paper, the modulus of a single full tropoelastin molecule is calculated to be about 0–9
pN/nm at an end-to-end distance of 0–140nm. Therefore, the modulus of a monomer as
shown in Figure 5.3, 8.7 ± 0.7 pN/nm for (GVPGV)7 or 6.4 ± 0.9 pN/nm for (PGV)12,
is on the same order of magnitude which is significant considering the difference in length
(35 vs. 786 residues) and sequence composition between the full tropoelastin [52] and
the model ELPs.
At the material macroscopic level, the results of KY for both (GVPGV)7 and (PGV)12
calculated with the tetrahedron model are remarkably close to the experimental results
as shown in Table 5.1 with an overestimation of between 1.1 to 8. The overestimation of
the tetrahedron model can be justified by the following points. First, both (GVPGV)7
and (PGV)12 are quite different from the HP domains used in the experiments [120, 10].
As for native elastin [52], it consists of many different types of HP domains. Second, the
finite length of the XL domains have been completely ignored and their function is only
limited to providing linkage between HP domains in the model. Third, the crosslinking
efficiency of XL domains is assumed to be 100% in the model, but it is probably lower
in experiments, which would reduce KY . Fourth, in a phase-separated aggregate of
self-assembled elastin, the modulus of a HP domains should be lower because of the
reduced solvophobic effect and maximized chain entropy [97] (see the next subsection for
a more detailed discussion on the influence of solvophobic effect and chain entropy on
the modulus), which would also reduce the final KY according to Equation (5.35). All
the differences are likely to affect the consequent Young’s modulus, but their effects are
difficult to assess quantitatively.
Chapter 5. Modeling Mechanical Properties 88
Overall, the consistency between k and KY in simulations and in experiments suggests
that the tetrahedron model is a reasonable way to model the Young’s modulus of elastin-
like materials using data from MD simulations. The reason for using tetrahedron as the
most basic unit is that it is the most symmetrical structure given a node is connected by
4 edges. Another model, the cubic model, was also tried before the tetrahedron model
was developed. The cubic model is not as good as the tetrahedron one because: first, it is
not as symmetrical; second, it cannot be used to construct a stress-strain curve because
its resultant Young’s modulus does not change with strain; third, it overestimates KY to
an even larger extent.
5.4.2 Comparison between Results in Water and in Methanol
The modeled Young’s modulus (KY ) is higher in water than in methanol. According
to Equation (5.35), this is a result of a higher modulus (k) and a smaller equilibrium
length (d0) of a monomer in water than in methanol. Therefore, the comparison only
needs to be focused on the monomer level. Why is k higher and d0 lower in water? In
order to provide a detailed explanation to this question, we have attempted the following
derivation.
We make two assumptions. First, since the recoiling force is mainly entropic for the
native elastin [82, 49] and tropoelastin [7] in water, it is assumed to be also entropic
for ELPs both in water and in methanol. Second, in the most stretched state, the chain
entropy is assumed to be zero in both water and methanol regardless of sidechain entropy.
We started with deriving a relationship between the change of system free energy (∆G),
modulus (k) and the change of end-to-end distance of the peptide (∆d). Since the
recoiling process is entropic, ∆G is approximated to the change of system entropy (∆S).
As the material recoils, the recoiling force (f) is doing positive work, δd is below 0, and
Chapter 5. Modeling Mechanical Properties 89
∆S is above 0, which means a gain of system entropy. After the relationship is derived,
∆S is decomposed into the solvent and solute parts and compared in different solvents
so as to understand how they modulate the modulus of the peptide.
Based on its thermodynamic origin, the recoiling force can be decomposed into
f = fe + fs =
(∂H
∂d
)p,T
− T(∂S
∂d
)p,T
, (5.41)
where f is the recoiling force, fs and fe are the entropic and enthalpic contributions, and
H, S, d, T , p, are system enthalpy, system entropy, peptide end-to-end distance, tem-
perature, and pressure, respectively. Since the recoiling force is assumed to be entropic,
f is approximated to
f ≈ fs = −T(∂S
∂d
)p,T
(5.42)
The above equation shows how f is determined by the change of system entropy ∆S
with respect to a change in the peptide’s end-to-end distance (∆d) between the relaxed
and stretched states. To relate f to k,
∆G = −T∆S =
∫ d
d0
fsdx =
∫ d
d0
−kdx
= −1
2k(d− d0)2 = −1
2k(∆d)2. (5.43)
Therefore,
∆S =k(∆d)2
2T, (5.44)
which means that if ∆d and T are fixed, k is proportional to ∆S. Therefore, the more
significant the gain of system entropy, the higher the modulus. Since k is lower in
methanol than in water, ∆S should also be lower. However, Equation (5.44) cannot
explain why ∆S is lower. For a system of a hydrophobic polymer such as (GVPGV)7 or
(PGV)12 in a polar solvent such as water, the gain of system entropy is two fold upon the
peptide’s recoil. First, the chain entropy increases but only to a certain point, after which
it decreases instead. Second, the solvent entropy also increases because the solvophobic
Chapter 5. Modeling Mechanical Properties 90
effect is mainly entropic at room temperature [27]. Therefore, ∆S can be decomposed
into
∆S = Sd0 − Sd = (Sd0u − Sdu) + (Sd0v − Sdv )
= ∆Su + ∆Sv,
(5.45)
where subscripts u and v mean solute (i.e. the peptide) and solvent (i.e. water or
methanol). To distinguish different solvents, Equation (5.45) is rewritten as
∆Sw = ∆Swu + ∆Swv (5.46)
in water, and as
∆Sm = ∆Smu + ∆Smv (5.47)
in methanol, where superscript w and m mean in water and in methanol, respectively.
Since ∆S should be higher in water than in methanol as indicated by Equation (5.44),
∆∆S = ∆Sw −∆Sm > 0. (5.48)
The above equation can be expanded substituting Equation (5.46) & (5.47) and rear-
ranged to
∆Smu −∆Swu < ∆Swv −∆Smv , (5.49)
which is the first key inequality. Because the chain entropy in the most stretched state
is assumed to be zero, and the chain entropy is believed to be higher in methanol than
in water because of its broader distribution of Rg,
∆Swu −∆Smu = Swu − Smu < 0, (5.50)
which is the second key inequality. Substituting Equation (5.50) into (5.49),
∆Swv −∆Smv > 0, (5.51)
which is the third key inequality. For convenience, the three key inequalities obtained so
far are put together,
∆Smu −∆Swu < ∆Swv −∆Smv , (5.49)
Chapter 5. Modeling Mechanical Properties 91
∆Swu −∆Smu < 0, (5.50)
∆Swv −∆Smv > 0, (5.51)
which are interpreted as: comparing the recoiling process between in water and in
methanol, there are both an increase in the gain of chain entropy (Equation (5.50))
and a decrease in the gain of solvent entropy (Equation (5.51)), which would compromise
each other in modulating the modulus of an ELP according to Equation (5.44). As a re-
sult, the modulus turns out to be lower in methanol than in water, which means that the
increase cannot compensate the decrease (Equation (5.49)). Therefore, the lower mod-
ulus in methanol must be caused by the decrease in the gain of solvent entropy, which
is equivalent to saying that the solvophobic effect is reduced in methanol. Despite an
increase in the gain of chain entropy, the reduced solvophobic effect results in a broader
distribution of Rg and a larger equilibrium length.
5.5 Conclusion
Our results show that our estimate of k are commensurate with experimental values
obtained on full tropoelastin [7]. In addition, a mathematical model named the tetra-
hedron model was constructed to model KY and remarkably, it results in a very good
agreement with experimental measurements on self-assembled ELPs. Based on the tetra-
hedron model, the stress-strain curves can also be modeled, which turn out to be of a
similar shape to experimental curve, as well. The fact that the values of KY predicted
by our model are close to experimental measurements despite the neglect of XL domain
is a strong argument for HP domains being mainly responsible for generating elasticity
in elastin-like materials.
The same approach was applied successively to ELPs in water and in methanol. The
results suggest that methanol would decrease k of a monomer as well as KY of the
Chapter 5. Modeling Mechanical Properties 92
elastin-like material, which suggests that the hydrophobic effect plays an important role
in generating elasticity in elastin in support of the two-phase model [92, 129, 44, 66].
5.6 Material & Methods
The simulations data used to compute the elastic moduli of (GVPGV)7 and (PGV)12 was
obtained from simulations described in Chapter 3 with the production time for (PGV)12
in water and in methanol extended to 500 ns per replica, the same as that of (GVPGV)7.
The end-to-end distance was calculated between the C atom of the C-terminal acetyl
group and the N atom of the N-terminal NH2 group. To fit parabolas to the end-to-
end distance distributions, all replicas were used for (GVPGV)7, but 3 and 4 replicas
were removed for (PGV)12 in water and methanol respectively in order to improve the
fitting quality. Total sampling times of 14 µs, 14 µs, 12.95 µs, and 12.6 µs were used
for (GVPGV)7 in water, (GVPGV)7 in methanol, (PGV)12 in water and (PGV)12 in
methanol, respectively. In the removed replicas, the peptides were trapped in a region of
small end-to-end distances which would have produced an abrupt peak in the resultant
PMFs. As for the cutoff of the PMF during the fitting process, a few values were tried
ranging from 1–4 RT , and the one that produced the highest fit quality was selected,
which is 2 RT (i.e. 5.98 kJ/mol, R: gas constant, T : temperature (300 K)), though the
modulus and equilibrium length produced with different cutoffs were not significantly
different from each other.
Chapter 6
Summary & Future Directions
6.1 Summary
Below is a compiled list of the work presented in this thesis.
• In Chapter 3, we investigated the self-aggregation propensities of a set of 6 model
peptides, (GVPGV)7, (PGV)12, (GGVGV)7, (GVGVA)7, (GV)18 and G35, in both
water and methanol by analyzing their conformational properties as monomers. We
found that ELPs swell in methanol and we also concluded that it is the reduction
of the solvophobic effect that prevents the ALPs from forming amyloid-like fibrils
in methanol.
• In Chapter 4, we studied the solvent qualities of water, a set of primary alcohols
from methanol to 1-octanol, and octane on the 6 model peptides. We found that
water and octane, which represent the polar and nonpolar extremes of the solvent
set, are the poorest solvents. In between, as the methyl chain of the alcohol becomes
longer (i.e. polarity of the solvent decreases), the solvent qualities increases up to
93
Chapter 6. Summary & Future Directions 94
heptanol as indicated by the peptide’s Rg, but none of the alcohols studied is a θ- or
good solvent. We postulate that the uneven distribution of the polar and nonpolar
groups in a peptide is a major impeding factor that prevents a solvent of similar
hydrophobicity/nonpolarity as the peptide from being its θ-solvent.
Due to improper preparation of the initial conformations, other structural prop-
erties like the contents of secondary structures were unreliable. This issue was
thoroughly discussed in the Section 4.3.
• In Chapter 5, we derived the modulus (k) and equilibrium length (d0) of ELPs
from the PMF upon the end-to-end distance. Based on k and d0, we developed the
tetrahedron model to calculate the Young’s modulus (KY ) of elastin-like materials.
The results are commensurate with experimental measurements. Applying this
approach to simulations in different solvents, it shows that a monomer has a lower
k and a larger d0 in methanol than in water. As a result, the corresponding Young’s
modulus of a material made of such a monomer is also lower in methanol. This
observation highlights the important role of hydrophobic effect in generating the
elasticity in elastin-like material, which is consistent with the two-phase model.
6.2 Future Directions
First, in recent years, along with the rapid increase of computational power and the
development of more efficient algorithms, the time scale for MD simulations have been
continuously extended. As a result, the accuracy of current force fields turns out to
be limited, which has spurred a new trend of force field optimization and comparison
(see Chapter 2 for references). However, most of the force fields were optimized and
validated with folded proteins in mind, which was also assumed to be good for IDPs
Chapter 6. Summary & Future Directions 95
when MD simulations were being conducted in this thesis. Such an assumption turns
out to be questionable, which has motivated a comparison of the most recent force fields
for the model peptides. The preliminary results of this comparison study are shown
in Appendix A. Some of the interesting findings include (1) in CHARMM22* [95], the
Rg curve is roughly reproduced for (PGV)12, but not for (GV)18 or G35; (2) the ALPs
form β-sheet in OPLS-AA/L, but do not in CHARMM22*; (3) a XL-domain derived
Ala-rich peptide forms α-helix in CHARMM22*, but do not in OPLS-AA/L (personal
communication with Aditi Ramesh). Therefore, a force field that is good for ELPs may
not be as suitable for ALPs, and one that is good for XL domains may not be as valid
for HP domains. In order to achieve a more accurate description of the conformational
ensembles of IDPs, a thorough comparison and evaluation of force fields are necessary.
Second, since elastin is an extracellular matrix protein, MD simulations of peptides aggre-
gation as well as of mature elastin-like materials are necessary to deepen our understand-
ing of the underlying structure-function relationships in elastin. However, due to the
extraordinary demand of atomistic models on computational resources, CG models need
to be used if a certain degree of loss of the atomistic details is acceptable. The MARTINI
model mentioned in Subsection 2.2.2 is currently under development and will be tested in
the near future. Once a CG model is validated, it can also be used to test the hypothesis
proposed in Chapter 3. An alternative to circumvent the bottleneck of computational
power is to use an elastic network model (ENM) model. The tetrahedron model built
in Chapter 5 provides the possibility of developing an extremely coarse-grained ENM
for modeling the Young’s modulus of a piece of elastin-like material. Equation (5.7) can
be used as the potential energy function with the calculated k and d0 as its constant
parameters. In this model, each XL and HP domain correspond to an atom and a bond,
respectively. By removing a number of HP domains randomly, the Young’s modulus
under the conditions of limited cross-linking efficiency can also be simulated.
Appendix A
Force Fields Comparison
A.1 Background
There are multiple pieces of evidence showing that OPLS-AA/L [61], the force field
used for most of the work presented in this thesis, is not the best one among a variety
of modern force fields. First, the results from a couple of very recent studies on force
fields comparison [8, 71] suggest that OPLS-AA/L is not the best at reproducing NMR
measurements for biomolecular systems. Second, results of Sarah Rauscher from our
group show that OPLS-AA/L produces over-collapsed conformations of the N-terminal
SH3 domain of the protein drk, and hence underestimating its Rg [97]. Third, our
colleague Aditi found that an Ala-rich peptide, A7K1, which is found to form α-helix
by circular dichroism (CD), can hardly form any α-helix in OPLS-AA/L (unpublished
results). Therefore, we started a comparison study that focuses on selecting an optimal
force field for this project, in particular, the MD simulations of EBPs.
At the first stage of this study, an ELP, (GVPGV)7, has been simulated in 7 force fields.
1The sequence of A7K is AAAAAAAKAAKAAAAAAA.
96
Appendix A. Force Fields Comparison 97
One of them, CHARMM22*, has been tested with two water models. Therefore, in total,
8 force field sets have been tested and compared for (GVPGV)7. The 8 force field sets
are shown in Table A.1. A force field set simply means the force field plus a particular
type of water model. Usually, a force field has a preferred water model, which is the one
used when the force field was being developed. Initially, CHARMM27 and CHARMM22*
were paired with TIP3P. Afterwards, realizing there is a CHARMM-modified variant of
TIP3P, namely TIPS3P, so CHARMM22* + TIPS3P was also added to the force field
sets for comparison. The results on (GVPGV)7 suggest that CHARMM22* + TIPS3P is
the best force field set because the peptide reaches its largest average Rg. Concurrently,
a parallel comparison conducted by Aditi shows that A7K forms an extensive amount of
α-helix in CHARMM22*, which does not happen in any of the other force fields2 she has
compared.
Force field Water model
ff99SB-ILDN [51, 72] TIP3P [57]
ff99SB*-ILDN [15, 72] TIP3P
ff03* [31] TIP3P
ff03w [13] TIP4P/2005 [2]
OPLS-AA/L [61] TIP4P [57]
CHARMM27 [75, 71] TIP3P
CHARMM22* [95] TIPS3P [74]
Table A.1: Selected force field set for comparison.
At the second stage of the study, we simulated 3 of the model peptides, (PGV)12, (GV)18
and G36, in 7 solvents, water, methanol, ethanol, pentanol, heptanol, octanol, octane in
CHARMM22*, trying to reproduce the Rg curve as shown in Chapter 4 and with proper
initial conformations. As mentioned in Subsection 4.2.4, the results have been reproduced
2The force fields Aditi has compared include ff99SB*-ILDN, ff03w, OPLS-AA/L and CHARMM22*.
Appendix A. Force Fields Comparison 98
qualitatively for (PGV)12. However, as for the other two peptides, the Rg curve of (GV)18
includes an abnormally high spike in methanol and that of G36 is flattened out. After
looking into the structure of (GV)18, a serious artifact is found that involves the formation
of continuous G-V β-turns, which we name it the zigzag extension. After communicating
with one of the major developers of the CHARMM force field, Alex Mackerell from
the University of Maryland, we took his suggestion and tested the currently newest
CHARMM force field, CHARMM36 [16]. Unfortunately, the zigzag extension still exists
in CHARMM36 although not as abundant as in CHARMM22*.
We suspect that this artifact is caused by inadequate parameterization of the Gly back-
bone parameters for the dihedral angles, φ and ψ. Therefore, at the third stage of the force
fields comparison study, we plotted the potential energy map for dipeptides of 3 residues,
Gly, Val and Pro, in vacuo with different φ and ψ angles in 4 force fields, CHARMM22*,
CHARMM36, OPLS-AA/L, and ff99SB*-ILDN. The results indicate that the effects of
different force field families on the potential energy is fundamentally different for Gly
dipeptide while very similar for the Val and Pro dipeptides. Since the zigzag extension
has not been observed in OPLS-AA/L, the results confirms our suspect that the artifact
is most likely due to improper parameters of the backbone dihedral angles of Gly.
This study is still ongoing, and probably will need assistance from the CHARMM22*
force field developers to improve the parameters of the backbone dihedral angles of Gly.
The following sections present the results that have been obtained as of writing.
Appendix A. Force Fields Comparison 99
A.2 Results
A.2.1 Force Fields Comparison for (GVPGV)7
Radius of Gyration The distributions of Rg of (GVPGV)7 in different force fields is
shown in Figure A.1. According to the broadness of the distributions, the force fields
can be roughly divided into three groups. Unsurprisingly, OPLS-AA/L belongs to the
first group, in which the peptide is the most collapsed than that in the other groups.
In addition, ff03* and ff03w also belong to the first group. The second group includes
ff99SB-ILDN, ff99SB*-ILDN, CHARMM27 and CHARMM22* with TIP3P water model,
in which the peptide has a larger Rg than in group one. Surprisingly, when CHARMM22*
is paired with the CHARMM-modified variant of TIP3P [74], TIPS3P, Rg becomes even
larger. Therefore, CHARMM22* itself forms the third group when paired with TIPS3P.
PMF The PMF along the end-to-end distance of (GVPGV)7 has been computed, based
on which the modulus of the peptide has also been calculated using the method described
in Subsection 5.2.1. The result is shown in Figure A.2. In terms of the fitting quality,
ff03*, ff03w, OPLS-AA/L and CHARMM22*+TIPS3P are above 0.90 and hence better
than the others, which means PMF converges faster in these force fields. However, as
shown in Figure A.1, ff03*, ff03w and OPLS-AA/L tend to underestimate Rg. As a result,
CHARMM22*+TIPS3P is the most preferred force field set. In terms of the consequent
modulus, since they are all on the same order of magnitude and the experimental value
is unknown, the difference among force fields is not very helpful in helping us make the
selection. Therefore, we decided to use CHARMM22* + TIPS3P as the optimal force
field set for the future work in this project.
Appendix A. Force Fields Comparison 100
0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3Rg (nm)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P
ff99SB-ILDN + TIP3P
ff99SB*-ILDN + TIP3P
ff03* + TIP3P
ff03w + TIP4P/2005
OPLS + TIP4P
CHARMM27 + TIP3P
CHARMM22* + TIP3P
CHARMM22* + TIPS3P
Figure A.1: Distributions of Rg of (GVPGV)7 in different force field sets. Only backbone
Cα atoms were used for the calculation.
Appendix A. Force Fields Comparison 101
0.0
0.5
1.0
1.5
2.0
2.5
3.0
PM
F (k
J/m
ol)
k = 3.5 ± 0.1 pN/nmd0 = 1.7 ± 0.0 nmr2 = 0.87
ff99SB-ILDN + TIP3P
k = 3.3 ± 0.3 pN/nmd0 = 1.8 ± 0.0 nmr2 = 0.84
ff99SB*-ILDN + TIP3P
k = 6.7 ± 0.2 pN/nmd0 = 1.4 ± 0.0 nmr2 = 0.93
ff03* + TIP3P
k = 4.1 ± 0.2 pN/nmd0 = 1.5 ± 0.0 nmr2 = 0.92
ff03w + TIP4P/2005
0.5 1.0 1.5 2.0 2.5 3.0d (nm)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
PM
F (k
J/m
ol)
k = 6.5 ± 0.5 pN/nmd0 = 1.3 ± 0.0 nmr2 = 0.92
OPLS-AA/L + TIP4P
0.5 1.0 1.5 2.0 2.5 3.0d (nm)
k = 3.9 ± 0.2 pN/nmd0 = 1.4 ± 0.0 nmr2 = 0.80
CHARMM27 + TIP3P
0.5 1.0 1.5 2.0 2.5 3.0d (nm)
k = 3.8 ± 0.1 pN/nmd0 = 1.6 ± 0.0 nmr2 = 0.82
CHARMM22* + TIP3P
0.5 1.0 1.5 2.0 2.5 3.0d (nm)
k = 3.4 ± 0.1 pN/nmd0 = 1.9 ± 0.0 nmr2 = 0.93
CHARMM22* + TIPS3P
Figure A.2: PMFs of (GVPGV)7 in different force field sets. The dashed line show a fit
of parabola to the PMF. k and d0 are the modulus and the equilibrium length of the
peptide, and r2 indicates the fitting quality.
Appendix A. Force Fields Comparison 102
A.2.2 Force Fields Comparison for (GV)18
Zigzag extension As mentioned above, the Rg curve of (GV)18 contains a spike in
methanol. After analyzing the structure, we found that the spike is caused by a structure
called zigzag extension, which also exists in simulations in CHARMM36. A snapshot of
zigzag extension is shown in Figure A.4. In the zigzag extension, there is only one
type of β-turn, which is formed by the H-bond between the backbone C=O group of
Gly2i+1 and the N-H group of Val2i+4. In a perfect zigzag extension, nearly all residues
(except Val2 and Gly35) of the peptide are engaged in the formation of β-turns, which
would transform the peptide into a pseudo 2D-helix. The structure of zigzag extension is
most prominent in methanol, but also appears in other solvents. Although the β-spiral
model [119] mentioned in Section 1.6 also consists of repetitive β-turns, it is actually very
different from the zigzag extension. In a hypothetical β-spiral, for example, in (PGV)n
where all VPGV units form β-turns, both of the N-H and C=O groups of a single Val
participate in the formations of two consecutive β-turns.
H-bonding Map To have a quantitative view of the structure of zigzag extension,
H-bonding maps were calculated for (GV)18 in different solvents as well as in different
force fields as shown in Figure A.5. Since the zigzag extension involves H-bonds between
the backbone C=O group of Gly2i+1 and the N-H group of Val2i+4, so only every other
C=O group is hydrogen bonded, which is exactly what the maps in CHARMM22* and
CHARMM36 in Figure A.5 show. Similarly, the H-bonding maps of (GV)18 in other
solvents in Figure A.6 also indicate the existence of zigzag extensions. In contrast, as
shown in Figure A.5, the C=O group of every residue involves in the formation a H-bond
in OPLS-AA/L, and the corresponding type of turn is γ turn.
Appendix A. Force Fields Comparison 103
watermethanol
ethanolpentanol
heptanoloctanol
octane
0.8
1.0
1.2
1.4
1.6
Radiu
s of
Gyra
tion (
Rg)
(nm
)
(PGV)12
(GV)18
(G)36
Figure A.3: Average Rg of (GVPGV)7, (GV)18 and G36 in water, alcoholic solvents, and
octane. The alcoholic solvents are ordered by the length of their methyl chains.
Appendix A. Force Fields Comparison 104
Figure A.4: A snapshot of the zigzag extension of (GV)18 in methanol in CHARMM22*
force field. The upper half is a far-sight view of the whole peptide, the lower half zooms
into the zigzag region.
Appendix A. Force Fields Comparison 105
Figure A.5: H-bonding maps of (GV)18 in CHARMM22*, OPLS-AA/L and CHARMM36
force fields.
Appendix A. Force Fields Comparison 106
Figure A.6: H-bonding map of (GV)18 in other solvents in CHARMM22* force field.
Appendix A. Force Fields Comparison 107
PMF of the Ramachandran Plot In addition, we also plotted the PMF of the Ra-
machandran plot for Gly and Val in (GV)18 in CHARMM22*, CHARMM36 and OPLS-
AA/L as shown in Figure A.7 and A.8.
On the one hand, the PMFs of Gly in CHARMM22* and in CHARMM36 are very similar,
and both favor the helical region over the extended region. This is not exactly consistent
with the PMF of a short tripeptide, G3, in which the extended region was preferred
instead, but the overall contour shapes are similar among those PMFs. The aforemen-
tioned PMF of G3 was published in the paper that first announced CHARMM36 [16]. In
contrast, the PMF of Gly in OPLS-AA/L is very different from those in CHARMM force
fields not only because there is no preference to either the helical or extended region,
but also its contour shape is drastically different. Since the zigzag extension happens
in both CHARMM22* and CHARMM36, but not in OPLS-AA/L, we think this result
suggests that it is Gly that causes the zigzag extension. In addition, although the PMF
of Gly should be symmetrical with respect to the point of (φ = 0, ψ = 0) as shown in
Figure A.9, which is calculated for G35, the introduction of Val breaks such symmetry
and results in a PMF slightly biased towards the φ > 0 region.
On the other hand, the PMFs of Val are all different from each other in terms of both
of the contour shape and the preference to a certain region of φ-ψ combinations. In
CHARMM22*, the extended region is favored. In CHARMM36, the helical region is
favored. In OPLS-AA/L, though the extended region is favored, the helical region is
also populated. Given that the zigzag extension happens in both CHARMM22* and
CHARMM36, we think it is not very sensitive to the difference in the PMFs of Val. In
other word, Val is less likely to be the major reason that causes the zigzag extension.
Appendix A. Force Fields Comparison 108
Figure A.7: PMFs of Ramachandran plots for Gly in (GV)18 in CHARMM22*, OPLS-
AA/L and CHARMM36 force fields.
Appendix A. Force Fields Comparison 109
Figure A.8: PMFs of Ramachandran plots for Val in (GV)18 in CHARMM22*, OPLS-
AA/L and CHARMM36 force fields.
Appendix A. Force Fields Comparison 110
Figure A.9: PMFs of Ramachandran plots for Gly in G36 in CHARMM22*, OPLS-AA/L
and CHARMM36 force fields.
Appendix A. Force Fields Comparison 111
A.2.3 Force Fields Comparison for Dipeptides In Vacuo
Potential Energy Map The potential energy maps of the dipeptides of 3 residues,
Gly, Val, Pro, in 4 force fields, CHARMM22*, CHARMM36, OPLS-AA/L, and ff99SB*-
ILDN are shown in Figures A.10, A.11, and A.12. The maps of CHARMM22* and
CHARMM36 are exactly the same for all three dipeptides because we found the system of
a single dipeptide is too simple to distinguish the two. The maps of Gly dipeptide are very
different among different force field families. In particular, the peaks and troughs of the
potential energy are drastically different from each other. In OPLS-AA/L and ff99SB*-
ILDN, the favored region of φ-ψ combinations is close that of the extended structures,
while in the CHARMM force fields, it is closer to the helical region. For Val dipeptide
maps, their peaks and troughs are much closer to each other among different force fields
than those of Gly dipeptide though not exactly the same. The trough of potential energy
in CHARMM force fields favor the right part of the region of extended structures, but
ff99SB*-ILDN favors its left part while OPLS-AA/L displays no preference. In contrast,
the potential energy maps of the Pro dipeptide are nearly the same among the different
force fields.
A.3 Discussion
The force fields comparison on (GVPGV)7 suggests that CHARMM22* is the favored
force field, but CHARMM22* produces significant artifact on (GV)18. We think that the
sequence repetitiveness of (GV)18 probably amplifies the zigzag effect. The comparison
of the PMFs of Ramachandran plots of (GV)18 and G35, as well as the potential energy
maps of the dipeptides in different force fields suggests that the problem is likely to be
caused by Gly since it is where the force fields are most different from each other, at least
Appendix A. Force Fields Comparison 112
at the residue level.
As mentioned in the background, this work is still ongoing. Currently, the major bot-
tleneck is the resolution of the artifact by obtaining better parameters for Gly backbone
dihedral angles in CHARMM force fields, which probably needs help from the CHARMM
force field developers.
150
100
50
0
50
100
150
ψ
Gly, CHARMM22* Gly, CHARMM36
150 100 50 0 50 100 150φ
150
100
50
0
50
100
150
ψ
Gly, OPLS-AA/L
150 100 50 0 50 100 150φ
Gly, ff99SB*-ILDN
120
105
90
75
60
45
30
15
0
kJ/mol
Figure A.10: Potential energy maps of the Gly dipeptide in different force fields.
Appendix A. Force Fields Comparison 113
150
100
50
0
50
100
150
ψ
Val, CHARMM22* Val, CHARMM36
150 100 50 0 50 100 150φ
150
100
50
0
50
100
150
ψ
Val, OPLS-AA/L
150 100 50 0 50 100 150φ
Val, ff99SB*-ILDN
120
105
90
75
60
45
30
15
0
kJ/mol
Figure A.11: Potential energy maps of the Val dipeptide in different force fields.
Appendix A. Force Fields Comparison 114
150
100
50
0
50
100
150
ψ
Pro, CHARMM22* Pro, CHARMM36
150 100 50 0 50 100 150φ
150
100
50
0
50
100
150
ψ
Pro, OPLS-AA/L
150 100 50 0 50 100 150φ
Pro, ff99SB*-ILDN
0
60
120
180
240
300
360
kJ/mol
Figure A.12: Potential energy maps of the Pro dipeptide.
Appendix A. Force Fields Comparison 115
A.4 Material & Methods
All the peptides, including the dipeptides, are capped with an N-terminal acetyl group
and an C-terminal amide group.
(GVPGV)7 The same setup parameters apply to simulations in all 8 force field sets
shown in Table A.1. We simulated 40 300-ns replicas with the first 150 ns truncated
as equilibration, which results in 6 µs of sampling time in each force field set. The
initial structures of the peptides were generated at 300 K in vacuo. All simulations
were performed at constant pressure (1 bar) and constant temperature (300 K) with
periodic boundary conditions. The simulation package used was Gromacs-4.5.5. The
LINCS algorithm was used to constrain all bond lengths [47, 46], and an integration time
step of 2 fs was applied. An cutoff of 1.4 nm was used for Lennard-Jones interactions.
The PME algorithm [26, 33] was used to calculate long-range electrostatics interactions
with a Fourier spacing of 0.12 and a interpolation order of 4. Nose-Hoover thermostat
[88, 50] was used for temperature coupling with the peptide and solvent coupled to two
temperature baths and a time constant of 2 ps. Parrinello-Rahman [91] was used for
pressure coupling with a time constant of 2 ps.
(PGV)12, (GV)18, G36 For simulations in CHARMM22*, please refer to Section
4.4. For simulations in OPLS-AA/L, please refer to Section 3.5. For simulations in
CHARMM36, except for the force field, all the technical setup was the same as in
CHARMM22*. In the H-bonding maps, the value for each possible intramolecular
peptide-peptide H-bonds was calculated by the number of its appearance along the tra-
jectory normalized by the number of frames and averaged over all replicas.
Appendix A. Force Fields Comparison 116
Dipeptides The potential energy is calculated from the structure after energy mini-
mization, during which a quadratic term was added to the potential energy function to
restrain the φ and ψ dihedral angles to two particular values, respectively. Please note
the final potential energy was calculated without the quadratic term. The values for the
dihedral angles varied from −180◦ and 170◦ at a 10◦ interval. To compare among the
force fields, the minima of potential energy map were adjusted to the same level of that
in CHARMM22*.
Appendix B
sumcoresg
sumcoresg, which stands for sum up the cores usage for a research group, is a web
application (web app) that collects, analyzes and presents usage data of computational
resources on multiple computer clusters.
B.1 Motivation
Our group has been allocated a significant amount of computational resource in recent
years by Compute Canada (https://computecanada.ca/), so we think it necessary
to track our usage so as to fully utilize the resource. Since the allocated resource is
distributed over multiple computer clusters across the country, the usage tracking will
also help our group avoid underutilizing some of the clusters while overutilizing the others,
which would result in unnecessary queueing time for the jobs to start running.
We started by assigning each cluster to a group member who would be in charge of
collecting the usage data for that particular cluster. The data was collected by executing
a customized script, keeping it running without hanging up, and restarting it immediately
117
Appendix B. sumcoresg 118
after a cluster shutdown and reboot. At the end of each week, each member would report
the data collected during the last week to a group leader who would compile all the
results and discuss it in the upcoming group meeting. This process was rather tedious,
inefficient and can be problematic. For example, people all use their own scripts for their
own clusters, hence it was uncertain if they all calculated the usage in the same way.
Also, it was not always easy to keep noticed of a cluster shutdown and reboot, which
could result in one or two days’ data loss.
Later on, one of our group members, Chris Ing, wrote a script that could collect the
contemporary usage data from all allocated clusters at once when executed, which made
the data collection process much less laborious. However, the downside of Chris’ code
was that it still needed to be executed manually, which limited the frequency of data
collection. Besides, group members could visualize the usage data freely until it was
presented.
Therefore, I decided to build a web application to automate the whole process, which
includes data collection, analysis and presentation. One major advantage of a web ap-
plication over a desktop program is that there is no installation process required on the
client side except for a web browser and Internet connection, which is available by default
on most modern computers or cell phones.
This work turns out to be sumcoresg, which is accessible at http://usage.pomeslab.
com at any time, but only authorized people are able to view the usage data. Currently,
sumcoresg collects usage data from 8 computer clusters across Canada about every 10
minutes, which results in about 1000 data points per week. Therefore, the usage data
collected is reasonably accurate. After the data analysis, the latest usage information of
all interested clusters are shown in a table, which can be useful for a person to select a
relatively less busy cluster and start running new jobs there. The historical usage data
will be visualized as a plot along the time or in a bar chart. However, the usage data can
Appendix B. sumcoresg 119
essentially be presented in any way that feels straightforward and convenient.
B.2 Material & Methods
At the backend of sumcoresg, the language used is Python (http://www.python.org/),
the web framework used is Flask (http://flask.pocoo.org/), the templating engine
used is Jinja2 (http://jinja.pocoo.org/docs/templates/), and the database used
is PostgreSQL (http://www.postgresql.org/).
At the frontend, the markup, styling and programming languages used are HTML, CSS and
JavaScript, respectively.
The hosting service used is Heroku (https://www.heroku.com/), a cloud platform ini-
tially developed for Ruby on Rails (http://rubyonrails.org/), and later extended
for developments in Python as well. The major advantage of Heroku is that it’s free for
small web applications.
The network protocol used for communication between the web server and all the inter-
ested clusters is Secure Shell 2 (SSH), and its implementation in Python (i.e. Python
module) used is paramiko (http://www.lag.net/paramiko/). The network used for
communication between the web server and users is Hypertext Transfer Protocol (HTTP).
All the code scripts and folders in sumcoresg are summarized in Table B.1. The source
code will be available upon request.
Appendix B. sumcoresg 120
sumcoresg.py
contains all URL handlers. The important ones include
main, login, signup, logout, report, plot, plot_dur,
histo, pomeslab_png. It also contains two functions for
starting to collect data (start_collecting_data) and
starting the application (start_app_run).
app_config.py includes global configurations.
thedata.py includes constant variables.
util.py includes utility functions.
obj.py includes Python classes. e.g. Cluster, Report.
statparsers.pyincludes the parsers for processing usage data, which are
in the format of XML, fetched from different clusters.
data_collector.pyincludes functions for data collection, process, and presen-
tation.
db_tables.pyincludes table schemas used in the database. e.g. Usage,
Account, Figure.
manage.pyincludes management functions for initial launch of the
app.
distribute_pub_key.py
for distributing the SSH public key to different clusters
once updated. To save the trouble, it would be better to
use this script after confirming all the interested computer
clusters are on.
Appendix B. sumcoresg 121
write_xml.py
based on CLUSTER_TAGS, CLUSTER_DATA and USER_DATA in
thedata.py, it generates static/xml/clusters.xml and
static/xml/users.xml, which contain the configurations
for clusters and users. Each time the above three vari-
ables are updated, write_xml.py needs executed so that
clusters.xml and users.xml are up to date. It is now
realized that writing the configurations directly into the
database would have been much more convenient and eas-
ier to update.
queue_data.pyfor importing the data that was manually collected previ-
ously. Now the code deprecated.
generate_key.shcontains code sample for generating a new SSH key pair,
i.e. both public and private keys.
templates This folder contains all HTML templates.
staticThis folder contains all non-dynamically generated files.
e.g css files and js files.
afternoon.backup This folder contains backups for usage data.
Makefile contains make rules for daily maintenance of the app.
matplotlibrcused by matplotlib [54], it contains customizations for
decorating the plots.
ProcfileHeroku specific file, contains the code to be executed when
the app is launched.
requirements.txtHeroku specific file, contains the modules that need to be
installed for launching the app.
runtime.txt Heroku specific file, specifies a specific runtime.
Appendix B. sumcoresg 122
.sumcoresgk.pub
contains the public SSH key, which needs to be stored
in the ${HOME}/.ssh/authorized_keys file in each inter-
ested computer cluster. It should not be version controlled
in order to reduce the risk of being attacked.
.sumcoresgk
contains the private SSH key, which is needed by the web
server to interact with the computer clusters. It should
never be made public, so to reduce the risk of being at-
tached, it shall be regenerated and distributed to all clus-
ters using distribute_pub_key.py regularly.
Table B.1: Summary of scripts and folders in sumcoresg.
B.3 Workflow
The workflow of sumcoresg is illustrated as in Figure B.1. The web server first connects
to the target computer cluster, speaking SSH, and execute the command which will
generate the latest usage data on that computer cluster. The specific command depends
on what queueing system is installed and how it is configured on the particular cluster
interacted with. Generally, on a system with Moab/Torque installed, the command is like
/path/to/showq/or/qstat -some -options --format=xml.
--format=xml makes sure the data returned is in the format of XML. The returned usage
data will be received by the web server, and then processed and formatted in HTML. The
formatted result should be memcached using the program memcache in order to speed up
the response when a user visits the web application. The interaction between the web
server and the computer cluster takes place at a specified time interval (e.g. 10 minutes).
Appendix B. sumcoresg 123
When a user visits the web application speaking HTTP, the server will respond with
the memcached result immediately. After the user receives it, the browser on the user’s
computer will generate the graph for visualization. In contrast to the server-cluster
interaction, that between web server and a user only happens when he visits the website.
In addition, in order to restrict the access only to authorized users, only those that know
a secret code, which is set up by the web master, are able to sign up and then log in
to visualize the usage data. However, please be noted that this is a not a very strong
authentication system, and could become insecure when the user base increases.
B.4 Screen Shots
Figure B.1–B.3 show how the data look like as of writing. Figure B.1 shows the latest
report on usage, which are being updated around every 10 minutes. Figure B.2 and B.3
show the historical usage data along the time and in a bar chart. As mentioned above,
the data presentation can be versatile.
B.5 Future Directions
The current code works very well, but there are still many ways to optimize and improve
the code both for readability and development of new features in the future.
1. Reimplement the configurations of clusters and users in the database in replacement
of write_xml.py, static/xml/clusters.xml and static/xml/users.xml, and
CLUSTER_TAGS, CLUSTER_DATA and USER_DATA variables in thedata.py.
2. Further modularize data_collector.py, and rewrite big functions into smaller
Appendix B. sumcoresg 124
Web ServerComputer Cluster
showq/qstat
SSH2
return usage data in xml
The web server receives, processes, formats the usage data, and then memcache the
results. GET / HTTP/1.1Host: usage.pomeslab.com
<!DOCTYPE html><html lang="en"> <head> <meta charset="utf-8"/> <title>Latest report</title> </head> <body> ... </body></html>
Visitor
Figure B.1: Workflow of sumcoresg. Please see detailed description in the text. In the
end of the screen shot, results from more computer clusters are omitted as indicated
by the ellipsis. The cartoons for the web server and visitor are downloaded directly
from clker.com. The cartoon of the computer cluster is drawn by duplicate a unit of 4
computers, which is also from clker.com.
Appendix B. sumcoresg 125
Figure B.2: Historical usage data along the time. The y axis shows the usage as a
percentage of the allocated core hours. The numbers of allocated cores in the legend
is arbitrarily made for illustration purpose only. The dashed line indicates 100% usage,
below or above which means the cluster is temporarily being underutilized or overutilized.
The density of data points is actually much higher than what is shown in the figure. The
data resolution is decreased because the figure would otherwise be very large in size and
take a long time to be load.
Appendix B. sumcoresg 126
Figure B.3: Historical usage data in a bar chart. A bar chart is good for summarizing
the usage data over a long period of time. The x axis shows a list of cluster names being
tracked, and the y axis shows the usage as a percentage of the allocated core hours. bars
in green and red indicate overutilization and underutilization, respectively. The dashed
line indicates 100% usage. The title means the data is a summary of the usage data since
the beginning of the year.
Appendix B. sumcoresg 127
ones.
3. Isolate URL handlers for the user system (e.g. login, signup, logout) from
sumcoresg.py into an separate script.
4. Build a content management system to make addition and removal of clusters and
users easy.
5. Build more interactive ways of data visualization.
6. Generalize the app so that it can be easily adopted by any research group that uses
multiple clusters simultaneously.
Appendix C
xit
xit is a program that eases the process of system set up and analysis for multiple replica
MD simulations.
C.1 Motivation
For MD simulations, it is routine to set up multiple replicas, submit jobs, analyze the
trajectories, and visualize the results. New comers usually do each step separately, which
will ends up in a number of scripts distributed all over the system. Even worse, when
it comes to a new set of systems, old scripts are likely to be copied and modified so as
to be adapted to the new systems. As a result, it will not take long to end up with
many different but highly similar files all around the place, which is a big challenge
for maintenance. Therefore, it is necessary to generalize the process and write up a
single program that is able to handle all kinds of routine jobs, which should also highly
extensible and easy to be adopted to new projects. This program turns out to be xit.
Besides, xit also implements a queueing mechanism for executing most of the jobs it is
128
Appendix C. xit 129
capable of in parallel via multi-threading.
The following list provides a more specific description of what xit does.
1. Set up simulation replicas, and then submitted jobs to the queue in the compute
cluster for calculations. Given the templates of input files, xit will generate the
code to be run for each replica, and then submit it to the queuing system. This
step is done via xit prep --some --options.
2. After the simulations are done, xit should handle all kinds of analysis. This step
is done via xit anal --some --options.
3. After each type of analysis, if the result is not in the most convenient format,
especially when the analysis code is not self-written, then it may need to be
transformed first and then stored properly in a data file. This step is done via
xit transform --some --options.
4. After the transformation, the result is almost ready for visualization. The commond
to generate a figure is xit plot --some --options.
5. It has been realized that xit plot is more appropriate for visualizing the anal-
ysis of a single property. When it comes to multiple ones, use the command
xit plotmp --some --options where mp means multiple properties.
In all of the above steps, xit takes care of looping through all replicas.
C.2 Material & Methods
xit is written purely in Python. The file format chosen for storing the configuration
file, which is project specific, is YAML (http://www.yaml.org/spec/1.2/spec.html).
Appendix C. xit 130
YAML turns out to be very powerful and useful. The Python module used to parse the
configuration files is PyYAML (http://pyyaml.org/). The file format chosen for stor-
ing analyzed and transformed results is HDF5 (http://www.hdfgroup.org/HDF5/), and
the corresponding Python module used is PyTables (http://www.pytables.org/). The
templating engine used is Jinja2 (http://jinja.pocoo.org/docs/), both for templat-
ing topological files and generating replica-specific analysis commands.
Currently, all of the subcommands of xit, i.e. prep, anal, transform, plot, plotmp
have been written, and new types of analysis and plots are being constantly added. Out
of the five subcommands, prep and anal are always executed in parallel.
All the code scripts and folders in sumcoresg are summarized in Table C.1. The source
code is available upon request.
xit.py
processes the commandline argument and then invokes one of
the following functions: prep, anal, transform, plot, plotmp
based on the subcommand typed.
prep.py
handles routine works to set up multiple simulation replicas. For
example, create directories, copy over initial structure files (e.g.
pdb or gro file), templating topology (e.g. top file) files and
scripts for system equilibration. Please be noted that xit does
not have code for equilibrating the MD system, but it can gen-
erate the code to do that based on a input template file.
anal.pyhandles different kinds of analysis and invokes the corresponding
functions in the analysis_methods directory.
transform.py
handles different kinds of transformation of the raw results gen-
erated by other analysis codes (e.g. parse xvg or xpm files gen-
erated by many Gromacs tools), and stores the results in a HDF5
file.
Appendix C. xit 131
plot.pyprepares the data to be plotted and invokes the corresponding
plotting function in the plot_types directory.
plotmp.pysimilar to plot.py but handles plotting of multiple properties
and uses plotting functions in the plotmp_types directory.
xutils.pyincludes the function for parsing the commandline arguments.
The function is called by function main in xit.py.
prop.pyincludes all table schemas for storing the analysis results in a
HDF5 file
objs.py includes Python class objects.
utils.py includes various utility functions.
analysis_methods
This folder contains files for all types of analysis, when adding
a new type of analysis, please modify one of the files or add
a new file to this folder. If a new file is added, please edit
analysis_methods/__init__.py to make sure it is properly im-
ported.
plot_types
This folder contains scripts for generating plots of different types.
Similar to analysis_methods, modify one of the files in it or add
a new file to this folder when a new type of plot is needed. If a
new file is added, please edit plot_types/__init__.py to make
sure it is properly imported.
plotmp_typesSimilar to plot_types, but scripts in this folder plot multiple
properties in a single figure.
.xitconfig.yaml
This is the configuration file which contains all project specific
information, and it should be located in the root of the project
directory.
Appendix C. xit 132
Table C.1: Summary of scripts and folders in xit.
C.3 Usage Examples
In this section, an example is shown for using each of the subcommands, prep, anal,
transform, plot, plotmp.
The following command will make directories for 10 replicas (from 00 to 09) of systems
of all combinations of sequence (sq) 3, 4, 5, 6 in water (w), methanol (m) and ethanol (e).
In total, there will be 120 jobs (10 replicas × 12 systems). The naming of the sequences
and solvents is totally arbitrary.
xit prep --vars sq[3-6] 'w m e' [00-09] --prepare mkdir
The following command will analyze the radius of gyration of all Cα atoms of the back-
bone for 10 replicas (from 00 to 09) of systems of sequence 3 and 6. --nolog means
no log files will be generated, instead the standard output (stdout) and standard error
(stderr) will be printed to the screen directly. Without --nolog, a log file will be gen-
erated for each replica. A very handy option for debugging purposes is --test, which
will print the commands to be executed instead of actually executing it for each replica.
It is like a dry run.
xit anal --vars 'sq3 sq6' 'w m' [00-09] --analysis rg_c_alpha --nolog
The following command will transform the results in xvg format to a proper one as
specified in prop.py, and then store them in a HDF5 file, which is specified in the config-
uration file, .xitconfig.yaml. A previously transformed results can be overwritten by
Appendix C. xit 133
appending the option --overwrite.
xit transform --vars 'sq3 sq6' 'w m' [00-09] --property rg_c_alpha \
--filetype xvg
The following command will plot the results of rg_c_alpha transformed in a bar chart.
The option --grptoken path2 is used to decide how the replicas should be grouped. In
this example, it is assumed that the directories with a deeper level than path2 represent
the replica numbers of a particular system, so they should be grouped together and only
their average will be used for plotting. The calculated values will also be stored in the
HDF5 file in order to reduce the time of re-plotting. The calculated values can also be
overwritten by appending the option --overwrite.
xit plot --vars 'sq3 sq6' 'w m' [00-09] --property rg_c_alpha \
--plot_type bars --grptoken path2
The following command is an example of plotting the two properties, Property 1 (p1)
and Property 2 (p2) on the x and y axes respectively, as indicated by the name of
plotmp_type, xy. The option, --overwrite, works for xit plotmp as well.
xit plotmp --vars 'sq3 sq6' 'w m' [00-09] --properties p1 p2 \
--plotmp_type xy --grptoken path2 --overwrite
When it comes to a new project, only if there needs to be a new type of analysis or plot
will new code need written. Otherwise, what is needed is usually just a new version of
the configuration file.
Below is part of an example configuration file in YAML format, where the text after # is
comment.
systems:# variables used to identify a single replica of
Appendix C. xit 134
# a particular systemvar1: [sq1, sq2, sq3, sq4, sq5, sq6]var2: [w, m]var3: ['00','01','02','03','04','05','06','07','08','09']
dir1: '{var2}300' # dir of level 1dir2: '{var1}' # dir of level 2dir3: '{var3}' # dir of level 3id : '{var1}{var2}{var3}' # a unique id for each replica
data:repository: 'repository' # dir containing mdp, templates, etc.analysis : 'analysis' # dir containing plain text resultsplots : 'plots' # dir for storing plotted figureslog : 'log' # dir for storing logs
hdf5:title : 'in water and methanol'filename: 'mono_meo.h5'
# includes another YAML file which contains configurations about# different types of analysisanal: !include .xitconfig_anal.yaml
# configurations for plottingplot:
rg_c_alpha: # property to be plottedbars: # name of a particular plot type
ylabel: {ylabel: $R_g$} # y label using LaTeX syntaxgrped_bars: # name of a particular plot type
grp_REs: ['w300/sq[1-6]', 'm300/sq[1-6]']ylabel: {ylabel: $R_g$, labelpad: 10}xticklabels:
labels: ['(GVPGV)7', '(PGV)12', '(GGVGV)7','(GVGVA)7', '(GV)18', '(G)35']
rotation: 15
C.4 Future Directions
xit is like a pipeline for setting up MD simulations and analyzing the resultant data.
As more analysis methods and plotting types are added, xit will become more featured
and more useful.
Appendix D
tprparser
tprparser is a component of MDAnalysis, a popular MD analysis package for multiple
popular MD simulation packages. It parses tpr files generated by Gromacs and extracts
useful topology information.
D.1 Motivation
A tpr file is the one that contains all information about the structural topology and
running parameters of a MD system in Gromacs. To start a MD run, Gromacs needs to
extract all the information from a tpr file. Although a tpr file contains all the useful
information, its file structure tpr file is poorly documented, which limits its access by
other MD analysis packages like MDAnalysis[81]. Previous to the work included in this
section, when using MDAnalysis to analyze the trjactories generated by Gromacs, pdb
or gro files have to be used for obtaining the information about structural topology.
However, the information contained in these files is limited comparing to that in a tpr
file. For example, the charge of individual atom is not available in either pdb or gro
135
Appendix D. tprparser 136
files. Therefore, there is a need for a tpr parser that can interact with MDAnalysis in
the community, and this need was first proposed in 2008 (see https://code.google.
com/p/mdanalysis/issues/detail?id=2 for a chronological discussion on this topic).
Finally, a workable tprparser has been written and will be included in the upcoming
0.8 release of MDAnalysis.
D.2 Material & Methods
MDAnalysis is written in Python, so is tprparser. A tpr file is written in External
Data Representation (XDR) format, which is a standard for description and encoding
data, and used for transferring data between different computer architectures (http:
//tools.ietf.org/html/rfc4506). In XDR, data is serialized. The full description of
the XDR format can be obtained from RFC 4506 at the above URL. In short, XDR uses
a base unit of 4 bytes to represent all items in the data. The Python package used for
coding and decoding a XDR file is called xdrlib. Although xdrlib is written following
RFC 1014 (http://tools.ietf.org/html/rfc1014) of 1987, which was obsoleted by
RFC 1832 (http://tools.ietf.org/html/rfc1832) in 1995, which was again obsoleted
by RFC 4506 in 2006, but without technical changes. Based on trial and error and
communications with the Gromacs developers on the mailing-list, it turns out that there
is no XDR version incompatibility issue between the tpr file and xdrlib.
Currently, since only the decoding of a tpr file is interested, i.e. to extract useful infor-
mation from it, we will not discuss how to encode a tpr file. Therefore, what needs to be
done is to follow the structure of the source code written in C from Gromacs, and figure
out how it decodes a tpr file, and then follow the same routines but in Python. With
all the necessary information decoded, it then needs to be formalized in a proper way so
that it could be used by MDAnalysis.
Appendix D. tprparser 137
D.3 Results
The structure of the tpr file turns out to be quite convoluted, which can be why still
no parser was written for it after 5 years ad passed since its need was first proposed.
What makes it even more difficult to follow the routines is that the structure of tpr
file changes with every major Gromacs release, which can be seen in the numerous “if
. . . elif . . . else . . .” structures in the source code, which is a result of trying to keep
Gromacs backward compatible.
Because the running parameters are rarely used when analyzing a MD trajectory, The
current version of the tprparser only extracts the structural information, which includes
atoms (number, name, type, resname, resid, segid, mass, charge, residue, segment, radius,
bfactor, resnum), bonds, angles, dihedral angles and improper dihedral angles. Accord-
ing to one of the major developers and maintainers of MDAnalysis, Oliver Beckstein,
tprparser is still the only available parser for reading tpr files written in pure Python
as of this writing.
D.4 Discussion
The major advantage of writing analysis tools using MDAnalysis in Python over using
Gromacs template in C is that it is generally easier and faster to program in a high-level
language like Python than in a low-level language like C. However, there can also be
disadvantages. For example, regular Python code is much slower than the corresponding
C code that can do the same job. To speed up the process, the bottleneck part of the
slow Python code may need to be compiled into binary and then called by Python.
Currently, tprparser is capable of parsing tpr files generated by Gromacs of version
Appendix D. tprparser 138
4.0.x–4.6.x. It is currently only available in the develop branch of MDAnalysis (https:
//code.google.com/p/mdanalysis/source/list?name=develop), but will be included
in the upcoming 0.8 release.
Keeping up with the release of new tpr structures is probably the major challenge for
the future maintenance of tprparser. Ideally, the Gromacs developers could develop a
stable structure of the tpr file and have it well documented.
Bibliography
[1] B. B. Aaron and J. M. Gosline. Optical properties of single elastin fibres indicate
random protein conformation. Nature, 287(865):867, 1980.
[2] J. L. Abascal and C. Vega. A general purpose model for the condensed phases
of water: TIP4P/2005. The Journal of Chemical Physics, 123(23):234505–234512,
2005.
[3] M. P. Allen and D. J. Tildesley. Molecular Dynamics. In Computer Simulation of
Liquids, chapter 3, pages 71–109. Oxford University Press, 1989.
[4] J. Alper. Stretching the limits. Science, 297(5580):329–331, 2002.
[5] A. L. Andrady and J. E. Mark. Thermoelasticity of swollen elastin networks at
constant composition. Biopolymers, 19(4):849–855, 1980.
[6] M. Baer, E. Schreiner, A. Kohlmeyer, R. Rousseau, and D. Marx. Inverse tem-
perature transition of a biomimetic elastin model: Reactive flux analysis of fold-
ing/unfolding and its coupling to solvent dielectric relaxation. The Journal of
Physical Chemistry B, 110(8):3576–3587, 2006.
[7] C. Baldock, A. F. Oberhauser, L. Ma, D. Lammie, V. Siegler, S. M. Mithieux,
Y. Tu, J. Y. H. Chow, F. Suleman, M. Malfois, S. Rogers, L. Guo, T. C. Irving, T. J.
139
Bibliography 140
Wess, and A. S. Weiss. Shape of tropoelastin, the highly extensible protein that
controls human tissue elasticity. Proceedings of the National Academy of Sciences
of the United States of America, 108(11):4322–4327, 2011.
[8] O. M. Becker, J. Alexander D. MacKerell, B. Roux, and M. Watanabe. Are protein
force fields getting better? A systematic benchmark on 524 diverse NMR measure-
ments. Journal of Chemical Theory and Computation, 8(4):1409–1414, 2012.
[9] O. M. Becker, A. D. MacKerell, Jr., B. Roux, and M. Watanabe. Computational
Biochemistry and Biophysics. CRC Press, 2001.
[10] C. M. Bellingham, M. A. Lillie, J. M. Gosline, G. M. Wright, B. C. Starcher,
A. J. Bailey, K. A. Woodhouse, and F. W. Keeley. Recombinant human elastin
polypeptides self-assemble into biomaterials with elastin-like properties. Biopoly-
mers, 70(4):445–455, 2003.
[11] C. M. Bellingham, K. A. Woodhouse, P. Robson, S. J. Rothstein, and F. W. Keeley.
Self-aggregation characteristics of recombinantly expressed human elastin polypep-
tides. Biochimica et Biophysica Acta, 1550(1):6–19, 2001.
[12] H. J. C. Berendsen, D. van der Spoel, and R. van Drunen. GROMACS: A message-
passing parallel molecular dynamics implementation. Computer Physics Commu-
nications, 91(1-3):43–56, 1995.
[13] R. Best. Protein simulations with an optimized water model: cooperative helix for-
mation and temperature-induced unfolded state collapse. The Journal of Physical
Chemistry B, 114(46):14916–13923, 2010.
[14] R. B. Best, D. de Sancho, and J. Mittal. Residue-specific α-helix propensities from
molecular simulation. Biophysical Journal, 102(6):1462–1467, 2012.
Bibliography 141
[15] R. B. Best and G. Hummer. Optimized molecular dynamics force fields applied
to the helix-coil transition of polypeptides. The Journal of Physical Chemistry B,
113(26):9004–9015, 2009.
[16] R. B. Best, X. Zhu, J. Shim, P. E. M. Lopes, J. Mittal, M. Feig, and A. D. Mackerell,
Jr. Optimization of the additive CHARMM all-atom protein force field targeting
improved sampling of the backbone φ, ψ and side-chain χ1 and χ2 dihedral angles.
Journal of Chemical Theory and Computation, 8(9):3257–3273, 2012.
[17] B. Bochicchio, A. Pepe, and A. M. Tamburro. Investigating by CD the molecular
mechanism of elasticity of elastomeric proteins. Chirality, 994(9):985–994, 2008.
[18] S. L. Brazee and E. Carrington. Interspecific comparison of the mechanical prop-
erties of mussel byssus. The Biological Bulletin, 211(3):263–274, 2006.
[19] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and
M. Karplus. CHARMM: A program for macromolecular energy, minimization, and
dynamics calculations. Journal of Computational Chemistry, 4(2):187–217, 1983.
[20] J. E. Castle, A. M. Salvi, R. Flamia, and G. Satriano. Surface science aspects
of supramolecular conformation in elastin-like polypeptides. Surface and Interface
Analysis, 44(2):246–257, 2012.
[21] D. K. Chang and D. W. Urry. Polypentapeptide of elastin: Damping of internal
chain dynamics on extension. Journal of Computational Chemistry, 10(6):850–855,
1989.
[22] H. Chung, T. Y. Kim, and S. Y. Lee. Recent advances in production of recombinant
spider silk proteins. Current Opinion in Biotechnology, 23(6):957–964, 2012.
[23] M. I. S. Chung, M. Miao, R. J. Stahl, E. Chan, J. Parkinson, and F. W. Keeley. Se-
Bibliography 142
quences and domain structures of mammalian, avian, amphibian and teleost tropoe-
lastins: Clues to the evolutionary history of elastins. Matrix biology, 25(8):492–504,
2006.
[24] J. T. Cirulis. Self-Assembly and Fibre Formation of by Self-Assembly and Fibre
Formation of Elastin-Like Polypeptides. PhD thesis, 2009.
[25] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson,
D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman. A Second Generation
Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules.
Journal of the American Chemical Society, 117(19):5179–5197, 1995.
[26] T. Darden, D. York, and L. Pedersen. Particle mesh Ewald: An Nlog(N) method
for Ewald sums in large systems. The Journal of Chemical Physics, 98(12):10089–
10092, 1993.
[27] K. A. Dill. Dominant Forces in Protein Folding. Biochemistry, 29(31):7133–7155,
1990.
[28] C. M. Dobson. Protein misfolding, evolution and disease. Trends in Biochemical
Sciences, 24(9):329–332, 1999.
[29] C. M. Dobson. Protein folding and misfolding. Nature, 426(6968):884–890, 2003.
[30] K. L. Dorrington and N. G. McCrum. Elastin as a rubber. Biopolymers, 16(6):1201–
1222, 1977.
[31] Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, G. Xiong, W. Zhang, R. Yang,
P. Cieplark, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman. A pointcharge
force field for molecular mechanics simulations of proteins based on condensed-
Bibliography 143
phase quantum mechanical calculations. Journal of Computational Chemistry,
24(16):1999–2012, 2003.
[32] C. M. Elvin, A. G. Carr, M. G. Huson, J. M. Maxwell, R. D. Pearson, T. Vuo-
colo, N. E. Liyou, D. C. C. Wong, D. J. Merritt, and N. E. Dixon. Synthesis
and properties of crosslinked recombinant pro-resilin. Nature, 437(7061):999–1002,
2005.
[33] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, and L. G. Peder-
sen. A smooth particle mesh Ewald method. The Journal of Chemical Physics,
103(19):8577–8593, 1995.
[34] R. Flamia, G. Lanza, A. M. Salvi, J. E. Castle, and A. M. Tamburro. Conforma-
tional study and hydrogen bonds detection on elastin-related polypeptides using
X-ray photoelectron spectroscopy. Biomacromolecules, 6(3):1299–1309, 2005.
[35] R. Flamia, a. M. Salvi, L. D’Alessio, J. E. Castle, and A. M. Tamburro. Trans-
formation of amyloid-like fibers, formed from an elastin-based biopolymer, into a
hydrogel: an X-ray photoelectron spectroscopy and atomic force microscopy study.
Biomacromolecules, 8(1):128–138, 2007.
[36] R. Flamia, P. A. Zhdan, M. Martino, J. E. Castle, and A. M. Tamburro. AFM
study of the elastin-like biopolymer poly(ValGlyGlyValGly). Biomacromolecules,
5(4):1511–1518, 2004.
[37] N. Floquet, S. Hery-Huynh, M. Dauchez, P. Derreumaux, A. M. Tamburro, and
A. J. P. Alix. Structural characterization of VGVAPG, an elastin-derived peptide.
Biopolymers, 76(3):266–280, 2004.
[38] D. Frenkel and B. Smit. Molecular Dynamics Simulations. In Understanding Molec-
ular Simulation (Second Edition): From Algorithms to Applications, chapter 4,
Bibliography 144
pages 63–107. Academic Press, 2 edition, 2002.
[39] A. E. Garcıa and K. Y. Sanbonmatsu. α-helical stabilization by side chain shielding
of backbone hydrogen bonds. Proceedings of the National Academy of Sciences of
the United States of America, 99(5):2782–2787, 2002.
[40] R. Glaves, M. Baer, E. Schreiner, R. Stoll, and D. Marx. Conformational dynam-
ics of minimal elastin-like polypeptides: the role of proline revealed by molecular
dynamics and nuclear magnetic resonance. Chemphyschem, 9(18):2759–2765, 2008.
[41] J. Gosline, M. Lillie, E. Carrington, P. Guerette, C. Ortlepp, and K. Savage. Elastic
proteins: biological roles and mechanical properties. Philosophical Transactions
of the Royal Society of London. Series B, Biological Sciences, 357(1418):121–132,
2002.
[42] J. M. Gosline. Hydrophobic interaction and a model for the elasticity of elastin.
Biopolymers, 17(3):677–695, 1978.
[43] J. M. Gosline, F. F. Yew, and T. WeisFogh. Reversible structural changes in a
hydrophobic protein, elastin, as indicated by fluorescence probe analysis. Biopoly-
mers, 14(9):1811–1826, 1975.
[44] W. R. Gray, L. B. Sandberg, and J. A. Foster. Molecular model for elastin structure
and function. Nature, 246(5434):461–466, 1973.
[45] S. C. Harvey, R. K.-Z. Tan, and T. E. Cheatham III. The flying ice cube: velocity
rescaling in molecular dynamics leads to violation of energy equipartition. Journal
of Computational Chemistry, 19(7):726–740, 1998.
[46] B. Hess. P-LINCS: A parallel linear constraint solver for molecular simulation.
Journal of Chemical Theory and Computation, 4(1):116–122, 2008.
Bibliography 145
[47] B. Hess, H. Bekker, H. J. C. Berendsen, and J. G. E. M. Fraaije. LINCS: A linear
constraint solver for molecular simulations. Journal of Computational Chemistry,
18(12):1463–1472, 1997.
[48] B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl. GROMACS 4: Algorithms
for highly efficient, load-balanced, and scalable molecular simulation. Journal of
Chemical Theory and Computation, 4(3):435–447, 2008.
[49] C. A. J. Hoeve and P. J. Flory. The elastic properties of elastin. Biopolymers,
13(4):677–686, 1974.
[50] W. G. Hoover. Canonical dynamics: Equilibrium phase-space distributions. Phys-
ical Review A, 31(3):1695–1697, 1985.
[51] V. Hornak, R. Abel, A. Okur, B. Strockbine, A. Roitberg, and C. Simmerling. Com-
parison of multiple Amber force fields and development of improved protein back-
bone parameters. Proteins: Structure, Function, and Bioinformatics, 65(3):712–
725, 2006.
[52] Http://www.uniprot.org/uniprot/P15502. Human tropoelastin sequence.
[53] W. Humphrey, A. Dalke, and K. Schulten. VMD: visual molecular dynamics.
Journal of Molecular Graphics, 14(1):33–38, 1996.
[54] J. D. Hunter. Matplotlib: A 2D graphics environment. Computing in Science &
Engineering, 9(3):90–95, 2007.
[55] S. Hwang, Q. Shao, H. Williams, C. Hilty, and Y. Q. Gao. Methanol Strength-
ens Hydrogen Bonds and Weakens Hydrophobic Interactions in ProteinsA Com-
bined Molecular Dynamics and NMR study. The Journal of Physical Chemistry
B, 115(20):6653–6660, 2011.
Bibliography 146
[56] A. Jabs, M. S. Weiss, and R. Hilgenfeld. Non-proline cis peptide bonds in proteins.
Journal of molecular biology, 286(1):291–304, 1999.
[57] W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein.
Comparison of simple potential functions for simulating liquid water. The Journal
of Chemical Physics, 79(2):926–935, 1983.
[58] W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives. Development and testing
of the OPLS all-atom force field on conformational energetics and properties of
organic liquids. Journal of the American Chemical Society, 118(45):11225–11236,
1996.
[59] W. L. Jorgensen and J. Tirado-Rives. The OPLS potential functions for proteins,
energy minimizations for crystals of cyclic peptides and crambin. Journal of the
American Chemical Society, 110(6):1657–1666, 1988.
[60] W. Kabsch and C. Sander. Dictionary of protein secondary structure: pat-
tern recognition of hydrogen-bonded and geometrical features. Biopolymers,
22(12):2577–2637, 1983.
[61] G. A. Kaminski, R. A. Friesner, J. Tirado-Rives, and W. L. Jorgensen. Evaluation
and reparametrization of the OPLS-AA force field for proteins via comparison
with accurate quantum chemical calculations on peptides. The Journal of Physical
Chemistry B, 105(28):6474–6487, 2001.
[62] R. G. Kirste, W. A. Kruse, and K. Ibel. Determination of the conformation of
polymers in the amorphous solid state and in concentrated solution by neutron
diffraction. Polymer, 16(2):120–124, 1975.
[63] P. Kollman, R. Dixon, W. Cornell, T. Fox, C. Chipot, and A. Pohorille. The de-
velopment/application of the minimalist organic/biochemical molecular mechanic
Bibliography 147
force field using a combination of ab initio calculations and experimental data. In
W. van Gunsteren, P. Weiner, and A. Wilkinson, editors, Computer Simulation of
Biomolecular Systems: Theoretical and Experimental Application Vol. 3. Springer,
1997.
[64] D. B. Kony, P. H. Hunenberger, and W. F. van Gunsteren. Molecular dynam-
ics simulations of the native and partially folded states of ubiquitin: influence of
methanol cosolvent, pH, and temperature on the protein structure and dynamics.
Protein science : a publication of the Protein Society, 16(6):1101–1118, 2007.
[65] J. Kyte and R. F. Doolittle. A simple method for displaying the hydropathic
character of a protein. Journal of Molecular Biology, 157(1):105–132, 1982.
[66] B. Li, D. O. V. Alonso, B. J. Bennion, and V. Daggett. Hydrophobic hydration
is an important source of elasticity in elastin-based biopolymers. Journal of the
American Chemical Society, 123(48):11991–11998, 2001.
[67] B. Li, D. O. V. Alonso, and V. Daggett. The molecular basis for the inverse
temperature transition of elastin. Journal of Molecular Biology, 305(3):581–592,
2001.
[68] B. Li and V. Daggett. Molecular basis for the extensibility of elastin. Journal of
Muscle Research and Cell Motility, 23(5-6):561–573, 2002.
[69] D.-W. Li and R. Bruschweiler. NMR-Based Protein Potentials. Angewandte
Chemie, 122(38):6930–6932, 2010.
[70] M. A. Lillie, G. J. David, and J. M. Gosline. Mechanical role of elastin-associated
microfibrils in pig aortic elastic tissue. Connective Tissue Research, 37(1-2):121–
141, 1998.
Bibliography 148
[71] K. Lindorff-Larsen, P. Maragakis, S. Piana, M. P. Eastwood, R. O. Dror, and D. E.
Shaw. Systematic validation of protein force fields against experimental data. PloS
one, 7(2):e32131, 2012.
[72] K. Lindorff-Larsen, S. Piana, K. Palmo, P. Maragakis, J. L. Klepeis, R. O. Dror,
and D. E. Shaw. Improved side-chain torsion potentials for the Amber ff99SB
protein force field. Proteins, 78(8):1950–1958, 2010.
[73] a. Luzar and D. Chandler. Effect of environment on hydrogen bond dynamics in
liquid water. Physical Review Letters, 76(6):928–931, 1996.
[74] A. D. MacKerell Jr., D. Bashford, M. Bellott, R. L. Dunbrack, J. D. Evanseck,
M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir,
K. Kuczera, F. T. K. L. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen,
B. Prodhom, W. E. Reiher, III, B. Roux, M. Schlenkrich, J. C. Smith, R. Stote,
J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus. All-
atom empirical potential for molecular modeling and dynamics studies of proteins.
The Journal of Physical Chemistry B, 102(18):3586–3616, 1998.
[75] A. D. Mackerell, Jr., M. Feig, and C. L. Brooks III. Extending the treatment of
backbone energetics in protein force fields: limitations of gas-phase quantum me-
chanics in reproducing protein conformational distributions in molecular dynamics
simulations. Journal of Computational Chemistry, 25(11):1400–1415, 2004.
[76] S. J. Marrink, A. H. de Vries, and A. E. Mark. Coarse grained model for semiquan-
titative lipid simulations. The Journal of Physical Chemistry B, 108(2):750–760,
2004.
[77] S. J. Marrink, H. J. Risselada, S. Yefimov, D. P. Tieleman, and A. H. de Vries.
The MARTINI force field: coarse grained model for biomolecular simulations. The
Bibliography 149
Journal of Physical Chemistry B, 111(27):7812–7824, 2007.
[78] R. P. Mecham. Methods in elastic tissue biology: elastin isolation and purification.
Methods, 45(1):32–41, 2008.
[79] M. Miao, C. M. Bellingham, R. J. Stahl, E. E. Sitarz, C. J. Lane, and F. W.
Keeley. Sequence and structure determinants for the self-aggregation of recombi-
nant polypeptides modeled after human elastin. Journal of Biological Chemistry,
278(49):48553–48562, 2003.
[80] M. Miao, J. T. Cirulis, S. Lee, and F. W. Keeley. Structural determinants of
cross-linking and hydrophobic domains for self-assembly of elastin-like polypep-
tides. Biochemistry, 44(43):14367–14375, 2005.
[81] N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. MDAnalysis:
A toolkit for the analysis of molecular dynamics simulations. Journal of Compu-
tational Chemistry, 32(10):2319–2327, 2011.
[82] F. Mistrali, D. Volpin, G. B. Garibaldo, and A. Ciferri. Thermodynamics of elas-
ticity in open systems. Elastin. The Journal of Physical Chemistry, 75(1):142–149,
1971.
[83] S. M. Mithieux and A. S. Weiss. Elastin. Advances in Protein Chemistry, 70:437–
461, 2005.
[84] L. Monticelli, S. K. Kandasamy, X. Periole, R. G. Larson, D. P. Tieleman, and S.-J.
Marrink. The MARTINI coarse-grained force field: extension to proteins. Journal
of Chemical Theory and Computation, 4(5):819–834, 2008.
[85] L. D. Muiznieks, A. S. Weiss, and F. W. Keeley. Structural disorder and dynamics
of elastin. Biochemistry and Cell Biology, 88(2):239–250, 2010.
Bibliography 150
[86] C. Neale, W. D. Bennett, D. P. Tieleman, and R. Pomes. Statistical Convergence
of Equilibrium Properties in Simulations of Molecular Solutes Embedded in Lipid
Bilayers. Journal of Chemical Theory and Computation, 7(12):4175–4188, 2011.
[87] P. S. Nerenberg and T. Head-Gordon. Optimizing ProteinSolvent Force Fields
to Reproduce Intrinsic Conformational Preferences of Model Peptides. Journal of
Chemical Theory and Computation, 7(4):1220–1230, 2011.
[88] S. Nose. A unified formulation of the constant temperature molecular dynamics
methods. The Journal of Chemical Physics, 81(1):511–519, 1984.
[89] D. Pal and P. Chakrabarti. Cis peptide bonds in proteins: residues involved,
their conformations, interactions and locations. Journal of Molecular Biology,
294(1):271–288, 1999.
[90] R. V. Pappu, X. Wang, A. Vitalis, and S. L. Crick. A polymer physics perspective
on driving forces and mechanisms for protein aggregation. Archives of Biochemistry
and Biophysics, 469(1):132–141, 2008.
[91] M. Parrinello and A. Rahman. Polymorphic transitions in single crystals: A new
molecular dynamics method. Journal of Applied Physics, 52(12):7182–7190, 1981.
[92] S. M. Partridge. Isolation and Characterization of Elastin. In E. A. Balazs, editor,
Chemistry and Molecular Biology of the Intercellular Matrix, volume 1, pages 593–
616. Academic Press, London, 1970.
[93] A. T. Petkova, W.-M. Yau, and R. Tycko. Experimental Constraints on Quaternary
Structure in Alzheimers β-Amyloid fibrils. Biochemistry, 45(2):498–512, 2006.
[94] B. M. Pettitt and M. Karplus. Role of electrostatics in the structure, energy and
dynamics of biomolecules: a model study of N-methylalanylacetamide. Journal of
Bibliography 151
the American Chemical Society, 107(5):1166–1173, 1985.
[95] S. Piana, K. Lindorff-Larsen, and D. E. Shaw. How robust are protein folding
simulations with respect to force field parameterization? Biophysical Journal,
100(9):L47–L49, 2011.
[96] M. S. Pometun, E. Y. Chekmenev, and R. J. Wittebort. Quantitative observation of
backbone disorder in native elastin. Journal of Biological Chemistry, 279(9):7982–
7987, 2004.
[97] S. Rauscher. Protein Non-Folding : A Molecular Simulation Study of the Structure
and Self-Aggregation of Elastin. PhD thesis, University of Toronto, 2011.
[98] S. Rauscher, S. Baud, M. Miao, F. W. Keeley, and R. Pomes. Proline and glycine
control protein self-organization into elastomeric or amyloid fibrils. Structure,
14(11):1667–1676, 2006.
[99] S. Rauscher, C. Neale, and R. Pomes. Simulated tempering distributed replica sam-
pling, virtual replica exchange, and other generalized-ensemble methods for con-
formational sampling. Journal of Chemical Theory and Computation, 5(10):2640–
2662, 2009.
[100] S. Rauscher and R. Pomes. Molecular simulations of protein disorder. Biochemistry
and Cell Biology, 88(2):269–290, 2010.
[101] S. Rauscher and R. Pomes. Structural Disorder and Protein Elasticity. In Fuzziness,
pages 159–183. 2012.
[102] W. Reiher. Theoretical studies of hydrogen bonding. PhD thesis, 1985.
[103] P. J. Rossky, M. Karplus, and A. Rahman. A model for the simulation of an
aqueous dipeptide solution. Biopolymers, 18(4):825–854, 1979.
Bibliography 152
[104] M. Rubinstein and R. H. Colby. Polymer physics. Oxford University Press, Reading,
Massachusetts, 2003.
[105] A. M. Salvi, P. Moscarelli, B. Bochicchio, G. Lanza, and J. E. Castle. Combined
effects of solvation and aggregation propensity on the final supramolecular struc-
tures adopted by hydrophobic, glycine-rich, elastin-like polypeptides. Biopolymers,
99(5):292–313, 2013.
[106] M. S. Searle, R. Zerella, D. H. Williams, and L. C. Packman. Native-like hairpin
structure in an isolated fragment from ferredoxin : NMR and CD studies of solvent
effects on the N-terminal 20 residues. Protein engineering, 9(7):559–565, 1996.
[107] M. Seo, S. Rauscher, R. Pomes, and D. P. Tieleman. Improving Internal Peptide
Dynamics in the Coarse-Grained MARTINI Model: Toward Large-Scale Simu-
lations of Amyloid- and Elastin-like Peptides. Journal of Chemical Theory and
Computation, 8(5):1774–1785, 2012.
[108] E. J. Sorin and V. S. Pande. Exploring the helix-coil transition via all-atom equi-
librium ensemble simulations. Biophysical Journal, 88(4):2472–2493, 2005.
[109] Y. Sugita and Y. Okamoto. Replica-exchange molecular dynamics method for
protein folding. Chemical Physics Letters, 314(1-2):141–151, 1999.
[110] D. A. Torchia and K. A. Piez. Mobility of elastin chains as determined by 13C
nuclear magnetic resonance. Journal of Molecular Biology, 76(3):419–424, 1973.
[111] G. M. Torrie and J. P. Valleau. Nonphysical sampling distributions in Monte Carlo
free-energy estimation: Umbrella sampling. Journal of Computational Physics,
23(2):187–199, 1977.
Bibliography 153
[112] J. Uitto. Biochemistry of the elastic fibers in normal connective tissues and its
alterations in diseases. Journal of Investigative Dermatology, 72(1):1–10, 1979.
[113] H. C. Urey and C. A. Bradley, Jr. the Vibrations of Pentatonic Tetrahedral
Molecules. Physical Review, 38(11):1969–1978, 1931.
[114] D. W. Urry and C. M. Venkatachalam. A librational entropy mechanism for elas-
tomers with repeating peptide sequences in helical array. International Journal of
Quantum Chemistry, 24(S10):81–93, 1983.
[115] M. B. van Eldijk, C. L. Mcgann, K. L. Kiick, and J. C. M. van Hest. Elastomeric
polypeptides. Topics in Current Chemistry, 310:71–116, 2012.
[116] K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Dar-
ian, O. Guvench, P. Lopes, I. Vorobyov, and A. D. MacKerell, Jr. CHARMM gen-
eral force field: A force field for druglike molecules compatible with the CHARMM
allatom additive biological force fields. Journal of Computational Chemistry,
31(4):671–690, 2010.
[117] K. Vanommeslaeghe and A. D. Mackerell, Jr. Automation of the CHARMM Gen-
eral Force Field (CGenFF) I: bond perception and atom typing. Journal of Chem-
ical Information and Modeling, 52(12):3144–3154, 2012.
[118] K. Vanommeslaeghe, E. P. Raman, and A. D. MacKerell, Jr. Automation of
the CHARMM General Force Field (CGenFF) II: assignment of bonded param-
eters and partial atomic charges. Journal of Chemical Information and Modeling,
52(12):3155–3168, 2012.
[119] C. M. Venkatachalam and D. W. Urry. Development of a linear helical confor-
mation from its cyclic correlate. β-Spiral model of the elastin poly(pentapeptide)
(VPGVG)n. Macromolecules, 14(5):1225–1229, 1981.
Bibliography 154
[120] S. Vieth, C. M. Bellingham, F. W. Keeley, S. M. Hodge, and D. Rousseau. Mi-
crostructural and tensile properties of elastin-based polypeptides crosslinked with
Genipin and pyrroloquinoline quinone. Biopolymers, 85(3):199–206, 2007.
[121] D. Volpin and A. Ciferri. Thermoelasticity of elastin. Nature, 225(5230):382–382,
1970.
[122] B. Vrhovski, S. Jensen, and A. S. Weiss. Coacervation characteristics of recombi-
nant human tropoelastin. European Journal of Biochemistry, 250(1):92–98, 1997.
[123] B. Vrhovski and A. S. Weiss. Biochemistry of tropoelastin. European Journal of
Biochemistry, 258(1):1–18, 1998.
[124] J. Wang, P. Cieplak, and P. A. Kollman. How well does a restrained electrostatic
potential (RESP) model perform in calculating conformational energies of organic
and biological molecules? Journal of Computational Chemistry, 21(12):1049–1074,
2000.
[125] Z. R. Wasserman and F. R. Salemme. A molecular dynamics investigation of the
elastomeric restoring force in elastin. Biopolymers, 29(12-13):1613–1631, 1990.
[126] P. K. Weiner and P. A. Kollman. AMBER: Assisted model building with energy re-
finement. A general program for modeling molecules and their interactions. Journal
of Computational Chemistry, 2(3):287–303, 1981.
[127] S. J. Weiner, P. A. Kollman, D. A. Case, U. C. Singh, C. Ghio, G. Alagona, S. Pro-
feta, Jr., and P. Weiner. A new force field for molecular mechanical simulation of
nucleic acids and proteins. Journal of the American Chemical Society, 106(3):765–
784, 1984.
[128] S. J. Weiner, P. A. Kollman, D. T. Nguyen, and D. A. Case. An all atom force field
Bibliography 155
for simulations of proteins and nucleic acids. Journal of Computational Chemistry,
7(2):230–252, 1986.
[129] T. Weis-Fogh and S. O. Andersen. New Molecular Model for the Long-range Elas-
ticity of Elastin. Nature, 227(5259):718–721, 1970.
[130] E. Wohlisch. Static-kinetic theory, thermodynamics and biological significance of
caoutchouc type elasticity. Kolloid-Z, 89:239–271, 1939.
Top Related