Evidence from mixed hydrate nucleation for a funnel model ...Understanding hydrate formation is a...

6
Evidence from mixed hydrate nucleation for a funnel model of crystallization Kyle Wm. Hall a , Sheelagh Carpendale b , and Peter G. Kusalik a,1 a Department of Chemistry, University of Calgary, Calgary, AB, Canada T2N 1N4; and b Department of Computer Science, University of Calgary, Calgary, AB, Canada T2N 1N4 Edited by Michael L. Klein, Temple University, Philadelphia, PA, and approved September 2, 2016 (received for review July 5, 2016) The molecular-level details of crystallization remain unclear for many systems. Previous work has speculated on the phenomeno- logical similarities between molecular crystallization and protein folding. Here we demonstrate that molecular crystallization can involve funnel-shaped potential energy landscapes through a detailed analysis of mixed gas hydrate nucleation, a prototypical multicomponent crystallization process. Through this, we contrib- ute both: (i ) a powerful conceptual framework for exploring and rationalizing molecular crystallization, and (ii ) an explanation of phenomenological similarities between protein folding and crys- tallization. Such funnel-shaped potential energy landscapes may be typical of broad classes of molecular ordering processes, and can provide a new perspective for both studying and understanding these processes. nucleation | gas clathrate hydrates | potential energy landscapes | crystallization funnel | molecular dynamics simulation M olecular crystallization and its inhibition are important to a broad range of fields. For example, some organisms [such as Antarctic fish (1) and winter rye (2)] have developed a rich chemistry of antifreeze proteins to control internal freezing, and there is significant interest in exploiting antifreeze proteins for food applications (e.g., see ref. 3). Gas hydrate formation in oil and gas pipelines is a major industrial concern (4). For pharma- ceuticals, there is much interest in understanding and controlling crystal polymorphism (e.g., see ref. 5). A better understanding of molecular crystallization, and factors influencing these processes, has potential to aid further advancements in such fields. As high- lighted by a recent review on crystallization (6), traditional theo- retical models of crystallization (e.g., classical nucleation theory) have proven to be problematic for a variety of systems and there remain technical challenges to studying crystallization both exper- imentally and computationally, so a clear understanding of crystal nucleation has yet to emerge. Molecular crystallization is one of the major classes of molecular ordering processes. Other molecular ordering processes include micelle formation, the formation of coordination polymers, and protein folding. Previous work has asserted, although not sub- stantiated, that crystallization and protein folding are somehow similar processes. For example, protein folding has been specu- latively described as a first-order phase transition similar to liquidsolid transitions (7). It has also been proposed that both the waterice transition and protein folding are difficult to study in silico because both are complex searches for relatively few ordered structures among numerous disordered alternative structures (8). The aim of this study is twofold: (i ) to provide a conceptual de- scription of molecular crystallization (simply referred to as crys- tallization henceforth), and (ii ) to provide an explanation of the apparent similarities between crystallization and protein folding. On the basis of extensive simulations of the nucleation of a gas hydrate (i.e., a molecular solid) and detailed analysis of these simulations, it is herein demonstrated that hydrate nucleation in- volves funnel-shaped potential energy landscapes akin to those associated with protein folding, and it is proposed that funnel- shaped potential energy landscapes may be characteristic of other crystallization processes (e.g., ice nucleation). Funnel-shaped po- tential energy landscapes afford a conceptual framework for understanding crystallization. Protein folding has been very successfully characterized in terms of funnel-shaped energy landscapes (i.e., the funnel model of protein folding) (911). For a protein, there are many high- energy conformations corresponding to unfolded structures, and relatively few low-energy conformations corresponding to native- like structures. The transition between these two regimes (i.e., protein folding) involves ordering in stages (e.g., creation of internal hydrogen bonds and the development of hydrophobic cores) so that there is a narrowing in the number of accessible conformations with decreasing system energy (10). Therefore, protein folding tends to involve funnel-shaped potential energy landscapes and ordering in stages. Free-energy barriers to protein folding can arise when de- creases in energy associated with a protein moving deeper in the funnel are not sufficient to compensate for the entropic losses ex- perienced by the system (9). Such a free-energy barrier can be vi- sualized as the rapid narrowing in the width of the funnel such that the regions contiguous to the hole leading further down the funnel are essentially flat. An individual copy of a protein must spend time traversing this relatively energetically flat region of configuration space until it finds one of the comparatively few configurations through which it can proceed further down the funnel (10), i.e., the proteins search is akin to a random walker. An individual copy of a protein will stochastically overcome such barriers. There are mul- tiple pathways (i.e., routes) to the bottom of the funnel (911), such that different denatured copies of a protein can pass through dif- ferent sets of microstates (i.e., specific arrangements) as they reconfigure to the proteins native structure (10). The sides of the Significance Molecular crystallization, and its promotion and inhibition, is important across a broad range of fields. Previous work has asserted, although not substantiated, that crystallization and protein folding are somehow similar processes. Here, through detailed analysis of gas hydrate nucleation, we demonstrate that hydrate nucleation involves funnel-shaped potential energy landscapes, and put forth a funnel model of crystallization. This funnel model of crystallization provides a lens for exploring and understanding crystallization and potentially other ordering processes. We show that the phenomenological similarities be- tween crystallization and protein folding result from underlying similarities between essential features of these microscopic or- dering processes. This work thus provides a key connection be- tween these two important but disparate research domains, thereby enabling knowledge advancement in both. Author contributions: K.W.H. and P.G.K. designed research; K.W.H. performed research; K.W.H., S.C., and P.G.K. analyzed data; and K.W.H., S.C., and P.G.K. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1610437113/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1610437113 PNAS | October 25, 2016 | vol. 113 | no. 43 | 1204112046 CHEMISTRY Downloaded by guest on May 24, 2020

Transcript of Evidence from mixed hydrate nucleation for a funnel model ...Understanding hydrate formation is a...

Evidence from mixed hydrate nucleation for a funnelmodel of crystallizationKyle Wm. Halla, Sheelagh Carpendaleb, and Peter G. Kusalika,1

aDepartment of Chemistry, University of Calgary, Calgary, AB, Canada T2N 1N4; and bDepartment of Computer Science, University of Calgary, Calgary, AB,Canada T2N 1N4

Edited by Michael L. Klein, Temple University, Philadelphia, PA, and approved September 2, 2016 (received for review July 5, 2016)

The molecular-level details of crystallization remain unclear formany systems. Previous work has speculated on the phenomeno-logical similarities between molecular crystallization and proteinfolding. Here we demonstrate that molecular crystallization caninvolve funnel-shaped potential energy landscapes through adetailed analysis of mixed gas hydrate nucleation, a prototypicalmulticomponent crystallization process. Through this, we contrib-ute both: (i) a powerful conceptual framework for exploring andrationalizing molecular crystallization, and (ii) an explanation ofphenomenological similarities between protein folding and crys-tallization. Such funnel-shaped potential energy landscapes maybe typical of broad classes of molecular ordering processes, andcan provide a new perspective for both studying and understandingthese processes.

nucleation | gas clathrate hydrates | potential energy landscapes |crystallization funnel | molecular dynamics simulation

Molecular crystallization and its inhibition are important to abroad range of fields. For example, some organisms [such

as Antarctic fish (1) and winter rye (2)] have developed a richchemistry of antifreeze proteins to control internal freezing, andthere is significant interest in exploiting antifreeze proteins forfood applications (e.g., see ref. 3). Gas hydrate formation in oiland gas pipelines is a major industrial concern (4). For pharma-ceuticals, there is much interest in understanding and controllingcrystal polymorphism (e.g., see ref. 5). A better understanding ofmolecular crystallization, and factors influencing these processes,has potential to aid further advancements in such fields. As high-lighted by a recent review on crystallization (6), traditional theo-retical models of crystallization (e.g., classical nucleation theory)have proven to be problematic for a variety of systems and thereremain technical challenges to studying crystallization both exper-imentally and computationally, so a clear understanding of crystalnucleation has yet to emerge.Molecular crystallization is one of the major classes of molecular

ordering processes. Other molecular ordering processes includemicelle formation, the formation of coordination polymers, andprotein folding. Previous work has asserted, although not sub-stantiated, that crystallization and protein folding are somehowsimilar processes. For example, protein folding has been specu-latively described as a first-order phase transition similar to liquid–solid transitions (7). It has also been proposed that both the water–ice transition and protein folding are difficult to study in silicobecause both are complex searches for relatively few orderedstructures among numerous disordered alternative structures (8).The aim of this study is twofold: (i) to provide a conceptual de-scription of molecular crystallization (simply referred to as crys-tallization henceforth), and (ii) to provide an explanation of theapparent similarities between crystallization and protein folding.On the basis of extensive simulations of the nucleation of a gashydrate (i.e., a molecular solid) and detailed analysis of thesesimulations, it is herein demonstrated that hydrate nucleation in-volves funnel-shaped potential energy landscapes akin to thoseassociated with protein folding, and it is proposed that funnel-shaped potential energy landscapes may be characteristic of other

crystallization processes (e.g., ice nucleation). Funnel-shaped po-tential energy landscapes afford a conceptual framework forunderstanding crystallization.Protein folding has been very successfully characterized in

terms of funnel-shaped energy landscapes (i.e., the funnel modelof protein folding) (9–11). For a protein, there are many high-energy conformations corresponding to unfolded structures, andrelatively few low-energy conformations corresponding to native-like structures. The transition between these two regimes (i.e.,protein folding) involves ordering in stages (e.g., creation of internalhydrogen bonds and the development of hydrophobic cores) so thatthere is a narrowing in the number of accessible conformations withdecreasing system energy (10). Therefore, protein folding tends toinvolve funnel-shaped potential energy landscapes and ordering instages. Free-energy barriers to protein folding can arise when de-creases in energy associated with a protein moving deeper in thefunnel are not sufficient to compensate for the entropic losses ex-perienced by the system (9). Such a free-energy barrier can be vi-sualized as the rapid narrowing in the width of the funnel such thatthe regions contiguous to the hole leading further down the funnelare essentially flat. An individual copy of a protein must spend timetraversing this relatively energetically flat region of configurationspace until it finds one of the comparatively few configurationsthrough which it can proceed further down the funnel (10), i.e., theprotein’s search is akin to a random walker. An individual copy of aprotein will stochastically overcome such barriers. There are mul-tiple pathways (i.e., routes) to the bottom of the funnel (9–11), suchthat different denatured copies of a protein can pass through dif-ferent sets of microstates (i.e., specific arrangements) as theyreconfigure to the protein’s native structure (10). The sides of the

Significance

Molecular crystallization, and its promotion and inhibition, isimportant across a broad range of fields. Previous work hasasserted, although not substantiated, that crystallization andprotein folding are somehow similar processes. Here, throughdetailed analysis of gas hydrate nucleation, we demonstratethat hydrate nucleation involves funnel-shaped potential energylandscapes, and put forth a funnel model of crystallization. Thisfunnel model of crystallization provides a lens for exploring andunderstanding crystallization and potentially other orderingprocesses. We show that the phenomenological similarities be-tween crystallization and protein folding result from underlyingsimilarities between essential features of these microscopic or-dering processes. This work thus provides a key connection be-tween these two important but disparate research domains,thereby enabling knowledge advancement in both.

Author contributions: K.W.H. and P.G.K. designed research; K.W.H. performed research;K.W.H., S.C., and P.G.K. analyzed data; and K.W.H., S.C., and P.G.K. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1610437113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1610437113 PNAS | October 25, 2016 | vol. 113 | no. 43 | 12041–12046

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

24,

202

0

funnel can be rugged with local energy minima that can kineticallytrap individual copies of the protein in nonnative structures (9, 10).Work with heteropolymers (i.e., model systems for protein folding)has revealed that kinetic traps can delay folding to native structuressuch that pathways involving kinetic traps have different kineticscompared with throughway pathways that proceed directly to nativestructures (12), as is illustrated by the moat funnel landscape in ref.10. Thus, the role of multiple pathways and kinetic traps is im-portant for understanding the kinetics of protein folding. In sum-mary, three key characteristics of the funnel model of proteinfolding are (i) funnel-shaped potential energy landscapes and or-dering in stages; (ii) barriers and stochasticity; and (iii) multiplepathways and kinetic traps.To explore the possible connections between protein folding

and crystallization, gas hydrate nucleation was studied. Gas hy-drates are crystalline structures where water molecules form hostlattices of polyhedral hydrogen bond cages that are occupied bysmall gas molecules (e.g., methane). Understanding hydrateformation is a pressing industrial issue. For example, hydratesoften form plugs in oil and gas pipelines, and are a major con-cern at both onshore and offshore facilities (4). Chemical in-hibitors are one of the key strategies for preventing hydrateformation; however, chemical inhibition can be costly (e.g., seeref. 13). A better molecular understanding of hydrate formationcould, for example, support the development of improved in-hibitors (4).Despite significant interest, the molecular mechanism of hy-

drate nucleation is not fully understood (14). The molecular-level details of hydrate nucleation are beyond the spatiotemporalresolution limits of current experimental techniques. However,hydrate nucleation can now be reproducibly studied in silico withdirect, unbiased simulations using atomistic water models (e.g.,as in refs. 15–18). Comparable simulations of crystallization inother molecular systems are difficult (see ref. 6). For example,the direct calculation of ice nucleation rates for atomistic watermodels remained an open challenge until 2015 (19). Therefore,in contrast to other crystallization processes, gas hydrate nucle-ation is an industrially relevant crystallization process that isamenable to detailed investigation through simulation.To probe hydrate nucleation, 10 independent 200-ns isobaric,

isothermal simulations were performed at 50 MPa and 250 K for asystem of 3,375 molecules that was 90% H2O, 5% CH4, and 5%H2S. In all of the simulations, the initial configuration of the systemwas a CH4/H2S nanobubble in liquid water. Based on the modelsand conditions used in this study, the simulated system was meta-stable with respect to both the CH4 and H2S hydrate phases, butnot with respect to hexagonal ice (see Supporting Information for adiscussion of the system’s metastability). Both CH4 and H2S favorthe formation of the same hydrate phase (i.e., the sI hydrate structure)(20, 21). Additional methodological details concerning system prep-aration and the simulations are provided in Supporting Information.During hydrate nucleation, a system’s potential energy de-

creases with approximately concurrent increases in both: (i) thenumber of water cages in the system, and (ii) system orderaccording to molecular-level, water-based order parameters [e.g.,the F4 order parameter (22)]. This can be seen by comparing thetime evolution of these properties for one of this study’s trajec-tories (e.g., see run 2 in Fig. 1 A–C) with structural snapshotsfrom that trajectory (Fig. 1D). More detailed descriptions of thewater cages of interest and average F4 values are provided in Fig.1 legend. Six of this study’s 10 simulations had nucleated andwere in a growth regime by the time they had reached 200 ns(Fig. 1), and analysis of these 6 simulations will be the focus ofthe rest of this paper. Upon extension to 260 ns, three of theremaining trajectories had nucleated. These additional trajec-tories were used as a separate analysis test set and provideconfirmation that the results for hydrate nucleation in this

paper reasonably capture ensemble behavior (see SupportingInformation, Fig. S1).During nucleation, the changes in potential energy, number of

5126n – 415106m cages, and average F4 value are not completelysimultaneous. This can be seen based on the normalized curvesin Fig. 2A where changes in potential energy precede changes inthe average F4 value, which in turn precede changes in thenumber of cages. This behavior is expected because lower-orderstructure (e.g., as captured by the F4 order parameter) must bepresent if higher-order structures (e.g., cages) are to form. Thisstaged description of hydrate nucleation is consistent with pre-vious work (27, 28). For example, see figure 14C in ref. 27, al-though the authors did not interpret their results as such. More

Fig. 1. Time evolution of (A) potential energy, (B) average F4 value, and (C)number of 5126n – 415106m cages in the 10 independent trajectories for theCH4/H2S nanobubble–liquid water system. The color scheme in A is also usedfor B and C. System-averaged F4 values have been previously used to monitorhydrate nucleation in simulations (e.g., as in refs. 15, 16, 18, 23). F4 is anorder parameter that depends on the torsion angles between adjacent H2Omolecules and relates to the planarity of hydrogen bond rings (22). Themathematical details of the F4 order parameter can be found in ref. 22. Theaverage F4 values for liquid water, the sI hydrate phase, and ice are −0.04,0.7, and −0.4, respectively (24). The simulations contained insufficient CH4

and H2S to completely convert to a hydrate phase, and consequently couldnot attain F4 values of 0.7. A 4b5c6d cage has its water molecules arranged sothat they form b quadrilateral, c pentagonal, and d hexagonal hydrogenbond rings. The 5126n – 415106m cages (where n = 0, 2, 3, 4 and m = 2, 3, 4) areassociated with hydrate formation according to previous visual (25) and al-gorithmic (26) analysis. A modified version of the FSICA program (26) wasused to detect cages and determine cage occupancies within the simulations.The potential energy, F4, and cage population curves in this figure and theremainder of the paper are based on 1-ns averages as detailed in SupportingInformation. (D) Snapshots of the nucleation process in run 2. The oxygenatoms of the water molecules comprising the largest cluster of 5126n – 415106m

cages are connected by red tubes. CH4 and H2S within this cluster are blue andgreen spheres, respectively. The oxygen atoms of the water molecules that arenot part of the cluster are small purplish-blue points.

12042 | www.pnas.org/cgi/doi/10.1073/pnas.1610437113 Hall et al.

Dow

nloa

ded

by g

uest

on

May

24,

202

0

importantly, several of the plotted order parameters in ref. 27are different from those used in this study, which highlights thatthe staged nature of hydrate nucleation is not dependent onthe particular order parameters used to track hydrate nucleation.This evidence confirms that hydrate nucleation exhibits orderingin stages as system potential energy decreases. This is a key partof characteristic (i) of protein folding funnels as described above.To help further elucidate the general mechanistic features of

hydrate nucleation and its similarities to protein folding, it isadvantageous to compare nucleation behavior within the set ofnucleating trajectories in Fig. 1. Such comparisons require amechanism for temporally aligning nucleating trajectories giventhat the process of interest does not occur at the same timewithin each trajectory, a common challenge with stochasticprocesses. The F4 curves are an appropriate basis for temporallyaligning the nucleating trajectories. First, the F4 curves are ap-parently more sensitive to the early structural changes during

nucleation than cage populations (Fig. 2A). Second, the F4curves are less sensitive to nonhydrate system fluctuations thanthe potential energy curves (e.g., the potential energy curvesdepend on the evolution of the CH4/H2S nanobubble in additionto the structuring of the aqueous phase). Therefore, the F4curves were leveraged to temporally align the trajectories usingthe following procedure.A four-parameter sigmoidal fitting was applied to each tra-

jectory’s F4 curve in Fig. 1B. These sigmoidal fits had R2 valuesgreater than 0.99. For each trajectory, the time corresponding tothe inflection point of the F4 curve’s sigmoidal fit was subtractedfrom the trajectory’s simulation times yielding relative times.Zero ns in this alternative temporal frame of reference has auniversal meaning, namely the point in time of fastest ordering inthe nucleating trajectories with respect to the F4 order param-eter. The temporally aligned F4 curves for the nucleating tra-jectories have excellent overlap, particularly for trelative < 0 ns, ascan be seen in Fig. 2B. Given the definition of trelative = 0 ns,nucleation occurs when trelative < 0 ns, so the focus of the remaininganalysis is on the nucleating trajectories in the trelative < 0 ns regime.According to Fig. 2C, hydrate nucleation in the CH4/H2S nano-

bubble system is typically associated with first the appearance of 512

cages, then 51262 cages, and then both 51263 and 4151062 cages ap-proximately simultaneously. During the early stages of hydrateformation, the average populations of these cages are in the sameorder as their appearance, namely 512 > 51262 > 51263 ≅ 4151062. Forthe simulated system, the thermodynamically favored hydrate crystalstructure, sI, consists of only 512 and 51262 cages. The appearanceof a variety of cages stems from the lower entropic penalty associatedwith transitioning from the liquid to a slightly disordered lat-tice rather than directly to the fully ordered thermodynamicallyfavored structure.Entropy can also play a critical role in the appearance of specific

cage types. For example, 512 cages are highly symmetric, and eachconsists of 20 H2O molecules, the same as the average hydrationnumber of aqueous methane (29). Other 5126n – 415106m

cages have more water molecules, lower symmetry, or both.Therefore, 512 cages are structurally more consistent with theaqueous phase and have higher degeneracy, so there would be asmaller entropic penalty associated with their formation. How-ever, space-filling structures cannot be created from 512 cagesalone, so the formation of a hydrate structure requires the ap-pearance of additional cages. Meanwhile, 51262 cages have highersymmetry than 4151062 cages and contain fewer water moleculesthan 51263 cages, so the earlier formation of 51262 cages is expectedto be entropically favorable. The remaining 5126n – 415106m cagesare either larger or have lower symmetry than 4151062 and 51263 cages,and hence appear later. Large cages (e.g., 51264 cages) are notexpected to have large populations because these cages have volumesthat exceed the sizes of CH4 and H2S. In turn, their formation entailsthe creation of excess and larger void volumes. Void volumes, i.e.,cavities, are unfavorable in aqueous environments (e.g., see ref. 30).This latter point is emphasized by the rarity of empty cages as can beseen in Fig. 2D. Therefore, the staged appearance of cage types andcage occupancy stems from the tendency of water molecules torearrange to form the hydrate phase so that entropic penaltiesare minimized, i.e., hydrate nucleation proceeds so that a sys-tem’s phase-space manifold connecting the solution and hydratemacrostates encompasses the maximum possible number of mi-crostates. Previous work has described this manifold as the “re-active tube” along the hydrate nucleation pathway (31). Thepotential energy landscape for nucleation would then be a pro-jection of the microstates encompassed by this manifold as afunction of the system’s potential energy.The existence of multiple microstates at each stage along the

solution–hydrate transition implies that there can be stochasticdifferences in the exact number of each cage type for any specifictrajectory as nucleation proceeds. This variation is highlighted by

Fig. 2. Comparing nucleating trajectories. (A) Superimposed normalizedpotential energy, average F4, and 5126n – 415106m cage curves for one nu-cleating trajectory (run 1 in Fig. 1). The potential energy curve has beeninverted to facilitate comparisons. (B) Temporally aligned average F4 curvesfor the nucleating trajectories exhibiting excellent overlap during nucle-ation, i.e., the trelative < zero-ns regime. Relative times provide a temporalframe of reference for comparing the nucleating trajectories. (C) Thenumber of each of the 5126n – 415106m cages during nucleation as averagedacross the six nucleating trajectories. (D) Occupancy of 5126n – 415106m cagesduring nucleation as averaged across the nucleating trajectories. (E) Themax-mean-min distributions for the number of 512 and 51262 cages duringhydrate nucleation. The max-mean-min distribution for a given cage type isa set of three curves indicating the evolution of the maximum, mean, andminimum number of that particular cage type across the set of alignednucleating trajectories. (F) The decreasing relative standard deviations in thenumbers of 512 and 51262 cages among the aligned trajectories as nucleationand growth proceed. The average curves and max-mean-min distributions inC–E are based on the analysis of over 49,000 configurations from 6 nucle-ating trajectories.

Hall et al. PNAS | October 25, 2016 | vol. 113 | no. 43 | 12043

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

24,

202

0

the max-mean-min distributions for the 512 and 51262 cages inFig. 2E. However, among the nucleating trajectories, the relativestandard deviations in the populations of 512 and 51262 cagesdecrease as the trajectories progress along the solution–hydratetransition (Fig. 2F). This apparent structural convergence im-plies that the manifold of accessible microstates, i.e., the po-tential energy landscape, narrows as the hydrate macrostate isapproached. However, cages do not fully capture the solution–hydrate transition because there exist early stages along thistransition where the system is ordering at the molecular leveland cages have yet to appear (see the F4 and cage curves inFig. 2A).Convergence among the trajectories is also observed in the

properties of the individual water molecules, which can probe allstages along the solution–hydrate transition. Sg is a water-basedorder parameter sensitive to local tetrahedral structure (32), as isdescribed in Fig. 3 legend. The Sg probability distributions beforeand after hydrate nucleation are provided in Fig. 3A. The nar-rowing and shifting of the Sg probability distribution reveals thatthere is decreased variation in the local arrangements of watermolecules following nucleation and that the water moleculesform more tetrahedral arrangements. This transition begins atapproximately trelative = −25 ns, and there is a marked change inapparent behavior at approximately trelative = −10 ns (Fig. 3B).The former coincides with the approximate appearance of 5126n –415106m cages whereas the latter occurs as the maximum for-mation rate of these cages is approached (compare Figs. 3B and2D). Therefore, Sg probability distributions are sensitive to theformation of the hydrate phase while also probing the behavior

of the solution phase. The probability distribution for the trans-lational mean-squared displacement (MSD) values of the watermolecules also sharpens and shifts to lower values during thenucleation process as can be seen in Fig. 3C, and as is expectedfor a solution–hydrate transition. However, this sharpening oc-curs more gradually than for the Sg distribution as can be seenby comparing Fig. 3D with Fig. 3B. Previous work has indicatedthat gas hydrate nucleation proceeds with the formation ofsolvent-separated guest molecules and water molecule rear-rangement to yield cages (33). Therefore, the early reductionsin the MSD values of the water molecules, without commen-surate changes to the tetrahedral ordering of the aqueousphase, may be indicative of early stage ordering of guest mol-ecules in the aqueous phase.During hydrate nucleation, the number of microstates sam-

pled by the system as a whole can be expected to depend on thenumber of molecular microstates readily accessible to the watermolecules. Molecular property distributions are indicative of theaccessible molecular microstates. As the molecular propertydistributions are narrowing, the potential energy of the system isdecreasing as can be seen by comparing Fig. 3E with Fig. 3 A–D.Fig. 4 then captures this funneling of molecular properties withlower system potential energies. This funneling is observed forother molecular properties beyond those in Fig. 4 provided thatthese properties are reasonably chosen (see Fig. S2 for anotherfunnel based on two other properties of the water molecules). Asa system descends deeper into the funnel, hydrate structures begin

Fig. 3. Evolution of molecular property distributions during nucleation.(A) The probability distribution for the Sg values of water molecules attrelative = −50 ns and trelative = 0 ns. The distributions have been calculatedusing the 10-ps average configurations from all of the nucleating trajectoriesfor the stated relative times. The Sg order parameter reflects the degree towhich the four nearest neighbors of a water molecule exhibit tetrahedralitywith Sg values closer to 0 indicating increased tetrahedrality (30). Themathematical details of the Sg order parameter can be found in ref. 30. Theaverage Sg values for ice and liquid water are 0.0117 and 0.100, respectively(30). (B) The temporal evolution of the relative probability distribution of Sgvalues. For a given point in time, Sg values with probabilities approachingthat of the most probable Sg value for that point in time (e.g., ∼0.03 fortrelative = −50 ns based on A) are black whereas Sg values of decreasing rel-ative probability are increasingly green. The Sg probability distribution wassampled every 0.5 ns in the interval of trelative = −50 to 0 ns. Like A, B is basedon the analysis of 10-ps average configurations from all of the nucleatingtrajectories for the stated relative times. (C) The probability distribution ofthe translational MSD values of water molecules at trelative = −50 ns andtrelative = 0 ns. The line types are the same as in A. (D) The temporal evolutionof the relative probability distribution of MSD values. The relative proba-bilities are color encoded the same way as for B. The MSD values used toconstruct C and D correspond to the variances in the water molecule posi-tions for the 10-ps average configurations used to construct A and B, re-spectively. (E) The systemic decrease in potential energy during hydrateformation. The color scheme is the same as in Fig. 2B.

Fig. 4. Funneling of molecular properties during hydrate nucleation. Foreach nucleating trajectory, average configurations were calculated using a10-ps averaging window at every 0.5-ns point within the time interval oftrelative = −72.5 ns to trelative = 0 ns. These average configurations were his-togramed according to their corresponding system potential energies. Then,for the set of configurations within a given potential energy range, thewater molecules of those configurations were histogramed according totheir Sg and MSD values (see Fig. 3 for a description of Sg and MSD values).This yielded a 3D histogram based on the analysis of more than 2.5 millionwater molecules from 876 configurations. For each potential energy interval,the illustrated surface bounds the set of (Sg, MSD) values that are readilyaccessible to the water molecules, i.e., have a relative probability ≥5%compared with the most probable (Sg, MSD) values for that potential energyinterval. The surface is color encoded so that low and high system potentialenergies are black and cyan, respectively. The central dots indicate how thetrajectories move through the (Sg, MSD, system potential energy)-spacebased on the average MSD and Sg values for the configurations comprisingthe trajectories. The red and yellow points correspond to run 2 and all ofthe other nucleating trajectories, respectively. The molecular configurationson the right represent the system in run 2 when it has reached the enlargedand attached red points. These configurations have the same visualencoding as those in Fig. 1D, and the time stamps on the configurations arerelative times.

12044 | www.pnas.org/cgi/doi/10.1073/pnas.1610437113 Hall et al.

Dow

nloa

ded

by g

uest

on

May

24,

202

0

to appear as is highlighted in Fig. 4 by the molecular configurationinsets on the right-hand side. By extension of the funnel in Fig. 4,the system will have more accessible microstates during the solu-tion–hydrate transition the closer the system is to the high-energyfluid macrostate and relatively fewer as it approaches the hydratemacrostate. However, as system descriptors, molecular propertydistributions lack the specificity of individual system microstates, sothe system’s potential energy landscape will necessarily be morecomplex than the molecular properties funnel shown in Fig. 4.Therefore, hydrate nucleation is ordering in stages involving fun-nel-shaped potential energy landscapes, and fully satisfies charac-teristic (i) of the funnel model of protein folding as describedabove.The stochastic nature of hydrate nucleation can be explained

within a funnel model by assuming that its funnel-shaped po-tential energy landscape, similar to proteins, has a relatively flatupper portion corresponding to the solution macrostate and acomparatively narrow lower region leading to the hydrate mac-rostate. Like proteins, individual trajectories will stochasticallydiffuse across the flat upper region of the funnel until they en-counter structures through which they can descend deeper intothe funnel. Related to this, estimates of hydrate critical nucleibased on simulations indicate that a hydrate system’s nucleus canhave a variety of structures at the apparent hydrate nucleationtransition state (28), which is similar to how protein-foldingtransition states are characterized as ensembles of protein con-formations (9). Therefore, the funnel-shaped potential energylandscapes associated with hydrate nucleation apparently exhibitbarriers and stochasticity, characteristic (ii) of the funnel modelof protein folding as described above.Hydrate nucleation also exhibits multiple pathways and kinetic

traps, characteristic (iii) of the funnel model of protein folding asdescribed above. The hydrate nuclei in this study sample somewhatdifferent structures as nucleation progresses (e.g., see Fig. 2E).Given that amorphous hydrate nuclei can be annealed to morecrystalline structures (16, 34, 35), it can be inferred that the po-tential energy landscape for a hydrate-forming system is ruggedwith kinetic traps, i.e., local minima, corresponding to amorphousstructures, and that these kinetic traps impeded the nucleatingtrajectories in this study from achieving the sI crystal structure. Inthe gas hydrate literature, it has been proposed that there aremultiple pathways to the crystalline hydrate state which may or maynot involve amorphous intermediates (18). Whether or not indi-vidual trajectories form amorphous structures would depend onhow these trajectories enter and descend the funnel with somepathways leading more directly to the crystalline hydrate phasethan others. Therefore, hydrate nucleation involves funnel-shapedpotential energy landscapes with many topological similarities tothose of protein folding, which explains the phenomenologicalsimilarities between protein folding and crystallization.For the protein community, the funnel model of protein

folding provided a new view on the phenomenon and a con-ceptual framework for understanding protein folding kinetics(10). Crystallization funnels can similarly now form a conceptualframework for interpreting crystallization and nucleation phe-nomenology, which should aid future work. To highlight thisframework, funnel-shaped potential energy landscapes will nowbe used to: (i) compare hydrate and ice nucleation (i.e., twoordering processes involving hydrogen-bond networks of water),and (ii) interpret the results of previous studies on ice nucle-ation. Small clusters of water molecules have been demonstratedto have rugged, funnel-shaped potential energy landscapes (e.g.,see figure 5 in ref. 36); therefore, it is reasonable to conjecturethat: (i) ice nucleation involves a funnel-shaped potential energylandscape, and (ii) this landscape is rugged, even though thisfunnel has yet to be extracted.Ice nucleation is apparently far more difficult to study in silico

than hydrate nucleation. This may seem unexpected given that

hydrate nucleation represents a multicomponent crystallizationprocess. Ice nuclei are almost solely comprised of double-diamondand hexagonal water cages as illustrated in figures 5 and 6 of ref.19, whereas hydrate nuclei contain many more different types ofwater cages (25, 26, 35). From the perspective of combinatorics,there are many more possible hydrogen-bond structures for hy-drate nuclei than there are for ice nuclei. Therefore, the regionsof an ice-forming system’s potential energy landscape that cor-respond to nucleation and crystal growth are narrower thanthose of a hydrate-forming system (assuming that there is asufficient concentration of guest species in the latter to inducehydrate nucleation). In contrast, the metastable liquid regions ofthe ice-forming system’s potential energy landscape are widerthan those of the hydrate-forming system. The presence of hy-drophobic solutes (e.g., CH4) in the metastable liquid aqueousphase of the hydrate-forming system results in a more structuredliquid aqueous phase compared to pure liquid water. This can beseen in Fig. 3B where, before hydrate nucleation, the mostprobable Sg values for the water molecules in this study’s hy-drate-forming system are already significantly lower than the Sgvalue of 0.100 for liquid water (32).Therefore, an ice nucleation simulation would appear to in-

volve a greater search problem than a comparable hydrate nu-cleation simulation. The ice nucleation trajectory is less confinedas it meanders through the fluid portion of its potential energylandscape, and is simultaneously searching for a smaller target.Ice nucleation in pure water systems is thus a significantly rarerevent than hydrate nucleation in multicomponent guest–watersystems of appropriate local composition. The ice search is fur-ther hindered by the comparatively slower dynamics in ice nu-cleation simulations arising from the lower temperatures used toprobe ice nucleation in silico. Taken together, these factorsrender ice nucleation more difficult to study in silico.The behavior of supercooled water is still not fully understood,

particularly in “no man’s land” (i.e., water at temperatures be-tween its homogeneous nucleation temperature and its glasstransition temperature). For example, there exists a temperature(T) in no man’s land below which liquid water is not metastable(37). Funnel-shaped potential energy landscapes can provide aconceptual framework for explaining observed phenomenologyin the vicinity of T.Given that the liquid water macrostate of a pure water system

is associated with more, higher-energy microstates than the icemacrostate, the liquid water macrostate will experience a greaterreduction in its number of accessible microstates as the system isquenched toward T. Therefore, the excess entropy of liquidwater compared to ice decreases as T is approached (37), andthe system cannot access as much of the upper regions of thefunnel. The system is thus more likely to encounter microstatesleading to the lower regions of the funnel as temperature isdecreased, so the formation rate of ice increases as temperatureis decreased toward T (as was observed in ref. 37). As the systemis quenched below T, the accessible portion of the upper regionof the funnel is sufficiently small that the system samples, withcomparable likelihood, both liquid-like microstates and micro-states that lead to the lower regions of the funnel. Therefore,liquid water is no longer metastable with respect to ice, andcrystallization can occur more rapidly than the relaxation ofliquid water (37). The system’s exploration of the funnel is nowcorresponding to spinodal-like behavior. However, the funnel isrugged with local minima. This is why vitrified low-density amor-phous ice forms upon rapid quenching of aqueous systems (37).Below T, the overall ice formation rate is dependent on the rate atwhich the system can escape the funnel’s local minima, and solowering the system’s temperature beyond T reduces the formationrate of ice (as was observed in ref. 37). Therefore, the formationrate of ice reaches a maximum at T, and this can be explained in

Hall et al. PNAS | October 25, 2016 | vol. 113 | no. 43 | 12045

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

May

24,

202

0

terms of how water systems explore their funnel-shaped potentialenergy landscapes.Through extensive simulations and detailed analysis of hydrate

nucleation, this study has revealed that the potential energylandscapes of hydrate-forming systems are funnel shaped.Hydrate nucleation is still not fully understood (14), so theconceptual framework of funnel-shaped potential energy land-scapes offers an important tool in advancing the community’sunderstanding of hydrate nucleation, a process relevant to sev-eral areas in science and technology. Through a comparison ofhydrate nucleation and protein folding, it was demonstrated thatthe phenomenological similarities of these processes are in-dicative of similarities in the features of their underlying po-tential energy landscapes. To demonstrate the conceptual powerof funnel-shaped potential energy landscapes, the features ofthese landscapes were used to rationalize: (i) differences be-tween hydrate and ice nucleation, and (ii) behavior of low-temperature water. Whereas ice nucleation was conjectured toinvolve a funnel-shaped potential energy landscape, future workis needed to demonstrate and probe this funnel. Funnel-shapedpotential energy landscapes may be applicable to many othercrystallization processes, and perhaps other molecular orderingphenomena (e.g., the formation of metal-organic frameworks ormicelles). Future work should explore the topologies of the

potential energy landscapes associated with these processes.These topologies are expected to be diverse [as is thought to bethe case for proteins (10)], and to reflect the rich phenome-nology of crystallization and ordering processes. With thecommon conceptual framework of funnel-shaped potentialenergy landscapes, there are greater opportunities for boththe protein and physical chemistry communities to reappro-priate and share concepts and techniques, which shouldpromote their mutual advancement. For example, crystalli-zation processes may have equivalents to the secondary,tertiary, and quaternary structures of proteins. There are stillmany open challenges in crystallization and protein foldingresearch, and the connections made in this article will em-power researchers with a fresh perspective as they embark inthis research.

ACKNOWLEDGMENTS. The authors thank Dr. Zhengcai Zhang andDr. Guang-Jun Guo for their provision and modification of the Face-SaturatedIncomplete Cage Analysis (FSICA) code for detecting the hydrate cages. Theauthors also thank Dmitri Rozmanov for his discussions and guidance oncalculating rotational root-mean-squared displacements. This research was sup-ported by Alberta Innovation and Science; Alberta Innovates - TechnologyFutures; the Canada Foundation for Innovation; the Natural Sciences andEngineering Research Council of Canada (NSERC); the NSERC Vanier CGSProgram; SMART Technologies; Compute Canada; and the University of Calgary.

1. DeVries AL, Wohlschlag DE (1969) Freezing resistance in some Antarctic fishes. Science163(3871):1073–1075.

2. Griffith M, Ala P, Yang DSC, HonW-C, Moffatt BA (1992) Antifreeze protein producedendogenously in winter rye leaves. Plant Physiol 100(2):593–596.

3. Ustun NS, Turhan S (2015) Antifreeze proteins: Characteristics, function, mechanismof action, sources, and applications to food. J Food Process Preserv 39(6):3189–3197.

4. Koh CA, Sloan ED, Sum AK, Wu DT (2011) Fundamentals and applications of gashydrates. Annu Rev Chem Biomol Eng 2:237–257.

5. Vippagunta SR, Brittain HG, Grant DJW (2001) Crystalline solids. Adv Drug Deliv Rev48(1):3–26.

6. Sosso GC, et al. (2016) Crystal nucleation in liquids: Open questions and future chal-lenges in molecular dynamics simulations. Chem Rev 116(12):7078–7116.

7. Wales DJ, Bogdan TV (2006) Potential energy and free energy landscapes. J PhysChem B 110(42):20765–20776.

8. Matsumoto M, Saito S, Ohmine I (2002) Molecular dynamics simulation of the icenucleation and growth process leading to water freezing. Nature 416(6879):409–413.

9. Wolynes PG, Onuchic JN, Thirumalai D (1995) Navigating the folding routes. Science267(5204):1619–1620.

10. Dill KA, Chan HS (1997) From Levinthal to pathways to funnels. Nat Struct Biol 4(1):10–19.

11. Dill KA, MacCallum JL (2012) The protein-folding problem, 50 years on. Science338(6110):1042–1046.

12. Chan HS, Dill KA (1994) Transition states and folding dynamics of proteins and het-eropolymers. J Chem Phys 100(12):9238–9257.

13. Creek JL (2012) Efficient hydrate plug prevention. Energy Fuels 26(7):4112–4116.14. English NJ, MacElroy JMD (2015) Perspectives on molecular simulation of clathrate

hydrates: Progress, prospects and challenges. Chem Eng Sci 121:133–156.15. Walsh MR, Koh CA, Sloan ED, Sum AK, Wu DT (2009) Microsecond simulations of

spontaneous methane hydrate nucleation and growth. Science 326(5956):1095–1098.16. Liang S, Kusalik PG (2011) Exploring nucleation of H2S hydrates. Chem Sci 2:

1286–1292.17. Sarupria S, Debenedetti PG (2012) Homogenous nucleation of methane hydrate in

microsecond molecular dynamics simulations. J Phys Chem Lett 3(20):2942–2947.18. Zhang Z, Walsh MR, Guo G-J (2015) Microcanonical molecular simulations of methane

hydrate nucleation and growth: Evidence that direct nucleation to sI hydrate isamong the multiple nucleation pathways. Phys Chem Chem Phys 17(14):8870–8876.

19. Haji-Akbari A, Debenedetti PG (2015) Direct calculation of ice homogeneousnucleation rate for a molecular model of water. Proc Natl Acad Sci USA 112(34):10582–10588.

20. Davidson DW, Handa YP, Ratcliffe CI, Tse JS, Powell BM (1984) The ability of smallmolecules to form clathrate hydrates of structure II. Nature 311:142–143.

21. Buch V, et al. (2009) Clathrate hydrates with hydrogen-bonding guests. Phys ChemChem Phys 11(44):10245–10265.

22. Rodger PM, Forester TR, Smith W (1996) Simulations of the methane hydrate/meth-ane gas interface near hydrate forming conditions. Fluid Phase Equilib 116:326–332.

23. Walsh MR, et al. (2011) Methane hydrate nucleation rates from molecular dynamicssimulations: Effects of aqueous methane concentration, interfacial curvature, andsystem size. J Phys Chem C 115(43):21241–21248.

24. Jiménez-Ángeles F, Firoozabadi A (2014) Nucleation of methane hydrates at mod-erate subcooling by molecular dynamics simulations. J Phys Chem C 118(21):11310–11318.

25. Walsh MR, et al. (2011) The cages, dynamics, and structuring of incipient methaneclathrate hydrates. Phys Chem Chem Phys 13(44):19951–19959.

26. Guo G-J, Zhang Y-G, Liu C-J, Li K-H (2011) Using the face-saturated incomplete cageanalysis to quantify the cage compositions and cage linking structures of amorphousphase hydrates. Phys Chem Chem Phys 13(25):12048–12057.

27. Wu JY, Chen LJ, Chen YP, Lin ST (2016) Molecular dynamics study on the nucleation ofmethane + tetrahydrofuran mixed guest hydrate. Phys Chem Chem Phys 18(15):9935–9947.

28. Barnes BC, Knott BC, Beckham GT, Wu DT, Sum AK (2014) Reaction coordinate ofincipient methane clathrate hydrate nucleation. J Phys Chem B 118(46):13236–13243.

29. Dec SF, Bowler KE, Stadterman LL, Koh CA, Sloan ED, Jr (2006) Direct measure of thehydration number of aqueous methane. J Am Chem Soc 128(2):414–415.

30. Pratt LR, Pohorille A (1992) Theory of hydrophobicity: Transient cavities in molecularliquids. Proc Natl Acad Sci USA 89(7):2995–2999.

31. Lauricella M, Meloni S, English NJ, Peters B, Ciccotti G (2014) Methane clathrate hy-drate nucleation mechanism by advanced molecular simulations. J Phys Chem C118(40):22847–22857.

32. Chau P-L, Hardwick AJ (1998) A new order parameter for tetrahedral configurations.Mol Phys 93(3):511–518.

33. Jacobson LC, Hujo W, Molinero V (2010) Nucleation pathways of clathrate hydrates:Effect of guest size and solubility. J Phys Chem B 114(43):13796–13807.

34. Jacobson LC, Hujo W, Molinero V (2010) Amorphous precursors in the nucleation ofclathrate hydrates. J Am Chem Soc 132(33):11806–11811.

35. Vatamanu J, Kusalik PG (2010) Observation of two-step nucleation in methane hy-drates. Phys Chem Chem Phys 12(45):15065–15072.

36. Wales DJ, Miller MA, Walsh TR (1998) Archetypal energy landscapes. Nature 394:758–760.

37. Moore EB, Molinero V (2011) Structural transformation in supercooled water controlsthe crystallization rate of ice. Nature 479(7374):506–508.

38. Abascal JLF, Vega C (2005) A general purpose model for the condensed phases ofwater: TIP4P/2005. J Chem Phys 123(23):234505.

39. Liang S, Rozmanov D, Kusalik PG (2011) Crystal growth simulations of methane hy-drates in the presence of silica surfaces. Phys Chem Chem Phys 13(44):19856–19864.

40. Forester TR, McDonald IR, Klein ML (1989) Intermolecular potentials and the prop-erties of liquid and solid hydrogen sulphide. Chem Phys 129(2):225–234.

41. Rozmanov D, Kusalik PG (2010) Robust rotational-velocity-Verlet integration meth-ods. Phys Rev E 81(5 Pt 2):056706–1.

42. Essmann U, et al. (1995) A smooth particle mesh Ewald method. J Chem Phys 103(19):8577–8593.

43. Martyna GJ, Klein ML, Tuckerman M (1992) Nosé-Hoover chains: The canonical en-semble via continuous dynamics. J Chem Phys 97(4):2635–2643.

44. Jang S, Voth GA (1997) Simple reversible molecular dynamics algorithms for Nosé-Hoover chain dynamics. J Chem Phys 107(22):9514–9526.

45. Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR (1984) Moleculardynamics with coupling to an external bath. J Chem Phys 81(8):3684–3690.

46. Liang S, Kusalik PG (2010) Crystal growth simulations of H2S hydrate. J Phys Chem B114(29):9563–9571.

47. Conde MM, Vega C (2010) Determining the three-phase coexistence line in methanehydrates using computer simulations. J Chem Phys 133(6):064507.

48. Conde MM, Gonzalez MA, Abascal JLF, Vega C (2013) Determining the phase diagramof water from direct coexistence simulations: The phase diagram of the TIP4P/2005model revisited. J Chem Phys 139(15):154505.

49. Humphrey W, Dalke A, Schulten K (1996) VMD: Visual molecular dynamics. J MolGraph 14(1):33–38.

12046 | www.pnas.org/cgi/doi/10.1073/pnas.1610437113 Hall et al.

Dow

nloa

ded

by g

uest

on

May

24,

202

0