[Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic...

76
ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY: INSIGHTS FROM ATOM-BASED CALCULATIONS AND STATISTICAL MECHANICS By THEMIS LAZARIDIS, GEORGIOS ARCHONTIS, and MARTIN KARPLUS Department of Chemistry, Harvard University, Cambridge, Massachusetts 02138 I. Introduction .................................................... 11. Models for Protein Stability Enthalpy of Protein Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Makhatadze & Privalov Analysis . B. van der Waals Interactions in Proteins and Model Systems A. B. D. Summary .................................................... Electrostatic Interactions in Proteins: Hydrogen Bond Contribution . . . . A. Polyalanine Helix Calculatio B. Hydrogen Bond Analysis of C. Protein Hydrogen Bonding D. Summary .................................................... Enthalpy of Solvation of Polar Groups: Breakdown of Surface Area Proportionality Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ionizable Side Chains . . . . B. Test of Additivity C. Simulation Analy D. E. Summary and Implications for Protein Stability . . . Concluding Discussion . . . . . . . . Appendix: Computati References ....................................................... 111. Theoretical Calculation of Vacuum Denaturation Enthalpy . . . . . . . . . . . . . . . . . . . . Model for van der Waals Interactions Decomposition of van der Waals Interactions in IV. . . . . . . . . . .............. C. Model System Results . . . . . .. V. VI. Continuum Electrostatics Calculations of Solvation VII. 23 1 235 241 242 244 249 250 251 252 258 259 260 262 266 268 269 270 272 274 277 280 283 288 293 296 I. INTRODUCTION Proteins are distinguished from other heteropolymers by having a unique three-dimensional structure that is stable under physiological conditions. This structure corresponds to the native state in which pro- teins perform their biological functions. A variety of factors contribute to the stability and uniqueness of the native state. In an atom-based description, van der Waals and electrostatic interactions are the primary nonbonded energy terms involved; the bonded energy terms are essential for the covalent structure of the polypeptide chain, but are less important in determining the stability of the native state. Since proteins exist in ADVANC&S IN PROTEIN CHEMISTRY, VaI. 47 231 Copyright Q 1995 by Academic Press. Inc. All rights of reproduction in any form resemed.

Transcript of [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic...

Page 1: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY: INSIGHTS FROM ATOM-BASED CALCULATIONS AND

STATISTICAL MECHANICS

By THEMIS LAZARIDIS, GEORGIOS ARCHONTIS, and MARTIN KARPLUS

Department of Chemistry, Harvard University, Cambridge, Massachusetts 02138

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Models for Protein Stability

Enthalpy of Protein Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Makhatadze & Privalov Analysis . B. van der Waals Interactions in Proteins and Model Systems A. B.

D. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Electrostatic Interactions in Proteins: Hydrogen Bond Contribution . . . . A. Polyalanine Helix Calculatio B. Hydrogen Bond Analysis of C. Protein Hydrogen Bonding D. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Enthalpy of Solvation of Polar Groups: Breakdown of Surface Area Proportionality Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ionizable Side Chains . . . . B. Test of Additivity C. Simulation Analy D. E. Summary and Implications for Protein Stability . . . Concluding Discussion . . . . . . . . Appendix: Computati References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

111.

Theoretical Calculation of Vacuum Denaturation Enthalpy . . . . . . . . . . . . . . . . . . . .

Model for van der Waals Interactions Decomposition of van der Waals Interactions in

IV. . . . . . . . . . . . . . . . . . . . . . . .

C. Model System Results . . . . . . .

V.

VI.

Continuum Electrostatics Calculations of Solvation

VII.

23 1 235 241 242 244 249 250 251 252 258 259 260 262 266 268

269 2 70 272 274 277 280 283 288 293 296

I. INTRODUCTION

Proteins are distinguished from other heteropolymers by having a unique three-dimensional structure that is stable under physiological conditions. This structure corresponds to the native state in which pro- teins perform their biological functions. A variety of factors contribute to the stability and uniqueness of the native state. In an atom-based description, van der Waals and electrostatic interactions are the primary nonbonded energy terms involved; the bonded energy terms are essential for the covalent structure of the polypeptide chain, but are less important in determining the stability of the native state. Since proteins exist in

ADVANC&S IN PROTEIN CHEMISTRY, VaI. 47

231 Copyright Q 1995 by Academic Press. Inc. All rights of reproduction in any form resemed.

Page 2: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

232 THEMIS LAZARIDIS ET AL.

solution, nonbonded interactions within the protein, between the pro- tein and the solvent, and among the solvent molecules contribute to the thermodynamic properties. There have been no detailed calculations that make use of the interactions involved to evaluate the enthalpy, entropy, and free energy associated with protein folding. Energy and enthalpy calculations are relatively straightforward; the evaluation of the entropy and free energy is more difficult. In addition, the overall free energy difference between the native and unfolded states is such (on the order of 0.1 kcal/mol-residue) that very small differences between large numbers are involved in an analysis of protein stability.

Because of the difficulty of detailed theoretical calculations, many phenomenological analyses of protein stability have been made. Con- structs like hydrogen bonding and the hydrophobic effect, in particular, have been suggested as playing a significant role in counterbalancing the increase in configurational entropy upon unfolding. Historically, hydrogen bonding was first recognized as of potential importance in determining the structure of biomolecules (Mirsky and Pauling, 1936). This led to the successful prediction of the role of secondary structural elements in proteins (Pauling and Corey, 1953). The importance of the solvent was stressed by Kauzmann (1959) who focused on the hydropho- bic interaction as an essential element in protein stability. Drawing on the work of Frank and Evans (1945) and using experimental data for the transfer of small molecules between water and organic solvents, Kauzmann suggested that the major driving force for protein folding is the burial of hydrophobic groups. The physical principles suggested by Kauzmann have been elaborated into a statistical mechanical model of protein stability by Dill (1990). In such analyses hydrogen bonding contributes to the uniqueness of the native structure, but not to its stability, i.e., the structure has to be such that the hydrogen bonding potential of polar groups is satisfied (Chothia, 1974; Baker and Hubbard, 1984; Stickle et al., 1992).

During the past 20 years, a large body of experimental data has become available concerning the structure and thermodynamics of proteins. Of particular importance for understanding the stability of proteins are measurements of the heat capacities, free energies, entropies, and enthal- pies of denaturation over a range of temperatures (Privalov, 19’79,1989). Corresponding thermodynamic measurements for small molecules that contain the functional groups of the amino acids and peptides have also been made (Murphy and Gill, 1989a,b, 1990). In addition, the effect of mutations on the stability of a variety of proteins has been studied (Fersht, 1987; Matthews, 1993). In many cases, these studies have been

Page 3: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 233

accompanied by structure determinations for the wild-type and mu- tant species.

Such a wealth of information provides both a challenge and an oppor- tunity to develop a fuller understanding of protein stability. It has led to a number of attempts to interpret the thermodynamic measurements in terms of contributions at the atomic level. The conclusions differ significantly at both the quantitative and the qualitative levels concerning the importance of various factors in stabilizing the native state. This is a consequence of the use of different assumptions, introduced explicitly or implicitly into the model, to go from the thermodynamic data to a molecular interpretation. It has been concluded, for example, that hydrogen bonding makes the dominant contribution (Creighton, 1991), that the hydrophobic effect makes the dominant contribution (Dill, 1990), or that hydrogen bonding and the hydrophobic effect are of approximately equal importance (Privalov and Makhatadze, 1993) in stabilizing the native structure.

Statistical mechanics provides the theory required to go from micro- scopic contributions to macroscopic thermodynamics. The energy and enthalpy can be calculated from canonical averages over the van der Waals and electrostatic terms in the potential function. For the entropy and free energy, the partition function, which depends on the same energy terms, is required. By calculating the difference in these thermo- dynamic quantities between the native and denatured state as a function of temperature and solvent conditions, a full statistical mechanical treat- ment of protein stability could be obtained, in principle. In practice, this is beyond the currently available theoretical and computational methodologies. Nevertheless, theory can provide limits on the assump tions that are currently employed in the phenomenological analyses. A major objective of this chapter is to take a first step in this direction. A number of calculations are undertaken to examine assumptions that have been made in analyzing the experimental data and to provide information needed to decide among different interpretations. Empiri- cal energy calculations for minimized structures and averages over molec- ular dynamics simulations are used to estimate the contributions of van der Waals and electrostatic interactions in the native and unfolded states. They are complemented by Poisson-Boltzmann continuum calculations and integral equation results that provide information concerning the effect of the solvent and its relation to the exposed surface area. Despite the approximate nature of these calculations, they answer essential ques- tions that are posed by experiments and their analyses. Examples ad- dressed in this chapter include: (1) what is the loss of nonbonded interactions in protein unfolding? (2) What is the magnitude of hydro-

Page 4: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

234 THEMIS LAZARIDIS ET AL.

gen bonding interactions in proteins? (3) Are van der Waals interactions in native proteins similar to those in nonpolar liquids or solids? (4) IS the surface area proportionality assumption valid for estimating the solvation of polar groups?

Section I1 summarizes published analyses of protein stability based on model compound data and outlines the assumptions that have been made. In Section 111, the enthalpy of protein folding is considered and a thermodynamic cycle is used to relate the measurements to quantities that can be calculated. We focus on the enthalpy of denaturation because it is most directly accessible to calculations. The experiments and analysis of Privalov and co-workers, particularly that of Makhatadze and Privalov (1 993; this paper is referred to as M & P in this chapter) are considered in detail because these measurements provide the most complete results on the thermodynamics of proteins. In addition, they have used an approach to analyze their results which corresponds to the thermody- namic cycle employed in the present calculations. This makes possible a comparison between theory and experiment. In particular, their esti- mates of internal van der Waals and hydrogen bonding contributions to the enthalpy difference between the native and denatured states of the protein are compared with the calculations of the van der Waals and electrostatic terms (the latter includes hydrogen bonding) from an atom-based model. To obtain a better understanding of the importance of various interactions, we then undertake a detailed examination of the contributing terms. In Section IV, the van der Waals term is decom- posed into polar and nonpolar contributions and comparisons are made with results for simple systems that have been used as models for the protein interior (e.g., liquid alkanes and alkane crystals). This permits us to determine which, if any, of the models is most appropriate for empirical estimates of the van der Waals interactions in native proteins. A corresponding study is made in Section V of the electrostatic contri- bution with special emphasis on the hydrogen bonding term in pro- teins and helical peptides. Because of the important role of the surface area model in estimating the difference in solvation enthalpy, as well as the free energy, of the native and unfolded states, we use statistical mechanical calculations and molecular dynamics simulations in Section VI to examine solvation effects. We focus on polar groups for which the simple additivity assumption is most likely to be in error. The analysis indicates that the polar contribution to the solvation enthalpy difference between native and unfolded protein is overestimated by the surface area model. This permits us to draw conclusions concerning the phenom- enological contributions to the enthalpy of protein folding and make comparisons with the work of Spolar et al. (1992). A concluding discus-

Page 5: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 235

sion in Section VII summarizes what has been learned from the atom- based calculations and provides an overview of our understanding of the contributions to protein stability. The sources of the overall stability of proteins are described and the differences among proteins are consid- ered. We then comment on the predictive utility of our current under- standing and on what needs to be done to confirm and extend the analysis.

Several reviews on protein stability have appeared (Dill, 1990; Creighton, 1991; Matthews, 1993; Oobatake and Ooi, 1993; Ponnus- wamy, 1993; Rashin, 1993; Rose and Wolfenden, 1993). Consequently, this chapter emphasizes what can be learned from theory, rather than attempting a comprehensive coverage of the literature on protein ther- modynamics.

11. MODELS FOR PROTEIN STABILITY

Kauzmann (1959) introduced the use of thermodynamic data for model compounds in elucidating the interactions contributing to pro- tein stability. He proposed the transfer of nonpolar groups from water to nonpolar liquids as a paradigm for protein folding and provided a qualitative description of the important interactions. The major conclu- sion, already mentioned, was that the hydrophobic effect, as manifested in the burial of nonpolar groups in the native protein, was the dominant contribution to protein stability. A more quantitative analysis was per- formed by Tanford (1962), who accepted the paradigm of transfer be- tween two phases and employed data on the solubility of amino acids in water and in ethanol to estimate the hydrophobic contribution to the stability of proteins.

Since then, a number of experimental studies have been performed to provide more accurate measures of the contribution of sidechain burial to protein stability. Measurements of the distribution of amino acids between aqueous dioxane or ethanol mixtures and water led to the establishment of the first “hydrophobicity scale” (Nozaki and Tanford, 1971). Other data were provided by the results of Wolfenden et al. (1981) on the distribution of analogs of amino acid side chains between water and the gas phase and between water and liquid cyclohexane (Radzicka and Wolfenden, 1988). The thermodynamics of transfer of N-acetyl- amino acid amides from octanol to water has been studied by Fauchsre and Pliska (1983).

Although model compound results of this type provide important information, their use for the study of protein stability is not straightfor- ward. All applications have been based on the assumption that each

Page 6: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

236 THEMIS LAZARIDIS ET AL.

chemical group in a molecule makes a constant contribution to the thermodynamic properties of interest. Such a “group additivity” assump- tion has been employed in making predictions for a variety of molecular properties in systems other than proteins. Applications include the heat capacities of organic molecules in the gas phase (Benson, 1968), activity coefficients in liquid mixtures (Larsen et aL, 1987), thermodynamics of hydration of organic molecules (Cabani et ab, 1981), and solubilities of compounds of pharmacological interest (Yalkowsky and Valvani, 1979).

Given the group additivity assumption, protein stability estimates re- quire a procedure for determining the difference in the solvation of groups in the native and denatured states. In almost all applications, the effective contribution is taken to be proportional to the accessible surface area (Lee and Richards, 1971), although arguments have been given for using the molecular surface (Israelashvili, 1985; Tunon et al., 1992; Jackson and Sternberg, 1994). Thus, the solvation contribution of each group in a given state is equal to the accessible surface area (MA) in that state multiplied by a coefficient determined by fitting the model compound data.

Surface proportionality calculations seem to work well for the free energy of solvation of nonpolar compounds (Hermann, 1972; Reynolds et al., 1974; Still et al., 1990), although an unequivocal justification for such a model is not available. Such models have also been used for the enthalpy and entropy, although these quantities contain large cavity contributions (Ben-Naim, 1975; Yu and Karplus, 1988), which cancel in the free energy. A question raised concerning the use of surface area models of hydrophobic solvation focuses on possible differences in the shape of small molecule and protein surfaces. The model systems are mainly small, convex molecules, while proteins exhibit surfaces with markedly different and irregular geometries. If the free energy of solva- tion depends on the curvature of the nonpolar surface (Sharp et al., 1991a; Nicholls et aL, 1991), the proportionality coefficient determined from small molecule data could lead to errors when applied to proteins. However, it is not clear that at the atomic level the protein surface is locally very different from that of small molecules. Moreover, in many of the applications, the specific origin of the model is less important than the fact that it works, at least in an average sense; e.g., even if there were a volume dependence rather than a surface area dependence for individual groups (Baase et aL, 1992), a surface area proportionality could still be used since one is mainly counting the number of groups involved. There is little direct information on the applicability of the surface area model for the global folding of proteins. The only data justifying its utilization for proteins are provided by mutation studies

Page 7: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 237

(Shortle et al., 1990; Kellis et al., 1989; Ericksson et al., 1992). Since the resulting free energies correspond to conditional thermodynamic quantities (i.e., they represent the free energy changes given that the rest of the protein is already in place) their relation to the overall stability of proteins is indirect.

For polar groups the surface area proportionality assumption is less well established than for nonpolar groups, although it has been used in a number of analyses of protein stability (Makhatadze and Privalov, 1993; Privalov and Makhatadze, 1993) and for estimating the free ener- gies of interaction of protein complexes (Horton and Lewis, 1992). There is evidence that the additivity assumption fails for molecules with several polar groups (Wu and Sandler, 1989; Cabani et al., 1981; Rose- man, 1988). A striking example of nonadditivity is that the free energies of solvation of phydroxybenzyl alcohol and mhydroxybenzyl alcohol differ by 3.5 kcal/mol (Ben-Naim, 1990). Other examples of deviations from additivity are given by Wolfenden (1983). Cabani et al. (1981) suggested the use of a correction term whenever two polar groups are close to each other, but they were not able to obtain statistically reliable values for these corrections. In proteins, which have many polar groups in close proximity, significant deviations from the additivity assumption might be expected. The essential peptide group, which contains both a C=O and a NH functionality, is expected to cause problems (Yu et al., 1991), although treating it as a unit may lead to more valid approxi- mate results.

A variety of models of the solvation contribution to protein stability have been developed based on the group additivity and accessible surface area assumptions. The earlier applications mentioned previously used model compound data for the main chain and the side chains of the amino acids. More recently, atomic representations have been intro- duced. One parameterization scheme is that developed by Eisenberg and McLachlan (1986). They estimated solvent effects from accessible surface areas and atom-based solvation parameters. Five classes of atoms are distinguished (C, O/N, 0-, N', and S) and the contribution of each atom to the solvation free energy was obtained as the product of its solvation parameter and the ASA. The former was determined by fitting the data of Fauchcre and Pliska for the transfer of amides between water and octanol. The model has also been parameterized from data for transfer from the gas phase to water (Wesson and Eisenberg, 1992). It has been applied with molecular mechanics potentials to several structural studies, including the evaluation of misfolded structures (see, for exam- ple, Chiche et aL, 1990), although no global dissection of protein stability has been attempted. Clearly, the reference state makes an important

Page 8: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

238 THEMIS LAZARIDIS ET AL.

difference in the meaning of the solvation results. If gas-phase to water data are used to determine the proportionality constants, a “pure” solvation contribution is obtained and additional interactions in the protein interior have to be included in a stability calculation. By contrast, the use of transfer data from a liquid or solid to water implies that the liquid or solid reference corresponds to the protein interior so that no additional terms arising from interactions in the latter have to be in- cluded. Most simple models of protein stability have been based on the latter choice, though the former can give more insights into the interactions involved. The important work of M & P uses the gas-phase reference state (see below). Mixed models have also been used (Shirley et al., 1992). They are somewhat more complex to interpret (see below).

Ooi et al. (1987) developed a similar solvation model based on accessi- ble surface areas and a somewhat more detailed group decomposition. They analyzed solvation data for the free energy, enthalpy, and heat capacity of small molecules and amino acid side chains and determined contributions for seven groups (aliphatic C, aromatic C, carbox(n)yl C, carbox( n)yl 0, amine or amide, N, S, or SH, and OH). This parametriza- tion was applied to protein stability (Ooi and Oobatake, 1988, 1991; Khechinachvili, 1990) and to the helix-coil transition of polyalanine (Ooi and Oobatake, 1991). An innovation made by these authors is that they do not treat the protein interior as a separate phase. They use the gas-phase thermodynamic cycle, as described in Scheme I and estimate the enthalpy and entropy of unfolding in the gas phase in terms of buried surface area, just like the solvation terms. No explicit interactions (e.g., hydrogen bonding, van der Waals terms) appear in their model. The group decomposition of Ooi and Oobatake has been criticized by Makhatadze and Privalov (1993), who suggested that they significantly underestimated the solvation enthalpy of the peptide group. From the large discrepancies between experimental and predicted enthalpies for several amides (Table 3 of Ooi et al., 1987 versus Table 1 of M & P), this criticism appears to be justified. The reason for this discrepancy is that Ooi et al. do not use amide data to derive the contributions for the carbonyl C and 0 atoms, but rather use data for carboxylic acids. As M & P show, the hydration enthalpy of the COOH group is significantly lower than the sum of the contributions of the CO and OH groups. Therefore, the value that Ooi et al. derive for the carbony10 (-32 cal/ mol Az) is too small. Since the largest part of the peptide group surface is contributed by the 0 atom, it follows that the hydration enthalpy of the peptide group is underestimated.

Murphy and Gill (1989a,b, 1990) performed calorimetric measure- ments to determine the heat capacity and enthalpy of dissolution of

Page 9: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION T O PROTEIN STABILITY 239

cyclic dipeptides from the crystal into water. The motivation for their study was the assumption that the interior of the protein is closer to a peptide crystal than to a nonpolar or polar organic liquid. This is based on the often cited high packing density of the protein interior (Richards, 1977). Murphy and Gill (1990) also analyzed their data by a group contribution approach. They included four groups: the peptide group, nonpolar H, aromatic ring, and OH. The carbon surface is proportional to the number of hydrogens attached to it. For example, the Phe side chain gives one aromatic ring, and 2 t 5 nonpolar hydrogens (i.e., five hydrogens for the aromatic ring and two for the P-methylene group). They have studied a limited number of cyclic dipeptides (with G, A, V, L, P) , which include no charged groups. Murphy and Gill (1991) applied their parametrization to the thermodynamics of protein folding and concluded that the exposure of polar groups makes a positive contribu- tion to the enthalpy of unfolding, i.e., the burial of polar groups makes a contribution to the enthalpy that stabilizes the folded state. Also, they concluded that the burial of aliphatic groups contributes a destabilizing term to the enthalpy of the folded state. However, the contribution of the burial of aliphatic groups to the free energy is stabilizing.

Record and co-workers (Spolar et al., 1989; Livingstone et al., 1991) showed that the heat capacity of protein denaturation is proportional to the nonpolar MA exposed on denaturation. More recently, they incorporated the contribution of buried polar surface area to the heat capacity (Spolar et al., 1992), i.e., they distinguished two types of groups-nonpolar (C atoms) and polar (0 and N) atoms. The coeffi- cient for burying the nonpolar surface area was determined from liquid hydrocarbon to water transfer data and the coefficient for the polar surface from liquid amide to water transfer data. No contributions other than the two (nonpolar and polar) surface areadependent terms were included in the model, except for an empirical constant, proportional to the number of residues, which is fitted to the experimental data. Based on this model, Spolar et al. (1992) were able to obtain satisfactory results for the heat capacity and the enthalpy of denaturation of a number of proteins. They stated, however, that the difference between fits with and without a polar term was not significant for the heat capacity.

Pace and Gajiwala (1994) also considered polar and nonpolar contri- butions to the free energy of protein denaturation. The polar group contributions were divided into two parts: (1) the effect of the burial of polar groups, estimated from water to octanol transfer free energies; and (2) hydrogen bonding in the folded protein. For both polar and nonpolar groups they used the data of FauchCre and Pliska (1983). Pace and Gajiwala (1994) suggested that the hydrogen bond term could be

Page 10: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

240 THEMIS LUAFUDIS ET AL.

estimated from mutation experiments (Shirley et aL, 1992), but actually used a contribution obtained by fitting the calculations to protein stability data; their value was 1.3 kcal/mol for a hydrogen bond, similar to that from mutation studies. For the conformational entropy they assumed a value of 4 e.u. per residue, based on the theoretical estimates of Kauz- mann (1959) and the experimental data of Privalov (1979). When they applied this approach to a set of proteins, they found that the nonpolar term made a significant contribution to the free energy of protein stabil- ity. For the polar groups, the contribution referred to as “hydrogen bonding” (step 2) was favorable, but it was almost canceled by the water to octanol transfer free energy (step 1) . Thus, the folding free energy was dominated by the nonpolar (hydrophobic) term in their analysis.

Privalov and co-workers have supplemented their early measurements of protein folding thermodynamics by a series of studies concerned with an analysis based on group contributions. They measured the heat capacity of transfer of small molecules from the gas phase to water and performed a group contribution analysis of the results (Makhatadze and Privalov, 1990). They used the following groups: the protein backbone (later the peptide group, CONH, alone), aliphatic groups, aromatic groups, and the polar parts of the polar side chains; side chains were divided into polar parts and nonpolar aromatic and aliphatic portions. Subsequently, they reanalyzed the heat capacity data using a correction for the molar volume difference between the gas phase and solution in the model compound data (Privalov and Makhatadze, 1992); this correction is small. The sum of the nonpolar and polar contributions was found to be close to the experimental heat capacity. The remainder was small and negative, in contrast to a positive vibrational contribution expected from the analysis of Sturtevant (1977). The group decomposi- tion method was extended to an estimate of the enthalpy, entropy, and free energy of protein folding by Privalov and Makhatadze (1993; Makhatadze and Privalov, 1993). The hydration enthalpy of nonpolar and polar groups was obtained from a surface area model. To account for the use of a gas-phase reference state, they included additional contributions arising from van der Waals and hydrogen bonding terms in the folded protein. For the entropy, they estimated the hydration contribution by a corresponding surface area model. When they esti- mated the configurational entropy of the polypeptide chain they found a large difference between the calculated and experimental entropy of unfolding. They assigned this difference to the side chain configura- tional entropy.

The models summarized in this section represent a variety of ap- proaches that come to significantly different conclusions concerning

Page 11: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 24 1

the important contributions to protein stability. In the following sections we try to use atom-based calculations to understand where the differences arise and which of the models corresponds most closely to reality.

111. ENTHALPY OF PROTEIN FOLDING

The enthalpy of protein denaturation in solution, A%(sol), is the experimental quantity that any detailed description must be able to interpret. Measurements of A%( sol) for several proteins are available from the work of Privalov and co-workers (Privalov, 1979, 1989; Privalov and Gill, 1988). At present there exist no methods for directly calculating A% (sol) that are sufficiently accurate to provide detailed information concerning the various contributions. For the native state, it is possible to perform molecular dynamics simulations to evaluate the enthalpy in vacuum. Experiments (Frauenfelder et al., 1991) and molecular dynam- ics simulations (Brooks et al., 1988) show that the native state is a Boltz- mann average over a very large number of configurations on a multi- minimum surface (Elber and Karplus, 1987); the average structure corresponds to that determined by X-ray diffraction or nuclear magnetic resonance measurements. Since all the configurations of high probability (i.e., those that make a significant contribution to the native state aver- age) are very similar, the ensemble of structures that represents the native state and contributes to the energy or enthalpy of the folded protein can be sampled adequately. Both the internal protein contribu- tions and the protein solvent contributions can be calculated with reason- able accuracy by use of empirical energy functions (Brooks et al., 1988). However, the full enthalpy in solution includes a cavity (pure solvent) term (Ben-Naim, 1975; Yu and Karplus, 1988) which is more difficult to evaluate by simulations because it converges slowly. For the denatured state, the situation is even more complex because a wide range of very different conformations is present at room temperature. Although no full averaging is feasible, estimates could be made by considering a subset of the possible configurations. Even this is problematic due to the need of solvating a system that corresponds to an extended polypeptide chain.

Because of the difficulties of a direct approach, the thermodynamic cycle in Scheme I is introduced; an analogous cycle can be used for the entropy and free energy. Although the cycle, per se, does not eliminate any calculational difficulties, it makes possible the conceptual decomposition of the unfolding transition into a series of steps, each of which can be considered separately. In contrast to Aa( so l ) , the quantity A%(vac) can be evaluated in a relatively straightforward way by simulations of the isolated protein in the native and in the

Page 12: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

242 THEMIS LAZARIDIS ET AL.

unfolded states. To obtain information for comparison with the calcula- tion of A#(vac) , we consider the analysis of the denaturation enthalpy of protein presented by M & P. Their results can be expressed in terms of the cycle shown in Scheme I. They further decompose A#(vac) into van der Waals and hydrogen bonding terms, which can be compared with the van der Waals and electrostatic contributions to Aa(vac) obtained directly from calculations. In subsequent sections, we examine the calcu- lated van der Waals and electrostatic terms in more detail. This provides a better understanding of their contributions to protein stability.

A. Makhatadze 6' h v a l o v Analysis

The M & P analysis proceeds by using the experimental measure- ments of AN(so1) and estimates of APd(N) and APd(U) to determine A#(vac). The solvation terms AmJ(N) and APd(U) are divided into polar and nonpolar contributions. Both are assumed to be proportional to the exposed surface area and parameters for the surface area depen- dence are obtained by fitting experimental solvation data for model compounds. It is assumed that the solvation contributions are additive. The peptide backbone (CONH) is treated as one unit and each side chain is divided into a nonpolar and polar portion. General parameters are used for nonpolar groups (aliphatic and aromatic), while specific parameters are used for the polar portion of each amino acid side chain. The exposed surface area of each amino acid in the native state is determined from the crystal structure. For the unfolded state, the ex- posed surface area of each amino acid is assumed to be equal to that of the same amino acid in an extended Gly-X-Gly tripeptide; no effects of neighboring amino acid side chains are included. With these results and the additivity assumption, the total solvation enthalpy of the un-

At/:(vac)

Folded Protein (vac) + Unfolded Protein (vac)

Folded Protein (sol) Unfolded Protein (sol)

A//i(sol)

SCHEME 1.

Page 13: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 243

folded state is a sum over the individual amino acids. Formally, we have (see Scheme I)

Aa(vac) = Aa(so1) + AWJ(N) - AH.(U) = Aa(so1) + AmJ(N,np) - AH$(U,np) + AWW,p) - AWJ(U,p)

(1)

= AG(so1) - Aa(so1,np) - AG(sol,p),

where p and np refer to the polar and nonpolar portions of the individual amino acids; the peptide group is treated as part of the polar contribu- tion. M 8c P estimate AG(so1,np) and AG(sol,p), the solvent contribu- tions to the unfolding enthalpy of the nonpolar and polar amino acids, respectively, from the differences in the exposed surface area between the unfolded and native states of each contributing element. The results for four proteins in kcal/mol at 25°C are given in Table I. Combining these values as in Eq. (1) with the experimental values for AG(soI), M 8c P obtain Al-II:(vac) ; their values in kcal/mol are also listed in Table I.

To proceed further, M 8c P assume that As(vac) has only two contribu- tions. They are the van der Waals interactions and the hydrogen bonding interactions, both of which stabilize the native state. Only the nonpolar groups are considered in calculating the van der Waals term. Values for the van der Waals interaction per unit surface area (Az) are determined from the heat of sublimation of crystalline cyclohexane for the aliphatic portions of the amino acid side chains and from the heat of sublimation of crystalline benzene for the aromatic portions of the amino acid side chains. This choice for a reference state for the protein interior isjustified by M 8c P with the argument that proteins have the same packing density as hydrocarbon crystals; this point is considered in Section IV. The van der Waals contribution to AG(vac) is obtained from the difference in the exposed surface of the aliphatic and aromatic groups in the folded

TABLE I M&P Analysis of Enthalpy of Unfoldinga

Enthalpy/

Protein AHi(so1) AHk'(so1,np) AHU,(sol,p) AHi(vac) AW3vdW) AEG(elec) H bond

Cytocbrome c 21 - 246 -1525 1792 266 1526 14.3 Ribonuclease 71 -219 -1854 2143 234 1910 14.6

Lysozyme 58 -261 -1769 2088 281 1808 11.5

Mvonlobin 1.4 -376 -2246 2623 404 2219 9.1 ~~

"All values in kcal/mol at 25"C, obtained from Tables 9 and 10 of Makhatadze and Privalov ( 1993).

Page 14: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

244 THEMIS LAZARIDIS ET AL.

and unfolded state; the resulting values for the four proteins are given in Table I. Since M & P assume that the only remaining vacuum contribu- tion comes from hydrogen bonding in the native state, the hydrogen bond term is obtained by difference (see Table I) . Finally, M 8c P divide the hydrogen bonding contribution by the number of “network” hydro- gen bonds to obtain the vacuum energy of a typical protein hydrogen bond. Network hydrogen bonds are determined on a structural basis without restricting each donor or acceptor to participate in only one hydrogen bond (Stickle et al., 1992). The resulting energies (in kcal/ mol) are also listed in Table I. Such an estimate of the number of hydrogen bonds corresponds to an upper limit and so gives a lower limit for the enthalpy per hydrogen bond. If it is assumed instead that each donor-acceptor pair forms a single hydrogen bond, the enthalpy per hydrogen bond would be increased by a factor of about 1.4.

The resulting hydrogen bond energies, which range from 9.1 kcal/ mol for myoglobin to 14.6 kcal/mol for ribonuclease A, are significantly larger than the usually quoted energy of a peptide hydrogen bond in vacuum. The value of the latter is generally estimated to be 5 or 6 kcal/mol (Rose and Wolfenden, 1993). Thus, the analysis of M & P leads to an unrealistically large value for the strength of a protein hydro- gen bond. M & P are aware of this and state that “the estimated value of the energy of hydrogen bonding is not as large as it appears at first glance” (Makhatadze and Privalov, 1993, p. 656) and that “the value that we obtained for proteins seems quite reasonable, especially if one takes into account that the energy we are considering corresponds not to a single hydrogen bond, but to a bond in the cooperative network of hydrogen bonds. . . . ” (Makhatadze and Privalov, 1993 p. 656).

It should be clear that the large value for the hydrogen bond energy is a direct consequence of the analysis used to obtain it. The value of the hydrogen bond contribution arises from the experimental values of AG(so1) after all other contributions are determined [see Eq. (l)]. If any of the other contributions were significantly in error, the hydrogen bond energy would be wrong as well. An analysis of hydrogen bonding is given in Section V.

After completion of our chapter, Makhatadze and Privalov provided us with the revised interpretation of their data included in this volume. Sections I-VII of our chapter refer to the published work of Makhatadze and Privalov (1993); the new analysis is discussed in the Epilogue to our chapter.

B. Theoretical Calculation of Vacuum Denaturation Enthalpy

Molecular mechanics and dynamics methods based on an atomic model and empirical energy functions are best suited for calculating

Page 15: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 245

the energy of the native and unfolded states in vacuum. Thus, for compar- ison with the M & P analysis in Table I, we first consider A#(vac). The energy of either the native or denatured states can be written as

where hi,, is the energy corresponding to the minimum energy structure and &,,,, cot, and Eylb represent the translational, rotational, and vibra- tional contributions that depend on temperature (Brooks et al., 1988). For an energy surface with multiple minima, as is found in the native and denatured states of proteins, an average over the significant minima would be required; one procedure for doing this has been described by Stillinger and Weber (1982). Alternatively, the average energy can be calculated from a molecular dynamics simulation, which directly leads to a result including the motional contributions (specifically, L b ) . The motional contributions are not negligible; an estimate of their value for insulin in the rigid rotator, harmonic oscillator approximation, which makes possible a straightforward evaluation of quantum effects, is given in Tidor and Karplus (1994). However, the difference in Gans + cot + &ib

between the native and unfolded states is expected to be small, even though there are many conformers that contribute to the unfolded state. This is true because the energy is a Boltzmann weighted average over conformations, each of which makes a similar contribution. In contrast, the entropy also depends on the total number of accessible conforma- tions (Karplus et al., 1987).

Given the above and the fact that the difference between the enthalpy and energy is expected to be small, we can write

AHE(vac) AEI:(vac) = (E,(vac)) - (EN(vac)) E,(min) - EN(min), (3)

where the brackets correspond to dynamic averages. Consequently, we can begin our analysis of the thermodynamic cycle shown in Scheme I by calculating AEE(vac). Equation (3) makes it clear that we need to calculate the energy of both the native and the unfolded states. As we see in what follows, neglect of the unfolded state contribution would yield meaningless results.

To evaluate Eu(min) and EN(min), we use the polar hydrogen model and the CHARMM program (Brooks et al., 1988); details are given in the Appendix. The full interactions without any cutoff were calculated. The X-ray structure for the native state and an extended chain model for the unfolded state were employed. To avoid artifacts due to bad contracts, both structures were gently minimized. In addition, the aver-

Page 16: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

246 THEMIS LAZARlD1.S ET AL.

age energy, (EN (vac) >, was calculated from molecular dynamics simula- tions to compare with EN(min) so as to estimate the corrections to the static model. Since M & P based their analysis on a polypeptide chain with neutral side chains (Privalov, 1979; Makhatadze and Privalov, 1993). Neutralized charge distributions were introduced for the ionizable side chains (see Appendix). The effect of ionizable groups on AH;(sol) is expected to be “small” because they are almost entirely exposed to solvent in both the native and the unfolded states (Yang and Honig, 1992). We discuss this point when we consider the enthalpies of solvation of the native and denatured states in Section VI.

Calculations were performed on the four proteins studied by M & P: cytochrome c (Cyt c) , ribonuclease A (RNase A), lysozyme (Lys) , and myoglobin (Mb). Table I1 lists AE,U(vac) = A#(vac) and gives its decom- position into bonding and nonbonding terms; the former includes bond, bond angle, dihedral, and improper dihedral contributions, and the latter consists of van der Waals and electrostatic contributions, which are listed separately. Separate results for the native and the unfolded proteins are presented in Table 111. It is clear that the unfolded structure has a large negative energy, which must be included in obtaining a meaningful value for AE;(vac). The total denaturation energy, AEi(vac), is about one-fourth of the calculated native state energy, EN(vac). The main cancellation occurs for the electrostatic term, which in the unfolded state is equal to 80% of the native state value. This is due primarily to the presence of many attractive short-range interactions between main chain polar groups. The dominant attractive electrostatic interaction comes from dipole-dipole interactions in adjacent residues, which con- tribute about -30 kcal/mol residue in both the folded and unfolded states (see Section V). There is almost complete cancellation of the bonding interactions, as discussed below. The van der Waals term is reduced much more in the unfolded state because there is a significant

TABLE I1 Calculated Energy of Unfolding”

Protein N,, Molecular weight A.E#(vac) AE#(vdW) AE#(elec) AEi(bond) ~

Cytochrome c 103 12,300 943 688 225 30 RNase A 124 13,700 1,068 654 398 16

Myoglobin 153 17,800 1,492 1,020 42 1 51 Lysozyme 129 14,300 1,116 738 35 1 27

~ ~~~

All values in kcal/mol calculated as described in text (see also Appendix). For myoglo- bin and cytochrome c it is assumed that the free heme will have the same self (intraheme)- energy as in the protein.

Page 17: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 247

TABLE I11 Calculated Energies of Native Proteins" and Unfolded Chainsb

Energy of Native Protein Energy of unfolded chains -

Cytochromec -4421 -916 -3886 380 -3652 -219 -3710 278 RNase A -5528 -904 -4941 317 -4460 -250 -4544 333 Lysozyme -5820 -999 -5176 355 -4704 -261 -4825 382 Myoglobin -6678 -1373 -5731 426 -5298 -348 -5360 410

"All values in kcal/mol calculated as described in text and in the Appendix. *The free heme is not included in the reported values for Cyt c and Mb.

contribution from the longer-range attractions in the compact native state.

The importance of the unfolded chain contributions raises the ques- tion of the validity of the single extended structure used here. It is expected to yield limiting values for the energy but an ensemble of structures should be introduced to obtain more precise values for the various contributions.

Table I1 shows that both the van der Waals and electrostatic contribu- tions to AHi(vac) are important, while the bonding terms are small. In what follows, we do not consider the latter. The exact value of the bonding term is sensitive to minimization protocol but it is always small in comparison with other contributions to the vacuum unfolding en- thalpy. It has been proposed (Creighton, 1984) that an unfavorable contribution to folding comes from bonding interactions, i.e., from strain of the protein in the folded form. This is unlikely to be significant. Dihedral angles in folded proteins are close to the minimum value for the isolated system (Gelin and Karplus, 1975) ; for the stiffer bond lengths and angles the strain energy is expected to be smaller. However, the present results are not precise enough for a quantitative test of this question. The bonded contributions in Table I are positive for unfolding, but this is due to the fact that there is more strain in the starting extended structures than in the starting crystal structures. In one case (barnase; results not shown) in which we performed a short dynamics simulation (1000 steps) of the extended structure and then minimized, the sign of the bonding contribution was reversed: we obtained -37 kcal/mol, compared to +9 obtained with the above protocol. For a more accurate estimate, it would be necessary to calculate the average bonded energy obtained from simulations of both the folded and unfolded forms. Al- though the two separate terms are relatively large (see Table 111), the

Page 18: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

248 THEMIS LAZARIDIS ET AL

bonded contribution to AH:(vac) is expected to be small, whatever its sign.

The results for the calculated unfolding energy on a per gram basis, A&(vac), and their contributions, are given in Table IV. It can be seen that all energy components are similar for the different proteins. This supports the idea that the mechanism of protein stability is essentially independent of the protein. The same holds true for the values of AEi(vac) estimated by M & P; from the results in Table I, we obtain 149 2 5 cal/g. However, it should be noted that the variability is suffi- ciently large that use of the per gram values to predict the AEI:(vac) would be reasonable in terms of order of magnitude but could lead to quantitative errors. Such an approach is expected to work equally well for AS: (vac) and A GI: (vac) . Unfortunately, for AGI: (sol), which involves essentially complete cancellations between the AG: (vac) and the solva- tion terms, AG:(sol,np) plus AGi(sol,p), the errors in the approximation of simple mass scaling are expected to be too large for the results to be of interest.

Table V compares the calculated values of AEi(vac) = AH:(vac) from Table I1 with those estimated by M & P in Table I. The values of M & P are significantly larger than the calculated values; the ratio varies between 1.76 for Mb to 2.01 for RNase A. To determine the source of the differ- ence, we also list in Table V the calculated van der Waals and electrostatic contributions and compare them with the M & P estimates. The total calculated van der Waals interaction is significantly larger than the M & P value; the ratio varies from 2.79 for RNase A to 2.52 for Mb. In contrast to the van der Waals interactions, the calculated electrostatic contribution is much smaller than the hydrogen bond energy of M & P; the ratio varies from 0.21 for RNase A to 0.15 for Cyt c.

TABLE IV Calculated Enera of Unfoldingon Calm'c/gram Basis"

Protein Ae{(vac) Aea(vdW) Ae{(elec)

Cytochrome c 77 56 18 RNase A 78 48 29 Lysozyme 78 52 25 Myoglobin 84 57 24

Averageb 7 9 ? 3 53 2 4 2 4 2 5

Calculated from Table 11. *Value obtained by averaging the four proteins,

independent of size.

Page 19: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION T O PROTEIN STABILITY 249

TABLE V Vacuum Enthalpy of Unfolding, AHk(vac), and Its Decomposition"

~~

Calculated M & P

Protein Total vdW Elect Total vdW H bond H bondb

Cytochrome c 943 688 225 1792 266 1526 1104 RNase A 1068 654 398 2144 234 1910 1489 Lysozyme 1116 738 351 2089 281 1808 1350 Myoglo bin 1492 1020 421 2623 404 2219 1603

"All values in kcal/mol at 25°C. bValue obtained by subtracting the calculated van der Waals energy from the total

M & P( 1993) energy.

From the above results, it is evident that there is a large discrepancy between the value of AHI:(vac) obtained by use of a thermodynamic cycle with a surface area model for solvation effects and that calculated directly with an empirical energy function. The general features of the disagreement are the same for all the proteins, although there are quanti- tative differences. The calculated van der Waals contributions are signifi- cantly larger and the calculated electrostatic contributions are much smaller than the estimates based on the surface area model. In the following sections we explore the origin of the difference. We examine the van der Waals and electrostatic (hydrogen bonding) terms in Sections IV and V, respectively. This is followed by an analysis in Section VI of the difference in the solvation of the folded and unfolded states [AHe(N) and AH$ (U), respectively], which plays an essential role in the M & P estimates of the contributions to AHE(vac).

IV. VAN DER WAALS INTERACTIONS IN PROTEINS AND MODEL SYSTEMS

The van der Waals energy difference between the native and the unfolded proteins was calculated in the previous section from an atomic model for the energy minimized structures. It was shown that the van der Waals terms make an important contribution to AG(vac). Further, the calculated values were significantly larger than the estimates of M & P for a set of four proteins. To understand the origin of the difference between the two results and to validate the atomic model calculations of the van der Waals interactions, we present a more detailed analysis in this section.

There has been considerable debate as to which reference system is most appropriate for the interior of a protein (see Section 11). Model

Page 20: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

250 THEMIS LAZARIDIS ET AL.

systems that have been suggested include hydrocarbon crystals (M & P, 1993; Nicholls et al., 1991), hydrocarbon liquids (Baldwin, 1986; Spolar et al., 1989, 1992), polar liquids, such as octanol (Fauchere and Pliska, 1983; Pace and Gajiwala, 1994), and peptide crystals (Murphy and Gill, 1991). We calculate the van der Waals contribution to the energy of some of these systems and compare the results with experiment where possible. Comparison of the model system results and those calculated for the protein interior allows an evaluation of the various systems as models for proteins.

A. Model fw van der Waals Interactions

A polar hydrogen representation was used with the protein parameters in the CHARMM potential energy function (Brooks et al., 1983). For the van der Waals interaction energy, a painvise expression of the Lennard-Jones form

KdW = &[ (+)I2 - *(+)6] (4)

is employed. Here, E is the well depth and is the distance where the minimum in EVdw occurs. The van der Waals parameters used for extended carbon atoms in several potential energy functions that are widely used for macromolecules are shown in Table VI. It can be seen that these parameters do not vary significantly, with the exception of

TABLE Vl Lanard-Jones Parameters for Extended Atom Model"

CH, CH 2

Parameter R,,, E %, E

CHARMM19 4.33 0.18 4.47 0.11 OPLSb 4.24 0.21 4.37 0.12 AMBER' 4.00 0.15 3.85 0.12 UNICEPPd 4.24 0.18 4.44 0.14

"See Eq. (4) for definitions. E is in kcal/mol and Rmi. in A. Adapted from MacKerell et al. (1995).

Jorgensen et al. (1984). Weiner et al. (1984). Dunfield et al. (1978).

Page 21: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 251

AMBER, which employs somewhat smaller values. We used these parameters in Section I11 and employ them here for alkanes, aromatics, and the nonpolar interactions of the cyclic peptides. In most applications to bulk systems, a cutoff is introduced for the van der Waals interaction to speed up the calculations. We use a cutoff of 8.5 A with a switching function that starts at 8 A (Brooks et al., 1983). For consistency, the same cutoff scheme was used in all calculations reported in this section, so that the relative magnitudes of the interactions in different media are not affected. The effect of the cutoff on the absolute magnitude of the calculated interactions was estimated to be about 10 to 15%. For the protein analysis in Section IV,B we give results without a cutoff to be consistent with Section 111. Details of the calculations are given in the Appendix.

The results given in various portions of this section validate the van der Waals parameters for nonpolar molecules. For polar systems (e.g., amides) the importance of the van der Waals contribution to the heat of vaporization of pure liquids has been described by MacKerell and Karplus (1991).

B. Decomposition of van der Waals Interactions in Proteins

The van der Waals interaction energies in the protein interior that contribute to AHi(vac) were calculated in Section 111. They can be divided into three components: nonpolar-nonpolar (np-np) , non- polar-polar (np-p) , and polar-polar (p-p) interactions. The nonpolar atoms correspond to all carbons except the backbone carbonyl, the side- chain carboxyl carbon of Asp and Glu, and the side-chain amide carbon of Asn and Gln. The latter carbons plus all other atoms, including polar hydrogens, are treated as p atoms. Inclusion of the carboxyl amide carbons of Asp, Glu, Asn, and Gln in the nonpolar class slightly increases the magnitude of the np-np term but does not affect the conclusions of this analysis. The decomposition of the calculated van der Waals interactions into np-np, np-p, and p-p terms is given in Table VII, which also lists the van der Waals estimate of M & P. As already pointed out in Section 111, the total calculated van der Waals energy is between 1.76 to 2.01 times larger than the M 8c P estimate. The major reason for the difference is that M & P included only the nonpolar contribution. The calculated np-np term is similar to, but somewhat smaller than (about three-fourths) , the corresponding M & P estimates from hydrocar- bon crystals. The difference is a consequence of the procedure used by M & P to estimate the van der Waals energy. In a hydrocarbon crystal, the heat of sublimation, which is approximately equal to half of the

Page 22: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

252 THEMIS LAZARIDIS ET AL.

TABLE VII van der Waals Contributions to Unfolding Enthalpy, AH#(vac)"

AH# (vdW)

Total M & P Protein "P-"P "P-P P-P

Cytochrome c 21 1 372 113 688 265 RNase A 148 366 140 654 234 Lysozyrne 205 396 137 738 28 1 Myoglobin 306 573 144 1020 404

a All values in kcal/mol; the M & P (1993) values are at 25°C.

binding energy of each molecule with its surrounding molecules, is given by its ASA times an empirical coefficient. When applied to the nonpolar groups in a protein, this procedure gives half of the binding energy of the nonpolar groups with their surroundings, i.e., 1/2 (np-np + np-p). Since np-p interactions are larger in magnitude than np-np interac- tions, the M & P estimate will be larger than the true np-np value. In other words, M & P include some of the np-p interactions in their np-np estimate because they do not consider whether a nonpolar group is surrounded by nonpolar or polar groups. The total van der Waals energy involves all interactions, both polar and nonpolar, and the sum of the np-p and p-p contributions is greater in magnitude than the np-np term.

If the full van der Waals interaction is used in the M & P model as a limiting case (last column of Table V) , there is a significant reduction in the estimated strength of a hydrogen bond, which is assumed to arise from electrostatic interactions. Using the number of network hydrogen bonds, the resulting range is 6.6 to 11.5 kcal/mol, which is still larger than the expected vacuum value. Moreover, the calculated electrostatic energy difference is still much smaller than the adjusted hydrogen bond term obtained in this way (see Table V) .

C. Model System Results

In this section we compare binding energies of nonpolar groups (CHs, CH,, CH) in a number of model systems and in proteins. The binding energy is the total interaction energy of a group with its surroundings. In homogeneous (pure) systems, it is twice the molar energy. In heteroge- neous systems (mixtures), it is one component of the partial molar energy, the other being the cavity energy. Use of the binding energy makes it straightforward to compare homogeneous and heterogeneous

Page 23: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILJTY 253

systems. In all calculations in this subsection we use the same cutoff scheme (see Appendix).

1. Alkane Cvstals

A set of calculations was performed on crystals of the normal alkanes, ethane, pentane, hexane, and octane. Both the crystal coordinates and the minimized crystal structures were used; the method employed for the calculations is described in the Appendix.

The calculated estimates for the sublimation energies of the four alkane crystals are given in Table VIII; for pentane, hexane, and octane, experimental values are available and are included in the table. The energy calculations neglect the effects of thermal motion [see Eq. (3)]. In studies of alkane crystals with all-atom parameters (MacKerell et aL, 1994) it was found that the average van der Waals energy obtained from molecular dynamics simulations is 5 to 12% less negative than the van der Waals energy in the minimized structures. Thus, the motional averag- ing has an effect on the potential energy that is approximately equal to and opposite in sign to the effect of truncation. The experimental heats of sublimation also include a PV term; the contribution is of the order of 0.2 to 0.4 kcal/mol, since the sublimation temperature is in the range 85 to 193 K. Considering these effects, the agreement between the calculated and experimental sublimation energies indicates that the van der Waals parameters are satisfactory.

For comparison, Table VIII also lists the predictions of the M & P approach. The values are obtained by calculating the exposed surface area of the molecule in the crystal conformation and multiplying it by the coefficient used by M & P. The M & P value for the coefficient

TABLE VIII Heats of Sublimation of Hydrocarbon Cqstals"

Crystal Minimized Alkane coordinates coordinates Experimentalb M & P'

Ethane 3.3 3.75 - 5.3 Pentane 7.92 9.15 9.34 8.0 Hexane 9.44 10.52 11.22 8.9 Octane 12.92 14.04 14.70 10.6

"All values in kcal/mol.

'Calculated as ASA X 30.1 kcal/mol A*. ASAs were calcu!ated for the cqstal confor- mation with C radiuso= 2 A and water probe radius = 1.4 A (ethane, 276 A*; pentane, 265 A2; hexane, 294 A2; octane, 353 A2).

References in MacKerell et al. (1995).

Page 24: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

254 THEMIS LAZARIDIS ET AL.

(30.1 kcal/mol A*) is based on cyclohexane. The resulting values are all smaller than experiment values and our calculations, except for ethane. Table IXA lists the interaction energies of methyl and methylene groups in the alkane crystals.

2. Liquid Alkanes

Simulations were performed for liquid alkanes to compare with the crystal results. Propane and pentane were studied with the same parame- ters and cutoff as those used for the crystals. A range of calculations were performed to show that the results are not very sensitive to the specific conditions; see Appendix.

The results for the estimated heats of vaporization are reported in Table X and compared with the experimental values. The calculated values are lower than the experimental values. Part of this difference is due to the truncation of the long-range interactions, which in this case is not compensated by the lack of motional averaging as in the crystal

TABLE IX Binding Energies of Nonpolar Groups"

A. In Model Systems and FVota'ns

Compound X-ray CHs (Min.) ' X-ray CHz (Min.) X-ray CH (Min.)

Ethane -3.30 -3.75 - - - - Pentane -3.74 -4.30 -2.97 -3.40 - - Hexane -3.48 -4.18 -3.10 -3.44 - - Octane -3.55 -4.19 -3.28 -3.62 - -

Average -3.52 -4.10 -3.12 -3.49 - -

Myoglobin - -3.5550.8 - -3.120.7 - -

Alkane crystals

Proteins

Cyclic dipeptides - - - - Ala-Ala -4.23 - 4.35

Pro-Leu -4.68 -5.02 -2.68 -3.00 -2.8 -2.8

B. Per Carbon Atom in Liquids

Liquid alkanes Propane -2.07 (WT, 0.54 g/cms)

- 1.90 (NP7; 0.49 g/cms) Pentane -1.70

Leu side chain -1.75 Phe ring - 1.57

In water

All values in kcal/mol. Min. implies that the X-ray crystal structure was minimized.

Page 25: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 255

TABLE X Heats of Vaporization of Liquid Alkanes"

Alkane Temperature "C Ensemble Calculated* Experimental'

Propane - 42 NVT (0.54 g/cms) 3.11 (3.57) 4.53 NPT (0.49 g/cms) 2.83 (3.29) -

Pentane 25 NPT 4.3 (4.9) 6.32

a Data in kcal/mol. Values in parentheses include the PV = RT term, References in MacKerell et al. (1995).

calculations. The binding energy per molecule was calculated and de- composed into contributions on an extended carbon atom basis; no distinction is made between CHs, CHp, and CH(arom) groups. The resulting values are listed in Table IXB. It is clear that the binding energy for nonpolar groups in liquid alkanes is significantly smaller in magnitude than that found in alkane crystals. The difference can be compared with the estimate of Nicholls et al. (1991) based on the enthal- pies of fusion of alkane crystals. For example, the enthalpy of fusion of pentane is A P n = Amub - A € P p = 9.34 - 6.32 = 3 kcal/mol (see Tables VIII and X), which gives 0.6 kcal per carbon atom. This corres- ponds to 1.2 kcal binding energy per carbon atom. Taking into account motional corrections for the crystals, this is comparable to the difference in binding energies between crystals and liquids obtained in Table IX.

3. Nonpolar Groups in Water

As mentioned in Section 111, the solvation enthalpy has two compo- nents: the direct solute-solvent interaction, and the cavity term, which is the change in solvent-solvent interactions upon solute insertion (Yu and Karplus, 1988). Whereas the cavity term is difficult to obtain theoreti- cally and depends critically on the special properties of water as solvent, calculation of the direct solute-solvent interaction is straightforward with available simulation techniques. In this section we perform this calculation for two nonpolar molecules that are models of protein side chains. Toluene was chosen as a model for the phenylalanine side chain and 2-methylbutane as a model for the leucine side chain. The methyl group of toluene mimics the blocking effect of the protein P-carbon and its interaction with the solvent is not included in the following analysis. The extra methyl group of 2-methylbutane has the correspond- ing role for leucine.

Monte Carlo simulations of these molecules were performed in water at 25°C (for toluene, also at 80°C). Details of the simulations are given in

Page 26: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

256 THEMIS LAZARIDIS ET AL.

the Appendix. Two temperatures were studied for toluene to determine whether the van der Waals interaction energy has a significant tempera- ture dependence. The interaction energies averaged over the simulation for the six aromatic sites in Phe and the four sites corresponding to the Leu side chain are listed in Table IX. The values for the interactions in water are similar to those in the liquid alkanes.

4. Cyclic Dipeptides

The thermodynamics of dissolution of cyclic dipeptide crystals in water (Murphy and Gill, 1989a) exhibit some unexpected trends. In particular, analysis of the slope of the AH vs ACp plot for a number of cyclic dipeptides with hydrophobic side chains implies that the enthalpy of transfer of nonpolar groups from the crystal phase to water is negative (Yang et al., 1992). However, the enthalpy of transfer of nonpolar groups from organic liquids to water is about zero. Therefore, the enthalpy of transfer of nonpolar groups from the crystal phase, which is more densely packed than the liquid, to water should be positive rather than negative. This discrepancy led to the suggestion that van der Waals interactions between nonpolar groups in cyclic dipeptide crystals are not as favorable as those in alkane crystals due to constraints imposed by the need to satisfy polar interactions (Yang et al., 1992; Murphy and Freire, 1992; Creighton, 1991).

To test this suggestion, two cyclic dipeptide crystals, c(Ala-Ma) and c(Pro-Leu), were studied. The method used is the same as that employed for the alkane crystals (see Appendix). Calculations were made with the crystal structure and the crystal structure after a minimization. The binding energy of the CH, and CH2 groups is given in Table IX. As seen in Table IX, the cyclic dipeptide results are very similar to those in alkane crystals; the CH2 values are slightly smaller in magnitude, while the CH3 group values are somewhat larger.

The suggestion that van der Waals interactions of nonpolar groups in cyclic dipeptide crystals are not as favorable as those in alkane crystals is not supported by our results. Packing and van der Waals interactions of nonpolar groups in cyclic dipeptide crystals are just as favorable as in alkane crystals.

5. Comparison with Proteins

To compare the results obtained for the model systems with those for proteins, we used myoglobin and calculated the van der Waals ener- gies of all buried CH3 and CH2 groups of a minimized crystal structure (200 steps SD and 500 steps ABNR, see Brooks et al., 1983); a buried group is taken to be one that has less than 1 k ASA (the full ASAs of CH,

Page 27: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 257

and CH2 groups vary from about 19 to 80 Az). The protein calculations described here use the same 8.5 A cutoff as in the rest of this section to make them consistent with the crystal and solution results. The average values and their standard deviation are shown in Table IX. The protein values are comparable to but somewhat smaller in magnitude than those found in the alkane crystals. Somewhat less negative values in proteins are reasonable because they are more constrained than the alkane crys- tals and have to satisfy a number of different types of interactions simulta- neously. However, it is not clear whether the differences are significant because the alkane crystals have been more extensively minimized (see Appendix). Comparison with the values for liquid alkanes (Table IX) show that the protein values are significantly larger and closer to the crystal results.

The values for the interaction in water are considerably smaller in magnitude than the corresponding results in the interior of proteins and are more similar to those in liquid alkanes (Table IX). Of course, it should be noted again that the protein results are based on X-ray crystal structures and energy-minimized structures. At room temperature the magnitude of the binding energies in proteins is expected to be smaller by 10-12% (see Sect. IV,C,l). Taking this into account, the difference between the binding energies of nonpolar groups in water and in proteins is about 1 kcal/mol per carbon atom.

The van der Waals interaction energies were also calculated for the aromatic ring of phenylalanine (Phe) and the side chain of leucine (Leu) as prototypes for aromatic and aliphatic side chains, respectively. The crystal structure and the energy-minimized structurFs of the proteins were used and residues with solvent ASA less than 6 A were included; the exact value of the surface area cutoff does not significantly affect the results. Again, a nonbonded cutoff of 8.5 A was used. The results obtained for the four proteins are shown in Table XI and compared to those obtained in aqueous solution. In agreement with the results in

TABLE XI Interaction Energies fur Phe and Leu Side Chains in Proteins and

in Water"

Protein Water

Crystal Minimized 25°C 80°C

Phe -14.9 -C 3 -16.6 f 3 -9.4 -8.5 Leu -11.3? 1 -12.5 ? 1 -7 -

"All values in kcal/mol.

Page 28: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

258 THEMIS LAZARIDIS ET AL.

Table IX, the van der Waals interactions of these side chains with all other parts of the protein (np and p) are significantly larger than those in aqueous solution. Thus, there is a stabilizing van der Waals contribu- tion in the native state relative to the denatured state, where the side chains are exposed to solvent.

As pointed out in Section 111, the van der Waals interactions in the interior of proteins are actually an average over the structures sampled at a given temperature. To test the effect of motional averaging, the results from molecular dynamics simulations of barnase in water at 25°C are compared with those from the energy-minimized structures. For the thermodynamic cycle (Scheme I) , the calculation of protein properties (e.g., the van der Waals and electrostatic energies) in the solvated system is appropriate. This avoids the need to introduce the effect of structural changes between the protein in vacuum and in solution. From the trajectory files a number of coordinate sets were extracted and used to calculate averages for the total van der Waals interactions and their decomposition into nonpolar-nonpolar, polar-polar, and nonpolar- polar terms. These averages are reported in Table XI1 and compared with the corresponding results for the crystal and the minimized struc- tures, which are also given in the table. Overall, the van der Waals interactions at room temperature are 5-10% less negative than the ones calculated with the minimized protein structures, which is similar to the behavior of alkane crystals. The np-np and np-p components of the van der Waals energy are less negative and the p-p component is more negative in the simulation than in the minimized structures.

D. Summary

In this section we have made an analysis of van der Waals interactions in proteins and compared them to van der Waals interactions in model systems to determine the extent to which they contribute to protein stability and to find the system that is most appropriate as a model for

TABLE XI1 Interaction Energies f iom Dynamic Simulation of Barnase"

vdW Crystal Minimized Dynamics averageb

"P-"P - 253 -238 -218

P-P -133 -138 - 183 "P-P -420 -426 - 349

All values in kcal/mol. 'Average and standard deviation over 31 frames.

Page 29: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION T O PROTEIN STABILITY 259

the protein interior. The van der Waals interaction parameters used in these calculations, which are similar to those in other force fields, were shown to give reasonable results for the heats of sublimation of alkane crystals and heats of vaporization of liquid alkanes. This provides support for the conclusions derived from the protein and model system calcula- tions.

The van der Waals interactions of nonpolar groups in proteins are closest to those in alkane crystals, although of a slightly smaller magni- tude. This is consistent with the similar packing density exhibited by these two systems. Cyclic dipeptide crystals exhibit equally favorable van der Waals interactions. In liquid alkanes these interactions have smaller magnitude. The van der Waals interactions of alkanes in water are similar to those in liquid alkanes and have a small temperature dependence (see also Lee, 1991). Therefore, the large increase in solvation enthalpy of nonpolar groups in water with temperature is not due to the direct van der Waals interactions, as suggested in the past (Privalov and Gill, 1988), but to the solvent cavity term.

The overall van der Waals term in proteins can be decomposed into np-np, np-p, and p-p components. The largest of these is the np-p component, followed by the np-np, and finally by the p-p component. The p-p term includes the van der Waals interaction between hydrogen bonded groups, which is a part of the hydrogen bonding energy. How- ever, as will be shown in the following section, this is a rather small part of the total p-p term. The analysis of M & P accounts for only the np-np component, although they implicitly include some of the np-p interactions. As a result, they significantly underestimate the stabilizing enthalpic contribution of van der Waals interactions.

V. ELECTROSTATIC INTERACTIONS IN PROTEINS HYDROGEN BOND CONTRIBUTION

In the analysis of the vacuum unfolding enthalpy AG(vac), the largest discrepancy between the M & P analysis and the calculations is in the electrostatic stabilization of the native state. By their differencing proce- dure, which leaves the hydrogen bond term last (i.e., it is the “omnium gatherum” term), M 8c P obtained a very large contribution (see Table I) . The calculations lead to a much smaller electrostatic term (see Table 11), only part of which is expected to arise from hydrogen bonding. In this section we examine the hydrogen bond interactions in the native protein and determine how they are related to the overall electrostatic interaction. As in Section 11, we consider results for a protein in which the ionizable side chains are neutral. In addition, we consider a 20-

Page 30: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

260 THEMIS LAZARIDIS ET AL.

residue polyalanine peptide as a simple model system to compare with the protein results. This system is of particular interest because measure- ments of A#(sol) are available (Scholtz et al., 1991) so that an analysis can be made that corresponds to the system used by M & P for proteins.

A fundamental aspect of the vacuum energy contributions to AE:(vac) is that they involve the difference between the native and the unfolded structures, for which a fully extended chain is used. This is an approxima- tion that serves to maximize the energy change on folding. As noted in Section 111, there are still large stabilizing interactions in the extended chain. The total electrostatic interaction energies of the four proteins are given in Table I11 for the native and unfolded states. Since the unfolded state has a significant stabilizing electrostatic contribution, the estimates of electrostatic (hydrogen bonding energy) by M & P implicitly refer to this difference rather than to the native state per se. Although the hydrogen bond energy is expected to be small in the unfolded state, the electrostatic energy is not.

In Section V,A we consider a simple 20-residue polyalanine helix and perform calculations similar to those for the four proteins in Section 111. The calculations are compared to the results of the M & P analysis for this system. The analysis of Ooi and Oobatake (1991) is also presented for comparison (08~0). In Section V,B the energetics of helix formation in vacuum and the hydrogen bonding contribution is examined in more detail. Finally, an analysis of hydrogen bonding interactions in proteins and a discussion of the M & P approach is given in Section V,C.

A. Polyalanine Helix Calculations

The polyalanine model consisted of 20 alanines with neutral blocking groups at the termini, i.e., an acetyl group at the N terminus and a methylamine group at the C terminus. The a helix was constructed in the ideal form (+ = -57", JI = -47") and then minimized for 300 ABNR steps; there were only small shifts in the structure and the final angles were + = -62", + = -42". The extended conformations had ideal initial angles of + = -140", JI = 135", and final angles (+ = -154", + = 149") after minimization with 300 ABNR steps. No cutoff was used in the minimizations as in the final calculations of the energy differences. The difference in energy between the minimized helical and extended polypeptide conformation is given in Table XIII.

To perform a M & P-type analysis of the energetics of helix formation we calculated the difference in the aliphatic unit (C, plus C, methyl group) and the peptide group ASA between the two conformations. These numbers were then multiplied by the coefficients determined by

Page 31: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 261

TABLE XI11 Analysis of PO-Residue Polyalanine Helix"

Parameter M&P O&O Calculations

A HI: (sol) AH: (s01,p) AH: (so1,np) AH:(vac) AHi(vdW)

"P-"P "P-P P-P

AH; (elec)

26' 19< -

-9 -6 - 240 49 127

9 - 56 7

- 37 12 60

- -205 - 24

- - - - -

- 231

a All values in kcal/mol at 25°C. ' In the M&P analysis the experimental AhHa(so1) is used. In the O&O analysis AH#(sol) is predicted.

M 8c P (29.2 cal/mol A' for the aliphatic unit and 406.8 cal/mol Az for the peptide group) to obtain AHi(so1,p) and AHi(so1,np). AG(vac) is obtained by difference using the experimental enthalpy of helix unfold- ing. We used the experimental value of approximately 1.3 kcal/mol per residue (Sholtz et ad., 1991) for a 50-residue, nonpolyalanine peptide and neglected end effects. The resulting A G (vac) was decomposed into van der Waals and "hydrogen bonding" contributions according to the approach of M & P. The results are given in Table XIII.

Ooi and Oobatake use separate proportionality coefficients for the carbon, the nitrogen, and the oxygen of the peptide group. With their coefficients and our calculated differences in MA, we obtained the polar and nonpolar solvation contributions to unfolding enthalpy, which are also given in Table XIII. For AH:(vac) we used the value reported by the authors (Ooi and Oobatake, 1991).

Comparison between the values calculated by the M & P and O&O procedures reveals significant differences. The striking difference is in the value for the polar solvation enthalpy, for which the O&O method predicts a value much smaller than that of M & P. As discussed earlier in Section 11, this difference arises from the model compounds and the particular group additivity scheme adopted by O&O, which leads to an unrealistically low value for peptide group hydration (see also Section VI) . Also, the vacuum enthalpy of unfolding estimated by O&O is substan- tially smaller than the calculated value.

Comparison between the M & P values and the calculations also reveals significant differences. In contrast to the O&O method, the M & P value for the vacuum enthalpy of unfolding is much larger than the calculated

Page 32: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

262 THEMIS LAZARIDIS ET AL.

value. This is due not to the van der Waals term, which is actually underestimated as in proteins, but to the electrostatic (hydrogen bond) term. If one performs the M 8c P analysis on a per residue basis, one obtains 16 kcal/mol for the strength of one hydrogen bond. As discussed in Sections I11 and VI, this value is too large. Even with the increased values of van der Waals interactions that we obtained in the previous section, the hydrogen bond strength deduced from M 8c P’s analysis is still larger than the accepted values for hydrogen bonds between uncharged groups in the gas phase (Rose and Wolfenden, 1993).

B. Hydrogen Bond Analysis of Polyalanine Helix

With the polar hydrogen CHARMM19 parameter set, a polyalanine helix has special properties which aid in the analysis. The extended atom methyl groups are neutral so that all the charges are on the main chain associated with hydrogen bonding groups (i.e., CO and NH). Hydrogen bonding is represented by the electrostatic interactions, with a smaller contribution from van der Waals terms. No specific hydrogen bonding function is present because the parameterization (i.e., the par- tial charges and Lennard-Jones parameters for each atom) gives excel- lent agreement with high-level ab initio gas-phase calculations for hydro- gen bond energies (Reiher, 1985). The peptide hydrogen bonds arise from the interaction between the CO group (partial charges C, +0.55; 0, -0.55) and the C,NH group (N, -0.35; H, +0.25; C,, +0.1). How- ever, the electrostatic interactions of these dipolar units are not limited to the CO ***NHC, hydrogen bond in which they are involved. Thus, the evaluation of a hydrogen bond energy in the helical system, and similarly in a protein, requires some analysis.

To do a standard hydrogen bond analysis of the polyalanine helix, we only consider the folded system. The unfolded chain makes a negligi- ble contribution to the standard hydrogen bonding interactions (i.e., the i,i+4 CO..-NHC, energy in the extended chain is only -0.03 kcal/mol). In a helix there are two types of CO...NHC, pairs close enough to be considered hydrogen bonded: i,i+4 and i,i+3. The interaction energy between these pairs was calculated. The total CO NHC, interaction for a i,i+4 hydrogen bond is about -3.4 kcal/mol, all of it contributed by the electrostatic term. The i,i+3 interaction is much smaller (-0.15 kcal/mol) and has a significant contribution from favorable van der Waals interactions (-0.6 kcal/mol) and unfavorable electrostatic interactions ( + 0.45 kcal/mol) .

The value 3.4 kcal/mol for a good hydrogen bond is significantly lower than the experimental or quantum mechanical values obtained for

Page 33: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 263

the interaction of polar molecules in the gas phase (Rose and Wolfenden, 1993). For example, the interaction between two formamide molecules was found by ab initio calculations to be about -6.5 kcal/mol (Mitchell and Price, 1990). Looking into the reasons for this discrepancy we calcu- lated the interaction energy of two Nmethylacetamide molecules in the gas phase using the same empirical force field as in the protein calculations. In their optimal configuration (the antiparallel dimer) this interaction energy is calculated to be -8.8 kcal/mol (-2 van der Waals). The CO'-NHC,2 interaction is only -3.6 kcal/mol (all electrostatic), similar to that found in the helix; the superscripts 1 and 2 indicate the two molecules involved. The additional energy comes from favorable dipolar interactions between the other parts of the two molecules, i.e., the interactions CO'*.*C02 (-3.1 kcal/mol, -0.5 van der Waals), CO*.-.NHCA (-1.0 kcal/mol, -0.7 van der Waals), and NHCA*..NHCZ (- 1.1 kcal/mol, -0.7 van der Waals).

To account for this type of collective effect in evaluating the hydrogen bonding contribution, we reconsider the polyalanine a helix. The change in energy per residue for an infinite helix, relative to the ex- tended state, is estimated and then end effects are introduced to correct for the 20-mer polyalanine helix and for finite helices that occur in proteins. The change in energy on formation of an infinite helix is

where is the change in interaction energy between residues i and j from the unfolded to the folded form; only termsj 2 i are included in the summation to avoid double counting. The energy per residue is then

AE = j 2 i

where i is any residue, The results are shown in Table XIV. Separate values for the a-helical and extended forms are given in Table XV. Table XVI gives the electrostatic dipole-dipole interactions between the CO and NHC, groups for the i,i, i ,i+3, and 2,2+4 interactions. The unfavor- able intraresidue energy, A&, is due to the fact that the CO and NH dipoles are parallel in the helix, while they are antiparallel in the ex- tended form. The large negative i,i+l term in both the helical and the extended forms arises from dipole-dipole interactions, as already mentioned in Section II1,B and can be seen in Table XVI. It is a dominant contribution to the electrostatic energy of the extended protein chains as well. The most favorable interaction is i to i+4, as expected. The

Page 34: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

264 THEMIS LAZARIDIS ET AL.

TABLE X N Change in Residue-Residue Interactions on Formation o j

Polvalanine CY Helix

. . 2,

i,i + 1 i,i + 2 i,i + 3 i ,i + 4 i ,i + 5 i , i +? 6

Total

+4.3 -0.1 -0.7 -3.9 -5.1 -0.6 -1.9 -8.0

+5.9 -1.5 +0.3 -2.2 -4.5 -0.5 -1.8 -4.3

-0.84 +1.35 -1.03 -1.73 -0.62 -0.14 -0.09 -3.1

-0.7 +0.1

- -0.6

interaction with residue i+3 is significant as well but has a large van der Waals component. Both the i,i+4 and i ,i+3 terms are larger than the direct CO...HNC, interaction. This is due to additional interactions between the remaining groups in the residues. In particular, the sizeable van der Waals i ,i+3 interaction arises from a large number of interatomic interactions, the largest of which are O*-Nit3, C2-NZt3, and 02-Cit3. The van der Waals contribution (-3.1 kcal/mol) is more than one-third of the total energy, whereas the electrostatic contribution (-4.3 kcal/mol) is slightly more than half. The bonding terms make only a small contribu- tion, as expected. One notable feature of the results is that long-range electrostatic interactions (i,i+ 2 6 ) contribute almost one-fourth of the total stabilization energy.

The above results are estimates for an infinite helix, with the as- sumption that contributions beyond 15 residues are negligible. Beyond i,i + 4 the contribution to AE decreases monotonically and rapidly

TABLE XV Residue-Residue Interactions in Polyalanine CY Helix and Extended Polyalanine

Polyalanine a helix Extended polyalanine

Interaction Total E I;Ldw &,"d Total E I$,ec &,w

. . 2, z 2.47 2.45 -0.37 0.387 -1.8 i,i + 1 -31.31 -32.17 0.78 0.085 -31.21 - i,i + 2 -1.18 0.16 -1.34 - -0.51 i,i + 3 -3.98 -2.25 -1.73 - -0.11 i , i + 4 -5.12 -4.52 -0.59 - 0.01 i,i + 5 -0.65 -0.51 -0.14 - -0.02 i , i + S 6 -1.89 -1.79 -0.10 - 0.01

-3.40 -30.64 -0.17 -0.08

0.01 -0.02

0.01

~~

0.48 -0.57 -0.34 -0.02

0 0 0

1.127 0.005

Page 35: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY

TABLE X U Electrostatic Dipole-Dipole Interactions in Polyakmine Helix"

i i + 1 EN (vat) Eu (vac)

265

NHC, NHC, -6 -5 co co -17 -17.5 co NHC, -8.5 -8.5

NHC, NHC, -0.3 - NHC, co -0.15 - co co -2.24 - co NHC, +0.43 -

i + 4 NHC, NHC, 0 NHC, co -0.06 - co co -1.06 - co NHC, -3.4 -

"All values in kcal/mol. 'EU(vac) is negligible for i,i + 3 and i,i + 4 interactions.

i + 3

-

(e.g., the i,i + 15 interaction in the helix is -0.04 kcal/mol and -7 X kcal/mol in the extended form). For a finite helix, the stabilization

in energy will be different because of end effects, i.e., the energy will be -8N kcal/mol, where N is the number of residues, plus the correc- tions for end effects. To calculate the end effect, we first substract all the interactions that are missing due to the truncation of the helix. They are

m m

= AE,,,+, = hEend, k = l j = k

In Eq. (7) , the first line represents the loss of interactions of the last (Nth) residue, the second line the loss of interactions of the ( N - 1)th residue, and so on. The end effect of a semi-infinite helix would be half of APd. Based on the numbers in Table I, APd is equal to -48 kcal/mol. After subtracting this number, it is necessary to add the interactions of the two terminal blocking groups. These numbers were calculated to be -12.0 kcal/mol for the acetyl group and -6.0 kcal/ mol for the methylamine group.

Page 36: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

266 THEMIS LAZARIDIS ET AL.

Thus, for a polyalanine chain that is long enough so the two ends interact to a negligible extent the stabilizing internal energy on helix formation is given by

hE;(vac) 8*N - 30 kcal/mol, (8)

where N is the number of residues. For the 20-residue polyalanine helix, this yields a value of 130 kcal/mol for AE;(vac), in good agreement with the complete calculation in Table XI11 (127 kcal/mol) . If the same analysis is done for the electrostatic energy alone, one finds

AE;(elec) 4.3*N- 24.1 kcal/mol, (9)

which gives 62 kcal/mol, compared to 60 kcal/mol from the full calcula- tion. It is clear from these results that the van der Waals energy makes a significant contribution; it arises from both polar and nonpolar atoms.

In the vacuum analogue of the analysis by Baldwin and co-workers of the energetics of helix formation (Scholtz et al., 1991), all of the energy (electrostatic and van der Waals) is hydrogen bond energy since they assume the only difference between the helix and the extended forms to be due to hydrogen bonding. Our calculations show that less than half of the total stabilization energy in vacuum comes from conventional hydrogen bonding interactions. Large contributions come from van der Waals interactions and dipole-dipole interactions of groups that are not hydrogen bonded to each other.

C. Protein Hydrogen Bonding

Following the analysis of the a helix, we first calculate the standard energy of the hydrogen bonds in the four proteins studied in Section 111. We include all hydrogen bonds with the NH. - -0 and CO. * .H angles greater than 90’ and divide them into two classes: good hydrogen bonds (H.e.0 distance less than 2.5 A) and marginal hydrogen bonds ( H - * . O distance between 2.5 and 3 A). The i,i+4 hydrogen bond in the helix belongs to the first class and the i,i+3 interaction in the ideal (nonmini- mized) helix yields a marginal hydrogen bond. After minimization, the CO...H angle drops to a value slightly smaller than the cutoff (89”), but we still consider it a marginal hydrogen bond. The results are given in Table XVII.

The average CO-NHC, interaction energy for “good” hydrogen bonds is -2.8 to -3.0 kcal/mol. The major part of this energy is elec- trostatic, while about -0.2 kcal/mol comes from van der Waals inter-

Page 37: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 267

TABLE XVII CO-HNCCU Interactions in Proteins“

Protein Ngdb Nmarp, &,,(vdW) &,,,(vdW)

Cytochrome c 58 (75) 18 -2.8 t 0.9 (-0.2) -1.0 t 0.9 (-0.6) RNase 67 (74) 15 -2.9 ? 0.7 (-0.1) -0.9 ? 0.7 (-0.6) Lysozyme 78 (105) 29 -2.9 ? 0.7 (-0.2) -0.7 ? 0.6 (-0.6) Myoglobin 100 (194) 48 -3.0 t 0.8 (-0.2) -0.5 2 0.4 (-0.6) Polyalanine I/res (2/res) l/res -3.4 (0.0) -0.15 (-0.6)

a All values in kcal/mol. * Only backbone hydrogen bonds are included; the corresponding numbers of “net-

work” hydrogen from Stickle et al. (1992) are given in parentheses.

actions. In the ideal polyalanine helix, this number is slightly larger, - 3.4, kcal/mol. The marginal hydrogen bonds are characterized by interaction energies between -0.5 and -1.0 kcal/mol, most of which arises from van der Waals terms. As seen in the previous section, the marginal i, i+ 3 hydrogen bond has an unfavorable electrostatic energy and a favorable, dominant, van der Waals energy. From Table XVII we see that there are no substantial differences among the four proteins in the quality of their hydrogen bonds. Also, the number of good hydrogen bonds per residue is about the same for all four proteins (about 1 per residue and 1.1 for myoglobin) .

In Section I11 we calculated the total electrostatic change between native proteins and their extended chain conformations. In such a calcu- lation, the contributions from hydrogen bonding interactions and more delocalized electrostatic interactions are not separated. To determine the individual contributions, we consider myoglobin in some detail. The total electrostatic stabilization energy is 421 kcal/mol (see Table 11). For the eight helices of myoglobin, we make use of Eq. (9) and obtain a total energy of about 315 kcal/mol. This value is somewhat larger than the true result since Eq. (9) holds for an ideal and sufficiently long polyalanine helix. By calculating the unfolding energy change for the eight helices in the conformation they have in the native protein, we find that the actual number should be between 285 and 300 kcal/mol. This number is significantly smaller than the total electro- static stabilization energy of 421 kcal/mol.

M & P divided their total electrostatic energy by the number of network hydrogen bonds to obtain the average strength of hydrogen bonds in a protein. They observed that the strength of hydrogen bonding de- creases with the amount of helicity in the protein and rationalized this by arguing that hydrogen bonds in helices are slightly longer (Makhatadze et

Page 38: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

268 THEMIS LAZARIDIS ET AL.

al., 1993; Wintrode et al., 1994). This conclusion depends on the way that hydrogen bonds are counted. Their use of network hydrogen bonds, which include the marginal i,i+3 CO-NH interactions, effectively re- duces the hydrogen bond energy in helical proteins. As a result, myoglo- bin, which is all a helical and thus has a relatively high number of i,i+3 hydrogen bonds, was predicted to have the weakest hydrogen bonds. If the analysis is restricted to strong hydrogen bonds, according to our definition, the M&P hydrogen bond strength would be 13 kcal/mol for myoglobin. From the directly calculated result, we see that the real i,i+4 hydrogen bonds in myoglobin are as strong, if not stronger, than those in the other proteins (see Table XVII).

Calculations on ideal polyalanine p sheets (not reported) show that p sheets have no “marginal” hydrogen bonds and that the CO-NHC, interaction is -3.3 kcal/mol (+0.1 kcal/molvan der Waals) for antiparal- lel p sheets and -2.6 kcal/mol (-0.2 kcal/mol van der Waals) for parallel /3 sheets. As in helices, there are a multitude of other nonhydro- gen bonding dipole-dipole interactions which contribute significantly to the stabilization.

One result of the use of the number of network hydrogen bonds by M & P is that a lower, and thus more realistic, estimate of the hydrogen bonding strength is obtained. If one uses the number of “good” hydro- gen bonds in the M & P approach, higher values are obtained for the hydrogen bonding strength but these values are similar in the four proteins; they are 13 or 14 kcal/mol, except for RNase A, which gives 15.9 kcal/mol. The unrealistically high hydrogen bonding strength is due to other factors, one of which is the underestimation of the van der Waals interactions. Another factor is the presence of favorable nonlocal electrostatic interactions in addition to hydrogen bonds. Finally, the effect of the overestimation of the polar solvation enthalpy is discussed in the following section.

D. Summary

The study of the polyalanine helix in vacuum was particularly instruc- tive in clarifying the relationship between hydrogen bonding and the electrostatic contribution to the enthalpy of unfolding. The conventional hydrogen bonding interaction with any reasonable definition accounts for only part of the total electrostatic contribution. Substantial contribu- tions arise from other dipole-dipole interactions and about one-fourth of the total stabilization energy of the helix in vacuum arises from longer- range electrostatic interactions. Van der Waals interactions are also sig-

Page 39: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 269

nificant in vacuum, but their contribution is expected to diminish in solution, where compensating peptide-solvent interactions take place.

It is common practice to use experimental results for small molecule dimers in the gas phase to estimate the strength of hydrogen bonding in proteins. The usual criteria for the existence of a hydrogen bond involve the positions and orientations of the donor and acceptor groups, NH and CO in the case of the protein backbone. However, as discussed in Section V,B, the interaction energy also depends strongly on the positions and orientations of the surrounding groups. Therefore, one cannot use the gas-phase values indiscriminately wherever a hydrogen bond in a protein is formed, or count the total number of such hydrogen bonds and multiply by a gas-phase experimental value to obtain the total hydrogen bonding contribution. For the same reason, the concept of an average “hydrogen bonding strength” in a protein, obtained by dividing the electrostatic enthalpy of unfolding in vacuum by the number of hydrogen bonds, is of questionable utility.

VI. ENTHALPY OF SOLVATION OF POLAR GROUPS: BREAKDOWN OF SURFACE AREA PROPORTIONALITY ASSUMPTION

For the completion of the thermodynamic cycle in Scheme I, it is necessary to evaluate the enthalpy of solvation of the native and unfolded proteins. It is useful to divide the solvation contribution into polar and nonpolar terms, as in Eq. (1). Such a separation is expected to be a satisfactory approximation for the present analysis. We assume that AH:(sol,np) is estimated correctly by a model which considers additive contributions of the nonpolar groups and relates them to the difference in the exposed surface area of the nonpolar groups in the native and the unfolded structures. Although the accuracy and physical justification of such an approach is not fully established (Karplus, 1980), its empirical utility has been verified in many cases (Still et aL, 1990; Hermann, 1972). As listed in Table I the contributions of AHU(so1,np) estimated by M & P for the four proteins destabilize the native state. Their contribution is important (-219 kcal/mol for RNase to -376 kcal/mol for Mb). However, it it much smaller than AH:(sol,p), the polar group contribu- tion, which is also destabilizing and ranges from -1525 kcal/mol for Cyt c to -2246 kcal/mol for Mb, i.e., hHi(so1,p) is a factor of six to eight times larger than AHi(so1,np). In this section we focus on the polar contribution.

The values of AHi(so1,np) and AH$(sol,p) listed in Table I were obtained assuming that the ASA of the unfolded chain can be approxi- mated by the sum of the ASAs from Gly-X-Gly tripeptides. However, it

Page 40: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

270 THEMIS LAZARIDIS ET AL.

is known that this assumption is not absolutely correct and the ASA calculated in this way is about 15% higher than the ASA of the chain in the extended conformation (Doig and Williams, 1991). This overesti- mation of AASA; would lead to overestimation of AHi(so1,p) and AHi(so1,np). Therefore, if the actual ASA of the unfolded chain were used, the values derived by M 8c P for the internal energy change AH;(vac) would be smaller, but still in significant disagreement with the calculations.

A related question is whether the assumption of a fully extended chain, which is unrealistic for entropic reasons, significantly affects the calculated MA. To answer this question we generated a large number of unfolded conformations of a protein, at various degrees of compact- ness, and examined the ASA as a function of the radius of gyration. We found that as soon as the radius of gyration reaches about one-third of its value in the fully extended chain, the ASA attains its asymptotic value. Therefore, the use of the fully extended chain is unlikely to cause a significant overestimation of the ASA of the denatured protein, unless the latter is very compact.

For AH;(sol,p), it is less likely that a model which assumes additivity and relates the enthalpy simply to the exposed surface area is adequate. There are at least three questions to be considered. The first concerns the role of charged side chains for which surface area and additivity assumption would clearly break down. The second concerns the additivity assumption for polar interactions, which are longer range and more directional than the nonpolar interactions. The third question is a little different. Given neutral side chains (or, equivalently, the assumption that the solvation of the ionized side chains is essentially the same in the native and unfolded states) the peptide group makes by far the largest polar solvation contribution in both the native and the unfolded proteins in the M & P analysis. Thus, even if the additivity assumption were valid, the solvation difference between the unfolded and native states may not be obtained accurately by the simple surface area model with the assumed coefficent, i.e., the very large contribution of the peptide group makes the results very sensitive to an accurate representa- tion for it and its buried surface area. Next, we describe some calculations that address these three questions.

A. Zonizabb Side Chains

At neutral pH, ionizable side chains, such as those of Glu, Asp, Lys, Arg, and possibly His, are expected to be positively or negatively charged. The solvation enthalpies for these charged groups are in the range of

Page 41: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 271

50 to 100 kcal/mol. A small number of such charged groups could make a larger contribution to the solvation enthalpy. Apparently, this contribution essentially cancels due to the fact that most, often all, charged groups are equally solvated in the native and the denatured states. Structural studies show that charged groups tend to be on the surface of proteins, presumably because their aqueous solvation cannot be counterbalanced in the low dielectric constant medium of the protein interior. Even for partly or fully buried ion pairs, it appears that the stabilizing contribution in the folded protein is small (Horovitz et aZ., 1990) or absent (Hendsch and Tidor, 1994). There are long-range inter- actions between charged groups that depend on the protein structure and would be different in the native and unfolded proteins. These give rise to the changes in the p& of the ionizable groups (Bashford and Karplus, 1990) and it is in fact this difference in p& between the native and unfolded states that is the basis of the pH dependence of protein stability (Tanford and Roxby, 1972; Yang and Honig, 1992).

These considerations are in accord with the pH dependence of protein stability, as evaluated experimentally (Privalov, 1979) or calculated theo- retically (Yang and Honig, 1992). In both approaches, the effect of pH on the free energy of protein stability is estimated to be of the order of 10 kcal/mol. This requires that the difference in solvation of charged groups between the native and denatured states be small. Moreover, since the enthalpic contribution is about 90% of the total for ionic solvation, a corresponding argument holds for the enthalpy (Yu and Karplus, 1988).

It is fortunate for the present analysis that the contribution of ionizable groups to AH{(sol) is small since the surface area proportionality assump- tion fails for such interactions. We address this point here because it is useful for understanding what is involved in the surface area assumption. We consider a thermodynamic cycle for the approach of two charges (Scheme 11). The top line corresponds to the approach of two charged groups in vacuum ( E = 1) from a large distance to an intermediate distance at which the two groups still have no buried surface area relative to that exposed when they are infinitely far apart. The bottom arrow corresponds to the approach of the two groups to the same distance in aqueous solution where the dielectric constant is E 80. We write A H for the various contributions to the cycle, even though the use of E leads to an approximation to the free energy (Roux et aZ., 1990). However, as already stated, the free energy is dominated by the enthalpy for the solvation of charged groups. The vertical processes in Scheme I1 involve solvation of the charged pair of ions at the two distances under consider- ation. Since there is no difference in buried surface area between the

Page 42: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

272 THEMIS LAZARIDIS ET AL

SCHEME 2.

left-hand and right-hand part of the cycle, the solvation enthalpies are equal by use of the surface area model, i.e., AH% (far) = AH$ (near). Along the horizontal arrows, AH%? (vac) 80 AH= (sol). From the thermodynamic cycle

AH$ (near) - AH$ (far) = AH%? (vac) - AH"if(so1).

The left-hand side [AH$(near) - AH$(far)] is zero by the surface area model, while the right-hand side is essentially AH?: (vac), a large number. Consequently, the surface area model cannot work and an alternative approach, such as a Poisson-Boltzmann calculation, is re- quired for charged groups.

B. Test of Additivity Assumption via Integral Equation Theoly

One test of the surface area assumption for the solvation of polar groups was made by the use of the RISM integral equation method (Chandler and Andersen, 1972; Hirata et al., 1983; Yu and Karplus, 1988). The reference interaction site model (RISM) integral equation for solvation in water is an approximate method that is simple to apply. It is much faster than free energy simulations and has no statistical errors so that, in contrast to simulation methods, it is possible to calculate with precision the enthalpic and entropic contributions to the free energy. These can be obtained from analytic temperature derivatives (Yu et al., 1990) or density derivatives (Yu and Karplus, 1988). In comparison with experiment for Nmethylacetamide (NMA) (Yu et aL, 1991), it has been shown that relative solvation enthalpies are accurate (e.g., cis- vs truns- NMA) even when the absolute values have significant errors. For a brief methodological description of RISM theory, see the Appendix.

Page 43: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 273

We apply the RISM theory to a series of peptides that can adopt a variety of conformations. We consider NMA, the simplest model for the peptide group, the alanine dipeptide, and the alanine tetrapeptide. In all of these molecules, the peptide group is the only polar group. Consequently, by determining their solvation energy for a large number of conformations, we test the surface area proportionality assumption for the peptide group.

The NMA molecule is used in the trans configuration. For the alanine dipeptide, the C7=, C7eq, and C5 structures were studied. For the tetrapep- tide 40 structures covering a wide range of conformational space were examined; they are based on the vacuum energy analysis of Czerminsky and Elber (1990). The dipeptide has the same neutral blocking groups as the polyalanine helix (see Section V). The tetrapeptide has an isobu- tyryl C-terminal blocking group (Czerminsky and Elber, 1990).

The solvation energies are plotted in Fig. 1 as a function of the polar accessible surface. The open circles correspond to the values for NMA, the three dipeptide conformations, and the tetrapeptide in an ex- tended conformation. These results can be fitted by the equation AEsalv = 9.064 - 0.5404 ASA, with AEsolv in kcal/mol and ASA in Az. The coefficient -0.5404 kcal/mol/A* corresponds to a solvation energy of -18.9 kcal/mol for the peptide group; the value is somewhat larger than the - 14.7 kcal/mol/peptide group used by M & P; the difference is within the uncertainty of the RISM method which tends to overestimate

y = 9,064054-0.54037%

P 2 -50 m

-. . c

50 70 90 110 130 150 CONH accessible surface (A')

Contribution to the total excess solvation energy of NMA, the alanine dipeptide and the alanine tetrapeptide from the CONH groups, plotted against the corresponding CONH accessible surface. The total excess solvation energy was calculated by the RISM- HNC methodology (see Appendix). The open circles correspond (in order of increasing ASA) to the values for trunsNMA, the alanine dipeptide in C7ax, Cveq. and C5 conformation, and the alanine tetrapeptide in extended conformation. The filled circles correspond to nonextended conformations of the alanine tetrapeptide.

FIG. 1 .

Page 44: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

274 THEMIS LAZARIDIS ET AL

polar solvation enthalpies (Yu et al., 1991). Thus, the RISM results and the M & P estimate are consistent.

The results in Fig. 1 for the nonextended conformations of the tetra- peptide show large deviations from a linear dependence on the MA. The solvation energies are all more negative than the linear dependence would predict. This suggests that the difference in the hydration energy of the peptide groups between the folded (native) structure and the unfolded structure is significantly smaller than that obtained from a surface area estimate.

C. Simulation Analysis of Peptide Group Solvation

To complement the integral equation theory calculations, we consider results obtained for the solvation of the peptide group from simulations. It is known from simulations (Rossky and Karplus, 1979), a6 initio calcula- tions (Guo and Karplus, 1992), and crystal data (Baker and Hubbard, 1984) that a carbonyl group can form two strong hydrogen bonds, ideally in the direction of the lone pairs, and that an NH group can form one strong hydrogen bond. Thus, carbonyl groups near the surface of a protein can interact to a significant extent with water, even if they are hydrogen bonded to a backbone NH group in an a helix. This contrasts with expectations from the ASA of the peptide group, which is near zero in an a helix. This possibility has been considered from the viewpoint of crystal data for proteins by Savage et al. (1993). We analyze a molecular dynamics simulation of barnase in water at room temperature (Caflisch and Karplus, 1994). The average CO-water interaction energy is calcu- lated from 317 coordinate frames at 1 ps intervals. The simulation was performed with a boundary potential and several layers of water around the protein. In the CO-water energy calculations we consider only the first solvation shell, i.e., the water molecules that are 4 A or closer to any protein atom. Including more solvation layers does not substantially affect the calculated interaction energies.

Although the interaction energy is only one component of the solva- tion, it has been shown for polar systems that the cavity energy, which is positive, is essentially equal to minus one-half of the interaction energy (Archontis and Karplus, 1995). To a good approximation the solvation energy and, therefore, the solvation enthalpy, are then equal to one- half of the interaction energy. This result is in accord with the Born model for charged systems, which yields a free energy of solvation which is one-half of the interaction energy (Roux et aZ., 1990). Thus, although the calculated numbers cannot be compared directly with experimental solvation enthalpies, they provide a test of the surface area proportional-

Page 45: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 275

ity assumption. We calculated an average accessibility for the carbonyl oxygen (the accessibility of C is negligible) and compared the result with the interaction energy. Figure 2 plots the total interaction energy vs the average oxygen MA. The data show that there is a tendency for increased solvent interactions as the ASA increases, but the large scatter appears to preclude a simple proportionality. There are several CO groups with almost zero accessibility that exhibit interaction energies with water that are as large as many accessible ones.

The results for a few selected carbonyl groups and their hydrogen bonding partners are shown in Table XVIII. Two of them are fully exposed, another three are parts of a helices and have 1,4 hydrogen bonds to NH groups, and one is part of a @-strand hydrogen bonded to an NH group. The average 0.m.H distances show that the hydrogen bonded C = 0 groups remain hydrogen bonded during the simulation. Further, their solvent accessibility during the simulation is similar to that in the crystal structure.

Both of the fully exposed CO groups form two direct hydrogen bonds with water molecules (equal to about -3 or -4 kcal/mol each) and there are additional interactions between the CO group and other nearby water molecules. These interactions are favorable for Gln-2 and unfavor- able for Ala-32. This is an effect of neighboring groups, which orient the waters differently in the two cases. Clearly, the additivity of group contributions does not apply and the surface area proportionality as- sumption fails.

The CO of Ser-28 and Glu-29, which are involved in helical hydrogen bonds, exhibits an average interaction energy with water of -6.4 and - 4.5 kcal/mol, respectively. About - 1 kcal/mol originates from van

2

L -2 8

-6

al 0 - E

Y

%

al

a,

-

._ s -10 - 2 - 2 -14

FIG. 2. Average interaction energy of CO groups with water during a molecular dynam- ics simulation of barnase, as a function of the average accessible surface area of the oxygen atom.

Page 46: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

TABLE XVIII Carbonyl- Water Interaction Energies in Barnase Simulation4'

Carbonyl Average 0-H ASA of 0 in Average M A of CO-water C,NH-water NH-water kW group Structure distance crystal 0 &ot ( K m ) &kc

22.8 t 5 -15.0 C 3.4 - -

29 C 1.8 -8.2 t 2.9 - -

Gln-2 Exposed - 33.6

Ma-32 Exposed (end - 29.26 (-1.9 t 1.0)

of helix) (-1.5 % 1.0)

H bonded (-1.4 5 0.4) Asp12 Helix, 2.06 C 0.2 0.89 0.36 C 0.8 -1.0 C 2.6 - 0.4 C 0.9 -0.7 t 0.1

to Thr-16 NH and side chain

Ser-28 Helix, 2.17 C 0.2 4.98 4.73 C 2.5 -6.4 _f 3.2 f0.11 t 0.7 -0.8 t 0.1 H bonded to (-1.3 2 0.7) Ala-32 NH

GIu-29 Helix, 2.2 2 0.2 0.25 0.83 2 0.7 -4.5 C 2.3 -0.26 t 0.7 -0.7 t 0.1 H bonded to (-0.8 C 0.6) Leu-33 NH

-0.7 t 0.2 -1. C 1.2 Lys-98 /3 Strand, 2.07 C 0.2 0.71 0.15 t 0.3 -2.0 -t 2.5 H bonded to (-0.5 -t 0.4) Thr-107 NH

Accessible surface areas (MA) in Az, energies in kcal/mol. 'Average ASA over all 317 frames in all simulation values. The -t numbers are standard deviations.

Page 47: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 277

der Waals interactions and the rest is electrostatic. The interaction of Asp-12 with water is lower because its CO group also interacts with the side chain of Thr-16. The CO of Lys-98, which participates in a p sheet, also interacts with water, albeit to a smaller extent than the helical CO groups.

The interaction with water of the NH groups to which four of the above CO groups are hydrogen bonded was also calculated. This interaction is much smaller than that for CO groups, i.e., the interaction energy ranges from -0.7 to -1.7 kcal/mol. This is consistent with the fact, already mentioned, that NH groups are capable of forming only one hydrogen bond, whereas CO groups are capable of forming two. The interaction with water of some fully exposed NH groups that are not involved in protein hydrogen bonds was found to be about -4 kcal/mol.

The above results question the validity of any approach that uses the ASAS to estimate the solvation energy of polar groups. Specifically, they suggest that the solvation enthalpy of a folded protein is significantly underestimated and that, as a consequence, the change in solvation energy upon unfolding is overestimated. In the M & P subtraction analy- sis, this correction would result in a decrease in the native state stabiliza- tion required from internal van der Waals and electrostatic interactions, in accord with the atom-based calculations described in the previous sections. We consider this point further in Section VII.

D. Continuum Ekctrostatics Calculations of Solvation

Continuum electrostatics calculations provide another approach to polar and charged group solvation of peptides and proteins. The method- ology and applications of the continuum approach based on the Poisson-Boltzmann equation are covered in several recent reviews (Sharp and Honig, 1990; Davis and McCammon, 1990; Rashin, 1993). Results of such calculations have a bearing on the present analysis; in particular, they relate to the validity of the accessible surface area model for the hydration of polar groups. Again, we note that due to the use of a temperature-dependent dielectric constant, the calculated solvation effect corresponds more to a free energy than to an energy or enthalpy (Roux et al., 1990). Nevertheless, the results are of interest particularly because of the dominant enthalpic contribution to polar and ionic sol- vation.

Gilson et al. (1985) presented analytical solutions of the continuum electrostatics equations for simple geometries. They calculated the self- energy (here equal to the solvation energy) of a charge in various posi- tions relative to a plane dielectric boundary, for which they found a

Page 48: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

278 THEMIS LAZARIDIS ET AL

curve of sigmoidal shape. When the charge is buried in the low dielectric region, its solvation energy varies with the distance from the boundary. If it is next to the boundary it retains much of its solvation energy, even though it has no contact (no exposed surface area) with the high dielectric region.

Rashin (1992) solved the Poisson equation for the C peptide of RNase A in the helical and unfolded conformations. The solvation energy of the peptide groups in the helical form was found to be only slightly less negative than that in the unfolded form. This indicates that the accessible surface area approach would overestimate the difference in solvation energy between the fQlded and unfolded forms of the helix.

Yang et al. (1992) addressed the problem of protein tertiary structure formation. They conceptually separated the process of folding into two steps: formation of secondary structural elements, and assembly of these elements to form the native structure. For the enthalpy of secondary structure formation they used the experimental enthalpy found for heli- ces [about -1.3 kcal/mol (Scholtz et aL, 1991)l. For the assembly step, they performed continuum electrostatics calculations. The change in electrostatic free energy for this second step was found to be positive in all cases (i.e., the electrostatic contribution destabilizes the folded form). In fact, the destabilization was large enough to counterbalance the favorable enthalpy of secondary structure formation.

To compare the results of M & P with the predictions of continuum electrostatics in a consistent way, it would be useful to have the electro- static energy difference between the folded and the unfolded proteins. A set of such calculations have been performed with the program UHBD, which was used to solve the Poisson-Boltzmann equation. (These calcula- tions were performed by Dr. Michael Schaefer). For cytochrome c and myoglobin the solvation energy of heme was separately determined and added to that of the unfolded chain. The calculations were performed at zero ionic strength with a solvent dielectric constant equal to 80 and a protein dielectric constant equal to unity; the final grid size was 0.5 A. The solvation energies for the folded and unfolded chains and their difference are given in Table XIX and compared to the change in polar hydration enthalpy calculated by M & P at 25°C. The results for a 20- residue polyalanine (Y helix are also given.

From the results in Table XIX it is seen that the M & P polar hydration enthalpy difference between the folded and unfolded states is about four times as large as the continuum electrostatics results for AE:(sol,p). This is consistent with the conclusion from the previous analyses that the accessible surface area approach substantially overestimates the change in this quantity. Table XIX also gives the internal electrostatic

Page 49: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 279

TABLE XIX Solvation (Free) Energy Differences of Folded and Unfolded Proteins in

Poisson-Boltzmann Approximation"

Protein AE$(N) AE:!(U) AE#(sol,p) M & P* AE#(vac,elec) AE;(sol,elec)'

Cytochrome c -441 -789 -348 -1525 +225 -123 RNase A -539 -958 -419 -1854 +398 -21 Lysozyme -546 -1020 -474 -1770 +351 -123 Myoglobin -558 -1087 -529 -2246 +421 -108 Ala-20 -45 -89 -44 -205* + 60 + 16

a All values in kcal/mol. *Polar hydration enthalpy, AHa(sol,p), according to M & P (1993).

dCalculated as AAS&,ONH. (406.8 cal/mol A2) (M & P, 1993). AE:(sol,elec) = AEl;(sol,p) + AEE(vac,elec).

energy change, AEE(vac,elec), calculated by the atomic model (Section 111) and the total (internal plus solvation) electrostatic energy changes upon unfolding, AEi(so1,elec). The latter are negative for all proteins. This means that they are unfavorable for stability, though the net electro- static contribution is small. These results are consistentwith those of Yang et al. (1992), which were also based on the Poisson-Boltzmann equation.

For the polyalanine helices the loss of internal electrostatic interactions upon unfolding is larger than the gain of solvation energy. The result is that the total electrostatic contribution is favorable to helix formation according to the continuum electrostatics model. The loss of solvation calculated by the approach of M & P predicts an unfavorable polar contribution to helix formation.

To examine the helical case in more detail, we calculate the Poisson- Boltzmann solvation free energy of the folded and'extended states for a series of (Y helices of different lengths; the structures were generated in the same way as that of the 20-mer. The exact results depend somewhat on the model for the extended state. The results are shown in Table XX. If the results are fitted to a straight line we obtain

AE$(N) = -0.6N - 33 kcal/mol (10)

for the helical and

AEZ(U) = -4.2N- 4.3

for the extended conformation. The difference in electrostatic solvation energy from extended to helical conformation is

Page 50: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

280 THEMIS LAZARIDIS ET AL.

TABLE XX Solvation Energies of Helical and Extended Alanine

Peptides in Continuum Approximation“

Protein U % N ) AE:;(U)

Ma-1 6 -43.05 -71.41 Ma-20 -45.42 -89.02 Ma-24 -48.25 - 103.84 Ala-28 -51.00 - 122.63 Ma-32 -52.84 -138.60

“All values in kcal/mol.

AE!(sol,p) = 3.6N - 28.7. (12)

The internal energy of helix formation was found in Section V to be

AE!(vac) = -8N + 30 kcal/mol (13)

and the electrostatic component of this energy is

hE!(vac,elec) = -4.3N+ 24.1, (14)

therefore, the total change in electrostatic energy upon helix formation is given by the sum of Eqs. (12) and (14):

hEi(so1,elec) = -0.7N - 4.6. (15)

Equation (15) holds for a helix that is long enough that the two ends interact negligibly. ‘This equation predicts that for an infinite helix the contribution of each bond to the enthalpy (ifwe assume that the electro- static free energy is all enthalpic) is - 0.7 kcal/mol. This value is close to the experimental value of about - 1.3 kcal/mol (Scholtz et aL, 1991). The agreement would improve if there is a net effect of van der Waals interactions favoring the a helix.

E. Summary and Implications for Protein Stability

The purpose of the calculations that we described in this section was to test the assumption that the solvation enthalpies of polar groups and, in particular, the peptide group, are proportional to their ASA. This test was done by integral equation theory calculations of the solvation enthalpy of NMA and small model peptides in various conformations,

Page 51: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 28 1

by analysis of the interaction energy of carbonyl groups with water in a molecular dynamics simulation of barnase, and by continuum Poisson- Boltzmann calculations of folded and extended proteins and helices.

The conclusion from the calculations is that there is no simple propor- tionality between the solvation enthalpy and the ASA for polar groups. The integral equation treatment revealed large deviations from a linear relationship between the ASA and hydration enthalpy (Fig. 1). Further, it demonstrated that the solvation enthalpy is always less than or equal to that given by the ASA. The analysis of molecular dynamics trajectories showed a wide scatter of the CO-solvent interaction energies as a func- tion of the surface area (Fig. 2). Finally, the continuum calculations gave much smaller electrostatic solvation contributions upon unfolding than those obtained from the surface proportionality assumption.

Thus, although the various approaches each have inherent approxima- tions, their agreement provides strong evidence that the surface area proportionality assumption is not valid for polar groups. Moreover, the quantitative results obtained from the various methods indicate that the ASA significantly overestimates the magnitude of the polar group solvation enthalpy change upon unfolding, AHN(sol,p). One source of this failure for the peptide group is that it loses almost all of its ASA upon secondary structure formation; according to the ASA it should lose all of its solvation enthalpy. However, even a very small amount of ASA is sufficient for the formation of a hydrogen bond between a CO group and water; a CO group, which normally forms two hydrogen bonds, can participate in secondary structures while interacting signifi- cantly with the solvent.

Although the qualitative result that AHi(so1,p) is overestimated sig- nificantly by the ASA is clear from the present analysis, a quantitative value for the overestimate is not possible because the ASA is not applica- ble, i.e., one cannot simply change the proportionality constant used in the solvation enthalpy calculation and expect to get accurate results. In Section VII, we, nevertheless, make some estimates of AH$(sol,p) by use of the thermodynamic cycle (Scheme I).

In addition to its effect on the absolute values of the polar solvation enthalpy change on unfolding, the breakdown of the surface proportion- ality assumption may also change the relative stabilities of the four pro- teins studied by M&P and in this work. That would be the case if the overestimate of the solvation enthalpy varied significantly among the four proteins. This can be illustrated by comparing RNase A and Mb, which is instructive because the former buries the greatest amount of polar surface area per residue, whereas the latter buries the greatest amount of nonpolar area per residue. This raises the question why Mb

Page 52: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

282 THEMIS LAZARIDIS ET AL.

is not significantly more stable than RNase, i.e., it buries a much larger amount of nonpolar surface area per residue on folding, which is expected to yield a higher degree of stabilization from the hydrophobic effect.

One explanation, proposed by Yang et al. (1992) and also deduced from the analysis of M & P, is that the transfer of polar groups from water to the protein interior is enthalpically more unfavorable in myoglobin. According to the analysis of M & P, the source of the unfavorable polar contribution in myoglobin is that the internal (hydrogen bonding) interactions in myoglobin are weaker than those in RNase A. As a result, RNase should exhibit a much larger, positive value of AH:(vac) per residue (see Section 11). However, this is not borne out by the CHARMM calculations. In Table IV we see that the two proteins exhibit a similar value of AHI:(vac) per gram. Moreover, an analysis of the hydrogen bond strengths showed no differences between the two proteins.

Another possible explanation is that the polar solvation enthalpy is overestimated more for RNase than for Mb. This would happen, for example, if RNase had a greater number of hydrogen bonded groups close to the surface. To test this explanation we counted the number of carbony10 atoms in the four proteins which have accessibility greater than 1 or 2 Az. The results are shown in Table XXI. We notice that Mb has significantly fewer carbonyl oxygens “slightly” accessible to solvent, i.e., it buries its peptide groups to a greater extent than the other four proteins. This follows from the large amount of helical secondary structure in myoglobin. The peptide groups on the sides of the helices which are packed against each other in the protein interior are screened effectively from solvent and have a very low solvation enthalpy. By con- trast, the dominant @sheet structure of RNase leads to more surface hydrogen bonds. This difference in accessibility will not be fully captured by the ASA approach, because in the ASA calculation an accessibility of 2 or 3 makes a negligible solvation contribution. However, from our calculations, such an accessibility is very important and could provide the explanation of the similar stability of the two proteins.

TABLE XXI Number of Backbone Carbony10 Atom with ASA Greater Than 1 or 2 iz

Protein >2 A2 (per residue) >1 A2 (per residue)

Cytochrome c 47 (0.456) 52 (0.505) RNase 57 (0.460) 62 (0.500) Lysozyme 61 (0.473) 67 (0.519) Myoglobin 51 (0.333) 61 (0.399) Barnase 56 (0.509) 59 (0.536)

Page 53: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 283

In addition to the four proteins, Table XXI includes a calculation on barnase. Recent calorimetric studies have found that the enthalpy of unfolding of barnase is unusually high compared with that of other proteins (Griko et aZ., 1994; Martinez et aZ., 1994). Energy calculations performed on barnase (not reported) show that AHE(vac) per residue for this protein is very similar to that for the other four proteins. There- fore, the larger experimental AHE(so1) of barnase must be a result of a higher solvation contribution. One plausible explanation is based on the results of Table XXI, which show that the native structure of barnase is characterized by the highest average peptide group accessibility among the proteins studied. Even though there is no simple proportionality between solvation energy and MA, the hydrogen bonding calculations indicate that the contribution from hydration of these peptide groups upon unfolding will be smaller than that in the other proteins.

VII. CONCLUDING DISCUSSION

Although the overall thermodynamic quantities (enthalpies, entro- pies, and free energies) required for an understanding of protein stability can now be measured with high accuracy (Privalov, 1979, 1989), the analysis of the results and the evaluation of the individual contributions is still very difficult. As M 8c P (1993) stated in the introduction to the article which provided much of the motivation for this study,

It is clear that the stability of a protein’s compact native structure in aqueous solutions is determined not only by the interactions between the groups of protein molecules but also by the interactions between these groups and the surrounding water molecules. It actually represents a balance of these two types of interaction. By studying the thermodynamics of protein unfolding we can determine this resul- tant effect in terms of the enthalpy, entropy and Gihbs free energy. We cannot say anything definite, however, about the components of this process caused by the disruption of internal bonds in proteins and the establishment of new bonds with water molecules. On the other hand, it is clear that knowledge of these components is of crucial importance in understanding the mechanism of protein folding.

In this paper we have shown how empirical energy calculations with an atomic model and statistical mechanical analyses can provide informa- tion that answers two types of questions concerning the stability of pro- teins. The first is the basic question of the interactions that contribute to the enthalpy difference between the native and unfolded states. The second concerns the phenomenological models that have been used to obtain insights into the origins of protein stability. Although the results presented here are only the first steps in a more detailed approach to protein thermodynamics, they nevertheless increase our understanding.

Page 54: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

284 THEMIS LAZARIDIS ET AL.

The present calculations suggest that the use of a thermodynamic cycle that isolates the vacuum unfolding enthalpy, AHi(vac), can aid in our understanding of protein stability. If such a cycle is used, it is no longer necessary to use model systems (e.g., hydrocarbon crystals) and simplified constructs (e.g., hydrogen bonding) to evaluate the contributions to AHi(vac). Instead, if the structure of the protein is known, direct calcula- tions with all-atom models and empirical energy functions provide the best estimates of the van der Waals and electrostatic contributions.

The magnitude of van der Waals interactions and their contribution to protein stability is still not fully understood (Privalov and Gill, 1988; Nicholls et aL, 1991). The results presented here clearly show that hydro- carbon crystals, in contrast to liquid hydrocarbons, are the better model systems for the protein interior. This is in accord with expectations since proteins have packing densities similar to organic crystals (Richards, 1977). However, a large portion of the nonpolar van der Waals energy in proteins arises from interactions with polar groups. Thus, these addi- tional van der Waals interactions must be considered to obtain a correct estimate of the nonpolar group contribution to the enthalpy of unfold- ing. Also, polar group-polar group van der Waals interactions are found to be very important. Moreover, the calculations demonstrate, as men- tioned above, that at the present stage of development of empirical energy functions, the best approach to determining the van der Waals contribution to provide stability in AHi(vac) is to do a calculation based on a slightly minimized X-ray structure or good NMR structure if it is available. This avoids the need to refer to any model system.

The high packing density in proteins is shown to provide an additional enthalpic stabilizing factor over and above the conventional hydrophobic (nonpolar liquid to water) transfer energies. The van der Waals interac- tions of nonpolar groups in water are similar to those in pure nonpolar liquids and do not vary significantly with temperature. Cyclic dipeptides appear to be no different from alkane crystals in terms of nonpolar van der Waals interactions.

The present results have implications for other methods that have been proposed for interpreting protein stability. We mention one exam- ple that is based on the “liquid hydrocarbon model” of the protein interior (Baldwin, 1986; Spolar et al., 1989, 1992). In this model, the enthalpy of transfer of nonpolar groups from the protein interior to water is taken to be equal to zero near room temperature (295 K), as it is for liquid hydrocarbons. Section IV indicates that this enthalpy of transfer should be more positive than the liquid hydrocarbon model suggests. The enthalpic contribution of polar groups in this model is described by a surface area-dependent term and a constant term, which

Page 55: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 285

is determined by fitting protein experimental data. Therefore, the under- estimation of the enthalpic contribution of the nonpolar groups leads to an overestimation of the enthalpy of transfer of polar groups from protein to water, i.e., this contribution turns out to be too positive.

The analysis of hydrogen-bonding contributions to protein stability in AHi(vac) is complicated by a fundamental ambiguity in the definition of the hydrogen-bonding energy. In small molecules, such as water, the concept of a hydrogen bond energy has a clear meaning. However, in larger molecules, including proteins, longer-range dipole-dipole inter- actions are present and contribute significantly to the total interaction energy. The most straightforward definition, the interaction between the CO and NHC, groups that form “good” hydrogen bonds, is only one component of the total electrostatic interaction. The case study of helix formation in vacuum clearly showed that a multitude of interactions, some of them long ranged, contribute to the total stabilization. Ascribing all of the electrostatic energy to hydrogen bonding is too simplistic.

The accessible surface area approach has been widely used for estimat- ing the solvation contributions to protein folding. For nonpolar groups it is relatively well established, although there are still a few issues that need to be resolved (Sharp et aZ., 1991b). For polar groups, however, the accessible surface area model introduces significant errors. Continuum electrostatics calculations, analyses of molecular simulation data, and integral equation theory all lead to the result that surface area arguments based on model compound data significantly overestimate the difference in polar group solvation between the folded and unfolded states of proteins. The calculations agree with the analysis of M & P in that the peptide group is very important in folding thermodynamics. However, the difference in polar solvation enthalpy of the peptide group between folded and unfolded proteins is overestimated.

It is questionable whether a simple solvation model that provides accurate results can be found. Alternatives to strict surface area propor- tionality already exist. Savage et al., (1993) proposed a discrete version of the ASA approach: they consider that one hydrogen bond to water forms when the ASA is between 0.01 and 9.35 A*, and that two are formed when the ASA is between 9.35 and 18.7 A2, etc.; this is not in exact accord with our simulation results. Stouten et aZ. (1993) propose the occupancy of the solvation shell of each protein atom as a measure of its solvation. None of these methods, however, deal with the effect of neighboring polar groups [nonadditivity, or correlation effects (Ben- Naim, 1990; Yu et d., 1991)l. In the barnase simulation data we found a wide variation in polar group-water interaction energies depending

Page 56: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

286 THEMIS LAZARIDIS ET A L

on the structural context. The errors arising from nonadditivity effects appear to be significant.

A different approach to modeling polar interactions was proposed by Honig and co-workers (1993). A process, such as folding or association, is treated in three steps: (1) uncharging of the polar groups, (2) perform- ing the conformational change or association reaction with the un- charged molecule(s), and (3) recharging the polar groups. The first and third steps are calculated by continuum electrostatics and the second step by surface accessibility methods. It will be interesting to see the results of more applications of this approach. There are limitations to the quantitative accuracy of continuum electrostatics (it appears to overestimate the differences in solvation between the folded and un- folded states) and the uncertainties could easily be of the order of the total free energy changes associated with protein folding, which are very small.

The analysis of M & P finds that the net effect of transferring polar groups from water to the protein interior for the four proteins studied is enthalpically destabilizing but becomes stabilizing when the entropic contribution is added. At room temperature they predict that both polar and nonpolar group transfers make favorable and similar contributions to the free energy of folding. Our results indicate two sources of error in their analysis: (1) the neglect of polar-nonpolar van der Waals interac- tions, which leads to an underestimation of the enthalpy of nonpolar group transfer; and (2) an overestimation of the change in polar solvation enthalpy (and probably entropy) due to the use of the ASA proportional- ity assumption. We provide a revision of the M & P numbers based on the present analysis, i.e., we make some simple estimates to obtain a likely decomposition of the enthalpy and briefly comment on the entropy and free energy of unfolding for the four proteins that were studied in M & P (1993).

For the nonpolar solvation enthalpies (Table XXII) , we use the results given by M & P (see Table I) . For the intraprotein energy change on

TABLE XXII Proposed Decomposition of Enthalpy of Unfoldingasb

Protein AHa(sol,np) AH:(vac,p) AH:(vac,np) AHW(so1) AHW(sol,p)

Cytochrome c -246 516 (1526) 397 (265) 21.3 -646 (-1525) RNase A -219 721 (1910) 331 (234) 70.3 -763 (-1854) Lysozyme -261 686 (1808) 403 (281) 57.8 -770 (-1770) Myoglobin -376 849 (2219) 592 (404) 1.4 -1064 (-2246)

“At 25°C. All values in kcal/mol. ’Numbers in parentheses are those proposed by M & P (1993).

Page 57: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 287

unfolding, AHi(vac), we use the CHARMM results. One-half of the polar-nonpolar van der Waals interactions are assigned to the nonpolar groups and the other half to the polar groups. The polar internal energy change is the difference between the total van der Waals plus electrostatic energy changes and the nonpolar van der Waals energy (Table VII). The polar solvation enthalpy contribution, AH:( so1,p) , is calculated by difference from the experimental solvation enthalpy AH: (sol). The resulting values for the polar solvation enthalpy are 2 to 2.5 times smaller in magnitude than M & P’s numbers. This is in qualitative agreement with the RISM results. The continuum electrostatics calculations gave energy changes about 4 times smaller than M & P’s polar solvation enthalpies. This is consistent with the fact that the continuum electro- static energies include an entropic component which opposes solvation. The resulting total enthalpy changes of the polar and nonpolar groups from the folded to the unfolded proteins are given in Table XXIII, i.e., they include the van der Waals and electrostatic term from vacuum calculation plus the solvation enthalpy.

Because we have not performed any calculations for the entropy that could permit us to evaluate the different contributions, we do not give a reanalysis of the results of Privalov and Makhatadze (1993) for the entropy and free energy. However, it is highly probable that the use of the M A proportionality assumption for the entropy of the polar groups leads to an overestimation of the polar hydration entropy change. This would result in a change in the side-chain entropy values, which also seem to be too large. Such changes would carry over to the free energy analysis. It is left to a future study to investigate these questions in more detail.

TABLE XXIII Proposed Contributions of Polar and Nonpolar Groups

to AHk(so1)

Protein Polar Nonpolar

Cytochrome c -30 ( + 1 ) 151 (19.5) RNase A -42 (+56) 112 (15.2) Lysozyme -84 (+38) 142 (19.6) Myoglobin -215 (-27) 216 (28.4)

“At 25°C. All values in kcal/mol. The non- polar-polar contribution to AEU,(vdW) is divided equally between the polar and nonpolar groups.

Numbers in parentheses are those prepared by M & P (1993).

Page 58: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

288 THEMIS LAZARIDIS ET AL.

An important point to note in any analysis of protein stability is that many quantities scale approximately linearly with the size of the protein. This is what is expected in the thermodynamic limit for a homogeneous system since the individual energetic contributions and the enthalpy, entropy, and free energy are extensive properties. It appears that the composition and structure (packing) of most proteins are sufficiently similar that approximately extensive behavior is observed. This means that many simple models that depend on a scaling with mass of the protein can give the correct behavior if a fitting procedure is used, even if the basic interpretation is incorrect. Thus, care has to be used in validating phenomenological models that depend on experimental fits. It appears to us that this is one of the contributing factors that has led to the existence of a wide range of models for protein stability. We hope that the approach taken here, which makes use of chemically meaningful, even if approximate, theoretical calculations, will aid in arriving at a better understanding of the factors contributing to protein stability.

APPENDIX COMPUTATIONAL DETAILS

Unless noted otherwise, all calculations reported herein have been performed with CHARMM, version 22 (Brooks et aL, 1983). Parameter set 19 (PAR4M19) was used. In this parameter set polar hydrogen atoms are treated explicitly, while nonpolar hydrogens are incorporated to the heavy atom to which they are bonded.

Sources of Protein Crystal Comdinates

The following coordinate sets were taken from the Brookhaven Protein Databank 5CyT for horse heart cytochrome c, lMBC for sperm whale (carbonmonoxy) myoglobin, and 8RAT for bovine pancreatic ribo- nuclease A. The coordinate sets of lysozyme and barnase were kindly provided by D. C. Phillips, A. Cameron, and G. Dobson.

Protein UnfoLding Energies in Vacuum

To be consistent with the work of Privalov and Makhatadze, we em- ployed neutral side chains for all amino acid residues. An approximate charge distribution for the neutral side chains of Asp, Glu, Arg, and Lys was obtained by an arbitrary, yet “reasonable” reduction of the charges of the ionized forms in PARAM19 without adding or subtracting hydrogen atoms. For Asp and Glu the partial charges of the two carboxyl oxygens were reduced from -0.6 to -0.1. For Lys the partial charges of the three amino hydrogens were reduced from 0.35 to 0.1 and of the CE from 0.15 to 0. For ARG, the partial charges of the four amino hydrogens

Page 59: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 289

were reduced from 0.35 to 0.20, of Cr from 0.5 to 0.25, and of HE from 0.3 to 0.15.

Polar hydrogen positions were added to the native protein by using the HBUILD facility of CHARMM and then the structures were mini- mized for 300 steepest descent steps to relieve possible bad contacts. The unfolded proteins were constructed as extended chains with + = - 140 and + = 135, corresponding to p-pleated sheet, and then mini- mized by 300 steepest descent steps. To obtain straight unfolded chains and, thus, facilitate the continuum electrostatics calculations, prolines were treated in the same way as all other residues. The resulting distorted proline geometry is restored by local rearrangements during the subse- quent minimization. The vacuum energy results are not substantially different when the correct proline geometry is taken into account in the construction of the unfolded chain.

The intraprotein energy of unfolding and the various contributions to it are obtained as the difference between the corresponding values in the folded and unfolded proteins. No cutoff is used for the nonbonded interactions in the final energy calculations, whereas standard cutoffs are employed during the minimizations (Brooks et al., 1983). (The results are very similar if the minimization is done without a cutoff.) For myoglo- bin and cytochrome c, the heme is assumed to dissociate and remain in the same conformation as in the protein and its intramolecular energy is added to that of the unfolded chain.

The polyalanine helices with 16, 20, 24, 28, and 32 residues were constructed in the ideal conformation (4 = -57 and + = -47) and then minimized with 300 steps of adopted basis Newton-Raphson (ABNR) minimization (Brooks et aL, 1983). In these calculations the use of no nonbonded cutoff during the minimizations was found to better main- tain the ideal helical geometry. All helices are blocked with an acetyl group at the N terminus and a methyl amine group at the C terminus. The unfolded polyalanine chains were generated as described above for the unfolded proteins and then minimized for 300 ABNR steps. Subsequently, residue-residue or CO-NHC" interactions were calcu- lated and the difference between the helical and extended chains was used in the analysis. The results in Tables XIV and XV are for i = 5, but due to symmetry any residue would give similar results.

van der Waals Interactions of Groups in Proteins

For the calculation of the interaction energy of methyl and methylene groups and leucine and phenylalanine side chains in proteins we used a set of more extensively minimized protein structures (200 steepest descent and 500 ABNR steps). In these calculations a nonbonded cutoff

Page 60: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

290 THEMIS LAZARIDIS ET AL

of 8.5 A was used, switched off from 8 A, to compare with the results for bulk systems.

Average Energtes from Dynamics Simulations

To calculate the effect of temperature on intraprotein energies, we used a recent simulation of barnase (Caflisch and Karplus, 1994). This simulation has been performed at room temperature, in explicit water, and with stochastic boundary conditions. The reported energies were obtained with a cutoff of 8.5 A, switched off from 8 A.

Crystal Calculations

The calculations were carried out using the CRYSTAL facility of CHARMM. The alkane calculations are essentially repetition of the calcu- lations of MacKerell et al. (1995) with the extended atom parameter set 19. The cutoff for the nonbonded interactions was set to 8.5 A, switched off from 8 A. The crystal structure was minimized for 200 ABNR steps keeping the lattice fixed and for another 1000 ABNR steps allowing the lattice parameters to vary. For the cyclic dipeptides, c(Ala-Ala) and c (Leu-Pro) , the calculation started from the published crystal structures (Sletten, 1970; Karle, 1972) and followed the same protocol as in the alkane calculations. We have also visually examined the crystal structure of c(G1y-Gly) (Degeihl and Marsh, 1959).

Molecular Dynamics Simulation of Pentane

The simulation was started from an equilibrated box of 128 all- hydrogen pentane molecules (A. D. MacKerell, Jr., et al., 1995). All hydrogens were deleted and the new system was equilibrated for 10,000 steps at constant temperature of 298.3 K and pressure of 1 atm. (The normal boiling point of pentane is 309 K.) The time step was lfs. The coupling constants for temperature and pressure were set to 0.01 and 0.00005, respectively. A 8.5 A nonbonded cutoff was employed, switched off from 8 A. Following equilibration, another 10,000 steps of dynamics were performed, during which the average energy of the system was calcu- lated.

Monte Carlo Simulation of Propane

This simulation was carried out using the program BOSS, version 2.8 (Jorgensen, 1989). The propane parameters used were identical to those for extended carbons in proteins (PARAM19). Again, the cutoff was set to 8.5 A, switched off from 8 A. First a constant volume simulation was performed, starting from an equilibrated box of 267 propane molecules

Page 61: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 291

(Jorgensen, 1989). The temperature was set to -42"C, the normal boiling point of propane. After 2.5 X lo5 steps equilibration, the average energy was calculated over lo6 steps. Subsequently, the simulation was continued for 1.5 X lo6 steps at constant pressure of 1 atm. This resulted in a decrease in density (0.49 from 0.54 g/cm3). The average energy over the last lo6 steps was calculated.

Monte Carlo Simulations in Water

NPT simulations of toluene and 2-methylbutane in TIP4P water were performed with the BOSS program at room temperature and 1 a m . Toluene was also simulated at 80°C. The Lennard-Jones parameters used for these calculations were the same as the corresponding CHARMM extended atom (PARAM19) parameters and the scheme with an 8.5 A cutoff, switched off from 8 A, was used. Each simulation consisted of 2.5 X lo5 steps equilibration and 2 to lo6 steps for averaging.

Accessible Su$ace Area Calculations

These calculations were performed with CHARMM, which implements the Lee and Richards algorithm (Lee and Richards, 1971). For these calculations all hydrogen atoms are deleted. For consistency with the calculations of others, we employed Richards' values for the atomic radii (Richards, 1977). The water probe radius was set to the commonly used value of 1.4 A.

Continuum Electrostatics Calculations

These calculations have been performed by Dr. M. Schaefer using the program UHBD, version 4.1 (Davis et al., 1991). This program was modified to accommodate a noncubic grid for the unfolded chains. An interior dielectric of 1 and a solvent dielectric of 80 were used in all calculations. The ionic strength was set to zero. For the proteins and polyalanine helices the calculations were done in three phases: an initial run with border space 20 A and grid constant 2 A, a focusing run with a border space 7 fi and grid constant 1 A, and a final focusing run with border space 5 A and grid constant 0.5 A. The border space is the minimum distance between any protein atom and the edges of the grid.

Integral Equation T h e q Calculations

For the water solvent a three-site model similar to TIP3P was used, with bond length .ROH = 0.9572 A and angle OHOH = 104.52" (Jorgensen et aL,o 1983). For the water hydrogen the van der Waals parameters UH =

0.4 A and EH = 0.046 kcal/mol were employed. The nonzero value of

Page 62: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

292 THEMIS LAZARIDIS ET AL.

uH was introduced to avoid a Coulombic singularity (Pettitt and Rossky, 1982; Yu and Karplus, 1988). The solvent-solvent and solute-solvent van der Waals parameters were calculated with the combination rules cuy = (E,E, , ) ' /~ and uay = (a, t a,)/2. The temperature was set egual to T = 298.15 K and the water number density equal to 0.03334/A3; this corresponds to the experimental value 0.997 g/cmg at T = 298.15 K and P = 1 atm.

The solvent-solvent and solute-solvent distribution functions for the NMA, alanine dipeptide, and alanine tetrapeptide were calculated by solving the RISM integral equation (Chandler and Andersen, 1972; Hir- ata et al., 1983; Yu et al., 1990) in conjunction with a site-site HNC closure (Hirata and Rossky, 1981; Yu et al., 1990). In the calculation of the solvent-solvent distribution functions the solvent charges were scaled by a numerical factor A = 0.9795 as described in Yu et al. (1990). This scaling ensures that the solvent correlation functions are consistent with a macroscopic dielectric constant of water E = 78.6. The isochoric (i.e., constant density) temperature derivatives of the distribution functions, needed in the calculation of the excess solvation energy, were deter- mined by solving the corresponding temperature derivatives of the RISM integral equation and the HNC closure (Yu et al., 1990).

The excess solvation energy was calculated by the expression (Yu et az., 1990)

Equation (A-1) results from the isochoric temperature derivative of the RISM-HNC excess chemical potential of solvation (Singer and Chan- dler, 1985). In Eq. (A-1) the indices u and u denote solute and solvent atoms, respectively, pu is the solvent density, guu(r) are the solute-solvent radial distribution functions and cuu( r) the solute-solvent direct correla- tion functions; the symbol &stands for isochoric temperature derivative. The function Uuu(r) is the total interaction energy between solute atom (u) and solvent atom (u ) in vacuum. In the calculations performed here Uuu(r) is a sum of a short-range van der Waals term and a long-range Coulomb term (for charged atoms).

The first term on the right-hand side of Eq. (A-1) corresponds to the average solute-solvent interaction energy in solution. The second term on the right-hand side is the solvent-reorganization energy (referred to as cavity energy) (Yu et al., 1990). This term corresponds to the change

Page 63: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 293

in the average energy of the solvent due to the solvation of the solute. Both terms depend on the solute conformation via the distribution functions guu(r) and cuu( r), and their temperature derivatives i3,gUu( r) and i3Tcuv(r). Equation (A-1) expresses the excess solvation energy as a sum of solute-solvent terms. Consequently, it is possible to use Eq. (A-1) and calculate the contribution to the total solvation energy from each solute atom u.

ACKNOWLEDGMENTS This work was supported in part by grants from the National Science Foundation and

the National Institutes of Health. We are grateful to W. L. Jorgensen and J. A. McCammon for providing us with the programs BOSS and UHBD, respectively, to M. Schaefer for performing the continuum electrostatics calculations and for a discussion of the results, to Irwin Lee for performing some of the van der Waals calculations in proteins, to A. MacKerell and S. Fischer for help and discussions on the alkane crystal calculations, to A. Caflisch for providing the barnase simulation data, to J. D. Evanseck for numerous discussions, and to P. Privalov for comments on the manuscript. The figures were created using the program ASGL by A. Sali. We also thank P. Privalov for sending a copy of the manuscript by Makhatadze and Privalov in this volume prior to publication; we have not used any of their new results reported in this volume, but consider them in the Epilogue, which was prepared after our manuscript was accepted.

REFERENCES Archontis, G., and Karplus, M. (1995). To be published. Baase, W. A., Eriksson, A. E., Zhang, X.-J., Heinz, D. W., Sauer, U., Blaber, M., Baldwin,

E. P., Wozniak, J. A., and Matthews, B. W. (1992). Faraday Discuss. Chem. Soc. 93, 173-181.

Baker, E. N., and Hubbard, R. E. (1984). h o g . Biophys. Mol. Bid. 44, 97-179. Baldwin, R. L. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 8069-8072. Bashford, D., and Karplus, M. (1990). Biochemist? 29, 10219-10225. Ben-Naim, A. (1975). Biopolymers 14, 1337-1355. Ben-Naim, A. (1990). Biopolymers 29, 567-596. Benson, S. W. (1968). “Thermochemical Kinetics.” Wiley, New York. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., and Karplus,

Brooks, C. L., 111, Karplus, M., and Pettitt, B. M. (1988). Adv. Chem. Phys. 71. Cabani, S., Gianni, P., Mollica, V., and Lepori, L. (1981). J. Solution C h . 10, 563-595. Caflisch, A., and Karplus, M. (1994). Aoc. Natl. Acad. Sci. U.S.A. 91, 1746-1750. Chandler, D., and Andersen, H. C. (1972).J. Chem. Phys. 57, 1930-1937. Chiche, L., Gregoret, L. M., Cohen, F. E., and Kollman, P. A. (1990). Proc. Natl. Acad.

Chothia, C. (1974). Nature (London) 248, 338-339. Creighton, T. E. (1984). “Proteins: Structures and Molecular Properties.” Freeman,

Creighton, T. E. (1991). Cum. @in. Struct. Bid. 1, 5-16. Czerminsky, R., and Elber, R. (1990). J. Chem. Phys. 92, 5580-5601.

M. (1983).J. Comput. Chem. 4, 187-217.

Sci. U.S.A. 87, 3240-3243.

New York.

Page 64: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

294 THEMIS LAZARIDIS ET AL.

Davis, M. E., and McCammon, J. A. (1990). C h a . Rev. 90,509-521. Davis, M. E., Madura, J. D., Luty, B. A., and McCammon, J. A. (1991). Comput. Phys.

Degeihl, R., and Marsh, R. E. (1959). Acta Cqxtallogr. 12, 1007-1014. Dill, K. A. (1990). B i o c h i s t l y 29, 7133-7155. Doig, A. J., and Williams, D. H. (1991).J. Mol. Bid. 217, 389-398. Dunfield, L. G., Burgess, A. W., and Scheraga, H. A. (1978).j. Phys. Chem. 82,2609-2616. Eisenberg, D., and McLachlan, A. D. (1986). Nature (London) 319, 199-203. Elber, R., and Karplus, M. (1987). Science 235, 318-321. Eriksson, A. E., Baase, W. A., Zhang, X.-J., Heinz, D. W., Blaber, M., Baldwin, E. P., and

Matthews, B. W. (1992). Science 255, 178-183. Fauchire, J.-L., and Pliska, V. (1983). Eur.J. Med. Chem.-Chim. Ther. 18, 369-375. Fersht, A. R. (1987). Trends Biochem. Sci. 12, 301-304. Frank, H. S., and Evans, M. W. (1945).J. C h . Phys. 13, 507-532. Frauenfelder, H., Sligar, S. G., and Wolynes, P. G. (1991). Science 254, 1598-1603. Gelin, B., and Karplus, M. (1975). Proc. Natl. Acad. Sci. U.S.A. 72, 2002-2006. Gilson, M. K., Rashin, A., Fine, R., and Honig, B. (1985). J. Mol. Bid. 183, 503-516. Griko, Y. V., Makhatadze, G. I., Privalov, P. L., and Hartley, R. W. (1994). Protein Sci.

Guo, H., and Karplus, M. (1992).J. Phys. Chem. 96, 7273-7277. Harpaz, Y., Gerstein, M., and Chothia, C. (1994). Structure 2, 641-645. Hendsch, Z. S., and Tidor, B. (1994). Protein Sci. 3, 211-226. Hermann, R. B. (1972).J. Phys. Chem. 76, 2754-2759. Hirata, F., and Rossky, P. J. (1981). Chem. Phys. Lett. 83, 329-334. Hirata, F., Rossky, P. J., and Pettitt, B. M. (1983).J. Chem. Phys. 78, 4133-4144. Honig, B., Sharp, K., and Yang, A.-S. (1993). J. Phys. Chem. 97, 1101-1109. Horovitz, A., Serrano, L., Avron, B., Bycroft, M., and Fersht, A. R. (1990). J. Mol. Bid.

Horton, N., and Lewis, M. (1992). Protein Sci. 1, 169-181. Israelashvili, J. N. (1985). “Intermolecular and Surface Forces.” Academic Press, London. Jackson, R. M., and Sternberg, M. J. E. (1994). Protein Eng. 7, 371-383. Jorgensen, W. L. (1989). “BOSS, Version 2.8.” Yale University, New Haven, CT. Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., and Klein, M. I. (1983).

Jorgensen, W. L., Madura, J. D., and Swenson, C. J. (1984). J. Am. Chem. Soc. 106,

Karle, I. L. (1972). J. Am. Chem. SOC. 94, 81-84. Karplus, M. (1980). Bzophys. J. 32, 45-46. Karplus, M., Ichiye, T., and Pettitt, B. M. (1987). Biophys. J. 52, 1083-1085. Kauzmann, W. (1959). Adv. Protein C h a . 11, 14-63. Kellis, J. T., Jr., Nyberg, K., and Fersht, A. R. (1989). Biochemist? 28, 4914-4922. Khechinachvili, N. N. (1990). Biochim. Biophys. Acta 1040, 346-354. Larsen, B. L., Rasmussen, P., and Fredenslund, A. (1987). Ind. Eng. Chem. Res. 261,

Lee, B. (1991). Biopolyws 31,993-1008. Lee, B., and Richards, F, M. (1971). J. Mol. B id . 55, 379-400. Livingstone, J. R., Spolar, K. S., and Record, M. T., Jr. (1991). Biochemist? 30,4237-4244. MacKerell, A. D., Jr., and Karplus, M. (1991).j. Phys. C h a . 95, 10559-10560. MacKerell, A. D., Jr., Field, M., Fischer, S., Watanabe, M., and Karplus, M. (1995). Submit-

Commun. 62, 187-197.

3, 669-676.

216, 1031-1044.

J. C h a . Phys. 79, 926-935.

6638-6646.

2274-2286.

ted for publication.

Page 65: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION T O PROTEIN STABILITY 295

Makhatadze, G. I., and Privalov, P. L. (199O).J. Mol. Biol. 213, 375-384. Makhatadze, G. I., and Privalov, P. L. (1993). J. Mol. Bid . 232, 639-659. Makhatadze, G. I., Kim. RS., Woodward, C., and Privalov, P. L. (1993). Protein Sci,

Martinez, J. C., Harrow, M. E., Filimonov, V. V., Mateo, P. L., and Fersht, A. R. (1994).

Matthews, B. W. (1993). Annu. Rev. Biochem. 62, 139-160. Mirsky, A. E., and Pauling, L. (1936). Proc. Natl. Acad. Sci. U.S.A. 22, 439-447. Mitchell, J. B. O., and Price, S. L. (199O).J. Comput. Chem. 11, 1217-1233. Murphy, K. P., and Freire, E. (1992). Adv. Protein Chem. 43, 313-361. Murphy, K. P., and Gill, S. J. (1989a). J. Chem. Thmnodyn. 21, 903-913. Murphy, K. P., and Gill, S. J. (198913). Thmnochim. Acta 139, 279-290. Murphy, K. P., and Gill, S. J. (1990). Thennochim. Acta 172, 11-20. Murphy, K. P., and Gill, S. J. (1991). J. Mol. Eiol. 222, 699-709. Nicholls, A., Sharp, K. A., and Honig, B. (1991). Proteins 11, 281-296. Nozaki, Y., and Tanford, C. (1971). J. Biol. Chem. 246, 2211-2217. Oobatake, M., and Ooi, T. (1993). Bog. Biophys. Mol. Bid. 59, 237-284. Ooi, T., and Oobatake, M. (1988). J. Biochem. (Tokyo) 103, 114-120. Ooi, T., and Oobatake, M. (1991). Proc. Natl. Acad. Sci. U.S.A. 88, 2859-2863. Ooi, T., Oobatake, M., NCmCthy, G., and Scheraga, H. A. (1987). Proc. Natl. Acad. Sn'.

Pace, C. N., and Gajiwala, K. S. (1994). Preprint. Pauling, L., and Grey , R. B. (1953). Roc. R. SOC. London, Ser. B 141, 21-33. Pettitt, M., and Rossky, P. J. (1982).J. Chem. Phys. 77, 1451-1457. Ponnuswamy, P. K. (1993). Bog. Biophys. Mol. Bid. 59, 57-103. Privalov, P. L. (1979). Adv. Protein Chem. 33, 167-241. Privalov, P. L. (1989). Annu. Rev. Biophys. Eiophys. Chem. 18, 47-69. Privalov, P. L., and Gill, S. J. (1988). A d a Protein Chem. 39, 191-234. Privalov, P. L., and Makhatadze, G. I. (1992). J. Mol. Bid. 224, 715-723. Privalov, P. L., and Makhatadze, G. I. (1993). J. Mol. Bid. 232, 660-679. Radzicka, A., and Wolfenden, R. (1988). Biochemistry 27, 1664-1670. Rashin, A. A. (1992). Proteins 13, 120-131. Rashin, A. A. (1993). Bog. Biophys. Mol. Eiol. 60, 73-200. Reiher, W. (1985). Ph.D. Thesis, Harvard University, Cambridge, MA. Reynolds, J. A., Gilbert, D. B., and Tanford, C. (1974). Proc. Natl. Acad. Sn'. U.S.A. 71,

Richards, F. M. (1977). Annu. Rev. Biophys. Bioeng. 6, 151-176. Rose, G. D., and Wolfenden, R. (1993). Annu. Rev. Biophys. Eiomol. Struct. 22, 381-415. Roseman, M. A. (1988).J. Mol. Bid. 200, 513-522. Rossky, P. J., and Karplus, M. (1979). J. Am. Chem. Soc. 101, 1913-1937. Roux, B., Yu, H.-A., and Karplus, M. (1990). J. Phys. Chem. 94, 4683-4688. Savage, H. J., Elliott, C. J., Freeman, C. M., and Finney, J. L. (1993).J. Chem. Soc., Faraday

Scholtz, J. M., Marquee, S., Baldwin, R. L., York, E. J., Stewart, J. M., Santoro, M., and

Sharp, K. A., and Honig, B. (1990). Annu. Rev. Eiophys. Eiophys. Chem. 19, 301-332. Sharp, K. A., Nicholls, A., Fine, R. M., and Honig, B. (1991a). Science 252, 106-109. Sharp, K. A., Nicholls, A., Friedman, R., and Honig, B. (1991b). Biochemistry 30,9686-9697. Shirley, B. A., Stanssens, P., Hahn, U., and Pace, C. N. (1992). Biochemistry 31, 725-732. Shortle, D., Stites, W. E., and Meeker, A. K. (1990). Biochemisty 29, 8033-8041.

2, 2028-2036.

Biochemistry 33, 3919-3926.

U.S.A. 84, 3086-3090.

2925-2927.

Trans. 89, 2609-2617.

Bolen, D. W. (1991). Proc. Natl. Acad. Sci. U.S.A. 88, 2854-2858.

Page 66: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

296 THEMIS W D l S ET AL.

Singer, S. J., and Chandler, D. (1985). Mol. Phys. 55, 621-625. Sletten, E. (1970).J. Am. Chem. SOC. 92, 172-177. Spolar, R. S., Ha, J.-H. and Record, M. T., Jr. (1989). Proc. Natl. Acad. Sn'. U.S.A. 86,

Spolar, R. S., Livingstone, J. R., and Record, M. T., Jr. (1992). Biochemist9 31,3947-3955. Stickle, D. F., Presta, L. G., Dill, K. A,, and Rose, G. D. (1992).J. Mol. Biol. 226,1143-1159. Still, W. C., Tempczyk, A., Hawley, R. C., and Hendrickson, T. (1990). J. Am. C h . SOC.

Stillinger, F. H., and Weber, T. A. (1982). Phys. Reu. A 25, 978-989. Stouten, P. F. W., Fromrnel, C., Nakamura, H., and Sander, C. (1993). MoL SimuL 10,

Sturtevant, J. M. (1977). Proc. Natl. Acad. Sci. U.S.A. 74, 2236-2240. Tanford, C. (1962).J. Am. C h . Soc. 84, 4240-4247. Tanford, C., and Roxby, R. (1972). Biochemist9 11, 2192-2198. Tidor, B., and Karplus, M. (1994). J. Mol. Biol. 238, 405-414. Tunon, I . , Silla, E., and Pascual-Ahuir, J. L. (1992). Protein Eng. 5, 715-716. Weiner, S. J., Kollman, P. A., Case, D. A., Singh, U. C., Ghio, C., Alagona, G., Profeta,

Wesson, L., and Eisenberg, D. (1992). Protein Sci. 1, 227-235. Wintrode, P. L., Makhatadze, G. I., and Privalov, P. L. (1994). Proteins: Struct., Funct.,

Wolfenden, R. (1983). Science 222, 1087-1093. Wolfenden, R., Anderson, L., and Cullis, P. M. (1981). Biochemzst7y 20, 849-855. Wu, H. S., and Sandler, S. I. (1989). AICWJ. 35, 168-172. Yalkowsky, S. H., and Valvani, S. C. (1979). J. Chem. Eng. Data 24, 127-129. Yang, A.S., and Honig, B. (1992). CUT. q i n . Struct. Biol. 2, 40-45. Yang, A.-S., Sharp, K. A., and Honig, B. (1992).J. Mol. Biol. 227, 889-900. Yu, H.-A., and Karplus, M. (1988). J. Chem. Phys. 89, 2366-2379. Yu, H.-A., Roux, B., and Karplus, M. (1990). J. C h . Phys. 92, 5020-5033. Yu, H.-A., Pettitt, B. M., and Karplus, M. (1991).J. Am. Chem. Soc. 113, 2425-2434.

8382-8385.

112,6127-6129.

97-120.

S., Jr., and Weiner, P. (1984).J. Am. Chem. SOC. 106, 765-784.

Genet. 18, 246-253.

EPILOGUE

This volume contains two chapters concerned with the thermodynam- ics of protein stability: one by Makhatadze and Privalov, referred to as M & P', and our chapter, referred to as LAK. The aim of the two chapters is the same. Both try to obtain a detailed molecular interpretation of the interactions that contribute to the stability of proteins. The chapters are based on the same thermodynamic data but come to significantly different conclusions concerning their interpretation. Since such an interpretation does not follow directly from the measurements, addi- tional information had to be introduced. Assumptions are made by Makhatadze and Privalov to permit them to use empirical data to go from the macroscopic thermodynamic parameters to the microscopic interpretation. We take the opposite approach and base our analysis on calculations of the thermodynamic quantities from a microscopic model. It seems useful, therefore, to include an Epilogue which describes our

Page 67: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION ‘ro PROTEIN STABILITY 297

view of where the differences arise and to assess their importance for the present understanding of protein thermodynamics. The Epilogue is based on LAK and on the several versions of M & P’ provided to us by Professor Privalov. From the preliminary version submitted by Makhatadze and Privalov to the final version that is being published in this volume, there have been many changes. The published version of M & P’ incorporates certain of the results and concepts presented in LAK although specific citations to this paper are not always given. This suggests that progress is being made toward a realistic analysis of protein thermodynamics.

The chapter by Makhatadze and Privalov in this volume (M & P’) is a comprehensive study of the thermodynamics of 20 proteins and provides measurements and an analysis of the enthalpy, entropy, and free energy of unfolding over a wide range of temperatures. Our chapter concerns itself with the four proteins (cytochrome c, ribonuclease A, myoglobin, and lysozyme) for which experimental data and their analyses were available in the published work of Makhatadze and Privalov (1993) ; in this Epilogue, we refer to the published paper as M & P and to the chapter in this volume as M & P’. The main focus of our chapter is on the enthalpy of protein folding at physiological temperature, which is most directly amenable to calculational approaches; entropy and free energy are discussed very briefly.

The analysis in M & P’ is somewhat different from that used in M & P on which the work of LAK is based. This leads to significant quantitative changes in the results of the empirical analysis. Consequently, we include and discuss the new results of M & P’ for the four protein set. We have made corresponding calculations for some of the additional proteins studied in M & P’ (i.e., barnase, ubiquitin, BPTI, RNase T1, T4 lysozyme) , and found that the results are in accord with those from the four proteins considered in our chapter. This is in agreement with the conclusions of M 8c P‘ that the relative contributions to protein stability are similar in all proteins studied and shows that any discrepancies between M & P or M & P’ and LAK are not a result of the small number of proteins studied in the latter. Thus, we restrict the Epilogue to a comparison of the results for the four proteins.

There are a number of possible sources for the differences in the two analyses. One is that the assumptions of M & P’ may not be entirely correct and/or our microscopic model may not be fully valid. It is very likely that both approaches have inherent uncertainities and we will discuss some of them. Also, there appear to be some differences in the meaning of certain constructs (e.g., hydrogen bonding) used in describing the two sets of results. The possible errors and differences

Page 68: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

298 THEMIS LAZARIDIS ET AL.

in interpretation are discussed in this Epilogue. We hope that this will aid the reader in obtaining a picture of the current status and the remaining problems in achieving a microscopic description of the ther- modynamics of protein folding.

Reuised Analysis Based on M & P'

In their chapter in this volume, M & P' have introduced a revised model of the unfolded state. Instead of representing the ASA for the unfolded state as a sum over individual residues (see M & P), they use an extended chain model, i.e., the extended chain conformation was built with the sequence builder in QUANTA (Molecular Simulations, Inc.) and was used without minimization. A minimized extended chain was used in our calculations. As already mentioned (Section VI) , use of the extended chain instead of individual residues reduces the ASA of the unfolded state by about 15%. The effect on the difference between the folded and unfolded states (AASA) is larger, i.e., the AASA values are reduced by 50 to 70%. This, in turn, leads to a significant reduction in the magnitudes of AhHi(so1,np) and AHi(so1,p) and, finally, in the empirical values of AH;(vac), which are the quantities used in the analysis.

Revised versions of Table I and Table V based on M 8c P' are presented in this Epilogue; they are referred to as Tables E-I and E-11. As is evident from a comparison of Tables E-I and E-I1 with Tables I and V, the values of AHi(vac) estimated by M 8c P' are much closer to the calculated values than those by M & P. The revised values are only about 30% larger, with a range of a factor of 1.2 to 1.3 relative to the calculated values; the AHg(vac) in M 8c P were 1.8 to 2.0 times the calculated results. Thus, for the total vacuum unfolding enthalpy, AHi(vac), there is encouraging agreement between the empirical analysis and the calcula- tions. However, as mentioned in the beginning of the Epilogue, the primary purpose of the two studies is not simply to estimate AHE(vac),

TABLE E-I M&P Analysis of Enthalpy of Unfolding"

Enthalpy/ Protein AH#(sol) AH#(sol,np) AHX(sol,p) AH#(vac) AHa(vdW) AHK(elec) H bond

Cytochrome c 21 -172 -1067 1260 184 1076 10.1 Ribonuclease 71 - 159 -1127 1357 170 1167 8.92 Lysozyme 58 - 192 -1224 1474 205 1269 8.08 Myoglobin 1.4 -276 -1541 1818 295 1523 6.25

a All values in kcal/mol at 25% obtained from M & P' in this volume. See also Table I.

Page 69: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION T O PROTEIN STABILITY 299

TABLE E-I1 Vacuum Enthalpy of Unfolding, AH:(vac), and Its Decomposition“

Calculated M & P

Protein Total vdW Elect Total vdW H bond NewH bondb

Cytochrome c 943 688 225 1260 184 1076 572 RNase A 1068 654 398 1357 170 1187 703 Lysozyme 1116 738 351 1475 206 1270 737 Myoglobin 1492 1020 421 1819 295 1524 799

~ ~~

“All values in kcal/mol at 25°C. See also Table V. ’Value obtained by subtracting the calculated van der Waals energy from the total

M & P’ energy.

but rather to determine the important contributions to it. In the decom- position, unfortunately, there remains a significant disagreement be- tween M & P’ and LAK. From Tables V and E-11, we see that the discrep- ancy between the empirical and calculated van der Waals contributions to AHi(vac) has, in fact, increased. The calculated values are 3.5 to 5.4 times the M & P’ estimates, instead of a ratio of about 2.6 from M & P. For the electrostatic terms, the agreement between the calculations and the empirical estimate shows a slight improvement. The calculated val- ues are still much smaller; they vary from 0.21 to 0.34 times those in M & P’ versus 0.13 to 0.21 in M & P.

Thus, the general features of the original analysis in our chapter are not altered; i.e., we conclude that M & P’ underestimate the van der Waals terms and overestimate the electrostatic terms.

Origin of Differences between M €3 P’ and LAK

van der Waals Interactions. In Section N , we provided evidence that the calculated van der Waals interaction energies that contribute to AHi(vac) are reliable. The specific test calculations were made for nonpolar groups, but it is expected that the van der Waals calculations for polar groups have a corresponding accuracy; an examination of the latter is provided in MacKerell and Karplus (1991). Thus, the difference between the calculated values for AEI:(vdW) and the empirical estimates in M & P and M & P’ indicates that the latter are significantly too small. As pointed out in Section N,B, the major reason for the discrepancy is that M & P and M & P‘ include only the nonpolar contribution in their estimate of AHI:(vdW). As is evident from Table VII, the nonpolar-polar van der Waals term is more than half of the total and the polar-polar term is about one-sixth. If one considers the nonpolar interactions alone,

a.

Page 70: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

300 THEMIS LAZARIDIS ET AL.

the reduced values of M & P’, relative to M & P, are now in quite good agreement with LAK. This can be seen by comparing the column headed vdW under M & P’ in Table E-I1 with the column headed np-np in Table VII.

We conclude that the M & P‘ analysis of the nonpolar van der Waals contribution to the enthalpy of unfolding is in satisfactory agreement with the calculated values and that the difference between AE;(vdW) of LAK and AH:(vdW) of M & P’ arises from their neglect of the large nonpolar-polar and the smaller polar-polar contributions.

b. Electrostatic Interactions. To obtain the most meaningful empirical estimate of the electrostatic contribution, AHi(elec), we use the cal- culated value of AEi(vdW), which, as pointed out previously, is signifi- cantly larger than the M & P’ estimate of AHi(vdW). When this value of AE:(vdW) is subtracted from AHi(vac) estimated by M & P’, we obtain the result in the column headed H bond in Table E-11. Comparison with the column Elec under Calculated in Table E-I1 (these values are identical to those in Table V) shows that the empirical estimate of M & P’ is about twice the calculated value. This is in much better agreement than the results in the column headed H bond. The latter is referred to as the “enthalpic contribution of the polar groups” to AHI:(vac) in Section VII1,C of M 8c P’, and was described simply as “hydrogen bonding” in M & P (see also below). As discussed in some detail in Section W, this overestimate, which is

only a factor of two in the most favorable interpretation, is due to the breakdown of the additivity assumption and the accessible surface area approximation used in M & P and M & P’ for the solvation enthalpy of polar groups, AH: (so1,p). Simulation analyses, RISM integral equation theory, and Poisson-Boltzmann calculations are in agreement that the actual values of hHi(so1,p) are significantly less negative than the esti- mates of M & P’. Poisson-Boltzmann calculations were performed for the folded and unfolded states to examine AHi(so1,p) (see Section W,D). Table E-111, a revision of Table XIX, shows that with the revised AASAvalues in M & P’, AHi(so1,p) for the four proteins is between 2.5 and 3 times the Poisson-Boltzmann value. We can also compare the results for AHi(so1,np) from M & P’ with those that we have estimated by the reverse procedure, i.e., the calculated values of AH;(vac) are combined with the empirical AHi(so1,np) and the measured AHi(so1) to estimate AHi(so1,p). From Table E-IV, a revision of Table XXII, the values of AHi(so1,p) are about two-thirds of those given by M & P’. This is in accord with the expectation that the Poisson-Boltzmann calculations underestimat‘e AHi(sol,p), as discussed in our chapter.

Page 71: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 30 1

TABLE E-111 Salvation (Free) Energy Difference Between Folded and

Unfolded Proteins from the Poisson-Boltrmann Approximation and M & P”

Protein AEW(so1,p) M & P’

Cytochrome c -348 -1067 RNase A -419 -1127 Lysozyme - 474 -1225 Myoglobin -529 -1542 AlA-20 - 44 - 205

a All values in kcal/mol. See also Table XIX.

Since the peptide group makes the most important contribution to the polar solvation enthalpy, there is some concern that the M 8c P model compound value is an overestimate; some support for the M 8c P’ value is given by the RISM calculations in our chapter. M 8c P and M 8c P’ use the peptide group results to bolster their assumption that additivity applies to polar group solvation, in contrast to our conclusion (see Section IV). The fact that the contribution of the peptide group is approximately equal to the sum of the contributions of the CO and NH groups is due to the fact that these groups point away from each other in the trans peptide bond and are solvated by different water molecules. In other cases (e.g., the COOH group), such additivity is not observed, as can be seen in Table 2 of M 8c P.

From the above discussion, we conclude that M 8c P and M 8c P’ overestimate the electrostatic contribution of the polar groups of protein stability due to the inadequacy of the empirical approach used to deter- mine the solvation enthalpies. However, it is possible that for “global” analyses of the type of interest here, a “mean field” shielding approxima-

TABLE E-IV Prcposed Decomposition of Enthalpy of Unfolding”,‘

Protein AHg(sol,np) AWN(vac,p) AH#(vac,np) AHE(sol) AHi(sol,p)

Cytochrome c -172 516 (1076) 397 (184) 21 -720 (-1067) RNase A -159 721 (1187) 331 (170) 71 -822 (-1127) Lysozyme -192 686 (1270) 403 (206) 58 -838 (-1225) Myoglobin -276 849 (1524) 592 (295) 1.4 -1164 (-1542)

“At 25°C. All values in kcal/mol. The nonpolar-polar contribution to aEE(vac) is

‘Numbers in parentheses are those proposed by M & P‘ in this volume. divided equally between the polar and nonpolar groups. See also Table XXII.

Page 72: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

302 THEMIS W D I S ET AL.

tion for polar groups is a satisfactory approximation. The composition ofproteins, in terms of polar and nonpolar residues, is relatively invariant and the fraction of buried polar groups is essentially constant (Harpaz et d., 1994). Further, the most important contribution to the polar solvation enthalpy is made by the peptide group, which is uniformly distributed along the polypeptide chain. This suggests that the model compound parameters can be applied to proteins with a mean field reduction coefficient, particularly if the unfolded state is an extended chain. A value in the neighborhood of two-thirds would yield approxi- mate agreement between the calculated results and the empirical esti- mates. This could be a useful first approximation for global analyses of protein thermodynamics, although it is not accurate enough to deter- mine the relative stabilities of different proteins.

Hydrogen Bond Contribution

M & P’ and P. L. Privalov (private communication) point out that “what we consider the hydrogen bond contribution to the enthalpy of protein unfolding is certainly not the enthalpy of hydrogen bonding but the contribution of polar groups to the enthalpy of protein stabiliza- tion. This enthalpy includes the enthalpy of net hydrogen bonding, the enthalpy of van der Waals interactions between polar groups and also partly the enthalpy of interactions between polar and nonpolar groups.” We agree that the empirical approach of M & P and M & P’ does include all these contributions in what they call hydrogen bonding, although this was not clear in M & P. However, it is not evident that this combination of terms should be referred to as “hydrogen bonding” nor that it can be meaningfully compared with estimates of hydrogen bond energies in vacuum.

We have demonstrated in our chapter that the contribution of hydro- gen bonds to AHI:(vac) depends strongly on the definition of the hydro- gen bond. If the hydrogen bond energy is restricted to the CO - * * NHC, interaction of good hydrogen bonds (see Section V,C), the average hydrogen bond energy is in the neighborhood of -3 kcal/mol with the potential function used in the calculations; almost all of that energy is electrostatic. However, this contribution is only a fraction of the total contribution of the CO and NH groups to AHi(e1ec). A significant part of their interactions is nonlocal (about 25%), as was demonstrated for the case of a polyalanine helix in Section V,A and B. For the four proteins, use of -3 kcal/mol per hydrogen bond and the inclusion of only good hydrogen bonds (Table XVII) yields values for the longer- range contribution to the total electrostatic interaction in the range 23 to SO%, which is on the order of that found in the polyalanine helix.

Page 73: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 303

Uncertainties in Calculations of AEz(vac)

It is important to consider the accuracy of the theoretical results. Although random errors in the calculations are small, it is difficult to obtain precise estimates of the systematic errors arising from approxima- tions in the empirical potential functions. A variety of studies suggest that calculated values are sufficiently accurate to permit meaningful conclusions from the global analysis of protein enthalpies described in our chapter. An important element in the calculations is the use of neutralized side chains for the protein in the folded and unfolded states. As described in Sections II1,B and VI, this choice is in accord with measurements and calculations that indicate that the charged side chains make a small contribution to protein stability. Their presence would have led to results for the energy that are very sensitive to the details of the structural models for the folded and unfolded states. Although it is only the differences between the two that are significant, large energy fluctuations can result from small changes in the sidechain positions. To eliminate artifacts due to truncation of the interactions (e.g., from the large differences in the maximum distances between polar groups in the folded and unfolded chains), the calculations of the protein energies were done by including all interactions without a cutoff.

A major uncertainty in both the empirical and theoretical analyses is the model for the unfolded state. The large difference between the results of M & P and M & P' is an indication of the sensitivity of the empirical analysis to the model for the unfolded state. The theoretical calculations are also sensitive to the model of the unfolded state, which in reality is expected not to be an extended chain but a superposition of many "random coil" structures. As the unfolded state model gets more complex and compact, a reduction in both M A and AEE(vac) would be expected. Thus, the calculations and empirical analyses would change in a corresponding manner, although the absolute magnitudes would be altered. The qualitative trends of primary interest described in our chapter and this Epilogue are expected to remain valid.

Conclusion

The purpose of IAK is to provide a new perspective for the interpreta- tion of the fundamental thermodynamic data for protein stability given in M & P. It is hoped that having the two approaches in the same volume will lead to a clearer perception of what we do and what we do not understand at the present time. The elegant studies by Makhatadze and Privalov (1993, and in this volume), which are based on a large body

Page 74: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

304 THEMIS LAZARIDIS ET AL.

of experimental data determined by them and others, are an essential element of any treatment of protein thermodynamics. Our analysis dem- onstrates how to use atom-based calculations to determine the important contributions to protein stability in a way that is consistent with the experimental data. This Epilogue shows that the differences between our results and those in M & P’ can be understood. The resulting synthesis is a first step toward a consistent approach to protein thermody- namics. However, it should be stressed that, at best, a global understand- ing of the important contributions to protein stability has been achieved. Many details remain to be worked out.

Our analysis suggests that evaluation of the contribution of polar and nonpolar groups to protein stability is likely to be more useful and less prone to confusion than are the more standard constructs, such as hydrogen bonding and hydrophobic interactions. Table E-V, a revision of Table XXIII, combines the values in Table E-IV, a revision of Table XXII to obtain the contribution of polar and nonpolar groups to the enthalpy of unfolding at 25°C; the values in parentheses in Table E-V are the corresponding estimates from M & P’. The contribution of nonpolar groups is the sum of their calculated contribution to the en- thalpy of unfolding in vacuum, AHi(vac,np), and the change in their solvation enthalpy on unfolding, AH: (so1,np) ; the same combination of terms is used for the polar groups; for this analysis the nonpolar-polar interactions are divided equally between the nonpolar and polar contri- butions (see our chapter). The calculated values show that the polar groups enthalpically destabilize the folded state, while the nonpolar groups enthalpically stabilize the folded state. This contrasts sharply with

TABLE E-V Proposed Contributions of Polar and Nonpolar Groups

to AHi(so1) * b

Protein Polar Nonpolar

Cytochrome c -204 (+9) 225 (12) RNase A -101 (+60) 171 (10) Lysozyme -152 (+45) 210 (13) Myoglobin -315 (-18) 316 (19)

“At 25°C. All values in kcal/mol. The non- polar-polar contribution to AE#(vdW) is divided equally between the polar and nonpolar groups, See also Table XXIII. ’ Numbers in parentheses are those proposed

by M & P‘ in this volume.

Page 75: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

ENTHALPIC CONTRIBUTION TO PROTEIN STABILITY 305

M 8c P’, who conclude that the polar and nonpolar groups enthalpically stabilize the folded protein to about the same degree.

The above discussion and most of our chapter refer only to the enthal- pic contribution to protein stability which can be analyzed in a straightfor- ward way by energy calculations. Since the contribution of polar and nonpolar groups to the free energy of protein folding is also of great interest, we shall make an estimate of these quantities by combining the entropy results of M & P’ listed in Table E-VI with our values for the enthalpic contributions to obtain the contributions of polar and nonpo- lar groups to the free energy. We realize that these results are more approximate than the enthalpic treatment, but believe it is nevertheless interesting to have them for comparison. They are also listed in Table E-VI and compared to the estimates of M 8c P’. To obtain the overall entropy of unfolding, the configurational entropy, which cannot be separated into polar and nonpolar terms, has to be added. As for the enthalpy, the major contribution to stability comes from the nonpolar groups. The polar groups contribute little or not at all to protein stability. For lysozyme and, in particular, for RNase A, some stabilization from the polar groups is obtained, but this may well result from an overestimate of the solvation entropy of the polar groups. M 8c P‘ obtain favorable contributions to stability of comparable magnitudes from the polar and nonpolar groups, as indicated in Table E-VI.

Tables E-V and E-VI refer to the average behavior of polar and nonpo- lar groups. It is likely that specific polar groups are stabilizing, while others are destabilizing. Their individual contributions depend on the environment of these groups in the folded protein, if we assume that the environments in the unfolded state are similar. Correspondingly,

TABLE E-VI Contribution of Polar and Nonpolar Groups to Entropy, -TASi(sol), and Free Energy, AGW(sol),

of Protein Unfolding”

-TAS;(SO~)~ A C# (sol) ‘

Protein Polar Nonpolar Polar Nonpolar ~~~~~~~ ~ ~ ~ ~ ~

Cytochrome c 202 208 -2 (+211) 433 (220) RNase A 225 200 +124 (+285) 371 (210) Lysozyme 239 240 +87 (+284) 450 (253) Myoglobin 289 340 -26 (+271) 656 (359)

‘All values in kcal/mol at 25°C. bValues from M & P’ in this volume. ‘Values in parentheses from M & P’ in this volume.

Page 76: [Advances in Protein Chemistry] Advances in Protein Chemistry Volume 47 Volume 47 || Enthalpic Contribution to Protein Stability: Insights from Atom-Based Calculations and Statistical

306 THEMIS LAZARIDIS ET AL

nonpolar groups in loosely packed regions of the protein interior are less stabilizing than those in well-packed regions.

Such more subtle aspects (e.g., the effect of single residue substitutions on protein stability) cannot be predicted from the global approaches described here. Additional calculational and empirical analyses are needed for the detailed interpretation of these effects. Theoretical meth- ods, such as free energy simulations, and mutational data can usefully be exploited for this purpose.

Overall, our free energy estimates are consistent with the traditional view of Kauzmann concerning the dominant contributions to protein thermodynamics (Kauzmann, 1959; Dill, 1990), which regards hydropho- bic interactions as the primary source of stability and hydrogen bonding as the source of specificity. However, from the data of M 8c P' and the present calculations, we now have a better understanding of what is involved. In particular, the fact that polar groups make only a small contribution to stability appears to be the result of the cancellation of a favorable entropic term and an unfavorable enthalpic term.