Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

11
1978 J. Phys. Chem. 1994,98, 1978-1988 Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models Doree Sitkoff,ta Kim A. Sharp,? and Barry Honig'*# Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 191 04, and Department of Biochemistry and Molecular Biophysics and Center for Biomolecular Simulation, Columbia University, 630 West 168th St., New York, New York 10032 Received: September 24, 1993; In Final Form: December 7, 1993" A method and parametrization scheme which allow fast and accurate calculations of hydration free energies are described. The solute is treated as a polarizable cavity of a shape defined by the molecular surface, containing point charges at the location of atomic nuclei. Electrostatic contributions to solvation are derived from finite difference solutions of the Poisson equation (FDPB method). Nonpolar (cavity/van der Waals) energies are added as a surface area dependent term, with a single surface tension coefficient (7) derived from hydrocarbon solubility in water. Atomic charges and radii are obtained by modifying existing force-field or quantum- mechanically-derived values, by fitting to experimental solvation energies of small organic molecules. A new, simple parameter set (parameters for solvation energy, PARSE) is developed specifically for the FDPB/y method, by choosing atomic charges and radii which reproduce the estimated contributions to solvation of simple functional groups. The PARSE parameters reproduce hydration free energies for a test set of 67 molecules with an average error of 0.4 kcal/mol. For amino acid side chain and peptide backbone analogs the average error is only 0.1 kcal/mol. Introduction A reliable theoretical treatment of chemical processes which occur in aqueous solution requires accurate methods for calcu- lating solvation free energies. A number of theoretical methods have been developed recently with this goal in mind. The approaches vary widely in the complexity of the models and ease of calculation. Explicit water simulations have yielded good agreement with experimental solvation data for small molecule^.^-^ These methods are computer intensive, however, and can therefore become prohibitive in applications to very large molecules such as proteins. Moreover, their applicability to charged systems is currently limited in that only a fixed contribution to molecular polarizability is included, implicitly, through forcefield param- etrization. At the other extreme are methods which compute solvation free energy as a sum of atomic or chemical group contributions, with constants optimized to small molecule solvation data (frequently a dependence on exposed surface area is included).67 While it is generally accepted that this simpler approach works quite well for nonpolar surface area, it is clear that it neglects interactions between polar groups, as well as the interactions of polar groups or charges that are buried but near the molecular surface, with the surrounding solvent. More recent approaches include a polarizable, quantum mechanical solute model with either explicit or continuum solvent however, such methods are computationally demanding and are not easily applied to larger solute molecules, such as proteins. In this paper, we describe an approach to the calculation of solvation free energies which is rapid, can be easily applied to proteins, and includes solvent effects on buried charges and charge-charge interactions. The method computes electrostatic solvation free energy contributions by solving the Poisson equation for a polarizable solute molecule with continuum solvent rep- resentation. Nonpolar solvation free energies, derived from transfer data on small hydrocarbon molecules, are added on as a surface area dependent term to provide the total solvation energy. The electrostatics treatment used in the approach relies on a finite difference solution to the Poisson equation for solutes of arbitrary shape embedded in a dielectric medium (the FDPB t University of Pennsylvania. f Columbia University. Q Abstract published in Advance ACS Abstracts, January 15, 1994. 0022-3654/94/2098- 1978$04.50/0 method).l9 Comparisons with results of explicit solvent simu- lations have shown that this continuum treatment can offer comparable accuracy, at orders of magnitude less computer time.20Vz1 Recent work has demonstrated that solutepolarizability may be explicitly incorporated into the model easily through the solute dielectric constant.22 A recent review details many applications of the FDPB method, in calculating pK,'s of groups in proteins, redox midpoints, binding energies and partition coefficients.23 The FDPB method requires as input radius and charge parameters for each atom. In previous work, these parameters were taken without change from molecular mechanics forcefields. While they yield very reasonable results, such parameters were not designed for use with the FDPB method, nor for explicit solute polarizability. Recently, there have been some attempts to optimize parameters for the FDPB method, with specific attention given to reproducing experimental solvation free energies. The approaches taken include scaling charges to dipole moments,22 general adjustments of atomic radii,*' and chemical- group-specific parameter tuning.24 The results have provided some improvement in the accuracy of the FDPB method. However, a parameter set that can be applied to a large number of molecules and yield consistently accurate solvation energies has not been described. In this paper, we investigate the utility of several available parameter sets for solvation energy calculations using the FDPB method. We examine the feasibility of optimizing force-field- or ab initio-derived parameters through either charge or radius scaling, by matching to the well-defined vacuum-to-water solvation free energies for the 20 amino acid side chains and the peptide backbone. As suggested by the previous optimization attempts, the accuracy achieved using the scaled molecular mechanics parameters is improved but still limited. As an alternative, we develop a new, simple parameter set, specifically for use with the FDPB method with solute polarizability included. The set is based on a group contribution approach and uses Pauling radii25 and simple bond dipole moments optimized to small molecule transfer data. Good agreement with experimental solvation free 0 1994 American Chemical Society

Transcript of Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Page 1: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1978 J . Phys. Chem. 1994,98, 1978-1988

Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Doree Sitkoff,ta Kim A. Sharp,? and Barry Honig'*# Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 191 04, and Department of Biochemistry and Molecular Biophysics and Center for Biomolecular Simulation, Columbia University, 630 West 168th St. , New York, New York 10032

Received: September 24, 1993; In Final Form: December 7, 1993"

A method and parametrization scheme which allow fast and accurate calculations of hydration free energies are described. The solute is treated as a polarizable cavity of a shape defined by the molecular surface, containing point charges at the location of atomic nuclei. Electrostatic contributions to solvation are derived from finite difference solutions of the Poisson equation (FDPB method). Nonpolar (cavity/van der Waals) energies are added as a surface area dependent term, with a single surface tension coefficient (7) derived from hydrocarbon solubility in water. Atomic charges and radii are obtained by modifying existing force-field or quantum- mechanically-derived values, by fitting to experimental solvation energies of small organic molecules. A new, simple parameter set (parameters for solvation energy, PARSE) is developed specifically for the FDPB/y method, by choosing atomic charges and radii which reproduce the estimated contributions to solvation of simple functional groups. The P A R S E parameters reproduce hydration free energies for a test set of 67 molecules with an average error of 0.4 kcal/mol. For amino acid side chain and peptide backbone analogs the average error is only 0.1 kcal/mol.

Introduction A reliable theoretical treatment of chemical processes which

occur in aqueous solution requires accurate methods for calcu- lating solvation free energies. A number of theoretical methods have been developed recently with this goal in mind. The approaches vary widely in the complexity of the models and ease of calculation. Explicit water simulations have yielded good agreement with experimental solvation data for small molecule^.^-^ These methods are computer intensive, however, and can therefore become prohibitive in applications to very large molecules such as proteins. Moreover, their applicability to charged systems is currently limited in that only a fixed contribution to molecular polarizability is included, implicitly, through forcefield param- etrization. At the other extreme are methods which compute solvation free energy as a sum of atomic or chemical group contributions, with constants optimized to small molecule solvation data (frequently a dependence on exposed surface area is included).67 While it is generally accepted that this simpler approach works quite well for nonpolar surface area, it is clear that it neglects interactions between polar groups, as well as the interactions of polar groups or charges that are buried but near the molecular surface, with the surrounding solvent. More recent approaches include a polarizable, quantum mechanical solute model with either explicit or continuum solvent however, such methods are computationally demanding and are not easily applied to larger solute molecules, such as proteins.

In this paper, we describe an approach to the calculation of solvation free energies which is rapid, can be easily applied to proteins, and includes solvent effects on buried charges and charge-charge interactions. The method computes electrostatic solvation free energy contributions by solving the Poisson equation for a polarizable solute molecule with continuum solvent rep- resentation. Nonpolar solvation free energies, derived from transfer data on small hydrocarbon molecules, are added on as a surface area dependent term to provide the total solvation energy. The electrostatics treatment used in the approach relies on a finite difference solution to the Poisson equation for solutes of arbitrary shape embedded in a dielectric medium (the FDPB

t University of Pennsylvania. f Columbia University. Q Abstract published in Advance ACS Abstracts, January 15, 1994.

0022-3654/94/2098- 1978$04.50/0

method).l9 Comparisons with results of explicit solvent simu- lations have shown that this continuum treatment can offer comparable accuracy, a t orders of magnitude less computer time.20Vz1 Recent work has demonstrated that solutepolarizability may be explicitly incorporated into the model easily through the solute dielectric constant.22 A recent review details many applications of the FDPB method, in calculating pK,'s of groups in proteins, redox midpoints, binding energies and partition coefficients.23

The FDPB method requires as input radius and charge parameters for each atom. In previous work, these parameters were taken without change from molecular mechanics forcefields. While they yield very reasonable results, such parameters were not designed for use with the FDPB method, nor for explicit solute polarizability. Recently, there have been some attempts to optimize parameters for the FDPB method, with specific attention given to reproducing experimental solvation free energies. The approaches taken include scaling charges to dipole moments,22 general adjustments of atomic radii,*' and chemical- group-specific parameter tuning.24 The results have provided some improvement in the accuracy of the FDPB method. However, a parameter set that can be applied to a large number of molecules and yield consistently accurate solvation energies has not been described.

In this paper, we investigate the utility of several available parameter sets for solvation energy calculations using the FDPB method. We examine the feasibility of optimizing force-field- or ab initio-derived parameters through either charge or radius scaling, by matching to the well-defined vacuum-to-water solvation free energies for the 20 amino acid side chains and the peptide backbone. As suggested by the previous optimization attempts, the accuracy achieved using the scaled molecular mechanics parameters is improved but still limited. As an alternative, we develop a new, simple parameter set, specifically for use with the FDPB method with solute polarizability included. The set is based on a group contribution approach and uses Pauling radii25 and simple bond dipole moments optimized to small molecule transfer data. Good agreement with experimental solvation free

0 1994 American Chemical Society

Page 2: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Calculation of Hydration Free Energies

Figure 1. Thermodynamic cycle showing breakdown into electrostatic and nonpolar (cavity/van der Waals) contributions to solvation energy.

energies is obtained for a range of small polar molecules, including those representing the amino acid side chains and the polypeptide backbone.

Theory and Methods

Solvation Energy Calculation. The approach is outlined in the thermodynamic cycle depicted in Figure 1. The total solvation free energy is expressed as a sum of separate electrostatic and nonpolar solvation contributions. Electrostatic solute-solvent interaction free energies, are evaluated by discharging the solute in the first solvent. The solute, now considered to be completely nonpolar, is transferred between solvents. This step gives the nonpolar, or cavity/van der Waals contributions to the solvation energy, AGS/vdw. The solute is then recharged in the second solvent, and the associated electrostatic interaction energy,

, is calculated. The total solvation free energy, Acyl , is obtained from the sum of the energies of the separate steps: AGw,eht

Electrostatic Solvation Energy Contribution. Calculations are based on a continuum electrostatics model which has been described p r e v i ~ u s l y . ~ ~ The solute is represented explicitly, by atomic coordinates with associated atomic radii and partial charges placed at each nucleus. The boundary between solute and solvent is the "molecular surface" which is obtained from the van der Waals and reentrant surfaces, the latter being generated by rolling a solvent-sized probe over thevan der Waals surface (probe radii of 0.0 and 1.4 A for vacuum and water were used). Any crevices out of reach of the surface of the probe are considered as part of the solute. The solute dielectric constant is assigned a value of 1 (e = 1, no solute polarizability) or higher values which include capacity for electronic polarization due to the reaction field from the solvent. The solvent is represented by a dielectric continuum of the appropriate dielectric constant (e = 1 for vacuum, e = 80 for room-temperature water). The solute and solvent, with associated charges and dielectric boundaries, are mapped onto a grid. Electrostatic potentials are calculated by solving the Poisson equation

Vt(r) V4(r) + 47rp(r) = 0 (2) in finite difference form using iterative numerical techniques.26 This method has been incorporated into the Delphi program,27 which has recently been improved in accuracy and speed.28 The solution yields the electrostatic potential a t every grid point. The total electrostatic solvation energy is the sum of steps 1 and 3 of Figure 1, given by

(3)

where qi and $1 are the charge and calculated potential a t the ith gridpoint, for the case of transfer from vacuum (e = 1) to water (e = 80).

The Journal of Physical Chemistry, Vol. 98, No. 7, I994 1979

Nonpolar (Cavity/van der Waals) Energy Contribution. En- ergies are derived from experimental transfer data for hydro- carbons, in accordance with previous studies which found that the energy of transferring small straight-chain alkanes into water is linearly dependent on thealkane surface areaF9 The relationship may be described by

AG = y A + b (4) where A is the solvent-accessible surface area and y and b are constants. Values for y and b were extracted from a least-squares fit to a plot of experimental alkane transfer free energies versus accessible surface area, and eq 4 was used to compute the AG for the accessible surface area of a solute of interest. For these calculations, solvent accessible surface areas were defined by the center of a solvent-sized probe (1.4 A for water) rolled over the solute surface. Solvent-accessible surface areas were calculated using a program which implements a modified Shrakeand Rupley vertex algorithm,30 using a hierarchical data structure. The algorithm can calculate the surface area of a 1000 atom protein on an SG IRIS in 1.4 s to within 0.5% precision.31 For internal consistency, the same set of radii was used to calculate the surface areas of the alkanes as well as the solutes of interest. The best-fit values of y and b thus varied slightly between radius sets, ranging between 5.0 f 0.5 cal/mol.A2 and 0.860 f 0.100 kcal/mol for y and b respectively.

The experimental vacuum to water transfer free energies for alkanes used to obtain y and b were taken from the data of Ben- Naim and Marcus.32 These authors computed free energy values from the equilibrium concentrations of solute in the two phases, using

AG = -RT ln(C,/C,) (5) where C1 and C2 are expressed in units of molarity. Recent analyses have suggested that a more appropriate measure of the surface area dependence of the vacuum to water cavity/van der Waals free energy contribution toprotein folding and macroscopic processes is obtained by subtracting a volume-dependent Flory- Huggins-like term from the free energy defined in eq 5.33~3~ However, for the solute solvation calculations described here, any such volume adjustment would apply to both steps 2 and 4 of Figure 1 and so cancel out. In addition, the curvature of the solute-solvent surface is beIieved to affect the solute-water surface tension interaction, according to the equation

y(R) ly ( - ) = 1/(1 * a / @ ( 6 ) where a is the radius of the solvent and R is the radius of curvature of the solute-solvent i n t e r f a ~ e . ~ ~ , ~ ~ This suggests that differences in curvature between the alkanes (the nonpolar energy param- etrization molecules) and the solute whose solvation energy is being computed should be taken into account. However, the curvature of the alkanes and the small polar molecules studied here is essentially identical; thus curvature effects weredisregarded in the calculation.

In the following, we refer to the solvation energy calculation method described above as the FDPB/y method.

Small Molecule Solvation Energy Calculations. Solvation energies for vacuum to water transfer were computed using the FDPB/y method and compared with experimental values for a set of small molecules representing the neutral amino acid side chains and backbone. Experimental data were derived from Wolfenden et al.,36 according to eq 5 (data for the neutral forms of the molecules were used). All calculations were done at a resolution of 4.0 grids/A. At this scale, the distance from solute van der Waals surface to box edge was at least large enough to fit a solvent probe for all solutes studied (in older Delphi versions this is essential to the correct functioning of the algorithm which

Page 3: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1980 The Journal of Physical Chemistry, Vol. 98, No. 7, 1994 Sitkoff et al.

Figure 2. Thermodynamic cycle used in determining experimental transfer energies for ionic species.

defines the solute-solvent boundary, although in the version used here this requirement has been relaxed). Sensitivities to grid scale and centering were tested: total transfer free energies differed by less than 0.2 kcal/mol between 2 and 5 g/A; and by less than 0.1 kcal/mol for varying grid offsets a t 4 g/A. Solute geometries were built and energy minimized using the Insight/ DISCOVER molecular modeling package (Biosym Technologies, Inc.) and were considered to be invariant among solvent phases unless otherwise specified.

Parameters tested for use with the FDPB/y method included four sets derived from molecular simulation force fields: CVFF (1n~ight/D1SCOVER),3~ the CHARMM set PARAM 19,3* AMBER,39 and 0PLS.a Partial charges were used in published form. Where necessary, radii were derived from the minimum in the Lennard-Jones potential, given as 2-%. In addition, a b initio-derived partial charges developed by Chipot et al. were tested.41 (These charges were obtained by fitting to potentials from S C F wave functions calculated with the 6-3 1G** split basis set. Fitting sites were located every 0.5 A outside the envelope produced by twice the van der Waals radii, out to a distance of 5 A from atomicnuclei.) No specific radius set has been developed for use in solvation energy calculations with the ab initio charges; thus radii from several forcefields were tested. Molecules for which parameter values could not be obtained (e.g., neutral acids, neutral arginine) were omitted from the calculation for that parameter set.

Calculations were performed using solute dielectric constants of 1 and 2. A dielectric of 1 excludes explicit changes in solute polarization, creating suitable conditions for testing the parameter sets, since they were optimized using a dielectric constant of 1, Le., a fixed solute polarization. A dielectric of 2 includes explicit solute polarizability and is in accordance with the experimentally derived high-frequency electronic response of small organic solutes to an external electric field and with recent work which calculated dielectric values of 2-3 for small molecules using experimental polarizability data.22

Optimization of Existing Parameter Sets. The OPLS and a b initio parameter sets were altered in order to optimize agreement with experimental transfer energy values when solute polarizability was included (e = 2). Either of two simple rules were followed: (1) increasing the atomic charges by a single scaling value; (2) decreasing the atomic radii toward standardvan der Waals values.

Development of a New Parameter Set (PARSE). A new, simple parameter set, PARSE (parameters for solvation energy), was designed in order to maximize the accuracy of energies calculated using the FDPB/y method with polarizability included. Pa- rameter values were derived based on a modified group contri- bution approach. Previous authors have shown that solvation energies for many small molecules can be reproduced by assigning constant parametrized “group” contributions to the constituent chemical groups in the molecule, and computing the total molecular solvation free energy as a sum of the component chemicalgroupcontributions.4sJ In thePARSEparametrization approach, each chemical group is assigned a constant set of charges (and radii) in the solvation energy calculation, such that the

-12 -8 -4 0 4

AGv->w experimental (kcdmol)

b)

0 I

-12 -12 -8 -4 0 4

AGv->w experimental (kcdmol)

Figure 3. Comparison of calculated and experimental transfer free energies from vacuum to water for amino acid side chain molecules and N-methylacetamide (NMA). Calculations were performed using the FDPB/y method with solute dielectric = 1 and parameters as follows: (a) OPLS parameters. Solid line is the least-squares line fit AGdc = O.94AGe,-0.65. Thecorrelation constant is0.85. (b) a6 initio-derived charges and OPLS radii. Solid line is the least-squares line fit AGd, = 0.99AGe, - 0.37. The correlation constant is 0.93.

solvation energies for a set of simple small polar molecules containing that group are reproduced.

Experimental solvation free energies for small molecules representing chemical groups of interest were ~ b t a i n e d s ’ ~ and divided into subsets containing common functional groups. Standard Pauling van der Waals radii25 were used to assign atomic radii, with the exception that the hydrogen radius was reduced from 1.2 to 1.0 A. The nonpolar (cavity/van der Waals) contribution to solvation energy was then calculated for the molecules within each chemical group. This contribution was subtracted from the total experimental solvation free energy to obtain an “experimental” electrostatic energy. Atomic charges were assigned by distributing equal and opposite charges across bonded atoms in the chemical group (applying “bond dipole moments”), where charge magnitudes were chosen such that the calculated electrostatic contribution best matched the “experi- mental” electrostatic energies for the set of molecules containing that functional group. These rules were followed in assigning charge parameters: With the object of simplicity, the fewest moments needed toobtain agreement wereapplied (e.g., optimized charges for alcohols-methanol, ethanol, etc.-consist of identical and opposite charges placed only upon the molecules’ oxygen and hydrogen atoms). Charges for more complicated functional groups were built up from the parameters of their subsistent simple groups (e.g., amide = ketone + amine). Some groups required further adjustment of charges to permit agreement between

Page 4: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Calculation of Hydration Free Energies

TABLE 1: Solvation Free Energies Using OPLS and Ab IOiteDerived Parameters'

The Journal of Physical Chemistry, Vol. 98, No. 7, 1994 1981

~ ~~

aacid molecule OPLS ab initio SA AGc/v OPLS ab Mtio AGQ,tOi* arg N-prop ylguanidine 307 2.46 -10.92 asn acetamide -12.73 -13.09 217 1.99 -10.73 -11.10 -9.72

CYS methylthiol -5.26 -3.99 194 1.87 -3.39 -2.12 -1.24 gln propionamide -12.58 -1 1.94 249 2.16 -10.42 -9.79 -9.42

propionic acid -9.82 246 2.14 -7.68 -6.47 glu his methylimidazole -1 1.80 -10.40 260 2.22 -9.58 -8.19 -10.25 lYS N- butylamine -5.85 274 2.29 -3.56 -4.38 met methyl ethyl sulfide -3.23 -3.42 26 1 2.22 -1.01 -1.20 -1.49

toluene -2.20 -3.00 294 2.39 0.19 -0.60 -0.76 ser methanol -8.16 -7.51 171 1.76 -6.41 -5.81 -5.08 Phe

thr ethanol -7.77 -6.72 208 1.95 -5.82 -4.77 -4.90 trP methylindole -8.44 -8.80 334 2.60 -5.84 -6.21 -5.91 tYr p-cresole -8.79 -9.39 305 2.45 -6.34 -6.95 -6.13 bb N-methylacetamide -9.52 255 2.19 -7.33 -10.08

asp acetic acid -10.35 213 1.97 4.38 -6.70

0 Calculations for polar amino acid side chain and backbone molecules with solute dielectric constant = 1. Missing entries are for molecules lacking electrostatic parameters. AG,l= electrostatic transfer free energy; SA = calculated surface area in A2; AGClv = calculated cavity/van der Waals transfer free energy; hGtot = total calculated or experimental transfer free energy. Energies are in kcal/moI. Experimental values from ref 36.

b)

-12 -12 -8 '-4 0 4

AGv->w experimental (kcallmol)

-12- ' I ' ' ' ' '

-12 -8 -4 0 4

AGv->w experimental (kcal/mol)

Figure 4. Comparison of experimental and calculated transfer free energies for amino acid side chain molecules and NMA. Calculations used a solute dielectric constant of 2 and adjusted parameters: (a) OPLS radii and OPLS charges increased by 17%. Solid line is the least-squares line fit AG,i, = 0.98AGWa - 0.36. The correlation constant is 0.85. (b)

TABLE 2 Solvation Free Energies Using Modified OPLS and Ab IdtieDerived Parameters'

AGk.tOi

OPLS OPLS abinitio abinitio aacid molecule inc crgb redradc inc crgd redradc AG-J arg N-propyl- -10.92

guanidine asn acetamide -11.19 -9.89 -9.86 -11.11 -9.72 asp acetic acid -8.41 -7.17 -6.70 cys methylthiol -3.33 -1.57 -2.90 -1.47 -1.24 gln propionamide -10.85 -9.72 -9.01 -9.91 -9.42 glu propionic acid -7.84 -6.75 -6.47

lys N-butylamine -3.37 -3.56 -4.38

his methyl- -9.81 -10.56 -7.69 -8.96 -10.25 imidazole

met methylethyl -1.11 -0.70 -2.48 -1.72 -1.49 sulfide

phe toluene 0.32 -0.37 -2.06 -1.62 -0.76 ser methanol -6.19 -5.62 -5.35 -5.04 -5.08 thr ethanol 4 - 5 2 -5.12 -4.73 -4.49 -4.90

tyr p-cresole 4 - 0 2 -5.91 -6.60 -7.09 -6.13 bb N-methyl- -7.53 -7.01 -10.08

trp methylindole -5.39 -5.45 -6.15 -6.45 -5.91

acetamide

0 Calculations performed with solute dielectric constant = 2. Symbols as in Table 1 . Energies are in kcal/mol. * OPLS charges scaled by 1.17 and unmodified OPLS radii. Unmodified OPLS charges and the reduced radius set, specified in Table 4. d Ab initio charges scaled by 1.16 and unmodified OPLS radii. e Unmodified ob initio charges and the reduced radius set specified in Table 4. /Experimental values from ref 36.

Ambiguities in assigning multiple bond moments were resolved by choosing those which produced the best match to solvation free energies, with the least amount of charge adjustment.

Parameter sets were also derived for the ionized counterparts of the amino acid side-chain molecules. Experimental solvation free energies for charged molecules were obtained using the thermodynamic cycle depicted in Figure 2 and the associated equations below:

OPLS charges and the rduced radius set in Table 4. Solid line is the least-squares line fit AG,lc = 0.96AG,, + 0.02. The correlation constant is 0.92.

calculated and experimental transfer free energies. For these cases, additional dipole moments were added. (An example is the acidic group, -COOH, whose solvation energy could not be matched using a superposition of the ketone and alcohol charges. An additional moment consisting of equal and opposite charges on the hydrogen and the ketone oxygen atoms was added.)

where the upper signs apply to basic side chains and the lower to acids. A G g L , experimental gas-phase acidities and basici- ties, were obtained from the literat~re.~Z A G Z u were derived from literature pKa va1ues2'36'43 using

-AGig,..l. = 2.303R T( PK,) (8) The vacuum to water transfer energy of a hydrogen ion has been measured variously at values ranging from -254 to -261 kcall mol.24 A value of -261 kcal/mol was used.

Page 5: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1982

TABLE 3: PARSE Group Charges, Solvation Free Energies, and Gas-Phase Dipole Moments'

The Journal of Physical Chemistry, Vol. 98, No. 7, 1994

A. Neutral Molecules

Sitkoff et al.

chemical group atom crgb aacid moleculeC AGei SA AGclv calc ex# AG, c d c p exppe e r r p

-OH

NH3

-NH2

-NH-

S H

-CH,S-CHn

>c=o

-COH

-COOH

-CONH2

-CONH-

-CNC-(NH2)2

aromatic LCH or LC-CHp

aromatic -C H=N-CH=

aromatic 1C-OH

aromatic >NH

aromatic -NH-CH=N-CH=

-0.49 ser 0.49 thr

-0.90 0.30

-0.78 0.39

lYS -0.50

0.50 -0.29 cys

0.29 -0.53 met

0.265

0.55 -0.55

0.49 -0.49

0.00 0.55 asp

-0.495 glu -0.49

0.435 0.55 asn

-0.55 gln -0.78

0.39 0.55 bb

-0.55 -0.40

0.40 0.28 arg

-0.56 -0.75

0.375 -0.125

0.125 phe 0.125 0.155 0.125

0.055

0.435

0.40

0.40 0.155 0.125

-0.56

-0.49 tyr

-0.40 trp

-0.40 his

methanol ethanol propanol butanol isopropyl

alcohol 2-butanol 3-methyl-l-

butanol ammonia

methylamine ethylamine propylamine N-but ylamine dimethylamine diethylamine methylthiol ethylthiol methyl ethyl

sulfide dimethyl sulfide diethyl sulfide acetone 2-butanone 2-pentanone 3-pentanone 3-methyl-2-

butanone 2,4-dimethyl-3-

pentanone acetaldehyde proprionaldehyde

acetic acid propionic acid butyric acid

acetamide propionamide

N-methyl- acetamide

N-propyl- guanidine

benzene toluene ethylbenzene pyridine 4 -met h y I p y r i d i ne 2methylpyridine phenol p-cresole 2-methylphenol methylindole

methyl- imidazole

-7.21 158 1.77 -6.97 190 1.95 -6.95 220 2.11 -6.87 249 2.27 -6.69 216 2.09

-6.51 246 2.25 -6.95 272 2.39

-5.68 119 1.56

-6.77 165 1.81 -6.58 194 1.97 -6.61 223 2.13 -6.60 255 2.30 -6.53 201 2.01 -5.97 194 2.35 -3.25 180 1.90 -3.21 210 2.06 -3.70 243 2.24

-3.92 212 2.07 -3.47 272 2.40 -6.01 213 2.08 -5.80 243 2.24 -5.66 269 2.38 -5.61 271 2.39 -5.61 266 2.36

-5.30 308 2.59

-5.61 179 1.89 -5.40 210 2.06

-8.60 193 1.97 -8.55 224 2.14 -8.54 251 2.28

-11.75 197 1.99 -11.57 228 2.15

-12.20 236 2.20

-13.36 284 2.46

-3.21 227 2.15 -3.10 261 2.33 -3.08 289 2.49 -7.17 220 2.11 -7.36 256 2.31 -6.56 256 2.31 -8.61 240 2.22 -8.52 274 2.41 -8.35 271 2.39 -8.39 301 2.55

-12.41 234 2.19

-5.44 -5.08 0.36 1.74 1.71 0.03 -5.02 -4.90 0.12 1.81 1.69 0.12 -4.84 -4.83 0.01 1.81 1.68 0.13 -4.60 -4.72 0.12 1.82 1.66 0.16 -4.60 -4.76 0.16 1.74 1.66 0.08

-4.26 -4.58 0.32 1.73 -4.56 -4.42 0.14 1.83

-4.12 -4.31 0.19 1.33 1.47 0.14

-4.96 -4.61 -4.48 -4.30 -4.52 -3.62 -1.35 -1.15 -1.46

-4.57 0.39 1.83 1.31 0.52 -4.50 0.11 1.77 1.22 0.55 -4.39 0.09 1.76 1.17 0.59 -4.38 0.08 1.75 1.44f 0.31 -4.29 0.23 1.84 1.03 0.81 -4.07 0.45 1.83 0.92 0.91 -1.24 0.11 1.39 1.53 0.14 -1.30 0.15 1.42 1.58 0.16 -1.49 0.03 2.21 1.56 0.65

-1.85 -1.54 0.31 2.29 1.50 0.79 -1.07 -1.43 0.36 2.14 1.54 0.60 -3.93 -3.85 0.08 2.58 2.88 0.30 -3.56 -3.64 0.08 2.58 -3.28 -3.53 0.25 2.54 -3.22 -3.41 0.19 2.57 -3.25 -3.24 0.01 2.62

-2.71 -2.74 0.03 2.63

-3.72 -3.50 0.22 2.07 2.69 0.62 -3.34 -3.44 0.10 2.33 2.52 0.19

-6.63 -6.70 0.07 1.63 1.74 0.11 -6.41 -6.47 0.06 1.62 1.75 0.13 -6.26 -6.36 0.10 1.61

-9.76 -9.72 0.04 3.44 3.76 0.32 -9.42 -9.42 0.00 3.39 3.79 0.34

-10.00 -10.08 0.08 3.99 3.73 0.26

-10.90 -10.92 0.02 1.97

-1.06 -0.77 -0.59 -5.06 -5.05 -4.25 -6.39 -6.11 -5.96 -5.84

-0.87 0.19 0.00 0.00 0.00 -0.76 0.01 0.21 0.36 0.15 -0.80 0.21 0.19 0.59 0.40 -4.70 0.36 1.83 2.19 0.36 -4.93 0.12 2.13 2.70 0.57 -4.63 0.38 1.66 1.85 0.19 -6.62 0.23 1.45 1.22 0.00 -6.13 0.02 1.49 1.51/ 0.08 -5.87 0.09 1.32 -5.91 0.07 0.95 2.29f 1.34

-10.22 -10.25 0.03 3.53 3.30' 0.23

N -0.56

Page 6: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Calculation of Hydration Free Energies

TABLE 3 (Continued)

The Journal of Physical Chemistry, Vol. 98, No. 7 , 1994 1983

B. Ionized Molecules AGtOl exptl

chemical group atom crg aacid molecule AGel SA AG+ calc exp AG, pK,g AG,” NH.+

-cs- -coo- aromatic LCO-

aromatic -NH-CH=

N+H-CH=

N

-0.32 0.33 lys+

-0.08 CYS- -0.92

0.10 asp-

-0.195 tyr-

-0.07 -0.50 -0.35 his+

0.45 0.1413 0.125 0.35 arg+

-0.35 0.45

-0.70 0.40

-0.55 glu-

0.125

ammonium -70.71 145 N-butyl- -71.80 258

ammonium methylthiol ion -78.54 177

acetate ion -82.38 188 proprionate ion -81.32 219 p-cresole ion -77.28 273

methyl- -66.57 239 imidazolium

N-pguanidinium -68.53 284

1.71 -81.51 -81.53 0.02 9.25 196.4 2.36 -69.44 -69.24 0.20 10.60 210.6

1.88 -76.66 -76.79 0.13 10.30 350.6

1.94 -80.44 -80.65 0.21 4.80 341.5 2.11 -79.21 -79.12 0.09 4.88 340.3 2.40 -74.88 -75.01 0.13 9.98 343.5

2.21 -64.36 -64.13 0.23 6.95 216.6

2.46 -66.07 13.65

total AGm 0.15 total fi err 0.36 aacid AGm 0.10 aacid p err 0.30

a Symbols: p = dipole moment in debye; pK,, = experimental aqueous pK AG, = gas-phase proton dissociation energy. Other symbols as in Table 1. Energies are in kcal/mol. Total errors given at bottom of the table include both neutral and ionized molecules. Optimized PARSE atomic charges. All unlisted aromatic C, H, and ring-substituted CHp groups were assigned as in methylbenzene. Unlisted aliphatic C and H were assigned a charge of 0.0. Molecules used in optimization of PARSE parameter set, divided into chemical groups. Experimental values from ref 5, 36. e Experimental values from ref 44. IExperimental dipole moments obtained in benzene. gVh Experimental values from ref 42,43, 2, and 36. pK, and AG, were used to determine experimental transfer free energy of ionized molecules, according to eqs 7 and 8. The pK, value for imidazole was used for meth limidazole. The gas-pke acidity value for N-propylguanidinium was unavailable in the literature. The PARSE radius for the NH,+ group is 2.0 1. All other ionic radii are identical to neutral radii.

Testsof thePARSEParameter Set. Dipolemomentscalculated using the PARSE parameter set were compared with experimental gas-phase dipole moment data44 for the small molecules. The total calculated vacuum dipole moment values were obtained from Delphi as the vector sum of the fixed and induced (polarization) dipoles in the solute.

Solvation free energies were also calculated for a set of molecules not used in the parametrization process, using the PARSE parameters. Results were compared with experimental solvation free energies.5~~5

Results Solvation Energy Calculations: Preoptimization. Results with

solute e = 1: Initial solvation energycalculations tested the utility of some available parameter sets for use with the FDPB/y method. These calculations were performed with solute dielectric constant of 1 , as the force-field parameters were optimized for use with a force field which excludes explicit solute polarization changes.

Calculated and experimental solvation free energies for the amino acid side chain and backbone molecules were compared. The results obtained using the CVFF, CHARMM, and AMBER parameter sets showed poor overall agreement with experimental values (results not shown). The average and maximum absolute errors in kcal/mol were 3.08 and 6.41; 2.26 and 5.86; and 4.21 and 12.54 using CVFF, CHARMM, and AMBER, respectively, and correlation coefficients were 0.73 or lower. Results using the OPLS and ab initio parameter sets are presented in Figure 3 and Table 1 . For these, agreement with experiment is quite good. The average and maximum absolute errors in kcal/mol were 1.09 and 2.75; and 0.71 and 2.06 for OPLS and ab initio- derived parameters, respectively, and the correlation coefficients were high (above 0.85). (The ab initio results are for ab initio charges and OPLS radii. Using other radii altered the equation of the least-squares line fit in Figure 3 but produced little change in the correlation coefficient.) Of course, the poor results obtained using other parameters says nothing about the utility of the sets when used in conjunction with their own simulation methods.

Results with solute t = 2: When solute polarizability was included, calculated transfer free energies were consistently less favorable (more positive) than the experimental values. Average and maximum absolute errors in kcal/mol increased to 2.12 and 5.23; and 1.97 and 4.77 for the OPLS and ab initio sets respectively (results not shown). High correlation was maintained, however, suggesting a simple alteration of parameters could improve the results.

Optimization Using OPLS and Ab initio Parameters. Agree- ment between experimental solvation energies and values cal- culated with solute polarizability (t = 2) was greatly improved through either of two separate, simple adjustments to the OPLS and ab initio parameter sets: charge scaling or radius reduction. Optimal adjustments were obtained by scaling up all OPLS or ab initio atomic partial charges by 17% or 16%, respectively, without altering the radii, and decreasing each of the atomic radii to the values indicated in Table 4, without modifying charges. The average absolute errors in solvation free energies were similar to errors obtained using solute e = 1: 1.13 and 0.60 kcal/mol, for the charge and radius adjustments to the OPLS parameters, and 0.82 and 0.64 kcal/mol for the adjustments to the ab initio set. Results are shown in Figure 4 and Table 2. The directions of the adjustments are in accord with the work of other a ~ t h o r s . ~ ’ , ~ ~ * ~ ~

While good overall agreement with experimental data is obtained using the adjusted parameters, the results in Figure 4 and Table 2 show that for certain molecules the magnitude of error is still considerable, in many cases exceeding 1 kcal/mol, with a maximum error of about 3 kcal/mol. Further parameter deficiencies could be at fault (OPLS parameters yield errors in relative solvation energies of up to 1-1.5 kcal/mol even when used in the BOSS program for which they were derivedz.3). Alternatively, the deviations could be due to a breakdown of the continuum assumptions of the FDPB or nonpolar per-surface area methods. There is evidence that nonpolar contributions can depend on other factors in addition to surface area; for example, cyclic hydrocarbons require 1-2 kcal/mol less energy to transfer

Page 7: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1984 The Journal of Physical Chemistry, Vol. 98, No. 7, 1994

TABLE 4: Tabulation of van der Waals Radii (A)

Sitkoff et al.

atom Pauling’ Chothiab RamachandranC CVFFd CHARMMC OPLSf AMBER8 reduced radii* PARSE CHn 2.00 1.87 -2.25‘ 2.19 1.90 2.00 C 2.18 1.70 1.70 C(arom) 1.70 1.76 1.60 2.03 2.10 1.99 1.80 1.70 1.70 H 1.20 1 .oo 1.37 1.468 1.54 1 .oo 1 .oo H(aromatic) 1.468 1.36 OH 1.40 1.60 1.68 0 1.40 1.40 1.40 1.60 1.60 1.40 1.40 NHZ(tetrahed) 1.50 1.97 NHZ,NH(trig) 1.65 1.97 1.60 1.82 1.75 N 1.50 1.35 1.4511.5 1 .SO SH 1.85 1.99 S 1.85 1.85 1.89 1.90 2.00 1.90 1.85 lone pair 1.20

H ( w W 0.80 1 .oo

a Reference 25. Reference 29c. Reference 47. Reference 37. Reference 38. /Reference 40. 8 Reference 39. Optimized to reproduce solvation energies for the FDPB/y method with OPLS or ob initio charge parameters and solute c = 2. Varies between 2.13 and 2.36.

from vacuum to water than straight chain alkanes of identical surface area (data from Ben-Naim and Marcus32). Interestingly, applying this reduced nonpolar contribution to the aromatic molecules improves agreement between calculated and experi- mental solvation energies in nearly all cases for the molecules in this set. The number of molecules considered here is too few to determine which sources of error are contributing significantly. Future work will address these issues.

Our immediate purpose is to obtain a parameter set which, when used in the FDPB/y method, reproduces small-molecule solvation energies well. One option is to introduce additional parameter modifications to the OPLS or ab initio sets. For example, a correction specific to the aromatic OPLS param- eters-such as a further increase in aromatic C-H bond dipole moment, or a decrease in the C and H radii-would improve solvation energy agreement for aromatics. This introduces a somewhat ad-hoc complication, however. An alternative choice, which we have pursued, is to derive a new parameter set, specific for FDPB/ y solvation energy calculations.

Optimization of the PARSE Parameter Set. Tables 3 and 4 contain the optimized PARSE charges and radii, and Table 3 contains the calculated and experimental solvation free energies for the optimization set of molecules. Most of the functional groups are simple enough that the charge parameters consist of a single bond dipole moment. The amide, acetamide, and imidazole group charges were derived by combining the amine + ketone, indole + ketone, and pyridine + indole + aromatic C H groups. Acid and aromatic alcohol group charges were obtained from ketone + alcohol and aromatic C H + alcohol groups, with an additional bond moment to achieve agreement with solvation energies.

The molecules in Table 3 are classified according to the functional groups they contain. For many of the groups (e.g., primary amines, ketones, acids, and amides), excellent agreement with solvation energies was obtained for nearly all the subset members. Other groups (e.g., secondary amines, sulfides, and pyridines) demonstrate less consistent agreement among the molecules within that class. For these, the charge set may be too simple, or a single charge distribution may in insufficient. In all the groups, calculated energies were slightly too negative for the singly methylated molecule (methanol, methylamine). The same result is obtained for methane, and it is due to the fact that the experimental vacuum to water transfer energy for methane does not fall in line with the other alkanes on a plot of AGtramfcr versus surface area; it is larger by about 0.3 kcal/mol. It appears that for molecules of this size, the linear relationship between nonpolar solvation energy and surface area predicts an energetic contri- bution which is too small (this effect disappears when volume corrected transfer free energy for the alkanes a re~sed’~) . Overall, however, excellent agreement for solvation free energies was obtained for this optimization set of molecules. The average and maximum absolute errors are 0.15 and 0.45 kcal/mol. For

molecules representing the amino acid side chains and backbone, the average and maximum absolute errors are 0.10 and 0.23 kcal/mol. Figure 5 shows the agreement between calculated and experimental transfer free energies, using the PARSE parameters, for the aminoacid side chain and backbone molecules (these molecules were included in the optimization set).

For comparison, solvation energies were also calculated using the group contribution parameters developed by Privalov and Makhatadze’ (results not shown; most of the molecules were also part of their optimization set and appear in their publication). In this method, an optimized parameter describes the solvation free energy contribution of each group. The total solvation energy is given by the sum of the contributions from the constituent groups. As with the PARSE results, most solvation energies are matched extremely well; however, there are problems for particular chemical groups (such as phenols). Average and maximum absolute errors using Privalov and Makhatadze’s parameters were 0.23 and 2.12 kcal/mol for the full set, and 0.31 and 1.86 kcal/mol for the amino acid side chain and backbone molecules (excluding molecules for which parameters were unavailable).

In Tables 4 and 5 the PARSE charge and radius parameters are compared to those derived from other sets available in the literature. The PARSE radii are similar to radii observed by Ramachandran in peptide crystals4’ and those used by Chothia for surface area calculations2% (Table 4). The radii aresomwehat smaller than some of the force-field radii. The PARSE charges lie within the range of charges used in the force-field sets CVFF, CHARMM, and OPLS but differ significantly from the AMBER and ab initio sets (Table 5). One reason for the difference is that, in contrast to the electrostatic potential fitting method used to obtain AMBER and ab initio charges, PARSE charges were limited to few atoms. The PARSE charges, in conjunction with the continuum approximation, rely on a simple chargedistribution to reproduce the general response of the solvent. Tests of the PARSE Parameters. Reproduction of dipole

moments: In Table 3, experimental gas-phase dipole moments are compared with the dipole moments calculated using the PARSEparameters, for a number of small polar molecules. Figure 6 compares the calculated and experimental dipole moments for the amino acid side-chain and backbone analogs. In general the agreement is good: the average absolute error for all molecules for which experimental data were available was 0.36 D; for amino acid side chains and backbone only the error was 0.30 D. Many molecules for which agreement is poorer (sulfides, some nitrogen- containing aromatics) are cases where OPLS charges also fail to reproduce dipole moments well, while the more detailed charge representation offered by the ab initio set produces better agreement (results of dipole moment calculations using OPLS and ab initio sets not shown). Results for the nitrogen-containing aromatics could be improved by redistributing some of the negative charge from the nitrogen among the carbon ring atoms; however,

Page 8: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Calculation of Hydration Free Energies

-12 -8 -4 0 4

AGv.>w experimental (kcal/mol)

Figure 5. Comparison of experimental and calculated transfer free energies for amino acid side chain molecules and NMA. Calculations used PARSE parameters, with solute dielectric constant of 2. Solid line is the least squares line fit AGulc = 0.99AGe, - 0.06. The correlation constant is 1.00.

0 1 2 3 4 5

(D) dipole moment experimental

Figure 6. Comparison of experimental and calculated gas-phase dipole moments for the amino acid side chain and backbone molecules. PARSE parameters were used. Solid line is the least-squares line fit AG-1, = 0.91AGe,, + 0.08. The correlation constant is 0.85.

in the interest of maintaining the simplicity of the PARSE set this was not done.

Solvation free energies of other molecules: Solvation free energy calculations on molecules not included in the parame- terization set appear in Table 6. For the majority of these molecules, agreement with experimental free energies resembled

The Journal of Physical Chemistry, Vol. 98, No. 7, 1994 1985

that obtained for the set of molecules used to derive the parameters (Table 3). Exceptions include the dimethyl-substituted pyridine molecules, for which the error may be linked to the problems in matching dipole moments in aromatic nitrogen-containing groups, noted above. The same explanation may apply to the meth- ylpyrazine molecule (a ring containing two nitrogens). Calculated solvation energies for cyclic molecules were consistently 1-2 kcal/ mol less favorable than experimental values. This is at least in part due to possible differences in y and b values for cyclic versus aliphatic hydrocarbons, which were mentioned earlier, that were not taken into account in the calculation.

The molecules ethanediol and ethanediamine are a special and interesting case since, in contrast with the other molecules studied here, changes in molecular geometry between solvent phases are known to occur. Experimental and theoretical studies on ethanediol and similarly substituted alkanes have shown that the preferred conformation in vacuum has an internal hydrogen bond, which disappears upon transfer to water (where the polar -OH groups can hydrogen bond with the solvent).48 In order to obtain the correct solvation energy for these molecules, the vacuum and water structures must be used in the E = 1 and t = 80 Delphi runs, respectively. Furthermore, an internal energy contribution due to the conformational change must be added. Ab initio calcu- lations have been performed for ethanediol and ethanediamine in thevacuum state. The lowest energy conformations were found, in agreement with experimental data, to have an internal hydrogen bond.49sO Monte Carlo simulations on ethanediol in water, performed by Nagy et al.51 using the BOSS program,52 result in a different, externally hydrogen bonded lowest energy state. The ab initioenergy (no solvent contribution) for the preferred aqueous structure for ethanediol is approximately 4.0 kcal/mol higher than for thevacuumstructure, as computed using MP2/6-31G*/ /6-3 lG* energy ca l~u la t ions .~~ This amount was added to the solvent contribution of -1 1.9 kcal/mol calculated for ethanediol using FDPB/y/PARSE (where the geometries published by Nagy et al., denoted tGg' and tGg, were used for thevacuum and aqueous Delphi runs, respectively). The total solvation energy, -7.9 kcal/ mol, is in excellent agreement with the experimentally measured transfer energy value of -7.66 kcal/mol. Ab initio energy calculations on ethanediamine suggest that solvent-induced changes in conformation can contribute about 2.2 kcal/mol at most (assuming the aqueous solvent induces a switch to the trans conformation).5O Adding this contribution to thesolvation energy of ethanediamine calculated using PARSE yields a net error of 1.5 kcal/mol. The remaining error could be due to the quality of the ab initio calculations available at the time of computation (calculations were performed using GAUSSIAN-70 with the 4-3 1G and 6-31G basis sets and an ab initio SCF Hartree-Fock gradient program developed by Pulay with the 4-21G basis set). We plan to perform ab initio calculations of the gas-phase contributions to changes in structure for ethanediamine using newer ab initio methods with larger basis sets, which include electron correlation, to see if this will reconcile the difference between the PARSE calculated and experimental solvation energies.

Comparison with Other Solvation Energy Calculation Methods.

TABLE 5: Atomic Charge Parameters for Acetamide (in -e Units) atom CVFF' CHARMMb AMBERc OPLSd obinitid 1.17* OPLSf 1.16'ab ini t id PARSE C -0.30 -0.140 -0.649 -0.753 H1 0.10 0.380 0.172 0.200 H2 0.10 0.380 0.176 0.204 H3 0.10 0.380 0.172 0.200 C 0.38 0.55 0.675 0.50 1.001 0.585 1.172 0.55 0 -0.38 -0.55 -0.470 -0.50 -0,648 -0.585 -0.752 -0.55 N -0.56 -0.60 -0.867 -0.85 -1.138 -0.995 -1.320 -0.78 HN 1 0.28 0.30 0.344 0.425 0.459 0.497 0.532 0.39 HN2 0.28 0.30 0.344 0.425 0.453 0.497 0.525 0.39

0 Reference 37. Reference 38. e Reference 39. Reference 40. e Reference 41, foptimized to reproduce solvation energies for the FDPB/y method with OPLS radii and solute c = 2.

Page 9: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1986

TABLE 6: Solvation Free Energies Using PARSE Parameters'

The Journal of Physical Chemistry, Vol. 98, No. 7, 1994 Sitkoff et al.

pentanol 4-methyl-2-pentanol 2-pentanol 2-methyl-2-butanol 2,3-dimethylisobutyl alcohol 3-pentanol hexanol 3-hexanol 2-methyl-3-pentanol heptanol 4-heptanol 2-methylpropanol 2-methyl-2-propanol 2-methyl-2-pntanol octanol pentaneamine hexaneamine N,N-dipropylamine N,N-dibut ylamine 4-methyl-2-pentanone 4-heptanone 5-nonanone 2-hexanone 2-heptanone 2-octanone 2-nonanone butanal pentanal hexanal heptanal octanal nonanal N-methylformamide cyclopentanol cyclohexanol cycloheptanol pyrrolidine piperazine pipyridine propylbenzene o-xylene m-xylene p-xylene naphthalene anthracene phenanthracene 1,2,4-trimethylbenzene methylethylbenzene but ylbenzene methylpropylbenzene dimethylethylbenzene dimethylpropylbenzene dimethylethylphenol 3-methylpyridine 2-ethylpyridine 3-ethylpyridine 4-ethylpyridine 2,4-dimethylpyridine 3,4-dimethylpyridine 3,5-dimethylpyridine 2,3-dimethylpyridine 2,5-dimethylpyridine 2,6-dimethylpyridine dimethyleth ylpyridine 2-methylpyrazine 1,2-ethanediamine 1,2-ethanediol

-6.92 -6.31 -6.34 -6.03 -6.00 -6.26 -6.93 -6.34 -6.20 -6.95 -6.35 -6.86 -6.25 -6.07 -6.95 -6.63 -6.59 -5.93 -5.91 -5.65 -5.58 -5.58 -5.76 -5.78 -5.76 -5.75 -5.42 -5.40 -5.41 -5.42 -5.43 -5.39

-12.20 -6.53 -6.54 -6.48 -6.38

-1 1.54 -6.29 -3.07 -3.15 -3.11 -3.11 4 . 5 3 -5.79 -5.78 -3.10 -3.03 -3.07 -3.03 -2.99 -3.01 -8.38 -6.95 -6.42 -6.86 -6.97 -6.44 -6.84 -6.72 -6.34 -6.31 -5.94 -6.84

-10.06 -13.10 -13.91

278 295 275 267 288 274 308 304 297 337 334 243 240 295 367 284 314 322 381 294 327 385 302 33 1 361 391 239 269 299 329 358 388 200 245 268 288 229 246 251 319 288 294 294 292 355 351 32 1 313 349 340 330 354 344 254 284 283 283 289 28 1 288 272 289 290 324 249 214 198

2.43 2.52 2.41 2.37 2.48 2.41 2.59 2.57 2.53 2.75 2.73 2.24 2.22 2.52 2.91 2.46 2.62 2.67 2.99 2.51 2.69 3.01 2.56 2.72 2.88 3.04 2.22 2.38 2.54 2.70 2.86 3.02 2.01 2.25 2.37 2.48 2.16 2.25 2.28 2.65 2.48 2.52 2.52 2.50 2.84 2.82 2.66 2.62 2.8 1 2.76 2.71 2.84 2.79 2.30 2.46 2.46 2.45 2.49 2.44 2.48 2.40 2.49 2.49 2.68 2.27 2.08 1.99

-4.49 -3.79 -3.93 -3.66 -3.51 -3.85 4 .34 -3.77 -3.67 -4.20 -3.62 4 . 6 2 -4.02 -3.55 -4.04 -4.16 -3.96 -3.27 -2.92 -3.14 -2.89 -2.57 -3.21 -3.06 -2.88 -2.72 -3.20 -3.02 -2.87 -2.72 -2.57 -2.36

-10.19 -4.28 -4.17 -4.00 -4.22 -9.29 -4.00 -0.42 -0.67 -0.59 -0.59 -2.03 -2.95 -2.95 -0.45 -0.41 -0.26 -0.26 -0.28 -0.17 -5.60 4 .65 -3.95 4 . 4 0 -4.52 -3.95 -4.39 4 . 2 4 -3.95 -3.82 -3.44 -4.16 -7.79

-11.02 -11.92

-4.47 -3.73 -4.39 -4.43 -3.92 -4.35 -4.36 -4.08 -3.89 -4.24 -4.01 -4.52 -4.51 -3.93 -4.09 -4.10 4.03 -3.66 -3.33 -3.06 -2.93 -2.67 -3.29 -3.04 -2.88 -2.49 -3.18 -3.03 -2.81 -2.67 -2.29 -2.08

-10.00 -5.49 -5.47 -5.49 -5.48 -7.38 -5.11 -0.53 -0.90 -0.83 -0.81 -2.39 -4.23 -3.95 -0.86 -0.30 -0.40 -0.45 -0.44 -0.18 -5.92 -4.77 -4.33 -4.60 -4.37 -4.86 -5.22 -4.84 -4.83 -4.72 -4.60 -4.46 -5.52 -7.60 -7.66

0.02 0.06 0.46 0.77 0.40 0.50 0.02 0.3 1 0.21 0.04 0.39 0.10 0.49 0.38 0.05 0.06 0.07 0.39 0.41 0.07 0.04 0.10 0.08 0.02 0.00 0.23 0.02 0.01 0.06 0.05 0.28 0.29 0.19 1.21 1.30 1.49 1.26 1.91 1.1 1 0.12 0.23 0.24 0.22 0.36 1.28 1 .oo 0.41 0.11 0.14 0.18 0.15 0.01 0.33 0.12 0.38 0.20 0.14 0.91 0.83 0.60 0.88 0.90 1.16 0.30 2.28 3.42 4.26

averrop 0.44

a Solvation energies for molecules not used in the optimization set. AG,, = absolute value of difference between calculated and experimental transfer free energies. Other symbols as in Table 1. Energies are in kcal/mol. b References 5 and 45. Average error not including ethanediol and ethanediamine.

A summary of the solvation energies of selected small polar solutes solvation free energies which incorporates solute polarizability. calculated using the FDPB/y method with the PARSE param- We have developed a set of atomic parameters that reproduce eters, and other published methods is presented in Table 7. experimentally observed transfer free energy data for small

molecules, including those representing the amino acid side chains and backbone. The optimized parameters should increase the Discussion accuracy of small molecule solvation energy computations. The In this paper, we have presented a rapid method for calculating

Page 10: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

Calculation of Hydration Free Energies

TABLE 7: Comparison of Solvation Free Energy Results (in kcal/mol)

The Journal of Physical Chemistry, Vol. 98, No. 7, 1994 1987

molecule M & P BOSSb Stillc Sharpd MohanC L i d PARSB C&Th Tannor' expty

methanol ethanol 2-propanol acetone acetic acid acetamide t-nma c-nma benzene toluene phenol

-5.2 -5.3 -5.1 -5.5 -4.1 -5.0 -3.8 -2.4 -6.1 -4.8 -10.3 -8.8 -9.9 -1.9 -9.9 -5.2 -0.9 0.3 -0.9 1.6 -1.9 -5.8

-6.2 -5.2 -4.3 -3.2 -6.5 -10.6

-1 .o -0.1 -6.3

-4.6 -4.4 -4.3 -3.2 -10.4 -10.3 -9.1 -1.1

1.7 -2.1

-5.4 -5.2 -5.4 -5.0 -4.6

-4.0 -3.9 -6.6 -6.7 -6.6 -9.0 -9.8

-10.0

-2.0

-6.1

-1.1 -0.8 -6.4

-5.3 -4.6 -3.6 -4.7 -8.5

-0.3 -0.1 -5.4

-5.1 -4.5 -4.5 -4.4 -1.1 -10.6 -8.4 -8.4

-1.1 -6.3

-5.1 -4.9 -4.8 -3.9 -6.7 -9.7

-10.1 -10.1 -0.9 4.8 -6.6

Group contribution method of Privalov and Makhadatze.' Electrostatic contribution taken from FEP calculations using the BOSS program,20 cavity term obtained from present work has been added. Approximate generalized Born method of Still et aI.5' d FDPB with OPLS parameters, charges scaled to reproduce dipole moments,22 cavity term from present work has been added. FDPB with OPLS parameters and optimized OPLS radii of Mohan et al.:' cavity term = 0. f FDPB with optimized parameters of Lim et a1.,% cavity term = 0. g FDPB/y method using PARSE parameters, from this work. Quantum mechanical solute with approximate generalized Born treatment of solute-solvent interactions.45 f Quantum mechanical solute with FDPB treatment of solutesolvent interactions,'* cavity term obtained from present work has been added. References 5 and 36.

parameters can be applied and will hopefully improve the reliability of FDPB/y calculations on proteins, as well. The success of the parametrization process indicates that charges and radii for new chemical groups could be easily developed, by optimizing to small molecule solvation energies using the group contribution approach described here.

Simple group additivity methods have been shown to be remarkably successful a t reproducing experimental small-mol- ecule solvation energies. We have demonstrated that comparable accuracy can be obtained using the FDPB/y method with the PARSE parameters, which were derived using a group contri- bution approach. In the FDPB/y method, however, group contributions are not purely additive; neighboring charges influence a group's solvation energy according to the PB equation. This makes the FDPB/y method more adaptable to calculations on larger molecules, where parametrization for such interactions is more difficult. Additionally, the FDPB calculation can include contributions due to partially and shallowly buried groups in a more realistic manner than through simple exposed surface area dependence.

Previous efforts have been made to optimize parameters for the polarizable solute FDPB method using small molecule solvation energy data. Recently, Sharp et al.22 showed that agreement with experimental solvation energies could be improved by scaling up force-field charges to match the experimental dipole moments of the solutes. In the work of Lim et al.,24 parameters for several small molecules were obtained through fine adjustment of charges and radii obtained from either OPLS, CHARMM, or Rashin and Namb0odiri.5~ While this approach produced excellent accord with the solvation data, few molecules were fit, and the parameters weredrawn fromseveral different sets. Mohan et a1.2' varied atomic radii to obtain an optimal set for use with OPLS charges in the FDPB method. In agreement with results obtained here, the preferred set matched small-molecule solvation energies well overall, with larger errors for individual molecules. However, in the work of both Lim et al. and Mohan et al. the nonpolar contribution to transfer was neglected; thus the pa- rameters were optimized to produce an electrostatic energy which matched the total transfer free energy, rather than the "true" electrostatic portion. This results in charge parameters that are too small (or radii that are too large), since the total transfer free energies are smaller in magnitude than the electrostatic portion. This could conceivably produce errors in calculations on larger molecules, where charge interactions between groups are present. The utility of these parameters in calculations involving proteins is thus uncertain.

The parametrization approach described here works well for small molecules for which there are good experimental solvation data. However, there are some limitations that suggest directions for future development. The parametrization method is straight- forward for simple chemical groups, but for larger groups where

more atoms are charged, the assignment process can become somewhat arbitrary. Keeping the charges simple, as we have attempted to do here, can result in poor agreement for solvation energies of molecules containing that group (e.g., the pyridine derivatives). Additionally, it is very difficult to obtain parameters for new chemical groups, such as those present in DNA or many drug molecules, if experimental solvation energy data are not available. One approach to resolving these problems would be to obtain parameters from ab initio calculations (or the OPLS force-field) and apply the simple charge scaling or radius reduction adjustments developed here.

Another area for future development involves testing and refining the method and parameters on proteins. One issue is whether parameters optimized for calculating solvation energies will produce accurate interaction energies for intra- and inter- molecular contacts, such as those within proteins and between proteins and other proteins or ligands. Another issue is whether the parameters for N M A and the amino acid side chain analogs can be transferred without change to amino acids in proteins. It should be stressed that the assumption of transferrability underlies not only the FDPBIyIPARSE method but all methods which derive parameters fromsmall-molecule data. Recently, this issue was addressed in work by Price et al.54 Variations in atomic charges, obtained using ab initio methods and distributed multipole analysis, were studied for a number of small molecules, to determine whether surrounding portions of molecules affected the charges on chemical groups. The results indicated that charges derived from side-chain analogs should be transferable to polypeptides, for most amino acid residues. Notably, however, Price et al. found that the smallest realistic model for a peptide backbone is not NMA, but the dipeptide (a conclusion also reached by Bellido and Rullman55 due to the contrast in the peptide group dipole moment in dipeptides and formamide). The parameter derivations presented here are limited by the fact that experimental solvation free energies are not available for dipeptide molecules, thus N M A has been used as a model for the peptide backbone. Despite this substitution, preliminary calculations using PARSE parameters have produced excellent agreement with experiment, for the electrostatic potential in water due to an a-helix peptide backbone.56 Future calculations on protein systems will provide further tests as to the transferrability of the PARSE parameters. Other issues that will have to be addressed in protein calculations which use the FDPBIyIPARSE method are the roles of volume corrections and surface curvature in computing the nonpolar solvation energy contributions.34Js

A possible improvement over the approach presented here would involve incorporating the effects of continuum solvent into a quantum mechanical solute calculation. This would eliminate the need for atomic charge parameters and include solute polarizability automatically, through the quantum mechanics. Efforts in this direction have already been made. Luzhkov and

Page 11: Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models

1988 The Journal of Physical Chemistry, Vol. 98, No. 7, 1994

Warshel have added a Langevin dipole description of the solvent to the MNDO Hami1t0nian.I~ Cramer and Truhlar calculated solvation free energies utilizing the AM1 Hamiltonian, with solvent effects treated using the generalized Born model of Still et a1.17,45J7 Each atom type was assigned a unique surface tension coefficient which was parametrized to produce agreement with experimental solvation freeenergies. Recently, the FDPB method was incorporated into a quantum mechanical solvation energy calculation scheme which uses a pseudospectral method for solving the Hartree-Fock equations, combined with a general valence bond treatment of correlation (PSGVB).I8 The results from this work have been encouraging: for most molecules, computed solvation free energies agreed with experiment to within 1 kcal/ mol, following some minor adjustment of atomic radius param- eters. Full quantum mechanical methods are currently limited to molecules much smaller than proteins. Quantum mechanics has been incorporated into subregions of proteins, howev-

Given these issues and the current state of theoretical and computational development, the approach used in this work, involving a combination of discrete representation of the solute, implicit solvent representation, and extensive parametrization against experimental data, is a reasonable way to calculate solvation energies. The tradeoff between simplicity, accuracy, and computational tractability permits the rapid calculation of solvation energies, with excellent accuracy for small molecules, while the longer distance screening effects between charges important for calculations in proteins are retained.

er.8,9,11,12,15,58,59

Sitkoff et al.

Acknowledgment. We wish to thank Anthony Nicholls for making available recently updated versions of Delphi and S. Sridharan for providing the surface area program prior to publication. We thank Prof. W. L. Jorgensen for providing a list of OPLS parameters for organic and biochemical systems. Financial support from the Office of Naval Research (N00014- 93-1-0405) (B.H.); the N I H (GM 30518) (B.H.), (GM082750) (D.S.) (RR06892); and the E. R. Johnson Foundation and the NSF (MCB92-20477) (K.A.S.) is gratefully acknowledged.

References and Notes

564. (1) Bash, P.; Singh, U. C.; Langridge, R.; Kollman, P. Science 1987,236,

(2) Jorgensen, W. L.; Briggs, J. M. J. Am. Chem. SOC. 1989,111,4190. (3) Jorgensen, W. L.; Briggs, J. M.; Contreras, M. L. J. Phys. Chem.

(4) Hansch, C.; Leo, A. Substituent Constants for Correlation Analysis

( 5 ) Cabani, S.; Gianni, P.; Mollica, V.; Lepori, L. J. Solur. Chem. 1981,

(6) Eisenberg, D.; McLachlan, A. D. Nature 1986, 319, 199. (7) Privalov, P. L.; Makhatadze, G. I. J . Mol. Biol. 1993, 232, 660. (8) Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227. (9) Tapia, 0.; Johannin, G. J . Chem. Phys. 1981, 75, 3624.

1990, 94, 1683.

in Chemistry and Biology; Wiley: New York, 1979.

10, 563.

(10) Singh, U. C.; Kollman, P. A. J . Comput. Chem. 1986, 7, 718. (11) Rullmann, J. A.; Bellido, M. N.; van Duijnen, P. T. J . Mol. Biol.

(12) Field, M. J.; Bash, P. A.; Karplus, M. J. Comput. Chem. 1990, 11,

(13) Alagona, G.; Ghio, C.; Igual, J.; Tomasi, J. J . Mol. Srrucr.

(14) Luzhkov, V.; Warshel, A. J. Comput. Chem. 1992, 13, 199. (15) Bajorath, J.; Kraut, J.; Li, Z.; Kitson, D. H.; Hagler, A. T. Proc.

(16) Gao, J.; Xia, X. Science 1992, 258, 63 1. (17) Cramer, C. J.; Truhlar, D. G. Science 1992, 256, 213. (18) Tannor, D.; Marten, B.; Murphy, R.; Nicholls, A,; Honig, B.; Friesner,

1989,206, 101.

700.

(THEOCHEM) 1990, 204, 253.

Narl. Acad. Sci. U S A . 1991, 88, 6423.

R. J. Am. Chem. SOC., submitted.

(19) Sharp, K.; Honig, B. Annu. Reu. Biophys. Biophys. Chem. 1990,19,

(20) Jean-Charles, J.; Nicholls, A.; Sharp, K.; Honig, B.; Tempczyk, A.;

(21) Mohan, V.; Davis, M. E.; McCammon, J. A.; Pettitt, B. M. J . Phys.

(22) Sharp, K.; Jean-Charles, J.; Honig, B. J. Phys. Chem. 1992, 96,

(23) Honig, B.; Sharp, K. A.; Yang, A.4 . J . Phys. Chem. 1993,97,1101. (24) Lim, C.; Bashford, D.; Karplus, M. J. Phys. Chem. 1991,95,5610. (25) Pauling, L. The Nature of the Chemical Bond; Cornell University

Press: Ithaca, NY, 1960. (26) Nicholls, A.; Honig, B. J . Comput. Chem. 1991, 12, 435. (27) Nicholls, A.; Sharp, K. A.; Honig, B. Delphi; Department of

Biochemistry and Molecular Biophysics; Columbia University: New York, 1990.

(28) Nicholls, A., manuscript in preparation. (29) (a) Hermann, R. B. J . Phys. Chem. 1971,76,2754. (h) Nozaki, Y.;

Tanford, C. H. J . Biol. Chem. 1971,246,221 1. (c) Chothia, C. J. Mol. Biol. 1976, 105, 1.

(30) Shrake. A.; Rupley, J. A. J. Mol. Biol. 1973, 79, 351. (31) Sridharan, S.; Nicholls, A.; Honig, B. Biophys. J . 1992, 61, A174. (32) Ben-Naim, A.; Marcus, Y. J. Chem. Phys. 1984, 81, 2016. (33) DeYoung, L. R.; Dill, K. A. J . Phys. Chem. 1990, 94, 801. (34) Sharp, K. A.; Nicholls, A,; Fine, R. M.; Honig, B.Science 1991,252,

106. (35) Sharp, K. A.; Nicholls, A,; Friedman, R.; Honig, B. Biochemisfry

1991, 30, 9686. (36) (a) Wolfenden, R.; Andersson, L.; Cullis, P. M.; Southgate, C. C.

Biochemistry 1981, 20, 849. (b) Wolfenden, R. Biochemistry 1978, 17, 199.

(37) Hagler, A. T.; Huler, E.; Lifson, S. J . Am. Chem. SOC. 1973, 96, 5319.

(38) Brooks, 8. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983,4, 187.

(39) Weiner, S. J.;Kollman,P. A.;Nguyen, D. T.; Case, D. A.J. Comput. Chem. 1986, 7 , 230.

(40) Jorgensen, W. L.; Tirado-Rives, J. J . Am. Chem. SOC. 1988, 110, 1657.

(41) Chipot, C.; Maigret, B.; Rivail, J.-L.;Scheraga, H.A.J. Phys. Chem. 1992, 96, 10276.

(42) (a) Lias, S. G.; Liebman, J. F.; Levin, D. J . Phys. Chem. Ref. Data 1984. (b) Lias, S. G.; Bartmess, J. E.; Liebman, J. F.; Holmes, J. L.; Levin, R. D.; Mallard, W. G. J. Phys. Chem. Ref. Data, Suppl. I 1988. (c) Gas Phase Ion Chemistry; Bowers, M. T., Ed.; Academic Press: New York, 1979.

(43) Bell, R. P. The Proton in Chemistry; Cornell University Press: Ithaca, NY, 1973.

(44) (a) CRC Handbook of Chemistry and Physics; Lide, D. R., Ed.; CRC Press: Boca Raton, FL, 1992. (b) McClellan, A. L. Tables of Experimental Dipole Moments; W. H. Freeman and Co.: San Francisco,

(45) Cramer, C. J.; Truhlar, D. G. J . Comput. Mol. Des. 1992, 6, 629. (46) (a) Jayaram, B.; Fine, R.; Sharp, K. A.; Honig, B. J . Phys. Chem.

1989, 93,4320. (b) Gilson, M. K.; Davis, M. E.; Luty, B. A.; McCammon, J. A. J. Phys. Chem. 1993,97, 3591. (c) Masterton, W. L. J. Phys. Chem. 1954, 22, 1830. (d) Terasawa, S.; Itsuki, H.; Arakawa, S. J . Phys. Chem. 1975, 79, 2345.

(47) Ramachandran, G. N.; Ramakrishnan, C.; Sasisekharan, V. J. Mol. Biol. 1963, 7, 95.

(48) (a) Adams, B.; Lerner, L. J . Am. Chem. SOC. 1992,114,4827. (b) Brady, J. W. J. J. Am. Chem. Soc. 1989, 1 1 1 , 5155. (c) Nagy, P. I.; Dum, W. J. 111: Alanona. G.: Ghio. C. J . Phvs. Chem. 1993. 97. 4628.

301.

Hendrickson, T.; Still, C. J . Am. Chem. Soc. 1990, 113, 1454.

Chem. 1992, 96, 6428.

3822.

1989; Vol. 1-3.

(49) Nagy,-P. 1.; Dunn, W: J. I.; AlGona, G.;Ghio, C, J.~A&. Chem. SOC. 1991.113. 6719.

(50) Carballeira, L.; Mosquera, R. A.; Rios, M. A.; Tovar, C. A. J . Mol.

n, W. J. I.; Alagona, G.; Ghio, C. J. Am. Chem. Soc. Struct. 1989, 193, 263.

(51) Nagy, P. I.; Dun] - 1992, 114,4752.

(52) Jorgensen, W. L.; Gao, J. J . Phys. Chem. 1986, 90, 2174. (53) Rashin, A. A.; Namboodiri, K. J . Phys. Chem. 1987, 91, 6003. (54) Price, S. L.; Faerman, C. H.; Murray, C. W. J. Comput. Chem. 1991,

( 5 5 ) Bellido, M. N.; Rullman, J. A. C. J . Comput. Chem. 1989,10,479. (56) Sitkoff, D., manuscript in preparation. (57) Still, W. C.; Tempczyk, A.; Hawley, R. C.; Hendrickson, T. J . Am.

( 5 8 ) Weiner, S. J.; Seibel, G. L.; Kollman, P. A. Proc. Narl. Acad. Sci.

(59) Lee, F. S.; Chu, 2. T.; Warshel, A. J . Comput. Chem. 1993, 14,

12, 1187.

Chem. SOC. 1990, 112, 6127.

U.S.A. 1986,83, 649.

161.