Complexity in Modeling and Understanding Protonation States: Computational Titration of...

14
Complexity in Modeling and Understanding Protonation States: Computational Titration of HIV-1-Protease–Inhibitor Complexes by Ashutosh Tripathi a ), Micaela Fornabaio a ), Francesca Spyrakis b ), Andrea Mozzarelli b ), Pietro Cozzini c ), and Glen E. Kellogg* a ) 1 ) a ) Department of Medicinal Chemistry & Institute for Structural Biology and Drug Discovery, Virginia Commonwealth University, Richmond, VA 23298-0540, USA (phone: þ 1-804828-6452; fax: þ 1-804-828-3664; e-mail: [email protected]) b ) Department of Biochemistry and Molecular Biology, University of Parma, I-43100 Parma c ) Molecular Modeling Laboratory, Department of General and Inorganic Chemistry, University of Parma, I-43100 Parma The computational-titration (CT) algorithm based on the (natural) Hydropathic INTeractions (HINT) force field is described. The HINT software model is an empirical, non-Newtonian force field derived from experimentally measured partition coefficients for solvent transfer between octanol and H 2 O (log P o/w ). The CT algorithm allows the identification, modeling, and optimization of multiple protonation states of residues and ligand functional groups at the protein–ligand active site. The importance of taking into account pH and ionization states of residues, which strongly affect the process of ligand binding, for correctly predicting binding free energies is discussed. The application of the CT protocol to a set of six cyclic inhibitors in their complexes with HIV-1 protease is presented, and the advance of HINT as a virtual-screening tool is outlined. Introduction. – A typical problem in modeling biomolecular systems or molecular- recognition systems is accurately modeling the energetics of binding. Thermodynamic analyses of protein – ligand interactions not only give vital insight into the free-energy changes of a system, but also elaborate whether an interaction is enthalpy- or entropy- driven [1 – 3]. Calculation of binding free energy involves evaluation of both enthalpic and entropic contributions, and forms an integral part of structure-based drug-design protocols [4] [5] . To this end, computer simulations have strived to predict binding free energy and, concomitantly, interpret experimental data. The fundamental idea behind development of robust computational models is to incorporate as much of the physicochemical parameters defining protein – ligand interactions as possible [6] . These models can contribute significantly to the understanding of the structural and energetic basis of biomolecular interactions, with the goal of de novo predicting the binding free energy of protein – ligand complexes. Generally, enthalpic contributions are estimated by theoretical methods, knowl- edge-based potential functions, or parameters derived from experimental data [7] . The force fields used for calculations of intermolecular interactions and, by inference, free energy assume that steric and electrostatic forces are sufficient to account for the CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2564 # 2007 Verlag Helvetica Chimica Acta AG, Zɒrich 1 ) Senior Fellow of the Center for the Study of Biological Complexity, Virginia Commonwealth University, Virgina.

Transcript of Complexity in Modeling and Understanding Protonation States: Computational Titration of...

Complexity in Modeling and Understanding Protonation States:Computational Titration of HIV-1-Protease–Inhibitor Complexes

by Ashutosh Tripathia), Micaela Fornabaioa), Francesca Spyrakisb), Andrea Mozzarellib),Pietro Cozzinic), and Glen E. Kellogg*a)1)

a) Department of Medicinal Chemistry & Institute for Structural Biology and Drug Discovery,Virginia Commonwealth University, Richmond, VA 23298-0540, USA

(phone: þ1-804828-6452; fax: þ1-804-828-3664; e-mail: [email protected])b) Department of Biochemistry and Molecular Biology, University of Parma, I-43100 Parma

c) Molecular Modeling Laboratory, Department of General and Inorganic Chemistry,University of Parma, I-43100 Parma

The computational-titration (CT) algorithm based on the 8natural9 Hydropathic INTeractions(HINT) force field is described. The HINT software model is an empirical, non-Newtonian force fieldderived from experimentally measured partition coefficients for solvent transfer between octanol andH2O (log Po/w). The CT algorithm allows the identification, modeling, and optimization of multipleprotonation states of residues and ligand functional groups at the protein– ligand active site. Theimportance of taking into account pH and ionization states of residues, which strongly affect the processof ligand binding, for correctly predicting binding free energies is discussed. The application of the CTprotocol to a set of six cyclic inhibitors in their complexes with HIV-1 protease is presented, and theadvance of HINT as a virtual-screening tool is outlined.

Introduction. – A typical problem in modeling biomolecular systems or molecular-recognition systems is accurately modeling the energetics of binding. Thermodynamicanalyses of protein– ligand interactions not only give vital insight into the free-energychanges of a system, but also elaborate whether an interaction is enthalpy- or entropy-driven [1–3]. Calculation of binding free energy involves evaluation of both enthalpicand entropic contributions, and forms an integral part of structure-based drug-designprotocols [4] [5]. To this end, computer simulations have strived to predict binding freeenergy and, concomitantly, interpret experimental data. The fundamental idea behinddevelopment of robust computational models is to incorporate as much of thephysicochemical parameters defining protein– ligand interactions as possible [6]. Thesemodels can contribute significantly to the understanding of the structural and energeticbasis of biomolecular interactions, with the goal of de novo predicting the binding freeenergy of protein– ligand complexes.

Generally, enthalpic contributions are estimated by theoretical methods, knowl-edge-based potential functions, or parameters derived from experimental data [7]. Theforce fields used for calculations of intermolecular interactions and, by inference, freeenergy assume that steric and electrostatic forces are sufficient to account for the

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2564

E 2007 Verlag Helvetica Chimica Acta AG, ZGrich

1) Senior Fellow of the Center for the Study of Biological Complexity, Virginia CommonwealthUniversity, Virgina.

observed biological interactions [7]. However, these terms alone are never sufficientfor accurate prediction of biomolecular interactions, as they do not always includesolvation/desolvation effects. The entropic contributions to binding are much less well-defined and often poorly quantitated or even ignored in most cases. Most approachessum up these interactions separately as distinct enthalpic and entropic contributions,whereas, in reality, the ligand–protein recognition is a concerted event, andthermodynamic quantities cannot be just simply added up [8]. The phenomenon ofsolvation/desolvation within the protein active site, resulting in hydrophobic inter-actions and other phenomena, has been particularly difficult to model computationally.The computational model is perceived in most simulations as if the system was invacuum. However, developments in free-energy calculation over the past several yearshave made significant progress towards this goal [9] [10]. Among the methods currentlyavailable to calculate free energy, those based on complex and time-consumingmolecular-dynamics (MD) simulations, with explicit consideration of H2O molecules,have been shown to correlate fairly well with free energy. Free-energy-perturbation(FEP) calculations, based on statistical mechanics, can predict the free energy of asystem by analyzing ensemble averages (calculated by MD or Monte Carlo (MC)simulations) and treating solvent molecules and ions explicitly [11]. Unfortunately,besides being computationally expensive, calculation of free energy using MD or MC issomewhat compromised by errors and problems from a variety of sources [12]. Thus,computational time and uncertainty is compounded for complex biomolecular systemswith solvents and counterions [12] [13].

Values of the free energy (DG) of binding, computed by sophisticated simulations,often do not correlate well with the experimental binding measurements. This could bedue to several reasons. First, it is becoming common practice in the drug-discoverycommunity to use assays generating IC50 values rather than equilibrium constants, andcomputational scientists often interchangeably use IC50 instead of Ki or Kd data tocalculate free energies. Furthermore, the experimental binding data itself presentuncertainties, i.e., there are often differences of one order of magnitude (correspondingto ca. 1 kcal mol�1) between inhibition data collected in different laboratories; theseexperimental data are, thus, not always of a quality to enable accurate quantitativecorrelations with computational data. On the other hand, structural data available fromX-ray crystallography and NMR experiments have undeniably fostered an under-standing of the biological complexity of molecular recognition. However, often onlypartial agreement between experimental and theoretical binding-energy data based oncrystal structures has been observed.

It is difficult to correlate solution-binding data and crystallographic structural databecause of a number of inherent experimental limitations. Biomolecular interactionsare sensitive to subtle changes in experimental conditions such as pH, buffer, ionicstrength, and temperature under which the data are collected [14]. While the pH usedfor making the binding measurements is generally perceived as the pH at which bindingtakes place, that pH may not actually be the optimum pH for binding. This discrepancyis magnified when crystals for X-ray analysis are grown under different pH conditions.Unless at very high resolution, an X-ray structure by itself reveals little directinformation regarding the protonation states of the active-site residues in a protein. Asprotein– ligand specificity and stability are known to be sensitive to structural details,

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2565

presumably because protonation states of ionizable residues and the details of the H-bonding network are very important for optimum interaction, it is imperative toidentify, characterize, and understand the protonation states of residues. Even a changein the ionization state of a single residue or ligand functional group may have aprofound effect on the results for structure-based energy calculations.

In general, molecular association depends on the ionic strength and pH of thesolution [14]. Protein crystals may contain between 30 and 70% solvent [15], whichincludes the buffer solution as well. These solvent molecules and ions are distributedamong the protein molecules of the crystal lattice. However, due to their inherentelectrostatic properties, they may distribute themselves in different pH zones or solventchannels, depending on the nature of the residues lining these solvent pockets. Thus, pHinfluences ligand binding both directly, by changing the H-bonding character ofionizable site residues, and indirectly, by altering the shape and properties of the sitewith specifically bound solvent molecules [16] [17]. Another significant factor inquantitative estimation of interaction strength is the microscopic dielectric constant,which is almost never known [18]. While many approximations can be made for thedielectric constant of the protein interior [19] [20], the most accurate modeling ofdielectric effects within proteins requires consideration of the atomic polarizabilities ofthe heterogeneous protein and the solvent (both H2O and counterions) [21] [22]. Thesize of a typical protein– ligand system renders approaches of this nature computa-tionally very expensive; thus, simplified models that use macroscopic dielectric models,either uniform or distance-dependent, are generally applied [23].

In order to accurately predict binding free energies, it is fundamental to take intoaccount pH as well as ionization and entropic contributions for virtual screeningexperiments. Yet, in many biomolecular systems the lack of extensive binding andinhibition data as a function of pH limits the likelihood of good correlation betweencalculated and measured binding data. In this manuscript, we have examined X-raystructures of HIV-1 protease in complex with several cyclic-urea and cyclic-sulfonamide inhibitor analogs. The binding of ligands to a biological macromoleculeis made even more complicated with the presence of multiple ionizable groups, and theHIV-1-protease– ligand system has several [24]. Existing modeling techniques areinsufficient to characterize atomic-level details of binding, and do not often considerthe multiple protonation states and ensemble of protonation states that can exist inthese systems, many of which are quite similar in energy. We have applied ourcomputational titration protocol, based on Hydropathic INTeraction (HINT), toanalyze and identify the best protonation models for these complexes. Our maininterest in developing a computational titration algorithm is its ability to identify andoptimize all possible protonation states so that rational models with atomic details canbe constructed and applied to modeling ligand-binding energetics.

Results and Discussion. – Computational-Titration Algorithm. The computational-titration (CT) algorithm is based on the empirical HINT free-energy non-covalentforce field [25], and involves modeling of optimum ionization conditions at the bindingsite [26] [27]. The HINT program calculates the free energies of association formodeled systems through a summation of hydrophobic atom constants that are derivedfrom experimental log Po/w values (partition coefficient of a small molecule in octanol/

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2566

H2O). Since log Po/w is an experimentally derived thermodynamic quantity, it directlycorrelates with the free energy of interaction and encodes all non-covalent interactionsin the biological environment as well as solvent effects and entropy [28] [29]. TheHINTmodel describes specific atom–atom interactions between two molecules, using theequation:

HTOT ¼ SS bij ¼ SS (aiSi ajSjRijTijþ rij) (1)

where a is the hydrophobic atom constant (derived from log Po/w), S is the solvent-accessible surface area, T is a function that differentiates polar–polar interactions(acid–acid, acid–base, or base–base), and R and r are functions of the distancebetween atoms i and j [28]. The interaction score bij describes the specific atom–atominteraction between atoms i and j, and HTOT represents the total HINT interactionscore. The HINT model is based on the assumption that each bij approximates a partialdg value for a specific atom–atom interaction, thus the total HINT score, which is thecumulative sum of all the atom–atom interactions, correlates with DGinteraction, i.e.,DG¼S(dG). The scoring function of HINT has been extensively used, and itsapplicability and validity in predicting the free energy of binding for protein– ligandand other systems has been illustrated previously [30–34].

The CT methodology allows exploration of the ionization states of active-siteresidues and ligand functional groups as a function of protonation. This novel method,implemented in the HINT software, involves building and scoring of distinctprotonation models, i.e., a defined ionization state for each relevant residue or ligandfunctional group. First, all the possible ionization states of residues and ligandfunctional groups are enumerated, with user choices of residue types [26] [32] [35].Fig. 1 indicates how this is accomplished. Each box represents a potential position for aproton (Hþ ). With no added Hþ, there is only one model, but with one added Hþ ,there are 14 unique models, etc. In the example of Fig. 1, one of the ligand9s aminogroups and one of the two carboxylate O-atoms of Asp125 are protonated. The acidicresidues Asp and Glu are usually subjected to titration; the basic residues Lys and Argare usually kept in their protonated states, while His is normally selected for titration.The CT algorithm further allows selection of Tyr and Cys for titration in cases whereexploring the ionization of these residues is desired. The analogous functional groupson the ligands, i.e., carboxylic acids, amines, aromatic alcohols, and thiols, are user-selectable for titration. Cofactors or H2O molecules are also taken into account.

The HINT titration algorithm identifies and exhaustively optimizes the H2Omolecules that are in potential bridging positions, i.e., within a 4 N radius from bothprotein and ligand, thus being able to interact with both [36]. During the computation,each model corresponding to a particular protonation state is built and then optimized.Here, all rotatable bonds involving polar H-atoms (R�XHn) – including those newlycreated via protonations, such as the OH groups of carboxylic acids – at both theprotein active site and on the ligand will be examined and exhaustively optimized [26]to maximize H-bond formation, i.e., by rotation of these bonds through the entire 3608space. In addition, Asn and Gln side-chain amides are rotated by �1808 and orientedfor optimal interaction. Simply, this algorithm creates (proton) rotameric models thatare isocrystallographic in that all of the models should fit within the experimental

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2567

electron-density envelopes and be indistinguishable. Next, the algorithm calculates theHINT score of each rotameric model and creates a table of HINT score as a function ofsite charges. From these data, a 8titration curve9 can be calculated. To translate theHINT scores to free energy, we used the following general equation [31]:

DG¼ �0.00195 HTOT�5.543 (2)

Finally, the statistical-thermodynamics Boltzmann energy partition function is appliedto each site charge to calculate population-weighted averages of the binding freeenergy for each site charge. The Maxwell –Boltzmann distribution from statisticalmechanics forms the basis of understanding classical molecular phenomena in terms ofhow energy is distributed in an average sample of states. Mathematically, theBoltzmann distribution can be expressed in the form:

f(E)¼Ae�E/kT (3)

It gives the probability of any molecule existing in an energy state E as a function of itsfree energy. In our case, it elucidates the possibility of an ionization state existing in aparticular energy state as a function of that state9s free energy. The probabilitydecreases exponentially as the free energy of the state increases. The Boltzmannaverage energy weights the energies of each state by these probabilities and is morerepresentative of the overall ensemble energy than a normal average.

Ionization-State Ensemble of HIV-1 Protease. HIV-1 Protease has been widelystudied because of its crucial role in propagation of the AIDS virus, but also as a

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2568

Fig. 1. Schematic representation of active-site ionizable residues and ligand functional groups subjected tocomputational titration (exemplified for the PDB complex 1DMP). Boxes indicate potential protonationsites, i.e., potential Hþ positions. The sole protonation site on the NH2 groups is indicated by a solid box,whereas the two potential protonation sites on the Asp residues are indicated by dotted boxes (Hþ canbe placed on either carboxylate O-atom, but not on both). The illustrated model has two added Hþ : oneligand NH2 group and one of the two carboxylate O-atoms of Asp125 of the protease are protonated

(8occupied9 solid boxes); the other carboxylate is unavailable for protonation.

prototypical target enzyme vital for proteolytic cleavage of viral proteins. It is ahomodimeric aspartyl protease with 99 amino acid residues in each subunit [24]. Theactive site of the enzyme is situated at the interface between the two monomers. Aconserved H2O (w) molecule, w301, located at the HIV-1 protease symmetry axis andbridging the two subunits, has been observed in all HIV-protease– ligand complexes, H-bonded to Ile50 and Ile150 as well as to specific inhibitors. Two other H2O molecules, w313

[37] and the pseudo-symmetric w313’, are largely conserved, and are crystallographicallydetected in most HIV-1-protease– ligand complexes. Initially, protease inhibitors,mostly linear peptide analogues, were designed such that they coordinated with w301,linking them to the amide H-atoms of Ile50 and Ile150 on the flaps of the protease dimer.It was observed from several studies that this specific H2O molecule is crucial for thebinding of these inhibitors [38]. Thus, it was hypothesized that incorporation of thebinding features of this structural H2O molecule into an inhibitor would energeticallyfavor the binding interactions, increasing both binding affinity as well as specificity,since this structural H2O molecule is unique to the aspartyl proteases and since itssubstitution would be thermodynamically favorable, at least partly, due to increasedentropy.

Meticulous design of cyclic-urea inhibitors of HIV-1 protease, specifically designedto displace w301, was reported by Lam et al. [39] in 1994. An essential feature of thisclass of analogues was a C¼O group mimicking the H-bonding features of the key H2Omolecule. Furthermore, the conversion of the flexible linear peptidic inhibitors intorigid, cyclic structures with restricted conformations provided additional favorableentropic benefits. The preferred conformations and stereochemistry of these inhibitors(Fig. 2 below) were confirmed by X-ray crystallography [39] [40]. The crystal structuresalso revealed the H-bonds between the ligand diol groups and the carboxylates of thecatalytic aspartates (Asp25 and Asp125) that serve to anchor the scaffold in the activesite. However, since protons are normally not detected by X-ray crystallography, thereis experimental uncertainty in the correct assignment of the protonation state of thecatalytic dyad. Careful analysis of all the possible protonation states can reveal moreinformation about H-bonding, including the possible existence of alternative H-bonding networks. This understanding could lead to the design of better inhibitors withgreater binding affinity.

In the current study, we examined six HIV-1-protease– inhibitor complexes(Fig. 2), including three cyclic-urea derivatives, SD146 (PDB code 1QBT) [41],DMP450 (1DMP) [42], and Aha001 (1AJX) [43], as well as three cyclic sulfonamides,Aha024 (1G35) [44], Aha047 (1G2K) [44], and Aha006 (1AJV) [43]. For the cyclic-urea analogs, the C¼OO-atom substitutes the position occupied by the O-atom of w301and, thus, forms H-bonds with Ile50 and Ile150. For the sulfonamide analogs, both O-atoms of the sulfonamide group are engaged in H-bonding, with one O-atom being H-bonded to the amide N-atom of Ile50, and the other to the amide N-atom of Ile150.Further, the vicinal diols are engaged in H-bonding networks with the catalyticaspartates.

The active site of HIV-1 protease presents a number of ionizable residues ideallysuited for a CT protocol. In the six protease– ligand complexes, ionizable residueslocated within 8 N of the ligand were chosen for titration. The contributions of w313 andw313’ were also included in the calculations. Note that, in a sense, H2O, being both a

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2569

potential H-bond donor and acceptor, may act as a pH buffer by re-orientating after aprotonation change on a neighboring functional group. Thus, allowing it to freely rotateis an important component of the CT algorithm. For all complexes, the active-siteresidues Asp25, Asp29, and Asp30 on chain A, and Asp125, Asp129, and Asp130 on chain Bwere selected for titration, while only the NH2 groups of DMP450 were subjected totitration. The resulting complex (PDB code 1DMP), having a total of eight ionizablefunctional groups (six on the protein and two on the ligand), yields a total of 2916protonation models for building and scoring. All the remaining complexes have sixionizable residues at the protein active site, yielding 729 protonation models.

The results of the 8titration9 are shown in Fig. 3 for each of the six complexesinvestigated, the HINT scores for each protonation model being plotted as a function ofsite charge. The normal (arithmetic) average and the Boltzmann average (statisticallyweighted average based on site populations) are calculated for each site charge, and the

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2570

Fig. 2. Structures of cyclic inhibitors of HIV-1 protease. In parentheses, the PDB codes of thecorresponding ligand–protein complexes are given.

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2571

Fig.3.Com

putatio

nal-titratio

nresults

fortheHIV

-1-protease–

inhibitorcomplexes

analyzed.PDBNumbers:a)1QBT,b)1DMP,

c)1AJX,d)1G35,e)

1G2K,f)1AJV.AllHINTscoresareplottedasafunctionofsitecharge;inaddition,normalaverages(opentriangles)andBoltzman

naverages(closed

squares)aredepicted.

corresponding 8titration curves9 are obtained. Details of the CT calculations, i.e.,number of models, normal-average and Boltzmann-average free energies, are listed foreach site charge in Table 1.

With the exception of the 1QBT complex with SD146 as ligand, the titration curvesfor all the complexes show a bell-like shape, particularly evident in the complex withDMP450 (Fig. 3,b). The titration curve for the 1QBT complex (Fig. 3,a) has adifferent trend, and while the free energy diminishes with protonation, the curve doesnot reach a minimum. The ligands considered in this study have mostly similar chemicalstructure and size, but 1QBT is bulkier, with two substituted benzimidazole groups.This complex has the highest experimental binding free energy, but its calculatedenergy is underestimated by the HINT scoring function. This could be due to thestructure of the ligand itself, possibly the protonation state of the benzimidazole groupsthat are not subjected to titration with our current algorithm. The lowest free-energypoint determined by Boltzmann statistics – the minimum of the Boltzmann-averagetitration curve – is what we define to be the 8optimal9 calculated free energy of binding,representing the most-favorable site charge and, by inference, pH for the specific ligandbinding.

Fig. 4 illustrates the best identified model for the HIV-1-protease complex with thecyclic-urea inhibitor DMP450, corresponding to the titration curve shown in Fig. 3,b.While the 8best9 model is the one with the highest HINT score and presumablyrepresents the protonation model corresponding to the best binding, it must behighlighted that there are many models with similar energy, and that it is likely thatmany of them actually coexist, especially at room temperature or above, where bindingassays are typically performed. For the complex with PDB code 1DMP, the best modelcorresponds to a site charge of �3, where the two peripheral NH2 groups of the ligandand of Asp125 on the protein are protonated. As expected, Asp25 and Asp125 are engagedin a complex network of H-bonds with the vicinal-diol function of the ligand. Theligand9s two NHþ3 groups form H-bonds with Asp29/Asp30/w313 and Asp129/Asp130/w313’,respectively. The deprotonated Asp29 and Asp129 residues are also involved in H-bondswith w313 and w313’, respectively.

The experimental and calculated binding free energies for the examined HIV-1-protease– ligand complexes are reported in Table 2. The 8optimal9 site charges, i.e., theoptimal protonation states that correspond to the optimal binding energies calculatedby Boltzmann statistical analysis, are also reported. The results are encouraging,producing an average error of �2.5 kcal mol�1 in the prediction of binding energy,excluding the outlier complex with SD146 (PDB code 1QBT), where the calculatedenergy is underestimated by the CT protocol and which displays an atypical titrationprofile (Fig. 3,a). It is important to recognize that using an equation correlating HINTscores with binding energies that is calibrated for the specific analyzed system, insteadof a general equation, would likely improve free-energy prediction.

Conclusions. – The computational protocol described here allows modeling of themultiplicity of protonation states in ligand–protein complexes, an often overlookedaspect of structure that has implications for drug discovery. This approach allowsgenerating hypotheses on the best model for binding, i.e., the model with protonationcorresponding to the optimal binding energy. The binding energy is evaluated with the

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2572

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2573

Table 1. Computational-Titration Results for Complexes between HIV-1 Protease and Cyclic-Urea orCyclic-Sulfonamide Ligands (see Fig. 2)

PDB Entry Site charge Model count Averagea) [kcal mol�1]

Normal Boltzmann

1QBT �6 1 �4.67 �4.67�5 12 �5.02 �5.11�4 60 �5.39 �5.55�3 160 �5.72 �5.95�2 240 �6.01 �6.32�1 192 �6.29 �6.650 64 �6.53 �6.91

1DMP �6 1 �8.28 �8.28�5 14 �8.56 �10.09�4 85 �8.83 �11.50�3 292 �9.05 �11.79�2 620 �9.23 �11.82�1 832 �9.37 �11.610 688 �9.49 �11.231 320 �9.59 �10.692 64 �9.71 �10.02

1AJX �6 1 �7.40 �7.41�5 12 �7.68 �7.79�4 60 �7.81 �7.98�3 160 �7.93 �8.13�2 240 �7.99 �8.21�1 192 �8.00 �8.220 64 �7.95 �8.10

1G35 �6 1 �6.65 �6.65�5 12 �6.69 �6.82�4 60 �6.89 �7.04�3 160 �6.95 �7.19�2 240 �6.98 �7.30�1 192 �6.97 �7.340 64 �6.93 �7.31

1G2K �6 1 �6.36 �6.36�5 12 �6.57 �6.71�4 60 �6.78 �7.04�3 160 �6.92 �7.34�2 240 �7.03 �7.59�1 192 �7.10 �7.740 64 �7.11 �7.67

1AJV �6 1 �6.94 �6.94�5 12 �7.17 �7.35�4 60 �7.32 �7.66�3 160 �7.36 �7.89�2 240 �7.31 �8.05�1 192 �7.17 �8.150 64 �6.93 �8.11

a) HINT Scores converted into free energies using Eqn. 2 [31].

HINT scoring function, which has previously been shown to fairly accurately predictbinding free energies [45]. In considering all the ionizable residues at the active site,and by modeling all the possible protonation states of residues and functional groups atthis site, the computational-titration algorithm represents fairly realistically the

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2574

Fig. 4. Best model for HIV-1 protease in complex with a cyclic urea inhibitor (DMP450; PDB 1DMP).Ligand: ball-and-stick representation (colored by atom type); key active-site residues: titratable Asp instick representation (colored by atom type), Ile50 and Ile150 in stick representation (orange), and H2Omolecules w313 and w313’ in stick representation (magenta). Asp125 and the two ligand NH2 groups are

protonated in this model; H-bonding network at active site: yellow dashed lines.

Table 2. Experimental and Calculated Optimal (Boltzmann-weighted average) Binding Free Energies(DG, in kcal mol�1) for HIV-1-Protease–Ligand Complexes

PDB Ligand DGexpera) DGcalc

b) Optimal site charge

1QBT SD146 �14.44 �6.91 0c)1DMP DMP450 �12.99 �11.82 �21AJX Aha001 �10.79 �8.22 �11G35 Aha024 �11.06 �7.34 �11G2K Aha047 �10.82 �7.74 �11AJV Aha006 �10.52 �8.15 �1

a) Data taken from [46]. b) HINT Scores converted into free energies using Eqn. 2 [31]. c) 1QBThas notreached a true titration-curve minimum. This is the value at the curve9s lowest point, thus the 8optimal9site charge may be at a more positive value (see Fig. 3,a).

fluxional behavior of labile H-atoms in solvated biological systems. Functional H-atoms, in fact, are not static elements of a real biomolecular system. However, thecomputational titration remains computationally expensive. The number of modelsincreases rapidly as the number of ionizable residues/functional groups in the bindingpocket increases. The inclusion of H2O molecules in the calculations makes theseprocedures even more time-consuming. Nevertheless, we are committed to making thistool available and are currently developing a web-enabled version that will be publiclyaccessible.

We gratefully acknowledge support by an NIH grant (GM71894, to G. E. K.) and funds from theItalian Ministry of Instruction, University, and Research within an internationalization project (toA. M.).We appreciate the guidance and support of Prof. Donald J. Abraham in all of these studies.

Experimental Part

Optimized molecular models of the HIV-1-protease– ligand complexes in this work were taken fromour previous study on the contribution of H2O molecules to the energetics of protease– ligand binding[45]. In that study, the GRID program was used [46] for identifying and placing H2O in favorablelocations that were unoccupied due to crystallographic uncertainty. In the cyclic-urea complexes, only thecomplex with Aha001 (PDB 1AJX) had crystallographically detected H2O molecules (w313 and w313’),while these were not experimentally reported for the complexes withDMP450 (PDB 1DMP) and SD146(PDB 1QBT), and were positioned with GRID. However, in the case of the complexes with cyclicsulfonamides, crystallographic w313 and w313’ were experimentally confirmed for the three complexes1AJV, 1G35, and 1G2K, except for w313’ in 1G2K, where it was positioned with GRID.

The modeling programs Sybyl 7.2 (www.tripos.com) and HINT 3.11S were used in this study. In theHINT computational-titration (CT) protocol, the protein, ligand, and H2O were partitioned as distinctmolecules: only H-atoms deemed 8semi-essential9, i.e., only those attached to polar atoms (N, O, S, P) andthose attached to unsaturated C-atoms were explicitly considered in the model and assigned HINTconstants. The inferred solvent model, where each residue is partitioned based on its H-atom count, wasapplied. The solvent-accessible surface area for the amide N-atoms of the protein backbone werecorrected with the 8þ209 option. All Asp, Glu, and His residues within an 8 N radius of the ligand and theionizable functional groups on the ligand (amino, phosphorous, carboxylate groups) were selected fortitration. In this study, only aspartates were present at the HIV-1 active site, and only one ligand had atitratable amino group.

Optimization of each protonation-state model focuses on exhaustive optimization of the dihedralangles of the R�XHn bond, which are exhaustively optimized by forcing full 3608 rotation. This rotationpositions the polar H-atoms for optimum H-bonding and intermolecular HINT energy score. Theprocedure targets primary amino (NH2), hydroxy (OH), and sulfanyl (SH) groups on both protein andligand. Note that the OH groups of non-ionized COOH groups are also optimized in this way. In His, theimidazole ring is flipped to optimize H-bonding, as are the terminal amide functions of Asn and Gln. Allthe possible models were composed and scored using Eqn. 1. Finally, HINT scores were plotted as afunction of pH for all the models. Normal andBoltzmann-weighted averages were then obtained for eachprotonation level.

REFERENCES

[1] Ajay, M. A. Murcko, J. Med. Chem. 1995, 38, 4953.[2] H. J. Bçhm, G. Klebe, Angew. Chem., Int. Ed. 1996, 35, 2588.[3] S. K. Burt, C. W. Hutchins, J. Greer, Curr. Opin. Struct. Biol. 1991, 1, 213.[4] C. Lee, Curr. Opin. Struct. Biol. 1992, 2, 217.[5] K. P. Murphy, Med. Res. Rev. 1999, 19, 333.

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2575

[6] P. Kollman, Acc. Chem. Res. 1996, 2, 461.[7] H. Gohlke, G. Klebe, Angew. Chem., Int. Ed. 2002, 41, 2644.[8] K. A. Dill, J. Biol. Chem. 1997, 272, 701.[9] F. J. Luque, C. Curutchet, J. M. Muriedas, A. Bidon-Chanal, I. Soteras, A. Morreale, J. L. Gelpi,

Orozco, Modesto, Phys. Chem. Chem. Phys. 2003, 5, 3827.[10] M. R. Reddy, A. Krzysztof, in 8Free Energy Calculations in Rational Drug Design9, Ed. M. R.

Reddy, M. D. Erion, Springer, Kluwer Academic/Plenum Publishers, New York, 2001, p. 317.[11] S. Thomas, Comp. Biochem. Biophys. 2001, 169.[12] M. R. Reddy, M. D. Erion, A. Agarwal, Rev. Comput. Chem. 2000, 16, 217.[13] W. Wang, O. Donini, C. M. Reyes, P. A. Kollman,Annu. Rev. Biophys. Biomol. Struct. 2001, 30, 211.[14] J. Antosiewicz, J. A. McCammon, M. K. Gilson, J. Mol. Biol. 1994, 238, 415.[15] B. W. Matthews, J. Mol. Biol. 1968, 33, 491.[16] E. Tajkhorshid, S. Suhai, J. Mol. Struct. (Theochem) 2000, 501–502, 297.[17] R. J. P. Williams, Pure Appl. Chem. 1983, 55, 35.[18] B. Honig, A. Nicholls, Science 1995, 268, 1144.[19] J. Nqvist, J. Comput. Chem. 1996, 17, 1587.[20] J. Mazur, R. L. Jernigan, Biopolymers 2005, 31, 1615.[21] M. J. E. Sternberg, F. R. F. Hayes, A. J. Russell, P. G. Thomas, A. R. Fersht, Nature 1987, 330, 86.[22] M. Feig, C. L. Brooks, Curr. Opin. Struct. Biol. 2004, 14, 217.[23] E. L. Mehler, T. Solmajer, Protein Eng. Des. Sel. 1991, 4, 903.[24] M. A. Navia, P. M. Fitzgerald, B. M. McKeever, C. T. Leu, J. C. Heimbach, W. K. Herber, I. S. Sigal,

P. L. Darke, J. P. Springer, Nature 1989, 337, 615.[25] G. E. Kellogg, J. C. Burnett, D. J. Abraham, J. Comput.-Aided Mol. Des. 2001, 15, 381.[26] G. E. Kellogg, M. Fornabaio, D. L. Chen, D. J. Abraham, F. Spyrakis, P. Cozzini, A. Mozzarelli, J.

Mol. Graph. Mod. 2006, 24, 434.[27] G. E. Kellogg, M. Fornabaio, F. Spyrakis, A. Lodola, P. Cozzini, A. Mozzarelli, D. J. Abraham, J.

Mol. Graph. Mod. 2004, 22, 479.[28] G. E. Kellogg, D. J. Abraham, Eur. J. Med. Chem. 2000, 35, 651.[29] D. J. Abraham, G. E. Kellogg, J. M. Holt, G. K. Ackers, J. Mol. Biol. 1997, 272, 613.[30] J. C. Burnett, P. Botti, D. J. Abraham, G. E. Kellogg, Proteins 2001, 42, 355.[31] P. Cozzini, M. Fornabaio, A. Marabotti, D. J. Abraham, G. E. Kellogg, A. Mozzarelli, J. Med. Chem.

2002, 45, 2469.[32] M. Fornabaio, P. Cozzini, A. Mozzarelli, D. J. Abraham, G. E. Kellogg, J. Med. Chem. 2003, 46, 4487.[33] R. Gussio, D. W. Zaharevitz, C. F. McGrath, N. Pattabiraman, G. E. Kellogg, C. Schultz, A. Link, C.

Kunick, M. Leost, L. Meijer, E. A. Sausville, Anticancer Drug Des. 2000, 15, 53.[34] D. J. Cashman, J. N. Scarsdale, G. E. Kellogg, Nucleic Acids Res. 2003, 31, 4410.[35] F. Spyrakis, M. Fornabaio, P. Cozzini, A. Mozzarelli, D. J. Abraham, G. E. Kellogg, J. Am. Chem.

Soc. 2004, 126, 11764.[36] G. E. Kellogg, D. L. Chen, Chem. Biodiv. 2004, 1, 98.[37] H. Jhoti, O. M. P. Singh, M. P. Weir, R. Cooke, P. Murray-Rust, A. Wonacott, Biochemistry 1994, 33,

8417.[38] A. Wlodawer, J. Vondrasek, Annu. Rev. Biophys. Biomol. Struct. 1998, 27, 249.[39] P. Y. S. Lam, P. K. Jadhav, C. J. Eyermann, C. N. Hodje, Y. Ru, L. T. Bacheler, J. L. Meek, M. J. Otto,

M. M. Rayner, Y. N. Wong, C. H. Chang, P. C. Weber, D. A. Jackson, T. R. Sharpe, S. Erickson-Viitanen, Science 1994, 263, 380.

[40] P. J. Ala, R. J. DeLoskey, E. E. Huston, P. K. Jadhav, P. Y. S. Lam, C. J. Eyermann, C. N. Hodje,M. C. Schadt, F. A. Lewandowski, P. C. Weber, D. D. McCabe, J. L. Duke, C. H. Chang, J. Biol.Chem. 1998, 273, 12325.

[41] P. K. Jadhav, P. Ala, F. J. Woerner, C. H. Chang, S. S. Garber, E. D. Anton, L. T. Bacheler, J. Med.Chem. 1997, 40, 181.

[42] C. N. Hodge, P. E. Aldrich, L. T. Bacheler, C. H. Chang, C. J. Eyermann, S. Garber, M. Grubb, D. A.Jackson, P. K. Jadhav, B. Korant, P. Y. S. Lam, M. B. Maurin, J. L. Meek, M. J. Otto, M. M. Rayner,C. Reid, T. S. Sharpe, L. Shum, D. L. Winslow, S. Erickson-Viitanen, Chem. Biol. 1996, 3, 301.

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007)2576

[43] K. Backbro, S. Lowgren, K. Osterlund, J. Atepo, T. Unge, J. Med. Chem. 1997, 40, 898.[44] W. Schaal, A. Karlsson, G. Ahlsen, J. Lindberg, H. O. Andersson, U. H. Danielson, B. Classon, T.

Unge, B. Samuelsson, J. Hulten, A. Hallberg, A. Karlen, J. Med. Chem. 2001, 44, 155.[45] M. Fornabaio, F. Spyrakis, A. Mozzarelli, P. Cozzini, D. J. Abraham, G. E. Kellogg, J. Med. Chem.

2004, 47, 4507.[46] P. J. Goodford, J. Med. Chem. 1985, 28, 849.

Received February 12, 2007

CHEMISTRY & BIODIVERSITY – Vol. 4 (2007) 2577