Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

9
International Journal of Pharmaceutics 453 (2013) 3–11 Contents lists available at SciVerse ScienceDirect International Journal of Pharmaceutics j o ur nal ho me page: www.elsevier.com/locate/ijpharm Review Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches) David Elder a , René Holm b,a GSK Pharmaceuticals, Park Road, Ware, Hertfordshire, SG12 0DP, United Kingdom b Biologics and Pharmaceutical Science, H. Lundbeck A/S, Ottiliavej 9, DK-2500 Valby, Denmark a r t i c l e i n f o Article history: Received 27 June 2012 Received in revised form 18 October 2012 Accepted 24 October 2012 Available online 2 November 2012 Keywords: Aqueous solubility Predicting solubility In silico solubility Solubility challenge Kinetic solubility Equilibrium solubility Drug formulation a b s t r a c t Aqueous solubility is a key physicochemical attribute required for the characterisation of an active phar- maceutical ingredient (API) during drug discovery and beyond. Furthermore, aqueous solubility is highly important for formulation selection and subsequent development processes. This review provides a sum- mary of simple predictive methods used to assess aqueous solubility as well as an assessment of the more complex in silico methodologies and a review of the recent solubility challenge. In addition, a summary of experimental methods to determine solubility is included, with a discussion of some potential pitfalls. © 2012 Elsevier B.V. All rights reserved. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Simple in silico screening for solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Solubility challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4. Experimental solubility definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.1. Kinetic solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.2. Equilibrium solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.3. Regulatory and bio-relevant solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1. Introduction Lipinski developed his famous ‘rule of five’ to describe those physicochemical attributes that are most impactful on drug attri- tion (Lipinski, 2000). These are molecular mass < 500 Da, c log P < 5, number of hydrogen bond donors < 5, number of hydrogen bond acceptors < 10 (Lipinski, 2004). The application of the concept of ‘drug-likeness’ and ‘druggability’, which are linked to the ‘rule of five’, has gained wide-spread acceptance as a key approach in Corresponding author. Tel.: +45 3630 1311; fax: +45 3643 8242. E-mail address: [email protected] (R. Holm). reducing attrition in drug discovery/development. Unfortunately, despite this acceptance, data trending has revealed that physico- chemical attributes of current drug discovery programmes (across the industry) differ significantly from those in clinical development (Leeson and Springthorpe, 2007). In particular, this trending shows a marked increase in lipophilicity and a commensurate reduction in aqueous solubility (Lipinski, 2000). Morphy (2006) showed that 1680 ‘optimised’ compounds from discovery literature had mean c log P values of 4.0 and molecular masses of 435 Da. Similarly, Oprea (2002) reported that more than 50% of compounds showing high potency for their specified tar- get had c log P values of >4.25 and molecular masses of >425 Da. Keserü and Makara (2009) attributed this to current strategies 0378-5173/$ see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijpharm.2012.10.041

description

Aqueous solubility is a key physicochemical attribute required for the characterisation of an active pharmaceuticalingredient (API) during drug discovery and beyond. Furthermore, aqueous solubility is highlyimportant for formulation selection and subsequent development processes. This review provides a summaryof simple predictive methods used to assess aqueous solubility as well as an assessment of the morecomplex in silico methodologies and a review of the recent solubility challenge. In addition, a summaryof experimental methods to determine solubility is included, with a discussion of some potential pitfalls

Transcript of Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

Page 1: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

R

Aa

Da

b

a

ARRAA

KAPISKED

C

1

ptna‘fi

0h

International Journal of Pharmaceutics 453 (2013) 3– 11

Contents lists available at SciVerse ScienceDirect

International Journal of Pharmaceutics

j o ur nal ho me page: www.elsev ier .com/ locate / i jpharm

eview

queous solubility: Simple predictive methods (in silico, in vitro and bio-relevantpproaches)

avid Eldera, René Holmb,∗

GSK Pharmaceuticals, Park Road, Ware, Hertfordshire, SG12 0DP, United KingdomBiologics and Pharmaceutical Science, H. Lundbeck A/S, Ottiliavej 9, DK-2500 Valby, Denmark

r t i c l e i n f o

rticle history:eceived 27 June 2012eceived in revised form 18 October 2012ccepted 24 October 2012vailable online 2 November 2012

eywords:

a b s t r a c t

Aqueous solubility is a key physicochemical attribute required for the characterisation of an active phar-maceutical ingredient (API) during drug discovery and beyond. Furthermore, aqueous solubility is highlyimportant for formulation selection and subsequent development processes. This review provides a sum-mary of simple predictive methods used to assess aqueous solubility as well as an assessment of the morecomplex in silico methodologies and a review of the recent solubility challenge. In addition, a summaryof experimental methods to determine solubility is included, with a discussion of some potential pitfalls.

queous solubilityredicting solubilityn silico solubilityolubility challengeinetic solubilityquilibrium solubility

© 2012 Elsevier B.V. All rights reserved.

rug formulation

ontents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32. Simple in silico screening for solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43. Solubility challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74. Experimental solubility definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.1. Kinetic solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2. Equilibrium solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.3. Regulatory and bio-relevant solubility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

. Introduction

Lipinski developed his famous ‘rule of five’ to describe thosehysicochemical attributes that are most impactful on drug attri-ion (Lipinski, 2000). These are molecular mass < 500 Da, c log P < 5,umber of hydrogen bond donors < 5, number of hydrogen bond

reducing attrition in drug discovery/development. Unfortunately,despite this acceptance, data trending has revealed that physico-chemical attributes of current drug discovery programmes (acrossthe industry) differ significantly from those in clinical development(Leeson and Springthorpe, 2007). In particular, this trending shows

cceptors < 10 (Lipinski, 2004). The application of the concept ofdrug-likeness’ and ‘druggability’, which are linked to the ‘rule ofve’, has gained wide-spread acceptance as a key approach in

∗ Corresponding author. Tel.: +45 3630 1311; fax: +45 3643 8242.E-mail address: [email protected] (R. Holm).

378-5173/$ – see front matter © 2012 Elsevier B.V. All rights reserved.ttp://dx.doi.org/10.1016/j.ijpharm.2012.10.041

a marked increase in lipophilicity and a commensurate reductionin aqueous solubility (Lipinski, 2000).

Morphy (2006) showed that 1680 ‘optimised’ compounds fromdiscovery literature had mean c log P values of 4.0 and molecularmasses of 435 Da. Similarly, Oprea (2002) reported that more than

50% of compounds showing high potency for their specified tar-get had c log P values of >4.25 and molecular masses of >425 Da.Keserü and Makara (2009) attributed this to current strategies
Page 2: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

4 rnal o

ips

lmcb

igpdpt(cp(

uicswbrpinenraasppiie

2

g1tweTqDct

aptstteta

D. Elder, R. Holm / International Jou

n high-throughput screening hits and hits-to-lead optimisationrogramme, including the use of DMSO (dimethyl sulfoxide) toolubilise ‘lead’ molecules.

Physical property ‘inflation’ is likely to be detrimental to theikelihood of clinical success. Several studies have shown that the

ean molecular mass declines as compounds progress throughlinical development and the more lipophilic compounds tend toe discontinued at each clinical phase (Wenlock et al., 2003).

The commonly held belief that ‘compounds with highern vitro potency for their specified pharmacological target havereater potential to translate into viable low-dose medicinalroducts’ is increasingly being seen as erroneous. This drugiscovery strategy can introduce sub-optimal physicochemicalroperties that translate into un-desirable absorption, distribu-ion, metabolism, excretion and toxicity (ADMET) characteristicsGleeson et al., 2011). Indeed, the essence of successful drug dis-overy appears to be a balance between hydrophobicity-drivenotency and hydrophilicity-driven biopharmaceutics propertiesHill and Young, 2010).

About 35–40% of lead substances are known to be sparingly sol-ble (or lower) and it is not expected that this figure will change

n the near future (Stegemann et al., 2007). Up to 77% of screeningompounds in discovery have been reported to have inadequateolubility for subsequent testing (Xia et al., 2003), which leads toasted time and effort. The importance of aqueous solubility is

est addressed at the discovery stage, but experimental testing isesource intensive. This has prompted many researchers to try toredict solubility using in silico tools. At present, in silico modelling

s based on experimental training sets, as ab inito approaches areot currently feasible due to the complexity of the problem (Göllert al., 2006). Many different strategies (both simple and complex inature) have been used to try and predict aqueous solubility. Thiseview will examine some of the in silico strategies used to assessqueous solubility and bio-pharmaceutical properties in generalnd look into some more general approaches of measuring aqueousolubility (kinetic, equilibrium, bio-relevant). This will provide theharmaceutical development scientist with an understanding ofotential avenues for interacting with the discovery teams, thereby

nitiating an important focus on solubility as early as possible toncrease the likelihood of clinical success and radically improvearly stage attrition levels.

. Simple in silico screening for solubility

Poor aqueous solubility is difficult to define absolutely, but as aeneral working definition compounds with an average potency of

mg/kg need to have aqueous solubilities of greater than 100 �g/mlo be adequately soluble, whereas, solubilities of less than 10 �g/mlould be described as poorly soluble (Faller and Ertl, 2007). How-

ver, it is clear that clinical dose is an important consideration.herefore, poorly soluble compounds (<10 �g/ml) could have ade-uate solubility if the dose was low, i.e. less than 1 mg (Butler andressman, 2010). It is sometimes forgotten within the discoveryommunity that formulation approaches also have a significant roleo play in addressing poorly soluble compounds.

A large number of in silico methods for predicting solubility haveppeared in the literature. Models based on a variety of molecularredictors can help the medicinal chemist understand the struc-ural features limiting solubility. Approaches include molecularurface area, hydrophilicity/hydrophobicity measurements, elec-ronic and topological assessments (Lamanna et al., 2008). One of

he biggest challenges is accurate prediction using in silico mod-lling, due in large part to the absence of high quality data set(s)hat would underpin rapid, reliable and reproducible solubilityssessments. Although there are a significant number of in silico

f Pharmaceutics 453 (2013) 3– 11

computational models for the prediction of aqueous solubilitiesof drug molecules, they usually have a high order of uncertaintyassociated with them. There are a number of experimental vari-ables that can affect the accuracy of measured databases. Thesecan include adsorption of substrate to vessel surfaces or filters (forshake flask methods), inaccurate pKa determinations, compoundinstability, slow dissolution, poor wettability and incorrect dataanalysis or interpretation, which all affect data quality (Faller andErtl, 2007). Fourches et al. (2010) cautioned that in silico modelsmust be ‘trained and validated on chemical data sets that have beenthoroughly curated with respect to both chemical structure andassociated target property values’. Also Dreaden (2006) indicatedreservations on in silico prediction methodologies where standarderrors of estimate of between 0.4 and 0.5 log units for a reason-ably diverse data set were probably over-fitting the data (this wasattributed to poor quality data, where data generated from differentexperimental approaches was combined). Similarly, Kramer et al.(2010) developed a metaClassifier approach as a reaction to theoverfitting that is typically seen in QSPR (quantitative structureproperty relationship) models. The authors indicated an expecta-tion of classification accuracy of 65% for soluble/insoluble usingforeign validation sets. They ascribed the reasons to the obviousscatter seen with the experimental data, but also to ambiguitieswith respect to the exact structure (tautomers, ionisation state,etc.). In silico methods for the prediction of aqueous solubilityshould therefore be used with this precaution in mind, i.e. for rank-ing rather than absolute assessment.

In addition, there are many diverse solubility definitions, whichleads to confusion as to exactly what data has been reported andwhat data can be compared. Common examples are differencesin media (water vs. buffer), differences in pH, differences in crys-tal form (polymorphs, solvates, hydrates, co-crystals, amorphous,etc.), kinetic vs. equilibrium solubility, and differences in purity,all of which can result in very different values being reported. Asa consequence of these measurement difficulties, different stud-ies in the literature cannot be combined, to give the large anddiverse data sets required to accurately predict solubilities of chem-icals arising from new chemical space. However, it could be arguedthat no matter how large and diverse the supporting data set, newchemical space is still uncharted territory and it will prove to bechallenging to any proposed solubility model. This is especiallytrue for low solubility compounds. In addition, models based ona restricted training set can be inappropriately applied to the pre-diction of molecules in a different chemical space (Yan et al., 2004).Dreaden (2006) tested 16 commercially available modelling pro-gram for their predictive ability to assess the aqueous solubilityof a reference set of 122 compounds with accurately determinedsolubilities in water. Of the 17 evaluated software systems, thebest performing software packages were Admensa (72.1%), ADMETpredictor (64.8%), ChemSilico (59.8%), ADME Boxes (59.0%) andACD/Labs (59.0%) (Dreaden, 2006). Some of the most commonlyused commercial and freeware packages for prediction of solubil-ity, together with a description of their operating principles, areprovided in Table 1 (Wang and Hou, 2011).

Fragment based models attempt to predict solubility as a sum-mation of individual contributions from the various structuralmotifs; for example, contributions of atoms, molecules or largerstructural moieties. Hence, there is a generalised assumption thatmolecular properties are pre-determined by molecular structure,and therefore, the sum can be approximated by contributionsfrom the fragments (Faller and Ertl, 2007). Numerous simulationapproaches based on fragment-modelling are available. The three

most popular approaches include in silico models utilising E-stateindices (Tetko et al., 2001), group contribution approach (Klopmanand Zhu, 2001) and molecular topology (Huuskonen, 2000). Notsurprisingly, these approaches work well for molecular properties
Page 3: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

D. Elder, R. Holm / International Journal of Pharmaceutics 453 (2013) 3– 11 5

Table 1Compilation of some of the most commonly used commercial/freeware software packages for prediction of solubility (based on Wang and Hou (2011)).

Commercial softwarepackage

Web addressa Comments

AB/DMSO, ACD/LogS http://www.acdlabs.com/products/pc admet/ Commercial predictors for physicochemical, ADME, andtoxicology properties provide a comprehensive array of toolsfor the prediction of molecular physical properties fromstructure (fragment based). The ability to train allows for theinclusion of novel chemical space in many predictor modules.The models have been extended to include a tool forproperty-based structural design.

ALOGPS 2.1 http://www.vcclab.org/lab/alogps/aqueous.html Free Applet provides interactive on-line predictors of watersolubility, log P and pKa(s) of compounds for drug design(ADME/T and HTS) studies. The molecular weight andelectrotopological E-state indices are used to estimate byArtificial Neural Networks aqueous solubility.

CERIUS2 http://accelrys.com/products/datasheets/admet-descriptors.pdf Commercial software predictions of the solubility ofcompounds in water at 25 ◦C (a ranking relative to thesolubilities of a training set of drug molecules is provided). Agenetic partial least squares method was used to derive themodel based on a training set with experimentally measuredsolubilities.

COSMO-RS www.cosmologic.de Commercial software predictions of aqueous solubility basedon quantum chemical, multivariate linear regressions of data.

CSLogWS www.chemsilico.com Commercial software based on neural networks (utilisinghundreds of topological descriptors) calculates intrinsicaqueous solubility (LogWSo) and pH solubility profiles at pH2.0, pH 5.0 and pH 7.4.

QikProp http://www.chem.ac.ru/Chemistry/Soft/QIKPROP.en.html Commercial software uses molecular mechanical andmultivariate regression analysis to provide fast predictions(10,000 molecules/h) for 17 physically significant descriptorsand 7 pharmaceutically relevant properties useful inpredicting ADME characteristics of drug candidates. QikProp’suses whole-molecule descriptors (as opposed tofragment-based descriptors).

S+LogS(ADMET-Predictor)

http://www.simulations-plus.com/Products.aspx?pID=13&mID=17 Commercial software uses three models for each property;one artificial neural network ensemble model and twoMeylan–Howard models distinguished by melting point input• Native solubility (solubility in pure water)• Native pH at saturation in pure water• Intrinsic solubility in pure water• Salt solubility factor• Water solubility at user-specified pH

SKLogS http://www.bmdrc.org/04 product/01 preadme.asp Commercial software uses topological descriptors andartificial neural networks to define various physicochemicalproperties concerned with drug discovery, such as log P andwater solubility.

SLIPPER http://www.timtec.net/SLIPPER-LogP-LogD-LogSw-FA/ SLIPPER-2001 (Solubility, LIPophilicity, PERmeability) is acommercial software program using multivariable linearregression to predict the aqueous solubility, lipophilicity andabsorption for a wide range of compounds based on structuralsimilarity and physicochemical properties of compounds.

Smart Mining/DMSO www.chemdiv.com/ ChemDiv is a commercial software program for prediction ofDMSO solubility using physicochemical and topologicalparameters.

SPARC http://ibmlc2.chem.uga.edu/sparc/search/searchPka.cfm Freeware software for predicting solute–solvent interactions.SPARC uses computational algorithms based on fundamentalphysicochemical parameters to estimate a large number ofchemical and physical properties (including aqueoussolubility) from molecular structure. Mechanisticperturbation models have been developed that can beimplemented where needed for a specific reactivity queries.

StarDrop http://www.optibrium.com/stardrop/stardrop-adme-qsar-models.php Commercial software enables user to predict a broad range ofADME and physicochemical properties using a suite ofhigh-quality QSAR models including aqueous solubility andsolubility in phosphate buffered saline at pH 7.4.

PHYSPROP http://www.srcinc.com/what-we-do/product.aspx?id=133 The Physical Properties Database (PHYSPROP) containschemical structures, and physical properties for over 41,000chemicals. Physical properties are collected from a widevariety of sources, and include experimental, extrapolated,and estimated values for water solubility of test compounds.

a The websites were accessed on 17th of September 2012, changes in address may occur over time.

Page 4: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

6 rnal o

tlcicHi

thcv3fpealYat

l

wti(aaa(tl2tadsw2aah

crhl

l

Hcsdiambptlfcctb

D. Elder, R. Holm / International Jou

hat are purely additive, for example, lipophilicity as defined byog P. Unfortunately, the reverse is true for solubility due to theomplex interaction of inductive and mesophilic effects, as well asntra-molecular hydrogen bonding (seen with salts, polymorphs,o-crystals, etc.) and ionic interactions (typically seen with salts).owever, this approach offers the possibility of modelling (at least

n part) the effects of crystal packing (Chickos et al., 1991).Indeed, the biggest challenges facing solubility modelling is

he unpredictable nature of the solid-state (polymorphs, solvates,ydrates, co-crystals, amorphous, etc.) and trying to predict therystal class, space group (in total 230 possibilities based on 14 Bra-ais lattices and 32 unique point groups that are arranged in seven-dimensional coordinate systems), the number, position and con-ormation of independent molecules in the unit cell, and the latticearameters. The other challenge is to effectively model the entropyffect moving from an ordered, structured low entropy solid state to

disordered, unstructured high entropy solution state. This initiallyed to the derivation of the General Solubility Equation (GSE) byalkowski and co-workers (Sanghvi et al., 2003) and more recently

modified GSE (Johnson et al., 2007), both including a melting pointerm in the equation:

og S = −log P − 0.01(MPt − 25) + 0.5 (1)

here S is the intrinsic solubility, P is the octanol/water parti-ion coefficient and MPt is the melting point. If melting points not known, a ‘standard’ value of 125 ◦C has been suggestedDelaney, 2005). This melting point term only partially helps toddress the issue of the influence of crystal lattice energy (Ouvrardnd Mitchell, 2003; Datta and Grant, 2004), entropy (Johnsonnd Yalkowski, 2005) and general thermodynamic considerationsPalmer et al., 2008) on the predictivity of aqueous solubility. Fur-her, it is clear from the GSE that hydrophobicity (as defined byog P) is the dominant variable in the equation (Hill and Young,010). As it is much more straightforward for a medicinal chemisto influence changes in log P in a molecular scaffold rather thanttempting to influence the melting point of the resultant molecule,iscussing log P can be a key to success in some discovery organi-ations. Leeson and Springthorpe (2007) indicated that it probablyas not coincidental that a typical marketed drug has a c log P of

.5, because this corresponds to the upper limit of good solubilitys predicted by the GSE. As highlighted earlier, increasingly poorqueous solubility is the logical extension of introducing overlyydrophobic character into potential drug candidates.

Hill and Young (2010) indicated that the GSE constraint of log P of >2.5 is probably the worst case scenario as it does noteflect ionisation within the molecule and the impact that this canave in enhancing aqueous solubility. Thus by replacing log P with

og DpH 7.4, a modified GSE can be derived:

og SpH 7.4 = −log DpH 7.4 − 0.01(MPt − 25) + 0.5 (2)

ill and Young (2010) assessed a large data set of about 20,000ompounds, where both measured log DpH 7.4 and accurate kineticolubility measurements (at pH 7.4) were available. These empiricalata were complemented with calculated values for hydrophobic-

ty (c log P and c log DpH 7.4), together with MW and the number ofromatic rings. The authors reported a striking disconnect betweeneasured and calculated hydrophobicity with decreasing solu-

ility. Poorly soluble compounds (<30 �M) show a particularlyoor disconnect with a correlation coefficient of R2 = 0.11, but ashe solubility increases (30–200 �M) there is an improved corre-ation (R2 = 0.32) and the best correlation (albeit still quite poorrom a statistical perspective) occurs with good soluble (>200 �M)

ompounds (R2 = 0.462). These observations suggest that factorsontributing to poor aqueous solubility are somehow perturbinghe octanol/water system and preventing a true equilibrium fromeing obtained, which could be attributable to aggregate formation

f Pharmaceutics 453 (2013) 3– 11

(Jadhav et al., 2010). Ironically, these data support the contentionthat calculated log DpH 7.4 (or c log P) might be a better indicator ofthe true value of hydrophobicity than using the measured value,demonstrating some of the difficulties in obtaining high qualitydata when working with low soluble compounds. These observa-tions were developed by Young et al. (2011), where the reliabilityand predictability of chromatographic measurements of hydropho-bicity were substantiated. These showed little solubility-relatedvariation, a normal distribution of values with more meaning-ful (and relevant) measurements generated for more hydrophobiccompounds (few measurements of log DpH 7.4 above 3 are obtainedwith the octanol–water method). Recently, the adverse effectsof aromaticity on solubility have been reported. Three relatedvariables have been utilised: percentage of sp3 hybridised atoms(Lovering et al., 2009), aromatic portion (AP) (Lamanna et al., 2008)and the number of aromatic rings (Ritchie and MacDonald, 2009;Ritchie et al., 2011) in the target structure. Bergström et al. (2007)indicated that those poorly soluble molecules that reached the mar-ket were characterised by solvation solubility limitations, i.e. greaseballs, rather than by solid state considerations (i.e. brick dust).

Wassvik et al. (2008) further indicated that it was clear thatmolecules with low lipophilicity are more likely to display poorsolubility from a well ordered crystal structure, i.e. ‘brick dust’,whereas, highly lipophilic compounds are solubility limited byinadequate solvation. They modelled various scenarios and showedthat for a compound with a melting point of 250 ◦C and c log P of2 that the GSE demonstrates that solid-state considerations willpredominate (52%), whereas, when the c log P is increased to 6,solid-state considerations drop significantly (27%). Planar, large flatand rigid molecules with conjugated �-systems and extended ringsystems correlate highly with increased solid-state stability andhave a high-likelihood (86%) of flagging for low aqueous solubility.

The explanations involve either reduced entropy contributionsto the corresponding free energy of solvation/melting owing toenhanced rigidity and decreased molecular mobility or increasedlattice energy (and MPt) owing to increased �–� stacking ofthe planar aromatic systems. Hill and Young (2010) also found amore pronounced corollary between the number of aromatic ringsystems and c log DpH 7.4 (rather than log P) and consequently sol-ubility. They proposed a solubility forecast index (SFI):

SFI = c log DpH 7.4 + number of aromatic rings (3)

The authors indicated that if the SFI < 5, then there is a reasonablechance of demonstrating good aqueous solubility and further rea-soned that each aromatic ring is equivalent to an extra log unit ofhydrophobicity. As the average number of aromatic ring systemsin marketed oral products is 1.6 (Sanghvi et al., 2003), then theaverage SFI would be 2.4. Recently, Young et al. (2011) have builton their earlier work, replacing c log DpH 7.4 with a chromatograph-ically determined hydrophobicity parameter (ChromLog DpH 7.4)and deriving a new parameter (PFI, property forecast index), where:

PFI = ChromLog DpH 7.4 + number of aromatic rings (4)

Due to a 2-unit shift when moving from c log DpH 7.4 to Chrom-Log DpH 7.4 the authors indicated that if the PFI < 7, then thereis a reasonable chance of demonstrating good aqueous solubil-ity. Statistically, an improved differentiation between classes wasachieved with the chromatographic parameter. Interestingly, withPFI held constant, MW variation had essentially no influenceon solubility. Additionally, they demonstrated relevance of thePFI concept to other key developability parameters (in additionto solubility); for example, permeability, human serum albumin

binding, cytochrome P450 affinity and intrinsic clearance. Theintrinsic hydrophobicity of a molecule (i.e. log P) and the addi-tion of log P + #Ar (the intrinsic PFI) were demonstrated to be thekey drivers behind hERG binding and molecular promiscuity. In
Page 5: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

rnal o

eltmambtbaTdp

fihl(2lan

affib

3

el‘nT(lsimdawTt(mafbra2Carwcinarsi

D. Elder, R. Holm / International Jou

ach of these, the impact of flatness over and above any corre-ation with hydrophobicity was emphasised. They further arguedhat “hydrophobicity should be the pre-eminent parameter in

edicinal chemistry, rather than size, weight or polar surfacerea”. Additionally, a second key molecular parameter is related toolecular shape, which appears to be well modelled by the num-

er of aromatic rings in the molecule. The authors also observedhat hetero-aromatic ring systems were more desirable than theirenzenoid analogues due to a generally lowered hydrophobicity,lthough minimisation of all aromatic rings is a sensible strategy.aken together, the PFI forms a firm foundation for property-basedesign (above and beyond solubility) that should facilitate betterredictivity, decision making and ultimately reduce attrition.

In addition to the various molecular descriptors utilised in thiseld, a diverse range of statistically based/data mining approachesave been utilised for prediction of aqueous solubility. As well as

inear regression (Jain et al., 2006) and partial least squares (PLS)Clark, 2005), support vector machines (SVP) (Lind and Maltseva,003), Monte Carlo simulations (Jorgensen and Duffy, 2000), cel-

ular automata (Kier and Chen, 1994), genetic algorithms (Wengernd Zell, 2003), random forests (RF) (Palmer et al., 2007) and neuraletworks (Yan et al., 2004) have all been employed.

In contrast, the methods presented here are easier to conceptu-lise and use in striving to generate better development candidatesor the future. Also, they provide the formulation scientist with therst simple ideas on enabling technologies to explore the greaseall/brick dust paradigm.

. Solubility challenge

A research group from the University of Cambridge, UK (Llinàst al., 2008) recently measured the intrinsic solubilities of 100 drug-ike molecules at fixed temperature and ionic strength using thechasing equilibrium’ approach (Stuart and Box, 2005). This tech-ique ensures that the equilibrium solubility is rapidly achieved.hey rejected any results with a residual standard deviationRSD) of >4%. Using this data set they challenged (Solubility Chal-enge) other research groups to predict, using any available inilico technique, the intrinsic solubility (of the neutral form, i.e.ndependent of pH) of a further 32 drug-like molecules. The

olecules in the training set were selected to represent a diverseata-set and the molecules satisfying the following criteria: safend affordable, with pKas between pH 1 and 12. The moleculareights (MW) ranged from 115 (proline) to 645 (amiodarone).

he intrinsic solubilities ranged over 7-orders of magnitude fromhe poorly soluble (amiodarone, 6.7 nM) to the highly solubleacetaminophen, 86 mM), with a reasonably even spread of inter-

ediate values. Some molecules were too soluble and could not beccurately measured (procainamide, orbifloxacin, l-proline, levo-olaxacin, 5-fluorouracil), some were chemically unstable (aspirin,enzylpenicillin – both prone to aqueous hydrolysis) and someequired the use of co-solvents (sertraline, loperamide, meclizine,miodarone). The authors subsequently reported (Hopfinger et al.,009) that they received more than 100 entries to the Solubilityhallenge, but that >5% of respondents gave incomplete responsesnd that results were not reported for all 32 compounds, whicheduced the useable responses to 99. Respondents utilised thehole spectrum of available in silico predictive tools and hence this

hallenge provided an over-arching perspective of the current abil-ty to predict aqueous solubility. Unfortunately, the authors couldot recommend any definitive best method(s), rather a number of

pproaches which faired equally well (or badly, dependent on theeader’s perspective) at predicting solubility. Within the full dataet (32 compound data set), there were two compounds that exhib-ted polymorphism (trazadone (two forms) and diflunisal (four

f Pharmaceutics 453 (2013) 3– 11 7

forms)) and the solubility of the most stable polymorph, i.e. leastsoluble form was used. Four compounds were too soluble to mea-sure (2-chloromandelic acid, 1R-2S-ephedrine, marbofloxacin and1R-2R-pseudoephedrine), and these compounds were excludedfrom the full set and not used in any subsequent statistical eval-uation. Additionally, the four largest outliers were also removedto give a restricted data set of 24 compounds. The percentage ofcorrect respondents were then reported, using a ±10% error con-straint for S (intrinsic solubility) or ±0.5 log error limit for log S. Thefindings can be summarised as follows:

(a) The ranges of percent correct predictions for the measuredintrinsic solubility ranged from (i) 0.0–21.9% (32 compound,full set), (ii) 0.0–17.9% (28 compound, measured data set) and(iii) 0.0–20.8% (24 compound, restricted data set).

(b) The ranges for the predicted vs. measured correlation coeffi-cient (R2) for intrinsic solubility ranged from (i) 0.000 to 0.642(28 compound measured data set) and (ii) 0.002 to 0.987 (24compound, restricted data set).

(c) The ranges of the percent correct predictions (with acceptancecriteria of 0.5 error limit of log. solubility), ranged from (i)15.6% to 62.5% (32 compound, full set), (ii) 10.7% to 60.7% (28compound, measured data set) and (iii) 12.5% to 70.8% (24 com-pound, restricted data set).

(d) The ranges in the predicted vs. measured correlation coefficientranged from (i) 0.018 to 0.650 (28 compound, measured dataset) and (ii) 0.002 to 0.835 (24 compound, restricted data set).

(e) No respondents identified that some of the compounds couldbe polymorphic and consequently no respondent made anypredictions based on those attributes.

(f) The majority of the 28 compound test set have log. solubilities inthe range 0.5–3.0 and therefore accurate predictions are morelikely within this range rather than at the extremes of high orlow measurable solubility.

(g) The predictive powers of the models used are based oncertain facets of the chemical structure and therefore thetraining set will be biased towards certain compounds, i.e.imipramine solubility, 22 ± 1 �g/ml was correctly predictedby 81% of respondents, whereas, naphthoic acid solubility,28.96 ± 0.01 �g/ml was correctly predicted by only one-thirdof the participants.

(h) The two compounds that were least well predicted wereprobenecid, 3.9 ± 0.1 �g/ml (2% correct predictions) andindomethacin, 410 ± 5 �g/ml (4% correct predictions).

Subsequently, several commentaries were published on the Solu-bility Challenge. Hewitt et al. (2009) commented on the approachesthat they submitted (simple linear regression, artificial neuralnetworks, category formation and bespoke in silico methods),which were selected to span the multitude of available modellingapproaches. Hewitt and co-workers were surprised that the sim-ple linear regression approach appeared to be superior to the morecomplex methods. Their top-level conclusion was that predictingaqueous solubility is still hugely challenging. The failures werenot attributable to limitations of the modelling approaches, butrather they were a result of the complexity of the challenge andthe recommendation was to use simple modelling by first-intent.Hewitt et al. (2009) indicated that data quality was key and evenwith a ‘high quality’ data set provided in the Solubility Challenge,questions were still raised about the veracity of this data, and itis critical to understand the applicability domain (i.e. the domainwhere the model is truly predictive). Predictions made outside of

this domain are less reliable. Solubility models still have a screeningrole in excluding those molecules with poor solubility characteris-tics and the veracity of these predictions can be verified by the useof multiple models. Finally, Hewitt and co-workers recommended
Page 6: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

8 rnal o

orb

atbfisdiabta3il

ectwFmbttmottaoeiCpceo

4

atwmts

ousbet222emsd

D. Elder, R. Holm / International Jou

ther alternative non-statistical approaches, such as mechanisticeasoning, solute packing characteristics, as the additional data cane used to enhance confidence in the predictive tools.

Cheng et al. (2011) developed a support vector machine (SVM)lgorithm based on 46,000 compounds of known solubility. Despitehe size of this training set the SVM approach performed poorly foroth soluble (sensitivity 19–22%) and insoluble compounds (speci-city 28.6–42.9%) in the Solubility Challenge. Kramer et al. (2010)howed better predictivity with their metaClassifier approach,espite the fact that their training set was based on kinetic solubil-

ties rather than equilibrium solubilities. They showed a predictionccuracy of 77.8% for solubility, but tended to predict higher solu-ilities than the measured values (perhaps unsurprisingly as theirraining set was measured at pH 7.4 in the presence of smallmounts of DMSO). However, their model only correctly predicted/8 of the insoluble compounds and mefenamic acid and trim-

pramine were strong outliers, and thus their approach performedess optimally with insoluble compounds.

The predictive powers of these more complex in silico mod-ls seem relatively modest if judged solely by their correlationoefficients. Indeed, Faller and Ertl (2007) suggested that the advan-ages of multi-linear models appear to be modest when comparedith the simple c log P correlation proposed by Hansch et al. (1968).

aller and Ertl (2007) also suggested that knowing when to trust theodels was equally important and provided some relationships

etween solubility and molecular descriptors. They also cautionedhat before using a prediction model the user should check whetherhe test compound fits into the ‘property’ space covered by the

odels training set. Finally, the user cannot expect the accuracyf the in silico model to be greater than that of the experimen-al determinations (which are typically within a factor of 5) andhe suggested way forward was to restrict the use of currentlyvailable models to the ‘appropriate’ situation, and thereby focusn a selected, limited number of compounds that are outside thexisting training sets, freeing up scarce resources for more crit-cal, in-depth evaluations (Faller and Ertl, 2007). The Solubilityhallenge clearly demonstrates that although solubility can beredicted by first intent, it still requires the generation of actualompound for experimental measurements in order for a discov-ry optimisation programme to successfully optimise the solubilityf the series.

. Experimental solubility definitions

Before embarking on an overview of different approaches tossess solubility, it is probably worthwhile defining solubility andhe method(s) used to determine solubility, as the term is often usedithout precision and this can lead to confusion. Solubility is a ther-odynamic term and can be defined as the amount of solute (drug)

hat dissolves in a given volume of a solvent (water, buffer, etc.) at apecified pH, temperature and pressure (Alsenz and Kansy, 2007).

The importance of solubility in both drug discovery and devel-pment is evident, however, so is the dependence of the methodsed to measure the solubility (described in fuller detail in the nextection). A number of different terms describing solubility haveeen used in the literature, including apparent solubility, apparentquilibrium solubility, kinetic solubility, precipitation solubility,hermodynamic solubility, equilibrium solubility (Kawakami et al.,005; Elamin et al., 1994; Bevan and Lloyd, 2000; Kerns et al.,008; Lipinski et al., 2001; Betigeri et al., 2008; Brewster et al.,008; Murdande et al., 2011). The different terms cover differ-

nces in the protocol to measure aqueous solubility, but theost important methods used in the pharmaceutical industry for

olubility determinations are kinetic and equilibrium (or thermo-ynamic) solubility. The methods can also be used for measuring

f Pharmaceutics 453 (2013) 3– 11

the solubility in formulation systems or excipients, though differ-ences between the systems should be kept in mind when definingthe protocol.

4.1. Kinetic solubility

The starting point of kinetic solubility is the dissolved solute andthe kinetic solubility (or maximum solubility) of the fastest precip-itating species is usually determined. It is thus fast (nephelometric,turbidimetric methods) and easy to automate and ideally suited fora discovery environment using high-throughput screening tools.However, the physical character of the precipitated species aretypically not determined, which can be crystalline, amorphous,co-crystal, salt or neutral compound, or indeed a mixed-phase.Indeed, precipitation from organic solutions (typically mixturesof DMSO/water) typically favours metastable species (Ostwald’sRule of Stages). Kinetic solubility assessments vary as a functionof time impacting on the degree of supersaturation that can occurand therefore, they typically over-predict the true solubility values(Alsenz and Kansy, 2007). Kinetic solubility are normally used asa ranking tool, and can also to some degree be used to verify thatthe compounds tested in in vitro screenings from DMSO stock solu-tions are in fact in solution, i.e. the obtained results are reasonabletrustworthy. Stuart and Box (2005) measured kinetic solubility for 7different compounds and compared it to the equilibrium solubilityreported in the literature. The kinetic solubility values were up to50 times greater than the thermodynamic solubility, which is whykinetic solubility methods cannot necessarily be assumed to repre-sent an approximation nor necessarily, a rank order of solubility. Itis therefore recommended to conduct a limited validation exercisein a new chemical series by measuring the equilibrium solubility ofa sub-set of this series, if ranking is the intended purpose.

4.2. Equilibrium solubility

Equilibrium (or thermodynamic) solubility in contrast is typi-cally determined either in late phase discovery or in a developmentsetting. Here, the solubility assay is performed in the presence ofexcess solute and the residual solute at the end of the experiment istypically re-assessed from a form perspective to ensure that therehas been no change caused by polymorphism, hydration, co-crystalformation, etc. Equilibrium solubility therefore represents the sat-uration solubility of the solute in equilibrium with an excess ofundissolved solute, the gold standard being the shake flask method.Although, equilibrium solubility is likely to more accurately repre-sent the true solubility it is still dependent on various factors, forexample, solute purity, particle size, solvent, chemical/physical sta-bility in solution, aggregation, particle growth (Ostwald ripening),time, temperature, mixing conditions, amount of residual solute,solvent purity, adsorption to surfaces, pH, methodology, method-ology accuracy, etc. As the crystal lattice energy (if crystalline) of thesolute needs to be overcome then time plays an important role inthe determination. Equilibrium solubility measurements are oftendetermined at several time points, e.g. 4, 12, 24, 48 h and the end-point is frequently reported when there is concordance of resultsat two consecutive time points. Equilibrium solubilities on soluteproduced by medicinal chemistry processes are likely to be higherthan achievable in later development as both the purity and crys-tallinity increases as the project moves forward (Alsenz and Kansy,2007). Pudipeddi and Serajudin (2005) estimated that the solubilitydifferences between amorphous and crystalline compounds could

be 10-fold. For projects where the compound is typically presentedin a solubilised form in the first regulated toxicity studies, this mayconstitute a major challenge as the solubility will also be affectedby the vehicle used in the study.
Page 7: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

rnal o

omImsitIse

omrdoewsndeacknabboauictctgia

4

t(mba

onHobtuplcbimp

D. Elder, R. Holm / International Jou

The intrinsic solubility (S0) is the solubility of the neutral formf ionised compounds (or the solubility of non-ionised compound)easured at a defined pH and in a given solvent (typically water).

ntrinsic solubility appears to be reasonably independent of theedium used (Bergström et al., 2004). However, in contrast,

olubilities measured at a fixed pH (especially of poorly soluble,onisable drugs) tend to be variable and highly dependent on theype and relative concentration of the ions present in the media.n addition, solubility measurement cannot differentiate betweenoluble monomers and self-associated species that are also soluble,.g. dimers, micelles (Faller and Ertl, 2007).

As discussed in Section 3, Kramer et al. (2010) tended toverestimate the solubilities by the use of kinetic solubility vs. ther-odynamic solubility. Saal and Petereit (2012) recently compared

esults obtained by both kinetic and equilibrium solubility on aatabase of 465 research compounds. For a considerable numberf low solubility compounds, there appeared to be little differ-nce between the two solubility determinations. However, thereere also a considerable number of compounds where the kinetic

olubility was significantly higher (sometimes two orders of mag-itude) than the corresponding equilibrium solubility. The authorsemonstrated that at least 60% of drug residues examined afterquilibrium solubility were crystalline in nature (compared to 22%morphous and the remainder of indeterminate crystallinity). Inontrast, in the vast majority of cases, the precipitate obtained afterinetic solubility determinations is thought to be amorphous inature (Hoelke et al., 2009). These differences in physical form arenticipated to contribute significantly to the differences observedetween kinetic and equilibrium solubility, as differences in solu-ility of one to three orders of magnitude have been historicallybserved between crystalline and amorphous compounds (Huangnd Tong, 2004). Saal and Petereit (2012) therefore argued that these of kinetic solubility data could misguide the medicinal chemist

n the optimisation process and instead of making more solubleompounds they would make compounds harder to crystallise. Thehermodynamic protocol also included a risk, as only 60% of the pre-ipitate was crystallinic after the equilibrium time. Investigation ofhe excess compound with polarised microscopy was, hence, sug-ested to circumvent this risk (Saal and Petereit, 2012) and therebymprove the protocol for thermodynamic solubility determinationt the drug discovery stage.

.3. Regulatory and bio-relevant solubility

Solubility and intestinal permeability are routinely usedo assess bio-pharmaceutical classification of drug moleculesAmidon et al., 1995). The compounds are classified into a 4-box

odel: class 1 (high solubility/high permeability), class 2 (low solu-ility/high permeability), class 3 (high solubility/low permeability)nd class 4 (low solubility/low permeability).

Regulatory guidance (FDA, 1997) stipulates the volume and pHf media to be used (between pH 1.0 and 8.0) for solubility determi-ations, but is silent on which buffer that should be used. Typically,Cl at constant ionic strength is used for pH 1 and buffered aque-us solutions at pH 4.0, 7.2 and 8 are typically utilised. Additionaluffer, where the pH is equal to pKa (±1) should also be included. Ifhe pKa value is outside the physiological relevant pH range (1–8),sing buffers at the pH extremes is recommended (i.e. pH 1 and 8),lus pH 6.8 to cover the EMA (European Medicines Agency) guide-

ine (EMA, 2010) and at least two additional buffer, e.g. pH 4 and 5 toover the middle of the physiological pH range. The studies should

e conducted at 37 ◦C in triplicate. When conducting these studies

t is recommended to investigate the excess solid material, as thisay be the key to explain possible variations in the pH-solubility

rofile.

f Pharmaceutics 453 (2013) 3– 11 9

Bio-relevant media have been proposed and discussed exten-sively in the scientific literature (Dressman et al., 2007). Thisis because the in vivo solubility in the gastrointestinal tractdefines the upper achievable solubility concentration of the drug(Dressman et al., 2007). This is attributed to two principal fac-tors: the wide range of pHs encountered in the GI tract, which canhave significant impact of the solubilities of ionisable drugs. Sec-ondly, enhanced solubilisation either via lipid components in thefed state and by micellar mechanisms, either by protein/peptidebinding and/or via native or ingested surfactants, both in the fedor fasted state. However, there are obvious ethical and safety con-siderations to performing direct measurements of intra-gastric orintra-intestinal drug concentrations and as a consequence bio-relevant media were developed. For a compound in the discoveryphase, where the intrinsic aqueous solubility looks poor, it is highlyrelevant to measure the solubility in bio-relevant media assess theneed for enabling formulation technologies.

Historically, dissolution media, i.e. simple aqueous buffers, havebeen used for quality control purposes, but are rarely predictive ofin vivo behaviour. Thus when using dissolution as a surrogate forin vivo performance, media that adequately reflect in vivo condi-tions are required, i.e. bio-relevant media. Traditionally, simulatedgastric fluid (SGF) and simulated intestinal fluid (SIF), either with orwithout enzymes, have been utilised as dissolution media, partic-ularly in older pharmacopoeial monographs. More recently, thesemedia have been complemented by simulated fasted gastric media(FaSSGF) and simulated fed state gastric media (FeSSGF) to sim-ulate fed and fasted state gastric solubility. Similarly, FeSSIF andFaSSIF media can be used to simulate fed and fasted state intestinalsolubility.

Vertzoni et al. (2005) developed FaSSGF containing HCl, sodiumchloride, pepsin and low levels of bile salts and lecithin. In con-trast, it is more difficult to simulate fed state gastric media (FeSSGF)as the composition changes with both food composition and time.Thus, ‘snap-shot’ media reflecting the changing pH, buffer capac-ity and osmolality of gastric media during early (<75 min), middle(75–165 min) and late (>165 min) time periods after food have beenconsumed. They contain HCl, sodium chloride, milk/buffers andeither acetate (middle) or phosphate (late) ions.

Similarly, Jantratid et al. (2008) and Jantratid and Dressman(2009) developed simulated fasted intestinal media (FaSSIF) con-taining maleic acid, sodium chloride, pepsin and low levels of bilesalts and lecithin. As with gastric media, FeSSIF media are dividedinto early, middle and late time periods after food consumption.They contain maleic acid, sodium chloride, pepsin, low levels ofbile salts, lecithin, glyceryl monocholate and sodium oleate.

The typical volumes in fasted state gastric media are ca. 300 ml(including 250 ml of co-administered vehicle, as per standardisedPK study design). USP dissolution apparatus II (paddles) has alower operating volume of 500 ml and thus USP III (reciprocat-ing cylinder) is recommended (Marques et al., 2011). In a similarvein, bio-relevant media (FaSSIF and FeSSIF) have been used asthe apical media in Caco-2 permeability studies (Kartaoka et al.,2006).

Drug solubilities in bio-relevant media (FaSSGF and FeSSIF) arereasonably predictive of aspirated human gastric fluid (HGF) andhuman intestinal fluid (HIFfed) and most media show enhanced sol-ubility vs. the corresponding intrinsic aqueous solubility. Dressmanet al. (2007) showed that the solubilities of ketoconazole in gas-tric media (FaSSGF and HGF) were enhanced ca. 1300-fold (9054and 9025 �g/ml, respectively). Similarly, solubilities in intestinalmedia (FeSSIF and HIFfed) were enhanced between 60 and 140-fold

(406–540 and 476–989 �g/ml, respectively) vs. the intrinsic solu-bility (6.9 �g/ml). Dipyridamole showed a similar profile, whereas,felodipine showed no enhancement in gastric media, but upwardsof 400-fold enhancement in HIFfed.
Page 8: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

10 D. Elder, R. Holm / International Journal o

Fig. 1. Solubility values of basic compounds in buffer and various biorelevant media,from Söderlind et al. (2010). Details on the media can be found in the main text. Fromleft to right, the bars are solubilities for the compounds in phosphate buffer pH 6.5,FaSSIF-I, FaSSIF-II and HIF.

Fig. 2. Solubility values of neutral compounds in buffer and various biorelevantmtb

(t2cscarstHoo1aiii

edia, from Söderlind et al. (2010). Details on the media can be found in the mainext. From left to right, the bars are solubilities for the compounds in phosphateuffer pH 6.5, FaSSIF-I, FaSSIF-II and HIF.

A recent publication showed that a modified FaSSIF-II mediumwith lecithin reduced from 0.75 mM to 0.2 mM) better reflectedhe solubility in HIF for 24 model compounds (Söderlind et al.,010). For acid and basic compounds tested, FaSSIF-II performedomparably with the original FaSSIF media (for basic compoundsee Fig. 1). However, for the neutral compounds (atovaquone,yclosporine, danazol, diethylstilbesterol, felodipine, griseofulvinnd probucol) FaSSIF-II solubilities were better aligned with aspi-ated HIF solubilities (see Fig. 2). In contrast, these findings were notupported by Clarysse et al. (2009) who found that the solubility ofwo neutral BCS class II drugs were found to be higher in duodenalIF compared to standard FaSSIF, which may be due to method-logy differences. Clarysse et al. (2011) also investigated the usef a simpler medium based on d-�-tocopheryl polyethylene glycol000 succinate (TPGS) based on functional biorelevance with FaSSIF

nd FeSSIF media and advocated its use based on enhanced stabil-ty and lower cost. These contradicting results demonstrates themportance of further scientific efforts to characterise the humanntestinal contents to get a better definition of a media that can

f Pharmaceutics 453 (2013) 3– 11

facilitate a biorelevant understanding of formulation dissolutionbehaviour.

Similarly, bio-relevant solubility and dissolution testing are pre-dictive of both drug plasma concentrations and fraction absorbed(Nicolaides et al., 2001; Takano et al., 2006). These predictionsare most successful for drugs with low dose/solubility ratios, e.g.glimenclamide in the fasted state and danazol in the fed state.However, for compounds with high dose/solubility ratios, e.g. ato-vaquone, although, fed state simulations are accurate, in contrastfasted state predictions are not and appear to be significantlyaffected by the prevailing in vitro hydrodynamics (Dressman et al.,2007).

5. Conclusion

Solubility is the most important physicochemical parameterused in drug discovery and development and consequently a goodunderstanding of the concept and methods to predict or determinethis parameter are important for the pharmaceutical scientist. Arelative large number of predictive methods have been used histori-cally. Thus, although solubility is relatively easy to understand froma conceptual point of view, the ability of in silico methods to accu-rately predict aqueous solubility is constrained by (i) inability toaccurately predict the solid-state conformation(s) and (ii) inabilityto model entropy considerations. The results from the recent Solu-bility Challenge exemplify the problems facing investigators in thisarea. Consequently, there is still ample scope for new experimentalapproaches to predict solubility, even if this is not trivial undertak-ing due to the many factors influencing the determined value. Thescience of solubility determination will therefore continue to be atopic of interest in the future.

Acknowledgements

To Ian Churcher and Rob Young (GSK, Medicinal Chemistry) forhelpful discussions and commentary.

References

Alsenz, J., Kansy, M., 2007. High throughput solubility measurement in drug discov-ery and development. Adv. Drug Deliv. Rev. 59, 546–567.

Amidon, G.L., Lennernas, H., Shah, V.P., Crison, J.R., 1995. A theoretical basis for abiopharmaceutical drug classification: the correlation of in vitro drug productdissolution and in vivo bioavailability. Pharm. Res. 12, 413–420.

Bergström, C.A.S., Luthhman, K., Artursson, P., 2004. Accuracy of calculated pH-dependent aqueous solubility. Eur. J. Pharm. Sci. 22, 387–398.

Bergström, C.A.S., Wassvik, C.M., Johansson, K., Hubatsch, I., 2007. Poorly solu-ble marketed drugs display salvation limited solubility. J. Med. Chem. 50,5858–5862.

Betigeri, S., Thakur, A., Shukla, R., Raghavan, K., 2008. Effect of polymer additiveson the transformation of BMS-566394 anhydrate to the dehydrate form. Pharm.Res. 25, 1043–1051.

Bevan, C.D., Lloyd, R.S., 2000. A high-throughput screening method for the determi-nation of aqueous drug solubility using laser nephelometry in microtiter plates.Anal. Chem. 72, 1781–1787.

Brewster, M.E., Vandecruys, R., Peeters, J., Neeskens, P., Verreck, G., Loftsson,T., 2008. Comparative interaction of 2-hydroxypropyl-�-cyclodextrin andsulfobutylether-�-cyclodextrin with itraconazole: phase-solubility behaviourand stabilization of supersaturated drug solutions. Eur. J. Pharm. Sci. 34, 94–103.

Butler, J.M., Dressman, J.B., 2010. The developability classification system: applica-tion of biopharmaceutics concepts to formulation development. J. Pharm. Sci.99, 4940–4954.

Cheng, T., Li, Q., Wang, Y., Bryant, S.H., 2011. Binary classification of aqueous solu-bility using support vector machines with reduction and recombinant featureselection. J. Chem. Inf. Model. 51, 229–236.

Chickos, J.S., Braton, C.M., Hesse, D.G., Liebman, J.F., 1991. Estimating entropies andenthalpies of fusion of organic compounds. J. Org. Chem. 56, 927–938.

Clarysse, S., Psachoulias, D., Brouwers, J., Tack, J., Annaert, P., Duchateau, G., Reppas,

C., Augustijns, P., 2009. Postprandial changes in solubilising capacity of humanintestinal fluids for BCS class II drugs. Pharm. Res. 26, 1456–1466.

Clarysse, S., Brouwers, J., Tack, J., Annaert, P., Augustijns, P., 2011. Intestinal drugsolubility estimation based on simulated intestinal fluids: comparison with sol-ubility in human intestinal fluids. Eur. J. Pharm. Sci. 43, 260–269.

Page 9: Aqueous solubility: Simple predictive methods (in silico, in vitro and bio-relevant approaches)

rnal o

C

D

D

D

D

E

E

F

F

F

G

G

H

H

H

H

H

H

H

J

J

J

J

J

J

J

K

K

K

K

K

K

L

L

L

D. Elder, R. Holm / International Jou

lark, M., 2005. Generalized fragment-substructure based property predictionmethod. J. Chem. Inf. Model. 45, 30–38.

atta, S., Grant, D.J.W., 2004. Advances in determination, prediction and engineering.Nat. Rev. Drug Discov. 3, 42–57.

elaney, J.S., 2005. Predicting aqueous solubility from structure. Drug Discov. Today10, 289–295.

readen, J.C., 2006. In silico prediction of aqueous solubility. Expert Opin. DrugDiscov. 1, 31–52.

ressman, J.B., Vertzoni, M., Goumas, K., Repas, C., 2007. Estimating drug solubilityin the gastrointestinal tract. Adv. Drug Deliv. Rev. 59, 591–602.

lamin, A.A., Ahlneck, C., Alderborn, G., Nystrom, C., 1994. Increased metastablesolubility of milled griseofulvin, depending on the formation of a disorderedsurface structure. Int. J. Pharm. 111, 159–170.

MA, 2010. Guideline on the investigation of bioequivalence.CPMP/EWP/QWP/1401/98 Rev. 1/Corr**.

aller, B., Ertl, P., 2007. Computational approaches to determining drug solubility.Adv. Drug Deliv. Rev. 59, 533–545.

DA, 1997. Guidance for Industry, Dissolution testing of immediate release solid oraldosage forms. US Department of Health and Human Services, Food and DrugAdministration, Center for Drug Evaluation and Research (CDER).

ourches, D., Muratov, E., Tropsha, A., 2010. Trust but verify: on the importance ofchemical structure curation in chemoinformatics and QSAR modelling research.J. Chem. Inf. Model. 50, 1189–1204.

leeson, M.P., Hersey, A., Montanari, D., Overington, J., 2011. Probing the linksbetween in vitro potency, ADMET and physicochemical parameters. Nat. Rev.Drug Discov. 10, 197–208.

öller, A.H., Hennermann, M., Keldenich, J., Clark, T., 2006. In silico prediction ofbuffer solubility based on quantum-mechanical and HQSAR- and topology-based descriptors. J. Chem. Inf. Model. 46, 648–658.

ansch, C., Quinlan, J.E., Lawrence, G.L., 1968. Linear free energy relationshipbetween partition coefficients and the aqueous solubility of organic liquids. J.Org. Chem. 33, 347–350.

ill, A.P., Young, R.J., 2010. Getting physical in drug discovery: a contemporaryperspective on solubility and hydrophobicity. Drug Discov. Today 15, 648–655.

ewitt, M., Cronin, M.T.D., Enoch, S.J., Madden, J.C., Roberts, D.W., Dearden, J.C., 2009.In silico prediction of aqueous solubility: the solubility challenge. J. Chem. Inf.Model. 49, 2572–2587.

oelke, B., Gieringer, S., Arit, M., Saal, C., 2009. Comparison of nephelometric, UV-spectroscopic and HPLC methods for high-throughput determination of aqueoussolubility in microtiter plates. Anal. Chem. 81, 3165–3172.

opfinger, A.J., Esposito, E.X., Llinàs, A., Glen, R.C., Goodman, J.M., 2009. Findings ofthe challenge to predict aqueous solubility. J. Chem. Inf. Model. 49, 1–5.

uang, L.F., Tong, W.Q., 2004. Impact of solid state properties on developabilityassessment of drug candidates. Adv. Drug Deliv. Rev. 56, 321–334.

uuskonen, J., 2000. Estimation of aqueous solubility for a diverse set of organiccompounds based on molecular topology. J. Chem. Inf. Comput. Sci. 40, 773–777.

adhav, A., Ferreira, R.S., Klumpp, C., Mott, B.T., Austin, C.P., Inglese, J., Thomas, C.J.,Maloney, D.J., Shoichet, B.K., Simeonov, A., 2010. Quantitative analyses of aggre-gation, autofluorescence, and reactivity artifacts in a screen for inhibitors of athiol protease. J. Med. Chem. 53, 37–51.

ain, N., Yang, G., Machatha, S.G., Yalkowski, S.H., 2006. Estimation of the aqueoussolubility of weak electrolytes. Int. J. Pharm. 319, 169–171.

antratid, E., Dressman, J.B., 2009. Biorelevant dissolution media simulating the prox-imal human gastrointestinal tract: an update. Dissol. Technol. 16, 21–25.

antratid, E., Janssen, N., Reppas, C., Dressman, J.B., 2008. Dissolution media simulat-ing conditions in the proximal human gastrointestinal tract: an update. Pharm.Res. 25, 1663–1676.

ohnson, S.R., Chen, X.Q., Murphy, D., Gudmundsson, O., 2007. A computationalmodel for the prediction of aqueous solubility that includes crystal packing,intrinsic solubility and ionization effects. Mol. Pharmacol. 4, 513–523.

ohnson, J.L.W., Yalkowski, S.H., 2005. Two new parameters for predicting theentropy of melting: eccentricity (epsilon) and spirality (mu). Ind. Eng. Chem.Res. 44, 7559–7566.

orgensen, W.L., Duffy, E.M., 2000. Prediction of drug solubility from Monte Carlosimulations. Bioorg. Med. Chem. Lett. 10, 1155–1158.

awakami, K., Miyoshi, K., Ida, Y., 2005. Impact of the amount of excess solids onapparent solubility. Pharm. Res. 22, 1537–1543.

erns, E.H., Di, L., Carter, G.T., 2008. In vitro solubility assays in drug discovery. Curr.Drug Metab. 9, 879–885.

eserü, G.M., Makara, G.M., 2009. The influence of lead discovery strategies on theproperties of drug candidates. Nat. Rev. Drug Discov. 8, 203–212.

ier, L.B., Chen, C.-K., 1994. A cellular automata model of an aqueous solution. J.Chem. Inf. Comput. Sci. 34, 1334–1337.

lopman, G., Zhu, H., 2001. Estimation of aqueous solubility of organic compoundsby the group contribution approach. J. Chem. Inf. Comput. Sci. 41, 439–445.

ramer, C., Beck, B., Clark, T., 2010. Insolubility classification with accurate predic-tion probabilities using a metaClassifier. J. Chem. Inf. Model. 50, 401–414.

amanna, C., Bellini, M., Padova, A., Westerberg, G., Maccari, L., 2008. Straightforwardrecursive partitioning model for discarding insoluble compounds in the drug

discovery process. J. Med. Chem. 51, 2891–2897.

eeson, P.D., Springthorpe, B., 2007. The influence of drug-like concepts on decisionmaking in medicinal chemistry. Nat. Rev. Drug Discov. 6, 881–890.

ind, P., Maltseva, T., 2003. Support vector machines for the estimation of aqueoussolubility. J. Chem. Inf. Comput. Sci. 43, 1855–1859.

f Pharmaceutics 453 (2013) 3– 11 11

Lipinski, C.A., 2000. Drug-like properties and the causes of poor solubility and per-meability. J. Pharmacol. Toxicol. Methods 44, 235–249.

Lipinski, C.A., 2004. Lead- and drug-like compounds: the rule-of-five revolution.Drug Discov. Today Technol. 1, 337–341.

Lipinski, C.A., Lombardo, F., Dominy, B.W., Feeney, P.J., 2001. Experimental and com-putational approaches to estimate solubility and permeability in drug discoveryand development settings’. Adv. Drug Deliv. Rev. 46, 3–26.

Llinàs, A., Glen, R.C., Goodman, J.M., 2008. Solubility challenge: can you predict solu-bilities of 32 molecules using a database of 100 reliable measurements? J. Chem.Inf. Model. 48, 1289–1303.

Lovering, F., Bikker, J., Humblet, C., 2009. Escape from the flatland: increasingsaturation as an approach to improving clinical success. J. Med. Chem. 52,6752–6756.

Marques, M.R.C., Loebenberg, R., Almukainzi, M., 2011. Simulated biological fluidswith possible applications in dissolution testing. Dissol. Technol. 18, 15–28.

Morphy, R., 2006. The influence of target family and functional activity on thephysicochemical properties of pre-clinical compounds. J. Med. Chem. 49,2969–2978.

Murdande, S.B., Pikal, M.J., Shanker, R.M., Bogner, R.H., 2011. Aqueous solubilityof crystalline and amorphous drugs: challenges in measurement. Pharm. Dev.Technol. 16, 187–200.

Nicolaides, E., Symillides, M., Dressman, J.B., Reppas, C., 2001. Biorelevant dissolutiontesting to predict the plasma profiles of lipophilic drugs after oral administration.Pharm. Res. 18, 380–388.

Oprea, T., 2002. Current trends in lead discovery: are we looking for the appropriateproperties? J. Comput. Aided Mol. Des. 16, 325–334.

Ouvrard, C., Mitchell, J.B.O., 2003. Can we predict lattice energy from molecularstructure? Acta Crystallogr. B: Struct. Sci. 59, 676–685.

Palmer, D.S., Llinàs, A., Morao, I., Day, G.M., Goodman, J.M., Glen, R.C., Mitchell, J.B.O.,2008. Predicting intrinsic aqueous solubility by a thermodynamic cycle. Mol.Pharmacol. 5, 266–279.

Palmer, D.S., O’Boyle, N.M., Glen, R.C., Mitchell, J.B., 2007. Random forest models topredict aqueous solubility. J. Chem. Inf. Model. 47, 150–158.

Pudipeddi, M., Serajudin, A.T.M., 2005. Trends in solubility of polymorphs. J. Pharm.Sci. 94, 929–939.

Ritchie, T.J., MacDonald, S.J.F., 2009. The impact of aromatic ring count on compounddevelopability – are too many aromatic rings a liability in drug design? DrugDiscov. Today 14, 1011–1020.

Ritchie, T.J., MacDonald, S.J.F., Young, R.J., Pickett, S.D., 2011. The impact of aro-matic ring count on compound developability: further insights by examiningcarbo- and hetero-aromatic and -aliphatic ring types. Drug Discov. Today 16,164–171.

Saal, C., Petereit, A.C., 2012. Optimizing solubility: kinetic versus thermodynamicsolubility temptations and risks. Eur. J. Pharm. Sci. 47, 589–595.

Sanghvi, T., Jain, N., Yang, G., Yalkowski, S.H., 2003. Estimation of aqueous solubilityby the General Solubility Equation (GSE) the easy way. QSAR Comb. Sci. 22,258–262.

Söderlind, E., Karlsson, E., Carlsson, A., Kong, R., Lenz, A., Lindborg, S., Sheng, J.J., 2010.Simulating fasted human intestinal fluids: understanding the roles of lecithinand bile acids. Mol. Pharmacol. 7, 1498–1507.

Stegemann, S., Leveiller, F., Franchi, D., de Jong, H., Linden, H., 2007. When poorsolubility becomes an issue: from early stage to proof of concept. Eur. J. Pharm.Sci. 31, 249–261.

Stuart, M., Box, K., 2005. Chasing equilibrium: measuring the intrinsic solubility ofweak acids and bases. Anal. Chem. 77, 983–990.

Takano, R., Sugano, K., Higashida, A., Hayashi, Y., Machida, M., Aso, Y., Yamashita,S., 2006. Oral absorption of poorly water-soluble drugs: computer simulationof fraction absorbed in humans from miniscale dissolution test. Pharm. Res. 23,1144–1156.

Tetko, I.V., Tanchuk, V.Y., Kasheva, T.N., Villa, A.E.P., 2001. Estimation of aqueoussolubility of chemical compounds using E-state indices. J. Chem. Inf. Comput.Sci. 41, 1488–1493.

Vertzoni, M., Dressman, J.B., Butler, J.M., Hempenstall, J.M., Reppas, C., 2005. Simu-lation of fasting gastric conditions and its importance for the in vivo dissolutionof lipophilic compounds. Eur. J. Pharm. Biopharm. 60, 413–417.

Wang, J., Hou, T., 2011. Recent advances on aqueous solubility prediction. Comb.Chem. High Throughput Screen. 14, 328–338.

Wassvik, C.M., Holmén, A.G., Draheim, R., Artursson, P., Bergström, C.A.S., 2008.Molecular characteristics for solid-state limited solubility. J. Med. Chem. 51,3035–3039.

Wenger, K., Zell, A., 2003. Prediction of aqueous solubility and partition coefficientoptimized by genetic algorithm based descriptor selection method. J. Chem. Inf.Comput. Sci. 43, 1077–1084.

Wenlock, M.C., Austin, R.P., Barton, P., Davis, A.M., Leeson, P.D., 2003. A comparison ofthe physicochemical property profiles of development and marketed oral drugs.J. Med. Chem. 46, 1250–1256.

Xia, X., Maliski, E., Poppe, L., Cheetham, J., 2003. Solubility prediction by recursivepartitioning. Pharm. Res. 20, 1634–1640.

Yan, A.X., Gasteiger, J., Anzali, S., 2004. Linear and nonlinear functions on modelling

of aqueous solubility of aqueous compounds. J. Comput. Aided Mol. Des. 18,75–87.

Young, R.J., Green, D.V.S., Luscombe, C.N., Hill, A.P., 2011. Getting physical in drugdiscovery II: the impact of chromatographic hydrophobicity measurements andaromaticity. Drug Discov. Today 16, 822–830.