Mining the WHO Drug Safety Database Using Lasso Logistic ...304279/FULLTEXT01.pdf · collect a...

Mining the WHO Drug Safety Database UsingLasso Logistic Regression

Ola Caster

U.U.D.M. Project Report 2007:16

Examensarbete i matematisk statistik, 20 poäng

Handledare: Andrew Bate, the Uppsala Monitoring Centre

Examinator: Silvelyn Zwanzig

Juni 2007

Department of Mathematics

Uppsala University

Mining the WHO Drug Safety Database Using

Lasso Logistic Regression

Ola Caster

Master of Science Project in Mathematical Statistics

Supervisor: Ph.D. Andrew Bate, the Uppsala Monitoring Centre

Examiner: Ph.D. Silvelyn Zwanzig, Uppsala University

1

Abstract

For reasons such as low incidence, occurrence in groups frequently ex-

cluded from clinical trials and long onset times, some adverse drug re-

actions (ADRs) of a new medicinal product stay unnoticed until after

market launch. The World Health Organization (WHO) in collaboration

with the Uppsala Monitoring Centre (UMC) continuously collect sponta-

neous ADR reports from the entire world and use data mining approaches

to detect which drugs are most likely to cause which previously unantic-

ipated ADRs. This WHO drug safety database, being the largest of its

kind, comprises about 3.8 million accumulated reports.

The currently used data mining methods are based on two-dimensional

projections of the data with respect to a given drug-ADR combination.

This combination is then given an association score based on the dis-

crepancy between the observed and expected number of reports on it.

In this thesis these disproportionality-based methods are represented by

the information component (IC) measure of the UMC, a shrunk Bayesian

measure.

A limitation with the IC is its incapability to deal with confounding by

co-medication and masking. Confounding by co-medication means that

the association between a drug and a certain ADR might seem stronger

than it really is because that drug is used together with another drug,

which in turn is truly associated with the ADR. Masking, on the other

hand, is a phenomenon whereby a very strong association between an

ADR and some drug might weaken the associations between that ADR

and other drugs.

Here a novel method to mine the WHO drug safety database is pro-

posed to address these issues, the lasso logistic regression (LLR). Instead

of studying each combination separately, in the LLR model the ADR un-

der study is fixed and its presence on a report is predicted by the presence

of all occurring drugs in the database, thus yielding a logistic regression

framework. Further, independent prior Laplace distributions are put on

the parameters, resulting in a lasso-type shrinkage where a subset of the

parameters are shrunk to exactly zero.

The LLR was confirmed to correct for confounding by co-medication

and masking in simulated scenarios and specific clinical examples. Fur-

ther, with a specific degree of shrinkage the LLR had 10 % higher recall

and maintained precision in comparison to the IC with respect to a test

database. Although its transparency is limited, the LLR has an important

role to play in the future of ADR monitoring.

2

Contents

Abbreviations 5

1 Introduction 6

2 Models and Methods 10

2.1 The Database Structure . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Two-Dimensional Measures . . . . . . . . . . . . . . . . . . . . . 10

2.3 The Lasso Logistic Regression Model . . . . . . . . . . . . . . . . 11

2.3.1 Interpretation of the β vector . . . . . . . . . . . . . . . . 12

2.4 The CLG-lasso Algorithm . . . . . . . . . . . . . . . . . . . . . . 13

2.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.6 User Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.6.1 Hyperparameter . . . . . . . . . . . . . . . . . . . . . . . 17

2.6.2 Signalling Threshold . . . . . . . . . . . . . . . . . . . . . 18

3 Experiments 19

3.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Confounding by Co-Medication . . . . . . . . . . . . . . . . . . . 22

3.2.1 Example 1: Lactic Acidosis . . . . . . . . . . . . . . . . . 23

3.2.2 Example 2: Hypertriglyceridaemia . . . . . . . . . . . . . 29

3.2.3 Example 3: Haemorrhagic Cystitis . . . . . . . . . . . . . 32

3.3 Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4 Systematic Comparisons . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.1 Precision-Recall . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.2 Properties of Disconcordant Combinations . . . . . . . . . 41

3

4 Discussion 48

Acknowledgements 54

Appendix A Example of a Result File 57

4

Abbreviations

ADR Adverse Drug ReactionWHO World Health OrganizationUMC Uppsala Monitoring CentreIC Information ComponentPRR Proportional Reporting RatioLLR Lasso Logistic RegressionOR Odds RatioMAP Maximum a PosterioriBBR Bayesian Binary RegressionNRTI Nucleoside Reverse Transcriptase InhibitorNaRTI Nucleoside Analogue Reverse Transcriptase InhibitorNNRTI Non-Nucleoside Reverse Transcriptase InhibitorPI Protease InhibitorB BiguanideS SulfonylureaT ThiazolidinedioneHIV Human Immunodeficiency VirusSSRI Selective Serotonin Re-Uptake InhibitorsMS Multiple SclerosisFDA Food and Drug AdministrationHBLR Hierarchical Bayesian Logistic Regression

5

1 Introduction

The issue of drug safety is one of the greatest - if not the greatest - concernswithin the pharmaceutical community. It has been well-known at least sincethe thalidomide tragedy in 1961 that some adverse drug reactions (ADRs) willstay unnoticed until after drug launch [1]. Typically, these ADRs have a lowincidence, occur in groups frequently excluded from clinical trials (such as preg-nant women) or have long onset times [2]. With a pharmaceutical industryhaving severe problems already with the current legislation to be able to affordtheir clinical trial programs, the way out seems to be post-marketing drug safetysurveillance rather than intensified pre-marketing testing.

One of several approaches within post-marketing drug safety surveillance is tocollect a large number of spontaneous ADR reports and mine the database thusconstructed. When this mining is performed repeatedly over a period of time itis called ADR monitoring, an idea dating back to the 1960’s [3]. Already thenthe need for international cooperation within ADR monitoring was noticed [1].In this spirit, the World Health Organization (WHO) in collaboration withthe Uppsala Monitoring Centre (UMC) collect reports from about 80 countriesworldwide, and now have over 3.8 million accumulated reports. This WHO drugsafety database, which is the largest of its kind, is mined quarterly.

The database includes about 14000 drug terms and 2000 ADR terms. Thereare reports on about 750000 out of the approximately 20 million theoreticalcombinations of drug and ADR terms. From here on, the word combinationwill always have this meaning, unless explicitly stated otherwise. In addition toits size, the mining of the database is complex because of huge underreporting,heterogeneity of data on reports and reporting biases [4].

The primary task when mining this and similar databases is to find out whichdrugs are most likely to cause which previously unanticipated ADRs, i.e. tohighlight combinations which seem to be too frequently reported. These com-binations are then normally further investigated, often by clinical review of theevidence. Mainstay statistical inference such as e.g. hypothesis testing is notfeasible for databases of this kind due to the non-systematic data collection andlack of comparison groups. To avoid confusion for readers familiar with ADRmonitoring, a signal is here defined as the highlighting of a combination forclinical review by judging the reporting on that combination as unexpectedlyhigh [5]. This use of the word does not take any clinical evidence of causalityinto consideration, as has been suggested when trying to define various conceptswithin this field of research [6].

6

The currently most popular data mining methods for quantitative analysis ofdrug safety databases are the so called disproportionality-based methods, whichare either Bayesian or non-Bayesian two-dimensional projections of the data [7].An example of the former is the information component (IC) measure of theUMC [8], and an example of the latter is the proportional reporting ratio(PRR) [9]. The major advantage with Bayesian measures in this context isthat they exhibit an a priori belief of no disproportionality, which means thatthey are shrunk towards a value of no association. By varying the shrinkageone can control how much data that is needed to shift away from the a prioribelief towards a suspicion that the combination is actually disproportionallyover-reported. Due to the nature of the two-dimensional projections each com-bination is studied separately, and in some way the observed number of reportson that combination is compared to what would be expected based on the restof the reports.

However, the disproportionality-based methods suffer from two clear limitations.Firstly, they are unable to correct for confounding by co-medication [10]. Thismeans that if one drug A, say, truly causes some ADR and another drug B,say, is frequently co-medicated with drug A, then these methods will signalthat both drugs are likely to cause the ADR. Some researchers use the wording’signal leakage’ from drug A to drug B, and drug B is sometimes referred to as an’innocent bystander’. This is a well-documented phenomenon within statistics,where it is called Simpson’s paradox. It is, however, not a paradox but rathera simple consequence of the fact that each combination is studied separately.

The second problem is related to the background reporting rate. When studyinga particular combination, the disproportionality-based methods naively assumethat all reports on the ADR excluding the drug of interest constitute some sortof general background reporting of the ADR. However, if there are one or moredrugs which occur frequently on those reports the background reporting becomessubstantial, which increases the expected number of reports on the combinationcurrently under study. Thus, when the observed number of reports is comparedto the expected number of reports, the false conclusion could be drawn thatthe combination between the drug and the ADR should not be highlighted as asignal. This phenomenon is called ’masking’, or sometimes ’cloaking’ [10]. Notethat this description of masking does not imply that there is any causal linkbetween the ADR and the drug of the masked combination.

In this thesis, a fundamentally different approach to mining the WHO drugsafety database is studied, namely regression. Generally speaking, in a re-gression model the value of some dependent variable is explained by a set of

7

predictor variables, and their respective degrees of contribution are estimatedin terms of parameters. The nature of drug safety databases, where on eachreport a certain ADR is either present or absent, makes the use of logistic re-gression plausible [11]. By simultaneously using all drugs as predictors for thepresence of the ADR on a report a multivariate framework is set up which, atleast theoretically, corrects for confounding by co-medication. Further, becausethe regression models include an intercept which is a function of the backgroundreporting rate, the regression approach could in theory be anticipated to avoidproblems with masking. Even though there seems to be a huge gap betweenthe disproportionality-based methods and the regression framework, there is aclear link. In fact it can be shown that the parameters of a logistic regressionmodel are some sort of measures of disproportionality [12], however adjustedwith respect to the other predictors, i.e. the presence of other drugs in thisapplication.

To be precise, the regression method used is lasso logistic regression (LLR) [13].This is one of many shrinkage regression methods, which all have the basicidea of shrinking the parameters towards zero. Generally speaking, shrinkageregression is appealing for two reasons: Firstly, whereas the estimates are nolonger unbiased they exhibit a lower variance which gives an overall betterpredictability of the models, and secondly, the models become easier to interpretbecause a smaller subset of effects are highlighted [14]. The probably most well-known shrinkage regression method is ridge regression which was introduced asearly as 1962 [15], six years after the idea of biased estimation had first beenproposed by Stein [16].

In ADR monitoring one is primarily not interested in neither predictabilitynor interpretability of models, but rather in finding out which drugs are likelyto cause which ADRs. However, the shrinkage approach still offers three po-tential advantages over ordinary logistic regression (as normally implemented)in this application where the number of predictors is of the magnitude 104:Numerical instability in the estimation is avoided, the computations becomeconsiderably faster and finally drugs with very few reports on the ADR of inter-est will not be highlighted, given that the shrinkage is strong enough [17, 13].This final property is completely analogous to the shrinkage of the Bayesiandisproportionality-based methods.

The lasso (least absolute shrinkage and selection operator) was proposed byTibshirani in 1996 [18], however similar ideas seem to have emerged earlierin related fields [19]. It is a regression method that imposes an L1 penaltyon the parameters, which means that the sum of the absolute values of the

8

parameters is restricted. Already in the original description of the lasso theconnection to Bayesian statistics was stated: The L1 penalty is equivalent toputting independent prior Laplace distributions on the parameters. Becauseof the special nature of this penalty, a subset of the parameters are shrunk toexactly zero. This is appealing in the context of ADR monitoring since it givesa natural signalling threshold at zero, although, admittedly, it is not altogetherobvious that lasso logistic regression should be superior to e.g. ridge logisticregression in this application.

It might be clarifying to think of the LLR as a multivariate extension of theBayesian measures of disproportionality, e.g. the IC, which in turn are basedon the non-Bayesian measures of disproportionality. However, this extensiondoes not only bring advantages but also possible limitations, such as the needfor model assumptions and presumably heavier computational load. The sheercomplexity of the method is also an issue since it might be difficult to interpretthe models produced.

The aim of the project is to:

1. implement the method of LLR on the WHO drug safety database

2. investigate the properties of the LLR in ADR monitoring to see if thismethod is practically useful and

3. to attempt to characterize when the LLR is most and least likely to bebeneficial in ADR monitoring.

9

2 Models and Methods

2.1 The Database Structure

The structure of the n reports is:

{([a11, . . . , a1r1 ]T , [d11, . . . , d1s1 ]T ), . . . , ([an1, . . . , anrn ]T , [dn1, . . . , dnsn ]T )}(1)

where aij represents the j:th ADR in the i:th report and, similarly, dij representsthe j:th drug in the i:th report. We have that aij ∈ A = {α1, . . . , α|A|}, theset of all ADRs, and that dij ∈ D = {δ1, . . . , δ|D|}, the set of all drugs. Everyreport includes at least one ADR and one drug, and we call a report with onlya single drug listed a sole drug report.

2.2 Two-Dimensional Measures

Whereas the focus of this thesis is LLR, a couple of two-dimensional mea-sures are needed to set up the framework. A two-dimensional projection ofthe database is given in table 1.

There are several disproportionality measures defined based on this projection.In this project comparisons against the IC measure are frequently occurring.This measure is well approximated by [20]:

IC = log2

a+ 0.5(a+c)∗(a+b)a+b+c+d + 0.5

(2)

The numerator is the observed number of reports plus a constant (0.5) andthe denominator is the expected number of reports (under the assumption ofmutual independence) plus the same constant. The addition of this constantto both the numerator and denominator shrinks the ratio towards 1, and thusthe measure towards 0 since the logarithm to the base 2 is used. The shrinkagewill be stronger for combinations with few reports. In practice, in addition to apoint estimate also the lower endpoint of a 95 % credibility interval (which will

Table 1 . Two-dimensional projection of the database for ADR j and drug k.

ADR j yes ADR j no TotalDrug k yes a b a+ bDrug k no c d c+ d

Total a+ c b+ d a+ b+ c+ d

10

here be called a confidence interval) is calculated [4]. In screening the WHOdatabase, the IC highlights all combinations for which this 95 % lower confidencebound called IC025 is strictly positive [8].

Another two-dimensional measure, the odds ratio (OR), is of interest because itis connected to logistic regression (see section 2.3.1). The odds ratio is definedas:

OR =a/b

c/d(3)

To understand this definition we need to know that the odds for an event A isgiven by P (A)/(1 − P (A)). Thus, the odds ratio is interpreted in this contextas the quotient between the odds that the ADR is present on a report withthe drug and the odds that the ADR is present on a report without the drug.Strictly speaking, since we are dealing with the total database, this should becalled the reporting odds ratio [7]. Non-shrunk crude reporting odds ratios havethemselves been proposed as sensible measures of disproportionality.

2.3 The Lasso Logistic Regression Model

The notation of this section follows that of Section 2.1.

Contrasting the two-dimensional measures, in LLR one ADR rather than onecombination is studied at the time. If we study the j:th ADR the data aretransformed into the following form: {(x1, y1), . . . , (xn, yn)}

where yi =

{1 if aik = αj for some k ∈ {1, . . . , ri}0 otherwise

and xip =

{1 if dil = δp for some l ∈ {1, . . . , si}0 otherwise

if xip is the p:th element of xi. Note that xi stays the same irrespective ofwhich ADR that is studied.

Now, let the ADR under study be fixed. Let xi be extended with the constantterm 1 as its first input to accommodate the intercept β0 and consider theordinary logistic regression model:

P (y = 1|β,xi) =exp (βTxi)

1 + exp (βTxi)(4)

We extend this model to the lasso logistic regression model by imposing an L1

constraint on the parameters: ∑j

|βj | ≤ t (5)

11

It is well known that finding the solution β to the model in Equation 4 underthis constraint is the same as finding the (Bayes) maximum a posteriori (MAP)estimate under independent Laplace (double exponential) priors for the βjs [18]:

f(βj |λ) =λ

2exp (−λ|βj |) (6)

This prior distribution has mean 0, mode 0 and variance 2/λ2. It can be shownthat there is a one-to-one transformation between the bound t and the hyper-parameter λ [14].

From now on we adopt the Bayesian perspective. Putting independent Laplacepriors with mean 0 and variance 2/λ2 on the parameters yields the followingposterior log-likelihood (except a normalizing constant) [13]:

l(β) = −n∑i=1

log (1 + exp (−βTxiyi))−|D|∑j=0

(log2λ

+ λ|βj |) (7)

The MAP estimate is the β that maximizes this posterior log-likelihood.

2.3.1 Interpretation of the β vector

In a univariate ordinary logistic regression model with a dichotomous predictorvariable β consists of only one parameter β1 (if the intercept is neglected). Itis well-known and easily shown [12] that, if the predictor variable is coded aseither 0 or 1, β1 is the same as the log of the odds ratio (recall section 2.2)for the predictor in the two possible outcomes of the dependent variable. Thisconnection between logistic regression and odds ratios is very important andserves as a foundation for our interpretation of the β vector.

The univariate case is easily extended to a multivariate setting with more thanone predictor variable, as is the case in ADR monitoring. In this case each βj

(except the intercept) could be interpreted as an adjusted log odds ratio. Inother words it is the log odds ratio for a certain predictor variable under thecondition that all other predictor variables are identically distributed withinthe two outcomes of the dependent variable. Since this condition seldom holds,adjusted log odds ratios often differ from crude log odds ratios calculated fromtwo-dimensional projections. This regression property is the reason why wewould expect LLR to correct for confounding by co-medication.

By looking at the definition of an odds ratio, it seems reasonable that a valueof greater than one, equivalent to a positive βj , would suggest an associationbetween the ADR under study and the drug corresponding to the βj . In this

12

case, when the effect of all other drugs has been accounted for, the odds thata report including the drug will also include the ADR exceeds the odds that areport not including the drug will include the ADR. Note that this chain of logicassumes that the database truly reflects the real world. Since it is well-knownthat the database suffers from numerous flaws (see Section 1), thus we can onlyuse the estimated βj values as a means of highlighting combinations for clinicalreview, and not for common statistical inference.

In the LLR model the parameters are to varying degrees shrunk towards zero(equivalent of shrinking the log odds ratios towards unity) and the magnitudeof the shrinkage depends on the value of the hyperparameter λ. Therefore theinterpretation of the β vector is a much more delicate issue. However, our maininterest will still be the positive parameters.

Equation 4 can aid in the interpretation of the intercept:

P (y = 1|β, xi0 = 1;xip = 0 ∀p 6= 0) =exp (β0)

1 + exp (β0)≈ exp (β0)

where the approximation will always be accurate for this database. Since xip = 0for all p 6= 0 so that the effect of all drugs has been taken out, the interceptcould be interpreted as the multivariate estimate of the log background reportingfrequency of the ADR. This is the reason why we would expect the LLR tocorrect for masking when the background reporting is very unevenly spread outamong the drugs.

2.4 The CLG-lasso Algorithm

The issue of finding the MAP estimate of β is a convex optimization problem.In ordinary logistic regression β is estimated using the maximum likelihoodmethod. The standard algorithm is the iteratively reweighted least squares(IRLS) algorithm, which is based on the Newton-Raphson method [14]. Thedisadvantage with this method in problems with many predictors, such as this,is the high memory load [13]. Also, when the lasso-type shrinkage is imposed onthe parameters the IRLS algorithm can not ensure convergence [18]. Instead,a recently proposed algorithm, the CLG-lasso algorithm, was used for fittingthe logistic model with Laplace priors [13]. CLG-lasso, which is a so calledcyclic coordinate descent algorithm, is based on the CLG algorithm of Zhangand Oles [21]. An exhaustive description of the algorithm is beyond the scopeof this thesis, and interested readers are referred to the detailed description byGenkin et al [13].

13

The basis of all cyclic coordinate descent algorithms is to optimize with re-spect to only one variable at the time while all others are held constant. Whenthis one-dimensional optimization problem has been solved, optimization is per-formed with respect to the next variable, and so on. When the procedure hasgone through all variables it starts all over with the first one again, and theiterations proceed in this manner until some pre-defined convergence criterionis met.

The one-dimensional optimization problem in LLR is to find βnewj , the value forthe j:th parameter that maximizes the posterior log-likelihood assuming thatall other βjs are held constant. Looking at Equation 7, the objective functiong() becomes:

g(z) =n∑i=1

log (1 + exp ((βj − z)xijyi − βTxiyi)) + λ|z| (8)

The CLG-lasso algorithm uses an approximation to g() such that, in the end,the update equation for βj becomes

βnewj =

βj −∆j if ∆vj < −∆j

βj + ∆vj if −∆j ≤ ∆vj ≤ ∆j

βj + ∆j if ∆j < ∆vj

(9)

where the interval [βj −∆j , βj + ∆j ] is an iteratively adapted trust region forthe suggested update ∆vj . The width of this interval is determined based onits previous value and the previous update made to βj . The suggested updateis given by

∆vj =

∑ni=1 xijyi

11+exp (βTxiyi)

− λsgn(βj)∑ni=1 x

2ijF (βTxiyi,∆jxij)

(10)

where, for some δ > 0, F () is defined as

F (r, δ) =

{1/4 if |r| ≤ δ1/(2 + exp (|r| − δ) + exp (δ − |r|)) otherwise

(11)

The suggested update ∆vj has two problems. First, because sgn() is undefinedat 0, the update itself is undefined at βj = 0 and second, there is no guaranteethat g() decreases if βj changes sign. The first problem is solved by testing bothsgn(βj) = 1 and sgn(βj) = −1 for a decrease in g() when βj = 0. The convexityof the objective function guarantees that at most one option can be successful.The second issue is solved by simply setting βnewj to 0 if the update suggests achange of sign for βj .

14

This update is performed once for each βj in each iteration. The algorithmstops when convergence has been reached as defined by∑n

i=1 |(βTxiyi)

new − (βTxiyi)old|

1 +∑ni=1 |(β

Txiyi)new|

≤ ε (12)

In this paper, ε = 0.0005 throughout.

2.5 Implementation

To implement the model presented in Section 2.3 using the algorithm in Sec-tion 2.4 on the WHO drug safety database, a freely available software calledBBR (Bayesian Binary Regression) was used [13, 22]. It is an efficient imple-mentation of the CLG-lasso algorithm written in the C language available forboth Windows and Linux environments, operated via the command line. For adetailed description of the software and its options the reader is referred to theweb page [22].

The starting point was a transcript from the database on the format presented inSection 2.1. Since the BBR software requires a datafile with a slightly differentformat, a translating script was written in the Perl language [23]. It prints adatafile based on which ADR the user wishes to study. The syntax looks like

> perl translator.plx adr=<adr number> <infile> <outfile>

where the command line options have obvious meanings.

When called, the BBR software by default prints a so called model file whichbasically includes the identification numbers for the used predictors (in thiscase all the drugs) and the respective parameter estimates. The model filesgive little information when inspected visually if there are more than just a fewpredictors. The software also prints information such as estimation diagnosticsand numerically sorted parameter estimates to the screen. For easier use of thesoftware and facilitated presentation of the results, a Perl script was written. Ittakes four arguments: Commands to pass on to the BBR software, the name ofthe data file, the name of the result file it should produce and the number ofthe ADR studied. These arguments are simply split by a semicolon:

> perl BBR.plx ;<BBR options>;<data file>;<result file>;<adr number>

15

This script first starts the BBR software with the given options and data file.It then reads the information sent to the screen and saves it as the result filespecified. Finally, it reads the result file and inserts the following informationfor drugs that have at least one report with the ADR studied:

• The name of the drug

• The IC value for the combination between this drug and the ADR

• The 95% lower confidence bound for the IC value (IC025)

All other lines are left untouched. In this way a result file is produced whichincludes estimation diagnostics as well as easily overviewed parameter estimatesand corresponding IC values. An example of such a result file is given in Ap-pendix A.

Often it is desirable to get not only point estimates of model parameters butalso at least approximate confidence intervals. Whereas there exist differentapproximate formulae for the calculation of the covariance matrix of the lassoparameter estimates in the context of linear models [18, 24], we know of nothingequivalent for logistic models. However, because the bootstrap is such a generalapproach, the bootstrap percentile method [25] was used to obtain approximateconfidence intervals for the parameter estimates in this application of the LLR.The method is very straightforward: Sampling with replacement from the origi-nal n reports is performed to construct B bootstrap datasets of size n, and thenβ is estimated from each of the bootstrap datasets. Finally, he approximate1− α confidence interval for βj is given by [βα/2j , β

1−α/2j ], where βkj is the k:th

percentile of the B estimates for βj .

The property of the L1 penalty to set some parameters to exactly zero intro-duced a small problem to the bootstrap procedure. It can very well happenthat parameters get strictly positive or strictly negative estimates based on theoriginal data and some of the bootstrap datasets, whereas they get estimated tozero based on other bootstrap datasets. This means that the distribution of thebootstrap estimates will be bell-shaped around some value and have an isolatedpeak at zero. It was chosen to accept this, because if a lower confidence boundfor a parameter is set to zero this simply reflects the fact that there is littledata to support the association between the ADR and the drug correspondingto that parameter.

The bootstrap procedure was also implemented as a Perl script. It is operatedthe same way as BBR.plx above, but with an extra argument (<B>) determiningthe number of bootstrap samples:

16

> perl BBRboot.plx ;<B>;<BBR options>;<data file>;<result file>;<adr number>

It produces a result file similar to BBR.plx, however with an extra column forthe lower confidence bound of the parameter estimates.

2.6 User Choices

2.6.1 Hyperparameter

As suggested in Section 2.3, the LLR is dependent on the user choosing a hy-perparameter λ or, equivalently, a bound t. The value will decide how much theparameters are shrunk. Because the degree of shrinkage is specified in terms ofprior variance rather than λ in the BBR software (the prior variance is equalto 2/λ2, see Section 2.3), the choice of hyperparameter is presented as priorvariance in this thesis. In many applications of the lasso and similar methodsthe hyperparameter value is chosen with respect to its ability of producing mod-els with low prediction error. In such cases the method of cross-validation [14]provides an easy and efficient way of estimating the hyperparameter.

The application of ADR monitoring is somewhat different since the models pro-duced are not primarily intended for prediction. In fact, once the estimationof a given model is complete, the model is used to raise hypotheses about pos-sible associations between drugs and the ADR under study. Thus, one couldsuspect that cross-validation would not be a good way of choosing the hyper-parameter value in this application. This suspicion was supported by simplepreliminary testing (results not shown) where cross-validation suggested far toolittle shrinkage; for instance drugs with only a single report on an ADR receivedunrealistically high parameter estimates. The reason might be that some of thedata can add to the predictability although it presents far too weak evidencefor potential highlights.

In this study it was assumed that a prior variance of 1 would work well, orat least not be harmful to the method. This hyperparameter value was usedin the entire study except the systematic comparisons (see Section 3.4), wheredifferent hyperparameter values were used. Interestingly, a similar pragmaticapproach regarding a suitable shrinkage is used also for the IC measure [4].

17

2.6.2 Signalling Threshold

In the setting of ADR monitoring one needs a signalling threshold, i.e. a de-cision rule specifying when the output of the method should lead to furtherinvestigation of a particular combination. Since the interpretation of the modelparameters of the LLR is quite complex (see Section 2.2), it is far from obvioushow this threshold should be chosen.

In this study two intuitive signalling thresholds were used, β025 > 0 and β >

0. The former has a clear connection to the corresponding threshold for theIC measure, which is IC025 > 0. The main disadvantage with this choice isthat the problems of the bootstrap procedure (see Section 2.5) propagate intothe signalling decision. The latter one, which was only used in the systematiccomparisons, lacks the fine-tuning that a varying confidence level offers, howeverthe low computational load is a clear advantage. See also Section 3.4 for moreelaboration on the two different signalling thresholds.

18

3 Experiments

This section covers three different parts: Simulation studies, studies of specificreal examples and systematic comparisons between the LLR and the IC. Foreach part the experimental setup is described, followed by the results and abrief discussion. The reason for this separation is to emphasize that the threeparts follow a natural chronological course. Note that in quite a few graphs they-axis label is ’signal score’ which means IC value or, in the case of the LLR, βestimate. These are not expected to be on the same scale and can therefore notbe compared in an absolute sense.

3.1 Simulation

In order to investigate whether the LLR was likely to be able to correct forconfounding by co-medication and masking, simple simulated examples werestudied first. The original database was used as background and then reportson two made up drugs, A and B, and a made up ADR, X, were added. Be-cause of this simple approach the examples were transparent which made theinterpretation easier. The LLR was run with a prior variance of 1 withoutbootstrapping.

Using a notation of A/X for the combination between drug A and ADR X;¬A/X for the combination between other drugs than A and ADR X; A,B/Xfor the combination of A and B together with X and so forth, the two scenariosare presented in Table 2.

Scenario 1 was intended to simulate a situation with confounding by co-medication.The most important features are the 100 sole drug reports on A/X, the 10 re-ports on A,B/X and the small number of sole drug reports on B/X. Thisreflects a strong association between A and X and a weak association betweenB and X probably confounded by co-medication with A, at least at the point

Table 2 . Summary of the number of added reports in the simulated scenarios.

Scenario A/X A,B/X B/X A/¬X B/¬X A,B/¬X ¬A,¬B/X(spread)

1 100 10 0-10 1000 1000 100 1000 (100)2 5000 0 0-10 5000 5000 0 1000 (100)

A and B are made up drugs and X is a made up ADR. The numbers indicate howmany reports that were added on certain combinations in the different scenarios. The’spread’ is the number of drugs that the reports are spread out on.

19

where there is not a single sole drug report on B/X. The other number ofreports that were used were more or less chosen arbitrarily.

The results are shown in Figure 1. As expected the IC could not differentiatebetween reports on the combination between drug B and ADR X that includedonly this drug and those reports that also included drug A. This means thatthe IC highlighted the combination B/X already at x = 0, i.e. where the onlyreporting of this combination consisted of ten reports that also included A, adrug strongly associated with X. As the number of sole drug reports between Band X increased the LLR signal score also increased, however it never reachedas high as that of the IC. The reason is that the LLR continuously correctedthe IC upward bias in the strength of association for this combination. Theinterpretation of the results was that the two methods performed as expected,and the implication was that a search for real examples of confounding by co-medication was started (see Section 3.2).

# Drug B reports

Sig

nal s

core

0

2

4

6

8

0 2 4 6 8 10

● ● ● ● ● ● ●● ●

IC

0 2 4 6 8 10

● ● ● ● ● ● ●●

●

LLR

● ● ●Drug ADrug B

Figure 1 . Results from simulation scenario 1, which attempted to mimic a situation withconfounding by co-medication. Apart from some background reporting, drug A is extensivelyreported with ADR X and there are also joint reports on drugs A and B with the ADR. Thex-axis shows how many reports on the combination between B and X that have been added,and the y-axis shows the signal score, i.e. the IC values for A and B in the left panel and theLLR β-values in the right panel.

20

Scenario 2 was set up to simulate a situation with masking. Here the mostimportant features are the 5000 sole drug reports on A/X and the 1000 reportson X with 100 other drugs, which reflects a very uneven reporting on X where5/6 of all reports are with A. It was therefore thought that if the LLR werecapable of correcting for masking, it would highlight an association between B

and X much earlier than the IC, e.g. with fewer B/X reports added. Theother number of reports were chosen more or less arbitrarily, however it wouldbe expected that the number of reports on B/¬X could not be set to low. Thereason is that it seems unlikely that a drug which is rarely reported would haveits association with an ADR masked, irrespective of how uneven the reportingon that ADR is.

The results are shown in Figure 2. The effect was not as dramatic as in the firstsimulation scenario, but it is clear that the LLR signal score was shifted upwardsin comparison to the IC. It is difficult to draw any conclusions regarding howmany reports the respective methods would need to highlight the combinationB/X because that depends on the particular choices of signalling thresholdand hyperparameter value for the LLR. Also, the IC lower confidence boundswere not calculated. Nevertheless, the interpretation of the results was thatthe LLR had the potential of correcting for occurring masking in the database,and therefore a real example was studied more extensively with respect to thisproperty (see Section 3.3).

21

# Drug B reports

Sig

nal s

core

0

5

10

0 2 4 6 8 10

●

●

●●

●● ●

●●

IC

0 2 4 6 8 10

● ● ●

●●

●●

●●

LLR

● ● ●Drug ADrug B

Figure 2 . Results from simulation scenario 2, which attempted to mimic a situation withmasking. Apart from some background reporting, drug A was heavily over-reported with ADRX in comparison to all other drugs. The x-axis shows how many reports on the combinationbetween B and X that have been added, and the y-axis shows the signal score, i.e. the ICvalues for A and B in the left panel and the LLR β-values in the right panel.

3.2 Confounding by Co-Medication

As the results from simulation scenario 1 suggested that the LLR was capable ofcorrecting for confounding by co-medication of drugs, a search for real examplesof this phenomenon was initiated. The starting point was an ADR term whereit has already been shown that confounding by co-medication is an issue, lacticacidosis.

In this section three different ADRs are presented. For each of them the analysiswas performed in the same way: First, the LLR with a prior variance of 1 wasrun in 200 bootstrap replicates. This is generally considered to be too fewreplicates, however the lengthy computations allowed no more. Then a subsetof drugs from the class(es) of drug(s) under study were selected and displayedseparately. For all three ADRs, only drugs with at least five reports on the ADRwere selected. In addition a graph displaying all drugs reported with the ADRwas drawn where the different class(es) of drug(s) under study were compared to

22

all other drugs. Finally the reasons to observed differences between the methodswere investigated, mainly by looking at raw counts of reports.

3.2.1 Example 1: Lactic Acidosis

This is a well studied example within this field of research, and has often beenpresented as a motivation for the need of using regression methods [26, 27]. Thecondition lactic acidosis is serious and highly undesirable, meaning that the pHlevel in the blood decreases to potentially lethal levels.

Two groups of drugs are particularly related to this ADR: anti-HIV drugs andanti-diabetes drugs. Anti-HIV therapy is very complex when it comes to ADRmonitoring, partly because the drugs cause many different ADRs, partly becausethey are frequently co-medicated and partly because the reporting systems inthe countries where the drugs are mostly used are severely under-developed.The current question is which class or classes of anti-HIV drugs that actuallycause lactic acidosis. Here we consider, like Szarfman et al. [27], four distinctclasses of drugs: Nucleoside Reverse Transcriptase Inhibitors (NRTIs), Nucleo-side Analogue Reverse Transcriptase Inhibitors (NaRTIs), Non-Nucleoside Re-verse Transcriptase Inhibitors (NNRTIs) and Protease Inhibitors (PIs).

Regarding the anti-diabetics, it is particularly one class of drugs, the biguanides(B), that are undoubtedly causing lactic acidosis to some extent. Two of thesedrugs, phentermin and buformin, were actually withdrawn from the marketbecause of their association with lactic acidosis. In addition to this class, thestudy included two other classes of peroral antidiabetics, the sulfonylurea (S)and the thiazolidinediones (T).

Figure 3 displays the signal scores (point estimates and 95 % lower confidencebounds) for all drugs from the selected classes with at least five reports onlactic acidosis. To simplify matters somewhat, all drug terms of combinationsof substances were excluded.

The results look very much like those produced by Szarfman et al. Amongthe anti-HIV drugs (left panel), the NRTIs stavudine and zidovudine togetherwith the NaRTI tenofovir were the drugs with the highest LLR signal scores.The other NRTIs and NaRTIs, with the exception of zalcitabine, got lower, yetpositive signal scores whereas all PIs and NNRTIs got negative signal scores.The IC measure, on the contrary, gave positive signals for all the studied drugs.

Moving on to the anti-diabetes drugs (right panel), both methods gave the threebiguanides very high signal scores. However, five out of six drugs in the class

23

sulfonylurea got positive signal scores by the IC and negative signal scores bythe regression method. The β for the sixth drug, tolbutamide, was just slightlyabove 0 and β025 was clearly negative. The thiazolidinediones also got loweredsignal scores but just slightly.

However not shown in the graph, calculated crude odds ratios were consistentwith the IC values for almost all drugs. The only exception was lodenosine, andthe explanation is probably that the IC is shrunk whereas crude odds ratios arenot.

In Figure 4 all drugs reported with lactic acidosis are shown. Here the biguanides,the NaRTIs and some of the NRTIs are placed further up in the upper-rightcorner than any other drugs, indicating that these drugs are the strongest as-sociated with this ADR. All other anti-HIV and anti-diabetes drugs are placedabove or slightly to the left of the mass of all other drugs. The reason is thatthey have received too high IC values because of confounding. In the interestof clarity the 95 % lower confidence bounds are not presented in the graph,and as expected they added no further information to that given by the pointestimates.

The results are readily explained by looking at raw counts of reports, see Ta-ble 3. Starting with the anti-diabetics, the biguanides obviously should be highlyranked. The sulfonylurea were so frequently co-medicated with metformin thatit is natural that they got low signal scores by the LLR. The most obvious exam-ple is gliclazide, which had 18 reports on lactic acidosis, all including metformin.Tolbutamide had 75 % sole drug reports, and probably would have got a higherpositive signal if the absolute number of counts had been higher. Glibenclamidehad 10 sole drug reports, but because 32 of its 52 reports also included met-formin, which received a very strong signal, the LLR lowered its signal scoremarkedly. The thiazolidinediones were also co-medicated with metformin, butto a much lower extent. Thus, their very moderate signal dampening seems tobe in line with what could be expected.

Among the anti-HIV drugs, it seems natural that the PIs and NNRTIs got neg-ative signal scores because of their very frequent co-medication with stavudine.This does not apply to amprenavir, but that drug only had 2 sole drug reports,so its low signal score still seems fair. Stavudine and zidovudine had very manysole drug reports and seem to truly cause the ADR. The low LLR scores forthe other NRTIs seem reasonable in most cases, however the drug abacavir is abit of an exception. It had 18 sole drug reports on lactic acidosis but still gota signal score just above 0 because 38 of its total 76 reports also included thestrongly associated drug stavudine.

24

Sig

nal s

core

−2

02

46

8

NR

TI−

Sta

vudi

neN

RT

I−Lo

deno

sine

NR

TI−

Did

anos

ine

NR

TI−

Zid

ovud

ine

NR

TI−

Lam

ivud

ine

NR

TI−

Aba

cavi

rN

RT

I−Z

alci

tabi

neN

RT

I−E

mtr

icita

bine

NaR

TI−

Ten

ofov

irN

aRT

I−A

defo

vir

NN

RT

I−E

favi

renz

NN

RT

I−N

evira

pine

NN

RT

I−D

elav

irdin

eP

I−S

aqui

navi

rP

I−R

itona

vir

PI−

Nel

finav

irP

I−In

dina

vir

PI−

Am

pren

avir

PI−

Ata

zana

vir

B−

Phe

nfor

min

B−

Met

form

inB

−B

ufor

min

S−

Glib

encl

amid

eS

−T

olbu

tam

ide

S−

Glic

lazi

deS

−G

lipiz

ide

S−

Chl

orpr

opam

ide

S−

Glim

epiri

deT

−T

rogl

itazo

neT

−R

osig

litaz

one

T−

Pio

glita

zone

●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●●

●

●

●

●

●●●

●●

●

●

●●

● ICLLR

Figure 3 . Signal scores for all drugs coming from the pre-defined classes of interest with atleast five reports on lactic acidosis. Anti-HIV drugs are presented in the left panel and anti-diabetes drugs are presented in the right panel. Drug class abbreviations: NRTI NucleosideReverse Transcriptase Inhibitor; NaRTI Nucleoside Analogue Reverse Transcriptase Inhibitor;NNRTI Non-Nucleoside Reverse Transcriptase Inhibitor; PI Protease Inhibitor; B Biguanide;S Sulphon amide; T Thiazolidinediones.

25

LLR signal score ( ββ )

IC s

igna

l sco

re

−5

0

5

−2 0 2 4 6 8

●

●●

●

● ●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●

●

●

●

●●●

●

●

●

●

●

●●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●●

●●●

●

●●

●●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●●

●

●●

●

●● ●

●●

● ●

●

●●

●

●

●

●

●

●●

●●●

●

●

●

●●

●

●

●

●

●

●●

●●●●

●

●

●

●

●

●●●

●●●●●

●

●●●●●●●●

●

●●●●●●●

●

●●

●

●●●●

●

●●●●●●

●

●

●

●

●

●●●●●

●

●

●

●●●

●

●●●

●

●●

●

●●●●●●

●●●

●

●

●

●

●●●●

●

●●●●

●

●

●

●

●

●

●

●

●

●●●

●

●●

●

●

●●

●●●●●

●

●

●

●●●

●●

●

●

●●●●●●

●

●

●

●●

●

●●●

●

●●

●●

●

●●●●●●●●

●

●●●●●●●

●

●●●●

●

●●

●

●

●

●●●

●

●●●●●●●

●

●●●●●

●

●●●●●●●●

●

●●●●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

●●●●●●●●●●●●●●●●●●●●

●

●

●

NRTIsNaRTIsNNRTIsPIsBigunanidesSulfonylureaThiazolidinedionesOther

Figure 4 . Graph for the lactic acidosis example comparing the IC to the LLR β for all drugsreported with this ADR. All values are point estimates. Drug class abbreviations: NRTI Nu-cleoside Reverse Transcriptase Inhibitor; NaRTI Nucleoside Analogue Reverse TranscriptaseInhibitor; NNRTI Non-Nucleoside Reverse Transcriptase Inhibitor; PI Protease Inhibitor; BBiguanide; S Sulphon amide; T Thiazolidinediones.

26

It is interesting and illuminating to compare abacavir to tenofovir, which hadonly 7 sole drug reports but still got a much higher signal score. The reasonseems to be that the drug with which tenofovir was predominantly co-medicated,didanosine, was itself not highly ranked. Still the degree of co-reporting, 40 outof 73 reports, was even higher for tenofovir than for abacavir. The interpretationis that not only the degree of co-reporting but also the signal score of the co-medicated drug seems to be very important for the LLR.

27

Table 3 . Raw counts of reports for the drugs of the lactic acidosis example.

NR

TI-

stavu

din

e

NR

TI-

lod

en

osi

ne

NR

TI-

did

an

osi

ne

NR

TI-

zid

ovu

din

e

NR

TI-

lam

ivu

din

e

NR

TI-

ab

acavir

NR

TI-

zalc

itab

ine

NR

TI-

em

tric

itab

ine

NaR

TI-

ten

ofo

vir

NaR

TI-

ad

efo

vir

NN

RT

I-efa

vir

en

z

NN

RT

I-n

evir

ap

ine

NN

RT

I-d

ela

vir

din

e

PI-

saqu

inavir

PI-

rit

on

avir

PI-

nelfi

navir

PI-

ind

inavir

PI-

am

pren

avir

PI-

ata

zan

avir

NRTI-stavudine 177 6 256 43 299 38 1 6 19 0 81 60 4 37 44 77 79 4 3

NRTI-lodenosine 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0

NRTI-didanosine 256 0 18 43 32 19 3 0 40 1 31 38 1 23 29 56 31 5 6

NRTI-zidovudine 43 0 43 119 60 2 2 0 0 1 5 40 1 9 9 28 16 2 2

NRTI-lamivudine 299 0 32 60 11 20 2 0 9 1 57 27 4 19 26 35 50 3 3

NRTI-abacavir 38 0 19 2 20 18 1 0 5 0 7 11 0 3 5 4 2 2 0

NRTI-zalcitabine 1 0 3 2 2 1 1 0 1 1 0 0 0 0 0 0 0 1 0

NRTI-emtricitabine 6 0 0 0 0 0 0 0 3 0 2 0 0 0 0 0 0 0 0

NaRTI-tenofovir 19 0 40 0 9 5 1 3 7 0 5 0 0 5 6 1 2 5 4

NaRTI-adefovir 0 0 1 1 1 0 1 0 0 5 0 0 0 0 0 0 0 0 0

NNRTI-efavirenz 81 0 31 5 57 7 0 2 5 0 3 3 0 3 3 11 0 1 0

NNRTI-nevirapine 60 0 38 40 27 11 0 0 0 0 3 3 1 7 9 10 12 0 0

NNRTI-delavirdine 4 0 1 1 4 0 0 0 0 0 0 1 0 2 1 1 0 0 0

PI-saquinavir 37 0 23 9 19 3 0 0 5 0 3 7 2 0 35 2 4 2 2

PI-ritonavir 44 0 29 9 26 5 0 0 6 0 3 9 1 35 7 4 22 2 2

PI-nelfinavir 77 0 56 28 35 4 0 0 1 0 11 10 1 2 4 4 6 0 0

PI-indinavir 79 6 31 16 50 2 0 0 2 0 0 12 0 4 22 6 12 0 0

PI-amprenavir 4 0 5 2 3 2 1 0 5 0 1 0 0 2 2 0 0 2 0

PI-atazanavir 3 0 6 2 3 0 0 0 4 0 0 0 0 2 2 0 0 0 1

Total 767 6 355 264 369 76 8 10 73 7 104 109 5 56 76 109 103 11 10

B-p

hen

form

in

B-m

etf

orm

in

B-b

ufo

rm

in

S-g

lib

en

cla

mid

e

S-t

olb

uta

mid

e

S-g

licla

zid

e

S-g

lip

izid

e

S-c

hlo

rp

rop

am

ide

S-g

lim

ep

irid

e

T-t

rogli

tazon

e

T-r

osi

gli

tazon

e

T-p

iogli

tazon

e

B-phenformin 242 3 1 3 0 0 0 1 0 0 0 0

B-metformin 3 1278 0 32 1 18 15 2 9 3 4 1

B-buformin 1 0 37 5 0 0 0 0 0 0 0 0

S-glibenclamide 3 32 5 10 0 0 0 0 0 0 0 0

S-tolbutamide 0 1 0 0 6 0 0 0 0 0 0 0

S-gliclazide 0 18 0 0 0 0 0 0 0 0 0 0

S-glipizide 0 15 0 0 0 0 3 0 0 0 0 0

S-chlorpropamide 1 2 0 0 0 0 0 2 0 0 0 0

S-glimepipride 0 9 0 0 0 0 0 0 0 0 4 0

T-troglitazone 0 3 0 0 0 0 0 0 0 9 0 0

T-rosiglitazone 0 4 0 0 0 0 0 0 4 0 5 0

T-pioglitazone 0 1 0 0 0 0 0 0 0 0 0 3Total 252 1497 43 52 8 18 18 5 10 14 10 6

Anti-HIV drugs are displayed on the top panel and anti-diabetes drugs on the bottompanel. Note: On the respective diagonals the number of sole drug reports are given in gray.Note: Since there can be any number of drugs on a report the column sums can exceedthe total number of reports. Drug class abbreviations: NRTI Nucleoside Reverse Tran-scriptase Inhibitor; NaRTI Nucleoside Analogue Reverse Transcriptase Inhibitor; NNRTINon-Nucleoside Reverse Transcriptase Inhibitor; PI Protease Inhibitor; B Biguanides; SSulfonylurea; T Thiazolidinediones. 28

3.2.2 Example 2: Hypertriglyceridaemia

The results from the example of lactic acidosis made it reasonable to investigatewhether the co-medication of some of the drugs studied there would give rise tothe same effects when studying a different ADR term, and so the condition hy-pertriglyceridaemia was chosen. The clinical meaning of hypertriglyceridaemia,which is not as serious as lactic acidosis, is that the levels of fatty acids in theblood are elevated. It is well-known that anti-HIV therapy is associated withthis condition [28].

The detailed graph showing only anti-HIV drugs with at least five reports onhypertriglyceridaemia is given in Figure 5. Remarkably, all the anti-HIV drugsgot markedly lower signal scores by the LLR than by the IC measure. Three ofthem were not highlighted by the LLR when looking at the β025, whereas theIC measure highlighted all 16 drugs.

The graph with all drugs reported with this ADR can be found in Figure 6,and we see that the anti-HIV drugs formed a cluster above the mass of all otherdrugs. The reason, again, is that their IC values were biased upwards due toconfounding, which can be seen by examination of the raw counts (results notshown). The difference between this example and the lactic acidosis exampleis that there is no obvious confounder in this one: The drugs are all mutuallyco-medicated, but there is no drug that has a substantial amount of sole drugreports. The result is, in some sense, that the LLR lets the drugs share thesame signal, whereas the IC measure does not differentiate between shared andsole drug reports and therefore gives all drugs high signal scores. We give thisto our knowledge not previously studied phenomenon the name signal sharing.

Note that, just as in the lactic acidosis example, only the point estimates aredisplayed. The 95 % lower confidence bounds offered no additional informationto the point estimates.

The three drugs receiving negative β025 values, tenofovir, emtricitabine andatazanavir, had 1, 1 and 2 sole drug reports among 24, 7 and 11 reports intotal, respectively. Note that in this particular example the difference betweenthe methods was not substantial when looking at the number of highlighteddrugs, which was 13 for the LLR and 16 for the IC. However, there was a cleargeneral downward shift of the LLR signal scores in comparison to the IC values,and if the overall observed counts had been lower the difference with respect tothe number of highlighted drugs probably would have been greater.

29

Sig

nal s

core

0

2

4

6

NR

TI−

Sta

vudi

ne

NR

TI−

Did

anos

ine

NR

TI−

Zid

ovud

ine

NR

TI−

Lam

ivud

ine

NR

TI−

Aba

cavi

r

NR

TI−

Zal

cita

bine

NR

TI−

Em

tric

itabi

ne

NaR

TI−

Ten

ofov

ir

NN

RT

I−E

favi

renz

NN

RT

I−N

evira

pine

PI−

Saq

uina

vir

PI−

Rito

navi

r

PI−

Nel

finav

ir

PI−

Indi

navi

r

PI−

Am

pren

avir

PI−

Ata

zana

vir

●

●

●

●

●●

● ●

●●

● ●●

●

●

●

● ICLLR

Figure 5 . Signal scores for all anti-HIV drugs with at least five reports on hypertriglyceri-daemia. Drug class abbreviations: NRTI Nucleoside Reverse Transcriptase Inhibitor; NaRTINucleoside Analogue Reverse Transcriptase Inhibitor; NNRTI Non-Nucleoside Reverse Tran-scriptase Inhibitor; PI Protease Inhibitor.

30


IC s

igna

l sco

re

−4

−2

0

2

4

6

0 2 4

●

●●

●●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●●

●

●

●

●●●●

●

●

●

●●

●

●

●●

●●

●

●●●●●

●●

●●

●

●

●

●●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●●

●●

●●

●●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●● ●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●●●●●●●●●●●

●●

●●●●●●

●

●●●●●●●●●●●●●●●●●

●

●●●

●●

●●

●

●●●

●

●

●

●

●

●●

●

●●●

●

●●

●●

●●●

●

●

●●●

●●

●●●●●

●

●●●

●●

●

●●

●●

●

●

●●

●●●

●

●●

●

●●●

●

●●

●

●●●●●●

●

●●●●●●●

●●

●●●●●

●

●●

●

●

●●●●●●

●

●●

●

●

●

●●●●●●●●●●●

●

●●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

●Anti−HIVOther

Figure 6 . Graph for the hypertriglyceridaemia example comparing the IC to the β forall drugs reported with the ADR. All values are point estimates. Drug class abbreviations:NRTI Nucleoside Reverse Transcriptase Inhibitor; NaRTI Nucleoside Analogue Reverse Tran-scriptase Inhibitor; NNRTI Non-Nucleoside Reverse Transcriptase Inhibitor; PI Protease In-hibitor.

31

3.2.3 Example 3: Haemorrhagic Cystitis

However very interesting, the results from the hypertriglyceridaemia exampledid not answer the question whether lactic acidosis is a unique example ofthe LLR actually distinguishing between drugs probably causing the ADR anddrugs probably being innocent bystanders. Therefore another group of drugswith suspectedly similar co-medication patterns was studied, the anti-cancerdrugs. Also immunosuppressive drugs were included because of their chemicalsimilarities with the anti-cancer drugs.

One condition which is known to be caused by the anti-cancer and immunosup-pressive agent cyclophosphamide and its relative ifosphamide is haemorrhagiccystitis [29]. This quite severe condition is an infection of the bladder resultingin blood being transported into the urine. Because of the frequent co-medicationamong the anti-cancer drugs it might be suspected that there are cases of inno-cent bystanders for this particular ADR term.

A detailed graph for the haemorrhagic cystitis example is shown in Figure 7.It includes anti-cancer and immunosuppressive drugs with at least five reportson the ADR. It also includes the drug mesna, which is very interesting in thiscontext because it is solely given as an adjuvant (helping drug) together withcyclophosphamide and ifosphamide to protect patients from haemorrhagic cys-titis.

As expected, cyclophosphamide got a very high signal score by both methods.It had 196 sole drug reports out of 241 reports in total. There were a coupleof other drugs (ifosphamide, mitomycin and tiaprofenic acid) where the dif-ferences were negligible. In all those cases there was no co-medication withcyclophosphamide. For all other drugs studied there were differences of varyingmagnitude. These could in almost all examples be explained by co-medicationwith either cyclophosphamide or ifosphamide. The probably best example ismesna, which had 12 reports in total of which 10 were with cyclophosphamideand none was a sole drug report.

However, one important and very interesting exception was found, epirubicin.It had 8 reports in total on haemorrhagic cystitis, of which 7 included no otherdrugs and 1 included paclitaxel, i.e. not cyclophosphamide. Given these facts itwould seem strange that it received such a low signal score by the LLR. A closerinvestigation of matters revealed that epirubicin was in fact very frequently co-medicated with cyclophosphamide (973 out of 2339 reports in total), but thatnone of the joint reports was on haemorrhagic cystitis.

32

This example of cyclophosphamide in combination with epirubicin and the ADRhaemorrhagic cystitis is clearly an example of confounding by co-medication,however not in the way it is normally thought of. Described simplistically, theLLR found that cyclophosphamide was highly associated with haemorrhagiccystitis, but not when co-medicated with epirubicin. Therefore epirubicin itselfwas not found associated with haemorrhagic cystitis despite its 7 sole drugreports. Indeed, an interaction term added to the model got an estimate of lessthan −6 and in its presence epirubicin received a signal score of almost 3. Itis still unknown why none of the joint reports was on haemorrhagic cystitis;if epirubicin were not affecting the presence of this ADR in cyclophosphamidereports the expected number would be about 22.

Another example of this phenomenon was paclitaxel. However, in this casethe degree of co-medication was much smaller, and so the effect was much lessdramatic. It is a very interesting question how common this phenomenon mightactually be.

The graph including all drugs reported with haemorrhagic cystitis is given inFigure 8. It is difficult to draw any conclusions from it because there are not somany other drugs to compare with. There seems, however, to be a trend thatthe anti-cancer and immunosuppressive drugs are placed above and to the leftof the other drugs, suggesting that their IC values have been biased upwardsdue to confounding by co-medication. Just as in the two previous examples the95 % lower confidence boundaries have been left out because they did not addany valuable information.

33

Sig

nal s

core

−2

0

2

4

6

C−

Cyc

loph

osph

amid

eC

−Ifo

spha

mid

eC

−M

itom

ycin

C−

Vin

orel

bine

C−

Pac

litax

elC

−B

usul

fan

C−

Thi

otep

aC

−M

etho

trex

ate

C−

Cyt

arab

ine

C−

Flu

dara

bine

C−

Vin

cris

tine

C−

Eto

posi

deC

−E

piru

bici

nC

−D

oxor

ubic

in

Mes

na

I−T

iapr

ofen

ic a

cid

I−M

ycop

heno

lic a

cid

I−C

iclo

spor

inI−

Aza

thio

prin

eI−

Pre

dnis

olon

eI−

Pre

dnis

one

●

●

● ●

●

●●

● ●● ●

●

●

●

●

●

●● ● ●

●

● ICLLR

Figure 7 . Signal scores for all anti-cancer and immunosuppressive drugs with at leastfive reports on haemorrhagic cystitis. The graph also shows an adjuvant (helping drug) tocyclophosphamide and ifosphamide called mesna which is only used to protect patients fromhaemorrhagic cystitis. Drug class abbreviations: I Immunosuppressive drugs; C Anti-cancerdrugs.

34


IC s

igna

l sco

re

−2

0

2

4

6

−2 0 2 4 6

●

●

●

●

●●

●

●●

●

●●

●●●●●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●●

●

●●●●●●●●●

●

●●●●●●●●●●●●

●

●●●●

●

●●●●●●●●●●●●●●●●

●

●

●

●●●●●●●●●

●

●●●●●●●●

●

●●●●

●●●

●●●●●●●●●●●●

●

●●

●

●●●●●

●

●●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

Anti−cancerImmunosupressiveMesnaConfoundersOther

Figure 8 . Graph for the haemorrhagic cystitis example comparing the IC to the β forall drugs reported with this ADR. All values are point estimates. The two confounders arecyclophosphamide and ifosphamide. Drug class abbreviations: I Immunosuppressive drugs;C Anti-cancer drugs.

35

3.3 Masking

As the results from the second simulated scenario (see Section 3.1) suggestedthat the LLR was capable of correcting for masking, one specific example wasstudied, namely rhabdomyolysis. Rhabdomyolysis is a severe condition meaningthat the muscle tissue breaks down, which might cause severe damage to thekidneys. It was realized that the drug cerivastatin caused this ADR in 2001 [30],and since then this particular combination has been seriously over-reported.This is a typical example of the type of reporting bias that the WHO drugsafety database suffers from. Because of this over-reporting of cerivastatin, itmight be suspected that signals from other drugs have been masked.

Just as in the confounding by co-medication examples the LLR was run in 200bootstrap replicates with prior variance equal to 1. Then a graph including alldrugs reported with rhabdomyolysis was drawn based on the lower endpoints ofthe 95% confidence intervals for the two methods. The reason for focusing onthe lower confidence boundaries and not the point estimates was that the maininterest was to find new associations, i.e. such associations that were highlightedby the LLR and not the IC measure. These were especially marked out in thegraph. Finally the existing literature on the new associations was examined interms of publication in the journal Reactions Weekly. This is a journal whichscans a lot of clinical journals that are considered to be of good enough qualityand then picks up previously un-reported combinations without further review.

The graphical presentation for the rhabdomyolysis example is given in Figure 9.There are 21 combinations in the second quadrant representing those combina-tions that the LLR found and the IC did not, of which six were published inthe journal Reactions Weekly. These 21 combinations are all listed in Table 4.One observation is that there are quite a few drugs among the 21 which arevery frequently reported. This is not surprising since drugs with few reports onother ADRs would have very low expected counts and would therefore not beexpected to have their probable associations masked.

The fact that the ADR under study is rhabdomyolysis does not confirm that thenew associations were highlighted because they were actually unmasked, eventhough is seems likely. One must also remember that the results depend on theselected hyperparameter value and signalling threshold for the LLR.

One drug standing out was gemfibrozil. It is known that gemfibrozil and cerivas-tatin interact leading to an increased risk of rhadbomyolysis [31]. The slightlowering of fenofibrates signal score by the LLR could be explained by a mod-

36

erate co-medication with cerivastatin. Its 75 sole drug reports however resultedin a clear positive signal.

Table 4 . Possible examples of masked combinations for the ADR rhabdomyolysis.

Drug ADR/Drug Sole drug Drug Literaturereports reports reports evidence

Clozapine 89 64 44255 NoCitalopram 36 14 12153 YesOfloxacin 21 8 11959 Yes

Furosemide 35 3 13148 NoVenlafaxine 47 26 19075 NoClopidogrel 31 4 7802 YesMetformin 41 14 10513 NoStavudine 18 2 5366 NoSirolimus 10 4 2270 No

Itraconazole 23 3 6269 YesAmiodarone 56 8 14163 No

Azithromycin 38 7 10090 YesAbacavir 16 9 3046 No

Candesartan 15 9 2964 NoNefazodone 35 4 8270 Yes

Danazol 10 2 2693 NoInterferon alfa-2B 23 18 7017 No

Sulpiride 10 3 1828 NoAripiprazole 21 18 4566 No

Baclofen 12 9 2536 NoPregabalin 8 7 1264 No

’Literature evidence’ refers to a publication of the combination in the journal Reac-tions Weekly.

37

LLR signal score 95% lower confidence bound ( ββ 025)

IC s

igna

l sco

re 9

5% lo

wer

con

fiden

ce b

ound

(IC

025)

−5

0

5

−4 −2 0 2 4

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●●●

●

●●

●

●

●

●

●

●

●

●

●●

●●● ●

●

● ●

●

●●●

●

●

●

●

●

●●

●●

●

●●●●●

●

●

●

●●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●●●

●

●●

●

●●

●

●

●●

●

●

●

●

●●

●●

●

●●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●●●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

New associationsGemfibrozilOther

Figure 9 . Graph comparing the IC to the LLR with respect to the 95 % lower confidencebounds for the rhabdomyolysis example. Note that this is different from the other graphs ofsimilar type where only the point estimates were shown. The reason for doing so was that thatthe focus in this example were the combinations highlighted by the LLR and not by the IC.Those combinations all lie in the south-east quadrant. Also gemfibrozil has been especiallymarked out because it was such an obvious outlier.

38

3.4 Systematic Comparisons

A total scan of the database was performed by systematically fitting one LLRmodel for each ADR term occurring in the database. Seven different values ofprior variance ranging from 0.005 to 1 were used. Because of the computationalload no bootstraps were run and instead of using β025 > 0 as the decision rulefor signalling, β > 0 was used instead. The logic behind this rule was thatusing a low prior variance, i.e. high degree of shrinkage, pushes the β estimatestowards zero just as β025 will always be closer to zero than β itself (assumingthat β is strictly positive).

3.4.1 Precision-Recall

The LLR was compared to the IC measure in terms of precision and recall. Inthis context, the precision of a method is the fraction of the method’s signalsthat corresponds to true associations, and the recall is the fraction of all true as-sociations that the method manages to highlight. Thus, if we use the definitionsin Table 5, the precision is given by a

a+b and the recall is given by aa+c . There are

other measures, such as specificity and sensitivity, that could be used instead.However, the advantage with precision and recall in this particular applicationis that they are not dependent on the number of true negative decisions, whichis typically very large and tends to dominate the other three outcomes.

The concepts of precision and recall require that it is known which associationsthat are true and which are not. In this application this is not the case [32], andhence an approximation to the ’truth’ was needed. In this paper the publicationof combinations in the journal Reactions Weekly was used as approximation [33].Since it is undesirable for a method to signal for combinations with less thanthree reports [34, 4], combinations with two or fewer reports were excluded fromthe comparison. Note that Reactions Weekly is intended to be an early warningsystem and that it is enough with a single report from somewhere in the worldto get a combination published. Therefore it is likely to include false positivepublications. Of course it could also happen that it misses true associations,but this probably happens more seldom. Note also that the WHO drug safety

Table 5 . Possible outcomes of a binary decision method.

True association No true association TotalSignal a (True positive) b (False positive) a+ b

No signal c (False negative) d (True negative) c+ d

Total a+ c b+ d a+ b+ c+ d

39

database by no means is independent of this journal, since most of the casereports leading to a publication will end up in the database.

The precision-recall graph is given in Figure 10. An ideal method would alwaysbe in the top right corner, where both precision and recall are high. Thereforeit would be desirable for the LLR to have a curve lying completely above theIC curve, which it does not. The reason is that a too high prior variance makesthe LLR signal for so many combinations (with the used threshold) that theprecision drops rapidly. For moderate values of the prior variance, however, theLLR seems to be superior to the IC. The particular choice of a prior varianceof 0.2 gives the same precision as the IC with conventional signalling threshold,whereas the recall is increased by more than 10 %. This is a very appealingbehaviour which might suggest that this particular prior variance should beused for this particular database.

The appearance of the precision-recall graph might seem confusing. The ICcurve has a much slower decrease in precision with increasing recall than theLLR curve has. This is probably due to the fact that for the IC the thresholdwas changed, whereas for the LLR the degree of shrinkage was changed whilethe threshold was kept constant.

Because of the LLR property to estimate very many parameters to exactly zeroit would not make sense to construct a precision-recall curve based on a varyingthreshold for this method. Such a curve would probably be quite alike the ICcurve for strictly positive thresholds, however any strictly negative thresholdwould make the precision drop and the recall increase dramatically. Thus, thecurve would contain a huge gap. One could argue that a fairer comparison wouldhave been to treat the IC the same way as the LLR and vary the shrinkage alsofor the IC. Our standpoint is that this would be illuminating and that it wouldbe interesting to see whether the current use of the IC is optimal. The purposeof this study is however to characterize the properties of the LLR and thereforeit was easier to compare it to the golden standard, i.e. the IC as it is actuallyused.

40

Recall

Pre

cisi

on

0.05

0.10

0.15

0.20

0.25

0.0 0.2 0.4 0.6 0.8 1.0

●●●●●●●●

●●●●●●

●●●●●

●

●

●

●

●

●

●

●

●

●

● ICLLR

Figure 10 . Precision-recall curve based on all combinations in the WHO drug safety databasewith three or more reports. Each circle represents a threshold for IC025, and the filled circlerepresents the used threshold 0. Each triangle represents a prior variance for the LLR, forwhich the signalling threshold always was β > 0.

3.4.2 Properties of Disconcordant Combinations

The methods were also compared with respect to the properties of the com-binations where the methods differed. With the definitions given in Table 6,the interesting groups of combinations are the two disconcordance groups. Ifconfounding by co-medication were a problem in the database and if the LLRwould correct for it to some extent, we would expect there to be more cases ofco-medication in the IC disconcordance group than in any of the other groups.This was tested by comparing the fraction sole drug reports among the combina-tions in the IC disconcordance group to the fraction sole drug reports among thecombinations in the positive concordance group, which seemed to be a suitablereference.

The results are given in Figure 11. In the IC disoncordance group the fractionsole drug reports is much lower than in the positive concordance group. Thissuggests that co-medication is a primary reason for the LLR to discard a com-bination found by the IC. Notice that as the prior variance increases the LLR

41

Table 6 . Classification of a combination based on the IC and LLR results.

IC signal No IC signalLLR signal Positive concordance LLR disconcordance

No LLR signal IC disconcordance Negative concordance

includes more and more combinations, so that the IC disconcordance group be-comes smaller and smaller. With a very high prior variance this group thereforeprobably consists of the most evident cases of co-medication, and that is whythe fraction sole drug reports is the lowest for the highest prior variance. Then,as the prior variance decreases, the group grows larger and larger because theLLR becomes more and more restrictive while the IC is kept constant. It ishowever important to note that combinations not included with a high priorvariance are not included with a lower prior variance either.

If masking were a true problem and if the LLR were capable of correctingfor it, it would be expected that there were more cases of masking in the LLRdisconcordance group than in any of the other groups. There is no obvious way ofdefining a measure for a given combination that says whether that combinationcould be suspected to be masked, and to the author’s knowledge there is noneavailable in the literature. Here an ADR specific masking score is proposedbased on the set consisting of the number of reports on all the combinationsincluding the ADR, excluding those combinations with no reports at all. Themasking score is then defined as the ratio between the mean and the median ofthat set. The logic behind the definition is that an ADR with an even reportingacross different drugs would get a lower mean and thus a low value whereas anADR with one or several outliers would get a high value. In analogy with thecomparison of the fraction sole drug reports above, the masking score of theADRs of the combinations in the LLR disconcordance group was compared tothe masking score in the positive concordance group.

The results are given in Figure 12. Except when the prior variance is very high,the suggested masking score is higher in the LLR disconcordance group than inthe positive concordance group. The differences are not so large as was the casewith the fraction sole drug reports above, particularly not for the prior variance0.2, and the interpretation is that unmasking is one, however probably not thestrongest, reason why the LLR finds associations that the IC does not find. Itmust also be kept in mind that the here defined masking score is not validatedand might be far from optimal.

Note that the LLR becomes more and more restrictive with decreasing priorvariance whereas the IC is kept constant, so that the LLR disconcordance group

42

Prior variance

Fra

ctio

n so

le d

rug

repo

rts

IC disconcordancePos. concordance

0.005 0.01 0.1 0.2 0.3 0.5 1

0.0

0.2

0.4

0.6

0.8

1.0

Figure 11 . Boxplots showing, for different values of the LLR prior variance, the fraction soledrug reports for the combinations in the two groups. The IC disconcordance group consistsof those combinations that were found by the IC and not by the LLR, whereas the positiveconcordance group, acting as reference, consists of those combinations found by both methods.Note that the LLR becomes less and less restrictive as the variance increases, so that the ICdisconcordance group becomes smaller and smaller with increasing variance.

43

becomes smaller and smaller with decreasing prior variance. Because of this,when the prior variance is very small only the strongest associations of thosethat the IC did not find are left in the LLR disconcordance group. This is wherewe would suspect to find most cases of masking, which is also confirmed by theresults. However, this group of very strong associations are also found by theLLR as the prior variance increases.

Another important aspect of ADR monitoring is to be able to highlight a newassociation as fast as possible. The IC was designed to be able to highlightcombinations with three reports, however some combinations might need con-siderably more reports. The LLR as used in the systematic comparisons wasstudied in retrospect with respect to the minimum number of reports a combi-nation needed to be highlighted. The results are presented in Figure 13. It isparticularly interesting that at the suggested prior variance 0.2 the LLR neededfour reports to give a signal. The practical importance of this difference to theIC could be debated, but one way of avoiding the whole problem could be tofine-tune the prior variance so that the LLR can highlight combinations withonly three reports. The question then is how much such an adjustment woulddecrease the precision, a question which is impossible to answer on beforehand.

A closer look at the interesting prior variance 0.2 is given in Figure 14, whichdisplays a histogram of the IC values for the combinations in the LLR discon-cordance group. If unmasking were the explanation to why the LLR highlightedthese combinations we would expect the IC point estimates to be low, but, asthe histogram shows, the vast majority of combinations lies above zero. Thus,these combinations are probably about to be found by the IC but they still needsome more reports for that to happen. Because the LLR is capable of adjustingthe parameters and discarding so many combinations due to confounding byco-medication, some of the parameter space in the models is set free so thatthese borderline combinations can get a higher β estimate.

It might seem confusing that the LLR (with prior variance 0.2) on one hand ismore conservative than the IC in that it can not highlight combinations withthree reports and that it on the other hand is more liberal to combinationsthat are almost found by the IC. This need not be a contradiction, however.Using a prior variance of 0.2, the LLR clearly is only capable of highlighting newcombinations that have at least four reports. But for the particular group ofcombinations that have more than four reports but still have a negative IC025

the property of the LLR to correct for confounding by co-medication helps itmake room for other combinations that would need additional reports to get apositive IC025. There can of course also be elements of masking in some of the

44

Prior variance

Mas

king

sco

re

LLR disconcordancePos. concordance

0.005 0.01 0.1 0.2 0.3 0.5 1

05

1015

20

Figure 12 . Boxplots showing, for different values of the LLR prior variance, the maskingscore (see Section 3.4) for the ADRs of the combinations in the two groups. The LLR discon-cordance group consists of those combinations that were found by the LLR and not by the IC,whereas the positive concordance group, acting as reference, consists of those combinationsfound by both methods. Note that the LLR becomes less and less restrictive as the varianceincreases, so that the LLR disconcordance group becomes larger and larger with increasingvariance.

45

Prior variance

Min

imum

rep

orts

nee

ded

for

sign

al

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

0.005 0.01 0.1 0.2 0.3 0.5 1

●

●

●

●

● ●

●

Figure 13 . Graph showing, for different values of the LLR prior variance, how many reportsa combination at least needed for a signal. The LLR becomes less and less restrictive withincreasing variance so that this number decreases down to 2 for prior variance equal to 1.Note that the IC is designed to be able to pick up combinations with three reports.

cases which makes the whole even more difficult to disentangle.

Despite the suspicion that confounding by co-medication is more importantthan masking, there is still evidence of masking in some cases. Still lookingat the LLR disconcordance group for the prior variance 0.2, there are 2941combinations with a negative IC spread out on 438 unique ADR terms and498 unique drug terms. Some possibly clinically relevant examples are given inTable 7, where examples of unmasked combinations between selective serotoninre-uptake inhibitors (SSRIs, a class of anti-depressants) and ADR terms relatedto multiple sclerosis (MS) are presented. In those cases the drug interferon hasclearly masked the IC, which is realized by removing all interferon reports inthe entire database and then calculating adjusted IC values.

46

IC point estimate

Num

ber

of c

ombi

natio

ns

0

1000

2000

3000

4000

5000

−3 −2 −1 0 1 2

Figure 14 . Histogram showing the IC point estimates for the combinations in the LLRdisconcordance group for the prior variance 0.2. Combinations that have been unmaskedshould have a low IC, whereas combinations that the LLR found because it has discarded alot of false positives should have a high IC.

Table 7 . Examples of masked combinations for two ADRs related to MS.

ADR 1. MS-like syndromeADR/Drug Sole drug Drug Literature

Drug reports reports reports β IC ICadj evidenceFluoxetine 21 18 51487 0.65 -0.79 0.25 NoSertraline 21 16 30670 1.06 -0.06 0.98 NoParoxetine 12 11 37184 0.22 -1.11 -0.08 No

ADR 2. MS aggravatedADR/Drug Sole drug Drug Literature

Drug reports reports reports β IC ICadj evidence

Fluoxetine 10 8 51487 0.22 -1.79 -0.47 YesParoxetine 8 6 37184 0.13 -1.63 -0.50 No

’Literature evidence’ refers to a publication of the combination in the journal Reac-tions Weekly. The adjusted IC is the IC one would get if all reports on the drug interferonwere removed from the database. This drug has clearly masked the combinations listedabove.

47

4 Discussion

The aim of this project as presented in Section 1 was to implement the methodof LLR on the WHO drug safety database; to investigate the properties ofthe LLR in ADR monitoring to see if this method is practically useful and toattempt to characterize when the LLR is most and least likely to be beneficialin ADR monitoring. The implementation succeeded by controlling the existingBBR software with various Perl scripts. Further, it was shown that the LLRis indeed capable of correcting for occurring confounding by co-medication andmasking, and that these properties resulted in an overall better performancethan the IC when scanning the entire database. Despite limitations with regardto its transparency the LLR has proven to be a very useful data mining methodin ADR monitoring which will be taken into practical use in some form.

Here the current choices of model, implementation and algorithm will be dis-cussed with respect to strengths, limitations and potential future changes. Whereasthe use of a logistic regression framework is natural and appealing, it must beremembered that certain assumptions are made. A simple rearrangement ofEquation 4 shows that, if P (y = 1) << 1, the log of P (y = 1) is roughly equalto the intercept plus the sum of the individual drugs’ contributions. Thus, thedrug effects on the probability of the ADR appearing on a report are in prac-tice assumed to be multiplicative. If this model is not really appropriate, thatfurther complicates the interpretation of the parameter estimates. However,even though the relative effects of the drugs might be altered, the method willprobably still correctly detect associations even if the model appropriateness isquestionable [26].

The model used here does not require the input vectors xi to be standardized,as is normally the case in lasso models. The option is available in the BBRsoftware and it was tried, but because each xi consists of roughly 14000 entriesthe computations broke down. This problem might be solved by using anotheralgorithm, such as the recently proposed interior point algorithm by Koh etal [35]. The implication of not standardizing is that the parameters are shrunkunevenly, though our belief is that this is of little importance for the work in thisproject since all explanatory variables were categorical with relatively similarmeans and variances. If, however, the models were extended to include alsocontinuous covariates such as e.g. patient age (see below), this might become areal problem.

Further, the CLG-algorithm does not exclude the intercept from the shrinkage,which is automatically the case when standardization is used. On the other

48

hand the intercept should be the most insensitive parameter of them all andprobably this detail does not change the method’s behaviour too much. As anexample, for lactic acidosis the intercept changes from -8.07 for a prior varianceof 1 to -8.11 for a prior variance of 0.005.

Normally in statistical model building, especially when predictability is an issue,one checks whether the model assumptions seem valid, e.g. by looking at thegoodness of fit. In this data mining application we have chosen to overlook thisstep and instead focus only on the parameter estimates and what drug-ADRassociations they suggest. This could of course be challenged even though themethod in practice generates very useful output.

The choice to extend the ordinary logistic regression model by putting priorLaplace distributions on the parameters introduces additional complexity. Asmentioned in Section 1, there are mainly three reasons for doing so: Numericalinstability in the estimation is avoided, the computations become faster (com-pared to the IRLS algorithm) and also a natural signalling threshold is set atzero. If one were to use another type of shrinkage, e.g. ridge logistic regression,the latter property would no longer apply. But then again, it might be presump-tuous to claim that this property is always good. For instance, in the LLR itis impossible to tell whether a zero-valued parameter was shrunk from negativeor positive values, although this probably is of little relevance in practice.

In addition to the three reasons for introducing shrinkage we have implicitlysuggested a further use of the shrinkage in the LLR, namely as a substitute tothe calculation of confidence intervals. It is still unknown whether the perfor-mance of the LLR could be further improved by systematically bootstrappingthe models using a suitable shrinkage. It was however confirmed that it doesperform very well with a prior variance of 0.2 without bootstrapping, and clearlythe computational load is negligible in comparison.

Despite these suggested advantages with using shrinkage logistic regression, itis not self-evident that the shrinkage is absolutely necessary. Therefore it mightbe worthwhile to test some modern algorithms designed for large-scale logisticregression without shrinkage [17]. These have been claimed to remedy the in-stability and calculation speed issues associated with the IRLS algorithm, butthey most likely require calculation of confidence intervals, probably using thebootstrap approach, in order to be practically useful in this application.

The models used in this project were all of the conceptual form

P (ADRj) = f(Drug1, . . . , Drug|D|)

49

but there are a number of possible ways to improve on this. First, on somereports there are also listed so called concomitant drugs. These were used bythe patient at the time the ADR was experienced, but were judged to probablynot be the causal agent. Since the IC clearly suffers from also including theconcomitant drugs, they were not included in this project. The LLR, however,would not be expected to suffer from including these drugs, in fact it might evenbe advantageous. One possibility would be to give them a lower weight than allother drugs.

Second, we might introduce other recorded covariates into the model as explana-tory variables:

P (ADRj) = f(Drug1, . . . , Drug|D|, Cov1, . . . , Cov|C|)

(where |C| denotes the total number of covariates). Examples of such covariatesare patient age, patient gender, country of origin and time of reporting. Thereason for including covariates into the models is that they, just as co-medicateddrugs, might be confounders. Measures of disproportionality, such as the IC,can be adjusted for covariates by the use of stratification, however this hasbeen shown to be rather problematic and far from a real boost to overall perfor-mance [36]. Whether the results would be similar if the covariates were includedinto the LLR depends on if they really have negligible explanatory value or ifthe method of stratification as implemented for the IC simply was not optimal.One potential advantage with the regression approach is that the continuouscovariates can be used as they are instead of assigning them a specific stratum.

Another possible improvement would be to include two or more, ideally all,ADRs as dependent variables into the same model instead of studying eachADR separately. This ideal model would look conceptually like this (assumingthat the covariates are also included):

P (ADR1, . . . , ADR|A|) = f(Drug1, . . . , Drug|D|, Cov1, . . . , Cov|C|)

The advantage with this extension is that the interdependence among the ADRscan also be accounted for. Exactly what the structure of this interdependencemight look like is not known, but it is possible that there exist innocent by-standers among the ADRs just as among the drugs. It should be noted thatit is far from realistic to estimate the ideal model as presented above since itwould have more than 30 million parameters. Also, to the author’s knowledge,there do not exist any suitable algorithms.

Another possible way to study innocent bystander effects among the ADRswould be to use the current model with the roles of drugs and ADRs inter-changed. In other words one would fix the drug under study and use the ADRs

50

as predictors for the presence of that specific drug on a report. It is however abit unclear how such a model could be motivated from a logical point of view.

Some of the results are worthy of further attention. First, it is striking how wellthe results for lactic acidosis correspond to the results obtained by Szarfmanet al. based on the FDA (Food and Drug Administration) database [27]. Itis still a bit unclear exactly how their method was specified since no full peerreviewed papers have been published, though they mention that their methodcalled HBLR (Hierarchical Bayesian Logistic Regression) is capable of handlingup to hundreds of drugs as predictors, which is much less than the LLR usedin this project. This size limitation makes a complete database scan impossibleand the absence of other publications on this matter makes us believe that thesystematic study presented here is unique in that sense.

The specific examples of confounding by co-medication presented in Section 3.2gave valuable insight into the properties of the LLR. We noted that the LLRjudged the associations between the chosen ADRs and certain drugs much lowerthan the IC did, and in almost all cases this behaviour seemed reasonable whenlooking at the individual drugs’ counts. It was noted that the extent of co-medication was important as well as the strength of association between theco-medicated drug(s) and the ADR, even though it was hard to exactly specifythe details of this balance. One implication of this property was that frequent co-medication within a group of drugs where no single drug was strongly associatedwith the ADR, such as in the hypertriglyceridaemia example, caused all drugsto get low signal scores. We call this, in our opinion, useful behaviour signalsharing.

The illuminating case of the drugs epirubicin and cyclophosphamide togetherwith the ADR haemorrhagic cystitis, on the contrary, was an example of theLLR not operating in an optimal manner. In this case the co-medication betweenthe two drugs was substantial overall but non-existent when looking at reportson this particular ADR. The LLR therefore failed to signal for epirubicin, whichhad 8 reports on haemorrhagic cystitis of which 7 were sole drug reports.

Although the detailed study of particular ADR terms was valuable in termsof characterizing some properties of the LLR, the most interesting results arethose from the systematic comparisons. Obviously, the main finding was theincreased performance of the LLR in comparison to the IC in terms of precisionand recall, although it must be emphasized that it is unknown definitively whichassociations are true and which are not. We used as a test set the publicationof signals in the journal Reactions Weekly. This test set is unfortunately the

51

only one readily available in this application, and seems to be the only way ofobjectively comparing two methods.

It was judged that out of the prior variances tested, 0.2 offered the best balancebetween precision and recall. At that point the LLR had the same precision asthe IC but a recall which was more than 10 % higher. This means roughly thatby using the LLR instead of the IC, one would find one bonus ’true’ associationper ten ’true’ associations found by the IC without the resource requirementof having to consider any extra false positive associations. One might wonderwhether this is of practical importance. Our opinion is that it is actually quitean improvement and when one considers that the LLR could be expected toperform even better by e.g. including the recorded covariates into the modelsthis novel approach seems very promising.

An attempt was made at explaining why the LLR performed better than the IC.It must be kept in mind that this is a quite complex issue since these methodsdiffer in three conceptual ways: The type of shrinkage is different; the basicmetric of the LLR is an odds ratio which is not identical to an unshrunk ICvalue and finally the LLR is, because it is a regression method, multivariate.The special type of the lasso shrinkage was briefly mentioned above, but it seemsthat it is more important when comparing to other shrinkage logistic regressionmethods than when comparing to the IC. Also, we have made no observationsthat might suggest a difference in practice. The difference in basic metric haspractically been assumed to be unimportant, however this might not be alto-gether true for combinations of rare drugs and/or ADRs, especially when theobserved number of reports is very low [12, 7]. What remains is the fundamentaldifference that comes from the LLR being a multivariate method, in contrastto the IC that analyzes each combination independently. The discussion belowtherefore focuses on the relative importance of the two potential advantages thatcome from a multivariate method, unmasking and correction for confoundingby co-medication.

The results suggest that confounding by co-medication is actually quite com-mon in this database and the LLR property of correcting for it, as was seenin some particular cases, makes the LLR discard some false positive signalscompared to the IC, thus increasing the precision. This is probably the reasonwhy the LLR was capable of achieving such high precision for a high degree ofshrinkage. However, it was probably also the most common explanation to whythe LLR found new combinations in comparison to the IC. The reason is thatwhen certain combinations get lower parameter estimates than what would beexpected by looking at the IC values, more of the parameter space is left for

52

other combinations that, in terms of IC, are close to but not above the signallingthreshold.

Masking might also be an issue for some ADRs, especially when combined withdrugs that are frequently reported overall. One could argue that the unmaskedcombinations are more precious findings than the other newly found associ-ations. The reason is that many of the other newly found associations couldprobably be found also by the IC by using a more liberal threshold, although thisapproach of course requires more manual work after the data mining selection.

One aspect of ADR monitoring which has received little attention in this projectis the speed of signal detection. This property is very important since an earlysignal increases the chance of saving lives in the case of a serious ADR. Wehave observed earlier that the prior variance most appropriate from a precision-recall point of view, 0.2, actually would need one extra report in comparisonto the IC in order to find a very strong new signal (i.e. four instead of threereports). As mentioned before the practical importance of this difference couldbe debated, but one way of avoiding the whole problem could be to fine-tunethe prior variance so that the LLR can signal for combinations with only threereports. The question then is how much such an adjustment would decreasethe precision, a question which is impossible to answer on beforehand. It wouldalso be interesting to retrospectively study how the signal score for certaincombinations would have evolved over time in comparison to the IC.

A major limitation with the LLR seems to be the transparency. The basicconcept of measures based on observed and expected number of reports iswell accepted among clinicians, but even the increased complexity of shrunkdisproportionality-based measures might cause confusion. To move one stepfurther by introducing a shrunk regression-based measure might be very prob-lematic in terms of communicability. In some instances it might be possible toexplain why a certain parameter value is lower or higher than what would beexpected based on the IC by presenting specific raw counts. In general, however,this seems impractical and it remains a challenge to launch the LLR within theclinical community.

Also it would be difficult to change from the IC to the LLR in one step. Forinstance, using the LLR as in the systematic comparisons with a prior varianceof 0.2 would imply that about 40000 combinations would go from being signalsto not being signals, or the other way around. Even though it seems that theLLR is in fact better, such a change would not be practically possible. Onewould have to use a transition period to slowly make the clinical communityget acclimatized to this new methodology. The LLR could also be used as a

53

supplement to the IC in specific areas of interest, and there is no doubt that itwill be an important part of future ADR monitoring.

To conclude, this project has demonstrated that the LLR can be implementedand used in ADR monitoring and that it has useful properties. In this studyit performed better than the IC when scanning the entire database and despiteits limited transparency it has an important role to play in the future of ADRmonitoring.

Acknowledgements

First and foremost I would like to thank my supervisor Andrew Bate who hasbelieved in me all the way and who has helped me set out my course. Further,Niklas Noren has been an excellent resource at the UMC. Many of the key stepsthat I have taken along the way would not have been possible without him.Also Johan Hopstadius has contributed by helping me getting started with theUMC server.

I would like to express my gratitude towards professor David Madigan and hiscolleagues for developing the BBR software and making it publicly available. Ifit were not for their work, I would never have got as far as I actually did.

Two people, my examinator Silvelyn Zwanzig and professor Ingemar Kaj, at theDepartment of Mathematics at Uppsala University are worth acknowledging too.They have ensured that this project has received academic credit as a masterthesis.

Last but not least I would like to thank the whole UMC staff for providing sucha nice working environment, professionally as well as socially.

54

References

[1] Finney D.J. An international drug safety program. The Journal of New Drugs, 3:262–

265, 1963.

[2] Evans S.J. Pharmacovigilance: a science or fielding emergencies? Statistics in Medicine,

19(23):3199–3209, 2000.

[3] Finney D.J. Monitoring adverse reactions to drugs - its logic and its weaknesses. Pro-

ceedings of the European Society for the study of Drug Toxicity, 7:198–201, 1966.

[4] Noren G.N., Bate A., Orre R., and Edwards I.R. Extending the methods used to screen

the WHO drug safety database towards analysis of complex associations and improved

accuracy for rare events. Statistics in Medicine, 25:3740–3757, 2006.

[5] Hauben M. Application of an empiric bayesian data mining algorithm to reports of

pancreatitis associated with atypical antipsychotics. Pharmacotherapy, 24(9):1122–1129,

2004.

[6] Edwards I.R. and Biriell C. Harmonisation in pharmacovigilance. Drug Safety, 10(2):93–

102, 1994.

[7] van Puijenbroek E.P., Bate A., Leufkens H.G., Lindquist M., Orre R., and Egberts

A.C. A comparison of measures of disproportionality for signal detection in spontaneous

reporting systems for adverse drug reactions. Pharmacoepidemiology and Drug Safety,

11(1):3–10, 2002.

[8] Bate A., Lindquist M., Edwards I.R., Olsson S., Orre R., Lansner A., and De Freitas

R.M. A bayesian neural network method for adverse drug reaction signal generation.

European Journal of Clinical Pharmacology, 54:315–321, 1998.

[9] Evans S.J., Waller P.C., and Davis S. Use of proportional reporting ratios (PRRs) for

signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiology

and Drug Safety, 10(6):483–486, 2001.

[10] Hauben M., Madigan D., Gerrits C.M., Walsh L., and van Puijenbroek E.P. The role of

data mining in pharmacovigilance. Expert Opinion on Drug Safety, 4(5):929–948, 2005.

[11] van Puijenbroek E.P., Egberts A.C.G., Heerdink E.R., and Leufkens H.G.M. Detecting

drug-drug interactions using a database for spontaneous adverse drug reactions: An

example with diuretics and non-steroidal anti-inflammatory drugs. European Journal of

Clinical Pharmacology, 56:733–738, 2000.

[12] Hosmer D.W. and Lemeshow S. Applied Logistic Regression. Wiley, USA, 1989.

[13] Genkin A., Lewis D.D., and Madigan D. Large-scale bayesian logistic regres-

sion for text categorization. Technometrics, 2006 (to appear), available on

stat.rutgers.edu/˜ madigan/PAPERS/techno-06-09-18.pdf.

[14] Hastie T., Tibshirani R., and Friedman J. The Elements of Statistical Learning. Springer,

Canada, 2001.

[15] Hoerl A.E. Application of ridge analysis to regression problems. Chemical Engineering

Progress, 58:54–59, 1962.

[16] Stein C.M. Inadmissibility of the usual estimator of the mean of a multivariate normal

distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics

and Pobability 1, pages 197–206. University of California Press, 1956.

[17] Komarek P. and Moore A. Fast robust logistic regression for large sparse datasets with

binary outputs. In Bishop C.M. and Frey B.J., editors, Proceedings of the Ninth Inter-

national Workshop on Artificial Intelligence and Statistics, Jan 3-6, 2003, Key West,

FL.

55

[18] Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal

Statistical Society. Series B (Methodological), 58(1):267–288, 1996.

[19] Tropp J.A. Just relax: Convex programming methods for identifying sparse signals in

noise. IEEE Transactions on Information Theory, 52(3):1030–1051, 2006.

[20] Noren G.N. Statistical methods for knowledge discovery in adverse drug reaction surveil-

lance. PhD thesis, Department of Mathematics, Stockholm University, 2007.

[21] Zhang T. and Oles F. Text categorization based on regularized linear classifiers. Infor-

mation Retrieval, 4:5–31, 2001.

[22] Genkin A., Lewis D.D., and Madigan D. BBR: Bayesian logistic regression software.

www.stat.rutgers.edu/˜ madigan/BBR, cited 2007-04-12.

[23] Hietaniemi J. Comprehensive perl archive network. www.cpan.org, cited 2007-04-12.

[24] Osborne M.R., Presnell B., and Turlach B.A. On the lasso and its dual. Journal of

Computational and Graphical Statistics, 9(2):319–337, 2000.

[25] Efron B. and Tibshirani R.J. An Introduction to the Bootstrap. Chapman & Hall, USA,

1993.

[26] DuMouchel W. Logistic regression analysis of spontaneous ADR databases. In Interna-

tional Conference on Pharmaco-Epidemiology, Nashville, TN, 2005.

[27] Szarfman A., DuMouchel W., Fram D., Tonning J.M., Almenoff J., Fleischer R.D., and

Levine J.G. Lactic acidosis: Unraveling the individual toxicities of drugs used in HIV

and diabetes polytherapy by hierarchical bayesian logistic regression data mining. In

FDA Science Forum, 2005 (Conference Abstract).

[28] Manfredi R. and Chiodi F. Disorders of lipid metabolism in patients with HIV disease

treated with antiretroviral agents: frequency, relationship with administered drugs and

role of hypolipidaemic therapy with bezafibrate. Journal of Infection, 42(3):181–188,

2001.

[29] Fraiser L.H., Kanekal S., and Kehrer J.P. Cyclophosphamide toxicity. Characterising and

avoiding the problem. Drugs, 42(5):781–795, 1991.

[30] Bayer Corporation. Letter publicly available from FDA on

www.fda.gov/medwatch/safety/2001/Baycol2.htm, published August 8, 2001.

[31] Noren G.N., Sundberg R., Bate A., and Edwards I.R. A statistical methodology for

drug-drug interaction surveillance. Technical Report 2007:6, Mathematical Statistics,

Stockholm University, 2007.

[32] Lindquist M., Stahl M., Bate A., Edwards I.R., and Meyboom R.H. A retrospective

evaluation of a data mining approach to aid finding new adverse drug reaction signals in

the WHO international database. Drug Safety, 23(6):533–542, 2000.

[33] Lindquist M., Edwards I.R, Bate A., Fucik H., Nunes A.M., and Stahl M. From associa-

tion to alert - A revised approach to international signal analysis. Pharmacoepidemiology

and Drug Safety, 8(S1):S15–S25, 1999.

[34] Edwards I.R., Lindquist M., Wiholm B.E., and Napke E. Quality criteria for early signals

of possible adverse drug reactions. Lancet, 336(8708):156–158, 1990.

[35] Koh K., Kim S-J., and Boyd S. An interior-point method for large-scale l1-regularized

logistic regression. Journal of Machine Learning Research, 2006 (submitted).

[36] Hopstadius J., Noren G.N., Bate A., and Edwards I.R. Adjustment for potential con-

founders in adverse drug reaction surveillance. Drug Safety, 2007 (submitted).

56

Appendix A Example of a Result File

ADR is PROTHROMBIN DECREASED (number is 583)

Bayesian Binary Regression - Training. Ver. 3.01

Command line: c:/BBR/BBRtrain -p 1 -V 1 prothrombin.dat prothrombin.mod

Log Level: 0

PriorType: Laplace Lambda=1.41421

HyperParameter: Fixed

Feature Utility Function: Pearson’s correlation

#Features To Select: use all

Cosine normalization: No

Model - link function: Logistic Optimizer: ZO Threshold probability: 0.5

Standardize: No

Convergence threshold: 0.0005 Iterations limit: 1000

Data file for Training: prothrombin.dat

Write Model file: prothrombin.mod

Results file: no

Training data 3836537 rows

Features in training data: 13208

Class sizes in training, 1/0: 14946 / 3821591

Design: 13209 variables

Final prior variance value 1

Starting ZO model, Time 168

PriorType: Laplace Lambda=1.41421 Prior var=1

Stopped by original ZO rule

Built ZO model 37 iterations, Time 344

Beta components dropped finally: 12446 Left: 763

Beta IC IC025 Number Name

-2.374500000 -5.99 -9.73 175 "Diphtheria and tetanus toxoids"

-2.373260000 -5.85 -9.58 12537 "Influenza vaccine"

-2.367450000 -5.70 -9.43 13792 "Evra"

-2.193780000 -5.36 -7.05 6521 "Hepatitis b vaccine"

-2.160130000 -5.16 -7.68 3009 "Levonorgestrel"

-2.025700000 -2.22 -5.95 12816 "Nadroparin"

. . . lots of lines left out . . .

0.000000000 -1.57 -5.30 13068 "Verteporfin"

0.000000000 -1.62 -5.35 1302 "Pizotifen"

0.000000000 -1.66 -5.39 13821 "Omalizumab"

0.000000000 -1.66 -5.39 349 "Nitrazepam"

0.000000000 -1.66 -5.40 4171 "Zimeldine"

0.000000000 -1.78 -5.51 11781 "Amifostine"

0.000000000 -1.78 -5.51 1778 "Pentamidine"

0.000000000 -1.96 -5.69 14610 "Natalizumab"

0.000000000 -1.97 -5.70 276 "Chlortalidone"

0.000000000 -1.98 -5.71 4766 "Flunarizine"

0.000000000 -1.98 -5.72 4883 "Amoxapine"

0.000000000 -2.05 -5.78 12553 "Imiquimod"

0.000000000 -2.23 -5.96 1052 "Flupentixol"

0.000000000 -2.26 -5.99 546 "Xylocaine-epinephrine"

-6.66926 <Intercept>

57

---Resubstitution results---

Confusion matrix: Relevant Not Relevant

Retrieved 633 718

Not Retrieved 14313 3820873

Precision = 46.8542

Recall = 4.23525

F1 = 7.7683

T11U = 548

T11NU = 0.0183327

T11SU = 0.345555

T11F = 0.155528

T13U = 11942

% errors = 0.391786

Training set loglikelihood -58178.8 Average -0.0151644

Log prior (penalty) -5435.4 Log posterior -63614.2

ROC area under curve 92.7119

Time 418

58

Mining the WHO Drug Safety Database Using Lasso Logistic ...304279/FULLTEXT01.pdf · collect a...

Documents

Transcript of Mining the WHO Drug Safety Database Using Lasso Logistic ...304279/FULLTEXT01.pdf · collect a...