Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in...

10
Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT Ralitza Gueorguieva, Ran Wu, Patrick G. O’Connor, Constance Weisner, Lisa M. Fucito, Sabine Hoffmann, Karl Mann*, and Stephanie S. O’Malley* Background: The goal of the current study was to use tree-based methods (Zhang and Singer, 2010, Recursive Partitioning and Applications, 2nd ed. Springer, New York) to identify predictors of absti- nence from heavy drinking in COMBINE (Anton et al. JAMA 2006; 295:2003), the largest study of pharmacotherapy for alcoholism in the United States to date, and to validate these results in PREDICT (Mann et al. Addict Biol 2012; 18:937), a parallel study conducted in Germany. Methods: We compared a classification tree constructed according to purely statistical criteria to a tree constructed according to a combination of statistical criteria and clinical considerations for predic- tion of no heavy drinking during treatment in COMBINE. We considered over 100 baseline predictors. The tree approach was compared to logistic regression. The trees and a deterministic forest identified the most important predictors of no heavy drinking for direct testing in PREDICT. Results: The tree built using both clinical and statistical considerations consisted of 4 splits based on consecutive days of abstinence (CDA) prior to randomization, age, family history of alcoholism, and confidence to resist drinking in response to withdrawal and urges. The tree based on statistical con- siderations with 4 splits also split on CDA and age but also on gamma-glutamyl transferase level and drinking goal. Deterministic forest identified CDA, age, and drinking goal as the most important pre- dictors. Backward elimination logistic regression among the top 18 predictors identified in the determin- istic forest analyses identified only age and CDA as significant main effects. Longer CDA and goal of complete abstinence were associated with better outcomes in both data sets. Conclusions: The most reliable predictors of abstinence from heavy drinking were CDA and drink- ing goal. Trees provide binary decision rules and straightforward graphical representations for identifi- cation of subgroups based on response and may be easier to implement in clinical settings. Key Words: Alcohol Dependence, Clinical Trial, Classification Tree, Deterministic Forest, Logistic Regression. M OST STATISTICAL ANALYSES of randomized clinical trials focus on between-group comparisons; however, it is also important to identify predictors of good outcome regardless of treatment. A systematic literature review (Adamson et al., 2009) identified dependence severity, psychopathology ratings, alcohol-related self-efficacy, moti- vation, and treatment goal as the most consistent predictors of alcohol treatment outcome. Furthermore, dependence severity together with baseline alcohol consumption accounted for the greatest variance in predictive models. However, potentially important variables (e.g., social sup- port, alcoholism typology) were considered in too few studies to allow for meaningful analysis, and many studies consid- ered predictors one at a time and ignored the relationships among predictor variables. As classical regression approaches are often limited to testing main effects and lower order interactions of limited number of predictors, a more powerful approach may be based on classification and regression trees (CARTs). Tree-based methods originated with the development of automatic interaction detection (AID) algorithms by Morgan and Sonquist (1963), Morgan and Messenger (1973, THAID), and Kass (1980, CHAID). CART methods were formalized by Breiman and colleagues (1984). Modern devel- opments include deterministic and random forests (Zhang and Singer, 2010) and take full advantage of the computa- tional power available today. The basic goals of tree-based methods are to produce good classification algorithms and to From the Department of Biostatistics (RG), Yale University School of Public Health and School of Medicine, New Haven, Connecticut; Department of Psychiatry (RW, LMF, SSO), Yale University School of Medicine, New Haven, Connecticut; Department of Internal Medicine (PGO), Yale University School of Medicine, New Haven, Connecticut; Kaiser Permanente Division of Research (CW), Oakland, California; Department of Psychiatry (CW), University of California at San Fran- cisco, San Francisco, California; and Department of Addictive Behavior and Addiction Medicine (SH, KM), Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany. Received for publication January 13, 2014; accepted August 1, 2014. Reprint requests: Ralitza Gueorguieva, PhD, Yale University School of Public Health and School of Medicine, 60 College St., Room 215B, New Haven, CT 06520-8034; Tel.: 203-285-8155; Fax: 203-974-7662; E-mail: [email protected] *Senior authors with equal contribution. Copyright © 2014 by the Research Society on Alcoholism. DOI: 10.1111/acer.12541 Alcohol Clin Exp Res, Vol 38, No 10, 2014: pp 2647–2656 2647 ALCOHOLISM:CLINICAL AND EXPERIMENTAL RESEARCH Vol. 38, No. 10 October 2014

Transcript of Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in...

Page 1: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Predictors of Abstinence from Heavy Drinking During

Treatment in COMBINE and External Validation in

PREDICT

Ralitza Gueorguieva, RanWu, Patrick G. O’Connor, ConstanceWeisner, Lisa M. Fucito,Sabine Hoffmann, Karl Mann*, and Stephanie S. O’Malley*

Background: The goal of the current study was to use tree-based methods (Zhang and Singer, 2010,Recursive Partitioning and Applications, 2nd ed. Springer, New York) to identify predictors of absti-nence from heavy drinking in COMBINE (Anton et al. JAMA 2006; 295:2003), the largest study ofpharmacotherapy for alcoholism in the United States to date, and to validate these results in PREDICT(Mann et al. Addict Biol 2012; 18:937), a parallel study conducted in Germany.

Methods: We compared a classification tree constructed according to purely statistical criteria to atree constructed according to a combination of statistical criteria and clinical considerations for predic-tion of no heavy drinking during treatment in COMBINE. We considered over 100 baseline predictors.The tree approach was compared to logistic regression. The trees and a deterministic forest identifiedthe most important predictors of no heavy drinking for direct testing in PREDICT.

Results: The tree built using both clinical and statistical considerations consisted of 4 splits basedon consecutive days of abstinence (CDA) prior to randomization, age, family history of alcoholism,and confidence to resist drinking in response to withdrawal and urges. The tree based on statistical con-siderations with 4 splits also split on CDA and age but also on gamma-glutamyl transferase level anddrinking goal. Deterministic forest identified CDA, age, and drinking goal as the most important pre-dictors. Backward elimination logistic regression among the top 18 predictors identified in the determin-istic forest analyses identified only age and CDA as significant main effects. Longer CDA and goal ofcomplete abstinence were associated with better outcomes in both data sets.

Conclusions: The most reliable predictors of abstinence from heavy drinking were CDA and drink-ing goal. Trees provide binary decision rules and straightforward graphical representations for identifi-cation of subgroups based on response and may be easier to implement in clinical settings.

Key Words: Alcohol Dependence, Clinical Trial, Classification Tree, Deterministic Forest, LogisticRegression.

MOST STATISTICAL ANALYSES of randomizedclinical trials focus on between-group comparisons;

however, it is also important to identify predictors of goodoutcome regardless of treatment. A systematic literaturereview (Adamson et al., 2009) identified dependence severity,

psychopathology ratings, alcohol-related self-efficacy, moti-vation, and treatment goal as the most consistent predictorsof alcohol treatment outcome. Furthermore, dependenceseverity together with baseline alcohol consumptionaccounted for the greatest variance in predictive models.However, potentially important variables (e.g., social sup-port, alcoholism typology) were considered in too few studiesto allow for meaningful analysis, and many studies consid-ered predictors one at a time and ignored the relationshipsamong predictor variables. As classical regressionapproaches are often limited to testing main effects and lowerorder interactions of limited number of predictors, a morepowerful approach may be based on classification andregression trees (CARTs).Tree-based methods originated with the development of

automatic interaction detection (AID) algorithms byMorganand Sonquist (1963), Morgan and Messenger (1973,THAID), and Kass (1980, CHAID). CART methods wereformalized by Breiman and colleagues (1984). Modern devel-opments include deterministic and random forests (Zhangand Singer, 2010) and take full advantage of the computa-tional power available today. The basic goals of tree-basedmethods are to produce good classification algorithms and to

From the Department of Biostatistics (RG), Yale University Schoolof Public Health and School of Medicine, New Haven, Connecticut;Department of Psychiatry (RW, LMF, SSO), Yale University School ofMedicine, New Haven, Connecticut; Department of Internal Medicine(PGO), Yale University School of Medicine, New Haven, Connecticut;Kaiser Permanente Division of Research (CW), Oakland, California;Department of Psychiatry (CW), University of California at San Fran-cisco, San Francisco, California; and Department of Addictive Behaviorand Addiction Medicine (SH, KM), Central Institute of Mental Health,Medical FacultyMannheim, HeidelbergUniversity,Mannheim, Germany.

Received for publication January 13, 2014; accepted August 1, 2014.Reprint requests: Ralitza Gueorguieva, PhD, Yale University School

of Public Health and School of Medicine, 60 College St., Room 215B,New Haven, CT 06520-8034; Tel.: 203-285-8155; Fax: 203-974-7662;E-mail: [email protected]

*Senior authors with equal contribution.Copyright© 2014 by the Research Society on Alcoholism.

DOI: 10.1111/acer.12541

Alcohol Clin Exp Res,Vol 38, No 10, 2014: pp 2647–2656 2647

ALCOHOLISM: CLINICAL AND EXPERIMENTAL RESEARCH Vol. 38, No. 10October 2014

Page 2: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

estimate the predictive structure of the data (i.e., identifywhich variables interact with one another to produce a cer-tain classification). This is performed by the method of recur-sive partitioning on a learning sample with the goal ofvalidating the algorithm on an independent sample and pre-diction of future cases.

The learning sample is recursively divided into groups thatare most homogeneous with respect to the outcome and mostdifferent from one another. Different versions of the algo-rithm incorporate different statistical criteria for splitting thesample and determining the optimal size of the tree (Breimanet al., 1984; Zhang and Singer, 2010).

Tree-based methods are appealing alternatives to standardlinear model techniques when assumptions of additivity ofthe effects of explanatory variables, normality, and linearityare untenable. Tree-based and forest-based methods are non-parametric computationally intensive algorithms that can beapplied to large data sets and are resistant to outliers. Theyallow consideration of a large pool of predictor variables andcan discover predictors that even experienced investigatorsmay have overlooked (Zhang and Singer, 2010). These meth-ods are most useful for identification of variable interactionsand may be easier to use in clinical settings because theyrequire evaluation of simple decision rules rather than math-ematical equations (Zhang and Singer, 2010).

However, due to their focus on interactions, tree-basedmethods can lead to idiosyncratic results that fail to replicate.This problem has plagued the field since the inception oftree-based algorithms especially earlier versions of the meth-ods. In order for tree-based methods to be used in clinicalpractice, clinical and practical considerations may need to betaken into account in order to avoid deriving trees with split-ting variables that may be difficult to use by clinicians andthat are unlikely to replicate in other studies.

In alcohol research, tree-based techniques have been usedonly in epidemiological studies for prediction (Muller et al.,2008; Vik et al., 2006). The potential of these methods toinform clinical decisions in treatment studies and ultimatelyin clinical practice has not been explored. The goal of thecurrent study was to use tree-based methods to identifygroups of subjects with good drinking outcomes duringtreatment in COMBINE (Anton et al., 2006) and to validatethese results in PREDICT (Mann et al., 2012).

The COMBINE Study evaluated the benefits of combin-ing pharmacotherapy treatment (naltrexone, acamprosate)and behavioral interventions (Medication Management[MM, Pettinati et al., 2004]; combined behavioral interven-tion [CBI, Miller, 2004]) in alcohol-dependent patients. ThePREDICT Study adopted the methodologies, assessment,and behavioral interventions used in the COMBINE Studyto allow for direct comparisons of the findings between the 2studies (Mann et al., 2009).

A number of summary drinking outcomes have been con-sidered in alcohol research (e.g., percent drinking or heavydrinking days, total abstinence). In COMBINE, we focus on

abstinence from heavy drinking following a grace period of8 weeks, which has been recommended as an outcome mea-sure for clinical trials because it is associated with reducedrisk of alcohol-related consequences while allowing forimprovements in drinking short of abstinence (Falk et al.,2010). This binary measure is also convenient to use with theclassification approach as it provides an easily ascertainedoutcome and an easily interpreted decision.

MATERIALS ANDMETHODS

The Study Samples

In COMBINE, 8 groups received MM with 16 weeks of naltrex-one (100 mg/d) or acamprosate (3 g/d), both, and/or both placebos,with or without CBI. Our analysis focused on subjects who had anydrinking data during treatment (n = 1,220). A small percentage ofsubjects had received inpatient treatment in the 30 days prior toenrollment (7.7%), and the majority was recruited from thecommunity.

In PREDICT, 426 subjects were recruited from inpatient facilitiesand were randomly assigned to naltrexone (50 mg/d), acamprosate(2 g/d), or placebo. Pharmacotherapy was given for 12 weeks com-bined with MM. Following the first day of heavy drinking, subjectswere detoxified and randomized to augmentation with CBI inanother protocol.

Drinking Outcome

The outcome measure in COMBINE was no heavy drinking dur-ing the last 8 weeks of double-blind treatment. Seventy subjects(5.74%) were coded as heavy drinking due to missing values. InPREDICT, once subjects relapsed to heavy drinking, they wereswitched to another protocol (Berner et al., 2013); therefore, theoutcome variable in PREDICT was whether subjects relapsed toheavy drinking during double-blind treatment.

Predictors

We considered over 100 baseline predictors in COMBINE thathad <15% missing values. Categorical predictors with missing val-ues had an additional missing category created. Continuous predic-tors had missing values imputed using PROC MI in SAS (SASInstitute Inc., Cary, NC). Predictors are shown by domain inTable 1 and described in detail in Appendices S1 and S2.

Training and Validation Samples

A random two-thirds of the COMBINE data (n = 809) was usedfor tree development, and the remaining one-third (n = 411) for treevalidation. The whole PREDICT sample (n = 426) was used forexternal validation.

Tree Construction

Two approaches to tree construction were undertaken: one basedon purely statistical criteria, and another based on statistical andpractical considerations.

Purely statistical approach. The Willows program (Zhang et al.,2009) was used to construct a tree to predict the outcome inCOMBINE using recursive partitioning.Tree growing.A node was split into 2 daughter nodes so that each

daughter node was as homogeneous as possible with respect to theoutcome, while the 2 nodes were as different from each other as

2648 GUEORGUIEVA ET AL.

Page 3: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

possible on outcome. At each step, the predictor variable andthreshold of that variable that led to the best split in terms ofentropy were selected. That is, for each node in the tree, the pro-gram assessed all potential predictors and all possible splits of valueson these predictors (multiple possible thresholds for continuous andordinal variables and combinations of categories for nominal

variables) and identified the variable and split that were associatedwith the largest difference in proportions of heavy drinking in the 2daughter nodes. The algorithm proceeded recursively until no fur-ther splits were possible. We imposed the restriction of at least 20subjects in each node to avoid unstable splits.Tree pruning.Once the full tree was grown, the chi-square statistic

for 2 9 2 contingency tables was calculated for each split. Using apreselected alpha level (p = 0.0001), nodes whose chi-square valuesas well as the chi-square values of subsequent splits did not exceedthe predetermined threshold were pruned to avoid overfitting. Thatis, splits for which the proportions of subjects without heavy drink-ing in the 2 daughter nodes were not significantly different wereremoved from the tree.Performance of the algorithm on each sample was assessed using

area under the curve (AUC) of the receiver operating curves show-ing sensitivity versus 1-specificity for the classification of subjects inoutcome categories. Values close to 1 reflect near perfect classifica-tion, while values around 0.5 reflect random classification.

Approach Based on Statistical and Clinical Considerations

Tree growing. A tree was constructed in steps using recursivepartitioning in JMP (SAS Institute, 2010) as in JMP the usercan control the final choice of splitting variables. For each split,the top 3 candidates based on the log-worth statistic were con-sidered. Two of the authors (RG and RW) prepared a tablewith information about the top 3 candidates (an example is pre-sented in Table 2) for splitting a node and asked the remainingco-authors to rank-order the predictors based on the practicalutility of each candidate. The number 1 ranks were counted,and the predictor that collected the most number 1 ranks wasselected. In case of a tie, a discussion among the voting mem-bers of the team resolved the tie. For continuous predictors, theoptimal split was allowed to be adjusted slightly based on clini-cal considerations. For example, a gamma-glutamyl transferase(GGT) cutoff of 193 IU/L in a tree based on practical and sta-tistical considerations was lowered to 189 IU/L which is 3 timesthe upper limit of the normal range. Tree growing proceededuntil the group voted to stop splitting because the splitting pathsbecame too complicated.Tree pruning. The constructed tree with 10 splits was evaluated on

the validation sample (VS) in COMBINE. Splits for which the per-centage of subjects with no heavy drinking in the 2 daughter nodeswere different by <5%, and/or the differences were not in the samedirection in the validation data set as in the training data set werepruned.Performance of the algorithm on each sample was assessed using

AUC (Table 3).

Table 1. Predictors in Tree and Forest Analyses in COMBINE

Domain Variables

BaselineDemographics Age, gender, race, marital status,

education, employment, family incomeAlcohol consumption % heavy days, % abstinent days,

consecutive days abstinent,peak BAC, trajectories of anydrinking and heavy drinking

Alcohol severity SCID symptom count, CIWA,DRINC total, age of onset

Prior alcohol treatment Detox, AA attendance, any treatmentDrinking goal Complete abstinence, conditional

abstinence, controlled drinking, otherFamily history Alcohol, smokingCraving OCDS total scoreSmoking Current smokerDrug use Cannabis usePhysical exam Weight, height, pulse, blood pressureLaboratory analysis Urine and blood test resultsAlcohol AbstinenceSelf-Efficacy

Total and subscale scores

URICA Overall readiness and 4 subscalescores

WHO quality of life Environment, physical, psychological,social relationships

SF12 Physical healthProfile of mood states Tension, depression, anger, vigor,

fatigue, confusionPerceived stress PSS totalImportant people Number of in-network daily drinkersLegal problems Arrested

TreatmentTreatment condition Acamprosate, naltrexone, COMBINE

behavioral intervention

AA, Alcoholics Anonymous; BAC, blood alcohol concentration; CIWA,Clinical Institute Withdrawal Assessment for Alcohol; DRINC, DrinkerInventory of Consequences; OCDS, Obsessive Compulsive DrinkingScale; PSS, Perceived Stress Scale; SCID, Structured Clinical Interviewfor DSM-IV; SF12, Short Form 12-item Survey; URICA, University ofRhode Island Change Assessment; WHO,World Health Organization.

Table 2. Top 3 Candidates for Splitting the Root Node of the Tree Predicting No Heavy Drinking During the Last 8 Weeks of Treatment in the TrainingSample in COMBINE

Predictor Categories Number in category Percent without heavy drinking (%) Statistical criteria (log worth)

CDA before treatment >14 days≤14 days

120689

7038

10.35

Age >45 years≤45 years

373436

5434

8.38

Drinking goal Complete abstinence or missinga

Othersb318491

5435

5.62

CDA, consecutive days of abstinence.aComplete Abstinence: “I want to quit using alcohol once and for all, to be totally abstinent, and never use alcohol ever again for the rest of my life. ”bOthers: This grouping contains all of the other responses including: (i) no goal; (ii) controlled use; (iii) abstinence but may drink later; (iv) occasional

use; and (v) want to quit but realize might slip.

PREDICTORSOF ABSTINENCE FROMHEAVY DRINKING 2649

Page 4: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Deterministic Forest

As different combinations of predictors may lead to similarprediction, we also constructed deterministic forest in the train-ing sample (TS) to identify the strongest predictors of the out-come. A deterministic forest consists of trees of similarstructure based on the best set of splits for predicting the out-come. We applied the approach of Zhang and colleagues (2003)and considered the top 20 splits of the root node and the top3 splits of the 2 daughter nodes of the root node, giving riseto 180 (20 9 3 9 3) trees in the forest.

Logistic Regression in COMBINE

We performed backward elimination logistic regression analysiswith main effects and 2-way interactions among the predictors iden-tified as splitters at least 20% of the time in the deterministic forest.We limited the number of predictors because it was not possible toconsider simultaneously all possible 2-way interactions. At each stepof the backward elimination procedure, the effect with the least sig-nificant p-value larger than 0.05 was dropped so that at each step,the model was hierarchically well-formulated. Only effects that weresignificant in both the training and validation data sets were selectedfor further testing in PREDICT. We also performed logistic regres-sion with the top categorized predictors identified from the treesand their interactions to estimate odds ratios and associated 95%confidence intervals. Backward elimination logistic regression wasused to assess all possible interactions and main effects among thetop 3 predictors identified in deterministic forests.

External Validation in PREDICT

We reconstructed the trees identified in COMBINE, in the PRE-DICT sample. We also fit the final logistic regression models identi-fied after backward elimination algorithms in COMBINE andtested the main effects and interactions corresponding to the treesthat validated internally in COMBINE to assess effect sizes in

PREDICT. The predictive abilities of the trees and logistic regres-sion models were evaluated using AUCs.

RESULTS

Classification Trees

Tree Based on Purely Statistical Approach. The con-structed tree on the TS is shown in Fig. 1A, and the same treeon the VS is shown in Fig. 1B. The first splits of the treesshow that a high proportion of subjects with more than 2consecutive weeks of continuous abstinence consecutive daysof abstinence [CDA] prior to randomization (n = 120 in TS,n = 64 in VS, nodes 2) abstained from heavy drinking in thelast 8 weeks of treatment (p = 70% in TS, 56% in VS). Thesecond splits show that among those who had 2 weeks orfewer CDA prior to randomization, younger subjects(age < 45 years, nodes 5) were less likely to abstain fromheavy drinking (p = 28% in TS, p = 30% in VS). Theremaining 2 splits were based on high GGT (≥193) and ondrinking goal and did not replicate in VS. The AUC for was69% in TS versus 61% in VS.

Tree Based on Statistical and Clinical Consider-ations. The first 2 splits of the trees based on statistical andpractical considerations (Fig. 2A,B) were exactly the same asthe first 2 splits of the trees based on statistical considerationsonly. However, the third split was for subjects who abstainedfrom drinking for 2 weeks or less prior to randomizationand were younger (<45 years old) and was based on thewithdrawal/urge subscore on the Alcohol Abstinence

Table 3. Areas Under the Curve Showing the Predictive Ability of Each of the Considered Trees and Models in COMBINE and in PREDICT

PredictorsCOMBINE

training sample (%)COMBINE

validation sample (%)PREDICT

(%)

Tree based on purelystatistical approach

CDAAge (within CDA)GGT (within Age)Goal (within GGT)

69 61 57

Tree based on statisticaland clinical considerations

CDAAge (within CDA)CWITHDR (within Age)FH (within CWITHDR)

68 61 57

Tree with only the 2 top splits CDAAge (within CDA)

66 60 57

Full tree (22 splits) 78 64 –Final model from logisticregression after backwardelimination with 2-wayinteractions in both training andvalidation sample

CDA continuousAge continuous

68 62 56

Logistic regression with robustpredictors identified in trees

CDAAgeCDA*Age

66 60 57

Logistic regression with predictorsidentified in deterministic forest

CDAAgeGoal

69 62 60

CDA, consecutive days of abstinence; CWITHDR, confidence that they could resist drinking in response to withdrawal and urges; FH, family history;GGT, gamma-glutamyl transferase.

2650 GUEORGUIEVA ET AL.

Page 5: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Node 3 689 263 .38

Consecu�ve days of abs�nence prior to treatment

< 193 IU/L

< 15 days

< 45 yrs

Condi�onal abs�nence/ controlled drinking

Node 1 809 347 .43

Node 2 120 84 .70

Node 6 22 2

.10

Node 7 322 166 .52

Node 8 116 77 .66

Node 9 206 89 .43

GGT

Drinking goal

Age Node 4 344 168 .49

Node 5 345 95 .28

Total abs�nence /other

A Tree based on sta�s�cal considera�ons in COMBINE training sample.

Node 3 347 124 .36

Consecu�ve days of abs�nence prior to treatment

< 15 days

< 45 yrs

Condi�onal abs�nence /controlled drinking/missing

value

Node 1 411 160 .39

Node 2 64 36 .56

Node 6 10 6

.60

Node 7 151 62 .41

Node 8 50 20 .40

Node 9 101 42 .42

GGT

Drinking goal

Age Node 4 161 68 .42

Node 5 186 56 .30

Total abs�nence /other

< 193 IU/L

B Tree based on sta�s�cal considera�ons in COMBINE valida�on sample.

Fig. 1. Tree based on statistical considerations in COMBINE (A) training sample and (B) validation sample. The 3 numbers from top to bottom in eachnode of the tree are total number of subjects, number of subjects with no heavy drinking during the last 8 weeks of treatment, and percent of subjects withno heavy drinking during the last 8 weeks of treatment.

PREDICTORSOF ABSTINENCE FROMHEAVY DRINKING 2651

Page 6: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Self-Efficacy (AASE) confidence scale. Subjects with higherconfidence that they could resist drinking in response towithdrawal and urges (CWITHDR ≥ 2.5, nodes 6) hadhigher chance of abstaining from heavy drinking (p = 36%in TS, p = 34% in VS) than subjects with lower confidence(p = 21% in TS, p = 26% in VS, nodes 7). Positive com-

pared to negative family history of alcoholism in the groupwith lower confidence was associated with better outcome(p = 29% vs. p = 14% in TS, p = 30% vs. p = 23% in VS,nodes 8 vs. 9). The predictive ability of the tree was fair in TS(AUC = 68%) but was substantially lower in VS(AUC = 61%).

Node 3 689 263 .38

Consecu�ve days of abs�nence prior to treatment

< 2.5

< 15 days

< 45 yrs

No

Node 1 809 347 .43

Node 4 344 168 .49

Node 5 345 95 .28

Node 6 154 55 .36

Node 7 191 40 .21

Node 8 85 25 .29

Node 9 106 15 .14

Confidence: Withdrawal/Urge

Family history of alcoholism

Age

Node 2 120 84 .70

A Tree based on sta�s�cal and clinical considera�ons in COMBINE training sample.

Node 3347124.36

Consecu�ve days of abs�nence prior to treatment

< 2.5

< 15 days

< 45 yrs

No

Node 1411160.39

Node 416168.42

Node 518656.30

Node 69532.34

Node 79124.26

Node 84714.30

Node 94410.23

Confidence: Withdrawal/Urge

Family history of alcoholism

Age

Node 26436.56

B Tree based on sta�s�cal and clinical considera�ons in COMBINE valida�on sample.

Fig. 2. Tree based on statistical and clinical considerations in COMBINE (A) training sample and (B) validation sample. The 3 numbers from top to bot-tom in each node of the tree are total number of subjects, number of subjects with no heavy drinking during the last 8 weeks of treatment, and percent ofsubjects with no heavy drinking during the last 8 weeks of treatment.

2652 GUEORGUIEVA ET AL.

Page 7: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

As the first 2 splits coincide for the 2 approaches, a treewith only these 2 splits was evaluated and had AUC = 66%in TS and AUC = 60% in VS. For comparison purposes,the largest tree that was constructed on TS (prior to pruningof branches) had 22 splits, AUC = 78% on TS andAUC = 64% on VS. These represent the upper limits ofachievable prediction accuracy with the tree approach forthese data.

Deterministic Forest

The deterministic forest identified 94 variables that wereused to split nodes in at least 1 of the 180 trees in the forest.Of these, 18 variables were used for node splitting at least 36times, which corresponds to about 20% of the trees in theforest (Table 4). The top 3 predictors (CDA, age, and drink-ing goal) occurred in all trees in the forest. The treatmentsoccurred less often than 20% of the time as splitting variablesin the trees and thus did not appear to be strong predictorsof outcome (naltrexone—29 times, acamprosate—24 times,CBI—21 times).

Logistic Regression

Backward elimination with the top predictors identified inthe deterministic forest analysis resulted in only 2 continuouspredictors: CDA and age (Table 3). The predictive ability of

this analysis was fair in TS (AUC = 68%) and substantiallylower in VS (AUC = 62%).Logistic regression analyses with CDA (≤14 days,

>14 days), age (≤45 years, >45 years), and the interactionhad slightly worse predictive ability (AUC = 66% in TS andAUC = 60% in VS, Table 3). The estimates in Table 5 sug-gest that longer abstinence was associated with significantlyhigher odds of abstinence from heavy drinking in TS(OR = 3.97, 95% CI: 2.59, 6.09) and in VS (OR = 2.39,95% CI: 1.37, 4.15). Older age was also associated with bet-ter outcome in TS (OR = 2.08, 95% CI: 1.36, 3.20) and inVS (OR = 1.70, 95% CI: 1.09, 2.64). In TS, the age effectwas more significant among those with shorter abstinence(OR = 2.51, 95% CI: 1.83, 3.45) than among those withlonger abstinence (OR = 1.61, 95% CI: 0.58, 4.43).Backward elimination logistic regression with the top 3 cat-

egorized predictors identified in deterministic forest analyses(CDA, age, and drinking goal) resulted in only a main effectsmodel that had fair classification accuracy in TS (69%) andlower classification accuracy in VS (62%). As in the previouslogistic regression analysis, longer abstinence was associatedwith better outcome in both TS and VS. Goal of completeabstinence or other was associated with significantly betteroutcome in both TS (OR = 1.97, 95% CI: 1.45, 2.66) and VS(OR = 1.71, 95% CI: 1.14, 2.57). However, the effect of agewas significant only in the TS (OR = 2.43, 95% CI: 1.80,3.28) and not in the VS (OR = 1.40, 95% CI: 0.92, 2.13).

External Validation of the Results in PREDICT

Subjects in PREDICT had longer prerandomization absti-nence (median = 22 days, interquartile range [IQR]: 20 to25 days) than subjects in COMBINE (median = 5 days,IQR: 4 to 10 days) as COMBINE required 4 days of absti-nence, whereas PREDICT required 2 weeks of abstinence.Therefore, the cutoff for CDA in the external validationanalyses was raised from 2 weeks (14 days) to 3 weeks(21 days).We then reconstructed the trees built in COMBINE, using

PREDICT. All 3 trees (not shown) had the same classifica-tion accuracy (57%) in PREDICT, corresponding to slightlybetter classification accuracy than random classification.Logistic regression analysis with CDA (≤21 days, >21 days),age (≤45 years, >45 years), and the interaction showed thatthe odds of abstinence from heavy drinking during treatmentwere higher by about 60% among subjects with more than3 weeks of abstinence prior to treatment (OR = 1.60, 95%CI: 1.09, 2.35, Table 5). The AUC based on this model wasalso 57%.The logistic regression model with age (≤45 years,

>45 years), CDA (≤21 days, >21 days), and drinking goal(total abstinence vs. all other goals) showed that longer absti-nence and goal of total abstinence were associated with sig-nificantly higher odds of abstinence from heavy drinking(OR = 1.63, 95% CI: 1.11, 2.40 and OR = 1.71, 95% CI:1.14, 2.55). Age, however, did not have a significant effect on

Table 4. Top Predictors Used in Node Splitting in the Forest and HowOften TheyWere Used to Split a Tree in the Forest

Variable in COMBINENumber of times

as splittersaAvailable inPREDICT

CDA prior to treatment 300 YesAge 300 YesDrinking goal 229 YesBaseline trajectory of heavy drinking 125 NoAASE total confidence score 105 YesSelf-report ordinal health measure 105 NoPOMS fatigue scale 87 NoBaseline trajectory of any drinking 79 NoAASE total temptation score 51 YesOCDS total score 51 YesWHOQoL psychological domain 45 YesAASE confidence: withdrawal/urge 39 YesFamily history: 2 or more members 38 YesPOMS depression scale 36 NoPerceived stress score 36 Yesb

Potassium 36 Nob

Baseline smoking (yes/no) 36 Yesb

AST(SGOT) 36 Yesb

AASE, Alcohol Abstinence Self-Efficacy; AST, aspartate aminotransfer-ase; CDA, consecutive days of abstinence; OCDS, Obsessive CompulsiveDrinking Scale; POMS, Profile of Mood States; QoL, quality of life; SGOT,serum glutamic-oxaloacetic transaminase; WHO, World HealthOrganization.

aA predictor can be used more than once in splitting the nodes of a treein the forest.

bThese predictors were not used in the external validation logisticregression analyses due to different ranges or categories between the 2data sets.

PREDICTORSOF ABSTINENCE FROMHEAVY DRINKING 2653

Page 8: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

the outcome. This model achieved 60% classification accu-racy, the highest achieved classification accuracy in PRE-DICT.

DISCUSSION

Both tree-based and logistic regression analyses showed atbest fair classification accuracy in the data sets. This suggeststhat the signal-to-noise ratios in these data sets are small andthat even with multiple variables accurate prediction of absti-nence from heavy drinking during treatment is difficult. Nev-ertheless, a couple of predictors of good outcome emergedfrom all analyses, namely abstinence prior to treatment anddrinking goal.

The capacity to sustain abstinence prior to treatment sig-nificantly decreased the likelihood of heavy drinking.Two weeks of abstinence was the optimal split in COM-BINE which is consistent with prior work by Stout (2000). InPREDICT, CDA was defined as more than 3 weeks ofabstinence due to the differences in the data ranges betweenCOMBINE and PREDICT, but this different categorizationalso showed that longer pretreatment abstinence predicted agood response. Thus, individuals with less abstinence priorto entering treatment may need additional support toimprove their odds of a good outcome.

Drinking goal also emerged as an important variable inthe deterministic forest analyses. In both data sets, a goal oftotal abstinence was associated with significantly lower likeli-hood of heavy drinking. Subjects with goals of conditionalabstinence or controlled drinking may require additionalmonitoring and interventions specifically designed to teachmoderate drinking skills in order to achieve better outcome.In COMBINE, individuals with controlled drinking goalshave more daily drinkers in their social networks, suggestingthat efforts to incorporate nondrinkers in their networkmight be worth exploring (DeMartini et al., 2014).

The treatments did not appear in our individual trees andwere very infrequently selected as splitting variables in the

deterministic forest. However, we performed sensitivityanalyses when forcing the first split to be on treatment inorder to assess how robust the identified predictors are in thecontext of treatment. Results when naltrexone was forced tobe the first splitting variable are presented in Fig. S1A,B. Itappears that the identified predictors (CDA, age) are robustacross treatments, but it is possible that within a particulartreatment, stronger predictors may be identified (e.g., lowerblood alcohol concentration peak levels may be a slightlybetter predictor of outcome than older age for subjects onnaltrexone with shorter abstinence). While the question ofwhich predictors are associated with good outcome within atreatment group is of interest, identifying subgroups of sub-jects for whom a particular treatment is better than anotheris the more challenging and potentially more important ques-tion. The latter goal may be achieved by modifying the split-ting criterion (Lipkovich and Dmitrienko, 2014; Zhanget al., 2010) so that subjects are divided into subgroups withthe greatest difference in treatment response rather than forc-ing the first split. Approaches based on counterfactual out-comes (Foster et al., 2011) are also possible. Systematicexploration of moderator effects in COMBINE and PRE-DICT is a topic of future research.

The purely statistical approach of tree building performedcomparably to the approach based on statistical and practi-cal considerations on TS but did not replicate as well on VS.It identified splitting variables (e.g., GGT) with cutoffs thatare difficult to apply due to differences in laboratory refer-ence values. In analyses with multiple predictors, the bestsplitting variable may only be marginally better than anothermore practically useful variable that is either more robustacross samples (i.e., absolute values are more comparableacross samples) or is easier to implement in a clinical setting.When due consideration to practical utility is given in the sta-tistical analysis and different alternatives are evaluated inparallel, the results are more likely to be both nearly optimalin terms of statistical performance and translate more easilyinto clinical practice.

Table 5. Odds Ratios and 95% Confidence Intervals for the Effects of Predictors on Abstinence from Heavy Drinking in Logistic Regression Analyses inCOMBINE and in PREDICT

Predictor CategoriesCOMBINE Training

SampleCOMBINE Validation

Sample PREDICT

Predictors from tree analyses CDA Longer versus shortera 3.97 (2.59, 6.09)* 2.39 (1.37, 4.15)* 1.60 (1.09, 2.35)*Age Older versus youngerb 2.08 (1.36, 3.20)* 1.70 (1.09, 2.64)* 1.21 (0.82, 1.78)Age (amongshorter abstinence)

Older versus youngerb 2.51 (1.83, 3.45)* 1.61 (0.58, 4.43) 1.13 (0.65, 1.97)

Age (among longerabstinence)

Older versus youngerb 1.73 (0.78, 3.83) 1.65 (0.95, 2.87) 1.29 (0.76, 2.20)

Predictors from deterministicforest analyses

CDA Longer versus shortera 3.73 (2.40, 5.78)* 2.18 (1.24, 3.81)* 1.63 (1.11, 2.40)*Age Older versus youngerb 2.43 (1.80, 3.28)* 1.40 (0.92, 2.13) 1.16 (0.75, 1.65)Drinking goal Complete abstinence

versus other goals1.97 (1.45, 2.66)* 1.71 (1.14, 2.57)* 1.71 (1.14, 2.55)*

CDA, consecutive days of abstinence.aIn COMBINE “longer” means “>14 days,” in PREDICT “longer” means “>21 days.”bIn both data sets “older” means “>45 years.”*Effects are significant at 0.05 level.

2654 GUEORGUIEVA ET AL.

Page 9: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Tree-based methods have potential advantages over classi-cal statistical methods because they rely on fewer assump-tions and are useful for identification of interactions.Although logistic regression can also be used to test interac-tions, usually the number of predictors and the order of theconsidered interactions are limited (only up to 2- or 3-way).Furthermore, there are no automatic procedures for cutoffselection on continuous variables, and although usuallypower is greater with continuous predictors, when the rela-tionship between the predictor and the outcome is nonlinearor when a simple decision rule is necessary, logistic regressionmay not be very practical. In contrast, trees automaticallypresent in the form of simple decision rules and can be easierto adapt for use in clinical settings.However, extra caution needs to be used to perform both

internal and external validation of trees as interactions repli-cate less frequently than main effects. In particular, trees canidentify splitting variables that are idiosyncratic to the partic-ular data set more often than classical methods that focus onmain effects (Hastie et al., 2009; Zhang et al., 2010). Due tothe performed internal and external validation of our analy-ses, our results can be considered stable in similar samples ofpatients.Tree-based methods can discover important predictors

and combinations of predictors among a large pool of cova-riates. However, in this particular application, tree-basedmethods did not identify strong predictors of outcome andhad slightly worse performance to logistic regression in termsof classification accuracy on VS. This is probably due to lackof important multiway interactions in the data set and to aset of predictors that is not prohibitively large to be handledby classical methods.Our conclusions are limited because the COMBINE and

PREDICT samples are not necessarily representative of thepatient populations treated for alcohol dependence. Pro-grams may vary in the requirements for abstinence, andpatients may have comorbidities, such as drug depen-dence, which were exclusionary criteria in COMBINE andPREDICT.Furthermore, while the 3 best predictors (CDA, age, and

drinking goal) can be easily assessed, the cutoffs on absti-nence and age are sample-dependent. PREDICT wasdesigned to allow direct comparisons to COMBINE, but dif-ferences in some of the inclusion/exclusion criteria, the studypopulations, the treatment length, and the outcome defini-tions hampered the external validation process. In particular,the 2 study samples differed on CDA, which turned out to bethe most important predictor. In PREDICT, subjects wererequired to have at least 2 weeks of inpatient hospitalizationprior to enrollment and thus had significantly longer CDAthan subjects in COMBINE who achieved abstinence pri-marily as outpatients. Our decision to increase the cutofffrom 2 weeks in COMBINE to 3 weeks in PREDICT wasmeant to assure that sufficient sample sizes were available ineach group in PREDICT and may appear somewhat arbi-trary. However, as fluctuations in drinking follow a weekly

pattern, by increasing the cutoff by 1 week, we were able tominimize the effect of such fluctuations on the results.We also used 2 different periods to define “no heavy drink-

ing”: 8 weeks in COMBINE following a grace period of8 weeks and 12 weeks in PREDICT. We could have usedthe first 12 weeks of the 24-week study period in COMBINEin order to achieve complete comparability of outcome; how-ever, we chose to focus in COMBINE on no heavy drinkingfollowing a grace period in order to identify strongest predic-tors of this clinical outcome recommended by the Food andDrug Administration. Ideally, the same outcome shouldhave been used in PREDICT but it was not possible as afterrelapse to drinking, individuals were randomized to CBI.We codedmissing heavy drinking data as heavy drinking in

the analyses. This approach could produce biased estimateswhen the proportion of missing data is large and/or theassumptions that dropouts have relapsed to heavy drinkingare untenable. To investigate the sensitivity of ourconclusions, we performed alternative analyses on completersonly andwhen subjectswithmissing drinking datawere codedas not heavy drinking. The results were very similar, that is thefirst 2 splits of all treeswere exactly the same, deterministic for-ests identified the same top 3 predictors, and there was signifi-cant overlap in the remaining top predictors (Appendix S3).This is probably due to the small proportion of subjects withmissing data but speaks to the robustness of our conclusions.In studies with larger proportion of missing data, multipleimputationmaybe considered (Hapfelmeier et al., 2012).Although we did not identify unexpected or surprising pre-

dictors, our results have clinical and research utility. In par-ticular, our finding that longer abstinence prior to treatmentis a reliable predictor of treatment response can help with thedesign of future studies. Subjects who are able to maintainabstinence for 2 weeks or more are likely to have good out-come regardless of treatment, and hence, assessment of noveltreatments for alcoholism could focus on subjects who donot achieve this level of abstinence. In clinical settings withtight resources, more intensive counseling could be allocatedto those with less abstinence at evaluation time. Our forestanalyses identified the most important predictor variablesassociated with good outcome, and thus, our findings couldguide instrument selection for future studies. Further exter-nal validation of the results in other data sets will be benefi-cial. The PROJECT MATCH data could be particularlyinformative because it contains 2 arms, an aftercare arm fol-lowing inpatient treatment and an outpatient arm (ProjectMATCHResearch Group, 1997).

FINANCIAL INTERESTS

Dr. Stephanie S. O’Malley: member American Society ofClinical Psychopharmacology workgroup, the Alcohol Clini-cal Trial Initiative, sponsored by Abott Laboratories, EliLilly & Company, Lundbeck, Pfizer, and Ethypharm; con-tract, Eli Lilly; medication supplies, Pfizer, Inc.; consultant,Alkermes, Arkeo; Scientific Panel of Advisors, Hazelden

PREDICTORSOF ABSTINENCE FROMHEAVY DRINKING 2655

Page 10: Predictors of Abstinence from Heavy Drinking During Treatment in COMBINE and External Validation in PREDICT

Foundation. Dr. Karl Mann, member of the Alcohol Clini-cal Trial Initiative; consultant to Lundbeck and Pfizer. ThePREDICT study was sponsored entirely by the GermanGovernment: Federal Ministry of Research (BMBF) spon-sorship contract 01EB0110.

ACKNOWLEDGMENTS

The project described was supported by grantsR01AA017173, K05 AA014715, and K23 AA020000 fromthe National Institute on Alcohol Abuse and Alcoholism.The content is solely the responsibility of the authors anddoes not necessarily represent the official views of theNational Institute on Alcohol Abuse and Alcoholism or theNational Institutes of Health. We would like to acknowledgeDr. Robert Stout from Decision Sciences Institute for hisconsultation and feedback and Dr. Wan-Min Tsai from YaleUniversity for technical assistance.

REFERENCES

Adamson SJ, Sellman JD, Frampton CMA (2009) Patient predictors of

alcohol treatment outcome: a systematic review. J Subst Abuse Treat

36:75–86.Anton RF, O’Malley SS, Ciraulo D, Cisler RA, Couper D, Donovan DM,

Gastfriend DR, Hosking JD, Johnson BA, LoCastro JS, Longabaugh R,

Mason BJ, Mattson ME, Miller WR, Pettinati HM, Randall CL, Swift R,

Weiss RD, Williams LD, Zweben A. (2006) Combined pharmacothera-

pies and behavioral interventions for alcohol dependence. The COMBINE

study: a randomized controlled trial. JAMA 295:2003–2017.Berner MM, Wahl S, Brueck R, Frick K, Smolka R, Haug M, Hoffman S,

Reinhard I, Lemenager T, Gann H, Batra A, Mann K; PREDICT study

group (2014) The place of additional individual psychotherapy in the treat-

ment of alcoholism: a randomized controlled study in nonresponders to

anticraving medication—results of the PREDICT study. Alcohol Clin Exp

Res 38:1118–1125.Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and

Regression Trees. Chapman &Hall/CRC, Wadsworth, CA.

DeMartini KS, Devine EG, DiClemente CC, Martin DJ, Ray LA, O’Malley

SS (2014) Predictors of pretreatment commitment to abstinence: results

from the COMBINE study. J Stud Alcohol Drugs 75:438–446.Falk D, Wang XQ, Liu L, Fertig J, Mattson M, Ryan M, Johnson B, Stout

R, Litten RZ (2010) Percentage of subjects with no heavy drinking days:

evaluation as an efficacy endpoint for alcohol clinical trials. Alcohol Clin

Exp Res 34:2022–2034.Foster JC, Taylor JM, Ruberg SJ (2011) Subgroup identification from ran-

domized clinical trial data. Stat Med 30:2867–2880.Hapfelmeier A, Hothorn T, Ulm K (2012) Recursive partitioning on incom-

plete data using surrogate decisions and multiple imputation. Comput Stat

Data Anal 56:1552–1565.Hastie T, Tibshirani R, Friedman J (2009). The Elements of Statistical

Learning, Chapter 9, 2nd edn. Springer Science + Business Media,

LLC, New York, NY.

Kass GV (1980) An exploratory technique for investigating large quantities

of categorical data. Appl Stat 29:119–127.Lipkovich I, Dmitrienko A (2014) Strategies for identifying predictive bio-

markers and subgroups with enhanced treatment effect in clinical trials

using SIDES. J Biopharm Stat 24:130–153.Mann K, Kiefer F, Smolka M, Gann H, Wellek S, Heinz A; PREDICT

Study Research Team (2009) Searching for responders to acamprosate

and naltrexone in alcoholism treatment: rational and design of the Predict

study. Alcohol Clin Exp Res 33:674–683.

Mann K, Lemenager T, Hoffman S, Reinhard I, Hermann D, Batra A, Ber-

ner M, Wodarz N, Heinz A, Smolkar MN, Zimmermann US, Wellek S,

Kiefer F, Anton RF; PREDICT Study Team (2012) Results of a double-

blind, placebo-controlled pharmacotherapy trial in alcoholism conducted

in Germany and comparison with the US COMBINE study. Addict Biol

18:937–946.Miller WR (Ed.) (2004) Combined Behavioral InterventionManual: A Clini-

cal Research Guide for Therapists Treating People With Alcohol Abuse

and Dependence, Vol. 1. National Institute on Alcohol Abuse and Alco-

holism, Bethesda,MD.

Morgan JN, Messenger RC (1973) THAID a Sequential Analysis Program

for Analysis of Nominal Scale Dependent Variables. Survey Research

Center, Institute for Social Research, University ofMichigan, Ann Arbor.

Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and

a proposal. J Am Stat Assoc 58:415–435.Muller SE, Weijers HG, Boning J, Wiesbeck GA (2008) Personality traits

predict treatment outcome in alcohol-dependent patients. Neuropsychobi-

ology 57:159–164.Pettinati HM, Weiss RD, Miller WR, Donovan DM, Ernst DB, Rounsaville

BJ (2004) Medical Management (MM) Treatment Manual: A Clinical

Research Guide for Medically Trained Clinicians Providing Pharmaco-

therapy as Part of the Treatment for Alcohol Dependence, Vol. 2.

National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD.

Project MATCHResearch Group (1997) Matching alcoholism treatments to

client heterogeneity: project MATCH posttreatment drinking outcomes.

J Stud Alcohol 58:7–29.SAS Institute (2010) SAS Institute Inc 2009 Using JMP 9. SAS Institute Inc,

Cary, NC.

Stout RL (2000)What is a drinking episode? J Stud Alcohol 61:455–461.Vik PW, Cellucci T, Hedt J, Jorgensen M (2006) Transition to college:

a classification and regression tree (CART) analysis of natural reduc-

tion of binge drinking. Int J Adolesc Med Health 18:171–180.Zhang H, Legro RS, Zhang J, Zhang L, Chen X, Huang H, Casson PR,

Schlaff WD, Diamond MP, Krawetz SA, Coutifaris C, Brzyski RG,

Christman GM, Santoro N, Eisenberg E (2010) Decision trees for identi-

fying predictors of treatment effectiveness in clinical trials and its applica-

tion to ovulation in a study of women with polycystic ovary syndrome.

HumReprod 25:2612–2621.Zhang H, Singer B (2010) Recursive Partitioning and Applications. 2nd ed.

Springer, New York.

Zhang H, Wang M, Chen X (2009) Willows: a memory efficient tree and for-

est construction software. BMC Bioinformatics 10:130.

Zhang H, Yu CY, Singer B (2003) Cell and tumor classification using gene

expression data: construction of forests. Proc Natl Acad Sci 100:4168–4172.

SUPPORTING INFORMATION

Additional Supporting Information may be found in theonline version of this article:

Appendix S1. Baseline predictors in COMBINE.Appendix S2.Descriptives of predictors in COMBINE.Appendix S3.Number of times the top 18 splitter variables

identified in the main analysis with missing drinking datacoded as heavy drinking appear in the deterministic forestsof the main and 2 sensitivity analyses.

Fig. S1. (A) Tree with first split forced to be on naltrex-one in COMBINE training sample. (B) Tree with first splitforced to be on naltrexone and third split within the nal-trexone group forced to be on age in the COMBINEtraining sample.

2656 GUEORGUIEVA ET AL.