Variable Selection for Individualized Treatment Decision-Making

29
Variable Selection for Individualized Treatment Decision-Making Presenter: Daniel Almirall University of Michigan, Institute for Social Research Joint Statistical Meetings San Diego, California July 30, 2012

description

Variable Selection for Individualized Treatment Decision-Making. Presenter: Daniel Almirall University of Michigan, Institute for Social Research Joint Statistical Meetings San Diego, California July 30, 2012. Warm-up. - PowerPoint PPT Presentation

Transcript of Variable Selection for Individualized Treatment Decision-Making

Page 1: Variable Selection for Individualized Treatment Decision-Making

Variable Selection for Individualized Treatment

Decision-Making

Presenter: Daniel AlmirallUniversity of Michigan, Institute for Social Research

Joint Statistical MeetingsSan Diego, California

July 30, 2012

Page 2: Variable Selection for Individualized Treatment Decision-Making

WARM-UP

Page 3: Variable Selection for Individualized Treatment Decision-Making

Warm-up: Data=(S,A,Y). Suppose we want the effect of A on Y. Why condition on a

pre-treatment variable S?

• Confounding (specific to observational studies): S is correlated with both A and Y.

• Precision: S may be a pre-treatment measure of Y, or any other variable highly correlated with Y.

• Missing Data: Y is missing for some units, S and A predict missing-ness, and S is associated with Y.

• Effect Heterogeneity/Moderation/Modification: S may moderate, or specify the effect of A on Y.

Page 4: Variable Selection for Individualized Treatment Decision-Making

Warm-up: Data=(S,A,Y). Suppose we want the effect of A on Y. Why condition on a

pre-treatment variable S?

• Effect Heterogeneity/Moderation/Modification: S may moderate, or specify the effect of A on Y.

• Actually, our focus is on a specific type of Effect Moderation…

Page 5: Variable Selection for Individualized Treatment Decision-Making

Tailoring Variables are specific types of moderator variables.

• A tailoring variable is a pre-treatment measure such that individuals who measure at some values of the variable benefit more (equally) from one (multiple) type(s) of treatment, whereas individuals who measure at other values of the variable benefit from a different (specific) type of treatment.

• Tailoring variables are prescriptive.• They help individualize treatment decision making.

Page 6: Variable Selection for Individualized Treatment Decision-Making

Example of a Tailoring Variable

• Provide outpatient treatment to individuals with higher levels of social support. Provide either one to individuals with low levels of social support.

Page 7: Variable Selection for Individualized Treatment Decision-Making

What is the Relevance?• Theoretical Implication: Understanding the

heterogeneity of treatment effects enhances our understanding of scientific theories; and may suggest new scientific hypotheses.

• Practical Implication 1: Identifying types of individuals for which treatment not effective may suggest altering the treatment to suit the needs of those individuals.

• Practical Implication 2: Individualized decision making: Provide different treatments for different types of individuals.

Page 8: Variable Selection for Individualized Treatment Decision-Making

Prototypical Linear Regression with Covariate-by-Treatment Interactions

• In the example linear model: – S is a tailoring variable if 1 + 2 s is negative (positive) or

zero for some values of S=s, yet positive (negative) for other values of S=s.

Page 9: Variable Selection for Individualized Treatment Decision-Making

GOALS AND CONTEXT

Page 10: Variable Selection for Individualized Treatment Decision-Making

Goal• Primary Goal: To devise a method for Tailoring

Variable selection

• Do this in such a way that – Results are more likely to replicate– Does not require a priori knowledge of functional forms

especially main effects of candidate tailoring variables– Permits subsequent exploratory data analysis of the

ways in which the selected tailoring variables may be combined for individualizing treatment

• This is called Tailoring Variable Feature Construction

Page 11: Variable Selection for Individualized Treatment Decision-Making

Need for a Principled Method of Variable Selection for Tailoring

• Current practice may be of some concern: – Recall: E( Y(a) | S=s ) = 0 + 1 s + 1 a + 2 a s

– Fit many different interactions; look for p-value < 0.05– Theoretical explanations sometimes come after

• Unfortunate Results:– Proposals for tailoring variables do not replicate– For example: Project MATCH in alcohol research

• Issues: Wealth of data, cost, statistical power• The “process of discovery” is often considered fun!

Page 12: Variable Selection for Individualized Treatment Decision-Making

Variable Selection for Prediction is not the same thing as

Variable Selection for Tailoring• Prediction:

– What variables predict Y?

• Tailoring:– What variables predict the individual (differential) effects

of A on Y, e.g., D = Y(1) – Y(0)?– An obvious challenge is that we do not observe D for

each individual. D can be thought of as latent.– Stated differently, what variables are useful in making

decisions about A=1 vs A=0 in terms of optimizing Y?

Page 13: Variable Selection for Individualized Treatment Decision-Making

Variable Selection: Effect Heterogeneity• Imai, K. and Ratkovic, M. (2012) “Estimating treatment effect heterogeneity in Randomized

Program Evaluation” ACC, Tom Ten Have Memorial Award! Session this Wed 8/1 200P-350P.• Kang, J., et al. (2012). “Tree-structured analysis of treatment effects with large observational

data.” Applied Statistics• Loh, W.Y., et al. (2012) “Should all smokers use combination smoking cessation

pharmacotherapy?” Nicotine and Tobacco Res.• He, X. (2012) “Identification of subgroups with large differential treatment effects in GWAS”

Thesis.• Siddique, J. et al. (2011) “Comparative effectiveness of medication vs CBT in Depressed Low-

income Women” ICHPS 2011, Cleveland • Gunter, L., et al. (2011) “Variable selection for qualitative interactions” Statistical

Methodology• Imai, K. and Strauss, A. (2011). “Estimation of heterogeneous treatment effects from

randomized experiments” • Cai, T., et al. (2010) “Analysis of randomized comparative clinical trial data for personalized

treatment selections” Biostatistics. Also in the session this Wed 8/1 200P-350P.• King, A.C., and Kraemer H.C. et al. (2008) “Exploring refinements in targeted behavioral

medicine intervention to advance public health.” Annals of Behavioral Medicine• Kraemer, H.C. (2007) “Toward non-parametric and clinically meaningful moderators and

mediators.” Stat. in Medicine. Has many other useful and recent articles in this area!!

Page 14: Variable Selection for Individualized Treatment Decision-Making

THE MOTIVATING DATA SET

Page 15: Variable Selection for Individualized Treatment Decision-Making

Adolescent Substance Use Data• Observational study of N=2870 adolescents with

substance use problems• From substance use programs across the US (CSAT)• GAIN: Global Appraisal of Individual Needs

– structured clinical interview; baseline, 3, 6, 9, 12 months– demographics & measures along 6 dimensions of need

• Data { (S0,X0), A1, (S1,X1), A2, (S2,X2), A3, Y } • St = pre-specified candidate tailoring variables• Xt = many many auxiliary variables; e.g., X2 has 126• At = did adolescent receive txt in 3-month interval?• Y = substance use frequency at 12 months

Page 16: Variable Selection for Individualized Treatment Decision-Making

A PROPOSAL FOR TAILORING VARIABLE SELECTION

Page 17: Variable Selection for Individualized Treatment Decision-Making

Adolescent Substance Use Data• Data: { (S0,X0), A1, (S1,X1), A2, (S2,X2), A3, Y } • St: Candidate Tailoring Variables

– E.g., eps7p3, sfs8p6, etc… (E.g., 18 variables in S2)

• Xt: Auxiliary variables• At: Treatment• Y: Outcome = sfs8p12• We describe the method for final time point only.

(Extends readily but beyond scope of talk.)– Choose among the 18 variables S2 to make a decision

about A3=1=treatment vs A3=0=no treatment during months 6-9.

Page 18: Variable Selection for Individualized Treatment Decision-Making

1. Use theory, clinical experience, cost to choose a pre-specified list of candidate tailoring variables S.

2. Randomly split data set: discovery & evaluation.3. Using discovery data set:

a) Using A=1: Build a machine to predict Y. Call this f1(S,X).

b) Using A=0: Build a machine to predict Y. Call this f0(S,X).

c) Using all data: Calculate D = f1 (S,X) – f0(S,X)

d) Variable selection on D: e.g., Use LASSO with STABILITY SELECTION for variable selection in a regression of D ( or D>0 ) on S. Selected variables denoted by S* S.

4. Using the evaluation data set:a) Test selected variables for differential effects in (IPTW)

regression; e.g. of Y on A, S*, S*-by-A interaction terms.

Page 19: Variable Selection for Individualized Treatment Decision-Making

Stability Selection (with LASSO)Meinshausen and Bulhmann (2010) JRSSB

•LASSO: Often difficult to select the right amount of regularization (tuning pmtr ) to select S* exactly.•Bootstrap the LASSO. For every value of the tuning parameter , calculate the probability (over bootstraps) of selecting the variable K with LASSO. These are “stability paths”:•For each variable K, calculate the max . These are called “selection probabilities”.•Keep variables that have max ≥ .

• chosen by the user (another tuning parameter!?)

Page 20: Variable Selection for Individualized Treatment Decision-Making
Page 21: Variable Selection for Individualized Treatment Decision-Making

Variable Selection Results

LASSOSTABILITY SELECTION

(w/LASSO)

age age

ers216 (environmental risk 6) ers216 (environmental risk 6)

sfs8p0 (substance frequency 0) sfs8p0 (substance frequency 0)

sfs8p3 (substance frequency 3)

sfs8p6 (substance frequency 6) sfs8p6 (substance frequency 6)

eps7p0 (emotional problems 0)

eps7p3 (emotional problems 3) eps7p3 (emotional problems 3)

ce6 (controlled environment 6)

Page 22: Variable Selection for Individualized Treatment Decision-Making
Page 23: Variable Selection for Individualized Treatment Decision-Making

Evaluation Results (IPTW Regression)REGRESSION TERM EST |WALD| PVALIntercept 0.043 2.47 0.01Emotional Problems 3 -0.013 0.58 0.56Environmental Risk 6 <0.001 0.35 0.73Substance Frequency 6 0.492 11.9 <0.01Age 0.009 3.31 <0.01Substance Frequency 0 0.087 2.91 <0.01Treatment 9 -0.081 2.97 <0.01Treatment 9 x Emotional Problems 3 0.098 2.06 0.04Treatment 9 x Environmental Risk 6 0.002 2.22 0.03Treatment 9 x Substance Frequency 6 -0.240 2.74 <0.01Treatment 9 x (Age – 16) -0.003 0.53 0.59Treatment 9 x Substance Frequency 0 0.034 0.60 0.55

Page 24: Variable Selection for Individualized Treatment Decision-Making

Evaluation Results (IPTW Regression):Specific Contrasts from the Regression Model

EFFECT OF TREATMENT VS NO TXTEFFECT

SIZE95% CI for ES(LOW, UPP)

16 year old,No Emotional Problems 3, No Environmental Risk 6, High Substance Frequency 6

-1.38 -2.22 -0.54

16 year old,High Emotional Problems 3, No Environmental Risk 6, No Substance Frequency 6

0.56 0.02 1.10

16 year old,High Emotional Problems 3, High Environmental Risk 6, No Substance Frequency 6

1.02 0.30 1.73

Page 25: Variable Selection for Individualized Treatment Decision-Making

CAN WE DO MORE?

Page 26: Variable Selection for Individualized Treatment Decision-Making
Page 27: Variable Selection for Individualized Treatment Decision-Making

Thank you.

Contact information: [email protected]

Funding:R03-MH-097954 (PI: Almirall)R01-MH-080015 (PI: Murphy)P50-DA-010075 (PI: Collins)R01-DA-015697 (PI: McCaffrey & Griffin)

Ownership of Errors: Any errors, confusion, or misconceptions are my own—my colleagues did not completely vet all of my statements in this talk.

Page 28: Variable Selection for Individualized Treatment Decision-Making

EXTRA SLIDES

Page 29: Variable Selection for Individualized Treatment Decision-Making

S=0no heavy drinking

Y

NTX+CBI S is a moderator variable because the magnitude of the effect of Tx=NTX+CBI versus Tx=NTX differs by levels of S.

However, S is not a tailoring variable: Tx=NTX+CBI is better for all subjects.

S=0 S=1

Y

S is a weak tailoring variable because the direction of the effect of Tx=NTX+CBI versus Tx= NTX differs by levels of S but magnitude is small.

S is somewhat prescriptive: Offer Tx=NTX+CBI to S=1 subjects; the difference in effects is not substantial for S=0 subjects.

S=0 S=1

Y

S is a strong tailoring variable because the direction of the effect of Tx=NTX +CBI versus Tx=NTX differs by levels of S.

S is very prescriptive: Offer Tx=NTX to S=0 subjects; offer Tx=NTX+CBI to S=1 subjects. Large magnitudes of clinical significance.

S=1returned to heavy

drinking

NTX

NTX+CBI

NTX

NTX+CBINTX