The regression discontinuity design in epidemiology · The regression discontinuity design in...
Transcript of The regression discontinuity design in epidemiology · The regression discontinuity design in...
The regression discontinuity design inepidemiology
S.Geneletti1, G.Baio2 and A.P.Dawid3
1 London School of Economics and Political Science,2 University College London,3 University of Cambridge
30/11/2010
Outline
I What is the RD design?
I Causal inference
I RD design applied to statins
I THIN data
I Results
I Further work
What is the RD design?
I The regression discontinuity (RD) design was firstintroduced in the educational econometrics literature inthe 60’s [5]
I Recently other econometricians have become interested informal causal aspects [3, 6]
I The original idea was to exploit policy thresholds toestimate the causal effect of an educational intervention
What is the RD design?
ExampleI We want to know what the effect of going to college is on
income
I Comparing the income of individuals who attend collegeand those who do not will not tell us the effect of collegeattendance alone
I Confounders such as social class, ability, motivation etc.will make this difficult
I Classic problem of observational studies
What is the RD design?
Example cont’dI Often college scholarships are given on the basis of grades
obtained in final school examinations
I For example: if all exam grades are above 75% studentgets scholarship
I If one student gets 74% and another 76%
I Can we really consider them as coming from differentpopulations especially if in other respects (e.g. familyincome etc) they are the same?
I Given that there is natural variability in exam performanceeven for the same individual?
What is the RD design?
Public health ExampleI Many medicines are prescribed according to a particular
guidelineI Antiretroviral HIV drugs prescribed when patient’s CD4 counts
is less than 200 cells/mm3
I Blood pressure medication is prescribed when patient’s BP is140/90mmHg or above
I Statins are prescribed when e.g. 10 year Framingham riskscore is over 20%
What is the RD design?
Public Health Example cont’dI Consider the HIV patients.
I If one patient has a CD4 count of 195 and another of 205cells/mm3
I Theoretically, one patient gets the drugs while the otherdoesn’t
I If the two are the same in every other relevant respect
I Can we really consider them as coming from differentpopulations?
I Given that there is a natural variability in CD4 counts andin the instruments used to measure them?
RD design and confounding
Sharp DesignI The idea of the RD design is that the threshold behaves
like a randomising deviceI If we imagine that the thresholds are adhered to very
strictlyI termed sharp design
I Then we can think of the RD design as removing theconfounding due unobserved factors
I For education could be e.g. academic history, talent,motivation
I For HIV could also be unobserved health/personalcharacteristics
RD design and confounding
Fuzzy DesignI In public health contexts the sharp threshold is unlikely to
be adhered to
I Often GP’s override guidelines – generally because theyfeel patients will benefit from medication even when theydo not fit guidelines
I Often patients do not take the prescribed drugs asrecommended
I There are statistical methods that cater for these casesI termed fuzzy design
RD design and compliance
I For RD applied to GP prescription context there are twolayers of compliance
1. Compliance of GP to prescription guidelines [i.e. only givepatients with CD4 count below 200 cells/mm3 theantiretroviral drug]
2. Compliance of patient to prescription [i.e. take theantiretroviral drug twice a day every day]
I The RD design is related to compliance of the first type
I The RD’s relation to compliance means it is also relatedto intention-to-treat experiments
RD design and compliance
I RD with sharp threshold = randomised trial with perfectcompliance
I RD with fuzzy threshold = randomised trial with partialcompliance
I Mathematically the LHS and RHS of both equations areidentical
I So in the fuzzy design we don’t estimate an averagecausal effect but rather a complier causal effect
I The compliers are those who “respect” the threshold,
I For the GP prescription it is those who the GP prescribesthe drug to in accordance to the guidelines
I Whether the patients take the drugs as recommendedneeds to be dealt with separately
Causality in Statistics
MotivationI Causation = intervention
I However we cannot always intervene and randomise
I The trick is to understand what mechanisms behave inthe same way under intervention and under observation
I These mechanisms are then causal
Decision theoretic (DT) set-up
I F intervention variable, X other variables
I p(T = t|F = t,X) = 1 means set T = t e.g. byrandomisation in trial
I p(T |F = ∅, X) = p(T |X), T arises “naturally” in theobservational regime
I We estimate effects as predictive expectations (or otherfunctions) -i.e. we answer which treatment would benefita new unit exchangeable to those we have observed?
Decision theoretic (DT) set-up
I F intervention variable, X other variables
I p(T = t|F = t,X) = 1 means set T = t e.g. byrandomisation in trial
I p(T |F = ∅, X) = p(T |X), T arises “naturally” in theobservational regime
I We estimate effects as predictive expectations (or otherfunctions) -i.e. we answer which treatment would benefita new unit exchangeable to those we have observed?
Decision theoretic (DT) set-up
I F intervention variable, X other variables
I p(T = t|F = t,X) = 1 means set T = t e.g. byrandomisation in trial
I p(T |F = ∅, X) = p(T |X), T arises “naturally” in theobservational regime
I We estimate effects as predictive expectations (or otherfunctions) -i.e. we answer which treatment would benefita new unit exchangeable to those we have observed?
Simple problem first
I Consider theATE = E(Y |F = 1, T = 1)− E(Y |F = 0, T = 0)
I Where we leave out X for simplicity
I This is not necessarily the same as the “naive” treatmenteffectNTE = E(Y |F = ∅, T = 1)− E(Y |F = ∅, T = 0)
I Unless Y does not depend on how the treatment wasadministered
I I.e. F⊥⊥Y |T
Simple problem first
I Consider theATE = E(Y |F = 1, T = 1)− E(Y |F = 0, T = 0)
I Where we leave out X for simplicity
I This is not necessarily the same as the “naive” treatmenteffectNTE = E(Y |F = ∅, T = 1)− E(Y |F = ∅, T = 0)
I Unless Y does not depend on how the treatment wasadministered
I I.e. F⊥⊥Y |T
Simple problem first
I Consider theATE = E(Y |F = 1, T = 1)− E(Y |F = 0, T = 0)
I Where we leave out X for simplicity
I This is not necessarily the same as the “naive” treatmenteffectNTE = E(Y |F = ∅, T = 1)− E(Y |F = ∅, T = 0)
I Unless Y does not depend on how the treatment wasadministered
I I.e. F⊥⊥Y |T
Simple problem cont
F T Y
1. Y⊥⊥F |T means only the value of treatment matters for Y
2. However that does not tend to hold...
3. Usually there is a confounder U s.t.
U ⊥⊥ F
Y ⊥⊥ F |(U, T )
4. If U is unobserved and there is no randomisation thenATE 6= NTE
Simple problem cont
F T Y
U
1. Y⊥⊥F |T means only the value of treatment matters for Y
2. However that does not tend to hold...
3. Usually there is a confounder U s.t.
U ⊥⊥ F
Y ⊥⊥ F |(U, T )
4. If U is unobserved and there is no randomisation thenATE 6= NTE
Simple problem first
I If we look at adherence to the threshold as compliance
I We can introduce another variable binary Z – thethreshold indicator:
I If Z = 1 the individual is above the threshold
I If Z = 0 the individual is below the threshold
I When the threshold is strict then Z = F
RD design
F T Y
U
Z
I Z and F both have the same relationship with U ,T and Y
I This means Z can be used for causal inference
The RD design
AssumptionsA1 The threshold is set prior to the observed data and is not
changed after observationI Generally plausible as threshold set by the powers that be e.g.
gov’t agencies, NICE etc.
A2.1 Individuals close to the threshold are exchangeableI We have no reason to believe that the individuals just above
and below the threshold are differentI This is violated if individuals can change their outcome to fall
above or below the thresholdI Benefit fraud: individuals might say their income is below a
threshold in order to fall into a category that receives benefits
The RD design
Assumptions cont’dI Another way of expressing A2.1:
A2.1 The threshold is a randomising deviceI This means that a comparison of above and below gives us a
causal effect estimate of the treatment – at the thresholdI This is because randomisation is the gold standard for causal
inference as controls for confoundingI The question is how far above and how far below?
The RD design
Assumptions cont’dA3 The assignment variable is continuous
I There cannot be a threshold w/out a continuous variableI Means we don’t have to worry about choosing bandsI We fit two separate regressions – one above and one below the
thresholdI Or assume a common slope and fit one regression – this
assumes effect is the same everywhere
The causal effect
The continuous case: Sharp thresholdI Let Y be the outcome, W the assignment variable and T
the treatment indicator
I If the regressions are given by
E(Y )s = αs + βsW
where:I x is the value of X at the threshold;I s = b⇒W < w (below)I s = a⇒W ≥ w (above)
An estimate of the causal effect of the treatment is
ACE = E(Y |T = 1)− E(Y |T = 0)
= αb − αa + (βb − βa)w
I There are more sophisticated estimates[3, 6]
The causal effect
The continuous case: Fuzzy thresholdI Often there is not strict adherence to threshold
I Use the relationship between RD design and complianceto estimate the effect in this situation
I If Z = 1 if individual is above the threshold and Z = 0below then RD fuzzy estimate same as partial complianceestimate
I The local average treatment effect (LATE) – compliereffect [? ]
I Can be equated to fuzzy average causal effect (FACE)
LATE
The causal effect
The continuous case: Fuzzy thresholdI The formula for the fuzzy estimator is
FACE =E(Y |Z = 1)− E(Y |Z = 0)
E(T |Z = 1)− E(T |Z = 0)
I One estimate is:
αb − αa + (βb − βa)wˆp1|1 − ˆp1|0
I Where ˆpt|z is an estimate of p(T = t|Z = z)
I This is partly based on the compliance literature [1]
The RD design for binary outcomes
I Many outcomes in public health are binary (death, cvdevent)
I The RD design can be used for binary outcomes by usinglogistic regressions
I And then looking at treatment risk-ratios (RR)
I We don’t want to use odds ratios because we don’tnecessarily have rare outcomes
I Also, we want to be able to evaluate the RR at thethreshold
The causal risk ratio
The binary case: sharp thresholdI If we fit two separate logistic regressions
logit(p)s = αs + βsX,where s = {a, b} for above and below,
I then causal risk ratio at the threshold x is given by
RR =1 + exp(−{αb + βbx})1 + exp(−{αa + βax})
The causal risk ratio
The binary case: fuzzy thresholdI The fuzzy design for a binary outcome was originally
developed in the compliance literature by [2]
FRR =
1− p(Y |Z = 1)− p(Y |Z = 0)
p(Y |T = 1, Z = 1)p(T |Z = 1)− p(Y |T = 1, Z = 0)p(T |Z = 0)
I The different parts are estimated using logistic regressionsevaluated at the threshold
I The FRRI =RR when the design is sharpI Is further from the RR the more fuzzy the design
I This can also be derived along the same lines as the LATEbut much harder work!
The trouble with statins
I Statins are a class of drugs used to lower cholesterol andprescribed to prevent heart disease
I They are amongst the most prescribed drugs in the UK
I Some even suggest handing them out with fast food!
The trouble with statins
I Trials [7] show an average reduction of LDL cholesterol ofapproximately 2 mmol/l
I Also, NHS guidelines are to prescribe statins to individualsw/out previous CVD if their 10 year CVD score exceeds20% [4]
I CVD scores are predicted probabilities of event in next 10 yearsand are based on age, sex, smoking status, pressure, cholesteroland depending on type of score also diabetes, LVH etc.
I We could use the RD design with the threshold to seewhether the effect of statins is the same as in the trials
The trouble with statins
I In a second instance we can also try and determinewhether the prescription threshold is ideal
I By looking at CVD events and incorporating acost-effectiveness analysis
RD design design for statins
How do we measure the effects?I We have two outcomes of interest:
I Change in LDL cholesterol after treatmentI Occurrence of CVD events after treatment
I The threshold variable is the 10 year Framingham CVDscore
I Or another continuous variable that might be used by GPsto determine statin prescription
Example — RD design in the THIN data
I The THIN data set contains data from routine generalpractice prescriptions as well as information on thevariables that determine these prescriptions
I Individual characteristics (sex, date of birth, date ofregistration with practice, proxies of socioeconomic status)
I Medical history (GP visits, prescriptions, exams)
I This information can be used to characterise the patientswith respect to
I Measurements of health indicators that allow to estimate a riskof experiencing cardiovascular events
I Treatment with statinsI Measurements of suitable outcomes (e.g. LDL level, CHD
events, deaths)
Example (cont’d)
Preliminary analysis
I Data from THIN10 (a sub sample of 10 practices as ofFebruary 2009)
I Already existing “code lists” to identify and managecardiovascular events & related variables
I Identify relevant read codes & select records of patients withmeasurements for suitable variables
I Will need to update and perhaps modify this code list
I Created new (provisional) lists to identify records ofprescription for statin treatment
Example (cont’d)
I Estimated a cardiovascular risk predictorI Based on University of Edinburgh risk calculator
(http://cvrisk.mvm.ed.ac.uk/calculator/calc.asp)
Example (cont’d)
I Estimated a cardiovascular risk predictorI Combines two dimensions from Framingham risk calculatorI NB: Framingham risk calculator would be ideal, but it is not
consistently recorded in THINI Requires measurements of
I HLD and total cholesterol;I systolic blood pressure;I smoking and diabetes status and the presence of left
ventricular hypetrophy;I age and sex
I Problems with recording of smoking status, so will need tomake this estimation more robust
Example (cont’d)
Preliminary analysisI For the sake of simplicity we considered a simple
continuous outcomeI Measure of LDL cholesterol following the estimation of CVD
risk
I To simplify the analysis, we grouped the patientsaccording to their age at the risk prediction
I Bins of 5 years (50-54 — 85+)
I Each patient was associated with the treatment group ifthey had a prescription for statins in the year following therisk prediction
Example — Sharp design
I Assume that the design is sharp (i.e. “perfect” treatmentallocation)
I Run two regression analysesI Control for sex, risk and age at LDL measurement
I Treatment effect measured as ACE
ACE = E(Y |T = 1)− E(Y |T = 0)
Example — Sharp design
0.0 0.1 0.2 0.3
02
46
Age at prediction = 50−54 (n = 1484) ACE = −0.271
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5 0.6
01
23
45
67
Age at prediction = 55−59 (n = 2016) ACE = −0.0334
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5 0.6
12
34
56
Age at prediction = 60−64 (n = 2188) ACE = −0.098
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.2 0.4 0.6
02
46
8
Age at prediction = 65−69 (n = 2485) ACE = −0.554
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.2 0.4 0.6 0.8
24
68
Age at prediction = 70−74 (n = 2142) ACE = 0.0552
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.2 0.4 0.6 0.8 1.0
12
34
56
7
Age at prediction = 75−79 (n = 1167) ACE = 0.120
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.2 0.4 0.6 0.8 1.0
12
34
5
Age at prediction = 80−84 (n = 613) ACE = 0.064
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
12
34
5
Age at prediction = 85+ (n = 251) ACE = 3.32
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
Example — Sharp design
I ACE reasonably stable and negative (i.e. treatmentdecreases level of LDL) for age groups 50-54 up to 70-74
I Older age groups show very unstable estimates (few datapoints in the treatment group!)
I Overall, treatment effect is small
Example — Sharp design
0.0 0.2 0.4 0.6
02
46
8
Age at prediction = 65−69 (n = 2485) ACE = −0.554
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
Example — Sharp design
I ACE reasonably stable and negative (i.e. treatmentdecreases level of LDL) for age groups 50-54 up to 70-74
I Older age groups show very unstable estimates (few datapoints in the treatment group!)
I Overall, treatment effect is small
I More importantly, the design is not sharp!
Example — Fuzzy design
0.1 0.2 0.3
02
46
Age at prediction = 50−54 (n = 1484)
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5
01
23
45
67
Age at prediction = 55−59 (n = 2016)
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5 0.6
24
68
10
Age at prediction = 60−64 (n = 2188)
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.1 0.2 0.3 0.4 0.5 0.6
02
46
8
Age at prediction = 65−69 (n = 2485)
Predicted risk scoreLD
L (m
mol
/l)
Not TreatedTreated
0.0 0.2 0.4 0.6 0.8
24
68
Age at prediction = 70−74 (n = 2142)
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
Example — Fuzzy design
I Under these circumstances, we cannot use ACE toestimate the causal effect, but need to build FACE
I For this preliminary analysis, we estimate the denominatorusing the observed raw proportions
I There are a few possible ways of computing the estimandI By threshold onlyI By treatment onlyI By treatment & threshold
Example (cont’d)
0.1 0.2 0.3
02
46
Age at prediction = 50−54 (n = 1484) FACE = −0.326
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5
01
23
45
67
Age at prediction = 55−59 (n = 2016) FACE = −0.509
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.0 0.1 0.2 0.3 0.4 0.5 0.6
24
68
10
Age at prediction = 60−64 (n = 2188) FACE = −0.916
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
0.1 0.2 0.3 0.4 0.5 0.6
02
46
8
Age at prediction = 65−69 (n = 2485) FACE = −5.53
Predicted risk score
LDL
(mm
ol/l)
Not TreatedTreated
Some results
50-54 55-59 60-64 65-69 70-74ACE -0.2709 -0.0334 -0.0980 -0.5535 0.0550FACE -2.0254 -0.2734 -0.7816 -6.9494 0.5801FACE∗ -0.3255 -0.5085 -0.9161 -5.5263 4.2267
I ACE two regressions on data defined by threshold andcompliance
I FACE = ACEp1.1−p1.0
I ACE∗ two regressions on data defined by thresholds butwith treatment as predictor
I FACE∗ = ACE∗
p1.1−p1.0
Some results
50-54 55-59 60-64 65-69 70-74ACE -0.2709 -0.0334 -0.0980 -0.5535 0.0550FACE -2.0254 -0.2734 -0.7816 -6.9494 0.5801FACE∗ -0.3255 -0.5085 -0.9161 -5.5263 4.2267
I ACE two regressions on data defined by threshold andcompliance
I FACE = ACEp1.1−p1.0
I ACE∗ two regressions on data defined by thresholds butwith treatment as predictor
I FACE∗ = ACE∗
p1.1−p1.0
Some results
50-54 55-59 60-64 65-69 70-74ACE -0.2709 -0.0334 -0.0980 -0.5535 0.0550FACE -2.0254 -0.2734 -0.7816 -6.9494 0.5801FACE∗ -0.3255 -0.5085 -0.9161 -5.5263 4.2267
I ACE two regressions on data defined by threshold andcompliance
I FACE = ACEp1.1−p1.0
I ACE∗ two regressions on data defined by thresholds butwith treatment as predictor
I FACE∗ = ACE∗
p1.1−p1.0
Some results
I Estimates of FACE are very unstable
I Need to come up with more robust estimates ofdenominator
Example — Comments
I The results are only indicative of the underlying causalmechanism, due to a series of factors
I Data need to be made more robust (include more practices &more precise information on crucial predictor, such as smokingstatus)
I Account properly for the two layers on “non compliance”I GPs prescribing below threshold (or not prescribing above)I Individual compliance (patients prescribed statins who do not
take them continuously)
I There seems to be an effect of treatment, especially insome age groups, but more analyses are required
I Careful stratification by sexI Control for more health conditions
Where to next?
I Clean up data more and apply to whole THIN dataset
I Find more stable/robust estimates of the denominator ofthe FACE
I Incorporate cost-effectiveness analysis
I Apply RD design to other drugs/screening
References
[1] A. P. Dawid. Causal inference using influence diagrams: The problem of partial compliance (with Discussion).In P.J. Green, N.L. Hjort, and S. Richardson, editors, Highly Structured Stochastic Systems, pages 45–81.Oxford University Press, 2003.
[2] MA Hernan and JM Robins. Instruments for causal inference - An epidemiologist’s dream? Epidemiology,17(4):360–372, JUL 2006.
[3] Guido W. Imbens and Thomas Lemieux. Regression discontinuity designs: A guide to practice. Journal ofEconometrics, 142(2):615 – 635, 2008. The regression discontinuity design: Theory and applications.
[4] NICE. Quick reference guide: Statins for the prevention of cardiovascular events, 2008.
[5] DL. Thistlethwaite and DT. Campbell. Regression-Discontinuity Analysis - An alternative to the ex-post-factoexperiment. Journal of Educational Psychology, 51(6):309–317, 1960.
[6] G. van der Klaauw. Regression-discontinuity analysis: A survey of recent developments in economics. Labour,22(2):219–245, 2008.
[7] S. Ward, L. Jones, A. Pandor, M. Holmes, R. Ara, A. Ryan, W. Yeo, and N. Payne. A systematic review andeconomic evaluation of statins for the prevention of coronary events. Health Technology Assessment, 11(14),2007.
Deriving the LATE
I Pretend we’re looking at a randomised trial with partialcompliance
I Introduce three variablesI Z the randomised treatment – not necessarily complied toI U the unobserved confoundersI CZ the preferred treatment under Z
Deriving the LATE
Z T Y
U
I If the DAG above describes the situation
I Then we can replace U with CZ
Deriving the LATE
Z T Y
CZ
I If the DAG above describes the situation
I Then we can replace U with CZ
Deriving the LATE
I The CZ ’s look a bit like counterfactuals
I But they aren’t as they represent preferences that you canelicit prior to any treatment being assigned
I So they are random variablesI We assume that T = CZ ,
I i.e. the treatment actually taken is the preferred treatment
I We also assume monotonicityI Individuals do not want to do the opposite of what they are
recommendedI p(C0 = 1, C1 = 0) = 0