Rerandomization,to,Improve, CovariateBalancein, Randomized...
Transcript of Rerandomization,to,Improve, CovariateBalancein, Randomized...
Rerandomization to Improve Covariate Balance in
Randomized Experiments Kari Lock
Harvard Statistics Advisor: Don Rubin
4/28/11
• Randomized experiments are the “gold standard” for estimating causal effects
• WHY?
1. They yield unbiased estimates 2. They eliminate confounding factors
The Gold Standard
R R R R
R R R R
R R R R
R R R R
R R R R
R R R R
R R R R
R R R R
R R R
R R R R R
R R R R R
R R R R R
R R R
R R R R R
R R R R R
R R R R R
Randomize
R R R R R R R R
12 Females, 8 Males 8 Females, 12 Males 5 Females, 15 Males 15 Females, 5 Males
• Suppose you get a “bad” randomization • What would you do??? • Can you rerandomize??? When? How?
Covariate Balance -‐ Gender
Rubin: What if, in a randomized experiment, the chosen randomized allocation exhibited substantial imbalance on a prognostically important baseline covariate?
Cochran: Why didn't you block on that variable?
Rubin: Well, there were many baseline covariates, and the correct blocking wasn't obvious; and I was lazy at that time. Cochran: This is a question that I once asked Fisher, and his reply was unequivocal:
Fisher (recreated via Cochran): Of course, if the experiment had not been started, I would rerandomize.
The Origin of the Idea
• The more covariates, the more likely at least one covariate will be imbalanced across treatment groups
• With just 10 independent covariates, the probability of a signiWicant difference (at level α = .05) for at least one covariate is 1 – (1-‐.05)10 = 40%!
• Covariate imbalance is not limited to rare “unlucky” randomizations
Covariate Imbalance
• We all know that randomized experiments are the “gold standard” for estimating causal effects
• WHY?
1. They yield unbiased estimates 2. They eliminate confounding factors
… on average! For any particular experiment, covariate imbalance is possible (and likely!), and conditional bias exists
The Gold Standard
Randomize subjects to treatment and control
Collect covariate data Specify criteria determining when a randomization is unacceptable; based
on covariate balance
(Re)randomize subjects to treatment and control
Check covariate balance
1)
2)
Conduct experiment
unacceptable acceptable
Analyze results (with a randomization test)
3)
4)
RERANDOMIZATION
Theorem: If the treatment groups are exchangeable for each randomization and for the rerandomization criteria, then rerandomization yields an unbiased estimate of the average treatment effect. • For exchangeability: • Equal sized treatment groups • Rerandomization criteria that is objective and does not favor a speciWic group
Unbiased
Unbiased
Options for unequal sized treatment groups:
• Rerandomize into multiple equally sized groups, then combine groups after the rerandomization • Discard the extra units to form equal sized groups • Rerandomize to lower MSE, with perhaps slight bias • Do not rerandomize
• Randomization Test: • Simulate randomizations to see what the statistic would look like just by random chance, if the null hypothesis were true
• Rerandomization Test: • A randomization test, but for each simulated randomization, follow the same rerandomization criteria used in the experiment
• As long as the simulated randomizations are done using the same randomization scheme used in the experiment, this will give accurate p-‐values
Rerandomization Test
• t-‐test: • Too conservative • SigniWicant results can be trusted
• Regression: • Regression including the covariates that were balanced on using rerandomization more accurately estimates the true precision of the estimated treatment effect • Assumptions are less dangerous after rerandomization because groups should be well balanced
Alternatives for Analysis
We use Mahalanobis Distance, M, to represent multivariate distance between group means:
2Under adequate sample sizes and pure randomization: ~: Number of covariates to be balanced
kMk
χ
Choose a and rerandomize when M > a
( ) ( ) ( )1
'cov T CT C T CM−
− −= −X X X X X X
Criteria for Acceptable Balance
( ) ( ) ( )11
1 1 'covT C T CT Cn n
−−⎛ ⎞
= + − −⎜ ⎟⎝ ⎠
X X X X X
Ma
MMa
Ma
Distribution of M
RERANDOMIZE
Acceptable Randomizations
pa = Probability of accepting a randomization
• Choosing the acceptance probability is a tradeoff between a desire for better balance and computational time
• The number of randomizations needed to get one successful one is Geometric(pa), so the expected number needed is 1/pa
• Computational time must be considered in advance, since many simulated acceptable randomizations are needed to conduct the randomization test
Choosing pa
• The obvious choice may be to set limits on the acceptable balance for each covariate individually • This destroys the joint distribution of the covariates
-3 -2 -1 0 1 2 3
-3-2
-10
12
3 D2
D1
Another Measure of Balance
• Since M follows a known distribution, easy to specify the proportion of accepted randomizations • M is afWinely invariant (unaffected by afWine transformations of the covariates) • Correlations between covariates are maintained • The balance improvement for each covariate is the same (and known)… • … and is the same for any linear combination of the covariates
Rerandomization Based on M
Theorem: If nT = nC, the covariate means are normally distributed, and rerandomization occurs when M > a, then ( )|T CE M a− ≤ =X X 0
( ) ( )cov cov| .T C T CaM a v− ≤ = −X X X X
( )( )
22
2
1,2 2 2
,2 2
ka
k
k aP a
vk ak P a
γ χ
χγ
+
⎛ ⎞+⎜ ⎟ ≤⎝ ⎠≡ × =⎛ ⎞ ≤⎜ ⎟⎝ ⎠
Covariates After Rerandomization
1
0where is the incomplete gamma function: ( , ) .
c b yb c y e dyγ γ − −≡ ∫
Percent Reduction in Variance
( )( )
, ,
, ,
rerandomizativar
var
onj T j C
j T j C
X X
X X
−
−
∣
( ) ( )( )
, , , ,
, ,
rerandomizatiovar var n100
varj T j C j T j C
j T j C
X X X X
X X
⎛ ⎞−⎜ ⎟⎜⎝
−
− ⎟⎠
− ∣
• For each covariate xj, the ratio of variances is:
and the percent reduction in variance is
av=
100(1 )av= −
Shadish Data
n = 445 undergraduate students
Randomized Experiment 235
Observational Study 210
Vocab Training 116
Math Training 119
Vocab Training 131
Math Training 79
randomization
randomization students choose
Shadish, M. R., Clark, M. H., Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. JASA. 103(484): 1334-‐1344.
Shadish Data
k = 10 covariates
Standardized Difference in Covariate Means-3 -2 -1 0 1 2 3
maleage
collgpaaactcomp
preflitlikelit
likemathnumbmath
mathprevocabpre
Shadish Data
To generate 1000 rerandomizations:
pa = 0.1: 11.6 seconds pa = 0.01: 1.8 minutes pa = 0.001: 18.3 minutes pa = 0.0001: 3.4 hours
Shadish Data
To generate 1000 rerandomizations:
pa = 0.1: 11.6 seconds pa = 0.01: 1.8 minutes pa = 0.001: 18.3 minutes pa = 0.0001: 3.4 hours
Shadish Data
( )210 0.001 1.48P a aχ ≤ ⇒ ==
Rerandomize when M > 1.48
Ratio of Variances:1.48
1.4
101, 1,2 22 2 2 2 0.12
1010, ,2 2
82 2
a
k a
k akv
γ γ
γ γ
⎛ ⎞ ⎛ ⎞+ +⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠× = × =⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
=
k = 10 covariates pa = 0.001
Percent Reduction in Variance: 100(1 – va) = 88%
Shadish Data Vocab Pre-Test
XT-XC
-2 -1 0 1 2
Pure RandomizationRerandomization
Shadish Data
Standardized Difference in Covariate Means-4 -2 0 2 4
act+10gpa
male
age
gpa
act
preflit
likelit
likemath
numbmath
mathpre
vocabpre
Actual RandomizationPure RandomizationRerandomization PRIV
88%
88%
88%
88%
88%
88%
88%
88%
88%
88%
88%
Theorem: If nT = nC , the covariate and outcome means are normally distributed, the treatment effect is additive, and rerandomization occurs when M > a, then
( )|T CE Y aY M τ− ≤ =and the percent reduction in variance for the estimated treatment effect is
( ) 200 1 ,1 av R−where R2 is the squared canonical correlation.
Estimated Treatment Effect After Rerandomization
va
Outcome Variance Reduction
Shadish Data
n = 445 undergraduate students
Randomized Experiment 235
Observational Study 210
Vocab Training 116
Math Training 119
Vocab Training 131
Math Training 79
randomization
randomization students choose
Outcomes: score on a vocabulary test, score on a mathematics test
Shadish Data
10, .001 Covariate Percent Reduction in Variance = 88%
ak p=⇒
=
Vocabulary: R2 = 0.11 ⇒ Percent Reduction in Variance = 88(0.11) = 9.68%
Equivalent to increasing the sample size by
1/(1-‐0.28) = 1.39
2Percent Reduction in Varia ˆnce 88 for is T CY RYτ = −
Mathematics: R2 = 0.32 ⇒ Percent Reduction in Variance = 88(0.32) = 28.16%
Shadish Data
Vocabulary
Estimated Treatment Effect-3 -2 -1 0 1 2 3
Mathematics
Estimated Treatment Effect-2 -1 0 1 2
• Since M follows a known distribution, easy to specify the proportion of accepted randomizations • M is afWinely invariant (unaffected by afWine transformations of the covariates) • Correlations between covariates are maintained • The balance improvement for each covariate is the same… • … and is the same for any linear combination of the covariates
Rerandomization Based on M
Is that good???
• Since M follows a known distribution, easy to specify the proportion of accepted randomizations • M is afWinely invariant (unaffected by afWine transformations of the covariates) • Correlations between covariates are maintained • The balance improvement for each covariate is the same… • … and is the same for any linear combination of the covariates
• In practice, covariates are often of varying importance • We can place covariates into tiers, with the top tier including the most important covariates, the second tier less important, etc.
• Criteria for acceptable balance can be less stringent for each successive tier
Tiers of Covariates
1) Check balance for covariates in tier 1 ⇒ calculate M1 using the covariates in tier 1,
rerandomize if M1 > a1 2) For each successive tier, regress each covariate
on all covariates in previous tiers, and check balance for the residuals of that tier ⇒ calculate Mt using the residuals of tier t,
rerandomize if Mt > at at denotes the acceptance threshold for tier t
Tiers of Covariates
Theorem: If the nT = nC and the covariate means are normally distributed, and rerandomization occurs if any Mt > at, then the ratio of variances for covariate xj is
1 if in tier 1,av
1 2
2 21 1(1 ) if in tier 2,a av R Rv+ −
Tiers of Covariates
1
12 2 2 2
1 1 12
( ) (1 ), if in tier 2,i t
t
a a i i a ti
v R v R R v R t−
− −=
+ − + − >∑where Rt2 denotes the squared canonical correlation between xj and all covariates in tiers 1 through t.
Empirical Percent Reduction in Variance
0 20 40 60 80 100
actcomp*collgpaalikelit*collgpaalikelit*actcomp
likemath*collgpaalikemath*actcomp
likemath*likelitnumbmath*collgpaanumbmath*actcomp
numbmath*likelitnumbmath*likemath
mathpre*collgpaamathpre*actcomp
mathpre*likelitmathpre*likemath
mathpre*numbmathvocabpre*collgpaavocabpre*actcomp
vocabpre*likelitvocabpre*likemath
vocabpre*numbmathvocabpre*mathpre
mathpre2vocabpre2
TIER 4:aframcauc
marriedmaleage
daddegrmomdegr
parentsIncomepintellpemot
pconscpagreepextra
beckmars
majormicredit
hsgpaarpreflit
TIER 3:collgpaaactcomp
likelitlikemath
numbmathTIER 2:
mathprevocabpre
TIER 1: 94.8278.2440.7523.24
CovariatesResiduals
Standardized Difference in Covariate Means
-4 -2 0 2 4
actcomp*collgpaalikelit*collgpaalikelit*actcomp
likemath*collgpaalikemath*actcomp
likemath*likelitnumbmath*collgpaanumbmath*actcomp
numbmath*likelitnumbmath*likemath
mathpre*collgpaamathpre*actcomp
mathpre*likelitmathpre*likemath
mathpre*numbmathvocabpre*collgpaavocabpre*actcomp
vocabpre*likelitvocabpre*likemath
vocabpre*numbmathvocabpre*mathpre
mathpre2vocabpre2
TIER 4:aframcauc
marriedmaleage
daddegrmomdegr
parentsIncomepintellpemot
pconscpagreepextra
beckmars
majormicredit
hsgpaarpreflit
TIER 3:collgpaaactcomp
likelitlikemath
numbmathTIER 2:
mathprevocabpre
TIER 1:PRIV95%94%
79%80%79%85%79%67%52%44%45%46%38%47%43%43%41%53%31%41%44%43%42%39%51%55%
94%89%92%79%78%84%90%85%81%82%84%91%88%74%76%79%78%72%78%77%80%78%81%
Empirical vs Theoretical Values
0 20 40 60 80 100
020
4060
8010
0
Empirical PRIV
Theo
retic
al P
RIV
Tier 1Tier 2Tier 3Tier 4
• Rerandomization is easily extended to multiple treatment groups
• To measure balance, one of the MANOVA test statistics (such as Wilks’ Λ) can be used, measuring the ratio of within group variability to between group variability for multiple covariates
• This is equivalent to rerandomizing based on Mahalanobis distance if used on only two groups
Multiple Treatment Groups
• This is not meant to replace blocking, and blocking and rerandomization can be used together
• Blocking is great for balancing a few very important covariates, but is not easily used for many covariates
• “Block what you can, rerandomize what you can’t”
Blocking
• The theoretical results given depend on M ~ χk2
• If the covariate means are not normally distributed, that is Wine! The theoretical results given are used nowhere in analysis. Rerandomization does NOT depend on asymptotics.
• The threshold a corresponding to pa, and the percent reduction in variance for the covariates can be estimated via simulation
Small Samples or Non-‐Normality
• Rerandomization improves covariate balance between the treatment groups, giving the researcher more faith that an observed effect is really due to the treatment
• If the covariates are correlated with the outcome, rerandomization also increases precision in estimating the treatment effect, giving the researcher more power to detect a signiWicant result
Conclusion
Thank you!!!