Rerandomization,to,Improve, CovariateBalancein, Randomized...

Rerandomization to Improve Covariate Balance in

Randomized Experiments Kari Lock

Harvard Statistics Advisor: Don Rubin

4/28/11

•  Randomized experiments are the “gold standard” for estimating causal effects

•  WHY?

1.  They yield unbiased estimates 2.  They eliminate confounding factors

The Gold Standard

R R R R

R R R R

R R R R

R R R R

R R R R

R R R R

R R R R

R R R R

R R R

R R R R R

R R R R R

R R R R R

R R R

R R R R R

R R R R R

R R R R R

Randomize

R R R R R R R R

12 Females, 8 Males 8 Females, 12 Males 5 Females, 15 Males 15 Females, 5 Males

•  Suppose you get a “bad” randomization •  What would you do??? •  Can you rerandomize??? When? How?

Covariate Balance -‐ Gender

Rubin: What if, in a randomized experiment, the chosen randomized allocation exhibited substantial imbalance on a prognostically important baseline covariate?

Cochran: Why didn't you block on that variable?

Rubin: Well, there were many baseline covariates, and the correct blocking wasn't obvious; and I was lazy at that time. Cochran: This is a question that I once asked Fisher, and his reply was unequivocal:

Fisher (recreated via Cochran): Of course, if the experiment had not been started, I would rerandomize.

The Origin of the Idea

•  The more covariates, the more likely at least one covariate will be imbalanced across treatment groups

•  With just 10 independent covariates, the probability of a signiWicant difference (at level α = .05) for at least one covariate is 1 – (1-‐.05)10 = 40%!

•  Covariate imbalance is not limited to rare “unlucky” randomizations

Covariate Imbalance

•  We all know that randomized experiments are the “gold standard” for estimating causal effects

•  WHY?

1.  They yield unbiased estimates 2.  They eliminate confounding factors

… on average! For any particular experiment, covariate imbalance is possible (and likely!), and conditional bias exists

The Gold Standard

Randomize subjects to treatment and control

Collect covariate data Specify criteria determining when a randomization is unacceptable; based

on covariate balance

(Re)randomize subjects to treatment and control

Check covariate balance

1)

2)

Conduct experiment

unacceptable acceptable

Analyze results (with a randomization test)

3)

4)

RERANDOMIZATION

Theorem: If the treatment groups are exchangeable for each randomization and for the rerandomization criteria, then rerandomization yields an unbiased estimate of the average treatment effect. •  For exchangeability: •  Equal sized treatment groups •  Rerandomization criteria that is objective and does not favor a speciWic group

Unbiased

Unbiased

Options for unequal sized treatment groups:

•  Rerandomize into multiple equally sized groups, then combine groups after the rerandomization •  Discard the extra units to form equal sized groups •  Rerandomize to lower MSE, with perhaps slight bias •  Do not rerandomize

•  Randomization Test: •  Simulate randomizations to see what the statistic would look like just by random chance, if the null hypothesis were true

•  Rerandomization Test: •  A randomization test, but for each simulated randomization, follow the same rerandomization criteria used in the experiment

•  As long as the simulated randomizations are done using the same randomization scheme used in the experiment, this will give accurate p-‐values

Rerandomization Test

•  t-‐test: •  Too conservative •  SigniWicant results can be trusted

•  Regression: •  Regression including the covariates that were balanced on using rerandomization more accurately estimates the true precision of the estimated treatment effect •  Assumptions are less dangerous after rerandomization because groups should be well balanced

Alternatives for Analysis

We use Mahalanobis Distance, M, to represent multivariate distance between group means:

2Under adequate sample sizes and pure randomization: ~: Number of covariates to be balanced

kMk

χ

Choose a and rerandomize when M > a

( ) ( ) ( )1

'cov T CT C T CM−

− −= −X X X X X X

Criteria for Acceptable Balance

( ) ( ) ( )11

1 1 'covT C T CT Cn n

−−⎛ ⎞

= + − −⎜ ⎟⎝ ⎠

X X X X X

Ma

MMa

Ma

Distribution of M

RERANDOMIZE

Acceptable Randomizations

pa = Probability of accepting a randomization

•  Choosing the acceptance probability is a tradeoff between a desire for better balance and computational time

•  The number of randomizations needed to get one successful one is Geometric(pa), so the expected number needed is 1/pa

•  Computational time must be considered in advance, since many simulated acceptable randomizations are needed to conduct the randomization test

Choosing pa

•  The obvious choice may be to set limits on the acceptable balance for each covariate individually •  This destroys the joint distribution of the covariates

-3 -2 -1 0 1 2 3

-3-2

-10

12

3 D2

D1

Another Measure of Balance

•  Since M follows a known distribution, easy to specify the proportion of accepted randomizations •  M is afWinely invariant (unaffected by afWine transformations of the covariates) •  Correlations between covariates are maintained •  The balance improvement for each covariate is the same (and known)… •  … and is the same for any linear combination of the covariates

Rerandomization Based on M

Theorem: If nT = nC, the covariate means are normally distributed, and rerandomization occurs when M > a, then ( )|T CE M a− ≤ =X X 0

( ) ( )cov cov| .T C T CaM a v− ≤ = −X X X X

( )( )

22

2

1,2 2 2

,2 2

ka

k

k aP a

vk ak P a

γ χ

χγ

+

⎛ ⎞+⎜ ⎟ ≤⎝ ⎠≡ × =⎛ ⎞ ≤⎜ ⎟⎝ ⎠

Covariates After Rerandomization

1

0where is the incomplete gamma function: ( , ) .

c b yb c y e dyγ γ − −≡ ∫

Percent Reduction in Variance

( )( )

, ,

, ,

rerandomizativar

var

onj T j C

j T j C

X X

X X

−

−

∣

( ) ( )( )

, , , ,

, ,

rerandomizatiovar var n100

varj T j C j T j C

j T j C

X X X X

X X

⎛ ⎞−⎜ ⎟⎜⎝

−

− ⎟⎠

− ∣

•  For each covariate xj, the ratio of variances is:

and the percent reduction in variance is

av=

100(1 )av= −

Shadish Data

n = 445 undergraduate students

Randomized Experiment 235

Observational Study 210

Vocab Training 116

Math Training 119

Vocab Training 131

Math Training 79

randomization

randomization students choose

Shadish, M. R., Clark, M. H., Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. JASA. 103(484): 1334-‐1344.

Shadish Data

k = 10 covariates

Standardized Difference in Covariate Means-3 -2 -1 0 1 2 3

maleage

collgpaaactcomp

preflitlikelit

likemathnumbmath

mathprevocabpre

Shadish Data

To generate 1000 rerandomizations:

pa = 0.1: 11.6 seconds pa = 0.01: 1.8 minutes pa = 0.001: 18.3 minutes pa = 0.0001: 3.4 hours

Shadish Data

( )210 0.001 1.48P a aχ ≤ ⇒ ==

Rerandomize when M > 1.48

Ratio of Variances:1.48

1.4

101, 1,2 22 2 2 2 0.12

1010, ,2 2

82 2

a

k a

k akv

γ γ

γ γ

⎛ ⎞ ⎛ ⎞+ +⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠× = × =⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

=

k = 10 covariates pa = 0.001

Percent Reduction in Variance: 100(1 – va) = 88%

Shadish Data Vocab Pre-Test

XT-XC

-2 -1 0 1 2

Pure RandomizationRerandomization

Shadish Data

Standardized Difference in Covariate Means-4 -2 0 2 4

act+10gpa

male

age

gpa

act

preflit

likelit

likemath

numbmath

mathpre

vocabpre

Actual RandomizationPure RandomizationRerandomization PRIV

88%

88%

88%

88%

88%

88%

88%

88%

88%

88%

88%

Theorem: If nT = nC , the covariate and outcome means are normally distributed, the treatment effect is additive, and rerandomization occurs when M > a, then

( )|T CE Y aY M τ− ≤ =and the percent reduction in variance for the estimated treatment effect is

( ) 200 1 ,1 av R−where R2 is the squared canonical correlation.

Estimated Treatment Effect After Rerandomization

va

Outcome Variance Reduction

Shadish Data

n = 445 undergraduate students

Randomized Experiment 235

Observational Study 210

Vocab Training 116

Math Training 119

Vocab Training 131

Math Training 79

randomization

randomization students choose

Outcomes: score on a vocabulary test, score on a mathematics test

Shadish Data

10, .001 Covariate Percent Reduction in Variance = 88%

ak p=⇒

=

Vocabulary: R2 = 0.11 ⇒ Percent Reduction in Variance = 88(0.11) = 9.68%

Equivalent to increasing the sample size by

1/(1-‐0.28) = 1.39

2Percent Reduction in Varia ˆnce 88 for is T CY RYτ = −

Mathematics: R2 = 0.32 ⇒ Percent Reduction in Variance = 88(0.32) = 28.16%

Shadish Data

Vocabulary

Estimated Treatment Effect-3 -2 -1 0 1 2 3

Mathematics

Estimated Treatment Effect-2 -1 0 1 2

•  Since M follows a known distribution, easy to specify the proportion of accepted randomizations •  M is afWinely invariant (unaffected by afWine transformations of the covariates) •  Correlations between covariates are maintained •  The balance improvement for each covariate is the same… •  … and is the same for any linear combination of the covariates

Rerandomization Based on M

Is that good???

•  Since M follows a known distribution, easy to specify the proportion of accepted randomizations •  M is afWinely invariant (unaffected by afWine transformations of the covariates) •  Correlations between covariates are maintained •  The balance improvement for each covariate is the same… •  … and is the same for any linear combination of the covariates

•  In practice, covariates are often of varying importance •  We can place covariates into tiers, with the top tier including the most important covariates, the second tier less important, etc.

•  Criteria for acceptable balance can be less stringent for each successive tier

Tiers of Covariates

1)   Check balance for covariates in tier 1 ⇒ calculate M1 using the covariates in tier 1,

rerandomize if M1 > a1 2) For each successive tier, regress each covariate

on all covariates in previous tiers, and check balance for the residuals of that tier ⇒ calculate Mt using the residuals of tier t,

rerandomize if Mt > at at denotes the acceptance threshold for tier t

Tiers of Covariates

Theorem: If the nT = nC and the covariate means are normally distributed, and rerandomization occurs if any Mt > at, then the ratio of variances for covariate xj is

1 if in tier 1,av

1 2

2 21 1(1 ) if in tier 2,a av R Rv+ −

Tiers of Covariates

1

12 2 2 2

1 1 12

( ) (1 ), if in tier 2,i t

t

a a i i a ti

v R v R R v R t−

− −=

+ − + − >∑where Rt2 denotes the squared canonical correlation between xj and all covariates in tiers 1 through t.

Empirical Percent Reduction in Variance

0 20 40 60 80 100

actcomp*collgpaalikelit*collgpaalikelit*actcomp

likemath*collgpaalikemath*actcomp

likemath*likelitnumbmath*collgpaanumbmath*actcomp

numbmath*likelitnumbmath*likemath

mathpre*collgpaamathpre*actcomp

mathpre*likelitmathpre*likemath

mathpre*numbmathvocabpre*collgpaavocabpre*actcomp

vocabpre*likelitvocabpre*likemath

vocabpre*numbmathvocabpre*mathpre

mathpre2vocabpre2

TIER 4:aframcauc

marriedmaleage

daddegrmomdegr

parentsIncomepintellpemot

pconscpagreepextra

beckmars

majormicredit

hsgpaarpreflit

TIER 3:collgpaaactcomp

likelitlikemath

numbmathTIER 2:

mathprevocabpre

TIER 1: 94.8278.2440.7523.24

CovariatesResiduals

Standardized Difference in Covariate Means

-4 -2 0 2 4

actcomp*collgpaalikelit*collgpaalikelit*actcomp

likemath*collgpaalikemath*actcomp

likemath*likelitnumbmath*collgpaanumbmath*actcomp

numbmath*likelitnumbmath*likemath

mathpre*collgpaamathpre*actcomp

mathpre*likelitmathpre*likemath

mathpre*numbmathvocabpre*collgpaavocabpre*actcomp

vocabpre*likelitvocabpre*likemath

vocabpre*numbmathvocabpre*mathpre

mathpre2vocabpre2

TIER 4:aframcauc

marriedmaleage

daddegrmomdegr

parentsIncomepintellpemot

pconscpagreepextra

beckmars

majormicredit

hsgpaarpreflit

TIER 3:collgpaaactcomp

likelitlikemath

numbmathTIER 2:

mathprevocabpre

TIER 1:PRIV95%94%

79%80%79%85%79%67%52%44%45%46%38%47%43%43%41%53%31%41%44%43%42%39%51%55%

94%89%92%79%78%84%90%85%81%82%84%91%88%74%76%79%78%72%78%77%80%78%81%

Empirical vs Theoretical Values

0 20 40 60 80 100

020

4060

8010

0

Empirical PRIV

Theo

retic

al P

RIV

Tier 1Tier 2Tier 3Tier 4

•  Rerandomization is easily extended to multiple treatment groups

•  To measure balance, one of the MANOVA test statistics (such as Wilks’ Λ) can be used, measuring the ratio of within group variability to between group variability for multiple covariates

•  This is equivalent to rerandomizing based on Mahalanobis distance if used on only two groups

Multiple Treatment Groups

•  This is not meant to replace blocking, and blocking and rerandomization can be used together

•  Blocking is great for balancing a few very important covariates, but is not easily used for many covariates

•  “Block what you can, rerandomize what you can’t”

Blocking

•  The theoretical results given depend on M ~ χk2

•  If the covariate means are not normally distributed, that is Wine! The theoretical results given are used nowhere in analysis. Rerandomization does NOT depend on asymptotics.

•  The threshold a corresponding to pa, and the percent reduction in variance for the covariates can be estimated via simulation

Small Samples or Non-‐Normality

•  Rerandomization improves covariate balance between the treatment groups, giving the researcher more faith that an observed effect is really due to the treatment

•  If the covariates are correlated with the outcome, rerandomization also increases precision in estimating the treatment effect, giving the researcher more power to detect a signiWicant result

Conclusion

Thank you!!!

Rerandomization,to,Improve, CovariateBalancein, Randomized...

Documents

Transcript of Rerandomization,to,Improve, CovariateBalancein, Randomized...