Compromised Randomization and Uncertainty of Treatment Assignments in Social...

48
Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments: The Case of the Perry Preschool Program James Heckman Department of Economics University of Chicago [email protected] Rodrigo R. A. Pinto Department of Economics University of Chicago [email protected] Azeem M. Shaikh Department of Economics University of Chicago [email protected] Adam Yavitz Department of Economics University of Chicago [email protected] 1 July 24, 2009 1 James Heckman is Henry Schultz Distinguished Service Professor of Economics at the University of Chicago, Professor of Science and Society, University College Dublin, Alfred Cowles Distinguished Visiting Professor, Cowles Foundation, Yale University and Senior Fellow, American Bar Foundation. Azeem M. Shaikh is an assistant professor, Department of Economics, University of Chicago. Rodrigo Pinto and Adam Yavitz are graduate students at the University of Chicago. A version of this paper was presented at a seminar at the High/Scope Foundation Ypsilanti, Michigan, December 2006; at a conference at the Minneapolis Federal Reserve in December 2007; at a National Poverty Center conference, Ann Arbor, Michigan, December 2007; at a conference sponsored by the Jacobs Foundation held at Castle Marbach, Germany, April 2008; at a Leibniz Network Conference on noncognitive skills in Mannheim, May 2008; and at an Institute for Research on Poverty conference, Madison, Wisconsin, June 2008. We thank participants at these conferences and seminars and Ricardo Barros and Peter Savelyev for helpful comments. We are grateful to Larry Schweinhart of the High Scope Foundation for his continual support. This research was supported by the Committee for Economic Development by a grant from the Pew Charitable Trusts and the Partnership for America’s Economic Success (PAES); the JB & MK Pritzker Family Foundation; Susan Thompson Buffett Foundation; NICHD (R01HD043411); and a grant from the American Bar Foundation. The views expressed in this presentation are those of the authors and not necessarily those of the funders listed here. Supplementary materials may be retrieved from http://jenni.uchicago.edu/perry comp rand/.

Transcript of Compromised Randomization and Uncertainty of Treatment Assignments in Social...

Page 1: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Compromised Randomization and Uncertainty of Treatment

Assignments in Social Experiments: The Case of the Perry

Preschool Program

James HeckmanDepartment of Economics

University of [email protected]

Rodrigo R. A. PintoDepartment of Economics

University of [email protected]

Azeem M. ShaikhDepartment of Economics

University of [email protected]

Adam YavitzDepartment of Economics

University of [email protected]

July 24, 2009

1James Heckman is Henry Schultz Distinguished Service Professor of Economics at the University ofChicago, Professor of Science and Society, University College Dublin, Alfred Cowles Distinguished VisitingProfessor, Cowles Foundation, Yale University and Senior Fellow, American Bar Foundation. Azeem M.Shaikh is an assistant professor, Department of Economics, University of Chicago. Rodrigo Pinto and AdamYavitz are graduate students at the University of Chicago. A version of this paper was presented at a seminarat the High/Scope Foundation Ypsilanti, Michigan, December 2006; at a conference at the MinneapolisFederal Reserve in December 2007; at a National Poverty Center conference, Ann Arbor, Michigan, December2007; at a conference sponsored by the Jacobs Foundation held at Castle Marbach, Germany, April 2008; at aLeibniz Network Conference on noncognitive skills in Mannheim, May 2008; and at an Institute for Researchon Poverty conference, Madison, Wisconsin, June 2008. We thank participants at these conferences andseminars and Ricardo Barros and Peter Savelyev for helpful comments. We are grateful to Larry Schweinhartof the High Scope Foundation for his continual support. This research was supported by the Committeefor Economic Development by a grant from the Pew Charitable Trusts and the Partnership for America’sEconomic Success (PAES); the JB & MK Pritzker Family Foundation; Susan Thompson Buffett Foundation;NICHD (R01HD043411); and a grant from the American Bar Foundation. The views expressed in thispresentation are those of the authors and not necessarily those of the funders listed here. Supplementarymaterials may be retrieved from http://jenni.uchicago.edu/perry comp rand/.

Page 2: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Abstract

Randomized controlled trials (RCT) are considered the golden standard for assessing the effec-tiveness of a treatment. However, social experiments that rely on RCT are often compromised, andthe statistical consequences of these compromises are typically neglected. We develop a generalinference framework that accounts for compromised RCT in social experiments.

Under the null hypothesis of no treatment effects, the outcomes of an RCT should be inde-pendent of treatment assignments. However, compromises of the random assignment can inducea spurious dependency between treatment status and outcomes regardless of the null hypothesis.Failing to account for such dependencies can produce biased inferences, casting doubt on the va-lidity of traditional inference. We decompose a compromising features of a randomization into twoaspects: (a) the uncertainty on the randomization mechanism that generated the compromises and(b) the role played by unobserved variables of participants in the implemented randomization. Weprovide a general framework to corrects for both issues.

We apply our analysis to the Perry Preschool Program, an early education intervention targetedtoward disadvantaged African Americans. Evidence of Perry Program is considered a cornerstonein support of Early Childhood Interventions. Randomization in Perry was compromised by reas-signment of treatment status according to a partially unknown randomization mechanism. Therandomization compromises also cast on unobserved variables of participants. We develop a sta-tistical method that corrects for the Perry compromises using the available information about thereassignment rule. We perform multiple hypothesis tests on a range of human activities. Outcomesremain statistically significant for both males and females after accounting for the compromisedrandomization.

KEYWORDS: Multiple Testing, Multiple Outcomes, Randomized Trial, Perry Preschool Program,Program Evaluation, Familywise Error Rate, Exact Inference.

JEL Codes: C12, C82, C93

James J. HeckmanDepartment of Economics

University of Chicago1126 East 59th StreetChicago, Illinois 60637

Telephone: (773) 702-0634Fax: (773) [email protected]

Rodrigo PintoDepartment of Economics

University of Chicago1126 East 59th StreetChicago, Illinois 60637Phone: (773) 702-3478

Fax: (773) [email protected]

Azeem ShaikhDepartment of Economics

University of Chicago1126 East 59th StreetChicago, Illinois 60637Phone: (773) 702-3621

Fax: (773) [email protected]

Adam YavitzDepartment of Economics

University of Chicago1155 E 60th Street, Suite 227

Chicago, Illinois 60637Phone: (773) 702-4686

Fax: (773) [email protected]

Page 3: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

1 Introduction

Randomized controlled trials (RCT) are considered the golden standard for assessing the effec-

tiveness of a treatment. Carefully executed social experiments allow researches to identify mean

treatment effects without bias.1 However, compromises are often made in practical applications

of the RCT method. These compromises change the statistical properties of RCT and cast doubt

on the validity of traditional inference. This paper examines general sources of compromisation in

RCT experiments and proposes a framework to generate valid inferences.

Compromised randomization occurs when the actual assignments of treatment and control

status among participants deviates from an ideal RCT randomization protocol. A common source

for compromisation is reassignment of treatment status, usually made to balance the distribution of

background variables between treatment and control groups. Compromises can induce dependance

between treatment assignments and some pre-program variables. If these pre-program variables

impact outcomes then treatment assignments may be correlated with outcomes through a common

dependence on pre-program variables. Spurious correlation between treatment assignments and

outcomes through pre-program variables (instead of through a true treatment effect) generate

biased inferences when not controlled for.

It is useful to model a compromised randomization as a stepwise procedure in which researchers

perform an initially not compromised randomization, and per- form a series of actions which deter-

mine the compromises. If compromises occur as a result of some known rule, and do not depend on

unobserved variables, then the distribution of treatment assignments that arise from the compro-

mised randomization can be reproduced, and valid statistical inference can be achieved using this

distribution.2 However, a full description of compromises are seldom available in social experiments.

As a result, there is often uncertainty regarding the true distribution of treatment assignments.3 As1See Heckman and Vytlacil (2007b) for one discussion of what randomized trials identify.2The required conditions depend on the exact assignment rule. If the reassignment rule is a deterministic function

of observed variables, and support conditions for matching are satisfied, a simple matching procedure produces validinference. For more general reassignment rules of identification of assignment rules is required. See Heckman andVytlacil (2007a) for one discussion of identification of models when there is selection into treatment on the basis ofunobserved variables.

3This problem is not unique to the program analyzed in this paper. For example, in the Abecedarian program,randomization was also compromised. One hundred twenty-two children, born to 120 economically disadvantagedfamilies, underwent preliminary processing for the Abecedarian study. After target children were born, qualifyingfamilies were matched on High-Risk Index, sex of the child, maternal IQ, and number of siblings and then assignedto receive high quality child care (treatment) or no treatment (control) on the basis of a table of random numbers(Breitmayer and Ramey, 1986). After this phase, some families refused participation (7 families from the treatment

1

Page 4: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

the compromises do not depend on unobserved variables, it is possible to surmount the uncertainty

of the distribution of treatment assignments by using an exchangeability property discussed in Sub-

section 3.3. However, when a randomization protocol combines both uncertainty and a dependance

on un- observed variables some novel statistical approaches are necessary. Our method addresses

this last case.

We apply our analysis to the Perry Preschool Program, an pre-school intervention targeted

toward disadvantaged young African Americans in Ypsilanti, Michigan in the early 1960s. The

Perry Program is the flagship for advocating the benefits of early childhood intervention. Although

extensive, the literature on Perry Program typically fails to account for its randomization compro-

mises. An exception is (Heckman, Moon, Pinto, Savelyev, and Yavitz, 2008) which corrects for

the compromises of Perry randomization by exploring a property of conditional exchangeability

that remains valid under compromises. As a consequence of this property, they assume indepen-

dence of outcomes and treatment assignments conditional on pre-program variables (a matching

assumption). However, (Heckman, Moon, Pinto, Savelyev, and Yavitz, 2008) do not account for the

compromises based on unobserved characteristics of the participants of the program. In the present

paper, we present a method for generating conservative test statistics for the null hypothesis of no

treatment effect that are consistent with compromises based on unobserved characteristics of Perry

participants. To this end, we correct for the uncertain nature of the assignment distribution in the

spirit of the literature on partial identification (Manski, 2003). Unlike this literature, we are not

interested in estimating or bounding treatment effects. Instead, we are interested in developing

conservative, but data-consistent tests of the hypothesis of no treatment effect in the case of partial

information about the randomization procedure as implemented.

The paper proceeds as follows. Section 2 describes the experimental setting of the Perry Pro-

gram and describes the randomization protocol. Section 3 discuss sources of compromises and

presents our model framework. Section 4 presents a formal description of our testing procedure.

Section 5 shows our empirical analysis. Section 6 concludes.

group and one from the control group), two children from the control group were reassigned to the day-care conditionat the request of local authorities and were dropped as subjects (Campbell and Ramey, 1994). In the SIME-DIMEstudy, the randomization protocol was never clearly described. (See Kurz and Spiegelman, 1972).

2

Page 5: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

2 The Perry Experiment

2.1 The Perry Program

The High/Scope Perry Preschool Program was an early childhood education program conducted at

the Perry elementary school in Ypsilanti, Michigan, during the early 1960s. Beginning at age three

and lasting two years, treatment consisted of a 2.5-hour preschool program on weekdays during the

school year, supplemented by weekly home visits by teachers.

Treatment began at age three and lasted for two years. The study admitted a total of five cohorts

during the period 1962–1965; two were admitted in the first year and one in each subsequent year.

The first cohort is an exception as treated children received only one year of treatment, beginning

at age four. High/Scope’s innovative curriculum, developed over the course of the Perry Program,

was based on the Piagetian principle of active learning, guiding students through key learning

experiences with open-ended questions (Schweinhart et al. 1993, pp. 34–36; Weikart et al. 1978,

pp. 5–6, 21–23). Web Appendix A provides further information on the program activities.

Sample of Participants The study is made up of 123 children from 104 families. Participants

were randomized into treatment (58 participants) and control (65 participants) groups. Siblings’

distribution among families consists of 82 singletons, 17 pairs, 1 triple, and 1 quadruple of siblings.

No family has more than one child per entry cohort.

Family Background Perry participants had an average of four older siblings; the average ma-

ternal age was 29 years at the time of enrollment. About half of the children were living with

both biological parents. Mothers had completed 9.4 years of schooling on average, but none of the

parents had completed more than 12 years of education.

Study Follow-Up Follow-up interviews were conducted yearly from 3 to 15 years old and then

at approximately 19, 27, and 40 years. Perry’s extensive database includes two thousand questions

about parental care, family socio-economic background, family structure, marital status, health,

cognitive and noncognitive tests, crime behavior, drug use, employment, income, consumption of

durable goods, welfare status, and life-course expectations. Appendix D present tables that describe

3

Page 6: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

the control and treatment differences for a selection of Perry outcomes. For further discussion of

the Perry experiment see Heckman, Moon, Pinto, Savelyev, and Yavitz (2008).

2.2 Experimental Design

This subsection describes Perry randomization protocol in detail. The description is useful for

understanding the specifics of the compromises and to identify the lack of information through some

steps of the protocol. Next section defines the statistical problems that follow Perry randomization

and proposes a Perry-tailored inference that complies with the implemented randomization protocol.

Eligibility Criteria Children were drawn from the population surrounding the Perry Elementary

School. Subjects were located through a survey of families associated with that school, as well as

through neighborhood group referrals and door-to-door canvassing.

Disadvantaged children were identified via Stanford-Binet IQ score and socio-economic status

(SES) index cutoffs. The SES is a weighted linear combination of three components: paternal skill

level, parental educational attainment, and number of rooms per person in the family home.4 Sub-

jects with SES above a certain level (initially fixed at 11) were excluded. Children with biological

mental defects and those with IQ scores outside the range 70–85 were excluded. The IQ and SES

criteria were not always adhered to (see Figures 3–4).

Randomization Protocol It is essential to understand the randomization protocol in order to

perform an appropriate inference procedure. Details of the implemented randomization provide the

theoretical basis to access some exchangeability property of treatment assignments. Exchangeability

is used to design an inference method that complies the specificities of the randomization. Following

Weikart et al. (1978, p. 16), there are 123 participants; the 51 females (25 treated and 26 control)

and 72 males (33 treated and 39 control) are distributed among five cohorts. For each entry cohort,

children were assigned to treatment and control groups under the following randomization protocol:

The Randomization Protocol:

Step 1: Younger siblings of previous Perry participants were set aside and assigned to the same

treatment group as their elder siblings.4This index is discussed in detail in the legend to Figure 4 below.

4

Page 7: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Step 2: Those remaining were ranked by IQ score measured at study entry with ties broken

randomly; Two groups were formed, with odd ranks in one group and even ranks in the other.

Step 3: Some individuals were swapped between the groups to “balance” gender and mean

SES score (while keeping IQ roughly constant);

Step 4: The groups were assigned to treatment and control with equal probability.

Step 5: Reassign some treated individuals whose mothers were working (and unable to make

it to follow-up meetings) to the control group.

Note that item 5 depends on unobserved variables of a subset of participants. The rationale for

excluding younger siblings from the randomization process was to avoid “spillovers” within a family,

which could potentially weaken the observed treatment effect.5 The steps of the Randomization

Protocol can be used to characterize the exogenous random variables and reassignment functions

described in Section 3. The randomization procedure is graphically illustrated in Figure 1 and

Appendix D gives detailed description of all available information on the randomization protocol.

3 Model Framework

The fundamental evaluation model describes an observed outcome k for a person i, that is Yi,k. Let

Yi,k1 , Yi,k0 be the potential treatment or control outcomes for person i. Let the random variable

Di be the treatment assignment for person i that takes the value 1 if treatment occurs, and 0

otherwise. Thus the the observed outcome Yi,k is defined by the following equation:

Yi,k = DiYi,k1 + (1−Di)Yi,k0 .

The evaluation problem arises because either Yi,k1 or Yi,k0 is observed but not both. Non-experimental

analysis often encounter the problem of selection bias, when person i self-selects into treatment.

Under self-selection, the difference between Yi,k1 or Yi,k0 might not be a causal consequence of treat-

ment itself, but rather due to a sorting effect. Ideally, RCT experiments are designed to solve the

self-selection problem by imposing independence between Yi,k1 , Yi,k0 and Di. In terms of standard

notation, these experiments guarantee that (Yi,k1 , Yi,k0) ⊥⊥ Di where “⊥⊥” denotes independence.

5For example, through home visits or emulation of one child by another.

5

Page 8: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Fig

ure

1:P

erry

Ran

dom

izat

ion

Pro

toco

l

CT

Step

4:

Post

-Ass

ignm

ent S

wap

sSo

me

post-

rand

omiza

tion

swap

sba

sed

on m

ater

nal e

mpl

oym

ent.

CT

Step

3:

Ass

ign

Trea

tmen

tR

ando

mly

ass

ign

treat

men

t sta

tus t

o th

e un

labe

led

sets

(with

equ

al p

roba

bilit

y).

CT

Step

2:

Bal

ance

Unl

abel

ed S

ets

Som

e sw

aps b

etw

een

unla

bele

d se

ts to

bal

ance

mea

ns (e

.g. g

ende

r, SE

S).

G₂

G₁

Step

1:

Form

Unl

abel

ed S

ets

Chi

ldre

n ra

nked

by

IQ, w

ith

ties b

roke

n ra

ndom

ly; e

ven-

an

d od

d-ra

nked

form

two

sets.

G₂

G₁

IQ ScoreSt

ep 0

: Se

t Asi

de Y

oung

er S

iblin

gsSu

bjec

ts w

ith e

lder

sibl

ings

are

ass

igne

d th

e sa

me

treat

men

t sta

tus a

s tho

se e

lder

sibl

ings

.

Unr

ando

miz

edEn

try

Coh

ort

CT

CT

Prev

ious

Wav

es

6

Page 9: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

The null hypothesis of no treatment effect for outcome k is translated by the equality in distri-

bution of the counterfactuals Yi,k1 , Yi,k0 for all participants i. Notationally, we write Hk : Yk1d= Yk0 ,

where Yk1 , Yk0 , Yk are the vectors of pooled outcomes across agents i. It is well known that if the

null hypothesis is true and (Yi,k1 , Yi,k0) ⊥⊥ Di holds for all i, then the outcome Yk is independent

of the random vector of treatment status D. In other words, hypothesis Hk is equivalently to

Hk : Yk ⊥⊥ D.

Compromises of the randomization protocol may preclude the use of Hk : Yk ⊥⊥ D as a test for

equality in distribution between Yk1 and Yk0 . Indeed, if compromises create a dependence between

some background variables Z and treatment status D, then the randomization might also induce

dependence between outcomes Y and treatment status D through their common dependence on Z.

This induced dependence might invalidate the hypothesis Hk : Y ⊥⊥ D even under the assumption

of no treatment effects. Inference techniques can correct for this common dependence by accounting

for the implemented randomization. Specifically, valid inference is obtained by using a distribution

of treatment assignments D that reproduces the relationship between D and background variables

Z according to the the implemented randomization. In order to generate valid inference, it is

necessary to uncovering the details of the compromised randomization. Next subsection discusses

the sources of compromises and maps the case of Perry into a general framework.

3.1 Sources of Compromises

As mentioned in the introduction, we decompose a compromising features of a randomization into

two aspects: (a) the uncertainty on randomization protocol generated by the compromises and (b)

the role played by unobserved variables of participants in the implemented randomization. We

discuss each one of these aspects. We build our analysis on a general model of randomization that

provides a theoretical basis to examine Perry compromises.

We describe the randomization protocol as a stepwise procedure that generates the distribution

of treatment status D. Let ε be a vector of exogenous random variables pooled across agents i

that was used at some step of the randomization protocol. For example, ε could be a collection

of i.i.d. Bernoulli random variables that assign the treatment status for each participant. Typi-

cally, researches also want to balance some background variables Z across treatment and control

groups through the randomization protocol, thus we generally define the initial random variable of

7

Page 10: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

treatment assignments generated by an uncompromised randomization as D ∼ δ([Z, ε]), where Z

denote the background variables used in the randomization protocol. Function δ encompasses all

steps of the randomization protocol and generates the random variable D based on the statistical

properties of the exogenous variables ε.

Reassignments: According to our framework, reassignments of participants after an initial treat-

ment assignment constitute a compromise. We represent general types of reassignments by a deter-

ministic function that changes the treatment status of participants. Notationally, we still use δ to

denote these reassignment compromises. That is, D ∼ δ([Z, ε]) where the function δ accounts for

the initial randomization and the followup reassignment compromises. If the function δ is known,

there is no uncertainty regarding the compromise, so the distribution of D is fully characterized and

could be reproduced. If this is the case, valid inference can be done by using the known distribution

of D.

Uncertainty: Full descriptions of compromises are seldom available in social experiments. Lack

of information on the compromises leads to uncertainty regarding the true distribution of treatment

assignments D. In other words, if δ is unknown, the distribution of D cannot be fully character-

ized. Nevertheless, valid inference can be done by exploring an exchangeability property that

remains valid under uncertainty of δ. Specifically, we prove (in Subsection 3.3) that, under general

assumptions, a treatment assignment D is exchangeable among participants that share identical

information for the background variables Z. This exchangeability property implies that counter-

factual outcomes Yk,1, Yk,0 are independent of treatment assignments D conditioned on Z, that is

(Yk,1, Yk,0) ⊥⊥ D|Z. In non-experimental data, this conditional dependence is called the Matching

Assumption (see Heckman (2006)). Researches usually evoke the Matching Assumption to avoid

the statistical modeling of bias selection in non-experimental data. In experimental data, if the

Matching Assumption holds, it comes as a statistical property of the implemented RCT.

Unobserved Variables: The compromises the follow the initial randomization can also depend

on unobserved variables of participants, denoted by U . In this case, we represent the vector of

treatment status byD ∼ δ([Z, ε, U ]). In general, the exchangeability property mentioned above does

not hold when the randomization depends on unobserved variables of participants. Compromises

8

Page 11: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

through unobserved variables generate a statistical problem that is statistically identical to selection

bias in non-experimental data. Participants’ compliance is a typical case of compromise through

unobserved variables. In the case of Perry, compromises through unobserved variables stem from

reassigning participants whose mothers could not comply the scheduled interviews.

3.2 The case of Perry:

Recall that the Perry protocol was designed to initially rank the IQs of the eldest siblings in each

family (Steps 1 and 2 of the Randomization Protocol). Swaps of individuals took place to “balance”

gender and mean SES score (Step 3); then treatment and control status were randomly assigned to

odd and even numbered families within each wave (Step 4). Step 5 reassigns the treatment status

of some participants whose mothers were employed at the onset of the survey.

Let the families of participants be indexed by J = {1, . . . , J} and let Ij be the index of the

siblings of family j ∈ J. In most cases, |Ij | = 1, but, as described in Section 2, there are families

with more than one sibling in the program. We denote by Zi,j the values of observed characteristics

used in the randomization protocol for the ith child in the jth family:

Zi,j = (Wi,j , IQi,j , SESi,j ,Gi,j ,Mi,j) ,

where Wi,j is the wave of the program in which the individual participated, IQi,j is the Stanford-

Binet IQ score of the individual, SESi,j is the measure of socio-economic status for the individual

described in the preceding section, Gi,j is the gender of the individual, and Mi,j is an indicator

variable for whether the individual’s mother was working or not. Also define Z = (Zi,j : i ∈ Ij , j ∈

J). Treatment status was assigned to families, not individuals, thus, let Dj be an indicator variable

for whether the jth family was selected into treatment, moreover define D = (Dj , j ∈ J).

A diagram of Perry’s five step randomization protocol is presented in Figure 2. Steps 1, 2, 3

and 5 use background variables Z. Steps 2 and 4 use exogenous random variables ε. Exogenous

random variables in Step 2 are associated with tie breaking of IQs, during the IQ ranking within

each waves. Exogenous random variables in Step 4 are associated with treatment assignments for

the groups that arise in Step 3. The last step of the randomization protocol states that a subset

of treated individuals were reassigned to the control group. The reassignments (Step 5) was based

9

Page 12: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

on unobserved characteristics of participants. Notationally, the reassignments cast on participants

unobservable variables U = (Uj : j ∈ J). Sources of compromises comes from uncertainty on Swaps

(Step 3) and Reassignments (Step 5).

The the final distribution of treatment assignments can be described by D ∼ δ([Z, ε, U ]), how-

ever, it is useful to define an intermediate vector of treatment assignments D, associated with steps

1 through 4 (appendix B provides a formal description of D).

Remark 3.1. Characteristics of D are: (a) the distribution of D is partially unknown; (b) uncertainty

concerning D comes from lack of information on the actual swaps that took place in step 3 of

the Randomization procedure; (c) distribution of D does not depend on unobserved variables U ,

which are only assessed in the last step; (d) we can write this vector of treatment assignments as

D ∼ ϕ([Z, ε]).

The final distribution of treatment assignments D can be decomposed in terms the intermediate

vector of treatment assignments D (which encompass steps 1 through 4) and the subsequent acts

of step 5 (which is based on unobserved variables U and background variables Z). We represent

the final distribution of assignments D by:

D ∼ δ([D, Z, U ]); δ : {0, 1}|J | × supp(Z,U)→ {0, 1}|J |; (1)

(2)

There are two advantages of this decomposition. First, an exchangeability property holds for

D, but not for D (see Subsection 3.3). Second, the last step focuses on a subset of participants

(families whose mothers were working at the onset of the survey), which decreases the complexity

of the function δ considerably.

Remark 3.2. Traditional resampling procedures that assume equal probability for each treatment

assignment regardless of the values of conditioning variables Z cannot generate the actual distri-

bution of treatment assignments given the rules followed in the Perry randomization.

Remark 3.3. As a consequence of Remark 3.2, a simple permutation of treatment labels regardless

of background variables Z (as in Lehmann and Romano (2005)) is invalid to test the null hypothesis

10

Page 13: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Fig

ure

2:Fr

amew

ork

for

Com

prom

ises

inP

erry

Ran

dom

izat

ion

Pro

toco

l

Step 1

Step 2

Step 3

Step 4

Step 5

Action

Associated

Random Variable

Associated

Background Variables

Younger siblings

set aside

IQ Ranking

Swaps of Participants

Group Treatment

Assignment

Reassignment upon

Unobserved Variables

none

Family ID

IQ, Wave

SES, Gender

none

Maternal

Employment

Status

ε

Exogenous

Random Variable

none

ε

Exogenous

Random Variable

Unobserved

Random Variable

U

D ~δ([D,Z,U])

~

D ~δ([Z,ε,U])

D ~ϕ([Z,ε])

~

Not

es:

Thi

sdi

agra

mre

pres

ents

the

step

sof

Per

ryR

ando

miz

atio

nP

roto

col

and

its

com

prom

ises

.P

erry

’sfiv

est

eps

rand

omiz

atio

npr

otoc

olis

repr

esen

ted

byth

eco

lum

nsof

this

diag

ram

.T

hefir

stlin

epo

ints

the

five

step

s.T

hese

cond

line

ofth

edi

agra

mlis

tsth

em

ain

act

perf

orm

edin

each

step

.T

heth

ird

line

pres

ents

the

rand

omva

riab

leth

atw

asas

sess

edin

each

step

.T

hefo

urth

line

lists

the

back

grou

ndva

riab

leus

edto

perf

orm

the

act

asso

ciat

edw

ith

each

step

.St

eps

1,2,

3an

d5

use

back

grou

ndva

riab

lesZ

.St

eps

2an

d4

cast

onex

ogen

ous

rand

omva

riab

lesε.

Exo

geno

usra

ndom

vari

able

sof

Step

2ar

eas

soci

ate

toth

eti

ebr

eaki

ngof

IQs,

duri

ngth

eIQ

rank

ing

wit

hin

each

wav

es.

Exo

geno

usra

ndom

vari

able

sof

Step

4ar

eas

soci

ate

wit

hth

etr

eatm

ent

assi

gnm

ent

for

the

grou

psth

atar

ise

from

Step

3.O

nly

Step

5is

base

don

part

icip

ants

unob

serv

edva

riab

lesU

.T

hebo

ttom

arro

wre

calls

that

final

dist

ribu

tion

oftr

eatm

ent

assi

gnm

entsD

can

bede

scri

bed

byD∼δ(

[Z,ε,U

]]),

whe

reZ

acco

unts

for

back

grou

ndva

riab

les,ε

acco

unts

for

exog

enou

sra

ndom

izat

ion

vari

able

san

dU

acco

unts

for

part

icip

ants

unob

serv

edva

riab

les.

We

deco

mpo

seth

edi

stri

buti

onofD

into

two

part

s.L

etD

beth

etr

eatm

ent

assi

gnm

ent

dist

ribu

tion

asso

ciat

edw

ith

step

s1

thro

ugh

4(l

eft

top

arro

w),

deno

ted

byD∼ϕ

([Z,ε

]]).

Obs

erve

thatD

does

not

depe

ndon

unob

serv

edva

riab

leU

beca

use

unob

serv

edva

riab

les

are

only

asse

ssed

inth

ela

stst

ep.

The

final

dist

ribu

tion

oftr

eatm

ent

assi

gnm

entsD

can

bew

ritt

enin

term

sth

era

ndom

vari

ableD

and

step

5(t

opri

ght

arro

w),

nota

tion

ally

,D∼δ(

[D,Z,U

]),.

11

Page 14: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

of no treatment effects.

Remark 3.4. Any permutation used for inference must be based upon an exchangeability property

that results from the true randomization protocol.

Conditioned on unobserved variable U , the available information on the reassignments allows

us to map δ into some reassignment functions σ of observable variables Z, such that σ ∈ Σ, where

Σ is known. Set Σ is restricted by the fact that at most two reassignments occurred per wave.

Notationally, we define Σ as a class of reassignment functions of the type

σ : {0, 1}|J | × supp(Z)→ {0, 1}|J |,

that adhere to the available information on reassignments (e.g. all reassignment functions that

reassigns at most two treated participants per wave among the ones whose mother were working).

Remark 3.5. The number of possible associations between two vectors of treatment assignments is

finite for a fixed Z, thus set Σ is finite for a fixed Z.

Assumption 3.1. There is a finite partition of supp(U), that is⋃u∈IU

Au = supp(U) where IU

is the indexing set of the partition, such that if U ∈ Au then the function δ can be mapped into

σu ∈ Σ.

Assumption 3.1 does not restrict our analysis, it is ia consequence of the finiteness of Σ for a

fixed Z. Under Assumption 3.1, we can write δ as a linear combination elements of Σ, that is:

δ([D, Z, U ]) =∑u∈U

1[U ∈ Au] · σu(D, Z); σu ∈ Σ; (3)

The incompleteness of our model stems from three sources: (a) the distribution of D is partially

unknown, (b) there is not enough information to fully define function δ, and (c) variable U is

unobserved. Uncertainty regarding the distribution of D is addressed by using an exchangeability

property which holds under compromises (described in the Subsection 3.3). We account for the

uncertainty of δ and U by computing critical values that produce valid inferences under uncertainty.

The measure of uncertainty in our model is related to the extension of the set Σ. Our empirical

analysis explores several different specifications of Σ and the strength of our results depends on the

12

Page 15: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

our assumptions about Σ. We show that uncertainty decreases as Σ is narrowed and even modest

assumptions about Σ lead to interesting inferences.

3.3 Exchangeability Property of Intermediate Treatment Assignments D

This section proves a useful exchangeability property of the random variable D: elements of D ∼

ϕ([Z, ε]) are exchangeable among participants that have the same values for background variables

Z. The property holds even though the function ϕ is partially unknown due to compromises. The

next section explores this property for inference.

Without loss of generality, we represent the exogenous variables ε as a vector of identical

variables such that each element of the vector is associated to a Perry participant. In other words,

ε not only has the same distribution across participants, but also realizes the same value for each

participant.

Let gz be a bijection on J in itself (known as permutations) exchanges indexes j and j′ only if

Zj = Zj′ , j, j′ ∈ J . Let GZ the collection of all permutations gZ , in other words, elements of GZ

permute participants that share the same information in Z. Set GZ is a symmetric group acting

on J, it is also finite and closed under compositions and inverses.

Assumption 3.2. Assume gZ [Z, ε] d= [Z, ε] ∀ gZ ∈ GZ .

Assumption 3.3. For any permutation g of indexes 1, . . . , |J | assume gϕ([Z, ε])|ε = ϕ(g[Z, ε])|ε.

The first assumption is innocuous given that gZZ = Z and gZεd= ε by construction. The second

assumption just states that the permutation mechanism is equivariant, that is, conditioned on a

draw of ε, any swap of participants is associated with the symmetric swap of the final output of ϕ.

In other words, Assumption 3.3 only stating that if the arguments of function ϕ were swapped, we

also have a symmetric swap in its output.

Remark 3.6. Observe that ϕ, does not rely on unobservable variables U , so agents with same Z are

indistinguishable in terms of the swapping mechanism.

Theorem 3.1. The Exchangeability Property: under Assumptions 3.2 and 3.3,

gzDd= D ∀ gz ∈ Gz

13

Page 16: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Proof.

gZD|ε = gzϕ([Z, ε])|ε

= ϕ(gz[Z, ε])|ε by Assumption 3.3

∴ gZDd= ϕ(gz[Z, ε])

d= ϕ([Z, ε]) by Assumption 3.2

d= D

Appendix B.1 provides a simple example that illustrates the intuition of the exchangeability

property described in this subsection.

Remark 3.7. As a consequence of Theorem 3.1, the distribution of treatment assignments D is

invariant to permutations gZ of the arguments of function δ, that is,

D = δ([D, Z, U ]) d= δ(gz[D, Z, U ]) ; gz ∈ GZ .

Remark 3.7 has a direct application for testing procedures. If function δ were known, we

could obtain D by reversing the reassignments of the actual treatment status D. The conditional

distribution of D is uniform across elements of GZ . Thus, we could obtain variability D by applying

permutations gzD. The conditional distribution of D could be obtained by applying the assignment

rule δ across the treatment vectors generated by gzD. Distribution of D could be then used in

hypothesis testing. Unfortunately, the reassignment rule is partially unknown and we account for

the uncertainty on δ by using a conservative approach. Section 4 provides a formal description of

our testing procedure, which is based on the exchangeability property of Theorem 3.3.

4 Setup and Notation

Let the vector of outcome k be denoted by Yk = (Yi,j,k : i ∈ Ij , j ∈ J), where outcomes are indexed

by K = {1, . . . ,K}. As mentioned, families are indexed by J and for each j ∈ J, Ij denotes the set

14

Page 17: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

of siblings in the jth family. Denote the observed data by

XK = (Yi,j,k, Zi,j , Dj : i ∈ Ij , j ∈ J, k ∈ K)

and let P be the distribution of the observed and unobserved data, (X,U) ∼ P . Let Pk be defined

by the rule:

P ∈ Pk ⇐⇒ Yk ⊥⊥ D|Z,U

The null hypotheses of interest are the |K| null hypotheses

Hk : P ∈ Pk for each k ∈ K (4)

indexed by k ∈ K. The alternative hypothesis is the unrestricted version of hypothesis (4). Let

K0(P ) denote the set of true null hypotheses, i.e.,

k ∈ K0(P )⇐⇒ P ∈ Pk,

that is, k ∈ K0(P ) if and only if Hk is true. Our goal is to test the family of null hypotheses (4) in

a way that controls the Familywise Error Rate (FWER) at level α, that is,

FWERP = P{reject any Hk with k ∈ K0(P )} ≤ α . (5)

Remark 4.1. Since randomization is imperfect in our setting, testing even a single null hypothesis

in a way that controls the probability of a Type-I error at level α will be nontrivial.

Remark 4.2. Controlling for familywise error rate avoids making “too many” false rejections. An-

other option of controlling for type-I probability error in a multivariate set up is considering error

rates that control for false rejections. See, e.g., Romano and Shaikh (2004).

15

Page 18: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

4.1 Testing a Single Null Hypothesis

For L ⊆ K, consider the problem of testing

HL : P ∈⋂k∈L

ωk, that is, Hk is true for k ∈ L (6)

in a way that controls the probability of a false rejection at level α, that is,

P{reject HL} ≤ α whenever HL is true . (7)

Denote the data associated with hypothesis HL as XL and consider any test statistic TL = TL(XL)

for which large values of TL provide evidence against the null hypothesis. Note that here we have

assumed that the test statistic only depends on XL rather than XK , again, L ⊆ K6.

To test hypothesis (6) we use permutation methods tailored to the available information of

the implemented randomization. The distribution of treatment assignments D, described in Equa-

tion (1), consists of a reassignment transformation of the random variable D. Moreover, D is

exchangeable according to Theorem (3.1). As a consequence of exchangeability, Remark 3.7 states

that δ([D, Z, U ]) d= δ(gz[D, Z, U ]) ∀ gz ∈ GZ . If the true reassignment rule δ were known, we could

obtain D by reversing the reassignments of the actual treatment status D. Let d be the actual

realization of D. The conditional distribution of D is uniform across elements of GZ . In particular,

random draws of gzd serve as the distribution of D conditional on D ∈ {gZ d : gZ ∈ GZ} (see

Theorem 15.2.2 in Lehmann and Romano (2005)). Consequently, we could obtain the conditional

distribution of D by applying the assignment rule δ. Unfortunately, the reassignment rule is par-

tially unknown. We account for the uncertainty on δ by using a conservative approach. We do

inference using σ ∈ Σ that generates the highest critical values. This way, we bound the critical

value that would arise if δ were known and control for the probability of a false rejection.

For each σ ∈ Σ, let σ−1(D,Z) be the set of possible vectors d of treatment assignments that

would generate D upon the reassignment function σ(d, Z).

Let the true d associated with δ and D be denoted by d∗, that is D = δ(d∗, Z, U).

Let gmZ , m = 2, . . . ,M be and i.i.d. sequence of permutations in GZ . Define the the test statistic

6This assumes no-cross variable restrictions of the outcomes Y in L \ K with the outcome in K.

16

Page 19: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

TmL (d) by:

TmL (σ, d) = TmL (XmL );

XmL (σ, d) = (Yi,j,k, σ(gmZ d, Zi,j), Zi,j , Dj : i ∈ Ij , j ∈ J, k ∈ L)

for σ ∈ Σ, d ∈ σ−1(D,Z).

For notation purposes, adopt T 1L as the test statistics based on the actual data XL, that is TL.

Theorem 4.1. The statistics TmL (δ, d∗), m = 1, . . . ,M are (conditionally) exchangeable.

Proof. We waive the proof due to its simplicity.

Define T (m)L (σ, d), m = 1, . . . ,M as the ordered values of test statistics TmL (σ, d), thus:

T(1)L (σ, d) ≤ · · · ≤ T (M)

L (σ, d).

Let cL(σ, d, α) be the (1 − α) highest quantile of TmL (σ, d), m = 1, . . . ,M. Let cL(Σ, α) be the

maximum value of cL(σ, d, α) over d ∈ σ−1(D,Z) and σ ∈ Σ. Notationally,

cL(Σ, α) = max({cL(σ, d, α) : d ∈ σ−1(D,Z), σ ∈ Σ}) , (8)

where

cL(σ, d, α) = T(d(1−α)Me)L (σ, d) ,

and dxe means the smallest integer greater than or equal to x.

Remark 4.3. The sets σ−1(D,Z) ∀σ ∈ Σ are finite, therefore, by the Bolzano-Weierstrass theorem,

cL(Σ, α) always exists.

Theorem 4.2. Under Assumption 3.1, cL(Σ, α) ≥ cL(δ, d∗, α).

Proof. Under Assumption 3.1, Equation (3) states that for each value of U , exists a function σ ∈ Σ

such that δ = σ. Moreover, {d : d ∈ δ−1(Z,D,U)} ⊂ {d : d ∈ σ−1(Z,D), σ ∈ Σ}.

17

Page 20: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Remark 4.4. The statistic cL(Σ, α) relies only on observed variables, thus it can be computed.

Statistic cL(δ, d∗, α) cannot be computed because neither δ nor d∗ are observed. Appendix C

describes the method for computing cL(Σ, α) in detail.

In this notation, we may now state the following result:

Theorem 4.3. Suppose data XK with distribution as described in Section 4 and σ satisfying σ ∈ Σ

are available. Let M ≥ 2 and 0 < α < 1 be given. Then, the test that rejects HL whenever

TL > cL(Σ, α) ,

satisfies (7).

Proof. Suppose HL is true. Then, T 1L(δ, d∗), . . . , TML (δ, d∗) are (unconditionally) exchangeable.

Hence,

P {TL > cL(δ, d∗, 1− α)}

= E[1{TL > T

(d(1−α)Me)L (δ, d∗)

}]= E

[E

[1M

M∑m=1

1{T (m) (δ, d∗) ≥ T (d(1−α)Me)

L (δ, d∗)}| {TmL (δ, d∗)}Mm=1

]]

= E[E[M − (d(1− α)Me)

M| {TmL (δ, d∗)}Mm=1

]]=

1M

(M − d(1− α)Me)

≤ α .

By Theorem 4.2, cL(Σ, α) ≥ cL(δ, d∗, α). Thus, P {TL > cL(Σ, α)} ≤ P{TL > cL(δ, d∗, α)}, which

completes the proof.

4.2 Testing Multiple Null Hypotheses

We now return to the problem of testing the family of null hypotheses (4) in a way that satisfies

(5). We use a stepdown multiple testing procedure. The terminology reflects the fact that our

procedure begins with the most significant null hypothesis and then “steps down” to less significant

null hypotheses. The argument for the validity of our procedure follows the arguments given in

18

Page 21: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Romano and Wolf (2005), who provide general results on the use of stepdown multiple testing

procedures for control of the FWER.

In order to describe the procedure based on critical values for each k ∈ K, let Tk = Tk(Xk)

be any test statistic for testing Hk. We use lowercase subscripts to denote single outcomes and

capital subscripts to denote sets of outcomes. Data Xk are the set of values associated with the

single outcome Yk. As mentioned, large values of Tk provide evidence against the null hypothesis.

For any L ⊆ K, define

TL(XL) = maxk∈L

Tk(Xk) (9)

and let cL(Σ, α), defined by (8), be the corresponding critical value for testing HL using TL as

described in Section 4.1. Using this notation, we now describe the respective testing procedure by

the following algorithm:

Algorithm 1.

Step 1: Let L1 = K. If

TL1 ≤ cL1(Σ, α) ,

then stop. Otherwise, reject any Hk such that Tk > cL1(Σ, α), and define the set L2 of

indexes in L1 which respective test statistics are less than the critical value associated with

L1. Formally, set

L2 = {k ∈ L1 : Tk ≤ cL1(Σ, α)}

and go on to Step 2.

...

Step n: If TLn ≤ cLn(Σ, α) , then stop. Otherwise, reject any Hk such that Tk > cLn(Σ, α),

set Ln+1 = {k ∈ Ln : Tk ≤ cLn(Σ, α)} and go on to Step n+ 1.

...

The following result shows that Algorithm 1 controls the FWER at level α.

19

Page 22: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Theorem 4.4. Suppose data XK with distribution P described in Section 4 and σ satisfying σ ∈ Σ

is available. Let M > 0 and 0 < α < 1 be given. Then Algorithm 1 satisfies (5).

Proof. Assume without loss of generality that K0(P ), the set of indices corresponding to true null

hypotheses, is nonempty. Suppose Algorithm 1 leads to at least one false rejection. Let n be the

smallest step at which a false rejection occurs. Then, it must be the case that

Tk > cLn(Σ, α)

for some k ∈ K0(P ). But, by the minimality of n, it must also be the case that

K0(P ) ⊆ Ln ,

since for any σ ∈ Σ and d ∈ σ−1(D,Z),

cK0(P )(σ, d, α) ≤ cLn(σ, d, α)

we have that

cK0(P )(Σ, α) ≤ cLn(Σ, α) .

Hence,

TK0(P ) ≥ Tk > cLn(Σ, α) ≥ cK0(P )(Σ, α) .

It follows that

FWERP ≤ P{TK0(P ) > cK0(P )(Σ, α)} ≤ α ,

where the final inequality follows from Theorem 4.3.

20

Page 23: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

5 Inference from the Perry Experiment

6 Conclusion

References

Breitmayer, B. J. and C. T. Ramey (1986, October). Biological nonoptimality and quality of

postnatal environment as codeterminants of intellectual development. Child Development 57 (5),

1151–1165.

Campbell, F. A. and C. T. Ramey (1994, April). Effects of early intervention on intellectual and

academic achievement: A follow-up study of children from low-income families. Child Develop-

ment 65 (2), 684–698. Children and Poverty.

Heckman, J. J. (2006). The principles underlying evaluation estimators with an application to

matching. Forthcoming, /emphAnnales d’Economie et de Statistiques.

Heckman, J. J., S. H. Moon, R. R. Pinto, P. A. Savelyev, and A. Q. Yavitz (2008). A reanalysis

of the High/Scope Perry Preschool Program. Unpublished manuscript, University of Chicago,

Department of Economics. First draft, September, 2006.

Heckman, J. J. and E. J. Vytlacil (2007a). Econometric evaluation of social programs, part I: Causal

models, structural models and econometric policy evaluation. In J. Heckman and E. Leamer

(Eds.), Handbook of Econometrics, Volume 6B, pp. 4779–4874. Amsterdam: Elsevier.

Heckman, J. J. and E. J. Vytlacil (2007b). Econometric evaluation of social programs, part II:

Using the marginal treatment effect to organize alternative economic estimators to evaluate

social programs and to forecast their effects in new environments. In J. Heckman and E. Leamer

(Eds.), Handbook of Econometrics, Volume 6B, pp. 4875–5144. Amsterdam: Elsevier.

Kurz, M. and R. G. Spiegelman (1972). The Design of the Seattle and Denver Income Maintenance

Experiments. Menlo Park, CA: Stanford Research Institute.

Lehmann, E. L. and J. P. Romano (2005). Testing Statistical Hypotheses (Third ed.). New York:

Springer Science and Business Media.

21

Page 24: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Manski, C. F. (2003). Partial Identification of Probability Distributions. New York: Springer-Verlag.

Romano, J. P. and A. M. Shaikh (2004). On control of the false discovery proportion. Technical

Report 2004-31, Department of Statistics, Stanford University.

Romano, J. P. and M. Wolf (2005, March). Exact and approximate stepdown methods for multiple

hypothesis testing. Journal of the American Statistical Association 100 (469), 94–108.

Schweinhart, L. J., H. V. Barnes, and D. Weikart (1993). Significant Benefits: The High-Scope

Perry Preschool Study Through Age 27. Ypsilanti, MI: High/Scope Press.

Schweinhart, L. J., J. Montie, Z. Xiang, W. S. Barnett, C. R. Belfield, and M. Nores (2005). Lifetime

Effects: The High/Scope Perry Preschool Study Through Age 40. Ypsilanti, MI: High/Scope

Press.

Schweinhart, L. J. and D. P. Weikart (1980). Young Children Grow Up: The Effects of the Perry

Preschool Program on Youths through Age 15. Ypsilanti, MI: High/Scope Press.

Weikart, D. P. (Ed.) (1967). Preschool Intervention: A Preliminary Report of the Perry Preschool

Project. Ann Arbor, MI: Campus Publishers.

Weikart, D. P., J. T. Bond, and J. T. McNeil (1978). The Ypsilanti Perry Preschool Project:

Preschool Years and Longitudinal Results Through Fourth Grade. Ypsilanti, MI: Monographs of

the High/Scope Educational Research Foundation.

Weikart, D. P., A. S. Epstein, L. Schweinhart, and J. T. Bond (1978). The Ypsilanti Preschool

Curriculum Demonstration Project: Preschool Years and Longitudinal Results. Ypsilanti, MI:

High/Scope Press.

22

Page 25: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Appendix

23

Page 26: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

A Background on the Perry Preschool Curriculum

Preschool Overview During each wave of the experiment, the preschool class consisted of 20–25

children, whose ages ranged from 3 to 4. This is true even of the first and last waves, as the first

wave admitted 4-year-olds, who only received one year of treatment, and the last wave was taught

alongside a group of 3-year-olds, who are not included in our data. Classes were 2-1/2 hours every

weekday during the regular school year (mid-October through May).

The preschool teaching staff of four produced a child-teacher ratio ranging from 5 to 6.25 over

the course of the program. Teaching positions were filled by public-school teachers who were

“certified in elementary, early childhood, and special education,” (Schweinhart et al., 1993, p.32).

Home Visits Home visits lasting 1-1/2 hours were conducted weekly by the preschool teachers.

The purpose of these visits was to “involve the mother in the educational process,” and “implement

the curriculum in the home,” (Schweinhart et al., 1993, p.32). By way of encouraging the mothers’

participation, teachers also helped with any other problems arising in the home during the visit.

Occasionally, these visits would consist of field trips to stimulating environments such as a zoo.

Curriculum The Perry Preschool curriculum was based on the Piagetian concept of active learn-

ing, which is centered around play that is based on problem-solving and guided by open-ended

questions. Children are encouraged to plan, carry out, and then reflect on their own activities.

The topics in the curriculum are not based on specific facts or topics, but rather on key experi-

ences related to the development of planning, expression, and understanding. The key experiences

are then organized into ten topical categories, such as “creative representation”, “classification”

(recognizing similarities and differences), “number”, and “time.”7 These educational principles are

reflected in the types of open-ended questions asked by teachers: for example,“What happened?

How did you make that? Can you show me? Can you help another child?” (Schweinhart et al.,

1993, p.33)

As the curriculum was developed over the course of the program, its details and application

varied from year to year. While the first year involved “thoughtful experimentation” on the part of

the teachers, experience with the program and series of seminars during subsequent years led to the7For a full list, see Schweinhart et al. (1993).

24

Page 27: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

development and systematic application of teaching principles with “an essentially Piagetian theory-

base.” During the later years of the program, all activities took place within a structured daily

routine intended to help children “to develop a sense of responsibility and to enjoy opportunities

for independence,” (Schweinhart et al., 1993, pp. 32–33).

B Computing Initial Distribution of Treatment Assignments D

As described in Section 4, Yi,j,k denote the kth outcome of the ith sibling in the jth family. The

set K = {1, . . . ,K} indexes outcomes and J = {1, . . . , J} is the set of families, in the experiment,

the outcome Yk ≡ (Yi,j,k : i ∈ Ij , j ∈ J). Thus, Yi,j,k is the kth outcome for the ith child of family

j. For each j ∈ J, let Ij denote the set of siblings in the jth family. In most cases, |Ij | = 1, but, as

described in the preceding section, there are families with more than one sibling in the program.

Denote by Zi,j a vector of characteristics for the ith child in the jth family:

Zi,j = (Wi,j , IQi,j , SESi,j ,Gi,j ,Mi,j) ,

where Wi,j is the wave of the program in which the individual participated, IQi,j is the Stanford-

Binet IQ score of the individual, SESi,j is the measure of socio-economic status for the individual

described in the preceding section, Gi,j is the gender of the individual, and Mi,j is an indicator

variable for whether the individual’s mother was working or not. Also define Z = (Zi,j : i ∈

Ij , j ∈ J). Let Dj be an indicator variable for whether the jth family was selected into treatment,

moreover define D = (Dj , j ∈ J).

Controlling for FWER requires a formal description of the distribution of Dj , j ∈ J . Let

Wj ≡Wi∗,j and IQj ≡ IQi∗,j , where

i∗ ≡ arg mini∈Ij

Wi,j .

In other words, Wj and IQj are the values of Wi,j and IQi,j for the sibling in the earliest wave of

the program for the jth family (the eldest sibling). Define SESj , Gj ,Mj symmetrically.

The Perry protocol was designed to initially rank the IQs of the eldest siblings in each family

and then to randomly assign treatment and control status to odd and even numbered families

25

Page 28: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

within each wave, respectively, as determined by a single toss of a coin. A practical problem that

plagues the Perry randomization protocol is that in the first stage there is not necessarily a unique

ordering of IQ because of ties. The IQ ties for eldest siblings within each wave are broken by

assuming equal probability of that available ranks for the block of tied individuals. Specifically,

define the partition of family indices by Ja, where

[j, j′ ∈ Ja

]⇐⇒

[(Wj , IQj) = (Wj′ , IQj′)

],

where all sets are clusters of eldest siblings that share the same values of IQ and belongs to the

same wave, and

J ≡A⋃a=1

Ja .

Without loss of generality, order the sets Ja by the lexicographic rank of the values of wave

and IQ, respectively (Wj , IQj), of participants. Construct a vector of the indexes in J , that is

{1, 2, . . . , |J |}, that follows the order of the sets J1, . . . , JA. Whenever the set Ja has more than

one participant (its cardinality is bigger than one), pick a random ordering of its components

with probability 1/(|Ja|!). Let ϕ denote the random variable of the vector of indexes (j1, . . . , jJ)

constructed in this fashion. Therefore ϕ has J dimension, its support is equal to JJ and there is

no repetition of its indices j ∈ J for all realizations of ϕ. Moreover, each of the realizations of ϕ is

equally probable, conditional on (Wj , IQj). Formally, let

ϕ = ϕ(((Wj , IQj) : j ∈ J))

be the random variable conditional on ((Wj , IQj) : j ∈ J), which consists of all equal probable

orderings of (j1, . . . , j|J |) satisfying the lexicographic ordering in R2 of ((Wj , IQj)):

(Wj1 , IQj1) ≤L · · · ≤L (WjJ , IQjJ ) ,

where L denotes the lexicographic order. Notationally write (j1, . . . , jJ) ∼ ϕ.

Define the set of assignments resulting from this protocol by D where D = (Dj : j ∈ J). It is

distributed as follows: for each 1 ≤ w ≤ w, let Dj` = 1 for all odd values of ` and Dj` = 0 for all

26

Page 29: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

even values of ` for which Wj` = w with probability 1/2 and Dj` = 0 for all odd values of ` and

Dj` = 1 for all even values of ` for which Wj` = w with probability 1/2.

The variable D comprises the items 1, 2 and 4 of Procedure ?? in Subsection 2.2.

Remark B.1. The distribution of D|Z has ≈ 1010 points of support.

Recall that by Remark ??, we can interchange items 3 and 4 w.l.o.g. We represent the swaps

of item 3 in Procedure ?? by a function δ0 that only depend on the observable variables Z:

δ0 : {0, 1}|J | × suppZ → {0, 1}|J |;

Thus we define the random vector of dichotomous random variables D that arises from items 1

through 4 of the Procedure ?? by:

D ∼ δ0(D, Z),

Remark B.2. The function of swaps δ0 is only partially known. We represent teh uncertainty of

function δ0 by stating that δ0 ∈ Σ0, such that Σ0 is known. Note that for a fixed Z, Σ0 is finite.

Remark B.3. Even though the distribution of D is unknown, inferences can be made by using the

exchangeability property that arises from the set up of the randomization. Section 3 describes the

exchangeability property of treatment assignments.

B.1 Example

One example can clarify ideas. Suppose three participants (I1, I2, I3). Now let three initial treat-

ment status D = (Da, Db, Dc) be associated with these three participants respectively. Elements

of the vector D do not need to be exchangeable. Now suppose the reassignment function σ swaps

the first and third elements of the vector [Db, Da, Dc]′, that is, δ([Da, Db, Dc]′) = [Dc, Db, Da]′.

The criteria of exchangeability relies on the equality on the information on variables Z, say

that the first and second participants share the same value of background variables Z, that is

Z1 = Z2. A valid permutation g should permute the first and the second participants, that is,

gδ([Da, Db, Dc]′) = [Db, Dc, Da]′.

Our exchangeability property says that gδ(D) d= δ(D). Indeed it is true. The rationale behind

this fact relies on the method of randomization. For the researcher who did the randomization,

27

Page 30: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

participants I1 and I2 are indistinguishable, as they sahre the same background information. thus

the initial order of participants could also be (I2, I1, I3). In the first ordering of participants, the

final treatment status would be: D1

D2

D3

= δ

Da

Db

Dc

=

Dc

Db

Da

if the second order was used, the final treatment status would be:

D2

D1

D3

= δ

Da

Db

Dc

=

Dc

Db

Da

, that is,

D1

D2

D3

=

Db

Dc

Da

As these two configurations are both equally likely to occur, we conclude that the two first elements

of the treatment status vector associated with the ordered participants (I1, I2, I3) are exchangeable.

In other words, the participants who share the same content of Z still remain exchangeable after

being swapped.

The swap function might act only for some draws of treatment status, say it swaps first and

third elements only if the first element is equal to one. In this scenario, we obtain equally likely

the distributions of treatment status above conditional on Da = 1, and the same distribution of

[D1, D2, D3] when Da = 0. It does not change the essence of the exchangeability.

The idea described here is subtle. Suppose an analyst draw on a slightly different idea. Suppose

he fix the order of participants at (I1, I2, I3) up front, that is D = [D1, D2, D3]′ = [Da, Db, Dc]′.

Suppose he assumes that Da and Db are exchangeable, thus the vector [Da, Db, Dc]′ has the same

distribution as the vector [Db, Da, Dc]′. Again, let g be a permutation that swaps the first and

second elements associated with participants the same content of Z, that is, I1 and I2. In this case,

δ([Da, Db, Dc]′) = [Dc, Db, Da]′ and gδ(D) = ([Db, Dc, Da]′). Observe that the first two elements

of the vector δ(D) are not exchangeable.

28

Page 31: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

C The Algorithm

This appendix describes the Algorithm of the paper is detail. The Appendix is divided into parts

that resembles the implemented algorithm. Subsection C.1 has a short description of the testing

procedure based on a permutation approach; Subsection ?? has a short description of the data

used in the randomization protocol. Subsection ?? describes how to construct the set Σ−12 (D,Z),

which includes all treatment status d ∈ σ−12 (D,Z) for all σ2 ∈ Σ2; Subsection ?? describes the set

of permutations gZ ∈ GZ ; Finally, Subsection C.5 describes the maximization algorithm and the

stepdown procedure;

C.1 A Permutation Approach

The second reassignment δ2 focus only participants assigned to treatment whose mothers were

working. The reassignment switches treatment status to control status for a few participants. The

function δ2, however δ2 ∈ Σ2 and Σ2 is known. Thus we maximize the critical value across for

possible σ2 ∈ Σ2. We apply the permutations gZ to the set of possible treatment status that could

arise previously to swaps δ2, that is d ∈ σ−12 (D,Z). The variability of D is obtained by δ2(gz ·d, Z).

Observe that we surpass the problem of maximizing over δ1 ∈ Σ1 by using the exchangeability

property of the randomization protocol.

Notationally, let gmZ ,m = 2, . . . ,M be and i.i.d. sequence of permutations in GZ . Let σ−12 (D,Z)

be the inverse set associated with each function σ2 ∈ Σ2, that is σ−12 (D,Z) ≡ {d : D = σ2(d, Z)},

where D is the actual treatment status. Let Σ−12 (D,Z) be the union of possible sets σ−1

2 (D,Z) :

σ2 ∈ Σ2.

Let m = 2, . . . ,M, let Xm(σ2, d) = (Y, σ2(gmZ d, Z), Z) define X0(σ, d) = (Y,D,Z).

Let Tm(σ, d) be the test statistics evaluated at these datasets.

Let c(σ, d) be the 1− α highest of the Tm(σ, d),m = 1, . . . ,M.

Let c be the maximum value of c(σ2, d) over σ2 ∈ Σ2 and d ∈ Σ−12 (D,Z).

Since σ2 ∈ Σ2 ⊃ σU and δ1(D, Z) ∈ σ−1U (D,Z), we have that c >= c(σU , δ1(D, Z)).

C.2 Basic Data

Data for implementing the testing procedure are:

29

Page 32: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

• We use the actual treatment status D (dichotomous vector 123× 1);

• Identification Number Id (ordered vector numbers from 1 to 123);

• Family Identification Number F (vector 123×1). There are 101 distinct families and therefore

101 older siblings;

• Wave W (vector 123× 1) takes values from 1 to 5. Values comply the cohort of each partici-

pant, from the first up to the fifth cohort;

• Maternal Employment Indicator MW ( dichotomous vector 123 × 1) other Socio-Economic

Status SES (vector 123× 1);

• Binet IQ at Entry IQ (vector 123× 1);

• Gender, represented by a “male” indicator M (vector 123× 1);

We use the notation of a superscript o to denote when a vector is associated to the eldest sibling

of each family. For example, while IQ has dimension 123, IQo has dimension 101 and comprises

only the IQs of the eldest siblings.

C.3 Constructing the Set of Vectors of Treatment Status Σ−12 (D, Z)

The set Σ−12 (D,Z) consists in the set of possible vectors of treatment status that could provide

the actual treatment status after the σ2(D,Z). To construct it, we switch the control status of

selected participants to treatment status. Eligible participants for the switch from control statutes

to treatment status comply the following criteria:

1. Belong to waves 2 through 5;

2. Maternal employment status is “working”;

3. Be the eldest sibling of the family;

4. Be assigned to control status;

We assume that there were are least one switch per wave and at most two switches per wave.

There are at most 2700 vectors of treatment status in Σ−12 (D,Z). Table 3 shows the identification

30

Page 33: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

number of such participants in the third column. The first column gives the treatment status

of selected participants, the second column gives the gender (1 for males). Fourth Column gives

the maternal status. The fifth column gives the number of possible combination of switches per

wave considering two switches. The sixth column provides the number of possible combination of

switches per wave considering one switch. The last column considers 1 or 2 switches. The final

number of possible treatment vectors that could had generated the current vector of treatment

status considering 1 or 2 switches per wave is 2700.

C.4 Constructing the Permutation Set GZ

We create a partition of the set of eldest participants based on four criteria. Let i be a identification

number of some eldest sibling of a family, thus:

1. Strictly above to the median of IQ at entry, that is 1[IQo(i) > median(IQ)];

2. Strictly below to the median of SES at entry, that is 1[SESo(i) > median(SES)];

3. Wave, that is W o(i);

4. Gender, that is Mo(i);

Participants who are the eldest siblings and have the same value of each of the criteria above

were clustered in the same group. The final partition of eldest siblings is given in Table 4. The

treatment assignments for eldest siblings that belongs to the same group number are assume to be

exchangeable. Only seven participants belong to groups of singletons, and do to permute at all.

Permutations within groups are independent.

31

Page 34: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

C.5 Maximizing the Critical Values

The code for the maximization procedure is short and it is printed in full.

D The Perry Study: Sampling and Randomization Protocol

D.1 Overview

Program Timeline Children were admitted to the study in five waves between 1962 and 1965

(Table 6). Waves (entry cohorts) were distinguished by birth year. During 1962, the first year of the

study, two waves were admitted: wave zero, at age 4, and wave one, at age 3. Additional waves were

admitted during each subsequent year, through wave four in 1966, all at age 3. Treatment lasted

one year for wave zero and two years for later waves, the reason being that the Perry Preschool

Program terminated at kindergarten entry (age 5).

D.2 Sampling

Sample Population The Perry study sampled children from families living in the catchment

area of the Perry Elementary School, in Ypsilanti, Michigan, as children in this district “seemed

most in need of early education support, as judged by the poor academic performance of children

at Perry School when compared with their peers in the community at large,” (Weikart et al., 1978,

pp. 6, 12). Table 7 shows these differences. All children were African-American, which reflected

both the de jure institutional segregation of the local school districts of the time as well as the de

facto segregation of residential areas during through the 1960s (Schweinhart et al., 2005, p. 22).

By contrast, the Erickson school shown in Table 7 is characterized as “an all-white school located

in an upwardly-mobile section of the Ypsilanti Public School District,” (Weikart, 1967, p. 65).

Sampling Procedure Children were primarily selected using the family census of the Perry Ele-

mentary School, as well as by neighborhood-group referral or door-to-door canvasing (Schweinhart

and Weikart, 1980, p. 17). The sampling procedure was comprehensive within its geographic scope:

“the total population in the Perry school district” with children born during the target period

was surveyed for each wave to determine eligibility, using census information for the school district

(Weikart, 1967, pp. 4, 65–68). The specific criteria for determining eligibility are described below.

32

Page 35: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Self-selection in the eligibility survey does not appear to be a factor; during the first two waves,

for example, “no parent who was approached refused to give information as to his socio-economic

status, and no parent refused to allow his child to be tested. Some families could not be found at

home in spite of several visits... [but] these families were later contacted during weekends. The

socio-economic data as well as test results on the children reached late indicated that these children

did not differ from those evaluated earlier,” Weikart (1967, p. 69). Further, the sample captured

a broad swath of the population:, as “...virtually all eligible children were enrolled in the project

during its five years of operation, approximately 25% of the total preschool-age population at that

time within the attendance area,” (Weikart et al., 1978, p. 16).

The only voluntary self-selection is that “3 families with children identified for the study refused

to participate,” (Schweinhart et al., 2005, p. 22). There was some involuntary selection after

treatment assignments were known — four children’s families moved away and one child died

before treatment program completion (Schweinhart et al., 2005, p. 23) — but no families voluntarily

declined participation following treatment assignment (Weikart et al., 1978, p. 7).

Eligibility Criteria Disadvantaged children were identified using an index of socio-economic

status (SES) and an IQ scale (Stanford-Binet, 1960 Norm, Form L-M).

The SES index is a weighted linear combination of three components: paternal occupational

skill level, parental educational attainment, and number of rooms per person in the family home.

The distribution of SES over the sample is given in Figure 4 (with definitions of SES components

provided in the figure notes). While the program documentation variously gives 10 (Weikart, 1967,

p. 68) or 11 (Weikart et al., 1978, p. 14) as the upper limit for eligibility, some children with an

SES index greater than 11 were admitted; out of 7 total such children, 6 ended up in the treatment

group, and 6 were admitted in the last two waves.

Weikart et al. (1978, p. 16) provides a Stanford-Binet IQ eligibility range between 50 and 85:

“The score range of 50 to 85 was selected for a very practical reason: special-education funds were

available from the state to aid children certified ‘educable mentally retarded’ based on demonstrated

performance within that range in the absence of discernible organic impairments.” This is at odds

with the ranges given in some other documentation, such as the most recently publication from the

experimenters: “They selected for the study those children whose intellectual performance scores

33

Page 36: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

(IQs) at this initial testing qualified them as ‘borderline educable mentally impaired’ by the State of

Michigan... that is, in the range of 70 to 85,” while noting the exceptions to this rule,” (Schweinhart

et al., 2005, p. 23).

Despite the discrepancy in documentation, there is no suggestion that the lower IQ bounds were

varied systematically, or that the inclusion of low-IQ children was a result of anything other than

their chance occurrence in the sample. The same cannot be said for the inclusion children with IQ

greater than 85: the documentation sources agree on that upper bound, and Weikart et al. (1978,

p. 16ff) indicates that “in some instances, [high-IQ inclusion] was done in order to fill preschool

vacancies in the experimental group and then to balance the control group; in other instances,

to include siblings of children already in the project.” Figure 3 shows all IQs by wave and final

treatment assignment.

Family Background The Perry sample is made up of 123 children from 104 families, with

each family contributing at most one child to each wave. Table 8 breaks down children by their

relationship to siblings in the study (if any). Expanding ones view beyond the study, Perry children

were young, and from large families: each had at least one older sibling, with a study-wide average

of four.

About half of the children were living with both biological parents. Average maternal age was

29 years at the time of enrollment; average maternal educational attainment was 9.4 years, but no

parent had completed more than 12 years of education. Among mothers with children at the Perry

School in 1962, 77% were born and 53% were educated in the south; correspondingly, 79% of Perry

study participants’ mothers were from the south, and only 11% from Michigan (Schweinhart et al.,

2005, pp. 22-23, 27).

D.3 Randomization Protocol

Detailed information on the implemented randomization protocol is important to construct a Perry-

tailored inference that corrects for compromises in the randomization as implemented and uncer-

tainty about the true assignment rule.

An understanding of the It is essential to understand the randomization protocol in order to

choose an appropriate inference procedure. Following Weikart et al. (1978, p. 16), there are 123

34

Page 37: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

participants; the 51 females (25 treated and 26 control) and 72 males (33 treated and 39 control)

are distributed among five cohorts. For each entry cohort, children were assigned to treatment and

control groups in the following the Procedure ??, which is graphically illustrated in Figure 1 of

Subsection 2.2.

Balancing IQ All documentation sources agree randomization protocol began with unlabeled

groups formed based on stratifying IQ rank. Weikart et al. (1978, p. 16) and Schweinhart et al.

(2005, p. 27) are not specific, indicating merely that children were “sorted” or “assigned” to groups

based on IQ rank. However, other sources differ on specifics. Weikart et al. (1978, p. 7) and

Schweinhart et al. (1993, p. 31) indicate that children were paired based on “similar” or “matched”

IQs, after which the two pair members were “randomly” assigned to one of each of the two unlabeled

groups. In contrast, Schweinhart and Weikart (1980, p. 20) indicates that “...[c]hildren were ranked

by their initial IQs; even rankings were assigned to one group and odd rankings to another.” We

interpret the invocation of randomness as describing the manner in which ties were broken, as there

are a few children with ties IQs in each wave, and that after ties were broken randomly, children

with even- and odd-ranked IQs were put in separate unlabeled groups.

Balancing SES and Gender Table 1 reviews the descriptions of the exchanges intended to

balance mean SES index and gender. In all cases, the text appears to indicate that exchanges

were made between one treatment child and one control child, keeping the marginal treatment

group counts fixed. The descriptions also indicate that swaps were made in such a way as to .

Although a couple of descriptions indicates that exchanges were made so as to balance IQ, we read

this as merely a comment that mean IQ remains balanced because balancing exchanges hold the

distribution of IQ across groups roughly constant. For the number of such exchanges, we use the

most precise estimate: 1–2 per wave.

Reassigning Children with Working Mothers After the first two waves, the randomization

procedure “had to be qualified somewhat by practical considerations;” children with single mothers

had transportation difficulties — funding was not available for preschool-provided transportation

— and scheduling home visits proved problematic, (Weikart et al., 1978, pp. 16–17). Table ??

reviews the documentation on reassignments of treatment children with working mothers from the

35

Page 38: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Tab

le1:

Ran

dom

izat

ion

Pro

toco

lD

ocum

enta

tion

:B

alan

cing

Gen

der

and

SES

inde

x

#of

Sw

aps

IQIn

vari

ance

Bal

ance

IQ?

Tex

tSou

rce

-H

eld

Rou

ghly

Con

stan

tN

o“I

fth

ese

grou

psha

dun

equa

lse

xra

tios

orun

equa

lSE

Sra

ting

s,th

eyw

ere

equa

ted

byex

chan

ging

them

,w

ith

[IQ

]sc

ores

held

mor

eor

less

cons

tant

.”

Wei

kart

etal

.(1

978,

p.16

)

-H

eld

Rou

ghly

Equ

alN

o“P

airs

ofsi

mila

rly-

rank

edch

ildre

nw

ere

exch

ange

dbe

twee

ngr

oups

unti

l[ge

nder

and

SES]

for

the

two

grou

psw

ere

equi

v-al

ent.

Schw

einh

art

and

Wei

kart

(198

0,p.

20)

-B

tw.

Sim

ilar

Ran

ksN

o“T

hen,

pair

sof

sim

ilarl

y-ra

nked

child

ren

wer

eex

chan

ged

betw

een

grou

psto

equa

tew

ithi

n-gr

oup

rati

osof

boys

togi

rls

and

the

aver

age

soci

oeco

nom

icle

vels

ofth

etw

ogr

oups

.”

Wei

kart

etal

.(1

978,

p.7)

Seve

ral

Btw

.Si

mila

rR

anks

Yes

“The

yth

enex

chan

ged

seve

ral

sim

ilarl

y-ra

nked

pair

mem

-be

rs,

soth

etw

ogr

oups

wou

ldbe

mat

ched

onm

ean

[SE

S,IQ

,an

dge

nder

].”

Schw

einh

art

etal

.(1

993,

p.31

)

1–2

Btw

.Sa

me

Ran

k-P

airs

Yes

“As

part

ofth

ein

itia

las

sign

men

tpr

oced

ure,

they

ex-

chan

ged

1or

2pa

irm

embe

rspe

rcl

ass

toen

sure

that

the

grou

psw

ere

mat

ched

onm

ean

[SE

S,IQ

,an

dge

nder

].”

Schw

einh

art

etal

.(2

005,

p.27

)

36

Page 39: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

control group. We simulate these swaps by supposing that up to two such reassignments occurred

per wave.

The lowest estimate of the total number of exchanges (2) comes from Schweinhart et al. (1993, p.

31), which adds, “[a] thorough review of the study’s comprehensive original data could identify only

two such transfers; the transferred children’s names were independently identified by the program’s

head teacher. (Because this transfer has become a point of question [cites], a special effort was

made to verify its extent.)” However, even if this were the overall limit, the fact that their timing

is unknown does not decrease the per-wave upper bound of 2.

Documentation is generally consistent on reassignments being unilateral from the treatment to

control group. The only indication to the contrary is in Schweinhart et al. (2005, p. 27), which uses

the phrase “exchanged 1 or 2 pair members” — also used in that book to describe the exchanges

between (see Table 1). Since the other books seem consistent on employment-based reassignment

being unilateral, and balancing swaps not, we choose to ignore this aberration as an imprecision in

terminology.

37

Page 40: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Tab

le2:

Ran

dom

izat

ion

Pro

toco

lD

ocum

enta

tion

:R

eass

igni

ngC

hild

ren

wit

hW

orki

ngM

othe

rs

#of

Sw

apsa

One-

orT

wo-

Way

bTex

tSou

rce

--

“Fol

low

ing

the

assi

gnm

ent

ofW

aves

0an

d1.

..[o

]cca

sion

alex

chan

ges

ofch

ildre

nbe

twee

ngr

oups

also

had

tobe

mad

ebe

caus

eof

the

inco

nven

ienc

eof

half-

day

pres

choo

lfor

wor

k-in

gm

othe

rs...

Wei

kart

etal

.(1

978,

pp.

16–1

7)

5T

otal

One

-Way

“...fi

vech

ildre

nw

ere

tran

sfer

red

from

the

expe

rim

enta

lgr

oup

toth

eco

ntro

lgr

oup,

rath

erth

andr

oppe

dfr

omth

est

udy,

beca

use

they

wer

eun

able

toat

tend

pres

choo

lor

topa

rtic

ipat

ew

ith

thei

rm

othe

rsin

the

hom

e-vi

sit

com

pone

ntof

the

prog

ram

.T

hese

child

ren

cam

efr

omsi

ngle

-par

ent

fam

ilies

inw

hich

the

mot

her

was

empl

oyed

.”

Schw

einh

art

and

Wei

kart

(198

0,p.

21)

5T

otal

One

-Way

“Fiv

ech

ildre

nw

ith

sing

lepa

rent

sem

ploy

edou

tsid

eth

eho

me

had

tobe

tran

sfer

red

from

the

pres

choo

lgr

oup

toth

eno

-pre

scho

olgr

oup

beca

use

ofth

eir

inab

ility

topa

rtic

i-pa

tein

the

clas

sroo

man

d/or

hom

e-vi

sit

com

pone

nts

ofth

epr

esch

ool

prog

ram

.”

Wei

kart

etal

.(1

978,

p.7)

2T

otal

One

-Way

“...f

eari

ngov

eral

lsa

mpl

eat

trit

ion,

staff

tran

sfer

red

from

the

prog

ram

grou

pto

the

non-

prog

ram

grou

p2

child

ren

(wit

hsi

ngle

mot

hers

empl

oyed

away

from

hom

e)w

how

ere

unab

leto

part

icip

ate

inan

yof

the

prog

ram

’scl

asse

sor

hom

evi

sits

.”

Schw

einh

art

etal

.(1

993,

p.31

)

1–2

/W

ave

Tw

o-W

ay?

“...a

spa

rtof

the

init

ial

assi

gnm

ent

proc

edur

ein

late

rcl

asse

s,th

eyex

chan

ged

1or

2pa

irm

embe

rspe

rcl

ass

tore

duce

the

num

ber

ofch

ildre

nof

empl

oyed

mot

hers

inth

epr

ogra

mgr

oup,

beca

use

itw

asdi

fficu

ltto

arra

nge

hom

evi

sits

for

them

.”

Schw

einh

art

etal

.(2

005,

p.27

)

Not

es:

(a)

Tot

alnu

mbe

rof

exch

ange

s,th

roug

hout

the

stud

y;(b

)W

heth

erre

assi

gnm

ents

wer

eun

ilate

ralf

rom

the

trea

tmen

tgr

oup

toco

ntro

lgro

up,

orw

heth

erth

eyw

ere

num

eric

ally

bala

nced

bytr

ansf

ers

from

the

cont

rol

grou

pto

the

trea

tmen

tgr

oup.

38

Page 41: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Figure 3: IQ at Entry, by Wave and by Treatment GroupTable 12: Entry IQ vs. Treatment Group, by Wave

Control Treat. Control Treat. Control Treat. Control Treat. Control Treat.88 2 1 87 2 1 87 3 1 86 2 88 186 1 86 2 86 1 2 85 2 85 2 185 1 85 1 84 1 84 2 84 184 2 84 2 83 1 1 83 3 2 83 383 1 83 1 82 1 1 82 2 1 82 282 2 79 1 81 1 2 81 1 81 180 1 1 73 1 80 2 80 1 80 1 279 1 72 2 79 1 1 79 1 1 79 277 1 2 71 1 75 1 1 78 2 1 78 1 176 1 70 1 73 1 1 77 1 76 2 173 1 69 1 71 1 76 2 75 1 171 1 64 1 69 1 75 1 71 170 1 9 8 68 1 73 1 61 169 3 14 12 66 1 13 1268 1 14 1367 166 163 2

15 13

Counts CountsIQIQIQIQIQ

Counts Counts Counts

Class 5

Perry: Stanford-Binet Entry IQ by Cohort and Group Assigment

Class 1 Class 2 Class 3 Class 4

61

Notes: Stanford Binet IQ at Entry (Age 3) was used.

39

Page 42: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Figure 4: SES Index, by Gender and Treatment Status

6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

SES Index : Male

Fra

ctio

n

6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

SES Index : Female

Fra

ctio

n

ControlTreatment

ControlTreatment

Notes: The socio-economic status index (SES) was defined as a weighted linear combination of 3 variables:(1) average highest grade completed by parents (or the grade of the only present parent), with coefficient12 ; (2) father’s employment status (or mother’s if father is absent): 3 for skilled, 2 for semi-skilled, and 1 forunskilled or none, all with coefficient 2; (3) number of rooms in home divided by number of people livingin the household, with coefficient 2. The skill level of the parent’s job is rated by the study coordinatorsand is not clearly defined. An SES score of 11 or lower was required to enter the study (Weikart, Bond,and McNeil, 1978, pp. 14). The SES score cutoff of 11 was not always adhered to: out of the full sample, 7individuals have SES above the cutoff. 6 out of 7 are in the treatment group, and 6 out of 7 are in the lasttwo waves.

40

Page 43: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 3: All Participants that can Switch Back to Treatment according to Σ−12 (D,Z)

Total Combinations

Id D MW Wave Males 0 switch 1 switches 2 switches 0, 1 or 2

42 0 1 2 1

43 0 1 2 1

58 0 1 3 0

68 0 1 3 1

69 0 1 3 0

88 0 1 4 1

89 0 1 4 1

90 0 1 4 0

94 0 1 4 1

95 0 1 4 1

114 0 1 5 1

115 0 1 5 0

121 0 1 5 0

122 0 1 5 0

Total 1 120 180 4928

3

4

1

1

1

1

4

7

16

116

105

3

12

Notes: First column gives the treatment status of selected participants, second column gives the gender (1 for

males). Third column gives the identification number. Fourth Column gives the maternal status. The fifth column

gives the number of possible combination of switches per wave considering two switches. The sixth column provides

the number of possible combination of switches per wave considering one switch. The last column considers 1 or 2

switches. The final number of possible treatment vectors that could had generated the current vector of treatment

status considering 1 or 2 switches per wave is 2700.

41

Page 44: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 4: Partition of Eldest Siblings for Permutation Purposes

id Group id Group id Group id Group

1 1 5 31 31 12 62 73 30 83 99 38

2 2 6 32 32 13 63 74 30 84 100 37

3 3 7 33 35 10 64 75 24 85 101 32

4 4 1 34 36 10 65 76 26 86 102 39

5 5 4 35 38 12 66 77 27 87 104 38

6 6 2 36 39 12 67 78 31 88 105 39

7 7 2 37 40 12 68 79 29 89 107 39

8 8 8 38 41 15 69 80 27 90 108 38

9 9 7 39 42 14 70 81 28 91 109 33

10 10 5 40 43 13 71 82 29 92 110 36

11 11 4 72 85 30 93 112 35

12 12 5 73 86 29 94 113 38

13 13 8 41 47 18 74 88 28 95 114 39

14 14 3 42 49 16 75 89 31 96 115 35

15 15 5 43 50 17 76 90 24 97 116 37

16 16 5 44 51 20 77 92 24 98 118 35

17 17 6 45 52 20 78 93 31 99 121 35

18 18 8 46 53 21 79 94 30 100 122 32

19 19 2 47 55 17 80 95 30 101 123 39

20 20 8 48 56 17 81 96 24

21 21 3 49 57 18 82 98 29

22 22 5 50 58 18

23 23 4 51 60 19

24 24 5 52 61 23

25 25 5 53 63 23

26 26 6 54 64 23

27 27 5 55 65 20

28 28 4 56 66 21

29 29 12 57 67 18

30 30 13 58 68 21

31 31 12 59 69 17

60 70 18

61 71 22

Wave 2 Wave 4 Wave 5

Wave 3

Wave 1

Notes: The treatment assignments for eldest siblings that belongs to the same group number are assume to beexchangeable.

42

Page 45: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 5: Partitions of Eldest Siblings According to different Discretization Criteria

panel(a) 1 2 3 4

IQ Median Median Tercile Tercile

SES Median Tercile Median Tercile

Gender Yes Yes Yes Yes

Wave Yes Yes Yes Yes

panel(b)

N. of Partition Sets 38 49 53 64

N. of Sigletons 7 16 20 28

panel(c)

N. of Permutations 4.42E+29 6.39E+21 5.92E+19 5.07E+14

N. of different gD 4.35E+13 2.94E+11 2.72E+09 4.03E+07

Cri

teri

a

Discretizations

Notes: The table presents four partitions of the eldest siblings set, which comprises 101 participants. The partitions

are based on the values of selected variables: gender (dichotomous variable), wave (categorical variable from 1 to 5)

and quantile indicators of IQ and SES, both measured at entry. Participants with same background value of selected

outcomes are clustered into the same partition set. We use two types of quantile indicators: (a) if the participant is

above or below the median of the target variable; (b) the tercile of the participant’s value of SES or IQ in the overall

sample. Panel (a) in the table provides the specification of each one of the four partitions. The first line in Panel

(b) gives the number of sets in each partition. The second line gives the number of singletons, that is, the number

of sets that have a single participant. We assume exchangeability of participants within partition sets. The first line

of Panel (c) provides the number of possible permutations for each partition. The second line of Panel (c) provides

the number of distinct vectors of treatment status would had been obtained if the permutations were applied to the

actual treatment status.

43

Page 46: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 6: Timeline of Perry Preschool Program Waves

WaveSample Size YearTreat. Ctl. 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967

Zero 13 15One 8 9Two 12 14

Three 13 14Four 12 13

Birth Year Perry Preschool Programa

Source: Weikart et al. (1978, p. 6)

Notes: The Perry Preschool Program ran during the school year (October through mid-May).

44

Page 47: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 7: Comparing Families of Participants with Other Families with Children in the PerryElementary School Catchment, Ypsilanti, MI.

Perry School(Overall)a

PerryPreschoolb

EricksonSchool

Mot

her

Average Age 35 31 32Mean Years of Education 10.1 9.2 12.4% Working 60% 20% 15%Mean Occupational Levelc 1.4 1.0 2.8% Born in South 77% 80% 22%% Educated in South 53% 48% 17%

Fat

her

% Fathers Living in the Home 63% 48% 100%Mean Age 40 35 35Mean Years of Education 9.4 8.3 13.4Mean Occupational Levelc 1.6 1.1 3.3

Fam

ily

&H

ome

Mean SESd 11.5 4.2 16.4Mean # of Children 3.9 4.5 3.1Mean # of Rooms 5.9 4.8 6.9Mean # of Others in Home 0.4 0.3 0.1% on Welfare 30% 58% 0%% Home Ownership 33% 5% 85%% Car Ownership 64% 39% 98%% Members of Librarye 25% 10% 35%% with Dictionary in Home 65% 24% 91%% with Magazines in Home 51% 43% 86%% with Major Health Problems 16% 13% 9%% Who Had Visited a Museum 20% 2% 42%% Who Had Visited a Zoo 49% 26% 72%

N 277 45 148

Source: Weikart, Bond, and McNeil (1978). Notes: (a) These are data based on parents who attended parent-

teacher meetings at the Perry school or that were tracked down at their homes by Perry personnel (Weikart, Bond,

and McNeil, 1978, pp. 12–15); (b) The Perry Preschool subsample consists of the full sample (treatment and control)

from the first two waves; (c) The Erickson School was an “all-white school located in a middle-class residential section

of the Ypsilanti public school district.” (ibid., p. 14); (d) Occupation level: 1 = unskilled; 2 = semiskilled; 3 = skilled;

4 = professional; (e) See the base of Figure 4 for the definition of SES; (f) Any member of the family.

45

Page 48: Compromised Randomization and Uncertainty of Treatment Assignments in Social Experiments…jenni.uchicago.edu/perry_comp_rand/Social-Experiments_Azeem_2009-07-24.pdf · 2009/7/24

Table 8: Siblings in Perry Sample

Overall Male Female

All Ctl. Trt. All Ctl. Trt. All Ctl. Trt.

Singleton 82 41 41 51 26 25 31 15 16Eldest 19 12 7 13 9 4 6 3 3

Younger 22 12 10 8 4 4 14 8 6

N 123 65 58 72 39 33 51 26 25

Note: The sample includes 17 pairs, 1 triple, and 1 quadruple of siblings.

46