Regression Discontinuity World Bank SIEF / APHRC Impact...
Transcript of Regression Discontinuity World Bank SIEF / APHRC Impact...
Regression Discontinuity
World Bank SIEF / APHRCImpact Evaluation Training
2015
Owen OzierDevelopment Research Group
The World Bank
6 May 2015
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 1 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
Regression Discontinuity (“RD”) Stata section
We will use a new dataset (.dta file):rd-exampledata-2015.dta
Four separate .do files are provided for this module:
1. Getting the RD package installed.
2. Exploring a new dataset: a (fictional) training program.
3. Basics of regression discontinuity: the rd command and its manual equivalent.
4. Fuzzy RD and manipulation of the running variable: density and covariate checks.
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9
A fictional computer and typing skills training program
Photo credits: Wikipedia / Robin Taylor, flickr; Wikipedia / Abelkazzah
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 3 / 9
Quickly reviewing Stata syntax
[command] [variables or other "arguments"], [options]
as in
summarize motheryob
or
summarize motheryob, detail
and some commands, options, and even variables can be abbreviated:
su mothery, d
in general, to learn more about a command and its syntax:
help [command]
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9
Quickly reviewing Stata syntax
[command] [variables or other "arguments"], [options]
as in
summarize motheryob
or
summarize motheryob, detail
and some commands, options, and even variables can be abbreviated:
su mothery, d
in general, to learn more about a command and its syntax:
help [command]
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9
Quickly reviewing Stata syntax
[command] [variables or other "arguments"], [options]
as in
summarize motheryob
or
summarize motheryob, detail
and some commands, options, and even variables can be abbreviated:
su mothery, d
in general, to learn more about a command and its syntax:
help [command]
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9
Quickly reviewing Stata syntax
[command] [variables or other "arguments"], [options]
as in
summarize motheryob
or
summarize motheryob, detail
and some commands, options, and even variables can be abbreviated:
su mothery, d
in general, to learn more about a command and its syntax:
help [command]
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9
Quickly reviewing Stata syntax
[command] [variables or other "arguments"], [options]
as in
summarize motheryob
or
summarize motheryob, detail
and some commands, options, and even variables can be abbreviated:
su mothery, d
in general, to learn more about a command and its syntax:
help [command]
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9
Basic characteristics of the data
. describe
Contains data from rd-exampledata-2015.dta
obs: 80,000 Example Regression Discontinuity Data - Owen Ozier 2015
vars: 17 5 May 2015 07:57
size: 2,960,000
-----------------------------------------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------------------------------------------------------------------------------------------------
id float %9.0g Unique identifier
datacollected byte %9.0g Full data available for this individual
selfskills float %9.0g Pre-baseline: Self-assessed labor market skills score 2011
skills float %9.0g Baseline: Labor market skills score in 2012
el byte %9.0g Baseline: Eligibility for 2013 training program
elsk float %9.0g Baseline: Eligibility for program X skills score
pri byte %9.0g Baseline: Completed primary schooling by 2012
intcomp byte %19.0g intlev Baseline: Interest in gaining computer skills (2012)
motheryob int %9.0g Baseline: Mother’s year of birth
mothermath float %9.0g Baseline: Mother’s mathematics test score in 2012
pasttrainings byte %9.0g Baseline: Participated in job trainings by 2012
friendtrain byte %9.0g Baseline: Any close friends participated in job training by 2012
finishtraining byte %9.0g Administrative data: Finished 2013 training program
tsk byte %9.0g Endline: Typing skill (WP30s), end of 2013 training program
empdayafter byte %9.0g Endline: Has formal job, one day after 2013 training program
empyearafter byte %9.0g Endline: Has formal job, one year after training program (2014)
earningsafter int %9.0g Endline: Earnings, one year after training program (2014)
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sorted by: id
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 5 / 9
Basic characteristics of the data
. codebook pri
----------------------------------------------------------------------------------------
pri Baseline: Completed primary schooling by 2012
----------------------------------------------------------------------------------------
type: numeric (byte)
range: [0,1] units: 1
unique values: 2 missing .: 20000/80000
tabulation: Freq. Value
30326 0
29674 1
20000 .
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 6 / 9
Basic characteristics of the data
. codebook skills
----------------------------------------------------------------------------------------
skills Baseline: Labor market skills score in 2012
----------------------------------------------------------------------------------------
type: numeric (float)
range: [-1,.99992049] units: 1.000e-13
unique values: 58982 missing .: 20000/80000
mean: -.016421
std. dev: .409222
percentiles: 10% 25% 50% 75% 90%
-.550547 -.291344 -.009326 .270603 .513644
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 7 / 9
Basic characteristics of the data
Which of these would make a better ”running variable” for a regression discontinuity?
----------------------------------------------------------------------------------------
pri Baseline: Completed primary schooling by 2012
----------------------------------------------------------------------------------------
type: numeric (byte)
range: [0,1] units: 1
unique values: 2
tabulation: Freq. Value
30326 0
29674 1
----------------------------------------------------------------------------------------
skills Baseline: Labor market skills score in 2012
----------------------------------------------------------------------------------------
type: numeric (float)
unique values: 58982
percentiles: 10% 25% 50% 75% 90%
-.550547 -.291344 -.009326 .270603 .513644
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 8 / 9
Basic characteristics of the data
Which of these would make a better ”running variable” for a regression discontinuity?
05
1015
2025
Den
sity
0 1Baseline: Completed primary schooling by 2012
0.2
.4.6
.81
Den
sity
−1 −.5 0 .5 1Baseline: Labor market skills score in 2012
Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 9 / 9