Regression Discontinuity World Bank SIEF / APHRC Impact...

18
Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training 2015 Owen Ozier Development Research Group The World Bank 6 May 2015 Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 1/9

Transcript of Regression Discontinuity World Bank SIEF / APHRC Impact...

Page 1: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity

World Bank SIEF / APHRCImpact Evaluation Training

2015

Owen OzierDevelopment Research Group

The World Bank

6 May 2015

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 1 / 9

Page 2: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 3: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 4: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 5: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 6: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 7: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Regression Discontinuity (“RD”) Stata section

We will use a new dataset (.dta file):rd-exampledata-2015.dta

Four separate .do files are provided for this module:

1. Getting the RD package installed.

2. Exploring a new dataset: a (fictional) training program.

3. Basics of regression discontinuity: the rd command and its manual equivalent.

4. Fuzzy RD and manipulation of the running variable: density and covariate checks.

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 2 / 9

Page 8: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

A fictional computer and typing skills training program

Photo credits: Wikipedia / Robin Taylor, flickr; Wikipedia / Abelkazzah

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 3 / 9

Page 9: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Quickly reviewing Stata syntax

[command] [variables or other "arguments"], [options]

as in

summarize motheryob

or

summarize motheryob, detail

and some commands, options, and even variables can be abbreviated:

su mothery, d

in general, to learn more about a command and its syntax:

help [command]

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9

Page 10: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Quickly reviewing Stata syntax

[command] [variables or other "arguments"], [options]

as in

summarize motheryob

or

summarize motheryob, detail

and some commands, options, and even variables can be abbreviated:

su mothery, d

in general, to learn more about a command and its syntax:

help [command]

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9

Page 11: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Quickly reviewing Stata syntax

[command] [variables or other "arguments"], [options]

as in

summarize motheryob

or

summarize motheryob, detail

and some commands, options, and even variables can be abbreviated:

su mothery, d

in general, to learn more about a command and its syntax:

help [command]

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9

Page 12: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Quickly reviewing Stata syntax

[command] [variables or other "arguments"], [options]

as in

summarize motheryob

or

summarize motheryob, detail

and some commands, options, and even variables can be abbreviated:

su mothery, d

in general, to learn more about a command and its syntax:

help [command]

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9

Page 13: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Quickly reviewing Stata syntax

[command] [variables or other "arguments"], [options]

as in

summarize motheryob

or

summarize motheryob, detail

and some commands, options, and even variables can be abbreviated:

su mothery, d

in general, to learn more about a command and its syntax:

help [command]

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 4 / 9

Page 14: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Basic characteristics of the data

. describe

Contains data from rd-exampledata-2015.dta

obs: 80,000 Example Regression Discontinuity Data - Owen Ozier 2015

vars: 17 5 May 2015 07:57

size: 2,960,000

-----------------------------------------------------------------------------------------------------------------------------------------------------------

storage display value

variable name type format label variable label

-----------------------------------------------------------------------------------------------------------------------------------------------------------

id float %9.0g Unique identifier

datacollected byte %9.0g Full data available for this individual

selfskills float %9.0g Pre-baseline: Self-assessed labor market skills score 2011

skills float %9.0g Baseline: Labor market skills score in 2012

el byte %9.0g Baseline: Eligibility for 2013 training program

elsk float %9.0g Baseline: Eligibility for program X skills score

pri byte %9.0g Baseline: Completed primary schooling by 2012

intcomp byte %19.0g intlev Baseline: Interest in gaining computer skills (2012)

motheryob int %9.0g Baseline: Mother’s year of birth

mothermath float %9.0g Baseline: Mother’s mathematics test score in 2012

pasttrainings byte %9.0g Baseline: Participated in job trainings by 2012

friendtrain byte %9.0g Baseline: Any close friends participated in job training by 2012

finishtraining byte %9.0g Administrative data: Finished 2013 training program

tsk byte %9.0g Endline: Typing skill (WP30s), end of 2013 training program

empdayafter byte %9.0g Endline: Has formal job, one day after 2013 training program

empyearafter byte %9.0g Endline: Has formal job, one year after training program (2014)

earningsafter int %9.0g Endline: Earnings, one year after training program (2014)

-----------------------------------------------------------------------------------------------------------------------------------------------------------

Sorted by: id

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 5 / 9

Page 15: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Basic characteristics of the data

. codebook pri

----------------------------------------------------------------------------------------

pri Baseline: Completed primary schooling by 2012

----------------------------------------------------------------------------------------

type: numeric (byte)

range: [0,1] units: 1

unique values: 2 missing .: 20000/80000

tabulation: Freq. Value

30326 0

29674 1

20000 .

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 6 / 9

Page 16: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Basic characteristics of the data

. codebook skills

----------------------------------------------------------------------------------------

skills Baseline: Labor market skills score in 2012

----------------------------------------------------------------------------------------

type: numeric (float)

range: [-1,.99992049] units: 1.000e-13

unique values: 58982 missing .: 20000/80000

mean: -.016421

std. dev: .409222

percentiles: 10% 25% 50% 75% 90%

-.550547 -.291344 -.009326 .270603 .513644

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 7 / 9

Page 17: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Basic characteristics of the data

Which of these would make a better ”running variable” for a regression discontinuity?

----------------------------------------------------------------------------------------

pri Baseline: Completed primary schooling by 2012

----------------------------------------------------------------------------------------

type: numeric (byte)

range: [0,1] units: 1

unique values: 2

tabulation: Freq. Value

30326 0

29674 1

----------------------------------------------------------------------------------------

skills Baseline: Labor market skills score in 2012

----------------------------------------------------------------------------------------

type: numeric (float)

unique values: 58982

percentiles: 10% 25% 50% 75% 90%

-.550547 -.291344 -.009326 .270603 .513644

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 8 / 9

Page 18: Regression Discontinuity World Bank SIEF / APHRC Impact ...pubdocs.worldbank.org/en/452941440083709642/10...Regression Discontinuity World Bank SIEF / APHRC Impact Evaluation Training

Basic characteristics of the data

Which of these would make a better ”running variable” for a regression discontinuity?

05

1015

2025

Den

sity

0 1Baseline: Completed primary schooling by 2012

0.2

.4.6

.81

Den

sity

−1 −.5 0 .5 1Baseline: Labor market skills score in 2012

Owen Ozier (The World Bank) Regression Discontinuity 6 May 2015 9 / 9