Econometrics using STATA : Part 2
Transcript of Econometrics using STATA : Part 2
Econometrics using STATA :
Part 2
Benjamin MonneryEconomiX, Univ Paris Nanterre
M1 Economie du Droit2017-2018
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
CONTENT OF PART 2
When RCT is not an option, only option is to use observational /real-life data
1. How to retrieve data ?
�
public sources (data.gouv), data repositories, journal archiveshow to clean/manipulate data sets in Stata ?
2. How to fix selection bias ?• when there is only selection on observables (part 2)
i.e. easy problems where you know all the determinants ofassignment correlated with YMethods : stratification, covariate-adjustment and matching
x when there is also selection on unobservables (part 3)Methods : IV, panel, DID, RDD...
B. Monnery (EconomiX) Econometrics using Stata II 2 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
EXAMPLE OF SELECTION ON OBSERVABLES
What’s the effect of lawyers on judicial outcomes ? e.g. Pr(conviction)
�
among defendants, having a lawyer is “as random” conditional on ...
• strengh of the case (evidence)• wealth, ...
⇒ Among these determinants of treatment, strengh of casecorrelates with Pr(conviction) for sure
what about wealth ? (depends on the judicial system)
Assumption : there is selection on observables (only) if
E [Y 1i |T = 1,X ] = E [Y 1
i |T = 0,X ]
E [Y 0i |T = 1,X ] = E [Y 0
i |T = 0,X ]
Potential outcomes are the same on average for treated anduntreated with same X
B. Monnery (EconomiX) Econometrics using Stata II 3 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
Finding Data
B. Monnery (EconomiX) Econometrics using Stata II 4 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
Access to data is necessary to answer questions> know key sources, be able to manipulate their data
Access to novel data is (almost) necessary to publish in top scientificjournals
• good data + good method + interesting topic = top science• “competition” for data among researchers• difficult to teach> be curious, follow the news, learn code
B. Monnery (EconomiX) Econometrics using Stata II 5 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
DATA GOUV
also look at INSEE, ministries’ websites...
B. Monnery (EconomiX) Econometrics using Stata II 6 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
HARVARD DATAVERSE AND JOURNAL ARCHIVES
Many top scientific journals now require online publication of datasets(like AER)
https://www.aeaweb.org/articles?id=10.1257/aer.20161503
B. Monnery (EconomiX) Econometrics using Stata II 7 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
CIVIL SOCIETY INITIATIVES
We will use some of their data later in the course (Diff-in-Diff)
B. Monnery (EconomiX) Econometrics using Stata II 8 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
Covariate-adjustment
B. Monnery (EconomiX) Econometrics using Stata II 9 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
INTUITION
We want to estimate a causal treatment effect by comparing theobserved outcomes of treated and untreated people
If we think we know all the determinants X of treatment assignment Tthat also relate to Y (selection on observables), we can simplycompare treated and untreated outcomes conditional on X
How to “condition on X ” ?
1. statistically control for X in a regression model (covariateadjustment)
�
estimate Yi = β0 + β1Ti + β2Xi + εi2. use matching (e.g. propensity score matching)3. use stratification (subclassification) :
�
compute differences within small groups (strata/cells) of X
⇒ Covariate-adjustement is the regression analog to stratification
B. Monnery (EconomiX) Econometrics using Stata II 10 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
In a problem of selection on observables, we want to compare treatedand untreated within subgroups with similar potential outcomes
Ex : what’s the effect of lawyers on defendants’ probability ofconviction ?
⇒ True answer ? Probably a reduction of Pr (conviction)
⇒ Problem (selection bias) : propensity to hire a lawyer andprobability of conviction are both related to strengh of evidenceagainst defendant
• if court has strong evidence against defendant, he is more likelyto hire a lawyer to help him
• however, he is also more likely to be convicted eventually⇒ hence risk of selection bias due to differences in strengh of
evidence
If you can measure strengh of evidence, selection bias can be “easily”eliminated by stratification, covariate-adjustment or matching
B. Monnery (EconomiX) Econometrics using Stata II 11 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
STRATIFICATION
Tab 1. Sample of Defendants Tab 2. Numbers Convicted
X / T Yes No All X / T Yes No AllStrong 40 10 70 Strong 30 10 40Weak 10 20 30 Weak 5 15 20
All 50 50 100 All 35 25 60
Stata : tab X T tab X T if Convicted==1
• Naive estimator : compare rates of conviction between Yes & NoTreated : 35/50 = 70% Untreated : 25/50 = 50%
• Naive answer : detrimental “effect” of lawyers of +20% points !
⇒ But strengh of evidence is related to both Lawyers andConvictions : selection bias
Better estimator : stratify by (condition on) strengh of evidence
B. Monnery (EconomiX) Econometrics using Stata II 12 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
STRATIFICATION
Tab 1. Sample of Defendants Tab 2. Numbers Convicted
X / T Yes No All X / T Yes No AllStrong 40 10 70 Strong 30 10 40Weak 10 20 30 Weak 5 15 20
All 50 50 100 All 35 25 60
• Among Strong casesTreated : 30/40 = 75% Untreated : 10/10 = 100%
�
Treatment effect : -25pp effect
• Among Weak casesTreated : 5/10 = 50% Untreated : 15/20 = 75%
�
Treatment effect : -25pp effect
⇒ Hence the stratified estimator gives a treatment effect of -25 pp
B. Monnery (EconomiX) Econometrics using Stata II 13 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
STRATIFICATION VERSUS REGRESSIONS
Stratification solves problems of selection on observables
However in practice, it is only appropriate in the most simplesituations :
• with few variables affecting T and Y• which are all categorical• e.g. 1 dummy (strong/weak), 2 dummies (+rich/poor), ...
In real-life, assignment often depends on a large number ofnon-dichotomic variables, i.e. need to stratify the sample within a lotof different groups (cells/strata)⇒ problem known as the curse of dimensionality
B. Monnery (EconomiX) Econometrics using Stata II 14 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
STRATIFICATION VERSUS REGRESSIONS
Problem 1 with stratification : the curse of dimensionality
Assume we want to condition on (stratify by) k dummy variables :the number of different groups will be 2k
with k = 10, we have 210 = 1024 group-specific treatments effects tocompare and average (211 = 2024 , 310 = 59049)
• computation can become long• many cells will be empty or only contain treated or untreated
observations : can’t compute group-specific effect> makes the estimated effect less general (i.e. local) as someobservations are left-out
B. Monnery (EconomiX) Econometrics using Stata II 15 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
STRATIFICATION VERSUS REGRESSIONS
Problem 2 with stratification : continuous variables
In real-life, many variables are not categorical but continuous
• strong/weak and rich/poor are statistical constructions to easecalculus
• the true underlying variables are continuous in nature⇒ stratification makes assumptions of homogeneity within groups
Regressions can easily solve both problems : many X and mix ofcategorical and continuous variables
B. Monnery (EconomiX) Econometrics using Stata II 16 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
COVARIATE ADJUSTMENT
Goal : conditional on X , treatment should be “as random”
Key : control appropriately the effect of wealth and case strengh
• Flexible specification :- only linear effect Yi = β0 + β1Lawyeri + β2Wealthi + εi- or more flexible form : logarithmic, polynomial
(Wealth2,Wealth3,...), by categories/bins, linear+bins...
• Relevant data/variables :- Use data on the “best” variables explaining treatment
assignment, instead of long-shot proxy variablesannual pre-tax income, disposable income, net wealth, grosswealth ? Family wealth (to account for possible family support) ?
> a (linear ?) combination of several variables, or some index ?
Recall : do not condition on potential mediators (e.g. lenght of trial) asthey will capture part of the true causal effect of T on Y
B. Monnery (EconomiX) Econometrics using Stata II 17 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
ASSUMPTIONS
The key underlying assumptions :
• Conditional independance assumption (CIA, orunconfoundedness)
�
Y 1i ,Y
0i ⊥ T | X
CIA is not directly testable (you need to argue why it’s credible)
• Common support (or overlap)
�
Pr (T = 1|X ) ∈ (0,1)common support is easily testable
+ SUTVA
Then stratification, covariate-adjustment and matching will work
B. Monnery (EconomiX) Econometrics using Stata II 18 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
REGRESSION ANATOMY
Under those assumptions, why exactly does covariate-adjustmentwork, i.e. give a causal effect of T on Y ?
⇒ what do multiple regressions do ?
We know that a simple regression with OLS : Yi = β0 + β1X1 + εi
... gives β̂1 = Cov (Y ,X1)Var (X1)
And a multiple regression with OLS : Yi = β0 + β1X1 + β2X2 + ui
... gives β̂ = (X ′X )−1X ′Y ... ?
To understand what it means, let’s turn to the regression anatomytheorem
B. Monnery (EconomiX) Econometrics using Stata II 19 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
REGRESSION ANATOMY
B. Monnery (EconomiX) Econometrics using Stata II 20 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
SENSITIVITY TO CIA
We can estimate how sensitive the results are to potentialconfounders
Simulation approach :
• Simulate a “fake” variable F that is correlated with both T and Y
• Look at the effect of including this new covariate F on β̂T
• By comparing the β̂T s under different constructions of F(variance-covariance), document the sensitivity of your findingswith respect to a violation of CIA
⇒ If β̂T only disappears under “unrealistic” assumptions (superlarge correlations (F ,X ) and (F ,Y )), then the effect is robust topotential selection on unobservables
B. Monnery (EconomiX) Econometrics using Stata II 21 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
Matching
B. Monnery (EconomiX) Econometrics using Stata II 22 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
MATCHING
Another popular method to deal with selection on observables ismatching
Matching = Appariemment
Idea : make many pairs of similar individuals (i , j), one treated & onenon-treated, and look at their average differences in outcomes
ˆATT =1
N1
∑T =1
(Yi − Yj (i))
where Yj (i) is the outcome of j , the non-treated individual closest tothe treated i (i.e. the match for i)
B. Monnery (EconomiX) Econometrics using Stata II 23 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
Note that we can also recover ATU and ATE with matching :
ˆATU =1
N0
∑T =0
(Yi − Yj (i))
ˆATE =N1
NˆATT +
N0
NˆATU
B. Monnery (EconomiX) Econometrics using Stata II 24 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
SIMPLE EXTENSIONS
Note that we can match
• on many dimensions, many X
�
that’s preferable to make CIA hold
• use several matches for a given i�
that’s prefered to reduce variance
ˆATT =1
N1
∑T=1
( Yi −1M
M∑m=1
Yjm(i) )
For now, most simple 1x1 matching on one X
B. Monnery (EconomiX) Econometrics using Stata II 25 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
1X1 MATCHING ON ONE X
B. Monnery (EconomiX) Econometrics using Stata II 26 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
1X1 MATCHING ON ONE X
B. Monnery (EconomiX) Econometrics using Stata II 27 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
ANOTHER EXAMPLE : 1X1 MATCHING ON ONE X
B. Monnery (EconomiX) Econometrics using Stata II 28 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
ANOTHER EXAMPLE : 1X1 MATCHING ON ONE X
The estimated ATT after matching is 16426− 13982 = 2444
whereas before matching : 16426− 20724 = −4298
B. Monnery (EconomiX) Econometrics using Stata II 29 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
SEVERAL X
In practice, we usually need to match on many observable variables
⇒ difficult to find perfectly similar i and j on all X (exact matching)
Other methods :• coarsened exact matching (“exact” matching within bins/ranges)
• distance-based matching- Euclidian distance||xi − xj || =
√(xi − xj )′(xi − xj ) =
√∑Kk=1(xki − xkj )2
- Normalized Euclidian distance, Mahalanobis distance
• propensity score matching
Distance-based and propensity score matching are most often used
B. Monnery (EconomiX) Econometrics using Stata II 30 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
SEVERAL MATCHES
In practice, we often want to increase precision by using severalmatches for each i
• Single nearest neighbor matching
• k-nearest neighbors matching (e.g. k=5 or 10)
• Caliper (or raduis) matching (maximal distance i − j)
• Kernel matching (different weights by distance)
• etc.
Asymptotically, they are all similar ; but in practice, this choice canmatter
B. Monnery (EconomiX) Econometrics using Stata II 31 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
PROPENSITY SCORE MATCHING
Like with distance-based matching, we want to aggregate alldifferences in X in only one index, the propensity score p(x)
p(x) measures the probability that individuals are treated (T = 1)based on their observables
• Among treated, some were very likely to be treated, some less so• Among non-treated, some were very likely not to be treated,
some less so
�
common support in p(x) between the two groups
Propensity score matching matches individuals with similar p(x) (butdifferent actual treatment status)
⇒ need to estimate p(x)
B. Monnery (EconomiX) Econometrics using Stata II 32 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
PROPENSITY SCORE MATCHING
To estimate p(x) for each individual (and then match neighbors), weusually use a probit (or logit) model :
Pr (T = 1|X ) = Pr (T ∗ > 0)= Pr (X ′β + ε > 0)= Pr (ε > −X ′β)= 1− CDF (−X ′β)= Phi(X ′β)
⇒p̂i (xi ) ranges from 0 to 1 (if probit or logit is used)
X are pre-determined variables (and interactions, polynomials, etc.)likely to explain T
and then predict the scores : p̂i (xi ) = Phi(X ′i β̂)
⇒ Hopefully with common support and balance of x between the twogroups
B. Monnery (EconomiX) Econometrics using Stata II 33 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
MAIN PRACTICAL ISSUES
Check common support : compare the two distributions of p(x)
Check balance of covariates : use simple t-tests, proportional tests, orthe standarized bias :
if std bias > 20%, difference is still “large”
Be careful about inference : propensity score matching is a two-stepprocess, so you need to adjust your standard errors (using bootstrap)
Many other choices to make : type of matching (1-1, 1-5, caliper,kernel, etc), replacement or not...
B. Monnery (EconomiX) Econometrics using Stata II 34 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
BONUS 3 : PRISON-BASED EDUCATION AND RECIDIVISM
Goal : make a 1-page critical review of the paper/chapter• brief summary of the paper (topic, method, main points, results)• discuss method, experimental design, interpretations,
conclusions• relate it to the class• criticisms, shortcomings ?
Send PDF by email before next monday (noon)at [email protected]
B. Monnery (EconomiX) Econometrics using Stata II 35 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM
Data on 31,000 prisoners released in New York State between 2005and 2008
They follow recidivism within 3 years (rearrest)
Only 347 of them received a college degree in prison
Challenge : make those 347 graduates as comparable as possible toother prisoners not getting a college degree
Method : match prisoners based on their propensity to get a degreepredicted for 47 covariates⇒ 1-1 nearest neighbor matching with a caliper of 0.01
B. Monnery (EconomiX) Econometrics using Stata II 36 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM
B. Monnery (EconomiX) Econometrics using Stata II 37 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM
B. Monnery (EconomiX) Econometrics using Stata II 38 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM
B. Monnery (EconomiX) Econometrics using Stata II 39 / 41
FINDING DATA COVARIATE-ADJUSTMENT MATCHING
APPLICATION ON STATA
Let’s imagine we want to estimate the effect of halfway houses(semi-liberté) instead of prison on recidivism in a sample of offendersconvicted to prison in France
• allows convicts to work, train, follow classes (probably good forreentry)
• requires them to return in “custody” every night (probably ok tomonitor offenders)
• often perceived as less punitive (possibly bad for futuredeterrence)
⇒ what’s the net causal effect on recidivism, after accounting forselection ?
Main assumption : the Conditional Independence Assumption holdsafter matching on propensity score
In Stata, we can simply use psmatch2
B. Monnery (EconomiX) Econometrics using Stata II 40 / 41
Econometrics using STATA :
Part 2
Benjamin MonneryEconomiX, Univ Paris Nanterre
M1 Economie du Droit2017-2018