ie_06_dummy
-
Upload
ikarma-bilal-khan -
Category
Documents
-
view
221 -
download
0
Transcript of ie_06_dummy
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 1/10
Dummy variables
1/10
Dummy variablesIntroduction to Econometrics
Francois Cocquemas
March 16, 2010
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 2/10
Dummy variables
Outline 2/10
These slides correspond to Chapter 7 in Wooldridge.
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 3/10
Dummy variables
Outline 3/10
Describing qualitative dataFar from all of the data of interest to econometricians is
quantitative.For instance, gender of individuals, whether they are married,the industry of firms, countries or regions are all considered tobe qualitative.
How do we include this into a regression?
In many cases, the information can be described as being trueor false, or the character present or absent. In those cases, itis easy to set up a binary variable or dummy variable takingvalues 0 and 1.Be careful when you define which value corresponds to which
characteristic.For instance, male is usually set to 1 when the individual ismale and 0 when female, while if rather we define female wewould likely do the opposite. Those are clearer than a gender
variable. It does not matter to the result, but it does to their
interpretation!
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 4/10
Dummy variables
Outline 4/10
Describing categories or ranges
Dummy variables are also useful to describe categories.Indeed, even if the variable is not binary, if it takes a finitenumber of values then it can be described by a complete setof dummy variablesFor instance, if eyes colour can be brown, blue, green or red,
we can have four dummy variables for each of these colour,taking 1 whenever an individual has eyes of this colour.More complex for Bowie.
Notice that summing all variables in a complete set shouldgive you 1 for all observations!
This technique can also be useful for quantitative data whichyou do not believe should be considered as one continuousvariable. A dummy variable for several ranges allows you todistinguish the effects of what you might see as “thresholds”.
Example: in the Mincer equation, we often use dummyvariables for high school dropouts, high school graduates, etc.
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 5/10
Dummy variables
Outline 5/10
Using a dummy variable in a regression
Including a dummy variable in a standard OLS regression is assimple as any other variable:
wage = β 0 + δ0female + β 1educ +
Coefficient δ0 is the difference in hourly wage between maleand female, given the same amount of education (and errorterm). If it is negative, women earn less than men on average.
This coefficient can be seen as an intercept shift.
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 6/10
Dummy variables
Outline 6/10
Intercept shift
educ
slope = 1
wage
0
0
men: wage = 0
1educ
women:wage = (
0
0) +
1educ
0
0
D i bl
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 7/10
Dummy variables
Outline 7/10
Using a set of dummy variables
What happens if we use a complete set of dummy variables?
wage = β 0 + δ0eyesbrown+ δ1eyesblue
+δ2eyesgreen+ δ3eyesred + β 1educ +
The four dummies sum to one, hence we have perfect
collinearity. The regression will not be able to identifyproperly the coefficients. It is as if we had a single variablealways equal to one (like for the intercept).
One possible way out is then to drop the intercept. Eachdummy coefficient will then be interpreted as the intercept for
this specific group.Another (more common) possibility is to drop one variable inthe set. This will be the baseline and the other dummycoefficients will read directly as the difference from this
baseline.
D mm i bl s
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 8/10
Dummy variables
Outline 8/10
Dummy variables in R
By default, R will automatically remove the last dummyvariable if you provide a complete set.
However, you are well-advised to do it yourself as this will helpwith the interpretation, and also because other software maynot be as kind.
There are many methods to create dummy variables fromqualitative data. One of the easiest way is to use ifelse (see
?ifelse).
Dummy variables
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 9/10
Dummy variables
Outline 9/10
Example from Alesina, Algan, Cahuc and Giuliano (2009)
!"#$% &' (")*$+ ,*%- "./ 0"#12 2%34$",*1.
5%6%./%., 7"2*"#$% (*2*.3 81-,9,",% 2%34$",*1. 1:
)*.*)4) ;"3%0.<=/6 6%2 >"6*,"?
<&? <@? <A?
9,21.3 :")*$+ ,*%-BACDEE
<B&AF?
BC&GEE
<BCCH?
I&BHHEEE
<BFJ@?0%3"$ 12*3*.-
81))1. 0"; 12*3*.K%:%2%.>%
8*7*$ 0"; 12*3*.BADAEEE
<B&&A?
IBCCC
<BCCH?
IBCAD
<BF@@?
9>"./*."7*". 12*3*.BCJ@
<B@@@?
IBC@&E
<BC&C?
&BCF@
<BG@F?
=%2)". 12*3*. B@F@<B&FG?
IBCCG<BCCD?
B&&D<BLF&?
M#-%27",*1.- DC FD DA
K! B@L BFC BA@
9142>%' N12$/ O"$4%- 9427%+P Q$%-*." %, =4$*".1 <@CCH?P R0M <@CCH? "./ S1,%21 %, "$B <@CCF?
Dummy variables
8/8/2019 ie_06_dummy
http://slidepdf.com/reader/full/ie06dummy 10/10
Dummy variables
Outline 10/10
Fixed effects
Dummy variables are also frequently used as fixed effects.
Typically, we might add time-fixed effect to our regression tocapture structural changes underlying our regression. Forinstance, this could be a dummy variable for each year or each
period (minus one).In many cases, it is also useful to define a set of individual-fixed effects to capture all unobserved individualcharacteristics.
This might lead to a potentially large number of dummy
variables, which is usually not a problem with moderncomputers.However, you must have several observation for each individualor you will not have degrees of freedom!