Lecture 4: Multiple OLS Regression

80
Lecture 4: Multiple OLS Regression Introduction to Econometrics,Fall 2020 Zhaopeng Qu Nanjing University 10/15/2020 Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 1 / 79

Transcript of Lecture 4: Multiple OLS Regression

Page 1: Lecture 4: Multiple OLS Regression

Lecture 4: Multiple OLS RegressionIntroduction to Econometrics,Fall 2020

Zhaopeng Qu

Nanjing University

10/15/2020

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 1 / 79

Page 2: Lecture 4: Multiple OLS Regression

1 Review for the previous lectures

2 Multiple OLS Regression: Introduction

3 Multiple OLS Regression: Estimation

4 Partitioned Regression: OLS Estimators in Multiple Regression

5 Measures of Fit in Multiple Regression

6 Categoried Variable as independent variables in Regression

7 Multiple Regression: Assumption

8 Properties of OLS Estimators in Multiple Regression

9 Multiple OLS Regression and Causality

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 2 / 79

Page 3: Lecture 4: Multiple OLS Regression

Review for the previous lectures

Section 1

Review for the previous lectures

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 3 / 79

Page 4: Lecture 4: Multiple OLS Regression

Review for the previous lectures

Simple OLS formula

The linear regression model with one regressor is denoted by

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖

Where𝑌𝑖 is the dependent variable(Test Score)𝑋𝑖 is the independent variable or regressor(Class Size orStudent-Teacher Ratio)𝑢𝑖 is the error term which contains all the other factors besides 𝑋that determine the value of the dependent variable, 𝑌 , for a specificobservation, 𝑖.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 4 / 79

Page 5: Lecture 4: Multiple OLS Regression

Review for the previous lectures

The OLS Estimator

The estimators of the slope and intercept that minimize the sum ofthe squares of ��𝑖,thus

𝑎𝑟𝑔 𝑚𝑖𝑛𝑏0,𝑏1

𝑛∑𝑖=1

��2𝑖 = 𝑚𝑖𝑛

𝑏0,𝑏1

𝑛∑𝑖=1

(𝑌𝑖 − 𝑏0 − 𝑏1𝑋𝑖)2

are called the ordinary least squares (OLS) estimators of 𝛽0 and𝛽1.

OLS estimator of 𝛽1:

𝑏1 = 𝛽1 = ∑𝑛𝑖=1(𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 )

∑𝑛𝑖=1(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 5 / 79

Page 6: Lecture 4: Multiple OLS Regression

Review for the previous lectures

Least Squares Assumptions

Under 3 least squares assumptions,1 Assumption 12 Assumption 23 Assumption 3

the OLS estimators will be1 unbiased2 consistent3 normal sampling distribution

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 6 / 79

Page 7: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Section 2

Multiple OLS Regression: Introduction

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 7 / 79

Page 8: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Violation of the first Least Squares Assumption

Recall simple OLS regression equation

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖

Question: What does 𝑢𝑖 represent?Answer: contains all other factors(variables) which potentially affect𝑌𝑖.

Assumption 1𝐸(𝑢𝑖|𝑋𝑖) = 0

It states that 𝑢𝑖 are unrelated to 𝑋𝑖 in the sense that,given a value of𝑋𝑖,the mean of these other factors equals zero.But what if they (or at least one) are correlated with 𝑋𝑖?

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 8 / 79

Page 9: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Example: Class Size and Test Score

Many other factors can affect student’s performance in the school.

One of factors is the share of immigrants in the class(school,district). Because immigrant children may have different backgroundsfrom native children, such as

parents’education levelfamily income and wealthparenting styletraditional culture

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 9 / 79

Page 10: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Scatter Plot: The share of immigrants and STR

14

16

18

20

22

24

26

0 25 50 75the share of immigrants

stud

ent−

teac

her

ratio

higher share of immigrants, bigger class sizeZhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 10 / 79

Page 11: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Scatter Plot: The share of immigrants and STR

630

660

690

0 25 50 75the share of immigrants

test

sco

res

higher share of immigrants, lower testscoreZhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 11 / 79

Page 12: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

The share of immigrants as an Omitted Variable

Class size may be related to percentage of English learners andstudents who are still learning English likely have lower test scores.

In other words, the effect of class size on scores we had obtained insimple OLS may contain an effect of immigrants on scores.

It implies that percentage of English learners is contained in 𝑢𝑖, inturn that Assumption 1 is violated, more precisely,the estimates of

𝛽1 and 𝛽0 are biased and inconsistent.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 12 / 79

Page 13: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: IntroductionAs before, 𝑋𝑖 and 𝑌𝑖 represent STR and Test Score.

Besides, 𝑊𝑖 is the variable which represents the share of Englishlearners.

Suppose that we have no information about it for some reasons, thenwe have to omit in the regression.

Then we have two regression:True model(Long regression):

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑊𝑖 + 𝑢𝑖

where 𝐸(𝑢𝑖|𝑋𝑖, 𝑊𝑖) = 0OVB model(Short regression):

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑣𝑖

where 𝑣𝑖 = 𝛾𝑊𝑖 + 𝑢𝑖Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 13 / 79

Page 14: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: Biasedness

Let us see what is the consequence of OVB

𝐸[ 𝛽1] = 𝐸[∑(𝑋𝑖 − ��)(𝛽0 + 𝛽1𝑋𝑖 + 𝑣𝑖 − (𝛽0 + 𝛽1𝑋 + 𝑣))∑(𝑋𝑖 − ��)(𝑋𝑖 − 𝑋)

]

= 𝐸[∑(𝑋𝑖 − ��)(𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑊𝑖 + 𝑢𝑖 − (𝛽0 + 𝛽1𝑋 + 𝛾𝑊 + 𝑢))∑(𝑋𝑖 − ��)(𝑋𝑖 − 𝑋)

]

Skip Several steps in algebra which is very similar to procedures for provingunbiasedness of 𝛽1,using the Law of Iterated Expectation(LIE) again.At last, we get (Please prove it by yourself)

𝐸[ 𝛽1] = 𝛽1 + 𝛾𝐸[∑(𝑋𝑖 − ��)(𝑊𝑖 − ��)∑(𝑋𝑖 − ��)(𝑋𝑖 − ��) ]

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 14 / 79

Page 15: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: Biasedness

As proving unbiasedness of 𝛽1,we can know

If 𝑊𝑖 is unrelated to 𝑋𝑖,then 𝐸[ 𝛽1] = 𝛽1, because

𝐸[∑(𝑋𝑖 − ��)(𝑊𝑖 − ��)∑(𝑋𝑖 − ��)(𝑋𝑖 − ��) ] = 0

If 𝑊𝑖 is not determinant of 𝑌𝑖, which means that

𝛾 = 0

,then 𝐸[ 𝛽1] = 𝛽1, too.

Only if both two conditions above are violated simultaneously, then𝛽1 is biased, which is normally called Omitted Variable Bias(OVB).

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 15 / 79

Page 16: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias(OVB): inconsistency

Recall: simple OLS is consistency when n is large, thus 𝑝𝑙𝑖𝑚 𝛽1 = 𝐶𝑜𝑣(𝑋𝑖,𝑌𝑖)𝑉 𝑎𝑟(𝑋𝑖)

𝑝𝑙𝑖𝑚 𝛽1 = 𝐶𝑜𝑣(𝑋𝑖, 𝑌𝑖)𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝑣𝑖))𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑊𝑖 + 𝑢𝑖))𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, 𝛽0) + 𝛽1𝐶𝑜𝑣(𝑋𝑖, 𝑋𝑖) + 𝛾𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖) + 𝐶𝑜𝑣(𝑋𝑖, 𝑢𝑖)𝑉 𝑎𝑟𝑋𝑖

= 𝛽1 + 𝛾 𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖)𝑉 𝑎𝑟𝑋𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 16 / 79

Page 17: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias(OVB): inconsistency

Recall: simple OLS is consistency when n is large, thus 𝑝𝑙𝑖𝑚 𝛽1 = 𝐶𝑜𝑣(𝑋𝑖,𝑌𝑖)𝑉 𝑎𝑟(𝑋𝑖)

𝑝𝑙𝑖𝑚 𝛽1 = 𝐶𝑜𝑣(𝑋𝑖, 𝑌𝑖)𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝑣𝑖))𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑊𝑖 + 𝑢𝑖))𝑉 𝑎𝑟𝑋𝑖

= 𝐶𝑜𝑣(𝑋𝑖, 𝛽0) + 𝛽1𝐶𝑜𝑣(𝑋𝑖, 𝑋𝑖) + 𝛾𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖) + 𝐶𝑜𝑣(𝑋𝑖, 𝑢𝑖)𝑉 𝑎𝑟𝑋𝑖

= 𝛽1 + 𝛾 𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖)𝑉 𝑎𝑟𝑋𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 16 / 79

Page 18: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias(OVB): inconsistency

Thus we obtain

𝑝𝑙𝑖𝑚 𝛽1 = 𝛽1 + 𝛾 𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖)𝑉 𝑎𝑟𝑋𝑖

𝛽1 is still consistentif 𝑊𝑖 is unrelated to X, thus 𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖) = 0if 𝑊𝑖 has no effect on 𝑌𝑖, thus 𝛾 = 0

If both two conditions above hold simultaneously, then 𝛽1 isinconsistent.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 17 / 79

Page 19: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias(OVB):Directions

If OVB can be possible in our regression,then we should guess thedirections of the bias, in case that we can’t eliminate it.

Summary of the bias when 𝑤𝑖 is omitted in estimating equation

𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖) > 0 𝐶𝑜𝑣(𝑋𝑖, 𝑊𝑖) < 0𝛾 > 0 Positive bias Negative bias𝛾 < 0 Negative bias Positive bias

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 18 / 79

Page 20: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: Examples

Question: If we omit following variables, then what are the directionsof these biases? and why?

1 Time of day of the test2 Parking lot space per pupil3 Teachers’ Salary4 Family income5 Percentage of English learners(the Share of Immigrants)

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 19 / 79

Page 21: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: ExamplesRegress Testscore on Class size

#### Call:## lm(formula = testscr ~ str, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -47.727 -14.251 0.483 12.822 48.540#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 698.9330 9.4675 73.825 < 2e-16 ***## str -2.2798 0.4798 -4.751 2.78e-06 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 18.58 on 418 degrees of freedom## Multiple R-squared: 0.05124, Adjusted R-squared: 0.04897## F-statistic: 22.58 on 1 and 418 DF, p-value: 2.783e-06

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 20 / 79

Page 22: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: ExamplesRegress Testscore on Class size and the percentage of English learners

#### Call:## lm(formula = testscr ~ str + el_pct, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -48.845 -10.240 -0.308 9.815 43.461#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 686.03225 7.41131 92.566 < 2e-16 ***## str -1.10130 0.38028 -2.896 0.00398 **## el_pct -0.64978 0.03934 -16.516 < 2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 14.46 on 417 degrees of freedom## Multiple R-squared: 0.4264, Adjusted R-squared: 0.4237## F-statistic: 155 on 2 and 417 DF, p-value: < 2.2e-16

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 21 / 79

Page 23: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Omitted Variable Bias: Examples表 2: Class Size and Test Score

Dependent variable:testscr

(1) (2)str −2.280∗∗∗ −1.101∗∗∗

(0.480) (0.380)el_pct −0.650∗∗∗

(0.039)Constant 698.933∗∗∗ 686.032∗∗∗

(9.467) (7.411)Observations 420 420R2 0.051 0.426

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 22 / 79

Page 24: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Introduction

Warp Up

OVB bias is the most possible bias when we run OLS regression usingnonexperimental data.

OVB bias means that there are some variables which should havebeen included in the regression but actually was not.

Then the simplest way to overcome OVB: Put omitted variables intothe regression, which means our regression model is

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑊𝑖 + 𝑢𝑖

The strategy can be denoted as controlling informally, whichintroduces the more general regression model: Multiple OLSRegression.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 23 / 79

Page 25: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

Section 3

Multiple OLS Regression: Estimation

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 24 / 79

Page 26: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

Multiple regression model with k regressors

The multiple regression model is

𝑌𝑖 = 𝛽0 + 𝛽1𝑋1,𝑖 + 𝛽2𝑋2,𝑖 + ... + 𝛽𝑘𝑋𝑘,𝑖 + 𝑢𝑖, 𝑖 = 1, ..., 𝑛 (4.1)

where

𝑌𝑖 is the dependent variable𝑋1, 𝑋2, ...𝑋𝑘 are the independent variables(includes one is our ofinterest and some control variables)𝛽𝑖, 𝑗 = 1...𝑘 are slope coefficients on 𝑋𝑖 corresponding.𝛽0 is the estimate intercept, the value of Y when all 𝑋𝑗 = 0, 𝑗 = 1...𝑘𝑢𝑖 is the regression error term, still all other factors affect outcomes.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 25 / 79

Page 27: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

Interpretation of coefficients 𝛽𝑖, 𝑗 = 1...𝑘

𝛽𝑗 is partial (marginal) effect of 𝑋𝑗 on Y.

𝛽𝑗 = 𝜕𝑌𝑖𝜕𝑋𝑗,𝑖

𝛽𝑗 is also partial (marginal) effect of 𝐸[𝑌𝑖|𝑋1..𝑋𝑘].

𝛽𝑗 = 𝜕𝐸[𝑌𝑖|𝑋1, ..., 𝑋𝑘]𝜕𝑋𝑗,𝑖

it does mean that we are estimate the effect of X on Y when “otherthings equal”, thus the concept of ceteris paribus.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 26 / 79

Page 28: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

OLS Estimation in Multiple Regressors

As in a Simple OLS Regression, the estimators of Multiple OLSRegression is just a minimize the following question

𝑎𝑟𝑔𝑚𝑖𝑛 ∑𝑏0,𝑏1,...,𝑏𝑘

(𝑌𝑖 − 𝑏0 − 𝑏1𝑋1,𝑖 − ... − 𝑏𝑘𝑋𝑘,𝑖)2

where 𝑏0 = 𝛽1, 𝑏1 = 𝛽2, ..., 𝑏𝑘 = 𝛽𝑘 are estimators.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 27 / 79

Page 29: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

OLS Estimation in Multiple Regressors

Similarly in Simple OLS, based on F.O.C,the multiple OLS estimators𝛽0, 𝛽1, ..., 𝛽𝑘 are obtained by solving the following system of normal

equations

𝜕𝜕𝑏0

𝑛∑𝑖=1

��2𝑖 = ∑ (𝑌𝑖 − 𝛽0 − 𝛽1𝑋1,𝑖 − ... − 𝛽𝑘𝑋𝑘,𝑖) = 0

𝜕𝜕𝑏1

𝑛∑𝑖=1

��2𝑖 = ∑ (𝑌𝑖 − 𝛽0 − 𝛽1𝑋1,𝑖 − ... − 𝛽𝑘𝑋𝑘,𝑖)𝑋1,𝑖 = 0

⋮ =⋮ =⋮𝜕

𝜕𝑏𝑘

𝑛∑𝑖=1

��2𝑖 = ∑ (𝑌𝑖 − 𝛽0 − 𝛽1𝑋1,𝑖 − ... − 𝛽𝑘𝑋𝑘,𝑖)𝑋𝑘,𝑖 = 0

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 28 / 79

Page 30: Lecture 4: Multiple OLS Regression

Multiple OLS Regression: Estimation

OLS Estimation in Multiple RegressorsSimilar to in Simple OLS, the fitted residuals are

𝑢𝑖 = 𝑌𝑖 − 𝛽0 − 𝛽1𝑋1,𝑖 − ... − 𝛽𝑘𝑋𝑘,𝑖

Therefore, the normal equations also can be written as

∑ 𝑢𝑖 = 0∑ 𝑢𝑖𝑋1,𝑖 = 0

⋮ =⋮∑ 𝑢𝑖𝑋𝑘,𝑖 = 0

While it is convenient to transform equations above using matrixalgebra to compute these estimators, we can use partitionedregression to obtain the formula of estimators without usingmatrices.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 29 / 79

Page 31: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Section 4

Partitioned Regression: OLS Estimators inMultiple Regression

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 30 / 79

Page 32: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Partitioned regression: OLS estimatorsA useful representation of 𝛽𝑗 could be obtained by the partitionedregression.

The OLS estimator of 𝛽𝑗; 𝑗 = 1, 2...𝑘 can be computed in threesteps:

1 Regress 𝑋𝑗 on 𝑋1, 𝑋2, ...𝑋𝑗−1, 𝑋𝑗+1, 𝑋𝑘, thus𝑋𝑗,𝑖 = 𝛾0 + 𝛾1𝑋1𝑖 + ... + 𝛾𝑗−1𝑋𝑗−1,𝑖 + 𝛾𝑗+1𝑋𝑗+1,𝑖... + 𝛾𝑘𝑋𝑘,𝑖 + 𝑣𝑗𝑖

2 Obtain the residuals from the regression above,denoted as ��𝑗,𝑖 = 𝑣𝑗𝑖3 Regress 𝑌 on ��𝑗,𝑖

The last step implies that the OLS estimator of 𝛽𝑗 can be expressedas follows

𝛽𝑗 = ∑𝑛𝑖=1 (��𝑗𝑖 − ��𝑗𝑖)(𝑌𝑖 − 𝑌 )

∑𝑛𝑖=1 (��𝑗𝑖 − ��𝑗𝑖)2

= ∑𝑛𝑖=1 ��𝑗𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

𝑗𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 31 / 79

Page 33: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Partitioned regression: OLS estimatorsSuppose we want to obtain an expression for 𝛽1.

Then the first step: regress 𝑋1,𝑖 on other regressors, thus

𝑋1,𝑖 = 𝛾0 + 𝛾2𝑋2,𝑖 + ... + 𝛾𝑘𝑋𝑘,𝑖 + 𝑣𝑖

Then, we can obtain

𝑋1,𝑖 = 𝛾0 + 𝛾2𝑋2,𝑖 + ... + 𝛾𝑘𝑋𝑘,𝑖 + ��1,𝑖

where ��1,𝑖 is the fitted OLS residual,thus ��𝑗,𝑖 = 𝑣1𝑖

Then we could prove that

𝛽1 = ∑𝑛𝑖=1 ��1,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

1,𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 32 / 79

Page 34: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Proof of Partitioned regression result(1)

Recall 𝑢𝑖 are the residuals for the Multiple OLS regression equation,thus

𝑌𝑖 = 𝛽0 + 𝛽1𝑋1,𝑖 + 𝛽2𝑋2,𝑖 + ... + 𝛽𝑘𝑋𝑘,𝑖 + ��𝑖

Then we have∑ ��𝑖 = ∑ ��𝑖𝑋𝑗𝑖 = 0, 𝑗 = 1, 2, ..., 𝑘

Likewise,��1𝑖 are the residuals for the partitioned regression equation on𝑋2𝑖..., 𝑋𝑘𝑖, then we have

∑ ��1𝑖 = ∑ ��1𝑖𝑋2,𝑖 = ... = ∑ ��1𝑖𝑋𝑘,𝑖 = 0

Additionally, because��1,𝑖 = 𝑋1,𝑖 − 𝛾0 − 𝛾2𝑋2,𝑖 − ... − 𝛾𝑘𝑋𝑘,𝑖, then

∑ ��𝑖��𝑗𝑖 = 0

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 33 / 79

Page 35: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Proof of Partitioned regression result(2)

∑𝑛𝑖=1 ��1,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

1,𝑖= ∑ ��1,𝑖( 𝛽0 + 𝛽1𝑋1,𝑖 + 𝛽2𝑋2,𝑖 + ... + 𝛽𝑘𝑋𝑘,𝑖 + ��𝑖)

∑ ��21,𝑖

= 𝛽0∑𝑛

𝑖=1 ��1,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ 𝛽1∑𝑛

𝑖=1 ��1,𝑖𝑋1,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ ...

+ 𝛽𝑘∑𝑛

𝑖=1 ��1,𝑖𝑋𝑘,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ ∑𝑛𝑖=1 ��1,𝑖��𝑖

∑𝑛𝑖=1 ��2

1,𝑖

= 𝛽1∑𝑛

𝑖=1 ��1,𝑖𝑋1,𝑖∑𝑛

𝑖=1 ��21,𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 34 / 79

Page 36: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Proof of Partitioned regression result(3)

𝑛∑𝑖=1

��1,𝑖𝑋1,𝑖 =𝑛

∑𝑖=1

��1,𝑖( 𝛾0 + 𝛾2𝑋2,𝑖 + ... + 𝛾𝑘𝑋𝑘,𝑖 + ��1,𝑖)

= 𝛾0 ⋅ 0 + 𝛾2 ⋅ 0 + ... + 𝛾𝑘 ⋅ 0 + ∑ ��21,𝑖

= ∑ ��21,𝑖

Then∑𝑛

𝑖=1 ��1,𝑖𝑌𝑖∑𝑛

𝑖=1 ��21,𝑖

= 𝛽1∑𝑛

𝑖=1 ��1,𝑖𝑋1,𝑖∑𝑛

𝑖=1 ��21,𝑖

= 𝛽1

Identical argument works for 𝑗 = 2, 3, ..., 𝑘, thus

𝛽𝑗 = ∑𝑛𝑖=1 ��𝑗,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

𝑗,𝑖𝑓𝑜𝑟 𝑗 = 1, 2, .., 𝑘

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 35 / 79

Page 37: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

The intuition of Partitioned regression

Partialling Out

First, we regress 𝑋𝑗 against the rest of the regressors (and aconstant) and keep ��𝑗 which is the “part” of 𝑋𝑗 that is uncorrelated

Then, to obtain 𝛽𝑗 , we regress Y against ��𝑗 which is “clean” fromcorrelation with other regressors.

𝛽𝑗 measures the effect of 𝑋1 after the effects of 𝑋2, ..., 𝑋𝑘 have beenpartialled out or netted out.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 36 / 79

Page 38: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios

Now we put two additional control variables into our OLS regressionmodel

𝑇 𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1𝑆𝑇 𝑅 + 𝛽2𝑒𝑙𝑝𝑐𝑡 + 𝛽3𝑎𝑣𝑔𝑖𝑛𝑐 + 𝑢𝑖

elpct: the share of English learners as an indicator for immigrants

avginc: average income of the district as an indicator for familybackgrounds

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 37 / 79

Page 39: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios(2)

tilde.str <- residuals(lm(str ~ el_pct, data=ca))mean(tilde.str) # should be zero

## [1] -1.0111e-16sum(tilde.str) # also is zero

## [1] -4.240358e-14

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 38 / 79

Page 40: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios(3)

Multiple OLS estimator

𝛽𝑗 = ∑𝑛𝑖=1 ��𝑗,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

𝑗,𝑖𝑓𝑜𝑟 𝑗 = 1, 2, .., 𝑘

sum(tilde.str*ca$testscr)/sum(tilde.str^2)

## [1] -1.101296

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 39 / 79

Page 41: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios(4)reg3 <- lm(testscr ~ tilde.str,data = ca)summary(reg3)

#### Call:## lm(formula = testscr ~ tilde.str, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -48.693 -14.124 0.988 13.209 50.872#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 654.1565 0.9254 706.864 <2e-16 ***## tilde.str -1.1013 0.4986 -2.209 0.0277 *## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 18.97 on 418 degrees of freedom## Multiple R-squared: 0.01154, Adjusted R-squared: 0.009171## F-statistic: 4.878 on 1 and 418 DF, p-value: 0.02774Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 40 / 79

Page 42: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios(5)reg4 <- lm(testscr ~ str+el_pct,data = ca)summary(reg4)

#### Call:## lm(formula = testscr ~ str + el_pct, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -48.845 -10.240 -0.308 9.815 43.461#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 686.03225 7.41131 92.566 < 2e-16 ***## str -1.10130 0.38028 -2.896 0.00398 **## el_pct -0.64978 0.03934 -16.516 < 2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 14.46 on 417 degrees of freedom## Multiple R-squared: 0.4264, Adjusted R-squared: 0.4237## F-statistic: 155 on 2 and 417 DF, p-value: < 2.2e-16Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 41 / 79

Page 43: Lecture 4: Multiple OLS Regression

Partitioned Regression: OLS Estimators in Multiple Regression

Test Scores and Student-Teacher Ratios(6)表 3: Class Size and Test Score

Dependent variable:testscr

(1) (2)tilde.str −1.101∗∗

(0.499)str −1.101∗∗∗

(0.380)el_pct −0.650∗∗∗

(0.039)Constant 654.157∗∗∗ 686.032∗∗∗

(0.925) (7.411)Observations 420 420R2 0.012 0.426Adjusted R2 0.009 0.424

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 42 / 79

Page 44: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Section 5

Measures of Fit in Multiple Regression

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 43 / 79

Page 45: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit: The 𝑅2

Decompose 𝑌𝑖 into the fitted value plus the residual 𝑌𝑖 = 𝑌𝑖 + ��𝑖

The total sum of squares (TSS): 𝑇 𝑆𝑆 = ∑𝑛𝑖=1(𝑌𝑖 − 𝑌 )2

The explained sum of squares (ESS): ∑𝑛𝑖=1( 𝑌𝑖 − 𝑌 )2

The sum of squared residuals (SSR): ∑𝑛𝑖=1( 𝑌𝑖 − 𝑌𝑖)2 = ∑𝑛

𝑖=1 ��2𝑖

And𝑇 𝑆𝑆 = 𝐸𝑆𝑆 + 𝑆𝑆𝑅

The regression 𝑅2 is the fraction of the sample variance of 𝑌𝑖explained by (or predicted by) the regressors.

𝑅2 = 𝐸𝑆𝑆𝑇 𝑆𝑆 = 1 − 𝑆𝑆𝑅

𝑇 𝑆𝑆

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 44 / 79

Page 46: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit in Multiple RegressionWhen you put more variables into the regression, then 𝑅2 alwaysincreases when you add another regressor. Because in general theSSR will decrease.

Consider two models𝑌𝑖 = 𝛽0 + 𝛽1𝑋1𝑖 + 𝑢𝑖

𝑌𝑖 = 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + 𝑣𝑖

Recall: about two residuals ��𝑖 and 𝑣𝑖, we have𝑛

∑𝑖=1

��𝑖 =𝑛

∑𝑖=1

��𝑖𝑋1𝑖 = 0𝑛

∑𝑖=1

𝑣𝑖 =𝑛

∑𝑖=1

𝑣𝑖𝑋1𝑖 =𝑛

∑𝑖=1

𝑣𝑖𝑋2𝑖 = 0

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 45 / 79

Page 47: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit in Multiple Regression

we will show that 𝑛∑𝑖=1

��2𝑖 ≥

𝑛∑𝑖=1

𝑣2𝑖

therefore 𝑅2𝑣 ≥ 𝑅2

𝑢, thus 𝑅2 the regression with one regressor is lessor equal than 𝑅2 that corresponds to the regression with tworegressors.

This conclusion can be generalized to the case of 𝑘 + 1 regressors.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 46 / 79

Page 48: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit in Multiple Regression

At first we would like to know ∑𝑛𝑖=1 ��𝑖 𝑣𝑖

𝑛∑𝑖=1

��𝑖 𝑣𝑖 =𝑛

∑𝑖=1

(𝑌𝑖 − 𝛽0 − 𝛽1𝑋1𝑖) 𝑣𝑖

=𝑛

∑𝑖=1

𝑌𝑖 𝑣𝑖 − 𝛽0𝑛

∑𝑖=1

𝑣𝑖 − 𝛽1𝑛

∑𝑖=1

𝑋1 𝑣𝑖

=𝑛

∑𝑖=1

𝑌𝑖 𝑣𝑖 − 𝛽0 ⋅ 0 − 𝛽1 ⋅ 0

=𝑛

∑𝑖=1

( 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + 𝑣𝑖) 𝑣𝑖

= ∑𝑖=1

𝑣𝑖 𝑣𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 47 / 79

Page 49: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit in Multiple Regression

Then we can obtain

𝑛∑𝑖=1

��2𝑖 −

𝑛∑𝑖=1

𝑣2𝑖 =

𝑛∑𝑖=1

��2𝑖 +

𝑛∑𝑖=1

𝑣2𝑖 − 2

𝑛∑𝑖=1

𝑣2𝑖

=𝑛

∑𝑖=1

��2𝑖 +

𝑛∑𝑖=1

𝑣2𝑖 − 2

𝑛∑𝑖=1

��𝑖 𝑣𝑖

=𝑛

∑𝑖=1

(��𝑖 − 𝑣𝑖)2 ≥ 0

- Therefore 𝑅2𝑣 ≥ 𝑅2

𝑢, thus 𝑅2 the regression with one regressor is less orequal than 𝑅2 that corresponds to the regression with two regressors.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 48 / 79

Page 50: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Measures of Fit: The Adjusted 𝑅2

the Adjusted 𝑅2,is a modified version of the 𝑅2 that does notnecessarily increase when a new regressor is added.

𝑅2 = 1 − 𝑛 − 1𝑛 − 𝑘 − 1

𝑆𝑆𝑅𝑇 𝑆𝑆 = 1 − 𝑠2

��𝑠2

𝑌

because 𝑛−1𝑛−𝑘−1 is always greater than 1, so 𝑅2 < 𝑅2

adding a regressor has two opposite effects on the 𝑅2.𝑅2 can be negative.

Remind: neither 𝑅2 nor 𝑅2 is not the golden criterion for good orbad OLS estimation.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 49 / 79

Page 51: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Standard Error of the Regression

Recall: SER(Standard Error of the Regression)SER is an estimator of the standard deviation of the 𝑢𝑖, which aremeasures of the spread of the Y’s around the regression line.

Because the regression errors are unobserved, the SER is computedusing their sample counterparts, the OLS residuals 𝑢𝑖

𝑆𝐸𝑅 = 𝑠�� = √𝑠2��

where 𝑠2�� = 1

𝑛−𝑘−1 ∑ 𝑢2𝑖 = 𝑆𝑆𝑅𝑛−𝑘−1

𝑛 − 𝑘 − 1 because we have 𝑘 + 1 stricted conditions in the F.O.C.Inanother word,in order to construct 𝑢2𝑖, we have to estimate 𝑘 + 1parameters,thus 𝛽0, 𝛽1, ..., 𝛽𝑘

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 50 / 79

Page 52: Lecture 4: Multiple OLS Regression

Measures of Fit in Multiple Regression

Example: Test scores and Student Teacher Ratios

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 51 / 79

Page 53: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

Section 6

Categoried Variable as independent variables inRegression

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 52 / 79

Page 54: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

A Special Case: Categorical Variable as 𝑋

Recall if 𝑋 is a dummy variable, then we can put it into regressionequation straightly.

What if 𝑋 is a categorical variable?Question: What is a categorical variable?

For example, we may define 𝐷𝑖 as follows:

𝐷𝑖 =⎧{⎨{⎩

1 small-size class if 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district < 182 middle-size class if 18 ≤ 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district < 223 large-size class if 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district ≥ 22

(4.5)

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 53 / 79

Page 55: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

A Special Case: Categorical Variable as 𝑋

Naive Solution: a simple OLS regression model

𝑇 𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1𝐷𝑖 + 𝑢𝑖 (4.3)

Question: Can you explain the meanning of estimate coefficient 𝛽1?

Answer: It doese not make sense that the coefficient of 𝛽1 can beexplained as continuous variables.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 54 / 79

Page 56: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

A Special Case: Categorical Variables as 𝑋The first step: turn a categorical variable(𝐷𝑖) into multiple dummyvariables(𝐷1𝑖, 𝐷2𝑖, 𝐷3𝑖)

𝐷1𝑖 = {1 small-sized class if 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district < 180 middle-sized class or large-sized class if not

𝐷2𝑖 = {1 middle-sized class if 18 ≤ 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district < 220 large-sized class or small-sized class if not

𝐷3𝑖 = {1 large-sized class if 𝑆𝑇 𝑅 in 𝑖𝑡ℎ school district ≥ 220 middle-sized class or small-sized class if not

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 55 / 79

Page 57: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

A Special Case: Categorical Variables as 𝑋

We put these dummies into a multiple regression

𝑇 𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1𝐷1𝑖 + 𝛽2𝐷2𝑖 + 𝛽3𝐷3𝑖 + 𝑢𝑖 (4.6)

Then as a dummy variable as the independent variable in a simpleregression The coefficients (𝛽1, 𝛽2, 𝛽3) represent the effect of everycategorical class on 𝑡𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 respectively.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 56 / 79

Page 58: Lecture 4: Multiple OLS Regression

Categoried Variable as independent variables in Regression

A Special Case: Categorical Variables as 𝑋

In practice, we can’t put all dummies into the regression, but onlyhave 𝑛 − 1 dummies unless we will suffer perfect multi-collinearity.

The regression may be like as

𝑇 𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1𝐷1𝑖 + 𝛽2𝐷2𝑖 + 𝑢𝑖 (4.6)

The default intercept term, 𝛽0,represents the large-sized class.Then,the coefficients (𝛽1, 𝛽2) represent 𝑡𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 gaps betweensmall_sized, middle-sized class and large-sized class,respectively.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 57 / 79

Page 59: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Section 7

Multiple Regression: Assumption

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 58 / 79

Page 60: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Multiple Regression: Assumption

Assumption 1: The conditional distribution of 𝑢𝑖 given 𝑋1𝑖, ..., 𝑋𝑘𝑖has mean zero,thus

𝐸[𝑢𝑖|𝑋1𝑖, ..., 𝑋𝑘𝑖] = 0

Assumption 2: (𝑌𝑖, 𝑋1𝑖, ..., 𝑋𝑘𝑖) are i.i.d.Assumption 3: Large outliers are unlikely.Assumption 4: No perfect multicollinearity.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 59 / 79

Page 61: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Perfect multicollinearity

Perfect multicollinearity arises when one of the regressors is a perfectlinear combination of the other regressors.

Binary variables are sometimes referred to as dummy variables

If you include a full set of binary variables (a complete and mutuallyexclusive categorization) and an intercept in the regression, you willhave perfect multicollinearity.

eg. female and male = 1-femaleeg. West, Central and East China

This is called the dummy variable trap.

Solutions to the dummy variable trap: Omit one of the groups or theintercept

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 60 / 79

Page 62: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Perfect multicollinearityregress Testscore on Class size and the percentage of English learners

#### Call:## lm(formula = testscr ~ str + el_pct, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -48.845 -10.240 -0.308 9.815 43.461#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 686.03225 7.41131 92.566 < 2e-16 ***## str -1.10130 0.38028 -2.896 0.00398 **## el_pct -0.64978 0.03934 -16.516 < 2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 14.46 on 417 degrees of freedom## Multiple R-squared: 0.4264, Adjusted R-squared: 0.4237## F-statistic: 155 on 2 and 417 DF, p-value: < 2.2e-16Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 61 / 79

Page 63: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Perfect multicollinearityadd a new variable nel=1-el_pct into the regression

#### Call:## lm(formula = testscr ~ str + nel_pct + el_pct, data = ca)#### Residuals:## Min 1Q Median 3Q Max## -48.845 -10.240 -0.308 9.815 43.461#### Coefficients: (1 not defined because of singularities)## Estimate Std. Error t value Pr(>|t|)## (Intercept) 685.38247 7.41556 92.425 < 2e-16 ***## str -1.10130 0.38028 -2.896 0.00398 **## nel_pct 0.64978 0.03934 16.516 < 2e-16 ***## el_pct NA NA NA NA## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 14.46 on 417 degrees of freedom## Multiple R-squared: 0.4264, Adjusted R-squared: 0.4237## F-statistic: 155 on 2 and 417 DF, p-value: < 2.2e-16

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 62 / 79

Page 64: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Perfect multicollinearity表 4: Class Size and Test Score

Dependent variable:testscr

(1) (2)str −1.101∗∗∗ −1.101∗∗∗

(0.380) (0.380)nel_pct 0.650∗∗∗

(0.039)el_pct −0.650∗∗∗

(0.039)Constant 686.032∗∗∗ 685.382∗∗∗

(7.411) (7.416)Observations 420 420R2 0.426 0.426

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 63 / 79

Page 65: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Multicollinearity

Multicollinearity means that two or more regressors are highly correlated,but one regressor is NOT a perfect linear function of one or more of theother regressors.

multicollinearity is NOT a violation of OLS assumptions.It does not impose theoretical problem for the calculation of OLSestimators.

But if two regressors are highly correlated, then the the coefficient onat least one of the regressors is imprecisely estimated (high variance).

to what extent two correlated variables can be seen as “highlycorrelated”?

rule of thumb: correlation coefficient is over 0.8.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 64 / 79

Page 66: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Venn Diagrams for Multiple Regression Model

In a simple model (y on X),OLS uses ‘Blue‘ + ‘Red‘ toestimate 𝛽.When y is regressed on Xand W: OLS throws awaythe red area and just usesblue to estimate 𝛽.Idea: Red area iscontaminated(we do notknow if the movements iny are due to X or to W).

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 65 / 79

Page 67: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Venn Diagrams for Multicollinearity

less information (compare the Blue and Green areas in both figures) isused, the estimation is less precise.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 66 / 79

Page 68: Lecture 4: Multiple OLS Regression

Multiple Regression: Assumption

Multiple Regression: Test Scores and Class Size表 5: Class Size and Test Score

testscr(1) (2) (3)

str −2.280∗∗∗ −1.101∗∗∗ −0.069(0.480) (0.380) (0.277)

el_pct −0.650∗∗∗ −0.488∗∗∗

(0.039) (0.029)avginc 1.495∗∗∗

(0.075)Constant 698.933∗∗∗ 686.032∗∗∗ 640.315∗∗∗

(9.467) (7.411) (5.775)N 420 420 420R2 0.051 0.426 0.707Adjusted R2 0.049 0.424 0.705

Notes: ∗∗∗Significant at the 1 percent level.∗∗Significant at the 5 percent level.∗Significant at the 10 percent level.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 67 / 79

Page 69: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Section 8

Properties of OLS Estimators in MultipleRegression

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 68 / 79

Page 70: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Unbiasedness(1)Use partitioned regression formula

𝛽1 = ∑𝑛𝑖=1 ��1,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

1,𝑖

Substitute𝑌𝑖 = 𝛽0 + 𝛽1𝑋1,𝑖 + 𝛽2𝑋2,𝑖 + ... + 𝛽𝑘𝑋𝑘,𝑖 + 𝑢𝑖, 𝑖 = 1, ..., 𝑛,then

𝛽1 = ∑ ��1,𝑖(𝛽0 + 𝛽1𝑋1,𝑖 + 𝛽2𝑋2,𝑖 + ... + 𝛽𝑘𝑋𝑘,𝑖 + 𝑢𝑖)∑ ��2

1,𝑖

= 𝛽0∑𝑛

𝑖=1 ��1,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ 𝛽1∑𝑛

𝑖=1 ��1,𝑖𝑋1,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ ...

+ 𝛽𝑘∑𝑛

𝑖=1 ��1,𝑖𝑋𝑘,𝑖∑𝑛

𝑖=1 ��21,𝑖

+ ∑𝑛𝑖=1 ��1,𝑖𝑢𝑖

∑𝑛𝑖=1 ��2

1,𝑖Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 69 / 79

Page 71: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Unbiasedness(2)

Because𝑛

∑𝑖=1

��1,𝑖 =𝑛

∑𝑖=1

��1,𝑖𝑋𝑗,𝑖 = 0 , 𝑗 = 2, 3, ..., 𝑘𝑛

∑𝑖=1

��1,𝑖𝑋1,𝑖 = ∑ ��21,𝑖

Therefore𝛽1 = 𝛽1 + ∑𝑛

𝑖=1 ��1,𝑖𝑢𝑖∑𝑛

𝑖=1 ��21,𝑖

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 70 / 79

Page 72: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Unbiasedness(3)Recall Assumption 1: 𝐸[𝑢𝑖|𝑋1𝑖, 𝑋2𝑖...𝑋𝑘𝑖] = 0 and ��1𝑖 is afunction of 𝑋2𝑖...𝑋𝑘𝑖

Then take expectations of 𝛽1 and The Law of IteratedExpectations again

𝐸[ 𝛽1] = 𝐸[𝛽1 + ∑𝑛𝑖=1 ��1,𝑖𝑢𝑖

∑𝑛𝑖=1 ��2

1,𝑖] = 𝛽1 + 𝐸[∑𝑛

𝑖=1 ��1,𝑖𝑢𝑖∑𝑛

𝑖=1 ��21,𝑖

]

= 𝛽1 + 𝐸[∑𝑛𝑖=1 ��1,𝑖𝐸[𝑢𝑖|𝑋2𝑖...𝑋𝑘𝑖]

∑𝑛𝑖=1 ��2

1,𝑖]

= 𝛽1

Identical argument works for 𝛽2, ..., 𝛽𝑘, thus

𝐸[ 𝛽𝑗] = 𝛽𝑗 where 𝑗 = 1, 2, ..., 𝑘Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 71 / 79

Page 73: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Consistency(1)

Recall𝛽1 = ∑𝑛

𝑖=1 ��1,𝑖𝑌𝑖∑𝑛

𝑖=1 ��21,𝑖

Similar to the proof in the Simple OLS Regression,thus

𝛽1 = ∑𝑛𝑖=1 ��1,𝑖𝑌𝑖

∑𝑛𝑖=1 ��2

1,𝑖=

1𝑛−2 ∑𝑛

𝑖=1 ��1𝑖𝑌𝑖1

𝑛−2 ∑𝑛𝑖=1 ��2

1𝑖= (

𝑠��1𝑌𝑠2

��1

)

where 𝑠��1𝑌 and 𝑠2��1

are the sample covariance of ��1 and 𝑌 and thesample variance of ��1.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 72 / 79

Page 74: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Consistency(2)

Base on L.L.N(the law of large numbers) and random sample(i.i.d)

𝑠��21

𝑝⟶ 𝜎��2

1= 𝑉 𝑎𝑟(��1)

𝑠��1𝑌𝑝

⟶ 𝜎��1𝑌 = 𝐶𝑜𝑣(��1, 𝑌 )

Combining with Continuous Mapping Theorem,then we obtain theOLS estimator 𝛽1,when 𝑛 ⟶ ∞

𝑝𝑙𝑖𝑚 𝛽1 = 𝑝𝑙𝑖𝑚(𝑠��1𝑌𝑠2

��1

) = 𝐶𝑜𝑣(��1, 𝑌 )𝑉 𝑎𝑟(��1)

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 73 / 79

Page 75: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Consistency(3)

𝑝𝑙𝑖𝑚 𝛽1 = 𝐶𝑜𝑣(��1, 𝑌 )𝑉 𝑎𝑟(��1)

= 𝐶𝑜𝑣(��1, (𝛽0 + 𝛽1𝑋1𝑖 + ... + 𝛽𝑘𝑋𝑘𝑖 + 𝑢𝑖))𝑉 𝑎𝑟(��1)

= 𝐶𝑜𝑣(��1, 𝛽0) + 𝛽1𝐶𝑜𝑣(��1, 𝑋1𝑖) + ... + 𝛽𝑘𝐶𝑜𝑣(��1, 𝑋𝑘𝑖) + 𝐶𝑜𝑣(��1, 𝑢𝑖)𝑉 𝑎𝑟(��1)

= 𝛽1 + 𝐶𝑜𝑣(��1, 𝑢𝑖)𝑉 𝑎𝑟(��1)

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 74 / 79

Page 76: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

Properties of OLS estimators: Consistency(4)

Based on Assumption 1: 𝐸[𝑢𝑖|𝑋1𝑖, 𝑋2𝑖...𝑋𝑘𝑖] = 0And ��1𝑖 is a function of 𝑋2𝑖...𝑋𝑘𝑖

Then𝐶𝑜𝑣(��1, 𝑢𝑖) = 0

Then we can obtain𝑝𝑙𝑖𝑚 𝛽1 = 𝛽1

Identical argument works for 𝛽2, ..., 𝛽𝑘,thus

𝑝𝑙𝑖𝑚 𝛽𝑗 = 𝛽𝑗 where 𝑗 = 1, 2, ..., 𝑘

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 75 / 79

Page 77: Lecture 4: Multiple OLS Regression

Properties of OLS Estimators in Multiple Regression

The Distribution of Multiple OLS Estimators

Recall from last lecture:Under the least squares assumptions,the OLS estimators 𝛽1 and 𝛽0, areunbiased and consistent estimators of 𝛽1 and 𝛽0.In large samples, the sampling distribution of 𝛽1 and 𝛽0 is wellapproximated by a bivariate normal distribution.

Similarly as in the simple OLS, the multiple OLS estimators areaverages of the randomly sampled data, and if the sample size issufficiently large, the sampling distribution of those averages becomesnormal.

Because the multivariate normal distribution is best handledmathematically using matrix algebra, the expressions for the jointdistribution of the OLS estimators are deferred to Chapter 18(SWtextbook).

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 76 / 79

Page 78: Lecture 4: Multiple OLS Regression

Multiple OLS Regression and Causality

Section 9

Multiple OLS Regression and Causality

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 77 / 79

Page 79: Lecture 4: Multiple OLS Regression

Multiple OLS Regression and Causality

Independent Variable v.s Control VariablesGenerally, we would like to pay more attention to only oneindependent variable(thus we would like to call it treatmentvariable), though there could be many independent variables.

Because 𝛽𝑗 is partial (marginal) effect of 𝑋𝑗 on Y.

𝛽𝑗 = 𝜕𝑌𝑖𝜕𝑋𝑗,𝑖

which means that we are estimate the effect of X on Y when “otherthings equal”, thus the concept of ceteris paribus.

Therefore,other variables in the right hand of equation, we call themcontrol variables, which we would like to explicitly hold fixed whenstudying the effect of 𝑋1 on Y.

More specifically,regression model turns into𝑌𝑖 = 𝛽0 + 𝛽1𝐷𝑖 + 𝛾2𝐶2,𝑖 + ... + 𝛾𝑘𝐶𝑘,𝑖 + 𝑢𝑖, 𝑖 = 1, ..., 𝑛

Transform it into𝑌𝑖 = 𝛽0 + 𝛽1𝐷𝑖 + 𝐶2...𝑘,𝑖𝛾′

2...𝑘 + 𝑢𝑖, 𝑖 = 1, ..., 𝑛

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 78 / 79

Page 80: Lecture 4: Multiple OLS Regression

Multiple OLS Regression and Causality

Good Controls and Bad Controls

Questions: Are “more controls” always better (or at least never worse)?Answer: It depends on. Because there are such things as Bad Controls,which means that when add a control(put a variable into the regression), itmay overcome the OVB bias but also bring more other bias.Good controls, are variables determining the treatment and correlated withthe outcome. And we can think of them as having been fixed at the timethe treatment variable was determined.Irrelevant controls, are variables uncorrelated with the treatment butcorrelated with the outcome. It may help reducing standard errors.Bad controls are not the factors we would like make fixed but theoutcomes of treatment.We will soon come back to discuss this topic again in Lecture 7.

Zhaopeng Qu (Nanjing University) Lecture 4: Multiple OLS Regression 10/15/2020 79 / 79