MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple...

206
MULTIPLE REGRESSION ANALYSIS: ESTIMATION useyin Ta¸ stan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory Econometrics: A Modern Approach (2nd ed.) by J. Wooldridge. 17 Ekim 2012 Econometrics I: Multiple Regression: Estimation - H. Ta¸ stan 1

Transcript of MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple...

Page 1: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

MULTIPLE REGRESSION ANALYSIS:ESTIMATION

Huseyin Tastan1

1Yıldız Technical UniversityDepartment of Economics

These presentation notes are based onIntroductory Econometrics: A Modern Approach (2nd ed.)

by J. Wooldridge.

17 Ekim 2012

Econometrics I: Multiple Regression: Estimation - H. Tastan 1

Page 2: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

In the simple regression analysis with only one explanatoryvariable the assumption SLR.3 is generally very restrictive andunrealistic.

Recall that the key assumption was SLR.3: all other factorsaffecting y are uncorrelated with x (ceteris paribus).

Multiple regression analysis allow us to explicitly control formany other factors that simultaneously affect y.

By adding new explanatory variables we can explain more ofthe variation in y. In other words, we can develop moresuccessful models.

Additional advantage: multiple regression can incorporategeneral functional form relationships.

Econometrics I: Multiple Regression: Estimation - H. Tastan 2

Page 3: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

In the simple regression analysis with only one explanatoryvariable the assumption SLR.3 is generally very restrictive andunrealistic.

Recall that the key assumption was SLR.3: all other factorsaffecting y are uncorrelated with x (ceteris paribus).

Multiple regression analysis allow us to explicitly control formany other factors that simultaneously affect y.

By adding new explanatory variables we can explain more ofthe variation in y. In other words, we can develop moresuccessful models.

Additional advantage: multiple regression can incorporategeneral functional form relationships.

Econometrics I: Multiple Regression: Estimation - H. Tastan 2

Page 4: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

In the simple regression analysis with only one explanatoryvariable the assumption SLR.3 is generally very restrictive andunrealistic.

Recall that the key assumption was SLR.3: all other factorsaffecting y are uncorrelated with x (ceteris paribus).

Multiple regression analysis allow us to explicitly control formany other factors that simultaneously affect y.

By adding new explanatory variables we can explain more ofthe variation in y. In other words, we can develop moresuccessful models.

Additional advantage: multiple regression can incorporategeneral functional form relationships.

Econometrics I: Multiple Regression: Estimation - H. Tastan 2

Page 5: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

In the simple regression analysis with only one explanatoryvariable the assumption SLR.3 is generally very restrictive andunrealistic.

Recall that the key assumption was SLR.3: all other factorsaffecting y are uncorrelated with x (ceteris paribus).

Multiple regression analysis allow us to explicitly control formany other factors that simultaneously affect y.

By adding new explanatory variables we can explain more ofthe variation in y. In other words, we can develop moresuccessful models.

Additional advantage: multiple regression can incorporategeneral functional form relationships.

Econometrics I: Multiple Regression: Estimation - H. Tastan 2

Page 6: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

In the simple regression analysis with only one explanatoryvariable the assumption SLR.3 is generally very restrictive andunrealistic.

Recall that the key assumption was SLR.3: all other factorsaffecting y are uncorrelated with x (ceteris paribus).

Multiple regression analysis allow us to explicitly control formany other factors that simultaneously affect y.

By adding new explanatory variables we can explain more ofthe variation in y. In other words, we can develop moresuccessful models.

Additional advantage: multiple regression can incorporategeneral functional form relationships.

Econometrics I: Multiple Regression: Estimation - H. Tastan 2

Page 7: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

Wage Equation

wage = β0 + β1educ+ β2exper + u

wage: hourly wages (in US dollars); educ: level of education (inyears); exper: level of experience (in years)

β1: measures the impact of eduction on wage, holding allother factors fixed.

β2: measures the ceteris paribus effect of experience on wage.

This wage equation allows us to measure the impact ofeducation on wage holding experience fixed. This was notpossible in simple regression analysis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 3

Page 8: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

Wage Equation

wage = β0 + β1educ+ β2exper + u

wage: hourly wages (in US dollars); educ: level of education (inyears); exper: level of experience (in years)

β1: measures the impact of eduction on wage, holding allother factors fixed.

β2: measures the ceteris paribus effect of experience on wage.

This wage equation allows us to measure the impact ofeducation on wage holding experience fixed. This was notpossible in simple regression analysis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 3

Page 9: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

Wage Equation

wage = β0 + β1educ+ β2exper + u

wage: hourly wages (in US dollars); educ: level of education (inyears); exper: level of experience (in years)

β1: measures the impact of eduction on wage, holding allother factors fixed.

β2: measures the ceteris paribus effect of experience on wage.

This wage equation allows us to measure the impact ofeducation on wage holding experience fixed. This was notpossible in simple regression analysis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 3

Page 10: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

Wage Equation

wage = β0 + β1educ+ β2exper + u

wage: hourly wages (in US dollars); educ: level of education (inyears); exper: level of experience (in years)

β1: measures the impact of eduction on wage, holding allother factors fixed.

β2: measures the ceteris paribus effect of experience on wage.

This wage equation allows us to measure the impact ofeducation on wage holding experience fixed. This was notpossible in simple regression analysis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 3

Page 11: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

Wage Equation

wage = β0 + β1educ+ β2exper + u

wage: hourly wages (in US dollars); educ: level of education (inyears); exper: level of experience (in years)

β1: measures the impact of eduction on wage, holding allother factors fixed.

β2: measures the ceteris paribus effect of experience on wage.

This wage equation allows us to measure the impact ofeducation on wage holding experience fixed. This was notpossible in simple regression analysis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 3

Page 12: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

Student Success and Family Income

avgscore = β0 + β1expend+ β2avginc+ u

avgscore: average standardized test score; expend: educationspending per student; avgincome: average family income

If avginc is not included in the model directly, its effect willbe included in the error term, u.

Because average family income is generally correlated witheducation expenditures, the key assumption SLR.3 will beinvalid: x (expend) will be correlated with u leading to biasedOLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 4

Page 13: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

Student Success and Family Income

avgscore = β0 + β1expend+ β2avginc+ u

avgscore: average standardized test score; expend: educationspending per student; avgincome: average family income

If avginc is not included in the model directly, its effect willbe included in the error term, u.

Because average family income is generally correlated witheducation expenditures, the key assumption SLR.3 will beinvalid: x (expend) will be correlated with u leading to biasedOLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 4

Page 14: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis Examples

Student Success and Family Income

avgscore = β0 + β1expend+ β2avginc+ u

avgscore: average standardized test score; expend: educationspending per student; avgincome: average family income

If avginc is not included in the model directly, its effect willbe included in the error term, u.

Because average family income is generally correlated witheducation expenditures, the key assumption SLR.3 will beinvalid: x (expend) will be correlated with u leading to biasedOLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 4

Page 15: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

Multiple regression analysis allow us to use more generalfunctional forms.

Consider the following quadratic model of consumption:

cons = β0 + β1inc+ β2inc2 + u

where x1 = inc and x2 = inc2

β1: we cannot fix inc2 while changing inc.

Marginal propensity to consume (MPC) is approximatedby:

∆cons

∆inc≈ β1 + 2β2inc

MPC depends on the level of income.

Econometrics I: Multiple Regression: Estimation - H. Tastan 5

Page 16: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

Multiple regression analysis allow us to use more generalfunctional forms.

Consider the following quadratic model of consumption:

cons = β0 + β1inc+ β2inc2 + u

where x1 = inc and x2 = inc2

β1: we cannot fix inc2 while changing inc.

Marginal propensity to consume (MPC) is approximatedby:

∆cons

∆inc≈ β1 + 2β2inc

MPC depends on the level of income.

Econometrics I: Multiple Regression: Estimation - H. Tastan 5

Page 17: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

Multiple regression analysis allow us to use more generalfunctional forms.

Consider the following quadratic model of consumption:

cons = β0 + β1inc+ β2inc2 + u

where x1 = inc and x2 = inc2

β1: we cannot fix inc2 while changing inc.

Marginal propensity to consume (MPC) is approximatedby:

∆cons

∆inc≈ β1 + 2β2inc

MPC depends on the level of income.

Econometrics I: Multiple Regression: Estimation - H. Tastan 5

Page 18: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

Multiple regression analysis allow us to use more generalfunctional forms.

Consider the following quadratic model of consumption:

cons = β0 + β1inc+ β2inc2 + u

where x1 = inc and x2 = inc2

β1: we cannot fix inc2 while changing inc.

Marginal propensity to consume (MPC) is approximatedby:

∆cons

∆inc≈ β1 + 2β2inc

MPC depends on the level of income.

Econometrics I: Multiple Regression: Estimation - H. Tastan 5

Page 19: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

Multiple regression analysis allow us to use more generalfunctional forms.

Consider the following quadratic model of consumption:

cons = β0 + β1inc+ β2inc2 + u

where x1 = inc and x2 = inc2

β1: we cannot fix inc2 while changing inc.

Marginal propensity to consume (MPC) is approximatedby:

∆cons

∆inc≈ β1 + 2β2inc

MPC depends on the level of income.

Econometrics I: Multiple Regression: Estimation - H. Tastan 5

Page 20: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 21: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 22: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 23: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 24: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 25: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The Model with Two Explanatory Variables

y = β0 + β1x1 + β2x2 + u

“Zero conditional mean” assumption:

E(u|x1, x2) = 0

In other words, for all combinations of x1 and x2, theexpected value of unobservables, u, is zero.

E.g., in the wage equation:

E(u|educ, exper) = 0

This means that unobservable factors affecting wage are notrelated on average with educ and exper.

If ability is a part of u, then average ability levels must be thesame across all combinations of education and experience inthe population.

Econometrics I: Multiple Regression: Estimation - H. Tastan 6

Page 26: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The model with two independent variables

E(u|x1, x2) = 0

Test scores and average family income:

E(u|expend, avginc) = 0

All other factors affecting the average test scores (such asquality of schools, student characteristics, etc.) are, onaverage, unrelated to education expenditures and familyincome.

In the quadratic consumption model:

E(u|inc, inc2) = E(u|inc) = 0

Since inc2 is automatically known when inc is known we donot need to write it in the conditional expectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 7

Page 27: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The model with two independent variables

E(u|x1, x2) = 0

Test scores and average family income:

E(u|expend, avginc) = 0

All other factors affecting the average test scores (such asquality of schools, student characteristics, etc.) are, onaverage, unrelated to education expenditures and familyincome.

In the quadratic consumption model:

E(u|inc, inc2) = E(u|inc) = 0

Since inc2 is automatically known when inc is known we donot need to write it in the conditional expectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 7

Page 28: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The model with two independent variables

E(u|x1, x2) = 0

Test scores and average family income:

E(u|expend, avginc) = 0

All other factors affecting the average test scores (such asquality of schools, student characteristics, etc.) are, onaverage, unrelated to education expenditures and familyincome.

In the quadratic consumption model:

E(u|inc, inc2) = E(u|inc) = 0

Since inc2 is automatically known when inc is known we donot need to write it in the conditional expectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 7

Page 29: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The model with two independent variables

E(u|x1, x2) = 0

Test scores and average family income:

E(u|expend, avginc) = 0

All other factors affecting the average test scores (such asquality of schools, student characteristics, etc.) are, onaverage, unrelated to education expenditures and familyincome.

In the quadratic consumption model:

E(u|inc, inc2) = E(u|inc) = 0

Since inc2 is automatically known when inc is known we donot need to write it in the conditional expectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 7

Page 30: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Multiple Regression Analysis

The model with two independent variables

E(u|x1, x2) = 0

Test scores and average family income:

E(u|expend, avginc) = 0

All other factors affecting the average test scores (such asquality of schools, student characteristics, etc.) are, onaverage, unrelated to education expenditures and familyincome.

In the quadratic consumption model:

E(u|inc, inc2) = E(u|inc) = 0

Since inc2 is automatically known when inc is known we donot need to write it in the conditional expectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 7

Page 31: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Independent Variables

Model

y = β0 + β1x1 + β2x2 + . . .+ βkxk + u

In this model we have k explanatory variables and k + 1unknown β parameters.

The definition of the error term is the same: it represents allother factors affecting y that are not included in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 8

Page 32: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Independent Variables

Model

y = β0 + β1x1 + β2x2 + . . .+ βkxk + u

In this model we have k explanatory variables and k + 1unknown β parameters.

The definition of the error term is the same: it represents allother factors affecting y that are not included in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 8

Page 33: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Independent Variables

Model

y = β0 + β1x1 + β2x2 + . . .+ βkxk + u

In this model we have k explanatory variables and k + 1unknown β parameters.

The definition of the error term is the same: it represents allother factors affecting y that are not included in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 8

Page 34: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

βj : measures the change in y in response to a unit change inxj holding all other xs and unobservable factors u fixed.

For example:

log(salary) = β0 + β1 log(sales) + β2ceoten+ β3ceoten2 + u

ceoten: tenure with the current company (years)

Interpretation of β1: elasticity of salary with respect to sales.Ceteris paribus salary will change by β1% in response to a 1%change in sales.

β2: does not measure the impact of ceoten. We need to takethe quadratic term into account.

Econometrics I: Multiple Regression: Estimation - H. Tastan 9

Page 35: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

βj : measures the change in y in response to a unit change inxj holding all other xs and unobservable factors u fixed.

For example:

log(salary) = β0 + β1 log(sales) + β2ceoten+ β3ceoten2 + u

ceoten: tenure with the current company (years)

Interpretation of β1: elasticity of salary with respect to sales.Ceteris paribus salary will change by β1% in response to a 1%change in sales.

β2: does not measure the impact of ceoten. We need to takethe quadratic term into account.

Econometrics I: Multiple Regression: Estimation - H. Tastan 9

Page 36: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

βj : measures the change in y in response to a unit change inxj holding all other xs and unobservable factors u fixed.

For example:

log(salary) = β0 + β1 log(sales) + β2ceoten+ β3ceoten2 + u

ceoten: tenure with the current company (years)

Interpretation of β1: elasticity of salary with respect to sales.Ceteris paribus salary will change by β1% in response to a 1%change in sales.

β2: does not measure the impact of ceoten. We need to takethe quadratic term into account.

Econometrics I: Multiple Regression: Estimation - H. Tastan 9

Page 37: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

βj : measures the change in y in response to a unit change inxj holding all other xs and unobservable factors u fixed.

For example:

log(salary) = β0 + β1 log(sales) + β2ceoten+ β3ceoten2 + u

ceoten: tenure with the current company (years)

Interpretation of β1: elasticity of salary with respect to sales.Ceteris paribus salary will change by β1% in response to a 1%change in sales.

β2: does not measure the impact of ceoten. We need to takethe quadratic term into account.

Econometrics I: Multiple Regression: Estimation - H. Tastan 9

Page 38: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

Zero Conditional Mean Assumption

E(u|x1, x2, . . . , xk) = 0

Error term is unrelated with explanatory variables.

If u is correlated with any of xs then OLS estimators will ingeneral be biased. Estimation results will be unreliable.

If there are omitted important variables affecting y thisassumption may not hold. This may lead to “omitted variablebias”.

This assumption also means that functional form is correctlyspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 10

Page 39: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

Zero Conditional Mean Assumption

E(u|x1, x2, . . . , xk) = 0

Error term is unrelated with explanatory variables.

If u is correlated with any of xs then OLS estimators will ingeneral be biased. Estimation results will be unreliable.

If there are omitted important variables affecting y thisassumption may not hold. This may lead to “omitted variablebias”.

This assumption also means that functional form is correctlyspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 10

Page 40: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

Zero Conditional Mean Assumption

E(u|x1, x2, . . . , xk) = 0

Error term is unrelated with explanatory variables.

If u is correlated with any of xs then OLS estimators will ingeneral be biased. Estimation results will be unreliable.

If there are omitted important variables affecting y thisassumption may not hold. This may lead to “omitted variablebias”.

This assumption also means that functional form is correctlyspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 10

Page 41: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

Zero Conditional Mean Assumption

E(u|x1, x2, . . . , xk) = 0

Error term is unrelated with explanatory variables.

If u is correlated with any of xs then OLS estimators will ingeneral be biased. Estimation results will be unreliable.

If there are omitted important variables affecting y thisassumption may not hold. This may lead to “omitted variablebias”.

This assumption also means that functional form is correctlyspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 10

Page 42: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables

Zero Conditional Mean Assumption

E(u|x1, x2, . . . , xk) = 0

Error term is unrelated with explanatory variables.

If u is correlated with any of xs then OLS estimators will ingeneral be biased. Estimation results will be unreliable.

If there are omitted important variables affecting y thisassumption may not hold. This may lead to “omitted variablebias”.

This assumption also means that functional form is correctlyspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 10

Page 43: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables: OLS Estimation

Sample Regression Function - SRF

y = β0 + β1x1 + β2x2 + . . .+ βkxk

Ordinary Least Squares (OLS) estimators minimize the Sum ofSquared Residuals (SSR):

OLS Objective Function

n∑i=1

u2i =

n∑i=1

(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik)2

OLS estimators, βj , j = 0, 1, 2, . . . , k, can be found by solving theFirst Order Conditions (FOC) involving k + 1 equations with k + 1unknowns.

Econometrics I: Multiple Regression: Estimation - H. Tastan 11

Page 44: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables: OLS Estimation

Sample Regression Function - SRF

y = β0 + β1x1 + β2x2 + . . .+ βkxk

Ordinary Least Squares (OLS) estimators minimize the Sum ofSquared Residuals (SSR):

OLS Objective Function

n∑i=1

u2i =

n∑i=1

(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik)2

OLS estimators, βj , j = 0, 1, 2, . . . , k, can be found by solving theFirst Order Conditions (FOC) involving k + 1 equations with k + 1unknowns.

Econometrics I: Multiple Regression: Estimation - H. Tastan 11

Page 45: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables: OLS Estimation

OLS First Order Conditions

n∑i=1

(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik) = 0

n∑i=1

xi1(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik) = 0

n∑i=1

xi2(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik) = 0

......

...n∑

i=1

xik(yi − β0 − β1xi1 − β2xi2 − . . .− βkxik) = 0

Econometrics I: Multiple Regression: Estimation - H. Tastan 12

Page 46: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Model with k Explanatory Variables: OLS Estimation

OLS and Method of Moments

OLS first order conditions can also be obtained using the methodof moments. They are the sample counterparts to the followingpopulation moment conditions:

E(u) = 0

E(x1u) = 0

E(x2u) = 0...

......

E(xku) = 0

Econometrics I: Multiple Regression: Estimation - H. Tastan 13

Page 47: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of SRF

The model with 2 explanatory variables

y = β0 + β1x1 + β2x2

Slope parameter estimates, βj , gives us the predicted changein y given the changes in x variables, ceteris paribus.

β1: holding x2 fixed, i.e. ∆x2 = 0

∆y = β1∆x1

Similarly, holding x1 fixed

∆y = β2∆x2

Econometrics I: Multiple Regression: Estimation - H. Tastan 14

Page 48: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of SRF

The model with 2 explanatory variables

y = β0 + β1x1 + β2x2

Slope parameter estimates, βj , gives us the predicted changein y given the changes in x variables, ceteris paribus.

β1: holding x2 fixed, i.e. ∆x2 = 0

∆y = β1∆x1

Similarly, holding x1 fixed

∆y = β2∆x2

Econometrics I: Multiple Regression: Estimation - H. Tastan 14

Page 49: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of SRF

The model with 2 explanatory variables

y = β0 + β1x1 + β2x2

Slope parameter estimates, βj , gives us the predicted changein y given the changes in x variables, ceteris paribus.

β1: holding x2 fixed, i.e. ∆x2 = 0

∆y = β1∆x1

Similarly, holding x1 fixed

∆y = β2∆x2

Econometrics I: Multiple Regression: Estimation - H. Tastan 14

Page 50: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of SRF

The model with 2 explanatory variables

y = β0 + β1x1 + β2x2

Slope parameter estimates, βj , gives us the predicted changein y given the changes in x variables, ceteris paribus.

β1: holding x2 fixed, i.e. ∆x2 = 0

∆y = β1∆x1

Similarly, holding x1 fixed

∆y = β2∆x2

Econometrics I: Multiple Regression: Estimation - H. Tastan 14

Page 51: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Intercept term is estimated as β0 = 1.29. This gives us theaverage colGPA when hsGPA = 0 and ACT = 0. Becausethere is no observation with ACT = 0 and hsGPA = 0, thisinterpretation is not meaningful.

Slope parameter on hsGPA is estimated to be 0.453. HoldingACT fixed, given a one-point change in high school GPA,college GPA is predicted to increase by 0.453 points, almosthalf a point.

Econometrics I: Multiple Regression: Estimation - H. Tastan 15

Page 52: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Intercept term is estimated as β0 = 1.29. This gives us theaverage colGPA when hsGPA = 0 and ACT = 0. Becausethere is no observation with ACT = 0 and hsGPA = 0, thisinterpretation is not meaningful.

Slope parameter on hsGPA is estimated to be 0.453. HoldingACT fixed, given a one-point change in high school GPA,college GPA is predicted to increase by 0.453 points, almosthalf a point.

Econometrics I: Multiple Regression: Estimation - H. Tastan 15

Page 53: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Intercept term is estimated as β0 = 1.29. This gives us theaverage colGPA when hsGPA = 0 and ACT = 0. Becausethere is no observation with ACT = 0 and hsGPA = 0, thisinterpretation is not meaningful.

Slope parameter on hsGPA is estimated to be 0.453. HoldingACT fixed, given a one-point change in high school GPA,college GPA is predicted to increase by 0.453 points, almosthalf a point.

Econometrics I: Multiple Regression: Estimation - H. Tastan 15

Page 54: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Alternative interpretation of 0.453

If we choose two students, A and B, with the same ACT scorebut hsGPA of student A is one point higher than thehsGPA of student B, then we predict that student A’scollege GPA is 0.453 points higher than that of student B’s.

ACT has + sign but its effect is very small.

Econometrics I: Multiple Regression: Estimation - H. Tastan 16

Page 55: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Alternative interpretation of 0.453

If we choose two students, A and B, with the same ACT scorebut hsGPA of student A is one point higher than thehsGPA of student B, then we predict that student A’scollege GPA is 0.453 points higher than that of student B’s.

ACT has + sign but its effect is very small.

Econometrics I: Multiple Regression: Estimation - H. Tastan 16

Page 56: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Alternative interpretation of 0.453

If we choose two students, A and B, with the same ACT scorebut hsGPA of student A is one point higher than thehsGPA of student B, then we predict that student A’scollege GPA is 0.453 points higher than that of student B’s.

ACT has + sign but its effect is very small.

Econometrics I: Multiple Regression: Estimation - H. Tastan 16

Page 57: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

colGPA Estimation Results

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 students, colGPA: university grade point average (GPA,points out of 4), hsGPA: high school grade point average, ACT :achievement test score

Alternative interpretation of 0.453

If we choose two students, A and B, with the same ACT scorebut hsGPA of student A is one point higher than thehsGPA of student B, then we predict that student A’scollege GPA is 0.453 points higher than that of student B’s.

ACT has + sign but its effect is very small.

Econometrics I: Multiple Regression: Estimation - H. Tastan 16

Page 58: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

Regression with only ACT variable:

colGPA simple regression results

colGPA = 2.4 + 0.0271 ACT

The coefficient estimate on ACT is almost three times aslarge as the previous estimate.

But this regression does not allow us to compare two studentswith the same high school GPA

The impact ACT decreases considerably after controlling forthe high school GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 17

Page 59: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

Regression with only ACT variable:

colGPA simple regression results

colGPA = 2.4 + 0.0271 ACT

The coefficient estimate on ACT is almost three times aslarge as the previous estimate.

But this regression does not allow us to compare two studentswith the same high school GPA

The impact ACT decreases considerably after controlling forthe high school GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 17

Page 60: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

Regression with only ACT variable:

colGPA simple regression results

colGPA = 2.4 + 0.0271 ACT

The coefficient estimate on ACT is almost three times aslarge as the previous estimate.

But this regression does not allow us to compare two studentswith the same high school GPA

The impact ACT decreases considerably after controlling forthe high school GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 17

Page 61: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

Regression with only ACT variable:

colGPA simple regression results

colGPA = 2.4 + 0.0271 ACT

The coefficient estimate on ACT is almost three times aslarge as the previous estimate.

But this regression does not allow us to compare two studentswith the same high school GPA

The impact ACT decreases considerably after controlling forthe high school GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 17

Page 62: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Determinants of College Success, gpa1.gdt

Regression with only ACT variable:

colGPA simple regression results

colGPA = 2.4 + 0.0271 ACT

The coefficient estimate on ACT is almost three times aslarge as the previous estimate.

But this regression does not allow us to compare two studentswith the same high school GPA

The impact ACT decreases considerably after controlling forthe high school GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 17

Page 63: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of Estimated Regression Function

SRF - Sample Regression Function

y = β0 + β1x1 + β2x2 + . . .+ βkxk.

In terms of changes

∆y = β1∆x1 + β2∆x2 + . . .+ βk∆xk.

Interpretation of the coefficient on x1, β1: holding all othervariables fixed, i.e. ∆x2 = 0, ∆x3 = 0, . . ., ∆xk = 0

∆y = β1∆x1

Holding all other independent variables fixed (i.e. controllingfor all other x variables), the change in y given a one unitchange in x1 is β1 (in terms of y’s unit of measurement)

Econometrics I: Multiple Regression: Estimation - H. Tastan 18

Page 64: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of Estimated Regression Function

SRF - Sample Regression Function

y = β0 + β1x1 + β2x2 + . . .+ βkxk.

In terms of changes

∆y = β1∆x1 + β2∆x2 + . . .+ βk∆xk.

Interpretation of the coefficient on x1, β1: holding all othervariables fixed, i.e. ∆x2 = 0, ∆x3 = 0, . . ., ∆xk = 0

∆y = β1∆x1

Holding all other independent variables fixed (i.e. controllingfor all other x variables), the change in y given a one unitchange in x1 is β1 (in terms of y’s unit of measurement)

Econometrics I: Multiple Regression: Estimation - H. Tastan 18

Page 65: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of Estimated Regression Function

SRF - Sample Regression Function

y = β0 + β1x1 + β2x2 + . . .+ βkxk.

In terms of changes

∆y = β1∆x1 + β2∆x2 + . . .+ βk∆xk.

Interpretation of the coefficient on x1, β1: holding all othervariables fixed, i.e. ∆x2 = 0, ∆x3 = 0, . . ., ∆xk = 0

∆y = β1∆x1

Holding all other independent variables fixed (i.e. controllingfor all other x variables), the change in y given a one unitchange in x1 is β1 (in terms of y’s unit of measurement)

Econometrics I: Multiple Regression: Estimation - H. Tastan 18

Page 66: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Interpretation of Estimated Regression Function

SRF - Sample Regression Function

y = β0 + β1x1 + β2x2 + . . .+ βkxk.

In terms of changes

∆y = β1∆x1 + β2∆x2 + . . .+ βk∆xk.

Interpretation of the coefficient on x1, β1: holding all othervariables fixed, i.e. ∆x2 = 0, ∆x3 = 0, . . ., ∆xk = 0

∆y = β1∆x1

Holding all other independent variables fixed (i.e. controllingfor all other x variables), the change in y given a one unitchange in x1 is β1 (in terms of y’s unit of measurement)

Econometrics I: Multiple Regression: Estimation - H. Tastan 18

Page 67: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Logarithmic Wage Equation

Estimated equation

log(wage) = 0.284 + 0.092educ+ 0.0041exper + 0.022tenure

n = 526 workers.

Functional form: log-level, 100βj gives us % change in wages.

For example, holding experience and tenure fixed, another yearof education is predicted to increase 9.2% change in wages.

In other words, given two workers with the same level ofexperience and tenure, if one of the workers has a level ofeducation one year higher than the other one, then thepredicted difference between their wages is 9.2

Econometrics I: Multiple Regression: Estimation - H. Tastan 19

Page 68: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Logarithmic Wage Equation

Estimated equation

log(wage) = 0.284 + 0.092educ+ 0.0041exper + 0.022tenure

n = 526 workers.

Functional form: log-level, 100βj gives us % change in wages.

For example, holding experience and tenure fixed, another yearof education is predicted to increase 9.2% change in wages.

In other words, given two workers with the same level ofexperience and tenure, if one of the workers has a level ofeducation one year higher than the other one, then thepredicted difference between their wages is 9.2

Econometrics I: Multiple Regression: Estimation - H. Tastan 19

Page 69: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Logarithmic Wage Equation

Estimated equation

log(wage) = 0.284 + 0.092educ+ 0.0041exper + 0.022tenure

n = 526 workers.

Functional form: log-level, 100βj gives us % change in wages.

For example, holding experience and tenure fixed, another yearof education is predicted to increase 9.2% change in wages.

In other words, given two workers with the same level ofexperience and tenure, if one of the workers has a level ofeducation one year higher than the other one, then thepredicted difference between their wages is 9.2

Econometrics I: Multiple Regression: Estimation - H. Tastan 19

Page 70: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Example: Logarithmic Wage Equation

Estimated equation

log(wage) = 0.284 + 0.092educ+ 0.0041exper + 0.022tenure

n = 526 workers.

Functional form: log-level, 100βj gives us % change in wages.

For example, holding experience and tenure fixed, another yearof education is predicted to increase 9.2% change in wages.

In other words, given two workers with the same level ofexperience and tenure, if one of the workers has a level ofeducation one year higher than the other one, then thepredicted difference between their wages is 9.2

Econometrics I: Multiple Regression: Estimation - H. Tastan 19

Page 71: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Changing More than One Explanatory VariableSimultaneously

Sometimes we want to change more than one x variables atthe same time to find their impact on y.

Also in some cases when of the x variables changes the otherautomatically changes.

For example, in the wage equation when we increase tenure 1year experience also increases by 1 years

In this case the total effect on wage is % 2.61:

∆ log Ucret = 0.0041∆tecrube+ 0.022∆kidem

= 0.0041 + 0.022 = 0.0261

Econometrics I: Multiple Regression: Estimation - H. Tastan 20

Page 72: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Changing More than One Explanatory VariableSimultaneously

Sometimes we want to change more than one x variables atthe same time to find their impact on y.

Also in some cases when of the x variables changes the otherautomatically changes.

For example, in the wage equation when we increase tenure 1year experience also increases by 1 years

In this case the total effect on wage is % 2.61:

∆ log Ucret = 0.0041∆tecrube+ 0.022∆kidem

= 0.0041 + 0.022 = 0.0261

Econometrics I: Multiple Regression: Estimation - H. Tastan 20

Page 73: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Changing More than One Explanatory VariableSimultaneously

Sometimes we want to change more than one x variables atthe same time to find their impact on y.

Also in some cases when of the x variables changes the otherautomatically changes.

For example, in the wage equation when we increase tenure 1year experience also increases by 1 years

In this case the total effect on wage is % 2.61:

∆ log Ucret = 0.0041∆tecrube+ 0.022∆kidem

= 0.0041 + 0.022 = 0.0261

Econometrics I: Multiple Regression: Estimation - H. Tastan 20

Page 74: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Changing More than One Explanatory VariableSimultaneously

Sometimes we want to change more than one x variables atthe same time to find their impact on y.

Also in some cases when of the x variables changes the otherautomatically changes.

For example, in the wage equation when we increase tenure 1year experience also increases by 1 years

In this case the total effect on wage is % 2.61:

∆ log Ucret = 0.0041∆tecrube+ 0.022∆kidem

= 0.0041 + 0.022 = 0.0261

Econometrics I: Multiple Regression: Estimation - H. Tastan 20

Page 75: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

OLS Fitted Values and Residuals

Fitted (predicted) value for observation i

yi = β0 + β1xi1 + β2xi2 + . . .+ βkxik.

Residual for observation i

ui = yi − yi

u > 0 implies yi > yi, underprediction

u < 0 implies yi < yi, overprediction

Econometrics I: Multiple Regression: Estimation - H. Tastan 21

Page 76: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

OLS Fitted Values and Residuals

Fitted (predicted) value for observation i

yi = β0 + β1xi1 + β2xi2 + . . .+ βkxik.

Residual for observation i

ui = yi − yi

u > 0 implies yi > yi, underprediction

u < 0 implies yi < yi, overprediction

Econometrics I: Multiple Regression: Estimation - H. Tastan 21

Page 77: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

OLS Fitted Values and Residuals

Fitted (predicted) value for observation i

yi = β0 + β1xi1 + β2xi2 + . . .+ βkxik.

Residual for observation i

ui = yi − yi

u > 0 implies yi > yi, underprediction

u < 0 implies yi < yi, overprediction

Econometrics I: Multiple Regression: Estimation - H. Tastan 21

Page 78: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

OLS Fitted Values and Residuals

Fitted (predicted) value for observation i

yi = β0 + β1xi1 + β2xi2 + . . .+ βkxik.

Residual for observation i

ui = yi − yi

u > 0 implies yi > yi, underprediction

u < 0 implies yi < yi, overprediction

Econometrics I: Multiple Regression: Estimation - H. Tastan 21

Page 79: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Algebraic Properties of Residuals

The sum (and also average) of OLS residuals is zero:

n∑i=1

ui = 0, ¯u = 0

follows directly from the first moment condition.

Sample covariance between OLS residuals and each xj is zero:

n∑i=1

xij ui = 0, j = 1, 2, . . . , k

also follows directly from the population moment conditions.Imposed by the OLS FOC.

The point (xj , y : j = 1, 2, . . . , k) is always on the OLSregression line.

y = ¯y

Econometrics I: Multiple Regression: Estimation - H. Tastan 22

Page 80: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Algebraic Properties of Residuals

The sum (and also average) of OLS residuals is zero:

n∑i=1

ui = 0, ¯u = 0

follows directly from the first moment condition.

Sample covariance between OLS residuals and each xj is zero:

n∑i=1

xij ui = 0, j = 1, 2, . . . , k

also follows directly from the population moment conditions.Imposed by the OLS FOC.

The point (xj , y : j = 1, 2, . . . , k) is always on the OLSregression line.

y = ¯y

Econometrics I: Multiple Regression: Estimation - H. Tastan 22

Page 81: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Algebraic Properties of Residuals

The sum (and also average) of OLS residuals is zero:

n∑i=1

ui = 0, ¯u = 0

follows directly from the first moment condition.

Sample covariance between OLS residuals and each xj is zero:

n∑i=1

xij ui = 0, j = 1, 2, . . . , k

also follows directly from the population moment conditions.Imposed by the OLS FOC.

The point (xj , y : j = 1, 2, . . . , k) is always on the OLSregression line.

y = ¯y

Econometrics I: Multiple Regression: Estimation - H. Tastan 22

Page 82: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Algebraic Properties of Residuals

The sum (and also average) of OLS residuals is zero:

n∑i=1

ui = 0, ¯u = 0

follows directly from the first moment condition.

Sample covariance between OLS residuals and each xj is zero:

n∑i=1

xij ui = 0, j = 1, 2, . . . , k

also follows directly from the population moment conditions.Imposed by the OLS FOC.

The point (xj , y : j = 1, 2, . . . , k) is always on the OLSregression line.

y = ¯y

Econometrics I: Multiple Regression: Estimation - H. Tastan 22

Page 83: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

An Alternative Derivation of Coefficient Estimates

The model with two explanatory variables

y = β0 + β1x1 + β2x2.

The slope estimator on x1 is:

β1 =

∑ni=1 ri1yi∑ni=1 r

2i1

where ri1 is the residual obtained from the regression of x1 onx2:

xi1 = α0 + α1xi2 + ri1

This means that β1 is the slope estimate obtained fromregressing y on residuals, ri1:

yi = β0 + β1ri1 + residual

Ceteris paribus (partialling out, netting out): β1 measures thesample relationship between y and x1 after x2 has beenpartialled out.

Econometrics I: Multiple Regression: Estimation - H. Tastan 23

Page 84: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

An Alternative Derivation of Coefficient Estimates

The model with two explanatory variables

y = β0 + β1x1 + β2x2.

The slope estimator on x1 is:

β1 =

∑ni=1 ri1yi∑ni=1 r

2i1

where ri1 is the residual obtained from the regression of x1 onx2:

xi1 = α0 + α1xi2 + ri1

This means that β1 is the slope estimate obtained fromregressing y on residuals, ri1:

yi = β0 + β1ri1 + residual

Ceteris paribus (partialling out, netting out): β1 measures thesample relationship between y and x1 after x2 has beenpartialled out.

Econometrics I: Multiple Regression: Estimation - H. Tastan 23

Page 85: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

An Alternative Derivation of Coefficient Estimates

The model with two explanatory variables

y = β0 + β1x1 + β2x2.

The slope estimator on x1 is:

β1 =

∑ni=1 ri1yi∑ni=1 r

2i1

where ri1 is the residual obtained from the regression of x1 onx2:

xi1 = α0 + α1xi2 + ri1

This means that β1 is the slope estimate obtained fromregressing y on residuals, ri1:

yi = β0 + β1ri1 + residual

Ceteris paribus (partialling out, netting out): β1 measures thesample relationship between y and x1 after x2 has beenpartialled out.

Econometrics I: Multiple Regression: Estimation - H. Tastan 23

Page 86: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

An Alternative Derivation of Coefficient Estimates

The model with two explanatory variables

y = β0 + β1x1 + β2x2.

The slope estimator on x1 is:

β1 =

∑ni=1 ri1yi∑ni=1 r

2i1

where ri1 is the residual obtained from the regression of x1 onx2:

xi1 = α0 + α1xi2 + ri1

This means that β1 is the slope estimate obtained fromregressing y on residuals, ri1:

yi = β0 + β1ri1 + residual

Ceteris paribus (partialling out, netting out): β1 measures thesample relationship between y and x1 after x2 has beenpartialled out.

Econometrics I: Multiple Regression: Estimation - H. Tastan 23

Page 87: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

An Alternative Derivation of Coefficient Estimates

The model with two explanatory variables

y = β0 + β1x1 + β2x2.

The slope estimator on x1 is:

β1 =

∑ni=1 ri1yi∑ni=1 r

2i1

where ri1 is the residual obtained from the regression of x1 onx2:

xi1 = α0 + α1xi2 + ri1

This means that β1 is the slope estimate obtained fromregressing y on residuals, ri1:

yi = β0 + β1ri1 + residual

Ceteris paribus (partialling out, netting out): β1 measures thesample relationship between y and x1 after x2 has beenpartialled out.

Econometrics I: Multiple Regression: Estimation - H. Tastan 23

Page 88: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Comparison of Simple and Multiple Regression Estimates

Two models:

y = β0 + β1x1, vs. y = β0 + β1x1 + β2x2.

These regressions will generally give different results.

There are two special cases in which they give the sameestimates:

The partial effect of x2 on y is zero, β2 = 0

x1 and x2 are uncorrelated in the sample.

Econometrics I: Multiple Regression: Estimation - H. Tastan 24

Page 89: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Comparison of Simple and Multiple Regression Estimates

Two models:

y = β0 + β1x1, vs. y = β0 + β1x1 + β2x2.

These regressions will generally give different results.

There are two special cases in which they give the sameestimates:

The partial effect of x2 on y is zero, β2 = 0

x1 and x2 are uncorrelated in the sample.

Econometrics I: Multiple Regression: Estimation - H. Tastan 24

Page 90: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Comparison of Simple and Multiple Regression Estimates

Two models:

y = β0 + β1x1, vs. y = β0 + β1x1 + β2x2.

These regressions will generally give different results.

There are two special cases in which they give the sameestimates:

The partial effect of x2 on y is zero, β2 = 0

x1 and x2 are uncorrelated in the sample.

Econometrics I: Multiple Regression: Estimation - H. Tastan 24

Page 91: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Comparison of Simple and Multiple Regression Estimates

Two models:

y = β0 + β1x1, vs. y = β0 + β1x1 + β2x2.

These regressions will generally give different results.

There are two special cases in which they give the sameestimates:

The partial effect of x2 on y is zero, β2 = 0

x1 and x2 are uncorrelated in the sample.

Econometrics I: Multiple Regression: Estimation - H. Tastan 24

Page 92: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Comparison of Simple and Multiple Regression Estimates

Two models:

y = β0 + β1x1, vs. y = β0 + β1x1 + β2x2.

These regressions will generally give different results.

There are two special cases in which they give the sameestimates:

The partial effect of x2 on y is zero, β2 = 0

x1 and x2 are uncorrelated in the sample.

Econometrics I: Multiple Regression: Estimation - H. Tastan 24

Page 93: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-Fit and Sums of Squares

SST: total variation in y.

SST =

n∑i=1

(yi − y)2

Note that: Var(y) = SST/(n− 1).SSE is the variation in the explained part:

SSE =

n∑i=1

(yi − y)2

SSR is the variation in the unexplained part:

SSR =

n∑i=1

u2i

Thus the total variation in y can be written as:

SST = SSE + SSR

Econometrics I: Multiple Regression: Estimation - H. Tastan 25

Page 94: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-Fit and Sums of Squares

SST: total variation in y.

SST =

n∑i=1

(yi − y)2

Note that: Var(y) = SST/(n− 1).SSE is the variation in the explained part:

SSE =

n∑i=1

(yi − y)2

SSR is the variation in the unexplained part:

SSR =

n∑i=1

u2i

Thus the total variation in y can be written as:

SST = SSE + SSR

Econometrics I: Multiple Regression: Estimation - H. Tastan 25

Page 95: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-Fit and Sums of Squares

SST: total variation in y.

SST =

n∑i=1

(yi − y)2

Note that: Var(y) = SST/(n− 1).SSE is the variation in the explained part:

SSE =

n∑i=1

(yi − y)2

SSR is the variation in the unexplained part:

SSR =

n∑i=1

u2i

Thus the total variation in y can be written as:

SST = SSE + SSR

Econometrics I: Multiple Regression: Estimation - H. Tastan 25

Page 96: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-Fit and Sums of Squares

SST: total variation in y.

SST =

n∑i=1

(yi − y)2

Note that: Var(y) = SST/(n− 1).SSE is the variation in the explained part:

SSE =

n∑i=1

(yi − y)2

SSR is the variation in the unexplained part:

SSR =

n∑i=1

u2i

Thus the total variation in y can be written as:

SST = SSE + SSR

Econometrics I: Multiple Regression: Estimation - H. Tastan 25

Page 97: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Dividing both sides of this expression by SST:

1 =SSE

SST+SSR

SST

Coefficient of determination: the ratio of the variation in theexplained part to the total variation:

R2 =SSE

SST= 1− SSR

SST

Since SSE can never be larger than SST we have 0 ≤ R2 ≤ 1

R2 measures the percentage of the variation in y that can beexplained by the variations in x variables. This can be used asgoodness-of-fit measure.

R2 can also be calculated as follows: R2 = Corr(y, y)2

Econometrics I: Multiple Regression: Estimation - H. Tastan 26

Page 98: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Dividing both sides of this expression by SST:

1 =SSE

SST+SSR

SST

Coefficient of determination: the ratio of the variation in theexplained part to the total variation:

R2 =SSE

SST= 1− SSR

SST

Since SSE can never be larger than SST we have 0 ≤ R2 ≤ 1

R2 measures the percentage of the variation in y that can beexplained by the variations in x variables. This can be used asgoodness-of-fit measure.

R2 can also be calculated as follows: R2 = Corr(y, y)2

Econometrics I: Multiple Regression: Estimation - H. Tastan 26

Page 99: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Dividing both sides of this expression by SST:

1 =SSE

SST+SSR

SST

Coefficient of determination: the ratio of the variation in theexplained part to the total variation:

R2 =SSE

SST= 1− SSR

SST

Since SSE can never be larger than SST we have 0 ≤ R2 ≤ 1

R2 measures the percentage of the variation in y that can beexplained by the variations in x variables. This can be used asgoodness-of-fit measure.

R2 can also be calculated as follows: R2 = Corr(y, y)2

Econometrics I: Multiple Regression: Estimation - H. Tastan 26

Page 100: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Dividing both sides of this expression by SST:

1 =SSE

SST+SSR

SST

Coefficient of determination: the ratio of the variation in theexplained part to the total variation:

R2 =SSE

SST= 1− SSR

SST

Since SSE can never be larger than SST we have 0 ≤ R2 ≤ 1

R2 measures the percentage of the variation in y that can beexplained by the variations in x variables. This can be used asgoodness-of-fit measure.

R2 can also be calculated as follows: R2 = Corr(y, y)2

Econometrics I: Multiple Regression: Estimation - H. Tastan 26

Page 101: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Dividing both sides of this expression by SST:

1 =SSE

SST+SSR

SST

Coefficient of determination: the ratio of the variation in theexplained part to the total variation:

R2 =SSE

SST= 1− SSR

SST

Since SSE can never be larger than SST we have 0 ≤ R2 ≤ 1

R2 measures the percentage of the variation in y that can beexplained by the variations in x variables. This can be used asgoodness-of-fit measure.

R2 can also be calculated as follows: R2 = Corr(y, y)2

Econometrics I: Multiple Regression: Estimation - H. Tastan 26

Page 102: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Coefficient of Determination:

R2 =SSE

SST= 1− SSR

SST

When a new x variable is added to the regression R2 alwaysincreases (or stays the same). It never decreases.

The reason is that, when a new variable is added SSR alwaysdecreases.

For this reason R2 may not be a good indicator modelselection.

Instead of R2 we will use adjusted R2 for model selection.

Econometrics I: Multiple Regression: Estimation - H. Tastan 27

Page 103: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Coefficient of Determination:

R2 =SSE

SST= 1− SSR

SST

When a new x variable is added to the regression R2 alwaysincreases (or stays the same). It never decreases.

The reason is that, when a new variable is added SSR alwaysdecreases.

For this reason R2 may not be a good indicator modelselection.

Instead of R2 we will use adjusted R2 for model selection.

Econometrics I: Multiple Regression: Estimation - H. Tastan 27

Page 104: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Coefficient of Determination:

R2 =SSE

SST= 1− SSR

SST

When a new x variable is added to the regression R2 alwaysincreases (or stays the same). It never decreases.

The reason is that, when a new variable is added SSR alwaysdecreases.

For this reason R2 may not be a good indicator modelselection.

Instead of R2 we will use adjusted R2 for model selection.

Econometrics I: Multiple Regression: Estimation - H. Tastan 27

Page 105: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Coefficient of Determination:

R2 =SSE

SST= 1− SSR

SST

When a new x variable is added to the regression R2 alwaysincreases (or stays the same). It never decreases.

The reason is that, when a new variable is added SSR alwaysdecreases.

For this reason R2 may not be a good indicator modelselection.

Instead of R2 we will use adjusted R2 for model selection.

Econometrics I: Multiple Regression: Estimation - H. Tastan 27

Page 106: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit

Coefficient of Determination:

R2 =SSE

SST= 1− SSR

SST

When a new x variable is added to the regression R2 alwaysincreases (or stays the same). It never decreases.

The reason is that, when a new variable is added SSR alwaysdecreases.

For this reason R2 may not be a good indicator modelselection.

Instead of R2 we will use adjusted R2 for model selection.

Econometrics I: Multiple Regression: Estimation - H. Tastan 27

Page 107: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit: Example

College GPA

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 R2 = 0.176

The coefficient of determination is 0.176.

%17.6 of the total variation in college GPA can be explainedby the variations in hsGPA and ACT .

This is not very large because there are many factors notincluded in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 28

Page 108: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit: Example

College GPA

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 R2 = 0.176

The coefficient of determination is 0.176.

%17.6 of the total variation in college GPA can be explainedby the variations in hsGPA and ACT .

This is not very large because there are many factors notincluded in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 28

Page 109: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit: Example

College GPA

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 R2 = 0.176

The coefficient of determination is 0.176.

%17.6 of the total variation in college GPA can be explainedby the variations in hsGPA and ACT .

This is not very large because there are many factors notincluded in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 28

Page 110: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Goodness-of-fit: Example

College GPA

colGPA = 1.29 + 0.453 hsGPA+ 0.0094 ACT

n = 141 R2 = 0.176

The coefficient of determination is 0.176.

%17.6 of the total variation in college GPA can be explainedby the variations in hsGPA and ACT .

This is not very large because there are many factors notincluded in the model.

Econometrics I: Multiple Regression: Estimation - H. Tastan 28

Page 111: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 112: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 113: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 114: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 115: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 116: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Regression through the Origin

Predicted y is 0 when all xjs are 0

y = β1x1 + β2x2 + . . .+ βkxk.

Note that the regression does not have a constant term.

R2 can be negative in this regression. In this case R2 istreated as 0 or the regression is re-estimated after adding anintercept term.

R2 < 0 means that the sample average of y, y, is moresuccessful in explaining total variation than the modelincluding x variables.

When the constant term in the PRF is different from 0, thenthe OLS estimators in the regression through the origin will bebiased.

Adding a constant term, when in fact it is 0, on the otherhand, increases the variance of OLS estimators.

Econometrics I: Multiple Regression: Estimation - H. Tastan 29

Page 117: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.1 Linear in Parameters

y = β0 + β1x1 + β2x2 + . . .+ βkxk + u

Population model is linear in parameters.

MLR.2 Random Sampling

We have a random sample of n observations drawn from thepopulation model defined in MLR.1:

{(xi1, xi2, . . . , xik, yi) : i = 1, 2, . . . , n}

Econometrics I: Multiple Regression: Estimation - H. Tastan 30

Page 118: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.1 Linear in Parameters

y = β0 + β1x1 + β2x2 + . . .+ βkxk + u

Population model is linear in parameters.

MLR.2 Random Sampling

We have a random sample of n observations drawn from thepopulation model defined in MLR.1:

{(xi1, xi2, . . . , xik, yi) : i = 1, 2, . . . , n}

Econometrics I: Multiple Regression: Estimation - H. Tastan 30

Page 119: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 120: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 121: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 122: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 123: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 124: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 125: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.3 Zero Conditional Mean

E(u|x1, x2, . . . , xk) = 0

This assumption states that x explanatory variables arestrictly exogenous. Random error term is uncorrelated withexplanatory variables.

There are several ways this assumption can fail.

1. Functional form misspecification: if the functionalrelationship between the explained and explanatory variables ismisspecified this assumption can fail.

2. Omitting an important variable that is correlated with anyof explanatory variables.

3. Measurement error in explanatory variables.

If this assumption fails then we have endogenous explanatoryvariables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 31

Page 126: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.4 No Perfect Collinearity

There is no perfect linear relationship between all independent xvariables. Any of the x variables cannot be written as a linearcombination of other independent variables. If this assumption failswe have perfect collinearity.

If x variables are perfectly collinear then it is not possible tomathematically determine OLS estimators.This assumption allows the independent variables to becorrelated: but they cannot be perfectly correlated.If we did not allow for any correlation among the independentvariables, then multiple regression would be of very limited usefor econometric analysis.For example, in the regression of student GPA on educationexpenditures and family income, we suspect that expenditureand income may be correlated and so we would like to holdincome fixed to find the impact of expenditure on GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 32

Page 127: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.4 No Perfect Collinearity

There is no perfect linear relationship between all independent xvariables. Any of the x variables cannot be written as a linearcombination of other independent variables. If this assumption failswe have perfect collinearity.

If x variables are perfectly collinear then it is not possible tomathematically determine OLS estimators.This assumption allows the independent variables to becorrelated: but they cannot be perfectly correlated.If we did not allow for any correlation among the independentvariables, then multiple regression would be of very limited usefor econometric analysis.For example, in the regression of student GPA on educationexpenditures and family income, we suspect that expenditureand income may be correlated and so we would like to holdincome fixed to find the impact of expenditure on GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 32

Page 128: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.4 No Perfect Collinearity

There is no perfect linear relationship between all independent xvariables. Any of the x variables cannot be written as a linearcombination of other independent variables. If this assumption failswe have perfect collinearity.

If x variables are perfectly collinear then it is not possible tomathematically determine OLS estimators.This assumption allows the independent variables to becorrelated: but they cannot be perfectly correlated.If we did not allow for any correlation among the independentvariables, then multiple regression would be of very limited usefor econometric analysis.For example, in the regression of student GPA on educationexpenditures and family income, we suspect that expenditureand income may be correlated and so we would like to holdincome fixed to find the impact of expenditure on GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 32

Page 129: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.4 No Perfect Collinearity

There is no perfect linear relationship between all independent xvariables. Any of the x variables cannot be written as a linearcombination of other independent variables. If this assumption failswe have perfect collinearity.

If x variables are perfectly collinear then it is not possible tomathematically determine OLS estimators.This assumption allows the independent variables to becorrelated: but they cannot be perfectly correlated.If we did not allow for any correlation among the independentvariables, then multiple regression would be of very limited usefor econometric analysis.For example, in the regression of student GPA on educationexpenditures and family income, we suspect that expenditureand income may be correlated and so we would like to holdincome fixed to find the impact of expenditure on GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 32

Page 130: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Assumptions for Unbiasedness of OLS Estimators

MLR.4 No Perfect Collinearity

There is no perfect linear relationship between all independent xvariables. Any of the x variables cannot be written as a linearcombination of other independent variables. If this assumption failswe have perfect collinearity.

If x variables are perfectly collinear then it is not possible tomathematically determine OLS estimators.This assumption allows the independent variables to becorrelated: but they cannot be perfectly correlated.If we did not allow for any correlation among the independentvariables, then multiple regression would be of very limited usefor econometric analysis.For example, in the regression of student GPA on educationexpenditures and family income, we suspect that expenditureand income may be correlated and so we would like to holdincome fixed to find the impact of expenditure on GPA.

Econometrics I: Multiple Regression: Estimation - H. Tastan 32

Page 131: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Finite Sample Properties of OLS Estimators

THEOREM: Unbiasedness of OLS

Under Assumptions MLR.1 through MLR.4 OLS estimators areunbiased:

E(βj) = βj , j = 0, 1, 2, . . . , k

The centers of the sampling distributions (i.e. expectations) ofOLS estimators are equal to the unknown population parameters.Remember that an estimate cannot be unbiased, it is just anumber based on a sample, and in general we cannot say anythingabout its proximity to the true value.

Econometrics I: Multiple Regression: Estimation - H. Tastan 33

Page 132: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

What happens if we add an irrelevant variable in the model?(overspecifying the model)

Irrelevance of the variable means that its coefficient in thepopulation model is zero.

E.g., suppose that in the regression below the partial effect ofx3 is zero, β3 = 0

y = β0 + β1x1 + β2x2 + β3x3 + u

Taking the conditional expectation we have:

E(y|x1, x2, x3) = E(y|x1, x2) = β0 + β1x1 + β2x2

Even though the true model is given above x3 is added to themodel by mistake.

Econometrics I: Multiple Regression: Estimation - H. Tastan 34

Page 133: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

What happens if we add an irrelevant variable in the model?(overspecifying the model)

Irrelevance of the variable means that its coefficient in thepopulation model is zero.

E.g., suppose that in the regression below the partial effect ofx3 is zero, β3 = 0

y = β0 + β1x1 + β2x2 + β3x3 + u

Taking the conditional expectation we have:

E(y|x1, x2, x3) = E(y|x1, x2) = β0 + β1x1 + β2x2

Even though the true model is given above x3 is added to themodel by mistake.

Econometrics I: Multiple Regression: Estimation - H. Tastan 34

Page 134: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

What happens if we add an irrelevant variable in the model?(overspecifying the model)

Irrelevance of the variable means that its coefficient in thepopulation model is zero.

E.g., suppose that in the regression below the partial effect ofx3 is zero, β3 = 0

y = β0 + β1x1 + β2x2 + β3x3 + u

Taking the conditional expectation we have:

E(y|x1, x2, x3) = E(y|x1, x2) = β0 + β1x1 + β2x2

Even though the true model is given above x3 is added to themodel by mistake.

Econometrics I: Multiple Regression: Estimation - H. Tastan 34

Page 135: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

What happens if we add an irrelevant variable in the model?(overspecifying the model)

Irrelevance of the variable means that its coefficient in thepopulation model is zero.

E.g., suppose that in the regression below the partial effect ofx3 is zero, β3 = 0

y = β0 + β1x1 + β2x2 + β3x3 + u

Taking the conditional expectation we have:

E(y|x1, x2, x3) = E(y|x1, x2) = β0 + β1x1 + β2x2

Even though the true model is given above x3 is added to themodel by mistake.

Econometrics I: Multiple Regression: Estimation - H. Tastan 34

Page 136: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

What happens if we add an irrelevant variable in the model?(overspecifying the model)

Irrelevance of the variable means that its coefficient in thepopulation model is zero.

E.g., suppose that in the regression below the partial effect ofx3 is zero, β3 = 0

y = β0 + β1x1 + β2x2 + β3x3 + u

Taking the conditional expectation we have:

E(y|x1, x2, x3) = E(y|x1, x2) = β0 + β1x1 + β2x2

Even though the true model is given above x3 is added to themodel by mistake.

Econometrics I: Multiple Regression: Estimation - H. Tastan 34

Page 137: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

In this case SRF is given by

y = β0 + β1x1 + β2x2 + β3x3

OLS estimators are still unbiased:

E(β0) = β0, E(β1) = β1, E(β2) = β2, E(β3) = 0,

True parameter value for the irrelevant variable is 0. Since thisvariable would have no explanatory power the expected valueof the OLS estimator will also be zero - i.e. unbiased.

However, even if they are unbiased, the variance of theregression will be larger if the model is overspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 35

Page 138: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

In this case SRF is given by

y = β0 + β1x1 + β2x2 + β3x3

OLS estimators are still unbiased:

E(β0) = β0, E(β1) = β1, E(β2) = β2, E(β3) = 0,

True parameter value for the irrelevant variable is 0. Since thisvariable would have no explanatory power the expected valueof the OLS estimator will also be zero - i.e. unbiased.

However, even if they are unbiased, the variance of theregression will be larger if the model is overspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 35

Page 139: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

In this case SRF is given by

y = β0 + β1x1 + β2x2 + β3x3

OLS estimators are still unbiased:

E(β0) = β0, E(β1) = β1, E(β2) = β2, E(β3) = 0,

True parameter value for the irrelevant variable is 0. Since thisvariable would have no explanatory power the expected valueof the OLS estimator will also be zero - i.e. unbiased.

However, even if they are unbiased, the variance of theregression will be larger if the model is overspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 35

Page 140: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Including Irrelevant Variables in a Regression Model

In this case SRF is given by

y = β0 + β1x1 + β2x2 + β3x3

OLS estimators are still unbiased:

E(β0) = β0, E(β1) = β1, E(β2) = β2, E(β3) = 0,

True parameter value for the irrelevant variable is 0. Since thisvariable would have no explanatory power the expected valueof the OLS estimator will also be zero - i.e. unbiased.

However, even if they are unbiased, the variance of theregression will be larger if the model is overspecified.

Econometrics I: Multiple Regression: Estimation - H. Tastan 35

Page 141: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 142: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 143: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 144: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 145: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 146: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

What happens if we exclude an important variable?

If a relevant variable is omitted this implies that its parameteris not 0 in the PRF. This is called underspecification of themodel.

In this case OLS estimators will be biased.

E.g. suppose that the PRF includes 2 independent variables:

y = β0 + β1x1 + β2x2 + u

Suppose that we omitted x2 because, say, it is unobservable.Now the SRF is

y = β0 + β1x1

Is the OLS estimator on x1, β1, still unbiased?

Econometrics I: Multiple Regression: Estimation - H. Tastan 36

Page 147: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

The impact of the omitted variable will be included in theerror term:

y = β0 + β1x1 + ν

True PRF includes x2’yi. Thus the error term ν can be writtenas:

ν = β2x2 + u

OLS estimator of β1 in the model above is:

β1 =

∑ni=1(xi1 − x1)yi∑ni=1(xi1 − x1)2

To determine the magnitude and sign of the bias we willsubstitute y in the formula for β1, re-arrange and takeexpectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 37

Page 148: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

The impact of the omitted variable will be included in theerror term:

y = β0 + β1x1 + ν

True PRF includes x2’yi. Thus the error term ν can be writtenas:

ν = β2x2 + u

OLS estimator of β1 in the model above is:

β1 =

∑ni=1(xi1 − x1)yi∑ni=1(xi1 − x1)2

To determine the magnitude and sign of the bias we willsubstitute y in the formula for β1, re-arrange and takeexpectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 37

Page 149: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

The impact of the omitted variable will be included in theerror term:

y = β0 + β1x1 + ν

True PRF includes x2’yi. Thus the error term ν can be writtenas:

ν = β2x2 + u

OLS estimator of β1 in the model above is:

β1 =

∑ni=1(xi1 − x1)yi∑ni=1(xi1 − x1)2

To determine the magnitude and sign of the bias we willsubstitute y in the formula for β1, re-arrange and takeexpectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 37

Page 150: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

The impact of the omitted variable will be included in theerror term:

y = β0 + β1x1 + ν

True PRF includes x2’yi. Thus the error term ν can be writtenas:

ν = β2x2 + u

OLS estimator of β1 in the model above is:

β1 =

∑ni=1(xi1 − x1)yi∑ni=1(xi1 − x1)2

To determine the magnitude and sign of the bias we willsubstitute y in the formula for β1, re-arrange and takeexpectation.

Econometrics I: Multiple Regression: Estimation - H. Tastan 37

Page 151: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

β1 =

∑(xi1 − x1)(β0 + β1xi1 + β2xi2 + ui)∑

(xi1 − x1)2

= β1 + β2

∑(xi1 − x1)xi2∑(xi1 − x1)2

+

∑(xi1 − x1)ui∑(xi1 − x1)2

Taking (conditional) expectation we obtain

E(β1) = β1 + β2

∑(xi1 − x1)xi2∑(xi1 − x1)2

+

∑(xi1 − x1)

=0,MLR.3︷ ︸︸ ︷E(ui)∑

(xi1 − x1)2

= β1 + β2

(∑(xi1 − x1)xi2∑(xi1 − x1)2

)

Econometrics I: Multiple Regression: Estimation - H. Tastan 38

Page 152: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitting a Relevant Variable

E(β1) = β1 + β2

(∑(xi1 − x1)xi2∑(xi1 − x1)2

)The expression in the parenthesis to the right of β2 is just theregression of x2 on x1:

x2 = δ0 + δ1x1

ThusE(β1) = β1 + β2δ1

bias = E(β1)− β1 = β2δ1

This is called omitted variable bias.

Econometrics I: Multiple Regression: Estimation - H. Tastan 39

Page 153: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

Iff δ1 = 0 or β2 = 0 then bias is 0.

The sign of bias depends on both β2 and the correlationbetween omitted variable (x2) and included variable (x1).

It is not possible to calculate this correlation if omittedvariable cannot be observed.

The following table summarizes possible cases:

Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0

β2 > 0 positive bias negative bias

β2 < 0 negative bias positive bias

Econometrics I: Multiple Regression: Estimation - H. Tastan 40

Page 154: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

Iff δ1 = 0 or β2 = 0 then bias is 0.

The sign of bias depends on both β2 and the correlationbetween omitted variable (x2) and included variable (x1).

It is not possible to calculate this correlation if omittedvariable cannot be observed.

The following table summarizes possible cases:

Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0

β2 > 0 positive bias negative bias

β2 < 0 negative bias positive bias

Econometrics I: Multiple Regression: Estimation - H. Tastan 40

Page 155: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

Iff δ1 = 0 or β2 = 0 then bias is 0.

The sign of bias depends on both β2 and the correlationbetween omitted variable (x2) and included variable (x1).

It is not possible to calculate this correlation if omittedvariable cannot be observed.

The following table summarizes possible cases:

Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0

β2 > 0 positive bias negative bias

β2 < 0 negative bias positive bias

Econometrics I: Multiple Regression: Estimation - H. Tastan 40

Page 156: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

Iff δ1 = 0 or β2 = 0 then bias is 0.

The sign of bias depends on both β2 and the correlationbetween omitted variable (x2) and included variable (x1).

It is not possible to calculate this correlation if omittedvariable cannot be observed.

The following table summarizes possible cases:

Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0

β2 > 0 positive bias negative bias

β2 < 0 negative bias positive bias

Econometrics I: Multiple Regression: Estimation - H. Tastan 40

Page 157: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

Iff δ1 = 0 or β2 = 0 then bias is 0.

The sign of bias depends on both β2 and the correlationbetween omitted variable (x2) and included variable (x1).

It is not possible to calculate this correlation if omittedvariable cannot be observed.

The following table summarizes possible cases:

Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0

β2 > 0 positive bias negative bias

β2 < 0 negative bias positive bias

Econometrics I: Multiple Regression: Estimation - H. Tastan 40

Page 158: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 159: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 160: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 161: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 162: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 163: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

bias = E(β1)− β1 = β2δ1

The size of the bias is also important. It depends on both δ1and β2.A small bias relative to β1 may not be a problem in practice.But in most cases we are not able to calculate the size of thebias.In some cases we may have an idea about the direction ofbias. For example, suppose that in the wage equation truePRF contains both education and ability.Suppose also that ability is omitted because it cannot beobserved, leading to omitted variable bias.In this case we can say that sign of the bias is + because it isreasonable to think that people with more ability tend to havehigher levels of education and ability is positively related towage.

Econometrics I: Multiple Regression: Estimation - H. Tastan 41

Page 164: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

The effect of omitted variable will be in u. Thus, MLR. 3 fails.

wage = β0 + β1educ+ β2ability + u

Instead of this model we estimate

wage = β0 + β1educ+ ν

ν = β2ability + u

Education will be correlated with the error term (ν).

E(ν|educ) 6= 0

Education is endogenous. If we omit ability the effect ofeducation on wages will be overestimated. Some part of theeffect of education on wage comes from ability.

Econometrics I: Multiple Regression: Estimation - H. Tastan 42

Page 165: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

The effect of omitted variable will be in u. Thus, MLR. 3 fails.

wage = β0 + β1educ+ β2ability + u

Instead of this model we estimate

wage = β0 + β1educ+ ν

ν = β2ability + u

Education will be correlated with the error term (ν).

E(ν|educ) 6= 0

Education is endogenous. If we omit ability the effect ofeducation on wages will be overestimated. Some part of theeffect of education on wage comes from ability.

Econometrics I: Multiple Regression: Estimation - H. Tastan 42

Page 166: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

The effect of omitted variable will be in u. Thus, MLR. 3 fails.

wage = β0 + β1educ+ β2ability + u

Instead of this model we estimate

wage = β0 + β1educ+ ν

ν = β2ability + u

Education will be correlated with the error term (ν).

E(ν|educ) 6= 0

Education is endogenous. If we omit ability the effect ofeducation on wages will be overestimated. Some part of theeffect of education on wage comes from ability.

Econometrics I: Multiple Regression: Estimation - H. Tastan 42

Page 167: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

The effect of omitted variable will be in u. Thus, MLR. 3 fails.

wage = β0 + β1educ+ β2ability + u

Instead of this model we estimate

wage = β0 + β1educ+ ν

ν = β2ability + u

Education will be correlated with the error term (ν).

E(ν|educ) 6= 0

Education is endogenous. If we omit ability the effect ofeducation on wages will be overestimated. Some part of theeffect of education on wage comes from ability.

Econometrics I: Multiple Regression: Estimation - H. Tastan 42

Page 168: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

In models with more than two variables, omitting a relevantvariable generally causes OLS estimators to be biased.

True PRF is given by:

y = β0 + β1x1 + β2x2 + β3x3 + u

x3 is omitted and the following model is estimated

y = β0 + β1x1 + β2x2

Also suppose that x3 is correlated with x1 but uncorrelatedwith x2.

In this case both β1 and β2 will be biased. Iff x1 and x2 arecorrelated then β2 will be unbiased.

Econometrics I: Multiple Regression: Estimation - H. Tastan 43

Page 169: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

In models with more than two variables, omitting a relevantvariable generally causes OLS estimators to be biased.

True PRF is given by:

y = β0 + β1x1 + β2x2 + β3x3 + u

x3 is omitted and the following model is estimated

y = β0 + β1x1 + β2x2

Also suppose that x3 is correlated with x1 but uncorrelatedwith x2.

In this case both β1 and β2 will be biased. Iff x1 and x2 arecorrelated then β2 will be unbiased.

Econometrics I: Multiple Regression: Estimation - H. Tastan 43

Page 170: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

In models with more than two variables, omitting a relevantvariable generally causes OLS estimators to be biased.

True PRF is given by:

y = β0 + β1x1 + β2x2 + β3x3 + u

x3 is omitted and the following model is estimated

y = β0 + β1x1 + β2x2

Also suppose that x3 is correlated with x1 but uncorrelatedwith x2.

In this case both β1 and β2 will be biased. Iff x1 and x2 arecorrelated then β2 will be unbiased.

Econometrics I: Multiple Regression: Estimation - H. Tastan 43

Page 171: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

In models with more than two variables, omitting a relevantvariable generally causes OLS estimators to be biased.

True PRF is given by:

y = β0 + β1x1 + β2x2 + β3x3 + u

x3 is omitted and the following model is estimated

y = β0 + β1x1 + β2x2

Also suppose that x3 is correlated with x1 but uncorrelatedwith x2.

In this case both β1 and β2 will be biased. Iff x1 and x2 arecorrelated then β2 will be unbiased.

Econometrics I: Multiple Regression: Estimation - H. Tastan 43

Page 172: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Omitted Variable Bias

In models with more than two variables, omitting a relevantvariable generally causes OLS estimators to be biased.

True PRF is given by:

y = β0 + β1x1 + β2x2 + β3x3 + u

x3 is omitted and the following model is estimated

y = β0 + β1x1 + β2x2

Also suppose that x3 is correlated with x1 but uncorrelatedwith x2.

In this case both β1 and β2 will be biased. Iff x1 and x2 arecorrelated then β2 will be unbiased.

Econometrics I: Multiple Regression: Estimation - H. Tastan 43

Page 173: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Assumption MLR.5: Homoscedasticity

This assumption states that conditional on x variables error termhas constant variance:

Var(u|x1, x2, . . . , xk) = σ2

If this assumption fails the model exhibits heteroscedasticity.

This assumption is essential in deriving variances and standarderrors of OLS estimators and in showing whether OLSestimators are efficient.

We do not need this assumption for unbiasedness.

E.g., in the wage equation this assumption implies that, thevariance of unobserved factors does not change with thefactors included in the model (education, experience, tenure,etc.).

Econometrics I: Multiple Regression: Estimation - H. Tastan 44

Page 174: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Assumption MLR.5: Homoscedasticity

This assumption states that conditional on x variables error termhas constant variance:

Var(u|x1, x2, . . . , xk) = σ2

If this assumption fails the model exhibits heteroscedasticity.

This assumption is essential in deriving variances and standarderrors of OLS estimators and in showing whether OLSestimators are efficient.

We do not need this assumption for unbiasedness.

E.g., in the wage equation this assumption implies that, thevariance of unobserved factors does not change with thefactors included in the model (education, experience, tenure,etc.).

Econometrics I: Multiple Regression: Estimation - H. Tastan 44

Page 175: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Assumption MLR.5: Homoscedasticity

This assumption states that conditional on x variables error termhas constant variance:

Var(u|x1, x2, . . . , xk) = σ2

If this assumption fails the model exhibits heteroscedasticity.

This assumption is essential in deriving variances and standarderrors of OLS estimators and in showing whether OLSestimators are efficient.

We do not need this assumption for unbiasedness.

E.g., in the wage equation this assumption implies that, thevariance of unobserved factors does not change with thefactors included in the model (education, experience, tenure,etc.).

Econometrics I: Multiple Regression: Estimation - H. Tastan 44

Page 176: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Assumption MLR.5: Homoscedasticity

This assumption states that conditional on x variables error termhas constant variance:

Var(u|x1, x2, . . . , xk) = σ2

If this assumption fails the model exhibits heteroscedasticity.

This assumption is essential in deriving variances and standarderrors of OLS estimators and in showing whether OLSestimators are efficient.

We do not need this assumption for unbiasedness.

E.g., in the wage equation this assumption implies that, thevariance of unobserved factors does not change with thefactors included in the model (education, experience, tenure,etc.).

Econometrics I: Multiple Regression: Estimation - H. Tastan 44

Page 177: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Assumption MLR.5: Homoscedasticity

This assumption states that conditional on x variables error termhas constant variance:

Var(u|x1, x2, . . . , xk) = σ2

If this assumption fails the model exhibits heteroscedasticity.

This assumption is essential in deriving variances and standarderrors of OLS estimators and in showing whether OLSestimators are efficient.

We do not need this assumption for unbiasedness.

E.g., in the wage equation this assumption implies that, thevariance of unobserved factors does not change with thefactors included in the model (education, experience, tenure,etc.).

Econometrics I: Multiple Regression: Estimation - H. Tastan 44

Page 178: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Gauss-Markov Assumptions

Assumptions MLR.1 through MLR.5 are called Gauss-Markovassumptions. MLR.1: Linear in parameters,MLR.2: Random sampling,MLR.3: Zero conditional mean,MLR.4: No perfect collinearity,MLR.5: Homoscedasticity.

This assumptions are only valid for cross-sectional data. Theyneed to be modified for time series data.

Assumptions MLR.3 and MLR.5 can be restated in terms ofthe dependent variable:

E(y|x1, x2, . . . , xk) = β0 + β1x1 + β2x2 + . . .+ βkxk

Var(y|x1, x2, . . . , xk) = Var(u|x1, x2, . . . , xk) = σ2

Econometrics I: Multiple Regression: Estimation - H. Tastan 45

Page 179: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Gauss-Markov Assumptions

Assumptions MLR.1 through MLR.5 are called Gauss-Markovassumptions. MLR.1: Linear in parameters,MLR.2: Random sampling,MLR.3: Zero conditional mean,MLR.4: No perfect collinearity,MLR.5: Homoscedasticity.

This assumptions are only valid for cross-sectional data. Theyneed to be modified for time series data.

Assumptions MLR.3 and MLR.5 can be restated in terms ofthe dependent variable:

E(y|x1, x2, . . . , xk) = β0 + β1x1 + β2x2 + . . .+ βkxk

Var(y|x1, x2, . . . , xk) = Var(u|x1, x2, . . . , xk) = σ2

Econometrics I: Multiple Regression: Estimation - H. Tastan 45

Page 180: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Gauss-Markov Assumptions

Assumptions MLR.1 through MLR.5 are called Gauss-Markovassumptions. MLR.1: Linear in parameters,MLR.2: Random sampling,MLR.3: Zero conditional mean,MLR.4: No perfect collinearity,MLR.5: Homoscedasticity.

This assumptions are only valid for cross-sectional data. Theyneed to be modified for time series data.

Assumptions MLR.3 and MLR.5 can be restated in terms ofthe dependent variable:

E(y|x1, x2, . . . , xk) = β0 + β1x1 + β2x2 + . . .+ βkxk

Var(y|x1, x2, . . . , xk) = Var(u|x1, x2, . . . , xk) = σ2

Econometrics I: Multiple Regression: Estimation - H. Tastan 45

Page 181: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

where

SSTj =

n∑i=1

(xij − xj)2

is the total sample variation in xj and R2j is the R-squared from

regressing xj on all other independent variables (including anintercept term).

Var(βj) moves in the same direction with σ2 but in oppositedirection with SSTj .To increase SSTj we need to collect more data (increase n).To reduce σ2 we need to find good explanatory variables.Econometrics I: Multiple Regression: Estimation - H. Tastan 46

Page 182: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

where

SSTj =

n∑i=1

(xij − xj)2

is the total sample variation in xj and R2j is the R-squared from

regressing xj on all other independent variables (including anintercept term).

Var(βj) moves in the same direction with σ2 but in oppositedirection with SSTj .To increase SSTj we need to collect more data (increase n).To reduce σ2 we need to find good explanatory variables.Econometrics I: Multiple Regression: Estimation - H. Tastan 46

Page 183: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

where

SSTj =

n∑i=1

(xij − xj)2

is the total sample variation in xj and R2j is the R-squared from

regressing xj on all other independent variables (including anintercept term).

Var(βj) moves in the same direction with σ2 but in oppositedirection with SSTj .To increase SSTj we need to collect more data (increase n).To reduce σ2 we need to find good explanatory variables.Econometrics I: Multiple Regression: Estimation - H. Tastan 46

Page 184: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 185: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 186: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 187: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 188: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 189: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Theorem: Variances of β

Under Gauss-Markov assumptions (MLR.1-MLR.5):

Var(βj) =σ2

SSTj(1−R2j ), j = 1, 2, . . . , k

Var(βj) also depends on R2j which measures the degree of

correlation among x variables.We did not have this term in the simple regression analysisbecause there was only one explanatory variable.As the degree of correlation increases among x variables thevariances of OLS estimators get larger and larger.When there are high level of collinearity among x variablesvariances of OLS estimators will be larger. This is calledmulticollinearity problem.In the limit R2

j = 1 in which case the variance is infinite (note

that in this case βjs will be indefinite. But MLR.4 prohibitsthis.

Econometrics I: Multiple Regression: Estimation - H. Tastan 47

Page 190: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Relationship between variance and R2j

Page 191: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 192: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 193: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 194: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 195: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 196: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Estimating Variances

An unbiased estimator of the error variance is:

σ2 =1

n− k − 1

n∑i=1

u2i =SSR

n− k − 1

Degrees of freedom:

dof = n− (k + 1)

dof = number of observations − number of parametersdof comes from the OLS first order conditions. Recall thatthere were k + 1 conditions. In other words k + 1 restrictionsare imposed on residuals.This means that given n− (k + 1) of the residuals, theremaining k + 1 residuals are known: there are only n− k − 1degrees of freedom in the residuals.Notice that the error term u has n degrees of freedom.

Econometrics I: Multiple Regression: Estimation - H. Tastan 49

Page 197: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Standard deviations of βjs

sd(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

Standard errors of βjs

se(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

σ =√σ2 is the standard error of regression (SER).

SER is an estimator of the standard deviation of the errorterm. SER may increase or decrease when a new variable isadded in the model.

se(βj) is used in calculating confidence intervals and testinghypotheses.

Econometrics I: Multiple Regression: Estimation - H. Tastan 50

Page 198: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Standard deviations of βjs

sd(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

Standard errors of βjs

se(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

σ =√σ2 is the standard error of regression (SER).

SER is an estimator of the standard deviation of the errorterm. SER may increase or decrease when a new variable isadded in the model.

se(βj) is used in calculating confidence intervals and testinghypotheses.

Econometrics I: Multiple Regression: Estimation - H. Tastan 50

Page 199: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Standard deviations of βjs

sd(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

Standard errors of βjs

se(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

σ =√σ2 is the standard error of regression (SER).

SER is an estimator of the standard deviation of the errorterm. SER may increase or decrease when a new variable isadded in the model.

se(βj) is used in calculating confidence intervals and testinghypotheses.

Econometrics I: Multiple Regression: Estimation - H. Tastan 50

Page 200: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Standard deviations of βjs

sd(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

Standard errors of βjs

se(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

σ =√σ2 is the standard error of regression (SER).

SER is an estimator of the standard deviation of the errorterm. SER may increase or decrease when a new variable isadded in the model.

se(βj) is used in calculating confidence intervals and testinghypotheses.

Econometrics I: Multiple Regression: Estimation - H. Tastan 50

Page 201: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Variance of the OLS Estimators

Standard deviations of βjs

sd(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

Standard errors of βjs

se(βj) =σ√

SSTj(1−R2j ), j = 1, 2, . . . , k

σ =√σ2 is the standard error of regression (SER).

SER is an estimator of the standard deviation of the errorterm. SER may increase or decrease when a new variable isadded in the model.

se(βj) is used in calculating confidence intervals and testinghypotheses.

Econometrics I: Multiple Regression: Estimation - H. Tastan 50

Page 202: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Efficiency of OLS

Gauss-Markov Theorem

Under the assumptions MLR.1 through MLR.5, OLS estimators arethe best linear unbiased estimators (BLUE) for unknownpopulation parameters. In other words, under MLR.1-MLR.5, OLSestimators β0, β1, β2, . . . , βk are BLUEs of β0, β1, β2, . . . , βk

The Gauss-Markov theorem provides justification for the OLSestimation.

If this theorem is valid then we do not need to find otherestimation methods than OLS. The method of OLS gives usthe most efficient (best, minimum variance) estimators.

If any of the five assumptions fails the Gauss-Markov theoremdoes not hold. When MLR.3 fails unbiasedness property doesnot hold. When MLR.5 fails efficiency does not hold.

Econometrics I: Multiple Regression: Estimation - H. Tastan 51

Page 203: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Efficiency of OLS

Gauss-Markov Theorem

Under the assumptions MLR.1 through MLR.5, OLS estimators arethe best linear unbiased estimators (BLUE) for unknownpopulation parameters. In other words, under MLR.1-MLR.5, OLSestimators β0, β1, β2, . . . , βk are BLUEs of β0, β1, β2, . . . , βk

The Gauss-Markov theorem provides justification for the OLSestimation.

If this theorem is valid then we do not need to find otherestimation methods than OLS. The method of OLS gives usthe most efficient (best, minimum variance) estimators.

If any of the five assumptions fails the Gauss-Markov theoremdoes not hold. When MLR.3 fails unbiasedness property doesnot hold. When MLR.5 fails efficiency does not hold.

Econometrics I: Multiple Regression: Estimation - H. Tastan 51

Page 204: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Efficiency of OLS

Gauss-Markov Theorem

Under the assumptions MLR.1 through MLR.5, OLS estimators arethe best linear unbiased estimators (BLUE) for unknownpopulation parameters. In other words, under MLR.1-MLR.5, OLSestimators β0, β1, β2, . . . , βk are BLUEs of β0, β1, β2, . . . , βk

The Gauss-Markov theorem provides justification for the OLSestimation.

If this theorem is valid then we do not need to find otherestimation methods than OLS. The method of OLS gives usthe most efficient (best, minimum variance) estimators.

If any of the five assumptions fails the Gauss-Markov theoremdoes not hold. When MLR.3 fails unbiasedness property doesnot hold. When MLR.5 fails efficiency does not hold.

Econometrics I: Multiple Regression: Estimation - H. Tastan 51

Page 205: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

Efficiency of OLS

Gauss-Markov Theorem

Under the assumptions MLR.1 through MLR.5, OLS estimators arethe best linear unbiased estimators (BLUE) for unknownpopulation parameters. In other words, under MLR.1-MLR.5, OLSestimators β0, β1, β2, . . . , βk are BLUEs of β0, β1, β2, . . . , βk

The Gauss-Markov theorem provides justification for the OLSestimation.

If this theorem is valid then we do not need to find otherestimation methods than OLS. The method of OLS gives usthe most efficient (best, minimum variance) estimators.

If any of the five assumptions fails the Gauss-Markov theoremdoes not hold. When MLR.3 fails unbiasedness property doesnot hold. When MLR.5 fails efficiency does not hold.

Econometrics I: Multiple Regression: Estimation - H. Tastan 51

Page 206: MULTIPLE REGRESSION ANALYSIS: ESTIMATION - …tastan/teaching/03 Multiple Regression... · Multiple Regression Analysis In the simple regression analysis with only one explanatory

The Meaning of Linearity of Estimators in BLUE

Linearity of Estimators

An estimator βj of βj is linear, iff, it can be expressed as a linearfunction of the data on the dependent variable:

βj =

n∑i=1

wijyi

where each wij can be a function of the sample values of all theindependent variables.

The OLS estimators can be written in the form given above:

βj =

∑ni=1 rijyi∑ni=1 r

2ij

=n∑

i=1

wijyi, where wij =rij∑ni=1 r

2ij

where rij is the residual obtained from the regression of xj on allother explanatory variables.

Econometrics I: Multiple Regression: Estimation - H. Tastan 52