Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: random components,...

Christopher Dougherty

EC220 - Introduction to econometrics (chapter 2)Slideshow: random components, unbiasedness of the regression coefficients

Original citation:

Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 2). [Teaching Resource]

© 2012 The Author

This version available at: http://learningresources.lse.ac.uk/128/

Available in LSE Learning Resources Online: May 2012

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/

http://learningresources.lse.ac.uk/

http://learningresources.lse.ac.uk/128/

http://creativecommons.org/licenses/by-sa/3.0/

1

uXY 21

XbbY 21ˆ

The regression coefficients are special types of random variable. We will demonstrate this using the simple regression model in which Y depends on X. The two equations show the true model and the fitted regression.

RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS

True model

Fitted model

2

22

XX

YYXXb

i

iiuXY 21

XbbY 21ˆ

We will investigate the behavior of the ordinary least squares (OLS) estimator of the slope coefficient, shown above.


True model

Fitted model

3

22

XX

YYXXb

i

iiuXY 21

XbbY 21ˆ

Y has two components: a nonrandom component that depends on X and the parameters, and the random component u. Since b2 depends on Y, it indirectly depends on u.


True model

Fitted model

4

22

XX

YYXXb

i

iiuXY 21

XbbY 21ˆ

If the values of u in the sample had been different, we would have had different values of Y, and hence a different value for b2. We can in theory decompose b2 into its nonrandom and random components.


True model

Fitted model

We will start with the numerator, substituting for Y and its sample mean from the true model.

5

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

uXY 21

XbbY 21ˆ


True model

Fitted model

The 1 terms in the second factor cancel. We rearrange the remaining terms.

6

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

uXY 21

XbbY 21ˆ


True model

Fitted model

We expand the product.

7

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

uXY 21

XbbY 21ˆ


True model

Fitted model

Substituting this for the numerator in the expression for b2, we decompose b2 into the true value 2 and an error term that depends on the values of X and u.

8

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

222

22

2XX

uuXX

XX

uuXXXXb

i

ii

i

iii

uXY 21 True model

XbbY 21ˆ

Fitted model


9

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

222

22

2XX

uuXX

XX

uuXXXXb

i

ii

i

iii

uXY 21

XbbY 21ˆ

The error term depends on the value of the disturbance term in every observation in the sample, and thus it is a special type of random variable.


True model

Fitted model

10

22

XX

YYXXb

i

ii

uuXXXX

uuXXXX

uXuXXXYYXX

iii

iii

iiiii

22

2

2121

][][

][][

222

22

2XX

uuXX

XX

uuXXXXb

i

ii

i

iii

uXY 21

XbbY 21ˆ

The error term is responsible for the variations of b2 around its fixed component 2. If we wish, we can express the decomposition more tidily.


True model

Fitted model

11

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

This is the decomposition so far.


12

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

The next step is to make a small simplification of the numerator of the error term. First, we expand it as shown.

ii

iii

iiiii

uXX

XXuuXX

uXXuXXuuXX


13

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

The mean value of u is a common factor of the second summation, so it can be taken outside.

ii

iii

iiiii

uXX

XXuuXX

uXXuXXuuXX


14

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

The second term then vanishes because the sum of the deviations of X around its sample mean is automatically zero.

ii

iii

iiiii

uXX

XXuuXX

uXXuXXuuXX

0 XnXnXnXXX ii

n

XX i


15

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

Thus we can rewrite the decomposition as shown. For convenience, the denominator has been denoted .

iiii uXXuuXX

iiii

ii

iiii

uauXX

uXX

uXXuXX

b

222

222

1

1

2XX i


16

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

A further small rearrangement of the expression for the error term.

iiii uXXuuXX

iiii

ii

iiii

uauXX

uXX

uXXuXX

b

222

222

1

1


17

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

Another one.

iiii uXXuuXX

iiii

ii

iiii

uauXX

uXX

uXXuXX

b

222

222

1

1


18

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

One more.

iiii uXXuuXX

iiii

ii

iiii

uauXX

uXX

uXXuXX

b

222

222

1

1


19

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

Thus we have shown that b2 is equal to the true value and plus a weighted linear combination of the values of the disturbance term in the sample, where the weights are functions of the values of X in the observations in the sample.

iiii uXXuuXX

iiii

ii

iiii

uauXX

uXX

uXXuXX

b

222

222

1

1

2XX

XXXXa

j

iii


20

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

As you can see, every value of the disturbance term in the sample affects the sample value of b2.

iiii uXXuuXX

iiuab 22

2XX

XXXXa

j

iii


21

2222

XX

uuXX

XX

YYXXb

i

ii

i

ii

Before moving on, it may be helpful to clarify a mathematical technicality. In the summation in the denominator of the expression for ai, the subscript has been changed to j. Why?

iiii uXXuuXX

iiuab 22

2XX

XXXXa

j

iii


22

The denominator is the sum, from 1 to n, of the squared deviations of X from its sample mean. This is made explicit in the version of the expression in the box.

iiuab 22

2XX

XXXXa

j

iii

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii


23

Written this way, the meaning of the denominator is clear, but the form is clumsy. Obviously, we should use –notation to compress it.

iiuab 22

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii

n

jj

iii

XX

XXXXa

1

2


24

For the -notation, we need to choose an index symbol that changes as we go from the first squared deviation to the last. We can use anything we like, EXCEPT i, because we are already using i for a completely different purpose in the numerator.

iiuab 22

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii

n

jj

iii

XX

XXXXa

1

2


25

We have used j here, but this was quite arbitrary. We could have used anything for the summation index (except i), as long as the meaning is clear. We could have used a smiley ☻ instead (please don’t).

iiuab 22

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii

n

jj

iii

XX

XXXXa

1

2


26

iiuab 22

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii

n

jj

iii

XX

XXXXa

1

2

The error term depends on the value of the disturbance term in every observation in the sample, and thus it is a special type of random variable.


27

iiuab 22

221

2 ... XXXX

XX

XX

XXXXa

n

i

j

iii

n

jj

iii

XX

XXXXa

1

2

We will show that the error term has expected value zero, and hence that the ordinary least squares (OLS) estimator of the slope coefficient in a simple regression model is unbiased.


28

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

The expected value of b2 is equal to the expected value of 2 and the expected value of the weighted sum of the values of the disturbance term.


True model

Fitted model

29

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

2 is fixed so it is unaffected by taking expectations. The first expectation rule (Review chapter) states that the expectation of a sum of several quantities is equal to the sum of their expectations.

iinnnnii uaEuaEuaEuauaEuaE ...... 1111


True model

Fitted model

30

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

Now for each i, E(aiui) = aiE(ui). This is a really important step and we can make it only with Model A.


True model

Fitted model

31

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

Under Model A, we are assuming that the values of X in the observations are nonstochastic. It follows that each ai is nonstochastic, since it is just a combination of the values of X.


True model

Fitted model

32

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

Thus it can be treated as a constant, allowing us to take it out of the expectation using the second expected value rule (Review chapter).


True model

Fitted model

33

it

i

ii uaXX

YYXXb 222

uXY 21

XbbY 21ˆ

n

jj

ii

XX

XXa

1

2

2

22

22

iiii

ii

uEauaE

uaEEbE

Under Assumption A.3, E(ui) = 0 for all i, and so the estimator is unbiased. The proof of the unbiasedness of the estimator of the intercept will be left as an exercise.


True model

Fitted model

uXY 21

It is important to realize that the OLS estimators of the parameters are not the only unbiased estimators. We will give an example of another.


34

True model

uXY 21

Someone who had never heard of regression analysis, seeing a scatter diagram of a sample of observations, might estimate the slope by joining the first and the last observations, and dividing the increase in the height by the horizontal distance between them.

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

Y

Y1

X1 Xn

Yn

X

Yn – Y1

Xn – X1


35

True model

uXY 21

The estimator is thus (Yn–Y1) divided by (Xn–X1). We will investigate whether it is biased or unbiased.

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

Y

Y1

X1 Xn

Yn

X

Yn – Y1

Xn – X1


36

True model

uXY 21

To do this, we start by substituting for the Y components in the expression.

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

Y

Y1

X1 Xn

Yn

X

nnn uXY 21

11211 uXY


37

True model

uXY 21

The 1 terms cancel out and the rest of the expression simplifies as shown. Thus we have decomposed this naïve estimator into two components, the true value and an error term. This decomposition is parallel to that for the OLS estimator, but the error term is different.

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n


38

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

We now take expectations to investigate unbiasedness.


39

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

The denominator of the error term can be taken outside because the values of X are nonstochastic.


40

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

Given Assumption A.3, the expectations of un and u1 are zero. Therefore, despite being naïve, this estimator is unbiased.


41

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

It is intuitively easy to see that we would not prefer the naïve estimator to OLS. Unlike OLS, which takes account of every observation, it employs only the first and the last and is wasting most of the information in the sample.


42

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

The naïve estimator will be sensitive to the value of the disturbance term u in those two observations, whereas the OLS estimator combines all the disturbance term values and takes greater advantage of the possibility that to some extent they cancel each other out.


43

True model

uXY 21

1

12

1

112

1

112121

1

12

XX

uu

XX

uuXX

XX

uXuX

XX

YYb

n

n

n

nn

n

nn

n

n

211

2

1

122

)(1

)()(

uuEXX

XXuu

EEbE

nn

n

n

More rigorously, it can be shown that the population variance of the naïve estimator is greater than that of the OLS estimator, and that the naïve estimator is therefore less efficient.


44

True model

Copyright Christopher Dougherty 2011.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section 2.3 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own and who feel that they might

benefit from participation in a formal course should consider the London School

of Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

20 Elements of Econometrics

www.londoninternational.ac.uk/lse.

11.07.25

http://www.oup.com/uk/orc/bin/9780199567089/

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: random components,...

Documents

Transcript of Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: random components,...