Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central...

38
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (review chapter). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/141/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/

Transcript of Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central...

Page 1: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

Christopher Dougherty

EC220 - Introduction to econometrics (review chapter)Slideshow: the central limit theorem

 

 

 

 

Original citation:

Dougherty, C. (2012) EC220 - Introduction to econometrics (review chapter). [Teaching Resource]

© 2012 The Author

This version available at: http://learningresources.lse.ac.uk/141/

Available in LSE Learning Resources Online: May 2012

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/

 

 http://learningresources.lse.ac.uk/

Page 2: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

1

THE CENTRAL LIMIT THEOREM

If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful for the construction of t statistics and confidence intervals if we are employing X as an estimator of the population mean.

0

5

10

15

0 0.5 1

n = 1

Page 3: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

2

THE CENTRAL LIMIT THEOREM

However, what happens if we are not able to assume that X is normally distributed?

0

5

10

15

0 0.5 1

n = 1

Page 4: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

3

THE CENTRAL LIMIT THEOREM

The standard response is to make use of a central limit theorem. Loosely speaking, a central limit theorem states that the distribution of X will approximate a normal distribution as the sample size becomes large, even when the distribution of X itself is not normal.

0

5

10

15

0 0.5 1

n = 1

Page 5: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

4

THE CENTRAL LIMIT THEOREM

There are a number of central limit theorems, differing only in the assumptions that they make in order to obtain this result. Here we shall be content with using the simplest one, the Lindeberg–Levy central limit theorem.

0

5

10

15

0 0.5 1

n = 1

Page 6: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

5

THE CENTRAL LIMIT THEOREM

It states that, provided that the Xi in the sample are all drawn independently from the same distribution (the distribution of X), and provided that this distribution has finite population mean and variance, the distribution of X will converge on a normal distribution.

0

5

10

15

0 0.5 1

n = 1

Page 7: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

6

THE CENTRAL LIMIT THEOREM

This means that our t statistics and confidence intervals will be approximately valid after all, provided that the sample size is large enough.

0

5

10

15

0 0.5 1

n = 1

Page 8: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

7

The figure shows the distribution of X for the case where the X has a uniform distribution with range 0 to 1, for 10,000,000 samples. A uniform distribution is one in which all values over a finite range are equally likely.

THE CENTRAL LIMIT THEOREM

0

5

10

15

0 0.5 1

n = 1

Page 9: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

8

For a sample of 1, the distribution of X is the uniform distribution itself, and so it is a horizontal line.

THE CENTRAL LIMIT THEOREM

0

5

10

15

0 0.5 1

n = 1

Page 10: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

9

THE CENTRAL LIMIT THEOREM

0

5

10

15

0 0.5 1

n = 10

n = 1

We now show the distribution of X for a sample of size 10, for 10,000,000 samples. It can be seen that X has a distribution very close to a normal distribution even when the sample size is quite small.

Page 11: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

10

THE CENTRAL LIMIT THEOREM

0

5

10

15

0 0.5 1

n = 25

n = 10

n = 1

Here is the distribution of X for 10,000,000 samples, each of size 25. It is even closer to normal.

Page 12: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

11

THE CENTRAL LIMIT THEOREM

0

5

10

15

0 0.5 1

n = 100

n = 25

n = 10

n = 1

Here is the distribution of X for 10,000,000 samples, each of size 25. It is indistinguishable from normal.

Page 13: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

12

If X had a different distribution, the sample size required for a good approximation would be different. The figure shows the case where X has a lognormal distribution. As you can see, it is heavily skewed.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1

Page 14: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

13

Here is the distribution of X for sample size 10, for 10,000,000 samples. It is less skewed.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

Page 15: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

14

With sample size 25, the distribution is increasingly symmetrical.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

n = 25

Page 16: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

15

However, even with sample size 100, the distribution is only an approximation to a normal distribution. Notice the difference in the shapes of the tails.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

n = 25

n = 100

Page 17: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

16

In asserting that the distribution of X tends to become normal as the sample size increases, we have glossed over an important technical point that needs to be addressed. The central limit theorem applies only in the limit, as the sample size tends to infinity.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

n = 25

n = 100

Page 18: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

17

However, as the sample size tends to infinity, the distribution of X degenerates to a spike located at the population mean. So how can we talk about the limiting distribution being normal?

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

n = 25

n = 100

Page 19: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

18

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6 7

n = 1n = 10

n = 25

n = 100

The answer is to transform the estimator in an appropriate way so that the transformation does have a limiting distribution. Having established the limiting distribution of the transformation, we may be able to work backwards to the properties of the estimator.

Page 20: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

19

THE CENTRAL LIMIT THEOREM

Suppose that X has variance .

Then, for sample size n, X has variance .

It follows that has variance .

However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to . As a consequence, does not have a limiting distribution. It increases indefinitely with n.

To deal with this, we consider instead the statistic .

The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

2

n/2

Xn 2

Xn n

2,0 NXn d

Xn

Xn

Let us write the variance of X as 2. Then the variance of the sample mean is 2/n. It follows that has variance 2, which is independent of n. We are making progress in finding the appropriate transformation.

Xn

Page 21: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

20

THE CENTRAL LIMIT THEOREM

Suppose that X has variance .

Then, for sample size n, X has variance .

It follows that has variance .

However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to . As a consequence, does not have a limiting distribution. It increases indefinitely with n.

To deal with this, we consider instead the statistic .

The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

2

n/2

Xn 2

Xn n

2,0 NXn d

Xn

Xn

However, as the sample size becomes large, the sample mean tends to the population mean of X, which we will denote . Thus tends to . This increases with n, so it cannot have a limiting distribution.

Xn n

Page 22: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

21

THE CENTRAL LIMIT THEOREM

Suppose that X has variance .

Then, for sample size n, X has variance .

It follows that has variance .

However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to . As a consequence, does not have a limiting distribution. It increases indefinitely with n.

To deal with this, we consider instead the statistic .

The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

2

n/2

Xn 2

Xn n

2,0 NXn d

Xn

Xn

To deal with this, we consider instead the statistic . This is what we need. The Lindeberg–Levy central limit theorem states that, as n tends to infinity, this statistic tends to a normal distribution with mean zero and variance 2.

Xn

Page 23: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

22

THE CENTRAL LIMIT THEOREM

Suppose that X has variance .

Then, for sample size n, X has variance .

It follows that has variance .

However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to . As a consequence, does not have a limiting distribution. It increases indefinitely with n.

To deal with this, we consider instead the statistic .

The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

2

n/2

Xn 2

Xn n

2,0 NXn d

Xn

Xn

The arrow with a d over it is mathematical shorthand that means ‘has limiting distribution as n tends to infinity’.

Page 24: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

23

So far we have talked of ‘the’ central limit theorem. In fact, there are numerous CLTs. They differ in the assumptions required for their use. The Lindeberg–Levy CLT is a particularly simple one and sufficient for the present analysis.

THE CENTRAL LIMIT THEOREM

Suppose that X has variance .

Then, for sample size n, X has variance .

It follows that has variance .

However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to . As a consequence, does not have a limiting distribution. It increases indefinitely with n.

To deal with this, we consider instead the statistic .

The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

2

n/2

Xn 2

Xn n

2,0 NXn d

Xn

Xn

Page 25: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

24

Now this relationship is true only as n goes to infinity. However, from the limiting distribution, we can start working back tentatively to finite samples. We can say, that for large n, the relationship may hold approximately.

THE CENTRAL LIMIT THEOREM

This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n,

and hence that, as an approximation, for sufficiently large n,

2,0 NXn d

nNX

2

,0 ~

n

NX2

,~

Page 26: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

25

Then, dividing the statistic by , we can say that, for sufficiently large n, the second equation is approximately true. (The symbol ~ means ‘is distributed as’.)

THE CENTRAL LIMIT THEOREM

This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n,

and hence that, as an approximation, for sufficiently large n,

2,0 NXn d

nNX

2

,0 ~

n

NX2

,~

n

Page 27: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

26

This implies the third equation. We knew that the sample mean was distributed with mean and variance 2/n. What we have shown is that its distribution is approximately normal in sufficiently large samples. This enables us to perform the usual tests.

THE CENTRAL LIMIT THEOREM

This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n,

and hence that, as an approximation, for sufficiently large n,

2,0 NXn d

nNX

2

,0 ~

n

NX2

,~

Page 28: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

27

Of course, this begs the question of what might be considered to be ‘sufficiently large n’. To answer this question, the analysis must be supplemented by simulation.

THE CENTRAL LIMIT THEOREM

This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n,

and hence that, as an approximation, for sufficiently large n,

2,0 NXn d

nNX

2

,0 ~

n

NX2

,~

Page 29: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

28

The figure shows the distribution of for the uniform distribution when n = 1. It is, of course, just the uniform distribution itself, with the mean of 0.5 subtracted.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

-1.5 -1 -0.5 0 0.5 1 1.5

Xn

Page 30: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

29

Here is the distribution of when n = 10. It looks very like a normal distribution.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

-1.5 -1 -0.5 0 0.5 1 1.5

Xn

Page 31: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

30

Here is the same figure with the theoretical limiting normal distribution superimposed (the dashed curve in red). It confirms that the distribution for the sample mean has virtually converged to normality with a sample size of only 10.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

-1.5 -1 -0.5 0 0.5 1 1.5

Page 32: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

31

The curve for n = 25 has been added. There is hardly any change because convergence has already been achieved.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

-1.5 -1 -0.5 0 0.5 1 1.5

Xn

Page 33: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

32

Of course, the curve for n = 100 also coincides.

THE CENTRAL LIMIT THEOREM

0.0

0.5

1.0

1.5

2.0

-1.5 -1 -0.5 0 0.5 1 1.5

Page 34: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

33

Now consider the example of the lognormal distribution. Here is the distribution of for n = 1. It is just the lognormal distribution itself with the mean subtracted.

THE CENTRAL LIMIT THEOREM

0

0.5

-6 -4 -2 0 2 4 6

Xn

Page 35: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

34

Here is the distribution of for n = 10. The theoretical limiting distribution is also shown. Clearly, n = 10 if far from being ‘sufficiently large’.

THE CENTRAL LIMIT THEOREM

0

0.5

-6 -4 -2 0 2 4 6

Xn

Page 36: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

35

THE CENTRAL LIMIT THEOREM

0

0.5

-6 -4 -2 0 2 4 6

Here is the distribution of for n = 25. It is closer to the limiting distribution but there is still a long way to go.

Xn

Page 37: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

36

THE CENTRAL LIMIT THEOREM

0

0.5

-6 -4 -2 0 2 4 6

Here is the distribution of for n = 100. It is closer still to the limiting distribution but convergence has not been achieved. In the case of the lognormal distribution, a sample size of 100 is clearly not “sufficiently large”. We should try 200, perhaps 500.

Xn

Page 38: Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

Copyright Christopher Dougherty 2011.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section R.15 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own and who feel that they might

benefit from participation in a formal course should consider the London School

of Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

20 Elements of Econometrics

www.londoninternational.ac.uk/lse.

11.07.25