Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of...

104
Lecture 3 Today: • Statistical Review cont’d: • Unbiasedness and efficiency • Sample equivalents of variance, covariance and correlation • Probability limits and consistency (quick) • The Simple Regression Model

Transcript of Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of...

Page 1: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

Lecture 3

Today:• Statistical Review cont’d:

• Unbiasedness and efficiency• Sample equivalents of variance, covariance and correlation• Probability limits and consistency (quick)• The Simple Regression Model

Page 2: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

We will next demonstrate that the variance of the distribution of X is smaller than that of X, as depicted in the diagram.

probability density

function of X

mX

XmX

X

probability density

function of X

SAMPLING AND ESTIMATORS

Page 3: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

.1

...1

var...var1

...var1

...1

var)(

22

2

222

12

12

12

nn

n

n

XXn

XXn

XXn

XVar

XX

XX

n

n

nX

We start by replacing X by its definition and then using variance rule 2 to take 1/n out of the expression as a common factor.

SAMPLING AND ESTIMATORS

Page 4: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

.1

...1

var...var1

...var1

...1

var

22

2

222

12

12

12

nn

n

n

XXn

XXn

XXn

XX

XX

n

n

nX

Next we use variance rule 1 to replace the variance of a sum with a sum of variances. In principle there are many covariance terms as well, but they are zero if we assume that the sample values are generated independently.

SAMPLING AND ESTIMATORS

Page 5: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

.1

...1

var...var1

...var1

...1

var

22

2

222

12

12

12

nn

n

n

XXn

XXn

XXn

XX

XX

n

n

nX

Now we come to the bit that requires thought. Start with X1. When we are still at the planning stage, we do not know what the value of X1 will be.

SAMPLING AND ESTIMATORS

Page 6: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

.1

...1

var...var1

...var1

...1

var

22

2

222

12

12

12

nn

n

n

XXn

XXn

XXn

XX

XX

n

n

nX

All we know is that it will be generated randomly from the distribution of X. The variance of X1, as a beforehand concept, will therefore be sX. The same is true for all the other sample components, thinking about them beforehand. Hence we write this line.

2

SAMPLING AND ESTIMATORS

Page 7: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

.1

...1

var...var1

...var1

...1

var

22

2

222

12

12

12

nn

n

n

XXn

XXn

XXn

XX

XX

n

n

nX

Thus we have demonstrated that the variance of the sample mean is equal to the variance of X divided by n, a result with which you will be familiar from your statistics course.

SAMPLING AND ESTIMATORS

Page 8: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

UNBIASEDNESS AND EFFICIENCY

However, the sample mean is not the only unbiased estimator of the population mean. We will demonstrate this supposing that we have a sample of two observations (to keep it simple).

Thus Z is an unbiased estimator of mX if the sum of the weights is equal to one. An infinite number of combinations of l1 and l2 satisfy this condition, not just the sample mean (here, li =1/n ).

XXn

nn

nn

XEXEn

XXEn

XXn

EXE

1)(...)(

1

)...(1

)...(1

)(

1

11

1)( if

)()()(

)()()()(

21

212211

22112211

X

XXEXE

XEXEXXEZE

Unbiasedness of X:

Generalized estimator Z = l1X1 + l2X2

Page 9: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

probabilitydensityfunction

mX

estimator B

Generalized estimator Z = l1X1 + l2X2 is an unbiased estimator of mX if the sum of the weights is equal to one. An infinite number of combinations of lis satisfy this condition, not just the sample mean.

How do we choose among them? The answer is to use the most efficient estimator, the one with the smallest population variance, because it will tend to be the most accurate.

estimator A

UNBIASEDNESS AND EFFICIENCY

Page 10: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

probabilitydensityfunction

estimator B

In the diagram, A and B are both unbiased estimators but B is superior because it is more efficient.

estimator A

mX

UNBIASEDNESS AND EFFICIENCY

Page 11: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

We will analyze the variance of the generalized estimator and find out what condition the weights must satisfy in order to minimize it.

Generalized estimator Z = l1X1 + l2X2

UNBIASEDNESS AND EFFICIENCY

Page 12: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

The first variance rule is used to decompose the variance.

Generalized estimator Z = l1X1 + l2X2

UNBIASEDNESS AND EFFICIENCY

Page 13: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

Note that we are assuming that X1 and X2 are independent observations and so their covariance is zero. The second variance rule is used to bring l1 and l2 out of the variance expressions.

Generalized estimator Z = l1X1 + l2X2

UNBIASEDNESS AND EFFICIENCY

Page 14: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

The variance of X1, at the planning stage, is sX2. The same goes for the variance of X2.

At this step, you can use the following result,

“If l1 + l2 = 1, then, l12 + l22 >= ½.”

to show that the sample mean is more efficient because it has a lower variance.

Generalized estimator Z = l1X1 + l2X2

UNBIASEDNESS AND EFFICIENCY

Page 15: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

Or, you can use calculus as follows:

We take account of the condition for unbiasedness and re-write the variance of Z, substituting for l2.

Generalized estimator Z = l1X1 + l2X2

UNBIASEDNESS AND EFFICIENCY

Page 16: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The quadratic is expanded. To minimize the variance of Z, we must choose l1 so as to minimize the final expression.

Generalized estimator Z = l1X1 + l2X2

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

UNBIASEDNESS AND EFFICIENCY

Page 17: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

5.00240 2111

2

dd Z

Generalized estimator Z = l1X1 + l2X2

We differentiate with respect to l1 to obtain the first-order condition.

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

UNBIASEDNESS AND EFFICIENCY

Page 18: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

5.00240 2111

2

dd Z

The expression is minimized for l1 = 0.5. It follows that l2 = 0.5 as well. So we have demonstrated that the sample mean is the most efficient unbiased estimator, at least in this example. (Note that the second differential is positive, confirming that we have a minimum.)

Generalized estimator Z = l1X1 + l2X2

21

21

2122

121

222

21

222

221

22112211

22112

)122(

1)( if )]1[(

)(

),cov(2)var()var(

)var(

21

X

X

X

XX

Z

XXXX

XX

UNBIASEDNESS AND EFFICIENCY

Page 19: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Suppose that you have alternative estimators of a population characteristic q, one unbiased, the other biased but with a smaller variance. How do you choose between them?

probabilitydensityfunction

q

estimator B

estimator A

Page 20: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

A widely-used loss function is the mean square error of the estimator, defined as the expected value of the square of the deviation of the estimator about the true value of the population characteristic.

probabilitydensityfunction

q

222 )()()(MSE ZZZEZ

estimator B

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 21: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The mean square error involves a trade-off between the variance of the estimator and its bias. Suppose you have a biased estimator like estimator B above, with expected value mZ.

probabilitydensityfunction

q mZ

bias

222 )()()(MSE ZZZEZ

estimator B

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 22: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The mean square error can be shown to be equal to the sum of the variance of the estimator and the square of the bias.

probabilitydensityfunction

q mZ

bias

222 )()()(MSE ZZZEZ

estimator B

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 23: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

To demonstrate this, we start by subtracting and adding mZ .

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 24: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

We expand the quadratic using the rule (a + b)2 = a2 + b2 + 2ab, where a = Z – mZ and b = mZ – q.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 25: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

We use the first expected value rule to break up the expectation into its three components.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 26: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The first term in the expression is by definition the variance of Z.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 27: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

(mZ – q) is a constant, so the second term is a constant.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 28: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

In the third term, (mZ – q) may be brought out of the expectation, again because it is a constant, using the second expected value rule.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 29: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Now E(Z) is mZ, and E(–mZ) is –mZ.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 30: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Hence the third term is zero and the mean square error of Z is shown be the sum of the variance of Z and the bias squared.

22

22

22

22

22

2

2

)(

))((2)(

)()(2)(

))((2)()(

))((2)()(

)(

)()(MSE

ZZ

ZZZZZ

ZZZZ

ZZZZ

ZZZZ

ZZ

ZE

ZEEZE

ZZE

ZE

ZEZ

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 31: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

In the case of the estimators shown, estimator B is probably a little better than estimator A according to the MSE criterion.

probabilitydensityfunction

q

estimator B

estimator A

CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE

Page 32: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION

Given a sample of n observations, the usual estimator of the variance is the sum of the squared deviations around the sample mean divided by n – 1, typically denoted s2

X.

Since the variance is the expected value of the squared deviation of X about its mean, it makes intuitive sense to use the average of the sample squared deviations as an estimator. But why divide by n – 1 rather than by n?

The reason is that the sample mean is by definition in the middle of the sample, while the unknown population mean is not, except by coincidence.

As a consequence, the sum of the squared deviations from the sample mean tends to be slightly smaller than the sum of the squared deviations from the population mean.

Hence a simple average of the squared sample deviations is a downwards biased estimator of the variance. However, the bias can be shown to be a factor of (n – 1)/n. Thus one can allow for the bias by dividing the sum of the squared deviations by n – 1 instead of n. The proof is in the appendix of the review chapter.

Variance

Estimator .1

1

1

22

n

iiX XX

ns

22)var( XX XEX

Page 33: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Variance

Estimator

Covariance

Estimator

A similar adjustment has to be made when estimating a covariance. For two random variables X and Y an unbiased estimator of the covariance sXY is given by the sum of the products of the deviations around the sample means divided by n – 1.

.1

1

1

22

n

iiX XX

ns

.1

1

1

n

iiiXY YYXX

ns

YXXY YXEYX ),(cov

22)var( XX XEX

ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION

Page 34: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The population correlation coefficient rXY for two variables X and Y is defined to be their covariance divided by the square root of the product of their variances.The sample correlation coefficient, rXY, is obtained from this by replacing the covariance and variances by their estimators.

22YX

XYXY

22

2222

11

11

11

YYXX

YYXX

YYn

XXn

YYXXn

ss

sr

YX

XYXY

Correlation

Estimator

ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION

Page 35: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Correlation

Estimator

The 1/(n – 1) terms in the numerator and the denominator cancel and one is left with a straightforward expression.

22YX

XYXY

22

2222

11

11

11

YYXX

YYXX

YYn

XXn

YYXXn

ss

sr

YX

XYXY

ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION

Page 36: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

Probability Limits and Consistency

Page 37: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 50

If n is equal to 1, the sample consists of a single observation. X is the same as X and its standard deviation is 50.

50 100 150 200

n = 1

0.08

0.04

0.02

0.06

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 38: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

We will see how the shape of the distribution changes as the sample size is increased.

50 100 150 200

n = 4

0.08

0.04

0.02

0.06

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 39: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

25 10

The distribution becomes more concentrated about the population mean.

50 100 150 200

n = 25

0.08

0.04

0.02

0.06

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 40: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

25 10100 5

To see what happens for n greater than 100, we will have to change the vertical scale.

50 100 150 200

0.08

0.04

n = 100

0.02

0.06

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 41: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

25 10100 5

We have increased the vertical scale by a factor of 10.

50 100 150 200

n = 100

0.8

0.4

0.2

0.6

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 42: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

25 10100 5

1000 1.6

The distribution continues to contract about the population mean.

50 100 150 200

n = 1000

0.8

0.4

0.2

0.6

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 43: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

n 1 504 25

25 10100 5

1000 1.65000 0.7

In the limit, the variance of the distribution tends to zero. The distribution collapses to a spike at the true value. The plim of the sample mean is therefore the population mean.

50 100 150 200

n = 50000.8

0.4

0.2

0.6

probability density function of X

X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 44: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Consistency

An estimator of a population characteristic is said to be consistent if it satisfies two conditions:

(1) It possesses a probability limit, and so itsdistribution collapses to a spike as the sample sizebecomes large, and

(2) The spike is located at the true value of thepopulation characteristic.

Hence we can say plim X = mX.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 45: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The sample mean in our example satisfies both conditions and so it is a consistent estimator of mX. Most standard estimators in simple applications satisfy the first condition because their variances tend to zero as the sample size becomes large.

50 100 150 200

n = 50000.8

0.4

0.2

0.6

probability density function of X

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

X

Page 46: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The only issue then is whether the distribution collapses to a spike at the true value of the population characteristic. A sufficient condition for consistency is that the estimator should be unbiased and that its variance should tend to zero as n becomes large.

It is easy to see why this is a sufficient condition. If the estimator is unbiased for a finite sample, it must stay unbiased as the sample size becomes large.

Meanwhile, if the variance of its distribution is decreasing, its distribution must collapse to a spike. Since the estimator remains unbiased, this spike must be located at the true value. The sample mean is an example of an estimator that satisfies this sufficient condition.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 47: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Consistency

Why are we interested in consistency, when in practice we have finite samples?

As a first approximation, the answer is that if we can show that an estimator is consistent, then we may be optimistic about its finite sample properties, whereas is the estimator is inconsistent, we know that for finite samples it will definitely be biased.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 48: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Consistency

Why are we interested in consistency, when in practice we have finite samples?

As a first approximation, the answer is that if we can show that an estimator is consistent, then we may be optimistic about its finite sample properties, whereas is the estimator is inconsistent, we know that for finite samples it will definitely be biased.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 49: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Consistency

However, there are reasons for being cautious about preferring consistent estimators to inconsistent ones.

First, a consistent estimator may be biased for finite samples.

Second, we are usually also interested in variances. If a consistent estimator has a larger variance than an inconsistent one, the latter might be preferable if judged by the mean square error or similar criterion that allows a trade-off between bias and variance.

How can you resolve these issues? Mathematically they are intractable, otherwise we would not have resorted to large sample analysis in the first place.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 50: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Consistency

However, there are reasons for being cautious about preferring consistent estimators to inconsistent ones.

First, a consistent estimator may be biased for finite samples.

Second, we are usually also interested in variances. If a consistent estimator has a larger variance than an inconsistent one, the latter might be preferable if judged by the mean square error or similar criterion that allows a trade-off between bias and variance.

How can you resolve these issues? Mathematically they are intractable, otherwise we would not have resorted to large sample analysis in the first place.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Page 51: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

The Simple Regression Model

Page 52: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Y

SIMPLE REGRESSION MODEL

Suppose that a variable Y is a linear function of another variable X, with unknown parameters b1 and b2 that we wish to estimate.

Suppose that we have a sample of 4 observations with X values as shown.

XY 21

b1

XX1 X2 X3 X4

Page 53: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

If the relationship were an exact one, the observations would lie on a straight line and we would have no trouble obtaining accurate estimates of b1 and b2.

Q1

Q2

Q3

Q4

XY 21

b1

Y

XX1 X2 X3 X4

Page 54: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

In practice, most economic relationships are not exact and the actual values of Y are different from those corresponding to the straight line.

P3P2

P1

Q1

Q2

Q3

Q4

XY 21

b1

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 55: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

To allow for such divergences, we will write the model as

Y = b1 + b2X + u, where u is a disturbance term.

P3P2

P1

Q1

Q2

Q3

Q4

XY 21

b1

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 56: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

Each value of Y thus has a non-random component, b1 + b2X, and a random component, u. The first observation has been decomposed into these two components.

P3P2

P1

Q1

Q2

Q3

Q4u1

XY 21

b1

Y

121 X

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 57: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

In practice we can see only the P points.

P3P2

P1

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 58: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

Obviously, we can use the P points to draw a line which is an approximation to the line Y = b1 + b2X.

If we write this line Y = b1 + b2X, b1 is an estimate of b1 and b2 is an estimate of b2.

P3P2

P1

^

XbbY 21ˆ

b1

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 59: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

The line is called the fitted model and the values of Y predicted by it are called the fitted values of Y. They are given by the heights of the R points.

P3P2

P1

R1

R2

R3 R4

XbbY 21ˆ

b1

Y (fitted value)

Y (actual value)

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 60: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

XX1 X2 X3 X4

The discrepancies between the actual and fitted values of Y are known as the residuals.

P3P2

P1

R1

R2

R3 R4

(residual)

e1

e2

e3

e4 XbbY 21ˆ

b1

Y (fitted value)

Y (actual value)

eYY ˆY

SIMPLE REGRESSION MODEL

Page 61: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

Note that the values of the residuals are not the same as the values of the disturbance term. The diagram now shows the true unknown relationship as well as the fitted line.

P3P2

P1

R1

R2

R3 R4

b1

XbbY 21ˆ

XY 21

b1

Y (fitted value)

Y (actual value)

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 62: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

The disturbance term in each observation is responsible for the divergence between the non-random component of the true relationship and the actual observation.

P3P2

P1

Q2Q1

Q3

Q4

XbbY 21ˆ

XY 21

b1

b1

Y (fitted value)

Y (actual value)

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 63: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

The residuals are the discrepancies between the actual and the fitted values.

If the fit is a good one, the residuals and the values of the disturbance term will be similar, but they must be kept apart conceptually.

P3P2

P1

R1

R2

R3 R4

XbbY 21ˆ

XY 21

b1

b1

Y (fitted value)

Y (actual value)

Y

XX1 X2 X3 X4

SIMPLE REGRESSION MODEL

Page 64: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

Both of these lines will be used in our analysis. Each permits a decomposition of the value of Y. The decompositions will be illustrated with the fourth observation.

Q4

u4 XbbY 21ˆ

XY 21

b1

b1

Y (fitted value)

Y (actual value)

Y

421 X

XX1 X2 X3 X4

e4

R4

421 Xbb

SIMPLE REGRESSION MODEL

Page 65: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Using the theoretical relationship, Y can be decomposed into its

non-stochastic component b1 + b2X and its random component u.

Y = b1 + b2X + u

This is a theoretical decomposition because we do not know the

values of b1 or b2, or the values of the disturbance term. We shall

use it in our analysis of the properties of the regression coefficients.

The other decomposition is with reference to the fitted line. In each observation, the actual value of Y is equal to the fitted value plus the residual. This is an operational decomposition which we will use for practical purposes.

Y = b1 + b2X + e = + e

SIMPLE REGRESSION MODEL

Page 66: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Least squares criterion:

221

1

2 ... n

n

ii eeeRSS

Minimize RSS (residual sum of squares), where

To begin with, we will draw the fitted line so as to minimize the sum of the squares of the residuals, RSS. This is described as the least squares criterion.

SIMPLE REGRESSION MODEL

Page 67: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Why the squares of the residuals? Why not just minimize the sum of the residuals?

Least squares criterion:

Why not minimize

221

1

2 ... n

n

ii eeeRSS

n

n

ii eee

...11

Minimize RSS (residual sum of squares), where

SIMPLE REGRESSION MODEL

Page 68: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

P4

The answer is that you would get an apparently perfect fit by drawing a horizontal line through the mean value of Y. The sum of the residuals would be zero.

You must prevent negative residuals from cancelling positive ones, and one way to do this is to use the squares of the residuals.

Of course there are other ways of dealing with the problem. The least squares criterion has the attraction that the estimators derived with it have desirable properties, provided that certain conditions are satisfied.

P3P2

P1Y

XX1 X2 X3 X4

Y

SIMPLE REGRESSION MODEL

Page 69: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0

1

2

3

4

5

6

0 1 2 3

1Y

2Y3Y

DERIVING LINEAR REGRESSION COEFFICIENTS

Y XbbY

uXY

21

21

ˆ :line Fitted

:model True

XNext, we’ll see how the regression coefficients for a simple regression model are derived, using the least squares criterion (OLS, for ordinary least squares).

We will start with a numerical example with just three observations: (1,3), (2,5), and (3,6)

Page 70: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0

1

2

3

4

5

6

0 1 2 3

1Y

2Y3Y

211 bbY 212 2ˆ bbY

213 3ˆ bbY Y

b2b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

X

Writing the fitted regression as Y = b1 + b2X, we will determine the values of b1 and b2 that minimize RSS, the sum of the squares of the residuals.

^

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 71: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0

1

2

3

4

5

6

0 1 2 3

1Y

2Y3Y

211 bbY 212 2ˆ bbY

213 3ˆ bbY

Given our choice of b1 and b2, the residuals are as shown.

Y

b2b1

21333

21222

21111

36ˆ

25ˆ

bbYYe

bbYYe

bbYYe

XbbY

uXY

21

21

ˆ :line Fitted

:model True

X

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 72: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

The sum of the squares of the residuals is thus as shown above.

21333

21222

21111

36ˆ

25ˆ

bbYYe

bbYYe

bbYYe

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 73: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

The quadratics have been expanded.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 74: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

Like terms have been added together.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 75: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

0281260 211

bb

bRSS

06228120 212

bb

bRSS

For a minimum, the partial derivatives of RSS with respect to b1 and b2 should be zero. (We should also check a second-order condition.)

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 76: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

The first-order conditions give us two equations in two unknowns.

0281260 211

bb

bRSS

06228120 212

bb

bRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 77: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0281260 211

bb

bRSS

06228120 212

bb

bRSS

50.1,67.1 21 bb

Solving them, we find that RSS is minimized when b1 and b2 are equal to 1.67 and 1.50, respectively.

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 78: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0

1

2

3

4

5

6

0 1 2 3

1Y

2Y3Y

211 bbY 212 2ˆ bbY

213 3ˆ bbY Y

b2b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

X

Here is the scatter diagram again.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 79: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

0

1

2

3

4

5

6

0 1 2 3

1Y

2Y3Y

17.31 Y67.4ˆ

2 Y

17.6ˆ3 YYXY

uXY

50.167.1ˆ :line Fitted

:model True 21

X

The fitted line and the fitted values of Y are as shown.

1.501.67

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 80: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

XXnX1

Y

XbbY

uXY

21

21

ˆ :line Fitted

:model True

1Y

nY

Now we will do the same thing for the general case with n observations.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 81: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

XXnX1

Y

b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

1211 XbbY

1Y

b2

nY

nn XbbY 21ˆ

Given our choice of b1 and b2, we will obtain a fitted line as shown.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 82: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

XXnX1

Y

b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

nnnnn XbbYYYe

XbbYYYe

21

1211111

ˆ

.....

ˆ

1211 XbbY

1Y

b2

nY

1e

nn XbbY 21ˆ

The residual for the first observation is defined.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 83: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Similarly we define the residuals for the remaining observations. That for the last one is marked.

XXnX1

Y

b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

nnnnn XbbYYYe

XbbYYYe

21

1211111

ˆ

.....

ˆ

1211 XbbY

1Y

b2

nY

1e

nenn XbbY 21

ˆ

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 84: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii

nnnnnn

nnn

XbbYXbYbXbnbY

XbbYXbYbXbbY

XbbYXbYbXbbY

XbbYXbbYeeRSS

212122

221

2

212122

221

2

1211121121

22

21

21

221

21211

221

222

222

...

222

)(...)(...

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

RSS, the sum of the squares of the residuals, is defined for the general case. The data for the numerical example are shown for comparison.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 85: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii

nnnnnn

nnn

XbbYXbYbXbnbY

XbbYXbYbXbbY

XbbYXbYbXbbY

XbbYXbbYeeRSS

212122

221

2

212122

221

2

1211121121

22

21

21

221

21211

221

222

222

...

222

)(...)(...

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

The quadratics are expanded.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 86: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii

nnnnnn

nnn

XbbYXbYbXbnbY

XbbYXbYbXbbY

XbbYXbYbXbbY

XbbYXbbYeeRSS

212122

221

2

212122

221

2

1211121121

22

21

21

221

21211

221

222

222

...

222

)(...)(...

Like terms are added together.

212122

21

212122

21

212122

21

212122

21

221

221

221

23

22

21

12622814370

63612936

42010425

2669

)36()25()3(

bbbbbb

bbbbbb

bbbbbb

bbbbbb

bbbbbbeeeRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 87: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii XbbYXbYbXbnbYRSS 212122

221

2 222

212122

21 12622814370 bbbbbbRSS

0281260 211

bb

bRSS

06228120 212

bb

bRSS

50.1,67.1 21 bb

Note that in this equation the observations on X and Y are just data that determine the coefficients in the expression for RSS.

The choice variables in the expression are b1 and b2. This may seem a bit strange because in elementary calculus courses b1 and b2 are usually constants and X and Y are variables.

However, if you have any doubts, compare what we are doing in the general case with what we did in the numerical example.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 88: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii XbbYXbYbXbnbYRSS 212122

221

2 222

212122

21 12622814370 bbbbbbRSS

0281260 211

bb

bRSS

06228120 212

bb

bRSS

50.1,67.1 21 bb

The first derivative with respect to b1.

02220 211

ii XbYnbbRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 89: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

iiiiii XbbYXbYbXbnbYRSS 212122

221

2 222

212122

21 12622814370 bbbbbbRSS

0281260 211

bb

bRSS

06228120 212

bb

bRSS

50.1,67.1 21 bb

With some simple manipulation we obtain a tidy expression for b1 .

02220 211

ii XbYnbbRSS

ii XbYnb 21 XbYb 21

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 90: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The first derivative with respect to b2.

iiiiii XbbYXbYbXbnbYRSS 212122

221

2 222

212122

21 12622814370 bbbbbbRSS

0281260 211

bb

bRSS

06228120 212

bb

bRSS

50.1,67.1 21 bb

02220 211

ii XbYnbbRSS

ii XbYnb 21 XbYb 21

02220 12

22

iiii XbYXXbbRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 91: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

02220 12

22

iiii XbYXXbbRSS

012

2 iiii XbYXXb

Divide through by 2.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 92: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

012

2 iiii XbYXXb

0)( 22

2 iiii XXbYYXXb

We now substitute for b1 using the expression obtained for it and we thus obtain an equation that contains b2 only.

02220 12

22

iiii XbYXXbbRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 93: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

012

2 iiii XbYXXb

0)( 22

2 iiii XXbYYXXb

0)( 22

2 XnXbYYXXb iii

The definition of the sample mean has been used.

n

XX i

XnX i

02220 12

22

iiii XbYXXbbRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 94: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

012

2 iiii XbYXXb

0)( 22

2 iiii XXbYYXXb

0)( 22

2 XnXbYYXXb iii

022

22 XnbYXnYXXb iii

The last two terms have been disentangled.

02220 12

22

iiii XbYXXbbRSS

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 95: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

012

2 iiii XbYXXb

0)( 22

2 iiii XXbYYXXb

0)( 22

2 XnXbYYXXb iii

022

22 XnbYXnYXXb iii

Terms not involving b2 have been transferred to the right side.

02220 12

22

iiii XbYXXbbRSS

YXnYXXnXb iii 222

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 96: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Hence we obtain an expression for b2.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 97: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

In practice, we shall use an alternative expression. We will demonstrate that it is equivalent.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 98: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Expanding the numerator, we obtain the terms shown.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

YXnYX

YXnYnXXnYYX

YXnYXXYYX

YXYXYXYXYYXX

ii

ii

iiii

iiiiii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 99: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

In the second term the mean value of Y is a common factor. In the third, the mean value of X is a common factor. The last term is the same for all i.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

YXnYX

YXnYnXXnYYX

YXnYXXYYX

YXYXYXYXYYXX

ii

ii

iiii

iiiiii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 100: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

We use the definitions of the sample means to simplify the expression.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

YXnYX

YXnYnXXnYYX

YXnYXXYYX

YXYXYXYXYYXX

ii

ii

iiii

iiiiii

n

XX i

XnX i

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 101: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

Hence we have shown that the numerators of the two expressions are the same.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

YXnYX

YXnYnXXnYYX

YXnYXXYYX

YXYXYXYXYYXX

ii

ii

iiii

iiiiii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 102: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

The denominator is mathematically a special case of the numerator, replacing Y by X. Hence the expressions are equivalent.

YXnYXXnXb iii 222

222 XnX

YXnYXb

i

ii

22

XX

YYXXb

i

ii

YXnYXYYXX iiii 222 XnXXX ii

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 103: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

XXnX1

Y

b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

1211 XbbY

1Y

b2

nY

nn XbbY 21ˆ

The scatter diagram is shown again. We will summarize what we have done. We hypothesized that the true model is as shown, we obtained some data, and we fitted a line.

DERIVING LINEAR REGRESSION COEFFICIENTS

Page 104: Lecture 3 Today: Statistical Review cont’d: Unbiasedness and efficiency Sample equivalents of variance, covariance and correlation Probability limits and.

© Christopher Dougherty 1999–2006

XXnX1

Y

b1

XbbY

uXY

21

21

ˆ :line Fitted

:model True

1211 XbbY

1Y

b2

nY

nn XbbY 21ˆ

XbYb 21

We chose the parameters of the fitted line so as to minimize the sum of the squares of the residuals. As a result, we derived the expressions for b1 and b2.

22

XX

YYXXb

i

ii

DERIVING LINEAR REGRESSION COEFFICIENTS