1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis:...

51
1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y = 0 + 1 x 1 + 2 x 2 + . . . k x k + u

Transcript of 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis:...

Page 1: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

1Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Multiple Regression Analysis: Further Issues

y = 0 + 1x1 + 2x2 + . . . kxk + u

Page 2: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

2Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Redefining Variables

Changing the scale of the y variable will lead to a corresponding change in the scale of the coefficients and standard errors, so no change in the significance or interpretation

Changing the scale of one x variable will lead to a change in the scale of that coefficient and standard error, so no change in the significance or interpretation

Page 3: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

3Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

11 1

1 1 22 1

1

2

22 1

1 22

1

2

2

ˆˆ ˆ,

ˆ

ˆvar ,ˆ

ˆ ˆˆ, ,ˆ 1

/ ˆ ˆ,/ 1

ˆ 1

n n

i i i ii i

ni

ii

n

iin

xi

i

jj

j jj

r urj j

ur

x x y y r y

rx x

y ySSE

Rs SST y y

t seSST Rse

SSR SSR qF c se

SSR n k

SER SSR n k

Page 4: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

4Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 5: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

5Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

数据测度的影响

统计量 y(由盎司改为磅,1 磅=16

盎司)

统计量 x(每天抽烟数由支改为包,1包 20支)

ˆj ˆ 16j ˆ

packs ˆ20 cigs

ˆjse ˆ 16jse ˆ

packsse ˆ20 cigsse

ˆj

t

不变 ˆ

packst

不变

F 不变 F 不变 2R 不变 2R 不变

SSR 216SSR SSR 不变

SER 16SER

SER 不变

ˆj

CI

ˆ ˆ 16j jc se

ˆpacks

CI

ˆ ˆ20 cigs cigsc se

Page 6: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

6Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 7: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

7Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

因变量或自变量以对数形式出现,改变度量单位不会影响斜率系数,只会改变截距项。

1 1log log logi ic y c y

1 1log log logj jc x c x

Page 8: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

8Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Beta Coefficients

Occasional you’ll see reference to a “standardized coefficient” or “beta coefficient” which has a specific meaning

Idea is to replace y and each x variable with a standardized version – i.e. subtract mean and divide by standard deviation

Coefficient reflects standard deviation of y for a one standard deviation change in x

Page 9: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

9Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 10: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

10Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Beta Coefficients (cont)

1 1 2 2

1

1

j

ˆ ˆ ˆ...where denote the z-score of , is the z score of , and so on. Andˆ ˆˆ ˆb ( / ) for 1,..., are called beta coefficients.

y k k

y

j y j

z b z b z b z errorz y z

x

j k

Page 11: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

11Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Example 6.1 :Effects of Pollution on Housing Prices

Page 12: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

12Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Functional Form

OLS can be used for relationships that are not strictly linear in x and y by using nonlinear functions of x and y – will still be linear in the parameters

Can take the natural log of x, y or both Can use quadratic forms of x Can use interactions of x variables

Page 13: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

13Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Interpretation of Log Models

ln(y) = 0 + 1ln(x) + u

1 is the elasticity of y with respect to x ln(y) = 0 + 1x + u

1 is approximately the percentage change in y given a 1 unit change in x

y = 0 + 1ln(x) + u

1 is approximately the change in y for a 100 percent change in x

Page 14: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

14Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Why use log models? Log models are invariant to the scale of the

variables since measuring percent changes They give a direct estimate of elasticity For models with y > 0, the conditional distribution

is often heteroskedastic or skewed, while ln(y) is much less so

The distribution of ln(y) is more narrow, limiting the effect of outliers.

it can not be used if a variable takes on zero or negative values.

It is more difficult to predict the original variable when using a dependent variable in log form.

Page 15: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

15Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Some Rules of Thumb

variables used in log form: Dollar amounts that must be positive Very large variables, such as population variables used in level form: Variables measured in years Variables that are a proportion or percent

Page 16: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

16Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

0 1 1 2 2

1

2 2

2 2

2 2

2 2

2 2

ˆ ˆ ˆˆlog log,

ˆˆlog

ˆˆ% 100 exp 1

ˆˆˆ ˆlog 1 logˆ1 exp

ˆexp 1

y x xgive x we get

y x

y x

y y y x

y x

y x

Page 17: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

17Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Quadratic Models

For a model of the form y = 0 + 1x + 2x2 + u we can’t interpret 1 alone as measuring the change in y with respect to x, we need to take into account 2 as well, since

1 2

1 2

1 2

ˆ ˆˆ 2 , so

ˆ ˆ ˆ 2

ˆ0, ,

ˆ max min

ˆ ˆ2

y x x

yx

xy

when that isx

y reach its i or value

x

Page 18: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

18Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

More on Quadratic Models

Suppose that the coefficient on x is positive and the coefficient on x2 is negative

Then y is increasing in x at first, but will eventually turn around and be decreasing in x

21*

21

ˆ2ˆat be will

point turning the0ˆ and 0ˆFor

x

Page 19: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

19Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

More on Quadratic Models

Suppose that the coefficient on x is negative and the coefficient on x2 is positive

Then y is decreasing in x at first, but will eventually turn around and be increasing in x

0ˆ and 0ˆ when as same the

is which ,ˆ2ˆat be will

point turning the0ˆ and 0ˆFor

21

21*

21

x

Page 20: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

20Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

How to describe decreasing effect

Page 21: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

21Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 22: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

22Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

How to describe increasing effect

Page 23: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

23Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 24: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

24Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Interaction Terms

For a model of the form y = 0 + 1x1 + 2x2 + 3x1x2 + u we can’t interpret 1 alone as measuring the change in y with respect to x1, we need to take into account 3 as well, since

1 3 21

1

2

, so to summarize

the effect of on we typically

evaluate the above at

yx

x

x y

x

Page 25: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

25Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Example: Effects of Attendance on Final Exam Performance

atndrte 系数为负,是否意味着听课对期末考试分数具有负面影响?

b1 仅考虑了 priGPA=0 时的影响。 atndrte和 priGPA•atndrte 系数估计值 t 值不显著,是否意

味着两者对期末考试分数无影响? F 检验的 p 值为 0.014.

Page 26: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

26Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Atndrte对 stndfnl 的偏效应:

其含义是:在 priGPA 的平均水平( 2.59 )上, atndrte 提高 10 个百分点,使 stndfnl 比期末考试平均分数高出 0.078 倍。

0.0067 0.0056 2.59 0.0078

1 1 6

1 1 6

ˆ ˆ 2.59,ˆ ˆ 2.59

0 1 6 6

0 61

ˆ 2.592.59

stndfnl atndrte priGPA atndrte uatndrte priGPA atndrte u

10.0078 0.0026 3t

Page 27: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

27Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Adjusted R-Squared

Recall that the R2 will always increase as more variables are added to the model

The adjusted R2 takes into account the number of variables in a model, and may decrease

2 2 2

2

2

2

1 1

11

1

ˆ1

1

1 1 1 1

u y

SSR nR

SST nSSR n k

RSST n

SST n

R n n k

Page 28: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

28Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

调整 R 方的作用:为在一个模型中另外增加自变量施加了惩罚。

随着一个新的自变量加入回归方程, SSR 下降,但回归中的自由度 df=n-k-1 也下降。因此,SSR/(n-k-1) 可能上升,也可能下降。

作为一个结论有: 在回归中增加一个新变量,当且仅当新变量的

t 统计量在绝对值上大于 1 ,调整 R 方才会有所提高;

在回归中增加一组变量时,当且仅当这组新变量联合显著性的 F 统计量大于 1 ,调整 R 方才会有所提高。

Page 29: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

29Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Adjusted R-Squared (cont)

It’s easy to see that the adjusted R2 is just (1 – R2)(n – 1) / (n – k – 1), but most packages will give you both R2 and adj-R2

You can compare the fit of 2 models (with the same y) by comparing the adj-R2

You cannot use the adj-R2 to compare models with different y’s (e.g. y vs. ln(y))

Page 30: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

30Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Using adjusted R-squared to choose between nonnested models.

2 0.6211R

2 0.6226R

2 20.061, 0.03R R

2 20.148, 0.09R R

Page 31: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

31Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

391732982salarySST

66.72lsalarySST

Page 32: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

32Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Controlling too many factors in regression analysis Important not to fixate too much on adj-R2

and lose sight of theory and common sense If economic theory clearly predicts a variable

belongs, generally leave it in Don’t want to include a variable that prohibits

a sensible interpretation of the variable of interest – remember ceteris paribus interpretation of multiple regression

Page 33: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

33Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

在研究啤酒税对交通死亡率影响的回归模型中, 是否应该将人均啤酒消费量变量包括在模型之中?

在保持 beercons 不变的情况下,死亡率因 tax 提高 1 个百分点而导致的差异。这一说法是否有意义?

Page 34: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

34Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Adding regressors to reduce the error of variance 在回归中增加一个新的自变量会加剧多重共线

性问题;另一方面,从误差项中取出一些因素作为解释变量可以减少误差方差。

应该将那些影响 y 而又与所有我们关心的自变量都无关的自变量包括进来。

Page 35: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

35Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Prediction and Residual Analysis Suppose we want to use our estimates to

obtain a specific prediction. First, suppose that we want an estimate of

E(y|x1=c1,…xk=ck) = 0 = 0+1c1+ …+ kck

This is easy to obtain by substituting the x’s in our estimated model with c’s , but what about a standard error?

Really just a test of a linear combination

Page 36: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

36Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Predictions (cont)

Can rewrite as 0 = 0 – 1c1 – … – kck

Substitute in to obtain y = 0 + 1 (x1 - c1) + … + k (xk - ck) + u

So, if you regress yi on (xij - cij) the intercept will give the predicted value and its standard error

Note that the standard error will be smallest when the c’s equal the means of the x’s

Page 37: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

37Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Example: CI for predicted college GPA

2.7 1.96 0.020 : 2.66 ~ 2.74CI

0 1200, 0 30, 0 5sat sat hsperc hsperc hsize hsize

Page 38: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

38Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Predictions (cont)

This standard error for the expected value is not the same as a standard error for an new outcome on y

We need to also take into account the variance in the unobserved error.

Page 39: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

39Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

0 0 0 0 0 00 1 1 0

0 00 0 1 1

0 00 0 1 1

0 00 1 10 0

0 0 0

12 20 0 2

0 2

00

ˆ ˆ ˆ

ˆ ˆ ˆˆˆ ˆ ˆˆ

ˆ 0

ˆ ˆ ,

ˆ ˆ ˆso

ˆ

ˆ ˆ

k k

k k

k k

k k

e y y x x u y

y x x

E y E E x E x

x x

E e E u

Var e Var y Var u

s

Var y

y

e e se y

e

j

的方差有两个来源:第一个是 的抽样误差,来自我们对

2

0ˆvar 1j jse c n y n

的估计。因为 ,所以 与 成比例。第二个是总体误差的方差 ,它不随样本容量的变化而变化。

Page 40: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

40Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

0

0 0 0

0

ˆCLM

ˆ ˆ ˆ 1

ˆ ˆ 0.95, 95%

j

0 00.025 0.025

u

e e se e n k t

P -t e se e t y

在经典线性模型( )假定下, 和 都是正态分布,所以也是正态分布的。 服从一个自由度为 的分布。

则 的一个 的预测区间:

0 00.025ˆ ˆy t se e

Page 41: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

41Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 42: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

42Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Prediction interval

0025.

0

0

0001

00

ˆˆ

for interval prediction 95% a have we

ˆˆ given that so ,~ˆˆ

esety

y

yyetesee kn

Usually the estimate of s2 is much larger than the variance of the prediction, thus

This prediction interval will be a lot wider than the simple confidence interval for the prediction

Page 43: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

43Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Residual Analysis Information can be obtained from looking at

the residuals (i.e. predicted vs. observed)

Example: Regress price of cars on characteristics – big negative residuals indicate a good deal

Example: Regress average earnings for students from a school on student characteristics – big positive residuals indicate greatest value-added

ˆ ˆi i iu y y

Page 44: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

44Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

例如, HPRICE1.RAW 的住房价格模型中。

0 1 2 3

81 81 81ˆ ˆ 120.206

price lotsize sqrft bdrms u

u price price

Page 45: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

45Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Page 46: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

46Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Predicting y in a log model

Simple exponentiation of the predicted ln(y) will underestimate the expected value of y

Instead need to scale this up by an estimate of the expected value of exp(u)

Page 47: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

47Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

0 1 1

0 1 1

0 1 1

0 1 1

2

0 0 1

2

1

logexp

exp exp

| exp exp

In this case can predict y as followsˆˆ ˆexp 2 exp ln

|

exp( ) exp( 2) i

expˆ

f ~ 0,

k k

k k

k k

k k

k k

E u

y x x uy x x u

u x x

E y x E u x x

y y

E y x x x

u N

0

ˆ ˆexp logy y

Page 48: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

48Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

0 1 1 2 2

0

ˆ1ˆ ˆ(2)

ˆ ˆ(3)

ˆ4 :ˆ ˆ ˆ ˆˆlog

ˆ ˆˆ(5) exp log

1 2 k i

i

i i

1 2 k i

k k

log y ylogy x ,x , ,x logy .

i, m = exp logy .y m m

x ,x , ,x logy

y x x xy y

0

当因变量为 时对 的预测:()从 对 的回归中得到拟合值对每个观测 都求出在不设截距的情况下求 对 的回归, 的系数

(惟一的系数)就是 的估计值。()对于给定的 的值,求出

利用 直接得 ˆ.y到预测值

Page 49: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

49Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

例 6.7 对 CEO 薪水的预测

0

ˆˆ exp

ˆˆ 1.117ˆ 4.504 0.163 log 5000 0.109 log 10000 0.0117 10 7.013ˆ 1.117 exp 7.013 1240.967 1240967

m lsalary

salary m

lsalary

salary

对样本中的每一个观测都求出 ;

将 对 进行回归(没有常数项)得 。

或 美元。

Page 50: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

50Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

Comparing log and level models A by-product of the previous procedure is a

method to compare a model in logs with one in levels.

Take the fitted values from the auxiliary regression, and find the sample correlation between this and y

Compare the R2 from the levels regression with this correlation squared

2

ˆ2 2ˆ

ˆ

yyyy

y y

R

Page 51: 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. Multiple Regression Analysis: Further Issues y =  0 +  1 x 1 +  2 x 2 +...  k x k +

51Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved.

例 6.8 对 CEO 薪水的预测

ˆ ,

2ˆ0 ,

20 1 2 3

ˆ ˆˆ exp log , 0.493, 0.243

0.201salary salarysalary salary

salary salary

salary sales mktval ceoten u R