Part II The Simple Regression Model -...

52
The Simple Regression Model Part II The Simple Regression Model As of Sep 16, 2020 Seppo Pynn¨ onen Econometrics I

Transcript of Part II The Simple Regression Model -...

Page 1: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Part II

The Simple Regression Model

As of Sep 16, 2020Seppo Pynnonen Econometrics I

Page 2: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Definition1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 3: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Definition

Two (observable) variables ”y” and ”x”.

y = β0 + β1x + u. (1)

Equation (1) defines the simple regression model.

Terminology:

y x

Dependent variable Independent variableExplained variable Explanatory variableResponse variable Control variablePredicted variable PredictorRegressand Regressor

Seppo Pynnonen Econometrics I

Page 4: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Definition

The variable u, called the error term or disturbance, accountsfactors other than x that affect y .

β0 is the intercept (called also the constant term) and β1 is theslope of the regression line β0 + β1x . That is, if the other factors(in u) are held fixed, so that the change in u is zero,

∆y = β1∆x , if ∆u = 0.

Thus, the (expected) change in y is β1 times the change in x .

In particular if x changes by one unit, the (expected) change in yis β1 units.

Seppo Pynnonen Econometrics I

Page 5: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Definition

Error term ui is a combination of a number of effects, like:

1 Omitted variables: Accounts the effects of variables omittedfrom the model

2 Nonlinearities: Captures the effects of nonlinearities betweeny and x . Thus, if the true model is yi = β0 + β1xi + γx2

i + vi ,and we assume that it is yi = β0 + β1x + ui , then the effect ofx2i is absorbed to ui . In fact, ui = γx2

i + vi .

3 Measurement errors: Errors in measuring y and x areabsorbed in ui .

4 Unpredictable effects: ui includes also inherently unpredictablerandom effects.

Seppo Pynnonen Econometrics I

Page 6: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 7: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Given observations (xi , yi ), i = 1, . . . , n, the task for aneconometrician is to estimate the population parameters β0

(intercept parameter) and β1 (slope parameter) of (1) in anoptimal way by utilizing the sample information.

Assumptions (classical assumptions)

1 Zero average error: E[ui ] = 0 for all i .

2 Homoskedasticity: var[ui ] = σ2 for all i .

3 Uncorrelated errors: cov[ui , uj ] = 0, for all i 6= j .

4∑n

i=1(xi − x)2 > 0.

5 cov[ui , xi ] = 0.

Seppo Pynnonen Econometrics I

Page 8: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Remark 1

Assumptions 1 and 2 are related to distributional properties ofthe error term of which Assumption 1 can always be fulfilledwhen the intercept term β0 is included to the regression in(1). We discuss Assumption 2 later in more detail.

Assumption 3. is mostly relevant in time series data or”clustered” data.

Assumption 5 is the key assumption how x and u are related,expressed sometimes in terms the stronger assumption ofmean independence, E[ui |xi ] = E[ui ], which impliescov[ui , xi ] = 0 when E[ui ] = 0.

Seppo Pynnonen Econometrics I

Page 9: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Exploring further 2.1, Wooldridge 5e, p.23

Suppose that a score on a final exam, score, depends on classesattended (attend) and unobserved factors that affect exam performance(such as student ability). Then

score = β0 + β1attend + u. (2)

When would you expect this model to satisfy E[u|attend] = E[u]?

Seppo Pynnonen Econometrics I

Page 10: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

The goal in the estimation is to find values for β0 and β1 of model(1) such that the error terms is as small as possible (in a suitablesense).

Under the above classical assumptions, the Ordinary Least Squares(OLS) minimizes the residual sum of squares of the error termsui = yi − β0 − β1xi , which produces optimal estimates for theparameters (the optimality criteria are discussed later).

Denote the sum of squares as

f (β0, β1) =n∑

i=1

(yi − β0 − β1xi )2. (3)

The first order conditions (foc) for the minimum are found bysetting the partial derivatives equal to zero. Denote by β0 and β1

the values satisfying the foc.

Seppo Pynnonen Econometrics I

Page 11: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

First order conditions:

∂f (β0, β1)

∂β0= −2

n∑i=1

(yi − β0 − β1xi ) = 0 (4)

∂f (β0, β1)

∂β1= −2

n∑i=1

xi (yi − β0 − β1xi ) = 0 (5)

These yield so called normal equations

nβ0 + β1∑

xi =∑

yi

β0∑

xi + β1∑

x2i =

∑xiyi ,

(6)

where the summation is from 1 to n.

Seppo Pynnonen Econometrics I

Page 12: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

The explicit solutions for β0 and β1 are (OLS estimators of β0 andβ1)

β1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2(7)

β0 = y − β1x , (8)

where

x =1

n

n∑i=1

xi and y =1

n

n∑i=1

yi

are the sample means.

In the solutions (7) and (8) we have used the propertiesn∑

i=1

(xi − x)(yi − y) =n∑

i=1

xiyi − nx y (9)

andn∑

i=1

(xi − x)2 =n∑

i=1

x2i − nx2. (10)

Seppo Pynnonen Econometrics I

Page 13: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Fitted regression line:y = β0 + β1x . (11)

Residuals:

ui = yi − yi

= (β0 − β0) + (β1 − β1)xi + ui .(12)

Thus the residual component ui consist of the pure error term uiand the sample errors due to the estimation of the parameters β0

and β1.

Seppo Pynnonen Econometrics I

Page 14: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Fitted regression line y = β0 + β1x in (11) is called also as thesample regression function (SRF) while E[y |x ] = β0 + β1x (recallE[u|x ] = 0) is called the population regression function (PRF).

It is notable that PRF is something fixed, but unknown, in thepopulation.

SRF is obtained for a given sample of data, a new sample willgenerate new a slope and intercept in (11), and therefore differentlines (that vary around PRF).

Seppo Pynnonen Econometrics I

Page 15: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Example 1

Suppose the population regression is y = β0 + β1x + u with u ∼ N(0, σ2) inwhich β0 = 5, β1 = 1.2, and σ2 = 4.

For a sample of n = 50 observations (x generated from the normal distributionN(5, 3)), we have the sample regression line shown in the top right hand plot.

Repeating sampling 5 times, the respective SRFs are shown in the bottom leftfigure along with the PRF.

Data

Obs x y1 5.03 10.242 4.68 9.953 2.62 10.894 3.96 14.035 5.51 12.62. . .. . .. . .

1 2 3 4 5 6 7

46

810

1214

PRF and a SDF form a sampleof n = 50 observations

xy

PRFSRF

1 2 3 4 5 6 7

46

810

1214

Population regression line andfive OLS sample regression lines

x

y

PRFSRFs

Seppo Pynnonen Econometrics I

Page 16: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Remark 2

The slope coefficient β1 in terms of sample covariance of x and y andvariance of x .

Sample covariance:

sxy =1

n − 1

n∑i=1

(xi − x)(yi − y) (13)

Sample variance:

s2x =

1

n − 1

n∑i=1

(xi − x)2. (14)

Thusβ1 =

sxys2x

. (15)

Seppo Pynnonen Econometrics I

Page 17: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Remark 3

The slope coefficient β1 in terms of sample correlation and standarddeviations of x and y .

Sample correlation:

rxy =

∑ni=1(xi − x)(yi − y)√∑n

i=1(xi − x)2∑n

i=1(yi − y)2=

sxysxsy

, (16)

where sx =√

s2x and sy =

√s2y are the sample standard deviations of x

and y , respectively.

Thus we can also write the slope coefficient in terms of sample standarddeviations and correlation as

β1 =sysxrxy . (17)

Seppo Pynnonen Econometrics I

Page 18: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Example 2

Relationship between wage and eduction.

wage = average hourly earningseduc = years of educationData is collected in 1976, n = 526

SAS commands for reading data, printing a few lines, sample statistics,generating a scatter plot, and estimating the regression.

SAS Studio excerpt:

Seppo Pynnonen Econometrics I

Page 19: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Scatter plot of the data with regression line.

/* Scatter plot with regression line imposed */

proc sgplot data = a;

reg x = educ y = wage / lineattrs = (color = "red"); * generates scatter plot with regression line;

xaxis label = "Education (years)"; * x-axis label;

yaxis label = "Hourly wage"; * y-axis label

run;

Figure 2.2: Wages and education.Seppo Pynnonen Econometrics I

Page 20: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

Sample statistics:

/* Some sample statistics */

proc means data = a min max mean std skew kurt maxdec = 2; * stats with output rounded to two decimals;

var wage educ;

run;

Seppo Pynnonen Econometrics I

Page 21: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Estimation of the model, OLS

SAS PROC REG results:

The estimated model isy = −0.905 + 0.541x .

Thus, the model predicts that an additional year increases hourly wage onaverage by 0.54 dollars.

Using (17) you can verify the OLS estimate for β1 can be computed using the

correlation (rwage,educ = 0.406) and standard deviations in the above sample

statistics table. After that, applying (8) you get OLS estimate for the intercept.

Thus in all, the estimates can be derived from the basic sample statistics.

Seppo Pynnonen Econometrics I

Page 22: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

OLS Statistics1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 23: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

OLS Statistics

n∑i=1

ui = 0. (18)

n∑i=1

xi ui = 0. (19)

y = β0 + β1x . (20)

SST =n∑

i=1

(yi − y)2. (21)

SSE =n∑

i=1

(yi − y)2. (22)

SSR =n∑

i=1

(yi − yi )2. (23)

Seppo Pynnonen Econometrics I

Page 24: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

OLS Statistics

It can be shown that

n∑i=1

(yi − y)2 =n∑

i=1

(yi − y)2 +n∑

i=1

(yi − yi )2, (24)

that isSST = SSE + SSR. (25)

Prove this!

Remark 4

It is unfortunate that different books and different statistical packages

use different definitions, particularly for SSR and SSE. In many the

former means Regression sum of squares and the latter Error sum of

squares. I.e., just the opposite we have here!

Seppo Pynnonen Econometrics I

Page 25: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Goodness-of-Fit, the R-square1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 26: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Goodness-of-Fit, the R-square

R-square (coefficient of determination)

R2 =SSE

SST= 1− SSR

SST. (26)

The positive square root of R2, denoted as R, is called themultiple correlation.

Remark 5

Here in the case of simple regression R2 = r2xy , i.e. R = |rxy |. These do

not hold in the general case (multiple regression)!

Prove Remark 5 by yourself.

Remark 6

Generally it holds for the OLS estimation, however, that R = ryy , i.e.

correlation between the observed and fitted (or predicted) values.

Seppo Pynnonen Econometrics I

Page 27: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Goodness-of-Fit, the R-square

Remark 7

It is obvious that 0 ≤ R2 ≤ 1 with R2 = 0 representing no linear relation

between x and y and R2 = 1 representing a perfect fit.

Adjusted R-square:

R2 = 1− s2u

s2y

, (27)

where

s2u =

1

n − 2

n∑i=1

(yi − yi )2 (28)

is an estimate of the residual varianceσ2u = var[u].

We find easily that

R2 = 1− n − 2

n − 1(1− R2). (29)

One finds immediately that R2 < R2.Seppo Pynnonen Econometrics I

Page 28: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Goodness-of-Fit, the R-square

Example 3

In the previous example R2 = 0.1648 and adjusted R-squared,

R2 = 0.1632. The R2 tells that about 16.5 percent of the variation in the

hourly earnings can be explained by education. However, the rest 83.5

percent is not accounted by the model.

Seppo Pynnonen Econometrics I

Page 29: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 30: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Consider the simple regression model

yi = β0 + β1xi + ui (30)

with σ2u = var[ui ].

Let y∗i = a0 + a1yi and x∗i = b0 + b1xi , a1 6= 0 and b1 6= 0. Then(30) becomes

y∗i = β∗0 + β∗1x∗i + u∗i , (31)

whereβ∗0 = a1β0 + a0 −

a1

b1β1b0, (32)

β∗1 =a1

b1β1, (33)

andσ2u∗ = a2

1σ2u. (34)

Seppo Pynnonen Econometrics I

Page 31: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

In short the effects on the regression coefficients are:

The slope coefficient (β1) is affected only by scaling of themeasurements.

Scaling y scales the slope coefficient by the same amount.Scaling x scales the slope coefficient inversely.

The intercept is affected by both the scaling and origin shift.

Seppo Pynnonen Econometrics I

Page 32: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Remark 8

Coefficients a1 and b1 scale the measurements and a0 and b0 shift themeasurements origin.

For example, if y is temperature measured in Celsius, then

y∗ = 32 +9

5y

gives temperature in Fahrenheit.

Seppo Pynnonen Econometrics I

Page 33: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Example 4

Suppose an econometrician has estimated the following wage equation

wage = 8 + 0.8 educ,

where wage is wage per hour in US dollars and educ is education innumber of years.

How much an additional year of education increases wage in euros if oneeuro is the exchange rate is 1.10 (i.e. one euro is worth of 1.10 dollars)?

Because wagee = wage$/1.10, βe = β$/1.10 = 0.8/1.10 ≈ 0.73.

Seppo Pynnonen Econometrics I

Page 34: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Example 5

Let the estimated model be

yi = β0 + β1xi .

”Demeaned” observations:

y∗i = yi − y and xi − x . So a0 = −y , b0 = −x , and a1 = b1 = 1.

Because β0 = y − β1x , we obtain from (32)

β∗0 = β0 − (y − β1x) = 0.

Soy∗ = β1x

∗.

(Note β1 remains unchanged).

Seppo Pynnonen Econometrics I

Page 35: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

If we further define a1 = 1/sy and b1 = 1/sx , where sy and sx are thesample standard deviations of y and x , respectively. Applying thetransformation yields standardized observations

y∗i =

yi − y

syand x∗i =

xi − x

sx.

Then again β0 = 0. The slope coefficient becomes

β∗1 =

sxsyβ1,

which is called standardized regression coefficient.

As an exercise show that in this case β∗1 = rxy , the correlation coefficient

of x and y .

Seppo Pynnonen Econometrics I

Page 36: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 37: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Logarithmic transformation is one of the most appliedtransformation for economic variables.

Table 2.1 Functional forms including log-transformations

Dependent Independent InterpretationModel variable variable of β1

level-level y x ∆y = β1∆x

level-log y log(x) ∆y = (β1/100)%∆x

log-level log(y) x %∆y = (100β1)∆x

log-log log(y) log(x) %∆y = β1%∆x

∆x , ∆y are changes in x andy , and %∆x = 100∆x/x , %∆y = 100∆y/y are

percentage changes.

Can you find the rationale for the interpretations?

Remark 9

Log-transformation can be only applied to variables that assume strictly

positive values!

Seppo Pynnonen Econometrics I

Page 38: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Example 6

Consider again the wage example.

Suppose we believe that instead of the absolute change a better choice isto consider the percentage change of wage (y) as a function of education(x). Then we would consider the model

log(y) = β0 + β1x + u.

Estimation of this model yields

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.583773 0.097336 5.998 3.74e-09 ***

educ 0.082744 0.007567 10.935 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4801 on 524 degrees of freedom

Multiple R-squared: 0.1858,Adjusted R-squared: 0.1843

F-statistic: 119.6 on 1 and 524 DF, p-value: < 2.2e-16

Seppo Pynnonen Econometrics I

Page 39: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

That islog(y) = 0.584 + 0.083x (35)

n = 526, R2 = 0.186. Note that R-squares of this model and thelevel-level model are not comparable.

The model predicts that an additional year of education increases on

average hourly earnings by 8.3%.

Seppo Pynnonen Econometrics I

Page 40: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Units of Measurement and Functional Form

Remark 10

Typically all models where transformations on y and x are functions ofthese variables alone can be cast to the form of linear model. That is, ifhave generally

g(y) = β0 + β1h(x) + u, (36)

where g and h are functions, then defining y∗ = g(y) and x∗ = h(x) wehave a linear model

y∗ = β0 + β1x∗ + u.

Note, however, that all models cannot be cast to a linear form. Anexample is

cons =1

β0 + β1income+ u.

Seppo Pynnonen Econometrics I

Page 41: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 42: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

We say generally that an estimator of θ of a parameter θ is

unbiased if E[θ]

= θ.

Theorem 1

Under the classical assumptions 1–5

E[β0

]= β0 and E

[β1

]= β1. (37)

Proof: Given observations x1, . . . , xn the expectations are conditional onthe given xi -values.

We prove first the unbiasedness of β1. Now

β1 =∑

(xi−x)(yi−y)∑(xi−x)2 =

∑(xi−x)yi∑(xi−x)2

=∑

(xi−x)β0∑(xi−x)2 + β1

∑(xi−x)xi∑(xi−x)2 +

∑(xi−x)ui∑(xi−x)2

= β1 + 1∑(xi−x)2

∑(xi − x)ui .

(38)

Seppo Pynnonen Econometrics I

Page 43: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

I.e.,

β1 = β1 +1∑

(xi − x)2

∑(xi − x)ui . (39)

Because xi s fixed we get

E[β1

]= β1 +

1∑(xi − x)2

∑(xi − x)E[ui ] = β1 (40)

Because E[ui ] = 0 by assumption 1. Thus β1 is unbiased.

Proof of unbiasedness of β0 is left to students.

Seppo Pynnonen Econometrics I

Page 44: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 45: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Theorem 2

Under the classical assumptions 1 through 5 and given x1, . . . , xn

var[β1

]=

σ2u∑n

i=1(xi − x)2, (41)

var[β0

]=

(1

n+

x2∑(xi − x)2

)σ2u. (42)

and for y = β0 + β1x with given x

var[y ] =

(1

n+

(x − x)2∑(xi − x)2

)σ2u. (43)

Seppo Pynnonen Econometrics I

Page 46: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Proof: Again we prove as an example only (41). Using (39) and theproperties of variance with x1, . . . , xn given

var[β1

]= var

[β1 + 1∑

(xi−x)2

∑(xi − x)ui

]

=(

1∑(xi−x)2

)2∑(xi − x)2var[ui ]

=(

1∑(xi−x)2

)2∑(xi − x)2σ2

u

=σ2u

∑(xi−x)2

(∑

(xi−x)2)2

=σ2u∑

(xi−x)2 .

(44)

Seppo Pynnonen Econometrics I

Page 47: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Remark 11

(42) can be written equivalently as

var[β0

]=

σ2u

∑x2i

n∑

(xi − x)2. (45)

Seppo Pynnonen Econometrics I

Page 48: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 49: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Recalling from equation (12) the residual ui = yi − yi is of the form

ui = ui − (β0 − β0)− (β1 − β1)xi . (46)

This reminds us about the difference between the error term ui andthe residual term ui .

An unbiased estimator of the error variance σ2u = var[ui ] is

σ2u =

1

n − 2

n∑i=1

u2i . (47)

Taking the (positive) square root gives an estimator for the errorstandard deviation σu =

√σ2u, called usually the standard error of

regression

σu =

√1

n − 2

∑u2i . (48)

Seppo Pynnonen Econometrics I

Page 50: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Theorem 3

Under the assumption (1)–(5)

E[σ2u

]= σ2

u,

i.e., σ2 is an unbiased estimator of σ2u.

Proof: Omitted.

Seppo Pynnonen Econometrics I

Page 51: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators1 The Simple Regression Model

Definition

Estimation of the model, OLS

OLS Statistics

Algebraic properties

Goodness-of-Fit, the R-square

Units of Measurement and Functional Form

Scaling and translation

Nonlinearities

Expected Values and Variances of the OLS Estimators

Unbiasedness

Variances

Estimating the Error Variance

Standard Errors of β0 and β1

Seppo Pynnonen Econometrics I

Page 52: Part II The Simple Regression Model - Univaasalipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc2.pdf · The Simple Regression Model Estimation of the model, OLS Example 1 Suppose the

The Simple Regression Model

Expected Values and Variances of the OLS Estimators

Replacing in (41) and (42) σ2u by σ2

u and taking square roots givethe standard error of β1 and β0

se(β1) =σu√∑

(xi − x)2(49)

and

se(β0) = σu

√1

n+

x2∑(xi − x)2

. (50)

These belong to the standard output of regression estimation, seecomputer print-outs above.

Seppo Pynnonen Econometrics I