Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf ·...
Transcript of Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf ·...
Econometrics II
Seppo Pynnonen
Department of Mathematics and Statistics, University of Vaasa, Finland
Spring 2018
Seppo Pynnonen Econometrics II
Introduction
Part I
Introduction
As of Jan 8, 2018Seppo Pynnonen Econometrics II
Introduction
Econometrics1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Econometrics
Econometrics is a discipline of statistics, specialized for using anddeveloping mathematical and statistical tools for empiricalestimation of economic relationships, testing economic theories,making economic predictions, and evaluating government andbusiness policy.
Data: Nonexperimental (observational)
Major tool: Regression analysis (in wide sense)
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data
(a) Cross-sectionalData collected at given point of time. E.g. a sample ofhouseholds or firms, from each of which are a number ofvariables like turnover, operating margin, market value ofshares, etc., are measured.From econometric point of view it is important that theobservations consist a random sample from the underlyingpopulation.
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data
(b) Time SeriesA time series consist of observations on a variable(s) overtime. Typical examples are daily share prices, interest rates,CPI values.An important additional feature over cross-sectional data isthe ordering of the observations, which may convey importantinformation.An additional feature is data frequency which may requirespecial attention.
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data
(c) Pooled Cross-sectionsBoth time series and cross-section features.For example a number of firms are randomly selected, say in1990, and another sample is selected in 2000.If in both samples the same features are measured, combiningboth years form a pooled cross-section data set.Pooled cross-section data is analyzed much the same way asusual cross-section data.However, it may be important to pay special attention to thefact that there are 10 years in between.Usually the interest is whether there are some importantchanges between the time points. Statistical tools are usuallythe same as those used for analysis of differences between twoindependently sampled populations.
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data
(d) Panel dataPanel data (longitudinal data) consists of (time series) datafor the same cross section units over time.Allows to analyze much richer dependencies than pure crosssection data.
Seppo Pynnonen Econometrics II
Introduction
Types of Economic Data
Example 1
Job training data from Holzer et al. (1993) Are training subsidieseffective? The Michigan experience, Industrial and Labor RelationsReview 19, 625–636.
Excerpt from the data:
year fcode employ sales avgsal
1987 410032 100 4.70E+07 35000
1988 410032 131 4.30E+07 37000
1989 410032 123 4.90E+07 39000
1987 410440 12 1560000 10500
1988 410440 13 1970000 11000
1989 410440 14 2350000 11500
1987 410495 20 750000 17680
1988 410495 25 110000 18720
1989 410495 24 950000 19760
1987 410500 200 2.37E+07 13729
1988 410500 155 1.97E+07 14287
1989 410500 80 2.60E+07 15758
1987 410501 . 6000000 .
1988 410501 . 8000000 .
1989 410501 . 1.00E+07 .
etc
Seppo Pynnonen Econometrics II
Introduction
The linear regression model1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
The linear regression model is the single most useful tool ineconometrics.
Assumption: each observation i , i = 1, . . . , n is generated by theunderlying process described by
yi = β0 + β1xi1 + · · ·+ βkxik + ui , (1)
where yi is the dependent or explained variable and xi1, xi2, . . . , xikare independent or explanatory variables, u is the error term, andβ0, β1, . . . , βk are regression coefficients (slope coefficients) (β0 iscalled the intercept term or constant term).
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
A notational convenience:
yi = x′iβ + ui , (2)
where xi = (1, x1i , . . . , xik)′ andβ = (β0, β1, . . . , βk)′ are k + 1 column vectors.
Stacking the x-observation vectors to an n × (k + 1) matrix
X =
x′1x′2...x′i...x′n
=
1 x11 x12 . . . x1k1 x21 x22 . . . x2k...
......
...1 xi1 xi2 . . . xik...
......
...1 xn1 xn2 . . . xnk
(3)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
we can writey = Xβ + u, (4)
where y = (y1, . . . , yn)′, and u = (u1, . . . , un)′.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Example 2
In Example 1 the interest is whether grant for employee educationdecreases product failures. The estimated model is assumed to be
log(scrap) = β0 + β1grant + β2grant−1 + u, (5)
where scrap is scarp rate (per 100 items), grant = 1 if firm received
grant in year t, grant = 0 otherwise, and grant−1 = 1 if firm received
grant in the previous year, grant−1 = 0 otherwise.
The above model does not take into account that the data consist of
three consecutive year measurements from the same firms (i.e., panel
data).
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Ordinaty Least Squares (OLS) Estimation yields (Stata):
regress lscrap grant grant_1
Source | SS df MS Number of obs = 162
----------+------------------------------ F( 2, 159) = 0.30
Model | 1.34805124 2 .67402562 Prob > F = 0.7395
Residual | 354.397022 159 2.22891209 R-squared = 0.0038
----------+------------------------------ Adj R-squared = -0.0087
Total | 355.745073 161 2.20959673 Root MSE = 1.493
---------------------------------------------------
lscrap | Coef. Std. Err. t P>|t|
--+------------------------------------------------
grant | .0543534 .310501 0.18 0.861
grant_1 | -.2652102 .36995 -0.72 0.474
_cons | .4150563 .139828 2.97 0.003
---------------------------------------------------
Neither of the coefficients are statistically significant and grant has evenpositive sign, although close to zero.
Dealing later with the panel estimation we will see that the situation can
be improved.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
The problem with the above estimation is that the OLSassumptions are not met.
The OLS assumptions are:
(i) E[ui |X] = 0 for all i
(ii) var[ui |X] = σ2u for all i
(iii) cov[ui , uj |X] = 0 for all i 6= j ,
(iv) X is a n × (k + 1) matrix with rank k + 1
Remark 1.1: Assumption (1) implies
cov[ui ,X] = 0, (6)
which is crucial in OLS-estimation.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
In panel data typically the homoscedasticity assumption (ii) isviolated.
In order to capture the implied heteroscedasticity the error term ismodeled generally as
uit = αi + δt + vit (7)
in which vit satisfied the assumption (ii), αi captures the(unobserved time invariant) individual effects and δt captures the(unobserved individual invariant) time effect.
We return to these matters more closely later.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Under assumptions (i)–(iv) the OLS estimator
β = (X′X)−1X′y (8)
is the Best Linear Unbiased Estimator (BLUE) of the regressioncoefficients β of the linear model in equation (4).
This is known as the Gauss-Markov theorem.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Sum of Squares (SS) identity:
SST = SSR + SSE, (9)
where
Total: SST =n∑
i=1
(yi − y)2 (10)
Model: SSR =n∑
i=1
(yi − y)2, (11)
Residual: SSE =n∑
i=1
(yi − yi )2 (12)
with yi = x′i β, and y = 1n
∑ni=1 yi , the sample mean.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Goodness of fit: R-square, R2
R2 =SSR
SST= 1− SSE
SST, (13)
Adjusted R-square (Adj R-square), R2
R2 = 1− SSE/(n − k − 1)
SST/(n − 1)= 1− s2u
s2y, (14)
where
s2u =1
n − k − 1
n∑i=1
u2i =SSE
n − k − 1(15)
is an estimator of the variance σ2u = var[ui ] of the error term(su =
√s2u , ”Root MSE” in the Stata output), and
s2y =1
n − 1
n∑i=1
(yi − y)2 (16)
is the sample variance of y .Seppo Pynnonen Econometrics II
Introduction
The linear regression model1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Assumption
(v) u ∼ N(0, σ2uI),
where I is an n × n identity matrix.
Individual coefficient restrictions:
Hypotheses are of the form
H0 : βj = β∗j , (17)
where β∗j is a given constant.
t-statistics:
t =βj − β∗js.e(βj)
, (18)
where
s.e(βj) = su
√(X′X)jj , (19)
and (X′X)jj is the jth diagonal element of (X′X)−1.Seppo Pynnonen Econometrics II
Introduction
The linear regression model
F -test:
The overall hypothesis that none of the explanatory variablesinfluence the y -variable, i.e.,
H0 : β1 = β2 = · · · = βk = 0 (20)
is tested by F -test
F =SSR/k
SST/(n − k − 1), (21)
which is F -distributed with degrees of freedom f1 = k andf2 = n − k − 1 if the null hypothesis is true.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
General (linear) restrictions:
H0 : Rβ = q, (22)
where R is a fixed m × (k + 1) matrix and q is a fixed m-vector.
m indicates the number of independent linear restrictions imposedto the coefficients.
The alternative hypothesis is
H1 : Rβ 6= q. (23)
The null hypothesis in (22) can be tested with an F -statistic thatunder the null hypothesis has the F -distribution with degrees offreedom f1 = m and f2 = n − k − 1.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Example 3
Consider model
y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + u. (24)
In terms of the general linear hypothesis (22) testing for singlecoefficients, e.g.,
H0 : β1 = 0 (25)
is obtained by selecting
R = (0 1 0 0 0) and q = 0. (26)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
The null hypothesis in (20), i.e.,
H0 : β1 = β2 = β3 = β4 = 0 (27)
is obtained by selecting
R =
0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1
and q =
0000
(28)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
A null hypothesis of the form
H0 : β1 + β2 = 1, β3 = β4 (29)
corresponds to
R =
(0 1 1 0 00 0 0 1 −1
)and q =
(10
). (30)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Example 4
Consider the following consumption function
Ct = β0 + β1Yt + β2Ct−1 + ut . (31)
Then β1 is called the short-run MPC (marginal propensity to consume).
The long-run MPC is
βlrmpc =β1
1− β2. (32)
Test the hypothesis whether the long run MPC = 1, i.e.,
H0 :β1
1− β2= 1. (33)
This is equivalent to β1 + β2 = 1.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Thus, the non-linear hypothesis (33) reduces in this case to the linearhypothesis
H0 : β1 + β2 = 1, (34)
and we can use the general linear hypothesis of the form (22) with
R = (0 1 1) and q = 1. (35)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Remark 1.2: Hypotheses of the form (34) can be easily tested with thestandard t-test by re-parameterizing the model.
Defining Zt = Ct−1 − Yt , equation (31) is (statistically) equivalent to
Ct = β0 + γYt + β2Zt + ut , (36)
where γ = β1 + β2.
Thus, in terms of (36) testing hypothesis (33) reduces to testing
H0 : γ = 1, (37)
which can be worked out with the usual t-statistic.
t =γ − 1
s.e(γ). (38)
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Example 5
Generalized Gobb-Douglas production function in transportationindustrya Yi = value added (output), L = labor, K = capital, and N =the number of establishments in the transportation industry.
y = β0 + β1k + β2l + u, (39)
where y = log(Y /N), k = log(K/N), and l = log(L/N) (log is thenatural logarithm).
lm(formula = y ~ k + l, data = dfr)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.29326 0.10718 21.396 3.24e-16 ***
k 0.27898 0.08069 3.458 0.00224 **
l 0.92731 0.09832 9.431 3.46e-09 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1885 on 22 degrees of freedom
Multiple R-squared: 0.9597,Adjusted R-squared: 0.9561
F-statistic: 262.2 on 2 and 22 DF, p-value: 4.501e-16
aZellner, A and N. Revankar (1970). Review of Economic Studies 37, 241–250. Data:
http://pages.stern.nyu.edu/∼wgreene/Text/Edition7/tablelist8new.htm Table F7.2
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
According to the results the capital elasticity is 0.279 and the laborelasticity is 0.927, thus labor intensive.
Let us test for the constant return to scale, i.e.,
H0 : β1 + β2 = 1. (40)
Using R car package linearHypothesis(), the general restrictedhypothesis method (22) yields
Res.Df RSS Df Sum of Sq F Pr(>F)
1 23 1.3079
2 22 0.7814 1 0.52645 14.822 0.000869 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
which rejects the null hypothesis.
In order to demonstrate the re-parametrization approach, defineregression model
log(Y /N) = β0 + γ log(K/N) + β2 log(L/K) + u. (41)
Estimation of this specification yieldsSeppo Pynnonen Econometrics II
Introduction
The linear regression model
lm(formula = y ~ k + lpk, data = dfr)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.29326 0.10718 21.396 3.24e-16 ***
k 1.20629 0.05358 22.512 < 2e-16 ***
lpk 0.92731 0.09832 9.431 3.46e-09 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.1885 on 22 degrees of freedom
Multiple R-squared: 0.9597,Adjusted R-squared: 0.9561
F-statistic: 262.2 on 2 and 22 DF, p-value: 4.501e-16
All the goodness-of-fit of these models are exactly the same, indicating
the equivalence equivalence of the models in statistical sense.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
The null hypothesis of the constant returns to scale in terms of thismodel is
H0 : γ = 1. (42)
The t-value is
t =γ − 1
s.e(γ)=
1.206294− 1
0.053584≈ 3.84 (43)
with p-value = 0.0009, exactly the same as above, again rejecting the
null hypothesis.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Remark 1.3: Estimation of the regression parameters under the
restrictions of the form Rβ = q are obtained by using restricted Least
Squares, provided by modern statistical packages.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Confidence intervals:
A 100(1− α)% confidence interval for a single parameter is of theform
βj ± tα/2s.e(βj), (44)
where tα/2 is the 1− α/2 percentile of the t-distribution withn − k − 1 degrees of freedom.
Seppo Pynnonen Econometrics II
Introduction
The linear regression model1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
The linear regression model
Economic theory implies sometimes nonlinear hypotheses.
In fact, the long-run MPC example is an example of non-linearhypothesis, which we could transform to a linear hypothesis.
This is not always possible.
For example a hypothesis of the form
H0 : β1β2 = 1 (45)
is nonlinear.
Non-linear hypotheses can be tested using Wald-test, Lagrangemultiplier test, or Likelihood Ratio, LR-test.
Each of these has an asymptotic χ2-distribution with degrees offreedom equal to the number of imposed restrictions on theparameters.
These tests will be considered more closely later.Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
In addition to OLS the Maximum Likelihood (ML) is one of themost popular estimation method in econometrics.
This method can be utilized if the joint distribution of the randomvariables is known.
Likelihood function
Generally, suppose that the probability distribution of a randomvariable, Y , depends on a set of parameters, θ, then theprobability density for the random variable is denoted as fY (y ;θ).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
The joint density of random variables Y1, . . . ,Yn isfY(y1, . . . , yn;θ), where Y = (Y1, . . . ,Yn)′ is the (column) vectorof the observations (prime denotes transposition).
In particular, if the random variables are independently distributedthen
fY(y1, . . . , yn;θ) =n∏
i=1
fYi(yi ;θ), (46)
where
n∏i=1
fYi(yi ;θ) = fY1(y1;θ)fY2(y2;θ) · · · fYn(yn;θ) (47)
is the product of the marginal densities fYi(yi ;θ).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Moreover, if the random variables Yi are independently andidentically distributed (i.i.d.) then the distributions fYi
(·;θ) are thesame for all i = 1, . . . , n.
For simplicity, we denote then f (yi ;θ) ≡ fYi(yi ;θ).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
In statistical analysis we can consider a sample of observationsy1, . . . , yn as a realization (observed values) of independentrandom variables Y1, . . . ,Yn.
Furthermore, assuming i.i.d., with observed values yi s, the functionin equation (46) becomes a function of θ and we can write
L(θ) ≡ L(θ; y1, . . . , yn) =n∏
i=1
f (yi ;θ), (48)
which is called the likelihood function.
Taking (natural) logarithms on both sides, we get the loglikelihood function
`(θ) ≡ log L(θ) =n∑
i=1
log f (yi ;θ). (49)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Denoting the log-likelihoods of individual observations as`i (θ) = log f (yi ;θ), we can write (49) as
`(θ) =n∑
i=1
`i (θ). (50)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Example 6
Under the normality assumption of the error term, ui , of the regression
yi = x′iβ + ui (51)
ui ∼ N(0, σ2u). (52)
It follows that given xi
yi |xi ∼ N(x′iβ, σ2u). (53)
Thus, with θ = (β′, σ2)′, the (conditional) density function is
f (yi |xi ;θ) =1√
2πσ2u
e− (yi−x′i β)2
2σ2u , (54)
`i (θ) = −1
2log(2π)− 1
2log σ2
u −1
2
(yi − x′iβ)2
σ2u
, (55)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
and
`(θ) = −n
2log(2π)− n
2log σ2
u −1
2
n∑i=1
(yi − x′iβ)2
σ2u
. (56)
In matrix form (56) becomes
`(θ) = −n
2log(2π)− n
2log σ2
u −1
2σ2u
(y − Xβ)′(y − Xβ). (57)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
We say that the parameter vector θ is identified or estimable if forany other parameter vector θ∗ 6= θ, for some data data y,L(θ∗; y) 6= L(θ; y).
Given data y the maximum likelihood estimate (MLE) of θ is thevalue θ of the parameter for which
L(θ) = maxθ
L(θ), (58)
i.e., the parameter value that maximizes the likelihood function.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
In practice it is usually more convenient to maximize thelog-likelihood, such that the MLE of θ is the value θ which satisfies
l(θ) = maxθ`(θ). (59)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Example 7
Consider the simple regression model
yi = β0 + β1xi + ui , (60)
with ui ∼ N(0, σ2u).
Given a sample of observations (y1, x1), (y2, x2), . . . , (yn, xn),
`(θ) = −1
2
(log(2π) + log σ2 +
n∑i=1
(yi − β0 − β2xi )2/σ2u
), (61)
where θ = (β0, β1, σ2u).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
The maximum of (61) can be found by setting the partial derivatives tozero.
I.e.,∂`∂β0
=∑n
i=1(yi − β0 − β1xi )/σ2u = 0
∂`∂β1
=∑n
i=1 xi (yi − β0 − β1xi )/σ2u = 0
∂`∂σ2
u= −1/σ2
u − 1(σ2
u)2
∑ni=1(yi − β0 − β1xi )2 = 0
(62)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Solving these gives
β1 =
∑ni=1(xi − x)(yi − y)∑n
i=1(xi − x)2(63)
β0 = y − β1x (64)
and
σ2u =
1
n
n∑i=1
u2i , (65)
whereui = yi − β0 − β1xi (66)
is the regression residual and y and x are the sample means of yi and xi .
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
In this particular case the ML estimators of the regressionparameters, β0 and β1 coincide the OLS estimators.
In OLS the error variance σ2u estimator is
s2 =1
n − 2
n∑i=1
u2i =n
n − 2σ2u. (67)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Let θ0 be the population value of the parameter (vector) θ and letθ be the MLE of θ0.
Then
(a) Consistency: plim θ = θ0, i.e., θ is a consistent estimator of θ0
(b) Asymptotic normality: θ ∼ N(θ0, I(θ0)−1
)asymptotically,
where
I(θ0) = −E[∂2`(θ)
∂θ∂θ′
]θ=θ0
. (68)
That is, θ is asymptotically normally distributed.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
(c) Asymptotic efficiency: θ is asymptotically efficient. That is, inthe limit as the sample size grows, MLE is unbiased and its(limiting) variance is smallest among estimators that areasymptotically unbiased.
(d) Invariance: The MLE of γ0 = g(θ0) is g(θ), where g is a(continuously differentiable) function.
Example 8
In Example 7 the MLE of the error variance σ2u is given by σ2
u defined in
equation (65). Using property (d), the MLE of the standard deviation
σu =√σ2u is σu =
√σ2u.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Remark 1.4: The above so called large sample properties (a)–(c) of the
ML (and OLS) estimators stem from two important results in probability
theory: The Law of Large Numbers (LLN) and the Central Limit
Theorem (CLT)
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
The LLN:
Theorem 1
Suppose Y1, . . . ,Yn are independent random variables with E[Yi ] = µand var[Yi ] = σ2 <∞ then for any ε > 0
P(|Yn − µ| > ε
)→ 0 (69)
as n→∞, where
Yn =1
n
n∑i=1
Yi (70)
is the sample mean. We denote (69) in short
plim Yn = µ
as n→∞. Alternatively it is also denoted as YnP→ µ.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
The CLT:
Theorem 2
Suppose Y1, . . . ,Yn are i.i.d random variables with E[Yi ] = µ andvar[Yi ] = σ2 then the distribution of
√n(Yn − µ)
σ(71)
approaches to the standard normal distribution, N(0, 1), as n→∞,where Yn is as defined in (70).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Remark 1.5: The I(θ0) defined in (68) is called the
Fisher information matrix and
H =∂2`(θ)
∂θ∂θ′(72)
is called the Hessian of the log-likelihood.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood1 Introduction
Econometrics
Types of Economic Data
Cross-sectional
Time Series Data
Pooled Cross-sections
Panel Data
The linear regression model
Regression statistics
Inference
Nonlinear hypotheses
Maximum Likelihood
Maximum Likelihood Estimation
Properties of Maximum Likelihood Estimators
Likelihood Ratio, Wald, and Lagrange Multiplier tests
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
For testing general restrictions of the form
H0 : c(θ) = 0, (73)
where c(·) is some (vector valued) function, there are three generalpurpose test methods:
Remark 1.6: We could specify the above hypothesis alternatively
H0 : r(θ) = q, (74)
where r(·) is some function and q is some constant.
Defining c(θ) = r(θ)− q reduces back to hypothesis (73).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Likelihood ratio test (LR-test)
LR = −2 log
(LRLU
)= −2(`R − `U), (75)
whereLR = max
θ,c(θ)=0L(θ)
is the maximum of the likelihood under the restriction ofhypothesis (73),
LU = maxθ
L(θ)
is the unrestricted maximum of the likelihood function(`U = log LU and `R = log LR).
Remark 1.7: Use of the LR test requires computing both the restricted
MLE of θ (to compute `R) and the unrestricted MLE (to compute `U).
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Wald test
W = c(θ)′V−1c(θ), (76)
where V is the asymptotic variance covariance matrix of c(θ).
Remark 1.8: Use of the Wald test requires only to find the unrestricted
MLE.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Lagrange multiplier test (LM)
LM =
(∂`(θR)
∂θ
)′ [I(θR)
]−1(∂`(θR)
∂θ
), (77)
where θR is the restricted MLE satisfying the restriction c(θR) = 0of the general hypothesis (73).
Remark 1.9: Use of the LM test requires only the restricted MLE.
Seppo Pynnonen Econometrics II
Introduction
Maximum Likelihood
Under the null hypothesis (73) each of these test statistics isasymptotically χ2-distributed with degrees of freedom equal to thenumber of restrictions.
Thus, they are asymptotically equivalent. In small samplesnumerical values may differ, however.
Seppo Pynnonen Econometrics II