Ps602 notes part1
-
Upload
palautog55 -
Category
Documents
-
view
257 -
download
4
description
Transcript of Ps602 notes part1
04/10/2023
Advanced Quantitative Methods - PS 602Notes – Version as of 8/30/12
Robert D. Duval - WVU Dept of Political Science
Class Office116 Woodburn 301A WoodburnTh 9:00-11:30 M-F 12:00-1:00Phone 304-293-9537 (office)
304-599-8913 (home)
email: [email protected] (Do not use Mix email address!)
April 10, 2023
Syllabus
Required texts
Additional readings
Computer exercises
Course requirementsMidterm - in class, open book (30%)Final – in-class, open book (30%)Research paper (30%)Participation (10%)
http://www.polsci.wvu.edu/duval/ps602/602syl.htm
Slide 2
April 10, 2023
PrerequisitesAn fundamental understanding of calculus
An informal but intuitive understanding of the mathematics of Probability
A sense of humor
Slide 3
April 10, 2023
Statistics is an innate cognitive skillWe all possess the ability to do rudimentary statistical
analysis – in our heads
– intuitively.
The cognitive machinery for stats is built in to us, just like it is for calculus.
This is part of how we process information about the world
It is not simply mysterious arcane jargon It is simply the mysterious arcane way you already
think.Much of it formalizes simple intuition.Why do we set alpha (α) to .05?
Slide 4
April 10, 2023
IntroductionThis course is about Regression analysis.
The principle method in the social science
Three basic parts to the course:An introduction to the general ModelThe formal assumptions and what they mean.Selected special topics that relate to regression
and linear models.
Slide 5
April 10, 2023
Introduction: The General Linear Model
The General Linear Model (GLM) is a phrase used to indicate a class of statistical models which include simple linear regression analysis.
Regression is the predominant statistical tool used in the social sciences due to its simplicity and versatility.
Also called Linear Regression Analysis.Multiple RegressionOrdinary Least Squares
Slide 6
April 10, 2023
Simple Linear Regression: The Basic Mathematical Model
Regression is based on the concept of the simple proportional relationship - also known as the straight line.
We can express this idea mathematically!Theoretical aside: All theoretical statements
of relationship imply a mathematical theoretical structure.
Just because it isn’t explicitly stated doesn’t mean that the math isn’t implicit in the language itself!
Slide 7
April 10, 2023
Math and languageWhen we speak about the world, we often have
imbedded implicit relationships that we are referring to.
For instance: Increasing taxes will send us into recession.Decreasing taxes will spur economic growth.
Slide 8
April 10, 2023
From Language to modelsThe idea that reducing taxes on the wealthy will
spur economic growth (or increasing taxes will harm economic growth) suggests that there is a proportional relationship between tax rates and growth in domestic product.
So lets look!Disclaimer! The “models” that follow are meant to be
examples. They are not “good” models, only useful ones to talk about!
Slide 9
Sources: (1) US Bureau of Economic Analysis, http://www.bea.gov/iTable/iTable.cfm?ReqID=9&step=1 (2) US Internal Revenue Service, http://www.irs.gov/pub/irs-soi/09in05tr.xls
GDP and Average Tax Rates: 1986-2009
April 10, 2023
The Stats
Slide 11
regress gdp avetaxrate
Source | SS df MS Number of obs = 24-------------+------------------------------ F( 1, 22) = 5.11 Model | 44236671 1 44236671 Prob > F = 0.0340 Residual | 190356376 22 8652562.57 R-squared = 0.1886-------------+------------------------------ Adj R-squared = 0.1517 Total | 234593048 23 10199697.7 Root MSE = 2941.5
------------------------------------------------------------------------------ gdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- avetaxrate | -1341.942 593.4921 -2.26 0.034 -2572.769 -111.1148 _cons | 26815.88 7910.084 3.39 0.003 10411.37 43220.39------------------------------------------------------------------------------
April 10, 2023
The Stats
Slide 12
regress gdp avetaxrate
Source | SS df MS Number of obs = 24-------------+------------------------------ F( 1, 22) = 5.11 Model | 44236671 1 44236671 Prob > F = 0.0340 Residual | 190356376 22 8652562.57 R-squared = 0.1886-------------+------------------------------ Adj R-squared = 0.1517 Total | 234593048 23 10199697.7 Root MSE = 2941.5
------------------------------------------------------------------------------ gdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- avetaxrate | -1341.942 593.4921 -2.26 0.034 -2572.769 -111.1148 _cons | 26815.88 7910.084 3.39 0.003 10411.37 43220.39------------------------------------------------------------------------------
April 10, 2023
But wait, there’s more…What is the model?
There is a directly proportional relationship between tax rates and economic growth.
How about an equation?We will get back to this…
Can you critique the “model?”
Can we look at it differently?
If higher
Slide 13
Effective Tax Rate and Growth in GDP:1986-2009
Sources: (1) US Bureau of Economic Analysis, http://www.bea.gov/iTable/iTable.cfm?ReqID=9&step=1 (2) US Internal Revenue Service, http://www.irs.gov/pub/irs-soi/09in05tr.xls
April 10, 2023
. regress gdpchange avetaxrate
Source | SS df MS Number of obs = 23
-------------+------------------------------ F( 1, 21) = 6.15
Model | 22.4307279 1 22.4307279 Prob > F = 0.0217
Residual | 76.5815075 21 3.64673845 R-squared = 0.2265
-------------+------------------------------ Adj R-squared = 0.1897
Total | 99.0122354 22 4.50055615 Root MSE = 1.9096
------------------------------------------------------------------------------
gdpchange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
avetaxrate | .9889805 .3987662 2.48 0.022 .1597008 1.81826
_cons | -7.977765 5.292757 -1.51 0.147 -18.98466 3.029126
------------------------------------------------------------------------------
Slide 15
April 10, 2023
Finally…sort of
Slide 16
. regress gdpchange taxratechange
Source | SS df MS Number of obs = 23-------------+------------------------------ F( 1, 21) = 11.00 Model | 34.0305253 1 34.0305253 Prob > F = 0.0033 Residual | 64.9817101 21 3.09436715 R-squared = 0.3437-------------+------------------------------ Adj R-squared = 0.3124 Total | 99.0122354 22 4.50055615 Root MSE = 1.7591
------------------------------------------------------------------------------- gdpchange | Coef. Std. Err. t P>|t| [95% Conf. Interval]--------------+----------------------------------------------------------------taxratechange | .2803104 .0845261 3.32 0.003 .1045288 .456092 _cons | 5.414708 .3780098 14.32 0.000 4.628593 6.200822-------------------------------------------------------------------------------
April 10, 2023
Alternate Mathematical Notation for the Line
Alternate Mathematical Notation for the straight line - don’t ask why!10th Grade Geometry
Statistics Literature
Econometrics Literature
Or (Like your Textbook)
Slide 17
y mx b= +
Y a bX ei i i
Y B B X ei i i= + +0 1
April 10, 2023
Alternate Mathematical Notation for the Line – cont.
These are all equivalent. We simply have to live with this inconsistency.
We won’t use the geometric tradition, and so you just need to remember that B0 and a are both the same thing.
Slide 18
April 10, 2023
Linear Regression: the Linguistic Interpretation
In general terms, the linear model states that the dependent variable is directly proportional to the value of the independent variable.
Thus if we state that some variable Y increases in direct proportion to some increase in X, we are stating a specific mathematical model of behavior - the linear model.
Hence, if we say that the crime rate goes up as unemployment goes up, we are stating a simple linear model.
Slide 19
April 10, 2023
Linear Regression:A Graphic Interpretation
Slide 20
The Straight Line
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
X
Y
April 10, 2023
The linear model is represented by a simple picture
Slide 21
Simple Linear Regression
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
X
Y
April 10, 2023
The Mathematical Interpretation: The Meaning of the Regression Parameters
a = the interceptthe point where the line crosses the Y-axis.(the value of the dependent variable when all of
the independent variables = 0)
b = the slopethe increase in the dependent variable per unit
change in the independent variable. (also known as the 'rise over the run')
Slide 22
April 10, 2023
The Error TermSuch models do not predict behavior perfectly.
So we must add a component to adjust or compensate for the errors in prediction.
Having fully described the linear model, the rest of the semester (as well as several more) will be spent of the error.
Slide 23
April 10, 2023
The Nature of Least Squares Estimation
There is 1 essential goal and there are 4 important concerns with any OLS Model
Slide 24
April 10, 2023
The 'Goal' of Ordinary Least Squares
Ordinary Least Squares (OLS) is a method of finding the linear model which minimizes the sum of the squared errors.
Such a model provides the best explanation/prediction of the data.
Slide 25
April 10, 2023
Why Least Squared error?Why not simply minimum error?
The error’s about the line sum to 0.0!
Minimum absolute deviation (error) models now exist, but they are mathematically cumbersome.
Try algebra with | Absolute Value | signs!
Slide 26
April 10, 2023
Other models are possible...
Best parabola...? (i.e. nonlinear or curvilinear relationships)
Best maximum likelihood model ... ?
Best expert system...?
Complex Systems…?Chaos/Non-linear systems modelsCatastrophe modelsothers
Slide 27
April 10, 2023
The Simple Linear VirtueI think we over emphasize the linear model.
It does, however, embody this rather important notion that Y is proportional to X.
As noted, we can state such relationships in simple English.As unemployment increases, so does the
crime rate.As domestic conflict increased, national
leaders will seek to distract their populations by initiating foreign disputes.
Slide 28
April 10, 2023
The Notion of Linear Change
The linear aspect means that the same amount of increase in unemployment will have the same effect on crime at both low and high unemployment.
A nonlinear change would mean that as unemployment increases, its impact upon the crime rate might increase at higher unemployment levels.
Slide 29
April 10, 2023
Why squared error? Because:
(1) the sum of the errors expressed as deviations would be zero as it is with standard deviations, and
(2) some feel that big errors should be more influential than small errors.
Therefore, we wish to find the values of a and b that produce the smallest sum of squared errors.
Slide 30
April 10, 2023
The Parameter estimates
In order to do this, we must find parameter estimates which accomplish this minimization.
In calculus, if you wish to know when a function is at its minimum, you set the first derivative equal to zero.
In this case we must take partial derivatives since we have two parameters (a & b) to worry about.
We will look closer at this and it’s not a pretty sight!
Slide 31
April 10, 2023
Decomposition of the error in LS
Slide 32
April 10, 2023
Goodness of Fit
Since we are interested in how well the model performs at reducing error, we need to develop a means of assessing that error reduction.
Since the mean of the dependent variable represents a good benchmark for comparing predictions, we calculate the improvement in the prediction of Yi relative to the mean of Y(the best guess of Y with no other
information).
Slide 33
April 10, 2023
Sum of Squares Terminology
In mathematical jargon we seek to minimize the Unexplained Sum of Squares (USS), where:
Slide 34
USS Y Y
e
i i
i
( )
2
2
April 10, 2023
Sums of Squares
This gives us the following 'sum-of-squares' measures:Total Variation = Explained Variation +
Unexplained Variation
2
2
2
)ˆ(exp
)ˆ(
)(
ii
i
i
YYSquaresofSumlainedUnUSS
YYSquaresofSumExplainedESS
YYSquaresofSumTotalTSS
Slide 35
April 10, 2023
Sums of Squares Confusion
Note: Occasionally you will run across ESS and RSS which generate confusion since they can be used interchangeably. ESS can be the error sums-of-squares, or
alternatively, the estimated or explained SSQ.Likewise RSS can be the residual SSQ, or the
regression SSQ. Hence the use of USS for Unexplained SSQ in this
treatment.Gujarati uses ESS (Explained sum of squares) and
RSS (Residual sum of squares)
Slide 36
April 10, 2023
The Parameter estimates In order to find the ‘best’ parameters, we must find
parameter estimates which accomplish this minimization. That is the parameters that have the smallest sum f
squared errors.
In calculus, if you wish to know when a function is at its minimum, you take the first derivative and set it equal to 0.0 The second derivative must be positive as well, but we
do not need to go there.
In this case we must take partial derivatives since we have two parameters to worry about.
Slide 37
April 10, 2023
Deriving the Parameter Estimates
Since
We can take the partial derivative of USS with respect to both a and b.
Slide 38
USS Y Y
e
Y a bX
i i
i
i i
( )
( )
2
2
2
USS
aY a bX
USS
bY a bX X
i i
i i i
2 1
2
( ) ( )
( ) ( )
April 10, 2023
Deriving the Parameter Estimates (cont.)
Which simplifies to
We also set these derivatives to 0 to indicate that we are at a minimum.
Slide 39
USS
aY a bX
USS
bX Y a bX
i i
i i i
2 0
2 0
( )
( )
April 10, 2023
Deriving the Parameter Estimates (cont.)
We now add a “hat” to the parameters to indicate that the results are estimators.
We also set these derivatives equal to zero.
Slide 40
USS
aY a b X
USS
bX Y a b X
i i
i i i
2 0
2 0
( )
( )
April 10, 2023
Deriving the Parameter Estimates (cont.)
Dividing through by -2 and rearranging terms, we get
Slide 41
Y an b X
X Y a X b X
i i
i i i i
( ) ,
( ) ( )2
April 10, 2023
Deriving the Parameter Estimates (cont.)
We can solve these equations simultaneously to get our estimators.
Slide 42
bn X Y X Y
n X X
X X Y Y
X X
a Y b X
i i i i
i i
i i
i
1 2 2
2
1
( )
( )( )
( )
April 10, 2023
Deriving the Parameter Estimates (cont.)
The estimator for a also shows that the regression line always goes through the point which is the intersection of the two means.
This formula is quite manageable for bivariate regression. If there are two or more independent variables, the formula for b2, etc. becomes unmanageable!
See matrix algebra in POLS603!
Slide 43
April 10, 2023
Tests of Inference t-tests for coefficients
F-test for entire model
Since we are interested in how well the model performs at reducing error, we need to develop a means of assessing that error reduction. Since the mean of the dependent variable represents a
good benchmark for comparing predictions, we calculate the improvement in the prediction of Yi relative to the mean of Y
Remember that the mean of Y is your best guess of Y with no other information.Well, often, assuming the data is normally distributed!
Slide 44
April 10, 2023
T-TestsSince we wish to make probability statements
about our model, we must do tests of inference.
Fortunately,
OK, so what is the seB?
Slide 45
B
set
Bn 2
April 10, 2023
Standard Errors of Estimates
These estimates have variation associated with them
Slide 46
April 10, 2023
This gives us the F test:
The F-test tells us whether the full model is significant
Note that F statistics have 2 different degrees of freedom: k-1 and n-k, where k is the number of regressors in the model.
)/(
)1/((,1 knUSS
kESSF knk
Slide 47
April 10, 2023
More on the F-testIn addition, the F test for the entire model must
be adjusted to compensate for the changed degrees of freedom.
Note that F increases as n or R2 increases and decreases as k – the number of independent variables - increases.
Slide 48
April 10, 2023
F test for different Models
The F test can tell us whether two different models of the same dependent variable are significantly different.i.e. whether adding a new variable will
significantly improve estimation
model) new in the parameters ofnumber (/1
regressors new ofnumber /R2
22
ndfR
RF
new
oldnew
Slide 49
April 10, 2023
The correlation coefficientA measure of how close the residuals are to the
regression line
It ranges between -1.0 and +1.0
It is closely related to the slope.
Slide 50
April 10, 2023
R2 (r-square, r2) The r2 (or R-square) is also called the coefficient of determination.
It is the percent of the variation in Y explained by X
It must range between 0.0 and 1.0.
An r2 of .95 means that 95% of the variation about the mean of Y is “caused” (or at least explained) by the variation in X
Slide 51
rESS
TSSUSS
TSS
2
1
April 10, 2023
Observations on r2
R2 always increases as you add independent variables.The r2 will go up even if X2 , or any new variable, is
a completely random variable.
r2 is an important statistic, but it should never be seen as the focus of your analysis.
The coefficient values, their interpretation, and the tests of inference are really more important.
Beware of r2 = 1.0 !!!
Slide 52
April 10, 2023
The Adjusted R2
Since R2 always increases with the addition of a new variable, the adjusted R2 compensates for added explanatory variables.
Note that it may range < 0.0 and greater than 1.0!!! But these values indicate poorly formed models.
)]1/([
)]1/([12
nTSS
knUSSR
Slide 53
April 10, 2023
Comments on the adjusted R-squared
R-squared will always go up with the addition of a new variable.
Adjusted r-squared will go down if the variable contributes nothing new to the explanation of the model.As a rule of thumb, if the new variable has a t-
value greater than 1.0, it increases the adjusted r-square.
Slide 54
April 10, 2023
The assumptions of the model
We will spend the next 7 weeks on this!
Slide 55
April 10, 2023
The Multiple Regression Model The Scalar Version
The basic multiple regression model is a simple extension of the bivariate equation.
By adding extra independent variables, we are creating a multiple-dimensioned space, where the model fit is a some appropriate space.
For instance, if there are two independent variables, we are fitting the points to a ‘plane in space’.
Visualizing this in more dimensions is a good trick.
Slide 56
April 10, 2023
The Scalar EquationThe basic linear model:
If bivariate regression can be described as a line on a plane, multiple regression represents a k-dimensional object in a k+1 dimensional space.
Slide 57
Y a b X b X b X ei i i k ki i 1 1 2 2 . . .
April 10, 2023
The Matrix ModelWe can use a different type of mathematical
structure to describe the regression model
Frequently called Matrix or Linear Algebra
The multiple regression model may be easily represented in matrix terms.
Where the Y, X, B and e are all matrices of data, coefficients, or residuals
Slide 58
Y XB e
April 10, 2023
The Matrix Model (cont.)
The matrices in are represented by
Note that we postmultiply X by B since this order makes them conformable.
Also note that X1 is a column of 1’s to obtain an intercept term.
Slide 59
Y
Y
Y
Yn
1
2
X
X X X
X X X
X X X
ik
k
n n nk
11 1 2
2 1 2 2 2
1 2
. . .
. . .
. . . . . . . . . . . .
. . .
B
B
B
B k
1
2 e
e
e
en
1
2
Y XB e
April 10, 2023
Assumptions of the modelScalar Version
The OLS model has seven fundamental assumptions. This count varies from author to author, based upon
what each author thinks is a separate assumption!
These assumptions form the foundation for all regression analysis.
Failure of a model to conform to these assumptions frequently presents problems for estimation and inference. The “problem” may range from minor to severe!
These problems, or violations of the assumptions, almost invariably arise out of substantive or theoretical problems!
Slide 60
April 10, 2023
The Assumptions of the Model
Scalar Version (cont.)
1. The ei's are normally distributed.
2. E(ei) = 0
3. E(ei2) = 2
4. E(eiej) = 0 (ij)
5. X's are nonstochastic with values fixed in repeated samples and is a finite nonzero number.
6. The number of observations is greater than the number of coefficients estimated.
7. No exact linear relationship exists between any of the explanatory variables.
Slide 61
April 10, 2023
The Assumptions of the Model
The English Version The errors have a normal distribution.
The residuals are heteroskedastic. (The variation in the errors doesn’t change across values of the
independent or dependent variables)
There is no serial correlation in the errors. The errors are unrelated to their neighbors.)
There is no multicollinearity. (No variable is a perfect function of another variable.)
The X’s are fixed (non-stochastic) and have some variation, but no infinite values.
There are more data points than unknowns.
The model is linear in its parameters.. All modeled relationships are directly proportional
OK…so it’s not really English…. Slide 62
April 10, 2023
The Assumptions of the Model:
The Matrix VersionThese same assumptions expressed in matrix
format are:
1. e N(0,)2. = 2I
3. The elements of X are fixed in repeated samples and (1/ n)X'X is nonsingular and its elements are finite
Slide 63
April 10, 2023
Properties of Estimators ()
Since we are concerned with error, we will be concerned with those properties of estimators which have to do with the errors produced by the estimates - the s
We use the symbol to denote a general parameter
It could represent a regression slope (B), a sample mean (Xbar), a standard deviation (s), and many other statistics or estimators based on some sample.
Slide 64
April 10, 2023
Types of estimator error
Estimators are seldom exactly correct due to any number of reasons, most notably sampling error and biased selection.
There are several important concepts that we need to understand in examining how well estimators do their job.
Slide 65
April 10, 2023
Sampling errorSampling error is simply the difference between
the true value of a parameter and its estimate in any given sample.
This sampling error means that an estimator will vary from sample to sample and therefore estimators have variance.
Slide 66
Sampling E rror
Var E E E E( ) [ ( )] ( ) [ ( )] 2 2 2 2
April 10, 2023
Bias
The bias of an estimate is the difference between its expected value and its true value.
If the estimator is always low (or high) then the estimator is biased.
An estimator is unbiased if
And
Slide 67
E ( )
B ias E ( )
April 10, 2023
Mean Squared Error The mean square error (MSE) is different from
the estimator’s variance in that the variance measures dispersion about the estimated parameter while mean squared error measures the dispersion about the true parameter.
If the estimator is unbiased then the variance and MSE are the same.
Slide 68
M ean square error E ( ) 2
April 10, 2023
Mean Squared Error (cont.)
The MSE is important for time series and forecasting since it allows for both bias and efficiency:
For Instance
These concepts lead us to look at the properties of estimators. Estimators may behave differently in large and small samples, so we look at both small sample and,
large (asymptotic) sample properties.
Slide 69
M S E = v arian ce + (b ias) 2
April 10, 2023
Small Sample Properties
These are the ideal properties. We desire these to hold.Unbiasedness
Efficiency
Best Linear Unbiased Estimator
If the small sample properties hold, then by extension, the large sample properties hold.
Slide 70
April 10, 2023
Bias A parameter is unbiased if
In other words, the average value of the estimator in repeated sampling equals the true parameter.
Note that whether an estimator is biased or not implies nothing about its dispersion.
Slide 71
E ( )
April 10, 2023
Bias
0
0.2
0.4
0.6
0.8
1
-3
-2.2
-1.4
-0.6
0.2 1
1.8
2.6
X
Pro
b
Normal(s=1.0)
Series3
Slide 72
April 10, 2023
Efficiency
An estimator is efficient if it is unbiased and where its variance is less than any other unbiased estimator of the parameter. Is unbiased;
Var( ) Var ( )
where is any other unbiased estimator of
There might be instances in which we might choose a biased estimator, if it has a smaller variance.
Slide 73
~
~
April 10, 2023
Efficiency
Slide 74
Series 1 (s=1.0) is more efficient than Series 2 (s=2.0)
April 10, 2023
BLUE (Best Linear Unbiased Estimate)
An estimator is described as a BLUE estimator if it is is a linear function
is unbiased
Var( ) Var ( )
where is any other linear unbiased estimator of
Slide 75
~~
What is a linear estimator?
A linear estimator looks like the formula for a straight line
The linearity referred to is the linearity in the parameters
Note that the sample mean is an example of a linear estimator.
nXnX
nX
nX
nX
1...
111321
nnxaxaxaxa ...ˆ332211
April 10, 2023 Slide 76
April 10, 2023
BLUE is BuenoIf your estimator (e.g. the Regression B) is the
BLUE estimator, then you have a very good estimator – relative to other regression style estimators.
The problem is that if certain assumptions are violated, then OLS may no longer be the “best” estimator.There might be a better one!
You can still hope that the large sample properties hold though!
Slide 77
April 10, 2023
Asymptotic (Large Sample) Properties
Asymptotically unbiased
Consistency
Asymptotic efficiency
Slide 78
April 10, 2023
Asymptotic bias
An estimator is unbiased if
As the sample size gets larger the estimated parameter gets closer to the true value.
For instance:
)ˆ(lim nn
E
n
1-1)E(Sand
)( 222
2 n
XXS i
Slide 79
April 10, 2023
Consistency
The point at which a distribution collapses is called the probability limit (plim)
If the bias and variance both decrease as the sample size gets larger, the estimator is consistent.
Usually noted by
01ˆlim
Pn
ˆplimn
Slide 80
April 10, 2023
Asymptotic efficiencyAn estimator is asymptotically efficient if
asymptotic distribution with finite mean and variance
is consistent
no other estimator has smaller asymptotic variance
Slide 81
April 10, 2023
Rifle and Target Analogy
Small sample propertiesBias: The shots cluster around some spot other than
the bull’s-eye)
Efficient: When one rifle’s cluster is smaller than another’s.
BLUE - Smallest scatter for rifles of a particular type of simple construction
Slide 82
April 10, 2023
Rifle and Target Analogy (cont.)
Asymptotic properties Think of increased sample size as getting closer to the target.
Asymptotic Unbiasedness means that as the sample size gets larger, the center of the point cluster moves closer to the target center.
With consistency, the point cluster moves closer to the target center and cluster shrinks in size.
If it is asymptotically efficient, then no other rifle has a smaller cluster that is closer to the true center.
When all of the assumptions of the OLS model hold its estimators are: unbiased
Minimum variance, and
BLUE
Slide 83
April 10, 2023
Assumption Violations: How we will approach the
question.
Definition
Implications
Causes
Tests
Remedies
Slide 84
April 10, 2023
Non-zero Mean for the residuals (Definition)
Definition:The residuals have a mean other than 0.0.
Note that this refers to the true residuals. Hence the estimated residuals have a mean of 0.0, while the true residuals are non-zero.
Slide 85
April 10, 2023
Non-zero Mean for the residuals (Implications)The true regression line is
Therefore the intercept is biased.
The slope, b, is unbiased. There is also no way of separating out a and .
Slide 86
Y a bXi i e
April 10, 2023
Non-zero Mean for the residuals (Causes, Tests,
Remedies)Causes: Non-zero means result from some form
of specification error. Something has been omitted from the model which accounts for that mean in the estimation.
We will discuss Tests and Remedies when we look closely at Specification errors.
Slide 87
April 10, 2023
Non-normally distributed errors :
DefinitionThe residuals are not NID(0,)
Slide 88
0.0
8.8
17.5
26.3
35.0
-1000.0 -250.0 500.0 1250.0 2000.0
Histogram of Residuals of rate90
Residuals of rate90
Count
Normality Tests SectionAssumption Value Probability Decision(5%)Skewness 5.1766 0.000000 RejectedKurtosis 4.6390 0.000004 RejectedOmnibus48.3172 0.000000 Rejected
April 10, 2023
Non-normally distributed errors :
ImplicationsThe existence of residuals which are not
normally distributed has several implications.First is that it implies that the model is to
some degree misspecified. A collection of truly stochastic disturbances
should have a normal distribution. The central limit theorem states that as the number of random variables increases, the sum of their distributions tends to be a normal distribution.
Distribution theory – beyond the scope of this course
Slide 89
April 10, 2023
Non-normally distributed errors : Implications (cont.)
If the residuals are not normally distributed, then the estimators of a and b are also not normally distributed.
Estimates are, however, still BLUE.
Estimates are unbiased and have minimum variance.
They are no longer efficient, even though they are asymptotically unbiased and consistent.
It is only our hypothesis tests which are suspect.
Slide 90
April 10, 2023
Non-normally distributed errors:
CausesGenerally causes by a misspecification error.
Usually an omitted variable.
Can also result from Outliers in data.Wrong functional form.
Slide 91
April 10, 2023
Non-normally distributed errors :
Tests for non-normality
Chi-Square goodness of fit
Since the cumulative normal frequency distribution has a chi-square distribution, we can test for the normality of the error terms using a standard chi-square statistic.
We take our residuals, group them, and count how many occur in each group, along with how many we would expect in each group.
Slide 92
April 10, 2023
Non-normally distributed errors :
Tests for non-normality (cont.)
We then calculate the simple 2 statistic.
This statistic has (N-1) degrees of freedom, where N is the number of classes.
Slide 93
2
2
1
O E
Ei i
ii
k
April 10, 2023
Non-normally distributed errors :
Tests for non-normality (cont.)
Jarque-Bera testThis test examines both the skewness and kurtosis
of a distribution to test for normality.
Where S is the skewness and K is the kurtosis of the residuals.
JB has a 2 distribution with 2 df.
Slide 94
JB nS K
2 2
6
3
2 4
( )
April 10, 2023
Non-normally distributed errors:
RemediesTry to modify your theory. Omitted variable?
Outlier needing specification?
Modify your functional form by taking some variance transforming step such as square root, exponentiation, logs, etc. Be mindful that you are changing the nature of the
model.
Bootstrap it!From the shameless commercial division!
Slide 95
April 10, 2023
Multicollinearity: Definition
Multicollinearity is the condition where the independent variables are related to each other. Causation is not implied by multicollinearity.
As any two (or more) variables become more and more closely correlated, the condition worsens, and ‘approaches singularity’.
Since the X's are fixed (or they are supposed to be anyway), this a sample problem.
Since multicollinearity is almost always present, it is a problem of degree, not merely existence.
Slide 96
April 10, 2023
Multicollinearity: Implications
Consider the following cases
A. No multicollinearityThe regression would appear to be identical to
separate bivariate regressions
This produces variances which are biased upward (too large) making t-tests too small.
The coefficients are unbiased.
For multiple regression this satisfies the assumption.
Slide 97
April 10, 2023
Multicollinearity: Implications (cont.)
B. Perfect Multicollinearity
Some variable Xi is a perfect linear combination of one or more other variables Xj, therefore X'X is singular, and |X'X| = 0.
This is matrix algebra notation. It means that one variable is a perfect linear function of another. (e.g. X2 = X1+3.2)
The effects of X1 and X2 cannot be separated.
The standard errors for the B’s are infinite.
A model cannot be estimated under such circumstances. The computer dies.
And takes your model down with it…
Slide 98
April 10, 2023
Multicollinearity: Implications (cont.)
C. A high degree of MulticollinearityWhen the independent variables are highly correlated the
variances and covariances of the Bi's are inflated (t ratio's are lower) and R2 tends to be high as well.
The B's are unbiased (but perhaps useless due to their imprecise measurement as a result of their variances being too large). In fact they are still BLUE.
OLS estimates tend to be sensitive to small changes in the data.
Relevant variables may be discarded.
Slide 99
April 10, 2023
Multicollinearity: Causes
Sampling mechanism. Poorly constructed design & measurement scheme or limited range. Too small a sample range
Constrained theory:
(X1 does affect X2) e.g. Elect consump = wealth + House size
Statistical model specification: adding polynomial terms or trend indicators.
Too many variables in the model - the model is over-determined.
Theoretical specification is wrong. Inappropriate construction of theory or even measurement. If your dependent variable is constructed using an independent
variable
Slide 100
April 10, 2023
Multicollinearity: Tests/Indicators
|X'X| approaches 0.0
The variance covariance matrix is singular, so it’s “determinant” is 0.0Since the determinant is a function of variable
scale, this measure doesn't help a whole lot. We could, however, use the determinant of the correlation matrix and therefore bound the range from 0. to 1.0
Slide 101
April 10, 2023
Multicollinearity: Tests/Indicators (cont.)
Tolerance: If the tolerance equals 1, the variables are
unrelated. If Tolj = 0, then they are perfectly correlated.
To calculate, regress each Independent variable on all the other independent variablesVariance Inflation Factors (VIFs)
Tolerance
Slide 102
V IFR k
1
1 2
TOL R V IFj j j 1 12 ( / ( ))
April 10, 2023
Interpreting VIFsNo multicollinearity produces VIFs = 1.0
If the VIF is greater than 10.0, then multicollinearity is probably severe. 90% of the variance of Xj is explained by the other X’s.
In small samples, a VIF of about 5.0 may indicate problems
Slide 103
April 10, 2023
Multicollinearity: Tests/Indicators
(cont.)
R2 deletes - tries all possible models of X's and by includes/ excludes based on small changes in R2 with the inclusion/omission of the variables (taken 1 at a time)
F is significant, But no t value is.
Adjusted R2 declines with a new variable
Multicollinearity is of concern when either
YXXX
YXXX
rr
rr
221
121
Slide 104
April 10, 2023
Multicollinearity: Tests/Indicators
(cont.)
I would avoid the rule of thumb
Beta's are > 1.0 or < -1.0
Sign changes occur with the introduction of a new variable
The R2 is high, but few t-ratios are.
Eigenvalues and Condition Index - If this topic is beyond Gujarati, it’s beyond me.
6.21XXr
Slide 105
April 10, 2023
Multicollinearity: Remedies
Increase sample sizePooled cross-sectional time series
Thereby introducing all sorts of new problems!
Omit Variables
Scale Construction/Transformation
Factor Analysis
Constrain the estimation. Such as the case where you can set the value of one coefficient relative to another.
Slide 106
April 10, 2023
Multicollinearity: Remedies (cont.)
Change design (LISREL maybe or Pooled cross-sectional Time series)Thereby introducing all sorts of new problems!
Ridge RegressionThis technique introduces a small amount of bias
into the coefficients to reduce their variance.
Ignore it - report adjusted R2 and claim it warrants retention in the model.
Slide 107
April 10, 2023
Heteroskedasticity: Definition
Heteroskedasticity is a problem where the error terms do not have a constant variance.
That is, they may have a larger variance when values of some Xi (or the Yi’s themselves) are large (or small).
Slide 108
E e i i( )2 2
April 10, 2023
Heteroskedasticity: Definition
This often gives the plots of the residuals by the dependent variable or appropriate independent variables a characteristic fan or funnel shape.
Slide 109
0
20
40
60
80
100
120
140
160
180
0 50 100 150
Series1
April 10, 2023
Heteroskedasticity: Implications
The regression B's are unbiased.
But they are no longer the best estimator. They are not BLUE (not minimum variance - hence not efficient).
They are, however, consistent.
Slide 110
April 10, 2023
Heteroskedasticity: Implications (cont.)
The estimator variances are not asymptotically efficient, and they are biased. So confidence intervals are invalid.
What do we know about the bias of the variance?
If Yi is positively correlated with ei, bias is negative - (hence t values will be too large.)
With positive bias many t's too small.
Slide 111
April 10, 2023
Heteroskedasticity: Implications (cont.)
Types of HeteroskedasticityThere are a number of types of heteroskedasticity.
Additive
Multiplicative
ARCH (Autoregressive conditional heteroskedastic) - a time series problem.
Slide 112
April 10, 2023
Heteroskedasticity: Causes
It may be caused by:Model misspecification - omitted variable or
improper functional form.
Learning behaviors across time
Changes in data collection or definitions.
Outliers or breakdown in model.Frequently observed in cross sectional data sets
where demographics are involved (population, GNP, etc).
Slide 113
April 10, 2023
Heteroskedasticity: Tests
Informal MethodsPlot the data and look for patterns!Plot the residuals by the predicted dependent
variable (Resids on the Y-axis)Plotting the squared residuals actually makes more
sense, since that is what the assumption refers to!
Homoskedasticity will be a random scatter horizontally across the plot.
Slide 114
April 10, 2023
Heteroskedasticity: Tests (cont.)
Park testAs an exploratory test, log the residuals and
regress them on the logged values of the suspected independent variable.
If the B is significant, then heteroskedasticity may be a problem.
Slide 115
ln ln ln
ln
u B X v
a B X vi i i
i i
2 2
April 10, 2023
Heteroskedasticity: Tests (cont.)
Glejser Test
This test is quite similar to the park test, except that it uses the absolute values of the residuals, and a variety of transformed X’s.
A significant B2 indicated Heteroskedasticity.
Easy test, but has problems.
Slide 116
u B B X v
u B B X v
u B BX
v
i i i
i i i
ii
i
1 2
1 2
1 2
1
u B BX
v
u B B X v
u B B X v
ii
i
i i i
i i i
1 2
1 2
1 22
1
April 10, 2023
Heteroskedasticity: Tests (cont.)
Goldfeld-Quandt testOrder the n cases by the X that you think is
correlated with ei2.
Drop a section of c cases out of the middle(one-fifth is a reasonable number).
Run separate regressions on both upper and lower samples.
Slide 117
April 10, 2023
Heteroskedasticity: Tests (cont.)
Goldfeld-Quandt test (cont.)
Do F-test for difference in error variancesF has (n - c - 2k)/2 degrees of freedom for each
2
1
)2
2,
2
2(
ee
eekcnkcn s
sF
Slide 118
April 10, 2023
Heteroskedasticity: Tests (cont.)
Breusch-Pagan-Godfrey Test (Lagrangian Multiplier test)Estimate model with OLSObtain
Construct variables
Slide 119
nui /~ 22
22iii up ~/ˆ
April 10, 2023
Heteroskedasticity: Tests (cont.)
Breusch-Pagan-Godfrey Test (cont.)Regress pi on the X (and other?!) variables
Calculate
Note that
Slide 120
mimiii ZZZp ...33221
)(ESS2
1
21 m
April 10, 2023
Heteroskedasticity: Tests (cont.)
White’s Generalized Heteroskedasticity testEstimate model with OLS and obtain residualsRun the following auxiliary regression
Higher powers may also be used, along with more X’s
Slide 121
April 10, 2023
Heteroskedasticity: Tests (cont.)
White’s Generalized Heteroskedasticity test (cont.)Note that
The degrees of freedom is the number of coefficients estimated above.
Slide 122
22 Rn
April 10, 2023
Heteroskedasticity: Remedies
GLSWe will cover this after autocorrelation
Weighted Least Squaressi
2 is a consistent estimator of σi2
use same formula (BLUE) to get a & ß
Slide 123
April 10, 2023
Iteratively weighted least squares (IWLS)
Iteratively weighted least squares (IWLS)
1. Obtain estimates of ei2 using OLS
2. Use these to get "1st round" estimates of σi
3. Using formula above replace wi with 1/ si and obtain new estimates for a and ß.
4. Adjust data
5. Use these to re-estimate
6. Repeat Step 3-5 until a and ß converge.
i
ii
i
ii s
XX
s
YY ** ,
iii ebXaY
Slide 124
*****iii eXY
April 10, 2023
White’s corrected standard errors
White’s corrected standard errors For normal OLS
We can restate this as
Since
this is the same when
xTSSBVar
2
1)ˆ(
x
i
n
ii
TSS
xxBVar
2
2
1
2
1
)()ˆ(
Slide 125
n
ii xx
1
2)(
22 i
April 10, 2023
White’s corrected standard errors
(cont.)
White’s corrected standard errorsWhite’s solution is to use the robust
estimator
When you see robust standard errors, it usually refers to this estimator
2
2
1
22 ˆ)ˆ()ˆ(ˆ
j
i
n
iij
jRSS
urBrVa
Slide 126
April 10, 2023
Obtaining Robust errorsIn Stata, just add a , r to the regress command
regress approval unemrateBecomes
regress approval unemrate, r
Slide 127
April 10, 2023
Autocorrelation: DefinitionAutocorrelation is simply the presence of
correlation between adjacent (contemporaneous) residuals.
If a residual is negative (or positive) then its neighbors tend to also be negative (or positive).
Most often autocorrelation is between adjacent observations, however, lagged or seasonal patterns can also occur.
Autocorrelation is also usually a function of order by time, but it can occur for other orders as well – firm or state size.
Slide 128
April 10, 2023
Autocorrelation: Definition (cont.)
The assumption violated is
This means that the Pearson’s r (correlation coefficient) between the residuals from OLS and the same residuals lagged one period (or more) is non-zero.
Slide 129
0)( jieeE
April 10, 2023
Autocorrelation: Definition (cont.)
Most autocorrelation is what we call 1st order autocorrelation, meaning that the residuals are related to their contiguous values.
Autocorrelation can be rather complex, producing counterintuitive patterns and correlations.
Slide 130
April 10, 2023
Autocorrelation: Definition (cont.)
Types of AutocorrelationAutoregressive processes
Moving Averages
Slide 131
April 10, 2023
Autocorrelation: Definition (cont.)
Autoregressive processes AR(p)
The residuals are related to their preceding values.
This is classic 1st order autocorrelation
ttt uee 1
Slide 132
April 10, 2023
Autocorrelation: Definition (cont.)
Autoregressive processes (cont.) In 2nd order autocorrelation the residuals are
related to their t-2 values as well
Larger order processes may occur as well
Slide 133
tptpttt ueeee ...2211
tttt ueee 2211
April 10, 2023
Autocorrelation: Definition (cont.)
Moving Average Processes MA(q)
The error term is a function of some random error plus a portion of the previous random error.
Slide 134
1 ttt uue
April 10, 2023
Autocorrelation: Definition (cont.)
Moving Average Processes (cont.
Higher order processes for MA(q) also exist.
The error term is a function of some random error plus a portion of the previous random error.
Slide 135
qtqtttt uuuue ...2211
April 10, 2023
Autocorrelation: Definition (cont.)
Mixed processes ARMA(p,q)
The error term is a complex function of both autoregressive and moving average processes.
Slide 136
qtqtt
tptpttt
uuu
ueeee
...
...
2211
2211
April 10, 2023
Autocorrelation: Definition (cont.)
There are substantive interpretations that can be placed on these processes.AR processes represent shocks to systems that
have long-term memory.MA processes are quick shocks to a system that
can handle the process ‘efficiently,’ having only short term memory.
Slide 137
April 10, 2023
Autocorrelation: Implications
Coefficient estimates are unbiased, but the estimates are not BLUE
The variances are often greatly underestimated (biased small)
Hence hypothesis tests are exceptionally suspect. In fact, strongly significant t-tests (P < .001) may
well be insignificant once the effects of autocorrelation are removed.
Slide 138
April 10, 2023
Autocorrelation: CausesSpecification error
Omitted variable – i.e inflation
Wrong functional form
Lagged effects
Data Transformations Interpolation of missing datadifferencing
Slide 139
April 10, 2023
Autocorrelation: TestsObservation of residuals
Graph/plot them!
Runs of signsGeary test
Slide 140
April 10, 2023
Autocorrelation: Tests (cont.)
Durbin-Watson d
Criteria for hypothesis of ACReject if d < dL
Do not reject if d > dU
Test is inconclusive if dL d dU.
nt
tt
nt
ttt
u
uud
2
2
2
21
ˆ
ˆˆ
Slide 141
April 10, 2023
Autocorrelation: Tests (cont.)
Durbin-Watson d (cont.)Note that the d is symmetric about 2.0, so that
negative autocorrelation will be indicated by a d > 2.0.
Use the same distances above 2.0 as upper and lower bounds.
Slide 142
Analysis of Time Series. http://cnx.org/content/m34544/latest/
April 10, 2023
Autocorrelation: Tests (cont.)
Durbin’s hCannot use DW d if there is a lagged
endogenous variable in the model
sc2 is the estimated variance of the Yt-1 term
h has a standard normal distribution
Slide 143
2121
cTs
Tdh
April 10, 2023
Autocorrelation: Tests (cont.)
Tests for higher order autocorreltaionLjung-Box Q (χ2 statistic)
Also called the Portmanteau test
Breusch-Godfrey
Slide 144
L
j
j
jT
rTTQ
1
2
)2('
April 10, 2023
Autocorrelation: RemediesGeneralized Least Squares
Later!
First difference methodTake 1st differences of your Xs and YRegress Δ Y on ΔXAssumes that Φ = 1!This changes your model from one that explains
rates to one that explains changes.
Generalized differencesRequires that Φ be known.
Slide 145
April 10, 2023
Autocorrelation: RemediesCochran-Orcutt method
(1) Estimate model using OLS and obtain the residuals, ut.
(2) Using the residuals from the OLS, run the following regression.
Slide 146
ttt vupu 1ˆˆˆ
April 10, 2023
Autocorrelation: Cochran-Orcutt method (cont.)
(3) using the p obtained, perform the regression on the generalized differences
Where
(4) Substitute the values of B1 and B2 into the original regression to obtain new estimates of the residuals.
(5) Return to step 2 and repeat – until p no longer changes.No longer changes means (approximately) changes are less than
3 significant digits – or, for instance, at the 3rd decimal place.
Slide 147
April 10, 2023
Autocorrelation with lagged Dependent Variables
The presence of a lagged dependent variable causes special estimation problems.
Essentially you must purge the lagged error term of its autocorrelation by using a two stage IV solution.
Careful with lagged dependent variable models. The Lagged dep var may simple scoop up all the
variance to be explained. A variety of models used lagged dependent b=variables:
Adaptive expectationsPartial adjustment,Rational expectations.
Slide 148
April 10, 2023
Model Specification: Definition
The analyst should understand one fundamental “truth” about statistical models. They are all misspecified. We exist in a world of incomplete information
at best. Hence model misspecification is an ever-present danger.
We do, however, need to come to terms with the problems associated with misspecification so we can develop a feeling for the quality of information, description, and prediction produced by our models.
Slide 149
April 10, 2023
Criteria for a “Good Model”Hendry & Richard Criteria
Be data admissible – predictions must be logically possible
Be consistent with theoryHave weakly endogenous regressors (errors
and X’s uncorrelated)Exhibit parameter stability – relationship
cannot vary over time – unless modeled in that way
Exhibit data coherency – random residualsBe encompassing – contain or explain the
results of other models
Slide 150
April 10, 2023
Model Specification: Definition (cont.)
There are basically 4 types of misspecification we need to examine:functional form
inclusion of an irrelevant variable
exclusion of a relevant variable
measurement error and misspecified error term
Slide 151
April 10, 2023
Model Specification: Implications
If an omitted variable is correlated with the included variables, the estimates are biased as well as inconsistent.
In addition, the error variance is incorrect, and usually overestimated.
If the omitted variable is uncorrelated with the included variables, the errors are still biased, even though the B’s are not.
Slide 152
April 10, 2023
Model Specification: Implications
Incorrect functional form can result in autocorrelation or heteroskedasticity.
See the notes for these problems for the implications of each.
Slide 153
April 10, 2023
Model Specification: Causes
This one is easy - theoretical design.something is omitted, irrelevantly included,
mismeasured or non-linear.
This problem is explicitly theoretical.
Slide 154
April 10, 2023
Data MiningThere are techniques out there that look for
variables to add.These are often atheoretical – but can they work?Note that data mining may alter the ‘true’ level
of significanceWith c candidates for variables in the model, and
k actually chosen with an α=.05, the true level of significance is:
Note similarity to Bonferroni correction
kc /* )1(1
Slide 155
April 10, 2023
Model Specification: Tests
Actual Specification Tests
No test can reveal poor theoretical construction per se.
The best indicator that your model is misspecified is the discovery that the model has some undesirable statistical property; e.g. a misspecified functional form will often be indicated by a significant test for autocorrelation.
Sometimes time-series models will have negative autocorrelation as a result of poor design.
Slide 156
April 10, 2023
Ramsey RESET TestThe Ramsey RESET test is a “Regression
Specification Error Test.
You add the predicted values of Y to the regression model
If they have a significant coefficient then the errors are related to the predicted values, indicating that there is a specification error.
This is based on demonstrating that there is some non-random behavior left in the residuals
Slide 157
April 10, 2023
Model Specification: Tests
Specification Criteria for lagged designsMost useful for comparing time series models with
same set of variables, but differing number of parameters
Slide 158
April 10, 2023
Model Specification: Tests (cont)
Schwartz Criterion
where 2 equals RSS/n, m is the number of Lags (variables), and n is the number of observations
Note that this is designed for time series.
Slide 159
SC m n ln ~ ln 2
April 10, 2023
Model Specification: Tests (cont)
AIC (Akaike Information Criterion)
Both of these criteria (AIC and Schwartz) are to be minimized for improved model specification. Note that they both have a lower bound which is a function of sample size and number of parameters.
Slide 160
A ICK
nj j
j ln 22
April 10, 2023
Model Specification: Remedies
Model BuildingA. "Theory Trimming" (Pedhauzer: 616)
B. Hendry and the LSE school of “top-down” modeling.
C. Nested Models
D. Stepwise Regression. Stepwise regression is a process of including the
variables in the model “one step at a time.” This is a highly controversial technique.
Slide 161
April 10, 2023
Model Specification: Remedies (cont.)
Stepwise Regression Twelve things someone else says are wrong with
stepwise:Philosophical Problems
1. Completely atheoretical
2. Subject to spurious correlation
3. Information tossed out - insignificant variables may be useful
4. Computer replacing the scientist
5. Utterly mechanistic
Slide 162
April 10, 2023
Model Specification: Remedies (cont.)
Stepwise Regression
Statistical6. Population model from sample data
7. Large N - statistical significance can be an artifact
8. Inflates the alpha level
9. The scientist becomes the beholden to the significant tests
10. Overestimates the effect of the variables added early, and underestimates the variables added later
11. Prevents data exploration
12. Not even least squares for stagewise
Slide 163
April 10, 2023
Model Specification: Remedies (cont.)
Stepwise Regression
Twelve Responses:Selection of the data selected for the procedure implies
some minimal level of theorization
All analysis is subject to spurious correlation. If you think it might be spurious, - omit it.
True - but this can happen anytime
All the better
If it "works", is this bad? We use statistical decision rules in a mechanistic manner
Slide 164
April 10, 2023
Model Specification: Remedies (cont.)
Stepwise Regression this is true of regular regression as well
This is true of regular regression as well
No
No more than OLS
Not true
Also not true - this is a data exploration technique
Huh? Antiquated view of stepwise...probably not accurate in last 20 years
Slide 165
April 10, 2023
Measurement Error
Not much to say.
If the measurement error is random, estimates are unbiased, but results are weaker.
If biased measurement, results are biased.
Occasionally non-random measurement error produces other statistical problems: e.g. heteroskedasticity.
Slide 166