Econ 488

Click here to load reader

download Econ 488

of 37

description

Econ 488. Lecture 3 Cameron Kaplan. Announcements. Midterm Date Change: Now October 22 Syllabus will be updated soon. Library Session: October 8. How Regression Works. Estimate the slope of the line that passes through the origin given the following data. How Regression Works. - PowerPoint PPT Presentation

Transcript of Econ 488

Econ 488

Lecture 3Cameron KaplanEcon 488AnnouncementsMidterm Date Change: Now October 22Syllabus will be updated soon.Library Session: October 8How Regression WorksEstimate the slope of the line that passes through the origin given the following dataHow Regression WorksEstimate the slope of the line that passes through the origin given the following dataHow Regression WorksTry this oneEstimate the slope of the line that passes through the origin given the following dataHow Regression WorksTry this oneEstimate the slope of the line that passes through the origin given the following dataAnswers1. (3,1): Slope = 1/32. (4,2): Slope = How did you get that?Slope = Y/XHow Regression WorksNow suppose we have 2 values:Estimate the slope of the line that passes through the origin given the following data

Possible Estimators1. Average of the two slopes.

Possible Estimators2. Midpoint EstimatorPossible Estimators2. Midpoint Estimator:

Or

= (1.5)/(3.5) = 3/7 Or= (1+2)/(3+4) = 3/7

Possible Estimators 3. Ordinary Least Squares (OLS)We want a line that is as close as possible to all of the points

Ordinary Least Squares

e1e2We want to find a line that makes these residuals, e1and e2 as small as possible.

Ordinary Least SquaresEquation of the line:

(pronounced: y i hat is equal to beta-hat x i)The underlying data generating process is:

(notice there are no hats)Finally, the observed values of X and Y can be described by:

e is the residual, which is actually observed. is the stochastic error term, which is never observed

Ordinary Least SquaresBy equations (1) and (3), we can see that:

So, we want to choose a line so that ei is as small as possible.But, ei can be negative or positive, so we cant just minimize ei.

Ordinary Least SquaresWe could choose a that minimizes the absolute value of ei.That is,

This is what is called the Least Absolute Deviations method.However, this is mathematically difficult, and there is another way that is better:Minimize ei2!ei2 is always positive, so we can minimize it.

Ordinary Least SquaresChoose a that minimizes Remember, We want to minimize the sum of this.

This is equivalent to:

Ordinary Least SquaresUsing calculus, the first order condition (FOC) for a minimum is that the first derivative is equal to zero.Take derivative with respect to

Solve for :

Our ExampleWhat is the OLS slope estimate for our example?Y1=1, X1=3Y2=2, X1=4So,

Possible EstimatorsNow we have 3 estimators:1. Average of slopes to each point:

OR

=5/120.4167

Possible Estimator2. Midpoint:

= 3/7 0.4286

3. Ordinary Least Squares

= 11/25 0.4444

Possible EstimatorWhich estimator is best?

Lets try an exercise.OLS with an intercept term

ExampleHeight and Shoe SizeSum of SquaresHow much of the variation in the dependent variable is explained by the estimated regression equation?Total Sum of Squares (TSS) How spread out are the y values in the sample?

Explained Sum of Squares (ESS) The sample variation in

Sum of SquaresResidual Sum of Squares (RSS) The sample variation in ei

TSS= ESS+RSSSome of the variation in y can be explained by the regression, and some cannotIf the RSS is small relative to the TSS, the equation is a good fit.

R-squaredR-squared (or R2) is the proportion of the variation in Y that is explained by the regression.

0 R2 1

R-squaredR-squared

R-squared

R-Squared

R-Squared

Multiple RegressionEach coefficient is a partial regression coefficient2 is the change in Y associated with a one unit increase in X2, holding the other Xs (i.e. X1, X3, X4, etc.) constant.

Multiple Regression Example Suppose we run this regression, and get: This means that, on average, a one year increase in education is associated with a $0.599 per hour increase in wages, holding experience and tenure constant.

Degrees of FreedomHow many more observations do you have to have above the number of coefficients you are trying to estimate?Can you estimate the slope and intercept given just one point?You always need at least as many observations as the number of coefficients you are estimating.But having more is better.Extra observations are extra degrees of freedom.Degrees of Freedom = n-k-1

R-squared vs. Adjusted R-squaredWhenever you add an extra variable, R2 will go up.Why? The extra variable will add at least some explanatory power to the regression.However, by adding another variable, you have an additional coefficient to estimate.Degrees of Freedom go down.So there is a benefit of adding an extra variable (R2 goes up) and a cost (d.f go down).Adjusted R2 adjusts the R2 to account for the loss in degrees of freedom.Adjusted R-squareNote that it is possible to get a negative adjusted R-squared