Post on 19-Dec-2015
What makes one estimator better than another
Estimator is jargon term for method of estimating
Estimate
• The estimator produces an estimate.• The estimate is the number.• The estimator is the method.
What makes one estimator better than another
A better estimator is more likely to be close to the true line.
How close is our regression line to the true line?
• To answer, we must make assumptions.• Assumption 1 is right in the question
above. It’s that there is a true line that we’re trying to find.
• Assumptions are needed to assess an estimator.
• To see where we’re going with the assumptions…
True line demo review
• Yi = α + βXi + ei
• (spreadsheet)
Least squares demo review
Errors’ expected value is 0.
– Assumption 2• Why we draw our regression line
through the middle of the points’ pattern• Implies that the least squares estimator
is unbiased• Estimator = Method
Bias
• Unbiased means aimed at target.– Bias demo
• The expected value of the least squares slope is the true slope.
• Same for intercept.
All errors have the same variance
– Assumption 3• Why you give each point equal
consideration
Errors not correlated with each other
– Assumption 4• Correlated means a linear relationship
that lets you predict one error once you know another error.
• Serial correlation would be if one error helps you anticipate the direction of the next error.
Errors not correlated with each other
• Why you predict on the regression line rather than above or below it.
Normal distribution for errors
– Assumption 5• Normal distribution results from the
accumulation of small disturbances. Random walk with small steps.
• Normal distribution demos show how tight the normal distribution is.
Normal distribution for errors
• Least squares is best.– Unbiased– Least variance -- most efficient -- of any
estimator that is unbiased • Efficiency demos
• Can do hypothesis testing.
1A spreadsheet adds …
• Standard error of coefficient for the slope
• T-statistic– Coefficient ÷ its Standard error
• R-squared• Standard error of the regression
Standard error of coefficient
• Shows how near the estimated coefficient might be to the true coefficient.
t
• A unitless number with a known distribution, if the assumptions about the errors are true.
• Used here to test the hypothesis that the true slope parameter is 0.
R2
• Between 0 and 1. Demo• Least squares maximizes this.• Correlation coefficient r is square root.
1Sum of squares of residuals
Sum of squares of Y Y
Standard error of the regression
• “s”• Should be called standard residual
– But it isn’t
s
• Root-mean-square average size of the residuals
• s2 is an estimate of 2
s2 sum of squares of residuals
Number of observations 2
S2 and 2
S2
Sum of squares of residuals
Divided by
N-2
2
Expected value of sum of squares of the errors
Divided by
N