Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a...

12
Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable (y) changes as an explanatory variable (x) changes… • Used to predict a y value given an x value. • Requires an explanatory and a response variable. • Given as an equation of a line in slope intercept form: ^ y = a + bx Read as: “y- hat” a = y- intercept b = slope

Transcript of Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a...

Page 1: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Stats Chapter 5 - Least Squares Regression

Definition of a regression line:A regression line is a straight line that describes how a

response variable (y) changes as an explanatory variable (x) changes…

• Used to predict a y value given an x value.

• Requires an explanatory and a response variable.

• Given as an equation of a line in slope intercept form:

y = a + bx

Read as: “y-hat” a = y-intercept b = slope

Page 2: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

How It Works:

^y = a + bx

x

y

Using the regression line to predict a y-value

Page 3: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Vertical Distance

Observed y

Predicted yVertical Distance = Observed - Predicted

Close-Up: We are trying to find a line that minimizesthe squares of the vertical distances…

y = negative

y = positive

Page 4: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Least-Squares Regression Line:

• The least-squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.

• The slope is the amount of change in y when x increases by one unit.

^

• The intercept of the line is the predicted value of y when x = 0.

^

Page 5: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Calculator Procedure

1) Enter Data into lists…

List 1 List 2 List 1 List 2

Page 6: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Run Stat > Calc > 8Run Stat > Calc > 8

y = 1.089 + .189x

Write regression line from

Calculated a and b values

Write regression line from

Calculated a and b values

y = a + bx

y-int = gas used when degree

days = 0

slope = increase in gas used when

degree days increase by one

Page 7: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Correlation vs. Regression

• The square of the correlation (r2) is the fraction of the variation in the values of y that is explained by the least squares regression line of y on x.

• 0 < r2 < 1

• When reporting a regression, give r2 as a measure of how successful the regression was in explaining the response.

• ex: 5.4 pg 134.

Page 8: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Residuals

• Residual = The difference between observed & predicted y-values.

• Residual = y - y

• Residual Plot - plots the residual values on the y-axis vs. the explanatory variable on the x-axis.

• Makes patterns easier to see.  

-1

1

0 55• Used to assess the fit of a regression line.  

Page 9: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Calculator Procedure

1) Run regression

2) Go into Stat Plot 1

3) Set Y-List to ‘RESID’

4) Set window values to match data range

5) Graph

2nd > STAT > 72nd > STAT > 7

Page 10: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Residual Patterns

Ideal:

Curved:

Spread:

Page 11: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Outliers:

Vertical Horizontal

Page 12: Stats Chapter 5 - Least Squares Regression Definition of a regression line: A regression line is a straight line that describes how a response variable.

Cautions About Correlation and Regression

• Both describe LINEAR relationships

• Both are affected by outliers

• Always plot your data before interpreting

• Beware of EXTRAPOLATION

• Beware of LUKRING VARIABLES

CORRELATION DOES NOTIMPLY CAUSATION!!!