Chapter 7 Linear Regression. Bivariate data x variable: is the independent or explanatory variable...
-
Upload
tracy-atkinson -
Category
Documents
-
view
217 -
download
0
description
Transcript of Chapter 7 Linear Regression. Bivariate data x variable: is the independent or explanatory variable...
Chapter 7Linear Regression
Bivariate data• x – variable: is the independent or
explanatory variable• y- variable: is the dependent or
response variable• Use x to predict y.
bxay ˆ
• (y-hat) means the predicted yb• is the slope• it is the amount by which y increases when x
increases by 1 unita• is the y-intercept• it is the height of the line when x = 0• in some situations, the y-intercept has no
meaning
y
Be sure to put the hat on the y
Sum of the squares = 61.25
ˆ 0.5 4y x
-4
4.5
-5
y =0.5(0) + 4 = 4
0 – 4 = -4
(0,0)
(3,10)
(6,2)
(0,0)
y =0.5(3) + 4 = 5.5
10 – 5.5 = 4.5
y =0.5(6) + 4 = 7
2 – 7 = -5
(Estimate)
(0,0)
(3,10)
(6,2)
Sum of the squares = 54
331
xy
Use a calculator to find the line of
best fit
Find y - y
-3
6
-3
What is the sum of the deviations
from the line?
Will it always be zero?
The line that minimizes the sum of the squares of the deviations from the line
is the LSRL.
YES
Least Squares Regression Line(LSRL)
• The line that gives the best fit to the data set
• The line that minimizes the sum of the squares of the deviations from the line
Slope:
For each unit increase in x, there is an approximate increase/decrease of b in y.
Interpretations
Correlation coefficient:There is a direction, strength, type of association between x and y.
The ages (in months) and heights (in inches) of seven children are given.
x 16 24 42 60 75 102 120
y 24 30 35 40 48 56 60
Find the LSRL and r.
Interpret the correlation coefficient and slope in the context of the problem.
0.342 20.4040.994
y xr
$
Correlation coefficient:
There is a strong, positive, linear association between the age and height of children.
Slope:For each increase in age of one month, there is an approximate increase of 0.34 inches in heights of children.
The ages (in months) and heights (in inches) of seven children are given.
x 16 24 42 60 75 102 120
y 24 30 35 40 48 56 60
Use y-hat to predict the height of a child who is 4.5 years old.
Use y-hat to predict the height of someone who is 20 years old.
4.5 years 38.876 inchesy$
20 years 102.504 inchesy$
0.342 20.404 y x$
4.5 years 54 months
Extrapolation• The LSRL should not be used to
predict y for values of x outside the data set.
• It is unknown whether the pattern observed in the scatterplot continues outside this range.
The ages (in months) and heights (in inches) of seven children are given.
x 16 24 42 60 75 102 120
y 24 30 35 40 48 56 60
Calculate x & y.
Draw the LSRL.
Plot the point (x, y) on the LSRL.
62.714, 41.857x y
YES
Will this point always be on the LSRL?
The correlation coefficient and the LSRL are both non-resistant measures.
Formulas – on chart0 1ˆ y b b x
1 y
x
sb r
s
1 2
i i
i
x x y yb
x x
0 1 b y b x
The following statistics are found for the variables posted speed limit (x) and the average number of accidents (y).
40, 11.6,18, 8.4, 0.9981
x
y
x sy s r
Find the LSRL & predict the number of accidents for a posted speed limit of 50 mph.
ˆ 0.723 10.91 y x ˆ 50 y
1 y
x
sb r
s 8.40.9981
11.6
0.7227...
0 1 b y b x 18 0.7227... 40 10.9104...
25.24 accidents
Reading the LSRL and Correlation Coefficient from Computer-Generated Data
The regression output for the association between student scores on Ms. Havens' Statistics tests and time spent studying (in hours) is shown.
What is the LSRL?
What is the correlation coefficient?
Predictor Coef StDev T PConstant 8.30125 2.6354 23.1 0.000Time 30.4902 0.7951 11.2 0.000
S = 2.513 R-Sq = 86.2% R-Sq (Adj.) = 84.2%
ˆ 30.4902 8.30125 y x 0.862r0.92843rPositive because the slope is positive
Assignments
• WS Regression #1– Due on Friday, 30 October 2015.
• WS Regression #2– Due on Friday, 30 October 2015.
• Friday QUIZ: Correllation and LSRL