Calculating Residuals © Christine Crisp “Teach A Level Maths” Vol. 2: A2 Core Modules.
-
Upload
moses-russell -
Category
Documents
-
view
238 -
download
3
Transcript of Calculating Residuals © Christine Crisp “Teach A Level Maths” Vol. 2: A2 Core Modules.
Calculating ResidualsCalculating Residuals
© Christine Crisp
““Teach A Level Maths”Teach A Level Maths”
Vol. 2: A2 Core Vol. 2: A2 Core ModulesModules
Calculating Residuals
"Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages"
Statistics 1
AQA
EDEXCELOCR
Calculating Residuals
Foot length and height of UK children
Height (cm)
Foot length (cm)
Once we have found a regression line, we may need to know how close any particular observation is to the line.To do this, we find a residual. For the height and foot length data . . .
y on x regression
line
To find the residual for the point we find
yyA ),( AA yx
),( AA yx
Calculating Residuals
e.g. The marks for 10 students in Maths and Physics are as follows:
A B C D E F G H I J
Maths, x 41 37 38 39 47 42 34 35 48 49
Physics, y 36 20 31 24 35 42 26 27 29 37
The regression line for y on x is
xy 700811
yyA Residual of point A =( The residual is negative if the point is below the line.)To find y, substitute the value of x at point A into the regression line:
530)41(700811 y
495513036 yyA
Calculating Residuals
),( AA yx
),( yxA x
SUMMARYTo find the residual for a particular
observation, A,• calculate the y-coordinate on the regression
line corresponding to the x-value at A,
yyA • find
• The residual is negative if the point is below the line
• Since , the residual at A is also given by
bxay AA bxay
Calculating Residuals
OutliersOutliers are points that lie well away from the regression line.
Since a residual measures the distance of a point from a line, residuals are used to identify outliers.
Outliers can have a considerable effect on a regression line and make it unreliable.
Calculating Residuals
e.g. The diagram is a scatter diagram of the data shown in the table.
38
77
116
125
144
123
182
51
yx
If we were to draw the line “by eye”, the 1st point . . . would lie well away from the line we would want to draw.
However, the calculation of the regression line includes the 1st point and distorts the position of the line.
Calculating Residuals
The diagram shows the y on x regression line for all the data. The residuals are shown by the red lines.
38
77
116
125
144
123
182
51
yx
xy 8802114
The left-hand end of the line is further down than it would be without the 1st point.
e.g. The diagram is a scatter diagram of the data shown in the table.
Calculating Residuals
x y
1 5
2 18
3 12
4 14
5 12
6 11
7 7
8 3
Removing the 1st point . . .
xy 8802114
e.g. The diagram is a scatter diagram of the data shown in the table.
Calculating Residuals
x y
1 5
2 18
3 12
4 14
5 12
6 11
7 7
8 3xy 0723621
xy 8802114
e.g. The diagram is a scatter diagram of the data shown in the table.
Removing the 1st point gives
Calculating Residuals
1392 R
The sum of the squares of the residuals,
9192 R
The sum of the squares of the residuals,
Without the 1st point, we have a regression line that is a much better fit.
xy 8802114
e.g. The diagram is a scatter diagram of the data shown in the table.
xy 0723621
Removing the 1st point gives
Calculating ResidualsExercise
(a) Find the equation of the regression line of y on x
(b) Estimate the percentage of accidents to children in an area with 10% open space.
(c) Find the residual for A.
1. The table shows the number of accidents to children as a percentage of those to adults, y, in 9 areas of London together with the percentage of open space in those areas, x.
17·1
23·8
30·8
33·6
3738·2
4042·9
46·3
Children’s Accidents (%)
14·814·66·35·24·571·41·35Open Spaces(%)
IHGFEDCBA
Calculating ResidualsSolutions(a) Find the equation of the regression line of y
on x(b) Estimate the percentage of accidents to
children in an area with 10% open space.(c) Find the residual for A.
(a) The equation of the regression line for y on x is
Solution:
xy 6514045 (b)
92810 yxNearly 29% of accidents will involve
children.(c) At , )346,5( A
1537)5(65140455 yx
1591537346 yyAResidual
=
Calculating Residuals
The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied.For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet.
Calculating Residuals
• calculate the y-coordinate on the regression line corresponding to the x-value at A,
),( AA yx
),( yxA x
SUMMARY
To find the residual for a particular observation, A,
The residual is negative if the point is below the line
yyA • find
• Since , the residual at A is also given by
bxay AA bxay
Calculating Residuals
yyA
e.g. The marks for 10 students in Maths and Physics are as follows:
36
41
A
20
37
B
31
38
C
24
39
D
35
47
E
42
42
F
26
34
G
27
35
H
29
48
I
37
49
J
Physics, y
Maths, x
The regression line for y on x is
xy 720071
Residual of point A =( The residual is negative if the point is below the line.)To find y, substitute the value of x at point A into the regression line:
5930)41(720071 y415593036 yyA
Calculating Residuals
OutliersOutliers are points that lie well away from the regression line.
Since a residual measures the distance of a point from a line, residuals are used to identify outliers.
Outliers can have a considerable effect on a regression line and make it unreliable.
Calculating Residuals
e.g. The diagram is a scatter diagram of the data shown in the table.
38
77
116
125
144
123
182
51
yx
If we were to draw the line “by eye”, the 1st point . . . would lie well away from the line we would want to draw.
However, the calculation of the regression line includes the 1st point and distorts the position of the line.
Calculating Residuals
e.g. The diagram shows the y on x regression line for the data in the table. The residuals are shown by the lines parallel to the y-axis.
1392 R
The sum of the squares of the residuals,
9192 R
The sum of the squares of the residuals,
Without the 1st point, we have a regression line that is a much better fit.
xy 0723621
xy 4617117
The 1st point has the largest residual.