© Boardworks 20141 of 10 Integers. © Boardworks 20142 of 10 Information.
S1.2 Calculating means and standard deviations -...
Transcript of S1.2 Calculating means and standard deviations -...
© Boardworks Ltd 20051 of 26 © Boardworks Ltd 20051 of 26
AS-Level Maths:Statistics 1for Edexcel
S1.2 Calculating means and standard deviations
This icon indicates the slide contains activities created in Flash. These activities are not editable.
For more detailed instructions, see the Getting Started presentation.
© Boardworks Ltd 20052 of 26
Co
nte
nts
© Boardworks Ltd 20052 of 26
Means
Calculating means
Calculating standard deviations
Coding
© Boardworks Ltd 20053 of 26
The mean is the most widely used average in statistics. It is
found by adding up all the values in the data and dividing by
how many values there are.
, , ,...,1 2 3 nx x x x
...1 2 3 inxx x x x
xn n
Note: The mean takes into account every piece of
data, so it is affected by outliers in the data. The
median is preferred over the mean if the data
contains outliers or is skewed.
Mean
Notation: If the data values are , then the
mean is
This is the
mean symbol
This symbol
means the
total of all the
x values
© Boardworks Ltd 20054 of 26
If data are presented in a frequency table:
Mean
Value Frequency
… …
2x
nx
1x 1f
2f
nf
...1 1 2 2 i in n
i i
x fx f x f x fx
f f
then the mean is
© Boardworks Ltd 20055 of 26
Example: The table shows the results of a survey
into household size. Find the mean size.
Mean
Household size, x Frequency, f
1 20
2 28
3 25
4 19
5 16
6 6
To find the mean, we add a 3rd column to the table.
x × f
20
56
75
76
80
36
TOTAL 114 343
Mean = 343 ÷ 114 = 3.01
© Boardworks Ltd 20056 of 26
Co
nte
nts
© Boardworks Ltd 20056 of 26
Standard deviation
Calculating means
Calculating standard deviations
Coding
© Boardworks Ltd 20057 of 26
There are three commonly used measures of spread (or
dispersion) – the range, the inter-quartile range and the
standard deviation.
( )2
varianceix x
n
( )
2
s.d.ix x
n
Standard deviation
The following formulae can be used to find the variance and s.d.
variance = (standard deviation)2
The variance is related to the standard deviation:
The standard deviation is widely used in statistics to measure
spread. It is based on all the values in the data, so it is
sensitive to the presence of outliers in the data.
© Boardworks Ltd 20058 of 26
Total: 22
Example: The mid-day temperatures (in °C) recorded for
one week in June were: 21, 23, 24, 19, 19, 20, 21
( )2
varianceix x
n
Standard deviation
...21 23 21 14721
7 7x
21 0 0
23 2 4
24 3 9
19 -2 4
19 -2 4
20 -1 1
21 0 0
( )2
ix xix xix
So variance = 22 ÷ 7 = 3.143
So, s.d. = 1.77°C (3 s.f.)
°CFirst we find the mean:
© Boardworks Ltd 20059 of 26
There is an alternative formula which is usually a more
convenient way to find the variance:
Standard deviation
( ) ( )2 2 2But, 2i i ix x x x x x 2 22i ix x x nx 2 22ix x nx nx 2 2
ix nx 2
2varianceix
xn
Therefore, and
2
2s.d.ix
xn
( )2
varianceix x
n
© Boardworks Ltd 200510 of 26
Example (continued): Looking again at the temperature
data for June: 21, 23, 24, 19, 19, 20, 21
Standard deviation
14721
7x
...2 2 2 221 23 21ix
°C
Also, = 3109
.
.
2
2 23109variance 21 3 143
7
s . 77.d 1
ixx
n
°C
Note: Essentially the standard deviation is a measure
of how close the values are to the mean value.
We know that
So,
© Boardworks Ltd 200511 of 26
When the data is presented in a frequency table, the formula
for finding the standard deviation needs to be adjusted slightly:
Calculating standard deviation from a table
2
2s.d.i i
i
f xx
f
Example: A class of 20
students were asked how
many times they exercise
in a normal week.
Find the mean and the
standard deviation.
Number of times
exercise taken
Frequency
0 5
1 3
2 5
3 4
4 2
5 1
© Boardworks Ltd 200512 of 26
Calculating standard deviation from a table
x × f x2 × f
0 0
3 3
10 20
12 36
8 32
5 25
No. of times
exercise taken, xFrequency, f
0 5
1 3
2 5
3 4
4 2
5 1
. .
2
2 2116s.d. 1 9 1 4
08
2
i i
i
f xx
f
The table can be extended to help find the mean and the s.d.
TOTAL: 20 38 116
.38
201 9x
© Boardworks Ltd 200513 of 26
If data is presented in a grouped frequency table, it is only
possible to estimate the mean and the standard deviation.
This is because the exact data values are not known.
An estimate is obtained by using the mid-point of an interval to
represent each of the values in that interval.
Example: The table
shows the annual mileage
for the employees of an
insurance company.
Estimate the mean and
standard deviation.
Calculating standard deviation from a table
Annual mileage, x Frequency
0 ≤ x < 5000 6
5000 ≤ x < 10,000 17
10,000 ≤ x < 15,000 14
15,000 ≤ x < 20,000 5
20,000 ≤ x < 30,000 3
© Boardworks Ltd 200514 of 26
Calculating standard deviation from a table
Mileage Frequency, f Mid-point, x f × x f × x2
0 – 5000 6 2500 15000 37,500,000
5000 – 10,000 17 7500 127,500 956,250,000
10,000 – 15,000 14 12,500 175,000 2,187,500,000
15,000 – 20,000 5 17,500 87,500 1,531,250,000
20,000 – 30,000 3 25,000 75,000 1,875,000,000
480,000
410
5,667x
TOTAL 45 480,000 6,587,500,000
26,587,500,000s.d. 10,667
47
55 11
miles
miles
© Boardworks Ltd 200515 of 26
In most distributions, about 67% of the data will lie within
1 standard deviation of the mean, whilst nearly all the
data values will lie within 2 standard deviations of the mean.
Values that lie more than 2 standard deviations from the
mean are sometimes classed as outliers – any such
values should be treated carefully.
Standard deviation is measured in the same units as the
original data. Variance is measured in the same units
squared.
Notes about standard deviation
Here are some notes to consider about standard deviation.
Most calculators have a built-in function which will find
the standard deviation for you. Learn how to use this
facility on your calculator.
© Boardworks Ltd 200516 of 26
Examination-style question:
The ages of the people in a
cinema queue one Monday
afternoon are shown in the
stem-and-leaf diagram:
Examination-style question
2 3 means 23 years old
2 3 6
3 1 6 6
4 1 2 5 6 9
5 0 4 7
6 1
a) Explain why the diagram suggests that the mean and
standard deviation can be sensibly used as measures of
location and spread respectively.
b) Calculate the mean and the standard deviation of the ages.
c) The mean and the standard deviation of the ages of the
people in the queue on Monday evening were 29 and
6.2 respectively. Compare the ages of the people
queuing at the cinema in the afternoon with those in the
evening.
© Boardworks Ltd 200517 of 26
a) The mean and the standard
deviation are appropriate, as
the distribution of ages is
roughly symmetrical and
there are no outliers.
Examination-style question
2 3 means 23 years old
2 3 6
3 1 6 6
4 1 2 5 6 9
5 0 4 7
6 1
b) . .597
597 so, 42 642861
44
2 6ix x
. .2 227,13127131 so, s.d. 42 64286
1410 9ix
c) The cinemagoers in the evening had a smaller mean
age, meaning that they were, on average, younger
than those in the afternoon.
The standard deviation for the ages in the evening was
also smaller, suggesting that the evening audience were
closer together in age.
© Boardworks Ltd 200518 of 26
Sometimes in examination questions you are asked to pool
two sets of data together.
Combining sets of data
Example: Six male and five female students sit an
A-level examination.
The mean marks were 52% and 57% for the males
and females respectively. The standard deviations
were 14 and 18 respectively.
Find the combined mean and the standard deviation
for the marks of all 11 students.
© Boardworks Ltd 200519 of 26
Let be the marks for the 6 male students.
Let be the marks of the 5 female students.
To find the overall mean, we first need to find the total
marks for all 11 students.
,...,1 6x x
,...,1 5y y
Combining sets of data
As 52x 6 52 312x
As 57y 5 57 285y
312 285 597x y
.. . %. .597
54 2727 31
541
Therefore
So the combined mean is:
© Boardworks Ltd 200520 of 26
To find the overall standard deviation, we need to find the
total of the marks squared for all 11 students.
As s.d. 14x
Therefore,
So the combined s.d. is: (to 3 s.f.)
Combining sets of data
As s.d. 18y
2
2s.d.ix
xn
( )2 2 2s.d.x n x
( )2 2 26 14 52 17,400x
( )2 2 25 18 57 17,865y 2 2 35,265x y
. . %235,26554 2 6 17
111
Notice that the formula
rearranges to give
© Boardworks Ltd 200521 of 26
Co
nte
nts
© Boardworks Ltd 200521 of 26
Calculating means
Calculating standard deviations
Coding
Coding
© Boardworks Ltd 200522 of 26
Coding is a technique that can simplify the numerical effort
required in finding a mean or standard deviation.
Enter some data below, and see how it changes when you
add or multiply by different numbers.
Coding
© Boardworks Ltd 200523 of 26
Adding
So, if a number b is added to each piece of data, the
mean value is also increased by b.
The standard deviation is unchanged.
i iy ax b
y ax b
s.d. s.d.y xa
Coding
More formally, if then:
Multiplying
If each piece of data is multiplied by a, the mean value
is multiplied by a.
The standard deviation is also multiplied by a.
© Boardworks Ltd 200524 of 26
Example: Find the mean and the standard deviation of the
values in the table. Use the transformation below to help you. 1
510
y x
Coding
x Frequency
50 3
60 5
70 7
80 4
90 1
y
0
1
2
3
4
Using the given transformation, add a y column to the table.
© Boardworks Ltd 200525 of 26
Coding
y Frequency, f
0 3
1 5
2 7
3 4
4 1
y × f y2 × f
0 0
5 5
14 28
12 36
4 16
.35
201 75y
Total 20 35 85
. .
2
2 285s.d. 1 75
21 09
0
i i
i
f yy
f
To find the mean:
To find the s.d.:
© Boardworks Ltd 200526 of 26
And the standard deviation of x is: 10 × 1.09 = 10.9
We can rearrange:
to get:
15
10y x
Therefore the mean of x is:
Coding
10 50x y
. .10 50 10 1 75 0 75 6 5x y
Note how the coding helped to simplify the
calculations by making the numbers smaller.
You have now found the mean and standard deviation of y.
To find them for the x values, you must reverse the coding.