Chapter 3.4 Measures of Central Tendency Measures of Central Tendency.
Measures of central tendency
-
Upload
nilanjan-bhaumik -
Category
Data & Analytics
-
view
115 -
download
0
Transcript of Measures of central tendency
1
Measures of
Central Tendency
AGBS Bangalore | 2013
Central Tendency
• In general terms, central tendency is a
statistical measure that determines a
single value that accurately describes the
center of the distribution and represents
the entire distribution of scores.
• The goal of central tendency is to identify
the single value that is the best
representative for the entire set of data.
3
Central Tendency (cont.)
• By identifying the "average score," centraltendency allows researchers to summarize orcondense a large set of data into a singlevalue.
• Thus, central tendency serves as a descriptivestatistic because it allows researchers todescribe or present a set of data in a verysimplified, concise form.
• In addition, it is possible to compare two (ormore) sets of data by simply comparing theaverage score (central tendency) for one setversus the average score for another set.
5
The Mean, the Median,
& the Mode
• It is essential that central tendency be
determined by an objective and well-defined
procedure so that others will understand exactly
how the "average" value was obtained and can
duplicate the process.
• No single procedure always produces a
good, representative value. Therefore,
researchers have developed three commonly
used techniques for measuring central tendency:
the mean, the median, and the mode.
6
The Mean
• The mean is the most commonly usedmeasure of central tendency.
• Computation of the mean requires scoresthat are numerical values measured onan interval scale.
• The mean is obtained by computing thesum, or total, for the entire set of scores,then dividing this sum by the number ofscores.
7
The Mean (cont.)
Conceptually, the mean can also be defined as:
1.The mean is the amount that each individual
receives when the total (ΣX) is divided equally
among all N individuals.
2.The mean is the balance point of the
distribution because the sum of the distances
below the mean is exactly equal to the sum of
the distances above the mean.
Calculate the Mean number of credit hours
Calculate the Mean Salary
Calculate the Median Salary
13
Changing the Mean
• Because the calculation of the mean involvesevery score in the distribution, changing thevalue of any score will change the value ofthe mean.
• Modifying a distribution by discarding scores orby adding new scores will usually change thevalue of the mean.
• To determine how the mean will be affected forany specific situation you must consider: 1) howthe number of scores is affected, and 2) how thesum of the scores is affected.
14
Changing the Mean (cont.)
• If a constant value is added to every score
in a distribution, then the same constant
value is added to the mean.
• Also, if every score is multiplied by a
constant value, then the mean is also
multiplied by the same constant value.
15
When the Mean Won’t Work
• Although the mean is the most commonly usedmeasure of central tendency, there aresituations where the mean does not provide agood, representative value, and there aresituations where you cannot compute a mean atall.
• When a distribution contains a few extremescores (or is very skewed), the mean will bepulled toward the extremes (displaced towardthe tail). In this case, the mean will not provide a"central" value.
16
When the Mean Won’t Work (cont.)
• With data from a nominal scale it isimpossible to compute a mean, and whendata are measured on an ordinal scale(ranks), it is usually inappropriate tocompute a mean.
• Thus, the mean does not always work as ameasure of central tendency and it isnecessary to have alternative proceduresavailable.
17
The Median
• If the scores in a distribution are listed in orderfrom smallest to largest, the median isdefined as the midpoint of the list.
• The median divides the scores so that 50% ofthe scores in the distribution have valuesthat are equal to or less than the median.
• Computation of the median requires scores thatcan be placed in rank order (smallest tolargest).
18
The Median (cont.)
Usually, the median can be found by a
simple counting procedure:
1.With an odd number of scores, list the
values in order, and the median is the
middle score in the list.
2.With an even number of scores, list the
values in order, and the median is half-
way between the middle two scores.
20
The Median (cont.)
• One advantage of the median is that it is
relatively unaffected by extreme
scores.
• Thus, the median tends to stay in the
"center" of the distribution even when
there are a few extreme scores or when
the distribution is very skewed. In these
situations, the median serves as a good
alternative to the mean.
Median for Grouped Frequency Distribution
21
Median for Grouped Frequency Distribution
22
Median for Grouped Frequency Distribution
23
The Mode
• The most common observation in a group of scores.
– Distributions can be unimodal, bimodal, or multimodal.
• If the data is categorical (measured on the nominal scale)
then only the mode can be calculated.
• The most frequently occurring score (mode) below is
Vanilla.
0
5
10
15
20
25
30
Van
illa
Cho
colate
Stra
wbe
rry
Nea
politan
But
ter P
ecan
Roc
ky R
oad
Fudg
e Rippl
e
fFlavor f
Vanilla 28
Chocolate 22
Strawberry 15
Neapolitan 8
Butter Pecan 12
Rocky Road 9
Fudge Ripple 6
Chap 3-25
The Characteristics of the Mode
• Value that occurs most often
• Not affected by extreme values
• Used for either numerical or categorical
(nominal) data
• There may be no mode
• There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
The Mode
• The mode can also be calculated with
ordinal and higher data, but it often is not
appropriate.
– If other measures can be calculated, the
mode would never be the first choice!
• 7, 7, 7, 20, 23, 23, 24, 25, 26 has a mode
of 7, but obviously it doesn’t make much
sense.
Mode for Grouped Frequency Distribution
27
Calculate the Mode
28
What is the mean rate of return here?
An investment of $100,000 declined to $50,000 at the end of year one and rebounded to $100,000 at end of year two:
The overall two-year return is zero, since it started and ended
at the same level.
000,100$X000,50$X000,100$X 321
50% decrease 100% increase
The Geometric Mean & The Geometric Rate of Return
Geometric mean
Used to measure the rate of change of a variable over time
Geometric mean rate of return
Measures the status of an investment over time
Where Ri is the rate of return in time period i
n/1
n21G )XXX(X
1)]R1()R1()R1[(R n/1
n21G
The Geometric Mean Rate
of Return: Example
Use the 1-year returns to compute the arithmetic mean and the geometric mean:
%2525.2
)1()5.(
X
Arithmetic
mean rate
of return:
Geometric
mean rate of
return:%012/1112/1)]2()50[(.
12/1))]1(1())5.(1[(
1/1)]1()21()11[(
nnRRRGR
Misleading result
More
representative
result
(continued)
Measures of Central Tendency:
Summary
Central Tendency
Arithmetic
Mean
Median Mode Geometric Mean
n
X
X
n
i
i 1
n/1
n21G )XXX(X
Middle value
in the ordered
array
Most
frequently
observed
value
Rate of
change of
a variable
over time
Which measure to use?Mean is generally considered the best measure of central tendency and
the most frequently used one. However, there are some situations where
the other measures of central tendency are preferred.
Median is preferred to mean when
There are few extreme scores in the distribution.
Some scores have undetermined values.
There is an open ended distribution.
Data are measured in an ordinal scale.
Mode is the preferred measure when data are measured in a nominal
scale.
A geometric mean is often used when comparing different items – finding
a single "figure of merit" for these items – when each item has multiple
properties that have different numeric ranges
33
Quiz Time 1) A teacher gives a 10 point quiz to a class of 9 students. All
the scores are whole numbers and nobody got a 0 or a
perfect core of 10. If the median is 7, what is the lowest
possible mean? What is the highest possible mean?
2) A poll reports that out of 100 families surveyed, the mean
number of children per family was 2.038, the median was 1.9,
and the mode was 1.82. Which of these values must be
wrong (independently)?
3) What is the mode here?
34
Solve this!
Wages 0-10 10-20 20-30 30-40 40-50 50-60 60-70 Total
Frequency 4 16 ? ? ? 6 4 230
35
Median = 33.5; Mode = 34. Calculate the missing frequencies.
Now solve this!
Variable 10-20 20-30 30-40 40-50 50-60 60-70 70-80 Total
Frequency 12 30 ? 65 ? 25 18 229
36
Median = 46. Calculate the missing frequencies.