2
Measure of Central Tendency
One of the powerful tools of analysis is to calculate a
single average value that represents the entire mass of
the data. The word average is very commonly used in
day to day conversation. An “Average” is a single
value which is considered as the most representative
or typical value for a given set of data. Such value lies
somewhere in the middle of the group. For this reason
an average is frequently referred to as a measure of
central tendency or central value.
3
Measure of central tendency show the tendency of
some central value around which data tends to
cluster.
Objectives of Averaging:-
• To get single value that describes the characteristics
of the entire data.
• To facilitate comparison.
4
Characteristics of a good average
• It should be easy to understand.
• It should be simple to compute.
• It should be based on all the observations.
• It should be rigidly defined.
• It should have sampling stability.
• It should be capable of further algebraic treatment.
• It should not be unduly affected by the presence of
extreme values.
5
The various measures of central tendency or averages commonly used are:-
• Arithmetic Mean
• Geometric Mean
• Harmonic Mean
• Median
• Mode
Simple Arithmetic Mean
Weighted Arithmetic Mean
6
Arithmetic mean
The most popular and widely used measure for
representing the entire data. Its value is obtained
by adding together all the observations and by
dividing this total by the number of observations.
Calculation of AM - Ungrouped data:-
Direct Method:-
N
xxx NX
21_
N
xN
1ii
N
xi.e. X_
7
Short – Cut Method:-
The AM can be calculated by taking deviation from
any point in that case formula is
where
A = arbitrary point or Assumed Mean
N
dAX
_
Axd
8
Calculation of AM – Grouped Data:-
Direct Method:-
Where, x= mid point of various classes
f = frequency of each class
N = Total frequency i.e.
N
fxX
_
fN
9
Short – Cut Method:-
Where,
A = arbitrary point or Assumed Mean
N = Total frequency i.e.
h = the class interval of class
hN
fdAX X
_
h
Axd
fN
10
Mathematical properties of Arithmetic Mean
• The algebraic sum of the deviations of all
observations from AM is always zero.
i.e.
• The sum of the squared deviations of all the
observations from AM is minimum.
i.e.
0_
xx
2
2_
Axxx
11
• If we have the AM and number of the observations
of two or more than two related groups, we can
compute average of these groups
21
221112
___
NN
xNxNx
12
Merits:-
• The calculation of AM is simple and it is unique,
that is, every data has one and only one mean.
• The calculation of AM is based on all the values
given in the data set.
• The AM is reliable single value that reflects all
values in the data set.
• The AM is least affected by fluctuations in the
sample size.
13
Limitations:-
• The value of AM cannot be calculated accurately for
unequal and open ended class intervals.
• It is very much affected by the extreme observations
which are not representative of the rest of the data.
• The calculation of the AM sometimes becomes
difficult because every data element is used in the
calculation.
14
Weighted Arithmetic Mean:
The AM as discussed earlier, gives equal importance
to each observation in the data set. However, there
are situations in which values of individual
observations in the data set are not of equal
importance. Under these circumstances, we may
attach to each observation a value ‘weight’
as an indicator of their importance. The
formula for computing weighted AM is
,..., 21 ww
w
xwX w
_
nw...
15
Geometric Mean:-
Geometric mean is defined as the Nth root of the product of N observations of a given data. If there are two observation, we take the square root, if three then cube root and so on.
To simplify calculations logarithms are used
NNxxxx .....321
NxxxN
GM logloglog1
log 21
N
xantiGM
loglog
GM
16
For grouped data the GM is calculated as
Application of GM:-
1. The GM is used to find the average percent increase in sales, production, population or other economic or business data.
2. It is an average which is most suitable when large weights have to be given to small values of observation and vice-versa.
NfN
ff NxxxGM1
21 .........21
N
xfantiGM
loglog
17
Merits:-1. The value of GM is not much affected by extreme
observations and is computed by taking all the observations into account.
2. It is useful in averaging ratio and percentage as well as in determining rate of increase and decrease.
Limitations:-
1. The calculation of GM as compared to AM is more difficult.
2. The value of GM cannot be calculated when any of the observation in the data set is either negative or zero.
18
Harmonic Mean:
The harmonic mean (HM) of a set of observation is defined as the reciprocal of the arithmetic mean of the reciprocal of the observations i.e.
For Grouped Data:
or
where
xf
NHM
1
x
N
xxx
NHM
N
1111
21
xf
N
fN
19
Applications:-
The harmonic mean is a measure of central tendency for data expressed as rates, such as kms per hours, tonnes per day, quantity per liter etc.
Merits:-
1. The HM of given data is based on all the observations.
2. It is useful in special cases for averaging rates.
Limitations:-
1. The HM is not often used for analyzing business problems.
2. The calculation of HM involves complicated calculations
20
Relationship among AM, GM and HM:-
For any set of observation, its AM, GM and HM are related to each other in the relationship
The sign of ‘=‘ holds if and only if all the observations are identical.
If the values of any two means is given then the value of third mean can be calculated:-
Or
HMGMAM
HMXGM ._
2 HMXGM ._
21
Median:-
Median may be defined as the middle value in the data
set when its element are arranged in the sequential
order i.e. ascending or descending. Half the
observations in a set of data are lower than it and
half of the observations are greater than it.
Median is also known as positional average.
22
Calculation of Median:- Ungrouped Data
Arrange the data in ascending or descending order of magnitude.
If the number of observations (N) is an odd number, then
Median = size or value of th observation in the data set .
If the number of observations (N) is an even number, then the median is
Median =
2
1N
2
nobservatioth 2
1Nnobservatioth
2N
23
Calculation of Median – Grouped Data
First identify the class interval which contains the median value i.e. Observation of the data set.
Median =
Where L is lower limit of median class
c.f. is preceding cumulative frequency to the
median class
f is frequency of the median class
h is the class interval of the median class
2N
hf
fcNL
..2
24
Merits:-1. Median is unique i.e. like mean, there is only one median for a
set of data.
2. The value of median is easy to understand and may be calculated from any type of data .
3. The sum of absolute differences of all the observations in the data set from median value is minimum. i.e.
is minimum.
4. The extreme values in the data set does not affect the calculation of the median value.
5. The median value may be calculated for an open-ended distribution of data set.
6. The median is considered the best statistical tech. for studying the qualitative attribute of an observation in the data set.
MedX
25
Limitations:-
1. The median is not capable of algebraic treatment i.e. the median of two or more sets of data cannot be determined.
2. The median is more affected by sampling fluctuations.
3. Median is an average of position, therefore arranging the data in ascending or descending order of magnitude is time consuming in case of a large number of observations.
26
Related positional measures i.e. Partition Values:
Quartiles:-
The values of observations in a data set, when arranged in an ordered sequence can be divided into four equal parts or quarters, using three quartiles namely Q1,Q2 and Q3.
The generalized formula for calculating quartiles in case of grouped data is :
For i = 1, 2, 3
Symbols have their usual meanings.
iQ hf
fciNL
..2
27
Deciles:-
The values of observations in a data set when arranged
in an ordered sequence can be divided into ten equal
parts, using nine deciles, Di (i=1,2,…9).
The generalized formula for calculating deciles in case
of grouped data is:
For j = 1, 2, 3…,
9 jD h
f
fcjN
L
..2
28
Percentiles:
The value of observations in a data set when arranged
in an ordered sequence can be divided into hundred
equal parts, using ninety nine percentiles, Pi (i=1,2,
….99).
The generalized formula for calculating percentiles in
case of grouped data is:
For k = 1, 2, 3,… 99 hf
fckNL
..2kP
29
Mode:-
Mode is defined as that value which occurs the
maximum number of times i.e. having the maximum
frequency.
The concept of mode is of great use to large scale
manufacturing of consumable items such as ready
made garments, shoe-makers and so on. In all such
cases it is important to know the size that fits most
persons rather than ‘mean size’.
30
Calculation of Mode:-
Ungrouped Data:- For determining mode count, the
number of observations the various values repeat
themselves and the value which occurs the
maximum numbers of times is the modal value.
Grouped Data:- In discrete and continuous series if
items are concentrated at one value only then mode
can be calculated easily. But if items are
concentrated at more than one value, we find the
item of concentration by the method of grouping.
31
After finding the modal class we will use the following formula:-
Mode =
Where L is lower limit of the modal class
f1 is frequency of the modal class
f2 is frequency of the class succeeding the
modal class
f0 is frequency of the class preceding the
modal class
h is class interval of modal class
hfff
ffL
201
01
2
32
It must be noted that the value of mode must lie in
the modal class. If it does not lie in modal class, it
is considered to be incorrect. In such situation we
use the following alternative formula
Mode = hff
fL
20
2
33
Merits:-
1. Mode value is easy to understand and to calculate.
Modal class can also be located by inspection.
2. The mode is not affected by the extent values in
the distribution. The mode value can also be
calculated for open-ended frequency distributions.
3. The mode can be used to describe qualitative as
well as quantitative data.
34
Limitations:-
1. Mode is not a rigidly defined measure as there are
several methods for calculating its value.
2. It is difficult to locate modal class in the case of
multi-modal frequency distributions.
3. Mode is not suitable for algebraic manipulations.
4. When data sets contain more than one modes, such
values are difficult to interpret and compare.
35
Relationship between Mean, Median and Mode:-In symmetrical distribution, the value of mean, median and
mode are equal. When all these three values are not equal to each other, the distribution is not symmetrical.
For asymmetrical distribution, Karl-Pearson has suggested a relationship between these three measures of central tendency as
or or
MedianMeanModeMean 3
MeanMedianMode 23
ModeMeanMedian 23
1 ModeMedianMean 32
1
Top Related