Measures of Central Tendency

35
1 MEASURES OF CENTRAL TENDENCY Dr. Vijay Kumar [email protected] ; [email protected]

Transcript of Measures of Central Tendency

Page 2: Measures of Central Tendency

2

Measure of Central Tendency

One of the powerful tools of analysis is to calculate a

single average value that represents the entire mass of

the data. The word average is very commonly used in

day to day conversation. An “Average” is a single

value which is considered as the most representative

or typical value for a given set of data. Such value lies

somewhere in the middle of the group. For this reason

an average is frequently referred to as a measure of

central tendency or central value.

Page 3: Measures of Central Tendency

3

Measure of central tendency show the tendency of

some central value around which data tends to

cluster.

Objectives of Averaging:-

• To get single value that describes the characteristics

of the entire data.

• To facilitate comparison.

Page 4: Measures of Central Tendency

4

Characteristics of a good average

• It should be easy to understand.

• It should be simple to compute.

• It should be based on all the observations.

• It should be rigidly defined.

• It should have sampling stability.

• It should be capable of further algebraic treatment.

• It should not be unduly affected by the presence of

extreme values.

Page 5: Measures of Central Tendency

5

The various measures of central tendency or averages commonly used are:-

• Arithmetic Mean

• Geometric Mean

• Harmonic Mean

• Median

• Mode

Simple Arithmetic Mean

Weighted Arithmetic Mean

Page 6: Measures of Central Tendency

6

Arithmetic mean

The most popular and widely used measure for

representing the entire data. Its value is obtained

by adding together all the observations and by

dividing this total by the number of observations.

Calculation of AM - Ungrouped data:-

Direct Method:-

N

xxx NX

21_

N

xN

1ii

N

xi.e. X_

Page 7: Measures of Central Tendency

7

Short – Cut Method:-

The AM can be calculated by taking deviation from

any point in that case formula is

where

A = arbitrary point or Assumed Mean

N

dAX

_

Axd

Page 8: Measures of Central Tendency

8

Calculation of AM – Grouped Data:-

Direct Method:-

Where, x= mid point of various classes

f = frequency of each class

N = Total frequency i.e.

N

fxX

_

fN

Page 9: Measures of Central Tendency

9

Short – Cut Method:-

Where,

A = arbitrary point or Assumed Mean

N = Total frequency i.e.

h = the class interval of class

hN

fdAX X

_

h

Axd

fN

Page 10: Measures of Central Tendency

10

Mathematical properties of Arithmetic Mean

• The algebraic sum of the deviations of all

observations from AM is always zero.

i.e.

• The sum of the squared deviations of all the

observations from AM is minimum.

i.e.

0_

xx

2

2_

Axxx

Page 11: Measures of Central Tendency

11

• If we have the AM and number of the observations

of two or more than two related groups, we can

compute average of these groups

21

221112

___

NN

xNxNx

Page 12: Measures of Central Tendency

12

Merits:-

• The calculation of AM is simple and it is unique,

that is, every data has one and only one mean.

• The calculation of AM is based on all the values

given in the data set.

• The AM is reliable single value that reflects all

values in the data set.

• The AM is least affected by fluctuations in the

sample size.

Page 13: Measures of Central Tendency

13

Limitations:-

• The value of AM cannot be calculated accurately for

unequal and open ended class intervals.

• It is very much affected by the extreme observations

which are not representative of the rest of the data.

• The calculation of the AM sometimes becomes

difficult because every data element is used in the

calculation.

Page 14: Measures of Central Tendency

14

Weighted Arithmetic Mean:

The AM as discussed earlier, gives equal importance

to each observation in the data set. However, there

are situations in which values of individual

observations in the data set are not of equal

importance. Under these circumstances, we may

attach to each observation a value ‘weight’

as an indicator of their importance. The

formula for computing weighted AM is

,..., 21 ww

w

xwX w

_

nw...

Page 15: Measures of Central Tendency

15

Geometric Mean:-

Geometric mean is defined as the Nth root of the product of N observations of a given data. If there are two observation, we take the square root, if three then cube root and so on.

To simplify calculations logarithms are used

NNxxxx .....321

NxxxN

GM logloglog1

log 21

N

xantiGM

loglog

GM

Page 16: Measures of Central Tendency

16

For grouped data the GM is calculated as

Application of GM:-

1. The GM is used to find the average percent increase in sales, production, population or other economic or business data.

2. It is an average which is most suitable when large weights have to be given to small values of observation and vice-versa.

NfN

ff NxxxGM1

21 .........21

N

xfantiGM

loglog

Page 17: Measures of Central Tendency

17

Merits:-1. The value of GM is not much affected by extreme

observations and is computed by taking all the observations into account.

2. It is useful in averaging ratio and percentage as well as in determining rate of increase and decrease.

Limitations:-

1. The calculation of GM as compared to AM is more difficult.

2. The value of GM cannot be calculated when any of the observation in the data set is either negative or zero.

Page 18: Measures of Central Tendency

18

Harmonic Mean:

The harmonic mean (HM) of a set of observation is defined as the reciprocal of the arithmetic mean of the reciprocal of the observations i.e.

For Grouped Data:

or

where

xf

NHM

1

x

N

xxx

NHM

N

1111

21

xf

N

fN

Page 19: Measures of Central Tendency

19

Applications:-

The harmonic mean is a measure of central tendency for data expressed as rates, such as kms per hours, tonnes per day, quantity per liter etc.

Merits:-

1. The HM of given data is based on all the observations.

2. It is useful in special cases for averaging rates.

Limitations:-

1. The HM is not often used for analyzing business problems.

2. The calculation of HM involves complicated calculations

Page 20: Measures of Central Tendency

20

Relationship among AM, GM and HM:-

For any set of observation, its AM, GM and HM are related to each other in the relationship

The sign of ‘=‘ holds if and only if all the observations are identical.

If the values of any two means is given then the value of third mean can be calculated:-

Or

HMGMAM

HMXGM ._

2 HMXGM ._

Page 21: Measures of Central Tendency

21

Median:-

Median may be defined as the middle value in the data

set when its element are arranged in the sequential

order i.e. ascending or descending. Half the

observations in a set of data are lower than it and

half of the observations are greater than it.

Median is also known as positional average.

Page 22: Measures of Central Tendency

22

Calculation of Median:- Ungrouped Data

Arrange the data in ascending or descending order of magnitude.

If the number of observations (N) is an odd number, then

Median = size or value of th observation in the data set .

If the number of observations (N) is an even number, then the median is

Median =

2

1N

2

nobservatioth 2

1Nnobservatioth

2N

Page 23: Measures of Central Tendency

23

Calculation of Median – Grouped Data

First identify the class interval which contains the median value i.e. Observation of the data set.

Median =

Where L is lower limit of median class

c.f. is preceding cumulative frequency to the

median class

f is frequency of the median class

h is the class interval of the median class

2N

hf

fcNL

..2

Page 24: Measures of Central Tendency

24

Merits:-1. Median is unique i.e. like mean, there is only one median for a

set of data.

2. The value of median is easy to understand and may be calculated from any type of data .

3. The sum of absolute differences of all the observations in the data set from median value is minimum. i.e.

is minimum.

4. The extreme values in the data set does not affect the calculation of the median value.

5. The median value may be calculated for an open-ended distribution of data set.

6. The median is considered the best statistical tech. for studying the qualitative attribute of an observation in the data set.

MedX

Page 25: Measures of Central Tendency

25

Limitations:-

1. The median is not capable of algebraic treatment i.e. the median of two or more sets of data cannot be determined.

2. The median is more affected by sampling fluctuations.

3. Median is an average of position, therefore arranging the data in ascending or descending order of magnitude is time consuming in case of a large number of observations.

Page 26: Measures of Central Tendency

26

Related positional measures i.e. Partition Values:

Quartiles:-

The values of observations in a data set, when arranged in an ordered sequence can be divided into four equal parts or quarters, using three quartiles namely Q1,Q2 and Q3.

The generalized formula for calculating quartiles in case of grouped data is :

For i = 1, 2, 3

Symbols have their usual meanings.

iQ hf

fciNL

..2

Page 27: Measures of Central Tendency

27

Deciles:-

The values of observations in a data set when arranged

in an ordered sequence can be divided into ten equal

parts, using nine deciles, Di (i=1,2,…9).

The generalized formula for calculating deciles in case

of grouped data is:

For j = 1, 2, 3…,

9 jD h

f

fcjN

L

..2

Page 28: Measures of Central Tendency

28

Percentiles:

The value of observations in a data set when arranged

in an ordered sequence can be divided into hundred

equal parts, using ninety nine percentiles, Pi (i=1,2,

….99).

The generalized formula for calculating percentiles in

case of grouped data is:

For k = 1, 2, 3,… 99 hf

fckNL

..2kP

Page 29: Measures of Central Tendency

29

Mode:-

Mode is defined as that value which occurs the

maximum number of times i.e. having the maximum

frequency.

The concept of mode is of great use to large scale

manufacturing of consumable items such as ready

made garments, shoe-makers and so on. In all such

cases it is important to know the size that fits most

persons rather than ‘mean size’.

Page 30: Measures of Central Tendency

30

Calculation of Mode:-

Ungrouped Data:- For determining mode count, the

number of observations the various values repeat

themselves and the value which occurs the

maximum numbers of times is the modal value.

Grouped Data:- In discrete and continuous series if

items are concentrated at one value only then mode

can be calculated easily. But if items are

concentrated at more than one value, we find the

item of concentration by the method of grouping.

Page 31: Measures of Central Tendency

31

After finding the modal class we will use the following formula:-

Mode =

Where L is lower limit of the modal class

f1 is frequency of the modal class

f2 is frequency of the class succeeding the

modal class

f0 is frequency of the class preceding the

modal class

h is class interval of modal class

hfff

ffL

201

01

2

Page 32: Measures of Central Tendency

32

It must be noted that the value of mode must lie in

the modal class. If it does not lie in modal class, it

is considered to be incorrect. In such situation we

use the following alternative formula

Mode = hff

fL

20

2

Page 33: Measures of Central Tendency

33

Merits:-

1. Mode value is easy to understand and to calculate.

Modal class can also be located by inspection.

2. The mode is not affected by the extent values in

the distribution. The mode value can also be

calculated for open-ended frequency distributions.

3. The mode can be used to describe qualitative as

well as quantitative data.

Page 34: Measures of Central Tendency

34

Limitations:-

1. Mode is not a rigidly defined measure as there are

several methods for calculating its value.

2. It is difficult to locate modal class in the case of

multi-modal frequency distributions.

3. Mode is not suitable for algebraic manipulations.

4. When data sets contain more than one modes, such

values are difficult to interpret and compare.

Page 35: Measures of Central Tendency

35

Relationship between Mean, Median and Mode:-In symmetrical distribution, the value of mean, median and

mode are equal. When all these three values are not equal to each other, the distribution is not symmetrical.

For asymmetrical distribution, Karl-Pearson has suggested a relationship between these three measures of central tendency as

or or

MedianMeanModeMean 3

MeanMedianMode 23

ModeMeanMedian 23

1 ModeMedianMean 32

1