Mba i qt unit-2.1_measures of variations

78
Course: MBA Subject: Quantitative Techniques Unit: 2.1

Transcript of Mba i qt unit-2.1_measures of variations

Course: MBA

Subject: Quantitative Techniques

Unit: 2.1

Ch 5_2

Ch 5_3

What is meant by variability?

Variability refers to the extent to which the

observations vary from one another from some

average. A measure of variation is designed to state

the extent to which the individual measures differ on

an average from the mean.

Continued…..

Ch 5_4

What are the purposes of measuring

variation ?

Measures of variation are needed for four basic

purposes:

To determine the reliability of an average;

To serve as a basis for the control of the variability;

To compare two or more series with regard to their

variability;

To facilitate the use of other statistical measures

Ch 5_5

What are the properties of a good

measure of variation ?A good measure of variation should possess the

following properties:

It should be simple to understand.

It should be easy to compute.

It should be rigidly defined.

It should be based on each and every observation of

the distribution.

It should be amenable to further algebraic treatment.

It should have sampling stability.

It should not be unduly affected by extreme

observations.

Ch 5_6

What are the methods of studying variation ?

The following are the important methods of studying

variation:

The Range

The Interquartile Range or Quartile Deviation.

The Average Deviation

The Standard Deviation

The Lorenz Curve.

Of these, the first four are mathematical and the last

is a graphical one.

Ch 5_7

What is meant by range ?

The range is defined as the distance between the

highest and lowest scores in a distribution.

It may also be defined as the difference between the

value of the smallest observation and the value of the

largest observation included in the distribution.

Ch 5_8

What are the usages of range ?Despite serious limitations range is useful in the

following cases:

Quality control: Range helps check quality of a

product. The object of quality control is to keep a

check on the quality of the product without 100%

inspection.

Fluctuation in the share prices: Range is useful in

studying the variations in the prices of stocks and

shares and other commodities etc.

Weather forecasts: The meteorological department

does make use of the range in determining the

difference between the minimum temperature and

maximum temperature.

Ch 5_9

What are the merits of range ?

Merits:

Among all the methods of studying variation, range

is the simplest to understand and the easiest to

compute.

It takes minimum time to calculate the value of

range. Hence, if one is interested in getting a quick

rather than a very accurate picture of variability, one

may compute range.

Ch 5_10

What are the limitations of range ?

Limitations:

Range is not based on each and every observation

of the distribution.

It is subject to fluctuations of considerable

magnitude from sample.

Range cannot be computed in case of open-end

distributions.

Range cannot tell anything about the character of

the distribution within two extreme observations.

Ch 5_11

Example: Observe the following three series

Series A: 6, 46 46 46 46 46 46 46

Series B: 6 6 6 6 46 46 46 46

Series C: 6 10 15 25 30 32 40 46

In all the three series range is the same (i.e., 46-6=40),

but it does not mean that the distributions are alike. The

range takes no account on the form of the distribution

within the range. Range is, therefore, most unreliable as

a guide to the variation of the values within a

distribution.

Ch 5_12

What is meant by inter-quartile range or

deviation?

Inter-quartile range represents the difference between

the third quartile and the first quartile. In measuring

inter-quartile range the variation of extreme

observations is discarded.

Continued…..

Ch 5_13

What is inter-quartile range or deviation

measured ?

One quartile of the observations at the lower end and

another quartile of the observations at the upper end of

the distribution are excluded in computing the inter-

quartile range. In other words, inter-quartile range

represents the difference between the third quartile

and the first quartile. Symbolically,

Inerquartile range = Q3 – Q1

Very often the interquartile range is reduced to the

form of the semi-interquartile range or quartile

deviation by dividing it by 2.

Ch 5_14

2..

13

QQDQ

Q.D. = Quartile deviation

Quartile deviation gives the average amount by which the

two Quartiles differ from the median. In asymmetrical

distribution, the two quartiles (Q1 and Q3 ) are equidistant

from the median, i.e., Median ± Q.D. covers exactly 50 per

cent of the observations.

The formula for computing inter-quartile deviation is

stated as under:

Ch 5_15

Co-efficient of Quartile deviation

When quartile deviation is very small it describes high

uniformity or small variation of the central 50%

observations, and a high quartile deviation means that

the variation among the central observations is large.

Quartile deviation is an absolute measure of variation.

The relative measure corresponding to this measure,

called the coefficient of quartile deviation, is

calculated as follows:

13

13

QQ

QQ

Coefficient of quartile deviation can be used to

compare the degree of variation in different

distributions.

Ch 5_16

How is quartile deviation computed?

The process of computing quartile deviation is very

simple. It is computed based on the values of the upper

and lower quartiles. The following illustration would

clarify the procedure.

Example:

You are given the frequency distribution of 292 workers

of a factory according to their average weekly income.

Calculate quartile deviation and its coefficient from the

following data:

Continued…………

Ch 5_17

Example:

Continued…………

Weekly Income No. of workers

Below 1350 8

1350-1370 16

1370-1390 39

1390-1410 58

1410-1430 60

1430-1450 40

1450-1470 22

1470-1490 15

1490-1510 15

1510-1530 9

1530 & above 10

Ch 5_18

Example:

Continued…………

Weekly Income No. of workers c.f.

Below 1350 8 8

1350-1370 16 24

1370-1390 39 63

1390-1410 58 121

1410-1430 60 181

1430-1450 40 221

1450-1470 22 243

1470-1490 15 258

1490-1510 15 273

1510-1530 9 282

1530 & above 10 292

N = 292

Calculation of Quartile deviation

Ch 5_19

Example:

Continued…………

nobservatiothnobservatiothN

1462

292

2Median = Size of

Median lies in the class 1410 - 1430

.14101390

734

292

4

333.141833381410

2060

1211461410

..2

1

1

classtheinliesQ

nobservatiordnobservatiothN

ofSizeQ

if

fcpN

LMedain

Ch 5_20

Example:

.02004482842

55552

4482842

55255

44813931449

44813931449..

14491914302040

1812191430

...4

3

144914302194

2923

4

3

4481393448313902058

63731390

...4

13

13

3

3

3

1

QQ

QQDQofCoeffiecnt

if

fcpN

LQ

classtheinliesQnobservatioth

nobservatiothN

ofSizeQ

if

fcpN

LQ

Ch 5_21

What are the merits of quartile deviation?

Merits:

In certain respects it is superior to range as a

measure of variation

It has a special utility in measuring variation in case

of open-end distributions or one in which the data

may be ranked but measured quantitatively.

It is also useful in erratic or highly skewed

distributions, where the other measures of variation

would be warped by extreme value.

The quartile deviation is not affected by the presence

of extreme values.

Ch 5_22

Limitations:

Quartile deviation ignores 50% items, i.e., the first 25%

and the last 25%. As the value of quartile deviation does

not depend upon every observation it cannot be regarded

as a good method of measuring variation.

It is not capable of mathematical manipulation.

Its value is very much affected by sampling fluctuations.

It is in fact not a measure of variation as it really does not

show the scatter around an average but rather a distance

on a scale, i.e., quartile deviation is not itself measured

from an average, but it is a positional average.

What are the limitations of quartile

deviation?

Ch 5_23

What is average deviation?

Average deviation refers to the average of the absolute

deviations of the scores around the mean.

It is obtained by calculating the absolute deviations of

each observation from median ( or mean), and then

averaging these deviations by taking their arithmetic

mean.

How is it calculated?

Continued…….

Ch 5_24

The formula for average deviation may be written as:

N

MedXDA Med

.)(..

If the distribution is symmetrical the average (mean or

median) ± average deviation is the range that will

include 57.5 per cent of the observation in the series. If

it is moderately skewed, then we may expect

approximately 57.5 per cent of the observations to fall

within this range. Hence if average deviation is small,

the distribution is highly compact or uniform, since

more than half of the cases are concentrated within a

small range around the mean.

Ungrouped data

Ch 5_25

The relative measure corresponding to the average

deviation, called the coefficient of average deviation, is

obtained, by dividing average deviation by the

particular average used in computing average

deviation. Thus, if average deviation has been

computed from median, the coefficient of average

deviation shall be obtained by dividing average

deviation by the median.

Ungrouped data

Median

DADAoftCoefficien Med

.... .

If mean has been used while calculating the value of

average deviation, in such a case coefficient of average

deviation is obtained by dividing average deviation by

the mean.

Ch 5_26

Branch 1

Income (Tk)

Branch II

Income (Tk)

4,000 3,000

4,200 4,000

4,400 4,200

4,600 4,400

4,800 4,600

4,800

5,800

Calculate the average deviation and coefficient of average

deviation of the two income groups of five and seven

workers working in two different branches of a firm:

Example:

Continued…..

Ch 5_27

Branch 1

│X- Med

Income (Tk) Med.=4,400

Branch II

│X- Med

Income (Tk) Med.= 4,400

4,000 400 3,000 1,400

4,200 200 4,000 400

4,400 0 4,200 200

4,600 200 4,400 0

4,800 400 4,600 200

4,800 400

5,800 1,400

N= 5 │X- Med =1,200 N = 7 │X- Med = 4000

Calculation of Average deviation

Continued…..

Ch 5_28

Brach I:

Brach II:

130400,4

57143...

435717

000,4...

0540400,4

240.....

2405

1200...

DAofcoeff

N

MedXDA

Median

DADAofCoeff

N

MedXDA

Ch 5_29

Grouped data

In case of grouped data, the formula for calculating

average deviation is :

Continued………..

N

MedXfDA Med

.)(..

Ch 5_30

Example:

Sales

(in thousand Tk)

No. of days

10 – 20 3

20 – 30 6

30 – 40 11

40 – 50 3

50 – 60 2

Continued……..

Calculation of Average Deviation from mean from the

following data:

Ch 5_31

Sales

(in thousand

Tk)

m.p

X

f

(=d)

fd

10 – 20 15 3 –2 – 6 18 54

20 – 30 25 6 –1 – 6 8 48

30 – 40 35 11 0 0 2 22

40 – 50 45 3 + 1 + 3 12 36

50 – 60 55 2 + 2 + 4 22 44

N = 25 fd = –5

= 204

10

35X

XX XX

XX

f

f

Calculation of Average deviation

Continued……

Ch 5_32

16825

204..

332351025

535

..

DA

iN

fdAX

N

XXfDA

Thus the average sales are Tk. 33 thousand per day

and the average deviation of sales is Tk. 8.16

thousand.

Ch 5_33

What are the areas suitable for use of

average deviation?It is especially effective in reports presented to the

general public or to groups not familiar with statistical

methods.

This measure is useful for small samples with no

elaborate analysis required.

Research has found in its work on forecasting business

cycles, that the average deviation is the most practical

measure of variation to use for this purpose.

Ch 5_34

What are the merits of average deviation?

Merits:

The outstanding advantage of the average deviation is

its relative simplicity. It is simple to understand and

easy to compute.

Any one familiar with the concept of the average can

readily appreciate the meaning of the average

deviation.

It is based on each and every observation of the data.

Consequently change in the value of any observation

would change the value of average deviation.

Ch 5_35

What are the merits of average deviation?

Merits:

Average deviation is less affected by the values of

extremes observation.

Since deviations are taken from a central value,

comparison about formation of different distributions

can easily be made.

Ch 5_36

What are the limitations of average deviation?

Limitations:

The greatest drawback of this method is that

algebraic signs are ignored while taking the

deviations of the items. If the signs of the deviations

are not ignored, the net sum of the deviations will be

zero if the reference point is the mean, or

approximately zero if the reference point is median.

The method may not give us very accurate results.

The reason is that average deviation gives us best

results when deviations are taken from median. But

median is not a satisfactory measure when the degree

of variability in a series is very high.

Continued…….

Ch 5_37

What are the limitations of average deviation?

Limitations:

Compute average deviation from mean is also not

desirable because the sum of the deviations from

mean ( ignoring signs) is greater than the sum of the

deviations from median (ignoring signs).

If average deviation is computed from mode that also

does not solve the problem because the value of

mode cannot always be determined.

It is not capable of further algebraic treatment.

It is rarely used in sociological and business studies.

Continued…….

Ch 5_38

What is meant by Standard Deviation?

Standard deviation is the square root of the squared

deviations of the scores around the mean divided by

N. S represents standard deviation of a sample; ∂, the

standard deviation of a population.

Standard deviation is also known as root mean square

deviation for the reason that it is the square root of the

means of square deviations from the arithmetic mean.

The formula for measuring standard deviation is as

follows :

N

XX

2

Ch 5_39

VarianceorVarianceHence 2

If we square standard deviation, we get what is called

Variance.

What is meant by Variance?

This refers to the squared deviations of the scores

around the mean divided by N. A measure of

dispersion is used primarily in inferential statistics and

also in correlation and regression techniques; S2

represents the variance of a sample ; ∂2 , the variance

of a population.

Ch 5_40

How is standard deviation calculated?

Ungrouped data

Standard deviation may be computed by applying

any of the following two methods:

By taking deviations from the actual mean

By taking deviations from an assumed mean

Continued……..

Ch 5_41

How is standard deviation calculated?

Ungrouped data

By taking deviations from the actual mean:

When deviations are taken from the actual mean,

the following formula is applied:

N

XX

2

If we calculate standard deviation without taking

deviations, the above formula after simplification

(opening the brackets) can be used and is given by:

Continued……..

Ch 5_42

Formula:

By taking deviations from an assumed mean: When

the actual mean is in fractions, say 87.297, it would

be too cumbersome to take deviations from it and

then find squares of these deviations. In such a

case either the mean may be approximated or else

the deviations be taken from an assumed mean and

the necessary adjustment be made in the value of

standard deviation.

222 2

XN

Xor

N

X

N

X

Ch 5_43

How is standard deviation calculated ?

The former method of approximation is less accurate

and therefore, invariably in such a case deviations are

taken from assumed mean.

When deviations are taken from assumed mean the

following formula is applied:22

N

d

N

d

Where AXd

Ch 5_44

Example:Find the standard deviation from the weekly wages of ten

workers working in a factory:

Workers Weekly wages (Tk)

A 1320

B 1310

C 1315

D 1322

E 1326

F 1340

G 1325

H 1321

I 1320

j 1331

Ch 5_45

2XX XX

Calculations of Standard Deviation

Continued…….

XX

Workers Weekly wages

(Tk)

A 1320 - 3 9

B 1310 - 13 169

C 1315 - 8 64

D 1322 - 1 1

E 1326 +3 9

F 1340 +17 289

G 1325 +2 4

H 1321 - 2 4

I 1320 - 3 9

J 1331 +8 64

N= 10 x=13230 = 0 = 622 2XX

Ch 5_46

).....(..........

2

iN

XX We know

Since

dAX

dAX

AXd

ddXX

getweXfromXgSubtractin ,

Continued…….

Ch 5_47

XX Substituting the value of in (i), mentioned

If, in the above question, deviations are taken from 1320

instead of the actual mean 1323, the assumed mean method

will be applied and the calculations would be as follows:

222

N

d

N

d

N

dd

897

2.6210

6222

N

XX

1323.10

13230Tk

N

XX

Continued…….

Ch 5_48

Workers Weekly wages

(Tk) A = 1320

d2

A 1320 0 0

B 1310 -10 100

C 1315 -5 25

D 1322 +2 4

E 1326 +6 36

F 1340 +20 400

G 1325 +5 25

H 1321 +1 1

I 1320 0 0

j 1331 +11 121

N= 10 d=30 d2 =712

Calculation of standard deviation (assumed mean method)

Continued…….

dAX

Ch 5_49

Thus the answer remains the same by both the

methods. It should be noted that when actual mean is

not a whole number, the assumed mean method should

be preferred because it simplifies calculations.

897262927110

30

10

712222

N

d

N

d

Ch 5_50

Grouped data

In grouped frequency distribution, standard deviation

can be calculated by applying any of the following two

methods:

By taking deviations from actual mean.

By taking deviations from assumed mean.

Continued……..

Ch 5_51

Grouped data

Deviations taken from actual mean: When deviations

are taken from actual mean, the following formula is

used:

Continued……..

If we calculate standard deviation without taking

deviations, then this formula after simplification

(opening the brackets ) can be used and is given by

2222

XN

fXor

N

fX

N

fX

N

XXf

2

Ch 5_52

Grouped data

Deviations taken from assumed mean: When

deviations are taken from assumed mean, the

following formula is applied :

Continued……..

,

22

iN

fd

N

fd

i

Axd

where

Ch 5_53

A purchasing agent obtained samples of 60 watt bulbs

from two companies. He had the samples tested in his

own laboratory for length of life with the following

results:

Example:

Length of life (in hours) Samples from

Company A Company B

1,700 and under 1,900 10 3

1,900 and under 2,100 16 40

2,100 and under 2,300 20 12

2,300 and under 2,500 8 3

2,500 and under 2,700 6 2

Continued……..

Ch 5_54

1. Which Company’s bulbs do you think are better in

terms of average life?

2. If prices of both types are the same, which company’s

bulbs would you buy and why?

Example:

Continued……..

Ch 5_55

Example:

Continued……..

Sample from Co. ALength of life

(in hours)

Midpoint Samples from Co. A

f d fd fd2 f d fd fd2

1,700– 1,900 1800 10 –2 –20 40 3 –2 – 6 12

1,900–2,100 2000 16 –1 –16 16 40 –1 – 40 40

2,100–2,300 2200 20 0 0 0 12 0 0 0

2,300–2,500 2400 8 1 +8 8 3 1 +3 3

2,500–2,700 2600 6 2 +12 24 2 2 +4 8

N=60 d=0 fd

= –16

fd2

=88

N=6

0

d=0 fd

= –39

fd2

= 63

Samples from Co. B

meanAssumedwherei

AXd

,

Here, A = 2200

i = 200

Ch 5_56

Example:

Continued……..

.11100672146

4236100..

4236200182120007104671

20060

16

60

88

67146,220060

16200,2

222

centperX

VC

iN

fd

N

fd

iN

fdAX

For Company A :

Here, N = 60

A = 2,200

fd = - 16

fd2 = 88

Ch 5_57

For Company B :

Continued……..

.6771002070

8158..

8158200794

20042051

20060

39

60

63

2070130220020060

392200

2

centperVC

X

2200

63

39

2

A

fd

fd

Ch 5_58

Consumption (K. Wait hours) No. of users

0 but less than 10 6

10 but less than 20 25

20 but less than 30 36

30 but less than 40 20

40 but less than 50 13

Illustration :18

You are given the data pertaining to kilowatt hours

electricity consumed by 100 persons in Deli.

Calculate the mean and the standard deviation.

Continued……..

Ch 5_59

Solution:

Calculation of Mean and standard Deviation (Taking

deviation from assumed mean)

10

25XConsumption

K. wait hours

m.p

(X)

No. of Users

(f)

d

fd fd2 c.f.

0–10 5 6 –2 –12 24 6

10–20 15 25 –1 –25 25 31

20–30 25 36 0 0 0 67

30–40 35 20 +1 +20 20 87

40–50 45 13 +2 +26 52 100

N=100 fd =9 fd2=121

Continued……..

Ch 5_60

hourswaitkiN

fdAX .9.2510

100

925

96.1010096.110008.21.1

10100

9

100

121222

iN

fd

N

fd

(i)

(ii)

Ch 5_61

Calculation of Mean and the Standard Deviation

(Taking deviation from assumed mean)

XX 2XX 2XXf

120192

XXf

Consumption

K. wait hours

Midpoi

nt

X

No. of

Users

f

fX

0–10 5 6 30 –2 436.81 2620.6

10–20 15 25 375 –10.9 118.81 2970.25

20–30 25 36 900 –0.9 0.81 29.16

30–40 35 20 700 9.1 82.81 1656.20

40–50 45 13 585 19.1 364.81 4742.53

N=100 fX=

2590

Continued……..

Ch 5_62

9.25

100

2590

N

fXX

96.10

19.120

100

12019

2

N

XXf

Ch 5_63

1. Since average length of life is greater in case of

company A, hence bulbs of company A are

better.

2. Coefficient of variation is less for company B.

Hence if prices are same, we will prefer to buy

company B’s bulbs because their burning hours

are more uniform.

Ch 5_64

For two firms A and B belonging to same industry, the

following details are available:

Number of Employees

Average monthly wage:

Standard deviation:

Firm A

100

Tk. 4,800

Tk. 600

Firm B

200

Tk. 5,100

Tk. 540

Find

i. Which firm pays out larger amount as wages?

ii. Which firm shows greater variability in the distribution of

wages?

iii. Find average monthly wage and the standard deviation

of the wages of all employees in both the firms.

Example:

Ch 5_65

i. For finding out which firm pays larger amount, we

have to find out X.

Firm A : N = 100,

Firm B : N = 200,

X

X

= 4800, X=100×4800 =4,80,000

= 5100, X=200×5100 =10,20,000

Hence firm B pays larger amount as monthly wages.

XNXorN

XX

Ch 5_66

ii. For finding out which firm pays greater variability in

the distribution of wages, we have to calculate

coefficient of variation.

Since coefficient of variation is greater in case of

firm A, hence it shows greater variability in the

distribution of wages.

50121004800

600100..

XVC

Firm A :

59101005100

540100..

XVC

Firm B :

Ch 5_67

iii. Combined average weekly wage:

21

221112

NN

XNXNX

= 4800,1N = 100, 1X 2N = 200,

2X = 5100,

000,5.300

1020000480000

200100

5100200480010012

TK

X

Ch 5_68

Combined Standard Deviation

540,200,600,100 2211 NN

d1= 1X 12X

d2 == 5100 - 5000 =100

12X

d1 = 1X = 4800 - 5000 =200

2X

Continued…….

21

2

2

22

1

12

2

22

1

1

12NN

NNNN dd

Ch 5_69

25.578300

000,320100

300

20000004000000583200036000000

200100

1002002001005402006001002222

12

Hence the combined standard deviation is Tk. 578.25.

Ch 5_70

Which measure of variation to use?

The choice of a suitable measure depends on the

following two factors:

The type of data available

The purpose of investigation

Ch 5_71

What is Lorenz curve?

It is cumulative percentage curve in which the

percentage of items is combined with the percentage of

other things as wealth, profit, turnover, etc. The Lorenz

curve is a graphic method of studying variation.

Ch 5_72

What is the procedure of drawing the

Lorenz curve?

While drawing the Lorenz curve the following

procedure is used:

The size of items and frequencies are both

cumulated and the percentages are obtained for the

various commutative values.

On the X-axis, start from 0 to 100 and take the per

cent of variable.

On the Y-axis, start from 0 to 100 and take the per

cent of variable.

Continued…..

Ch 5_73

What is the procedure of drawing the

Lorenz curve?

Draw a diagonal line joining 0 with 100. This is

known as line of equal distribution. Any point on this

line shows the same per cent on X as on Y.

Plot the various points corresponding to X and Y

and join them. The distribution so obtained, unless it

is exactly equal, will always curve below the diagonal

line.

Ch 5_74

How is interpretation of the Lorenz

curve done?

If two curves of distribution are shown on the Lorenz

presentation, the curve that is farthest from the

diagonal line represents the greater inequality. Clearly

the line of actual distribution can never cross the line of

equal distribution.

Ch 5_75

Example:In the following table is given the number of companies

belonging to two areas A and B according to the amount of

profits earned by them. Draw in the same diagram their

Lorenz curves and interpret them.

Profits earned in Tk.'000 No. of companies

Area A Area B

6 6 2

25 11 38

60 13 52

84 14 28

105 15 38

150 17 26

170 10 12

400 14 4

Continued……

Ch 5_76Continued……..

Profit

Profits

earned in

Tk. ‘000

Cumulativ

e

Profits

Cumulativ

e

Percentag

e

No. of

Companies

Cumulative

Number

Cumulative

Percentage

No. of

Companies

Cumulative

Number

Cumulative

Percentage

6 6 0.6 6 6 6 2 2 1

25 31 3.1 11 17 17 38 40 20

60 91 9.1 13 30 30 52 92 46

84 175 17.5 14 44 44 28 120 60

105 280 28.0 15 59 59 38 158 79

150 430 43.0 17 76 76 26 184 92

170 600 60.0 10 86 86 12 196 98

400 1000 100.0 14 100 100 4 200 100

Calculation for drawing the Lorenz curveSolution:

Ch 5_77

0.0

20.0

40.0

60.0

80.0

100.0

120.0

1 20 46 60 79 92 98 100

Lorenz curve

Per Cent of Companies

Pe

r Ce

nt o

f Pro

fits

References

Quantitative Techniques, by CR Kothari, Vikas publication

Fundamentals of Statistics by SC Guta Publisher Sultan

Chand

Quantitative Techniques in management by N.D. Vohra

Publisher: Tata Mcgraw hill