Session 9 Nagesh S J C E

24
Session – 9 Measures of Dispersions Standard Deviation Standard deviation is the root of sum of the squares of deviations divided by their numbers. It is also called ‘Mean error deviation’. It is also called mean square error deviation (or) Root mean square deviation. It is a second moment of dispersion. Since the sum of squares of deviations from the mean is a minimum, the deviations are taken only from the mean (But not from median and mode). The standard deviation is Root Mean Square (RMS) average of all the deviations from the mean. It is denoted by sigma (). Characteristics of standard deviation 1. Standard deviation and coefficient of variation possesses all these properties which a good measure of dispersion should possess. 2. The process of squaring the deviation eliminates negative sign and makes mathematical computations easy. Merits 1. It is based on all observations. 2. It can be smoothly handled algebraically. 3. It is a well defined and definite measure of dispersion. 4. It is of great importance when we are making comparison between variability of two series. Merits 1. It is difficult to calculate and understand. 2. It gives more weightage to extreme values as the deviation is squared. 1

Transcript of Session 9 Nagesh S J C E

Page 1: Session 9  Nagesh  S J C E

Session – 9

Measures of Dispersions

Standard Deviation

Standard deviation is the root of sum of the squares of deviations divided by their numbers. It is also called ‘Mean error deviation’. It is also called mean square error deviation (or) Root mean square deviation. It is a second moment of dispersion. Since the sum of squares of deviations from the mean is a minimum, the deviations are taken only from the mean (But not from median and mode).

The standard deviation is Root Mean Square (RMS) average of all the deviations from the mean. It is denoted by sigma ().

Characteristics of standard deviation

1. Standard deviation and coefficient of variation possesses all these properties which a good measure of dispersion should possess.

2. The process of squaring the deviation eliminates negative sign and makes mathematical computations easy.

Merits

1. It is based on all observations.

2. It can be smoothly handled algebraically.

3. It is a well defined and definite measure of dispersion.

4. It is of great importance when we are making comparison between variability of two series.

Merits

1. It is difficult to calculate and understand.

2. It gives more weightage to extreme values as the deviation is squared.

3. It is not useful in economic studies.

Standard deviation

If the variant xi takes the values of x1, x2 ………….. xn the standard deviation denoted by and it is defined by

=

The quantity 2 is called variance.

1

Page 2: Session 9  Nagesh  S J C E

Alternate Expressions

For raw data

2 =

For a grouped data 2 =

For a grouped data with step deviation method =

Coefficient of variance

It is defined as the ratio to be equal to standard deviation divided by mean.

The percentage form of CV is given by CV =

2

Page 3: Session 9  Nagesh  S J C E

Problems

1. Ten students of a class have obtained the following marks in a particular subject out of 100. Calculate SD and CV for the given data below.

Sl. No.(x)

marks

d = (x1 = 38.5)

d = (x1 - )(x1 - )2

1. 5 - 33.5 1122.25

2. 10 - 28.4 812.25

3. 20 - 18.5 342.25

4. 25 - 13.5 182.25

5. 40 1.5 2.25

6. 42 3.5 12.25

7. 45 6.5 42.25

8. 48 9.5 90.25

9. 70 31.5 992.25

10. 80 41.5 1722.25

x = 385(x1 - )2 = d2

= 5320.50

=

= 38.5

=

=

= 23.066

CV =

CV =

CV = 59.9%

3

Page 4: Session 9  Nagesh  S J C E

2. Compute standard deviation and coefficient of varience for following data of 100 students marks.

Class f Class

Mid point

x

d fd fd2

1 – 10 3 0.5 – 10.5 5.5 -2 -6 12

11 – 20 16 10.5 – 20.5 15.5 -1 -16 16

21 – 30 26 20.5 – 30.5 25.5 0 0 0

31 – 40 31 30.5 – 40.5 35.5 1 31 31

41 – 50 16 40.5 – 50.5 45.5 2 32 64

51 – 60 8 50.5 – 60.5 55.5 3 24 72

N = f = 100

fd = 65 fd2= 195

a = 25.5

d =

d =

= 25.5 + 6.5

32

= h

= 10 = 12.359

CV =

CV = = 38.62%

3. The AM and SD of a set of nine items are 43 and 5 respectively if an item of value 63 is added, find the mean and SD.

4

Page 5: Session 9  Nagesh  S J C E

xi = x N

xi = 43 x 9

x = 387 for 9 items

x = 387 + 63 for 10 item

x = 450

Modified mean

= 45

= 43 = 5 for 9 items

2 =

25 =

25 =

25 + 1849 =

= 1874

x2 = 1874

x2 = 16866 for 9 items

If 63 is added

x2 = 16866 + (63)2

= 20835 for 10 items

Modified 2 =

2 = 2 = 7.64 is modified SD.

4. The mean of 5 observations is 4.4. and variance is 8.24 and if the 3 items of the five observations are 1, 2 and 6. Find the values of other two observations.

5

Page 6: Session 9  Nagesh  S J C E

w.k.t.

x = 22

2 =

8.24 =

8.24 =

8.24 + 19.36 =

x2 = 138

x2 = 12 + 22 + 62 + x12 + x2

2

138 = 1 + 4 + 36 + x12 + x2

2

97 = x12 + x2

2

x12 + x2

2 = 97 ---- (1)

x = 1 + 2 + 6 + x1 + x2

22 = 9 + x1 + x2

x1 + x2 = - 13 ---- (2) put (2) in (1)

x2 = 13 – x1

by (1) & (2)

x12 + (13 – x1)2 = 97

x12 + 169 + x1

2 – 26x1 = 97

2 x12 – 26x1 + 72 = 0

x12 – 13x1 + 36 = 0

x1 =

x1 =

6

Page 7: Session 9  Nagesh  S J C E

x1 =

x1 =

x1 = 6.5 2.5

x1 = 9 or x1 = 4

x1 = 9 x2 = 4

7

Page 8: Session 9  Nagesh  S J C E

5. The mean and S.D. of the frequency distribution of a continuous random variable x are 40.604 and 7.92 respectively. Change of origin and scale is given below. Determine the actual class interval.

d -3 -2 -1 0 1 2 3 4

f 3 15 45 57 50 36 25 9

d f fd fd2 MV CI

-3 3 -9 27 22.5 20-25

-2 15 -30 60 29.5 25-30

-1 45 -45 45 32.5 30-35

0 57 0 0 37.5 35-40

1 50 50 50 42.5 40-45

2 36 72 144 47.5 50-55

3 25 75 225 52.5 55-60

4 9 36 144 57.5

N = 240 fd = 149 fd2 = 695

40.604 = a + 0.62h ----- (1)

= h

7.92 = h

= h

7.92 = h x 1.584

h = 4.998

h = 5

Put h = 5 in equation (1)

40.604 = a + 0.62 x 5

a = 37.5

Combined Standard Deviation

8

Page 9: Session 9  Nagesh  S J C E

Suppose we have different samples of various sizes n1, n2, n3 …….. having means x1, x2, x3 and standard deviation 1, 2, 3 ……. then combine standard deviation can be computed by the following formula.

2 (n1 + n2) = n1 (12 + d1

2) + n2 (22 + d2

2)

d1 =

d2 =

1. The mean’s of two samples of sizes 50 and 100 respectively are 54.1 and 50.3 and there standard deviations are 8 and 7 respectively obtain the SD for combined group.

n1 = 50

= 54.1

1 = 8

n2 = 100

= 50.3

2 = 7

51.56

2 (n1 + n2) = n1 (12 + d1

2) + n2 (22 + d2

2)

d1 =

d2 =

d1 = 94.1 – 51.56

d1 = 2.54 d12 = 6.45

d2 = 50.3 – 51.56

d2 = - 1.26 d22 = 1.56

2 150 = 50 (82 + 6.45) + 100 (72 + 1.58)

32 = (64 + 6.45) + 2 (49 + 1.58)

32 = 70.45 + 2 x 50.58

= 7.56

9

Page 10: Session 9  Nagesh  S J C E

2. The mean wage is Rs. 75 per day, SD wage is Rs. 5 per day for a group of 1000 workers and the same is Rs. 60 and Rs. 4.5 for the other group of 1500 workers. Find mean and standard deviation for the entire group.

We have by data, = 75, 1 = 5, n1 = 1000

= 60, 2 = 450, n2 = 1500

Let and be the mean and SD of the entire group.

Consider

i.e.,

Also we have,

(n1 + n2) 2 = n1 (12 + d1

2) + n2 (22 + d2

2),

where d1 = - = 75 – 66 = 9; d2 = - = 60 – 66 = -6

(1000 + 1500) 2 = 1000 (52 + 92) + 1500 (4.52 + (-6)2)

2 = 76.15 or = 8.73

10

Page 11: Session 9  Nagesh  S J C E

3. The runs scored by 3 batsman are 50, 48 and 12. Arithmtic mean’s respectively. The SD of there runs are 15, 12 and 2 respectively. Who is t he most consistent of the three batsman? If the one of these three is to be selected who is to be selected?

A B C

AM ( ) 50 48 12

SD() 15 12 2

CVA = x 100

CVA = x 100

CVA = 30%

CVB = x 100

CVB = x 100

CVB = 25%

CVC = x 100

CVC = x 100

CVC = 16.66%

Evaluation Criteria

1. Less CV indicates more constant player and hence more consistent player is (Player C)

2. Highest rune scorer = A = 50

11

Page 12: Session 9  Nagesh  S J C E

4. The coefficient of variation of the two series are 75% and 90% with SD 15 and 18 respectively compute there mean.

CVA = 75%

CVB = 80%

A = 15

B = 18

CV =

75 = 90 =

A = 20 A = 20

5. Goals scored by two teams A & B in a foot ball season are as shown below. By calculating CV in each, find which team may be considered as more consistent.

No. of goals

x

No. of matches Team (A)

fx

Team (B)

fxA-team B-team

0 27 17 0 0

1 9 9 9 9

2 8 6 16 12

3 5 5 15 15

4 4 3 16 12

N = f = 53 f = 40 fx = 56 fx2 = 48

Team (A)

fx2

Team (B)

fx2

0 0

9 9

32 24

45 45

64 48

fx2 = 150 fx2 = 126

A = = = 1.056

12

Page 13: Session 9  Nagesh  S J C E

B = = = 1.2

= =

= =

CVA = x 100 = = 123.8%

CVB = x 100 = = 109%

Since, CVB < CVA, team B is more consistent player

6. The prices of x and y share A & B respectively state which share more stable in its value.

Price A

(x)

(xi = 53)

(xi = )(xi = )2

Price - A

(4)

(xi = 105)

(xi = )(xi = )2

55 2 4 108 3 9

54 1 1 107 2 4

52 -1 1 105 0 0

53 0 0 105 0 0

56 3 9 106 1 1

58 5 25 107 2 4

52 -1 1 104 -1 1

50 -3 9 103 -2 4

51 -2 4 104 -1 1

49 -4 16 101 -4 16

x = 530 (xi= )2 = 70 x = 1050 x(xi= )2 = 40

13

Page 14: Session 9  Nagesh  S J C E

A = = = 53

B = = = 105

CVA = x 100 = = 4.98%

CVB = x 100 = = 1.903%

Since, CVB is less share B is more stable.

7. A student while computing the coefficient of variation obtained the mean and SD of 100 observations as 40 and 5.1 respectively. It was later discovered that he had wrongly copied an observation as 50 instead of 40. Calculate the correct coefficient of variation.

>> i.e.

x (incorrect) = 4000

Now correct x = 4000 – 50 + 40 = 3990

correct = 39.9

Let us consider

i.e.

x2 (incorrect) = 100 x 1626.01 = 162601

Now correct x2 = 162601 – (50)2 + (40)2 = 161701

14

Page 15: Session 9  Nagesh  S J C E

correct 2 = correct

i.e., correct 2 =

Now correct efficient of variation =

Hence correct C.V. = 12.53%

15

Page 16: Session 9  Nagesh  S J C E

8. The mean and SD of 21 observations are 30 and 5 respectively. It was subsequently noted that one of the observations 10 was incorrect. Omit it and determine the mean and SD of the rest.

>> i.e.

incorrect x = 630

Now omitting the incorrect value 10,

New x = 630 – 10 = 620

n = 21 – 1 = 20

New

Next consider

i.e.

Again omitting the incorrect value 10.

New x = 19425 –(10)2 = 19325, n = 20

Hence new

New = = 2.29

9. The mean of 200 items was 50. Later on it was discovered that two items were misread as 92 and 8 instead of 192 and 88. Find out the correct mean.

>> i.e.

incorrect x = 10000

Correct x = 10000 – 92 – 8 + 192 + 88 = 10180

Correct mean = = 50.9

16

Page 17: Session 9  Nagesh  S J C E

10. Find the missing frequencies in the following data given that the median is 137.2.

Class 100-110

110-120

120-130

130-140

140-150

150-100

106-170

170-180

Frequency 15 44 133 F1 125 F2 35 16 N=600

>> We prepare the table with the column of cumulative frequencies and use the formula for median.

Class Frequency cf

100-110 15 15

110-120 44 59

120-130 133 192

130-140 f1 192 + f1 Median class

140-150 125 317 + f1

150-160 f2 317 + f1 + f2

160-170 35 352 + f1 + f2

170-180 16 368 + f1 + f2

N = 600

Median = 1 +

We can take the median class as 130-140 since median is given to be 137.2

, h = 10 f = f1, c = 192

137.2 = 130 + (300 - 192)

i.e., 137-2 – 130 = i.e., 7.2 f1 = 1080 or f1 150

But the last cumulative frequency must be equal to N = 600

i.e. 368 + f1 + f2 = 600

368 + 150 + f2 = 600 f2 = 82

Thus f1 = 150, f2 = 82

17

Page 18: Session 9  Nagesh  S J C E

Relationship between various measures of dispersion

We have some of following relationships among the various methods of measures of dispersion

1. Mean QD covers 50% of observations of the distribution

2. Mean MD covers 57.5% of observations

3. Mean 1 includes 68.27% of observations

4. Mean 2 includes 95.45% of observations

5. Mean 3 includes 99.73% of observations

6. QD =

7. MD =

8. QD = MD

9. Combining the results we get 3 QD = 2 SD and 5 MD = 4 SD that is also equal to 6 QD.

10. Range = 6 times SD.

SOURCES AND REFERENCES

1. Statistics for Management, Richard I Levin, PHI / 2000.

2. Statistics, RSN Pillai and Bagavathi, S. Chands, Delhi.

3. An Introduction to Statistical Method, C.B. Gupta, & Vijaya Gupta, Vikasa Publications, 23e/2006.

4. Business Statistics, C.M. Chikkodi and Salya Prasad, Himalaya Publications, 2000.

5. Statistics, D.C. Sancheti and Kappor, Sultan Chand and Sons, New Delhi, 2004.

6. Fundamentals of Statistics, D.N. Elhance and Veena and Aggarwal, KITAB Publications, Kolkata, 2003.

7. Business Statistics, Dr. J.S. Chandan, Prof. Jagit Singh and Kanna, Vikas Publications, 2006.

18