Why statisticians were created Measure of dispersion FETP India.

30
Why statisticians were created Measure of dispersion FETP India

Transcript of Why statisticians were created Measure of dispersion FETP India.

Page 1: Why statisticians were created Measure of dispersion FETP India.

Why statisticians were created

Measure of dispersion

FETP India

Page 2: Why statisticians were created Measure of dispersion FETP India.

Competency to be gained from this lecture

Calculate a measure of variation that is adapted to the sample studied

Page 3: Why statisticians were created Measure of dispersion FETP India.

Key issues

• Range• Inter-quartile variation• Standard deviation

Page 4: Why statisticians were created Measure of dispersion FETP India.

Measures of spread, dispersion or variability

• The measure of central tendency provides important information about the distribution

• However, it does not provide information concerning the relative position of other data points in the sample

• Measure of spread, dispersion or variability address are needed

Range

Page 5: Why statisticians were created Measure of dispersion FETP India.

Why one needs to measure variability

Students

Marks obtained

Biology Physics Chemistry

1 200 199 100

2 200 200 200

3 200 201 300

Mean 200 200 200

Variation Nil Slight Substantial

Range 0 2 200

Range

Page 6: Why statisticians were created Measure of dispersion FETP India.

Every concept comes from a failure of the previous concept

• Mean is distorted by outliers• Median takes care of the outliers

Range

Page 7: Why statisticians were created Measure of dispersion FETP India.

The range: A simple measure of dispersion

• Take the difference between the lowest value and the highest value

• Limitation: The range says nothing about the values

between extreme values The range is not stable: As the sample size

increases, the range can change dramatically

Statistics cannot be used to look at the range

Range

Page 8: Why statisticians were created Measure of dispersion FETP India.

Example of a range

• Take a sample of 10 heights: 70, 95, 100, 103, 105, 107, 110, 112, 115

and 140 cms

• Lowest (Minimum) value 70cm

• Highest (Maximum) value 140cm

• Range 140 – 70 = 70cm

Range

Page 9: Why statisticians were created Measure of dispersion FETP India.

Three different distributions with the same range (35 Kgs)

30 40 50 60 70

30 40 50 60 70

30 40 50 60 70

X X X X X X X X X

X X XX X XX X

X X

X

Even

Uneven

Clumped XXXXXXX

Range

Page 10: Why statisticians were created Measure of dispersion FETP India.

The range increases with the sample size

Values Range

Initial set(5 values)

30 40 53 58 65 - - - 30 65 35

New set(3 more values)

30 40 53 58 65 48 51 64 30 65 35

New set(3 more values)

30 40 53 58 65 48 51 70 30 70 40

New set(3 more values)

30 40 53 58 65 28 51 70 28 70 42

Two ranges based on different sample sizes are not comparableRang

e

Page 11: Why statisticians were created Measure of dispersion FETP India.

Percentiles and quartiles

• Percentiles Those values in a series of observations,

arranged in ascending order of magnitude, which divide the distribution into two equal parts

The median is the 50th percentile

• Quartiles The values which divide a series of observations,

arranged in ascending order, into 4 equal parts The median is the 2nd quartile

Inter-quartile range

Page 12: Why statisticians were created Measure of dispersion FETP India.

First 25% 2nd 25% 3rd 25% 4th 25%

Q1Q2

(Median) Q3

Sorting the data in increasing order

• Median Middle value (if n is odd) Average of the two middle values (if n is

even) A measure of the “centre” of the data

• Quartiles divide the set of ordered values into 4 equal parts

Page 13: Why statisticians were created Measure of dispersion FETP India.

The inter-quartile range

• The central portion of the distribution • Calculated as the difference between the

third quartile and the first quartile• Includes about one-half of the

observations• Leaves out one quarter of the observations • Limitations:

Only takes into account two values Not a mathematical concept upon which

theories can be developed

Inter-quartile range

Page 14: Why statisticians were created Measure of dispersion FETP India.

The inter-quartile range: Example

• Values 29 , 31 , 24 , 29 , 30 , 25

• Arrange 24 , 25 , 29 , 29, 30 , 31

• Q1 Value of (n+1)/4=1.75 24+0.75 = 24.75

• Q3 Value of (n+1)*3/4=5.2 Q3 = 30+0.2 = 30.2

• Inter-quartile range = Q3 – Q1 = 30.2 – 24.75Inter-quartile range

Page 15: Why statisticians were created Measure of dispersion FETP India.

Graphic representation of theinter-quartile range

Inter-quartile range

Page 16: Why statisticians were created Measure of dispersion FETP India.

The mean deviation from the mean

• Calculate the mean of all values• Calculate the difference between each

value and the mean• Calculate the average difference

between each value and the mean• Limitations:

The average between negative and positive deviations may generate a value of 0 while there is substantial variation

Standard deviation

Page 17: Why statisticians were created Measure of dispersion FETP India.

The mean deviation from the mean:Example

Data 10 20 30 40 50 60 70Mean = 280/7 = 40Mean deviation from mean10-40 20-40 ………-30 -20 -10 0 10 20 30 Sum = 0

Standard deviation

Page 18: Why statisticians were created Measure of dispersion FETP India.

Absolute mean deviation from the mean

• Calculate the mean of all values• Calculate the difference between each

value and the mean and take the absolute value

• Calculate the average difference between each value and the mean

• Limitations: Absolute value is not good from a

mathematical point of viewStandard deviation

Page 19: Why statisticians were created Measure of dispersion FETP India.

Absolute mean deviation from the mean: Example

Standard deviation

Data 10 20 30 40 50 60 70Mean = 280/7 = 40Mean deviation from mean10-40 20-40 ………-30 -20 -10 0 10 20 30Absolute values30 20 10 0 10 20 30 Mean deviation from mean = 120/7 = 17.1

Page 20: Why statisticians were created Measure of dispersion FETP India.

Calculating the variance (1/2)

1. Calculate the mean as a measure of central location (MEAN)

2. Calculate the difference between each observation and the mean (DEVIATION)

3. Square the differences (SQUARED DEVIATION)• Negative and positive deviations will not

cancel each other out• Values further from the mean have a bigger

impactStandard deviation

Page 21: Why statisticians were created Measure of dispersion FETP India.

Calculating the variance (2/2)

4. Sum up these squared deviations (SUM OF THE SQUARED DEVIATIONS)

5. Divide this SUM OF THE SQUARED DEVIATIONS by the total number of observations minus 1 (n-1) to give the VARIANCE

• Why divide by n - 1 ? Adjustment for the fact that the mean is just

an estimate of the true population mean Tends to make the variance larger

Standard deviation

Page 22: Why statisticians were created Measure of dispersion FETP India.

The standard deviation

• Take the square root of the variance• Limitations:

Sensitive to outliers

)( 1

22

nn

xxnSD

ii

Standard deviation

Page 23: Why statisticians were created Measure of dispersion FETP India.

Example

Patient No of X rays

Deviation from mean

Absolute deviation

Square deviation

Square of observation

s

A 10 10-9= 1 1 12 = 1 102 = 100

B 8 8-9= -1 1 -12 = 1 82 = 64

C 6 6-9= -3 3 -32 = 9 62 = 36

D 12 12-9 = 3 3 32 = 9 122 = 144

E 9 9-9 = 0 0 02 = 0 92 = 81

Total 45 0 8 20 425

Mean = 45/9 = 9 x-rays Mean deviation = 8/5 = 1.6 x-rays

Variance = (20/(5-1)) = 20/4 = 5 x-rays Standard deviation = 5 = 2.2

Page 24: Why statisticians were created Measure of dispersion FETP India.

Properties of the standard deviation

• Unaffected if same constant is added to (or subtracted from) every observation

• If each value is multiplied (or divided) by a constant, the standard deviation is also multiplied (or divided) by the same constant

Standard deviation

Page 25: Why statisticians were created Measure of dispersion FETP India.

Need of a measure of variation that is independent from the

measurement unit• The standard deviation is expressed in

the same unit as the mean: e.g., 3 cm for height, 1.4 kg for weight

• Sometimes, it is useful to express variability as a percentage of the mean e.g., in the case of laboratory tests, the

experimental variation is ± 5% of the mean

Standard deviation

Page 26: Why statisticians were created Measure of dispersion FETP India.

The coefficient of variation

• Calculate the standard deviation• Divide by the mean

The standard deviation becomes “unit free”

• Coefficient of variation (%) = [S.D / Mean] x 100 (Pure number)

Standard deviation

Page 27: Why statisticians were created Measure of dispersion FETP India.

Uses of the coefficient of variation

• Compare the variability in two variables studied which are measured in different units Height (cm) and weight (kg)

• Compare the variability in two groups with widely different mean values Incomes of persons in different socio-

economic groups

Standard deviation

Page 28: Why statisticians were created Measure of dispersion FETP India.

A summary of measures of dispersion

Measure Advantages Disadvantages

Range •Obvious•Easy to calculate

•Uses only 2 observations•Increases with the sample size•Can be distorted by outliers

Inter-quartile range

•Not affected by extreme values

•Uses only 2 observations•Not amenable for further statistical treatment

Standard deviation

•Uses every value•Suitable for further analysis

•Highly influenced by extreme values

Page 29: Why statisticians were created Measure of dispersion FETP India.

Choosing a measure of central tendency and a measure of

dispersion

Type of distribution

Measure of central tendency

Measure of dispersion

Normal •Mean •Standard deviation

Skewed •Median •Inter-quartile range

Exponential or logarithmic

•Geometric mean •Consult with the statistician

Page 30: Why statisticians were created Measure of dispersion FETP India.

Key messages

• Report the range but be aware of its limitations

• Report the inter-quartile deviation when you use the median

• Report the standard deviation when you use a mean