Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical...

39
QMIS 120, by Dr. M. Zainal Chapter 3 Student Lecture Notes 3-1 Chapter 3 Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems Dr. Mohammad Zainal QMIS 120 After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Compute the range, variance, and standard deviation and know what these values mean Construct and interpret a box and whisker graph Compute and explain the coefficient of variation and z scores Use numerical measures along with graphs, charts, and tables to describe data Chapter Goals QMIS 120, by Dr. M. Zainal Chap 3-2

Transcript of Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical...

Page 1: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-1

Chapter 3

Describing Data Using

Numerical Measures

Business Statistics

Department of Quantitative Methods & Information Systems

Dr. Mohammad Zainal QMIS 120

After completing this chapter, you should be able to:

Compute and interpret the mean, median, and mode for a

set of data

Compute the range, variance, and standard deviation and

know what these values mean

Construct and interpret a box and whisker graph

Compute and explain the coefficient of variation and z scores

Use numerical measures along with graphs, charts, and

tables to describe data

Chapter Goals

QMIS 120, by Dr. M. Zainal Chap 3-2

Page 2: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-2

Chapter Topics

Measures of Center and Location

Mean, median, mode

Other measures of Location

Weighted mean, percentiles, quartiles

Measures of Variation

Range, interquartile range, variance and standard

deviation, coefficient of variation

Using the mean and standard deviation together

Coefficient of variation, z-scores

QMIS 120, by Dr. M. Zainal Chap 3-3

Summary Measures

Center and Location

Mean

Median

Mode

Other Measures

of Location

Weighted Mean

Describing Data Numerically

Variation

Variance

Standard Deviation

Coefficient of

Variation

Range

Percentiles

Interquartile Range

Quartiles

QMIS 120, by Dr. M. Zainal Chap 3-4

Page 3: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-3

Measures of Center and Location

Center and Location

Mean Median Mode Weighted Mean

N

x

n

x

x

N

i

i

n

i

i

1

1

i

ii

W

i

iiW

w

xw

w

xwX

Overview

QMIS 120, by Dr. M. Zainal Chap 3-5

Mean (Arithmetic Average)

The Mean is the arithmetic average of data

values

Population mean

Sample mean n = Sample Size

N = Population Size

n

xxx

n

x

x n

n

i

i

211

N

xxx

N

xN

N

i

i

211

QMIS 120, by Dr. M. Zainal Chap 3-6

Page 4: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-4

Mean (Arithmetic Average)

The most common measure of central tendency

Mean = sum of values divided by the number of values

Affected by extreme values (outliers)

(continued)

0 1 2 3 4 5 6 7 8 9 10

Mean = 3

0 1 2 3 4 5 6 7 8 9 10

Mean = 4

35

15

5

54321

4

5

20

5

104321

QMIS 120, by Dr. M. Zainal Chap 3-7

Median

In an ordered array, the median is the “middle”

number, i.e., the number that splits the

distribution in half

The median is not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10

Median = 3

0 1 2 3 4 5 6 7 8 9 10

Median = 3

QMIS 120, by Dr. M. Zainal Chap 3-8

Page 5: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-5

Median

To find the median, sort the n data values

from low to high (sorted data is called a

data array)

Find the value in the i = (1/2)n position

The ith position is called the Median Index

Point

If i is not an integer, round up to next highest

integer

(continued)

Chap 3-9 QMIS 120, by Dr. M. Zainal

Median Example

Note that n = 13

Find the i = (1/2)n position:

i = (1/2)(13) = 6.5

Since 6.5 is not an integer, round up to 7

The median is the value in the 7th position:

Md = 12

(continued)

Data array:

4, 4, 5, 5, 9, 11, 12, 14, 16, 19, 22, 23, 24

Chap 3-10 QMIS 120, by Dr. M. Zainal

Page 6: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-6

Shape of a Distribution

Describes how data is distributed

Symmetric: identical on both sides of its central point

Skewed: the tail on one side is longer than the tail

on the other side

QMIS 120, by Dr. M. Zainal Chap 3-11

Shape of a Distribution

Mean = Median

Mean < Median Median < Mean

Right-Skewed Left-Skewed Symmetric

(Longer tail extends to left) (Longer tail extends to right)

QMIS 120, by Dr. M. Zainal Chap 3-12

Page 7: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-7

Mode

A measure of location

The value that occurs most often

Not affected by extreme values

Used for either numerical or categorical data

There may be no mode

There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 5

0 1 2 3 4 5 6

No Mode

QMIS 120, by Dr. M. Zainal Chap 3-13

Weighted Mean

Used when values are grouped by frequency or

relative importance

Days to

Complete Frequency

5 4

6 12

7 8

8 2

Example: Sample of

26 Repair Projects

QMIS 120, by Dr. M. Zainal Chap 3-14

Page 8: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-8

Five houses on a hill by the beach

Review Example

$2,000 K

$500 K

$300 K

$100 K

$100 K

House Prices:

$2,000,000

500,000

300,000

100,000

100,000

Chap 3-15 QMIS 120, by Dr. M. Zainal

Summary Statistics

Chap 3-16 QMIS 120, by Dr. M. Zainal

Page 9: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-9

Mean is generally used, unless

extreme values (outliers) exist

Then Median is often used, since

the median is not sensitive to

extreme values.

Example: Median home prices may be

reported for a region – less sensitive to

outliers

Which measure of location is the “best”?

Chap 3-17 QMIS 120, by Dr. M. Zainal

Which measure of location is the “best”?

Example: Find the mean and median for the

following two sets of measurements

2, 9, 11, 5, 6 2, 9, 110, 5, 6

Chap 3-18 QMIS 120, by Dr. M. Zainal

Page 10: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-10

Measures of Variation

Variation

Variance Standard Deviation Coefficient of

Variation

Population

Variance

Sample

Variance

Population

Standard

Deviation

Sample

Standard

Deviation

Range

Interquartile

Range

QMIS 120, by Dr. M. Zainal Chap 3-19

Measures of variation give information on

the spread or variability of the data

values.

Variation

Same center,

different variation

Chap 3-20 QMIS 120, by Dr. M. Zainal

Page 11: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-11

Range

Simplest measure of variation

Difference between the largest and the smallest

observations:

Range = xmaximum – xminimum

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13

Example:

QMIS 120, by Dr. M. Zainal Chap 3-21

Ignores the way in which data are distributed

Sensitive to outliers

7 8 9 10 11 12

Range = 12 - 7 = 5

7 8 9 10 11 12

Range = 12 - 7 = 5

Disadvantages of the Range

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 5 - 1 = 4

Range = 120 - 1 = 119

QMIS 120, by Dr. M. Zainal Chap 3-22

Page 12: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-12

Average of squared deviations of values from

the mean

Population variance:

Sample variance:

Variance

N

μ)(x

σ

N

1i

2

i2

1- n

)x(x

s

n

1i

2

i2

QMIS 120, by Dr. M. Zainal Chap 3-23

Standard Deviation

Most commonly used measure of variation

Shows variation about the mean

Has the same units as the original data

Population standard deviation:

Sample standard deviation:

N

μ)(x

σ

N

1i

2

i

1-n

)x(x

s

n

1i

2

i

QMIS 120, by Dr. M. Zainal Chap 3-24

Page 13: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-13

Calculation Example: Sample Standard Deviation

Sample

Data (Xi) : 10 12 14 15 17 18 18 24

QMIS 120, by Dr. M. Zainal Chap 3-25

Comparing Standard Deviations

Mean = 15.5

s = 3.338 11 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5

s = .9258

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5

s = 4.57

Data C

Same mean, but different

standard deviations:

QMIS 120, by Dr. M. Zainal Chap 3-26

Page 14: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-14

Using Microsoft Excel

Descriptive Statistics are easy to obtain

from Microsoft Excel

Use menu choice:

Data / data analysis / descriptive statistics

Enter details in dialog box

QMIS 120, by Dr. M. Zainal Chap 3-27

Using Excel

Select:

Data / data analysis / descriptive statistics

QMIS 120, by Dr. M. Zainal Chap 3-28

Page 15: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-15

Enter dialog box details

Check box for summary statistics

Click OK

Using Excel (continued)

QMIS 120, by Dr. M. Zainal Chap 3-29

Excel output

Microsoft Excel

descriptive statistics output,

using the house price data:

House Prices:

$2,000,000

500,000

300,000

100,000

100,000

QMIS 120, by Dr. M. Zainal Chap 3-30

Page 16: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-16

Coefficient of Variation

Measures relative variation

Always in percentage (%)

Shows variation relative to mean

Is used to compare two or more sets of data

measured in different units

100%x

sCV

100%

μ

σCV

Population Sample

QMIS 120, by Dr. M. Zainal Chap 3-31

Comparing Coefficients of Variation

Stock A:

Average price last year = $50

Standard deviation = $5

Stock B:

Average price last year = $100

Standard deviation = $5

Both stocks

have the same

standard

deviation, but

stock B is less

variable relative

to its price

10%100%$50

$5100%

x

sCVA

5%100%$100

$5100%

x

sCVB

QMIS 120, by Dr. M. Zainal Chap 3-32

Page 17: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-17

Mean for grouped data

Once we group the data, we no longer know the values

of individual observations.

Thus, we find an approximation for the sum of these

values.

QMIS 120, by Dr. M. Zainal Chap 3-33

class. a offrequency theis andmidpoint theis Where

: samplefor Mean

:populationfor Mean

fm

n

mfx

N

mf

Variance and Standard

Deviation for grouped data

Also, we find an approximation for the variance and

standard deviation of grouped data.

QMIS 120, by Dr. M. Zainal Chap 3-34

class. a of frequency the is andmidpoint the is Where

1 : Sample

:Population

2

2

2

2

2

2

fm

n

n

mffm

s

N

N

mffm

22 and ss

Page 18: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-18

Example: Grouped Data

The table below gives the frequency distribution of

the daily commuting times (in minutes) from home to

CBA for all 25 students in QMIS 120. Calculate the

mean and the standard deviation of the daily

commuting times.

QMIS 120, by Dr. M. Zainal Chap 3-35

f Daily commuting time (min)

4 0 to less than 10

9 10 to less than 20

6 20 to less than 30

4 30 to less than 40

2 40 to less than 50

25 Total

Example: Grouped Data

QMIS 120, by Dr. M. Zainal Chap 3-36

(continued)

Page 19: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-19

Other Location Measures

Other Measures

of Location

Percentiles Quartiles

1st quartile = 25th percentile

2nd quartile = 50th percentile

= median

3rd quartile = 75th percentile

The pth percentile in a data array:

p% are less than or equal to this

value

(100 – p)% are greater than or

equal to this value

(where 0 ≤ p ≤ 100)

QMIS 120, by Dr. M. Zainal Chap 3-37

Percentiles

The pth percentile in an ordered array of n values is the

value in ith position, where

(n)100

pi

If i is not an integer,

round up to the next

higher integer value

QMIS 120, by Dr. M. Zainal Chap 3-38

Page 20: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-20

Percentiles

Example: Find the 60th percentile in an ordered array of

the following 19 values.

6 7 2 3 5 1 8 6 7 3 9 4 2 1 5 4 3 9 6

Solution:

QMIS 120, by Dr. M. Zainal Chap 3-39

Quartiles

Quartiles split the ranked data into 4 equal

groups:

Note that the second quartile (the 50th percentile)

is the median

25% 25% 25% 25%

Q1 Q2 Q3

QMIS 120, by Dr. M. Zainal Chap 3-40

Page 21: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-21

Quartiles

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

Example: Find the first quartile and third quartile

(n = 9)

Q1 = 25th percentile, so find i : i = (9) = 2.25

so round up and use the value in the 3rd position: Q1 = 13

Q3 = 75th percentile, so find i : i = (9) = 6.75

so round up and use the value in the 7th position: Q3 = 18

25 100

75 100

QMIS 120, by Dr. M. Zainal Chap 3-41

Box and Whisker Plot

A graphical display of data using a central “box” and extended “whiskers”:

Example:

QMIS 120, by Dr. M. Zainal Chap 3-42

25% 25% 25% 25%

* *

largest values* Smallest values*

Q1 Q2 Q3 Lower fence Upper fence

Outliers

Page 22: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-22

Box and Whisker Plot

Box and whisker plot gives a graphic presentation of

data using five measures:

Q1, Q2, Q3, smallest, and largest values*.

Can help to visualize the center, the spread, and the

skewness of a data set.

Very good tool of comparing more than a distribution.

* *

QMIS 120, by Dr. M. Zainal Chap 3-43

Box and Whisker Plot

Can help in detecting outliers.

Detecting an outlier: Lower fence: Q1 – 1.5(Q3 - Q1)

Upper fence: Q3 + 1.5(Q3 - Q1)

If a data point is larger than the upper fence or smaller than the lower fence, it is considered to be an outlier.

25% 25% 25% 25%

* *

QMIS 120, by Dr. M. Zainal Chap 3-44

Q1 Q2 Q3 Lower fence Upper fence

Outliers

Page 23: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-23

Constructing the Box and Whisker Plot

Lower 1st Median 3rd Upper

Limit Quartile Quartile Limit

* *

The lower limit is

Q1 – 1.5 (Q3 – Q1)

The upper limit is

Q3 + 1.5 (Q3 – Q1)

The center box extends from Q1 to Q3

The line within the box is the median

The whiskers extend to the smallest and largest values within

the calculated limits

Outliers are plotted outside the calculated limits

QMIS 120, by Dr. M. Zainal Chap 3-45

Shape of Box and Whisker Plots

The Box and central line are centered between the endpoints if data is symmetric around the median

(A Box and Whisker plot can be shown in either vertical or horizontal format)

QMIS 120, by Dr. M. Zainal Chap 3-46

Page 24: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-24

Distribution Shape and Box and Whisker Plot

Right-Skewed Left-Skewed Symmetric

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

QMIS 120, by Dr. M. Zainal Chap 3-47

Box-and-Whisker Plot Example

Below is a Box-and-Whisker plot for the following data:

0 2 2 2 3 3 4 5 6 11 27

This data is right skewed, as the plot depicts

0 2 3 6 11 27

Min Q1 Q2 Q3 Max Outlier

*

Upper limit = Q3 + 1.5 (Q3 – Q1)

= 6 + 1.5 (6 – 2) = 12

27 is above the

upper limit so is

shown as an outlier

QMIS 120, by Dr. M. Zainal Chap 3-48

Page 25: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-25

Interquartile Range

Can eliminate some outlier problems by using

the interquartile range

Eliminate some high-and low-valued

observations and calculate the range from the

remaining values.

Interquartile range = 3rd quartile – 1st quartile

QMIS 120, by Dr. M. Zainal Chap 3-49

Interquartile Range Example

Median

(Q2) X

maximum X minimum Q1 Q3

Example:

25% 25% 25% 25%

12 30 45 57 70

Interquartile range

= 57 – 30 = 27

QMIS 120, by Dr. M. Zainal Chap 3-50

Page 26: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-26

A standardized data value refers to

the number of standard deviations a

value is from the mean

Standardized data values are

sometimes referred to as z-scores

Standardized Data Values

QMIS 120, by Dr. M. Zainal Chap 3-51

where:

x = original data value

μ = population mean

σ = population standard deviation

z = standard score

(number of standard deviations x is from μ)

Standardized Population Values

σ

μx z

QMIS 120, by Dr. M. Zainal Chap 3-52

Page 27: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-27

where:

x = original data value

x = sample mean

s = sample standard deviation

z = standard score

(number of standard deviations x is from μ)

Standardized Sample Values

s

xx z

QMIS 120, by Dr. M. Zainal Chap 3-53

IQ scores in a large population have a bell-shaped distribution with mean μ = 100 and standard deviation σ = 15

Find the standardized score (z-score) for a person with an IQ of 121.

Standardized Value Example

Answer:

QMIS 120, by Dr. M. Zainal Chap 3-54

Page 28: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-28

If the data distribution is bell-shaped, then

the interval:

contains about 68% of the values in

the population or the sample

The Empirical Rule

1σμ

μ

68%

1σμ

QMIS 120, by Dr. M. Zainal Chap 3-55

contains about 95% of the values in

the population or the sample

contains about 99.7% of the values

in the population or the sample

The Empirical Rule

2σμ

3σμ

3σμ

99.7% 95%

2σμ

QMIS 120, by Dr. M. Zainal Chap 3-56

Page 29: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-29

Example: The age

distribution of a sample

of 5000 persons is bell-

shaped with a mean of

40 years and a standard

deviation of 12 years.

Determine the

approximate percentage

of people who are 16 to

64 years old.

The Empirical Rule

QMIS 120, by Dr. M. Zainal Chap 3-57

Sometimes a data set may have one or more

observation that is unusually small or large

value.

An experienced statistician may face the

following situations and need to take an action

Detecting Outliers

QMIS 120, by Dr. M. Zainal Chap 3-58

Outlier Action

A data value that was incorrectly recorded Correct it before any further

analysis

A data value that was incorrectly included Remove it before any further

analysis

A data value that belongs to the data set and

correctly recorded

Keep it !

Page 30: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-30

This extreme value is called an outlier and can

be detected using the z-score and the empirical

rule for data with bell-shape distribution.

Detecting Outliers

QMIS 120, by Dr. M. Zainal Chap 3-59

3σμ

99.7% Outliers Outliers

Regardless of how the data are distributed, at least (1 - 1/k2) of the values will fall within k standard deviations of the mean

Examples:

(1 - 1/12) = 0% ……..... k=1 (μ ± 1σ)

(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)

(1 - 1/32) = 89% ………. k=3 (μ ± 3σ)

Tchebysheff’s Theorem

within At least

QMIS 120, by Dr. M. Zainal Chap 3-60

Page 31: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-31

Tchebysheff’s Theorem

QMIS 120, by Dr. M. Zainal Chap 3-61

k = 2 k = 3

Example: The average systolic blood pressure for 4000

women who were screened for high blood pressure was

found to be 187 with standard deviation of 22. Using

Chebyshev’s theorem, find at least what percentage of

women in this group have a systolic blood pressure

between 143 and 231.

Tchebysheff’s Theorem

QMIS 120, by Dr. M. Zainal Chap 3-62

Page 32: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-32

Tchebysheff’s Theorem

QMIS 120, by Dr. M. Zainal Chap 3-63

(continued)

So far we have studied numerical methods to describe data with one variable.

Often decision makers are interested in the relationship between two variables.

To do so, we will use descriptive measure that is called covariance.

Covariance assigns a numerical value to the linear relationship between two variables (see scatter diagram)

Measures of association

between two variables

QMIS 120, by Dr. M. Zainal Chap 3-64

Page 33: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-33

It is given by:

A big disadvantage of the covariance is that it depends on the units of measurement for x and y.

Measures of association

between two variables

QMIS 120, by Dr. M. Zainal Chap 3-65

N

yx

n

yyxx

yixi

ii

xy

xy

: covariance Population

1S :covariance Sample

For the same data set, we will have two different covariance values depending on the units (i.e. height in meters or centimeters will make a big difference).

Pearson’s correlation coefficient is a good remedy to that problem as it can go only from -1 to 1.

Measures of association

between two variables

QMIS 120, by Dr. M. Zainal Chap 3-66

Page 34: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-34

It is given by

Measures of association

between two variables

QMIS 120, by Dr. M. Zainal Chap 3-67

N

yx

n

yyxx

yixi

ii

xy

xy

: covariance Population

1S :covariance Sample

Example: A golfer is interested in investigating the relationship, if any, between driving distance and 18-hole score.

Example

QMIS 120, by Dr. M. Zainal Chap 3-68

Average Driving Distance (m)

Average 18-Hole Score

277.6 69

259.5 71

269.1 70

267.0 70

255.6 71

272.9 69

Page 35: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-35

Example

QMIS 120, by Dr. M. Zainal Chap 3-69

Problems

Consider the following sample of four purchases of one stock in

the KSE. Find the average cost of the stock.

Purchase Price Quantity

1 .300 5,000

2 .325 15,000

3 .350 10,000

4 .295 20,000

QMIS 120, by Dr. M. Zainal Chap 3-70

Page 36: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-36

Problems

Construct a box plot for the following data:

340 300 400 360 320 290 260 330

QMIS 120, by Dr. M. Zainal Chap 3-71

Problems

Example: You are given 8 measurements: 3, 5, 4, 6, 12, 5, 6, 7.

Find

a) The mean b) The median c) The mode

QMIS 120, by Dr. M. Zainal Chap 3-72

Page 37: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-37

Problems

Find the covariance and Pearson’s correlation coefficient for

the following data

QMIS 120, by Dr. M. Zainal Chap 3-73

Week Number of commercials (x) Sales in $ (y)

1 2 50

2 5 57

3 1 41

4 3 54

5 4 54

6 1 38

7 5 63

8 3 48

9 4 59

10 2 46

Mean 3 51

St. Dev. 1.49 7.93

Measures of association between two variables

QMIS 120, by Dr. M. Zainal Chap 3-74

Page 38: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-38

Chapter Summary

Described measures of center and location

Mean, median, mode, weighted mean

Discussed percentiles and quartiles

Created Box and Whisker Plots

Illustrated distribution shapes

Symmetric, skewed

QMIS 120, by Dr. M. Zainal Chap 3-75

Chapter Summary

Described measure of variation

Range, interquartile range, variance,

standard deviation, coefficient of variation

Discussed Tchebysheff’s Theorem

Calculated standardized data values

(continued)

QMIS 120, by Dr. M. Zainal Chap 3-76

Page 39: Department of Quantitative Methods & Information Systems ... · Describing Data Using Numerical Measures Business Statistics Department of Quantitative Methods & Information Systems

QMIS 120, by Dr. M. Zainal

Chapter 3 Student Lecture Notes 3-39

Copyright

The materials of this presentation were mostly

taken from the PowerPoint files accompanied

Business Statistics: A Decision-Making Approach,

7e © 2008 Prentice-Hall, Inc.

Chap 3-77 QMIS 120, by Dr. M. Zainal