Business Statistics Chapter 1 - CA Sri Lanka

72
1-1 1-1 Business Statistics Chapter 1 By: Chinthaka Amila Kankanamge 1

Transcript of Business Statistics Chapter 1 - CA Sri Lanka

Page 1: Business Statistics Chapter 1 - CA Sri Lanka

1-1 1-1

Business Statistics

Chapter 1

By:

Chinthaka Amila Kankanamge

1

Page 2: Business Statistics Chapter 1 - CA Sri Lanka

1-2

2

1-1

GOALS:

Explain what is meant by statistics

Explain what is meant by descriptive statistics and inferential statistics.

Distinguish between a qualitative variable and a quantitative variable; discrete variable and a continuous variable.

Define the terms mutually exclusive and exhaustive. Distinguish among the nominal, ordinal, interval, and ratio

levels of measurement.

Page 3: Business Statistics Chapter 1 - CA Sri Lanka

1-3

What is meant by Statistics?

• Statistics is the science of data which involves – collecting, – classifying – summarizing, –organizing, –analyzing, and interpreting numerical

information to assist in making effective decision

3

1-2

Page 4: Business Statistics Chapter 1 - CA Sri Lanka

1-4

Types of Statistics

Descriptive Statistics: Methods of organizing, summarizing, and

presenting data in an informative way.

4

Inferential Statistics: A decision, estimate, prediction, or generalization about a population, based on a sample.

Descriptive Statistics Inferential Statistics

Page 5: Business Statistics Chapter 1 - CA Sri Lanka

1-5

Types of Statistics

Descriptive Statistics: Methods of organizing, summarizing, and

presenting data in an informative way.

5

Inferential Statistics:

A decision, estimate, prediction, or generalization about a population, based on a sample.

Page 6: Business Statistics Chapter 1 - CA Sri Lanka

1-6

Types of Statistics (examples of inferential statistics)

6

Eg 1: TV networks constantly monitor the popularity of heir programs by hiring Nielsen and other organizations to sample the preferences of TV viewers.

Eg 2: The accounting department of a large firm will select a sample of the invoices to check for accuracy for all the invoices of the company.

Eg 3: Wine tasters sip a few drops of wine to make a decision with respect to all the wine waiting to be released for sale.

Page 7: Business Statistics Chapter 1 - CA Sri Lanka

1-7

Types of Statistics

A population is a collection of all possible individuals, objects, or measurements of interest. A parameter is a descriptive measure of the entire population of all observations of interest

A sample is a portion, or part, of the population of interest. A statistic describes a sample and serves as an estimate of the corresponding population parameter

Page 8: Business Statistics Chapter 1 - CA Sri Lanka

1-8

Types of Variables

8

Qualitative or attribute (type of car owned)

Discrete (number of children)

Continuous (time taken for an exam)

Quantitative or Numerical

DATA

Page 9: Business Statistics Chapter 1 - CA Sri Lanka

1-9

Types of Variables

For a Qualitative or Attribute variable the characteristic being studied is nonnumeric.

9

Gender, religious affiliation, type of automobile owned, state of birth, eye color are examples.

In a Quantitative variable information is reported numerically.

balance in your checking account, minutes remaining in class, or number of children in a family.

Quantitative variables can be classified as either discrete or continuous.

Page 10: Business Statistics Chapter 1 - CA Sri Lanka

1-10

The pressure in a tire, the weight of a pork chop, or the height of students in a class.

10

Discrete variables: can only assume certain values and there are usually “gaps” between values.

The number of bedrooms in a house, or the number of hammers sold at the local Home Depot (1,2,3,…,etc).

Continuous variable can assume any value within a specified range.

Page 11: Business Statistics Chapter 1 - CA Sri Lanka

1-11

Sources of Data

Primary data :

Collected for specific purpose

Direct Observation

Questionnaires

Interviewing

Secondary Data :

Collected for another purpose

11

Page 12: Business Statistics Chapter 1 - CA Sri Lanka

1-12

Levels of Measurement

12

Nominal level:

Data that is classified into categories and cannot be arranged in any particular order.

eye color, gender, religious affiliation

Mutually exclusive:

An individual, object, or measurement is included in only one category.

Level of Data

Nominal Ordinal Ratio Interval

Page 13: Business Statistics Chapter 1 - CA Sri Lanka

1-13

Levels of Measurement

13

Ordinal level:

involves data arranged in some order, but the differences between data values cannot be determined or are meaningless.

During a taste test of 4 soft drinks, Mellow Yellow was ranked number 1, Sprite number 2, Seven-up number 3, and Orange Crush number 4.

Exhaustive: Each individual, object, or measurement must appear in one of the categories.

Page 14: Business Statistics Chapter 1 - CA Sri Lanka

1-14

Levels of Measurement (Cont..)

Interval level:

similar to the ordinal level, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point.

14

Temperature on the Fahrenheit scale.

Page 15: Business Statistics Chapter 1 - CA Sri Lanka

1-15

15

Ratio level:

The interval level with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement.

Monthly income of surgeons, or distance traveled by manufacturer’s representatives per month.

Page 16: Business Statistics Chapter 1 - CA Sri Lanka

1-16

16

Level of data

Nominal

Data may only be classified

Classification of students by

district

Ordinal

Data are ranked

Your rank for this course

module

Interval

Meaningful difference

between values

Temperature

Ratio

Meaningful 0 point &

ratio between values

Number of study hours

Page 17: Business Statistics Chapter 1 - CA Sri Lanka

1-17

17

Who Uses Statistics?

Statistical techniques are used extensively by marketing, accounting, quality control, consumers, hospital administrators, educators, politicians, physicians, etc...

Page 18: Business Statistics Chapter 1 - CA Sri Lanka

1-18

18

For Researching problems

usually requires published data. Statistics on these problems can be found in published articles, journals, and magazines.

Published data is not always available on a given subject. In such cases, information will have to be collected and analyzed.

One way of collecting data is via questionnaires.

What are the other data collection methods?

Sources of Statistical Data

Page 19: Business Statistics Chapter 1 - CA Sri Lanka

1-19

19

Chapter Two

Describing Data: Frequency Distributions and Graphic

Presentation

Page 20: Business Statistics Chapter 1 - CA Sri Lanka

1-20

Presentation of Data

Row data reveals very little

Shows how a large data set can be organized and managed to provide a quick visual interpretation of the massage the data convey.

20

Methods of Data Presentation

i. Data Array

ii. Tabulation of Data

iii. Stem-and-Leaf display

iv. Frequency Distribution

Page 21: Business Statistics Chapter 1 - CA Sri Lanka

1-21

Data Array Arrange data in systematic way (Ascending data array & Descending data Array)

1st Year 2nd Year

Physical 320 160

Bio 246 126

21

Tabulation of Data

Arrange data in a tables (Rows & Columns)

Page 22: Business Statistics Chapter 1 - CA Sri Lanka

1-22

Frequency Distribution

A Frequency distribution is a grouping of data

into mutually exclusive categories showing the

number of observations in each class.

22

Construction of a Frequency Distribution

Page 23: Business Statistics Chapter 1 - CA Sri Lanka

1-23

Frequency Distribution

Class midpoint: A point that divides a class into two equal parts. This is the average of the upper and lower class limits.

23

Class frequency:

The number of observations in each class.

Class interval:

The class interval is obtained by subtracting the lower

limit of a class from the lower limit of the next class.

Page 24: Business Statistics Chapter 1 - CA Sri Lanka

1-24

Eg 1: Dr. Tillman is Dean of the School of Business Socastee University. He wishes prepare to a report showing the number of hours per week students spend studying. He selects a random sample of 30 students and determines the number of hours each student studied last week.

24

15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6.

Organize the data into a frequency distribution.

Page 25: Business Statistics Chapter 1 - CA Sri Lanka

1-25

Eg 1: continued

There are 30 observations

Two raised to the fifth power is 32.

Therefore, we should have at least 5 classes.

It turns out we will need 6.

The range is 23.5 hrs, found by 33.8 hrs – 10.3 hrs.

We choose an interval of 5 hrs.

The lower limit of the first class is 7.5 hrs.26

25

Page 26: Business Statistics Chapter 1 - CA Sri Lanka

1-26

26

Hours studying Frequency, f

7.5 up to 12.5 1

12.5 up to 17.5 12

17.5 up to 22.5 10

22.5 up to 27.5 5

27.5 up to 32.5 1

32.5 up to 37.5 1

Page 27: Business Statistics Chapter 1 - CA Sri Lanka

1-27

Suggestions on Constructing a Frequency

Distribution

The class intervals used in the frequency distribution should be equal.

27

classes ofNumber

ue)Lowest val - lueHighest va(i

Determine a suggested class interval by using

the formula:

Use the computed suggested class interval to

construct the frequency distribution.

Page 28: Business Statistics Chapter 1 - CA Sri Lanka

1-28

Suggestions on Constructing a Frequency

Distribution

Note: this is a suggested class interval; if the

computed class interval is 97, it may be

better to use 100.

28

Count the number of values in each class.

Eg 1: A relative frequency distribution shows the

percent of observations in each class.

Page 29: Business Statistics Chapter 1 - CA Sri Lanka

1-29

EXAMPLE Mr. Jayatissa wishes prepare to a report showing

the number of hours per week students spend studying. He selects a random sample of 30 students and determines the number of hours each student studied last week.

15.0, 23.7, 19.7, 15.4, 18.3,

23.0, 14.2, 20.8, 13.5, 20.7,

17.4, 18.6, 12.9, 20.3, 13.7,

21.4, 18.3, 29.8, 17.1, 18.9,

10.3, 26.1, 15.7, 14.0, 17.8,

33.8, 23.2, 12.9, 27.1, 16.6.

Organize the data into a frequency distribution.

Page 30: Business Statistics Chapter 1 - CA Sri Lanka

1-30

Relative Frequency Distribution

30

Hours f Relative

Frequency

7.5 up to 12.5 1 1/30=.0333

12.5 up to 17.5 12 12/30=.400

17.5 up to 22.5 10 10/30=.333

22.5 up to 27.5 5 5/30=.1667

27.5 up to 32.5 1 1/30=.0333

32.5 up to 37.5 1 1/30=.0333

TOTAL 30 30/30=1

T

Page 31: Business Statistics Chapter 1 - CA Sri Lanka

1-31

Stem-and-leaf Displays

Stem-and-leaf display:

A statistical technique for displaying a set of data. Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits the leaf.

31

Note: an advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation.

Page 32: Business Statistics Chapter 1 - CA Sri Lanka

1-32

Eg 2 : Colin achieved the following scores on his

twelve accounting quizzes this semester:

86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85.

Construct a stem-and-leaf chart.

stem leaf

6 9

7 8 9

8 2 3 4 5 6 8

9 1 2 6

32

Page 33: Business Statistics Chapter 1 - CA Sri Lanka

1-33

Graphic Presentation of a Frequency

Distribution

The three commonly used graphic forms are

histograms, frequency polygons, and a cumulative

frequency distribution.

33

A Histogram is a graph in which the classes are

marked on the horizontal axis and the class

frequencies on the vertical axis.

The class frequencies are represented by the heights

of the bars and the bars are drawn adjacent to each

other.

Page 34: Business Statistics Chapter 1 - CA Sri Lanka

1-34

Graphic Presentation of a Frequency

Distribution

A frequency polygon consists of line

segments connecting the points formed by

the class midpoint and the class frequency.

34

A cumulative frequency distribution is used

to determine how many or what proportion of

the data values are below or above a certain

value.

Page 35: Business Statistics Chapter 1 - CA Sri Lanka

1-35

Histogram for Hours Spent Studying

0

2

4

6

8

10

12

14

10 15 20 25 30 35

Hours spent studying

Fre

qu

en

cy

35

Page 36: Business Statistics Chapter 1 - CA Sri Lanka

1-36

Frequency Polygon for Hours Spent Studying

0

2

4

6

8

10

12

14

10 15 20 25 30 35

Hours spent studying

Fre

qu

en

cy

36

Page 37: Business Statistics Chapter 1 - CA Sri Lanka

1-37

Cumulative Frequency Distribution For Hours Studying

0

5

10

15

20

25

30

35

10 15 20 25 30 35

Hours Spent Studying

Frequency

37

Page 38: Business Statistics Chapter 1 - CA Sri Lanka

1-38

Bar Chart

A bar chart can be used to depict any of the

levels of measurement (nominal, ordinal,

interval, or ratio).

City Number of unemployed

per 100,000 population

Atlanta, GA 7300

Boston, MA 5400

Chicago, IL 6700

Los Angeles, CA 8900

New York, NY 8200

Washington, D.C. 8900

38

Eg 3: Construct a bar chart for the number of unemployed per

100,000 population for selected cities during 2001

Page 39: Business Statistics Chapter 1 - CA Sri Lanka

1-39

Bar Chart for the Unemployment Data

7300

5400

6700

89008200

8900

0

2000

4000

6000

8000

10000

1 2 3 4 5 6

Cities

# u

nem

plo

yed

/100,0

00

Atlanta

Boston

Chicago

Los Angeles

New York

Washington

39

Page 40: Business Statistics Chapter 1 - CA Sri Lanka

1-40

Pie Chart

A pie chart is useful for displaying a relative frequency distribution. A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups.

Type of shoes # of runners

Nike 92

Adidas 49

Reebook 37

Asics 13

Other 9 40

Eg 4: A sample of 200 runners were asked to indicate their

favorite type of running shoe.

Draw a pie chart based on the following information.

Page 41: Business Statistics Chapter 1 - CA Sri Lanka

1-41

Pie Chart for Running Shoes

Nike

Adidas

ReebokAsics

Other

Nike

Adidas

Reebok

Asics

Other

41

Page 42: Business Statistics Chapter 1 - CA Sri Lanka

1-42

42

Chapter Three

Describing Data: Measures of Central Tendency

Page 43: Business Statistics Chapter 1 - CA Sri Lanka

1-43

Characteristics of the Mean

The arithmetic mean is the most widely used measure of location.

43

It is calculated by summing the values and dividing by the number of values.

The major characteristics of the mean are:

It requires the interval scale.

All values are used.

It is unique.

The sum of the deviations from the mean is 0.

Page 44: Business Statistics Chapter 1 - CA Sri Lanka

1-44

Population Mean

For ungrouped data, the population mean is

the sum of all the population values divided

by the total number of population values:

N

X

where µ is the population mean.

N is the total number of observations.

X is a particular value.

indicates the operation of adding.

Page 45: Business Statistics Chapter 1 - CA Sri Lanka

1-45

Eg 1: The Kiers family owns four cars. The following is the current mileage on each of the four cars:

56,000, 23,000, 42,000, 73,000

500,484

000,73...000,56

N

X

Find the mean mileage for the cars.

A Parameter is a measurable characteristic of a population.

Page 46: Business Statistics Chapter 1 - CA Sri Lanka

1-46

Sample Mean

• For ungrouped data, the sample mean is the

sum of all the sample values divided by the

number of sample values:

Where n is the total number of values in the

sample.

n

XX

Page 47: Business Statistics Chapter 1 - CA Sri Lanka

1-47

Eg 2: A sample of five executives received the

following bonus last year ($000):

14.0, 15.0, 17.0, 16.0, 15.0

4.155

77

5

0.15...0.14

n

XX

A statistic is a measurable characteristic of a

sample.

Page 48: Business Statistics Chapter 1 - CA Sri Lanka

1-48

Properties of the Arithmetic Mean

• Every set of interval-level and ratio-level data

has a mean.

• All the values are included in computing the

mean.

• A set of data has a unique mean.

• The mean is affected by unusually large or small

data values.

• The arithmetic mean is the only measure of

central tendency where the sum of the deviations

of each value from the mean is zero.

Page 49: Business Statistics Chapter 1 - CA Sri Lanka

1-49

Weighted Mean

Eg 3: Consider the set of values: 3, 8, and 4. The

mean is 5. Illustrating the fifth property:

)21

)2211

...(

...(

n

nnw

www

XwXwXwX

0)54()58()53()( XX

The weighted mean of a set of numbers X1, X2, ...,

Xn, with corresponding weights w1, w2, ...,wn, is

computed from the following formula:

Page 50: Business Statistics Chapter 1 - CA Sri Lanka

1-50

Eg 6: During a one hour period on a hot Saturday

afternoon cabana boy Chris served fifty

drinks. He sold five drinks for $0.50, fifteen

for $0.75, fifteen for $0.90, and fifteen for

$1.10. Compute the weighted mean of the

price of the drinks.

89.0$50

50.44$

1515155

)15.1($15)90.0($15)75.0($15)50.0($5

wX

Page 51: Business Statistics Chapter 1 - CA Sri Lanka

1-51

The Median

• The Median is the midpoint of the values after

they have been ordered from the smallest to the

largest. There are as many values above the median as below it in the data array.

For an even set of values, the median will be the arithmetic average of the two middle numbers.

Page 52: Business Statistics Chapter 1 - CA Sri Lanka

1-52

Eg 4 : The ages for a sample of five college students are:

21, 25, 19, 20, 22

Arranging the data in ascending order gives: 19, 20, 21, 22, 25. Thus the median is 21.

Eg 5 : The heights of four basketball players, in

inches, are:

76, 73, 80, 75

Arranging the data in ascending order gives:

73, 75, 76, 80. Thus the median is 75.5

Page 53: Business Statistics Chapter 1 - CA Sri Lanka

1-53

Properties of the Median

• There is a unique median for each data set.

• It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur.

• It can be computed for ratio-level, interval-level, and ordinal-level data.

• It can be computed for an open-ended frequency distribution if the median does not lie in an open-ended class.

Page 54: Business Statistics Chapter 1 - CA Sri Lanka

1-54

The Mode

• The mode is the value of the observation

that appears most frequently.

Eg 5: The exam scores for ten students are: 81, 93,

84, 75, 68, 87, 81, 75, 81, 87. Because the score

of 81 occurs the most often, it is the mode.

Page 55: Business Statistics Chapter 1 - CA Sri Lanka

1-55

Geometric Mean

• The geometric mean (GM) of a set of n

numbers is defined as the nth root of the

product of the n numbers. The formula is:

The geometric mean is used to average

percents, indexes, and relatives.

GM X X X Xnn ( )( )( )...( )1 2 3

Page 56: Business Statistics Chapter 1 - CA Sri Lanka

1-56

• The interest rate on three bonds were 5, 21, and 4

percent.

• The geometric mean is

• The arithmetic mean is (5+21+4)/3 =10.0

• The GM gives a more conservative profit figure

because it is not heavily weighted by the rate of

21percent.

49.7)4)(21)(5(3 GM

Page 57: Business Statistics Chapter 1 - CA Sri Lanka

1-57

Geometric Mean continued

• Another use of the geometric mean is to

determine the percent increase in sales,

production or other business or economic series

from one time period to another.

1period) of beginningat (Value

period) of endat Value( nGM

Page 58: Business Statistics Chapter 1 - CA Sri Lanka

1-58

Eg 8 : The total number of females enrolled in American

colleges increased from 755,000 in 1992 to

835,000 in 2000. That is, the geometric mean rate

of increase is 1.27%.

0127.1000,755

000,8358 GM

Page 59: Business Statistics Chapter 1 - CA Sri Lanka

1-59

Eg 9 : There are many flights from Houston to Little Rock, AK each day. The data below shows the number of minutes a flight was late (or early) in arriving in Little Rock for a sample of 5 flights. To explain, a positive number means the flight was late, a value of 0 indicates it arrived on time, and a negative number indicates it was early. So the first flight was 4 minutes late and the last flight 10 minutes early.

4 12 -9 6 -10

59

a. Determine the mean amount flights were late (or early).

b. Determine the median amount flights were late (or early).

Page 60: Business Statistics Chapter 1 - CA Sri Lanka

1-60

60

60.05

3X

4Median

a. Determine the mean amount flights were late (or early).

b. Determine the median amount flights were late (or early).

Page 61: Business Statistics Chapter 1 - CA Sri Lanka

1-61

Eg10: Suppose your cousin started a management job with Ford Motor Company in 1990 at $30,000 per year. In the year 2002 her salary was $65,000. What was the geometric mean rate of increase per year for the period?

Eg 11: For a sample of 50 stocks traded yesterday on the American Stock Exchange, 10 showed a decline of $1.00, 15 showed no change, and 25 increased by $2.00. Find the weighted mean.

61

06655.00.106655.100.1000,30$

000,65$12 GM

80.0$251510

)00.2($25)0($15)00.1$(10

wX

Page 62: Business Statistics Chapter 1 - CA Sri Lanka

1-62

The Mean of Grouped Data

The mean of a sample of data organized in a

frequency distribution is computed by the

following formula:

6.610

66

n

XX

n

XfX

Eg12: A sample of ten movie theaters in a large

metropolitan area tallied the total number of movies

showing last week. Compute the mean number of

movies showing.

Page 63: Business Statistics Chapter 1 - CA Sri Lanka

1-63

Eg 12 : continued

Movies

showing

frequency

f

class

midpoint

X

(f)(X)

1 up to 3 1 2 2

3 up to 5 2 4 8

5 up to 7 3 6 18

7 up to 9 1 8 8

9 up to 11 3 10 30

Total 10 66

Page 64: Business Statistics Chapter 1 - CA Sri Lanka

1-64

The Median of Grouped Data

• The median of a sample of data organized in a

frequency distribution is computed by:

)(2 if

CFn

LMedian

where L is the lower limit of the median class, CF is

the cumulative frequency preceding the median class,

f is the frequency of the median class, and i is the

median class interval.

Page 65: Business Statistics Chapter 1 - CA Sri Lanka

1-65

Finding the Median Class

To determine the median class for grouped

data:

– Construct a cumulative frequency distribution.

– Divide the total number of data values by 2.

– Determine which class will contain this value.

For example, if n=50, 50/2 = 25, then

determine which class will contain the 25th

value.

Page 66: Business Statistics Chapter 1 - CA Sri Lanka

1-66

Eg13 :

33.6)2(3

32

10

5)(2

if

CFn

LMedian

Movies

showing

Frequency Cumulative

Frequency

1 up to 3 1 1

3 up to 5 2 3

5 up to 7 3 6

7 up to 9 1 7

9 up to 11 3 10

From the table, L=5, n=10, f=3, i=2, CF=3

Page 67: Business Statistics Chapter 1 - CA Sri Lanka

1-67

The Mode of Grouped Data

• The mode for grouped data is approximated

by the midpoint of the class with the largest

class frequency.

The modes in Eg 13 are 8 and 10. When two values

occur a large number of times, the distribution is

called bimodal, as in Eg 13.

Page 68: Business Statistics Chapter 1 - CA Sri Lanka

1-68

Eg14: The following frequency distribution reports the number of students enrolled in each of the 50 sections of various courses taught in the College of Business last summer.

Students Frequency

0 up to 10 3

10 up to 20 8

20 up to 30 16

30 up to 40 10

40 up to 50 9

50 up to 60 4

Total 50

a. Determine the mean number of students per section.

b. Determine the median number of students per section. 68

2.3050

1510X

75.281016

112520

Median

Page 69: Business Statistics Chapter 1 - CA Sri Lanka

1-69

Symmetric Distribution

zero skewness mode = median = mean

Page 70: Business Statistics Chapter 1 - CA Sri Lanka

1-70

Right Skewed Distribution

positively skewed: Mean and Median are to

the right of the Mode.

Mode < Median < Mean

Page 71: Business Statistics Chapter 1 - CA Sri Lanka

1-71

Left Skewed Distribution

Negatively Skewed: Mean and Median are to the

left of the Mode.

Mean < Median < Mode

Page 72: Business Statistics Chapter 1 - CA Sri Lanka

1-72

End of the Chapter

• Thank you

• Questions

72