Probability & Statistics Chapter 02 Summarizing and Graphing Data.

33
Probability & Statistics Chapter 02 Summarizing and Graphing Data

Transcript of Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Page 1: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Probability & Statistics

Chapter 02Summarizing and Graphing Data

Page 2: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Describing Data- 1. Frequency distribution

• Frequency distribution: A grouping of data into categories showing the number of observations in each mutually exclusive category.

Page 3: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

The steps for organizing data into a frequency distribution

1. Set up grouping called classes. Decide on the size of the class interval.

2. Tally the raw data or ungrouped data into classes.

3. Count the number of tallies in each class. The number of observation in the is called the class frequency.

Page 4: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

• Class: A class is one of the categories into which data can be classified.

• Class interval for a frequency distribution is obtained by subtracting the lower limit of a class from the lower limit of the next class or subtracting the higher limit of a class from the higher limit of the next class

Page 5: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

• Class limits: represent the smallest and largest data values that can be included in a class

• The class boundaries are used to separate the classes so that there are no gaps in the frequency distribution

Page 6: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

• Class mark (midpoint): A point that divides a class into two equal parts. This is the average between the upper and lower class limits.

Page 7: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE 1

• Prof. Munasinghe is the head of the Department of Industrial Management (IM) and wishes to determine the amount of studying IM students do. He selects a random sample of 30 students and determines the number of hours each student studies per week: 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6,

• Organize the data into a frequency distribution.

Page 8: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Guidelines for Constructing a Frequency Distribution

• To determine the number of classes “2 to the k rule” can be usedselect the smallest integer (k) such that

2k n

n is the number of observations• There should be between 5 and 20 classes.

Page 9: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Guidelines for Constructing a Frequency Distribution

classes of number

value lowest - value highest i

• Determine the class interval based on the number of suggested classes by using the formula:

• The class intervals used in the frequency distribution should be equal.

• Use the computed suggested class interval to construct the frequency distribution. Note: this is a suggested class interval; if the computed class interval is 97, it may be better to use 100.

Page 10: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Guidelines for Constructing a Frequency Distribution

• Avoid open ended classes• Do not have overlapping classes• Count the number of values in each class.

Page 11: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE 1

• 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6

• Number of possible classes: 2k nneed to find a value for k where 2k 30 then k = 5number of possible classes = 5

• Class interval:i = (highest value – lowest value) / number of classesi = (33.8 – 10.3) / 5 = 4.7better to use i = 5

Page 12: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE 1

Classes Tally Frequency

10.3 – 15.2 |||| ||| 8

15.3 – 20.2 |||| |||| | 11

20.3 – 25.2 |||| || 7

25.3 – 30.2 ||| 3

30.3 – 35.2 | 1

Page 13: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Relative Frequency Distribution

• Shows the percentage of the observation in each class

• The relative frequency of a class is obtained by dividing the class frequency by the total frequency.

• This allows us to make statements regarding the number of observations in a single class relative to the entire sample

Page 14: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Relative Frequency Distribution for Exercise 1

Classes Frequency Relative Frequency

10.3 – 15.2 8 8/30 = 26.7%

15.3 – 20.2 11 11/30 = 36.7%

20.3 – 25.2 7 7/30 = 23.3%

25.3 – 30.2 3 3/30 = 10%

30.3 – 35.2 1 1/30 = 3.3%

Page 15: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Cumulative Relative Frequency Distribution

• The cumulative relative frequency distribution express the cumulative frequency of each class relative to the entire sample

Page 16: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Cumulative Relative Frequency Distribution for Exercise 1

Classes Frequency Cumulative Frequency

Cumulative Relative Frequency

10.3 – 15.2 8 8 8/30 = 26.7%

15.3 – 20.2 11 19 19/30 = 63.3%

20.3 – 25.2 7 26 26/30 = 86.7%

25.3 – 30.2 3 29 29/30 = 96.7%

30.3 – 35.2 1 30 30/30 = 100%

Page 17: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Contingency Table

Complaint Origin Reason for complaint

Electrical Mechanical Appearance

During guarantee period

18% 13% 32%

After guarantee period

12% 22% 3%

• Contingency table indicates the number of observations for both variables that fall jointly in each category.

• Example: Distribution of product complaint

Page 18: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Graphic Presentation

• There are three commonly used graphic forms of a frequency distribution – histograms, – frequency polygons, – cumulative frequency distribution (ogive).

Page 19: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Histogram

• A graph in which – the classes are marked on the horizontal axis – the class frequencies on the vertical axis.– the class frequencies are represented by the

heights of the bars– the bars are drawn adjacent to each other.

Page 20: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Histogram for Exercise 1

Class Midpoint

Frequency

12.75 8

17.75 11

22.75 7

27.75 3

32.75 1

Histrogram for study hours of IM student

0

4

8

12

12.75 17.75 22.75 27.75 32.75

Number of hours

Fre

qu

en

cy

Page 21: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Frequency Polygon

• A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency.

Page 22: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Frequency Polygon for Exercise 1

Class Midpoint

Frequency

7.75 0

12.75 8

17.75 11

22.75 7

27.75 3

32.75 1

37.75 0

Frequency Polygon for study hours of IM students

0

2

4

6

8

10

12

7.75 12.75 17.75 22.75 27.75 32.75 37.75

Number of hours

Fre

qu

ency

Page 23: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Cumulative Frequency distribution

• A cumulative frequency distribution (ogive) is used to determine how many or what proportion of the data values are below or above a certain value.

Page 24: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Ogive for Exercise 1

Class Midpoint

FrequencyLess than cumulative frequency

More than cumulative frequency

7.75 0 0 30

12.75 8 8 30

17.75 11 19 22

22.75 7 26 11

27.75 3 29 4

32.75 1 30 1

37.75 0 30 0

Page 25: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Ogive for Exercise 1

Ogive for study hours if IM students

05

1015

20253035

7.75 12.75 17.75 22.75 27.75 32.75 37.75

Number of hours

Fre

qu

ency

Less than cumulative frequency More than cumulative frequency

Page 26: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Bar Chart

CityNumber of

unemployed (in ‘000)

Gampaha 730

Matale 540

Chilaw 670

Colombo 890

Galle 820

Kandy 890

• A bar chart can be used to depict any of the levels of measurement (nominal, ordinal, interval, or ratio).

2-17

EXAMPLE : Construct a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995.

Page 27: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE continued

Unemployment data

0

200

400

600

800

1000

Gampaha Matale Chilaw Colombo Galle Kandy

City

nu

mb

er o

f p

erso

ns

2-18

a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995

Page 28: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Pie Chart

Type of shoe

# of runners

Nike 92

Adidas 49

Reebok 37

Asics 13

• A pie chart is especially useful in displaying a relative frequency distribution. A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups.

• Example: Draw a pie chart based on the following information.

Page 29: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE continued

Pie chart for running shoes

Nike45%

Asics7%

Other5%

Reebok19%

Adidas24%

Type of shoe

# of runners

Nike 92

Adidas 49

Reebok 37

Asics 13

Page 30: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Line Chart

• A graph of data that is mapped by a series of lines. Line charts show changes in data or categories of data over time and can be used to document trends

CityNumber of

unemployed (in ‘000)

Gampaha 730

Matale 540

Chilaw 670

Colombo 890

Galle 820

Kandy 890

EXAMPLE : Construct a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995.

Page 31: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE continued

Unemployment data

0

200

400

600

800

1000

Gampa

ha

Mata

le

Chilaw

Colom

boGall

e

Kandy

City

Nu

mb

er

of

pe

rso

n

Page 32: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Stem-and-Leaf Displays

• Stem-and-Leaf Display: A statistical technique for displaying a set of data. Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits become the leaf.

Note: An advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation.

Page 33: Probability & Statistics Chapter 02 Summarizing and Graphing Data.

EXAMPLE

stem leaf

6 9

7 8 9

8 2 3 4 5 6 8

9 1 2 6

• Saman achieved the following scores on his twelve accounting quizzes this semester: 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Construct a stem-and-leaf chart for the data.