CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36...

13
10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data Ordered array Frequency Distributions Constructing Frequency Distribution Tables Relative and Percentage Distributions Graphing Grouped Data Stem and leaf plots Histograms Polygons 37 Organizing & Grouping Data To facilitate the calculation of various descriptive measures such as percentages and averages (Before the days of computers) The main purpose in grouping data now is summarization Summarization is a way of making it easier to understand the information in data 38 Ordered array A first step in organizing data An ordered array is a listing of the values of a collection (either population or sample) in order of magnitude from the smallest value to the largest value. If the number of measurements to be ordered is of any appreciable size, the use of a computer is highly desirable. 40

Transcript of CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36...

Page 1: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

1

ORGANIZING AND GRAPHING

QUANTITATIVE DATA

36

October 19, 2016

Organizing &Graphing Quantitative Data

• Ordered array

• Frequency Distributions

– Constructing Frequency Distribution Tables

– Relative and Percentage Distributions

• Graphing Grouped Data

– Stem and leaf plots

– Histograms

– Polygons

37

Organizing & Grouping Data

• To facilitate the calculation of various descriptive measures such as percentages and averages (Before the days of computers)

• The main purpose in grouping data now is summarization

• Summarization is a way of making it easier to understand the information in data

38

Ordered array

• A first step in organizing data

• An ordered array is a

listing of the values of a collection (either population or sample) in order of magnitude from the smallest value to the largest value.

• If the number of measurements to be ordered is of any appreciable size, the use of a computer is highly desirable.

40

Page 2: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

2

STEM-AND-LEAF DISPLAYS

Definition

In a stem-and-leaf display of quantitative data, each value is divided into two portions – a stem and a leaf. The leaves for each stem are shown separately in a display.

43

Example 2-8

The following are the scores of 30 college students on a statistics test:

Construct a stem-and-leaf display.

44

75

69

83

52

72

84

80

81

77

96

61

64

65

76

71

79

86

87

71

79

72

87

68

92

93

50

57

95

92

98

Solution 2-8

To construct a stem-and-leaf display for these scores, we split each score into two parts. The first part contains the first digit, which is called the stem. The second part contains the second digit, which is called the leaf.

45

Solution 2-8

We observe from the data that the stems for all scores are 5, 6, 7, 8, and 9 because all the scores lie in the range 50 to 98

46

Figure 2.13 Stem-and-leaf display.

47

5

6

7

8

9

2

5

Leaf for 75

Leaf for 52

Stems

Page 3: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

3

Solution 2-8

After we have listed the stems, we read the leaves for all scores and record them next to the corresponding stems on the right side of the vertical line.

48

Figure 2.14 Stem-and-leaf display of test scores.

5

6

7

8

9

2 0 7

5 9 1 8 4

5 9 1 2 6 9 7 1 2

0 7 1 6 3 4 7

6 3 5 2 2 8

49

Figure 2.15 Ranked stem-and-leaf display of test

scores.

5

6

7

8

9

0 2 7

1 4 5 8 9

1 1 2 2 5 6 7 9 9

0 1 3 4 6 7 7

2 2 3 5 6 8

50

Example 2-9

The following data are monthly rents paid by a sample of 30 households selected from a small city.

Construct a stem-and-leaf display for these data.

51

880

1210

1151

1081

985

630

721

1231

1175

1075

932

952

1023

850

1100

775

825

1140

1235

1000

750

750

915

1140

965

1191

1370

960

1035

1280

Solution 2-9

6

7

8

9

10

11

12

13

30

75 50 21 50

80 25 50

32 52 15 60 85 65

23 81 35 75 00

91 51 40 75 40 00

10 31 35 80

70

52

Figure 2.16Stem-and-leaf display of rents.

53

Page 4: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

4

54

Information from a stem & leaf displays…

• Provide information regarding the range of the data set

• Shows the location of the highest concentration of measurements

• Reveals the presence or absence of symmetry.

55

Example 2-10

The following stem-and-leaf display is prepared for the number of hours that 25 students spent working on computers during the last month.

56

Example 2-10

Prepare a new stem-and-leaf display by grouping the stems.

57

0

1

2

3

4

5

6

7

8

6

1 7 9

2 6

2 4 7 8

1 5 6 9 9

3 6 8

2 4 4 5 7

5 6

Solution 2-10

58

0 – 2 3 – 5

6 – 8

6 * 1 7 9 * 2 6

2 4 7 8 * 1 5 6 9 9 * 3 6 8

2 4 4 5 7 * * 5 6

Figure 2.17 Grouped stem-and-leaf display.

Stem-and-leaf displays

• Most effective with relatively small data sets.

• As a rule they are not suitable for use in annual reports or other communications aimed at the general public.

• They are primarily of value in helping researchers and decision makers understand the nature of their data.

59

Page 5: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

5

Frequency Distributions

60

Frequency Distributions

• A frequency distribution for quantitative data lists

–all the classes

and

–the number of values that belong to each class.

• Data presented in the form of a frequency distribution are called grouped data.

61

62

Frequency Distributions

63

Weekly Earnings

(dollars)

Number of Employees

f

401 to 600

601 to 800

801 to 1000

1001 to 1200

1201 to 1400

1401 to 1600

9

22

39

15

9

6

Table 2.7 Weekly Earnings of 100 Employees of a Company

Variable

Third class

Lower limit of the sixth class

Upper limit of the sixth class

Frequency of the third class

Frequency column

Class width

Essential Question :

How do we construct a frequency distribution table?

Process of Constructing a Frequency Table

Page 6: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

6

Frequency Distributions

66

Weekly Earnings

(dollars)

Number of Employees

f

Weekly Earnings of 100 Employees of a Company

STEP 1. Determine the tentative number of classes (k)

k = 1 + 3.322 log N

Always round – off

Note: The number of classes should be between 5 and 15. The actual number of classes may be affected by convenience or other subjective factors

Process of Constructing a Frequency Table

STEP 2: Determine the range (R).

R = Highest Value – Lowest Value

STEP 3. Find the class width by dividing the range by the number of classes.

(Always round – off )

k

Rc

classesofnumber

Rangewidthclass

STEP 4. Write the classes or categories starting with the lowest score. Stop when the class already includes the highest score.

Add the class width to the starting point to get the second lower class limit. Add the class width to the second lower class limit to get the third, and so on. List the lower class limits in a vertical column and enter the upper class limits, which can be easily identified at this stage.

STEP 5. Determine the frequency for each class by referring to the tally columns and present the results in a table.

Page 7: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

7

When constructing frequency tables, the following guidelines should be followed.

The classes must be mutually exclusive. That is, each score must belong to exactly one class.Include all classes, even if the frequency might be zero.

All classes should have the same width, although it is sometimes impossible to avoid open –ended intervals such as “65 years or older”.

The number of classes should be between 5 and 15.

Let’s Try!!!

• Time magazine collected information on all 464 people who died from gunfire in the Philippines during one week. Here are the ages of 50 men randomly selected from that population.

• Construct a frequency distribution table.

19 18 30 40 41 33 73 25

23 25 21 33 65 17 20 76

47 69 20 31 18 24 35 24

17 36 65 70 22 25 65 16

24 29 42 37 26 46 27 63

21 27 23 25 71 37 75 25

27 23

Determine the tentative number of classes (K).

K = 1 + 3. 322 log N

= 1 + 3.322 log 50

= 1 + 3.322 (1.69897)

= 6.64

*Round – off the result to the next integer if the decimal part exceeds 0.

K = 7

Determine the range.

R = Highest Value – Lowest Value

R = 76 – 16 = 60

Page 8: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

8

Find the class width (c).

* Round – off the quotient if the decimal part exceeds 0.

k

Rc

classesofnumber

Rangewidthclass

957.87

60c

Write the classes starting with lowest score.

Classes Tally Marks Freq.

70 – 78

61 – 6952 – 6043 – 5134 – 4225 – 33

16 – 24

/////

/////

///////-///////-/////-////

/////-/////-/////-//

5

5027

14

17

Using Table:

• What is the lower class limit of the highest class?

• Upper class limit of the lowest class?

• Find the class mark of the class 43 – 51.

• What is the frequency of the class 16 – 24?

Concept of true class boundaries

81

Classes True Class boundaries

Tally Marks Freq. x

70 – 7861 – 6952 – 6043 – 5134 – 4225 – 3316 – 24

69.5 – 78.560.5 – 69.551.5 – 60.5 42.5 – 51.533.5 – 42.524.5 – 33.515.5 – 24.5

//////////

///////-///////-/////-/////////-/////-/////-//

550

2714 17

74655647382920

Example

Table 2.9 gives the total home runs hit by all players of each of the 30 Major League Baseball teams during the 2002 season.

Construct a frequency distribution table.

83

Page 9: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

9

Table 2.9 Home Runs Hit by Major League Baseball

Teams During the 2002 Season

Team Home Runs Team Home Runs

Anaheim

Arizona

Atlanta

Baltimore

Boston

Chicago Cubs

Chicago White Sox

Cincinnati

Cleveland

Colorado

Detroit

Florida

Houston

Kansas City

Los Angeles

152

165

164

165

177

200

217

169

192

152

124

146

167

140

155

Milwaukee

Minnesota

Montreal

New York Mets

New York Yankees

Oakland

Philadelphia

Pittsburgh

St. Louis

San Diego

San Francisco

Seattle

Tampa Bay

Texas

Toronto

139

167

162

160

223

205

165

142

175

136

198

152

133

230

187

84

Solution 2-3

2.215

124230classeach of width eApproximat

85

Now we round this approximate width to a convenient number – say, 22.

Solution 2-3

The lower limit of the first class can be taken as 124 or any number less than 124. Suppose we take 124 as the lower limit of the first class. Then our classes will be

124 – 145, 146 – 167, 168 – 189, 190 – 211,

and 212 - 233

86

Table 2.10 Frequency Distribution for the Data of

Table 2.9

87

Total Home Runs Tally f

124 – 145

146 – 167

168 – 189

190 – 211

212 - 233

|||| |

|||| |||| |||

||||

||||

|||

6

13

4

4

3

∑f = 30

Relative Frequency and Percentage Distributions

88

Relative Frequency and Percentage Distributions

A relative frequency distribution lists the categories and the proportion with which each occurs

Calculating Relative Frequency of a Category

89

sfrequencie all of Sum

category that ofFrequency category a offrequency lativeRe

Page 10: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

10

Relative Frequency and Percentage Distributions

Relative Frequency and Percentage Distributions

90

100 frequency) (Relative Percentage

sfrequencie all of Sum

class that ofFrequency class a offrequency Relative

f

f

Example 2-4

Calculate the relative frequencies and percentages for Table 2.10

91

Solution 2-4

92

Total Home

RunsClass Boundaries

Relative Frequency

Percentage

124 – 145

146 – 167

168 – 189

190 – 211

212 - 233

123.5 to less than 145.5

145.5 to less than 167.5

167.5 to less than 189.5

189.5 to less than 211.5

211.5 to less than 233.5

.200

.433

.133

.133

.100

20.0

43.3

13.3

13.3

10.0

Sum = .999 Sum = 99.9%

Table 2.11 Relative Frequency and Percentage Distributions for

Table 2.10Graphing Grouped Data

93

Histogram • A way of presenting grouped frequency distribution

graphically

• A histogram is a bar graph in which classes are marked on the horizontal axis

And

• The frequencies, relative frequencies/percentages are marked on the vertical axis.

• The frequencies, relative frequencies, or percentages are represented by the heights of the bars.

• In a histogram, the bars are drawn adjacent to each other. 94

Figure 2.3 Frequency histogram for Table 2.10.

95

124 -145

146 -167

168 -

189

190 -

211

212 -

233Total home runs

15

12

9

6

3

0

Fre

qu

en

cy

Page 11: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

11

Figure 2.3 Frequency histogram for Table 2.10.

96

124 146 168 190 212

Total home runs

15

12

9

6

3

0

Fre

qu

en

cy

Figure 2.4 Relative frequency histogram for Table

2.10.

97

124 -145

146 -167

168 -

189

190 -

211

212 -

233Total home runs

.50

.40

.30

.20

.10

0

Re

lati

ve

Fre

qu

en

cy

98

Information from a histogram…

• Provides information regarding the range of the data set

• Shows the location of the highest concentration of measurements

• Reveals the presence or absence of symmetry.

99

Graphing Grouped Data cont.

Polygon A graph formed by joining the midpoints of the tops of successive bars in a histogram with straight lines is called a polygon.

A special kind of line graph

100 101

Page 12: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

12

Figure 2.5 Frequency polygon for Table 2.10.

102

124 -145

146 -167

168 -

189

190 -

211

212 -

233

15

12

9

6

3

0

Fre

qu

en

cy

Figure 2.6 Frequency Distribution curve.

103

Fre

qu

en

cy

x

Some reflections on similarities of Histograms and Stem & Leaf

Displays

104 105

Advantage of the stem-and-leaf display over the histogram

• It preserves the information contained in the individual measurements.

• They can be constructed during the tallying process, so the intermediate step of preparing an ordered array is eliminated.

106

Example 2-5

The following data give the average travel time from home to work (in minutes) for 50 states. The data are based on a sample survey of 700,000 households conducted by the Census Bureau (USA TODAY, August 6, 2001).

107

Page 13: CHAPTER 2 : ORGANIZING DATA - · 10/18/2016 1 ORGANIZING AND GRAPHING QUANTITATIVE DATA 36 October 19, 2016 Organizing &Graphing Quantitative Data •Ordered array •Frequency Distributions

10/18/2016

13

Example 2-5

108

22.4

19.7

21.6

15.4

21.1

18.2

27.0

21.9

22.1

25.4

23.7

21.7

23.2

19.6

24.9

19.8

17.6

16.0

21.4

25.5

26.7

17.7

16.1

23.8

20.1

23.4

22.5

22.3

21.9

17.1

23.5

23.7

24.4

21.9

22.5

21.2

28.7

15.6

24.3

29.2

19.9

22.7

26.7

26.1

31.2

23.6

24.2

22.7

22.6

20.8

Construct a frequency distribution table. Calculate the relative frequencies and percentages for all classes.

Solution 2-5

63.26

4.152.31classeach of width eApproximat

109

Solution 2-5

Class Boundaries fRelative

Frequency Percentage

15 to less than 18

18 to less than 21

21 to less than 24

24 to less than 27

27 to less than 30

30 to less than 33

7

7

23

9

3

1

.14

.14

.46

.18

.06

.02

14

14

46

18

6

2

Σf = 50 Sum = 1.00 Sum = 100%

110

Table 2.12 Frequency, Relative Frequency, and Percentage

Distributions of Average Travel Time to Work

Work to do!

• Please read chapter 2 from the course book and solve the exercises at the end of chapter

111