Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology...

26
Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham

Transcript of Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology...

Page 1: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Statistical Analysis - Chapter 2“Organizing and Analyzing

Data”Fashion Institute of Technology

Dr. Roderick Graham

Page 2: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Showing Data Graphically When we collect a sample, we initially want to

get a picture of how the data “looks”. We can show our “stakeholders” easily what

the patterns in the data are What do we mean by “stakeholders”?

Three of the ways to show data are Histograms, Frequency Polygons, and Circle Graphs

Page 3: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Showing Data Graphically Look at the listing of numbers on p.17

This is called “ungrouped” data

Sometimes it is better to “group” data into categories…this makes it easier to represent data graphically (p.18)

Page 4: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Histograms Let’s look at the move from “ungrouped data”

to the construction of a histogram in your textbook…(pp. 17 – 18)

1.Start with a survey of numbers…or “ungrouped data”

2.Decide on the categories you want to use and “group” the numbers into the categories that fit it

3.Now the data has been changed from a series of ages, to GROUPS of ages

4.We can compute statistics for both grouped and ungrouped data

Page 5: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Histograms Let’s figure out this

Histogram (taken from actual data I am using)…

1 = 18 – 24 2 = 25 – 34 3 = 35 – 44 4 = 45 – 54 5 = 55 – 64 6 – 65+

How many people are between ages 45 and 54?

Page 6: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Frequency Polygon (Line Graph)

This is a line graph representing the shape of a histogram

Usually when you have “too many bars” (categories) you may want to use line graph

This can be used to show trends easier than a histogram.

Page 7: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Circle Graph These graphs are used to show what percentage (proportion) of a

sample is doing what. Your textbook goes into some detail about how to create circle

graphs with a protractor…lucky for us we have Excel! Below is an example from the CDC showing the percentages of

how people have become infected with HIV…

Page 8: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Key Points It is up to you (researcher) to decide what

graph is most important for presenting your data. For me…

1. If am showing a small amount of categories, I use a histogram

2. If I am showing trends through time, or a large number of categories, I use a line graph

3. If I want to show percentages, I use a circle graph (this always the best way to show percentages)

Page 9: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Our first “statistics” Remember that statistics are values that we compute

from our sample of data that we have collected. We will learn two basic and important types of statistics:

Measures of Central Tendency – What are the middle values for our data?

Measures of Dispersion or Spread – How much diverse is our data…or how widely scattered is our data?

You can compute these statistics for both grouped and ungrouped data

Page 10: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency (ungrouped) What if we had collected data about one

measure, and we wanted to know what the middle value was for this measure? Ex. What is the middle value, in age, for those who

listen to Lady Gaga? Ex. How many times do young Hispanic women

report shopping at H&M?

Knowing this middle, or central, value is important for describing our data.

There are three measures of central tendency…

Page 11: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency(ungrouped) Mean (p.24)

This is the mathematical average of a set of numbers Median (p.26)

This is the middle value of a set of data that has been arranged from lowest to highest

Mode (p. 27) The value that occurs the most in a set of data

We can use income as a good way of discussing these three measures. Imagine that we wanted to know the average incomes for FIT students. Imagine that we took a random sample of incomes for FIT students. …

Page 12: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency(ungrouped) The sample gives

these values: 5000, 6000, 30000,

110000, 15000, 6000, 17000, 13000, 12000, 11000, 8000, 6000, 15000, 6000, 11500

The Mean This is the average…. Sum of values =

271500 Total N = 15 Mean = 18100

Page 13: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency(ungrouped) The sample gives

these values: 5000, 6000, 30000,

110000, 15000, 6000, 17000, 13000, 12000, 11000, 8000, 6000, 15000, 6000, 11500

The Median This is the middle

values: 5000, 6000, 6000,

6000, 6000, 8000, 11000, 11500, 12000, 13000, 15000, 15000, 17000, 30000, 110000

The median here is 11500

In cases where there are two middle values, we average the two.

Page 14: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency(ungrouped) The sample gives

these values: 5000, 6000, 30000,

110000, 15000, 6000, 17000, 13000, 12000, 11000, 8000, 6000, 15000, 6000, 11500

The Mode This is the most

numerous value: 5000, 6000, 6000,

6000, 6000, 8000, 11000, 11500, 12000, 13000, 15000, 15000, 17000, 30000, 110000

The Mode here is 6000. Sometimes there is no

mode…or even two modes!

Page 15: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency(ungrouped) So given these

values…

5000, 6000, 6000, 6000, 6000, 8000, 11000, 11500, 12000, 13000, 15000, 15000, 17000, 30000, 110000

…what is the best measure of central tendency for this random sample of FIT students?

Mean?...18100 Median?...11500 Mode?...6000

Page 16: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Dispersion or Spread(ungrouped) Range (p.29)

The highest value minus the lowest value…. From our last example, the range would be:

115000 – 5000 = 110000

Standard Deviation (p.29 – 35) This is the average distance your values have

from the mean score. Best shown through example…

Page 17: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Dispersion or Spread (ungrouped)

Standard Deviation Let’s return to our FIT

random sample…

5000, 6000, 6000, 6000, 6000, 8000, 11000, 11500, 12000, 13000, 15000, 15000, 17000, 30000, 110000

Follow the steps on the right while we(I) calculate the standard deviation as a class on the board

1. Calculate the mean…which is 18100

2. Find the distance that each value has from the mean

3. Square the distance4. Add up these distances

and divide by the sample size – 1 (at this point, this number is called the variance).

5. Then we get the square root of this number

Page 18: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Standard DeviationX Mean (x-bar) X – x-bar (X – x-bar)2

5000 18100 -13100 17161 + E4

6000 18100 -12100 14641 + E4

6000 18100 -12100 14641 + E4

6000 18100 -12100 14641 + E4

6000 18100 -12100 14641 + E4

8000 18100 -10100 10201 + E4

11000 18100 -7100 5041 + E4

11500 18100 -6600 4356 + E4

12000 18100 -6100 3721 + E4

13000 18100 -5100 2601 + E4

15000 18100 -3100 961 + E4

15000 18100 -3100 961 + E4

17000 18100 -1100 121 + E4

30000 18100 11900 14161 + E4

110000 18100 91900 844561 + E4

Page 19: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Standard Deviation We sum (x – x-bar)2, and get the square root

of this sum. This is the standard deviation. What is the square root of the sum? Appx. 26,219

Right now, this number means very little…but in the following chapters we will gain a better understanding of the standard deviation

Page 20: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Measures of Central Tendency and Dispersion(Grouped Data) Remember that grouped data is a collection of

data that has been placed into categories…

Thus we need to calculate the mean and standard deviation differently, but the idea is the same.

P. 36 – 39 show the formulas for these measures.

Page 21: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Calculating the Mean for Grouped Data Let’s say we conducted a random sample of FIT

students, and asked them their GPA. We decided to group GPA into categories. Here is the data below:

So…what is the mean? Look at pages 36 – 38 and I will wait for someone to tell me how to go about answering this question?

GPA Category Number of Students

3.5 – 4.0 15

3.0 – 3.49 25

2.0 – 2.9 50

Below 2.0 11

Page 22: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Calculating the Mean for Grouped DataX = the average of the categoriesf = number of studentsSo can someone answer this question on the

board (with help from classmates)?

GPA Category

Number of Students

3.5 – 4.0 15

3.0 – 3.49 25

2.0 – 2.9 50

Below 2.0 11

GPA Categor

y

X Number of

Students

(f)

3.5 – 4.0 3.75 15

3.0 – 3.49

3.245 25

2.0 – 2.9 2.45 50

Below 2.0

(0 – 1.9)

.95 11

Page 23: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Calculating the Standard Deviation of Grouped Data Now let’s calculate the standard deviation for

this same set of data…

Who can do this one on the board?

GPA Category Number of Students

3.5 – 4.0 15

3.0 – 3.49 25

2.0 – 2.9 50

Below 2.0 11

Page 24: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Writing Research Reports (pp. 48 – 50) Background Statement (5 pts)

I will give you data…use your imagination Why was the study performed (why was the data

collected)?

Design and Procedures of the Study (10 pts) How did you conduct the study How was the study internally valid/externally valid

These two sections are not the most important…simply use your imagination to complete these two sections

Page 25: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

Writing Research Reports (pp. 48 – 50) Results (55 pts.)

The most important section. For this first report, this is where you present your data

graphically, show measures of dispersion, and central tendency

Analysis and Discussion (10 pts.) What is interesting to you about the results?

Conclusions and Recommendations (20 pts.) (this section you will not do for your report…this is

where you present your results and analysis to the class. The class can ask you questions, so be on point!)

Page 26: Statistical Analysis - Chapter 2 “Organizing and Analyzing Data” Fashion Institute of Technology Dr. Roderick Graham.

END