Statistics: Interpreting Data and Making...

25
Statistics: Interpreting Data and Making Predictions 4 April 2014 Statistics: Interpreting Data and Making Predictions 4 April 2014 1/26

Transcript of Statistics: Interpreting Data and Making...

Page 1: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Statistics: Interpreting Data and Making Predictions

4 April 2014

Statistics: Interpreting Data and Making Predictions 4 April 2014 1/26

Page 2: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Today we’ll see applications of standard deviation and how it can beused to make predictions.

Recall that the standard deviation of a set of data is a number whichhelps to determine how much variability is in the data. The morevariability the data has the larger the standard deviation.

The graph on the right has more variability, so it has the higherstandard deviation.

Statistics: Interpreting Data and Making Predictions 4 April 2014 2/26

Page 3: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Some Coin Flipping Data

Lets look at the experiment of flipping a coin repeatedly. We willsimulate this with Excel and the computer program Maple.

In the experiment we simulate a bunch of people flipping 100 coinsand determining the percentage of flips which came up heads.

We’ll first look at an Excel spreadsheet, Coin Flip Distribution.xlsx.

Statistics: Interpreting Data and Making Predictions 4 April 2014 3/26

Page 4: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Excel isn’t the best way to simulate a large amount of data. Thefollowing charts were created with the program Maple. We used thisprogram to demonstrate RSA encryption in the beginning of thesemester.

Statistics: Interpreting Data and Making Predictions 4 April 2014 4/26

Page 5: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Simulation of flipping 100 coins 1,000 times

Statistics: Interpreting Data and Making Predictions 4 April 2014 5/26

Page 6: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Simulation of flipping 100 coins 10,000 times

Statistics: Interpreting Data and Making Predictions 4 April 2014 6/26

Page 7: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Simulation of flipping 100 coins 100,000 times

Statistics: Interpreting Data and Making Predictions 4 April 2014 7/26

Page 8: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Simulation of flipping 100 coins 500,000 times

Statistics: Interpreting Data and Making Predictions 4 April 2014 8/26

Page 9: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Simulation of flipping 100 coins 1,000,000 times

Statistics: Interpreting Data and Making Predictions 4 April 2014 9/26

Page 10: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

As the number of flips gets larger and larger, the graph looks moreand more regular.

Note that the graphs are roughly symmetric, and that the middle ofeach graph is at 50, the “expected” percentage of heads.

Q Does the shape of the graphs, especially the latter ones, look at allfamiliar?

A Yes

B No

Statistics: Interpreting Data and Making Predictions 4 April 2014 10/26

Page 11: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

The Bell Curve (or Normal Curve)

Statistics: Interpreting Data and Making Predictions 4 April 2014 11/26

Page 12: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

The importance of the bell curve is that as the number of trials getslarger and larger, histograms generally look more and more like a bellcurve.

The particular shape of the bell curve reflects the mean and thestandard deviation. The center of the curve represents the mean.How wide or thin is the curve is an indication of the standarddeviation. The larger the standard deviation the wider is the curve.

Statistics: Interpreting Data and Making Predictions 4 April 2014 12/26

Page 13: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Bell Curves with Different Standard Deviations

Standard Deviation = 5

Standard Deviation = 2

For both of these graphs the mean is 50.

Statistics: Interpreting Data and Making Predictions 4 April 2014 13/26

Page 14: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

The blue and red graph have mean 0 and the green graph has mean−2. The blue graph has the smallest standard deviation, followed bythe green graph, and finally by the red graph, which has the largeststandard deviation.

Statistics: Interpreting Data and Making Predictions 4 April 2014 14/26

Page 15: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Standard Deviation and Number of Coins Flipped

Let’s go back to the coin flipping experiment. How do things changeif each person changes the number of flips?

The following graphs represent a simulation of 50,000 people flippinga coin repeatedly. Each person determines the percentages of headson their flips.

• The first graph represents each person flipping a coin 10 times andrecording the percentage of heads.

• The third graph represents each person flipping 100 times.

Statistics: Interpreting Data and Making Predictions 4 April 2014 15/26

Page 16: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Clicker Question

Which graph represents the most variability in the data?

10 flips per person 100 flips per person

A The first graph

B The second graph

Statistics: Interpreting Data and Making Predictions 4 April 2014 16/26

Page 17: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Answer

The first graph has more variability. While nearly all the data of thesecond graph is between 40 and 60, a lot of the data in the firstgraph is outside that range.

Statistics: Interpreting Data and Making Predictions 4 April 2014 17/26

Page 18: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

In terms of the normal curve, standard deviation can be interpretedapproximately with the following rules of thumb:

• 68% of all data is within 1 standard deviation of the mean.

• 95% of all data is within 2 standard deviations of the mean.

• 99.7% of all data is within 3 standard deviations of the mean.

Statistics: Interpreting Data and Making Predictions 4 April 2014 18/26

Page 19: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

The letter σ is an abbreviation for the standard deviation, and µ forthe mean.

Finding the area under a curve was one of the problems that led tothe development of Calculus in the 17th century.

Statistics: Interpreting Data and Making Predictions 4 April 2014 19/26

Page 20: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

How to Tell if a Coin is Unfair?

We’ll use the normal distribution to get some sense on how to tell if acoin is fair. Suppose you flip a coin and get at least 60% heads. Canyou conclude the coin is unfair?

Statistics: Interpreting Data and Making Predictions 4 April 2014 20/26

Page 21: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Clicker Question

Suppose you flip a coin 10 times and get 6 heads. Do you think this isgood evidence to conclude the coin is unfair?

A Yes

B No

Statistics: Interpreting Data and Making Predictions 4 April 2014 21/26

Page 22: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Answer

No it really isn’t. If you flip a coin 10 times, you are pretty likely toget 6 or more heads quite often.

Statistics: Interpreting Data and Making Predictions 4 April 2014 22/26

Page 23: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Let’s suppose we flip a coin 100 times and get at least 60% heads.We can ask what is the probability that that happens. Let’s imaginewe do this many many times. As our simulations indicate, we canconsider the distribution of trials giving us a bell curve.

Based on the data from Coin Flip Distribution.xlsx, the mean for thiscurve is 50% and the standard deviation is 5%. Then 60% is twostandard deviations to the right of the mean. The amount of data tothe right of 60% is then approximately 2.25% of the data.

So, there is only about a 2% chance that flipping a coin 100 timesresults in at least 60% heads, but that is not so small. Unless you hada reason to think the coin might be unfair, it would probably be hardto argue from this data that the coin is unfair, even though it is nottoo likely to get at least 60% heads.

Statistics: Interpreting Data and Making Predictions 4 April 2014 23/26

Page 24: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

To think a little further about it, suppose a class of 100 students eachflipped a coin 100 times. Even if everybody had a fair coin, we’dexpect, on average, 2 of the 100 students to get at least 60% heads.

Therefore, while getting this many heads isn’t too likely for any oneperson, with enough people it will happen.

Statistics: Interpreting Data and Making Predictions 4 April 2014 24/26

Page 25: Statistics: Interpreting Data and Making Predictionssierra.nmsu.edu/morandi/oldwebpages/math210Spring2014/Lectures/20… · Statistics: Interpreting Data and Making ... The importance

Next Week and Homework # 7

Next week’s topic is graph theory. There is more than one notion of agraph in mathematics. The one we will discuss isn’t the one thatcomes up in algebra courses. It is a more modern idea that has manyapplications to computer science and applied problems.

We’ll discuss aspects of graphs, how it arose historically, some of itsapplications, and how it can be used to help understand and classifydifferent types of surfaces.

Homework #7 is on the class website. It is due a week from today.

Statistics: Interpreting Data and Making Predictions 4 April 2014 25/26