Stats chapter 11

82
Chapter 11 Testing a Claim

description

 

Transcript of Stats chapter 11

Page 1: Stats chapter 11

Chapter 11

Testing a Claim

Page 2: Stats chapter 11

11.1 SIGNIFICANCE TESTS:

THE BASICS

Page 3: Stats chapter 11

The Pizza Problem

• Let us suppose that a certain pizza company claims that they deliver their pizza in an average of 20 minutes

• Now, we are told “average time” so it’s possible that they’ve delivered a pizza in 5 minutes, and it’s also possible that they delivered a pizza in 30 minutes

• If we order pizza 10 times, what average time will convince you that they’re claim is wrong?

• Welcome to significance testing!

Page 4: Stats chapter 11

How significance testing works

1. Assume that a claim about an average or proportion is true

2. Compute the average or prop of a sample

3. Compare the sample with the sampling distribution for the claim and sample size.

4. If the probability of obtaining the sample avg or prop is too low, we conclude that our claim is improbable, and reject it.

Page 5: Stats chapter 11

How significance testing works

In all cases, we are comparing the sample with the sampling distribution for the claim and sample size

Page 6: Stats chapter 11

PHANTOMS (a framework)

As with Confidence Intervals, there is an acronym to help you remember the steps of a significance test

• State the Parameter• State the Hypothesis pair• Check the Assumptions• State the Name of the test• Find the value of the Test Statistic• Obtain a p-value• Make a decision• Summarize

Page 7: Stats chapter 11

State Parameters

• Parameters work the same way they did in Confidence Intervals

• = The true average of the Pizza Company’s delivery times

• x-bar = the average delivery time for a sample of 10 deliveries from the Pizza Company

• p = the proportion of all deliveries from the Pizza Company that are delivered in less than 20 minutes

• p-hat = the proportion of sample of 10 deliveries from the Pizza Company that are delivered in less than 20 minutes

Page 8: Stats chapter 11

Stating Hypotheses

Hypotheses come in pairs:• “the null hypothesis” – H0 “H naught”

– This is the presumed claim– For our purposes, our null hypothesis

will always be in the forms:“= __”“p = ___”

Page 9: Stats chapter 11

Stating Hypotheses

Hypotheses come in pairs:• “the alternative hypothesis”– Ha

– This is the suspicion of the researcher– There are 3 alt hyps that we can test

1. “ ≠ ___” (two-sided alternative)

2. “p > ___” (one-sided alternative)3. “ < ___” (one-sided alternative)

Page 10: Stats chapter 11

Stating Hypotheses

Notice: Hypotheses are always about the parameter ( or p, never xbar or phat)

Written Examples“H0: = 20 minutes

Ha: > 20 minutes”

“H0: p = 0.5

Ha: p < 0.5”

Page 11: Stats chapter 11

Checking the Assumptions

• Since we are comparing our samples to a sampling distribution (just like the last chapter), the assumptions are the same

• We will review them now:

Page 12: Stats chapter 11

Checking the Assumptions

Assumptions for mean• SRS• Independence

N > 10n• Normality (a, b, or c must be true)

(a) population is Normal, or(b) n > 30; Central Limit Theorem, or(c) Sample is approximately normal:

(1) histogram single peak and symmetric, (2) Normal probability plot is linear,(3) no Outliers

Page 13: Stats chapter 11

Checking the Assumptions

Assumptions for proportions• SRS• Independence

N > 10n• Normality

np > 10nq > 10

Page 14: Stats chapter 11

Name of Test

• “one-sided z test for means”• “two-sided z test for means”• “one-sided t test for means”• “two-sided t test for means”• “one-sided z test for proportions”• “two-sided z test for proportions”• More on these later

Page 15: Stats chapter 11

Test Statistics

• Test Statistics are always of the form:

• Standard Deviation of the sampling distribution depends on the characteristic tested

estimate - null hypothesistest statistic

std dev of sampling dist (std error)

Page 16: Stats chapter 11

Test Statistics

• Std Dev for mean ( known):

• Std Dev for mean ( unknown):

• Std Dev for proportions:

n

s

n

p q

n

Notice that we use ‘p’ and not ‘p-hat’

Page 17: Stats chapter 11

P-values

• The P-value is the probability of obtaining a measurement as extreme as the test statistic

• At its most basic, computing the P-Value is the same as computing area from a Normal curve or Student’s t-distribution

• Computation varies slightly when using 2-sided alternative vs. 1-sided alternative

Page 18: Stats chapter 11

P-values

Two sided alternatives• For these alt hyps, we calculate a p-

val based on area “from two tails”

Page 19: Stats chapter 11

P-values

Example:• Let’s assume our sample of 24 has:

x-bar = 22 and s = 1.53• H0: = 20

Ha: 20

• “2-sided t-test for means”

Page 20: Stats chapter 11

P-values

Example (cont)• Test Statistic:

Page 21: Stats chapter 11

P-values

Example (cont)• Test Statistic:

22 206.404

/ 1.53 / 24

xts n

Page 22: Stats chapter 11

P-values

Example (cont)• Test Statistic:

22 206.404

/ 1.53 / 24

xts n

Page 23: Stats chapter 11

P-values

Example (cont)• Test Statistic:

22 206.404

/ 1.53 / 24

xts n

Page 24: Stats chapter 11

P-values

Example (cont)• Test Statistic:

• P-value

22 206.404

/ 1.53 / 24

xts n

23

P-val =2 P test stat

2 6.404

0.000000778

t

P t

Page 25: Stats chapter 11

P-values

Example (cont)• Test Statistic:

• P-value

22 206.404

/ 1.53 / 24

xts n

23

P-val =2 P test stat

2 6.404

0.000000778

t

P t

Page 26: Stats chapter 11

P-values

Example (cont)• Test Statistic:

• P-value

22 206.404

/ 1.53 / 24

xts n

23

P-val =2 P test stat

2 6.404

0.00000155

t

P t

Page 27: Stats chapter 11

P-values

One sided alternatives• Calculate the area tail indicated by

the alternative hypothesis for P-value

Page 28: Stats chapter 11

P-values

One sided alternatives• If H0 p = .53 and Ha: p > 0.53

then P-val = P(z > test stat)• If Ha: < 10, • then P-val = P(t < test stat)

Page 29: Stats chapter 11

P-values

Example• Let’s assume:• H0 p = .22 and Ha : p < 0.22

p-hat = 0.20 from n = 55• Test statistic

Page 30: Stats chapter 11

P-values

Example• Let’s assume:• H0 p = .22 and Ha : p < 0.22

p-hat = 0.20 from n = 55• Test statistic

0.20 0.22

/ 0.2 0.8 / 55

0.371

p p

pq n

Page 31: Stats chapter 11

P-values

Example• Let’s assume:• H0 p = .22 and Ha : p < 0.22

p-hat = 0.20 from n = 55• Test statistic

0.20 0.22

/ 0.2 0.8 / 55

0.371

p p

pq n

Page 32: Stats chapter 11

P-values

Example• Let’s assume:• H0 p = .22 and Ha : p < 0.22

p-hat = 0.20 from n = 55• Test statistic

0.20 0.22

/ 0.2 0.8 / 55

0.371

p p

pq n

Page 33: Stats chapter 11

P-values

Example (cont.)• P-value

• Would you say this is “likely” or “unlikely”?

P-val = 0.371P z 0.3553

Page 34: Stats chapter 11

Making a decision

• The P-value serves as the indicator• If the test statistic is likely under the

presumed sampling distribution (i.e. the p-value is large), then we have no reason to reject the null-hypothesis

• If the test statistic is unlikely (i.e. the p-value is small), then we have reason to reject the null-hypothesis.

• “If the p-value is low, reject the Hoe”

Page 35: Stats chapter 11

Making a decision

Significance level (‘alpha’ )• This is the probability level at which we will

reject H0

• Typical sig levels = 0.10, 0.05, 0.01• If no significance level is given, we will

generally reject at the = 0.05 level.• When p-val < , then we

“reject H0 at the = __ level”

• When H0 is rejected, we say the data is “statistically significant at the = __ level”

Page 36: Stats chapter 11

Making a decision

“Reject or Fail to Reject”• When p val > alpha, we “fail to reject H0”– This means that we do not have evidence to

show H0 is incorrect

– This does not mean, H0 is “correct”

• When p val < alpha we “reject H0”– This means that H0 is unlikely

– The new estimate for or p is our sample data (x-bar or p-hat)

Page 37: Stats chapter 11

WOW

• That was a lot of information!• We will be going over this

information again at a slower pace in the coming weeks.

• We’ll work out the mechanics later• Understanding the basics and the

“whys” right now will help you in the future!

Page 38: Stats chapter 11

Assignment 11.1

• Page 693 #3, 5, 7-8, 11-14

Page 39: Stats chapter 11

11.2 CARRYING OUT SIGNIFICANCE TESTS

Page 40: Stats chapter 11

z-test for a population mean

• This is the appropriate test when is known.

• Test Statistic:

/

xz

n

Page 41: Stats chapter 11

z-test for a population mean

• P-value:

Page 42: Stats chapter 11

Example 11.10

The mean systolic blood pressure for males 35 to 44 years is 128, and the standard deviation in this population is 15. The medical records of 72 male executives in this age group finds the mean systolic blood pressure is 129.93. Is this evidence that the mean blood pressure for all the company’s younger male executives is different than the national average?

Page 43: Stats chapter 11

Example 11.10

• We are going to check to see if our sample comes from a population with the same and sigma as the national population.

• Because of this, our parameter will come from the national averages.

• The null hypothesis will assume that younger male executives have the same mean blood pressure as the national average.

• The null hypothesis will always assume “things are equal”

Page 44: Stats chapter 11

Example 11.10

Parameter• “Let = average blood pressure of all

younger male executives in the company”

• “Let x-bar = average blood pressure in the sample of 72 younger male executives from the company”

Page 45: Stats chapter 11

Example 11.10

Hypotheses

• = 128 128

• Notice that we will need the 2-sided P-value

Page 46: Stats chapter 11

Example 11.10

Assumptions• Simple Random Sample

“We are not told that our sample is from an SRS. We should check how this sample was chosen. We will proceed as though this sample was an SRS”

• Independence“We are not told the size the population of young male executives. We should check that the population is greater than 10(72) = 720.”

• Normality“Because we have a large sample, the Central Limit Theorem guarantees that the sampling distribution is approximately Normal”

Page 47: Stats chapter 11

Example 11.10

Assumptions (cont.)• The preceding example illustrates ‘what to do’

if you think that an assumption is not met.• If you believe that an assumption is not met:

(1) state the condition that must be qualified, (2) mention that it “needs to checked,” and (3) state you will “proceed as though this assumption was met”

Always try to carry out the significance test.

Page 48: Stats chapter 11

Example 11.10

Name of the Test• “We will conduct a z-test for a

population mean”

Page 49: Stats chapter 11

Example 11.10

Test Statistic

/

xz

n

129.93 128

15 / 72

1.092z

Page 50: Stats chapter 11

Example 11.10

P-value p value 2 1.092P z

Page 51: Stats chapter 11

Example 11.10

P-value p value 2 1.092P z

Page 52: Stats chapter 11

Example 11.10

P-value p value 2 1.092P z

p value 0.2748

Page 53: Stats chapter 11

Example 11.10

Make a Decision• We are not given an in this example

we should use the standard 0.05 significance level.

• The p-value is larger than our , so we should reject the null hypothesis

• Note: nothing needs to be written for this part of PHANTOMS

Page 54: Stats chapter 11

Example 11.10

Summarize“Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93. Since this p-value is larger than a presumed = 0.05, we cannot reject our null hypothesis.We have no evidence to suggest that the mean systolic blood pressure of young executives is not 128.”

Page 55: Stats chapter 11

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value

Page 56: Stats chapter 11

Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93.

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value

Page 57: Stats chapter 11

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value(2) Compare the p-value with

Page 58: Stats chapter 11

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value(2) Compare the p-value with Since this p-value is larger than a presumed

= 0.05, we cannot reject our null hypothesis.

Page 59: Stats chapter 11

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value(2) Compare the p-value with (3) Interpret the conclusion in context

Page 60: Stats chapter 11

Example 11.10

Summarize (cont.)Note that the summary contains 3 parts:(1) Interpret the p-value(2) Compare the p-value with (3) Interpret the conclusion in contextWe have no evidence to suggest that the

mean systolic blood pressure of young executives is not 128.

Page 61: Stats chapter 11

Tests and Confidence Intervals

• A “two-sided alternative” and the “confidence interval” are the same test.

• A test will reject the null hypothesis of a two-sided alternative when the test statistic is outside the confident interval with CL = 1 -

• The link between confidence intervals and a two-sided test is called “duality”

• Refer to example 11.12

Page 62: Stats chapter 11

Assignment 11.2

• Page 709 #27, 29, 31-33

Page 63: Stats chapter 11

11.3 USE AND ABUSE OF TESTS

Page 64: Stats chapter 11

More on Significance Levels

• The significance level for a test is informed by the plausibility of H0.– If H0 is particularly “strong” or has a

many years behind it, then the evidence must also be “strong” (small )

– If we were trying to disprove the gravitational constant, the would have to be very, very small!

Page 65: Stats chapter 11

More on Significance Levels

• What are the consequences of rejecting H0?– There will always be a cost/benefit to

rejecting H0

– If it is more expensive to reject than it is to fail to reject, then the evidence must be strong (small )

– Consider the Toyota brake recall 2009

Page 66: Stats chapter 11

More on Significance Levels

• There is no “hard line” between reject and fail to reject– There isn’t a real difference between

= 0.10 and = 0.11– There is no sharp border between

“statistically significant” and “statistically insignificant”

– As P-value decreases, the strength of the evidence increases

– Although = 0.05 is ‘handy rule of thumb,’ it is not a universal rule

Page 67: Stats chapter 11

Cautions

• Don’t forget to examine the data– The presence of outliers can affect whether

the significance tests are plausible

• “Statistically Significant” is not the same thing as “Important”– Lack of significance may signal an

important conclusion

• A Test of Significance is not appropriate for all data sets

Page 68: Stats chapter 11

11.4 USING INFERENCE TO MAKE DECISIONS

Page 69: Stats chapter 11

“What if” we made the wrong decision?

There are two kinds of wrong decisions:• Reject a H0 that was actually true– This is a “TYPE I ERROR”

• Fail to reject H0 that was false– This is a “TYPE II ERROR”

• Some students find it helpful to think: “You can reject one hoe, but who can fail to reject two hoes”– whatever floats your boat, eh?

Page 70: Stats chapter 11

“What if” we made the wrong decision?

TYPE I ERROR• The null hypothesis was true!• The probability that we made this

error will be same as (since H0 was true)– You will need to know how to recognize

this error in context and – You will need to know the probability of

making a Type I error

Page 71: Stats chapter 11

“What if” we made the wrong decision?

TYPE II ERROR• In this case, the null hypothesis was

incorrect, but we failed to reject it• The probability of making a Type II

error is a “what if” calculation– “What if is actually 42- what’s the

probability that I fail to reject?”

• The probability of making a Type II error is known as

Page 72: Stats chapter 11

Type II Errors

Page 73: Stats chapter 11

Type II Errors

Page 74: Stats chapter 11

Type II Errors

This is the alternative samplingdistribution. Remember:H0 is (presumed) false

Page 75: Stats chapter 11

Type II Errors

Page 76: Stats chapter 11

Type II Errors

Page 77: Stats chapter 11

Type II Errors

This is

Page 78: Stats chapter 11

Type II Error

• is the area of the tail for the sampling distribution of the “what if” parameter value

• H0: = 5, xbar = 5.8, = 0.7, n = 40

• Calculation of when a = 6

• Since 0 > we need to calculate the left tail area

.

5.8 6

0.7 / 40

1.807

0.0354

P z

P z

Page 79: Stats chapter 11

Type II Error

• Mercifully, the AP exam will never ask you to compute

• You will be asked to interpret • Remember that is always

dependent on an alternative value of the parameter

.

Page 80: Stats chapter 11

Power

• The probability that the significance test will reject H0 at an level for an alternative value of the parameter is the power of the test against the alternative.

• Power = 1- • Power is the probability of not

making a TYPE II error• Lots of power is a good thing!

Page 81: Stats chapter 11

How to increase power

(1)Increase the significance level ()(2)Consider an alternative parameter that

is further away from the null hypothesis(3)Increase the sample size(4)Decrease

All the above have the effect of decreasing .

Less = More power

Page 82: Stats chapter 11