Download - October 15. In Chapter 9: 9.1 Null and Alternative Hypotheses 9.2 Test Statistic 9.3 P-Value 9.4 Significance Level 9.5 One-Sample z Test 9.6 Power and.

Apr 19, 2023

Chapter 9: Chapter 9: Basics of Hypothesis TestingBasics of Hypothesis Testing

In Chapter 9:

9.1 Null and Alternative Hypotheses

9.2 Test Statistic

9.3 P-Value

9.4 Significance Level

9.5 One-Sample z Test

9.6 Power and Sample Size

Review: Basics of Inference• Population all possible values• Sample a portion of the population • Statistical inference generalizing from a

sample to a population with calculated degree of certainty

• Two forms of statistical inference – Hypothesis testing– Estimation

• Parameter a numerical characteristic of a population, e.g., population mean µ, population proportion p

• Statistic a calculated value from data in the sample, e.g., sample mean ( ), sample proportion ( ) x p̂

Distinctions Between Parameters and Statistics (Chapter 8 review)

Parameters Statistics

Source Population Sample

Notation Greek (e.g., ) Roman (e.g., xbar)

Vary No Yes

Calculated No Yes

Ch 8 Review: How A Sample Mean Varies

Sampling distributions of means tend to be Normal with an expected value equal to population mean µ and standard deviation (SE) = σ/√n

nSE

SENx

x

x

where

,~

Introduction to Hypothesis Testing

• Hypothesis testing is also called significance testing

• The objective of hypothesis testing is to test claims about parameters

• The first step is to state the problem!

• Then, follow these four steps --

Hypothesis Testing Steps

A. Null and alternative hypotheses

B. Test statistic

C. P-value and interpretation

D. Significance level (optional)

Understand all four steps of testing, not just the calculations.

§9.1 Null and Alternative Hypotheses

• Convert the research question to null and alternative hypotheses

• The null hypothesis (H0) is a claim of “no difference in the population”

• The alternative hypothesis (Ha) is a claim of “difference”

• Seek evidence against H0 as a way of bolstering Ha

Illustrative Example: “Body Weight”

• Statement of the problem: In the 1970s, 20–29 year old men in the U.S. had a mean body weight μ of 170 pounds. Standard deviation σ = 40 pounds. We want to test whether mean body weight in the population now differs.

• H0: μ = 170 (“no difference”)

• The alternative hypothesis can be stated in one of two ways:Ha: μ > 170 (one-sided alternative) Ha: μ ≠ 170 (two-sided alternative)

§9.2 Test Statistic

nSE

SE

x

x

x

and

trueis hypothesis null theassumingmean population the where

z

0

0stat

This chapter introduces the z statistic for one-sample problems about means.

Illustrative Example: z statistic• For the illustrative example, μ0.= 170

• We take an SRS of n = 64 and know We know that σ = 40 so the standard error of the mean is

• If we found a sample mean of 173, then

• If we found a sample mean of 185, then

564

40

nSEx

60.05

170173 0

stat

xSE

xz

00.35

170185 0

stat

xSE

xz

Reasoning Behind the zstat

5,170~ NxThis is the sampling distribution of the mean when μ = 170, σ = 40, and n = 64.

§9.3 P-value• The P-value answer the question: What is the

probability of the observed a test statistic equal to or one more extreme than the current statistic assuming H0 is true?

• This corresponds to the AUC in the tail of the Z sampling distribution beyond the zstat. Use Table B or a software utility to find this AUC. (See next slide).

• Smaller and smaller P-values provider stronger and stronger evidence against H0

The one-sided P-value for zstat = 0.6 is 0.2743

The one-sided P-value for zstat = 3.0 is 0.0010

Two-Sided P-Value

• For one-sided Ha use the area in the tail beyond the z statistic

• For two-sided Ha consider deviations “up” and “down” from expected double the one-sided P-value

• For example, if the one-sided P-value = 0.0010, then the two-sided P-value = 2 × 0.0010 = 0.0020.

• If the one-sided P = 0.2743, then the two-sided P = 2 × 0.2743 = 0.5486.

Two-tailed P-value

§9.4 Significance Level

• Smaller and smaller P-values provide stronger and stronger evidence against H0

• Although it unwise to draw firm cutoffs, here are conventions that used as a starting point::P > 0.10 non-significant evidence against H0

0.05 < P 0.10 marginally significant against H0

0.01 < P 0.05 significant evidence against H0

P 0.01 highly significant evidence against H0

• ExamplesP =.27 is non-significant evidence against H0

P =.01 is highly significant evidence against H0

Decision Based on α Level

• Let α represent the probability of erroneously rejecting H0

• Set α threshold of acceptable error (e.g., let α = .10, let α = .05, or whatever level is acceptable)

• Reject H0 when P ≤ α

• Example: Set α = .10. Find P = 0.27. Since P > α, retain H0

• Example: Set α = .05. Find P = .01. Since P < α reject H0

§9.5 One-Sample z Test (Summary)

Test conditions: • Quantitative response• Good data• SRS (or facsimile)• σ known (not calculated)

• Population approximately Normal or sample large (central limit theorem)

Test procedure

A. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided) or Ha: µ < µ0 (left-sided) orHa: µ > µ0 (right-sided)

B. Test statistic

C. P-value: convert zstat to P value [Table B or software]

D. Significance level (optional)

nSE

SE

xx

x

where z 0

stat

Example: The “Lake Wobegon Problem”

• Typically, Weschler Adult Intelligence Scores are Normal with µ = 100 and = 15

• Take an SRS of n = 9• Measure scores {116, 128, 125, 119, 89,

99, 105, 116, 118}• Calculate x-bar = 112.8 • Does this sample mean provide statistically

reliable evidence that population mean μ is greater than 100?

Illustrative Example: Lake Wobegon

A. Hypotheses: H0: µ = 100 Ha: µ > 100 (one-sided)

B. Test statistic:

56.25

1008.112

59

15

0stat

x

x

SE

xz

nSE

C. P-value: Pr(Z ≥ 2.56) = 0.0052 (Table B)

D. Significance level (optional): This level of evidence is significant at α 0.01 (reject H0).

P =.0052 highly significant evidence against H0

Two-Sided Alternative (Lake Wobegon Illustration)

Two-sided alternativeHa: µ ≠100 doubles

P = 2 × 0.0052 = 0.0104

P = .0104 provides significant against H0

§9.6 Power and Sample Size

Truth

Decision H0 true H0 false

Retain H0 Correct retention

Type II error

Reject H0 Type I error Correct rejectionα ≡ probability of a Type I error

β ≡ Probability of a Type II error

Two types of decision errors:Type I error = erroneous rejection of a true

H0

Type II error = erroneous retention of a false H0

Power

• The traditional hypothesis testing paradigm considers only Type I errors

• However, we should also consider Type II errors

• β ≡ probability of a Type II error

• 1 – “Power” ≡ probability of avoiding a Type II error

Power of a z test

• where Φ(z) represent the cumulative probability of Standard Normal z (e.g., Φ(0) = 0.5)

• μ0 represent the population mean under the null hypothesis

• μa represents the population mean under an alternative hypothesis

nz a ||

1 01

2

Calculating Power: Example

A study of n = 16 retains H0: μ = 170 at α = 0.05 (two-sided). What was the power of the test conditions to identify a significant difference if the population mean was actually 190?

B] table[From 5160.004.0

40

16|190170|96.1

||1 0

1 2

n

z a

Reasoning Behind Power

• Consider two competing sampling distribution models– One model assumes H0 is true (top curve, next page) – An other model assumes Ha is true μa = 190 (bottom

curve, next page)

• When α = 0.05 (two-sided), we will reject H0 when the sample mean exceeds 189.6 (right tail, top curve)

• The probability of getting a value greater than 189.6 on the bottom curve is 0.5160, corresponding to the power of the test

Sample Size RequirementsThe required sample size for a two-sided z test to achieve a given power is

where 1 – β ≡ desired power of the test α ≡ desired significance level σ ≡ population standard deviation

Δ = μ0 – μa ≡ the difference worth detecting

2

2

112

2

zzn

Illustrative Example: Sample Size Requirement

• How large a sample is needed for a one-sample z test with 90% power and α = 0.05 (two-tailed) when σ = 40. The null hypothesis assumes μ = 170 and the alternative assumes μ = 190. We look for difference Δ = μ0 − μa = 170 – 190 = −20

• Round this up to 42 to ensure adequate power.

99.41

20

)96.128.1(402

22

2

211

2

2

zzn

Example showing the conditions for 90% power.