The Neyman-Pearson Paradigm - Furman

33
The Neyman-Pearson Paradigm Mathematics 47: Lecture 26 Dan Sloughter Furman University April 21, 2006 Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 1 / 10

Transcript of The Neyman-Pearson Paradigm - Furman

Page 1: The Neyman-Pearson Paradigm - Furman

The Neyman-Pearson ParadigmMathematics 47: Lecture 26

Dan Sloughter

Furman University

April 21, 2006

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 1 / 10

Page 2: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 3: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.

I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 4: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 5: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,

I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 6: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,

I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 7: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,

I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 8: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,

I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 9: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,

I α = P(T ∈ B | H0), the probability of a type I error, is the significancelevel of the test, also called the size of the type I error,

I β = P(T ∈ A | HA), the probability of a type II error, is called the sizeof the type II error, and

I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 10: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,

I β = P(T ∈ A | HA), the probability of a type II error, is called the sizeof the type II error, and

I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 11: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, and

I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 12: The Neyman-Pearson Paradigm - Furman

Hypothesis testing as a decision rule

I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.

I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0

if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:

I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance

level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size

of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is

false, is called the power of the test.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10

Page 13: The Neyman-Pearson Paradigm - Furman

Controlling α and β

I Our object is to design a test so that α and β are both small.

I For a given test and sample size, decreasing α will increase β.

I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10

Page 14: The Neyman-Pearson Paradigm - Furman

Controlling α and β

I Our object is to design a test so that α and β are both small.

I For a given test and sample size, decreasing α will increase β.

I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10

Page 15: The Neyman-Pearson Paradigm - Furman

Controlling α and β

I Our object is to design a test so that α and β are both small.

I For a given test and sample size, decreasing α will increase β.

I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10

Page 16: The Neyman-Pearson Paradigm - Furman

Example

I A machine is designed to fill boxes with 16 ounces of cereal.

I If X is the weight of a filled box, suppose X is N(µ, 0.16).

I To test the hypotheses

H0 : µ = 16

HA : µ 6= 16,

we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.

I That is, we take X̄ as our test statistic, setting an acceptance regionof

A = (15.75, 16.25)

and a critical region of

B = (−∞, 15.75] ∪ [16.25,∞).

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10

Page 17: The Neyman-Pearson Paradigm - Furman

Example

I A machine is designed to fill boxes with 16 ounces of cereal.

I If X is the weight of a filled box, suppose X is N(µ, 0.16).

I To test the hypotheses

H0 : µ = 16

HA : µ 6= 16,

we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.

I That is, we take X̄ as our test statistic, setting an acceptance regionof

A = (15.75, 16.25)

and a critical region of

B = (−∞, 15.75] ∪ [16.25,∞).

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10

Page 18: The Neyman-Pearson Paradigm - Furman

Example

I A machine is designed to fill boxes with 16 ounces of cereal.

I If X is the weight of a filled box, suppose X is N(µ, 0.16).

I To test the hypotheses

H0 : µ = 16

HA : µ 6= 16,

we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.

I That is, we take X̄ as our test statistic, setting an acceptance regionof

A = (15.75, 16.25)

and a critical region of

B = (−∞, 15.75] ∪ [16.25,∞).

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10

Page 19: The Neyman-Pearson Paradigm - Furman

Example

I A machine is designed to fill boxes with 16 ounces of cereal.

I If X is the weight of a filled box, suppose X is N(µ, 0.16).

I To test the hypotheses

H0 : µ = 16

HA : µ 6= 16,

we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.

I That is, we take X̄ as our test statistic, setting an acceptance regionof

A = (15.75, 16.25)

and a critical region of

B = (−∞, 15.75] ∪ [16.25,∞).

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10

Page 20: The Neyman-Pearson Paradigm - Furman

Example

I A machine is designed to fill boxes with 16 ounces of cereal.

I If X is the weight of a filled box, suppose X is N(µ, 0.16).

I To test the hypotheses

H0 : µ = 16

HA : µ 6= 16,

we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.

I That is, we take X̄ as our test statistic, setting an acceptance regionof

A = (15.75, 16.25)

and a critical region of

B = (−∞, 15.75] ∪ [16.25,∞).

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10

Page 21: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Then the significance level of the test is

α = P(X̄ ∈ B | H0)

= P(|X̄ − 16| ≥ 0.25 | µ = 16)

= 1− P(−0.25 < X̄ − 16 < 0.25 | µ = 16)

= 1− P

(−0.25

0.4√10

<X̄ − 16

0.4√10

<0.250.4√10

∣∣∣ µ = 16

)= 1− (Φ(1.98)− Φ(−1.98))

= 1− (0.9762− 0.0238)

= 0.0476.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 5 / 10

Page 22: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Then the significance level of the test is

α = P(X̄ ∈ B | H0)

= P(|X̄ − 16| ≥ 0.25 | µ = 16)

= 1− P(−0.25 < X̄ − 16 < 0.25 | µ = 16)

= 1− P

(−0.25

0.4√10

<X̄ − 16

0.4√10

<0.250.4√10

∣∣∣ µ = 16

)= 1− (Φ(1.98)− Φ(−1.98))

= 1− (0.9762− 0.0238)

= 0.0476.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 5 / 10

Page 23: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: we cannot compute

β = P(X̄ ∈ A | HA)

because HA is composite, and so does not uniquely specify a value ofµ.

I However, we can evaluate the operating characteristic

β(µ) = P(X̄ ∈ A | µ),

the probability of accepting H0 when µ is the true mean.

I And we can evaluate the power function,

π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),

the probability of rejecting H0 when µ is the true mean.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10

Page 24: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: we cannot compute

β = P(X̄ ∈ A | HA)

because HA is composite, and so does not uniquely specify a value ofµ.

I However, we can evaluate the operating characteristic

β(µ) = P(X̄ ∈ A | µ),

the probability of accepting H0 when µ is the true mean.

I And we can evaluate the power function,

π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),

the probability of rejecting H0 when µ is the true mean.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10

Page 25: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: we cannot compute

β = P(X̄ ∈ A | HA)

because HA is composite, and so does not uniquely specify a value ofµ.

I However, we can evaluate the operating characteristic

β(µ) = P(X̄ ∈ A | µ),

the probability of accepting H0 when µ is the true mean.

I And we can evaluate the power function,

π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),

the probability of rejecting H0 when µ is the true mean.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10

Page 26: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: we cannot compute

β = P(X̄ ∈ A | HA)

because HA is composite, and so does not uniquely specify a value ofµ.

I However, we can evaluate the operating characteristic

β(µ) = P(X̄ ∈ A | µ),

the probability of accepting H0 when µ is the true mean.

I And we can evaluate the power function,

π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),

the probability of rejecting H0 when µ is the true mean.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10

Page 27: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: π(16) = α, the significance level of the test.

I For example,

β(16.5) = P(X̄ ∈ A | µ = 16.5)

= P(15.75 < X̄ < 16.25 | µ = 16.5)

= P

(15.75− 16.5

0.4√10

<X̄ − 16.5

0.4√10

<16.25− 16.5

0.4√10

∣∣∣ µ = 16.5

)= Φ(−1.98)− Φ(−5.93)

= 0.0238,

andπ(16.5) = 1− β(16.5) = 0.9762.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10

Page 28: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: π(16) = α, the significance level of the test.

I For example,

β(16.5) = P(X̄ ∈ A | µ = 16.5)

= P(15.75 < X̄ < 16.25 | µ = 16.5)

= P

(15.75− 16.5

0.4√10

<X̄ − 16.5

0.4√10

<16.25− 16.5

0.4√10

∣∣∣ µ = 16.5

)= Φ(−1.98)− Φ(−5.93)

= 0.0238,

andπ(16.5) = 1− β(16.5) = 0.9762.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10

Page 29: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Note: π(16) = α, the significance level of the test.

I For example,

β(16.5) = P(X̄ ∈ A | µ = 16.5)

= P(15.75 < X̄ < 16.25 | µ = 16.5)

= P

(15.75− 16.5

0.4√10

<X̄ − 16.5

0.4√10

<16.25− 16.5

0.4√10

∣∣∣ µ = 16.5

)= Φ(−1.98)− Φ(−5.93)

= 0.0238,

andπ(16.5) = 1− β(16.5) = 0.9762.

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10

Page 30: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I The following graph of the power function π(µ) was created using theR commands:

> m <- seq(15,17,.01)> b <- pnorm(16.25,mean=m,sd=.4/sqrt(10))

- pnorm(15.75,mean=m,sd=.4/sqrt(10))> p <- 1 - b> plot(m,p,type="l",xlab="Mean",ylab="Power")

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 9 / 10

Page 31: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Graph of the power function:

15.0 15.5 16.0 16.5 17.0

0.2

0.4

0.6

0.8

1.0

Mean

Pow

er

I What should the graph of an “ideal” power function look like?

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10

Page 32: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Graph of the power function:

15.0 15.5 16.0 16.5 17.0

0.2

0.4

0.6

0.8

1.0

Mean

Pow

er

I What should the graph of an “ideal” power function look like?

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10

Page 33: The Neyman-Pearson Paradigm - Furman

Example (cont’d)

I Graph of the power function:

15.0 15.5 16.0 16.5 17.0

0.2

0.4

0.6

0.8

1.0

Mean

Pow

er

I What should the graph of an “ideal” power function look like?

Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10