The Neyman-Pearson Paradigm - Furman
Transcript of The Neyman-Pearson Paradigm - Furman
The Neyman-Pearson ParadigmMathematics 47: Lecture 26
Dan Sloughter
Furman University
April 21, 2006
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 1 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.
I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,
I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,
I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,
I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,
I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,
I α = P(T ∈ B | H0), the probability of a type I error, is the significancelevel of the test, also called the size of the type I error,
I β = P(T ∈ A | HA), the probability of a type II error, is called the sizeof the type II error, and
I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,
I β = P(T ∈ A | HA), the probability of a type II error, is called the sizeof the type II error, and
I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, and
I 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 isfalse, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Hypothesis testing as a decision rule
I Suppose we wish to test H0 versus HA based on a random sample X1,X2, . . . , Xn, and we must make a decision in favor of H0 or HA.
I Let T be a statistic and suppose we have selected subsets A ⊂ R andB ⊂ R, where A∪B = R and A∩B = ∅, such that we will accept H0
if T ∈ A and reject H0 (that is, accept HA) if T ∈ B.I We will use the following terminology:
I T is the test statistic,I A is the acceptance region,I B is the rejection, or critical, region,I Rejecting H0 when H0 is true is a type I error,I Accepting H0 when H0 is false is a type II error,I α = P(T ∈ B | H0), the probability of a type I error, is the significance
level of the test, also called the size of the type I error,I β = P(T ∈ A | HA), the probability of a type II error, is called the size
of the type II error, andI 1− β = P(T ∈ B | HA), the probability of rejecting H0 when H0 is
false, is called the power of the test.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 2 / 10
Controlling α and β
I Our object is to design a test so that α and β are both small.
I For a given test and sample size, decreasing α will increase β.
I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10
Controlling α and β
I Our object is to design a test so that α and β are both small.
I For a given test and sample size, decreasing α will increase β.
I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10
Controlling α and β
I Our object is to design a test so that α and β are both small.
I For a given test and sample size, decreasing α will increase β.
I In the Neyman-Pearson testing paradigm, one typically fixes α as theacceptable rate of committing type I errors (usually α = 0.05 orα = 0.01), and then controls β through the choice of the test statisticT and the sample size.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 3 / 10
Example
I A machine is designed to fill boxes with 16 ounces of cereal.
I If X is the weight of a filled box, suppose X is N(µ, 0.16).
I To test the hypotheses
H0 : µ = 16
HA : µ 6= 16,
we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.
I That is, we take X̄ as our test statistic, setting an acceptance regionof
A = (15.75, 16.25)
and a critical region of
B = (−∞, 15.75] ∪ [16.25,∞).
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10
Example
I A machine is designed to fill boxes with 16 ounces of cereal.
I If X is the weight of a filled box, suppose X is N(µ, 0.16).
I To test the hypotheses
H0 : µ = 16
HA : µ 6= 16,
we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.
I That is, we take X̄ as our test statistic, setting an acceptance regionof
A = (15.75, 16.25)
and a critical region of
B = (−∞, 15.75] ∪ [16.25,∞).
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10
Example
I A machine is designed to fill boxes with 16 ounces of cereal.
I If X is the weight of a filled box, suppose X is N(µ, 0.16).
I To test the hypotheses
H0 : µ = 16
HA : µ 6= 16,
we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.
I That is, we take X̄ as our test statistic, setting an acceptance regionof
A = (15.75, 16.25)
and a critical region of
B = (−∞, 15.75] ∪ [16.25,∞).
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10
Example
I A machine is designed to fill boxes with 16 ounces of cereal.
I If X is the weight of a filled box, suppose X is N(µ, 0.16).
I To test the hypotheses
H0 : µ = 16
HA : µ 6= 16,
we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.
I That is, we take X̄ as our test statistic, setting an acceptance regionof
A = (15.75, 16.25)
and a critical region of
B = (−∞, 15.75] ∪ [16.25,∞).
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10
Example
I A machine is designed to fill boxes with 16 ounces of cereal.
I If X is the weight of a filled box, suppose X is N(µ, 0.16).
I To test the hypotheses
H0 : µ = 16
HA : µ 6= 16,
we decide to sample 10 boxes and reject H0 if |X̄ − 16| ≥ 0.25.
I That is, we take X̄ as our test statistic, setting an acceptance regionof
A = (15.75, 16.25)
and a critical region of
B = (−∞, 15.75] ∪ [16.25,∞).
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 4 / 10
Example (cont’d)
I Then the significance level of the test is
α = P(X̄ ∈ B | H0)
= P(|X̄ − 16| ≥ 0.25 | µ = 16)
= 1− P(−0.25 < X̄ − 16 < 0.25 | µ = 16)
= 1− P
(−0.25
0.4√10
<X̄ − 16
0.4√10
<0.250.4√10
∣∣∣ µ = 16
)= 1− (Φ(1.98)− Φ(−1.98))
= 1− (0.9762− 0.0238)
= 0.0476.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 5 / 10
Example (cont’d)
I Then the significance level of the test is
α = P(X̄ ∈ B | H0)
= P(|X̄ − 16| ≥ 0.25 | µ = 16)
= 1− P(−0.25 < X̄ − 16 < 0.25 | µ = 16)
= 1− P
(−0.25
0.4√10
<X̄ − 16
0.4√10
<0.250.4√10
∣∣∣ µ = 16
)= 1− (Φ(1.98)− Φ(−1.98))
= 1− (0.9762− 0.0238)
= 0.0476.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 5 / 10
Example (cont’d)
I Note: we cannot compute
β = P(X̄ ∈ A | HA)
because HA is composite, and so does not uniquely specify a value ofµ.
I However, we can evaluate the operating characteristic
β(µ) = P(X̄ ∈ A | µ),
the probability of accepting H0 when µ is the true mean.
I And we can evaluate the power function,
π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),
the probability of rejecting H0 when µ is the true mean.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10
Example (cont’d)
I Note: we cannot compute
β = P(X̄ ∈ A | HA)
because HA is composite, and so does not uniquely specify a value ofµ.
I However, we can evaluate the operating characteristic
β(µ) = P(X̄ ∈ A | µ),
the probability of accepting H0 when µ is the true mean.
I And we can evaluate the power function,
π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),
the probability of rejecting H0 when µ is the true mean.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10
Example (cont’d)
I Note: we cannot compute
β = P(X̄ ∈ A | HA)
because HA is composite, and so does not uniquely specify a value ofµ.
I However, we can evaluate the operating characteristic
β(µ) = P(X̄ ∈ A | µ),
the probability of accepting H0 when µ is the true mean.
I And we can evaluate the power function,
π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),
the probability of rejecting H0 when µ is the true mean.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10
Example (cont’d)
I Note: we cannot compute
β = P(X̄ ∈ A | HA)
because HA is composite, and so does not uniquely specify a value ofµ.
I However, we can evaluate the operating characteristic
β(µ) = P(X̄ ∈ A | µ),
the probability of accepting H0 when µ is the true mean.
I And we can evaluate the power function,
π(µ) = 1− β(µ) = P(X̄ ∈ B | µ),
the probability of rejecting H0 when µ is the true mean.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 6 / 10
Example (cont’d)
I Note: π(16) = α, the significance level of the test.
I For example,
β(16.5) = P(X̄ ∈ A | µ = 16.5)
= P(15.75 < X̄ < 16.25 | µ = 16.5)
= P
(15.75− 16.5
0.4√10
<X̄ − 16.5
0.4√10
<16.25− 16.5
0.4√10
∣∣∣ µ = 16.5
)= Φ(−1.98)− Φ(−5.93)
= 0.0238,
andπ(16.5) = 1− β(16.5) = 0.9762.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10
Example (cont’d)
I Note: π(16) = α, the significance level of the test.
I For example,
β(16.5) = P(X̄ ∈ A | µ = 16.5)
= P(15.75 < X̄ < 16.25 | µ = 16.5)
= P
(15.75− 16.5
0.4√10
<X̄ − 16.5
0.4√10
<16.25− 16.5
0.4√10
∣∣∣ µ = 16.5
)= Φ(−1.98)− Φ(−5.93)
= 0.0238,
andπ(16.5) = 1− β(16.5) = 0.9762.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10
Example (cont’d)
I Note: π(16) = α, the significance level of the test.
I For example,
β(16.5) = P(X̄ ∈ A | µ = 16.5)
= P(15.75 < X̄ < 16.25 | µ = 16.5)
= P
(15.75− 16.5
0.4√10
<X̄ − 16.5
0.4√10
<16.25− 16.5
0.4√10
∣∣∣ µ = 16.5
)= Φ(−1.98)− Φ(−5.93)
= 0.0238,
andπ(16.5) = 1− β(16.5) = 0.9762.
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 7 / 10
Example (cont’d)
I The following graph of the power function π(µ) was created using theR commands:
> m <- seq(15,17,.01)> b <- pnorm(16.25,mean=m,sd=.4/sqrt(10))
- pnorm(15.75,mean=m,sd=.4/sqrt(10))> p <- 1 - b> plot(m,p,type="l",xlab="Mean",ylab="Power")
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 9 / 10
Example (cont’d)
I Graph of the power function:
15.0 15.5 16.0 16.5 17.0
0.2
0.4
0.6
0.8
1.0
Mean
Pow
er
I What should the graph of an “ideal” power function look like?
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10
Example (cont’d)
I Graph of the power function:
15.0 15.5 16.0 16.5 17.0
0.2
0.4
0.6
0.8
1.0
Mean
Pow
er
I What should the graph of an “ideal” power function look like?
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10
Example (cont’d)
I Graph of the power function:
15.0 15.5 16.0 16.5 17.0
0.2
0.4
0.6
0.8
1.0
Mean
Pow
er
I What should the graph of an “ideal” power function look like?
Dan Sloughter (Furman University) The Neyman-Pearson Paradigm April 21, 2006 10 / 10