Slides Prepared by JOHN S. LOUCKS St. Edward’s University
description
Transcript of Slides Prepared by JOHN S. LOUCKS St. Edward’s University
1 1 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Slides Prepared bySlides Prepared by
JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University
Slides Prepared bySlides Prepared by
JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University
2 2 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Chapter 11Chapter 11 Comparisons Involving Proportions Comparisons Involving Proportions
and a Test of Independenceand a Test of Independence
Inferences About the Difference BetweenInferences About the Difference Between Two Population ProportionsTwo Population Proportions
Test of Independence: Contingency TablesTest of Independence: Contingency Tables
Hypothesis Test for ProportionsHypothesis Test for Proportions of a Multinomial Populationof a Multinomial Population
3 3 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Inferences About the Difference BetweenInferences About the Difference BetweenTwo Population ProportionsTwo Population Proportions
Interval Estimation of Interval Estimation of pp11 - - pp22
Hypothesis Tests About Hypothesis Tests About pp11 - - pp22
4 4 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Expected ValueExpected Value
Sampling Distribution of Sampling Distribution of p p1 2p p1 2
E p p p p( )1 2 1 2 E p p p p( )1 2 1 2
p pp pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( ) p p
p pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( )
where: where: nn11 = size of sample taken from population 1 = size of sample taken from population 1
nn22 = size of sample taken from population 2 = size of sample taken from population 2
Standard Deviation (Standard Error)Standard Deviation (Standard Error)
5 5 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
If the sample sizes are large, the sampling distributionIf the sample sizes are large, the sampling distribution of can be approximated by a normal probabilityof can be approximated by a normal probability distribution. distribution.
If the sample sizes are large, the sampling distributionIf the sample sizes are large, the sampling distribution of can be approximated by a normal probabilityof can be approximated by a normal probability distribution. distribution.
p p1 2p p1 2
The sample sizes are sufficiently large if The sample sizes are sufficiently large if allall of these of these conditions are met:conditions are met: The sample sizes are sufficiently large if The sample sizes are sufficiently large if allall of these of these conditions are met:conditions are met:
nn11pp11 >> 5 5 nn11(1 - (1 - pp11) ) >> 5 5
nn22pp22 >> 5 5 nn22(1 - (1 - pp22) ) >> 5 5
Sampling Distribution of Sampling Distribution of p p1 2p p1 2
6 6 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Sampling Distribution of Sampling Distribution of p p1 2p p1 2
pp11 – – pp22pp11 – – pp22
p pp pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( ) p p
p pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( )
p p1 2p p1 2
7 7 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Interval Estimation of Interval Estimation of pp11 - - pp22
Interval EstimateInterval Estimate
1 1 2 21 2 / 2
1 2
(1 ) (1 )p p p pp p z
n n
1 1 2 21 2 / 2
1 2
(1 ) (1 )p p p pp p z
n n
8 8 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Market Research Associates isMarket Research Associates isconducting research to evaluate theconducting research to evaluate theeffectiveness of a client’s new adver-effectiveness of a client’s new adver-tising campaign. Before the newtising campaign. Before the newcampaign began, a telephone surveycampaign began, a telephone surveyof 150 households in the test marketof 150 households in the test marketarea showed 60 households “aware” ofarea showed 60 households “aware” ofthe client’s product. the client’s product.
Interval Estimation of Interval Estimation of pp11 - - pp22
Example: Market Research AssociatesExample: Market Research Associates
The new campaign has been initiated with TV andThe new campaign has been initiated with TV andnewspaper advertisements running for three weeks.newspaper advertisements running for three weeks.
9 9 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
A survey conducted immediatelyA survey conducted immediatelyafter the new campaign showed 120after the new campaign showed 120of 250 households “aware” of theof 250 households “aware” of theclient’s product.client’s product.
Interval Estimation of Interval Estimation of pp11 - - pp22
Example: Market Research AssociatesExample: Market Research Associates
Does the data support the positionDoes the data support the positionthat the advertising campaign has that the advertising campaign has provided an increased awareness ofprovided an increased awareness ofthe client’s product?the client’s product?
10 10 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Point Estimator of the Difference BetweenPoint Estimator of the Difference BetweenTwo Population ProportionsTwo Population Proportions
= sample proportion of households “aware” of the= sample proportion of households “aware” of the product product afterafter the new campaign the new campaign = sample proportion of households “aware” of the= sample proportion of households “aware” of the product product beforebefore the new campaign the new campaign
1p1p
2p2p
pp11 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product afterafter the new campaign the new campaign pp22 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product beforebefore the new campaign the new campaign
1 2
120 60.48 .40 .08
250 150p p 1 2
120 60.48 .40 .08
250 150p p
11 11 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
.08 .08 ++ 1.96(.0510) 1.96(.0510)
.08 .08 ++ .10 .10
.48(.52) .40(.60).48 .40 1.96
250 150
.48(.52) .40(.60).48 .40 1.96
250 150
Interval Estimation of Interval Estimation of pp11 - - pp22
Hence, the 95% confidence interval for the differenceHence, the 95% confidence interval for the differencein before and after awareness of the product isin before and after awareness of the product is-.02 to +.18.-.02 to +.18.
For For = .05, = .05, zz.025.025 = 1.96: = 1.96:
12 12 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
HypothesesHypotheses
HH00: : pp11 - - pp22 << 0 0
HHaa: : pp11 - - pp22 > 0 > 0 1 2: 0aH p p 1 2: 0aH p p 0 1 2: 0H p p 0 1 2: 0H p p 0 1 2: 0H p p 0 1 2: 0H p p
1 2: 0aH p p 1 2: 0aH p p 0 1 2: 0H p p 0 1 2: 0H p p 1 2: 0aH p p 1 2: 0aH p p
Left-tailedLeft-tailed Right-tailedRight-tailed Two-tailedTwo-tailed
We focus on tests involving no difference We focus on tests involving no difference betweenbetweenthe two population proportions (i.e. the two population proportions (i.e. pp11 = = pp22))
13 13 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
1 2p p1 2p p Pooled Estimate of Standard Error of Pooled Estimate of Standard Error of
1 2
1 2
1 1(1 )p p p p
n n
1 2
1 2
1 1(1 )p p p p
n n
1 1 2 2
1 2
n p n pp
n n
1 1 2 2
1 2
n p n pp
n n
where:where:
14 14 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
1 2
1 2
( )
1 1(1 )
p pz
p pn n
1 2
1 2
( )
1 1(1 )
p pz
p pn n
Test StatisticTest Statistic
15 15 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Can we conclude, using a .05 levelCan we conclude, using a .05 level
of significance, that the proportion ofof significance, that the proportion of
households aware of the client’s producthouseholds aware of the client’s product
increased after the new advertisingincreased after the new advertising
campaign?campaign?
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
Example: Market Research AssociatesExample: Market Research Associates
16 16 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
1. Develop the hypotheses.1. Develop the hypotheses.
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
HH00: : pp11 - - pp22 << 0 0
HHaa: : pp11 - - pp22 > 0 > 0
pp11 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product afterafter the new campaign the new campaign pp22 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product beforebefore the new campaign the new campaign
17 17 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
2. Specify the level of significance.2. Specify the level of significance. = .05= .05
3. Compute the value of the test statistic.3. Compute the value of the test statistic.
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
p
250 48 150 40250 150
180400
45(. ) (. )
.p
250 48 150 40250 150
180400
45(. ) (. )
.
sp p1 245 55 1
2501150 0514 . (. )( ) .sp p1 2
45 55 1250
1150 0514 . (. )( ) .
(.48 .40) 0 .08 1.56
.0514 .0514z
(.48 .40) 0 .08 1.56
.0514 .0514z
18 18 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
5. Determine whether to reject 5. Determine whether to reject HH00..
We We cannotcannot conclude that the proportion of conclude that the proportion of householdshouseholdsaware of the client’s product increased after aware of the client’s product increased after the newthe newcampaign.campaign.
4. Compute the 4. Compute the pp –value. –value.
For For zz = 1.56, the = 1.56, the pp–value = .0594–value = .0594
Because Because pp–value > –value > = .05, we = .05, we cannotcannot reject reject HH00..
pp –Value Approach –Value Approach
19 19 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis Tests about Hypothesis Tests about pp11 - - pp22
Critical Value ApproachCritical Value Approach
5. Determine whether to reject 5. Determine whether to reject HH00..
Because 1.56 < 1.645, we cannot reject Because 1.56 < 1.645, we cannot reject HH00..
For For = .05, = .05, zz.05.05 = 1.645 = 1.645
4. Determine the critical value and rejection rule.4. Determine the critical value and rejection rule.
Reject Reject HH00 if if zz >> 1.645 1.645
We We cannotcannot conclude that the proportion of conclude that the proportion of householdshouseholdsaware of the client’s product increased after aware of the client’s product increased after the newthe newcampaign.campaign.
20 20 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial
PopulationPopulation1.1. Set up the null and alternative hypotheses. Set up the null and alternative hypotheses.
2.2. Select a random sample and record the observed Select a random sample and record the observed
frequency, frequency, ffi i , for each of the , for each of the kk categories. categories.
3.3. Assuming Assuming HH00 is true, compute the expected is true, compute the expected frequency, frequency, eei i , in each category by multiplying the, in each category by multiplying the category probability by the sample size.category probability by the sample size.
21 21 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial
PopulationPopulation
22
1
( )f ee
i i
ii
k2
2
1
( )f ee
i i
ii
k
4.4. Compute the value of the test statistic. Compute the value of the test statistic.
Note: The test statistic has a chi-square distributionNote: The test statistic has a chi-square distributionwith with kk – 1 df provided that the expected frequencies – 1 df provided that the expected frequenciesare 5 or more for all categories.are 5 or more for all categories.
ffii = observed frequency for category = observed frequency for category iieeii = expected frequency for category = expected frequency for category ii
kk = number of categories = number of categories
where:where:
22 22 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial
PopulationPopulation
where where is the significance is the significance level andlevel and
there are there are kk - 1 degrees of - 1 degrees of freedomfreedom
pp-value approach:-value approach:
Critical value approach:Critical value approach:
Reject Reject HH00 if if pp-value -value <<
5.5. Rejection rule: Rejection rule:
2 2 2 2 Reject Reject HH00 if if
23 23 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
Finger Lakes Homes manufacturesFinger Lakes Homes manufactures
four models of prefabricated homes,four models of prefabricated homes,
a two-story colonial, a log cabin, aa two-story colonial, a log cabin, a
split-level, and an A-frame. To helpsplit-level, and an A-frame. To help
in production planning, managementin production planning, management
would like to determine if previous would like to determine if previous
customer purchases indicate that therecustomer purchases indicate that there
is a preference in the style selected.is a preference in the style selected.
24 24 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Split- A-Split- A-Model Colonial Log Level FrameModel Colonial Log Level Frame
# Sold# Sold 30 20 35 15 30 20 35 15
The number of homes sold of eachThe number of homes sold of each
model for 100 sales over the past twomodel for 100 sales over the past two
years is shown below.years is shown below.
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)
25 25 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
HypothesesHypotheses
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
where:where:
ppCC = population proportion that purchase a colonial = population proportion that purchase a colonial
ppL L = population proportion that purchase a log cabin = population proportion that purchase a log cabin
ppS S = population proportion that purchase a split-level = population proportion that purchase a split-level
ppAA = population proportion that purchase an A-frame = population proportion that purchase an A-frame
HH00: : ppCC = = ppLL = = ppSS = = ppAA = .25 = .25
HHaa: The population proportions are : The population proportions are notnot
ppCC = .25, = .25, ppLL = .25, = .25, ppSS = .25, and = .25, and ppAA = .25 = .25
26 26 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Rejection RuleRejection Rule
22
7.815 7.815
Do Not Reject H0Do Not Reject H0 Reject H0Reject H0
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
With With = .05 and = .05 and
kk - 1 = 4 - 1 = 3 - 1 = 4 - 1 = 3
degrees of freedomdegrees of freedom
Reject H0 if if pp-value -value << .05 or .05 or 22 > 7.815. > 7.815.
27 27 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Expected FrequenciesExpected Frequencies
Test StatisticTest Statistic
22 2 2 230 25
2520 25
2535 25
2515 25
25
( ) ( ) ( ) ( )22 2 2 230 25
2520 25
2535 25
2515 25
25
( ) ( ) ( ) ( )
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
ee1 1 = .25(100) = 25 = .25(100) = 25 ee22 = .25(100) = 25 = .25(100) = 25
ee33 = .25(100) = 25 = .25(100) = 25 ee44 = .25(100) = 25 = .25(100) = 25
= 1 + 1 + 4 + 4 = 1 + 1 + 4 + 4
= 10= 10
28 28 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
Conclusion Using the Conclusion Using the pp-Value Approach-Value Approach
The The pp-value -value << . We can reject the null hypothesis. . We can reject the null hypothesis.
Because Because 22 = 10 is between 9.348 and 11.345, = 10 is between 9.348 and 11.345, thethe area in the upper tail of the distribution is area in the upper tail of the distribution is betweenbetween .025 and .01..025 and .01.
Area in Upper Tail .10 .05 .025 .01 .005Area in Upper Tail .10 .05 .025 .01 .005
22 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Value (df = 3) 6.251 7.815 9.348 11.345 12.838
29 29 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Conclusion Using the Critical Value ApproachConclusion Using the Critical Value Approach
Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest
We reject, at the .05 level of significance,We reject, at the .05 level of significance,
the assumption that there is no home stylethe assumption that there is no home style
preference.preference.
2 2 = 10 = 10 >> 7.815 7.815
30 30 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test of Independence: Contingency Test of Independence: Contingency TablesTables
ei j
ij (Row Total )(Column Total )
Sample Sizee
i jij
(Row Total )(Column Total ) Sample Size
1.1. Set up the null and alternative hypotheses. Set up the null and alternative hypotheses.
2.2. Select a random sample and record the observed Select a random sample and record the observed
frequency, frequency, ffij ij , for each cell of the contingency table., for each cell of the contingency table.
3.3. Compute the expected frequency, Compute the expected frequency, eeij ij , for each cell., for each cell.
31 31 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test of Independence: Contingency Test of Independence: Contingency TablesTables
22
( )f e
eij ij
ijji2
2
( )f e
eij ij
ijji
5.5. Determine the rejection rule. Determine the rejection rule.
Reject Reject HH00 if if p p -value -value << or or . .
2 2 2 2
4.4. Compute the test statistic. Compute the test statistic.
where where is the significance level and, is the significance level and,with with nn rows and rows and mm columns, there are columns, there are((nn - 1)( - 1)(mm - 1) degrees of freedom. - 1) degrees of freedom.
32 32 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Each home sold by Finger LakesEach home sold by Finger Lakes
Homes can be classified according toHomes can be classified according to
price and to style. Finger Lakes’price and to style. Finger Lakes’
manager would like to determine ifmanager would like to determine if
the price of the home and the style ofthe price of the home and the style of
the home are independent variables.the home are independent variables.
Contingency Table (Independence) TestContingency Table (Independence) Test
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
33 33 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Price Colonial Log Split-Level A-FramePrice Colonial Log Split-Level A-Frame
The number of homes sold forThe number of homes sold for
each model and price for the past twoeach model and price for the past two
years is shown below. For convenience,years is shown below. For convenience,
the price of the home is listed as eitherthe price of the home is listed as either
$99,000 or less $99,000 or less or or more than $99,000more than $99,000..
> $99,000 12 14 > $99,000 12 14 16 316 3<< $99,000 18 $99,000 18 6 19 12 6 19 12
Contingency Table (Independence) TestContingency Table (Independence) Test
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
34 34 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
HypothesesHypotheses
Contingency Table (Independence) TestContingency Table (Independence) Test
HH00: Price of the home : Price of the home isis independent of the independent of the
style of the home that is purchasedstyle of the home that is purchasedHHaa: Price of the home : Price of the home is notis not independent of the independent of the
style of the home that is purchasedstyle of the home that is purchased
35 35 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Expected FrequenciesExpected Frequencies
Contingency Table (Independence) TestContingency Table (Independence) Test
PricePrice Colonial Log Split-Level A-Frame Total Colonial Log Split-Level A-Frame Total
<< $99K $99K
> $99K> $99K
TotalTotal 30 20 35 15 10030 20 35 15 100
12 12 14 16 3 45 14 16 3 45
18 6 19 12 5518 6 19 12 55
36 36 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Rejection RuleRejection Rule
Contingency Table (Independence) TestContingency Table (Independence) Test
2.05 7.815 2.05 7.815 With With = .05 and (2 - 1)(4 - 1) = 3 d.f., = .05 and (2 - 1)(4 - 1) = 3 d.f.,
Reject Reject HH00 if if pp-value -value << .05 or .05 or 22 >> 7.8157.815
22 2 218 16 5
16 56 11
113 6 75
6 75 ( . )
.( )
. .( . )
. . 2
2 2 218 16 516 5
6 1111
3 6 756 75
( . ).
( ). .
( . ).
.
= .1364 + 2.2727 + . . . + 2.0833 = 9.149= .1364 + 2.2727 + . . . + 2.0833 = 9.149
Test StatisticTest Statistic
37 37 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Conclusion Using the Conclusion Using the pp-Value Approach-Value Approach
The The pp-value -value << . We can reject the null hypothesis. . We can reject the null hypothesis.
Because Because 22 = 9.145 is between 7.815 and = 9.145 is between 7.815 and 9.348, the9.348, the area in the upper tail of the distribution is area in the upper tail of the distribution is betweenbetween .05 and .025..05 and .025.
Area in Upper Tail .10 .05 .025 .01 .005Area in Upper Tail .10 .05 .025 .01 .005
22 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Value (df = 3) 6.251 7.815 9.348 11.345 12.838
Contingency Table (Independence) TestContingency Table (Independence) Test
38 38 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Conclusion Using the Critical Value ApproachConclusion Using the Critical Value Approach
Contingency Table (Independence) TestContingency Table (Independence) Test
We reject, at the .05 level of We reject, at the .05 level of significance,significance,the assumption that the price of the the assumption that the price of the home ishome isindependent of the style of home that independent of the style of home that isispurchased.purchased.
2 2 = 9.145 = 9.145 >> 7.815 7.815
39 39 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
End of Chapter 11End of Chapter 11