Slides Prepared by JOHN S. LOUCKS St. Edward’s University

Post on 04-Jan-2016

43 views 3 download

description

Slides Prepared by JOHN S. LOUCKS St. Edward’s University. Chapter 11 Comparisons Involving Proportions and a Test of Independence. Inferences About the Difference Between Two Population Proportions. Hypothesis Test for Proportions of a Multinomial Population. - PowerPoint PPT Presentation

Transcript of Slides Prepared by JOHN S. LOUCKS St. Edward’s University

1 1 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Slides Prepared bySlides Prepared by

JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University

Slides Prepared bySlides Prepared by

JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University

2 2 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Chapter 11Chapter 11 Comparisons Involving Proportions Comparisons Involving Proportions

and a Test of Independenceand a Test of Independence

Inferences About the Difference BetweenInferences About the Difference Between Two Population ProportionsTwo Population Proportions

Test of Independence: Contingency TablesTest of Independence: Contingency Tables

Hypothesis Test for ProportionsHypothesis Test for Proportions of a Multinomial Populationof a Multinomial Population

3 3 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Inferences About the Difference BetweenInferences About the Difference BetweenTwo Population ProportionsTwo Population Proportions

Interval Estimation of Interval Estimation of pp11 - - pp22

Hypothesis Tests About Hypothesis Tests About pp11 - - pp22

4 4 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Expected ValueExpected Value

Sampling Distribution of Sampling Distribution of p p1 2p p1 2

E p p p p( )1 2 1 2 E p p p p( )1 2 1 2

p pp pn

p pn1 2

1 1

1

2 2

2

1 1 ( ) ( ) p p

p pn

p pn1 2

1 1

1

2 2

2

1 1 ( ) ( )

where: where: nn11 = size of sample taken from population 1 = size of sample taken from population 1

nn22 = size of sample taken from population 2 = size of sample taken from population 2

Standard Deviation (Standard Error)Standard Deviation (Standard Error)

5 5 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

If the sample sizes are large, the sampling distributionIf the sample sizes are large, the sampling distribution of can be approximated by a normal probabilityof can be approximated by a normal probability distribution. distribution.

If the sample sizes are large, the sampling distributionIf the sample sizes are large, the sampling distribution of can be approximated by a normal probabilityof can be approximated by a normal probability distribution. distribution.

p p1 2p p1 2

The sample sizes are sufficiently large if The sample sizes are sufficiently large if allall of these of these conditions are met:conditions are met: The sample sizes are sufficiently large if The sample sizes are sufficiently large if allall of these of these conditions are met:conditions are met:

nn11pp11 >> 5 5 nn11(1 - (1 - pp11) ) >> 5 5

nn22pp22 >> 5 5 nn22(1 - (1 - pp22) ) >> 5 5

Sampling Distribution of Sampling Distribution of p p1 2p p1 2

6 6 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Sampling Distribution of Sampling Distribution of p p1 2p p1 2

pp11 – – pp22pp11 – – pp22

p pp pn

p pn1 2

1 1

1

2 2

2

1 1 ( ) ( ) p p

p pn

p pn1 2

1 1

1

2 2

2

1 1 ( ) ( )

p p1 2p p1 2

7 7 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Interval Estimation of Interval Estimation of pp11 - - pp22

Interval EstimateInterval Estimate

1 1 2 21 2 / 2

1 2

(1 ) (1 )p p p pp p z

n n

1 1 2 21 2 / 2

1 2

(1 ) (1 )p p p pp p z

n n

8 8 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Market Research Associates isMarket Research Associates isconducting research to evaluate theconducting research to evaluate theeffectiveness of a client’s new adver-effectiveness of a client’s new adver-tising campaign. Before the newtising campaign. Before the newcampaign began, a telephone surveycampaign began, a telephone surveyof 150 households in the test marketof 150 households in the test marketarea showed 60 households “aware” ofarea showed 60 households “aware” ofthe client’s product. the client’s product.

Interval Estimation of Interval Estimation of pp11 - - pp22

Example: Market Research AssociatesExample: Market Research Associates

The new campaign has been initiated with TV andThe new campaign has been initiated with TV andnewspaper advertisements running for three weeks.newspaper advertisements running for three weeks.

9 9 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

A survey conducted immediatelyA survey conducted immediatelyafter the new campaign showed 120after the new campaign showed 120of 250 households “aware” of theof 250 households “aware” of theclient’s product.client’s product.

Interval Estimation of Interval Estimation of pp11 - - pp22

Example: Market Research AssociatesExample: Market Research Associates

Does the data support the positionDoes the data support the positionthat the advertising campaign has that the advertising campaign has provided an increased awareness ofprovided an increased awareness ofthe client’s product?the client’s product?

10 10 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Point Estimator of the Difference BetweenPoint Estimator of the Difference BetweenTwo Population ProportionsTwo Population Proportions

= sample proportion of households “aware” of the= sample proportion of households “aware” of the product product afterafter the new campaign the new campaign = sample proportion of households “aware” of the= sample proportion of households “aware” of the product product beforebefore the new campaign the new campaign

1p1p

2p2p

pp11 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product afterafter the new campaign the new campaign pp22 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product beforebefore the new campaign the new campaign

1 2

120 60.48 .40 .08

250 150p p 1 2

120 60.48 .40 .08

250 150p p

11 11 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

.08 .08 ++ 1.96(.0510) 1.96(.0510)

.08 .08 ++ .10 .10

.48(.52) .40(.60).48 .40 1.96

250 150

.48(.52) .40(.60).48 .40 1.96

250 150

Interval Estimation of Interval Estimation of pp11 - - pp22

Hence, the 95% confidence interval for the differenceHence, the 95% confidence interval for the differencein before and after awareness of the product isin before and after awareness of the product is-.02 to +.18.-.02 to +.18.

For For = .05, = .05, zz.025.025 = 1.96: = 1.96:

12 12 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

HypothesesHypotheses

HH00: : pp11 - - pp22 << 0 0

HHaa: : pp11 - - pp22 > 0 > 0 1 2: 0aH p p 1 2: 0aH p p 0 1 2: 0H p p 0 1 2: 0H p p 0 1 2: 0H p p 0 1 2: 0H p p

1 2: 0aH p p 1 2: 0aH p p 0 1 2: 0H p p 0 1 2: 0H p p 1 2: 0aH p p 1 2: 0aH p p

Left-tailedLeft-tailed Right-tailedRight-tailed Two-tailedTwo-tailed

We focus on tests involving no difference We focus on tests involving no difference betweenbetweenthe two population proportions (i.e. the two population proportions (i.e. pp11 = = pp22))

13 13 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

1 2p p1 2p p Pooled Estimate of Standard Error of Pooled Estimate of Standard Error of

1 2

1 2

1 1(1 )p p p p

n n

1 2

1 2

1 1(1 )p p p p

n n

1 1 2 2

1 2

n p n pp

n n

1 1 2 2

1 2

n p n pp

n n

where:where:

14 14 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

1 2

1 2

( )

1 1(1 )

p pz

p pn n

1 2

1 2

( )

1 1(1 )

p pz

p pn n

Test StatisticTest Statistic

15 15 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Can we conclude, using a .05 levelCan we conclude, using a .05 level

of significance, that the proportion ofof significance, that the proportion of

households aware of the client’s producthouseholds aware of the client’s product

increased after the new advertisingincreased after the new advertising

campaign?campaign?

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

Example: Market Research AssociatesExample: Market Research Associates

16 16 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

1. Develop the hypotheses.1. Develop the hypotheses.

pp -Value and Critical Value Approaches -Value and Critical Value Approaches

HH00: : pp11 - - pp22 << 0 0

HHaa: : pp11 - - pp22 > 0 > 0

pp11 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product afterafter the new campaign the new campaign pp22 = proportion of the population of households = proportion of the population of households “ “aware” of the product aware” of the product beforebefore the new campaign the new campaign

17 17 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

2. Specify the level of significance.2. Specify the level of significance. = .05= .05

3. Compute the value of the test statistic.3. Compute the value of the test statistic.

pp -Value and Critical Value Approaches -Value and Critical Value Approaches

p

250 48 150 40250 150

180400

45(. ) (. )

.p

250 48 150 40250 150

180400

45(. ) (. )

.

sp p1 245 55 1

2501150 0514 . (. )( ) .sp p1 2

45 55 1250

1150 0514 . (. )( ) .

(.48 .40) 0 .08 1.56

.0514 .0514z

(.48 .40) 0 .08 1.56

.0514 .0514z

18 18 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

5. Determine whether to reject 5. Determine whether to reject HH00..

We We cannotcannot conclude that the proportion of conclude that the proportion of householdshouseholdsaware of the client’s product increased after aware of the client’s product increased after the newthe newcampaign.campaign.

4. Compute the 4. Compute the pp –value. –value.

For For zz = 1.56, the = 1.56, the pp–value = .0594–value = .0594

Because Because pp–value > –value > = .05, we = .05, we cannotcannot reject reject HH00..

pp –Value Approach –Value Approach

19 19 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis Tests about Hypothesis Tests about pp11 - - pp22

Critical Value ApproachCritical Value Approach

5. Determine whether to reject 5. Determine whether to reject HH00..

Because 1.56 < 1.645, we cannot reject Because 1.56 < 1.645, we cannot reject HH00..

For For = .05, = .05, zz.05.05 = 1.645 = 1.645

4. Determine the critical value and rejection rule.4. Determine the critical value and rejection rule.

Reject Reject HH00 if if zz >> 1.645 1.645

We We cannotcannot conclude that the proportion of conclude that the proportion of householdshouseholdsaware of the client’s product increased after aware of the client’s product increased after the newthe newcampaign.campaign.

20 20 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial

PopulationPopulation1.1. Set up the null and alternative hypotheses. Set up the null and alternative hypotheses.

2.2. Select a random sample and record the observed Select a random sample and record the observed

frequency, frequency, ffi i , for each of the , for each of the kk categories. categories.

3.3. Assuming Assuming HH00 is true, compute the expected is true, compute the expected frequency, frequency, eei i , in each category by multiplying the, in each category by multiplying the category probability by the sample size.category probability by the sample size.

21 21 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial

PopulationPopulation

22

1

( )f ee

i i

ii

k2

2

1

( )f ee

i i

ii

k

4.4. Compute the value of the test statistic. Compute the value of the test statistic.

Note: The test statistic has a chi-square distributionNote: The test statistic has a chi-square distributionwith with kk – 1 df provided that the expected frequencies – 1 df provided that the expected frequenciesare 5 or more for all categories.are 5 or more for all categories.

ffii = observed frequency for category = observed frequency for category iieeii = expected frequency for category = expected frequency for category ii

kk = number of categories = number of categories

where:where:

22 22 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Hypothesis (Goodness of Fit) TestHypothesis (Goodness of Fit) Testfor Proportions of a Multinomial for Proportions of a Multinomial

PopulationPopulation

where where is the significance is the significance level andlevel and

there are there are kk - 1 degrees of - 1 degrees of freedomfreedom

pp-value approach:-value approach:

Critical value approach:Critical value approach:

Reject Reject HH00 if if pp-value -value <<

5.5. Rejection rule: Rejection rule:

2 2 2 2 Reject Reject HH00 if if

23 23 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)

Finger Lakes Homes manufacturesFinger Lakes Homes manufactures

four models of prefabricated homes,four models of prefabricated homes,

a two-story colonial, a log cabin, aa two-story colonial, a log cabin, a

split-level, and an A-frame. To helpsplit-level, and an A-frame. To help

in production planning, managementin production planning, management

would like to determine if previous would like to determine if previous

customer purchases indicate that therecustomer purchases indicate that there

is a preference in the style selected.is a preference in the style selected.

24 24 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Split- A-Split- A-Model Colonial Log Level FrameModel Colonial Log Level Frame

# Sold# Sold 30 20 35 15 30 20 35 15

The number of homes sold of eachThe number of homes sold of each

model for 100 sales over the past twomodel for 100 sales over the past two

years is shown below.years is shown below.

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

Example: Finger Lakes Homes (A)Example: Finger Lakes Homes (A)

25 25 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

HypothesesHypotheses

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

where:where:

ppCC = population proportion that purchase a colonial = population proportion that purchase a colonial

ppL L = population proportion that purchase a log cabin = population proportion that purchase a log cabin

ppS S = population proportion that purchase a split-level = population proportion that purchase a split-level

ppAA = population proportion that purchase an A-frame = population proportion that purchase an A-frame

HH00: : ppCC = = ppLL = = ppSS = = ppAA = .25 = .25

HHaa: The population proportions are : The population proportions are notnot

ppCC = .25, = .25, ppLL = .25, = .25, ppSS = .25, and = .25, and ppAA = .25 = .25

26 26 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Rejection RuleRejection Rule

22

7.815 7.815

Do Not Reject H0Do Not Reject H0 Reject H0Reject H0

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

With With = .05 and = .05 and

kk - 1 = 4 - 1 = 3 - 1 = 4 - 1 = 3

degrees of freedomdegrees of freedom

Reject H0 if if pp-value -value << .05 or .05 or 22 > 7.815. > 7.815.

27 27 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Expected FrequenciesExpected Frequencies

Test StatisticTest Statistic

22 2 2 230 25

2520 25

2535 25

2515 25

25

( ) ( ) ( ) ( )22 2 2 230 25

2520 25

2535 25

2515 25

25

( ) ( ) ( ) ( )

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

ee1 1 = .25(100) = 25 = .25(100) = 25 ee22 = .25(100) = 25 = .25(100) = 25

ee33 = .25(100) = 25 = .25(100) = 25 ee44 = .25(100) = 25 = .25(100) = 25

= 1 + 1 + 4 + 4 = 1 + 1 + 4 + 4

= 10= 10

28 28 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

Conclusion Using the Conclusion Using the pp-Value Approach-Value Approach

The The pp-value -value << . We can reject the null hypothesis. . We can reject the null hypothesis.

Because Because 22 = 10 is between 9.348 and 11.345, = 10 is between 9.348 and 11.345, thethe area in the upper tail of the distribution is area in the upper tail of the distribution is betweenbetween .025 and .01..025 and .01.

Area in Upper Tail .10 .05 .025 .01 .005Area in Upper Tail .10 .05 .025 .01 .005

22 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Value (df = 3) 6.251 7.815 9.348 11.345 12.838

29 29 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Conclusion Using the Critical Value ApproachConclusion Using the Critical Value Approach

Multinomial Distribution Goodness of Fit Multinomial Distribution Goodness of Fit TestTest

We reject, at the .05 level of significance,We reject, at the .05 level of significance,

the assumption that there is no home stylethe assumption that there is no home style

preference.preference.

2 2 = 10 = 10 >> 7.815 7.815

30 30 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Test of Independence: Contingency Test of Independence: Contingency TablesTables

ei j

ij (Row Total )(Column Total )

Sample Sizee

i jij

(Row Total )(Column Total ) Sample Size

1.1. Set up the null and alternative hypotheses. Set up the null and alternative hypotheses.

2.2. Select a random sample and record the observed Select a random sample and record the observed

frequency, frequency, ffij ij , for each cell of the contingency table., for each cell of the contingency table.

3.3. Compute the expected frequency, Compute the expected frequency, eeij ij , for each cell., for each cell.

31 31 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Test of Independence: Contingency Test of Independence: Contingency TablesTables

22

( )f e

eij ij

ijji2

2

( )f e

eij ij

ijji

5.5. Determine the rejection rule. Determine the rejection rule.

Reject Reject HH00 if if p p -value -value << or or . .

2 2 2 2

4.4. Compute the test statistic. Compute the test statistic.

where where is the significance level and, is the significance level and,with with nn rows and rows and mm columns, there are columns, there are((nn - 1)( - 1)(mm - 1) degrees of freedom. - 1) degrees of freedom.

32 32 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Each home sold by Finger LakesEach home sold by Finger Lakes

Homes can be classified according toHomes can be classified according to

price and to style. Finger Lakes’price and to style. Finger Lakes’

manager would like to determine ifmanager would like to determine if

the price of the home and the style ofthe price of the home and the style of

the home are independent variables.the home are independent variables.

Contingency Table (Independence) TestContingency Table (Independence) Test

Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)

33 33 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Price Colonial Log Split-Level A-FramePrice Colonial Log Split-Level A-Frame

The number of homes sold forThe number of homes sold for

each model and price for the past twoeach model and price for the past two

years is shown below. For convenience,years is shown below. For convenience,

the price of the home is listed as eitherthe price of the home is listed as either

$99,000 or less $99,000 or less or or more than $99,000more than $99,000..

> $99,000 12 14 > $99,000 12 14 16 316 3<< $99,000 18 $99,000 18 6 19 12 6 19 12

Contingency Table (Independence) TestContingency Table (Independence) Test

Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)

34 34 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

HypothesesHypotheses

Contingency Table (Independence) TestContingency Table (Independence) Test

HH00: Price of the home : Price of the home isis independent of the independent of the

style of the home that is purchasedstyle of the home that is purchasedHHaa: Price of the home : Price of the home is notis not independent of the independent of the

style of the home that is purchasedstyle of the home that is purchased

35 35 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Expected FrequenciesExpected Frequencies

Contingency Table (Independence) TestContingency Table (Independence) Test

PricePrice Colonial Log Split-Level A-Frame Total Colonial Log Split-Level A-Frame Total

<< $99K $99K

> $99K> $99K

TotalTotal 30 20 35 15 10030 20 35 15 100

12 12 14 16 3 45 14 16 3 45

18 6 19 12 5518 6 19 12 55

36 36 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Rejection RuleRejection Rule

Contingency Table (Independence) TestContingency Table (Independence) Test

2.05 7.815 2.05 7.815 With With = .05 and (2 - 1)(4 - 1) = 3 d.f., = .05 and (2 - 1)(4 - 1) = 3 d.f.,

Reject Reject HH00 if if pp-value -value << .05 or .05 or 22 >> 7.8157.815

22 2 218 16 5

16 56 11

113 6 75

6 75 ( . )

.( )

. .( . )

. . 2

2 2 218 16 516 5

6 1111

3 6 756 75

( . ).

( ). .

( . ).

.

= .1364 + 2.2727 + . . . + 2.0833 = 9.149= .1364 + 2.2727 + . . . + 2.0833 = 9.149

Test StatisticTest Statistic

37 37 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Conclusion Using the Conclusion Using the pp-Value Approach-Value Approach

The The pp-value -value << . We can reject the null hypothesis. . We can reject the null hypothesis.

Because Because 22 = 9.145 is between 7.815 and = 9.145 is between 7.815 and 9.348, the9.348, the area in the upper tail of the distribution is area in the upper tail of the distribution is betweenbetween .05 and .025..05 and .025.

Area in Upper Tail .10 .05 .025 .01 .005Area in Upper Tail .10 .05 .025 .01 .005

22 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Value (df = 3) 6.251 7.815 9.348 11.345 12.838

Contingency Table (Independence) TestContingency Table (Independence) Test

38 38 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

Conclusion Using the Critical Value ApproachConclusion Using the Critical Value Approach

Contingency Table (Independence) TestContingency Table (Independence) Test

We reject, at the .05 level of We reject, at the .05 level of significance,significance,the assumption that the price of the the assumption that the price of the home ishome isindependent of the style of home that independent of the style of home that isispurchased.purchased.

2 2 = 9.145 = 9.145 >> 7.815 7.815

39 39 Slide

Slide

© 2006 Thomson/South-Western© 2006 Thomson/South-Western

End of Chapter 11End of Chapter 11