Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  ·...

33
Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals

Transcript of Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  ·...

Page 1: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Sample Size, Study Design and Comparing Two Proportions with

Confidence Intervals

Page 2: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Review

Up to this point, we have discussed:– how to state a question in the form of two

hypotheses (null and alternative), – how to assess the data, and – how to answer the question by using a general

test statistic and an associated measure of the probability of observing our statistic (p-value), given the current state (null hypothesis).

In addition, we have addresses estimation with confidence by using the standard normal distribution to place upper and lower confidence bounds about an estimate.

Page 3: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Previously, we have been using observed data to test an assumed proportion.

The sample size is involved in testing the assumptions under the null hypothesis and in calculating the standard error under the null hypothesis.

What about comparing two observed proportions? How does sample size figure into calculations,

affect variability, and precision when testing or comparing two observed proportions?

Page 4: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Gallup Poll

On Election Day this year, residents in Michigan will vote on whether to amend their state constitution to make marriages between same-sex couples illegal.

As part of a special poll of Michigan registered and likely voters, Gallup sought to find out how the people in that state plan to vote on this contentious issue.

Page 5: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Results

• According to the poll, a bare majority of Michigan registered voters -- 51% -- say they would vote against the proposal to ban gay marriages. This compares with 44% who would vote for the proposal.

• Among likely Michigan voters, the results are essentially the same, with 51% against the ban and 45% in favor.

Page 6: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Question

• If the election were being held today, would you vote – for the proposal, which would pass a ban on gay marriages or against the proposal, which would defeat the ban on gay marriages?

Page 7: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,
Page 8: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Survey Method

Results are based on telephone interviews with 829 registered voters in Michigan, aged 18 and older, conducted Sept. 10-13, 2004.

For results based on this sample, one can say with 95% confidence that the margin of sampling error is ±4 percentage points.

Page 9: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Confidence Interval

What is the confidence interval about the proportion opposing the ban, 51%?

( )

( )

[ ]

0.51 1 0.510.51 1.96

829

0.51 1.96 0.017

0.51 0.034

0.48, 0.54

⎛ ⎞−⎜ ⎟±⎜ ⎟⎝ ⎠

= ±

= ±

=

Page 10: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Conclusion

That is, we are 95% confident that the proportion survey opposing the ban on gay marriages is between 48% and 54%.

Notice that the lower bound is still greater than the proportion supporting the ban on gay marriages (44%).

Page 11: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Sample Size and PrecisionHow reliable, or variable are these numbers? Could it change to 30% next time and 90% the

next? Or would a “re-do” just change it to 50% or 52%?

Instead of using confidence intervals, Gallup uses “margin of error.”

The Gallup “survey methods” says: “that the margin of sampling error is ±4 percentage points.”

Page 12: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Margin of sampling error

How do they figure the margin is “± 4”?The width of the confidence interval—commonly

called the “margin of error” depends on 3 things:• The sample size, n,• The level of confidence, usually 95%, and• The proportion we are trying to estimate.

Page 13: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Sample size calculation

Recall that the z that corresponds to 95% confidence is 1.96, an equation for calculating the sample size necessary to estimate a proportion p, to a margin of error d, and a reliability coefficient z, is:

Page 14: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

( )2

2ˆ ˆ1z p p

nd

−=

Page 15: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

For this example:

( )

( )

2

2

2

2

ˆ ˆ1

1.96 0.51 1 0.51

0.04

600.01

z p pn

d

−=

−=

=

i

Page 16: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Conclusion

Rounding up, we need 601 subjects to estimate p with the level of error and confidence specified.

If we’d guessed p = 0.54, then the required n would be 597, a roughly similar value.

So, where does Gallup come up with n=829?

Page 17: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Use the formula to estimate d from the sample size you have:

( )

( )

ˆ ˆ1

0.51 1 0.511.96

829

0.034 0.04?

p pd z

n−

=

−=

= ≈

Page 18: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Another Gallup Example

In a new Gallup Poll, conducted Sept. 13-15, President George W. Bush leads Democratic candidate John Kerry by 55% to 42% among likely voters, and by 52% to 44% among registered voters.

Page 19: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Kerry/

EdwardsBush/

Cheney NEITHER

(vol.) No

opinion % % % % Likely voters 2004 Sep 13-15 42 55 1 2 2004 Sep 3-5 45 52 1 2 2004 Aug 23-25 47 50 1 2 2004 Aug 9-11 47 50 1 2 2004 Jul 30-Aug 1 47 51 * 2 2004 Jul 19-21 49 47 2 2 2004 Jul 8-11 50 46 2 2 2004 Jun 21-23 48 49 1 2 2003 Jun 3-6 50 44 2 3

Page 20: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Survey MethodsResults based on likely voters are based on the

sub-sample of 767 survey respondents deemed most likely to vote in the November 2004 general election, according to a series of questions measuring current voting intentions and past voting behavior.

For results based on the total sample of likely voters, one can say with 95% confidence that the margin of sampling error is ±4 percentage points.

Page 21: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Confidence IntervalsWhat is the confidence interval about the proportion of

likely voters supporting Bush, 0.55?

( )

( )

[ ] [ ]

0.55 1 0.550.55 1.96

767

0.55 1.96 0.018

0.55 0.035

0.515, 0.585 0.52, 0.59

⎛ ⎞−⎜ ⎟±⎜ ⎟⎝ ⎠

= ±

= ±

= ∼

Page 22: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Conclusion

That is, we are 95% confident that the proportion of likely voters supporting Bush is between 52% and 59%.

Notice that the lower bound is still greater than the proportion of likely voters supporting Kerry (42%).

Page 23: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Did Gallup use enough people?

With p = 0.55. Gallup wants a margin of error d = 0.04 with 95% confidence, so:

( )

( )

2

2

2

2

ˆ ˆ1

1.96 0.55 1 0.55

0.04

594.2

z p pn

d

−=

−=

=

i

Page 24: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Local Poll

• For the Columbiana City Council, District 2, 46% polled voters support Derrik Bryant and 54% of polled voters support Danny Kelley.

• Only 93 subjects were polled, what is the 95% confidence interval about each proportion?

Page 25: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

For Bryant:

( )

( )

[ ]

0.46 1 0.460.46 1.96

93

0.46 1.96 0.052

0.46 0.10

0.36, 0.56

⎛ ⎞−⎜ ⎟±⎜ ⎟⎝ ⎠

= ±

= ±

=

We are 95% confident that the proportion of voters We are 95% confident that the proportion of voters supporting Bryant is between 36% and 56%.supporting Bryant is between 36% and 56%.

Page 26: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

For Kelley:

( )

( )

[ ]

0.54 1 0.540.54 1.96

93

0.54 1.96 0.052

0.54 0.10

0.44, 0.64

⎛ ⎞−⎜ ⎟±⎜ ⎟⎝ ⎠

= ±

= ±

=

We are 95% confident that the proportion of voters We are 95% confident that the proportion of voters supporting Kelley is between 44% and 64%. supporting Kelley is between 44% and 64%.

Page 27: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Notice

• the standard error for both CIs is the same, since we are dealing with two proportions that total 100%.

• The confidence intervals for both candidates overlap, this is likely due to the fact that such a small n was used.

Page 28: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

What should n have been?

( )2

21.96 0.46 1 0.46

0.04

596.4

n−

=

=

i

Rounding up, we needed 597 subjects to estimate Rounding up, we needed 597 subjects to estimate p p with the level of error and confidence specified.with the level of error and confidence specified.

Page 29: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Are they statistically different?

• What about declaring statistical significance?

• How do we determine if two observed proportions are statistically different?

Page 30: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Case Study

Recall the question that was actually asked in the CPR study reported in the NEJM.

• Do we need to give mouth-to-mouth ventilation and chest compression?

• Or will just doing chest compression alone be just as effective?

Page 31: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Summary:• In the Seattle study, heart-attack victims were

randomly assigned to two groups: full CPR or chest compression alone.

• They found a 10.4% survival rate for those receiving full CPR (x = 29, n = 278) and a 14.6% survival rate for those receiving chest compression alone (x = 35, n = 240).

• The trial was designed to detect a 3.5% improvement of chest compression alone over full CPR.

Page 32: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Question:

• Is there any difference in the survival proportions of dispatcher-instructed bystander administered CPR depending on whether mouth-to-mouth ventilation is used or not?

Page 33: Sample Size, Study Design and Comparing Two Proportions with Confidence Intervals€¦ ·  · 2007-10-17Comparing Two Proportions with Confidence Intervals. Review Up to this point,

Exercise

• Briefly write out the 10 steps as they pertain to comparing two observed proportions.

• Do your best to make a statement about each step, even if you are unsure of the statistical terms or formulas that will be applied.