scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how...

246
Lecture 15: Power and Confidence Intervals/Interval Estimation API-201Z Maya Sen Harvard Kennedy School http://scholar.harvard.edu/msen

Transcript of scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how...

Page 1: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Lecture 15:Power and

Confidence Intervals/Interval EstimationAPI-201Z

Maya Sen

Harvard Kennedy Schoolhttp://scholar.harvard.edu/msen

Page 2: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Announcements

I Second midterm just over two weeks, November 15

I Will be in-class, closed book and closed note (will provideformula sheet, probability tables)

I Have posted old exams and problem sets

I Review Session Tuesday 11/13, 4-5:15pm, Rubenstein 304

Page 3: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Announcements

I Second midterm just over two weeks, November 15

I Will be in-class, closed book and closed note (will provideformula sheet, probability tables)

I Have posted old exams and problem sets

I Review Session Tuesday 11/13, 4-5:15pm, Rubenstein 304

Page 4: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Announcements

I Second midterm just over two weeks, November 15

I Will be in-class, closed book and closed note (will provideformula sheet, probability tables)

I Have posted old exams and problem sets

I Review Session Tuesday 11/13, 4-5:15pm, Rubenstein 304

Page 5: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Announcements

I Second midterm just over two weeks, November 15

I Will be in-class, closed book and closed note (will provideformula sheet, probability tables)

I Have posted old exams and problem sets

I Review Session Tuesday 11/13, 4-5:15pm, Rubenstein 304

Page 6: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Announcements

I Second midterm just over two weeks, November 15

I Will be in-class, closed book and closed note (will provideformula sheet, probability tables)

I Have posted old exams and problem sets

I Review Session Tuesday 11/13, 4-5:15pm, Rubenstein 304

Page 7: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 8: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:

I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 9: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single mean

I Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 10: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in means

I Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 11: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportion

I Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 12: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 13: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis Tests

I Today:I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 14: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 15: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testing

I Interval estimationI Confidence IntervalsI Proper Interpretation

Page 16: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimation

I Confidence IntervalsI Proper Interpretation

Page 17: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence Intervals

I Proper Interpretation

Page 18: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Roadmap

I Be comfortable with four kinds of hypothesis tests:I Single meanI Difference in meansI Single proportionI Difference in proportions

I Proper interpretation of Hypothesis TestsI Today:

I Power for hypothesis testingI Interval estimationI Confidence IntervalsI Proper Interpretation

Page 19: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0

Do not reject H0

Page 20: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0

Do not reject H0

Page 21: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 GoodDo not reject H0 Good

Page 22: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 Bad! GoodDo not reject H0 Good Bad!

Page 23: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 Type 1 GoodDo not reject H0 Good Type 2

Page 24: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Page 25: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Page 26: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Page 27: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 28: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is true

I Akin to false positiveI Mammogram analogy: Positive test, despite not having the

disease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 29: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 30: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 31: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 32: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 33: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 34: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 35: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 36: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance (which gives us acritical value)

I Rejection region: Values of X to left/right of values of thecritical value for which the hypothesis test would reject thenull

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Page 37: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 38: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is false

I Akin to false negativeI Mammogram analogy: Negative test, despite having the

disease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 39: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 40: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 41: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 42: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Page 43: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 44: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 45: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 46: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 47: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 48: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 49: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Page 50: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 51: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 52: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 53: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:

I 1) You are asked to calculate the sample size you will need todetect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 54: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 55: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 56: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 57: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Page 58: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutesI So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 59: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutesI So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 60: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutesI So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 61: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commuters

I Find power of a hypothesis test when alternative hypothesis isthat commuting hours equals 120 minutes

I So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 62: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutes

I So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 63: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutesI So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 64: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I You are an urban policy expert studying the effects of recentconstruction on commuting times

I Suppose known average time spent commuting is 110minutes/week, with standard deviation of 30 minutes

I Take sample of 81 commutersI Find power of a hypothesis test when alternative hypothesis is

that commuting hours equals 120 minutesI So alternative is that the effect of construction is 20 minutes

I So H0 = 110, HA = 120, σ = 30, n = 81

Page 65: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 66: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 67: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 68: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 69: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 70: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 71: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 72: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(given your α values)

I Calculate critical values to define the rejection region

I Assume alternative hypothesis is true, µA = 120

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µA (e.g.,µA = 120, 130, 140, etc) to plot a power curve or powerfunction

Page 73: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Page 74: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Critical value

Page 75: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Critical value

α

Page 76: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Critical value

α

Distribution under H1

Page 77: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Critical value

α

Distribution under H1

β

Page 78: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Intuition

100 110 120 130

0.00

0.05

0.10

0.15

Normal distribution under H0 and H1

x

dens

ity

Distribution under H0

Critical value

α

Distribution under H1

β

Power

Page 79: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 80: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 81: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 82: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 83: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 84: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 1: Calculate rejection region under the null hypothesisbeing true

I Remember that test statistic for single mean is

z =X − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate critical value

1.645 <X − 110

30/√

81

115.48 < X

I That is, we would reject null for all X > 115.48 (our rejectionregion)

Page 85: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 86: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 87: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 88: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 89: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 90: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 91: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power Analysis Example

I Step 3: Assume alternative true, µA = 120

I Given alternative being true, how often would we (correctly)reject null?

I That is, what is P(X > 115.48) if µ = 120?

z =X − µ

σ/√n

=115.48 − 120

30/√

81

= −1.36

I Step 4: Finally β = P(Z < −1.36), which is 0.087

I Step 5: Power = 1 − P(Z < −1.36) = 0.913

Page 92: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 93: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 94: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 95: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 96: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 97: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 98: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Power

I Note: Power dependent on a) sample size, b) your HA (size ofeffect), and c) α level (and so type of test)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Page 99: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 100: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 101: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 102: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 103: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 104: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 105: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Interval Estimation

I Hypothesis testing → Useful to compare sample/s against anull hypothesis

I Interval estimation → Useful for calculating a range ofpossible values for the true population proportion/mean

I Most commonly used are confidence intervals (CIs)

I Takes into account not only the point estimate (for example,π), but also variability and sample size

I Caution: Frequently incorrectly interpreted!

I Most basic example → Confidence interval for a mean

Page 106: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 107: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 108: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 109: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 110: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 111: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

Use our old friend, the CLT:

I (1) The sums and means of random samples of observationshave an approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I Via the large of large numbers, this will be centered aroundthe true population mean/proportion

I CLT tells us about the underlying behavior of the sampleproportion/mean across all different kinds of data

Page 112: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 113: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 114: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 115: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 116: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 117: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 118: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 119: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Stated more formally for the sample mean:

I If X is the mean of n measurements x1, x2, ..., xn, then as ngoes up, X approaches:

X ∼ N(µ,σ2/n)

I And b/c this is Normal, we can standardize

X − µ

σ/√n

I We replace true standard error, σ2/n with the estimate fromour sample, s:

X − µ

s/√n

Page 120: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 121: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:

I For CI’s: Leverage fact that we know:I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 122: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 123: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of mean

I Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 124: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of mean

I Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 125: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 126: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 127: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 128: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 129: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Here’s where confidence intervals differ from hypothesis tests:I For CI’s: Leverage fact that we know:

I Approx 68% of probability falling within 1 SD of meanI Approx 95% of probability falling within 2 SD of meanI Approx 99.7% of probability falling within 3 SD of mean

I Can be more exact than this

I For example, know that 95% of probability mass of a standardNormal falls more precisely between -1.96 and 1.96

I That means we know that:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

Page 130: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 131: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 132: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 133: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 134: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 135: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Take this and work backwards:

P(−1.96 6X − µ

s/√n

6 1.96) = 0.95

P(−1.96s√n6 X − µ 6 1.96

s√n) = 0.95

P(−X − 1.96s√n6 −µ 6 −X + 1.96

s√n) = 0.95

P(X + 1.96s√n> µ > X − 1.96

s√n) = 0.95

Page 136: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 137: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 138: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 139: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 140: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 141: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 142: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 143: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I This gives us the 95% confidence interval for X

X ± 1.96× SE [X ]

I This is shorthand [LB,UB]:

LB = X − 1.96× SE [X ]

UB = X + 1.96× SE [X ]

I Can rewrite as general formula for a (1 − α)% confidenceinterval:

X ± zα/2 × SE [X ]

I Where we use zα/2 from standard normal (or tα/2 fromStudent’s t)

Page 144: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 145: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 146: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 147: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 148: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 149: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 150: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 151: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Means

I Means we can easily calculate confidence intervals forcommonly used α values

I α = 0.05 → 95% Confidence Interval

X ± 1.96SE [X ]

I α = 0.1 → 90% Confidence Interval

X ± 1.645SE [X ]

I α = 0.1 → 99% Confidence Interval

X ± 2.58SE [X ]

Page 152: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 153: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 154: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I Means

I Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 155: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in means

I ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 156: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI Proportions

I Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 157: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 158: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 159: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 160: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 161: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 162: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Calculating Confidence Intervals for Other Quantities ofInterest

I CLT means we can calculate confidence intervals for anyestimator that approximates the Normal distribution as n goesup

I MeansI Difference in meansI ProportionsI Difference in proportions

I All follow general form of

Point Estimate ± zα/2SE

I where zα/2SE refers to the margin of error

I CI’s most generally are:

Point Estimate ±Margin of Error

Page 163: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 164: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 165: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 166: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 167: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 168: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 169: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Proportions

I Sample proportion, π, has a normal sampling distributionunder CLT

π ∼ N(π,π(1 − π)

n)

I Because this is normal, use the same general guidelines, whereSE [π] =

√π(1 − π)/n

I E.g., 95% CI: π± 1.96SE [π]

I E.g., 90% CI: π± 1.645SE [π]

I E.g., 99% CI: π± 2.58SE [π]

Page 170: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 171: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 172: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 173: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 174: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 175: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 176: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Means

I Difference in two sample means, X1 − X2, has a normalsampling distribution under CLT

X1 − X2 ∼ N(µ1 − µ2,σ21n1

+σ22n2

)

I Because this is normal, use the same general guidelines, where

SE [X1 − X2] =

√s21n1

+s22n2

I E.g., 95% CI: π± 1.96SE [X1 − X2]

I E.g., 90% CI: π± 1.645SE [X1 − X2]

I E.g., 99% CI: π± 2.58SE [X1 − X2]

Page 177: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Proportions

I Difference in two sample proportions, π1 − π2, has a normalsampling distribution under CLT

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I Because this is normal, use the same general guidelines, where

non-pooled SE [π1 − π2] =√π1(1−π1)

n1+π2(1−π2)

n2

I E.g., 95% CI: π± 1.96SE [π1 − π2]

I E.g., 90% CI: π± 1.645SE [π1 − π2]

I E.g., 99% CI: π± 2.58SE [π1 − π2]

Page 178: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Proportions

I Difference in two sample proportions, π1 − π2, has a normalsampling distribution under CLT

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I Because this is normal, use the same general guidelines, where

non-pooled SE [π1 − π2] =√π1(1−π1)

n1+π2(1−π2)

n2

I E.g., 95% CI: π± 1.96SE [π1 − π2]

I E.g., 90% CI: π± 1.645SE [π1 − π2]

I E.g., 99% CI: π± 2.58SE [π1 − π2]

Page 179: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Proportions

I Difference in two sample proportions, π1 − π2, has a normalsampling distribution under CLT

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I Because this is normal, use the same general guidelines, where

non-pooled SE [π1 − π2] =√π1(1−π1)

n1+π2(1−π2)

n2

I E.g., 95% CI: π± 1.96SE [π1 − π2]

I E.g., 90% CI: π± 1.645SE [π1 − π2]

I E.g., 99% CI: π± 2.58SE [π1 − π2]

Page 180: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Proportions

I Difference in two sample proportions, π1 − π2, has a normalsampling distribution under CLT

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I Because this is normal, use the same general guidelines, where

non-pooled SE [π1 − π2] =√π1(1−π1)

n1+π2(1−π2)

n2

I E.g., 95% CI: π± 1.96SE [π1 − π2]

I E.g., 90% CI: π± 1.645SE [π1 − π2]

I E.g., 99% CI: π± 2.58SE [π1 − π2]

Page 181: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Intervals for Difference in Proportions

I Difference in two sample proportions, π1 − π2, has a normalsampling distribution under CLT

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I Because this is normal, use the same general guidelines, where

non-pooled SE [π1 − π2] =√π1(1−π1)

n1+π2(1−π2)

n2

I E.g., 95% CI: π± 1.96SE [π1 − π2]

I E.g., 90% CI: π± 1.645SE [π1 − π2]

I E.g., 99% CI: π± 2.58SE [π1 − π2]

Page 182: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I Question asked by Gallup:

I “In general, are you satisfied or dissatisfied with the waythings are going in the United States at this time?”

I Suppose of n = 1017 respondents, 248 said “yes, satisfied”

I Calculate 95% confidence interval for true π (true share ofAmericans who think country moving in right direction)

Page 183: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I Question asked by Gallup:

I “In general, are you satisfied or dissatisfied with the waythings are going in the United States at this time?”

I Suppose of n = 1017 respondents, 248 said “yes, satisfied”

I Calculate 95% confidence interval for true π (true share ofAmericans who think country moving in right direction)

Page 184: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I Question asked by Gallup:

I “In general, are you satisfied or dissatisfied with the waythings are going in the United States at this time?”

I Suppose of n = 1017 respondents, 248 said “yes, satisfied”

I Calculate 95% confidence interval for true π (true share ofAmericans who think country moving in right direction)

Page 185: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I Question asked by Gallup:

I “In general, are you satisfied or dissatisfied with the waythings are going in the United States at this time?”

I Suppose of n = 1017 respondents, 248 said “yes, satisfied”

I Calculate 95% confidence interval for true π (true share ofAmericans who think country moving in right direction)

Page 186: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I Question asked by Gallup:

I “In general, are you satisfied or dissatisfied with the waythings are going in the United States at this time?”

I Suppose of n = 1017 respondents, 248 said “yes, satisfied”

I Calculate 95% confidence interval for true π (true share ofAmericans who think country moving in right direction)

Page 187: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 188: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π is

π± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 189: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 190: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 191: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 192: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 193: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Confidence Interval Example

I A 95% CI for π isπ± 1.96SE [π]

I where SE [π] =√π(1 − π)/n

I This means we have

0.24± 1.96√

0.24(1 − 0.24)/1017

0.24± 0.026

→ [0.214, 0.266]

I How do we interpret this?

Page 194: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 195: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 196: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 197: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 198: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 199: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 200: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 201: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret Confidence Intervals?

I CIs one of most misinterpreted estimators

I Remember: Calculation of confidence interval depends on thesampling distribution (from CLT)

I Different sample → different confidence interval

I With some samples, calculated CI would “capture” true % ofAmericans satisfied with country direction

I With some samples, calculated CI would not “capture” true %of Americans satisfied with country direction

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I 95% of the time, the CI we construct in this fashion wouldcapture the % of Americans satisfied with country direction

Page 202: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 timesI For each sample, calculate 95% CI

I X ± 1.96× ˆSE [X ]

Page 203: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 timesI For each sample, calculate 95% CI

I X ± 1.96× ˆSE [X ]

Page 204: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 timesI For each sample, calculate 95% CI

I X ± 1.96× ˆSE [X ]

Page 205: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 times

I For each sample, calculate 95% CII X ± 1.96× ˆSE [X ]

Page 206: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 timesI For each sample, calculate 95% CI

I X ± 1.96× ˆSE [X ]

Page 207: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I Show this with simulation

I For sake of simulation, assume observations come fromNormal distribution w/ mean 1 and variance of 10

I I sample 500 observations, 100 timesI For each sample, calculate 95% CI

I X ± 1.96× ˆSE [X ]

Page 208: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 209: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 210: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 211: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 212: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 213: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 214: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 215: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 216: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 217: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 218: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 219: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 220: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

How to Interpret CIs?

I For all of the CIs we could calculated with repeat sampling,95% of them would cover true population parameter

I Confidence intervals one of most frequently misinterpretedestimators

I “There is a 95% probability that this interval I’ve calculatedcontains the true population parameter”

I → Not correct: Once you have calculated the CI, it eithercontains the true value or not

I “95% of the confidence intervals I calculate using this formulausing repeated sampling will contain the true populationparameter”

I → Correct!

Page 221: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 222: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 223: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 224: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 225: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 226: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n

→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 227: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 228: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Notes on sample size:

I Small samples → Follow same rules as hypothesis tests forwhen to switch to Student’s t distribution

I Larger sample size → will shrink standard errors → will leadto smaller CIs

I Ex) Doubling sample size will reduce the width of theconfidence interval for a sample mean by a half

I X ± σ/√n

I X ± σ/√

4n→ X ± 12σ/√n

I Problems may ask you to calculate minimum sample size,given α and standard deviation

Page 229: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample Size

Show this graphically → What happens to CIs when n goes up?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 230: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample SizeShow this graphically → What happens to CIs when n goes up?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 231: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample SizeShow this graphically → What happens to CIs when n goes up?

0.6

0.8

1.0

1.2

1.4

500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 232: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs and Sample SizeShow this graphically → What happens to CIs when n goes up?

0.6

0.8

1.0

1.2

1.4

1500 observations drawn from N(1,10)

95%

Con

fiden

ce In

terv

al

Page 233: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 234: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 235: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 236: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 237: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 238: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Close relationship between CIs and HTs

I If value A not in 95% CI → would be rejected by a two-sidedhypothesis test at the 5% level

I If value A in 95% CI → would not be rejected by a two-sidedhypothesis test at the 5% level

I → A 95% confidence interval is all of null hypotheses thatwould not be rejected at the 0.05 level

I → A (1 − α)% confidence interval is all of null hypothesesthat would not be rejected at the α level

Page 239: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α valueI Many find margin of error intuitive (although many incorrectly

interpret)

Page 240: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α valueI Many find margin of error intuitive (although many incorrectly

interpret)

Page 241: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α valueI Many find margin of error intuitive (although many incorrectly

interpret)

Page 242: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α valueI Many find margin of error intuitive (although many incorrectly

interpret)

Page 243: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:

I Can give you information over all possible null hypotheses thatwould be rejected (as opposed to one), conditional on α value

I Many find margin of error intuitive (although many incorrectlyinterpret)

Page 244: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α value

I Many find margin of error intuitive (although many incorrectlyinterpret)

Page 245: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

CIs versus Hypothesis Tests

I Both CIs and HTs useful tools in your inference toolkit

I HTs → useful when comparing groups or trying to test atheory

I CIs → useful for thinking about range of possible values,providing additional information about a single sample

I Many people prefer CIs:I Can give you information over all possible null hypotheses that

would be rejected (as opposed to one), conditional on α valueI Many find margin of error intuitive (although many incorrectly

interpret)

Page 246: scholar.harvard.eduPower Analyses I Power Analysis: If a given alternative hypothesis were true, how good would our test be at (correctly) rejecting the null? I Somewhat similar to

Next time

I Comparing groups that have paired data