HS 1678: Comparing Two Means1 Two Independent Means Unit 8.

HS 167 8: Comparing Two Means 1

Two Independent Means

Unit 8


Sampling ConsiderationsOne sample or two?If two samples, paired or independent?Is the response variable quantitative or categorical?Am I interested in the mean difference?

This chapter → two independent samples → quantitative response → interest in mean difference


One sample

Comparisons made to an external reference population

SRS from one population


Paired Sample

Just like a one-sample problem except inferences directed toward within-pair differences DELTA

Two samples with each observation in sample 1 matched to a unique observation in sample 2


Independent sample inference

No matching or pairing

Independent samples from two populations


What type of sampling method?

1. Measure vitamin content in loaves of bread and see if the average meets national standards.

2. Compare vitamin content of bread immediately after baking versus 3 days later (same loaves are used on day one and 3 days later)

3. Compare vitamin content of bread immediately after baking versus loaves that have been on shelf for 3 days

1 = single sample2 = paired samples3 = independent samples


Illustrative example: independent samples

Fasting cholesterol (mg/dl) Group 1 (type A personality): 233, 291, 312, 250, 246, 197, 268, 224, 239, 239, 254, 276, 234, 181, 248, 252, 202, 218, 212, 325

Group 2 (type B personality) 344, 185, 263, 246, 224, 212, 188, 250, 148, 169, 226, 175, 242, 252, 153, 183, 137, 202, 194, 213

Goal: compare response variable in two groups


Data setup for independent samples

Two columns Response variable in one columnExplanatory variable in other column


Side-by-side boxplots

2020N =

GROUP

21

Ch

ole

ste

rol (

mg

/dl)

400

300

200

100

21

20

Interpretation:

(1) Different locations (group 1 > group 2)

(2) Different spreads (group 1 < group 2)

(3) Shape: fairly symmetrical (but both with outside values)

Compare locations, spreads, and shapes


Summary statistics by group

Group

n mean std dev

1 20 245.05 36.64

2 20 210.30 48.34Take time to look at your results.

If no major departures from Normality, report means and standard deviations (and sample sizes)


Notation for independent samples

Parameters (population)

Group 1 N1 µ1 σ1

Group 2 N2 µ2 σ2

Statistics (sample)

Group 1 n1 s1

Group 2 n2 s2

2121

21

estimates

differencemean sample theis

xx

xx

1x

2x


Sampling distribution of mean difference

),(~212121 xxSENxx

The sampling distribution of the mean difference is key to inference

{FIGURE DRAWN ON BOARD}

The SDM difference tends to be Normal with expectation μ1 − μ2 and standard deviation SE; (SE discussed next

slide)


Pooled Standard Error

381919

191201

191201

21

22

11

dfdfdf

ndf

ndf

Illustrative data (summary statistics)

Group ni si xbari

1 20 36.64 245.05

2 20 48.34 210.30

623.183938

)34.48)(19()64.36)(19(

))(())((

22

222

2112

df

sdfsdfspooled

56.1320

1

20

11839.623

11

21

2

21

nn

sSE pooledxx


Confidence interval for µ1 – µ2

)( 2121,21 xxdf SEtxx

62.14) (7.36,

39.2775.34

)13.56)(02.2()30.21005.245(

))(()(21975,.121

xxn SEtxx

Illustrative example (Cholesterol in type A and B men)

(1−αlpha)100% confidence interval for µ1 – µ2


Comparison of CI formulas

)*)((estimate)(point SEt

Type of sample

point estimat

e

df for t* SE

single

paired

independent

21 xx

dx

xn

1n

1dn

)1()1( 21 nn

ndelta

11

21

2

nn

spooled


Independent t testA. H0: µ1 = µ2

vs. H1: µ1 > µ2 or H1: µ1 < µ2 or H1: µ1 µ2

B. Independent t statistic

C. P-value – use t table or software utility to convert tstat to P- value

D. Significance level

)2()1(

with

)(

21

21

21stat

21

nn

dfdfdf

SE

xxt

xx

Pooled t statistic

Illustrative example

381919

75.3430.21005.245

13.56

21

21

df

xx

SE xx

0.02 and 0.01between sided-Two

0.005 and 0.01between sided-One

2.56 56.13

75.34

21

21stat

P

P

SE

xxt

xx


SPSS output

These are the pooled (equal variance) statistics calculated in HS 167


Conditions necessary for t procedures

Validity assumptions good information (no information bias) good sample (“no selection bias”) good comparison (“no confounding” – no

lurking variables)

Distributional assumptions Sampling independence Normality Equal variance


Sample size requirements for confidence intervals

296.1

d

n

This will restrict the margin of error to no bigger than plus or minus d


Sample size requirement for CI

Suppose, you have a variable with = 15

365

154 use ,5For

2

2

nd

1445.2

154 use ,5.2For

2

2

nd

9001

154 use ,1For

2

2

ndSample size requirements increases when you need precision


Sample size for significance test

Goal: to conduct a significance test with adequate power to detect “a difference worth detecting”The difference worth detecting is a difference difference worth finding.

In a study of an anti-hypertensivesfor instance, a drop of 10 mm Hg might be worth detecting, while a drop of 1 mm Hg might not be worth detecting.

In a study on weight loss, a drop of 5 pounds might be meaningful in a population of runway models, but may be meaningless in a morbidly obese population.


Determinants of sample size requirements

“Difference worth detecting” () Standard deviation of data ()Type I error rate ( We consider only two-sided

Power of test (we consider on 80% power)


Sample size requirements for test

Approx. sample size needed for 80% power at alpha = .05 (two-sided) to detect a difference of Δ:

116

2

2

n

Illustrative example: Suppose Δ = 25 and = 45 …

538.52125

45162

2

n

HS 1678: Comparing Two Means1 Two Independent Means Unit 8.

Documents

Transcript of HS 1678: Comparing Two Means1 Two Independent Means Unit 8.