Copyright © 2011 by Pearson Education, Inc. All rights reserved Statistics for the Behavioral and...

Copyright © 2011 by Pearson Education, Inc. All rights reserved

Statistics for the Behavioral and Social Sciences:

A Brief Course Fifth Edition

Arthur Aron, Elaine N. Aron, Elliot Coups

Prepared by: Genna Hymowitz

Stony Brook University

This multimedia product and its contents are protected under copyright law. The following are prohibited by law:

-any public performance or display, including transmission of any image over a network; -preparation of any derivative work, including the extraction, in whole or in part, of any images;

-any rental, lease, or lending of the program.


The t Test for Independent Means

Chapter 9


Chapter Outline

• The Distribution of Differences Between Means• Estimating the Population Variance• Hypothesis Testing with a t Test for Independent

Means• Assumptions of the t Test for Independent Means• Effect Size and Power for the t Test for Independent

Means• Review and Comparison of the Three Kinds of Tests• The t Test for Independent Means in Research

Articles


t Tests for Independent Means• Hypothesis-testing procedure used for studies

with two sets of scores – Each set of scores is from an entirely different group of

people and the population variance is not known.• e.g., a study that compares a treatment group to a

control group


The Distribution of Differences Between Means

• When you have one score for each person with two different groups of people, you can compare the mean of one group to the mean of the other group.– The t test for independent means focuses on the difference

between the means of the two groups.• The comparison distribution is a distribution of differences between

means.– created by randomly selecting one mean from the distribution of means

from the first group’s population– randomly selecting one mean from the distribution of means for the second

group’s population– Subtract the mean from the second distribution of means from the mean

from the first distribution of means to create a difference score between the two selected means.

– Repeat this process a large number of times and you will have a distribution of differences between means.

» Note that this is not the actual way a distribution of means is created, but conceptually this is what a distribution of means is.


The Logic of a Distribution of Differences Between Means

• The null hypothesis is that Population M1 = Population M2

– If the null hypothesis is true, the two population means from which the samples are drawn are the same.

• The population variances are estimated from the sample scores.

• The variance of the distribution of differences between means is based on estimated population variances.– The goal of a t test for independent means is to decide whether the

difference between means of your two actual samples is a more extreme difference than the cutoff difference on this distribution of differences between means.


Mean of the Distribution of Differences Between Means

• With a t test for independent means, two populations are considered.– An experimental group is taken from one of these

populations and a control group is taken from the other population.

• If the null hypothesis is true: – The populations have equal means.– The distribution of differences between means has a mean

of 0.


Estimating the Population Variance

• In a t test for independent means, you calculate two estimates of the population variance.– Each estimate is weighted by a proportion consisting of its sample’s

degrees of freedom divided by the total degrees of freedom for both samples.

• The estimates are weighted to account for differences in sample size.– More weight is given to the larger sample.

» The most weight is given to the sample that provides the most information as determined by the degrees of freedom each sample provides.

– The weighted estimates are averaged.• This is known as the pooled estimate of the population variance.

– S2Pooled = df1(S2

1) + df2(S22)

df Total df Total

– df Total = df1 + df2


Figuring the Variance of Each of the Two Distributions of Means

• The pooled estimate of the population variance is the best estimate for both populations.

• Even though the two populations have the same variance, if the samples are not the same size, the distributions of means taken from them do not have the same variance.– This is because the variance of a distribution of means is the

population variance divided by the sample size.• S2

M1 = S2

Pooled / N1

• S2M2

= S2Pooled / N2


The Variance and Standard Deviation of the Distribution of Differences

Between Means

• The Variance of the distribution of differences between means (S2

Difference) is the variance of Population 1’s distribution of means plus the variance of Population 2’s distribution of means.

– S2Difference = S2

M1 + S2

M2

• The standard deviation of the distribution of difference between means (SDifference ) is the square root of the variance.

– SDifference = √S2Diifference


Steps to Find the Standard Deviation of the Distribution of Differences

Between Means• Figure the estimated population variances based on each sample.

• S2 = [∑(X – M)2] / (N – 1)• Figure the pooled estimate of the population variance.

• S2Pooled = df1(S2

1) + df2(S22)

df Total df Total

df1 = N1 – 1 and df2 = N2 – 1; dfTotal = df1 + df2

• Figure the variance of each distribution of means.

• S2M1

= S2Pooled / N1

• S2M2

= S2 Pooled / N2

• Figure the variance of the distribution of differences between means.

• S2Difference = S2

M1 + S2

M2

• Figure the standard deviation of the distribution of differences between means.• SDifference =√ S2

Difference


The Shape of the Distribution of Differences Between Means

• Since the distribution of differences between means is based on estimated population variances:– The distribution of differences between means is a t

distribution.– The variance of this distribution is figured based on

population variance estimates from two samples.• The degrees of freedom of this t distribution are the sum

of the degrees of freedom of the two samples.– dfTotal = df1 + df2


The t score for the Difference Between the Two Actual Means

• Figure the difference between your two samples’ means.

• Figure out where this difference is on the distribution of differences between means.– t = M1 – M2 / SDifference


Hypothesis Testing with a t Test for Independent Means

• The comparison distribution is a distribution of differences between means.

• The degrees of freedom for finding the cutoff on the t table is based on two samples.

• Your samples’ score on the comparison distribution is based on the difference between your two means.


Summary of t Test for Independent Means

(Steps 1 and 2)• Restate the question as a research hypothesis and a null hypothesis about the

populations.• Determine the characteristics of the comparison distribution.

– The mean will be 0.– standard deviation

• Calculate the estimated population variances based on each sample. • Figure the estimated population variances based on each sample.

• S2 = [∑(X – M)2] / (N – 1)• Figure the pooled estimate of the population variance.

» S2Pooled = df1 (S2

1) + df2 (S2

2)

df Total df Total

df1 = N1 – 1 and df2 = N2 – 1; dfTotal = df1 + df2

• Figure the variance of each distribution of means.

» S2M1

= S2Pooled / N1

» S2M2

= S2 Pooled / N2

• Figure the variance of the distribution of differences between means.• S2

Difference = S2M1

+ S2M2

• Figure the standard deviation of the distribution of differences between means.• SDifference = √S2

Diifference

• The comparison distribution will be a t distribution with df total degrees of freedom.


Summary of t Test for Independent Means

(Steps 3, 4, & 5)• Determine the cutoff sample score on the comparison distribution at

which the null hypothesis should be rejected.– Determine the degrees of freedom (dfTotal), desired significance level, and

tails in the test (one or two).– Look up the appropriate cutoff in a t table.– If the exact df is not given, use the df below it.

• Determine your sample’s score on the comparison distribution. – t = (M1 – M2) / SDifference

• Decide whether to reject the null hypothesis.– Compare your samples’ score on the comparison distribution to the cutoff t score.


Example of The t Test for Independent Means: Step 1

• Use the expressive writing study example from the text.– You have a sample of 20 students who were recruited to take part in the study.– 10 students were randomly assigned to the expressive writing group and wrote about

their thoughts and feelings associated with their most traumatic life events.– 10 students were randomly assigned to the control group and wrote about their plans

for the day.– One month later, all of the students rated their overall level of physical health on a

scale from 0 (very poor health) to 100 (perfect health).

• Restate the question as a research hypothesis and a null hypothesis about the populations.

– Population 1: students who do expressive writing– Population 2: students who write about a neutral topic (their plans for the day)– Research hypothesis: Population 1 students would rate their health differently from

Population 2 students (two-tailed tests).– Null hypothesis: Population 1 students would rate their health the same as Population

2 students.


Example of t Test for Independent Means: Step 2

• Determine the characteristics of the comparison distribution.– The comparison distribution is a distributions of differences between means.– Its mean = 0.– Figure the estimate population variances based on each sample.• S2

1 = 94.44 and S22 = 111.33

– Figure the pooled estimate of the population variance.• S2

Pooled = 102.89– Figure the variance of each distribution of means.• S2

Pooled / N = S2M

• S2M1 = 10.29

• S2M2 = 10.29

– Figure the variance of the distribution of differences between means.• Adding up the variances of the two distributions of means would come out to S 2

Difference = 20.58

– Figure the standard deviation of the distribution of difference between means.• S Difference= √S2

Difference = √20.58 = 4.54– The shape of the comparison distribution will be a t distribution with a total of 18 degrees of

freedom.



• Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected.– You will use a two-tailed test.– If you also chose a significance level

of .05, the cutoff scores from the t table would be 2.101 and -2.101.



• Determine your sample’s score on the comparison distribution.

– t = (M1 – M2) / SDifference

– (79.00 – 68.00) / 4.54 – 11.00 / 4.54 = 2.42



• Decide whether to reject the null hypothesis.– Compare your samples’ score on the

comparison distribution to the cutoff t score.

– Your samples’ score is 2.42, which is larger than the cutoff score of 2.10.

– You can reject the null hypothesis.


Assumptions of the t Test for Independent Means

• The population distributions are normal.• The two populations have the same variance.• Even if the distributions are not exactly

normal or the variances are not exactly equal, the t test is still pretty accurate.– If the populations are very far from normal, the

variances are very different, or the variances are very different and the populations are far from normal, then the t test does not give accurate results and alternative tests should be used.


Effect Size and Power for the t Test for Independent Means

• Estimated Effect Size = (M1 – M2) / Spooled

– The estimated effect size is the difference between the sample means divided by the pooled estimate of the population's standard deviation.

• Power– determined using a power table, computer

software, or a power calculator (found online)


Power When Sample Sizes Are Not Equal

• Harmonic Mean– special average influenced more by smaller numbers

• It is used in a t test for independent means when the number of scores in the two groups differ.

– In such cases, the harmonic mean is used as the equivalent of each group’s sample size when determining power.

• It gives the equivalent sample size for what you would have if you had two equal samples.

• It is two times the first sample size multiplied by the second sample size—all divided by the sum of the two sample sizes.

Harmonic Mean = [2(N1)(N2)] / (N1 + N2)


Review of the t Test for a Single Sample, t Test for Dependent Means, and the t Test for

Independent Means• t Test for a Single Sample

– Population Variance is not known.– Population mean is known.– There is 1 score for each participant.– The comparison distribution is a t distribution.– df = N – 1– Formula t = (M – Population M) / Population SM

• t Test for Dependent Means– Population variance is not known.– Population mean is not known.– There are 2 scores for each participant.– The comparison distribution is a t distribution.– t test is carried out on a difference score.– df = N – 1– Formula t = (M – Population M) / Population SM

• t Test for Independent Means– Population variance is not known.– Population mean is not known.– There is 1 score for each participant.– The comparison distribution is a t distribution.– df total = df1 + df2 (df1 = N1 – 1; df2 = N2 – 1)– Formula t = (M1 – M2) / SDifference


The t Test for Independent Means in Research Articles

• When found in research articles, the results of these tests are accompanied by reporting of the means and sometimes the standard deviations.– t = (dftotal) = x.xx, p < .01


Key Points• A t test for independent means is used for hypothesis testing with scores from two entirely separate groups

of people. The comparison distribution is a distribution of differences between means of samples. The distribution can be thought of as being built up in two steps. Each population of individuals produces a distribution of means, and then a new distribution is created of differences between pairs of means selected from these two distributions of means.

• The distribution of differences between means has a mean of 0 and a t distribution with the total of the degrees of freedom from the two samples. Its standard deviation is figured in several steps.

– Figure the estimated population variance based on each sample.– Figure the pooled estimate of the population variance.– Figure the variance of each distribution of means.– Figure the variance of the distribution of differences between means.– Figure the standard deviation of the distribution of differences between means.

• The assumptions of the t test for independent means are that the two populations have a normal distribution and have the same variance. However, the t test gives fairly accurate results when the distribution is not exactly normal or the variances are not exactly equal.

• Estimated effect size for a t test for independent means is the difference between the samples’ means divided by the pooled estimate of the population standard deviation. Power is greatest when the sample sizes of the two groups are equal. When they are not equal, you use the harmonic mean of the two sample sizes when looking up power on a table. Power for a t test for independent means can be determined using a t table, a power software package, or an internet power calculator.

• t tests for independent means are usually reported in research articles with the means of the two groups and the degrees of freedom, t score, and significance level. Results may also be reported in a table where significant differences are noted by asterisks.

Copyright © 2011 by Pearson Education, Inc. All rights reserved Statistics for the Behavioral and...

Documents

Transcript of Copyright © 2011 by Pearson Education, Inc. All rights reserved Statistics for the Behavioral and...