Download - Hypothesis Testing II The Two-Sample Case. Introduction In this chapter, we will look at the difference between two separate populations As opposed.

Hypothesis Testing II

The Two-Sample Case

Introduction In this chapter, we will look at the difference

between two separate populations As opposed to the difference between a sample and the

population, which was Chapter 8 example: males and females; or people with no children

compared with people with at least one child

You cannot test all males and all females, so need to draw a random sample from the population

Will want to find that the difference between the samples is real (statistically significant) rather than due to random chance

Summary of Chapter

Difference between two group’s means for large samples

Difference between two group’s means for small samples

Difference between two group’s proportions for large samples

Will end the chapter with the limitations of hypothesis testing

Hypothesis Testing with Sample Means

Large Samples

Assumptions

We need to assume that each sample is random, and also that the two samples are independent of each other When random samples are drawn in such a way that

the selection of a case for one sample has no effect on the selection of cases for another sample, the samples are independent

To satisfy this requirement, you may randomly select cases from one list of the population, then subdivide that sample according to the trait of interest

More Assumptions

In the two-sample case, the null is still a statement of “no difference”, but now we are saying that the two populations are “no different” from each otherThe null stated symbolically:

21

Null Hypothesis

We know that the means of our samples are different, but we are stating in the null that they are theoretically the same in the two populations

If the test statistic falls in the critical region, we may conclude that the difference did not occur by random chance, and that there is a real difference between the two groups

Test Statistic

In this chapter, the test statistic will be the difference in sample means If sample size is large, meaning that the combined

number of cases in the two samples is larger than 100, the sampling distribution of the differences in sample means will be normal in form and the standard normal curve can be used for critical regions

Instead of plotting sample means or proportions in the sampling distribution, we will plot the difference between the means of each sample

Formula for Z (Obtained)

The Formula:

meanssampleinsdifferencetheof

ondistributisamplingtheofdeviationdardsthe

meanspopulationtheindifferencethe

meanssampletheindifferencetheXXwhere

XXobtainedZ

XX

XX

tan

)(

21

21

21

21

2121

Revised Formula

We do not know the means of the populations in this chapter—only know the means for the samplesThe expression for the difference in the

population means is dropped from the equation because the expression equals zero—we assume in the null hypothesis that the values are the same

New Formula for Z (Obtained)

The Formula:

21

21)(xx

XXobtainedZ

Pooled Estimate

Use Formula 9.4 for the denominator if we do not know the population standard deviation (called the pooled estimate):

11 2

2

2

1

2

1

21

N

s

N

sxx

Interpretation

We are testing a hypothesis that women will be more supportive of gun control than men Need a statistical interpretation

Know that there is a difference between the means of the two samples

Are doing the test of hypothesis to see if the difference is large enough to justify the conclusion that it did not occur by random chance alone but reflects a significant difference between men and women on this issue

We find that Z (obtained) is -2.80, and Z (critical) is plus or minus 1.96

So, can conclude that the difference did not occur by random chance

The test statistic falls in the critical region, so it is unlikely that the null is true

Sociological Interpretation

Begin by looking at which group has the lower mean For our groups, we find that men have a lower

average score on the Support for Gun Control Scale, so are less supportive of gun control than women

We know that men and women are different in terms of their support for gun control Why would this be true?

Hypothesis Testing with Sample Means

Small Samples

Distribution

Cannot use the Z distribution for the sampling distribution of the difference between sample means Instead will use the t distribution to find the

critical region for unlikely sample outcomesWill need to make two adjustments

The degrees of freedom now will be (N1 + N2) - 2

Second Assumption

An additional assumption is that the variances of the populations of interest are equal We may assume equal population variances if

the sample sizes are approximately equal If one sample is large, and the other is small, we

cannot use this test

Formula for the Pooled Estimate Formula for the pooled estimate of the

standard deviation of the sampling distribution is different for small samples than for large samples

21

21

21

2

22

2

11

21 2 NN

NN

NN

sNsNxx

Formula for t (obtained)

It is the same as for Z (obtained):

21

21)(xx

XXobtainedt

Interpretation of the Results

We are testing the hypothesis that people with children are happier than people without children

Statistical interpretation: Will use a two-tailed test, since no direction has been

predicted The test statistic falls in the critical region, so married

people with no children and married people with at least one child are significantly different on the variable satisfaction with family life

Sociological Interpretation

Begin by comparing the means Higher scores indicate greater satisfaction

Who is in each sample? The samples were divided into respondents with no

children and respondents with at least one child Find that the respondents with no children

scored higher on this attitude scale They are more satisfied with family life We know this difference is not due to chance, but is a

real difference It completely contradicts our hypothesis

Hypothesis Testing With Sample Proportions (Large Samples) The null hypothesis states that no

significant difference exists between the populations from which the samples are drawn

Will use the formulas for proportions when there is a percentage in the question

Formula for Z (obtained)

21

2211

21

21

21

21

21

1

)(

NN

PNPNP

NN

NNPP

PPobtainedZ

ssu

uuPP

PP

ss

The Limitations of Hypothesis Testing

For All Tests of Hypothesis

Probability of Rejecting the Null

The probability of rejecting the null is a function of four independent factorsThe size of the observed differences

The greater the difference, the more likely we reject the null

The alpha level The higher the alpha level, the greater the

probability of rejecting the null hypothesis

Probability of Rejecting the Null

The use of one- or two-tailed tests The use of the one-tailed test increases the

probability of rejection of the null

The size of the sample The value of all test statistics is directly proportional to

sample size (not inversely proportional) The larger the sample, the higher the probability of

rejecting the null hypothesis

Two things to Remember about Sample Size Larger samples are better approximations

of the populations they represent, so decisions based on larger samples about rejecting or failing to reject the null, can be regarded as more trustworthy

It shows the most significant limitation of hypothesis testing

Limitation of Hypothesis Testing

Because a difference is statistically significant does not guarantee that it is important in any other sense Particularly with very large samples (N’s in excess of 1,000)

where very small differences may be statistically significant Even with small samples, trivial differences may be statistically

significant, since they represent differences in relation to the standard deviation of the population

So, statistical significance is a necessary but not sufficient condition for theoretical importance

Once a research result has been found to be significant, the researcher still faces the task of evaluating the results in terms of the theory that guides the inquiry

Conclusion

A difference between samples that is shown to be statistically significant may not be theoretically important, practically important, or sociologically importantLogic will have to determine thatAnd measures of association that show the

strength of the association