Class24 chi squaretestofindependenceposthoc(1)

40
SW318 Social Work Statistic s Slide 1 Chi-square Test of Independence Reviewing the Concept of Independence Steps in Testing Chi-square Test of Independence Hypotheses Chi-square Test of Independence in SPSS

Transcript of Class24 chi squaretestofindependenceposthoc(1)

Page 1: Class24 chi squaretestofindependenceposthoc(1)

SW318Social Work

StatisticsSlide 1 Chi-square Test of Independence

Reviewing the Concept of Independence

Steps in Testing Chi-square Test of Independence Hypotheses

Chi-square Test of Independence in SPSS

Page 2: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 2 Chi-square Test of Independence

The chi-square test of independence is probably the most frequently used hypothesis test in the social sciences.

In this exercise, we will use the chi-square test of independence to evaluate group differences when the test variable is nominal, dichotomous, ordinal, or grouped interval.

The chi-square test of independence can be used for any variable; the group (independent) and the test variable (dependent) can be nominal, dichotomous, ordinal, or grouped interval.

Page 3: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 3 Independence Defined

Two variables are independent if, for all cases, the classification of a case into a particular category of one variable (the group variable) has no effect on the probability that the case will fall into any particular category of the second variable (the test variable).

When two variables are independent, there is no relationship between them. We would expect that the frequency breakdowns of the test variable to be similar for all groups.

Page 4: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 4 Independence Demonstrated

Suppose we are interested in the relationship between gender and attending college.

If there is no relationship between gender and attending college and 40% of our total sample attend college, we would expect 40% of the males in our sample to attend college and 40% of the females to attend college.

If there is a relationship between gender and attending college, we would expect a higher proportion of one group to attend college than the other group, e.g. 60% to 20%.

Page 5: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 5

Displaying Independent and Dependent Relationships

Independent Relationship between Gender and College

40% 40% 40%

0%

20%

40%

60%

80%

100%

Males Females TotalPopo

rtio

n A

tten

ding

Col

lege

Dependent Relationship between Gender and College

60%

20%

40%

0%

20%

40%

60%

80%

100%

Males Females TotalPopo

rtio

n A

tten

ding

Col

lege

When the variables are independent, the proportion in both groups is close to the same size as the proportion for the total sample.

When group membership makes a difference, the dependent relationship is indicated by one group having a higher proportion than the proportion for the total sample.

Page 6: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 6 Expected Frequencies

Expected frequencies are computed as if there is no difference between the groups, i.e. both groups have the same proportion as the total sample in each category of the test variable.

Since the proportion of subjects in each category of the group variable can differ, we take group category into account in computing expected frequencies as well.

To summarize, the expected frequencies for each cell are computed to be proportional to both the breakdown for the test variable and the breakdown for the group variable.

Page 7: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 7 Expected Frequency Calculation

The data from “Observed Frequencies for Sample Data” is the source for information to compute the expected frequencies. Percentages are computed for the column of all students and for the row of all GPA’s. These percentages are then multiplied by the total number of students in the sample (453) to compute the expected frequency for each cell in the table.

Page 8: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 8

Expected Frequencies versus Observed Frequencies

The chi-square test of independence plugs the observed frequencies and expected frequencies into a formula which computes how the pattern of observed frequencies differs from the pattern of expected frequencies.

Probabilities for the test statistic can be obtained from the chi-square probability distribution so that we can test hypotheses.

Page 9: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 9 Independent and Dependent Variables

The two variables in a chi-square test of independence each play a specific role. The group variable is also known as the independent

variable because it has an influence on the test variable.

The test variable is also known as the dependent variable because its value is believed to be dependent on the value of the group variable.

The chi-square test of independence is a test of the influence or impact that a subject’s value on one variable has on the same subject’s value for a second variable.

Page 10: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 10

Step 1. Assumptions for the Chi-square Test

The chi-square Test of Independence can be used for any level variable, including interval level variables grouped in a frequency distribution. It is most useful for nominal variables for which we do not another option.

Assumptions: No cell has an expected frequency less than 5.

If these assumptions are violated, the chi-square distribution will give us misleading probabilities.

Page 11: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 11 Step 2. Hypotheses and alpha

The research hypothesis states that the two variables are dependent or related. This will be true if the observed counts for the categories of the variables in the sample are different from the expected counts.

The null hypothesis is that the two variables are independent. This will be true if the observed counts in the sample are similar to the expected counts.

The amount of difference needed to make a decision about difference or similarity is the amount corresponding to the alpha level of significance, which will be either 0.05 or 0.01. The value to use will be stated in the problem.

Page 12: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 12

Step 3. Sampling distribution and test statistic

To test the relationship, we use the chi-square test statistic, which follows the chi-square distribution.

If we were calculating the statistic by hand, we would have to compute the degrees of freedom to identify the probability of the test statistic. SPSS will print out the degrees of freedom and the probability of the test statistics for us.

Page 13: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 13 Step 4. Computing the Test Statistic

Conceptually, the chi-square test of independence statistic is computed by summing the difference between the expected and observed frequencies for each cell in the table divided by the expected frequencies for the cell.

We identify the value and probability for this test statistic from the SPSS statistical output.

Page 14: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 14 Step 5. Decision and Interpretation

If the probability of the test statistic is less than or equal to the probability of the alpha error rate, we reject the null hypothesis and conclude that our data supports the research hypothesis. We conclude that there is a relationship between the variables.

If the probability of the test statistic is greater than the probability of the alpha error rate, we fail to reject the null hypothesis. We conclude that there is no relationship between the variables, i.e. they are independent.

Page 15: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 15 Which Cell or Cells Caused the Difference

We are only concerned with this procedure if the result of the chi-square test was statistically significant.

One of the problems in interpreting chi-square tests is the determination of which cell or cells produced the statistically significant difference. Examination of percentages in the contingency table and expected frequency table can be misleading.

The residual, or the difference, between the observed frequency and the expected frequency is a more reliable indicator, especially if the residual is converted to a z-score and compared to a critical value equivalent to the alpha for the problem.

Page 16: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 16 Standardized Residuals

SPSS prints out the standardized residual (converted to a z-score) computed for each cell. It does not produce the probability or significance.

Without a probability, we will compare the size of the standardized residuals to the critical values that correspond to an alpha of 0.05 (+/-1.96) or an alpha of 0.01 (+/-2.58). The problems will tell you which value to use. This is equivalent to testing the null hypothesis that the actual frequency equals the expected frequency for a specific cell versus the research hypothesis of a difference greater than zero.

There can be 0, 1, 2, or more cells with statistically significant standardized residuals to be interpreted.

Page 17: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 17 Interpreting Standardized Residuals

Standardized residuals that have a positive value mean that the cell was over-represented in the actual sample, compared to the expected frequency, i.e. there were more subjects in this category than we expected.

Standardized residuals that have a negative value mean that the cell was under-represented in the actual sample, compared to the expected frequency, i.e. there were fewer subjects in this category than we expected.

Page 18: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 18

Interpreting Cell Differences in a Chi-square Test - 1

A chi-square test of independence of the relationship between sex and marital status finds a statistically significant relationship between the variables.

Page 19: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 19

Interpreting Cell Differences in a Chi-square Test - 2

Researcher often try to identify try to identify which cell or cells are the major contributors to the significant chi-square test by examining the pattern of column percentages.

Based on the column percentages, we would identify cells on the married row and the widowed row as the ones producing the significant result because they show the largest differences: 8.2% on the married row (50.9%-42.7%) and 9.0% on the widowed row (13.1%-4.1%)

Page 20: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 20

Interpreting Cell Differences in a Chi-square Test - 3

Using a level of significance of 0.05, the critical value for a standardized residual would be -1.96 and +1.96. Using standardized residuals, we would find that only the cells on the widowed row are the significant contributors to the chi-square relationship between sex and marital status.

If we interpreted the contribution of the married marital status, we would be mistaken. Basing the interpretation on column percentages can be misleading.

Page 21: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 21

Chi-Square Test of Independence: post hoc practice problem 1

This question asks you to use a chi-square test of independence and, if significant, to do a post hoc test using 1.96 of the critical value.

First of all, the level of measurement for the independent and the dependent variable can be any level that defines groups (dichotomous, nominal, ordinal, or grouped interval). “degree of religious fundamentalism" [fund] is ordinal and "sex" [sex] is dichotomous, so the level of measurement requirements are satisfied.

Page 22: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 22

Chi-Square Test of Independence: post hoc test in SPSS (1)

You can conduct a chi-square test of independence in crosstabulation of SPSS by selecting:

Analyze > Descriptive Statistics > Crosstabs…

Page 23: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 23

Chi-Square Test of Independence: post hoc test in SPSS (2)

First, select and move the variables for the question to “Row(s):” and “Column(s):” list boxes.

The variable mentioned first in the problem, sex, is used as the independent variable and is moved to the “Column(s):” list box.

The variable mentioned second in the problem, [fund], is used as the dependent variable and is moved to the “Row(s)” list box.

Second, click on “Statistics…” button to request the test statistic.

Page 24: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 24

Chi-Square Test of Independence: post hoc test in SPSS (3)

Second, click on “Continue” button to close the Statistics dialog box.

First, click on “Chi-square” to request the chi-square test of independence.

Page 25: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 25

Chi-Square Test of Independence: post hoc test in SPSS (4)

Now click on “Cells…” button to specify the contents in the cells of the crosstabs table.

Page 26: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 26

Chi-Square Test of Independence: post hoc test in SPSS (5)

First, make sure both “Observed” and “Expected” in the “Counts” section in “Crosstabs: Cell Display” dialog box are checked.

In the “Residuals” section, select “Unstandardized” and “Standardized” residuals and click on “Continue” and “OK” buttons.

Page 27: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 27

Chi-Square Test of Independence: post hoc test in SPSS (6)

In the table Chi-Square Tests result, SPSS also tells us that “0 cells have expected count less than 5 and the minimum expected count is 70.63”.

The sample size requirement for the chi-square test of independence is satisfied.

Page 28: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 28

Chi-Square Test of Independence: post hoc test in SPSS (7)

The probability of the chi-square test statistic (chi-square=2.821) was p=0.244, greater than the alpha level of significance of 0.05. The null hypothesis that differences in "degree of religious fundamentalism" are independent of differences in "sex" is not rejected.

The research hypothesis that differences in "degree of religious fundamentalism" are related to differences in "sex" is not supported by this analysis.

Thus, the answer for this question is False. We do not interpret cell differences unless the chi-square test statistic supports the research hypothesis.

Page 29: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 29

Chi-Square Test of Independence: post hoc practice problem 2

This question asks you to use a chi-square test of independence and, if significant, to do a post hoc test using -1.96 of the critical value. First of all, the level of measurement for the independent and the dependent variable can be any level that defines groups (dichotomous, nominal, ordinal, or grouped interval). [empathy3] is ordinal and [sex] is dichotomous, so the level of measurement requirements are satisfied.

Page 30: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 30

Chi-Square Test of Independence: post hoc test in SPSS (8)

You can conduct a chi-square test of independence in crosstabulation of SPSS by selecting:

Analyze > Descriptive Statistics > Crosstabs…

Page 31: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 31

Chi-Square Test of Independence: post hoc test in SPSS (9)

First, select and move the variables for the question to “Row(s):” and “Column(s):” list boxes.

The variable mentioned first in the problem, [sex], is used as the independent variable and is moved to the “Column(s):” list box.

The variable mentioned second in the problem, [empathy3], is used as the dependent variable and is moved to the “Row(s)” list box.

Second, click on “Statistics…” button to request the test statistic.

Page 32: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 32

Chi-Square Test of Independence: post hoc test in SPSS (10)

Second, click on “Continue” button to close the Statistics dialog box.

First, click on “Chi-square” to request the chi-square test of independence.

Page 33: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 33

Chi-Square Test of Independence: post hoc test in SPSS (11)

Now click on “Cells…” button to specify the contents in the cells of the crosstabs table.

Page 34: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 34

Chi-Square Test of Independence: post hoc test in SPSS (12)

First, make sure both “Observed” and “Expected” in the “Counts” section in “Crosstabs: Cell Display” dialog box are checked.

In the “Residuals” section, select “Unstandardized” and “Standardized” residuals and click on “Continue” and “OK” buttons.

Page 35: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 35

Chi-Square Test of Independence: post hoc test in SPSS (13)

In the table Chi-Square Tests result, SPSS also tells us that “0 cells have expected count less than 5 and the minimum expected count is 6.79”.

The sample size requirement for the chi-square test of independence is satisfied.

Page 36: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 36

Chi-Square Test of Independence: post hoc test in SPSS (14)

The probability of the chi-square test statistic (chi-square=23.083) was p<0.001, less than or equal to the alpha level of significance of 0.05. The null hypothesis that differences in "accuracy of the description of feeling protective toward people being taken advantage of" are independent of differences in "sex" is rejected.

The research hypothesis that differences in "accuracy of the description of feeling protective toward people being taken advantage of" are related to differences in "sex" is supported by this analysis.

Now, you can examine the post hoc test using the given critical value.

Page 37: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 37

Chi-Square Test of Independence: post hoc test in SPSS (15)

The residual is the difference between the actual frequency and the expected frequency (58-79.2=-21.2).

When converted to a z-score, the standardized residual (-2.4) was smaller than the critical value (-1.96), supporting a specific finding that among survey respondents who were male, there were fewer who said that feeling protective toward people being taken advantage of describes them very well than would be expected.

The answer to the question is true.

Page 38: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 38

Steps in solving chi-square test of independence: post hoc problems - 1

The following is a guide to the decision process for answering homework problems about chi-square test of independence post hoc problems:

Is the dependent and independent variable nominal, ordinal, dichotomous, or grouped interval?

Incorrect application of

a statistic

Yes

No

Page 39: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 39

Steps in solving chi-square test of independence: post hoc problems - 2

YesExpected cell counts less than 5?

No

Incorrect application of

a statistic

Compute the Chi-Square test of independence,requesting standardized residuals in the output

Is the p-value for the chi-square test of independence <= alpha?

Yes

False No

Page 40: Class24 chi squaretestofindependenceposthoc(1)

SW318 Social Work

Statistics Slide 40

Steps in solving chi-square test of independence: post hoc problems - 3

Is the value of the standardized residual for the specified cell larger (smaller) than the postive (negative) critical value given in the problem?

Yes

NoFalse

Is the relationship correctly described?

Yes

NoFalse

True

Identify the cell in the crosstabs table that contains the specific relationship in the problem