Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
-
Upload
gomathi852 -
Category
Documents
-
view
225 -
download
0
Transcript of Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
-
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
1/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Chi Square Test ofAssociation
Click above to start an Interactive Visual Presentation(Plugin Required)
Click hereto go to our plugin download and plugin tutorial page
Trouble Printing? Download Acrobat File
Copyright 2000, Tom Malloy
http://www.psych.utah.edu/stat/introstats/web-text/chi-square-Association/chi_assoc.pdfhttp://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdfhttp://presentation%28%29/http://www.psych.utah.edu/stat/introstats/resource/setupcomputer.htmlhttp://www.psych.utah.edu/stat/introstats/web-text/chi-square-Association/chi_assoc.pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
2/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
3/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
This map allows you to--
1. Jump directly to a topic which interests you.2. Coordinate the dynamic visual Authorware presentations with the corresponding
text available on this web page.
1. To find a topic which interests you:Look at the map of menus above. Choose a menu
that interests you. Notice that the menu buttons have topics printed on them. Click on any
button (topic) on the menu; you will jump directly to the text that corresponds to the topic
printed on the button.
2. To coordinate this web page with Authorware presentations:The correspondingAuthorware program should already be open. Go to the menu of your choice in the
Authorware program and click any button which interests you. Then on the topic locator
map above click on the same button on the same menu; you will jump to the text thatcorresponds to the Authorware presentation.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
4/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
End of Topic Locator Map
Top Test of Association
There are two different Chi Square tools. We discussed the Goodness of Fit Chi
Square in the previous lecture. Now we will discuss the Chi Square Test of
Association.
The Chi Square Test of Association was derived mathematically by Karl Pearson
early in the century, and is often known as Pearson's Chi Square Test of
Association. Pearson showed that the formulas for the Goodness of Fit statistic
we learned last lecture and the Test of Association statistic we will learn in this
lecture both have Chi Square as their sampling distribution. In this class we don't
have sufficient mathematical prerequisites to follow these proofs, but suffice it to
say they were great insights at their time, insights which have allowed substantial
scientific sophistication to be brought to bear on frequency-categorical data.
Quite often we have two categorical variables such as gender (male, female) and
job status (managers, clerks). We wonder whether there is an association (or
correlation) between the two categorical variables; that is, is there a relationship
between a person's gender and their job status? As another example, we might
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
5/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
interested in the potential association of Political Party (Republican, Democratic,
Independent, Other) and Environmental Attitudes (Preservation, Development,
Other). The Chi Square Test of Association allows us to evaluate associations
(i.e., correlations) between categorical variables such as these.
We will begin our discussion of the Chi Square Test of Association with the threecriteria that are necessary for its appropriate use.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
6/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Go To Top
THREE CRITERIA
1) Recall that in the Chi Square Goodness of Fit we needed a partition (a system of
mutually exclusive and exhaustive categories). Now, in the Chi Square Test of
Association, we need TWO partitions. 2) We also need some number of independent
observations. 3) Finally, we need frequency data.
CRITERION #1: 2 Partitions.In probability theory a partition is a mutuallyexclusive and an exhaustive set of categories. As you know, categories are pigeon holes,
or places where we can put things conceptually. To be a partition, a set of categories have
to be both mutually exclusive and exhaustive. Mutually exclusive means that observation
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
7/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
can go into one and only one category. No idea (nor thing, nor observation) may go in
more than one category.
Exhaustive means that the set of categories
cover every possible case. It means that every
object we observe can be put into one of thecategories. More detail on these terms is
available in the Chi Square Goodness of Fit
lecture.
Often when you fill out a survey you'll find that
there is an extra category called "Other.""Other" is a great category because it ensures
that whatever set of choices is offered must be exhaustive, because if you don't fit in any
of the existing categories then you're in "Other".
Examples of Partitions
As we discussed in the Goodness of Fit lecture,
gender is a partition. It is a set of mutually
exclusive and exhaustive categories. There are
only two categories in gender--male and female.
You cannot be both at the same time so so male
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
8/38
pdfcrowd comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
and female are mutually exclusive. Moreover,
for human beings, gender is exhaustive
because everybody goes in one of the two categories. Male and female exhaust the
possibilities.
Let's look at another example of a partition. Let's say at a particular corporation weclassify people according to their job status. At this particular corporation the job status
is either managerial or clerical and there are no other categories. (This unrealistic, but it
keeps the calculations in this example simple.) These categories are mutually exclusive.
Any employee is either going to be a manager or they're going to be a clerk; they can't
be both. And, since we've said that those are the only job types, those two job types
exhaust all the possibilities.
Example using 2 Partitions
Now let's create a running example for this
lecture. We will classify every employee of a
business both by gender and by job status.
That is, we will apply two partitions in
categorizing people. As you can see from the
graphic, that results in four possibilities: Male
clerk, Female clerk, Male manager, and Female
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
9/38
df d mi b PRO i Are you a developer? Try out the HTML to PDF API
manager.
Typically, in this kind of chi square you will make some kind of table. For this small
example we have a two by two (2 x 2) table; but the table could be seven by four table (7
x 4) or whatever, depending on how many pigeon holes are in each partition.
CATEGORICAL VARIABLES. Another way to speak about this example is that we have
two categorical variables: Gender and Job Status. When you classify by two categorical
variables, thus creating all the possible combinations of the two (clerical male, clerical
female, managerial male, managerial female) it is called "crossing" the variables.
To do a chi squares test of association you must classify each of your observations not
with just one but with two partitions.
In this simple example, male and female is one dimension and job status is the other
dimension. You'll notice that our requirement is that every person we observe in our
study can be classified by each of the partitions. So everybody will go into one of these
"cells" created by crossing these two partitions with each other.
CRITERION #2. N
Independent
Observations
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
10/38
df di b PRO i Are you a developer? Try out the HTML to PDF API
The second criterion for the Chi Square Test
of Association is that we have some number
of observations, each independent of the
others. In our running example, say that we
classify 200 people who work at a corporation by gender and job status. We will assumethat these people are independent of each other and that they are just picked out at
random.
In the Goodness of Fit lecture we discussed the case of a rat in a T-maze. In a T-maze a
rat has to right or left so you have a partition for each trial. However, if you observe the
same rat over many trials, these observations would not be independent because the
particular rat may have a turning bias.
The important point is that you need to think about the observations that you are
making and decide whether they're independent or not. For our example we are
assuming that for each person who works at this particular company, their gender is
independent of the next person. That fact that one person was born a man, doesn't
have anything to do with someone else who works in a different part of the corporation
being a woman. Their births are independent of each other and have no relationship to
each other.The same must be true for job status.
CRITERION #3: Frequency Data. The third criterion is frequency data.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
11/38
df di b PRO i A d l ? T t th HTML t PDF API
Frequency data means that we don't measure anything when we observe, all we do is put
the observations in categories and count the number of observation that fall into each
category.
Each time someone falls into a category you can make a little hash mark in one of the cells
of the table. We are just counting, not measuring.
Think about how simply counting is different from the measuring we did for the t tests. Our
dependent variable in t-tests has been things like someone's height or someone's weight,
or some number of puzzles solved correctly. In those examples when we observe a person
our dependent variable generated a measurement number - how many inches tall they are;
how many pounds they weigh; how many puzzles they solve; and we actually assign thatnumber to that person. That kind of data is called measurement data; and it requires
statistics like t-tests, correlation coefficients, and so on.
For Chi Square, we do not measure the participants; we just count their frequency in
various categories.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
12/38df di b PRO i A d l ? T t th HTML t PDF API
Go To Top
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
13/38df di b PRO i
Are you a developer? Try out the HTML to PDF API
Go To Top
Scientific Hypotheses. Let's say that the scientific hypothesis is that there isgender bias in hiring and promotion in this corporation. That means job status will depend
upon gender. More specifically it is more likely that the men will be managers and the
women will be clerks, which would be a classic kind of gender bias.
In contrast, the skeptic (or maybe corporation's lawyer or public relations representative)
will say that hiring and promotion are completely fair. They will note that there will be
different proportions of men and women in different categories but this is simply due to
chance and there is nothing systematic about how different genders fall into different job
status categories.
So we have our scientific hypothesis of gender bias, and our skeptical hypothesis which
is saying hiring practices are fair and if the data shows differently then it is only due to
chance.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
14/38df di b PRO iAre you a developer? Try out the HTML to PDF API
What is the Question?The essential question being asked by the Chi Square Test ofAssociation is: "Is one way of categorizing things related to the other way of categorizing things, or are
they independent?" Another way of putting it is, "Are the two partitions (categorical variables) correlated
(associated) with each other or are they unrelated (independent)?"
In our example we are trying to determine whether gender is related to job status. Is there a correlations
(association) between a person's gender and her/his job status?
To answer this question we collect some data. We go to the business and collect the relevant information
on 200 employees. We put the information into a table like the one we've been showing which crosses
job status with gender.
ASSOCIATION MEANS PREDICTABILITY. So let's repeat our essential question in yet another way. Are
the two classifications correlated or associated or are they independent? That is, can you predict one
f h h ? M ifi ll di ' j b f h i d If h i
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
15/38df di b PRO iAre you a developer? Try out the HTML to PDF API
from the other? More specifically, can you predict someone's job status from their gender. If there is an
association between gender and status, it will mean that you can make a good guess about their job
status by knowing their gender.
Let's examine how the frequency data in the table would look in several cases from complete
dependence to complete independence. We examine 200 people from the business. I've made theexample so that of these 200, 110 are women and 90 are men.
Complete Dependence
We'll start with the extreme case--complete
dependence between gender and job status. If
your study showed the observed frequencies in
the table on the graphic (not a single male clerk
and not a single female manager), that data would
have to come from watching old sitcoms from the
1950's. In any event, if your data looked like this,
where all 90 of the men are managers and all 110 of the women are clerks, the data wouldindicate complete dependence or complete predictability. In this example, the data would
demonstrate a case of extreme gender bias. The association between gender and job
status is as high as it can be.
If I told you that someone is a clerk and then asked you to guess what their gender is, you
would be able to predict their gender perfectly If a person is a clerk then she must be
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
16/38
Are you a developer? Try out the HTML to PDF API
would be able to predict their gender perfectly. If a person is a clerk, then she must be
female. If I said someone is a man, you could guess his job status perfectly; he must be a
manager. This is complete dependence of gender and status.
Strong Dependence
Here is an example of observed frequency data
indicating partial but strong dependence. This
example isn't quite so clear cut, there is still a
great deal of predictability between the two
categorical variables.
In the current graphic, out of 200 people, there
were 20 male clerks, 70 male managers; there
were 100 clerical females, and 10 females managers. In other words, 22% of the men are
clerks whereas 91% of the women are clerks.
Now you can't make a perfect prediction from gender to job status, but you can make apretty good guess. If I say that somebody is a manager and asked you to guess whether
they're a male or a female, you could make a pretty good guess. You'd guess a manager
would be male. Now you wouldn't be right all of the time, but you'd be right most of the
time.
In this example there's strong predictability from gender to job status and vice versa In
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
17/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
In this example there s strong predictability from gender to job status and vice versa. In
other words there is a strong association between gender and job status.
Independence
What would the data look like if the partitions
were independent or had no relationship
whatsoever? In such a case, you would not
be able to predict job status from gender any
better than chance.
I've changed the example such that half theemployees are managers and half are clerks.
In line with this, you'll notice that half of the women are clerical and half are managers.
The same goes for the men, half of them are clerical and half of them are managers. In
other words, 50% of all employees are managers, AND 50% of the men are managers
and 50% of the women are managers.
I've made the example such that men and women are not equal in number. Out of every
100 people there is 45 men and 55 women. So 45% of the employees are man and 55%
of the employees are women. If we look at the 100 clerks, we find that 45% of them are
men and 55% of them are women. That is the same as the percentage of men and
women in the company. There appears to be no bias whatsoever, not even chance
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
18/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
variations.
NO ASSOCIATION. In this case then, we have independence of the two partitions or the
two category systems. The category system called gender is independent of the
category system called job status. Or in the language used in the Chi Square, there is
no association between gender and job status.
So that's the question is for this kind of test statistic. It's a different kind of question than the one you
would have for a t test which uses measurement data.
The Research Data
For our running example, let's suppose that our
research yields the data shown in the current
graphic.
N = 200 total people. Males = 90 out of 200 or
45%. Females = 110 out 200 or 55%. Clerks =
120 out of 200 or 60%. Managers = 80 out of 200
or 40%.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
19/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
DATA PATTERNIn this particular case, the data pattern fits the scientific hypothesis. You can argue it anynumber of ways. For instance, 120 out of 200, or 60% of all employees are clerical, but 90
out of 110, or 82% of females are clerical, and 30 out of 90 or 30% of the males are clerical.
These are the kinds of arguments that someone who thought there was gender biaswould make. They'd say "Look, 60% of all employees are clerical, but 82% of the females
and only 33% of the males are clerical. There appears to be gender bias."
CHANCE
The skeptic would say, "Well I think that data
pattern is just happening by chance." The PCH
of chance says that these data could have
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
20/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
of chance says that these data could have
come about by chance variations in hiring and
promotion.
EVALUATE THE PCH OFCHANCE
To deal with the plausible competing hypothesis
(PCH) of chance, we're going to do a Chi Square
Test of Association.
Go To Top
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
21/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Go To Top
Let's calculate the Expected Frequencies
Review
The slide to the left and the two slides below
summarize the example. Review these three
slides and then we'll start calculating the expected
frequencies.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
22/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
FORMULA FOR EXPECTEDFREQUENCIES
This formula will make most sense after you
have worked through the example.
Calculating the Expected
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
23/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Calculating the Expected
Frequencies
We already have the observed frequencies for
each cell in our table. Now we will calculate an
expected frequency for each cell. The symbol
for expected frequency is fe. We will learn how
to calculate the expected frequencies by going
through all the cells in the example.
NOTATION. We want to be able to note and communicate which of the four cells we are
talking about at any given moment. The table is 2-dimensional, so we need two
dimensions to describe a location in it. By convention, these two dimensions we will call
"j" and "k." As the blue arrow in the graphic shows, the indexjruns down rows; it tells
us which row we are in. And the index kruns across columns; it tells us which column
we are in. [It's arbitrary which dimension we call j and which we call k. We just have to
agree on which is which.] The general symbol for the expected frequency in a particular,
unspecified cell is fe(jk). This can be read as the expected frequency for the cell where
row j intersects with column k. [Generally, "jk" is a subscript but that is currently difficult
to write subscripts in web html text. So I'll use a parenthesis around jk when I need to be
clear. In obvious cases I'll just write the indices without the parentheses.]
Our notation is such that we put row (j) first and column (k) last when indexing a cell in a
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
24/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
table. So fe(11) is the expected frequency for the first row in the first column. fe(12)
indicates the expected frequency in row 1, column 2. Fe(21) indicates the second row in
the first column; and Fe(22) is the cell that is in the second row of the second column.
CALCULATING THE EXPECTED FREQUENCY FOR CELL 1-1. To repeat, the symbol,
fe11, is the designation we give for the cell where j =1 and k =1. In this example cell 1-1 is
the upper left hand cell (which is where we placed clerical male employees). The
expected frequency for that cell is determined by the total number in that row (which is
120) times the total number in that column (which is 90) divided by the total number of
observations in the whole table (which is 200).
This is somewhat confusing to describe in words. But it's really simple if you look at the
examples in the graphics. All you need to do is find the row total, multiply it times the
column total, and divide by the total total.
EXPECTED FREQUENCY FOR CELL 1,1. So you calculate the expected frequency for
the first cell as 120 times 90 over 200. fe11is equal to 54.
Attempt to find the expected frequencies for the other cells on your own before you look
at the results below.
Ex ected Fre uencies for
h O h C ll
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
25/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
the Other Cells
Go ahead and find the expected frequency
for cell 1-2 (row 1, column 2) or the female
clerks cell.
One convenient way to summarize the
information you will need when we
eventually get to the formula is to write the
expected frequency in the cell with the
observed frequency.
Fe(1,2).The expected frequency for row1, column 2 is 120 times 110, over 200 which
you can see equals 66.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
26/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Fe(2,1). For row 2, column 1 the expected frequency works out to be 90 times 80 over
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
27/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
( , ) , p q y200, or 36.
Fe(2,2). Then the expected frequency for the cell in row two, column two is 110 times 80over 200, or 44.
In the final graphic of the series, the table has observed frequencies, which are the black
colored numbers, and expected frequencies which are shown in blue.
The observed frequencies are the data. The expected frequencies is what the data should
have come out to be if Gender were NOT associated with Job Status.
Go To Top
The Formula
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
28/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Go To Top
Let's look at the formula. It tells you to sum up
the squared differences between the expected
and the observed frequencies and divide each
squared difference by the expected frequency.
This is the same as the Goodness of Fit, exceptthat it is for a 2-dimensional case, so the
formula uses a double summation notation.
DEGREES OF FREEDOM. The degrees of
freedom are the number of rows minus one
times the number of columns minus one. Interms of notation, capital J represents the
number of rows, and capital K represents the
number of columns.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
29/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Go To Top
Calculations
Here is what our 2 x 2 table would be like like
given our data and the calculated expected
frequencies.
Next we determine the deviation between
observed frequency and the expected
frequency for each cell. We square the
deviation for each cell. Finally we divide the
squared deviation by the expected frequency
for that cell. The four graphics below show all
the calculations.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
30/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
31/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
IN WORDS. Get a value for each cell by determining (fo - Fe) squared over Fe In the
example the values for the four cells are 10.67, 8.73, 16, and 13.09.
Then sum up the values for all the cells to get the final chi square value. In the example chi
square = 10.67 + 8.73 + 16 + 13.09 = 48.58.
Degrees of Freedom
The formula for degrees of freedom is (the
number of rows minus one) times (the
number of columns minus one). It can be
symbolized by J - 1 times K minus 1 or (J -
1)(K - 1). In this case, the degrees of freedom
l 2 i 1 ti 2 i 1 j t 1
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
32/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
equal 2 minus 1 times 2 minus 1, or just 1.
[Note: Capital J is used to indicate the
number of rows. As we've already said, little j
is the index for some particular row. The
same is true of K and k.]
Now we can go on to the topic of statistical
conclusion validity.
STATISTICAL HYPOTHESES
The observed frequencies (fo's) are the data we
collect. The expected frequencies are what the
data should be if the two categorical variablesare independent (not associated).
NULL HYPOTHESIS. The skeptic thinks thatthere is no association between categorical
variables (in this case, between gender andstatus) so the observed frequencies shouldequal the expected frequencies other than
chance differences. So the corresponding null hypothesis is that we expect the
difference between observed and expected frequency in each cell to equal zero.
ALTERNATIVE HYPOTHESIS. The scientist things that there IS an association between
the two categorical variables. So the scientist thinks the data will differ from the expectedfrequencies. So the corresponding null hypothesis is that the difference between the
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
33/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
frequencies. So the corresponding null hypothesis is that the difference between the
observed and expected frequencies will NOT be zero.
Go To Top
Statistical Conclusion
Validity
Here we show the sampling distribution of chi
square again The number line along the bottom
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
34/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
square again. The number line along the bottom
of the graph goes from zero to positive infinity
for chi square. Chi square is a squared entity -
everything in it is squared. Even if you get
negative numbers from your cell calculations,they are going to be squared and made into
positive numbers. You cannot get a chi square
below zero. If you calculate a chi square below
zero, you made a mistake.
So the range of the Chi Square test statisticgoes from zero to positive infinity.
What H0 predicts
If you picture the null hypothesis in your mind,
you'll remember that we expect the difference
between observed expected frequencies to be
zero. If H0 is actually true, then every term (for
every cell) in the Chi Square formula would be
zero. That is, there would be no differences
between the observed and the expected
frequencies in any cell Therefore each of those
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
35/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
frequencies in any cell. Therefore, each of those
differences would be zero, zero squared is zero,
and the whole Chi Square would be equal to
zero. So H0 is predicting values of Chi Square
near zero. So high values of Chi Square are notwhat H0 is predicting.
The Rejection Region
To find the critical value you need to know the
degrees of freedom and your selected alpha
level. Then you just look the critical value up in
your table. The critical value of chi square, with
one degree of freedom and alpha of .05, is 3.84.
Chi Square tables are available on the course
web site.
You draw your "Reject H0" and "Do not reject
H0" regions based on this critical value of 3.84.
We found that our calculated chi square of
48.58 falls in the rejection region therefore, we
would reject H0
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
36/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
would reject H0.
Why we Reject H0
H0 is predicting that there will be no differencebetween expected and observed frequencies
and so you should get a chi square in the
neighborhood of zero if Ho is true. By chance
alone, you may get a Chi Square value bigger
than zero. However, the chance of getting a
value of Chi Square beyond 3.84 by chance
alone is very small.
Therefore using the logic we have used before
with other statistical tests, we'll reject H0
because it's very improbable that you would get
a Chi Square of this magnitude by chance
alone.
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
37/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
Go To Top
Sampling Distribution of
Chi Square
Let's review the sampling distribution of chi-
square. This slide shows the overall 4 step
process and is the same slide you saw for the
Goodness of Fit Chi Square.
The sampling distribution of the test statistic
you just calculated is called the Chi Square
probability distribution. This distribution starts
at zero and goes to positive infinity. Notice also
that it is not symmetrical. It's different than a
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf -
8/13/2019 Www Psych Utah Edu Stat Introstats Web Text Chi Square Assoc
38/38
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
that it is not symmetrical. It s different than a
bell curve, it has a big lump down by zero
where most of the probability is and it has only
one tail going off toward positive infinity.
Go To Top
Copyright 1997, 2000 Tom Malloy
http://pdfcrowd.com/http://pdfcrowd.com/redirect/?url=http%3a%2f%2fwww.psych.utah.edu%2fstat%2fintrostats%2fweb-text%2fchi-square-Association%2f&id=ma-140221135833-f95c0134http://pdfcrowd.com/customize/http://pdfcrowd.com/html-to-pdf-api/?ref=pdf