Cross-Tabulations. Cross-Tabs The level of measurement used for cross- tabulations are mostly...
-
date post
22-Dec-2015 -
Category
Documents
-
view
231 -
download
1
Transcript of Cross-Tabulations. Cross-Tabs The level of measurement used for cross- tabulations are mostly...
Cross-Tabulations
Cross-TabsThe level of measurement used for cross-tabulations are mostly nominal. Even when continuous variables are used (such as age and income), they are converted to categorical variables.
When continuous variables are converted to categorical variables, important information (variation) is lost.
Prentice-Hall
Data Types
Data
Numerical(Quantitative)
Categorical(Qualitative)
Discrete Continuous
Prentice-Hall
Categorical Data
• Categorical random variables yield responses that classify– Example: Gender (female, male)
• Measurement reflects number in category• Nominal or ordinal scale
– Examples• Did you attend a community college? • Do you live on-campus or off-campus?
Why Concerned about Categorical Random Variables?
• Survey data tends to be categorical … hot/comfortable/cold, sunny/cloudy/fog/rain, yes/no…
• Know limitations– nature of relationship– causality
• Widely used in marketing for decision-making
Prentice-Hall
Cross-Tabs
The Chi-square, 2, statistic is used to test the null hypothesis.
[Unfortunately, Chi-square, like many other statistics that indicate statistical significance, tells us nothing about the
magnitude of the relation.]
Prentice-Hall
2 Test of Independence
• Shows whether a relationship exists between two categorical variables– One sample is drawn– Does not show nature of relationship– Does not show causality
• Used widely in marketing • Uses contingency table
Prentice-Hall
Upper Tail AreaDF .995 … .95 … .051 ... … 0.004 … 3.8412 0.010 … 0.103 … 5.991
Critical Value
20 5.991
Reject
What is the critical 2 value if table has 2 rows and 3 columns, =.05?
= .05df = (2 - 1)(3 - 1)
= 2
If fo = fe, 2 = 0.
Do not reject H0
2 Table (Portion)
Prentice-Hall
2 Test of Independence Hypotheses & Statistic
• Hypotheses– H0: Variables are not dependent
– H1: Variables are dependent (related)
• Test statistic
• Degrees of freedom: (r - 1)(c - 1)
cells all
22
e
eo
f
ff
Observed frequency
Expected frequency
Prentice-Hall
2 Test of Independence Expected Frequencies
• Statistical independence means joint probability equals product of marginal probabilities
– P(A and B) = P(A)·P(B)
• Compute marginal probabilities• Multiply for joint probability• Expected frequency is sample size times joint
probability
Prentice-Hall
Diet Pepsi Diet Coke No Yes Total
No 84 32 116 Yes 48 122 170 Total 132 154 286
You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the 0.05 level of significance, is there evidence of a relationship?
2 Test of Independence An Example
Prentice-Hall
Expected Frequencies
total Grand
totalRow total Column =frequency Expected
Prentice-Hall
Diet PepsiNo Yes
Diet Coke Obs. Exp. Obs. Exp. Total
No 84 53.5 32 62.5 116
Yes 48 78.5 122 91.5 170
Total 132 132 154 154 286
Expected Frequenciesfe 1 in all cells
132·170286
154·170286
132·116286
132·154286
Prentice-Hall
2 Test of Independence
Cell fo fe fo - fe (fo - fe)² (fo - fe)²/ fe
1,1 84 53.5 +30.5 930.25 17.3879
1,2 32 62.5 -30.5 930.25 14.8840
2,1 48 78.5 -30.5 930.25 11.8503
2,2 122 91.5 +30.5 930.25 10.1667
Total 286 286 54.2889
Prentice-Hall
2 Test of Independence
H0: Not Dependent
H1: Dependent
= .05
df = (2 - 1)(2 - 1) = 1
Critical Value(s):
Test Statistic:
Decision:
Conclusion:
Reject at = .05
There is evidence of a relationship
20 3.841
Reject = .05= .05
2889.54cells all
22
e
eo
f
ff
Cross-Tabs Please provide the requested information by checking (once)
in each category. What is your:
age ____ < 18 ___ 18 - 26 ____ > 26
gender ____ male ____ female
course load __ < 6 units __ 6 – 12 units __ > 12 units
gpa __ < 2.0 __ 2.0 - 2.5 __ 2.6 - 3.0 __ 3.1 - 3.5 __ > 3.5 annual income __ < $15k __ $15k - $40k ___ > $40k
Cross-Tabs
The information is coded and entered in the file student.sf by letting the first response be recorded as a 1, the second as a 2, etc.
Cross-Tabs
The hypothesis test generally referred to as
a test of dependence.
The researcher wishes to determine whether the variables are dependent, or, exhibit a relationship.
Cross-Tabs
Let’s investigate whether a relationship between a student’s gpa and units attempted exists.
H0: GPA and UNITS are not dependent
H1: GPA and UNITS are dependent.
Cross-Tabs
Chi-Square Test
------------------------------------------
Chi-Square Df P-Value
------------------------------------------
3.67 8 0.8853
------------------------------------------
Cross-Tabs
p-value = 0.8853, Retain H0
thus, GPA and UNITS are not dependent
[Based on our data, there is no evidence to support the concept that a relationship exists between gpa and units attempted.]
Cross-Tabs
Let’s investigate whether a relationship between a student’s age and units attempted exist.
H0: AGE and UNITS are not dependent
H1: AGE and UNITS are dependent.
Cross-Tabs
Chi-Square Test
------------------------------------------
Chi-Square Df P-Value
------------------------------------------
9.89 4 0.0423
------------------------------------------
Cross-Tabs
p-value = 0.0423, Reject H0
thus, AGE and UNITS are dependent
[Based on our data, there is sufficient evidence to support the concept that a relationship exists between age and units attempted.]
Cross-TabsFrequency Table for age by units
Units <6 6-12 >12 AGE Total --------------------------------------------------------
<18 | 10 | 19 | 17 | 46 | 17.24% | 20.88% | 33.33% | 23.00%
--------------------------------------------------------Age 18-26 | 24 | 22 | 16 | 62
| 41.38% | 24.18% | 31.37% | 31.00% --------------------------------------------------------
>26 | 24 | 50 | 18 | 92 | 41.38% | 54.95% | 35.29% | 46.00% --------------------------------------------------------
UNITS Total 58 91 51 200 29.00% 45.50% 25.50% 100.00%
Questions?
ANOVA