Oct. 9 - personal.psu.edupersonal.psu.edu/drh20/100/lectures/lecture17Oct9.pdf · Oct. 9...
Transcript of Oct. 9 - personal.psu.edupersonal.psu.edu/drh20/100/lectures/lecture17Oct9.pdf · Oct. 9...
Oct. 9 Assignment: Read Chapter 12
Cell phone ownership: Fall 2001
So 66.2% of women in the sample say yes but only 45.7% of men in the sample say yes. Are they statistically significantly different?
Rows: sex Columns: cellphone no yes All female 26 51 77 33.8% 66.2% 100.0% male 19 16 35 54.3% 45.7% 100.0%
Expected counts are below observed counts no yes Total Female 26 51 77 30.94 46.06 Male 19 16 35 14.06 20.94 Total 45 67 112 Chi-Sq = 0.788 + 0.529 + 1.734 + 1.164 = 4.215
FALL 2001 Cell Phone Results
What is the correct conclusion from a chi-squared statistic of 4.215?
(A) We have evidence that the skeptic is correct. (B) We have evidence that the skeptic is incorrect. (C) We do not have evidence that the skeptic is correct. (D) We do not have evidence that the skeptic is incorrect.
3210
4
3
2
1
0
Area above 3.84 is .05Chisquared distribution with 1 degree of freedom
95% in hereresearch advocate w insis declared large and theIf chisquared is in here, it5% in here
Our chisquared is 4.215
3.84
But our chi-squared is 4.215 so the research advocate wins! There was a statistically significant difference in 2001.
We see a significant difference
Change over time: Cell phone ownership for STAT 100 students
Semester
Women
Men
Significant difference?
Fall 2001 66.2% (of 77) 45.7% (of 35) Yes
Spring 2004 91.2% (of 136) 86.1% (of 101) No
Spring 2005 97.4% (of 117) 94.2% (of 103) No
Fall 2005 97.8% (of 138) 95.1% (of 82) No
Fall 2008 100.0% (of 130) 100.0% (of 96) No
Spring 2005: A cautionary tale
Rows: Sex Columns: Cellphone No Yes All Female 3 114 117 4.8 112.2 Male 6 97 103 4.2 98.8 All 9 211 220
Note that two of the expected counts are smaller than 5.
This can make our results somewhat iffy.
The best approach in this case: Report the result (no significant difference) but point out the small expected counts of 4.79 and 4.21.
Why 1 degree of freedom? No Yes
Female 136
Male 101
26 211 237
The gray box is the ONLY one we can fill freely. Once that box is filled, all others are determined by margins!
Consider a 2x3 table (data from FA13): Rows: Sex Columns: Voted in 2012? Ineligible No Yes All Female 15 31 22 68 22.1% 45.6% 32.4% 100.0% Male 14 15 19 48 29.2% 31.3% 39.6% 100.0% All 29 46 41 116
How many degrees of freedom here?
Always Sometimes Never
Women One df Two df 68
Men 48
29 46 41 116
Degrees of freedom (df) always equal
(Number of rows – 1) × (Number of columns – 1)
Here is the chi-squared test:
Chi-Square = 2.44, DF = 2, cutoff for DF=2 is 5.991
There is no evidence of different 2012 voting patterns between men and women (in whatever population is represented here).
Rows: Sex Columns: Voted in 2012? Ineligible No Yes All Female 15 31 22 68 17.0 27.0 24.0 Male 14 15 19 48 12.0 19.0 17.0 All 29 46 41 116
Now for a 2x4 table (from STAT 100 FA08): Rows: Eyelens Columns: Eyecol Blue Brown Green Hazel All No Lens 39 50 18 15 122 25.0% 39.3% 14.3% 21.4% 100.0% Yes Lens 33 40 11 19 103 27.3% 39.6% 11.3% 21.7% 100.0% Total 72 90 29 34 225
Chi-Square = 2.18, DF = 3, cutoff for DF=3 is 7.815
There is no evidence of different eye-color patterns between the lens-wearing population and the non-lens-wearing population.
How many degrees of freedom are there for a chi-squared test on a 3x5 table?
(A) 35 (B) 15 (C) 12 (D) 8 (E) 4
Health studies and risk Hypothetical research question: Do strong electromagnetic fields cause cancer? 50 dogs randomly split into two groups: no field, yes field The response is whether they get lymphoma.
Rows: mag field Columns: cancer no yes All no 20 5 25 yes 10 15 25 All 30 20 50
Rows: mag field Columns: cancer observed above the expected no yes All no 20 5 25 15.00 10.00 25.00 yes 10 15 25 15.00 10.00 25.00 All 30 20 50 Chi-Square = 8.333 (compare to 3.84) Research advocate wins!
Terminology and jargon: In the mag field group, 15/25 of the dogs got cancer. Therefore, the following are all equivalent:
1. 60% of the sampled dogs in this group got cancer.
2. The sample proportion of dogs in this group that got cancer is 0.6.
3. The sample probability that a dog in this group got cancer is 0.6.
4. The sample risk of cancer in this group is 0.6.
One more: The sample odds of cancer in this group are 3/2.
1. Identify the 'bad' response category: In this example, cancer
2. Treatment risk: 15 / 25 or .60 or 60%
3. Baseline risk: 5 / 25 or .20 or 20%
4. Relative risk: Treatment risk over Baseline risk = .60 / .20=3
5. Increased risk: By how much does the risk increase for treatment as compared to control? (See next page)
6. Odds ratio: Ratio of treatment odds to baseline odds. ( Later.)
More terminology and jargon:
So magnetic field risk is 3 times higher than baseline risk. (This language is often how relative risk is expressed.)
22.4.
20.20.60.
==−
=−
BaselineBaselineTreatment
Increased risk (percentage change in risk): Compare the Change to the Original by dividing:
So the percentage change is 200% We might say: The magnetic field risk is 200% higher than the no-field risk.
Final note: When the chi-squared test is statistically significant then it makes sense to compute the various risk statements. If there is no statistical significance then the skeptic wins (more accurately, the skeptic doesn't lose). There is no evidence in the data for differences in risk for the categories of the explanatory variable.