Oct. 9 - personal.psu.edupersonal.psu.edu/drh20/100/lectures/lecture17Oct9.pdf · Oct. 9...

Oct. 9 Assignment: Read Chapter 12

Cell phone ownership: Fall 2001

So 66.2% of women in the sample say yes but only 45.7% of men in the sample say yes. Are they statistically significantly different?

Rows: sex Columns: cellphone no yes All female 26 51 77 33.8% 66.2% 100.0% male 19 16 35 54.3% 45.7% 100.0%

Expected counts are below observed counts no yes Total Female 26 51 77 30.94 46.06 Male 19 16 35 14.06 20.94 Total 45 67 112 Chi-Sq = 0.788 + 0.529 + 1.734 + 1.164 = 4.215

FALL 2001 Cell Phone Results

What is the correct conclusion from a chi-squared statistic of 4.215?

(A)  We have evidence that the skeptic is correct. (B)  We have evidence that the skeptic is incorrect. (C)  We do not have evidence that the skeptic is correct. (D)  We do not have evidence that the skeptic is incorrect.

3210

4

3

2

1

0

Area above 3.84 is .05Chisquared distribution with 1 degree of freedom

95% in hereresearch advocate w insis declared large and theIf chisquared is in here, it5% in here

Our chisquared is 4.215

3.84

But our chi-squared is 4.215 so the research advocate wins! There was a statistically significant difference in 2001.

We see a significant difference

Change over time: Cell phone ownership for STAT 100 students

Semester

Women

Men

Significant difference?

Fall 2001 66.2% (of 77) 45.7% (of 35) Yes

Spring 2004 91.2% (of 136) 86.1% (of 101) No

Spring 2005 97.4% (of 117) 94.2% (of 103) No

Fall 2005 97.8% (of 138) 95.1% (of 82) No

Fall 2008 100.0% (of 130) 100.0% (of 96) No

Spring 2005: A cautionary tale

Rows: Sex Columns: Cellphone No Yes All Female 3 114 117 4.8 112.2 Male 6 97 103 4.2 98.8 All 9 211 220

Note that two of the expected counts are smaller than 5.

This can make our results somewhat iffy.

The best approach in this case: Report the result (no significant difference) but point out the small expected counts of 4.79 and 4.21.

Why 1 degree of freedom? No Yes

Female 136

Male 101

26 211 237

The gray box is the ONLY one we can fill freely. Once that box is filled, all others are determined by margins!

Consider a 2x3 table (data from FA13): Rows: Sex Columns: Voted in 2012? Ineligible No Yes All Female 15 31 22 68 22.1% 45.6% 32.4% 100.0% Male 14 15 19 48 29.2% 31.3% 39.6% 100.0% All 29 46 41 116

How many degrees of freedom here?

Always Sometimes Never

Women One df Two df 68

Men 48

29 46 41 116

Degrees of freedom (df) always equal

(Number of rows – 1) × (Number of columns – 1)

Here is the chi-squared test:

Chi-Square = 2.44, DF = 2, cutoff for DF=2 is 5.991

There is no evidence of different 2012 voting patterns between men and women (in whatever population is represented here).

Rows: Sex Columns: Voted in 2012? Ineligible No Yes All Female 15 31 22 68 17.0 27.0 24.0 Male 14 15 19 48 12.0 19.0 17.0 All 29 46 41 116

Now for a 2x4 table (from STAT 100 FA08): Rows: Eyelens Columns: Eyecol Blue Brown Green Hazel All No Lens 39 50 18 15 122 25.0% 39.3% 14.3% 21.4% 100.0% Yes Lens 33 40 11 19 103 27.3% 39.6% 11.3% 21.7% 100.0% Total 72 90 29 34 225

Chi-Square = 2.18, DF = 3, cutoff for DF=3 is 7.815

There is no evidence of different eye-color patterns between the lens-wearing population and the non-lens-wearing population.

How many degrees of freedom are there for a chi-squared test on a 3x5 table?

(A)  35 (B)  15 (C)  12 (D)  8 (E)  4

Health studies and risk Hypothetical research question: Do strong electromagnetic fields cause cancer? 50 dogs randomly split into two groups: no field, yes field The response is whether they get lymphoma.

Rows: mag field Columns: cancer no yes All no 20 5 25 yes 10 15 25 All 30 20 50

Rows: mag field Columns: cancer observed above the expected no yes All no 20 5 25 15.00 10.00 25.00 yes 10 15 25 15.00 10.00 25.00 All 30 20 50 Chi-Square = 8.333 (compare to 3.84) Research advocate wins!

Terminology and jargon: In the mag field group, 15/25 of the dogs got cancer. Therefore, the following are all equivalent:

1.  60% of the sampled dogs in this group got cancer.

2.  The sample proportion of dogs in this group that got cancer is 0.6.

3.  The sample probability that a dog in this group got cancer is 0.6.

4.  The sample risk of cancer in this group is 0.6.

One more: The sample odds of cancer in this group are 3/2.

1.  Identify the 'bad' response category: In this example, cancer

2.  Treatment risk: 15 / 25 or .60 or 60%

3.  Baseline risk: 5 / 25 or .20 or 20%

4.  Relative risk: Treatment risk over Baseline risk = .60 / .20=3

5.  Increased risk: By how much does the risk increase for treatment as compared to control? (See next page)

6.  Odds ratio: Ratio of treatment odds to baseline odds. ( Later.)

More terminology and jargon:

So magnetic field risk is 3 times higher than baseline risk. (This language is often how relative risk is expressed.)

22.4.

20.20.60.

==−

=−

BaselineBaselineTreatment

Increased risk (percentage change in risk): Compare the Change to the Original by dividing:

So the percentage change is 200% We might say: The magnetic field risk is 200% higher than the no-field risk.

Final note: When the chi-squared test is statistically significant then it makes sense to compute the various risk statements. If there is no statistical significance then the skeptic wins (more accurately, the skeptic doesn't lose). There is no evidence in the data for differences in risk for the categories of the explanatory variable.

Oct. 9 - personal.psu.edupersonal.psu.edu/drh20/100/lectures/lecture17Oct9.pdf · Oct. 9...

Documents

Transcript of Oct. 9 - personal.psu.edupersonal.psu.edu/drh20/100/lectures/lecture17Oct9.pdf · Oct. 9...