Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is...

39
Contingency analysis

Transcript of Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is...

Page 1: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Contingency analysis

Page 2: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Sample

Test statistic

Null hypothesis

Null distributioncompare

How unusual is this test statistic?

P < 0.05 P > 0.05

Reject Ho Fail to reject Ho

Page 3: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Using one tail in the 2

• We always use only one tail for a 2 test

• Why?

Page 4: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Data match null expectationexactly

0Data deviate fromnull expectation in some way

Page 5: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Reality

Result

Ho true Ho false

Reject Ho

Do not reject Ho correct

correctType I error

Type II error

Page 6: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Test statistic

If null hypothesis is really true…

Do not reject HoCorrect answer

Reject HoType I error

Page 7: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Test statistic

If null hypothesis is really false…

Do not reject HoType II error

Reject Hocorrect

Page 8: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Errors and statistics

• These are theoretical - you usually don’t know for sure if you’ve made an error

• Pr[Type I error] = • Pr[Type II error] = …

– Requires power analysis– Depends on sample size

Page 9: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Contingency analysis

• Estimates and tests for an association between two or more categorical variables

Page 10: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buyingOBSERVED French

music playing

German music playing

Totals

Bottles of French wine sold

40 12 52

Bottles of German wine sold

8 22 30

Totals 48 34 82

Page 11: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Mosaic plot

Page 12: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Odds ratio

• Odds of success = probability of success divided by the probability of failure

O =p

1− p

Page 13: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Estimating the Odds ratio

• Odds of success = probability of success divided by the probability of failure

ˆ O =ˆ p

1− ˆ p

ˆ p =x

n

Page 14: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buyingOBSERVED French

music playing

Bottles of French wine sold

40

Bottles of German wine sold

8

Totals 48

Page 15: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Example

• Out of 48 bottles of wine, 40 were French

ˆ O =ˆ p

1− ˆ p

ˆ p =x

n

Page 16: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Example

• Out of 48 bottles of wine, 40 were French

ˆ O =0.833

1− 0.833= 5.00

ˆ p =40

48= 0.833

Interpretation: people are about 5 times more likely to buy a French wine

Page 17: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

O=1

Success and failureequally likely

Success more likely

Failure more likely

Page 18: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Odds ratio

• The odds of success in one group divided by the odds of success in a second group

OR =O1

O2

Page 19: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Estimating the Odds ratio

• The odds of success in one group divided by the odds of success in a second group

ˆ O R =ˆ O 1ˆ O 2

Page 20: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buying

• Group 1 = French music, Group 2 = German music

• Success = French wine

ˆ O R =ˆ O 1ˆ O 2

Page 21: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Group 2

• Out of 34 bottles of wine, 12 were French

ˆ O 2 =0.353

1− 0.353= 0.55

ˆ p =12

34= 0.353

Page 22: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buying

• Group 1 = French music, Group 2 = German music

• Success = French wine

ˆ O R =ˆ O 1ˆ O 2

=5.00

0.55= 9.09

ˆ O 1 = 5.00

ˆ O 2 = 0.55

Page 23: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buying

• Group 1 = French music, Group 2 = German music

• Success = French wine

ˆ O R =ˆ O 1ˆ O 2

=5.00

0.55= 9.09

Interpretation: people are about 9 times more likely to buy French wine in Group 1 compared to Group 2

Page 24: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

OR=1

Success more likelyin Group 1

Success more likelyin Group 2

Success equally likelyin both groups

Page 25: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Hypothesis testing

• Contingency analysis• Is there a difference in odds between two groups?

Page 26: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Hypothesis testing

• Contingency analysis• Is there an association between two categorical variables?

Page 27: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Music and wine buyingOBSERVED French

music playing

German music playing

Totals

Bottles of French wine sold

40 12 52

Bottles of German wine sold

8 22 30

Totals 48 34 82

Page 28: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Contingency analysis

• Is there a difference in the odds of buying French wine depending on the music that is playing?

• Is there an association between wine bought and music playing?

• Is the nationality of the wine independent of the music playing when it is sold?

Page 29: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Hypotheses

• H0: The nationality of the bottle of wine is independent of the nationality of the music played when it is sold.

• HA: The nationality of the bottle of wine sold depends on the nationality of the music being played when it is sold.

Page 30: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Calculating the expectations

With independence,

Pr[ French wine AND French music] =

Pr[French wine] Pr[French music]

Page 31: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Calculating the expectations

Pr[French wine] = 52/82=0.634

Pr[French music] = 48/82= 0.585

OBS. French music

German

music

Totals

French wine sold

52

German wine sold

30

Totals 48 34 82

By H0, Pr[French wine AND French music] = (0.634)(0.585)=0.37112

Page 32: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Calculating the expectations

EXP. French music

German music

Totals

French wine sold

0.37 (82) = 30.4

52

German wine sold 30

Totals 48 34 82

By H0, Pr[French wine AND French music] = (0.634)(0.585)=0.37112

Page 33: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Calculating the expectations

EXP. French music

German music

Totals

French wine sold

0.37 (82) = 30.4

21.6 52

German wine sold 17.6 12.4 30

Totals 48 34 82

Page 34: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

2

2 =Oi − E i( )

2

E ii

=40 − 30.4( )

2

30.4+

12 − 21.6( )2

21.6+

8 −17.6( )2

17.6+

22 −12.4( )2

12.4= 20.0

Page 35: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Degrees of freedom

For a 2 Contingency test,df = # categories -1- # parameters

df= (# columns -1)(# rows -1)

For music/wine example, df = (2-1)(2-1) = 1

Page 36: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Conclusion

2 = 20.0 >> 2 = 3.84,

So we can reject the null hypothesis of independence, and say that the nationality of the wine sold did depend on what music was played.

Page 37: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Assumptions

• This 2 test is just a special case of the 2 goodness-of-fit test, so the same rules apply.

• You can’t have any expectation less than 1, and no more than 20% < 5

Page 38: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Fisher’s exact test

• For 2 x 2 contingency analysis

• Does not make assumptions about the size of expectations

• JMP will do it, but cumbersome to do by hand

Page 39: Contingency analysis. Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o.

Other extensions you might see

• Yates correction for continuity

• G-test• Read about these in your book