Biostatistics Unit 10 Categorical Data Analysis 1.

36
Biostatistics Unit 10 Categorical Data Analysis 1

Transcript of Biostatistics Unit 10 Categorical Data Analysis 1.

Page 1: Biostatistics Unit 10 Categorical Data Analysis 1.

Biostatistics

Unit 10

Categorical Data Analysis

1

Page 2: Biostatistics Unit 10 Categorical Data Analysis 1.

Categorical Data Analysis

• Categorical data analysis deals with discrete data that can be organized into categories.

• The data are organized into a contingency table. The basic structure consists of two columns and two rows.

• The 2 distribution is used in categorical data analysis.

2

Page 3: Biostatistics Unit 10 Categorical Data Analysis 1.

Basic Contingency Table Structure

• Basic structure of a 2X2 contingency table has two columns and two rows.

3

Page 4: Biostatistics Unit 10 Categorical Data Analysis 1.

Structure of Contingency Tables

• Cells are labeled A through D.

• Columns and rows are added for labels.

4

Page 5: Biostatistics Unit 10 Categorical Data Analysis 1.

Using the contingency table as a comparison table

• Comparison of outcomes in laboratory tests is studied using contingency tables.

5

Page 6: Biostatistics Unit 10 Categorical Data Analysis 1.

Absolute and Relative Risk• Relative risk is the ratio of two proportions. In each

row is an absolute risk of getting the disease. • The ratio of these two proportions is the relative risk.

6

Page 7: Biostatistics Unit 10 Categorical Data Analysis 1.

Absolute Risk

7

Page 8: Biostatistics Unit 10 Categorical Data Analysis 1.

Relative Risk

8

Page 9: Biostatistics Unit 10 Categorical Data Analysis 1.

Example

A total of 452 children in elementary schools in Georgia and Florida were served burritos for lunch. Among these, 304 children reported eating the burritos. Among those who ate burritos, 155 reported getting sick from bacterial contamination.

There were also 148 children who did not eat burritos. Among these, 10 cases of illness were reported.

A case of disease was defined as gastrointestinal upset, fever and other symptoms. The CDC studied this event using categorical data analysis. They reported relative risk, significance and a confidence interval.

9

Page 10: Biostatistics Unit 10 Categorical Data Analysis 1.

Contingency Table

• Data from the reports of the incident were entered into a contingency table.

10

Page 11: Biostatistics Unit 10 Categorical Data Analysis 1.

Absolute risk—ate burritos

11

Page 12: Biostatistics Unit 10 Categorical Data Analysis 1.

Absolute risk—did not eat burritos

12

Page 13: Biostatistics Unit 10 Categorical Data Analysis 1.

Relative Risk

• Relative risk is the ratio of the two absolute risk probabilities.

• Conclusion: A child who ate burritos had 7.06 times the probability of getting sick as one who did not.

13

Page 14: Biostatistics Unit 10 Categorical Data Analysis 1.

Significance in relative risk

Significance in relative risk is found using the 2 distribution. The general formula is below.

14

Page 15: Biostatistics Unit 10 Categorical Data Analysis 1.

Significance in relative risk

In contingency table calculations, the values from the table are used to give a 2 value according to the formula below.

15

Page 16: Biostatistics Unit 10 Categorical Data Analysis 1.

Find significance using the TI-83

A. Matrix setup

16

Page 17: Biostatistics Unit 10 Categorical Data Analysis 1.

Find significance using the TI-83

B. Calculation results

Conclusion: With p this small, the result is highly significant.

17

Page 18: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for a Relative Risk Calculation

The confidence interval consists of the usual components of estimator, reliability coefficient and standard error. Standard error is found using the formula

18

Page 19: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for a Relative Risk Calculation

Logarithmic transformation is used

because of the shape of the 2 curve 1 df

which is hyperbolic. The antilog gives the

boundaries of the confidence interval.

19

Page 20: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for a Relative Risk Calculation

20

Page 21: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for a Relative Risk Calculation

21

Page 22: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for a Relative Risk Calculation

• Take antilog to complete the calculation.

• Conclusion: The relative risk is 7.06. We are 95% confident that the true value lies between 3.575 and 13.93.

22

Page 23: Biostatistics Unit 10 Categorical Data Analysis 1.

Odds Ratio

• The odds come from the ratio of two proportions.

• The odds ratio is the ratio of these two odds.

• Odds ratio is generally calculated from data in a case control study.

• The following gives the theoretical basis for the calculation of odds ratio. The outcome is determined as the cross-product.

23

Page 24: Biostatistics Unit 10 Categorical Data Analysis 1.

Contingency Table

24

Page 25: Biostatistics Unit 10 Categorical Data Analysis 1.

Odds ratio and the contingency table

• The probability of being exposed and getting sick (success) is P(E). The probability of being exposed and not getting sick (failure) is 1 – P(E).

• The probability of getting sick when not exposed is P(E’) while the probability of not getting sick when not exposed is 1 – P(E’).

25

Page 26: Biostatistics Unit 10 Categorical Data Analysis 1.

Determining Odds Ratio

Odds of getting sick when exposed

Odds of getting sick when not exposed

26

Page 27: Biostatistics Unit 10 Categorical Data Analysis 1.

Determining Odds Ratio

Odds ratio is the ratio of these two odds

The probability values are related to the cells in the contingency table.

27

Page 28: Biostatistics Unit 10 Categorical Data Analysis 1.

Determining Odds Ratio

The final ratio of cells to find odds ratio

This calculation of odds ratio is the cross-product of AD divided by BC.

28

Page 29: Biostatistics Unit 10 Categorical Data Analysis 1.

Case study for odds ratio

In the case control study, 52 children were involved. There were 13 children who ate the burritos among which 8 got sick. There were also 39 children who did not eat the burritos among which 6 reported symptoms of the illness. The odds ratio was calculated.

29

Page 30: Biostatistics Unit 10 Categorical Data Analysis 1.

Odds Ratio Calculation

Conclusion: The odds ratio is 8.8

30

Page 31: Biostatistics Unit 10 Categorical Data Analysis 1.

Find significance using the TI-83

A. Matrix setup

31

Page 32: Biostatistics Unit 10 Categorical Data Analysis 1.

Find significance using the TI-83

B. Calculation results

Conclusion: p < .001

32

Page 33: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for an Odds Ratio Calculation

Calculation for SE after logarithmic transformation

33

Page 34: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for an Odds Ratio Calculation

34

Page 35: Biostatistics Unit 10 Categorical Data Analysis 1.

CI for an Odds Ratio Calculation

Take antilog to complete the calculation.

Conclusion: The odds ratio is 8.8. We are 95% confident that the true value lies between 2.14 and 36.3.

35

Page 36: Biostatistics Unit 10 Categorical Data Analysis 1.

fin

36