Organizing and Displaying Epidemiologic Data with Tables and Graphs.

75
Organizing and Displaying Epidemiologic Data with Tables and Graphs

Transcript of Organizing and Displaying Epidemiologic Data with Tables and Graphs.

Page 1: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

Organizing and Displaying Epidemiologic Data

with Tables and Graphs

Page 2: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

2

Displaying Data

Learning Objectives

Discuss the difference between tables and graphs for written reports versus oral presentations

Create and interpret one and two variable tables Create and interpret a line graph Create and interpret an epidemic curve Create and interpret one and two variable bar

charts Describe when to use each type of table, graph,

and chart

Page 3: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

3

Displaying Data

Can you summarize the age and sex of the case-patients at a

glance?Case No.

Date of Onset Age Sex

1 21 Nov 9 M

2 21 Nov 39 M

3 22 Nov 29 F

Page 4: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

4

Displaying Data

Can you summarize the age and sex of the case-patients at a

glance?Case No.

Date of Onset Age Sex

1 21 Nov 9 M

2 21 Nov 39 M

3 22 Nov 29 F

4 21 Nov 10 M

5 22 Nov 55 F

6 22 Nov 11 M

Page 5: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

5

Displaying Data

Can you summarize the age and sex of the case-patients at a

glance?Case No. Age Sex

1 9 M

2 39 M

3 29 F

4 10 M

5 55 F

6 11 M

7 9 M

8 7 F

9 17 M

10 10 M

Case No. Age Sex

11 10 M

12 6 M

13 9 M

14 40 M

15 40 F

16 10 M

17 11 M

18 43 F

19 71 F

20 9 F

Case No. Age Sex

21 38 F

22 34 F

23 9 M

24 10 M

25 6 F

26 11 M

27 9 M

28 41 M

29 6 M

30 11 M

Case No. Age Sex

31 10 M

32 31 F

33 8 F

34 9 M

35 10 F

36 11 M

37 38 M

38 11 M

39 7 M

40 16 F

Page 6: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

6

Displaying Data

Basic Methods for Organizing and Presenting Data

Data can be organized through creation of:– Tables– Graphs– Charts

Page 7: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

7

Displaying Data

Why organize and present data?

To summarize when data set has too many records to look at individually

To become familiar with the data before analysis, and to catch errors

To look for (and display)– Patterns– Trends– Relationships– Exceptions / outliers

To communicate findings to others

Page 8: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

8

Displaying Data

Written vs. Oral Presentation

Written Time unlimited Details OK White, grey and black

Oral Time < 1 min Less detail Colors possible

Page 9: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

9

Displaying Data

How to organize data

Identify what data you have Use tables and graphs to summarize; catch

errors; identify patterns, relationships Decide how best to summarize the data to

communicate the findings Use tables and graphs to communicate the

findings effectively

Page 10: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

Tables

Page 11: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

11

Displaying Data

Tables

Data are arranged in rows and columns Quantitative information Usually, presents frequency of occurrence

of some event or characteristic in different subgroups

Page 12: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

12

Displaying Data

Tables

Earthquake-related injury

Other injury Total

Male 74 259 333

Female 85 151 236

Unknown 3 9 12

Total 162 419 581

Column

Row

Cell

Clear, concise labels

Row totals

Column Totals

Type of injury by sex, Port-au-Prince field hospital, Haiti, January 13 – May 28, 2010

Descriptive Title (What, where, when)

CDC. Post-earthquake injuries treated at a field hospital — Haiti, 2010. MMWR 59:1673-1677.

Footnote, source

Unknown, if needed

Page 13: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

13

Displaying Data

Types of Tables

1-variable table (frequency distribution)– Range of values of a single variable– Number of observations with each value

2-variable table– Counts shown according to 2 variables at once

3-variable table – Counts shown according to 3 variables at once

Composite (combination) tables

Page 14: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

14

Displaying Data

Example of 1-Variable Table —Tuberculosis Cases by Sex, U.S.,

2009

Sex # Cases

Males 6,990

Females 4,544

Unknown 11

Total 11,545

Table 1. Number of Reported Cases of Tuberculosis,by Sex, United States, 2009

CDC. Reported Tuberculosis in the U.S., 2009. Atlanta: CDC, October 2010.

Page 15: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

15

Displaying Data

Example of 1-Variable Table —Tuberculosis Cases by Age, U.S.,

2009

Age Group (years) # Cases≤ 5 401

5 – 14 24515 – 24 1,27425 – 44 3,89345 – 64 3,434

≥65 2,292Unknown 6

Total 11,545

Table 2. Number of Reported Cases of Tuberculosis,by Age, United States, 2009

CDC. Reported Tuberculosis in the U.S., 2009. Atlanta: CDC, October 2010.

Page 16: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

16

Displaying Data

Example of 1-Variable Table, with Percent Column

CDC. Reported Tuberculosis in the U.S., 2009. Atlanta: CDC, October 2010.

Age Group (years) # Cases Percent≤ 5 401 3.5%

5 – 14 245 2.1%15 – 24 1,274 11.0%25 – 44 3,893 33.7%45 – 64 3,434 29.7%

≥65 2,292 19.9%Unknown 6 0.1%

Total 11,545 100.0%

Table 2. Number of Reported Cases of Tuberculosis,by Age, United States, 2009

Page 17: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

18

Displaying Data

Creating Categories

Mutually exclusive, all inclusive Choices

– Standard categories for the disease– Equal intervals– Equal numbers within each group

Include category for unknown values When analyzing data, begin with more

categories, then collapse into a smaller number of categories for presentation

Page 18: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

19

Displaying Data

Some Standard Categories in U.S.

Notifiable Diseases P&I mortality

NCHS mortality HIV/AIDS

< 1 year1-45-9

10-1415-1920-2425-2930-3940-4950-59≥60

Not stated

Total

< 28 days28 d – 1 yr

1-1415-2425-4445-6465-7475-84≥85

Unknown

Total

< 1 year1-4

5-1415-2425-3435-4445-5455-6465-7475-84≥85

Not stated

Total

< 5 years5–12

13–1415–1920–2425–2930–3435–3940–4445–4950–5455–5960–64

≥65Total

Page 19: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

20

Displaying Data

Two-Variable Tables

Shows counts according to two variables simultaneously

Also called “cross-tab” or contingency tables

Page 20: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

21

Displaying Data

Age Group Females Males Unk Total

Example of Two-variable Table

≤ 5 187 214 0 4015 – 14 119 126 0 245

15 – 24 559 713 2 1,27425 – 44 1,641 2,247 5 3,89345 – 64 1,153 2,278 3 3,434

≥ 65 882 1,409 1 2,292Unknown 3 3 0 6

Total 4,554 6,990 11 11,545

Table 3. Number of Reported Cases of Tuberculosis,by Age and Sex, United States, 2009

CDC. Reported Tuberculosis in the U.S., 2009. Atlanta: CDC, October 2010.

Page 21: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

22

Displaying Data

Example of Two-by-Two Table

Drank from stream near Campsite 6?

Ill Well

Yes 18 4 22

No 5 39 44

23 43 66

Page 22: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

23

Displaying Data

Example of Two-by-Two Table

Drank from stream near Campsite 6?

Ill WellAttack

Rate (%)

Yes 18 4 22 81.8%

No 5 39 44 11.4%

23 43 66

Page 23: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

24

Displaying Data

Example of Three-variable Table

Females Males

Age group U.S. Other U.S. Other Total

≤ 5 167 20 183 31 401

5–14 82 37 72 54 245*

15–24 178 377 207 499 1,274*

25–44 411 1,215 635 1,591 3,893*

45–64 463 669 1,172 1,080 3,434*

65+ 365 509 631 761 2,292*

Total 1,667* 2,829* 2,900* 4,019* 11,545*

* Totals includes cases with missing age, sex, or birth country

Table 3. Number of Reported Cases of Tuberculosis,by Age, Sex, and Birth Country, United States, 2009

CDC. Reported Tuberculosis in the U.S., 2009. Atlanta: CDC, October 2010.

Page 24: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

25

Displaying Data

Composite (Combination) Tables

Combines two or more 1-way or 2-way tables Uses limited space efficiently Well suited for written and oral presentations,

but simple tables must be prepared first

Page 25: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

26

Displaying Data

Composite Table Example

Ortiz, Katz, Mahmoud, et al. J Infect Dis 2007;196:1685-1691

Page 26: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

27

Displaying Data

Why Tables?

When too many records, summarize in table (or graph)

Allow you to identify, explore, understand, and present distributions, trends, relationships, variations, and exceptions in the data

Tables serve as basis for graphs – always create a table first!

Page 27: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

28

Displaying Data

Some Tips for Creating Printed Tables

Keep it simple Should be self-explanatory Title (what, where, when) with table number Label each row and column clearly and concisely Include units of measurement (years, mg/dl, etc.) Show totals for rows and columns Explain codes, abbreviations, symbols Note any exclusions in a footnote Note source in a footnote

Page 28: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

Graphs

Page 29: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

30

Displaying Data

Graphs

Display quantitative data using a set of coordinates

Rectangular graphs (x, y coordinates) most common

x axis along bottom = method of classification, often time

y axis along side = frequency, usually number, percent or rate

Page 30: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

31

Displaying Data

Graphs: Advantages and Disadvantages

Advantages Easy to understand and interpret Reveal patterns in data

– Useful for generating hypotheses– Useful before formal data analysis

Disadvantage Loss of detail

Page 31: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

32

Displaying Data

Graph Types

Arithmetic-scale line graph Histogram Many other types, not covered in this lecture

– Semilogarithmic-scale line graph– Frequency polygon– Cumulative frequency curve– Survival curve– Scatter diagram

Page 32: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

33

Displaying Data

Arithmetic Scale Line Graph

# Cases

Intervals on x-axis are equal

Intervals on y-axis are equal

Start y-axis at 0; use scale breaks only if you must

Useful to portray data collected over time

Page 33: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

34

Displaying Data

Creating a Line Graph

Make x-axis longer than y-axis (best ratio 5:3) X-axis: Match x-axis scale to intervals used during

data collection Y-axis:

– Always start y-axis with 0– Identify largest value, round up for maximum Y value– Select reasonable intervals for y-axis

Plot data Create title Add comments, footnotes

Page 34: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

35

Displaying Data

Creating a Line Graph:X-axis and Y-axis

Y-axis

X-axis

Page 35: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

36

Displaying Data

Creating a Line Graph:Complete X-axis, Label X-axis

Data for Years 1960 – 2008

Page 36: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

37

Displaying Data

Creating a Line Graph:Complete Y-axis, Label Y-axis

Number of Cases

481,530 cases in 1963

Page 37: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

38

Displaying Data

Creating a Line Graph;Plot the data

Number of Cases

Page 38: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

39

Displaying Data

Creating a Line Graph:Add Title

Number of Reported Cases of Measles by Year, United States,

1960–2008Number of Cases

Page 39: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

40

Displaying Data

Number of Reported Cases of Measles by Year, United States,

1960–2008Vaccine licensed

Number of Cases

Creating a Line Graph:Add Comments, Footnotes, Source

CDC. Summary of Notifiable Diseases, U.S., 2008. Atlanta: CDC, June 2010.

Page 40: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

41

Displaying Data

Number of Reported Cases of Measles by Year, United States,

1960–2008Vaccine licensed

Number of Cases

Graph with Inset

CDC. Summary of Notifiable Diseases, U.S., 2008. Atlanta: CDC, June 2010.

Page 41: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

42

Displaying Data

Age-Adjusted Death Rates for Leading Causes of

Death, United States, 1987-2005

Dea

ths

per

100,

000

Page 42: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

43

Displaying Data

Comments on Arithmetic-Scale Line Graph

Method of choice for plotting rates over time X-axis almost always time (rarely, age) Y-axis can be counts, proportions, or rates

– Y-axis should start with 0– Determine largest value of Y needed to plot– Round off that number and divide into intervals

Set distance on either axis represents same quantity anywhere on that axis

Good for comparing 2 or more sets of data

Page 43: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

44

Displaying Data

Histogram

“Epidemic curve” in outbreak investigations Frequency distribution of quantitative data x axis continuous, usually time (onset or

diagnosis date) No spaces between adjacent columns, i.e.,

adjacent columns “touch” Easiest to interpret with equal class (x) intervals Column height proportional to number of

observations in that interval

Page 44: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

45

Displaying Data

Histogram

Page 45: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

46

Displaying Data

Feb. 13 14 15 16 17 18 19 20 21

One Case

Date and Time of Symptom Onset

No spaces between adjacent columns

Party

Number of Cases of Salmonella Enteritidis

by Date of Onset, Chicago, February 2000

Page 46: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

47

Displaying Data

Feb. 13 14 15 16 17 18 19 20 21

One Case

Date and Time of Symptom Onset

Party

Number of Cases of Salmonella Enteritidis

by Date of Onset, Chicago, February 2000

Page 47: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

48

Displaying Data

Feb. 13 14 15 16 17 18 19 20 21

Probable Case

Date and Time of Symptom Onset

Party

Number of Cases of Salmonella Enteritidis

by Date of Onset, Chicago, February 2000

Culture-confirmed Case

Page 48: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

Charts

Page 49: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

50

Displaying Data

Charts

Display quantitative data using only one coordinate

Most appropriate for comparing data with discrete categories

Common types include:– Bar charts– Pie charts– Maps– Other

Page 50: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

51

Displaying Data

Bar Charts

Can be vertical or horizontal Use for variable with discrete, non-linear

categories, such as county Has space between “columns”, since categories

are not continuous

4 types – simple, grouped, stacked, 100%

Best type depends on desired emphasis

Page 51: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

52

Displaying Data

Reported TB Cases by Race/Ethnicity

United States, 2001 (Simple Bar)

Page 52: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

53

Displaying Data

Reported TB Cases by Race/Ethnicity

United States, 2001 (Simple Bar)

Page 53: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

54

Displaying Data

Reported TB Cases by Race/Ethnicity

United States, 2001 (Simple Bar)

Page 54: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

55

Displaying Data

HCV Prevalence by Selected Groups,

United StatesHemophilia

Injecting drug users

Surgeons

Hemodialysis

Average Percent Anti-HCV Positive

Gen’l pop’n adults

Military personnel

STD clients

Pregnant women

Page 55: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

56

Displaying Data

Number of Reported Tuberculosis Casesby Birth Country and Year, U.S., 1991-

2007N

o.

of

Cas

es

(Grouped Bar Chart)Number of Cases

Page 56: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

57

Displaying Data

Number of Reported Tuberculosis Casesby Birth Country and Year, U.S., 1991-

2007N

o.

of

Cas

es

(Stacked Bar Chart)Number of Cases

Page 57: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

58

Displaying Data

Number of Reported Tuberculosis Casesby Birth Country and Year, U.S., 1991-

2007N

o.

of

Cas

es

(Stacked Bar Chart)Number of Cases

Page 58: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

59

Displaying Data

100% Component Bar Chart

All bars same height (100%) Components shown as proportions of the total,

not actual values Good for comparing how components contribute

to the whole within a group Not useful for comparing relative sizes of the

components across different groups because the denominator changes

Page 59: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

60

Displaying Data

Number of Reported Tuberculosis Casesby Birth Country and Year, U.S., 1991-

2007N

o.

of

Cas

es

(100% Component Bar Chart)Proportion of Cases

Page 60: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

61

Displaying Data

Pie Charts

Show components of a whole Size of “slice” = proportional contribution of

each component Hard to compare two or more pie charts Begin at 12 o’clock with largest slice and

proceed clockwise

Provide label and percent for each slice

Don’t use 3-D!

Page 61: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

62

Displaying Data

Hispanic(25%)

Black, non-Hispanic(30%)

Asian/Pacific Islander(22%)

White, non-Hispanic(21%)

American Indian/ Alaska Native (1%)

Reported TB Cases by Race/Ethnicity

United States, 2001 (Pie Chart)

Page 62: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

63

Displaying Data

Some Tips for Creating Printed Graphs

Should be self-explanatory Title (what, where, when) with table number Label each axis clearly and concisely Include units of measurement (years, mg/dl, etc.) In epidemiology, start Y-axis at zero Epidemic curve = histogram

Page 63: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

64

Displaying Data

Selecting the Right Presentation Method 1

Type of Graph or Diagram

Application

Arithmetic Scale Graph

Histogram

Number, proportion or rate over time

1.Frequency distribution for a continuous variable

2. Number of cases during an epidemic (epidemic curve) or over time

Page 64: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

65

Displaying Data

Selecting the Right Presentation Method 2

Type of Graph or Diagram

Application

Simple bar chart

Grouped bar chart

Stacked bar chart

Pie chart

Compare the size or frequency of different categories of the same variable

Compare the size or frequency of different categories across 2 or more variables

Compare totals and display component parts for 2 or more categories of second variable

Display parts of a whole

Page 65: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

66

Displaying Data

Question 1 — What’s Wrong With This Graph?

Year

No

. of C

ase

s

Source:http://wonder.cdc.gov/tb-v2007.html

Reported Tuberculosis Cases, United States, 1981-2007

Page 66: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

67

Displaying Data

Answer 1 – Misleading

Year

No

. of C

ase

s

Source:http://wonder.cdc.gov/tb-v2007.html

Reported Tuberculosis Cases, United States, 1981-2007

Page 67: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

68

Displaying Data

Question 2 — What’s Wrong With This Epi Curve?

Page 68: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

69

Displaying Data

Number of Cases of Gastroenteritis,Warehouse Workers, TN, August

2003

*

* Not counted as case

Catered dinner

Page 69: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

70

Displaying Data

Question 3 — What’s Wrong With This Graph?

Rate* of Invasive Pneumococcal Disease by Age Group -- United States, 1998

* Rate per 100,000 population

Page 70: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

71

Displaying Data

Rate* of Invasive Pneumococcal Disease by Age Group – U.S., 1998

* Rate per 100,000 population

Page 71: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

72

Displaying Data

Question 4 — What’s Wrong With This Table?

Age Group (years) # Cases< 15 15

15 – 20 35120 – 25 84225 – 30 89530 – 35 1,09735 – 40 1,36740 – 45 1,02345 – 55 982

55+ 284Total 6,862

Number of Reported Cases of Syphilis (P&S) by Age, United States, 2002

Page 72: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

73

Displaying Data

Number of Reported Cases of Syphilis (P&S)

by Age, United States, 2002Age Group (years) # Cases

< 14 1515 – 19 35120 – 24 84225 – 29 89530 – 34 1,09735 – 39 1,36740 – 44 1,02345 – 54 982

≥ 55 284Total 6,862

Page 73: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

74

Displaying Data

Summary

Data can be organized through the creation of tables, graphs and charts

The purpose of creating these visual displays

1. verify and analyze the data

2. explore patterns and trends

3. communicate information to others An effective figure should be able to be

interpreted without any additional information

Page 74: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

75

Displaying Data

Summary 2 Tables can illustrate the number of people with

particular characteristics and can provide valuable information about relationships between 2 variables

Line graphs are useful for showing patterns or trends over some variable, usually time

Histograms are most commonly used in epidemiology for epidemic curves (cases by time)

Bar charts provide a visual display of data from a one-variable table, but grouped bar charts can show 2 variables

Page 75: Organizing and Displaying Epidemiologic Data with Tables and Graphs.

76

Displaying Data

Conclusion

Choose the tool that best serves the data and purpose

Start with tables Use appropriate titles and labels Print ≠ PowerPoint KISS (message, colors, dimensions)