Classifying Health-related Data for Disease...

21
Classifying Health-related Data for Disease Mapping IRWIN F. CHAVEZ FACULTY OF TROPICAL MEDICINE MAHIDOL UNIVERSITY APRIL 24, 2018 THAI GIS NET MEETUP SERIES

Transcript of Classifying Health-related Data for Disease...

Page 1: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Classifying Health-related Data for Disease Mapping

IRWIN F. CHAVEZ

FACULTY OF TROPICAL MEDICINE

MAHIDOL UNIVERSITY

APRIL 24, 2018 THAI GI S NET MEETUP SERIES

Page 2: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Map of zoonotic pathogens from wildlife, shown from lowest occurrence (green) to highest (red)SOURCES: http://www.columbia.edu/cu/news/08/02/hotspots.html; https://www.nature.com/articles/nature06536.pdf)

Page 3: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

In disease mapping…Data can be presented in an infinite number of ways

Mapping paradox◦ to show something, you have to “hide” something

◦ How to Lie with Maps by Mark Monmonier◦ Mapmakers are compelled to tell “white lies”

◦ Multitude of ways to “lie”

◦ The “lies” are either a) necessary, b) deliberate, c) unintended

Page 4: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

District-level map of Thailand

926 features

Page 5: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Equal intervalIntervals are based on the maximum value divided by the number of intended classes

EXAMPLE:

Value range: 0 to 600

Classified into 5 groups◦ 0 to 120

◦ 121 to 240

◦ 241 to 360

◦ 361 to 480

◦ 481 to 600

DOES NOT take into account data distribution

Page 6: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Equal interval map of DF/DHF cases876 features; 50 missing data

652/5 = 131 class width

Number of features per category

Page 7: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Natural breaksAlso called Jenks’ optimization method

Organizes data into “natural groups”

Iterative computations until lowest in-class deviations are achieved

Takes into account data distribution and variability and uses those to identify “natural” categories

Page 8: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Map of DF/DHF cases using Jenks natural breaks876 features; 50 missing data

Page 9: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Standard deviationBased on the data’s mean and standard deviation

Takes into account data distribution and dispersion in building class boundaries

Page 10: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Standard deviational map of DF/DHF cases876 features; 50 missing data

Mean = 44.5; SD = 66.3

mean

Page 11: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

QuantileBased on ranks

4 – quartile; 5 – quintile; 10 – decile; 100 – centile

Reduces the effect of outliers

Page 12: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Quantile classification of DF/DHF cases876 features; 50 missing data

5 categories (quintiles)

Page 13: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

QuantileBased on ranks

4 – quartile; 5 – quintile; 10 – decile; 100 – centile

WAIT… a few considerations◦ Nature of data

◦ Is there ranking?

◦ Tied observations

Page 14: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Quantile classification of DF/DHF incidence rate per 10,000 population876 features; 50 missing data

5 categories (quintiles)

Page 15: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

QuantileBased on ranks

4 – quartile; 5 – quintile; 10 – decile; 100 – centile

WAIT… a few considerations◦ Nature of data

◦ Is there ranking?

◦ Tied observations

Not suitable for counts

Page 16: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis
Page 17: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Custom classificationCounts of reported malaria cases in Mekong countries at administrative level 1 features

SOURCE: Mekong Malaria II Update of malaria, multi-drug resistance and economic development in the Mekong region of Southeast Asia; The Southeast Asian Journal of Tropical Medicine and Public Health, vol 34 suppl 4, 2003.

Page 18: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Custom classification“Malaria control era”

Species ratio highlights predominating species at administrative level 1 features in the Mekong region (2000-2001)

SOURCE: Mekong Malaria II Update of malaria, multi-drug resistance and economic development in the Mekong region of Southeast Asia; The Southeast Asian Journal of Tropical Medicine and Public Health, vol 34 suppl 4, 2003.

Page 19: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Map of zoonotic pathogens from wildlife, shown from lowest occurrence (green) to highest (red)SOURCES: http://www.columbia.edu/cu/news/08/02/hotspots.html; https://www.nature.com/articles/nature06536.pdf)

Page 20: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

Take home messagesKnow and understand options to present data

Be critical

GIS is data-centric – keep the statistician in you awake

Be creative in designing maps that optimize data use, but not at the expense of clarity

Don’t worry about the “lies” in your maps – stay true to your map’s objective(s)

… THANK YOU!

Page 21: Classifying Health-related Data for Disease Mappingthaigis.net/wp-content/uploads/2018/04/Classifying... · 2018-04-27 · Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis

LINKS/REFERENCESMonmonier, Mark. (2005). Lying with Maps. Statistical Science. 20. 10.1214/088342305000000241. (https://goo.gl/RX1U6u)

Mitchell, Andy. (1999). The ESRI® Guide to GIS Analysis Vol 1: Geographic Patterns & Relationships. Environmental Systems Research Institute, Inc.

Krygier, John and Wood, Denis. (2005). Making Maps: A Visual Guide to Map Design for GIS. The Guildford Press.

Singhasivanon, Pratap et al. (2003). Mekong Malaria II Update of malaria, multi-drug resistance and economic development in the Mekong region of Southeast Asia. The Southeast Asian Journal of Tropical Medicine and Public Health Volume 34 Supplement 4, 2003.

http://www.columbia.edu/cu/news/08/02/hotspots.html

https://www.nature.com/articles/nature06536.pdf (MU access required to download)

http://open.lib.umn.edu/mapping/chapter/7-lying-with-maps/

https://gisgeography.com/choropleth-maps-data-classification/

https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization