Project Tycho: US disease data of past century
description
Transcript of Project Tycho: US disease data of past century
Large scale historical data for public health: do climate and demographics explain disease
patterns?
Wilbert van PanhuisDan Bain
Erin Jenkins, Xi Zhang, Yongxu HuangPatrick Manning
1. Project Tycho: US disease data of past century
2. Use of integrating disease, climate and demographic
data
3. Compilation of demographic data
4. Climate data and seasonality of measles and polio
A project to digitize and render computable public health data from around the world and to provide open access to these data.
Goal: Increased use of public health data for decision making
Vision: Centralized, coordinated access to disaggregated public health data
Strategy:-Set example using data already in public domain-Demonstrate value through analyses for decision making-Establish collaborations to link data and enhance data use-Explore barriers to Open Access in interdisciplinary context-Create international guidelines for public health data sharing
1546 – 1601
Danish nobleman who made accurate and comprehensive observations of the positions of the stars and planets. After
his death, Tycho’s assistant Johannes Kepler used these data to derive the laws of planetary motion.
Tycho Brahe
Digitization: 2 years, 200M keystrokes
35,000 files
Year 1 Year 2
Web interface in beta testing
www.tycho.pitt.edu
Measles, London
1944 1966
Measles, Pittsburgh
1906 1953
1. Project Tycho: US disease data of past century
2. Use of integrating disease, climate and
demographic data
3. Compilation of demographic data
4. Climate data and seasonality of measles and polio
Demographic drivers of disease patterns
Science, 01-28-2000 Science, 07-17-2009
Crude birth rate (/1000)
Tim
ing
of
peak
act
ivity
Climate drivers of disease patterns
Science, 01-28-2000
% polio cases per month by latitude: 1956-57, 1965-69
35-70ºN
10-35N
10ºN-10ºS
10-25ºS
25-55ºS
WHSQ,1979 AJE,1979
Integrating different data sets
Variable Disease data Climate data Demographic data
Location Cities/States Weather stations Cities, Counties, States
Time Week Day Decade, year
Data transformation required:-Max. and min. temperature per day -> per week-Precipitation per day -> per week-Decennial census data -> interpolations per year
- Assume no change within years
1. Project Tycho: US disease data of past century
2. Use of integrating disease, climate and demographic
data
3. Compilation of demographic data
4. Climate data and seasonality of measles and polio
Demographic data: ICPSR and others
Sources:ICPSR: Decennial census (state and county), City-county data books, US Census Bureau: State populations by year (interpolations)State Health Departments: State variables by year (eg birth rates)
Interpolations of state population
Difference by yearDifference by state
Difference of linear interpolation and census interpolated data- Linear overestimates between 1940-1950- Similar variance across states
Census higher
Linear higher
Yearly birth and death rates for statesC
rude
birt
h ra
te
(/10
00)
Cru
de d
eath
rat
e (/
1000
)
1. Project Tycho: US disease data of past century
2. Use of integrating disease, climate and demographic
data
3. Compilation of demographic data
4. Climate data and seasonality of measles and polio
Climate data: sources
NCDC: climate indicators by day for individual weather stations
PRISM: data by month for weather stations
Climate and seasonality
Polio
Calendar week
Distribution of disease incidence rates /100,000 by calendar week for US states before vaccine introduction
Calendar week
Measles
Eight cities along N-S gradient
Portland, ME
Boston, MA
New York, NY
Philadelphia, PA
Baltimore, MD
Richmond, VA
Raleigh, NC
Charleston, SC
Brunswick, GA
North-South Gradient of Temperature
Calendar day
Tem
pera
ture
m
ax
Median of maximum temperatures per calendar day for 8 cities using daily data from 1900-2010
South
North
North-South Gradient of measles ?Median incidence rates by calendar week for US
cities using weekly data between 1906-1948
Mea
sles
inci
denc
e ra
te
Calendar week
Start epidemic cycle
Association epidemic start and climateMeasles incidence rates for Boston: 1906-1948
Starting points of epidemic cycles identified (length of bar is week number)
Mea
sles
inci
denc
e ra
te
Week
Start w
eek new cycle
Measles incidence and relative humidityWeekly median cases by city (red) and relative humidity anomaly (value-mean)
Rel
ativ
e hu
mid
ity (
valu
e- m
ean)
Week
Measles incidence rate
Next steps
1. Fully integrate disease, demographic and climate data
2. Continue example analyses:
a. Climate and measles seasonality
b. Climate and polio seasonality
c. Explore additional climate indices
d. Birth rates and measles multi-annual seasonality
3. Direct linking between Tycho, climate and demographic
databases (establish collaborations)
4. Open access to enhance opportunities for discovery
Acknowledgements
Tycho database teamDon Burke, Wilbert van Panhuis, John Grefenstette, Shawn Brown, Ernesto Marques, Bruce Lee, Derek Cummings, Vladimir Zadorozhny, Steve Wisniewski, Su Yon Jung, Nian Shong Chok, Heather Eng, Anne Cross, David Galloway, Suzanne Cake, Raaka Kumbhakar
Dataverse team on climate and demographyPatrick Manning, Dan Bain, Xi Zhang, Yongxu Huang, Erin Jenkings