Marrying models and data: Adventures in Modeling, Data Wrangling and Software Design
-
Upload
anne-thessen -
Category
Technology
-
view
211 -
download
1
Transcript of Marrying models and data: Adventures in Modeling, Data Wrangling and Software Design
Marrying Models and Data: Adventures in Modeling, Data
Wrangling and Software Design
Anne E. Thessen, Elizabeth North, Sean McGinnis and Ian Mitchell
LTRANS
• Lagrangian Transport Model
• Open Source
• http://northweb.hpl.umces.edu/LTRANS.htm
• Used to predict transport of particles, subsurface hydrocarbons, and surface oil slicks (in development)
GISR Deepwater Horizon Database
• Over 7 million georeferenced data points
• Over 9 GB
• Over 2000 analytes and parameters
Number of Data Points
Database Contents
• Oceanographic Data
– Salinity
– Temperature
– Oxygen
– More
• Chemistry Data
– Hydrocarbons
– Heavy metals
– Nutrients
– More
Database Contents
• Oceanographic Data
– Salinity
– Temperature
– Oxygen
– More
• Chemistry Data
– Hydrocarbons
– Heavy metals
– Nutrients
– More
• Air
• Water
• Tissue
• Sediment/Soil
Heterogeneity
• Heterogeneity
– Terms
– Units
– Format
– Structure
Carboxybenzene
Benzoic AcidE210
C7H6O2
Dracylic Acid
Benzoic Acid
2,016 1,848
Heterogeneity
• Heterogeneity
– Terms
– Units
– Format
– Structure
n-Decane
ppb
ppbv ng/gμg/g mg/kgppt
parts per trillion
μg/kg
103 66
Metadata
• Metadata
– Missing
– Not computable0.23
UnitName
Location
Time
Attribution
Uncertainty
Method
The Great Data Hunt
• Discovery
– Project directory
– Funding agency records
– Literature
– Internet search
n = 140
Total Data Sets Discovered
The Great Data Hunt
• Discovery
– Project directory
– Funding agency records
– Literature
– Internet search
We identified 90 relevant data sets
Relevant
The Great Data Hunt
• Discovery
• Access
– Online
– Ask directly
– Literature
We received responses to 59% of our inquires and obtained 34% of the identified data sets
Relevant
The Great Data Hunt
• Discovery
• Access
– Online
– Ask directly
– Literature
We received responses to 59% of our inquires and obtained 34% of the identified data sets
41% of those responses were received within 24 hours and 29% were received within the first week
Days to Response
Freq
uen
cy
The Great Data Hunt
• Discovery
• Access
– Online
– Ask directly
– Literature
We received responses to 59% of our inquires and obtained 34% of the identified data sets
41% of those responses were received within 24 hours and 29% were received within the first week
0-20 email exchanges per data set
Number of Emails
Freq
uen
cy
The Great Data Hunt
• Discovery
• Access
• Citation
– Literature
– Existing requirements
– Generate new
Why didn’t people share?
• Paper not published yet – 35%
• Passed the buck – 20%
• Too busy – 10%
• Medical problems – 10%
• Poor quality – 10%
Why should anyone share?
• Mandated
• Increased citation and visibility
• Early access to GISR database
• New insights
Future Work
• Incorporate data as available
• Incorporate user feedback
• Web Access
• Users’ Guide
• Manuscripts
Thank You to Data Providers• NOAA/NOS Office of Response and
Restoration• Commonwealth Scientific and Industrial
Research Organization• Environmental Protection Commission of
Hillsborough County• National Estuarine Research Reserves• Sarah Allan• Kim Anderson• Jamie Pierson• Nan Walker• Ed Overton• Richard Aronson• Ryan Moody• Charlotte Brunner• William Patterson• Kyeong Park• Kendra Daly• Liz Kujawinski• Jana Goldman• Jay Lunden• Samuel Georgian• Leslie Wade
• Joe Montoya• Terry Hazen• Mandy Joye• Richard Camilli• Chris Reddy• John Kessler• David Valentine• Tom Soniat• Matt Tarr• Tom Bianchi• Tom Miller• Elise Gornish• Terry Wade• Steven Lohrenz• Dick Snyder• Paul Montagna• Patrick Bieber• Wei Wu• Mitchell Roffer• Dongjoo Joung• Mark Williams• Don Blake• Jordan Pino
• John Valentine• Jeffrey Baguely• Gary Ervin• Erik Cordes• Michaeol Perdue• Bill Stickle• Andrew Zimmerman• Andrew Whitehead• Alice Ortmann• Alan Shiller• Laodong Guo• A. Ravishankara• Ken Aikin• Tom Ryerson• Prabhakar Clement• Christine Ennis• Eric Williams• Ed Sherwood• Julie Bosch• Wade Jeffrey• Chet Pilley• Just Cebrian• Ambrose Bordelon