Data Management Needs and Challenges for Telemetry Scientists

11
Data Management Needs and Challenges for Telemetry Scientists Josh M London Wildlife Biologist, Polar Ecosystems Program National Marine Mammal Laboratory NOAA NMFS Alaska Fisheries Science Center

description

Data Management Needs and Challenges for Telemetry Scientists. Josh M London Wildlife Biologist, Polar Ecosystems Program National Marine Mammal Laboratory NOAA NMFS Alaska Fisheries Science Center. Temptation to identify biologists as the source for the raw data. - PowerPoint PPT Presentation

Transcript of Data Management Needs and Challenges for Telemetry Scientists

Data Management Needs and Challenges for Telemetry

ScientistsJosh M London

Wildlife Biologist, Polar Ecosystems ProgramNational Marine Mammal Laboratory

NOAA NMFS Alaska Fisheries Science Center

Temptation to identify biologists as the source for the raw data

The Tip of a Complex Iceberg

hypothesisagency needs/mandatesfunding initiatives

opportunistic vs. planned

tag design/vendor tag programming

Deployment of tags (location, age/sex, time)

Data Management

data quality controlsynthesis

movement model

PublicationsContract reportsStatus/Listing Review

derived products

Field Workand

Study Design

Narrowing BottleneckMany biologists lack

the skills and training for effective, scalable database design and data management

practices

Field Work & Tag Deployment

When? Where? Which Tag/Vendor? Which Age? Which

Sex? (Do we have a choice?)

Tag Programming Deployment Length

(attachment type)

Limited Tools for Managing Raw Telemetry Data

‘raw’ data

via Argos as CSV/Text Process w/ Vendor

Software (behavior data) Typically output as CSV Field data about animal

(e.g. ID, species, sex, age, health)

needs

Explore ‘raw’ data Address hypotheses Visualize movement/use Synthesize w/ dependent

(e.g. health, age) and independent data (e.g. other animals, remote sensed)

Biologists Not Trained in Large Scale Data Management

Biologists

Excel and/or Access ESRI ArcMap (shapefiles) Google Earth Mouse Click Interaction Programming (visual

basic, R, python) recipe driven … not developers

Data Manager

Postgres/PostGIS, Oracle, MySQL, SQL Server

Normalization and Efficient Design

Scripting, Jobs, Transactions

Data Integrity Automation, Reproducible

My Perspective

To address complex questions related to marine mammal telemetry and understanding animal ecology, I had to become more of a data manager …And, in the process, I’ve become less of a biologist

Start (2006)

Argos Monthly CDs SatPack Access

Database Excel Files (limited to

56k) Large, Flat Tables No Central Repository

Current System

Nightly FTP Argos Push

Nightly Data Processing

CSV/External Oracle Table

PL/SQL Procedures Developed/Designed

with Training via Google Search

My Perspective

Current Limitations

Data access requires a minimum level of technical skills (basic SQL, Oracle framework, Oracle APEX, R spatial tools, ArcMap)

Single Point of Access/Failure (me) Limited Documentation of Design Design May Not be Optimal/Appropriate Main Objective to Provide Data to Analysts –

Not necessarily designed for providing data to public

My Perspective

Greatest Needs – Research Program

Data Management and Design Consultation Data Design & Documentation Portal

(user-friendly metadata) Low Tech Exploration Tools Database and Application Developers

(data flow and data input) Training Opportunities

My Perspective

Greatest Needs – External to Program?

Provide Meaningful Public Access to Data A Clear Data Sharing Policy w/ Best Practices Encourage/Facilitate Scientific Collaboration Meet Agency Needs and Requirements How to Communicate Scientific Knowledge in

the Modern/Digital Age–sharing knowledge/expertise just as important as sharing data

Publish Data Once

My Perspective

Challenges / Road Blocks

Limited Funds and Priorities – appropriate resources for doing the priority analysis and science not available, let alone the resources to distribute data responsibly

Database design/management often in the hands of the least skilled users

IT Policies, Investments, and Infrastructure Varied Across Institutions

No standard(s) for communicating and sharing ‘raw’ animal telemetry data. What is ‘raw’ data?