Post on 27-Dec-2015
1
CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda
Environmental Sciences Division
Oak Ridge National Laboratory
CDIAC User Working Group MeetingSeptember 27-28, 2010
ORNL research was sponsored by the U.S. Department of Energy and performed at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-
00OR22725.
2
Overview
Data Support for SPRUCE
• Data Management Planning
• Goals outlined in the Science Plan
• Requirements identified in the Data Policy
• Actions and resources needed to meet requirements are in the Data Management Plan
• Implementation
• SPRUCE web site
• Resources and products accessible on the web site
Data Support for NGEE
• Data Management Planning
• Expect planning to be similar to SPRUCE
• NGEE Web Site
Shared Development Effort for Acquisition and Processing of Sensor Data
3
Science Plan for the Climate Change Response Scientific Focus Area
3.11 Data and informatics Goals for Response SFA data management are to ensure the fidelity and accessibility of the SFA data, minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata, and ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community. The suite of activities that collectively comprise this component of the SFA will naturally evolve over the life of the SFA, and they will be done in collaboration with data management components of other Climate SFAs. Initial data management work will focus on defining the data collection and distribution requirements, identifying key leverage points across SFAs and other projects, ensuring that site characterization data is maintained, and resolving any critical informatics knowledge gaps identified in the requirements definition. As the experiments begin to collect high resolution data, the data management activities will shift to ensuring that the experimental data are properly archived and distributed according to the SFA’s data access policy. Data from the Response SFA will be a combination of observational data recorded by researchers and data collected by automated equipment. Further details can be found in Annex C.
The data management component will leverage the expertise and tools in the Environmental Data Science and Systems (EDSS) group, particularly the Carbon Dioxide Information and Analysis Center (CDIAC) and the Atmospheric Radiation Measurement (ARM) program archive, to ensure that both observational and automated data are robustly archived in relational data models with necessary timestamp, spatial, temporal, and provenance metadata.
Goals for SPRUCE Data Management • Ensure the fidelity of and accessibility of SPRUCE data to the participants to facilitate all the pertinent science questions; • Minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata; and • Ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community and public when appropriate quality checked data are available.
Approach to Data Management Planning • Provide a structured framework to capture the project-defined requirements• Provide data management guidance and best practices• Responsibility of ORNL SPRUCE research group, the Task Leaders in particular, and Forest Service Staff, to reach a consensus about what needs to be controlled, to provide processing details, and to establish who is responsible for implementation. Accountability is key.
Planning Considerations • The plan supports field sampling, measurements, monitoring, and analyses. • Data management information collected pre-experiment will inform the final experimental data management processes.• SPRUCE tasks are subject to change or modification and experimental technology will evolve. The data management plan will have to be flexible and updated as needed, with version control.
4
Version 1.2 2010/05/10
SPRUCE Data Policy: Archiving, Sharing, and Fair-Use
The open sharing of all SPRUCE experiment data among researchers, the broader scientific community, and the public is critical to advancing the mission of DOE’s Program of Terrestrial Ecosystem Science.
SPRUCE is implementing an experimental platform for the long-term testing of the mechanisms controlling the vulnerability of organisms, ecosystems, and ecosystem functions to increases in temperature and exposure to elevated CO2 treatments within the northern peatland high-carbon ecosystem. All data collected at the SPRUCE facility, all results of any analysis or synthesis of information, and all model algorithms and codes developed in support of SPRUCE will be submitted to the SPRUCE Data Archive in a timely manner such that data will be available for use by SPRUCE researchers and, following publication, the public.
This policy is applicable to all SPRUCE participants including the SPRUCE Research Group at the Oak Ridge National Laboratory (ORNL), the U.S. Forest Service, cooperating independent researchers, and to the users of SPRUCE data products (see the Data Fair-Use Statement).
SPRUCE data policies are consistent with the sponsoring U.S. DOE Program for Terrestrial Ecosystem Science Data Policy and with the Memorandum of Understanding between the U.S. Forest Service and UT-Battelle.
Data Management Requirements are identified in the Data Policy
5
Data Archiving and Discovery
• Archive at Carbon Dioxide Information Analysis Center (CDIAC)
• Two levels of data accessibility.
• First is for sharing recently collected, derived, and processed data products among SPRUCE participants.
• Second is for access to mature data products by the broader scientific community and public.
• Public access will be concurrent with open literature or web site publication of SPRUCE results.
• Discovery facilitated through the compilation of descriptive companion metadata records and their inclusion in searchable metadata databases and clearinghouses.
Data Policy, continued
6
Data Sharing
Timeliness of Data Availability
• Researchers will actively process, quality assure, and document environmental measurements, etc
• Task Leaders will define a schedule for submitting data to the Archive for their given measurements.
Suggested guidelines for submitting data to the Archive for sharing among SPRUCE participants. Environmental measurements (automated instruments) -- 30 days after the completion of a month of measurements Annual surveys and seasonal measurement efforts -- 120 days from the completion of the survey Laboratory analyses of vegetation nutrient concentrations -- 60 days from completion of analyses
Suggested guidelines for submitting data to the Archive for public access.• Environmental measurements (automated instruments) -- annual updates Annual surveys and seasonal measurement efforts -- With publication of papers. Laboratory analyses of vegetation nutrient concentrations -- With publication of papers.
Quality Assurance of Data
• Task Leader will define the quality assurance checks to be performed prior to data sharing • among SPRUCE participants (Quality Level 1) and• (Quality Level 2) prior to public access Suggested guidelines for defining data Quality Levels: Level 1 and Level 2
Data Policy, continued
7
Data Fair-Use Statement
The SPRUCE data provided on the public archive are freely available and were furnished by the SPRUCE Research Group at ORNL, U.S. Forest Service, and cooperating independent researchers who encourage their use.
Please inform SPRUCE scientist(s) of your use of the archived data and of any publications.
Check the Archive frequently to ensure that you are using the latest version of the data.
Please acknowledge (1) data products as a citation as provided in the data archive documentation, (2) web site information downloads as a bibliographic web citation, or (3) general SPRUCE information as an acknowledgment or personal communication if no other citation form is applicable.
When publishing original analyses and results using these data, please acknowledge the agency or organization that supported the collection of the original data.
Please include these terms as publication keywords as applicable: SPRUCE Experiment, ORNL, U.S. DOE Office of Science, Marcell Experimental Forest, Northern Research Station, U.S. Forest Service.
Please provide an electronic reprint of your independent work to the SPRUCE Project so that all publications can be tracked by CDIAC.
Disclaimer of Liability
Data Policy, continued
8
Data and Metadata Reporting• Reporting Sampling and Measurement Dates and Times• Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables• Reporting Units for Chemical, Physical, and Descriptive Variables• Reporting Values below Detection Limits• Reporting Missing Data• Reporting Uncertainty Estimates• Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions• Assigning Project-Specific Data Quality Flags
Organization• Data Policy • Data Flow• Project Name Information• Identifying Measurement and Sampling Sites
Data Processing• Data Entry, Transfer, and Transformation• Managing Hardcopy Format Project Records• Managing Electronic Format Project Records• Names and Reporting Formats for Data Files• Scripted Programs for Processing and Analysis• Quality Level of Data
Data Documentation and Archiving• Planning to Archive Data for Public Release• Creating Archive Documentation• Providing Metadata to Searchable Indexes and Clearinghouses• Assigning Descriptive Data Set Titles
Data Systems Management• Day-to-Day Operation of Data Management Systems• Data Management System and Software Configuration Control Guidelines
Actions and resources needed to meet requirements are in the Data Management Plan
9
Task: Environmental Measurements• Automated InstrumentsTask: Environmental Measurements• Automated Instruments
Existing/Historical Data • MEF, NADP, Remote Sensing• Ground penetrating radar assessments• Additional links to existing data ?
Existing/Historical Data • MEF, NADP, Remote Sensing• Ground penetrating radar assessments• Additional links to existing data ?
SourcesSources
Task R2: Plant growth phenology and NPP• Periodic ObservationsTask R2: Plant growth phenology and NPP• Periodic Observations
Processing/QA FrequencyProcessing/QA Frequency
30-60 days aftercollection
30-60 days aftercollection
120 days after
survey, 60 days
after sample
analyses
120 days after
survey, 60 days
after sample
analyses
• Selected data uploaded• Periodic updates with new data and products
• Selected data uploaded• Periodic updates with new data and products
DestinationDestination AccessAccess
Supplemental Information• Photos, Videos, Additional ?Supplemental Information• Photos, Videos, Additional ?
Timing ?Timing ?
SPRUCE Data FlowSPRUCE Data Flow
Compiled by Les Hook, 2010/05/10
Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 • Inputs and Outputs ?
Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 • Inputs and Outputs ?
Task R3: Community composition• Periodic ObservationsTask R3: Community composition• Periodic Observations
Task R4: Plant Physiology • Periodic ObservationsTask R4: Plant Physiology • Periodic Observations
Task R5: Biogeochemical cycling responses• Periodic ObservationsTask R5: Biogeochemical cycling responses• Periodic Observations
SPRUCEData Archive
(CDIAC)
SPRUCEData Archive
(CDIAC)
Project Data Sharing
Public Data Sharing
with publication or per schedule
SPRUCE Web Site
Project and Public Access to Data and
Resources
Project Data Access• 100% open for Project Team• Permission needed by others
Project Resources • Common reference sources• Metadata Content Editor
Public Data Archive• 100% open to Public• Data and Metadata Search• Relational Database (e.g., FACE) ?
SPRUCE Web Site
Project and Public Access to Data and
Resources
Project Data Access• 100% open for Project Team• Permission needed by others
Project Resources • Common reference sources• Metadata Content Editor
Public Data Archive• 100% open to Public• Data and Metadata Search• Relational Database (e.g., FACE) ?
30-60 days aftercollection
30-60 days aftercollection
10
http://mnspruce.ornl.gov
11
http://ngee.ornl.gov
12
Shared Development Effort for Acquisition and Processing of Sensor Data
SPRUCE Sensors and data loggers
Acquisition and evaluation software
Independent processing steps
Next for SPRUCE and NGEE
• Number of sensors 25X
• Need advanced automated processing, displays, and alarms
• Web accessible
• Other needs?
13
Shared Development Effort for Acquisition and Processing of Sensor Data
Next Steps:
• Purchasing Campbell Scientific (CS) software with more capabilities.
• Meeting with CS Technical Representative for planning guidance.
• Making connections with ORNL CS power users.
• Learn from SPRUCE and NGEE prototypes
• Starting to look beyond acquisition and processing to analysis.
14
Additional Data Flow Diagrams
• Overview of Task Inputs and Resources
• S1 Bog Vegetation Survey Task
15
Task-Specific InputsTask-Specific InputsResourcesResources
SPRUCE Web Site
Project Access to Data and Resources
Project Resources
Common references:
• SPRUCE Task Description template• SPRUCE Variable Name template• SPRUCE Project Names template• Site Information template• Data Collection Guides
Project Data Archive• 100% open for Project Team• Permission needed by others
SPRUCE Web Site
Project Access to Data and Resources
Project Resources
Common references:
• SPRUCE Task Description template• SPRUCE Variable Name template• SPRUCE Project Names template• Site Information template• Data Collection Guides
Project Data Archive• 100% open for Project Team• Permission needed by others
Overview of Task Inputs and ResourcesOverview of Task Inputs and Resources
Compiled by Les Hook, 2010/05/10
Task EM:Task EM:
Existing/Historical DataExisting/Historical Data
Task R2:Task R2:
Supplemental InformationSupplemental Information
Task R6:Task R6:
Task R3:Task R3:
Task R4:Task R4:
Task R5:Task R5:
Data Policy
Data FlowTask Information
• Task Description• ID Measurements• Field Sampling & Measurement Description• Laboratory Analysis Description• Data Processing• Archive Schedule• QA Level Defined• Task Metadata• Task Data
SPRUCEData Archive
(CDIAC)
SPRUCEData Archive
(CDIAC)
Project Data Sharing
16
Task-Specific InputsTask-Specific Inputs
SPRUCE Web Site
Project and Public Access to Data and Resources
SPRUCE Web Site
Project and Public Access to Data and Resources
S1 Bog Vegetation Survey Task >>>Data Management Planning
S1 Bog Vegetation Survey Task >>>Data Management Planning
Compiled by Les Hook, 2010/04/30, updated 2010/09/20
SPRUCEData Archive
(CDIAC)
SPRUCEData Archive
(CDIAC)
Project Data Sharing
Forest Service• Survey Plot Coordinates
Data and Metadata Reporting• Reporting Sampling and Measurement Dates and Times• Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables• Reporting Units for Chemical, Physical, and Descriptive Variables• Reporting Values below Detection Limits• Reporting Missing Data• Reporting Uncertainty Estimates• Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions• Assigning Project-Specific Data Quality Flags
Data Processing• Data Entry, Transfer, and Transformation• Managing Hardcopy Format Project Records• Managing Electronic Format Project Records• Names and Reporting Formats for Data Files• Scripted Programs for Processing and Analysis• Quality Level of Data
Organization• Data Policy • Data Flow• Project Name Information• Identifying Measurement and Sampling Sites
See DCG – Site Information
Data Documentation and Archiving• Planning to Archive Data for Public Release• Creating Archive Documentation• Providing Metadata to Searchable Indexes and Clearinghouses• Assigning Descriptive Data Set Titles
Project Master List of Site Information
Task Metadata• Task Description• Field Sampling & Measurement Description• Laboratory Analysis Description• QA Level Defined• Archive Schedule
Data and Metadata Compilation
Data Systems Management• Day-to-Day Operation of Data Management Systems• Data Management System and Software Configuration Control Guidelines
See DCG – Hardcopy
Forms
See DCG
– Task Plan