Providing Researchers High Performance Web Service Access to Big
Downscaled Climate DataInternet 2 Technical Workshop
July 2014
University of Idaho GIS/CI Day2013
Erich Seamon, M.S. PMP GISPEnvironmental Data ManagerRegional Approaches to Climate Change for Pacific Northwest Agriculture (REACCHPNA)College of Agricultural and Life SciencesUniversity of [email protected]
Palouse region, Northern Idaho
Luke Sheneman, Ph.D.Technical Services ManagerNorthwest Knowledge Network University of [email protected] www.northwestknowledge.net
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Paul Gessler, Ph.D.ProfessorDepartment of Forest, Rangeland, and Fire SciencesCollege of Natural Resources, University of [email protected]
• Overview of data science efforts under the University of Idaho’s Northwest Knowledge Network (NKN) research consortium
• Data-centric research efforts in agriculture within NKN
• NKN scientific research collaboration efforts using multi-dimensional datasets
• Methods for heterogeneous systems/tool integration
• Lessons learned
GoalsClimatic data storage and accessClimatic science integrative researchClimate Science value to the public and regional stakeholders
Analysis ToolsExtensible Data CatalogingInteractive PythonTHREDDSArcGIS ServerPostgresQL
Presentation Overviewnorthwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Climate data and the Pacific Northwest
REACCH Cyberinfrastrure and Data Management Overview, 2013
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
NKN Services Model
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
• NKN functions as a unit under the Office of VP for Research
• Functions as an ‘Online Data Observatory’, providing customer projects with data cataloging and archiving
• Working to extend capabilities to serve all data/metadata via web services
• NKN functions under a service center model• Currently refining a cost model to distribute
services/technology efforts to projects on a equitable basis
• Development of a shared technology research environment is a focused priority for big data research
NKN Systems and Networking Efforts
• On-going systems and network upgrades are an essential component of NKN’s technology strategy
• NSF funding award# 1341040 assisted in the University of Idaho and NKN to extend our research networking. Upgrades included:
1. The UI campus core network;2. The Northwest Knowledge Network data repository; and 3. The DoE Idaho National Laboratory (INL) for replication/mirroring of NKN
data with proximate access to significant High Performance Computing (HPC) and visualization resources for researchers.
This complements previous institutional and NSF-funded improvements and enables true 10 Gigabit per second (Gbps) end-to-end data transfers to support all researchers at the UI.
• Scope: Core Network in Moscow, ID and INL HPC in Idaho Falls, ID• Status: 90% complete as of March 29, 2014
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
• Includes perfSONAR servers• Excludes stateful firewalls
– Science DMZ specifies flexible router ACLs for high performance network security
• Collaborate with other Institutions– Included Idaho National Lab HPC which provides hosting for:
• University of Idaho• Boise State University• Idaho State University
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
NKN Systems and Networking Efforts
The Regional Approaches to Climate Change project is a five year, $20M coordinated regional agricultural project, funded by the National Institute for Food and Agriculture to improve the long-term profitability of the cereal production systems in the Pacific Northwest under ongoing and projected climate change, while contributing to climate change mitigation by reducing emissions of greenhouse gases.REACCH includes efforts in research, extension, and education that integrates
diverse elements including climate modeling, cropping systems modeling, economics, agronomy, crop protection, and others in a transdisciplinary manner.
www.reacchpna.org
Focused Project Overview: REACCH northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
• Inland Pacific Northwest (IPNW) is a critical agricultural region
• Diverse research efforts abound – UI, WSU, OSU, UW, USDA/ARS, NSF, NOAA
• Clear connection between climate change and agriculture processes
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Focused Project Overview: REACCH
Climatic modeling integration with other research efforts is paramount for integrative research efforts
• NETCDF formats• Gridded model dataset outputs• Gridded meteorological datasets• Over 20TB for western US
• http://nimbus.cos.uidaho.edu/MACA
• John Abatzoglou/University of Idaho
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Focused Project Overview: REACCH
The REACCHPNA project is divided into ten functional objective teams (listed to the left), with lead investigators for each area, examining:
• the relationship between climate change and cereal crops, primarily winter wheat
• how climate change might affect cereal crops
• how production practices might contribute to or help mitigate climate change
• what farming methods might help these crops withstand climate change
• factors that influence decisions about crop management
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Focused Project Overview: REACCH
REACCH/NKN Systems Model
Geoportal Server GeospatialArcGIS
Server
THREDDS
Aggregation and
Programmatic
IpythonPhpREST
javascript
PostgresQL DatabaseMySQL
THREDDS –Aggregation and interrogation of NetCDF datasetsGeoportal Server. Metadata Cataloging – modified to allow data uploading IPython – Interactive Python. Python in a web browser! Can be used to compile and document research processesArcGIS Server – web server technology used for geospatial mapping processesPostgresQL – open source enterprise DB
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
REACCH Data Library
• Based on ESRI’s geoportal server software
• Linux/tomcat/java• Library can be
accessed at
data.reacchpna.org
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
REACCH THREDDS Server
• Thematic Realtime Environmental Data Distribution Services (THREDDS)
• Developed by UCAR• Aggregates and
subsets multi-dimensional datasets (NetCDF)
thredds.reacchpna.org
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
REACCH Data Analysis Library
• Use of geoprocessing services for analytics
• Climate time series• Subsetting and aggregation• Integrative data queries (eg.
Biotics and climatic data)• More applied tools:
• Growing Degree Day analysis
• Crop buffering
analysis.reacchpna.org
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Data Library Integrative Analysis Tool Examples
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
Interactive Python Server
• Useful for collaboration and informal scientific analysis
• Allows for arcpy integration
• IPython Notebook server available
ipython.reacchpna.org
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013
Summary Overview
• Development of a data architecture that emphasizes the use of web services (OPENDap, REST) has allowed PNW researchers access to more robust datasets for varied and integrative analysis
• Computing capabilities have been enhanced by placing data closer to HPC, as well as use of perfSONAR testing for bottlnecks
• Subsetting and aggregation methods have been very valuable• Python geoservices are a nice way to encapsulate and deploy
geographic and temporal data transformations
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
References
• CC:NIE – Support Big Science Data at U. of Idaho– http://www.nsf.gov/awardsearch/showAward?AWD_ID=1341040
• Northwest Knowledge Network (NKN)– https://www.northwestknowledge.net/
• PerfSONAR– http://www.perfsonar.net/
• Science DMZ Security– http://fasterdata.es.net/science-dmz
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
REACCH Information Accesswww.reacchpna.org
data.reacchpna.org analysis.reacchpna.orgpolicy.reacchpna.org
research.reacchpna.orgeducation.reacchpna.orgextension.reacchpna.orgpress.reacchpna.orgdictionary.reacchpna.orghelp.reacchpna.org
REACCH Cyberinfrastrure and Data Management Overview, 2013
[email protected]@[email protected]
Questions?
Presentation and contact info available @:
[email protected]@uidaho.edu
northwestknowledge.netreacchpna.org
Boulder Workshop Summer 2014
Top Related