Finding, browsing, and getting data easily using SPDF web services

31
Finding, browsing, and getting data easily using SPDF web services Space Physics Data Facility <http://spdf.gsfc.nasa.gov> NASA Goddard Space Flight Center Greenbelt MD 20771

description

Finding, browsing, and getting data easily using SPDF web services. Space Physics Data Facility NASA Goddard Space Flight Center Greenbelt MD 20771. CDAWeb data browser - PowerPoint PPT Presentation

Transcript of Finding, browsing, and getting data easily using SPDF web services

Page 1: Finding, browsing, and getting data easily using SPDF web services

Finding, browsing, and getting data easily using SPDF web

services

Space Physics Data Facility<http://spdf.gsfc.nasa.gov>

NASA Goddard Space Flight CenterGreenbelt MD 20771

Page 2: Finding, browsing, and getting data easily using SPDF web services

CDAWeb data browser• Handles vast heterogeneity of instruments and

parameters (>1000 datasets, 2M files, 10 TB, 8500 parameters)

• Yet simple browsing interface provides 80% of researchers’ needs without many bells & whistles

• Based on – standard file format (CDF)– standardized metadata– metadata-driven IDL software– provider metadata over-ride by Master CDFs

Page 3: Finding, browsing, and getting data easily using SPDF web services

CDAWeb usage statistics

Page 4: Finding, browsing, and getting data easily using SPDF web services

CDAWeb process• Top static web page calls Perl routines to generate selection

forms, with instrument-types, spacecraft, instruments, datasets and available time ranges filled from text lists (compiled each night from data CDFs, no formal database)

• Form POST calls a Perl routine to generate IDL calls to determine data files to fill user’s request, followed by a call to read_myCDF and then either plotmaster, list_mystruct or write_mycdf, depending on the user’s request.

• CDAWlib IDL routines read data CDFs into IDL structures, plot or list data, and return URLs to temporary output files (ASCII listings, movie or static GIF files, or created sub- or super-setted CDF files)

• CDAS SOAP and REST API calls can also query the text lists and generate the IDL calls

Page 5: Finding, browsing, and getting data easily using SPDF web services

Metadata (ISTP and Cluster)• Global attributes provide mission, instrument type,

dataset info (project, source_name, data_type, descriptor, etc.)

• Variable attributes (Catdesc, Fieldnam, var_notes)• Depend_0 _1 _2, Delta_plus/minus_var point to

related variables• Display_type, Var_type (data, support data)• Fill_val, Validmin/max for ranges• LablAxis/Label_ptr, Units/Unit_ptr for plots• Format for listing• Virtual variables to create variables on-the-fly

Page 6: Finding, browsing, and getting data easily using SPDF web services

CDAWlib IDL library<http://spdf.gsfc.nasa.gov/CDAWlib.html>

• Read_MyCDF reads CDF variables into an IDL structure, along with support variables and metadata

• PlotMaster determines best display for the variables and creates plots in GIF and PS/PDF

• Assorted plot routines: time_series, spectrogram, ionogram, radar_vector, image, orbit, stack_plot, map, plasmagram, movie, map_movie, time_text, etc.

Page 7: Finding, browsing, and getting data easily using SPDF web services

Other useful CDAWlib routines• List_MyStruct creates text listings using all available/defined

global and variable level metadata, e.g. label_axis, label_ptr1, format, depends, units, etc.

• CDFx <http://cdaweb.gsfc.nasa.gov/cdaweb/cdfx/> IDL GUI to read, list and plot data from CDFs located on user’s machine, allows some customization of plotting options, e.g. color pallets, scale ranges, etc.

• Auroral_Image maps 2-D data using map projections and various fill and expansion techniques (generally auroral images onto the Earth)

• Spectrogram plots a color spectrogram of a 2-dim variable in contiguous or non-contiguous blocks

Page 8: Finding, browsing, and getting data easily using SPDF web services

Making CDFs• Create skeleton text file (hopefully with standard

metadata) by hand or easier with SKTeditor<http://SSCweb.gsfc.nasa.gov/skteditor/>

• Convert skeleton file to empty CDF with SkeletonCDF (or automatically in SKTeditor)

• Add data with IDL CDAWlib routines or use MakeCDF C-language tool to put ASCII and binary data into a CDF file <http://spdf.gsfc.nasa.gov/makecdf.html>

• ISTP/IACG Guidelines for recommended naming of datasets and filenames, global and variable attributes <http://spdf.gsfc.nasa.gov/sp_use_of_cdf.html>

Page 9: Finding, browsing, and getting data easily using SPDF web services

Making CDFs with CDAWlib• Create empty CDF with SKTeditor or manually• Read CDF structure into an IDL structure with

read_master_cdf.pro • Fill IDL structure data fields with data• Create new CDF file with the contents of the filled

structure with write_data_to_cdf.pro (both in IDLmakecdf.pro)

Page 10: Finding, browsing, and getting data easily using SPDF web services

CDAWeb-resident data direct to IDL (Beta)

• <http://cdaweb.gsfc.nasa.gov/WebServices/REST/CdasIdlLibrary.html>

• d = spdfgetdata('AC_K2_MFI', ['Magnitude' , 'BGSEc'], ['2009-06-01T00:00:00.000Z', '2009-06-03T00:00:00.000Z'])

• IDL GUI interface to CDAWeb-held data: spdfcdawebchooser

Page 11: Finding, browsing, and getting data easily using SPDF web services
Page 12: Finding, browsing, and getting data easily using SPDF web services

CDAWeb Lessons• Standard data file format and metadata enable

powerful and extendible services• Code is metadata-driven (from the data files

themselves), rather than coding lots of special cases (with attendant high maintenance)

• Variable type and dimensions is often enough to determine desirable plot format, but can be over-ridden by metadata

• Metadata from Master (no data) CDFs over-rides metadata (or lack of) in data CDFs

Page 13: Finding, browsing, and getting data easily using SPDF web services

SPDF Web Services• Satellite Situation Center (SSC)

– SOAP RPC-encoded since 2002– SOAP document-literal since 2007

• Coordinated Data Analysis System (CDAS)– SOAP RPC-encoded since 2003– REST since 2010

Page 14: Finding, browsing, and getting data easily using SPDF web services

Web Service Styles• Remote Procedure Call (RPC)

– Tightly coupled– Early SOAP, CORBA, DCOM, RMI

• Service-Oriented Architecture (SOA)– Loosely coupled– Later (message/document) SOAP

• Resource-Oriented Architecture (ROA)– Loosely coupled– Representational State Transfer (REST)

Page 15: Finding, browsing, and getting data easily using SPDF web services

SOAP Web Services

• Popular before REST• Supported by most major software vendors• Criticized for being complex (but simple to

use with advanced tools/frameworks)• Focuses on “message-oriented” services• Supported in most popular programming

environments (Java, .NET, PHP, Python, Perl, etc.)

Page 16: Finding, browsing, and getting data easily using SPDF web services

REST Web Services• Focus on interacting with stateless resources using well-

known HTTP standard operations (GET, POST, PUT, DELETE, etc.)

• Simpler protocol requires simpler libraries/frameworks– Available in more programming environments

• Details less defined than SOAP (e.g., messages can be plain text instead of complex XML conforming to SOAP specifications)– Different services can be very different– “RESTful Web Services”, Leonard Richardson & Sam

Ruby, 2007 O'Reilly Media, Inc.

Page 17: Finding, browsing, and getting data easily using SPDF web services

REST vs SOAP

• CDAWeb REST sits on top of CDAS SOAP• Trades: ugly URLs but easy to understand,

use in browser• SOAP easy to call from Java with SOAP

libraries• Opendap, FTP, HTTP, TSDS

Page 18: Finding, browsing, and getting data easily using SPDF web services

More SOAP vs REST• IDL interface much easier with REST rather than

convincing scientists to install the Java bridge and our library along with the IDL code

• Finally and most important, we can create a system where spacecraft, instruments, datasets, and time periods are treated as objects and can be referenced in the event list server and in papers/reports

• SOAP is opaque and more difficult to explain and to use• Can describe a REST service in WSDL, particularly

version 2 and also in WADL <http://research.sun.com/spotlight/2006/2006-04-24-TR153.html>, although WSDL alone doesn’t always work for SOAP either.

Page 19: Finding, browsing, and getting data easily using SPDF web services

CDAWeb REST Complications• Metadata is complicated, requiring REST

interface to return XML/JSON• Results are complicated, requiring REST

interface to return XML/JSON• Resources don’t conform to a simple

hierarchical structure, requiring REST to support a POST (with XML request) for some resources in addition to the simpler GET method

Page 20: Finding, browsing, and getting data easily using SPDF web services

• Calling CDAWeb SOAP from IDL– Use Java Bridge (Java's SOAP) and CDAWeb-specific JAR

(automatically produced from WSDL)• Newer versions of IDL include/configure Java during installation

– 1 JAR to add to classpath– IDL call to CDAWeb:

• dataviews = cdas->getAllViewDescriptions()• Calling CDAWeb REST from IDL

– Use IDLnetURL and IDLffXMLDOM– Requires 1000s of lines* of hand-written, CDAWeb-specific

data-binding and serialization/deserialization code (that uses IDLffXMLDOM for marshalling/unmarshalling)

– IDL call to CDAWeb:• dataviews = cdas->getDataviews()

Page 21: Finding, browsing, and getting data easily using SPDF web services

CDAS REST IDL Library: > 4518 LOC (including comment and blank lines)

Component LOC Total 6502 spdfAuthenticator -199 spdfCdawebChooser -980 spdfCdawebChooserAuthenticator -116 spdfGetData -155 spdfHttpErrorDialog -100 spdfHttpErrorReporter -97 spdfWsExample -337 Current Library 4518 Missing pieces +? Complete Library >4518

Page 22: Finding, browsing, and getting data easily using SPDF web services

• Calling CDAWeb REST from Java– JAXB (Java API for XML Binding) can

produce the thousands of lines of code required by IDL (from the CDAS.xsd schema)

– Only requires a little more code (to send/receive HTTP) than SOAP.

• Conclusion:– Calling REST is only a little more work than

SOAP if the client environment has something like JAXB (which IDL doesn’t) or if someone else writes the extra code.

Page 23: Finding, browsing, and getting data easily using SPDF web services

SOAP Lesson 1 Applied To REST• Cannot reduce metadata to simple array of

string values– That is, metadata is more complicated

(structured) and simplification makes it less useful

– This resulted in the REST interface returning XML/JSON instead of something simpler

Page 24: Finding, browsing, and getting data easily using SPDF web services

SOAP Lesson 2 Applied To REST

• CDAWeb results (plot, listing, CDF) are too complicated to be returned in a HTTP entity body– A single request can produce multiple image files– A result may have many message, status, warning, and

error entities associated with it– Results may be thumbnail images that require special

processing to obtain expanded images• REST interface returns XML/JSON that contains

URL of actual result (plot, listing, CDF)

Page 25: Finding, browsing, and getting data easily using SPDF web services

CDAWeb Resource URI Design

• Attempted to design URI that was meaningful, well structured, and used path variables when possible

• /dataviews/{dataview}/datasets/{dataset}/data/{start-time},{stop-time}/{var1},{varn}?format=...

• Cannot represent multi-dataset (with dataset associated variables) resources with a single path

• Multi-dataset resources must be requested with a POST containing an XML description of request

Page 26: Finding, browsing, and getting data easily using SPDF web services

Backups

Page 27: Finding, browsing, and getting data easily using SPDF web services

AbstractThe NASA GSFC Space Physics Data Facility (SPDF) provides heliophysics science-enabling information services for enhancing scientific research and enabling integration of these services into the Heliophysics Data Environment paradigm, via SOAP and REST web services in addition to web browser, FTP, and OPeNDAP interfaces. We describe these interfaces and the philosophies behind these web services, and show how to call them from various languages, such as IDL and Perl. We are working towards a "one simple line to call" philosophy extolled in the recent VxO discussions. Combining data from many instruments and missions enables broad research analysis and correlation and coordination with other experiments and missions.

Page 28: Finding, browsing, and getting data easily using SPDF web services

Coordinated Data Analysis Web (CDAWeb) <http://cdaweb.gsfc.nasa.gov>

•Data browsing system provides plotting, listing and open access via FTP, HTTP, and web services (REST, SOAP, OPeNDAP) for data from most NASA Heliophysics missions•Combining data from many instruments and missions enables broad research analysis and correlation and coordination with other experiments and missions•Collecting and making data usable biggest effort but most important•Space weather, planetary studies, in situ/remote data

Page 29: Finding, browsing, and getting data easily using SPDF web services

Common Data Format (CDF) <http://cdf.gsfc.nasa.gov>

• Standard self-describing multidimensional data format • Platform- and discipline-independent• Associated scientific data management package (“CDF

Library”) makes actual data format completely transparent to the user and accessible through a consistent set of interface routines

• IDL and Matlab routines• Also callable by Fortran, C, C#, Perl, Java• Open source• Internal compression, checksums

Page 30: Finding, browsing, and getting data easily using SPDF web services

Format Translator

• CDF project also maintains software and services for translating between many standard formats (CDF, netCDF, HDF, FITS, XML)<http://cdf.gsfc.nasa.gov/html/dtws.html>

Page 31: Finding, browsing, and getting data easily using SPDF web services

IDL CDF InterfaceCreating CDFs: CDF_CREATE, CDF_VARCREATE,

CDF_ATTPUT, CDF_VARPUT, CDF_CLOSEReading CDFs: CDF_OPEN, CDF_INQUIRE,

CDF_CONTROL, CDF_VARINQ, CDF_VARGET, CDF_ATTINQ, CDF_ATTGET, CDF_CLOSE

Info: CDF_CONTROL, CDF_COMPRESSION, CDF_DOC, CDF_INQUIRE, CDF_LIB_INFO

Time: CDF_ENCODE_EPOCH, CDF_EPOCH, CDF_PARSE_EPOCH, CDF_EPOCH_DIFF, CDF_EPOCH_COMPARE

Other: CDF_SET_MD5CHECKSUM, CDF_SET_CDF27_BACKWARD_COMPATIBLE