Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald...

76
Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory

Transcript of Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald...

Page 1: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

Architecting Scientific Data ServicesUsing XML, SOAP, & the Grid

Brian Wilson, Tom Yunck,Gerald Manipon, and Dominic MazzoniJet Propulsion Laboratory

Page 2: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 2ESIP Federation Meeting, Jan. 4-6, 2005

DataData StorageStorage AlgorithmsAlgorithmsCPUCPU

A Conceptual Grid

Virtualization

PortalPortal

“On-Demand”

Scientist

NetworksNetworks

Page 3: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 3ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle Open Your Databases!

Support Query on and Retrieval of ALL Metadata Could support remote SQL queries on your entire database. Distributed/Local Metadata - not centralized as in ECHO Support XML formats with schemas & OWL/RDF semantics Examples: Amazon & Google SOAP Web services

Open Your Architectures – Modularity, Reuse Think in XML!! (design in XML first, not Java objects) Expose algorithms & application modules via SOAP Massive distributed computing via XML message exchange Reuse modules from OpenDAP/DODS and WMS/WCS

Web Services Choreography / Grid Dataflow Leverage WS-Coordination, -Context, -Security, etc. Leverage WS-Resource Framework (WSRF, Globus v4) Semantic Web Reasoning on top of WS substrate

Vision

Page 4: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 4ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle Web Services

Using XML, SOAP, & WSDL Thinking in XML: REST versus SOAP Architecting Data Query/Access Services using SOAP SOAP as a Universal Interoperability Glue

Classifying Data Services Dimensions / Categories Interop between SOAP, OpenDAP, & OGC WMS/WCS Semantic Web Support Data Discovery to Inventory Level -- Enabling SOAP services

WS Standards WS-Coordination, -Notification, -Security, -ReliableMessaging Convergence of Web & Grid Standards WS-Resource Framework (WSRF) Globus Toolkit v4

Outline

Page 5: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 5ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle 1st: Static HTML pages with Pictures!

Hyperlinks: click and jump! Easy authoring of text with graphics (HTML layout). Killer App: Having your own home page is hip. Cons: Too static, One-way communication.

2nd: Dynamic HTML with streaming audio & video Browser as an all-purpose, ubiquitous user interface. Fancy clients using embedded Java applets. Killer App: Fill out your time card on-line. Cons: Applets clunky, ease of authoring disappears,

information is still HTML (semi-structured).

3rd: SOAP-based Web Computing & Semantic Web Exchange structured data in XML format (no fragile HTML);

semantics (“meaning”) kept with data. Programmatic interfaces rather than just GUI for a human. Killer Apps: Grid Computing, automated data processing.

Three Generations of the Web

Page 6: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 6ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle Distributed Computing by Exchange of XML Messages

Lightweight, Loosely-Coupled API Programming language independent (unlike Java RMI) Transport protocol independent

Multiple Transport Protocols Possible HTTP or HTTPS (HTTP using SSL encryption) POST Email (SMTP), Instant Messaging (MQSeries, Jabber) Store & Forward Reliable Messaging Services (Java JMS) Even Peer-to-Peer (P2P) protocols

SOAP Toolkits for all languages Apache Axis for Java, modules for python/perl, Visual C# .Net.

Web Service Description Language (WSDL) Generate call to a service automatically from WSDL document. Data types and formats expressed in XML schema. Publish services in UDDI catalogs for automated discovery.

What is Simple Object Access Protocol (SOAP)?

Page 7: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 7ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle SOAP-based Web Computing & Semantic Web Exchange structured data in XML format (not HTML) Semantics or “meaning” kept with the data Emphasize programmatic interfaces Grid Services – WSRF and Globus Toolkit v4 Leverage many WS-* standards (Coordination, etc.)

Simple Object Access Protocol (SOAP) Distributed Computing by Exchange of XML Messages Lightweight, Loosely-Coupled API Programming language independent Multiple Transport Protocols Possible (HTTP, P2P) Web Services Description Language (WSDL) Publish Services in catalogs for automated discovery

Third Generation of the Web

Page 8: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 8ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle New paradigm of loosely-coupled distributed computing

SOAP messaging is exploding in the business world. Used in Grid Computing: WSRF and Globus Toolkit v4 Asynchronous workflow (return results when available)

Leverage XML standards (WSDL, UDDI, WS-*) Service API is XML messages, not code. Data types and formats expressed in XML schema.

Exchanging large scientific data sets Small objects in XML format Even medium-size objects (2D grid slices) can be in XML. Large objects as references (ftp or http URL) to HDF-EOS or

netCDF binary containers (files).

Security & authentication less important in science Use HTTPS, Basic (user/password) Authentication, or WS-

Security & WS-Authentication

Why use SOAP to create scientific services?

Page 9: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 9ESIP Federation Meeting, Jan. 4-6, 2005

SOAP Software Toolkits• Java (free)

– Install Java 1.4, Tomcat 4.1, and Apache Axis 1.0– Optionally install WSDL4J, UDDI4J, & WSIF from IBM– Configure Tomcat & Axis: classpaths, etc.

• Perl (free)– Install perl 5.6 & SOAP::Lite module (contains UDDI::Lite)

• Python (free)– Install python 2.2 and SOAP.py– Optionally install UDDI4py from IBM

• DotNet (not free, but cheap)

• Install Visual Studio .Net with C# or VB compilers

• Send a payment to MS

• C++ (some free libraries)– But more work required

Page 10: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 10ESIP Federation Meeting, Jan. 4-6, 2005

Toolkits Do the Work• Transparent Serialization

– Java JDBC ResultSet in the server -> XML format -> Java ResultSet Object in the client

– Or ResultSet -> C# DataTable Object in a C# client– Or ResultSet -> Perl 2D array or hash in a perl client

• Custom Serialization Possible– Write a custom serializer for any object

• Automatic Deployment– Axis: Place Java code in the deployment directory and the service is

magically exposed.

• Can customize deployment using WSDD

– C# .Net: Simply annotate the function as a [Web Method]

– Perl & Python: A few lines of code.

• Security & Authentication: Use capabilities of Apache

Page 11: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 11ESIP Federation Meeting, Jan. 4-6, 2005

Python Client Example

Interoperable SOAP Web Services

#!/usr/local/bin/python# soap_sql.py

from SOAPpy import SOAP#import SOAP

db = sys.stdin.readline()lines = sys.stdin.readlines()sql = string.join(lines, '\n')

server = SOAP.SOAPProxy( "https://genesis:[email protected]:80/axis/GenesisSQL.jws", namespace = "urn:Genesis_SQL", encoding = None)

print server.executeQuery(SOAP.stringType(sql), SOAP.stringType(db))

Input:jplsaccselect * from L2data where datasetid == '20010801_1350sac_g43_1p0'and datatype == ‘gpsProfile’

Page 12: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 12ESIP Federation Meeting, Jan. 4-6, 2005

REST versus SOAP• REpresentational State Transfer (REST)

– Roy Fielding’s Ph.D. Thesis– Everything is an object or document with a URI.– Problems:

• Hidden state: cookies, sessions, CGI db, etc.• What about virtual datasets?• Need to generate too many URI’s.

• Service Oriented Architecture (SOA)– Everything is a service.– What about stateful services? WS-Resource

• Same Old Tradeoff– Like object-oriented programming versus functional programming (need

both).– It’s both an object and a service.

Page 13: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 13ESIP Federation Meeting, Jan. 4-6, 2005

Amazon E-Commerce Services• Item Search – SOAP Service input

– <SubscriptionId>[Your SOAP key here]</><Request> <Keywords>free lunch hoagie</> <SearchIndex>Books</></Request>

• WSDL File– http://webservices.amazon.com/AWSECommerceService/

AWSECommerceService.wsdl

• Returns XML Document– Contains TotalResults count and a list of Items– Each item contains many attributes (title, authors, ASIN, etc.)

Page 14: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 14ESIP Federation Meeting, Jan. 4-6, 2005

Amazon E-Commerce Services• Also support REST paradigm for request

– http://webservices.amazon.com/onca/xml?Service=AWSECommerceService&Operation=ItemSearch&SubscriptionId=[Your SOAP Key here]&Keywords=free%20lunch%20hoagie&SearchIndex=Books&ResponseGroup=Medium

• Response is same XML Document– With results tailored by ResponseGroup(s)

• Why are they opening their database up?– Allow 3rd parties to build apps on top of db– Build partnerships– Promulgate their software architecture/standard– Goal: Amazon lookup part of every business web app.– Makes sense even for commercial entities– Certainly best for scientific data archives

Page 15: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 15ESIP Federation Meeting, Jan. 4-6, 2005

Example: General SQL Query Service• Using Informix Object-Relational Database

– Contains all Level 2 GPS occultation profile retrievals– Multiple parameters (temperature, water vapor pressure, refractivity, & pressure)

as a function of altitude– GeoSpatial Datablade provides 4D R-tree index on time, latitude, longitude, &

altitude.– Support “nearest” semantics (multiple hits sorted by distance)– GetNearest service & BatchGetNearest (batch of requests)– Implemented using Informix ORDBMS & Geodetic DataBlade

• SQL queries using 4D R-tree index on time, lat, lon, height

• executeQuery SOAP Service– Two input arguments

• Database to search• Text of arbitrary SQL Query

– Single output• SQL ResultSet in XML format

Page 16: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 16ESIP Federation Meeting, Jan. 4-6, 2005

Output of the executeQuery Serviceversion="1.0" encoding="utf-8" ?> <?xml-stylesheet type="text/xsl" href="http://gilaan.usc.edu:80/axis/xsldir/xsl271059363124909.xsl"?> <ResultSet> <Result>

<datasetid>20010801_1350sac_g43_1p0</datasetid> <datatype>gpsProfile</datatype> <height>0.0840427222</height> <lat>6.6839553</lat> <lon>-38.501096</lon> <refractivity>359.099346</refractivity> <temperature>308.641494</temperature> <pressure>1009.2816</pressure> <wvpressure>23.3353547</wvpressure> </Result> <Result> <datasetid>20010801_1350sac_g43_1p0</datasetid> <datatype>gpsProfile</datatype> <height>0.404808732</height> <lat>6.71812814</lat> <lon>-38.513238</lon> <refractivity>346.437323</refractivity> <temperature>302.964981</temperature> <pressure>973.213908</pressure> <wvpressure>21.5868849</wvpressure> </Result> . . .

Page 17: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 17ESIP Federation Meeting, Jan. 4-6, 2005

executeQuery SOAP MessageAccept: text/xmlAccept: multipart/*Content-Length: 663Content-Type: text/xml; charset=utf-8SOAPAction: "urn:Genesis_SQL#executeQuery"

<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/” xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/1999/XMLSchema” xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance” SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/”> <SOAP-ENV:Body> <namesp1:executeQuery xmlns:namesp1="urn:Genesis_SQL"> <c-gensym3 xsi:type="xsd:string">select * from L2data where datasetid == '20010801_1350sac_g43_1p0’ and datatype == 'gpsProfile' </c-gensym3> <c-gensym5 xsi:type="xsd:string">jplsacc</c-gensym5> </namesp1:executeQuery> </SOAP-ENV:Body></SOAP-ENV:Envelope>

Page 18: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 18ESIP Federation Meeting, Jan. 4-6, 2005

executeQueryResponse SOAP Message<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"> <soapenv:Body> <ns1:executeQueryResponse soapenv:encodingStyle= "http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns1="urn:Genesis_SQL"> <ns1:executeQueryReturn xsi:type="ns2:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns2="http://www.w3.org/2001/XMLSchema">

[Returned XML doc here, encoded]

</ns1:executeQueryReturn> </ ns1:executeQueryResponse> </ soapenv:Body></ soapenv:Envelope>

Page 19: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 19ESIP Federation Meeting, Jan. 4-6, 2005

CORBA

Specialized

DCE

DCOM

SOAP

Time

Pro

duct

ivity

/ E

ase

of U

se

XML-RPC

History of Distributed Computing

C++

Fortran

C

Object Pascal

Perl

Time

Pro

duct

ivity

/ E

ase

of U

se

Java

History of Programming Languages

Turbo Pascal

Python

Java RMI

Page 20: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 20ESIP Federation Meeting, Jan. 4-6, 2005

Sad History of Function Call Interfaces Fortran – partially typed

– Positional arguments only, by address, partially typed C/Pascal – fully typed

– Added varargs for unknown number of arguments– But used pointers to return multiple results

C++– Added function/operator overloading

Java– Pointers hidden (yes!), but lost operator overloading

Perl, Python, IDL – func(arg1, arg2, *array, **hash)– Added implicit array to catch variable number of args– Added implicit hash to catch named args

XML – input & output are hierarchical XML documents– Flexible interface since can ignore unneeded tags– Hierarchy & dynamic typing imply rich interface– Use XQuery/XPath to extract/update needed tags

Page 21: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 21ESIP Federation Meeting, Jan. 4-6, 2005

Thinking in XML Design the XML metadata and data representations

first. Add an ontology.– Then create objects in your favorite programming language.– Then create objects in more languages for broader use.

4DRectilinearGrid datatype – <4DRectilinearGrid>

<Latitudes> <Series>-90, 90, 2 <Longitudes> <URL>http://gsfc.nasa.gov/cgi-bin/dods/AIRS.hdf?Longitude <Altitudes> <Times> <Variable> <Name> <Unit> <MissingValueMarker> </Variable> <Grid> <URL>http://gsfc.nasa.gov/cgi-bin/dods/AIRSgranule.hdf?TAirStd </Grid></4DRectilinearGrid>

Page 22: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 22ESIP Federation Meeting, Jan. 4-6, 2005

New Paradigm for Scientific Computing Use two or more channels

– XML Messaging on one– Binary data protocol (like OpenDAP) on the other

Two channels can use different transports– XML Messaging over HTTP or dark P2P– OpenDAP over LambdaRail or scalable P2P caching network

XML Messaging Channel for:– Control, Service Chaining, Workflow, Metadata Exchange– Configuration of Binary channel

Binary Data Channel(s)– Out-of-band from XML point of view– Bulk data transfer, replication, or caching– Could be multi-protocol (OpenDAP, GridFtp, LambdaGrid)

Page 23: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 23ESIP Federation Meeting, Jan. 4-6, 2005

Data Query/Access Services Services Chain goes beyond simple data access

– Time & Geolocation Query yielding data inventory– Query on Quality Flags– General Metadata Query catalog satisfying conditions– Data Access - by object or granule ID and variable name– Data Slicing - as in OpenDAP– Data Subsetting – by lat, lon, alt, & time ranges– Parameter Subsetting – select only desired physical variables– Data Reformatting – choose output format as in WMS/WCS– On-Demand Grid Computations – grid diff in GraDS/DODS– Variable Bundles – external metadata from scientist– Return Composite Data Objects

• 4DRectilinearGrid, 4DCurvilinearGrid (swath)• XML & Native Binary representations

– User Composites – add custom object to data model– Semantic Support

• Use generic variable names• Reason about variables to be bundled

Page 24: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 24ESIP Federation Meeting, Jan. 4-6, 2005

Classifying Data Services - Dimensions

• Services Provided Is it discovery, query, access, and/or subsetting?

• Degree of Metadata Support• Richness of the Data Model• Request Interface / Protocol• Response Interface / Protocol• Representation of Large Binary Data Objects• Transport Layers (http, GridFtp, p2p, LambdaRail)• Permanent Names & Version Support• Service Chaining• Semantic Web Support• Interoperability with other services• Maturity, Support, Open or Proprietary, Cost

Page 25: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 25ESIP Federation Meeting, Jan. 4-6, 2005

Classifying Data Services (I) Services Provided

– Is it discovery, query, access, and/or subsetting?– Taxonomy on earlier slide

• Degree of Metadata Support– Time & Geolocation Query?– General Metadata Query?– Retrieve selected metadata fields?– Bind external metadata into the dataset?

• Richness of the Data Model– Supports point, swath, grid, and vector data types?– Is the data model exposed to the user?– Interoperability? (i.e., between OpenDAP, OGC, HDF-EOS)– Composite objects, like 4D geo-referenced grid?– Add user composites? (as in Java object serialization)– Compute grid averages or differences on demand? (Ferret)

Page 26: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 26ESIP Federation Meeting, Jan. 4-6, 2005

Classifying Data Services (II)• Request Interface / Protocol

– REST (one-line HTTP URL’s) versus SOAP– OpenDAP & WMS/WCS use REST– REST tends to be more ad hoc than structured XML/SOAP

• Response Interface / Protocol– OpenDAP uses custom binary wire protocol– OGC WMS/WCS uses REST– REST uses HTTP with MIME types & multi-part– SOAP uses XML documents with possible attachments

• Representation for Large Binary Data Objects– OpenDAP: custom binary wire protocol– WMS/WCS: MIME multi-part embedding– SOAP: embedded, attachments, or URL references

• Binary files referenced by ftp, http, or DODS URL’s– Serialized Binary Objects (as in C++, Java, or C# .Net)

Page 27: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 27ESIP Federation Meeting, Jan. 4-6, 2005

Classifying Data Services (III)• Transport Layers (http, GridFtp, p2p, LambdaRail)

– REST uses http for both request & response– SOAP can use http, smtp, Instant Messaging, Secure

Messaging, or p2p

• Degree of Modularity & Layering– Is the software separated into well-defined protocol layers?– Can functions/modules be reused separately?– Example: OpenDAP contains a rich data model, data slicing

syntax, and a binary wire protocol. Reuse each?– Clean design or somewhat ad hoc?

• Permanent Names & Version Support– Globally unique ID’s for data objects / granules– Permanent names for service endpoints– Versions for data objects & services– Is it a service or an object? (REST vs. SOAP)– It’s both. And they both need to be permanent.

Page 28: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 28ESIP Federation Meeting, Jan. 4-6, 2005

Classifying Data Services (IV)• Service Chaining

– Can modules be chained together?– In REST, here one reaches limits of one-line URL’s– In SOAP, have rich world of WS Choreography or Grid

Workflow solutions– Use semantic web to discover and chain services

• Semantic Web Support– Exposes scientific domain info. for data and service

discovery?– Generic dataset and variable names?– Are the data categorized in an ontology?– Is the service categorized in an ontology?– Can the “meaning” of the service and the meaning of each of

its inputs and outputs be inferred from an ontology?– Is the service interoperable with others through semantic

mediation?

Page 29: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 29ESIP Federation Meeting, Jan. 4-6, 2005

Time & Geolocation Query Service

• Inputs– StartTime, EndTime, Lat/Lon Box, DatasetName,

VariableName, MetadataGroup(s)

• Output– List of items– Each item has unique ID, start & end time, lat/lon box– More (optional) metadata groups

• Datasets– Implemented for GENESIS GPS database, and for

AIRS, MODIS, & MISR granules in DataPool.

Page 30: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 30ESIP Federation Meeting, Jan. 4-6, 2005

SciFlo Data Query/Access (SOAP) Services• GeoLocationQuery(startTime, endTime, lat, lon, timeTolerance,

distanceTolerace, variable, metadataGroups)– Returns granule ID’s and geolocation info. for AIRS L2 swaths that

intersect lat/lon point or are near enough.

• GeoRegionQuery(startTime, endTime, lat/lon region, . . .)– Returns granule ID’s and geolocation info. for AIRS L2 swaths that

intersect the lat/lon region or are near enough.

• QueryByMetadata(variable, ListOfConstraints, groups)– Returns granule ID’s and selected metadata for AIRS granules that

satisfy the metadata constraint expression:(min1 <= field1 <= max1 and/or min2 <= field2 <= max2 . . .).

• GetDataById(IdList, UrlOptions)– Given list of unique ID’s, returns list of ftp, http, or DODS URL’s

pointing to the granules (files). Uses cache and redirection server.

• -- Other semantic interfaces possible

Page 31: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 31ESIP Federation Meeting, Jan. 4-6, 2005

SciFlo Data Query/Access (SOAP) Services

• GetMetadata(type, groups)– Returns metadata describing scientific domain, dataset info, list of

metadata fields, generic name translation table, location of XML schema documents, location of related ontologies (OWL/RDF), etc.

• GetHelp(type, groups)– A SOAP service that itself returns help documentation describing

the available SOAP services.

• --These Services Combined Can Support:– Data Catalog, General Query, Data Access, Generic Names,– Data Discovery by Domain and Dataset Keywords– Type checking via XML schemas– Hooks to semantic web

Page 32: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 32ESIP Federation Meeting, Jan. 4-6, 2005

Data Discovery to Inventory Level• SOAP Services just described provide:

– Domain description Metadata

– Dataset Info. and Keywords

– Generic VariableName Lookup Table

– Query by Metadata or just Time & GeoRegion

• Layer more semantics on top of these services– Dataset Discovery by keyword search

– Bind in additional metadata provided by scientist

– Tie ontologies into keywords and genericVariableNames

– Semantic Inference: AIRS and MODIS both provide atmosphericTemperature and cloudProperties. Hmmm. Compare them.

– XML/SOAP provides substrate; possibilities only limited by ontology development.

Page 33: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 33ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle

Evolving Grid Computing Standards

[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]

GT4

Page 34: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 34ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle History of Scientific Computing as a Utility

The Grid began as effort to tightly couple multiple super- or cluster computers together (e.g., Globus Toolkit v1 & v2).

Needed job scheduling, submission, monitoring, steering, etc. SETI@HOME success

OGSI: Open Grid Services Infrastructure Globus v3.2 is open-source implementation using Java/C. A service is Grid-enabled by inheriting from Java class. Standard is complex and growing. Challenge: Ease of installation & use.

WS-Resource Framework (WSRF) Capabilities treated as storage or computing resources exposed

on the web.

Globus Toolkit v4 coming

Evolving Grid Computing Standards

Page 35: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 35ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle

Evolving Grid Computing Standards

[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]

GT4

Page 36: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 36ESIP Federation Meeting, Jan. 4-6, 2005

Grid Applications

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

A

B

C

D

E

Application Developer

9

Off the Shelf 13

Globus Toolkit

0

Grid Community

0

Page 37: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 37ESIP Federation Meeting, Jan. 4-6, 2005

Grid Apps Using Globus Middleware

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf 9

Globus Toolkit

4

Grid Community

4

Page 38: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 38ESIP Federation Meeting, Jan. 4-6, 2005

Components in Globus Toolkit 3.2

GSI

WS-Security

CAS(OGSI)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

RFT(OGSI)

RLS

OGSI-DAI

WU GridFTP

XIO

JAVAWS Core(OGSI)

OGSI C Bindings

MDS2

WS-Index(OGSI)

Pre-WSGRAM

WS GRAM(OGSI)

OGSI Python Bindings

(contributed)

pyGlobus(contributed)

Page 39: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 39ESIP Federation Meeting, Jan. 4-6, 2005

Planned Components in GT 4.0

GSI

WS-Security

CAS(WSRF)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

Authz Framework

RFT(WSRF)

RLS

OGSI-DAI

New GridFTP

XIO

JAVAWS Core(WSRF)

C WS Core(WSRF)

MDS2

WS-Index(WSRF)

Pre-WSGRAM

WS-GRAM(WSRF)

CSF(contribution)

pyGlobus(contributed)

Page 40: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 40ESIP Federation Meeting, Jan. 4-6, 2005

Page 41: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 41ESIP Federation Meeting, Jan. 4-6, 2005

WS-* Alphabet Soup

Page 42: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 42ESIP Federation Meeting, Jan. 4-6, 2005

WS Architecture

Page 43: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 43ESIP Federation Meeting, Jan. 4-6, 2005

WS-Resource Framework Model

Page 44: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 44ESIP Federation Meeting, Jan. 4-6, 2005

Using a Web Service to Access a WS-Resource

Page 45: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 45ESIP Federation Meeting, Jan. 4-6, 2005

Overview of WS-Coordination

Page 46: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 46ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle

Evolving Grid Computing Standards

[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]

GT4

Page 47: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 47ESIP Federation Meeting, Jan. 4-6, 2005

Compare/Contrast Template• Services Provided• Implemented Standards / Protocols• Inputs, Outputs, & Parameters• Metadata• Maturity

– Number of Users, Data Volume• Support

– Level, Type, Availability• Open / Proprietary

– Open? Proprietary? Published? Costs?• Activities

– Interoperability?• Future Plans

– Services to be Implemented

Page 48: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 48ESIP Federation Meeting, Jan. 4-6, 2005

Services Provided by WS• Distributed Computing via XML Message Exchange• Service Oriented Architecture (SOA)• Many transport protocols supported (http, smtp,

messaging, p2p)• Security, Authentication, Authorization• Reliable Messaging• Notification• Transactions• Stateful Resources – creation, lifetime (WSRF)

• WS Choreography• Grid Workflow

Page 49: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 49ESIP Federation Meeting, Jan. 4-6, 2005

Implemented Standards / Protocols• XML schema• SOAP, WSDL, UDDI• WS-Resource Framework

• WS-ResourceProperties, -ResourceLifetime, -BaseFaults• WS-Addressing (endpoints with state pointers)• WS-Notification (publish & subscribe)• WS-Security, -Policy, -SecurityPolicy• WS-Transaction, -ReliableMessaging• WS-Coordination, -Context (choreography)

• BPEL4WS (IBM workflow engine)• SciFlo Engine coming

Page 50: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 50ESIP Federation Meeting, Jan. 4-6, 2005

Inputs, Outputs, & Parameters • Too numerous to list.• See SOAP and WS-* specifications.

Page 51: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 51ESIP Federation Meeting, Jan. 4-6, 2005

Metadata • See WS-* specifications.• ??

Page 52: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 52ESIP Federation Meeting, Jan. 4-6, 2005

Maturity • SOAP, WSDL, UDDI are mature but still evolving.

• Latest versions: SOAP v1.2, WSDL v2.0, UDDI v3

• Many WS-* specifications are just being finalized.• WSRF and Globus Toolkit v4 coming soon.

Page 53: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 53ESIP Federation Meeting, Jan. 4-6, 2005

Support • From numerous companies.

• IBM Websphere, etc.• ??

Page 54: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 54ESIP Federation Meeting, Jan. 4-6, 2005

Open / Proprietary • Both• WS-* are standards, with interop. features.• But companies provide added value and push

changes in standards.• Already WS stacks and development environments

for sale.

Page 55: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 55ESIP Federation Meeting, Jan. 4-6, 2005

Future Plans • Interoperability testing is proceeding.• More WS-* standards and versions will emerge.

Page 56: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 56ESIP Federation Meeting, Jan. 4-6, 2005

Services Provided by SciFlo• Data Query / Access (SOAP services)

• GPS Occultation Database (atmospheric retrievals)• AIRS, MODIS, & MISR L2/L3 granules

• Grid Workflow Engine– Abstract XML dataflow documents translated to concrete

flows, with missing steps inserted automatically.– Parallel dataflow execution engine (scheduler)– Semantic inference using XML metadata– Operators can be SOAP services, local scripts, or local

native executables.– Move authenticated operators/executables to the data.– SOAP architecture, but also P2P functionality.– Every node is both client & server; easy node replication.– One-click installation onto server or desktop nodes.– Support Linux & Windows.– Initiate grid computations from your desktop.

Page 57: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 57ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle Grid:

Schedule & submit cluster computing jobs Operator tree is a Directed Acyclic Graph (DAG) CONDOR, CONDOR-G, DAGMan Globus Alliance Standards: GSI, GRAM, MDS, RLS, XIO, etc. Chimera -> Pegasus -> DAGMan -> Executing Grid Job

Web: Several web choreography standards IBM’s Business Process Execution Language (BPEL4WS)

Less convergence here than in OGSI/WSRF Marketplace winners? 10 workflow groups spoke at Global Grid Forum (GGF) meeting Sciflo will use some Globus capabilities via python bindings

(pyGlobus).

Dataflow / Workflow Engines

Page 58: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 58ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle History

DAML: DARPA Agent Markup Language OWL: Ontology Web Language (from DAML+OIL) Numerous inference engines & ontologies being developed

Semantics for Web Services OWL-S: OWL-based Web service ontology Describe properties & capabilities in computer-interpretable form

(beyond WSDL).

SciFlo Semantics & Inference Use WSDL+ & OWL-S to describe local operators (executables),

remote services, & grid computing jobs. Discover & select operators to fill in missing steps in a dataflow.

Semantic Web

Page 59: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 59ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle iEarth Vision will be enabled by the open-source SciFlo

Engine.

Automate large-scale, multi-instrument science processing by authoring a dataflow document that specifies a tree of executable operators.

iEarth Visual Authoring Tool Distributed Dataflow Execution Engine Move operators (executables) to the data. Built-in reusable operators provided for many tasks such as

subsetting, co-registration, regridding, data fusion, etc. Custom operators easily plugged in by scientists. Leverage convergence of Web Services (SOAP) with Grid

Services (Globus v3.2).

Hierarchical namespace of objects, types, & operators. sciflo.data.EOS.AIRS.L2.atmosphericParameters sciflo.operator.EOS.coregistration.PointToSwath

SciFlo Engine

Page 60: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 60ESIP Federation Meeting, Jan. 4-6, 2005

SciFlo Document

Page 61: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 61ESIP Federation Meeting, Jan. 4-6, 2005

Page 62: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 62ESIP Federation Meeting, Jan. 4-6, 2005

Abstract (skeleton) Workflow is more easily authored.

Trivial data format & unit conversions auto-inserted.

As toolbox of known & reliable operators grows, even complex ops like regridding become trivial.

Could use other backends if desired (BPEL, DAGMan).

Elaborating Workflow Documents

Page 63: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 63ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle

Distributed Computing Using SciFlo

Inject data query or flow execution requestinto SciFlo network from any node.

Front-endReceives Requests

Queue ofPending Flows

SOAP

Master ServerPlans

Execution

SciFlo Server at JPL

Parallel ExecutionEngine

SOAP

SOAP

SciFlo ServerAt Data Portal

RedirectFlow

GENESIS ESIP

Co-registrationSOAP Call

Goddard DataSubsetting

SciFlo ServerAt Langley

SOAPQuery

Execute

Grid-EnabledComputing

Cluster

CondorDAGMan

SubFlow

Page 64: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 64ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle Permanent Hierarchical Names (“Holy Grail”)

Naming Authority assigned at each namespace level Distributed P2P namespace (P2P catalog lookup)

Proper Names AIRS Level2 Parameter Retrieval Dataset (granules):

sciflo.data.EOS.AIRS.L2.atmosphericParameters (or metadata) Generic Point-To-Swath Co-registration Operator:

sciflo.operator.EOS.coregistration.PointToSwath

Generic Names Atmospheric Temperature Data:

sciflo.data.atmosphere.temperature.profile (or .grid)

Name resolves to list of EOS datasets

Semantics attached (3DGeoParameterGrid of temperature)

Data Access by Naming

Page 65: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 65ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle

SciFlo’s Strength Lies in Combining Many Elements into a Single Open-Source System

Abstract XML dataflow documents translated to concrete flows. Parallel dataflow execution engine Semantic inference using XML metadata Move operators to the data. SOAP architecture, but also P2P functionality. Every node is both client & server; easy node replication. One-click installation onto server or desktop nodes. Initiate grid computations from your desktop.

Access data objects by naming them! P2P Distributed Namespace of data sources & operators

Server architecture Group of interacting SOAP services (replaceable modules) Implementation in XML, python, & C/C++ (not Java)

Strength in Numbers: Let a million nodes bloom!

Page 66: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 66ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle getAirsMetaData(startTime, endTime, lat, lon)

Returns granule ID’s and geolocation info. for AIRS swaths that intersect lat/lon point.

getAirsUrlsFromGranuleIds(idList or xmlString) Redirection server – Returns URLs pointing to AIRS granules in

local cache or DataPool.

getGpsMetaData(startTime, endTime, NW, SW, SE, NE) Returns occultation ID’s and geolocation info. for GPS limb scans

that are inside a lat/lon region.

getGpsProfiles(idList or xmlString) Returns URL pointing to netCDF file containing GPS profiles for

all occultation ID’s in the list.

getGpsProfile(occID) Returns a single GPS profile in XML format.

SciFlo Data Access (SOAP) Services

Page 67: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 67ESIP Federation Meeting, Jan. 4-6, 2005

AIRS versus GPS Flowchart (part 1)

Page 68: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 68ESIP Federation Meeting, Jan. 4-6, 2005

AIRS versus GPS Flowchart (part 2)

Page 69: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 69ESIP Federation Meeting, Jan. 4-6, 2005

AIRS & GPS Retrievals Matchup Demo

Interface: HTML web form auto- generated from XML dataflow doc.

Input: User enters start/end time & other co-registration criteria.

Flow Execution: Calls 4 SOAP data query services & total of 15 operators on 4 computers.

Page 70: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 70ESIP Federation Meeting, Jan. 4-6, 2005

AIRS & GPS Retrievals Matchup Demo

Results Page: Shows status updates during execution and then final results.

Caching: Reuse intermediate data products or force recompute.

Results: Merged data in netCDF file & plots as Flash movie.

Page 71: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 71ESIP Federation Meeting, Jan. 4-6, 2005

AIRS/GPS Matchups

Page 72: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 72ESIP Federation Meeting, Jan. 4-6, 2005

AIRS/GPS Temperature & Water Vapor Comparison Plots

Page 73: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 73ESIP Federation Meeting, Jan. 4-6, 2005

AIRS/GPS Temperature & Water Vapor Comparison Plots

Page 74: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 74ESIP Federation Meeting, Jan. 4-6, 2005

General Features:

• High latitude measurements (~73N deg) separated by ~2 deg in longitude and ~0.5 hours

• Sharper tropopause measured by GPS. AIRS overestimates tropopause temperature by ~5 deg.

• GPS measure a wavy structure in the stratosphere which is absent in AIRS

Temperature as a function of pressure from AIRS (blue) and a coincident GPS occultation (red)

Page 75: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 75ESIP Federation Meeting, Jan. 4-6, 2005

AIRS/GPS Temperature Difference Statistics

Mean temperature profilefor 41 matchups, latitude

region -60 to -70 deg.,Jan-Apr, 2003

Page 76: Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald Manipon, and Dominic Mazzoni Jet Propulsion Laboratory.

SOAP Web & Grid Services

Wilson, 01/04/2005 76ESIP Federation Meeting, Jan. 4-6, 2005

Carbon Cycle SciFlo’s Innovation Lies in Combining Many Elements

into a Single Open-Source System Abstract XML dataflow documents Semantic inference using XML metadata Parallel dataflow execution engine Move operators to the data. Every node is both client & server; easy node replication. SOAP architecture, but also P2P functionality. Initiate grid computations from your desktop.

Goal: SciFlo nodes inside all Science Data Centers

Multi-Instrument Earth Science Instrument Cross-Comparisons Multi-Instrument Science Portals Large-scale multivariate statistical studies and verification of

weather/climate models.

Summary