Mazzoni problemi entomologici del vigneto - broni 13-02-2014-r
Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald...
-
Upload
flora-wilkerson -
Category
Documents
-
view
219 -
download
0
Transcript of Architecting Scientific Data Services Using XML, SOAP, & the Grid Brian Wilson, Tom Yunck, Gerald...
Architecting Scientific Data ServicesUsing XML, SOAP, & the Grid
Brian Wilson, Tom Yunck,Gerald Manipon, and Dominic MazzoniJet Propulsion Laboratory
SOAP Web & Grid Services
Wilson, 01/04/2005 2ESIP Federation Meeting, Jan. 4-6, 2005
DataData StorageStorage AlgorithmsAlgorithmsCPUCPU
A Conceptual Grid
Virtualization
PortalPortal
“On-Demand”
Scientist
NetworksNetworks
SOAP Web & Grid Services
Wilson, 01/04/2005 3ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle Open Your Databases!
Support Query on and Retrieval of ALL Metadata Could support remote SQL queries on your entire database. Distributed/Local Metadata - not centralized as in ECHO Support XML formats with schemas & OWL/RDF semantics Examples: Amazon & Google SOAP Web services
Open Your Architectures – Modularity, Reuse Think in XML!! (design in XML first, not Java objects) Expose algorithms & application modules via SOAP Massive distributed computing via XML message exchange Reuse modules from OpenDAP/DODS and WMS/WCS
Web Services Choreography / Grid Dataflow Leverage WS-Coordination, -Context, -Security, etc. Leverage WS-Resource Framework (WSRF, Globus v4) Semantic Web Reasoning on top of WS substrate
Vision
SOAP Web & Grid Services
Wilson, 01/04/2005 4ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle Web Services
Using XML, SOAP, & WSDL Thinking in XML: REST versus SOAP Architecting Data Query/Access Services using SOAP SOAP as a Universal Interoperability Glue
Classifying Data Services Dimensions / Categories Interop between SOAP, OpenDAP, & OGC WMS/WCS Semantic Web Support Data Discovery to Inventory Level -- Enabling SOAP services
WS Standards WS-Coordination, -Notification, -Security, -ReliableMessaging Convergence of Web & Grid Standards WS-Resource Framework (WSRF) Globus Toolkit v4
Outline
SOAP Web & Grid Services
Wilson, 01/04/2005 5ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle 1st: Static HTML pages with Pictures!
Hyperlinks: click and jump! Easy authoring of text with graphics (HTML layout). Killer App: Having your own home page is hip. Cons: Too static, One-way communication.
2nd: Dynamic HTML with streaming audio & video Browser as an all-purpose, ubiquitous user interface. Fancy clients using embedded Java applets. Killer App: Fill out your time card on-line. Cons: Applets clunky, ease of authoring disappears,
information is still HTML (semi-structured).
3rd: SOAP-based Web Computing & Semantic Web Exchange structured data in XML format (no fragile HTML);
semantics (“meaning”) kept with data. Programmatic interfaces rather than just GUI for a human. Killer Apps: Grid Computing, automated data processing.
Three Generations of the Web
SOAP Web & Grid Services
Wilson, 01/04/2005 6ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle Distributed Computing by Exchange of XML Messages
Lightweight, Loosely-Coupled API Programming language independent (unlike Java RMI) Transport protocol independent
Multiple Transport Protocols Possible HTTP or HTTPS (HTTP using SSL encryption) POST Email (SMTP), Instant Messaging (MQSeries, Jabber) Store & Forward Reliable Messaging Services (Java JMS) Even Peer-to-Peer (P2P) protocols
SOAP Toolkits for all languages Apache Axis for Java, modules for python/perl, Visual C# .Net.
Web Service Description Language (WSDL) Generate call to a service automatically from WSDL document. Data types and formats expressed in XML schema. Publish services in UDDI catalogs for automated discovery.
What is Simple Object Access Protocol (SOAP)?
SOAP Web & Grid Services
Wilson, 01/04/2005 7ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle SOAP-based Web Computing & Semantic Web Exchange structured data in XML format (not HTML) Semantics or “meaning” kept with the data Emphasize programmatic interfaces Grid Services – WSRF and Globus Toolkit v4 Leverage many WS-* standards (Coordination, etc.)
Simple Object Access Protocol (SOAP) Distributed Computing by Exchange of XML Messages Lightweight, Loosely-Coupled API Programming language independent Multiple Transport Protocols Possible (HTTP, P2P) Web Services Description Language (WSDL) Publish Services in catalogs for automated discovery
Third Generation of the Web
SOAP Web & Grid Services
Wilson, 01/04/2005 8ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle New paradigm of loosely-coupled distributed computing
SOAP messaging is exploding in the business world. Used in Grid Computing: WSRF and Globus Toolkit v4 Asynchronous workflow (return results when available)
Leverage XML standards (WSDL, UDDI, WS-*) Service API is XML messages, not code. Data types and formats expressed in XML schema.
Exchanging large scientific data sets Small objects in XML format Even medium-size objects (2D grid slices) can be in XML. Large objects as references (ftp or http URL) to HDF-EOS or
netCDF binary containers (files).
Security & authentication less important in science Use HTTPS, Basic (user/password) Authentication, or WS-
Security & WS-Authentication
Why use SOAP to create scientific services?
SOAP Web & Grid Services
Wilson, 01/04/2005 9ESIP Federation Meeting, Jan. 4-6, 2005
SOAP Software Toolkits• Java (free)
– Install Java 1.4, Tomcat 4.1, and Apache Axis 1.0– Optionally install WSDL4J, UDDI4J, & WSIF from IBM– Configure Tomcat & Axis: classpaths, etc.
• Perl (free)– Install perl 5.6 & SOAP::Lite module (contains UDDI::Lite)
• Python (free)– Install python 2.2 and SOAP.py– Optionally install UDDI4py from IBM
• DotNet (not free, but cheap)
• Install Visual Studio .Net with C# or VB compilers
• Send a payment to MS
• C++ (some free libraries)– But more work required
SOAP Web & Grid Services
Wilson, 01/04/2005 10ESIP Federation Meeting, Jan. 4-6, 2005
Toolkits Do the Work• Transparent Serialization
– Java JDBC ResultSet in the server -> XML format -> Java ResultSet Object in the client
– Or ResultSet -> C# DataTable Object in a C# client– Or ResultSet -> Perl 2D array or hash in a perl client
• Custom Serialization Possible– Write a custom serializer for any object
• Automatic Deployment– Axis: Place Java code in the deployment directory and the service is
magically exposed.
• Can customize deployment using WSDD
– C# .Net: Simply annotate the function as a [Web Method]
– Perl & Python: A few lines of code.
• Security & Authentication: Use capabilities of Apache
SOAP Web & Grid Services
Wilson, 01/04/2005 11ESIP Federation Meeting, Jan. 4-6, 2005
Python Client Example
Interoperable SOAP Web Services
#!/usr/local/bin/python# soap_sql.py
from SOAPpy import SOAP#import SOAP
db = sys.stdin.readline()lines = sys.stdin.readlines()sql = string.join(lines, '\n')
server = SOAP.SOAPProxy( "https://genesis:[email protected]:80/axis/GenesisSQL.jws", namespace = "urn:Genesis_SQL", encoding = None)
print server.executeQuery(SOAP.stringType(sql), SOAP.stringType(db))
Input:jplsaccselect * from L2data where datasetid == '20010801_1350sac_g43_1p0'and datatype == ‘gpsProfile’
SOAP Web & Grid Services
Wilson, 01/04/2005 12ESIP Federation Meeting, Jan. 4-6, 2005
REST versus SOAP• REpresentational State Transfer (REST)
– Roy Fielding’s Ph.D. Thesis– Everything is an object or document with a URI.– Problems:
• Hidden state: cookies, sessions, CGI db, etc.• What about virtual datasets?• Need to generate too many URI’s.
• Service Oriented Architecture (SOA)– Everything is a service.– What about stateful services? WS-Resource
• Same Old Tradeoff– Like object-oriented programming versus functional programming (need
both).– It’s both an object and a service.
SOAP Web & Grid Services
Wilson, 01/04/2005 13ESIP Federation Meeting, Jan. 4-6, 2005
Amazon E-Commerce Services• Item Search – SOAP Service input
– <SubscriptionId>[Your SOAP key here]</><Request> <Keywords>free lunch hoagie</> <SearchIndex>Books</></Request>
• WSDL File– http://webservices.amazon.com/AWSECommerceService/
AWSECommerceService.wsdl
• Returns XML Document– Contains TotalResults count and a list of Items– Each item contains many attributes (title, authors, ASIN, etc.)
SOAP Web & Grid Services
Wilson, 01/04/2005 14ESIP Federation Meeting, Jan. 4-6, 2005
Amazon E-Commerce Services• Also support REST paradigm for request
– http://webservices.amazon.com/onca/xml?Service=AWSECommerceService&Operation=ItemSearch&SubscriptionId=[Your SOAP Key here]&Keywords=free%20lunch%20hoagie&SearchIndex=Books&ResponseGroup=Medium
• Response is same XML Document– With results tailored by ResponseGroup(s)
• Why are they opening their database up?– Allow 3rd parties to build apps on top of db– Build partnerships– Promulgate their software architecture/standard– Goal: Amazon lookup part of every business web app.– Makes sense even for commercial entities– Certainly best for scientific data archives
SOAP Web & Grid Services
Wilson, 01/04/2005 15ESIP Federation Meeting, Jan. 4-6, 2005
Example: General SQL Query Service• Using Informix Object-Relational Database
– Contains all Level 2 GPS occultation profile retrievals– Multiple parameters (temperature, water vapor pressure, refractivity, & pressure)
as a function of altitude– GeoSpatial Datablade provides 4D R-tree index on time, latitude, longitude, &
altitude.– Support “nearest” semantics (multiple hits sorted by distance)– GetNearest service & BatchGetNearest (batch of requests)– Implemented using Informix ORDBMS & Geodetic DataBlade
• SQL queries using 4D R-tree index on time, lat, lon, height
• executeQuery SOAP Service– Two input arguments
• Database to search• Text of arbitrary SQL Query
– Single output• SQL ResultSet in XML format
SOAP Web & Grid Services
Wilson, 01/04/2005 16ESIP Federation Meeting, Jan. 4-6, 2005
Output of the executeQuery Serviceversion="1.0" encoding="utf-8" ?> <?xml-stylesheet type="text/xsl" href="http://gilaan.usc.edu:80/axis/xsldir/xsl271059363124909.xsl"?> <ResultSet> <Result>
<datasetid>20010801_1350sac_g43_1p0</datasetid> <datatype>gpsProfile</datatype> <height>0.0840427222</height> <lat>6.6839553</lat> <lon>-38.501096</lon> <refractivity>359.099346</refractivity> <temperature>308.641494</temperature> <pressure>1009.2816</pressure> <wvpressure>23.3353547</wvpressure> </Result> <Result> <datasetid>20010801_1350sac_g43_1p0</datasetid> <datatype>gpsProfile</datatype> <height>0.404808732</height> <lat>6.71812814</lat> <lon>-38.513238</lon> <refractivity>346.437323</refractivity> <temperature>302.964981</temperature> <pressure>973.213908</pressure> <wvpressure>21.5868849</wvpressure> </Result> . . .
SOAP Web & Grid Services
Wilson, 01/04/2005 17ESIP Federation Meeting, Jan. 4-6, 2005
executeQuery SOAP MessageAccept: text/xmlAccept: multipart/*Content-Length: 663Content-Type: text/xml; charset=utf-8SOAPAction: "urn:Genesis_SQL#executeQuery"
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/” xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/1999/XMLSchema” xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance” SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/”> <SOAP-ENV:Body> <namesp1:executeQuery xmlns:namesp1="urn:Genesis_SQL"> <c-gensym3 xsi:type="xsd:string">select * from L2data where datasetid == '20010801_1350sac_g43_1p0’ and datatype == 'gpsProfile' </c-gensym3> <c-gensym5 xsi:type="xsd:string">jplsacc</c-gensym5> </namesp1:executeQuery> </SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP Web & Grid Services
Wilson, 01/04/2005 18ESIP Federation Meeting, Jan. 4-6, 2005
executeQueryResponse SOAP Message<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"> <soapenv:Body> <ns1:executeQueryResponse soapenv:encodingStyle= "http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns1="urn:Genesis_SQL"> <ns1:executeQueryReturn xsi:type="ns2:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns2="http://www.w3.org/2001/XMLSchema">
[Returned XML doc here, encoded]
</ns1:executeQueryReturn> </ ns1:executeQueryResponse> </ soapenv:Body></ soapenv:Envelope>
SOAP Web & Grid Services
Wilson, 01/04/2005 19ESIP Federation Meeting, Jan. 4-6, 2005
CORBA
Specialized
DCE
DCOM
SOAP
Time
Pro
duct
ivity
/ E
ase
of U
se
XML-RPC
History of Distributed Computing
C++
Fortran
C
Object Pascal
Perl
Time
Pro
duct
ivity
/ E
ase
of U
se
Java
History of Programming Languages
Turbo Pascal
Python
Java RMI
SOAP Web & Grid Services
Wilson, 01/04/2005 20ESIP Federation Meeting, Jan. 4-6, 2005
Sad History of Function Call Interfaces Fortran – partially typed
– Positional arguments only, by address, partially typed C/Pascal – fully typed
– Added varargs for unknown number of arguments– But used pointers to return multiple results
C++– Added function/operator overloading
Java– Pointers hidden (yes!), but lost operator overloading
Perl, Python, IDL – func(arg1, arg2, *array, **hash)– Added implicit array to catch variable number of args– Added implicit hash to catch named args
XML – input & output are hierarchical XML documents– Flexible interface since can ignore unneeded tags– Hierarchy & dynamic typing imply rich interface– Use XQuery/XPath to extract/update needed tags
SOAP Web & Grid Services
Wilson, 01/04/2005 21ESIP Federation Meeting, Jan. 4-6, 2005
Thinking in XML Design the XML metadata and data representations
first. Add an ontology.– Then create objects in your favorite programming language.– Then create objects in more languages for broader use.
4DRectilinearGrid datatype – <4DRectilinearGrid>
<Latitudes> <Series>-90, 90, 2 <Longitudes> <URL>http://gsfc.nasa.gov/cgi-bin/dods/AIRS.hdf?Longitude <Altitudes> <Times> <Variable> <Name> <Unit> <MissingValueMarker> </Variable> <Grid> <URL>http://gsfc.nasa.gov/cgi-bin/dods/AIRSgranule.hdf?TAirStd </Grid></4DRectilinearGrid>
SOAP Web & Grid Services
Wilson, 01/04/2005 22ESIP Federation Meeting, Jan. 4-6, 2005
New Paradigm for Scientific Computing Use two or more channels
– XML Messaging on one– Binary data protocol (like OpenDAP) on the other
Two channels can use different transports– XML Messaging over HTTP or dark P2P– OpenDAP over LambdaRail or scalable P2P caching network
XML Messaging Channel for:– Control, Service Chaining, Workflow, Metadata Exchange– Configuration of Binary channel
Binary Data Channel(s)– Out-of-band from XML point of view– Bulk data transfer, replication, or caching– Could be multi-protocol (OpenDAP, GridFtp, LambdaGrid)
SOAP Web & Grid Services
Wilson, 01/04/2005 23ESIP Federation Meeting, Jan. 4-6, 2005
Data Query/Access Services Services Chain goes beyond simple data access
– Time & Geolocation Query yielding data inventory– Query on Quality Flags– General Metadata Query catalog satisfying conditions– Data Access - by object or granule ID and variable name– Data Slicing - as in OpenDAP– Data Subsetting – by lat, lon, alt, & time ranges– Parameter Subsetting – select only desired physical variables– Data Reformatting – choose output format as in WMS/WCS– On-Demand Grid Computations – grid diff in GraDS/DODS– Variable Bundles – external metadata from scientist– Return Composite Data Objects
• 4DRectilinearGrid, 4DCurvilinearGrid (swath)• XML & Native Binary representations
– User Composites – add custom object to data model– Semantic Support
• Use generic variable names• Reason about variables to be bundled
SOAP Web & Grid Services
Wilson, 01/04/2005 24ESIP Federation Meeting, Jan. 4-6, 2005
Classifying Data Services - Dimensions
• Services Provided Is it discovery, query, access, and/or subsetting?
• Degree of Metadata Support• Richness of the Data Model• Request Interface / Protocol• Response Interface / Protocol• Representation of Large Binary Data Objects• Transport Layers (http, GridFtp, p2p, LambdaRail)• Permanent Names & Version Support• Service Chaining• Semantic Web Support• Interoperability with other services• Maturity, Support, Open or Proprietary, Cost
SOAP Web & Grid Services
Wilson, 01/04/2005 25ESIP Federation Meeting, Jan. 4-6, 2005
Classifying Data Services (I) Services Provided
– Is it discovery, query, access, and/or subsetting?– Taxonomy on earlier slide
• Degree of Metadata Support– Time & Geolocation Query?– General Metadata Query?– Retrieve selected metadata fields?– Bind external metadata into the dataset?
• Richness of the Data Model– Supports point, swath, grid, and vector data types?– Is the data model exposed to the user?– Interoperability? (i.e., between OpenDAP, OGC, HDF-EOS)– Composite objects, like 4D geo-referenced grid?– Add user composites? (as in Java object serialization)– Compute grid averages or differences on demand? (Ferret)
SOAP Web & Grid Services
Wilson, 01/04/2005 26ESIP Federation Meeting, Jan. 4-6, 2005
Classifying Data Services (II)• Request Interface / Protocol
– REST (one-line HTTP URL’s) versus SOAP– OpenDAP & WMS/WCS use REST– REST tends to be more ad hoc than structured XML/SOAP
• Response Interface / Protocol– OpenDAP uses custom binary wire protocol– OGC WMS/WCS uses REST– REST uses HTTP with MIME types & multi-part– SOAP uses XML documents with possible attachments
• Representation for Large Binary Data Objects– OpenDAP: custom binary wire protocol– WMS/WCS: MIME multi-part embedding– SOAP: embedded, attachments, or URL references
• Binary files referenced by ftp, http, or DODS URL’s– Serialized Binary Objects (as in C++, Java, or C# .Net)
SOAP Web & Grid Services
Wilson, 01/04/2005 27ESIP Federation Meeting, Jan. 4-6, 2005
Classifying Data Services (III)• Transport Layers (http, GridFtp, p2p, LambdaRail)
– REST uses http for both request & response– SOAP can use http, smtp, Instant Messaging, Secure
Messaging, or p2p
• Degree of Modularity & Layering– Is the software separated into well-defined protocol layers?– Can functions/modules be reused separately?– Example: OpenDAP contains a rich data model, data slicing
syntax, and a binary wire protocol. Reuse each?– Clean design or somewhat ad hoc?
• Permanent Names & Version Support– Globally unique ID’s for data objects / granules– Permanent names for service endpoints– Versions for data objects & services– Is it a service or an object? (REST vs. SOAP)– It’s both. And they both need to be permanent.
SOAP Web & Grid Services
Wilson, 01/04/2005 28ESIP Federation Meeting, Jan. 4-6, 2005
Classifying Data Services (IV)• Service Chaining
– Can modules be chained together?– In REST, here one reaches limits of one-line URL’s– In SOAP, have rich world of WS Choreography or Grid
Workflow solutions– Use semantic web to discover and chain services
• Semantic Web Support– Exposes scientific domain info. for data and service
discovery?– Generic dataset and variable names?– Are the data categorized in an ontology?– Is the service categorized in an ontology?– Can the “meaning” of the service and the meaning of each of
its inputs and outputs be inferred from an ontology?– Is the service interoperable with others through semantic
mediation?
SOAP Web & Grid Services
Wilson, 01/04/2005 29ESIP Federation Meeting, Jan. 4-6, 2005
Time & Geolocation Query Service
• Inputs– StartTime, EndTime, Lat/Lon Box, DatasetName,
VariableName, MetadataGroup(s)
• Output– List of items– Each item has unique ID, start & end time, lat/lon box– More (optional) metadata groups
• Datasets– Implemented for GENESIS GPS database, and for
AIRS, MODIS, & MISR granules in DataPool.
SOAP Web & Grid Services
Wilson, 01/04/2005 30ESIP Federation Meeting, Jan. 4-6, 2005
SciFlo Data Query/Access (SOAP) Services• GeoLocationQuery(startTime, endTime, lat, lon, timeTolerance,
distanceTolerace, variable, metadataGroups)– Returns granule ID’s and geolocation info. for AIRS L2 swaths that
intersect lat/lon point or are near enough.
• GeoRegionQuery(startTime, endTime, lat/lon region, . . .)– Returns granule ID’s and geolocation info. for AIRS L2 swaths that
intersect the lat/lon region or are near enough.
• QueryByMetadata(variable, ListOfConstraints, groups)– Returns granule ID’s and selected metadata for AIRS granules that
satisfy the metadata constraint expression:(min1 <= field1 <= max1 and/or min2 <= field2 <= max2 . . .).
• GetDataById(IdList, UrlOptions)– Given list of unique ID’s, returns list of ftp, http, or DODS URL’s
pointing to the granules (files). Uses cache and redirection server.
• -- Other semantic interfaces possible
SOAP Web & Grid Services
Wilson, 01/04/2005 31ESIP Federation Meeting, Jan. 4-6, 2005
SciFlo Data Query/Access (SOAP) Services
• GetMetadata(type, groups)– Returns metadata describing scientific domain, dataset info, list of
metadata fields, generic name translation table, location of XML schema documents, location of related ontologies (OWL/RDF), etc.
• GetHelp(type, groups)– A SOAP service that itself returns help documentation describing
the available SOAP services.
• --These Services Combined Can Support:– Data Catalog, General Query, Data Access, Generic Names,– Data Discovery by Domain and Dataset Keywords– Type checking via XML schemas– Hooks to semantic web
SOAP Web & Grid Services
Wilson, 01/04/2005 32ESIP Federation Meeting, Jan. 4-6, 2005
Data Discovery to Inventory Level• SOAP Services just described provide:
– Domain description Metadata
– Dataset Info. and Keywords
– Generic VariableName Lookup Table
– Query by Metadata or just Time & GeoRegion
• Layer more semantics on top of these services– Dataset Discovery by keyword search
– Bind in additional metadata provided by scientist
– Tie ontologies into keywords and genericVariableNames
– Semantic Inference: AIRS and MODIS both provide atmosphericTemperature and cloudProperties. Hmmm. Compare them.
– XML/SOAP provides substrate; possibilities only limited by ontology development.
SOAP Web & Grid Services
Wilson, 01/04/2005 33ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle
Evolving Grid Computing Standards
[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]
GT4
SOAP Web & Grid Services
Wilson, 01/04/2005 34ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle History of Scientific Computing as a Utility
The Grid began as effort to tightly couple multiple super- or cluster computers together (e.g., Globus Toolkit v1 & v2).
Needed job scheduling, submission, monitoring, steering, etc. SETI@HOME success
OGSI: Open Grid Services Infrastructure Globus v3.2 is open-source implementation using Java/C. A service is Grid-enabled by inheriting from Java class. Standard is complex and growing. Challenge: Ease of installation & use.
WS-Resource Framework (WSRF) Capabilities treated as storage or computing resources exposed
on the web.
Globus Toolkit v4 coming
Evolving Grid Computing Standards
SOAP Web & Grid Services
Wilson, 01/04/2005 35ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle
Evolving Grid Computing Standards
[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]
GT4
SOAP Web & Grid Services
Wilson, 01/04/2005 36ESIP Federation Meeting, Jan. 4-6, 2005
Grid Applications
WebBrowser
ComputeServer
DataCatalog
DataViewer
Tool
Certificateauthority
ChatTool
CredentialRepository
WebPortal
ComputeServer
Resources implement standard access & management interfaces
Collective services aggregate &/or
virtualize resources
Users work with client applications
Application services organize VOs & enable
access to other services
Databaseservice
Databaseservice
Databaseservice
SimulationTool
Camera
Camera
TelepresenceMonitor
RegistrationService
A
B
C
D
E
Application Developer
9
Off the Shelf 13
Globus Toolkit
0
Grid Community
0
SOAP Web & Grid Services
Wilson, 01/04/2005 37ESIP Federation Meeting, Jan. 4-6, 2005
Grid Apps Using Globus Middleware
WebBrowser
ComputeServer
GlobusMCS/RLS
DataViewer
Tool
CertificateAuthority
CHEF ChatTeamlet
MyProxy
CHEF
ComputeServer
Resources implement standard access & management interfaces
Collective services aggregate &/or
virtualize resources
Users work with client applications
Application services organize VOs & enable
access to other services
Databaseservice
Databaseservice
Databaseservice
SimulationTool
Camera
Camera
TelepresenceMonitor
Globus IndexService
GlobusGRAM
GlobusGRAM
GlobusDAI
GlobusDAI
GlobusDAI
Application Developer
2
Off the Shelf 9
Globus Toolkit
4
Grid Community
4
SOAP Web & Grid Services
Wilson, 01/04/2005 38ESIP Federation Meeting, Jan. 4-6, 2005
Components in Globus Toolkit 3.2
GSI
WS-Security
CAS(OGSI)
SimpleCA
Data Managemen
tSecurity
WSCore
Resource Managemen
t
Information Services
RFT(OGSI)
RLS
OGSI-DAI
WU GridFTP
XIO
JAVAWS Core(OGSI)
OGSI C Bindings
MDS2
WS-Index(OGSI)
Pre-WSGRAM
WS GRAM(OGSI)
OGSI Python Bindings
(contributed)
pyGlobus(contributed)
SOAP Web & Grid Services
Wilson, 01/04/2005 39ESIP Federation Meeting, Jan. 4-6, 2005
Planned Components in GT 4.0
GSI
WS-Security
CAS(WSRF)
SimpleCA
Data Managemen
tSecurity
WSCore
Resource Managemen
t
Information Services
Authz Framework
RFT(WSRF)
RLS
OGSI-DAI
New GridFTP
XIO
JAVAWS Core(WSRF)
C WS Core(WSRF)
MDS2
WS-Index(WSRF)
Pre-WSGRAM
WS-GRAM(WSRF)
CSF(contribution)
pyGlobus(contributed)
SOAP Web & Grid Services
Wilson, 01/04/2005 40ESIP Federation Meeting, Jan. 4-6, 2005
SOAP Web & Grid Services
Wilson, 01/04/2005 41ESIP Federation Meeting, Jan. 4-6, 2005
WS-* Alphabet Soup
SOAP Web & Grid Services
Wilson, 01/04/2005 42ESIP Federation Meeting, Jan. 4-6, 2005
WS Architecture
SOAP Web & Grid Services
Wilson, 01/04/2005 43ESIP Federation Meeting, Jan. 4-6, 2005
WS-Resource Framework Model
SOAP Web & Grid Services
Wilson, 01/04/2005 44ESIP Federation Meeting, Jan. 4-6, 2005
Using a Web Service to Access a WS-Resource
SOAP Web & Grid Services
Wilson, 01/04/2005 45ESIP Federation Meeting, Jan. 4-6, 2005
Overview of WS-Coordination
SOAP Web & Grid Services
Wilson, 01/04/2005 46ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle
Evolving Grid Computing Standards
[From Globus Toolkit “Ecosystem”presentation at GGF11 by Lee Liming]
GT4
SOAP Web & Grid Services
Wilson, 01/04/2005 47ESIP Federation Meeting, Jan. 4-6, 2005
Compare/Contrast Template• Services Provided• Implemented Standards / Protocols• Inputs, Outputs, & Parameters• Metadata• Maturity
– Number of Users, Data Volume• Support
– Level, Type, Availability• Open / Proprietary
– Open? Proprietary? Published? Costs?• Activities
– Interoperability?• Future Plans
– Services to be Implemented
SOAP Web & Grid Services
Wilson, 01/04/2005 48ESIP Federation Meeting, Jan. 4-6, 2005
Services Provided by WS• Distributed Computing via XML Message Exchange• Service Oriented Architecture (SOA)• Many transport protocols supported (http, smtp,
messaging, p2p)• Security, Authentication, Authorization• Reliable Messaging• Notification• Transactions• Stateful Resources – creation, lifetime (WSRF)
• WS Choreography• Grid Workflow
SOAP Web & Grid Services
Wilson, 01/04/2005 49ESIP Federation Meeting, Jan. 4-6, 2005
Implemented Standards / Protocols• XML schema• SOAP, WSDL, UDDI• WS-Resource Framework
• WS-ResourceProperties, -ResourceLifetime, -BaseFaults• WS-Addressing (endpoints with state pointers)• WS-Notification (publish & subscribe)• WS-Security, -Policy, -SecurityPolicy• WS-Transaction, -ReliableMessaging• WS-Coordination, -Context (choreography)
• BPEL4WS (IBM workflow engine)• SciFlo Engine coming
SOAP Web & Grid Services
Wilson, 01/04/2005 50ESIP Federation Meeting, Jan. 4-6, 2005
Inputs, Outputs, & Parameters • Too numerous to list.• See SOAP and WS-* specifications.
SOAP Web & Grid Services
Wilson, 01/04/2005 51ESIP Federation Meeting, Jan. 4-6, 2005
Metadata • See WS-* specifications.• ??
SOAP Web & Grid Services
Wilson, 01/04/2005 52ESIP Federation Meeting, Jan. 4-6, 2005
Maturity • SOAP, WSDL, UDDI are mature but still evolving.
• Latest versions: SOAP v1.2, WSDL v2.0, UDDI v3
• Many WS-* specifications are just being finalized.• WSRF and Globus Toolkit v4 coming soon.
SOAP Web & Grid Services
Wilson, 01/04/2005 53ESIP Federation Meeting, Jan. 4-6, 2005
Support • From numerous companies.
• IBM Websphere, etc.• ??
SOAP Web & Grid Services
Wilson, 01/04/2005 54ESIP Federation Meeting, Jan. 4-6, 2005
Open / Proprietary • Both• WS-* are standards, with interop. features.• But companies provide added value and push
changes in standards.• Already WS stacks and development environments
for sale.
SOAP Web & Grid Services
Wilson, 01/04/2005 55ESIP Federation Meeting, Jan. 4-6, 2005
Future Plans • Interoperability testing is proceeding.• More WS-* standards and versions will emerge.
SOAP Web & Grid Services
Wilson, 01/04/2005 56ESIP Federation Meeting, Jan. 4-6, 2005
Services Provided by SciFlo• Data Query / Access (SOAP services)
• GPS Occultation Database (atmospheric retrievals)• AIRS, MODIS, & MISR L2/L3 granules
• Grid Workflow Engine– Abstract XML dataflow documents translated to concrete
flows, with missing steps inserted automatically.– Parallel dataflow execution engine (scheduler)– Semantic inference using XML metadata– Operators can be SOAP services, local scripts, or local
native executables.– Move authenticated operators/executables to the data.– SOAP architecture, but also P2P functionality.– Every node is both client & server; easy node replication.– One-click installation onto server or desktop nodes.– Support Linux & Windows.– Initiate grid computations from your desktop.
SOAP Web & Grid Services
Wilson, 01/04/2005 57ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle Grid:
Schedule & submit cluster computing jobs Operator tree is a Directed Acyclic Graph (DAG) CONDOR, CONDOR-G, DAGMan Globus Alliance Standards: GSI, GRAM, MDS, RLS, XIO, etc. Chimera -> Pegasus -> DAGMan -> Executing Grid Job
Web: Several web choreography standards IBM’s Business Process Execution Language (BPEL4WS)
Less convergence here than in OGSI/WSRF Marketplace winners? 10 workflow groups spoke at Global Grid Forum (GGF) meeting Sciflo will use some Globus capabilities via python bindings
(pyGlobus).
Dataflow / Workflow Engines
SOAP Web & Grid Services
Wilson, 01/04/2005 58ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle History
DAML: DARPA Agent Markup Language OWL: Ontology Web Language (from DAML+OIL) Numerous inference engines & ontologies being developed
Semantics for Web Services OWL-S: OWL-based Web service ontology Describe properties & capabilities in computer-interpretable form
(beyond WSDL).
SciFlo Semantics & Inference Use WSDL+ & OWL-S to describe local operators (executables),
remote services, & grid computing jobs. Discover & select operators to fill in missing steps in a dataflow.
Semantic Web
SOAP Web & Grid Services
Wilson, 01/04/2005 59ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle iEarth Vision will be enabled by the open-source SciFlo
Engine.
Automate large-scale, multi-instrument science processing by authoring a dataflow document that specifies a tree of executable operators.
iEarth Visual Authoring Tool Distributed Dataflow Execution Engine Move operators (executables) to the data. Built-in reusable operators provided for many tasks such as
subsetting, co-registration, regridding, data fusion, etc. Custom operators easily plugged in by scientists. Leverage convergence of Web Services (SOAP) with Grid
Services (Globus v3.2).
Hierarchical namespace of objects, types, & operators. sciflo.data.EOS.AIRS.L2.atmosphericParameters sciflo.operator.EOS.coregistration.PointToSwath
SciFlo Engine
SOAP Web & Grid Services
Wilson, 01/04/2005 60ESIP Federation Meeting, Jan. 4-6, 2005
SciFlo Document
SOAP Web & Grid Services
Wilson, 01/04/2005 61ESIP Federation Meeting, Jan. 4-6, 2005
SOAP Web & Grid Services
Wilson, 01/04/2005 62ESIP Federation Meeting, Jan. 4-6, 2005
Abstract (skeleton) Workflow is more easily authored.
Trivial data format & unit conversions auto-inserted.
As toolbox of known & reliable operators grows, even complex ops like regridding become trivial.
Could use other backends if desired (BPEL, DAGMan).
Elaborating Workflow Documents
SOAP Web & Grid Services
Wilson, 01/04/2005 63ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle
Distributed Computing Using SciFlo
Inject data query or flow execution requestinto SciFlo network from any node.
Front-endReceives Requests
Queue ofPending Flows
SOAP
Master ServerPlans
Execution
SciFlo Server at JPL
Parallel ExecutionEngine
SOAP
SOAP
SciFlo ServerAt Data Portal
RedirectFlow
GENESIS ESIP
Co-registrationSOAP Call
Goddard DataSubsetting
SciFlo ServerAt Langley
SOAPQuery
Execute
Grid-EnabledComputing
Cluster
CondorDAGMan
SubFlow
SOAP Web & Grid Services
Wilson, 01/04/2005 64ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle Permanent Hierarchical Names (“Holy Grail”)
Naming Authority assigned at each namespace level Distributed P2P namespace (P2P catalog lookup)
Proper Names AIRS Level2 Parameter Retrieval Dataset (granules):
sciflo.data.EOS.AIRS.L2.atmosphericParameters (or metadata) Generic Point-To-Swath Co-registration Operator:
sciflo.operator.EOS.coregistration.PointToSwath
Generic Names Atmospheric Temperature Data:
sciflo.data.atmosphere.temperature.profile (or .grid)
Name resolves to list of EOS datasets
Semantics attached (3DGeoParameterGrid of temperature)
Data Access by Naming
SOAP Web & Grid Services
Wilson, 01/04/2005 65ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle
SciFlo’s Strength Lies in Combining Many Elements into a Single Open-Source System
Abstract XML dataflow documents translated to concrete flows. Parallel dataflow execution engine Semantic inference using XML metadata Move operators to the data. SOAP architecture, but also P2P functionality. Every node is both client & server; easy node replication. One-click installation onto server or desktop nodes. Initiate grid computations from your desktop.
Access data objects by naming them! P2P Distributed Namespace of data sources & operators
Server architecture Group of interacting SOAP services (replaceable modules) Implementation in XML, python, & C/C++ (not Java)
Strength in Numbers: Let a million nodes bloom!
SOAP Web & Grid Services
Wilson, 01/04/2005 66ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle getAirsMetaData(startTime, endTime, lat, lon)
Returns granule ID’s and geolocation info. for AIRS swaths that intersect lat/lon point.
getAirsUrlsFromGranuleIds(idList or xmlString) Redirection server – Returns URLs pointing to AIRS granules in
local cache or DataPool.
getGpsMetaData(startTime, endTime, NW, SW, SE, NE) Returns occultation ID’s and geolocation info. for GPS limb scans
that are inside a lat/lon region.
getGpsProfiles(idList or xmlString) Returns URL pointing to netCDF file containing GPS profiles for
all occultation ID’s in the list.
getGpsProfile(occID) Returns a single GPS profile in XML format.
SciFlo Data Access (SOAP) Services
SOAP Web & Grid Services
Wilson, 01/04/2005 67ESIP Federation Meeting, Jan. 4-6, 2005
AIRS versus GPS Flowchart (part 1)
SOAP Web & Grid Services
Wilson, 01/04/2005 68ESIP Federation Meeting, Jan. 4-6, 2005
AIRS versus GPS Flowchart (part 2)
SOAP Web & Grid Services
Wilson, 01/04/2005 69ESIP Federation Meeting, Jan. 4-6, 2005
AIRS & GPS Retrievals Matchup Demo
Interface: HTML web form auto- generated from XML dataflow doc.
Input: User enters start/end time & other co-registration criteria.
Flow Execution: Calls 4 SOAP data query services & total of 15 operators on 4 computers.
SOAP Web & Grid Services
Wilson, 01/04/2005 70ESIP Federation Meeting, Jan. 4-6, 2005
AIRS & GPS Retrievals Matchup Demo
Results Page: Shows status updates during execution and then final results.
Caching: Reuse intermediate data products or force recompute.
Results: Merged data in netCDF file & plots as Flash movie.
SOAP Web & Grid Services
Wilson, 01/04/2005 71ESIP Federation Meeting, Jan. 4-6, 2005
AIRS/GPS Matchups
SOAP Web & Grid Services
Wilson, 01/04/2005 72ESIP Federation Meeting, Jan. 4-6, 2005
AIRS/GPS Temperature & Water Vapor Comparison Plots
SOAP Web & Grid Services
Wilson, 01/04/2005 73ESIP Federation Meeting, Jan. 4-6, 2005
AIRS/GPS Temperature & Water Vapor Comparison Plots
SOAP Web & Grid Services
Wilson, 01/04/2005 74ESIP Federation Meeting, Jan. 4-6, 2005
General Features:
• High latitude measurements (~73N deg) separated by ~2 deg in longitude and ~0.5 hours
• Sharper tropopause measured by GPS. AIRS overestimates tropopause temperature by ~5 deg.
• GPS measure a wavy structure in the stratosphere which is absent in AIRS
Temperature as a function of pressure from AIRS (blue) and a coincident GPS occultation (red)
SOAP Web & Grid Services
Wilson, 01/04/2005 75ESIP Federation Meeting, Jan. 4-6, 2005
AIRS/GPS Temperature Difference Statistics
Mean temperature profilefor 41 matchups, latitude
region -60 to -70 deg.,Jan-Apr, 2003
SOAP Web & Grid Services
Wilson, 01/04/2005 76ESIP Federation Meeting, Jan. 4-6, 2005
Carbon Cycle SciFlo’s Innovation Lies in Combining Many Elements
into a Single Open-Source System Abstract XML dataflow documents Semantic inference using XML metadata Parallel dataflow execution engine Move operators to the data. Every node is both client & server; easy node replication. SOAP architecture, but also P2P functionality. Initiate grid computations from your desktop.
Goal: SciFlo nodes inside all Science Data Centers
Multi-Instrument Earth Science Instrument Cross-Comparisons Multi-Instrument Science Portals Large-scale multivariate statistical studies and verification of
weather/climate models.
Summary