Building Science Gateways

42
Building Science Gateways Marlon Pierce Community Grids Laboratory Indiana University

description

Building Science Gateways. Marlon Pierce Community Grids Laboratory Indiana University. What Is a Web Portal?. Web container that aggregates content from multiple sources into a single display. “Start Pages” Typically consume RSS/Atom news feeds. - PowerPoint PPT Presentation

Transcript of Building Science Gateways

Page 1: Building Science Gateways

Building Science Gateways

Marlon Pierce

Community Grids Laboratory

Indiana University

Page 2: Building Science Gateways

What Is a Web Portal?Web container that

aggregates content from multiple sources into a single display. “Start Pages”

Typically consume RSS/Atom news feeds.

More powerful versions these days support Flickr, calendars, games, etc. Gadgets, widgets

Examples: iGoogle, Netvibes, My Yahoo!

Page 3: Building Science Gateways

Grid Computing OverviewGrid computing software is designed to integrate large

supercomputing facilities. TeraGrid, Open Science Grid, EGEE, etc. This is done via network services

Key Service Components Authentication and authorization framework (MyProxy) Remote process access and control (GRAM, Condor) Remote file, I/O access (GridFTP)

Additional Services Information services, replica management, database

federation, storage management, schedulers, etc.

Example Grid Software Stacks: CTSS and VDT

Page 4: Building Science Gateways

TeraGrid Supercomputing Resources (GPIR)

Page 5: Building Science Gateways

Science Portals and GatewaysScience Gateways adapt Web portal

technology to build user interfaces to the Grid.

Science portals resemble standard portals, but must alsoSupport access to computing and storage

resources.Allow users remote, Unix-like access to these

resources.Provide access to science applications and

data sets.

And we must provide value added services as well as user interfaces.

Page 6: Building Science Gateways

Portlets + Client Stubs

DB Service

JDBC

DB

Job Sub/Mon And FileServices

Operating andQueuing Systems

WSDL

Browser Interface

WSDL

WSDL

WSDLWSDL WSDL

VisualizationService

DB

WSDL

Host 1 Host 2 Host 3

My 2002 “octopus” SOA diagram, from the archives.

SOAP/HTTP

HTTP(S)

WSDL WSDL

Page 7: Building Science Gateways

TerminologyPortlet: this is a standard Java component that generates

HTML and can also act as a client to a remote service. Lives in a portal container. I will also use this term generically.

Web Service: a remotely invokeable function on the Internet.SOAP: the XML message envelop for carrying commands

over HTTP.WSDL: describes the service’s API in XML. REST: A variation of this approach.

Lots more info: http://grids.ucs.indiana.edu/ptliupages/presentations/I590WebService.ppt

Page 8: Building Science Gateways

But Why?Three-tiered Service Oriented Architecture is the

network equivalent of the the famous Model-View-Controller design pattern.View: the user interface components.Controller: Web service middlewareModel: the backend resources.

Independence of tiers gives flexibilityServices can be reused with alternative user interfaces

Workflow composers like TavernaUser interfaces can work with different service

implementations.

Drawback: reliability and robustness are issues.

Page 9: Building Science Gateways

Two Approaches to the Middle Tier

Grid Service Grid Service

BackendResource

Web Service

Portal ClientPortal Client

Grid Client

BackendResource

Fat Client Thin Client

Grid Protocol (SOAP) Grid Client

HTTP + SOAP

Grid Protocol(SOAP)

Page 10: Building Science Gateways

Disloc output converted to KML and plotted.

Page 11: Building Science Gateways

GeoFEST Finite Element Modeling portlet and plotting tools

Page 12: Building Science Gateways

What’s In the Screenshots?GeoFEST and Disloc Portlets

Live on gf7.ucs.indiana.eduManage the user’s display: Web forms, links to output,

graphics.Save user session state persistently.

QuakeTables Fault DB Web ServiceLives on gf2.ucs.indiana.eduContains geometric fault models.

GeoFEST and Disloc Execution Web ServicesLives on gf19.ucs.indiana.eduGenerates input files from fault models.Runs and manages codes.

Page 13: Building Science Gateways

Best Practice for Scientific Web Services

There are many tools to choose from. .NET, Apache Axis, Sun WS, Ruby on Rails, etc.

Make them self-contained. If possible, generate input files within the service.Or have an input file generating service.Remember that they may be used by other people with

other client tools.

Communicate data files with URLs.

Be very careful about exposing the state of the service.Don’t assume persistent connections.

Page 14: Building Science Gateways

Components for PortalsOpen Grid Computing Environments

Examples. See http://www.collab-ogce.org/

Page 15: Building Science Gateways

Components for Science Portals

OGCE is founded on the principal that portals should be built out of reusable parts.

Key standard in our first phase: the JSR 168 portlet specification.

Portlets can run in multiple containersuPortal, Sakai, GridSphere, LifeRay, etc.

Allows us to build Grid specific components and deploy along side other goodies: Sakai collaboration tools, contributed portlets, etc.

Future: Open Social compliant Google Gadgets

Page 16: Building Science Gateways

OGCE GPIR portlet can interoperate with TeraGrid and your own GPIR

services.

Page 17: Building Science Gateways

Manage TeraGrid MyProxy credentials with the OGCE

ProxyManager portlets.

Page 18: Building Science Gateways

OGCE file management client portlets interact with TeraGrid

GridFTP servers.

Page 19: Building Science Gateways

General purpose batch and interactive job submission to GRAM, WS-GRAM is supported.

Page 20: Building Science Gateways

Dashboard Portlet

20

The dashboard portlet allows users to track jobs on the selected resource. The user can view either his own set of jobs or get information on all submitted jobs.

Page 21: Building Science Gateways
Page 22: Building Science Gateways

Queue forecasting portlets work with the NWS QBETS to predict wait times and deadlines.

Page 23: Building Science Gateways

PURSe portlets manage user requests for portal accounts and Grid credentials.

Page 24: Building Science Gateways

Condor and Condor-G

Page 25: Building Science Gateways

OGCE IFrame Portlet can be used to integrate external

sites.

Page 26: Building Science Gateways

Client Libraries for Grid Computing

Page 27: Building Science Gateways

Two Major Grid Client EffortsThe Java COG Kit

Supports several versions of Globus and SSH. Condor-G

Has a Web Service interface (BirdBath) and Java client libraries.

Supports Globus (v2 and v4) and several other Grid middleware systems.

You can build either portlets or Web services with either of these.

OGCE portlets use primarily COGWe prefer Condor-G based Web services for long

running jobs.

Page 28: Building Science Gateways

CoG Abstraction Layer

CoG CoG CoG CoG CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG CoG

CoG

GridID

E

GT2GT3(X)

GT4WS-RF

Condor Unicore

Applications

SSH Others

Nanomaterials

Bio-Informatics

DisasterManagement

Portals

CoG Abstraction Layer

CoG CoG CoG CoG CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG CoG

CoG

GridID

E

DevelopmentSupport

CoG Abstraction Layers

Page 29: Building Science Gateways

TaskTask

Handler

Service

TaskSpecification

SecurityContext

ServiceContact

The class diagram is thesame for all grid tasks (running jobs, modifying files, moving data).

Classes also abstract toolkit provider differences. You set these as parameters: GT2, GT4, etc.

Page 30: Building Science Gateways

Coupling CoG TasksThe COG

abstractions also simplify creating coupled tasks.

Tasks can be assembled into task graphs with dependencies.“Do Task B after

successful Task A”

Graphs can be nested.

Page 31: Building Science Gateways

Problems with Grid Client Development

Grid portlets typically wrap each single Grid capability in a separate portlet

Problem is that Grid portlets need to combine these operations Portlets are entire web applications, so we need a component

model for portlets: reusable portlet parts

Even with the COG Abstraction Layer, we must still do a lot of coding to build new applications.

To address these problems we have adopted Java Server Faces Provides several nice Model-View-Controller features JSF provides an extensible framework (tag libraries) for making

reusable components. Apache JSF portlet bridge allows you to convert standalone JSF

applications (development phase) into portlets (deployment phase).

Page 32: Building Science Gateways

GTLAB Example<html> <body> <f:form> <o:submit id=”test” action=”next_page” />

<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu”

port=”7512” lifetime=”2” username=“mnacar” password=”***” />

<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” />

</o:submit> </f:form> </body></html>

32

Page 33: Building Science Gateways

Grid Tags Associated Grid Beans Features

<submit/> ComponentBuilderBean Creating components, job handlers, submitting jobs

<handler/> MonitorBean Handling monitoring page actions

<multitask/> MultitaskBean Constructing simple workflow

<dependency/> MultitaskBean Defining dependencies among sub jobs

<myproxy/> MyproxyBean Retrieving myproxy credential

<fileoperation/> FileOprationBean Providing Gridftp operations

<jobsubmission/> JobSubmitBean Providing GRAM job submissions

<filetransfer/> FileTransferBean Providing Gridftp file transfer

ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.

Page 34: Building Science Gateways

Managing Scientific Workflows

Page 35: Building Science Gateways

Scientific Workflows

Portal interfaces encode scientific use cases.If you have a rich set of services, it is a lot of

work to make portlets for all possible use cases.And power users will have always want

something more.Example: our CICC project has dozens of

chemical informatics Web services.http://www.chembiogrid.org.wiki

Workflow composers can simplify this.Allow users to encode and execute their own

use cases.

Page 36: Building Science Gateways

Web Services and Workflows

Perform a similarity search on the NIH DTP Human Tumor data.

Filter the results based on Pharmacokinetic properties (FILTER)

Convert to 3D (OMEGA)

Docking into a pre-defined protein (FRED)

Visualize (JMOL).Taverna workflow connects remote services.

Page 37: Building Science Gateways

OGCE’s XBaya Workflow Composer

Page 38: Building Science Gateways

Future of Science Gateways

Page 39: Building Science Gateways

Social Gadgets+AJAX

DB Service

JDBC

DB

Job Sub/Mon And FileServices

Operating andQueuing Systems

REST

Browser Interface

REST

WSDL

RESTREST REST

VisualizationService

DB

REST

Host 1 Host 2 Host 3

Updating the Octopus

RSS,JSON/HTTP

HTTP(S)

REST REST

Page 40: Building Science Gateways

Enterprise Approach Web 2.0 Approach

JSR 168 Portlets Gadgets, Widgets

Server-side integration and processing

AJAX, client-side integration and processing, JavaScript

SOAP RSS, Atom, JSON

WSDL REST (GET, PUT, DELETE, POST)

Portlet Containers Open Social Containers (Orkut, LinkedIn, Shindig); Facebook; StartPages

User Centric Gateways Social Networking Portals

Workflow managers (Taverna, Kepler, etc)

Mash-ups

Grid computing: Globus, condor, etc Cloud computing: Amazon WS Suite, Xen Virtualization

Semantic Web: RDF, OWL, ontologies

Microformats, folksonomies

Page 41: Building Science Gateways

Microformats,KML, and GeoRSS feeds used to deliver SAR data to multiple clients.

Page 42: Building Science Gateways

More Information

Contact me: [email protected]

See what I’m up to: http://communitygrids.blogspot.com/

OGCE software: http://collab-ogce.org/

QuakeSim: http://www.quakesim.org/

CICC: http://www.chembiogrid.org/wiki/

Lots of people worked on all of these.