Ian Foster
Computation Institute
Argonne National Lab & University of Chicago
Services for Science
2
Thanks!
DOE Office of Science
NSF Office of Cyberinfrastructure
National Institutes of Health
Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere
3
ScientificCommunication, ~1600
Brahe Kepler
4
1980
5
Scientific Communication, ~2000
Data ArchivesData Archives
User
Analysis toolsAnalysis tools
Gateway
Figure: S. G. Djorgovski
Discovery toolsDiscovery tools
Service-Oriented Science
6
Application Scenario
Location AMicroarray, Protein,
Image data
Location BMicroarray, Protein,
Image data
Location CMicroarray, Protein,
Image data
Location CImage Analysis
Location DImage Analysis
Microarray and protein databases at other institutions
Different database systems, data
representations, security
Different program
invocation, remote access, data transfer
7
caBIG: sharing of infrastructure, applications, and data.
DataIntegration!
Services& Cancer Biology
Globus
8
Service-Oriented Science
People create services (data, code, instr.) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
9
Creating Services
People create services (data, code, instr.) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
10
Anatomy of a Service
op1 opN (meta)data
Implementation(s)
Clients RegistryManagement
Clients
…
Service
Service
AttributeAuthorityAttributeAuthority
Persistence
11
Creating Services (~2005)
“This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”
12
ApplnService
Create
Index service
StoreRepository ServiceAdvertize
Discover
Invoke;get results
Introduce
Container
Transfer GAR
Deploy
Ohio State University and Argonne/U.Chicago
Creating Services in 2008Introduce and gRAVI
Introduce Define service Create skeleton Discover types Add operations Configure security
Grid Remote Application Virtualization Infrastructure Wrap executables
Globus
Demonstration:Creating Services
Introduce + gRAVIShannon Hastings
Scott OsterDavid Ervin
Stephen Langella
Kyle ChardRavi Madduri
14Center for Enabling Distributed Petascale Science
Workflow Automation at DOE Facilities
AutomationReproducibility
SecurityReusability
StorageMetadataAnalysis
Visualization
Advanced Photon Source
15
Discovering Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
16
The ultimate arbiter?
Types, ontologies
Can I use it?
Billions of services
Discovering Services
Assume success
Syntax, semantics
Permissions
Reputation
A B
17
Discovery (1):Registries
Globus
18
Discovery (2):Standardized Vocabularies
Core Services
Grid Service
Uses TerminologyDescribed In
Cancer DataStandards
Repository
EnterpriseVocabularyServices
ReferencesObjects
Defined in
Service Metadata
Publishes
Subscribes toand Aggregates
Queries Service
Metadata Aggregated In
Registers To
Discovery Client API
IndexService
Globus
19
20
Discovery (3): Tagging& Social Networking
GLOSS: Generalized
Labels Over Scientific data Sources
(Foster, Nestorov)
21
Discovery (3): Tagging& Social Networking
David de Roure, Carole Goble,
et al.
22
Composing Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
23
Composing Services:E.g., BPEL Workflow System
Data Service@ uchicago.edu
Analytic service@ osu.edu
Analytic service@ duke.eduResearcher
Or Client App
<BPELWorkflow
Doc>
<WorkflowInputs>
<WorkflowResults>
BPELEngine
link
caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al.
link
link
link
See also Kepler & Taverna
Globus
24
Composing Services: Taverna
caGrid Scavenger with semantic/metadata-
based caGrid service query
A sample caGrid
workflow
Globus
25
Composing Services
Globus
Demonstration:Composing Services
Taverna + GT4Taverna team
Wei TanRavi Madduri
27
Publishing Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
28
Publishing Services
Description Syntax, semantics
State Availability, load, …
Policies Who, what, when, …
Hosting Location, scalability, …
29
Authorization: SAML & XACML
VOMS Shibboleth LDAP PERMIS…
GT4 ClientGT4 Server
PDP
AttributesAuthorization
Decision
PIP PIP PIP
SAML
XACML
Globus
30
Hosting Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …
!!“Service-Oriented Science”, Science, 2005
31
The Two Dimensions of Service-Oriented Science
Decompose across network
Clients integrate dynamically Select & compose services Select “best of breed” providers Publish result as new services
Decouple resource & service providers
Function
Resource
Data Archives
Analysis tools
Discovery toolsUsers
Fig: S. G. Djorgovski
32
The geWorkbench/caGrid/TeraGrid Interface
33
Putting It Together for the Example Scenario
Location AMicroarray, Protein,
Image data
Location BMicroarray, Protein,
Image data
Location CMicroarray, Protein,
Image data
Location CImage Analysis
Location DImage Analysis
caGrid Service Interfaces
caGridEnviron-
ment
Registered Object
Definitions
Advertise-ment
Log on, Grid credentials
Query and Analysis Workflow
Discovery
Microarray & protein databases at other
institutions
34
Lessons Learned A convenient higher-level abstraction
Suitable for a subset of scientific use cases
Infrastructure need to be sustainable Integrates well with hospital/cancer
center/experimental facility IT infrastructure Workflows are attractive to users Scalability and provenance are important No vendor lock-in (if you are careful) User experience remains ambiguous
Early adopters are enthusiastic (50+ services) Cancer centers seek clear ROI
35
Services for Science A new approach to communicating
A (not-so new) approach to structuring systems They’re real
Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc., etc.)
Substantial numbers of services out there They’re challenging
Sociology: incentives, rewards Infrastructure: hosting Provenance: justifying “results” Scaling: services, requests
Top Related