Synergies among Grid, Peer-to-Peer and Cloud Computing (Towards e-Science Communities)
Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities &...
Transcript of Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities &...
Grid Computing and theOpen Grid Service Architecture
Ian Foster
Argonne National Laboratory
University of Chicago
http://www.mcs.anl.gov/~foster
2nd IEEE Intl Symp. on Network Computing & Applications, Boston, April 17, 2003
2
[email protected] ARGONNE CHICAGO
Abstract
In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic "virtual organizations“ formed from the disparate resources within a single enterprise and/or via external resource sharing relationships. This integration can be technically challenging due to the need to achieve various qualities of service in heterogeneous environments. I introduce this "Grid opportunity," discuss the origins and applications of Grid technologies in the world of science, and present recent work on an Open Grid Services Architecture that seeks to generalize Grid computing concepts to create a powerful framework for distributed resource sharing and management.
3
[email protected] ARGONNE CHICAGO
Partial AcknowledgementsOpen Grid Services Architecture design– Carl Kesselman, Karl Czajkowski @ USC/ISI– Steve Tuecke @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM
Grid services collaborators at ANL– Kate Keahey, Gregor von Laszewski– Thomas Sandholm, Jarek Gawor, John Bresnahan
Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)Strong links with many EU, UK, US Grid projectsSupport from DOE, NASA, NSF, IBM, Microsoft
4
[email protected] ARGONNE CHICAGO
Overview
Grid: why and what
Evolution of Grid technology– Open Grid Services Architecture
Future directions– Towards lightweight VOs: dynamic trust
relationships
– Towards global knowledge communities: virtual data and dynamic workspaces
5
[email protected] ARGONNE CHICAGO
Why the Grid?(1) Revolution in Science
Pre-Internet– Theorize &/or experiment, alone
or in small teams; publish paper
Post-Internet– Construct and mine large databases of
observational or simulation data
– Develop simulations & analyses
– Access specialized devices remotely
– Exchange information within distributed multidisciplinary teams
6
[email protected] ARGONNE CHICAGO
Why the Grid?(2) Revolution in Business
Pre-Internet– Central data processing facility
Post-Internet– Enterprise computing is highly distributed,
heterogeneous, inter-enterprise (B2B)
– Business processes increasingly computing- & data-rich
– Outsourcing becomes feasible => service providers of various sorts
7
[email protected] ARGONNE CHICAGO
“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set
of special purpose appliances” (George Gilder)
New OpportunitiesDemand New Technology
8
[email protected] ARGONNE CHICAGO
Grid Communities & TechnologiesYesterday– Small, static communities, primarily in science
– Focus on sharing of computing resources
– Globus Toolkit as technology base
Today– Larger communities in science; early industry
– Focused on sharing of data and computing
– Open Grid Services Architecture emerging
Tomorrow– Large, dynamic, diverse communities that share
a wide variety of services, resources, data
– New issues: Trust, distributed RM, knowledge
9
[email protected] ARGONNE CHICAGO
NSF TeraGrid
NCSA, SDSC, Argonne, Caltech
Unprecedented capability– 13.6 trillion flop/s
– 600 terabytes of data
– 40 gigabits per second
– Accessible to thousandsof scientists working onadvanced research
www.teragrid.org
10
[email protected] ARGONNE CHICAGO
11
[email protected] ARGONNE CHICAGO
Data Grids for High Energy Physics
Enable international community of 1000s to access & analyze petabytes of data
Harness computing & storage worldwide
Virtual data concepts:manage programs, data, workflow
Distributed system management
12
[email protected] ARGONNE CHICAGO
NEESgrid Earthquake Engineering Collaboratory
2
Network for Earthquake Engineering Simulation
Field Equipment
Laboratory Equipment
Remote Users
Remote Users: (K-12 Faculty and Students)
High-Performance Network(s)
Instrumented Structures and Sites
Leading Edge Computation
Curated Data Repository
Laboratory Equipment (Faculty and Students)
Global Connections
(fully developed FY 2005 – FY 2014)
(Faculty, Students, Practitioners)
U.Nevada Reno
www.neesgrid.org
13
[email protected] ARGONNE CHICAGO
Grid Computing
Grid ComputingBy M. Mitchell WaldropMay 2002
Hook enough computers together and what do you get? A new kind ofutility that offers supercomputer processing on tap.
Is Internet history about to repeat itself?
14
[email protected] ARGONNE CHICAGO
Industrial Perspective on Grids:A Wide Range of Applications
Unique by Industry with Common Characteristics
“Gridified” Infrastructure
FinancialServices
DerivativesAnalysis
Statistical Analysis
Portfolio Risk
Analysis
DerivativesAnalysis
Statistical Analysis
Portfolio Risk
Analysis
Manufacturing
Mechanical/ Electronic
Design
Process Simulation
FiniteElement Analysis
Failure Analysis
Mechanical/ Electronic
Design
Process Simulation
FiniteElement Analysis
Failure Analysis
LS / Bioinformatics
Cancer Research
Drug Discovery
Protein Folding
Protein Sequencing
Cancer Research
Drug Discovery
Protein Folding
Protein Sequencing
Other
Web Applications
Weather Analysis
Code Breaking/ Simulation
Academic
Web Applications
Weather Analysis
Code Breaking/ Simulation
Academic
Grid
Ser
vice
s M
arke
t Opp
ortu
nity
20
Energy
Seismic Analysis
Reservoir Analysis
Seismic Analysis
Reservoir Analysis
Entertainment
Digital Rendering
Digital Rendering
Massive Multi-Player
Games
Massive Multi-Player
Games
Streaming Media
Streaming Media
05
Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI
15
[email protected] ARGONNE CHICAGO
Overview
Grid: why and what
Evolution of Grid technology– Open Grid Services Architecture
Future directions– Towards lightweight VOs: dynamic trust
relationships
– Towards global knowledge communities: virtual data and dynamic workspaces
16
[email protected] ARGONNE CHICAGO
Open Grid Services Architecture
Service-oriented architecture– Key to virtualization, discovery,
composition, local-remote transparency
Leverage industry standards– Internet, Web services
Distributed service management– A “component model for Web services”
A framework for the definition of composable, interoperable services
“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002
17
[email protected] ARGONNE CHICAGO
Web Services
XML-based distributed computing technology
Web service = a server process that exposes typed ports to the network
Described by the Web Services Description Language, an XML document that contains– Type of message(s) the service understands &
types of responses & exceptions it returns
– “Methods” bound together as “port types”
– Port types bound to protocols as “ports”
A WSDL document completely defines a service and how to access it
18
[email protected] ARGONNE CHICAGO
OGSA StructureA standard substrate: the Grid service– Standard interfaces and behaviors that address key
distributed system issues
– A refactoring and extension of the Globus Toolkit protocol suite
… supports standard service specifications– Resource management, databases, workflow,
security, diagnostics, etc., etc.
– Target of current & planned GGF efforts
… and arbitrary application-specific services based on these & other definitions
19
[email protected] ARGONNE CHICAGO
Open Grid Services Infrastructure
Implementation
Servicedata
element
Other standard interfaces:factory,
notification,collectionsService
dataelement
Servicedata
element
GridService(required)
Dataaccess
Lifetime management• Explicit destruction• Soft-state lifetime
Introspection:• What port types?• What policy?• What state?
Client
Grid ServiceHandle
Grid ServiceReference
handleresolution
Hosting environment/runtime(“C”, J2EE, .NET, …)
20
[email protected] ARGONNE CHICAGO
Open Grid Services Infrastructure
GWD-R (draft-ggf-ogsi- gridservice-23) Editors:Open Grid Services Infrastructure (OGSI) S. Tuecke, ANLhttp://www.ggf.org/ogsi-wg K. Czajkowski, USC/ISI
I. Foster, ANLJ. Frey, IBMS. Graham, IBMC. Kesselman, USC/ISID. Snelling, Fujitsu LabsP. Vanderbilt, NASAFebruary 17, 2003
Open Grid Services Infrastructure (OGSI)
21
[email protected] ARGONNE CHICAGO
Example:Reliable File Transfer Service
Performance
Policy
Faults
servicedataelements
Pending
FileTransfer
InternalState
GridService
Notf’nSource
Policy
interfacesQuery &/orsubscribe
to service data
FaultMonitor
Perf.Monitor
Client Client Client
Request and manage file transfer operations
Data transfer operations
22
[email protected] ARGONNE CHICAGO
Open Grid Service Architecture:Next Steps
Technical specifications– Open Grid Services Infrastructure is complete
– Security, data access, Java binding, common resource models, etc., etc., in the pipeline
Implementations and compliant products– Here: OGSA-based Globus Toolkit v3, …
– Announced: IBM, Avaki, Platform, Sun, NEC, HP, Oracle, UD, Entropia, Insors, …, …
Rich set of service defns & implementations
23
[email protected] ARGONNE CHICAGO
Globus Toolkit v3 (GT3)Open Source OGSA Technology
Implements OGSI interfaces
Supports primary GT2 interfaces– High degree of backward compatibility
Multiple platforms & hosting environments– J2EE, Java, C, .NET, Python
New services– SLA negotiation, service registry, community
authorization, data management, …
Rapidly growing adoption and contributions: “Linux for the Grid”
24
[email protected] ARGONNE CHICAGO
Overview
Grid: why and what
Evolution of Grid technology– Open Grid Services Architecture
Future directions– Towards lightweight VOs: dynamic trust
relationships
– Towards global knowledge communities: virtual data and dynamic workspaces
25
[email protected] ARGONNE CHICAGO
Future Directions
Grids are about computers, certainly– “On-demand” access to computing, etc.
– Challenging future issues here: e.g., scale
26
[email protected] ARGONNE CHICAGO
CMS Event Simulation ProductionProduction Run on the Integration Testbed– Simulate 1.5 million full CMS events for physics
studies: ~500 sec per event on 850 MHz processor
– 2 months continuous running across 5 testbed sites
– Managed by a single person at the US-CMS Tier 1
27
[email protected] ARGONNE CHICAGO
Production Run on the Integration Testbed– Simulate 1.5 million full CMS events for physics
studies: ~500 sec per event on 850 MHz processor
– 2 months continuous running across 5 testbed sites
– Managed by a single person at the US-CMS Tier 1
1.5 Mi
llion
Events
1.5 Mi
llion
Events
Delive
red to
CMS P
hysici
st
Delive
red to
CMS P
hysici
st
(nearl
y 30 C
PU yea
rs)
(nearl
y 30 C
PU yea
rs)
CMS Event Simulation Productions!s!
28
[email protected] ARGONNE CHICAGO
Future Directions
Grids are about computers, certainly– “On-demand” access to computing, etc.
– Challenging future issues here: e.g., scale
But they are ultimately about people, their activities, and their interactions– New interaction modalities supported by on-
demand formation of lightweight VOs
– New technologies needed: e.g., trust, security, data and knowledge integration
Convergence of interest between “Compute” and “Collaboration” Grids?
30
[email protected] ARGONNE CHICAGO
Example Issue: Trust and Security
Effective VO operation depends critically on – Trust: can I rely on you?
– Protection mechanisms to govern actions
Suffers from VO-organization policy mismatch
Goal: collaborations no longer defined by slow centralized mechanisms but can– form spontaneously;
– be managed in a distributed manner; and
– be protected by an infrastructure that maintains and enforces trust relationships
31
[email protected] ARGONNE CHICAGO
Grid Security Services
RequestorApplication
VODomain
CredentialValidation
Service
AuthorizationService
Audit/Secure-Logging
Service
AttributeService
TrustService
ServiceProvider
Application
Bridge/Translation
Service
PrivacyService
CredentialValidation
Service
AuthorizationService
Audit/Secure-Logging
Service
AttributeService
TrustService
PrivacyService
CredentialValidation
Service
AuthorizationService
AttributeService
TrustService
CredentialValidation
Service
AuthorizationService
AttributeService
TrustService
WS-StubWS-Stub Secure Conversation
Requestor'sDomain
Service Provider'sDomain
32
[email protected] ARGONNE CHICAGO
Understanding and EnhancingVO Trust and Security
trust ( Tr, Te, As, L ) <- Cs; recommend ( Rr, Re, As, L ) <- Cs;
Social network analysisOther analyses
Workflow analysisRisk analysis
Factoring wrt environment
Monitoring for reputation,
compliance,intrusion
detection, etc.
Mechanism
Policy
Trust
CommunityUsabilityanalysis
Establishment,enhancement,maintenance,
verification
Feasibilityanalysis wrtcost, legality,etc.
allowed (S, O, A, C)
VOTA, PKI, VPN, etc.
33
[email protected] ARGONNE CHICAGO
Virtual Datafor Collaborative Science
Much collaboration is concerned with the development & use of knowledge, whether– Programs for data analysis and generation
– Computations involving those programs
– Metadata concerning data, programs, computations—and their interrelationships
In a distributed, heterogeneous, fractal (?) environment with widely varying– Data and analysis program formats
– Degrees of formality and scale
– Scientific goals and sharing policies
35
[email protected] ARGONNE CHICAGO
Virtual Data ConceptCapture and manage information about relationships among– Data (of widely varying representations)
– Programs (& their execution needs)
– Computations (& execution environments)
Apply this information to, e.g.– Discovery: Data and program discovery
– Workflow: Structured paradigm for organizing, locating, specifying, & requesting data
– Explanation: provenance
– Planning and scheduling
– Other uses we haven’t thought of
36
[email protected] ARGONNE CHICAGO
Transformation Derivation
Data
created-by
execution-of
consumed-by/generated-by
“I’ve detected a calibration error in an instrument and
want to know which derived data to recompute.”
“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”
Motivations
“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”
“I want to apply an astronomical analysis program to millions of objects. If the results already exist, I’ll save
weeks of computation.”
37
[email protected] ARGONNE CHICAGO1
10
100
1000
10000
100000
1 10 100
Num
ber o
f Clu
ster
s
Number of Galaxies
Galaxy clustersize distribution
DAG
Jim Annis, Steve Kent, Vijay Sehkri, Fermilab; Michael
Milligan, Yong Zhao, Chicago
Example:Sloan Galaxy Cluster Analysis
Sloan Data
38
[email protected] ARGONNE CHICAGO
Integrating Provenance Data
Collaboration-wideindex
Collaboration-levelindex
Group Index
PersonalIndex
PersonalIndex
PersonalIndex
CollaborationVDS
TR
TR
TR
DV
TR
DV
DV
DV
DV
DV
Group VDS
PersonalVDS
PersonalVDS
DS
DSDS
39
[email protected] ARGONNE CHICAGO
SummaryYesterday– Small, static communities, primarily in science
– Focus on sharing of computing resources
– Globus Toolkit as technology base
Today– Larger communities in science; early industry
– Focused on sharing of data and computing
– Open Grid Services Architecture emerging
Tomorrow– Large, dynamic, diverse communities that share
a wide variety of services, resources, data
– New issues: Trust, distributed RM, knowledge
40
[email protected] ARGONNE CHICAGO
For More Information
The Globus Project™– www.globus.org
Technical articles– www.mcs.anl.gov/~foster
Open Grid Services Arch.– www.globus.org/ogsa
Chimera– www.griphyn.org/chimera
Global Grid Forum– www.gridforum.org