Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities &...

40
Grid Computing and the Open Grid Service Architecture Ian Foster Argonne National Laboratory University of Chicago http://www.mcs.anl.gov/~foster 2nd IEEE Intl Symp. on Network Computing & Applications, Boston, April 17, 2003

Transcript of Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities &...

Page 1: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

Grid Computing and theOpen Grid Service Architecture

Ian Foster

Argonne National Laboratory

University of Chicago

http://www.mcs.anl.gov/~foster

2nd IEEE Intl Symp. on Network Computing & Applications, Boston, April 17, 2003

Page 2: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

2

[email protected] ARGONNE CHICAGO

Abstract

In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic "virtual organizations“ formed from the disparate resources within a single enterprise and/or via external resource sharing relationships. This integration can be technically challenging due to the need to achieve various qualities of service in heterogeneous environments. I introduce this "Grid opportunity," discuss the origins and applications of Grid technologies in the world of science, and present recent work on an Open Grid Services Architecture that seeks to generalize Grid computing concepts to create a powerful framework for distributed resource sharing and management.

Page 3: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

3

[email protected] ARGONNE CHICAGO

Partial AcknowledgementsOpen Grid Services Architecture design– Carl Kesselman, Karl Czajkowski @ USC/ISI– Steve Tuecke @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM

Grid services collaborators at ANL– Kate Keahey, Gregor von Laszewski– Thomas Sandholm, Jarek Gawor, John Bresnahan

Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)Strong links with many EU, UK, US Grid projectsSupport from DOE, NASA, NSF, IBM, Microsoft

Page 4: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

4

[email protected] ARGONNE CHICAGO

Overview

Grid: why and what

Evolution of Grid technology– Open Grid Services Architecture

Future directions– Towards lightweight VOs: dynamic trust

relationships

– Towards global knowledge communities: virtual data and dynamic workspaces

Page 5: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

5

[email protected] ARGONNE CHICAGO

Why the Grid?(1) Revolution in Science

Pre-Internet– Theorize &/or experiment, alone

or in small teams; publish paper

Post-Internet– Construct and mine large databases of

observational or simulation data

– Develop simulations & analyses

– Access specialized devices remotely

– Exchange information within distributed multidisciplinary teams

Page 6: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

6

[email protected] ARGONNE CHICAGO

Why the Grid?(2) Revolution in Business

Pre-Internet– Central data processing facility

Post-Internet– Enterprise computing is highly distributed,

heterogeneous, inter-enterprise (B2B)

– Business processes increasingly computing- & data-rich

– Outsourcing becomes feasible => service providers of various sorts

Page 7: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

7

[email protected] ARGONNE CHICAGO

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set

of special purpose appliances” (George Gilder)

New OpportunitiesDemand New Technology

Page 8: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

8

[email protected] ARGONNE CHICAGO

Grid Communities & TechnologiesYesterday– Small, static communities, primarily in science

– Focus on sharing of computing resources

– Globus Toolkit as technology base

Today– Larger communities in science; early industry

– Focused on sharing of data and computing

– Open Grid Services Architecture emerging

Tomorrow– Large, dynamic, diverse communities that share

a wide variety of services, resources, data

– New issues: Trust, distributed RM, knowledge

Page 9: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

9

[email protected] ARGONNE CHICAGO

NSF TeraGrid

NCSA, SDSC, Argonne, Caltech

Unprecedented capability– 13.6 trillion flop/s

– 600 terabytes of data

– 40 gigabits per second

– Accessible to thousandsof scientists working onadvanced research

www.teragrid.org

Page 10: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

10

[email protected] ARGONNE CHICAGO

Page 11: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

11

[email protected] ARGONNE CHICAGO

Data Grids for High Energy Physics

Enable international community of 1000s to access & analyze petabytes of data

Harness computing & storage worldwide

Virtual data concepts:manage programs, data, workflow

Distributed system management

Page 12: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

12

[email protected] ARGONNE CHICAGO

NEESgrid Earthquake Engineering Collaboratory

2

Network for Earthquake Engineering Simulation

Field Equipment

Laboratory Equipment

Remote Users

Remote Users: (K-12 Faculty and Students)

High-Performance Network(s)

Instrumented Structures and Sites

Leading Edge Computation

Curated Data Repository

Laboratory Equipment (Faculty and Students)

Global Connections

(fully developed FY 2005 – FY 2014)

(Faculty, Students, Practitioners)

U.Nevada Reno

www.neesgrid.org

Page 13: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

13

[email protected] ARGONNE CHICAGO

Grid Computing

Grid ComputingBy M. Mitchell WaldropMay 2002

Hook enough computers together and what do you get? A new kind ofutility that offers supercomputer processing on tap.

Is Internet history about to repeat itself?

Page 14: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

14

[email protected] ARGONNE CHICAGO

Industrial Perspective on Grids:A Wide Range of Applications

Unique by Industry with Common Characteristics

“Gridified” Infrastructure

FinancialServices

DerivativesAnalysis

Statistical Analysis

Portfolio Risk

Analysis

DerivativesAnalysis

Statistical Analysis

Portfolio Risk

Analysis

Manufacturing

Mechanical/ Electronic

Design

Process Simulation

FiniteElement Analysis

Failure Analysis

Mechanical/ Electronic

Design

Process Simulation

FiniteElement Analysis

Failure Analysis

LS / Bioinformatics

Cancer Research

Drug Discovery

Protein Folding

Protein Sequencing

Cancer Research

Drug Discovery

Protein Folding

Protein Sequencing

Other

Web Applications

Weather Analysis

Code Breaking/ Simulation

Academic

Web Applications

Weather Analysis

Code Breaking/ Simulation

Academic

Grid

Ser

vice

s M

arke

t Opp

ortu

nity

20

Energy

Seismic Analysis

Reservoir Analysis

Seismic Analysis

Reservoir Analysis

Entertainment

Digital Rendering

Digital Rendering

Massive Multi-Player

Games

Massive Multi-Player

Games

Streaming Media

Streaming Media

05

Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI

Page 15: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

15

[email protected] ARGONNE CHICAGO

Overview

Grid: why and what

Evolution of Grid technology– Open Grid Services Architecture

Future directions– Towards lightweight VOs: dynamic trust

relationships

– Towards global knowledge communities: virtual data and dynamic workspaces

Page 16: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

16

[email protected] ARGONNE CHICAGO

Open Grid Services Architecture

Service-oriented architecture– Key to virtualization, discovery,

composition, local-remote transparency

Leverage industry standards– Internet, Web services

Distributed service management– A “component model for Web services”

A framework for the definition of composable, interoperable services

“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002

Page 17: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

17

[email protected] ARGONNE CHICAGO

Web Services

XML-based distributed computing technology

Web service = a server process that exposes typed ports to the network

Described by the Web Services Description Language, an XML document that contains– Type of message(s) the service understands &

types of responses & exceptions it returns

– “Methods” bound together as “port types”

– Port types bound to protocols as “ports”

A WSDL document completely defines a service and how to access it

Page 18: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

18

[email protected] ARGONNE CHICAGO

OGSA StructureA standard substrate: the Grid service– Standard interfaces and behaviors that address key

distributed system issues

– A refactoring and extension of the Globus Toolkit protocol suite

… supports standard service specifications– Resource management, databases, workflow,

security, diagnostics, etc., etc.

– Target of current & planned GGF efforts

… and arbitrary application-specific services based on these & other definitions

Page 19: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

19

[email protected] ARGONNE CHICAGO

Open Grid Services Infrastructure

Implementation

Servicedata

element

Other standard interfaces:factory,

notification,collectionsService

dataelement

Servicedata

element

GridService(required)

Dataaccess

Lifetime management• Explicit destruction• Soft-state lifetime

Introspection:• What port types?• What policy?• What state?

Client

Grid ServiceHandle

Grid ServiceReference

handleresolution

Hosting environment/runtime(“C”, J2EE, .NET, …)

Page 20: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

20

[email protected] ARGONNE CHICAGO

Open Grid Services Infrastructure

GWD-R (draft-ggf-ogsi- gridservice-23) Editors:Open Grid Services Infrastructure (OGSI) S. Tuecke, ANLhttp://www.ggf.org/ogsi-wg K. Czajkowski, USC/ISI

I. Foster, ANLJ. Frey, IBMS. Graham, IBMC. Kesselman, USC/ISID. Snelling, Fujitsu LabsP. Vanderbilt, NASAFebruary 17, 2003

Open Grid Services Infrastructure (OGSI)

Page 21: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

21

[email protected] ARGONNE CHICAGO

Example:Reliable File Transfer Service

Performance

Policy

Faults

servicedataelements

Pending

FileTransfer

InternalState

GridService

Notf’nSource

Policy

interfacesQuery &/orsubscribe

to service data

FaultMonitor

Perf.Monitor

Client Client Client

Request and manage file transfer operations

Data transfer operations

Page 22: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

22

[email protected] ARGONNE CHICAGO

Open Grid Service Architecture:Next Steps

Technical specifications– Open Grid Services Infrastructure is complete

– Security, data access, Java binding, common resource models, etc., etc., in the pipeline

Implementations and compliant products– Here: OGSA-based Globus Toolkit v3, …

– Announced: IBM, Avaki, Platform, Sun, NEC, HP, Oracle, UD, Entropia, Insors, …, …

Rich set of service defns & implementations

Page 23: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

23

[email protected] ARGONNE CHICAGO

Globus Toolkit v3 (GT3)Open Source OGSA Technology

Implements OGSI interfaces

Supports primary GT2 interfaces– High degree of backward compatibility

Multiple platforms & hosting environments– J2EE, Java, C, .NET, Python

New services– SLA negotiation, service registry, community

authorization, data management, …

Rapidly growing adoption and contributions: “Linux for the Grid”

Page 24: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

24

[email protected] ARGONNE CHICAGO

Overview

Grid: why and what

Evolution of Grid technology– Open Grid Services Architecture

Future directions– Towards lightweight VOs: dynamic trust

relationships

– Towards global knowledge communities: virtual data and dynamic workspaces

Page 25: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

25

[email protected] ARGONNE CHICAGO

Future Directions

Grids are about computers, certainly– “On-demand” access to computing, etc.

– Challenging future issues here: e.g., scale

Page 26: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

26

[email protected] ARGONNE CHICAGO

CMS Event Simulation ProductionProduction Run on the Integration Testbed– Simulate 1.5 million full CMS events for physics

studies: ~500 sec per event on 850 MHz processor

– 2 months continuous running across 5 testbed sites

– Managed by a single person at the US-CMS Tier 1

Page 27: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

27

[email protected] ARGONNE CHICAGO

Production Run on the Integration Testbed– Simulate 1.5 million full CMS events for physics

studies: ~500 sec per event on 850 MHz processor

– 2 months continuous running across 5 testbed sites

– Managed by a single person at the US-CMS Tier 1

1.5 Mi

llion

Events

1.5 Mi

llion

Events

Delive

red to

CMS P

hysici

st

Delive

red to

CMS P

hysici

st

(nearl

y 30 C

PU yea

rs)

(nearl

y 30 C

PU yea

rs)

CMS Event Simulation Productions!s!

Page 28: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

28

[email protected] ARGONNE CHICAGO

Future Directions

Grids are about computers, certainly– “On-demand” access to computing, etc.

– Challenging future issues here: e.g., scale

But they are ultimately about people, their activities, and their interactions– New interaction modalities supported by on-

demand formation of lightweight VOs

– New technologies needed: e.g., trust, security, data and knowledge integration

Convergence of interest between “Compute” and “Collaboration” Grids?

Page 29: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

29

[email protected] ARGONNE CHICAGO

Global Knowledge Communities

Page 30: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

30

[email protected] ARGONNE CHICAGO

Example Issue: Trust and Security

Effective VO operation depends critically on – Trust: can I rely on you?

– Protection mechanisms to govern actions

Suffers from VO-organization policy mismatch

Goal: collaborations no longer defined by slow centralized mechanisms but can– form spontaneously;

– be managed in a distributed manner; and

– be protected by an infrastructure that maintains and enforces trust relationships

Page 31: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

31

[email protected] ARGONNE CHICAGO

Grid Security Services

RequestorApplication

VODomain

CredentialValidation

Service

AuthorizationService

Audit/Secure-Logging

Service

AttributeService

TrustService

ServiceProvider

Application

Bridge/Translation

Service

PrivacyService

CredentialValidation

Service

AuthorizationService

Audit/Secure-Logging

Service

AttributeService

TrustService

PrivacyService

CredentialValidation

Service

AuthorizationService

AttributeService

TrustService

CredentialValidation

Service

AuthorizationService

AttributeService

TrustService

WS-StubWS-Stub Secure Conversation

Requestor'sDomain

Service Provider'sDomain

Page 32: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

32

[email protected] ARGONNE CHICAGO

Understanding and EnhancingVO Trust and Security

trust ( Tr, Te, As, L ) <- Cs; recommend ( Rr, Re, As, L ) <- Cs;

Social network analysisOther analyses

Workflow analysisRisk analysis

Factoring wrt environment

Monitoring for reputation,

compliance,intrusion

detection, etc.

Mechanism

Policy

Trust

CommunityUsabilityanalysis

Establishment,enhancement,maintenance,

verification

Feasibilityanalysis wrtcost, legality,etc.

allowed (S, O, A, C)

VOTA, PKI, VPN, etc.

Page 33: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

33

[email protected] ARGONNE CHICAGO

Virtual Datafor Collaborative Science

Much collaboration is concerned with the development & use of knowledge, whether– Programs for data analysis and generation

– Computations involving those programs

– Metadata concerning data, programs, computations—and their interrelationships

In a distributed, heterogeneous, fractal (?) environment with widely varying– Data and analysis program formats

– Degrees of formality and scale

– Scientific goals and sharing policies

Page 34: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

34

[email protected] ARGONNE CHICAGO

Sloan Digital Sky Survey Production System

Page 35: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

35

[email protected] ARGONNE CHICAGO

Virtual Data ConceptCapture and manage information about relationships among– Data (of widely varying representations)

– Programs (& their execution needs)

– Computations (& execution environments)

Apply this information to, e.g.– Discovery: Data and program discovery

– Workflow: Structured paradigm for organizing, locating, specifying, & requesting data

– Explanation: provenance

– Planning and scheduling

– Other uses we haven’t thought of

Page 36: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

36

[email protected] ARGONNE CHICAGO

Transformation Derivation

Data

created-by

execution-of

consumed-by/generated-by

“I’ve detected a calibration error in an instrument and

want to know which derived data to recompute.”

“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”

Motivations

“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”

“I want to apply an astronomical analysis program to millions of objects. If the results already exist, I’ll save

weeks of computation.”

Page 37: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

37

[email protected] ARGONNE CHICAGO1

10

100

1000

10000

100000

1 10 100

Num

ber o

f Clu

ster

s

Number of Galaxies

Galaxy clustersize distribution

DAG

Jim Annis, Steve Kent, Vijay Sehkri, Fermilab; Michael

Milligan, Yong Zhao, Chicago

Example:Sloan Galaxy Cluster Analysis

Sloan Data

Page 38: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

38

[email protected] ARGONNE CHICAGO

Integrating Provenance Data

Collaboration-wideindex

Collaboration-levelindex

Group Index

PersonalIndex

PersonalIndex

PersonalIndex

CollaborationVDS

TR

TR

TR

DV

TR

DV

DV

DV

DV

DV

Group VDS

PersonalVDS

PersonalVDS

DS

DSDS

Page 39: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

39

[email protected] ARGONNE CHICAGO

SummaryYesterday– Small, static communities, primarily in science

– Focus on sharing of computing resources

– Globus Toolkit as technology base

Today– Larger communities in science; early industry

– Focused on sharing of data and computing

– Open Grid Services Architecture emerging

Tomorrow– Large, dynamic, diverse communities that share

a wide variety of services, resources, data

– New issues: Trust, distributed RM, knowledge

Page 40: Grid Computing and the Open Grid Service Architecture · 2012-08-16 · Grid Communities & Technologies zYesterday – Small, static communities, primarily in science – Focus on

40

[email protected] ARGONNE CHICAGO

For More Information

The Globus Project™– www.globus.org

Technical articles– www.mcs.anl.gov/~foster

Open Grid Services Arch.– www.globus.org/ogsa

Chimera– www.griphyn.org/chimera

Global Grid Forum– www.gridforum.org