Pegasus: Planning for Execution in Grids

24
Pegasus: Planning for Execution in Grids Ewa Deelman Information Sciences Institute University of Southern California

description

Pegasus: Planning for Execution in Grids. Ewa Deelman Information Sciences Institute University of Southern California. Pegasus Acknowledgements. Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui Su, Karan Vahi (ISI) - PowerPoint PPT Presentation

Transcript of Pegasus: Planning for Execution in Grids

Page 1: Pegasus:  Planning for Execution in Grids

Pegasus: Planning for Execution in

GridsEwa Deelman

Information Sciences InstituteUniversity of Southern California

Page 2: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Pegasus Acknowledgements Ewa Deelman, Carl Kesselman, Saurabh

Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui Su, Karan Vahi (ISI)

James Blythe, Yolanda Gil (ISI) http://pegasus.isi.edu Research funded as part of the NSF

GriPhyN, NVO and SCEC projects.

Page 3: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Outline General Scientific Workflow Issues on the

Grid Mapping complex applications onto the Grid

Pegasus Pegasus Application Portal

LIGO-gravitational-wave physics Montage-astronomy

Incremental Workflow Refinement Futures

Page 4: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Grid Applications Increasing in the level of complexity Use of individual application components Reuse of individual intermediate data products (files) Description of Data Products using Metadata Attributes

Execution environment is complex and very dynamic Resources come and go Data is replicated Components can be found at various locations or staged in on

demand

Separation between the application description the actual execution description

Page 5: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

FFT

FFT filea

/usr/local/bin/fft /home/file1

transfer filea from host1://home/filea

to host2://home/file1

ApplicationDomain

AbstractWorkflow

ConcreteWorkflow

ExecutionEnvironment

host1 host2

Data

Data

host2

App

licat

ion

Dev

elop

men

t and

Exe

cutio

n P

roce

ss

DataTransfer

Resource SelectionData Replica Selection

Transformation InstanceSelection

ApplicationComponentSelection

Retry

Pick different Resources

Specify aDifferentWorkflow

Failure RecoveryMethod

Abstract Workflow

Generation

ConcreteWorkflow

Generation

Page 6: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Why Automate Workflow Generation? Usability: Limit User’s necessary Grid knowledge

Monitoring and Directory Service Replica Location Service

Complexity: User needs to make choices

Alternative application components Alternative files Alternative locations

The user may reach a dead end Many different interdependencies may occur among

components Solution cost:

Evaluate the alternative solution costs Performance Reliability Resource Usage

Global cost: minimizing cost within a community or a virtual organization requires reasoning about individual user’s choices in light of

other user’s choices

Page 7: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Executable Workflow Construction Chimera builds an abstract workflow based

on VDL descriptions Pegasus takes the abstract workflow and

produces and executable workflow for the Grid

Condor’s DAGMan executes the workflow

AbstractWorfklow

Concrete Workflow Jobs

Chimera Pegasus DAGMan

Page 8: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Pegasus:Planning for Execution in Grids

Maps from abstract to concrete workflow Algorithmic and AI based techniques

Automatically locates physical locations for both components (transformations) and data Use Globus RLS and the Transformation Catalog

Finds appropriate resources to execute via Globus MDS

Reuses existing data products where applicable Publishes newly derived data products

Chimera virtual data catalog

Page 9: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

GridGridGrid

workflow executor (DAGman)Execution

WorkflowPlanning

Globus Replica Location Service

Globus Monitoring and Discovery

Service

Information and Models

detector

Raw data

Conc

rete

W

orkf

low

Replica LocationAvailable Reources

Abstract Worfklow

Dyn

amic

in

form

atio

n

Request Manager

Replica and Resource SelectorSubmission and

Monitoring System

Workflow Reduction

DataPublication

Virtual Data Language Chimera

Data Management

TransformationCatalog

Chimera is developed at ANLBy I. Foster, M. Wilde, and J. Voeckler

Page 10: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Example Workflow Reduction Original abstract workflow

If “b” already exists (as determined by query to the RLS), the workflow can be reduced

d1 d2ba c

d2b c

Page 11: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Mapping from abstract to concrete

Query RLS, MDS, and TC, schedule computation and data movement

Execute d2 at B

Move b from A

to B

Move c from B to U

Register c in the

RLS

d2b c

Page 12: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

LIGO-specific interface

Montage-specific Interface

Globus MDS

Globus RLS

Chimera

User

Metadata/

VDLMetadata/

Abstract Workflow

PegasusPortal

The Grid

Metadata CatalogService

TransformationCatalog

MyProxy

DAGMan

Authentication

VDL/

Abst

ract

Wor

fklo

w

Execution

records

Abstract Workflow/

Information

Information

Concre

te W

orkfl

ow/

Infor

mation

Jobs/Information

Information

Sim

plifi

ed V

iew

of S

C 2

003

Por

tal

Page 13: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

LIGO Scientific Collaboration Continuous gravitational waves are expected to be

produced by a variety of celestial objects Only a small fraction of potential sources are known Need to perform blind searches, scanning the regions of

the sky where we have no a priori information of the presence of a source

Wide area, wide frequency searches Search is performed for potential sources of continuous

periodic waves near the Galactic Center and the galactic core

The search is very compute and data intensive LSC used the occasion of SC2003 to initiate a month-long

production run with science data collected during 8 weeks in the Spring of 2003

Page 14: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Additional resources used: Grid3 iVDGL resources

Page 15: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

LIGO Acknowledgements Bruce Allen, Scott Koranda, Brian Moe, Xavier Siemens,

University of Wisconsin Milwaukee, USA

Stuart Anderson, Kent Blackburn, Albert Lazzarini, Dan Kozak, Hari Pulapaka, Peter Shawhan, Caltech, USA

Steffen Grunewald, Yousuke Itoh, Maria Alessandra Papa, Albert Einstein Institute, Germany

Many Others involved in the Testbed

www.ligo.caltech.edu www.lsc- group.phys.uwm.edu/lscdatagrid/ http://pandora.aei.mpg.de/merlin/

LIGO Laboratory operates under NSF cooperative agreement PHY-0107417

Page 16: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Montage Montage (NASA and NVO) Deliver science-grade

custom mosaics on demand

Produce mosaics from a wide range of data sources (possibly in different spectra)

User-specified parameters of projection, coordinates, size, rotation and spatial sampling.

Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid.

Page 17: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Small Montage Workflow

~1200 nodes

Page 18: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Montage Acknowledgments Bruce Berriman, John Good, Anastasia Laity,

Caltech/IPAC Joseph C. Jacob, Daniel S. Katz, JPL http://montage.ipac. caltech.edu/

Testbed for Montage: Condor pools at USC/ISI, UW Madison, and Teragrid resources at NCSA, PSC, and SDSC.

Montage is funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computational Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology.

Page 19: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Other Applications Using Chimera and Pegasus

Other GriPhyN applications: High-energy physics: Atlas, CMS (many) Astronomy: SDSS (Fermi Lab, ANL)

Astronomy: Galaxy Morphology (NCSA, JHU, Fermi,

many others, NVO-funded) Biology

BLAST (ANL, PDQ-funded) Neuroscience

Tomography (SDSC, NIH-funded)

Page 20: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Current System

Original Abstract Workflow Current Pegasus

Pegasus(Abstract Workflow)

DAGMan(CW))

Concrete W

orfklow

Workflow Execution

Page 21: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

time

Levels ofabstraction

Application-level

knowledge

Logicaltasks

Tasksbound toresources

and sent forexecution

User’sRequest

Relevantcomponents

Fullabstractworkflow

Partialexecution

Not yetexecuted executed

Workflow refinement

Taskmatchmaker

Workflow repair

Policyinfo

Workflow Refinement and execution

Page 22: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Incremental Refinement Partition Abstract workflow into partial

workflowsPW A

PW B

PW C

A Particular Partitioning New Abstract Workflow

Page 23: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Meta-DAGMan

Pegasus(A)

Pegasus(B)

Pegasus(C)

DAGMan(Su(A))

Su(B)

Su(C)

DAGMan(Su(B))

DAGMan(Su(C))

Pegasus(X) –Pegasus generates the concrete workflow and the submit files for X = Su(X)

DAGMan(Su(X))—DAGMan executes the concrete workflow for X

Page 24: Pegasus:  Planning for Execution in Grids

Ewa Deelman Information Sciences Institute

Future Directions Incorporate AI-planning technologies in

production software (Virtual Data Toolkit) Investigate various scheduling techniques Investigating fault tolerance issues

Selecting resources based on their reliability Responding to failures

http://pegasus.isi.edu