1 P-GRADE Portal tutorial at EGEE'09 Gergely Sipos MTA SZTAKI EGEE Training and Induction.

59
1 P-GRADE Portal tutorial at EGEE'09 P-GRADE Portal tutorial at EGEE'09 www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu Gergely Sipos MTA SZTAKI [email protected] EGEE Training and Induction EGEE Application Porting Support

description

3 Workflow The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal. Workflow management system (WFMS) is the software that does it Workflow Reference Model, 19/11/1998

Transcript of 1 P-GRADE Portal tutorial at EGEE'09 Gergely Sipos MTA SZTAKI EGEE Training and Induction.

Page 1: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

1

P-GRADE Portal tutorial at P-GRADE Portal tutorial at EGEE'09EGEE'09

www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu

Gergely SiposMTA SZTAKI

[email protected]

EGEE Training and InductionEGEE Application Porting Support

Page 2: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

2

Agenda of the morningAgenda of the morning

• Introduction to workflow concept• Workflow hands-on

~ Break

• Parameter studies• Parameter study hands-on

• Further information and next steps

Page 3: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

3

WorkflowWorkflow

The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal.

• Workflow management system (WFMS) is the software that does it

www.wfmc.org

Workflow Reference Model, 19/11/1998

Page 4: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

4

Why use workflowWhy use workflowss in Grid? in Grid?

• Build distributed applications through orchestration of multiple services

• A single job or a single service is good for nothing…

• Integration of multiple teams involved• Collaborative work

• Unit of reusage• (E-)science requires traceable, repetable analysis

• (Typically) ease of use grids• Graphical representation

Page 5: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

9

Grid WFMSGrid WFMS

Source: Jia Yu and Rajkumar Buyya: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4 / September, 2005

Page 6: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

15

(Some of the) available grid (Some of the) available grid workflow systemsworkflow systems

http://www.gridworkflow.org Categories for

– Composition tools – Description languages

• Scientific• Industrial• Formalism

– Engines

Some relevant tools for ARC, gLite, Globus, UNICORE grid users• Condor DAGMan

– Used as an enactor in P-GRADE Portal, Pegasus, …– Uses DAGMan WF language (DAG = Directed Acyclic Graph)

• MOTEUR– Interfaced with “pilot job” framework on EGEE (pull style job execution)– Uses SCUFL WF language

• gLite WMS– Describe workflows in JDL– Share Input-Output sandboxes with multiple jobs

• Taverna– Mainly for cluster computing– ARC interface is available by Lubeck University

• …

Page 7: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

16

P-GRADE PortalP-GRADE Portal

A Grid WFMS

www.portal.p-grade.hu

Page 8: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

17

Short History of P-GRADE portalShort History of P-GRADE portal

• Parallel Grid Application Development Environment

• Initial development started in the Hungarian SuperComputing Grid project in 2003

• It has been continuously developed since 2003• Around 30 manyear development + training + user support

• Detailed information: http://portal.p-grade.hu/ • Open Source community development since

January 2008: https://sourceforge.net/projects/pgportal/

• Current version: 2.8

Page 9: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

18

Current Current P-GRADE P-GRADE Portal Portal related projectsrelated projects

• GGF GIN (Since 2006)– Providing the GIN Resource Testing portal

• EU EGEE-II, EGEE-III (2006-2010)– Tool recommended for application development– Intensively used in new users’ training

• EU SEE-GRID-SCI (2008-2010)– Interfacing to DSpace-based workflow storage– Infrastructure testing workflows

• EU CancerGrid (2007-2009)– Development of new generation P-GRADE (gUSE

and WS-PGRADE)– Integration with desktop grids

• EU EDGeS (2008-2009)– Transparent access to Desktop Grid systems

Page 10: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

19

Portal installationsPortal installations

P-GRADE Portal services:– SEE-GRID infrastructure– Several VOs of EGEE:

• Biomed, Astronomy, Central European, NA4,...– GILDA: Training VO of EGEE– Many national Grids (UK National Grid Service,

HunGrid, Turkish Grid, etc.)– US Open Science Grid, TeraGrid– OGF Grid Interoperability Now (GIN) VO– …

Portal services and account request:http://portal.p-grade.hu/index.php?m=3&s=0 Account request form on portal login page

Page 11: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

20

Multi-Grid portal installation:Multi-Grid portal installation:www.lpds.sztaki.hu/multi-gridwww.lpds.sztaki.hu/multi-grid

Page 12: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

21

Design principlesDesign principles of P-GRADE portalof P-GRADE portal

• P-GRADE Portal is not only a user interface, it is a – General purpose– Workflow-level – Multi-Grid – Application Development and Execution Environment

• P-GRADE Portal includes a high-level middleware layer for orchestrating jobs on grid resources – inside a grid– among several different grids (and several VOs)

• P-GRADE Portal is grid-neutral:– Unlike many existing grid portals it is not tailored to any particular grid

type– Can be connected to various grids based on different grid middleware

• LCG-2, gLite, GT2, GT4, ARC, Unicore, etc.– Implements the high-level grid middleware services on top of the

existing grid middleware services– The workflow interface is the same no matter which type of grid is

connected to it

Page 13: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

22

What is a P-GRADE Portal workflow?What is a P-GRADE Portal workflow?

• A directed acyclic graph where– Nodes represent jobs (batch

programs to be executed on a computing element)

– Ports represent input/output files the jobs expect/produce

– Arcs represent file transfer operations

• semantics of the workflow:– A job can be executed if all

of its input files are available

Page 14: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

23

Three levels of parallelismThree levels of parallelism

– PS workflow level: Parameter study execution of the workflow

– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)

Multiple jobs run parallel

Each job can be a parallel program

– Job level: Parallel execution inside a workflow node (MPI job as workflow component)

Multiple instances of the same workflow process

different data files

Page 15: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

24

~100independent

jobs torun

Example: Computational ChemistryExample: Computational Chemistry

Department of Chemistry, University of Perugia

SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD

A single execution can be between 5 hours and 10 hours

SEQUENTIAL FORTRAN 90

Many simulations at the same time

Full story: EGEE Grid Application Porting Support - http://www.lpds.sztaki.hu/gasuc/index.php?m=7&s=3

Page 16: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

25

Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase

Portalserver

Gridservices

DOWNLOAD BINARI(ES)

UPLOAD JOB SOURCE(S)

Client COMPILE – EDIT

Page 17: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

26

Typical user scenarioTypical user scenarioWorkflow development phaseWorkflow development phase

Portalserver

Gridservices

START EDITOR

OPEN & EDIT WORKFLOW

ADD BINARIES

SAVE WORKFLOW

Client

DSpace WFrepository

IMPORT WORKFLOW

Page 18: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

27

MyProxyCertificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

DOWNLOAD (SMALL)

RESULTS

DOWNLOAD (SMALL) RESULTS

Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase

VISUALIZE JOBS and

WORKFLOW PROGRESS

MONITOR JOBS

DOWNLOAD PROXY CERTIFICATES

Client

Page 19: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

28

Accessing local and remote filesAccessing local and remote files

Portalserver

Gridservices

Computing elements

Storage elements and File catalogs

REMOTE INPUTFILES

REMOTE OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

Only the permanent

files!

Use legacy executables with Grid files without touching the code

Page 20: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

29

Extended DAGMan

Java Webstartworkflow editorWeb browser

EGEE, Globus (and ARC) Grid services + MyProxy service (gLite WMS, LFC,…; Globus GRAM, …)

Globus and gLite command line clients + scripts

P-GRADE PortalP-GRADE Portal structural overviewstructural overview

Extended DAGMan WF specification

Globus GIISgLite BDII

DSpacerepository

Page 21: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

30

Web interface - PortletsWeb interface - Portlets

Page 22: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

31

Email notificationsEmail notifications

NOTIFY

Page 23: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

32

Workflow portletWorkflow portlet

WORKFLOW EDITOR

Page 24: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

33

Graphical workflow editingGraphical workflow editing

• To define a graph:1. Drag & drop components:

jobs and ports2. Define their properties3. Connect ports by

channels (no cycles, no loops)

System generates JDL for each job automatically

Page 25: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

34

Workflow Workflow EditorEditorProperties of a jobProperties of a job

Properties of a job:• Executable file• Type of executable

(Sequential / Parallel)• Command line parameters• Which resource to use?

• Which VO?• Broker or Computing

element?

Page 26: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

35

Workflow Workflow EditorEditorDefining input-output filesDefining input-output files

File propertiesType: input: the executable reads output: the executable generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable uses this e.g. fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: temp. data channel

Page 27: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

36

• Client side location:result.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat

Local fileLocal file

Remote fileRemote file

How to refer to an I/O file?How to refer to an I/O file?

• Client side location:c:\experiments\11-04.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat

Input file Output file

Page 28: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

37

Upload a workflow from client side Upload a workflow from client side or from FTP serveror from FTP server

UPLOAD

STORED on FTP server

Page 29: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

38

Importing an applicationImporting an application

INCOMPLETE WORKFLOW Open it in editor and save it again

Page 30: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

39

Import a workflow from DSpace Import a workflow from DSpace repositoryrepository

Page 31: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

40

External access to DSpaceExternal access to DSpacehttp://pgrade-dspace.sztaki.huhttp://pgrade-dspace.sztaki.hu

Page 32: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

41

Certificate and proxy Certificate and proxy management Portletmanagement Portlet

Page 33: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

42

OGF GIN interoperability portal by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously

P-GRADEGEMLCA

Portal

GEMLCA GEMLCA RepositoryRepository

P-GRADEportal

Proxy 1

Proxy 2

Proxy 5

Proxy 4

Proxy 3

Proxy 6

Page 34: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

43

Application executionApplication execution

Page 35: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

44

Fault-tolerant executionFault-tolerant execution

• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS

• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this site.

• As a result – the portal guarantees the correct submission of a job

as long as there exists at least one matching resource

– job submission is reliable even in an unreliable grid

Page 36: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

45

Information system visualizationInformation system visualization

Page 37: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

46

LFC-SELFC-SE file browser portlet file browser portlet

Page 38: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

47

Compilation supportCompilation support

Page 39: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

48

WORKFLOW HANDS-ONWORKFLOW HANDS-ON

Page 40: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

49

From workflows to From workflows to parameter studiesparameter studies

Advanced execution patterns

Page 41: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

50

Scaling up a workflow to a Scaling up a workflow to a parameter studyparameter study

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

Page 42: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

51

Advanced parameter studiesAdvanced parameter studiesGenerator

component(s)Initial input data

Generate orcut input into smaller pieces

Collector component(s)

Aggregate result

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

Page 43: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

52

Concept of parameter study Concept of parameter study workflowsworkflows

GEN

SEQ

COLL

SEQSEQSEQ

Parameter study part

Collector part evaluates and

integrates the results

Generator part generates the

input parameter space

Page 44: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

53

Turning a WF into a parameter studyTurning a WF into a parameter study

By switching at least one of the open input ports

into a “PS Input port” the WF is turned into a Parameter Study

Page 45: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

54

Input-output files are stored in SEsInput-output files are stored in SEs/grid/gilda/sipos/InputImages Image.0 Image.1

/grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1

/grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1

/grid/gilda/sipos/Output ImagePart.0 ImagePart.1 . . .

2 x 2 x 2 = 8 execution of the whole workflow

CROSS PRODUCT of data items

Page 46: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

55

A B

Typical data-flow compositionsTypical data-flow compositions

A X B

MWF

A1

A2

A3

B1

B2

B3

{A1, A2, A3} {B1, B2, B3}

XWF

A1

A2

A3

B1

B2

B3

{A1, A2, A3} {B1, B2, B3}

dot iterator:one-to-one

cross iterator:all-to-all

WF

Ai Bj

{A1, A2, A3}

match iterator

If Ai and Bj have acommon ancestor

{B1, B2, B3}

A M B

CROSS ITERATOR DOT ITERATOR MATCH ITERATOR

Find these in e.g. TAVERNA, MOTEURP-GRADE Portalsupports this

Page 47: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

56

PS Input PortPS Input Port

Grid Directory instead of

FILE reference

Page 48: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

57

Parameter generatorParameter generator

Generator can be attached to any parameter input port

Generator can be• Auto generator: to generate text files• Custom generator: to generate any content

Generated files are moved into SE by the portal

Page 49: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

58

Definition Window of Auto Generator JobDefinition Window of Auto Generator Job

User defines the template of the text file

User puts key(s) into the template

User defines values for the key(s)• Integer number• Real number• Custom set• …

Page 50: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

59

PPlacement of resultlacement of result

Page 51: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

60

Will contain one compressed file for each execution of the workflow.

Use the default value!

Choose a „reliable” Storage Element

PPlacement of resultlacement of result

Page 52: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

61

Executing PS workflowsExecuting PS workflows

PS Details for parameter sweep

workflows applications

Page 53: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

62

Detailed view of a PS workflowDetailed view of a PS workflow

Workflow instances

Overall statistics of workflow instances

Collector job(s)

Generator job(s)

Page 54: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

63

PARAMETER STUDY PARAMETER STUDY HANDS-ONHANDS-ON

Page 55: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

64

Thank you!Thank you!

[email protected]

Learn once, use everywhereDevelop once, execute anywhere

Page 56: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

65

Backup slides to answer Backup slides to answer questionsquestions

Page 57: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

66

Proxy delegations Proxy delegations MyProxy

server

P-GRADE Portalserver GILDA

services

Proxy VOMSserver

ProxyProxy

VOMS ext.

Proxy

VOMS ext.

usernamepassword

Proxy based authentication

Login & psw based

authentication

usernamepassword

Page 58: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

67

SettingsSettings

Portal administrator can – connect the portal

to several grids– register default

resources of the connected grids

Page 59: 1 P-GRADE Portal tutorial at EGEE'09   Gergely Sipos MTA SZTAKI EGEE Training and Induction.

68

SettingsSettings

User can customize the connected grids by adding and removing resources