Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

32
Hong-Linh Truong and Thomas Fahringer Distributed and Parallel Systems Group Institute of Computer Science, University of Innsbruck {truong,tf}@dps.uibk.ac.at http://dps.uibk.ac.at/projects/pma Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. Performance Analysis of Grid workflows in K-WfGrid and ASKALON With contributions from: Bartosz Balis, Marian Bubak, Peter Brunner, Schahram Dustdar, Jakub Dziwisz, Andreas Hoheisel, Tilman Linden, Radu Prodan, Francesco Nerieri, Kuba Rozkwitalski, Robert Samborski

description

The complexity and quantity of Grid workows are so overwhelming that newperformance analysis techniques are required to instrument multilingual workowbasedapplications and to select, measure and analyze various performance metricsat dierent levels of abstraction. Moreover, Grid workow performanceanalysis tools have to support Grids based on service-oriented architecture (SOA)and to address well-known issues of the Grid computing such as the diversity,dynamics and scalability.In this talk, we present methods and techniques that are being applied tothe performance monitoring and analysis of Grid workows in the K-WfGridproject and the ASKALON toolkit. Ontology is used for describing performancedata of Grid workows and performance data is associated with multiple levelsof abstraction. Various instrumentation techniques are utilized to support multilingualand service-based Grid workows. A novel overhead classication forGrid workows is introduced and the performance analysis is conducted in adistributed fashion. Performance monitoring and analysis components are implementedas P2P-based Grid services. All performance data, monitoring andanalysis requests are described in XML, thus simplifying the interaction andintegration between various instrumentation, monitoring and analysis servicesand increasing the interoperability of the performance tools.

Transcript of Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Page 1: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong and Thomas FahringerDistributed and Parallel Systems Group

Institute of Computer Science, University of Innsbruck {truong,tf}@dps.uibk.ac.at

http://dps.uibk.ac.at/projects/pma

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005.

Performance Analysis of Grid workflows in K-WfGrid and ASKALON

With contributions from:

Bartosz Balis, Marian Bubak, Peter Brunner, Schahram Dustdar, Jakub Dziwisz, Andreas Hoheisel, Tilman Linden, Radu Prodan, Francesco Nerieri, Kuba Rozkwitalski, Robert Samborski

Page 2: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 2

OutlineOutlineMotivation

Objectives and approach

Performance metrics and ontologies

Architecture of Grid monitoring and analysis

Conclusions

Page 3: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 3

Performance Performance InstrumentationInstrumentation, , MonitoringMonitoring, and Analysis , and Analysis for for thethe GridGrid

Challenging task because of dynamic nature of the Grid and its applications

Combination of fine-grained and coarse-grainedmodelsMultilingual applicationsHeterogeneous and dynamic systems

Not just performance but also dependability

The focus of the talkOur concepts, architecture, interfaces and integration

Page 4: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 4

ASKALON ASKALON ToolkitToolkit

Programming and execution environment for Grid workflowsIntegrated various services for scheduling, executing, monitoring and analyzing Grid workflows

Target applicationsWorkflows of scientific applications (C/Fortran)• Material science, flooding, astrophysics, movie

rendering, etc.

http://dps.uibk.ac.at/projects/askalon/

Page 5: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 5

TheThe KK--WfGridWfGrid ProjectProjectAn FP6 EU project

www.kwfgrid.net

FocusesAutomatic construction and reuse of workflows based on knowledge gathered throughexecution

Target applicationsWorkflows of Grid/Web servicesGrid/Web services may invokelegacy applicationsBusiness (Coordinated TrafficManagement, EPR) and scientific (e.g., Food simulation) applications

Execute workflow

Capture knowledge

Reuseknowledge

K-WfGrid

Monitor environment

Analyzeinformation

Constructworkflow

Page 6: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 6

TheThe KK--WfGridWfGrid ProjectProjectAn FP6 EU project

www.kwfgrid.net

FocusesAutomatic construction and reuse of workflows based on knowledge gathered throughexecution

Target applicationsWorkflows of Grid/Web servicesGrid/Web services may invokelegacy applicationsBusiness (Coordinated TrafficManagement, EPR) and scientific (e.g., Food simulation) applications

Execute workflow

Capture knowledge

Reuseknowledge

K-WfGrid

Monitor environment

Analyzeinformation

Constructworkflow

Many Grid users and developers we know

emphasize the interoperability, integration and reliability, not

just performance

Page 7: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 7

Objectives, Requirements, ApproachesObjectives, Requirements, ApproachesPerformance monitoring and analysis for Grid workflows of Web/Grid services and multilingual components

Performance and dependability metrics

Monitoring and performance analysis Service-oriented distributed architecture and peer-to-peer modelUnified monitoring and performance analysis systemcovering infrastructure and applicationsStandardized data representations for monitoring data and events Adaptive and generic sensors, distributed analysis, performance bottleneck search

Page 8: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 8

Hierarchical View of Grid Workflows Hierarchical View of Grid Workflows

mProject1Service.java

public void mProject1() {

}

WorkflowWorkflow

A();

<parallel>

</parallel>

Workflow Region nWorkflow Region n

Activity mActivity m

Invoked Application mInvoked Application m

Code Code Region 1Region 1

Code Code Region qRegion q

Code Code Region Region ……

<activity name="mProject1"><executable name="mProject1"/>

</activity>

<activity name="mProject2"><executable name="mProject2"/>

</activity>

while () {...}

Page 9: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 9

Hierarchical View of Grid Workflows Hierarchical View of Grid Workflows

mProject1Service.java

public void mProject1() {

}

WorkflowWorkflow

A();

<parallel>

</parallel>

Workflow Region nWorkflow Region n

Activity mActivity m

Invoked Application mInvoked Application m

Code Code Region 1Region 1

Code Code Region qRegion q

Code Code Region Region ……

<activity name="mProject1"><executable name="mProject1"/>

</activity>

<activity name="mProject2"><executable name="mProject2"/>

</activity>

while () {...}

Monitoring and analysis at multiple levels of abstraction

Page 10: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 10

Workflow Execution Model (Simplified)

Workflow executionSpanning multiple Grid sites Highly inter-organizational, inter-related and dynamic

Multiple levels of job scheduling At workflow execution engine (part of WfMS)At Grid sites

Local schedulerLocal scheduler

Page 11: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 11

Performance Metrics of Grid WorkflowsInteresting performance metrics associated with multiple

levels of abstractionMetrics can be used in workflow composition, forcomparing different invoked applications of a single activity, adaptive scheduling, etc.

Five levels of abstractionCode region, Invoked application, Activity ,Workflow Region, WorkflowTopdown PMA: from a higher level to a lower one

Page 12: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 12

Monitoring and Measuring Performance Metrics

Performance monitoring and analysis toolsOperate at multiple levelsCorrelate performance metrics from multiple levels

Middleware and application instrumentationInstrument execution engine • Execution engine can be distributed or centralized

Instrument applications • Distributed, spanning multiple Grid sites

Challenging problems: the complexity of performance tools and data Integrate multiple performance monitoring tools executed on multiple Grid sites Integrate performance data produced by various tools

We need common concepts for performance data associated with Grid workflows

Page 13: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 13

Utilizing Ontologies for Describing Performance Data of Grid workflows

Metrics ontologySpecifies which performance metrics a tool can provideSimplifies the access to performance metrics provided by various tools

Performance data integrationPerformance data integration based on common concepts. High-level search and retrieval of performance data

Knowledge base performance data of Grid workflows Utilized by high-level tools such as schedulers, workflow composition tools, etc. Used to re(discover) workflow patterns, interactions in workflows, to check correct execution, etc.

Distributed performance analysisPerformance analysis requests can be built based on ontologies

Page 14: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 14

Workflow Performance Ontology Workflow Performance Ontology

WfPerfOnto (Ontology describing Performance data of GridWorkflows)

Specifies performance metrics Basic concepts• Concepts reflect the hierarchical structure of a workflow • Static and dynamic workflow performance data

Relationships• Static and dynamic relationships among concepts

Page 15: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 15

WfPerfOntoWfPerfOnto

Page 16: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 16

Monitoring and Analysis ScenarioMonitoring and Analysis Scenario

Page 17: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 17

ComponentsComponents forfor GridGrid PMAPMAThree main parts of a unified performance monitoring,

instrumentation and analysis systemMonitoring and instrumentation servicesPerformance analysis servicesPerformance service interfaces and data representations• We must reuse existing tools and techniques as much as

possible

Integration modelLoosely coupled: among Grid sites/organizations• Utilizing SOA for performance tools

Tightly coupled: within services deployed in a single Gridsite/organization• Interfacing to existing (parallel) performance tools

Page 18: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 18

KK--WfGridWfGrid Monitoring and Analysis ArchitectureMonitoring and Analysis Architecture

Page 19: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 19

Self-Managing Sensor-Based MiddlewareIntegrating diverse types of sensors into a single system

Event-driven and demand-driven sensors for system and applications monitoring, rule-based monitoring

Self-managing servicesService-based operations and TCP-based stream data deliveryPeer-to-peer Grid services for the monitoring middleware

Query and subscription of monitoring dataData query and subscription Group-based data query and subscription, and notification

Page 20: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 20

Self-Managing Sensor Based MonitoringMiddleware

Page 21: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 21

WorkflowWorkflow InstrumentationInstrumentationIssues to be addressed

Multiple levels of instrumentationInstrumentation of multilingual applications (C/Java/Fortran)Must be dynamic (enabled) instrumentation

ASKALON and K-WfGrid approachUtilizing existing instrumentation techniques• OCM-G (Roland Wismueller and Marian Bubak)

dynamic enabled instrumentation C/Fortran• Dyninst (Barton Miller, Jeff Hollingsworth)

for binary code generated from C/Fortran• Source/dynamic instrumentation of Java (from Java 1.5)

Using APART standardized intermediate representation (SIR)XML-based request for controlling instrumentation

Page 22: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 22

DIPAS

DIPAS: Distributed Performance AnalysisDIPAS: Distributed Performance Analysis

Monitoring Service

DIPAS Portal

Grid analysisagent

DIPAS GateWay

Grid analysisagent

Grid analysisagent GWES:

Execution Service

GOM: Ontology Memory

Knowledge Builder Agent

DIPAS GateWay

DIPAS Portal

Grid analysis agent accepts WARL and returns performance metricsdescribed in XML under a tree of metrics

WARL (workflow analysis request langauge): based on concepts and properties in WfPerfOnto

Page 23: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 23

Workflow OverheadWorkflow OverheadClassificationClassification

MiddlewareSchedulerResource ManagerExecution management

• Control of parallelism• Loss of parallelism

Loss of parallelismLoad imbalanceSerializationReplicated job

Data transfer

ActivityParallel processing overheadsExternal load

temporal overheads

middleware

loss of parallelism

data transfer

activity

scheduling

execution management

security

optimization algorithm

performance prediction

resource brokerage

job submission

submission decision policy

queue

queue waiting

queue residency

data transfer imbalance

access to database

third party data transfer

parallel overheads

external load

rescheduling

control of parallelism

fork construct

barrier

job preparation

service latency

restart

job failure

input from user

unidentified overhead

GRAM submission

GRAM polling

file staging

stage in

stage out

load imbalance

serialisation

archive/compression of data

execution management

resource broker

scheduler

performance prediction

job cancel

extractdecompression/

replicated job

Page 24: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 24

CurrentCurrent statusstatus of of thethe implementationimplementationSOA-based

Monitoring, Instrumentation and Analysis services are GT4 basedXML-based for performance data representations and requests

Monitoring and Instrumentation ServicesgSOAP, GSI-based dynamic instrumentation serviceJava-based dynamic instrumentationOCM-G But we need to integrate them into a single framework for workflowsand it is a non trivial task

Analysis servicesGT4-based with distributed componentsSimple language for workflow analysis request (WARL), designedbased on WfPerfOntoMetrics are described in XML

Page 25: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 25

ExampleExample: : DynamicDynamic InstrumentationInstrumentation

Page 26: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 26

SnapshotSnapshot: Online System : Online System MonitoringMonitoring

Page 27: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 27

RuleRule--based Monitoring based Monitoring Sensors use rules to analyze monitoring data

Rules are based on IBM ABLE toolkit

Example:A network path in theAustrian Grid, bandwidth < 5MB/s (based on Iperf)

Define a fuzzy variable withVERY LOW, LOW, MEDUM, HIGH, VERY HIGH

Fuzzy rules: Events send when bandwidth VERY LOW or VERY HIGH

Page 28: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 28

Online Infrastructure MonitoringOnline Infrastructure Monitoring

Page 29: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 29

DAGDAG--based workflow monitoring and based workflow monitoring and analysisanalysis

Monitoring data of Monitoring data of workflow activity workflow activity

executionexecution

Online application Online application profilingprofiling

Monitoring data of Monitoring data of computational nodecomputational node

Page 30: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 30

KK--WfGridWfGrid: Online Workflow Monitoring and : Online Workflow Monitoring and AnalysisAnalysis

Page 31: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 31

Online Workflow AnalysisOnline Workflow Analysis

Page 32: Performance Analysis of Grid Workflows in K-WfGrid and ASKALON

Hong-Linh Truong, Dagstuhl seminar on Automatic Performance Analysis, 15 December 2005. 32

ConclusionConclusionssThe architecture of monitoring and analysis service must tackle the

dynamics and the diversity of the GridService-oriented, peer-to-peer model, adaptive sensors

Integration and reuse are important issuesLoosely coupled and tightly coupledDo not neglect data representations and service interfaces

Performance metrics and ontology for Grid workflowsWhat performance metrics are important and how to measure themCommon and generic concepts and relationships monitored and alayzed

Given well-defined service interfaces, data representation, performance metrics and ontology

Simplify the integration among componentsTowards automatic, intelligent and distributed performance performance analysis

UIBK perf. work: http://dps.uibk.ac.at/projects/pma/Papers: http://dps.uibk.ac.at/index.pl/publications