Scheduling Parameter Sweep Workflow in the Grid

Faculty of Information TechnologyMonash UniversitySucha Smanchat

Supervisors: Dr Maria Indrawan and Dr Chris (Sea) Ling

Scheduling Parameter Sweep Workflow in the Grid

2

Content• Background & Motivation• Grid Workflow Scheduling• Objective & Definitions• The Proposed Algorithm• Scheduler Implementation• Simulation Results• Conclusion & Future Work

3

Background

• The Grid has become an important infrastructure to support e-Science due to its ability to provide a high-performance computing environment.

• Workflow technology is used to automate scientific processes which are then executed by scientific workflow management systems e.g. Kepler [1].

• Scientific workflows used in e-Science usually require high computation power which the Grid infrastructure can provide.

4

Background

• To execute a Grid workflow, tasks need to be scheduled onto Grid resources.

• Scheduling workflow tasks in the Grid is necessary to improve performance.– Makespan (overall execution time)– Cost– Resource Utilisation– Reliability– Security

5

Motivation

• Parameter Sweep Workflows– Workflows for parametric studies– Are repeatedly executed numerous times varying input

parameters to study or optimise such parameters– Execution

> One by one / Loop> Parallel using Nimrod/K [3]

– E.g. Quantum Chemical Calculations experiment [2]

6

Motivation

• Executing multiple parameter sweep workflow instances in parallel raises new scheduling challenges.– Every instance is derived from the same workflow definition– Hence, causing resource competition among concurrent

task instances of the same definition– Scheduling algorithm needs to manage multiple workflow

instances

7

Nimrod/K• A tool for parametric study being developed by

MeSsAGE Lab, Monash University [3]• Built on top of Kepler [1]• Allow workflows in Kepler to execute in the

Grid/Cloud via Nimrod toolset.• Feature the ability to clone a workflow into multiple

copies (instances) and execute them in parallel.• Also provides many different parameter tools for

various e-Research experiments– sweeping, optimising and experimental design

techniqueshttp://www.messagelab.monash.edu.au/Nimrod

8

Existing Grid Workflow Scheduling Algorithms• Batch Mode• Dependency Mode• Meta-heuristics based scheduling• Most algorithms are divided into two phases

– Task Prioritising Phase– Resource Selection Phase

Classification based on Yu et al. [4]

9

Examples of Batch Mode Algorithms

• Min-Min, Max-Min, Sufferage [5]• XSufferage [6]• QoS-guided Min-Min [7]• Selective Min-Min Max-Min [8]• Min-Min Average [9]

– Support multiple instances but does not consider resource competition.

10

Examples of Dependency Mode Algorithms• HEFT [10]• Hybrid HEFT [11]• SDC [12]

– Tasks with “scarce capable resources” - the tasks that fewer resources are able to execute - are given higher priority (rank)

11

What is missing?• Most existing Grid workflow scheduling algorithms do not

consider the dependencies of the tasks on resources, leading to resource competition problem

• They are designed mainly for Directed Acyclic Graph (DAG) and rely on loop unrolling to handle loop structure.– Need to know (or predict) the number of loop iteration

12

Scheduling Objective

• Given a Grid workflow, our objective is to schedule tasks across multiple instances of the Grid workflow to Grid resources based on resource scarcity and resource competition to minimises the makespan of the entire execution.

13

Parameter Sweep Workflow Model• PS = (V,E,R,P) where

• V is a set of nodes representing tasks in the workflow;• E is a set of edges representing the precedence

dependencies between tasks or nodes in the workflow;• R is a set of resources for executing tasks in the

workflow;• P is a set of parameter combinations which is used to

determine the total number of workflow instances

14

Grid Resource• We assume that each task can only be executed by

certain resources as in [12] and each node can only execute one task at a time

• A set of Capable Resources that can execute task ti is denoted as CR(ti).

• E.g. CR(t2) = {r2,r3}t1 t2 t3 t4 t5

r1 8 - 6 - 5r2 9 6 - - -r3 8 5 4 - -r4 7 - - 8 6r5 - - 5 - 4

15

Resource Nodes• In the Grid, a resource may refer to a cluster which

governs multiple compute nodes.• We define a function to retrieve the number of available

nodes in a resource r as

• We denote the total number of nodes in all capable resource CR(ti) as CRN(ti)

(2)

nodes(r) = number of available nodes in r (1)

)(

1

)()(tCR

j

jrnodestCRN

16

Resource Scarcity• Directly adopted from [12]• The resource scarcity (RS) a task t is defined as

(3)

• Resource scarcity is the ratio between the number of capable resource nodes for t and the number of all resource nodes.

R

k

krnodes

tCRNtRS

1

)(

)()(

17

Resource Competition• Resource competition is calculated from the proportion of

unscheduled task instances in the workflow instance that requires each Grid resource to execute.

• We define a function to count the number of unscheduled instances of a task that have been instantiated as

unschdInst(t) = number of unscheduled instances of t (4)

18

Resource Competition• The resource competition (RC) of a Grid resource r is

defined as

(5)

• where ti is a member of RT(r) and tj is a member of V• RT(r) is the set of tasks that can be executed by r

V

j

j

rRT

i i

i

tunschdInst

tCRNtunschdInst

rRC

1

)(

1

)(

)()(

)(

19

Resource Competition• The numerator represents the contention over the

Grid resource• The denominator represents the total number of

unscheduled task instances across all workflow instances that have been instantiated

• Resource competition is used in order to reserve resources in high demand for the tasks that are dependent on such resources to avoid bottleneck

20

Resource Metrics: EET• EET (Estimated Execution Time) The time taken for a

resource to execute a task• Most important to scheduling algorithms• Use history

– Assume to be zero if no previous history recorded to populate the history

21

Resource Metrics: EWT• EWT (Estimated Wait Time) The time until a resource can

start executing a task• Calculated from the task instances waiting in the queue

of the resource

otherwise

rnodes

xEETxEETmin

rinnodeidleanisthereif

rEWTQueue

j

j

i

Exec

i;

;0

1

1

22

Resource Metrics: ETT• ETT (Estimated Transfer Time) The time taken to transfer

the required files to the allocated resource.• Use history• Multiple file servers

– Select the server with minimum transfer time• Multiple files - how to determine ETT?

– Sum all ETTs (worst case, sequential transfers)– Maximum of all ETTs (best case, all parallel transfers)– Max of Sums of ETTs originating from each file source

23

Resource Metric: ECT• Estimated Completion Time (ECT) used in Ranking

Function.• The ECT of a task t when executed by a resource r is

calculated using the following equation [2]

ECT(r, t) = EET(r, t) + maximum( EWT(r), ETT(r) ) (6)

24

Ranking Function• The rank of each resource rj executing a task ti is

calculated by

(7)

• K constant is used to control the effect of RC on the rank value [14]

• A lower rank is better since it means the resource has lower competition and can also execute the task faster.

))((),(),( rRCKetrECTtrRank

25

The Besom Algorithm

• Designed as a batch mode scheduling so the scheduler can gather and schedule ready tasks from multiple workflow instances.

• Separated into three phases in order to support multiple instances– Instance Generation Phase– Task Prioritising Phase– Resource Selection Phase

Image from http://www.hecatescauldron.org/Besom%20Chants.htm

26

Besom Algorithm Overview

Instance Generation

TaskPrioritising

Resource Selection

Iteration Control

27

Instance Generation Phase

• Threshold is used to limit the number of task instances in each scheduling round

• More task instances in the scheduling pool leads to better load balancing.

• Too many task instances may lead to a useless schedule

2 while number of unscheduled task instances in set T < predefined threshold3 Generate one workflow instance w4 Add task instances tw from the new workflow instance to set T5 end while

28

Task Prioritising Phase

6 Update available Grid resources in set R7 Determine the task instances in set T that are ready and add them to RB8 for each task instance tw in RB do9 Find capable resources for its task definition t and update the set CR(t)10 Calculate RS(t) using (3) and assign it to tw

11 end for12 Sort task instances in RB in non-decreasing order based on RS

• Find ready task instances and use Resource Scarcity to prioritise them

29

Resource Selection Phase13 while RB is not empty do14 Get the first task instance tfirst from RB15 for each capable resource rj in CR(tfirst) do16 Calculate RC(rj) using (5)17 Calculate ECT using (6)18 Calculate Rank(rj , tfirst) using (7)19 end for20 Assign tfirst to the Grid resource rmin with the lowest rank21 Update EWT of rmin

22 Remove tfirst from RB and T23 end while

30

Iteration Control• After the three phases, the scheduler waits for the next

scheduling round which is triggered by– a certain period of time elapses– a new resource joins the grid– a task instance finishes execution

• Repeat the three phases in each scheduling round while– there are more workflow instances to schedule OR – set of unscheduled tasks T is not empty

31

Scheduling Loop

• To handle single-level feedback loop structure• Utilise the mechanism from scientific workflow

management systems of scheduling tasks during run-time

• Invoke scheduling algorithm as a task instance becomes ready for execution

• Avoid loop unrolling

32

An Approach to Loop Scheduling

• t2 of the second iteration cannot start until t4 of the first iteration finishes

• Only need to consider the task instances in the next loop iteration

• Resource Competition need to be modified

t1 t2 t3 t4 t5

33

Resource Competition (Revised)• We define U to be the set of task instances across all

instantiated workflow instances. • To represent task instances in the next iteration,

members of the set U are– the unscheduled task instances outside loop structure – all task instances inside the loop structure when the

execution have not exited the loop.• We define a function to count the instances of a task t

that are also members of U as

inst(t,U) = number of instances of t that are members of U (8)

34

Resource Competition (Revised)• The resource competition (RC) of a Grid resource r is

redefined as

(9)

• where ti is a member of RT(r) and tj is a member of V• RT(r) is the set of tasks that can be executed by r

V

j

j

rRT

i i

i

Utinst

tCRNUtinst

rRC

1

)(

1

),(

)(),(

)(

35

The Extended Besom Algorithm

• Management of the member of the set U– Task instances are added to U as new workflow instance

is generated (unchanged)– Task instances outside loop are removed after being

scheduled (unchanged)– Different from the original version, instances of the tasks

inside loop are removed when the execution of that particular workflow instance exit loop

– E.g. Instances of t2, t3 and t4 are removed when t5 becomes ready for execution

36

The Extended Besom Algorithm

• Instance Generation phase and Iteration Control remain identical to the original version

• Task Prioritising Phase and Resource Selection Phase are modified to accommodate the management of set U

37

Task Prioritising Phase

6 Update available Grid resources in set R7 for each task instance tw in U do 8 if tw is ready to execute then9 Add tw to RB10 if tw is an instance of the texit of a loop l = (LT, texit) then11 Remove from U the instances of tasks in loop of the same workflow instance w12 end if13 end for14 for each task instance tw in RB do15 Let t be the task definition of tw 16 find capable resources for t and update the set CR(t)17 Calculate RS(t) using (3) and assign it to tw 18 end for19 Sort task instances in RB in non-decreasing order based on RS

38

Resource Selection Phase

20 while RB is not empty do21 Get the first task instance tfirst from RB

22 for each capable resource rj in CR(tfirst) do

23 Calculate RC(rj) using (14)

24 Calculate ECT using (11)25 Calculate Rank(rj , tfirst) using (12)

26 end for27 Assign tfirst to the Grid resource rmin with the lowest rank

28 Update EWT of rmin

29 Remove tfirst from RB

30 Let t be the task definition of tfirst

31 if LT such that t is NOT a member of LT then32 Remove tfirst from U

33 else34 Do nothing35 end while

39

Implementing Scheduler

Iteration Control

Instance Generation

TaskPrioritising

Resource Selection

Generator Executor

Generator may implement different scheduling algorithms

Executor is the same for any algorithms but require workflow system functionality to execute tasks

40

Schedule Generator

Schedule Generation

Scheduling Interface

Nimrod/K

Nimrod Scheduler

Schedule Executor

Instance Generation

Prefire

Execution

Wait

Terminate

Initialisation

Scheduler Overview

41

Schedule Generator• Implement scheduling algorithms to generate

schedule– Besom– Min-Min, Max-Min, Xsufferage

• A separate package which can be used by any workflow management system

• Additional scheduling algorithm can be added with minimal modification

42

Schedule Executor• Uses Nimrod/K API to provides the algorithm with

information required to generate a Grid workflow schedule

• Start tasks in the workflow according to the generated schedule

• Specific to Nimrod/K

43

Evaluation• Performance (makespan) against Min-Min, Max-Min and

XSufferage algorithms• Three goals

– Minimise bottleneck– Manage multiple workflow instances– Incorporate loop scheduling without loop unrolling

• K constant that controls the effect of resource competition• Threshold for instance generation

44

Simulation Setup• Simulation environment is implemented into Nimrod/K• Parameter Sweep Actor (PS) is used to generate

instances and does not require Grid execution.• Dummy actor which sleeps for a period of time equal to

the ECT as if it is executed by a particular resource.• File transfer is also simulated using sleeps• Three different workflow structure: sequential, parallel,

and mixed structure.• Three different execution settings featuring different

degrees of resource competition.

45

TheThree Workflow Structures

t1 t4t3t2

PS

PS

PS

InputT1900k

OutputT1400k t5

OutputT4400k

OutputT3500k

OutputT2500k

t1 t4t3t2

t5

InputT1900k

InputT2600k

InputT3800k

InputT4700k

OutputT1400k

OutputT2500k

OutputT3500k

OutputT4400k

t1

t3

InputT1900k

InputT3800k

t2

t4

OutputT1400k

OutputT3500k

t5

OutputT2, 500k

OutputT4400k

a)

b)

c)

Sequential (S)

Parallel (P)

Mixed Structure (M)

46

The Three Execution Settings

• Setting (1), the resource r2 has a very high resource competition (RC) value as it is the only resource that can run t2, t3 and t4

• Simulation (2) and (3), RC of r2 is distributed to r1 and r3

1 t1 t2 t3 t4 t5 2 t1 t2 t3 t4 t5 3 t1 t2 t3 t4 t5

r1 5 - - - 4 r1 5 - 4 - 4 r1 5 - 4 - 4

r2 4 5 4 6 3 r2 4 5 - 6 3 r2 4 5 - - 3

r3 6 - - - 4 r3 6 - - - 4 r3 6 - - 6 4

r4 8 - - - 6 r4 8 - - - 6 r4 8 - - - 6

r5 7 - - - 6 r5 7 - - - 6 r5 7 - - - 6

47

Simulation Scenarios

• Total 9 scenarios– S1, S2, S3 – P1, P2, P3– M1, M2, M3(e.g. S1 is the sequential workflow using the first

execution setting)

• Transfer rates between Grid resources are assumed to be static

48

Base-Run• Use all 9 scenarios• Threshold in Instance Generation phase is defined

using the number of unscheduled task instances and the number of free resource nodes

• “Threshold Multiplier” is set to 2 • Workflow instance is generated until the number of

unscheduled instances is at least twice the number of free nodes

• K constant is set to 2.4• Execute 1, 5, 10, 20, 50, and 100 workflow instances

49

Improvement in Sequential Scenarios

1 5 10 20 50 1000

2000400060008000

100001200014000

S1

Min-MinMax-MinXSufferage

Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 1000

1000020000300004000050000600007000080000

S2


Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 100-50000

50001000015000200002500030000

S3


Number of Instances

Impr

ovem

ent (

ms)

50

Result: Sequential Scenarios• In S1 and S2 only little improvement can be observed since

r2 become a bottleneck. No resource selection actually takes place for t2, t3 and t4.

• Slight improvement gained at the beginning and near the end due to consideration of Resource Competition and transfer time

• S2 shows a weakness inherent to the XSufferage• The Besom algorithm improves the performance over Min-

Min in S3 by avoiding allocating the r2, r3, and r4 to tasks t1 and t5 which can run on other resources.

51

Improvement in Parallel Scenarios

1 5 10 20 50 100

-3000-2000-1000

010002000300040005000

P1


Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 100

-6000-4000-2000

02000400060008000

10000

P2


Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 100-4000-2000

02000400060008000

1000012000

P3


Number of Instances

Impr

ovem

ent (

ms)

52

Result: Parallel Scenarios• The parallel scenarios do not show any significant

improvement.• In P1 and P2, performance is again restricted by the

bottleneck caused by r2• In p3, t2, t3, and t4 run on their exclusive resources in

parallel so the performance of both algorithms is almost the same.

53

Improvement in Mixed Structure Scenarios

1 5 10 20 50 100

-4000

-2000

0

2000

4000

6000

8000

M1


Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 100-20000

2000400060008000

1000012000

M2


Number of Instances

Impr

ovem

ent (

ms)

1 5 10 20 50 100-50000

500010000150002000025000300003500040000

M3


Number of Instances

Impr

ovem

ent (

ms)

54

Result: Mixed Structure Scenarios• The scenario M1 is also restricted by bottleneck.• M2 and M3 show significant improvement in the longer

execution.• In M2, XSufferage performs similar to Besom because of

transfer cost between some instances of the tasks t1 and t2 is eliminated.

• M3 demonstrates the situation where Besom performs best– Complex workflow structure and execution setting– Resource requirement overlaps across sequential branches– Bottleneck involved multiple resources

55

Single Workflow Instance

• Besom algorithm can be used for scheduling conventional scientific workflows that execute only once, with at least the same performance as the XSufferage algorithm

S1 S2 S3 P1 P2 P3 M1 M2 M3-2000

0

2000

4000

6000

8000

10000

Single Workflow Instance


Scenario

Impr

ovem

ent (

ms)

56

Evaluation of K Constant• Use scenarios S3, P3, M3• Execute 100 workflow instances, Besom only

• Result:– The value of 2.3 is optimal– Ignoring resource competition (K=0) yields worse

makespans

57

Evaluation of K Constant: Result

0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5550000

590000

630000

670000

S3

K Constant Value

Mak

espa

n (m

s)

0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5250000270000290000310000330000350000

P3

K Constant Value

Mak

espa

n (m

s)

0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5360000

380000

400000

420000

440000

M3

K Constant Value

Mak

espa

n (m

s)

58

Threshold Evaluation• Use all 9 scenarios• K = 2.3, execute 100 workflow instances, against

XSufferage• Result:

– Optimal integer value for Threshold Multiplier is 6– Resources are underutilised when setting multiplier

too low– “Saturation state” is reached when resource

queues of the resources causing bottleneck are heavily populated

59

Threshold Evaluation: Result (S)

1x 2x 3x 4x 5x 6x 7x720000

770000

820000

870000

920000

S1

BesomXSufferage

Threshold Multiplier

Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x450000550000650000750000850000950000

1050000

S2

BesomXSufferage


Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x350000450000550000650000750000850000950000

1050000

S3

BesomXSufferage


Mak

espa

n (m

s)

60

Threshold Evaluation: Result (P)

1x 2x 3x 4x 5x 6x 7x720000730000740000750000760000770000780000

P1

BesomXSufferage


Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x500000510000520000530000540000550000560000570000580000

P2

BesomXSufferage


Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x300000310000320000330000340000350000360000370000

P3

BesomXSufferage


Mak

espa

n (m

s)

61

Threshold Evaluation: Result (M)

1x 2x 3x 4x 5x 6x 7x720000730000740000750000760000770000780000

Makespan M1

BesomXSufferage


Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x500000520000540000560000580000600000620000

Makespan M2

BesomXSufferage


Mak

espa

n (m

s)

1x 2x 3x 4x 5x 6x 7x300000

350000

400000

450000

500000

550000

Makespan M3

BesomXSufferage


Mak

espa

n (m

s)

62

Simulation Using Optimal Threshold• Use all 9 scenarios, against all 3 algorithms• K = 2.3, Threshold = 6, execute 100 workflow instances• Result:

– When the resource queues are heavily populated, the transfer time can be ignored as the wait time is much greater

– Long wait time resulting from the heavily populated resource queues minimises the advantages gained from both the consideration of ETT and the use of resource competition in resource selection

• It might not be possible to control the threshold in real workflows

63

Simulation Using Optimal Threshold: Result

S1 S2 S3 P1 P2 P3 M1 M2 M3

-5000

0

5000

10000

15000

20000


Scenario

Impr

ovem

ent (

ms)

64

Loop Scenarios• Scenarios L1, L2 and L3 against all 3 algorithms• K = 2.3, Threshold = 6, execute 100 workflow instances• Loop iterates 3 times

t1 t4t3t2PSInputT1

900kOutputT1

400k

t5OutputT4

400kOutputT3

500kOutputT2

500k

65

Result: Loop Scenarios

• Besom successfully schedules workflow with single-level feedback loop

• Performance is almost similar to the sequential scenarios

L1 L2 L3

-10000

-5000

0

5000

10000

15000

20000

25000

30000

35000


Impr

ovem

ent (

ms)

66

Summary• Besom algorithm can perform better when scheduling

complex workflow structure in which bottleneck involved multiple resources

• Besom algorithm can avoid the weakness of XSufferage• Besom algorithm does not present any advantage when

scheduling parallel independent tasks• When using optimal threshold, all algorithms perform

almost similarly.• Loop limitation – Besom may not perform optimally

during the very last loop iteration

67

Summary: Base-RunScenario

Scheduling Algorithm

Min-Min Max-Min XSufferage

S1 ★ ★S2 ★ ★ ★★★S3 ★★ ★P1

P2 ★P3 ★M1 ★ ★M2 ★★ ★★ ★M3 ★★★ ★★★ ★★★

★ - slight improvement, ★★ - moderate improvement, ★★★ - strong improvement

68

Conclusion• We propose the Besom algorithm for Grid workflow

scheduling that are able to manage multiple workflow instances and single-level feedback loop.

• Besom algorithm can reduce the effect of bottleneck by considering resource competition, and performs better in complex workflow structure.

• We demonstrate the importance of the number of input task instances to the scheduling (through threshold multiplier).

69

Conclusion• Besom algorithm is also applicable to conventional

scientific workflows• An independent scheduling software package is

implemented, which can be used by any workflow system through the provided programming interfaces.– Besom– Min-Min– Max-Min– XSufferage

• Simulation environment is implemented into Nimrod/K

70

Future Work• Scheduling parameter sweep in sub-workflow

– Instance generation is not under the control of scheduler• Consideration of conditional branches

– Approach analogous to loop scheduling• Workflow verification

– In order for the workflow to execute properly• Workflow structure analyser

– Which task is or is not in any loop or branch?

71

Future Work• Selective algorithm

– Switch between algorithms during run-time– Use the best algorithm for the current situation– Context-aware / Adaptive scheduling algorithm?

• Other scheduling objectives– Cost– Deadline– Resource utilisation– etc.

72

List of Publications• S. Smanchat, S. Ling and M. Indrawan, “Toward grid workflow

scheduling based on resource competition,” in Proceedings of the 13th Enterprise Distributed Object Computing Conference Workshops (EDOCW 2009), 2009, pp. 126-130.

• S. Smanchat, M. Indrawan, S. Ling, C. Enticott, and D. Abramson, “Scheduling Multiple Parameter Sweep Workflow Instances on the Grid,” in Proceedings of the 5th IEEE International Conference on e-Science (e-Science '09), Oxford, UK, 2009, pp. 300-306.

• S. Smanchat, M. Indrawan, S. Ling, C. Enticott, and D. Abramson, “A Scheduler based on Resource Competition for Parameter Sweep Workflow,” in Proceedings of the International Conference on Computational Science (ICCS 2011), 2011, pp. 176-185.

73

References1. B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E. A. Lee, J.

Tao, and Y. Zhao, "Scientific workflow management and the Kepler system," Concurr. Comput. : Pract. Exper., vol. 18, pp. 1039-1065, 2006

2. D. Abramson, C. Enticott, and I. Altinas, "Nimrod/K: towards massively parallel dynamic grid workflows," in Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Austin, Texas: IEEE Press, 2008.

3. Nimrod Toolkit, http://www.messagelab.monash.edu.au/Nimrod4. J. Yu, R. Buyya, and K. Ramamohanarao, "Workflow Scheduling Algorithms for Grid

Computing," in Metaheuristics for Scheduling in Distributed Computing Environments, 2008, pp. 173-214.

5. M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, "Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems," in Proceedings of the 8th Heterogeneous Computing Workshop (HCW '99), 1999, pp. 30-44.

6. H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman, "Heuristics for scheduling parameter sweep applications in grid environments," in Proceedings of the 9th Heterogeneous Computing Workshop (HCW 2000) 2000, pp. 349-363.

7. X. He, X. Sun, and G. v. Laszewski, "QoS guided min-min heuristic for grid task scheduling," J. Comput. Sci. Technol., vol. 18, pp. 442-451, 2003.

74

8. K. Etminani and M. Naghibzadeh, "A Min-Min Max-Min selective algorihtm for grid task scheduling," in Proceedings of the 3rd IEEE/IFIP International Conference in Central Asia on Internet (ICI 2007), 2007, pp. 1-7.

9. K. Liu, J. Chen, H. Jin, and Y. Yang, "A Min-Min Average Algorithm for Scheduling Transaction-Intensive Grid Workflows," in Proceedings of the 7th Australasian Symposium on Grid Computing and e-Research (AusGrid 2009). Wellington, New Zealand: Australian Computer Society, Inc., 2009, pp. 41-48.

10. H. Topcuoglu, S. Hariri, and W. Min-You, "Performance-effective and low-complexity task scheduling for heterogeneous computing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, pp. 260-274, 2002.

11. R. Sakellariou and H. Zhao, "A hybrid heuristic for DAG scheduling on heterogeneous systems," in Proceedings of the 18th International Conference on Parallel and Distributed Processing Symposium, 2004, pp. 111.

12. Z. Shi and J. J. Dongarra, "Scheduling workflow applications on processors with different capabilities," Future Gener. Comput. Syst., vol. 22, pp. 665-675, 2006.

13. Smanchat, S., et al., Scheduling Multiple Parameter Sweep Workflow Instances on the Grid, in Proceedings of the 5th IEEE International Conference on e-Science (e-Science '09). 2009: Oxford, UK. p. 300-306.

75

14. Pinedo, M.L., Scheduling: Theory, Algorithms, and Systems. Third ed. 2008: Springer.

15. Aalst, W.M.P.v.d., The Application of Petri Nets to Workflow Management. The Journal of Circuits, Systems and Computers, 1998. 8(1): p. 21-66.

76

Question?

Image from http://services.flikie.com/view/v3/android/wallpapers/16790538

http://services.flikie.com/view/v3/android/wallpapers/16790538

77

Appendix

78

Scheduling Algorithms: Batch Mode

• Task Prioritising Phase– Gather independent tasks that are ready to execute into

the batch– Calculate performance metrics such as Estimated

Completion Time (ECT)• Resource Selection Phase

– Allocate the tasks in the batch to the resource with the “best” performance metric depending on each heuristic

> Fastest resource> Resource with minimum file transfer time

79

Scheduling Algorithms: Dependency Mode• Task Prioritising Phase

– Rank all tasks in the workflow based on performance metrics such as ECT and the position of each task in the workflow.

– Tasks closer to the beginning of the workflow are ranked higher.

• Resource Selection Phase– Allocate the task with the highest rank to the resource with

the “best” performance metric depending on each heuristic e.g. resource with Earliest Finish Time

80

RCPSP VS Grid Workflow Scheduling

• Focuses on when to start activities and in which particular order so that the schedule is feasible given the resource availability constraint

• Organisations have complete control over their resources

• Resources are mainly considered homogenous with respect to activities

• Time lag is more static

• Needs to decide by which resources the tasks will be executed in which particular order, according to the performance of each resource

• No centralised control• Resources are mainly

heterogeneous• Network condition is dynamic

81

Tasks & Task Instances• Grid tasks – need to be scheduled to Grid resource

for execution e.g. computational tasks• Local tasks – run on local processor and are not

considered in the scheduling e.g. Boolean expressions and flow controls

• Task instance is denoted as tw where w is the workflow instance identifier

• Task instances t1w1 and t1w2 are different (same task but of different workflow instance)

82

Firing Resource Queues• Task queue of each resource maintained by scheduler.• To execute scheduled task, the scheduler goes

through each queue and starts the tasks within.• Scheduler can:

– strictly follow the schedule– skip waiting tasks– combine the two

83

Firing Resource Queues: Alternatives 1• Strictly follow the schedule

– The scheduler stops as soon as it encounters a task that cannot start (waiting transfer or free resource node)

– The scheduler misses the opportunity to start the subsequent tasks in the queue that are ready to start

– Maintain order of execution as scheduled– Useful when number of free resources nodes is not

known

84

• Skip the waiting tasks– The scheduler skips tasks that are waiting for

transfer and start the ready ones.– Free resource nodes are used immediately

(higher utilisation)– The skipped task might have to wait even

longer to execute and this delay may propagate through the rest of the workflow.

– Number of free resource nodes must be known to the scheduler.

Firing Resource Queues: Alternatives 2

85

• The scheduler skips tasks that are waiting while reserving enough resource nodes for the skipped tasks

• Fire a ready task if there are enough resource nodes for it and all the skipped tasks; otherwise stop and proceed to the next resource queue.

• The schedule may be violated• Task execution is not unnecessarily blocked by file

transfer.

Firing Resource Queues

86

Scheduler Termination• Parameter sweep workflow involve several workflow

instances and scheduling rounds• Scheduler need to know when to stop

– No task is waiting for execution or is being executed– No unscheduled task instance remains– No workflow instances remains– No input remains at any tasks

• We assume that the workflows scheduled based on our algorithm need to be “well-structured” and “bounded” following the definitions in [15]

87

Sequential (S) Parallel (P)

Mixed Structure (M)

Back

88

Transfer Rates

• In kilobytes per second

Back

r1 r2 r3 r4 r5

r1 - 60 70 100 90

r2 110 - 100 60 70

r3 100 90 - 130 80

r4 50 80 120 - 90

r5 90 70 110 120 -

Scheduling Parameter Sweep Workflow in the Grid

Documents

Transcript of Scheduling Parameter Sweep Workflow in the Grid