Scheduling Parameter Sweep Workflow in the Grid
description
Transcript of Scheduling Parameter Sweep Workflow in the Grid
Faculty of Information TechnologyMonash UniversitySucha Smanchat
Supervisors: Dr Maria Indrawan and Dr Chris (Sea) Ling
Scheduling Parameter Sweep Workflow in the Grid
2
Content• Background & Motivation• Grid Workflow Scheduling• Objective & Definitions• The Proposed Algorithm• Scheduler Implementation• Simulation Results• Conclusion & Future Work
3
Background
• The Grid has become an important infrastructure to support e-Science due to its ability to provide a high-performance computing environment.
• Workflow technology is used to automate scientific processes which are then executed by scientific workflow management systems e.g. Kepler [1].
• Scientific workflows used in e-Science usually require high computation power which the Grid infrastructure can provide.
4
Background
• To execute a Grid workflow, tasks need to be scheduled onto Grid resources.
• Scheduling workflow tasks in the Grid is necessary to improve performance.– Makespan (overall execution time)– Cost– Resource Utilisation– Reliability– Security
5
Motivation
• Parameter Sweep Workflows– Workflows for parametric studies– Are repeatedly executed numerous times varying input
parameters to study or optimise such parameters– Execution
> One by one / Loop> Parallel using Nimrod/K [3]
– E.g. Quantum Chemical Calculations experiment [2]
6
Motivation
• Executing multiple parameter sweep workflow instances in parallel raises new scheduling challenges.– Every instance is derived from the same workflow definition– Hence, causing resource competition among concurrent
task instances of the same definition– Scheduling algorithm needs to manage multiple workflow
instances
7
Nimrod/K• A tool for parametric study being developed by
MeSsAGE Lab, Monash University [3]• Built on top of Kepler [1]• Allow workflows in Kepler to execute in the
Grid/Cloud via Nimrod toolset.• Feature the ability to clone a workflow into multiple
copies (instances) and execute them in parallel.• Also provides many different parameter tools for
various e-Research experiments– sweeping, optimising and experimental design
techniqueshttp://www.messagelab.monash.edu.au/Nimrod
8
Existing Grid Workflow Scheduling Algorithms• Batch Mode• Dependency Mode• Meta-heuristics based scheduling• Most algorithms are divided into two phases
– Task Prioritising Phase– Resource Selection Phase
Classification based on Yu et al. [4]
9
Examples of Batch Mode Algorithms
• Min-Min, Max-Min, Sufferage [5]• XSufferage [6]• QoS-guided Min-Min [7]• Selective Min-Min Max-Min [8]• Min-Min Average [9]
– Support multiple instances but does not consider resource competition.
10
Examples of Dependency Mode Algorithms• HEFT [10]• Hybrid HEFT [11]• SDC [12]
– Tasks with “scarce capable resources” - the tasks that fewer resources are able to execute - are given higher priority (rank)
11
What is missing?• Most existing Grid workflow scheduling algorithms do not
consider the dependencies of the tasks on resources, leading to resource competition problem
• They are designed mainly for Directed Acyclic Graph (DAG) and rely on loop unrolling to handle loop structure.– Need to know (or predict) the number of loop iteration
12
Scheduling Objective
• Given a Grid workflow, our objective is to schedule tasks across multiple instances of the Grid workflow to Grid resources based on resource scarcity and resource competition to minimises the makespan of the entire execution.
13
Parameter Sweep Workflow Model• PS = (V,E,R,P) where
• V is a set of nodes representing tasks in the workflow;• E is a set of edges representing the precedence
dependencies between tasks or nodes in the workflow;• R is a set of resources for executing tasks in the
workflow;• P is a set of parameter combinations which is used to
determine the total number of workflow instances
14
Grid Resource• We assume that each task can only be executed by
certain resources as in [12] and each node can only execute one task at a time
• A set of Capable Resources that can execute task ti is denoted as CR(ti).
• E.g. CR(t2) = {r2,r3}t1 t2 t3 t4 t5
r1 8 - 6 - 5r2 9 6 - - -r3 8 5 4 - -r4 7 - - 8 6r5 - - 5 - 4
15
Resource Nodes• In the Grid, a resource may refer to a cluster which
governs multiple compute nodes.• We define a function to retrieve the number of available
nodes in a resource r as
• We denote the total number of nodes in all capable resource CR(ti) as CRN(ti)
(2)
nodes(r) = number of available nodes in r (1)
)(
1
)()(tCR
j
jrnodestCRN
16
Resource Scarcity• Directly adopted from [12]• The resource scarcity (RS) a task t is defined as
(3)
• Resource scarcity is the ratio between the number of capable resource nodes for t and the number of all resource nodes.
R
k
krnodes
tCRNtRS
1
)(
)()(
17
Resource Competition• Resource competition is calculated from the proportion of
unscheduled task instances in the workflow instance that requires each Grid resource to execute.
• We define a function to count the number of unscheduled instances of a task that have been instantiated as
unschdInst(t) = number of unscheduled instances of t (4)
18
Resource Competition• The resource competition (RC) of a Grid resource r is
defined as
(5)
• where ti is a member of RT(r) and tj is a member of V• RT(r) is the set of tasks that can be executed by r
V
j
j
rRT
i i
i
tunschdInst
tCRNtunschdInst
rRC
1
)(
1
)(
)()(
)(
19
Resource Competition• The numerator represents the contention over the
Grid resource• The denominator represents the total number of
unscheduled task instances across all workflow instances that have been instantiated
• Resource competition is used in order to reserve resources in high demand for the tasks that are dependent on such resources to avoid bottleneck
20
Resource Metrics: EET• EET (Estimated Execution Time) The time taken for a
resource to execute a task• Most important to scheduling algorithms• Use history
– Assume to be zero if no previous history recorded to populate the history
21
Resource Metrics: EWT• EWT (Estimated Wait Time) The time until a resource can
start executing a task• Calculated from the task instances waiting in the queue
of the resource
otherwise
rnodes
xEETxEETmin
rinnodeidleanisthereif
rEWTQueue
j
j
i
Exec
i;
;0
1
1
22
Resource Metrics: ETT• ETT (Estimated Transfer Time) The time taken to transfer
the required files to the allocated resource.• Use history• Multiple file servers
– Select the server with minimum transfer time• Multiple files - how to determine ETT?
– Sum all ETTs (worst case, sequential transfers)– Maximum of all ETTs (best case, all parallel transfers)– Max of Sums of ETTs originating from each file source
23
Resource Metric: ECT• Estimated Completion Time (ECT) used in Ranking
Function.• The ECT of a task t when executed by a resource r is
calculated using the following equation [2]
ECT(r, t) = EET(r, t) + maximum( EWT(r), ETT(r) ) (6)
24
Ranking Function• The rank of each resource rj executing a task ti is
calculated by
(7)
• K constant is used to control the effect of RC on the rank value [14]
• A lower rank is better since it means the resource has lower competition and can also execute the task faster.
))((),(),( rRCKetrECTtrRank
25
The Besom Algorithm
• Designed as a batch mode scheduling so the scheduler can gather and schedule ready tasks from multiple workflow instances.
• Separated into three phases in order to support multiple instances– Instance Generation Phase– Task Prioritising Phase– Resource Selection Phase
Image from http://www.hecatescauldron.org/Besom%20Chants.htm
26
Besom Algorithm Overview
Instance Generation
TaskPrioritising
Resource Selection
Iteration Control
27
Instance Generation Phase
• Threshold is used to limit the number of task instances in each scheduling round
• More task instances in the scheduling pool leads to better load balancing.
• Too many task instances may lead to a useless schedule
2 while number of unscheduled task instances in set T < predefined threshold3 Generate one workflow instance w4 Add task instances tw from the new workflow instance to set T5 end while
28
Task Prioritising Phase
6 Update available Grid resources in set R7 Determine the task instances in set T that are ready and add them to RB8 for each task instance tw in RB do9 Find capable resources for its task definition t and update the set CR(t)10 Calculate RS(t) using (3) and assign it to tw
11 end for12 Sort task instances in RB in non-decreasing order based on RS
• Find ready task instances and use Resource Scarcity to prioritise them
29
Resource Selection Phase13 while RB is not empty do14 Get the first task instance tfirst from RB15 for each capable resource rj in CR(tfirst) do16 Calculate RC(rj) using (5)17 Calculate ECT using (6)18 Calculate Rank(rj , tfirst) using (7)19 end for20 Assign tfirst to the Grid resource rmin with the lowest rank21 Update EWT of rmin
22 Remove tfirst from RB and T23 end while
30
Iteration Control• After the three phases, the scheduler waits for the next
scheduling round which is triggered by– a certain period of time elapses– a new resource joins the grid– a task instance finishes execution
• Repeat the three phases in each scheduling round while– there are more workflow instances to schedule OR – set of unscheduled tasks T is not empty
31
Scheduling Loop
• To handle single-level feedback loop structure• Utilise the mechanism from scientific workflow
management systems of scheduling tasks during run-time
• Invoke scheduling algorithm as a task instance becomes ready for execution
• Avoid loop unrolling
32
An Approach to Loop Scheduling
• t2 of the second iteration cannot start until t4 of the first iteration finishes
• Only need to consider the task instances in the next loop iteration
• Resource Competition need to be modified
t1 t2 t3 t4 t5
33
Resource Competition (Revised)• We define U to be the set of task instances across all
instantiated workflow instances. • To represent task instances in the next iteration,
members of the set U are– the unscheduled task instances outside loop structure – all task instances inside the loop structure when the
execution have not exited the loop.• We define a function to count the instances of a task t
that are also members of U as
inst(t,U) = number of instances of t that are members of U (8)
34
Resource Competition (Revised)• The resource competition (RC) of a Grid resource r is
redefined as
(9)
• where ti is a member of RT(r) and tj is a member of V• RT(r) is the set of tasks that can be executed by r
V
j
j
rRT
i i
i
Utinst
tCRNUtinst
rRC
1
)(
1
),(
)(),(
)(
35
The Extended Besom Algorithm
• Management of the member of the set U– Task instances are added to U as new workflow instance
is generated (unchanged)– Task instances outside loop are removed after being
scheduled (unchanged)– Different from the original version, instances of the tasks
inside loop are removed when the execution of that particular workflow instance exit loop
– E.g. Instances of t2, t3 and t4 are removed when t5 becomes ready for execution
36
The Extended Besom Algorithm
• Instance Generation phase and Iteration Control remain identical to the original version
• Task Prioritising Phase and Resource Selection Phase are modified to accommodate the management of set U
37
Task Prioritising Phase
6 Update available Grid resources in set R7 for each task instance tw in U do 8 if tw is ready to execute then9 Add tw to RB10 if tw is an instance of the texit of a loop l = (LT, texit) then11 Remove from U the instances of tasks in loop of the same workflow instance w12 end if13 end for14 for each task instance tw in RB do15 Let t be the task definition of tw 16 find capable resources for t and update the set CR(t)17 Calculate RS(t) using (3) and assign it to tw 18 end for19 Sort task instances in RB in non-decreasing order based on RS
38
Resource Selection Phase
20 while RB is not empty do21 Get the first task instance tfirst from RB
22 for each capable resource rj in CR(tfirst) do
23 Calculate RC(rj) using (14)
24 Calculate ECT using (11)25 Calculate Rank(rj , tfirst) using (12)
26 end for27 Assign tfirst to the Grid resource rmin with the lowest rank
28 Update EWT of rmin
29 Remove tfirst from RB
30 Let t be the task definition of tfirst
31 if LT such that t is NOT a member of LT then32 Remove tfirst from U
33 else34 Do nothing35 end while
39
Implementing Scheduler
Iteration Control
Instance Generation
TaskPrioritising
Resource Selection
Generator Executor
Generator may implement different scheduling algorithms
Executor is the same for any algorithms but require workflow system functionality to execute tasks
40
Schedule Generator
Schedule Generation
Scheduling Interface
Nimrod/K
Nimrod Scheduler
Schedule Executor
Instance Generation
Prefire
Execution
Wait
Terminate
Initialisation
Scheduler Overview
41
Schedule Generator• Implement scheduling algorithms to generate
schedule– Besom– Min-Min, Max-Min, Xsufferage
• A separate package which can be used by any workflow management system
• Additional scheduling algorithm can be added with minimal modification
42
Schedule Executor• Uses Nimrod/K API to provides the algorithm with
information required to generate a Grid workflow schedule
• Start tasks in the workflow according to the generated schedule
• Specific to Nimrod/K
43
Evaluation• Performance (makespan) against Min-Min, Max-Min and
XSufferage algorithms• Three goals
– Minimise bottleneck– Manage multiple workflow instances– Incorporate loop scheduling without loop unrolling
• K constant that controls the effect of resource competition• Threshold for instance generation
44
Simulation Setup• Simulation environment is implemented into Nimrod/K• Parameter Sweep Actor (PS) is used to generate
instances and does not require Grid execution.• Dummy actor which sleeps for a period of time equal to
the ECT as if it is executed by a particular resource.• File transfer is also simulated using sleeps• Three different workflow structure: sequential, parallel,
and mixed structure.• Three different execution settings featuring different
degrees of resource competition.
45
TheThree Workflow Structures
t1 t4t3t2
PS
PS
PS
InputT1900k
OutputT1400k t5
OutputT4400k
OutputT3500k
OutputT2500k
t1 t4t3t2
t5
InputT1900k
InputT2600k
InputT3800k
InputT4700k
OutputT1400k
OutputT2500k
OutputT3500k
OutputT4400k
t1
t3
InputT1900k
InputT3800k
t2
t4
OutputT1400k
OutputT3500k
t5
OutputT2, 500k
OutputT4400k
a)
b)
c)
Sequential (S)
Parallel (P)
Mixed Structure (M)
46
The Three Execution Settings
• Setting (1), the resource r2 has a very high resource competition (RC) value as it is the only resource that can run t2, t3 and t4
• Simulation (2) and (3), RC of r2 is distributed to r1 and r3
1 t1 t2 t3 t4 t5 2 t1 t2 t3 t4 t5 3 t1 t2 t3 t4 t5
r1 5 - - - 4 r1 5 - 4 - 4 r1 5 - 4 - 4
r2 4 5 4 6 3 r2 4 5 - 6 3 r2 4 5 - - 3
r3 6 - - - 4 r3 6 - - - 4 r3 6 - - 6 4
r4 8 - - - 6 r4 8 - - - 6 r4 8 - - - 6
r5 7 - - - 6 r5 7 - - - 6 r5 7 - - - 6
47
Simulation Scenarios
• Total 9 scenarios– S1, S2, S3 – P1, P2, P3– M1, M2, M3(e.g. S1 is the sequential workflow using the first
execution setting)
• Transfer rates between Grid resources are assumed to be static
48
Base-Run• Use all 9 scenarios• Threshold in Instance Generation phase is defined
using the number of unscheduled task instances and the number of free resource nodes
• “Threshold Multiplier” is set to 2 • Workflow instance is generated until the number of
unscheduled instances is at least twice the number of free nodes
• K constant is set to 2.4• Execute 1, 5, 10, 20, 50, and 100 workflow instances
49
Improvement in Sequential Scenarios
1 5 10 20 50 1000
2000400060008000
100001200014000
S1
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 1000
1000020000300004000050000600007000080000
S2
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 100-50000
50001000015000200002500030000
S3
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
50
Result: Sequential Scenarios• In S1 and S2 only little improvement can be observed since
r2 become a bottleneck. No resource selection actually takes place for t2, t3 and t4.
• Slight improvement gained at the beginning and near the end due to consideration of Resource Competition and transfer time
• S2 shows a weakness inherent to the XSufferage• The Besom algorithm improves the performance over Min-
Min in S3 by avoiding allocating the r2, r3, and r4 to tasks t1 and t5 which can run on other resources.
51
Improvement in Parallel Scenarios
1 5 10 20 50 100
-3000-2000-1000
010002000300040005000
P1
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 100
-6000-4000-2000
02000400060008000
10000
P2
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 100-4000-2000
02000400060008000
1000012000
P3
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
52
Result: Parallel Scenarios• The parallel scenarios do not show any significant
improvement.• In P1 and P2, performance is again restricted by the
bottleneck caused by r2• In p3, t2, t3, and t4 run on their exclusive resources in
parallel so the performance of both algorithms is almost the same.
53
Improvement in Mixed Structure Scenarios
1 5 10 20 50 100
-4000
-2000
0
2000
4000
6000
8000
M1
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 100-20000
2000400060008000
1000012000
M2
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
1 5 10 20 50 100-50000
500010000150002000025000300003500040000
M3
Min-MinMax-MinXSufferage
Number of Instances
Impr
ovem
ent (
ms)
54
Result: Mixed Structure Scenarios• The scenario M1 is also restricted by bottleneck.• M2 and M3 show significant improvement in the longer
execution.• In M2, XSufferage performs similar to Besom because of
transfer cost between some instances of the tasks t1 and t2 is eliminated.
• M3 demonstrates the situation where Besom performs best– Complex workflow structure and execution setting– Resource requirement overlaps across sequential branches– Bottleneck involved multiple resources
55
Single Workflow Instance
• Besom algorithm can be used for scheduling conventional scientific workflows that execute only once, with at least the same performance as the XSufferage algorithm
S1 S2 S3 P1 P2 P3 M1 M2 M3-2000
0
2000
4000
6000
8000
10000
Single Workflow Instance
Min-MinMax-MinXSufferage
Scenario
Impr
ovem
ent (
ms)
56
Evaluation of K Constant• Use scenarios S3, P3, M3• Execute 100 workflow instances, Besom only
• Result:– The value of 2.3 is optimal– Ignoring resource competition (K=0) yields worse
makespans
57
Evaluation of K Constant: Result
0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5550000
590000
630000
670000
S3
K Constant Value
Mak
espa
n (m
s)
0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5250000270000290000310000330000350000
P3
K Constant Value
Mak
espa
n (m
s)
0 1 1.5 2 2.2 2.3 2.4 2.5 3 3.5 4 5360000
380000
400000
420000
440000
M3
K Constant Value
Mak
espa
n (m
s)
58
Threshold Evaluation• Use all 9 scenarios• K = 2.3, execute 100 workflow instances, against
XSufferage• Result:
– Optimal integer value for Threshold Multiplier is 6– Resources are underutilised when setting multiplier
too low– “Saturation state” is reached when resource
queues of the resources causing bottleneck are heavily populated
59
Threshold Evaluation: Result (S)
1x 2x 3x 4x 5x 6x 7x720000
770000
820000
870000
920000
S1
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x450000550000650000750000850000950000
1050000
S2
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x350000450000550000650000750000850000950000
1050000
S3
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
60
Threshold Evaluation: Result (P)
1x 2x 3x 4x 5x 6x 7x720000730000740000750000760000770000780000
P1
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x500000510000520000530000540000550000560000570000580000
P2
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x300000310000320000330000340000350000360000370000
P3
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
61
Threshold Evaluation: Result (M)
1x 2x 3x 4x 5x 6x 7x720000730000740000750000760000770000780000
Makespan M1
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x500000520000540000560000580000600000620000
Makespan M2
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
1x 2x 3x 4x 5x 6x 7x300000
350000
400000
450000
500000
550000
Makespan M3
BesomXSufferage
Threshold Multiplier
Mak
espa
n (m
s)
62
Simulation Using Optimal Threshold• Use all 9 scenarios, against all 3 algorithms• K = 2.3, Threshold = 6, execute 100 workflow instances• Result:
– When the resource queues are heavily populated, the transfer time can be ignored as the wait time is much greater
– Long wait time resulting from the heavily populated resource queues minimises the advantages gained from both the consideration of ETT and the use of resource competition in resource selection
• It might not be possible to control the threshold in real workflows
63
Simulation Using Optimal Threshold: Result
S1 S2 S3 P1 P2 P3 M1 M2 M3
-5000
0
5000
10000
15000
20000
Min-MinMax-MinXSufferage
Scenario
Impr
ovem
ent (
ms)
64
Loop Scenarios• Scenarios L1, L2 and L3 against all 3 algorithms• K = 2.3, Threshold = 6, execute 100 workflow instances• Loop iterates 3 times
t1 t4t3t2PSInputT1
900kOutputT1
400k
t5OutputT4
400kOutputT3
500kOutputT2
500k
65
Result: Loop Scenarios
• Besom successfully schedules workflow with single-level feedback loop
• Performance is almost similar to the sequential scenarios
L1 L2 L3
-10000
-5000
0
5000
10000
15000
20000
25000
30000
35000
Min-MinMax-MinXSufferage
Impr
ovem
ent (
ms)
66
Summary• Besom algorithm can perform better when scheduling
complex workflow structure in which bottleneck involved multiple resources
• Besom algorithm can avoid the weakness of XSufferage• Besom algorithm does not present any advantage when
scheduling parallel independent tasks• When using optimal threshold, all algorithms perform
almost similarly.• Loop limitation – Besom may not perform optimally
during the very last loop iteration
67
Summary: Base-RunScenario
Scheduling Algorithm
Min-Min Max-Min XSufferage
S1 ★ ★S2 ★ ★ ★★★S3 ★★ ★P1
P2 ★P3 ★M1 ★ ★M2 ★★ ★★ ★M3 ★★★ ★★★ ★★★
★ - slight improvement, ★★ - moderate improvement, ★★★ - strong improvement
68
Conclusion• We propose the Besom algorithm for Grid workflow
scheduling that are able to manage multiple workflow instances and single-level feedback loop.
• Besom algorithm can reduce the effect of bottleneck by considering resource competition, and performs better in complex workflow structure.
• We demonstrate the importance of the number of input task instances to the scheduling (through threshold multiplier).
69
Conclusion• Besom algorithm is also applicable to conventional
scientific workflows• An independent scheduling software package is
implemented, which can be used by any workflow system through the provided programming interfaces.– Besom– Min-Min– Max-Min– XSufferage
• Simulation environment is implemented into Nimrod/K
70
Future Work• Scheduling parameter sweep in sub-workflow
– Instance generation is not under the control of scheduler• Consideration of conditional branches
– Approach analogous to loop scheduling• Workflow verification
– In order for the workflow to execute properly• Workflow structure analyser
– Which task is or is not in any loop or branch?
71
Future Work• Selective algorithm
– Switch between algorithms during run-time– Use the best algorithm for the current situation– Context-aware / Adaptive scheduling algorithm?
• Other scheduling objectives– Cost– Deadline– Resource utilisation– etc.
72
List of Publications• S. Smanchat, S. Ling and M. Indrawan, “Toward grid workflow
scheduling based on resource competition,” in Proceedings of the 13th Enterprise Distributed Object Computing Conference Workshops (EDOCW 2009), 2009, pp. 126-130.
• S. Smanchat, M. Indrawan, S. Ling, C. Enticott, and D. Abramson, “Scheduling Multiple Parameter Sweep Workflow Instances on the Grid,” in Proceedings of the 5th IEEE International Conference on e-Science (e-Science '09), Oxford, UK, 2009, pp. 300-306.
• S. Smanchat, M. Indrawan, S. Ling, C. Enticott, and D. Abramson, “A Scheduler based on Resource Competition for Parameter Sweep Workflow,” in Proceedings of the International Conference on Computational Science (ICCS 2011), 2011, pp. 176-185.
73
References1. B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E. A. Lee, J.
Tao, and Y. Zhao, "Scientific workflow management and the Kepler system," Concurr. Comput. : Pract. Exper., vol. 18, pp. 1039-1065, 2006
2. D. Abramson, C. Enticott, and I. Altinas, "Nimrod/K: towards massively parallel dynamic grid workflows," in Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Austin, Texas: IEEE Press, 2008.
3. Nimrod Toolkit, http://www.messagelab.monash.edu.au/Nimrod4. J. Yu, R. Buyya, and K. Ramamohanarao, "Workflow Scheduling Algorithms for Grid
Computing," in Metaheuristics for Scheduling in Distributed Computing Environments, 2008, pp. 173-214.
5. M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, "Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems," in Proceedings of the 8th Heterogeneous Computing Workshop (HCW '99), 1999, pp. 30-44.
6. H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman, "Heuristics for scheduling parameter sweep applications in grid environments," in Proceedings of the 9th Heterogeneous Computing Workshop (HCW 2000) 2000, pp. 349-363.
7. X. He, X. Sun, and G. v. Laszewski, "QoS guided min-min heuristic for grid task scheduling," J. Comput. Sci. Technol., vol. 18, pp. 442-451, 2003.
74
8. K. Etminani and M. Naghibzadeh, "A Min-Min Max-Min selective algorihtm for grid task scheduling," in Proceedings of the 3rd IEEE/IFIP International Conference in Central Asia on Internet (ICI 2007), 2007, pp. 1-7.
9. K. Liu, J. Chen, H. Jin, and Y. Yang, "A Min-Min Average Algorithm for Scheduling Transaction-Intensive Grid Workflows," in Proceedings of the 7th Australasian Symposium on Grid Computing and e-Research (AusGrid 2009). Wellington, New Zealand: Australian Computer Society, Inc., 2009, pp. 41-48.
10. H. Topcuoglu, S. Hariri, and W. Min-You, "Performance-effective and low-complexity task scheduling for heterogeneous computing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, pp. 260-274, 2002.
11. R. Sakellariou and H. Zhao, "A hybrid heuristic for DAG scheduling on heterogeneous systems," in Proceedings of the 18th International Conference on Parallel and Distributed Processing Symposium, 2004, pp. 111.
12. Z. Shi and J. J. Dongarra, "Scheduling workflow applications on processors with different capabilities," Future Gener. Comput. Syst., vol. 22, pp. 665-675, 2006.
13. Smanchat, S., et al., Scheduling Multiple Parameter Sweep Workflow Instances on the Grid, in Proceedings of the 5th IEEE International Conference on e-Science (e-Science '09). 2009: Oxford, UK. p. 300-306.
75
14. Pinedo, M.L., Scheduling: Theory, Algorithms, and Systems. Third ed. 2008: Springer.
15. Aalst, W.M.P.v.d., The Application of Petri Nets to Workflow Management. The Journal of Circuits, Systems and Computers, 1998. 8(1): p. 21-66.
76
Question?
Image from http://services.flikie.com/view/v3/android/wallpapers/16790538
77
Appendix
78
Scheduling Algorithms: Batch Mode
• Task Prioritising Phase– Gather independent tasks that are ready to execute into
the batch– Calculate performance metrics such as Estimated
Completion Time (ECT)• Resource Selection Phase
– Allocate the tasks in the batch to the resource with the “best” performance metric depending on each heuristic
> Fastest resource> Resource with minimum file transfer time
79
Scheduling Algorithms: Dependency Mode• Task Prioritising Phase
– Rank all tasks in the workflow based on performance metrics such as ECT and the position of each task in the workflow.
– Tasks closer to the beginning of the workflow are ranked higher.
• Resource Selection Phase– Allocate the task with the highest rank to the resource with
the “best” performance metric depending on each heuristic e.g. resource with Earliest Finish Time
80
RCPSP VS Grid Workflow Scheduling
• Focuses on when to start activities and in which particular order so that the schedule is feasible given the resource availability constraint
• Organisations have complete control over their resources
• Resources are mainly considered homogenous with respect to activities
• Time lag is more static
• Needs to decide by which resources the tasks will be executed in which particular order, according to the performance of each resource
• No centralised control• Resources are mainly
heterogeneous• Network condition is dynamic
81
Tasks & Task Instances• Grid tasks – need to be scheduled to Grid resource
for execution e.g. computational tasks• Local tasks – run on local processor and are not
considered in the scheduling e.g. Boolean expressions and flow controls
• Task instance is denoted as tw where w is the workflow instance identifier
• Task instances t1w1 and t1w2 are different (same task but of different workflow instance)
82
Firing Resource Queues• Task queue of each resource maintained by scheduler.• To execute scheduled task, the scheduler goes
through each queue and starts the tasks within.• Scheduler can:
– strictly follow the schedule– skip waiting tasks– combine the two
83
Firing Resource Queues: Alternatives 1• Strictly follow the schedule
– The scheduler stops as soon as it encounters a task that cannot start (waiting transfer or free resource node)
– The scheduler misses the opportunity to start the subsequent tasks in the queue that are ready to start
– Maintain order of execution as scheduled– Useful when number of free resources nodes is not
known
84
• Skip the waiting tasks– The scheduler skips tasks that are waiting for
transfer and start the ready ones.– Free resource nodes are used immediately
(higher utilisation)– The skipped task might have to wait even
longer to execute and this delay may propagate through the rest of the workflow.
– Number of free resource nodes must be known to the scheduler.
Firing Resource Queues: Alternatives 2
85
• The scheduler skips tasks that are waiting while reserving enough resource nodes for the skipped tasks
• Fire a ready task if there are enough resource nodes for it and all the skipped tasks; otherwise stop and proceed to the next resource queue.
• The schedule may be violated• Task execution is not unnecessarily blocked by file
transfer.
Firing Resource Queues
86
Scheduler Termination• Parameter sweep workflow involve several workflow
instances and scheduling rounds• Scheduler need to know when to stop
– No task is waiting for execution or is being executed– No unscheduled task instance remains– No workflow instances remains– No input remains at any tasks
• We assume that the workflows scheduled based on our algorithm need to be “well-structured” and “bounded” following the definitions in [15]
87
Sequential (S) Parallel (P)
Mixed Structure (M)
Back
88
Transfer Rates
• In kilobytes per second
Back
r1 r2 r3 r4 r5
r1 - 60 70 100 90
r2 110 - 100 60 70
r3 100 90 - 130 80
r4 50 80 120 - 90
r5 90 70 110 120 -