Yili Gong, Marlon E. Pierce and Geoffery C. Fox Community Grids Lab, Indiana University.
-
Upload
jaylin-boylan -
Category
Documents
-
view
218 -
download
0
Transcript of Yili Gong, Marlon E. Pierce and Geoffery C. Fox Community Grids Lab, Indiana University.
Yili Gong, Marlon E. Pierce and Geoffery C. FoxCommunity Grids Lab, Indiana University
Workflow Matchmaking in GridsTarget Execution Environment
TeraGridTarget Problem
Decide when a job in the workflow should run on which resource.
AssumptionsOnce two jobs have no logic or data dependency,
they can run simultaneously on a computing resource, if not exceeding any limit.
Jobs can only run on some of the resources.MotivationIt is better to group resource-critical jobs together
than mapping them individually.
The Resource-Critical Workflow Matchmaking AlgorithmRanking
Sort the jobs in a non-ascending order of the rank values.
GroupingGet the first ungrouped node in the sorted node list
as the first node of a new group.Check each of its children: if its ancestor are
grouped and its resource match ratio is below a certain threshold, add it into the group, and check its children further on.
MatchmakingMatchmaking nodes in each group
Example0
6
54321
87
9
14 18 22 13 25
15 26 20 14 21 17
26 20 19 R0
R1 R2
1.4 0.9
1.0
Data transfer rates between resources.
Execution times for jobs on resources.
DAG for workflow and sizes of data transferred between jobs.
Node R0 R1 R2
0 17 19 21
1 22 27 23
2 15 15 9
3 ∞ 8 9
4 17 14 20
5 ∞ ∞ 30
6 17 16 15
7 49 49 25
8 16 22 ∞
9 23 ∞ 19
Example0
6
54321
87
9
19.0
18.3
22.7
13.2
28.5
15.2
26.8
19.1
14.2
20.0
29.0
27.6
21.2
21.7
R0
R1 R2
1.4 0.9
1.0
Data transfer rates between resources.
Execution times for jobs on resources.
The weight of a node is the average of the job’s execution times on all possible resources.The weight of an edge is the average of the jobs’ communication times on all possible resource combinations.
Node R0 R1 R2
0 17 19 21
1 22 27 23
2 15 15 9
3 ∞ 8 9
4 17 14 20
5 ∞ ∞ 30
6 17 16 15
7 49 49 25
8 16 22 ∞
9 23 ∞ 19
14.2
24.0
13.0
8.5 17.0
30.0
16.0
41.0
19.0
21.0
Step 1: Ranking
Example0
6
54321
87
9
168.2
R0
R1 R2
1.4 0.9
1.0
Data transfer rates between resources.
Execution times for jobs on resources.
Node R0 R1 R2
0 17 19 21
1 22 27 23
2 15 15 9
3 ∞ 8 9
4 17 14 20
5 ∞ ∞ 30
6 17 16 15
7 49 49 25
8 16 22 ∞
9 23 ∞ 19
Step 1: Ranking0, 1, 5, 4, 3, 2, 7, 6, 8, 9
122.5
93.8 99.9 114.5
120.8
64.6 83.2 61.7
21.0
Using the same upward rank computing approach as in HEFT. Sort the jobs in a non-ascending order of the rank values.
Example0
6
54321
87
9
R0
R1 R2
1.4 0.9
1.0
Data transfer rates between resources.
Execution times for jobs on resources.
Node R0 R1 R2
0 17 19 21
1 22 27 23
2 15 15 9
3 ∞ 8 9
4 17 14 20
5 30 ∞ ∞
6 17 16 15
7 49 49 25
8 ∞ 22 16
9 23 ∞ 19Groups:0, 3, 5142, 876, 9
Step 1: Ranking 0, 1, 5, 4, 3, 2, 7, 6, 8, 9Step 2: Grouping
Match Ratio: the ratio of the number of resources the job can run on and the total resource number. Match Ratio Threshold α = 0.8
1
1 1 1
1 1
0.67
0.33
0.67
0.33
Example
0
6
54321
87
9
R0
R1 R2
1.4 0.9
1.0
Data transfer rates between resources.
Execution times for jobs on resources.
Node R0 R1 R2
0 17 19 21
1 22 27 23
2 15 15 9
3 ∞ 8 9
4 17 14 20
5 30 ∞ ∞
6 17 16 15
7 49 49 25
8 ∞ 22 16
9 23 ∞ 19
Step 3: Matchmaking a group 0, 3, 5
Experimental Evaluation -- Setting
DAG GeneratorParameter Sweep Applications
Heterogeneity ModelMatch RatioCommunication BandwidthCommunication-to-Computation-RatioMatch Ratio Threshold (MRT)
Experimental Evaluation -- MetricsCompare our resource critical algorithm with the
minimum EFT algorithm.Difference ratio of NSL
Normalized Schedule Length (NSL)The ratio of the real makespan divided by a
fixed cost of the critical path.
Average Improvement RatioThe average of difference ratios of all (200)
cases in a certain setting.
CPn i
jnw
LNSL
)(
Results -- Influence of MRT
Branch Number = 4, Depth = 8
The resource-critical algorithm performs worse than the min EFT algorithm.
The resource-critical algorithm performs Better than the min EFT algorithm.
Results -- Influence of CCR
Branch Number = 4, Depth = 8
The average improvement ratio increases from 23% to 43% as CCR varies from 0.1 to 1. This shows that the resource-critical algorithm works better when communication cost plays a bigger role.
Results -- Influence of the Shape of DAGs
CCR = 1.0, MRT = 0.5
Depth = 24 Branch Number = 4
The branch number has little influence on the performance of the resource-critical algorithm while depth does.
Contact:Yili Gong: [email protected]
Website: http://grids.ucs.indiana.edu/ptliupages/publications/
Questions?