Yili Gong, Marlon E. Pierce and Geoffery C. Fox Community Grids Lab, Indiana University.

Yili Gong, Marlon E. Pierce and Geoffery C. FoxCommunity Grids Lab, Indiana University

Workflow Matchmaking in GridsTarget Execution Environment

TeraGridTarget Problem

Decide when a job in the workflow should run on which resource.

AssumptionsOnce two jobs have no logic or data dependency,

they can run simultaneously on a computing resource, if not exceeding any limit.

Jobs can only run on some of the resources.MotivationIt is better to group resource-critical jobs together

than mapping them individually.

The Resource-Critical Workflow Matchmaking AlgorithmRanking

Sort the jobs in a non-ascending order of the rank values.

GroupingGet the first ungrouped node in the sorted node list

as the first node of a new group.Check each of its children: if its ancestor are

grouped and its resource match ratio is below a certain threshold, add it into the group, and check its children further on.

MatchmakingMatchmaking nodes in each group

Example0

6

54321

87

9

14 18 22 13 25

15 26 20 14 21 17

26 20 19 R0

R1 R2

1.4 0.9

1.0

Data transfer rates between resources.

Execution times for jobs on resources.

DAG for workflow and sizes of data transferred between jobs.

Node R0 R1 R2

0 17 19 21

1 22 27 23

2 15 15 9

3 ∞ 8 9

4 17 14 20

5 ∞ ∞ 30

6 17 16 15

7 49 49 25

8 16 22 ∞

9 23 ∞ 19

Example0

6

54321

87

9

19.0

18.3

22.7

13.2

28.5

15.2

26.8

19.1

14.2

20.0

29.0

27.6

21.2

21.7

R0

R1 R2

1.4 0.9

1.0



The weight of a node is the average of the job’s execution times on all possible resources.The weight of an edge is the average of the jobs’ communication times on all possible resource combinations.

Node R0 R1 R2

0 17 19 21

1 22 27 23

2 15 15 9

3 ∞ 8 9

4 17 14 20

5 ∞ ∞ 30

6 17 16 15

7 49 49 25

8 16 22 ∞

9 23 ∞ 19

14.2

24.0

13.0

8.5 17.0

30.0

16.0

41.0

19.0

21.0

Step 1: Ranking

Example0

6

54321

87

9

168.2

R0

R1 R2

1.4 0.9

1.0



Node R0 R1 R2

0 17 19 21

1 22 27 23

2 15 15 9

3 ∞ 8 9

4 17 14 20

5 ∞ ∞ 30

6 17 16 15

7 49 49 25

8 16 22 ∞

9 23 ∞ 19

Step 1: Ranking0, 1, 5, 4, 3, 2, 7, 6, 8, 9

122.5

93.8 99.9 114.5

120.8

64.6 83.2 61.7

21.0

Using the same upward rank computing approach as in HEFT. Sort the jobs in a non-ascending order of the rank values.

Example0

6

54321

87

9

R0

R1 R2

1.4 0.9

1.0



Node R0 R1 R2

0 17 19 21

1 22 27 23

2 15 15 9

3 ∞ 8 9

4 17 14 20

5 30 ∞ ∞

6 17 16 15

7 49 49 25

8 ∞ 22 16

9 23 ∞ 19Groups:0, 3, 5142, 876, 9

Step 1: Ranking 0, 1, 5, 4, 3, 2, 7, 6, 8, 9Step 2: Grouping

Match Ratio: the ratio of the number of resources the job can run on and the total resource number. Match Ratio Threshold α = 0.8

1

1 1 1

1 1

0.67

0.33

0.67

0.33

Example

0

6

54321

87

9

R0

R1 R2

1.4 0.9

1.0



Node R0 R1 R2

0 17 19 21

1 22 27 23

2 15 15 9

3 ∞ 8 9

4 17 14 20

5 30 ∞ ∞

6 17 16 15

7 49 49 25

8 ∞ 22 16

9 23 ∞ 19

Step 3: Matchmaking a group 0, 3, 5

Experimental Evaluation -- Setting

DAG GeneratorParameter Sweep Applications

Heterogeneity ModelMatch RatioCommunication BandwidthCommunication-to-Computation-RatioMatch Ratio Threshold (MRT)

Experimental Evaluation -- MetricsCompare our resource critical algorithm with the

minimum EFT algorithm.Difference ratio of NSL

Normalized Schedule Length (NSL)The ratio of the real makespan divided by a

fixed cost of the critical path.

Average Improvement RatioThe average of difference ratios of all (200)

cases in a certain setting.

CPn i

jnw

LNSL

)(

Results -- Influence of MRT

Branch Number = 4, Depth = 8

The resource-critical algorithm performs worse than the min EFT algorithm.

The resource-critical algorithm performs Better than the min EFT algorithm.

Results -- Influence of CCR

Branch Number = 4, Depth = 8

The average improvement ratio increases from 23% to 43% as CCR varies from 0.1 to 1. This shows that the resource-critical algorithm works better when communication cost plays a bigger role.

Results -- Influence of the Shape of DAGs

CCR = 1.0, MRT = 0.5

Depth = 24 Branch Number = 4

The branch number has little influence on the performance of the resource-critical algorithm while depth does.

Contact:Yili Gong: [email protected]

Website: http://grids.ucs.indiana.edu/ptliupages/publications/

Questions?

Yili Gong, Marlon E. Pierce and Geoffery C. Fox Community Grids Lab, Indiana University.

Documents

Transcript of Yili Gong, Marlon E. Pierce and Geoffery C. Fox Community Grids Lab, Indiana University.