Automating Data-Throttling Analysis for Data-Intensive...
Transcript of Automating Data-Throttling Analysis for Data-Intensive...
Automating Data-Throttling Analysis for Data-Intensive
Workflow
Ricardo J. Rodrıguez, Rafael Tolosana-Calasanz, Omer F. Rana{rjrodriguez, rafaelt}@unizar.es, [email protected]
Universidad de Zaragoza Cardiff UniversityZaragoza, Spain Cardiff, United Kingdom
May 15th, 2012
CCGrid’12: 12th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing
Ottawa, Canada
Motivation
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 2 / 25
Motivation Knowing the problem
Motivation (I): data movement issue
Scientific Workflows
Control/data flow graph
Set of tasksDependencies
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 3 / 25
Motivation Knowing the problem
Motivation (I): data movement issue
Scientific Workflows
Control/data flow graph
Set of tasksDependencies
Data-Intensive Workflows
Data workload ≥ computational workload
How to deal with this?
Minimise data movementMove data via higher capacity links as fast as possible
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 3 / 25
Motivation Data-movement policies
Motivation (II): data-movement policies
Pipeline workflow
Transmit as fast as possible → OK!
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 4 / 25
Motivation Data-movement policies
Motivation (II): data-movement policies
Pipeline workflow
Transmit as fast as possible → OK!
Directed Acyclic Graph workflow
Transmit as fast as possible →WRONG!
Cluster tasks → minimise datamovement
Problems may arise
Isolation view: networking + buffering(merge tasks)In-the-large: harmful (finite buffercapacity → cannot be reused)
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 4 / 25
Motivation The goal & our approach
Motivation (III): deriving data-throttling
The goal
Balance workflow paths. . .
. . . eliminating unnecessary bandwidth usage
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 5 / 25
Motivation The goal & our approach
Motivation (III): deriving data-throttling
The goal
Balance workflow paths. . .
. . . eliminating unnecessary bandwidth usage
Efficient buffer usageEfficient network bandwidth usage
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 5 / 25
Motivation The goal & our approach
Motivation (III): deriving data-throttling
The goal
Balance workflow paths. . .
. . . eliminating unnecessary bandwidth usage
Efficient buffer usageEfficient network bandwidth usage
Open question: WHAT strategy should I use?
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 5 / 25
Motivation The goal & our approach
Motivation (III): deriving data-throttling
The goal
Balance workflow paths. . .
. . . eliminating unnecessary bandwidth usage
Efficient buffer usageEfficient network bandwidth usage
Open question: WHAT strategy should I use?
Our approach
Automatically derive values for data-throttling
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 5 / 25
Motivation The goal & our approach
Motivation (III): deriving data-throttling
The goal
Balance workflow paths. . .
. . . eliminating unnecessary bandwidth usage
Efficient buffer usageEfficient network bandwidth usage
Open question: WHAT strategy should I use?
Our approach
Automatically derive values for data-throttling
Directed Acyclic Graphs (DAGs) → Petri nets (performance model)
Analysis on Petri net model (explained later :))
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 5 / 25
Background
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 6 / 25
Background Petri nets
Background (I): Petri nets (PNs)
Mathematical formalism
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 7 / 25
Background Petri nets
Background (I): Petri nets (PNs)
Mathematical formalism
Places (circles, pX )
Transitions (bars, tX ). Associated delay
Immediate (δtX = 0)Timed (δtX exponential distribution →Stochastic Petri Nets)
Tokens (black dots)
Directed arcs
Place → TransitionTransition → Place
Initial marking: no. tokens on places
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 7 / 25
Background Petri nets
Background (I): Petri nets (PNs)
Mathematical formalism
Places (circles, pX )
Transitions (bars, tX ). Associated delay
Immediate (δtX = 0)Timed (δtX exponential distribution →Stochastic Petri Nets)
Tokens (black dots)
Directed arcs
Place → TransitionTransition → Place
Initial marking: no. tokens on places
Enables modelling
ConcurrencySynchronisation
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 7 / 25
Background From DAG to PN
Background (II): From DAG to PN
Transforming from a DAG to a PN
Computation task → place + timed transition
Delay equal to computation task time
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 8 / 25
Background From DAG to PN
Background (II): From DAG to PN
Transforming from a DAG to a PN
Computation task → place + timed transition
Delay equal to computation task time
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 8 / 25
Background From DAG to PN
Background (II): From DAG to PN
Transforming from a DAG to a PN
Computation task → place + timed transition
Delay equal to computation task time
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 8 / 25
Background From DAG to PN
Background (II): From DAG to PN
Transforming from a DAG to a PN
Computation task → place + timed transition
Delay equal to computation task time
Transmission link → place + timed transition
Delay equal to transmission time: δt =data size transmitted
link bandwidth
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 8 / 25
Background Slack Concept
Background (III): Slack concept (1)For human beings: an example
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 9 / 25
Background Slack Concept
Background (III): Slack concept (1)For human beings: an example
δ execution time; tx transmission timeR.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 9 / 25
Background Slack Concept
Background (III): Slack concept (1)For human beings: an example
δ execution time; tx transmission timeR.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 9 / 25
Background Slack Concept
Background (III): Slack concept (1)For human beings: an example
δ execution time; tx transmission timeR.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 9 / 25
Background Slack Concept
Background (III): Slack concept (1)For human beings: an example
δ execution time; tx transmission time
Slowest path:
Θ =1
∑(δ + tx)
=1
8(Θ: inverse of execution time ofslowest path)
Slack: how much faster is apath w.r.t. another until asynchronisation point
µ1,3 =3− 2
8=
1
8
µ1,5 =7− 2
8=
5
8
µ3,5 =7− 6
8=
1
8
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 9 / 25
Background Slack Concept
Background (III): Slack concept (2)For tough guys/gals – Maths fans
Upper performance bound vs. exact analysis
Exact analysis needs reachability graph → NP-hard problem
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 10 / 25
Background Slack Concept
Background (III): Slack concept (2)For tough guys/gals – Maths fans
Upper performance bound vs. exact analysis
Exact analysis needs reachability graph → NP-hard problem
Use of Linear Programming (LP) techniques → polynomial complexity
Based on Petri net theory: tight marking (m) [p. 3 on the paper]
Further reading: google “RJ-EPEW-10”
m(p) ≥ δ(p•) ·Θ → m(p) = δ(p•) ·Θ+ µ(p) (1)
δ(p•) delay of transition after place p;Θ inverse of execution time of slowest path;
µ(p) slack of place p
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 10 / 25
Automating Data-Throttling Analysis
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 11 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (I)
Inputs: Performance estimation(i.e., DAX annotations) +PN-based model
Outputs: Data-throttling values+ Analysis results
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 12 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (I)
Inputs: Performance estimation(i.e., DAX annotations) +PN-based model
Outputs: Data-throttling values+ Analysis results
4 steps1 Compute slack values2 Cluster slacks
Why? In the next slide!
3 Compute data-throttlingvalues
4 Performance analysis
With and w/outdata-throttling
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 12 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (II)The need of clustering slacks
Influence between slacks
Order of adjusting ISIMPORTANT
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 13 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (II)The need of clustering slacks
Influence between slacks
Order of adjusting ISIMPORTANT
From output to input
Cluster slacks, then:1 Compute data-throttling
values2 Recompute slacks values
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 13 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (II)The need of clustering slacks
Influence between slacks
Order of adjusting ISIMPORTANT
From output to input
Cluster slacks, then:1 Compute data-throttling
values2 Recompute slacks values
In the example:1 {µ3,5, µ1,5}2 µ1,3
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 13 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (III)Deriving data-throttling values from slack
µ(p) → some delay may be added on the path
txnew = txold + α, α =µ(p)
Θ
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 14 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (III)Deriving data-throttling values from slack
µ(p) → some delay may be added on the path
txnew = txold + α, α =µ(p)
Θ
tx =data size
link bandwidth(BW )
BWnew =1
1
BWold
+µ(p)
Θ · data size
(2)
µ slack; tx transmission time;
Θ inverse of execution time of slowest path
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 14 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (IV)Applying to an example
Recall: slacks on synchronisation points
Assumptions
BW=100Mbps, latency 1e−4s
Data-sets equal to 10MiB
Dedicated network topology
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 15 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (IV)Applying to an example
Recall: slacks on synchronisation points
Assumptions
BW=100Mbps, latency 1e−4s
Data-sets equal to 10MiB
Dedicated network topology
Slowest path: 2 → 5 → 6
Slacks: µ1,4, µ3,6, µ4,6
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 15 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (IV)Applying to an example
Recall: slacks on synchronisation points
Makespan: 5.6779 seconds
Assumptions
BW=100Mbps, latency 1e−4s
Data-sets equal to 10MiB
Dedicated network topology
Slowest path: 2 → 5 → 6
Slacks: µ1,4, µ3,6, µ4,6
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 15 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (IV)Applying to an example
Recall: slacks on synchronisation points
Makespan: 5.6779 seconds
Assumptions
BW=100Mbps, latency 1e−4s
Data-sets equal to 10MiB
Dedicated network topology
Slowest path: 2 → 5 → 6
Slacks: µ1,4, µ3,6, µ4,6
1 → 4 adjust to 28.57%3 → 6 adjust to 35.15%4 → 6 adjust to 44.55%
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 15 / 25
Automating Data-Throttling Analysis
Automating Data-Throttling Analysis (IV)Applying to an example
Recall: slacks on synchronisation points
Makespan: 5.7750 seconds
(negligent impact on makespan BUT
input buffer waiting times reduced)
Assumptions
BW=100Mbps, latency 1e−4s
Data-sets equal to 10MiB
Dedicated network topology
Slowest path: 2 → 5 → 6
Slacks: µ1,4, µ3,6, µ4,6
1 → 4 adjust to 28.57%3 → 6 adjust to 35.15%4 → 6 adjust to 44.55%
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 15 / 25
Experiments and Results
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 16 / 25
Experiments and Results
Experiments and Results (I): Description
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 17 / 25
Experiments and Results
Experiments and Results (I): Description
Montage Workflow – 5inputs
Performance estimation fromDAX (Pegasus Wf System)
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 17 / 25
Experiments and Results
Experiments and Results (I): Description
Montage Workflow – 5inputs
Performance estimation fromDAX (Pegasus Wf System)
Network topologies assumed:
Single vs. dedicated data-link(throttling in dedicated)
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 17 / 25
Experiments and Results
Experiments and Results (I): Description
Montage Workflow – 5inputs
Performance estimation fromDAX (Pegasus Wf System)
Network topologies assumed:
Single vs. dedicated data-link(throttling in dedicated)
Tools used
Slack computation: MATLAB
Performance analysis: SimGrid
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 17 / 25
Experiments and Results
Experiments and Results (I): Description
Montage Workflow – 5inputs
Performance estimation fromDAX (Pegasus Wf System)
Network topologies assumed:
Single vs. dedicated data-link(throttling in dedicated)
Tools used
Slack computation: MATLAB
Performance analysis: SimGrid
Experiments performed
Impact on makespan
Input buffers & net BW usage
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 17 / 25
Experiments and Results Impact on the Workflow Makespan
Experiments and Results (II): Experiments (1)Impact on the Workflow Makespan
Network topologyNetwork bandwidth
10Mbps 100Mbps 1Gbps
Single output 193.20s 61.18s 47.98s
PP without BW throttling 153.15s 57.17s 47.58s
PP with BW throttling 153.32s 57.21s 47.58s
Data-throttling looses (insignificantly)
↑ net. BW implies smaller impact on makespan
Correlationdata transmission
computation(as indicated by Park & Humphrey)
Verified by our results
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 18 / 25
Experiments and Results Input Buffers and Network Bandwidth Usage
Experiments and Results (II): Experiments (2)Input Buffers and Network Bandwidth Usage – some plots
10Mbps 100Mbps 1Gbps0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Bu
ffer
wai
tin
g t
ime
(s)
Single output
PP w/o. BW throttling
PP w. BW throttling
10Mbps 100Mbps 1Gbps0
20
40
60
80
100
120
140
Buff
er w
aiti
ng t
ime
(s)
Single output
PP w/o. BW throttling
PP w. BW throttling
(a) Workflow task mImgTbl (b) Workflow task mConcatFit
Data-throttling has greatimpact on input buffers
Outperforms both othertopologies
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 19 / 25
Related Work
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 20 / 25
Related Work
Related Work
PNs already used in scientific workflow community
GWorkflowDL, GridFlow, FlowManager
Overhead analysis (Nerieri et al.)
Load imbalance + data movement
Data throttling issue (Park & Humphrey)
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 21 / 25
Related Work
Related Work
PNs already used in scientific workflow community
GWorkflowDL, GridFlow, FlowManager
Overhead analysis (Nerieri et al.)
Load imbalance + data movement
Data throttling issue (Park & Humphrey)
Performance analysis
Hybrid Bayesian-neural network (Duan et al., predicts execution time oftasks)Parametrised PN-based model (Tolosana-Calasanz et al.)
Structural analysis (Van der Alst & Van Hee)WF-nets
Operations: sequence, choice, synchroniser, fork, mergeAnalysis: correctness, deadlock analysis, liveness
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 21 / 25
Nice work, but. . .
How can I use this *fancy* approach?
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 22 / 25
Nice work, but. . .
How can I use this *fancy* approach?
Workflows characteristics
Synchronisation points/merge tasks
DAX (DAG in XML format) → PNML (PN in XML format)
Automatic transformation
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 22 / 25
Nice work, but. . .
How can I use this *fancy* approach?
Workflows characteristics
Synchronisation points/merge tasks
DAX (DAG in XML format) → PNML (PN in XML format)
Automatic transformation
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 22 / 25
Conclusions and Future Work
Outline
1 MotivationKnowing the problemData-movement policiesThe goal & our approach
2 BackgroundPetri netsFrom DAG to PNSlack Concept
3 Automating Data-Throttling Analysis
4 Experiments and ResultsImpact on the Workflow MakespanInput Buffers and Network Bandwidth Usage
5 Related Work
6 Conclusions and Future Work
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 23 / 25
Conclusions and Future Work
Conclusions and Future Work
Conclusions
Transfer as-fast-as-possible is not always the best option
Network bandwidth misuseInput buffer misuse
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 24 / 25
Conclusions and Future Work
Conclusions and Future Work
Conclusions
Transfer as-fast-as-possible is not always the best option
Network bandwidth misuseInput buffer misuse
Strategy for computing data-throttling values proposedMain drawbacks
Needs previous performance informationScalability
Future Work
Test in more realistic environments
Extend to other workflows
Improve strategy computation (reduce its complexity)
R.J. Rodrıguez et al. Automating Data-Throttling Analysis for Data-Intensive Workflow CCGrid’12 24 / 25
Automating Data-Throttling Analysis for Data-Intensive
Workflow
Ricardo J. Rodrıguez, Rafael Tolosana-Calasanz, Omer F. Rana{rjrodriguez, rafaelt}@unizar.es, [email protected]
Universidad de Zaragoza Cardiff UniversityZaragoza, Spain Cardiff, United Kingdom
May 15th, 2012
CCGrid’12: 12th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing
Ottawa, Canada