Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes...

17
Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo Gomes 1

Transcript of Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes...

Page 1: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

1

Scientific Workflow Interchanging Through Patterns:

Reversals and Lessons Learned

Bruno Fernandes BastosRegina Maria Maciel Braga

Antônio Tadeu Azevedo Gomes

Page 2: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

2

Agenda

• Introduction– Problem Formulation and Initial Hypothesis

• Envisioned Solution• Preliminary Experiments• Reformulated Hypothesis• Qualitative Analysis of the Research Material– The myExperiment Repository

• Related Work• Conclusions

Page 3: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

3

Introduction

• Scientific workflows are used for tackling complex problems in different e-science domains– They may be described as a directed graph where the vertices

represent the tasks and the edges represent the data relationships between the tasks

• Several Scientific Workflow Management Systems (SWfMSs) have been developed– Specifying scientific workflows with higher-level abstractions

(Workflow Specification Languages - WfSL) than scripts,– Orchestrating the execution of the tasks, and – Managing the data consumed and produced by these

workflows.

Page 4: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

4

Problem Formulation• We formulated our research problem– The state-of-the-art in SWfMSs does not allow a

scientist to easily reuse workflow specifications previously modeled in other SWfMSs than those this scientist is used to work with.

Page 5: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

5

Initial Hypothesis

• The use of workflow patterns could help in keeping the semantics of a workflow– The use of workflow patterns combined with software

architecture concepts to capture the key semantics expressed in workflow specifications enables the establishment of automated processes that transform these specifications across different SWfMSs.

• These processes allow for a reduction on the effort scientists would make to reuse workflow specifications developed by other research groups in SWfMSs that are not part of the usual tooling these scientists employ in their daily work

Page 6: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

6

Envisioned Solution

• A novel language for interchanging workflow specifications– Using the Acme architecture description interchange

language• It was based on the specification of a single

architectural style where the components were the tasks and the connectors were the patterns– Definition of an interchangeable workflow: workflow

composed of a set of “interchangeable elements”• Constants, subworkflows and webservices tasks

Page 7: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

7

Envisioned SolutionPatterns

Structural• Sequence: binds a single output port to a single input port;• Parallel Split: binds a single output port to two or more input ports,

replicating the same data from the output port to all input ports;• Simple Merge: binds two or more output ports to a single input port, feeding

the input port with data received from each output port in an interleaved way;

Behavioral• Synchronization: similar in structure to the Simple Merge pattern, but the

task with the input port may be only executed when data coming from all the output ports have been received and grouped according to some criteria;

• Exclusive Choice: similar in structure to the Parallel Split pattern, but only one of the input ports may receive data from the output port, according to some condition.

Page 8: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

8

Workflow Pattern Identification• Patterns may be implemented in different ways

– Depending on the features each SWfMS supports– Eg: Exclusive Choice Pattern

Page 9: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

9

Preliminary Experiments

• Experiment Planning– 4 VisTrails, 46 Kepler and 1452 Taverna

specifications• For the 1st hypothesis the task type matters– VisTrails has only one Web Service and it is not

available– Kepler has 45 types of tasks but none of them is a

Web Service– Taverna has more than 100 types and many Web

Services

Page 10: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

10

Preliminary Experiments• Analysis of the workflow transformations

– 53% of the Taverna tasks were interchangeable

Quantity of TasksQuantity of Interchangeable Workflows

Page 11: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

11

Reformulated Hypothesis

The use of workflow patterns and software architecture concepts to capture the key structural semantics expressed in workflow specifications enables the establishment of semi-automated processes that transform these specifications across different WfSLs. These processes allow for a reduction on the effort scientists would make to reuse structurally complex workflow specifications (in the sense of having a large number of tasks and dependency relationships between these tasks) developed by other research groups in SWfMSs that are not part of the usual tooling these scientists employ in their daily work.

Page 12: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

12

Further Experiments

• After interchanging the workflows structures we could interchange almost all workflows (98.28%)– Problems related to

the patterns identification

Page 13: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

13

Qualitative Analisys of the Research Material

• The myExperiment repository– Webservice tasks implemented as either local,

inaccessible, or authenticated, which made it impossible to execute these workflows, even in their source specifications

– Lack of documentation: Most of the analyzed workflows have no or very few metadata information

• Similar problems reported in the Wf4Ever project– Proposal of a new myExperiment repository

Page 14: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

14

Qualitative Analisys of the Research Material

• The studied systems– Once a task has its type defined and its input and

output ports linked to other tasks, it cannot have its type changed, therefore it needs to be removed• Once removed the relations are gone!• It reduces the utility of our approach

– Some SWfMS have limitations• VisTrails does not export subworkflows

Page 15: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

15

Related Work

• Taverna 2-Galaxy and Tavaxy– Limited to two SWfMSs and their adaptability to a

broader range of SWfMSs would depend on a complete reformulation of their architectures• Although Tavaxy brings the patterns approach

• IWIR– Most similar to ours– Syntactical structures that are quite similar to those

defined for the SWfMSs• Other works

Page 16: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

16

Conclusions

• This research endeavor started with exploratory studies aiming at identifying whether it would be possible to establish “future-proof” automated processes for transforming workflows between different SWfMSs.

• It was unclear whether the perceived problem does actually exist, and the experimental data we employed may point out in a different direction.

• The fact that the myExperiment repository is plenty of “toy” made it harder to execute a proof of concept.

Page 17: Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.

17

Questions