Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta...
-
Upload
maryann-tate -
Category
Documents
-
view
217 -
download
0
Transcript of Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta...
![Page 1: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/1.jpg)
Using Provenance to Improve Workflow Design
Frederico TostaLeonardo MurtaClaudia WernerMarta Mattoso
{ftoliveira, murta, werner, marta}@cos.ufrj.br
COPPE – Federal University of Rio de Janeiro - Brazil
UFRJ
![Page 2: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/2.jpg)
2
Summary
•Motivation
• Introduction & Background
•Goal
•Approach & Implementation
•Conclusion
COPPE/UFRJ
![Page 3: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/3.jpg)
3
Motivation
Pieces of workflows that occurred in the past may occur again in the future.
COPPE/UFRJ
![Page 4: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/4.jpg)
4
Motivation
• The number of services and bioinformatics operations are growing: Taverna has over 3500 (2007). VisTrails has over 1200 Modules (2008).
WorkflowServicesWorkflow
ServicesWorkflowServicesWorkflows and
WF Services
COPPE/UFRJ
![Page 5: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/5.jpg)
5
Motivation
How can we find the pieces or services that are useful during the design of a new workflow in an automatic and systematic way?
COPPE/UFRJ
![Page 6: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/6.jpg)
6
Software Reuse
• Is the process of creating software systems from existing software [Krueger, 1992].
Quality
Reliability Reduced Cost
Productivity
SoftwareReuse
COPPE/UFRJ
![Page 7: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/7.jpg)
7
Recommendation Systems
• E-Commerce: Apply data mining techniques to the problem of
helping user finding the items they would like to purchase.
Domain Concepts
E-commerce Customer Product* Cart Preference
Scientific Experiment
Scientist Component / Actor
Workflow(Goble, 2007)
Context
E-commerce concepts mapped into scientific experiment concepts
* what is recommended by e-commerce sites
COPPE/UFRJ
![Page 8: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/8.jpg)
8
Goal
• Propose a proactive recommendation service that aims at suggesting frequent combinations of scientific programs for reuse.
COPPE/UFRJ
![Page 9: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/9.jpg)
9
Approach
Workflow specification
Workflow specification
DB
Design
Design for reuse and recommendation
Provenance
COPPE/UFRJ
![Page 10: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/10.jpg)
10
Approach
Workflow specification
Workflow specification
DB
Design
ProactiveRecommendation
Design with reuse and recommendation
Provenance
COPPE/UFRJ
![Page 11: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/11.jpg)
11
Implementation
• Populating the database: VisTrails workflows:
- Parse provenance xml files to extract the relations.
MySQL database:- The relations are mapped into a database.- Each relation contains the modules and how
they are connected.
COPPE/UFRJ
![Page 12: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/12.jpg)
12
Implementation
VisTrails workflow design with recommendation
Source Destination Source Port Dest Port
HmmBuild HmmCalibrate DestinationDir SourceDir
HmmBuild Cat DestinationDir Dir
HmmBuild HmmCalibrate DestinationDir HmmPath
HmmBuild HmmCalibrate StdOut HmmPath
HmmBuild HmmCalibrate StdOut HmmPath
Ports 1 and 2 are the output ports DestinationDir and StdOut, respectively. Ports 3, 4 and 5 are the input ports SourceDir, HmmPath and Dir, respectively
•Recommendation Metric:From the example, we can infer that port StdOut of HmmBuild has been connected to port HmmPath of HmmCalibrate in 40% of previously designed workflows.
COPPE/UFRJ
![Page 13: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/13.jpg)
13
Implementation
VisTrails workflow design with recommendationCOPPE/UFRJ
![Page 14: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/14.jpg)
14
Conclusion
• We expect that this approach may help to propagate the benefits of software reuse to the context of scientific workflows.
• Reduce the time to design workflows.
• Increase the quality of workflows designed.
COPPE/UFRJ
![Page 15: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/15.jpg)
15
Conclusion
•Limitations: The current version of our prototype recommends
only a subsequent component based on previously used connection.
• Future works: Improve the approach recommending a
component investigating the whole path. Specify a context to each workflow. Apply weight to each relation based on workflow
usage.
COPPE/UFRJ
![Page 16: Using Provenance to Improve Workflow Design Frederico Tosta Leonardo Murta Claudia Werner Marta Mattoso {ftoliveira, murta, werner, marta}@cos.ufrj.br.](https://reader036.fdocuments.net/reader036/viewer/2022062519/5697bf771a28abf838c814fe/html5/thumbnails/16.jpg)
16
Using Provenance to Improve Workflow Design
UFRJ
Frederico TostaLeonardo MurtaClaudia WernerMarta Mattoso
{ftoliveira, murta, werner, marta}@cos.ufrj.br
COPPE – Federal University of Rio de Janeiro - Brazil