Status update OEG - Nov 2012
-
Upload
dgarijo -
Category
Technology
-
view
295 -
download
2
description
Transcript of Status update OEG - Nov 2012
![Page 1: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/1.jpg)
Date: 29/11/2012
Work at ISI,Current Status,
Next Steps
Daniel Garijo Verdejo,Oscar Corcho,
Yolanda Gil
Ontology Engineering Group. Laboratorio de Inteligencia ArtificialDepartamento de Inteligencia Artificial
Facultad de InformáticaUniversidad Politécnica de Madrid
![Page 2: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/2.jpg)
What am I working on?
•Creation of abstractions in scientific workflows
• Workflow Traces and template representation• Provenance representation• Plan representation
• Abstraction catalog
• Find ways to link the definitions to the provenance traces automatically
•Understandability and reuse of scientific workflows
2
![Page 3: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/3.jpg)
Outline
Index
1. Motivation2. Overview3. Workflow systems used4. Summary of work done in my previous visit to ISI
• OPMW and provenance publishing5. Summary of work done before second visit to ISI
• Workflow motif catalog6. Summary of work done in my second visit to ISI
• OPMW-PROV and P-PLAN• Automatic macro abstraction detection
7. Next Steps8. Future work
3
![Page 4: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/4.jpg)
Motivation
4
•As a designer: Discovery
• Workflows with similar functionality fragments/methods
• Design based in previous templates.
•As user/reuser: Understandability
• Search workflows by functionality
• Commonalities between execution runs
• Component categorization
![Page 5: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/5.jpg)
Overview
Abstraction definitions and categorization
Provenance representation
Plan representation
Algorithms for finding the different abstractions automatically
Experiment publication
5
Vocabularies
RDF Stores
Data mining tools, graph analysis, etc.
Descriptions/PSMS/Ontologies
![Page 6: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/6.jpg)
6
Taverna and Wings
IEEE eScience 2012. Chicago, USA
http://www.taverna.org.uk/
http://www.wings-workflows.org/
![Page 7: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/7.jpg)
Summary: Previous Work at ISI
Abstractions definitions and categorization
Provenance representation
Plan representation
Experiment Publication
OPMW
Virtuoso,Pubby, Wings (+Plugin)
7
Algorithms for finding the different abstractions automatically
![Page 8: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/8.jpg)
High level architecture
Interactive Browsing
(Pubby frontend)
Programatic access(external apps)
Wings workflow generation
OPMconversion Publication Share Reuse
Core
Portal
WINGS on local laptopWorkflow Template
WorkflowInstance
OPMexport
Core
Portal
WINGS on shared hostWorkflow Template
WorkflowInstance
OPMexport
Core
Portal
WINGS on web serverWorkflow Template
WorkflowInstance
OPMexport
LinkedData
Publication
Users
Other workflow environments
8
![Page 9: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/9.jpg)
OPMW: Process view
9
![Page 10: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/10.jpg)
OPMW: Attribution view
10
![Page 11: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/11.jpg)
Work previous to second visit to ISI
Abstractions definitions and categorization
Provenance representation
Plan representation
Algorithms for automatic matching
Experiment Publication
OPMW
Virtuoso,Pubby, Wings (+Plugin)
Motif Detection
11
![Page 12: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/12.jpg)
12
Overview
• Empirical analysis on 177 workflow templates from Taverna and Wings
• Catalog of recurring patterns: scientific workflow motifs.
• Data Oriented Motifs
• Workflow Oriented Motifs
•Understandability and reuse
IEEE eScience 2012. Chicago, USA
Catalog
http://sensefinancial.com/wp-content/uploads/2012/02/contribution.jpg
![Page 13: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/13.jpg)
13
Approach
•Reverse-engineer the set of current practices in workflowdevelopment through an analysis of empirical evidence
•Identify workflow abstractions that would facilitateunderstandability and therefore effective re-use
IEEE eScience 2012. Chicago, USA
![Page 14: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/14.jpg)
14
Workflow Motifs
•Workflow motif: Domain independent conceptual abstraction on the workflow steps.1. Data-oriented motifs: What kind of manipulations does the workflow have?
• E.g.: • Data retrieval • Data preparation• etc.
2. Workflow-oriented motifs: How does the workflow perform its operations?
•E.g.:• Stateful steps• Stateless steps• Human interactions• etc.
IEEE eScience 2012. Chicago, USA
WHAT?
HOW?
![Page 15: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/15.jpg)
15
Motif CatalogData-Oriented Motifs
Data Retrieval
Data Preparation
Format Transformation
Input Augmentation and Output Splitting
Data Organisation
Data Analysis
Data Curation/Cleaning
Data Moving
Data Visualisation
IEEE eScience 2012. Chicago, USA
Workflow-Oriented Motifs
Intra-Workflow Motifs
Stateful (Asynchronous) Invocations
Stateless (Synchronous) Invocations
Internal Macros
Human Interactions
Inter-Workflow Motifs
Atomic Workflows
Composite Workflows
Workflow Overloading
![Page 16: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/16.jpg)
Summary: Work done at ISI
Abstractions definitions and categorization
Provenance representation
Plan representation
Algorithms for automatic matching
Experiment Publication
OPMW + PROV+ P-PLAN
Virtuoso,Pubby, Wings (+Plugin)
Macro abstraction detection
Motif Detection
SUBDUE exploration and integration in RDF
16
![Page 17: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/17.jpg)
PROV Compatibility
•OPMW fits naturally into PROV• Same usage-generation
structure• Extension for the scientific
workflow with PROV
•Binary relationships (no n-ary patterns used).• Simplicity
•Publication of PROV as well as OPMW. • Queries can be answered in
both languages.• Flexibility.
•http://www.opmw.org/node/8
17
![Page 18: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/18.jpg)
P-PLAN
•Plans are not provenance•P-PLAN: Simple plan model for binding traces to template representations•Aligned with OPMW and PROV•Documentation in progress
18
![Page 19: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/19.jpg)
Summary: Work done at ISI
Abstractions definitions and categorization
Provenance representation
Plan representation
Algorithms for automatic matching
Experiment Publication
OPMW + PROV+ P-PLAN
Virtuoso,Pubby, Wings (+Plugin)
Macro abstraction detection
Motif Detection
SUBDUE exploration and integration in RDF
19
![Page 20: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/20.jpg)
Macro abstraction detection
Problem statement:
Given a repository of workflow templates (either abstract or specific) or workflow execution traces, what are the workflow fragments I can deduce from it?
Useful for:•Systems like Taverna and Wings: (Many templates, little annotation to relate them)• Finding relationships between workflows and sub-workflows.
• Most used fragments, most executed, etc.
• Systems like GenePattern and Galaxy: (Many runs, nearly no templates published)• Proposing new templates with the popular fragments.
20
![Page 21: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/21.jpg)
Macro abstraction detection
•Work in Progress (implementation and evaluation)• WINGS traces
•Similar to Sub-graph Isomorphism•Kind of “Graph Clustering”
•Early results• Tool for finding common sub-graphs
• Sequential graphs• Efficient• Scalable.
• Integration with RDF (by me)
•TO DO:• Finish implementation: inference.• Evaluation!!
21
![Page 22: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/22.jpg)
Next Steps
22
![Page 23: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/23.jpg)
Next Steps
•Thesis:• Finish up implementation.• How to evaluate results?
•Publications:• Workshop:
• Provenance Corpus (with Taverna Team). To have something citable• Conference:
• KCAP: Macro detection implementation and evaluation.• Journal
• Decay analysis publication in journal (January)• OPMW - PROV -P-PLAN publication in journal (December)• Motif extension publication in journal (Invited by special issue)
(Now)
23
![Page 24: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/24.jpg)
Future work
•Thesis:
• Other methods for detecting workflow abstraction automatically• Metadata and file analysis (diff, etc.): Filter, merge, etc.• Provenance reconstruction.
•Project:
• RO model specifications• Testcases • Workflow abstraction with Isoco
24
![Page 25: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/25.jpg)
Thanks !
25
:collaboratesWith
:collaboratesWith:collaboratesWith
:collaboratesWith
:supervises :supervises
:oscarCorcho :yolandGil
:khalidBelhajjame
:varunRatnakar
:caroleGoble
:pinarAlper
:danielGarijo
![Page 26: Status update OEG - Nov 2012](https://reader033.fdocuments.net/reader033/viewer/2022042715/5580cde7d8b42a8e558b51ac/html5/thumbnails/26.jpg)
Date: 03/10/2011
Daniel Garijo Verdejo
Ontology Engineering Group. Laboratorio de Inteligencia ArtificialDepartamento de Inteligencia Artificial
Facultad de InformáticaUniversidad Politécnica de Madrid
Work at ISI, relation with wf4Ever,
future steps