DAFTAR ISI - karanganyarkab.go.id · i DAFTAR ISI Daftar Isi .....i Daftar Tabel ..... iii
ISI work
-
Upload
dgarijo -
Category
Technology
-
view
776 -
download
2
description
Transcript of ISI work
Date: 13/10/2011
Work at ISI, relation with wf4Ever,
future steps
Daniel Garijo Verdejo,Yolanda Gil
Ontology Engineering Group. Laboratorio de Inteligencia ArtificialDepartamento de Inteligencia Artificial
Facultad de InformáticaUniversidad Politécnica de Madrid
1
The TB Drugome
3
Project goals
Text:Narrative of method,
software packages used
Software:scripted codes + manual steps +
notes/emails
Workflow: Workflow/scripts describing
dataflow, codes, and parameters
Data:Key datasets and figures/plots
Typical Published Article
Text:Narrative of method,
software packages used
Data:Key datasets and figures/plots
Reproducible Article: Weaver, GenePattern GRRD, etc.
NOT published, loosely recorded:
4
Problem with existing approaches
Only executable workflow is published:1. Must have the same codes to re-execute
the workflow, but:– Codes become unavailable
• Eg: eHits was proprietary and replaced by AutodockVina
– Different labs prefer different codes • Eg: R vs Matlab• Eg: viz in Citoscape vs yEd
2. Must have the same workflow framework to re-execute the workflow– Must have R for Weaver
3. Must import files to local file system and workflow framework– Must import bundle of workflow/data/code
files to reproduce
Workflow: Workflow/scripts describing
dataflow, codes, and parameters
Text:Narrative of method,
software packages used
Data:Key datasets and figures/plots
Reproducible Article: Weaver, GenePattern GRRD, etc.
5
Key Features of our approach
• Publish an abstract workflow in addition to executable w.– Description of workflow that is independent of the codes executed– Maps to the codes executed (the “executable workflow”)
• Publish both abstract and executable workflow using the OPM standard – OPM (Open Provenance Model) is independent of workflow framework
and is widely implemented– Other groups can import to their own workflow framework
• Publish data and workflows as Linked Data on the Web– All workflows and related files are web-accessible– Simple mechanism to share across local file systems
6
High level architecture
Interactive Browsing
(Pubby frontend)
Programatic access(external apps)
Wings workflow generation
OPMconversion Publication Share Reuse
Core
Portal
WINGS on local laptopWorkflow Template
WorkflowInstance
OPMexport
Core
Portal
WINGS on shared hostWorkflow Template
WorkflowInstance
OPMexport
Core
Portal
WINGS on web serverWorkflow Template
WorkflowInstance
OPMexport
LinkedData
Publication
Users
Other workflow environments
7
High level architecture (2)
RDFTriple store
RDFTriple store
Permanentweb-
accessiblefile
store
Permanentweb-
accessiblefile
store
RDF Upload Interface
RDF Upload Interface
SPARQL EndpointSPARQL
Endpoint
Linked Data publicationAbstract
Workflow(OPM)
ExecutableWorkflow
(OPM)
Web accessible
WorkflowData,
Components, etc.
Needed if workflow was developed in local host instead of a public server
OPMexport
Other workflow frameworks
OPMimport
Wings
Web browser
ISI web servers(http://wings.isi.edu/…)
Amazon EC2 cloud(http://ec2-184-72-160-64.…)
8
Executable and abstract workflow
9
OPMV extended model
account
account
account
Abstract template Node
Workflowtemplate
Input artifact1
Input artifact2
Outputartifact1
Abstractcomponent
Execution Node
Execution Input1
Execution Input2
Execution result
Specificcomponent
Execution account
Workflow Template Execution Results
user
account
accounthasArtifact
hasArtifact
hasWorkflowTemplate
hasArtifactTemplate
hasProcessTemplate
hasArtifactTemplate
hasArtifactTemplate
subClassOfwasGeneratedBy
wasGeneratedBy
used
used
usedused
wasControlledBy
hasSpecificComponenthasAbstractComponenthasProcess
Process
ArtifactArtifact
Artifact
Agent
AccountOPM Graph
Process
Artifact Artifact
Artifact
Red: OPM model
Black: OPM profile (extension)
10
Reproducibility
• 3 perspectives:– Reproducibility by an expert– Basic reproducibility by non-experts– Reproducibility by students from text only
• Or, not reproducible at all
11
Reproducibility Maps
Comparison of ligand binding sites using SMAP
Comparison of dissimilar protein structures using FATCAT
Docking using eHits/AutodockVina
12
Reproducibility maps: accessing the scripts and intermediate data
13
How can we use this in Wf4Ever ?
• The abstract workflow notion can be reused and imported to the workflows used in RO’s.– Complement to the workflow, to understand it better.– Allows tackling incomplete provenance.
• Additional workflow repository for recommendation– OPM (Open Provenance Model) is independent of workflow
framework and is widely implemented (Taverna has a OPM export too)
– Other groups can import to their own workflow framework• Workflow integration with WINGS.
– Semantic annotation of workflows.– Distributed workflow execution engine
14
Next steps
• Keep working on workflow abstraction.– Research on compatibility with problem solving methods
(PSMs).
• Create an OPMV/W3C PROV-O profile for common workflow representation.– Interoperability between workflow systems (Taverna).
• Work in workflows in different domains.– Biology, Astronomy.– Workflow reuse between different domains?
15
References
•The TB Drugome paper: http://funsite.sdsc.edu/drugome/TB/
• OPMO + OPMV mapped version: http://openprovenance.org/model/opmo
• WINGS workflow system: http://seagull.isi.edu/marbles/
•TB Drugome Wiki (Evolution of the work): http://seagull.isi.edu/wings-drugome/index.php/Main_Page
•Thanks to Yolanda Gil for letting me borrowing some of the Slides based on USCD slides for this presentation.
Date: 03/10/2011
Daniel Garijo Verdejo
Ontology Engineering Group. Laboratorio de Inteligencia ArtificialDepartamento de Inteligencia Artificial
Facultad de InformáticaUniversidad Politécnica de Madrid
Work at ISI, relation with wf4Ever,
future steps