2014-06-03-Taverna-IS-ENES2
-
Upload
mygrid-team -
Category
Technology
-
view
152 -
download
0
description
Transcript of 2014-06-03-Taverna-IS-ENES2
Taverna
Robert Haines, Stian Soiland-ReyesmyGrid, University of Manchester
[email protected]://orcid.org/0000-0002-9538-7919
IS-ENES2 workshop on workflows, Hamburg, 2014-06-03
http://www.taverna.org.uk/
http://www.mygrid.org.uk/
This work is licensed under a Creative Commons Attribution 3.0 Unported License
Taverna in Context
• Comprehensive Scientific Workflow Management System + auxiliary tools/repositories
• Based at Manchester with multiple contributions and collaborations
• Releases: Three major; numerous rolling intermediate. First release 2004.
• Downloads: 90,000+ cumulative; 1000 in first month per intermediate release; user audit for May 2013 had 900+ unique addresses use Taverna
• Users: ~380 sites and institutions have used or use Taverna
• Support: mailing list, community list and Jira
Taverna workflows • Sophisticated analysis
pipelines• A set of services to analyze or
manage data (local or remote)• Workflows run through the
workbench or via a server• Automation of data flow
through services• Control of service invocation• Iteration over data sets• Provenance collection• Extensible and open source
Taverna workflows
• Dataflow– Graphically connect data between drag-and-
dropped services
• Service types– REST, SOAP, Command Line, web interactions,
scripts (R, Python, Beanshell)– Domain-specific plugins– Your tool?
Taverna workflows
• Nested workflows
• Components– Reusable and inter-compatible workflow
fragments– Grouped into families– Semantically annotated– Curated
Application
Runtime Middleware
Resources/Codes/Services Infrastructures
Repositories
Execution Activity Plug-ins
Application
Runtime Middleware
Resources/Codes/Services
Platforms
Repositories
Registries
Application
Runtime Middleware
Resources/Codes/Services Infrastructures
Repositories
Execution Activity Plug-ins
Application
Runtime Middleware
Resources/Codes/Services
Platforms
Repositories
Taverna Desktop Workbench
Taverna Online Web Tool
Portals and Applications
Engine Server
Player
Cmd line
Provenance
Third Party Servers
BioSTIF
Workflows & workflow components
PROV, OPM
Data
Provenance
Registries
Workflow Clients for People
Technical ComputationalScientist
DomainScientist
Workbench
Components
Workflow Visibility
Concept KnowledgeTaverna Domain
High Low
Java libraryRuby Gem PlayerCmdLine
Simulation characteristics
• Platform• Data• Incorporating codes/services• Scale• Parameter and data sweeps• Interacting• Reporting
Biodiversity marine monitoring and health assessment
ecological niche modelling
Data Intensive ScienceCollaborative Science
Pilumnus hirtellusEnclosed sea problem (Ready et al., 2010)
Sarah Bourlat
Data discovery
Data assembly, cleaning, and refinement
Ecological Niche Modeling
Statistical analysis
Analytical cycle
Data collection
Insights Scholarly Communication & Reporting
www.biovel.eu
Ecological Niche Modeling Workflow (ENM)
dataconfigurationparameters
steps
Data and Parameter Sweeps
Patients Patient Avatar Disease Simulation Workflow Patient Avatar
updatedSystemic Factors
Gene Expression Profile
RISK
http://www.vph-share.eu/
VPH-Share @neurist Aneurysm Morphology Workflow
Implementation in VPH-ShareThe @neurIST morphological workflow specification in Taverna
http://www.vph-share.eu/
Make Deploy
Save and share
Taverna Workbench
VPH-Share
Taverna Player
CommandlineTaverna(batch)
Workflow repository
Integrate and run
Analysis portals
Taverna Server
Taverna Workbench
• Desktop application• GUI• Plug-in Framework• Themed editions• Intermediate results
views• Search for Web Services
in catalogues• Search and publish to
myExperiment
Web apps to create and run workflows Taverna Online
• Dr Vadim Surpin and Vitaly Sharanutsa– Institute for Information Transmission Problems of Russian
Academy of Sciences (IITP RAS)
• An online, in-browser application for assembling and running Taverna Workflows over a HPC platform
Web apps to create and run workflows
Service Chaining EditorPete Walker et al Plymouth Marine LaboratoryFor chaining OGC Web Processing Service geospatial
Web Services
Desktop Clienthttp://www.xworx.org/
Data Centric Interface
BIFI (Beautiful Interfaces for Inputs) Taverna Workbench Plug-in, GUI definition language
Taverna Server family
• Taverna Server – Multiple clients, Multi-user– Local and large scale infrastructures– Site Replication
• Taverna Server Amazon Image– Can have local tools and services (e.g. R)– Multiple instances in Amazon Cloud and as required,
for multiple users/uses and different security scenarios• Taverna Virtual Machine• Taverna Command Line• Bundled Servers, Services and Tools
Interacting with a workflow
• Many tasks need user interaction
• A workflow on a server does not need to be “press a button and wait”– VPH-Share opens a VNC connection to the
spawned instance.• Taverna Interaction Service
– Users interact with a workflow (wherever it is running) in a Web browser.
– Interaction Service Plug-in in workbench
iPython integration
https://www.youtube.com/watch?v=QVQwSOX5S08http://goo.gl/hm0qCN
Analysis Portals
https://www.youtube.com/watch?v=s3D8JXc-tSM
– Find and share workflows (Taverna, RapidMiner, Kepler, Galaxy, Trident, Vistrails, etc)
– Track updates of workflows
– Social curation:Comments, tags, stars
– Themed groups of users (projects, domain)
– Organize packs of related data (source files, results)
Over 7500 members, 300 groups,2500 workflows, 600 files and300 packs (research objects)
VPH-Share Models of Human Physiology
Eagle Genomics Next Generation Sequencing based Patient Diagnostics
Astronomy & HelioPhysics
DocumentPreservation Digitisation
Systems Biology
OpenTox Project Chemistry Development KitDrug Toxicity Ecological
Niche ModellingPopulation ModellingMeta-genomicsPhylo-genetics
• Data cleaning• Data movement• Data retrieval and
annotation • Data analysis • Data mining • knowledge
management• Data curation and data
warehouse population• Data visualisation• Parameter sweeps over
simulations
Drug discovery, small molecules, targets, compounds OpenPHACTS
Open source development
• Taverna is open source software (LGPL)– https://github.com/taverna/– License allows integration in closed-source products
• Open development– Developer documentation and tutorials– Public mailing lists, issue trackers, wiki– Contributors from around the world
• Taverna Plugins– APIs and plugin system
• Applying to join the Apache Foundation
Summary
• Taverna Suite for interactive and batch workflows
• Flexible Plug-ins and Flexibly Plugged-in• Themed Taverna• Moving to the Apache Foundation• We welcome collaboration/contribution
• http://www.taverna.org.uk
Integrating with Taverna• “anything” can be extended by plugins:
– Service types– Service discovery– Menus and toolbars– File types (e.g. SVG, PDF, CSV)– Complete views/perspectives
• Documentation and tutorials for plugin developers
• Installable/updatable from plugin sites• Many plugins gets included in Taverna
Core or domain-specific editions (e.g. AstroTaverna++ became Taverna Astronomy edition)
• AstroTaverna plugin• OAuth plugin• Taverna PROV plugin• VAMDC plugin• BIFI plugin• VPH-Share plugin• Interaction plugin• XPath plugin• REST plugin• BioCatalogue plugin• PBS plugin• SADI plugin• External Tools plugin• UNICORE plugin• CDK plugin• caGrid plugin• XWS plugin• gLite plugin• WPS plugin• ...
http://dev.mygrid.org.uk/wiki/display/developer/Creating+plugins+for+Taverna+2