The Taverna 2 Platform
-
Upload
tom-oinn -
Category
Technology
-
view
756 -
download
5
description
Transcript of The Taverna 2 Platform
DEVELOPING WITH THE TAVERNA CODE
An introduction to the T2 PlatformTom Oinn, [email protected], 7th October 2008Updated 4th November 2008
A WARNING!
In this presentation ‘Taverna’ is distinct from ‘The Taverna Workbench’.
‘Taverna’ should be interpreted as ‘the functionality provided by the core components’ Workflow creation and modification Workflow enactment, monitoring, provenance ... and many more
‘Taverna’ in this context applies to the version 2 of the codebase in cases where a distinction is significant.
‘Service’ is used in the sense of a software component providing a service rather than any particular technology such as SOAP or REST. ‘Services’ here are plain Java objects.
THE PROBLEM
Taverna (both 1.x and 2) is complex Inherent complexity
Wide ranging functionality Mechanistic complexity
Loaded through Raven to resolve dependency issues Plug-in mechanism Many different extension points Requires unusual knowledge of classloading to resolve issues
Taverna is valuable Developers want to use Taverna to:
Extend Taverna through plug-in implementations Embed and use Taverna in other systems
The complexity obscures the value
APPLICATION DEVELOPMENT SUPPORT
APPLICATION SUPPORT
Guiding principles:1. Do not require knowledge of classloading2. Do not require a specific application environment3. Allow minimal code
Provide ‘exactly as much’ functionality as required Explicitly support common usage patterns
4. Prevent dependency on implementation5. Extensibility and openness
WHAT’S IN AN APPLICATION?
An application using Taverna will contain: The Taverna API jars Support jars used by the
application The application code itself
Applies to all applications Command Line GUI Web app Axis service ...
T2 Core API Platform API, Spring etc.
Platform service APIs
Platform Libraries, T2 interfaces
... ......
Application Specific Libraries for your code
Your Application Code
TAVERNA IMPLEMENTATION CLASSES
The application must also contain...
Implementations of the enactor, workflow model and similar. Loaded through Raven Inherit and implement
Taverna APIs from the parent class loader
Raven Class Loaders
T2 Core Impl Implementation Dependencies
Implementations of T2 Interfaces
T2 Core API Platform API, Spring etc.
Platform service APIs
Platform Libraries, T2 interfaces
PLUG-INS
Taverna is plug-in based Each plug-in is isolated in its
own class loader and contains: Plug-in specific libraries The extension point
implementation code The plug-in inherits the Taverna
APIs from the parent class loader These APIs include the extension
point interfaces A plug-in can contain many
extension point implementations
Plug-in Class Loaders 1..n
... ...
Plug-in specific libraries (not T2 APIs)
Plug-in Implementation
... ...
T2 Core API Platform API, Spring etc.
Platform service APIs
Platform Libraries, T2 interfaces
SERVICE BEANS
Configured instances Implement service APIs Use T2 implementation classes from
Raven Act as a bridge between Raven
driven implementations and application code
Think of as a ‘toolbox’ for working with Taverna facilities
Configured, linked and instantiated through Spring XML configuration
Set shown here is indicative We will provide more than this 3rd parties can also provide
components here
Workflow Exporter
Workflow Parser
Reference Service
Edit Kit
Activity Kit
Enactor
Monitor Factory
...
Plug-in Service
Raven Class Loaders
T2 Core Impl Implementation Dependencies
Implementations of T2 Interfaces
T2 Core API Platform API, Spring etc.
Platform service APIs
Platform Libraries, T2 interfaces
XML
Cont
ext
Confi
gura
tion
Root Class Loader from Application, Web-App, Web Service ...
Classes only available through services Classes available to all codeService Beans
Workflow Exporter
Workflow Parser
Reference Service
Edit Kit
Activity Kit
Enactor
Monitor Factory
...
Plug-in Service
T2 Core API Platform API, Spring etc.
Platform service APIs
Platform Libraries, T2 interfaces
... ......
Application Specific Libraries for your code
Your Application CodeRaven Class Loaders
T2 Core Impl Implementation Dependencies
Implementations of T2 Interfaces
Plug-in Class Loaders 1..n
... ...
Plug-in specific libraries (not T2 APIs)
Plug-in Implementation
... ...
XML Context Configuration
FINAL TAVERNA APPLICATION STRUCTURE
USING THE PLATFORM 1-2-3
1. Initialize platform (once per application) Downloads implementation code where required Configures and instantiates service beans Configures plug-in manager, loads plug-ins Specified by Spring XML based configuration files
We will provide base configurations as part of the platform
2. Fetch appropriate service bean(s) by name The standard Taverna platform will define a set of base services and
their service names
3. Use service bean(s) to access Taverna functionality Code purely against interface based APIs – service beans perform any
object instantiation such as creation of new workflow model instances.
An example use of the platform to load a workflow, acquire input data, run the workflow and handle the output.
I’m using pseudo-code here rather than Java but in general one line of pseudo-code will correspond to one line of Java, not taking into account standard Java constructs such as casting, type declarations etc. for brevity.
Also consider that the final ‘base’ set of service beans is not yet defined, so service names and interfaces are illustrative rather than definitive.
A SHORT EXAMPLE
INITIALIZATION & WORKFLOW LOADING
The ‘loader’ here is a platform service bean, a tool in the toolbox defined by the platform.
This particular tool can be used to create a new workflow model from an XML definition, in this case from a URL.
Important point – only one explicit object instantiation: the platform itself. In a web application or web service this would already have been provided
in the servlet context, but for conventional applications we need it explicitly. Aside – the workflow model in Taverna 2 is read-only. If we wanted to
modify the model we just loaded we would need an appropriate ‘workflow editor’ service. In our current code this is the Edits interface.
plat = new T2Platform(“path/to/conf.xml”);
loader = plat.getBean(“t2.workflow.loader”);workflow = loader.parseWorkflow(“http://foo.com/wf.xml”);
DATA REGISTRATION
‘ref_service’ is a Reference Service. This is the component of Taverna used to register data and obtain an internal identifier for it. This identifier is then used as input when running the workflow.
Two inputs are registered: ‘input1’ is data held in a file locally ‘input2’ is a literal string value
The real Reference Service interface is slightly more complex as it registers data asynchronously but the principle is the same. As before there is no direct object construction (other than the file), so no direct link to any implementation classes.
ref_service = plat.getBean(“t2.reference.service”);input1 = ref_service.register(new File(“some.data”));input2 = ref_service.register(“string value”);
ENACTMENT
Following the same pattern as before, we obtain an enactor service bean from the platform.
This tool creates workflow instances from workflow definitions. In this particular example we’re ignoring the workflow context, following the ‘minimal code’ principle the service bean API would include this simple version and use some sensible default context.
Having obtained the workflow instance we can push the previously registered data into its (in this case two) inputs. Workflows in T2 are pipeline based, so the act of pushing data in starts the enactment, there is no explicit ‘start’ operation (except for cases where there are no workflow inputs).
enactor = plat.getBean(“t2.enactor”);wf_instance = enactor.createInstance(workflow);wf_instance.push(“input1”,input1);wf_instance.push(“input2”,input2);
HANDLING RESULTS
I’m taking some liberty with the ‘real’ API here but the principle is the same. We don’t have a blocking fetch method at the moment but maybe we should!
We call a method on the workflow instance to fetch the output which will block until it’s available.
The result is in the form of an internal identifier in the reference system, so we then use the reference service to render the data referenced by that identifier to a string and print it to the console.
result = wf_instance.blocking_fetch(“output”);System.out.println(ref_service.getAsString(result));
THE ENTIRE APPLICATION
plat = new T2Platform(“path/to/conf.xml”);
loader = plat.getBean(“t2.workflow.loader”);workflow = loader.parseWorkflow(“http://foo.com/wf.xml”);
ref_service = plat.getBean(“t2.reference.service”);input1 = ref_service.register(new File(“some.data”));input2 = ref_service.register(“string value”);
enactor = plat.getBean(“t2.enactor”);wf_instance = enactor.createInstance(workflow);wf_instance.push(“input1”,input1);wf_instance.push(“input2”,input2);
result = wf_instance.blocking_fetch(“output”);System.out.println(ref_service.getAsString(result));
CONCLUSION
Going back to the ‘guiding principles’ does this code satisfy them? There’s no reference to classloading, and the application would just run
from the normal command line. There’s no more code than required to do the job. This is a simple
example and didn’t address various issues such as context configuration, but those issues weren’t in the summary of the application either so their corresponding absence in the code is another sign of success.
The code never refers to implementation types, it is therefore resilient in the face of implementation updates and changes.
The code is almost self documenting. We can’t make any statements about the extensibility of the
platform, but all the other requirements are satisfied.
PLUG-IN SUPPORT
PLUG-IN MANAGEMENT IN TAVERNA 1.X
Plug-in code is used to extend Taverna Includes all ‘base’ functionality such as web service invocation, http
references etc. Plug-in development issues
No tooling support Testing is problematic Problems with optional maven dependencies Can’t use libraries which are not maven artifacts
Plug-in usage issues Insufficient metadata to property describe plug-in functionality Plug-in management service API missing Plug-ins have no presence in the code once loaded
PLUG-IN DEVELOPMENT
Provide tool support Plug-in description generator Plug-in verifier Plug-in registry
Modified class loading strategy Single class loader per plug-in Allow mix of artifact and non-artifact jars
Use Raven to fetch artifact jars, but not to obtain class loaders Testing is a special case of application development
See previous section We can provide archetypes for integration tests using the platform to
simplify testing
PLUG-IN DESCRIPTION Plain text description Short name Version, triple numeric with periods Development status (enumeration, for example ‘alpha | beta | stable’ to be decided) Author list
Author name Author affiliation (optional) Author email (optional) Author URL (optional) Author Icon (optional, 128x128 true colour PNG with transparency)
Tag list Free text tag
Homepage URL (can be generated by tooling along with the page itself) Related link list
Link description Link URI (most likely HTTP but potentially other reference)
Icon, 128x128 true colour PNG with transparency Plugin manager background image (arbitrary size, true colour PNG with transparency, used as a background for
the plugin manager panel for this plugin) Core API version targeted specified as an artifact (group, artifact, version) SPI implementation list
Plain text description Short name
NEW PLUG-IN MANAGER STRUCTURE
Raven Repository
Artifact Cache
Plugin Manager Service Bean SPI Registry
SPI Registry
Instance Registry
Instance Registry
Get artifact / jar
Consumer Code i.e. Taverna Workbench
Non-artifact Repository
Jar Cache
Download Manager
Install, list, enable, disable... plugins
Consume SPI instances, receive notification of addition / removal
PLUG-IN MANAGER IMPLICATIONS
Most places in application code explicitly referencing artifacts and raven will now reference the plug-in manager Code using SPI and instance registries is unchanged Workflow serializer changes to reference plugin rather than
artifact specifier Plug-ins can be disabled on the fly Plug-in manager appears as infrastructure bean in platform Plug-in management functionality is generic
Part of the generic platform rather than the Taverna specific part
PLUG-IN MANAGER IMPLEMENTATION
The plug-in manager is instantiated and configured through Spring Plug-in manager and related components are available
in the root class loader for an application Exposed as named beans in the Spring context Applications access infrastructure properties through
the Spring context Access through interfaces not implementation types
Next slide shows actual implementation components in the current code
system artifact set
basebase base
base
property
classname
IMPLEMENTATION STRUCTURE
Download Manager
Raven
Jar Managerbase/artifacts/agrp/aid-aver.jar
POM.xml Parserbase/artifacts/agrp/aid-aver.pombase/artifacts/agrp/aid-aver.jar
Plug-in Parserbase/pgrp/pid-pver.xml
Plug-in Managerbase/pgrp/pid-pver/jars/jarname.jar
Componentfile path
Key: SPI Registry
WHERE NEXT?
TIMESCALE
1st November – Initial generic platform implementation Early version released to public CVS 4th November
1st December – Specification of Taverna specific platform services Early implementation of enactor, workflow loader and reference
service components expected mid-November 1st January – Initial implementation of Taverna services 16th & 17th February – 2 day Developer Workshop in
Manchester Sign up at
http://spreadsheets.google.com/viewform?key=pGRSW--IktWUFVQVLleNzIw&hl=en