The Taverna 2 Platform

28
DEVELOPING WITH THE TAVERNA CODE An introduction to the T2 Platform Tom Oinn, [email protected] , 7 th October 2008 Updated 4 th November 2008

description

A summary of the Taverna 2 Platform

Transcript of The Taverna 2 Platform

Page 1: The Taverna 2 Platform

DEVELOPING WITH THE TAVERNA CODE

An introduction to the T2 PlatformTom Oinn, [email protected], 7th October 2008Updated 4th November 2008

Page 2: The Taverna 2 Platform

A WARNING!

In this presentation ‘Taverna’ is distinct from ‘The Taverna Workbench’.

‘Taverna’ should be interpreted as ‘the functionality provided by the core components’ Workflow creation and modification Workflow enactment, monitoring, provenance ... and many more

‘Taverna’ in this context applies to the version 2 of the codebase in cases where a distinction is significant.

‘Service’ is used in the sense of a software component providing a service rather than any particular technology such as SOAP or REST. ‘Services’ here are plain Java objects.

Page 3: The Taverna 2 Platform

THE PROBLEM

Taverna (both 1.x and 2) is complex Inherent complexity

Wide ranging functionality Mechanistic complexity

Loaded through Raven to resolve dependency issues Plug-in mechanism Many different extension points Requires unusual knowledge of classloading to resolve issues

Taverna is valuable Developers want to use Taverna to:

Extend Taverna through plug-in implementations Embed and use Taverna in other systems

The complexity obscures the value

Page 4: The Taverna 2 Platform

APPLICATION DEVELOPMENT SUPPORT

Page 5: The Taverna 2 Platform

APPLICATION SUPPORT

Guiding principles:1. Do not require knowledge of classloading2. Do not require a specific application environment3. Allow minimal code

Provide ‘exactly as much’ functionality as required Explicitly support common usage patterns

4. Prevent dependency on implementation5. Extensibility and openness

Page 6: The Taverna 2 Platform

WHAT’S IN AN APPLICATION?

An application using Taverna will contain: The Taverna API jars Support jars used by the

application The application code itself

Applies to all applications Command Line GUI Web app Axis service ...

T2 Core API Platform API, Spring etc.

Platform service APIs

Platform Libraries, T2 interfaces

... ......

Application Specific Libraries for your code

Your Application Code

Page 7: The Taverna 2 Platform

TAVERNA IMPLEMENTATION CLASSES

The application must also contain...

Implementations of the enactor, workflow model and similar. Loaded through Raven Inherit and implement

Taverna APIs from the parent class loader

Raven Class Loaders

T2 Core Impl Implementation Dependencies

Implementations of T2 Interfaces

T2 Core API Platform API, Spring etc.

Platform service APIs

Platform Libraries, T2 interfaces

Page 8: The Taverna 2 Platform

PLUG-INS

Taverna is plug-in based Each plug-in is isolated in its

own class loader and contains: Plug-in specific libraries The extension point

implementation code The plug-in inherits the Taverna

APIs from the parent class loader These APIs include the extension

point interfaces A plug-in can contain many

extension point implementations

Plug-in Class Loaders 1..n

... ...

Plug-in specific libraries (not T2 APIs)

Plug-in Implementation

... ...

T2 Core API Platform API, Spring etc.

Platform service APIs

Platform Libraries, T2 interfaces

Page 9: The Taverna 2 Platform

SERVICE BEANS

Configured instances Implement service APIs Use T2 implementation classes from

Raven Act as a bridge between Raven

driven implementations and application code

Think of as a ‘toolbox’ for working with Taverna facilities

Configured, linked and instantiated through Spring XML configuration

Set shown here is indicative We will provide more than this 3rd parties can also provide

components here

Workflow Exporter

Workflow Parser

Reference Service

Edit Kit

Activity Kit

Enactor

Monitor Factory

...

Plug-in Service

Raven Class Loaders

T2 Core Impl Implementation Dependencies

Implementations of T2 Interfaces

T2 Core API Platform API, Spring etc.

Platform service APIs

Platform Libraries, T2 interfaces

XML

Cont

ext

Confi

gura

tion

Page 10: The Taverna 2 Platform

Root Class Loader from Application, Web-App, Web Service ...

Classes only available through services Classes available to all codeService Beans

Workflow Exporter

Workflow Parser

Reference Service

Edit Kit

Activity Kit

Enactor

Monitor Factory

...

Plug-in Service

T2 Core API Platform API, Spring etc.

Platform service APIs

Platform Libraries, T2 interfaces

... ......

Application Specific Libraries for your code

Your Application CodeRaven Class Loaders

T2 Core Impl Implementation Dependencies

Implementations of T2 Interfaces

Plug-in Class Loaders 1..n

... ...

Plug-in specific libraries (not T2 APIs)

Plug-in Implementation

... ...

XML Context Configuration

FINAL TAVERNA APPLICATION STRUCTURE

Page 11: The Taverna 2 Platform

USING THE PLATFORM 1-2-3

1. Initialize platform (once per application) Downloads implementation code where required Configures and instantiates service beans Configures plug-in manager, loads plug-ins Specified by Spring XML based configuration files

We will provide base configurations as part of the platform

2. Fetch appropriate service bean(s) by name The standard Taverna platform will define a set of base services and

their service names

3. Use service bean(s) to access Taverna functionality Code purely against interface based APIs – service beans perform any

object instantiation such as creation of new workflow model instances.

Page 12: The Taverna 2 Platform

An example use of the platform to load a workflow, acquire input data, run the workflow and handle the output.

I’m using pseudo-code here rather than Java but in general one line of pseudo-code will correspond to one line of Java, not taking into account standard Java constructs such as casting, type declarations etc. for brevity.

Also consider that the final ‘base’ set of service beans is not yet defined, so service names and interfaces are illustrative rather than definitive.

A SHORT EXAMPLE

Page 13: The Taverna 2 Platform

INITIALIZATION & WORKFLOW LOADING

The ‘loader’ here is a platform service bean, a tool in the toolbox defined by the platform.

This particular tool can be used to create a new workflow model from an XML definition, in this case from a URL.

Important point – only one explicit object instantiation: the platform itself. In a web application or web service this would already have been provided

in the servlet context, but for conventional applications we need it explicitly. Aside – the workflow model in Taverna 2 is read-only. If we wanted to

modify the model we just loaded we would need an appropriate ‘workflow editor’ service. In our current code this is the Edits interface.

plat = new T2Platform(“path/to/conf.xml”);

loader = plat.getBean(“t2.workflow.loader”);workflow = loader.parseWorkflow(“http://foo.com/wf.xml”);

Page 14: The Taverna 2 Platform

DATA REGISTRATION

‘ref_service’ is a Reference Service. This is the component of Taverna used to register data and obtain an internal identifier for it. This identifier is then used as input when running the workflow.

Two inputs are registered: ‘input1’ is data held in a file locally ‘input2’ is a literal string value

The real Reference Service interface is slightly more complex as it registers data asynchronously but the principle is the same. As before there is no direct object construction (other than the file), so no direct link to any implementation classes.

ref_service = plat.getBean(“t2.reference.service”);input1 = ref_service.register(new File(“some.data”));input2 = ref_service.register(“string value”);

Page 15: The Taverna 2 Platform

ENACTMENT

Following the same pattern as before, we obtain an enactor service bean from the platform.

This tool creates workflow instances from workflow definitions. In this particular example we’re ignoring the workflow context, following the ‘minimal code’ principle the service bean API would include this simple version and use some sensible default context.

Having obtained the workflow instance we can push the previously registered data into its (in this case two) inputs. Workflows in T2 are pipeline based, so the act of pushing data in starts the enactment, there is no explicit ‘start’ operation (except for cases where there are no workflow inputs).

enactor = plat.getBean(“t2.enactor”);wf_instance = enactor.createInstance(workflow);wf_instance.push(“input1”,input1);wf_instance.push(“input2”,input2);

Page 16: The Taverna 2 Platform

HANDLING RESULTS

I’m taking some liberty with the ‘real’ API here but the principle is the same. We don’t have a blocking fetch method at the moment but maybe we should!

We call a method on the workflow instance to fetch the output which will block until it’s available.

The result is in the form of an internal identifier in the reference system, so we then use the reference service to render the data referenced by that identifier to a string and print it to the console.

result = wf_instance.blocking_fetch(“output”);System.out.println(ref_service.getAsString(result));

Page 17: The Taverna 2 Platform

THE ENTIRE APPLICATION

plat = new T2Platform(“path/to/conf.xml”);

loader = plat.getBean(“t2.workflow.loader”);workflow = loader.parseWorkflow(“http://foo.com/wf.xml”);

ref_service = plat.getBean(“t2.reference.service”);input1 = ref_service.register(new File(“some.data”));input2 = ref_service.register(“string value”);

enactor = plat.getBean(“t2.enactor”);wf_instance = enactor.createInstance(workflow);wf_instance.push(“input1”,input1);wf_instance.push(“input2”,input2);

result = wf_instance.blocking_fetch(“output”);System.out.println(ref_service.getAsString(result));

Page 18: The Taverna 2 Platform

CONCLUSION

Going back to the ‘guiding principles’ does this code satisfy them? There’s no reference to classloading, and the application would just run

from the normal command line. There’s no more code than required to do the job. This is a simple

example and didn’t address various issues such as context configuration, but those issues weren’t in the summary of the application either so their corresponding absence in the code is another sign of success.

The code never refers to implementation types, it is therefore resilient in the face of implementation updates and changes.

The code is almost self documenting. We can’t make any statements about the extensibility of the

platform, but all the other requirements are satisfied.

Page 19: The Taverna 2 Platform

PLUG-IN SUPPORT

Page 20: The Taverna 2 Platform

PLUG-IN MANAGEMENT IN TAVERNA 1.X

Plug-in code is used to extend Taverna Includes all ‘base’ functionality such as web service invocation, http

references etc. Plug-in development issues

No tooling support Testing is problematic Problems with optional maven dependencies Can’t use libraries which are not maven artifacts

Plug-in usage issues Insufficient metadata to property describe plug-in functionality Plug-in management service API missing Plug-ins have no presence in the code once loaded

Page 21: The Taverna 2 Platform

PLUG-IN DEVELOPMENT

Provide tool support Plug-in description generator Plug-in verifier Plug-in registry

Modified class loading strategy Single class loader per plug-in Allow mix of artifact and non-artifact jars

Use Raven to fetch artifact jars, but not to obtain class loaders Testing is a special case of application development

See previous section We can provide archetypes for integration tests using the platform to

simplify testing

Page 22: The Taverna 2 Platform

PLUG-IN DESCRIPTION Plain text description Short name Version, triple numeric with periods Development status (enumeration, for example ‘alpha | beta | stable’ to be decided) Author list

Author name Author affiliation (optional) Author email (optional) Author URL (optional) Author Icon (optional, 128x128 true colour PNG with transparency)

Tag list Free text tag

Homepage URL (can be generated by tooling along with the page itself) Related link list

Link description Link URI (most likely HTTP but potentially other reference)

Icon, 128x128 true colour PNG with transparency Plugin manager background image (arbitrary size, true colour PNG with transparency, used as a background for

the plugin manager panel for this plugin) Core API version targeted specified as an artifact (group, artifact, version) SPI implementation list

Plain text description Short name

Page 23: The Taverna 2 Platform

NEW PLUG-IN MANAGER STRUCTURE

Raven Repository

Artifact Cache

Plugin Manager Service Bean SPI Registry

SPI Registry

Instance Registry

Instance Registry

Get artifact / jar

Consumer Code i.e. Taverna Workbench

Non-artifact Repository

Jar Cache

Download Manager

Install, list, enable, disable... plugins

Consume SPI instances, receive notification of addition / removal

Page 24: The Taverna 2 Platform

PLUG-IN MANAGER IMPLICATIONS

Most places in application code explicitly referencing artifacts and raven will now reference the plug-in manager Code using SPI and instance registries is unchanged Workflow serializer changes to reference plugin rather than

artifact specifier Plug-ins can be disabled on the fly Plug-in manager appears as infrastructure bean in platform Plug-in management functionality is generic

Part of the generic platform rather than the Taverna specific part

Page 25: The Taverna 2 Platform

PLUG-IN MANAGER IMPLEMENTATION

The plug-in manager is instantiated and configured through Spring Plug-in manager and related components are available

in the root class loader for an application Exposed as named beans in the Spring context Applications access infrastructure properties through

the Spring context Access through interfaces not implementation types

Next slide shows actual implementation components in the current code

Page 26: The Taverna 2 Platform

system artifact set

basebase base

base

property

classname

IMPLEMENTATION STRUCTURE

Download Manager

Raven

Jar Managerbase/artifacts/agrp/aid-aver.jar

POM.xml Parserbase/artifacts/agrp/aid-aver.pombase/artifacts/agrp/aid-aver.jar

Plug-in Parserbase/pgrp/pid-pver.xml

Plug-in Managerbase/pgrp/pid-pver/jars/jarname.jar

Componentfile path

Key: SPI Registry

Page 27: The Taverna 2 Platform

WHERE NEXT?

Page 28: The Taverna 2 Platform

TIMESCALE

1st November – Initial generic platform implementation Early version released to public CVS 4th November

1st December – Specification of Taverna specific platform services Early implementation of enactor, workflow loader and reference

service components expected mid-November 1st January – Initial implementation of Taverna services 16th & 17th February – 2 day Developer Workshop in

Manchester Sign up at

http://spreadsheets.google.com/viewform?key=pGRSW--IktWUFVQVLleNzIw&hl=en