Automated Syntactic Mediation for Web Service Integration

Post on 11-May-2015

472 views 0 download

Tags:

Transcript of Automated Syntactic Mediation for Web Service Integration

Automated Syntactic Mediation for Web Service

Integration

Martin Szomszor (mns03r@ecs.soton.ac.uk)

Presentation Outline• Contemporary workflow design pattern

Using workflow to capture experimentation process Discovery of services using semantics

• Problem description Syntactic incompatibility

• Using ontologies for mediation• Architecture to support syntactic mediation• Mapping Language

Overview of mapping mechanics Implementation description

• Future work Dynamic discovery of Mappings

In Silico Experimentation

• Computational experimentation

• Access to resources provided by Web Services

• Users map experimental process to workflow

• Tasks are realised by service instances

Service Discovery

• Users need to find services to fulfill given tasks e.g. Retrieve sequence data Sequence alignment (Blast)

• There are lots of services !• Interface definitions can be terse, often un-

documented and sometimes cryptic • Limited semantic value• Manual discovery not ideal

Semantic Discovery

• Support users in the discovery of services according to domain specific terminology

• Annotate service descriptions with concepts from an ontology (PEDRO annotation tool) Input and output types assigned a semantic type by a

reference to an ontology concept

• Discover services by: Task performed Resources used Input and output semantic types

Use Case

• Common bioinformatics task:I. Find sequence data for a given id

(accession number)

II. Perform sequence alignment to discover similar sequence data

III. Obtain results

• Itself a complete workflow, but likely to feature in larger workflows too

Semantically Driven Workflow Design

• When building workflows, users connect services because they are deemed semantically compatible: Output semantic type equivalent to input

semantic type

Syntactic Compatibility• However, semantically compatible service

interfaces may not be syntactically compatible (i.e. different data formats)

Syntactic Mediation• When a mismatch in data formats occurs within a

workflow, a translation component is required• Current solutions are manual

Identify when mismatch occurs Derive conversion requirements Find suitable conversion tool Create new translation components if necessary

• These conversion components come in a variety of guises Translation Scripts (e.g. XSLT) Bespoke Code (JAVA and PERL) Web Services

• Simple solution: Adaptor for each compatible data format O(n2) Poor Scalability

• Alternative: Introduce intermediate representation O(n) Less effort introducing new formats

• Data Integration problem

Conversion Approaches

fc

e

b

d

a

fc

e

b

d

a

Three Layer View

• Physical Layer Data can be stored in different formats:

• E.g. binary, text, xml, relational database, etc…

• Logical Layer Organisation of data elements described by a schema:

• E.g XML Schema, relational database model

• Conceptual Layer What the data means (semantics)

• E.g. Ontology, description logic, Entity Relation Diagram

Intermediate Representation

• Data integration field has used this solution in similar application domains: TAMBIS Project [Stevens et al 2003]

• Complex query formulation over diverse bioinformatics information sources

SEEK Project [Bowers and Ludascher 2004]

• An ontology-driven framework for geographic data transformation in scientific workflows

• Intermediate representation in the form of a conceptual model E.g. Ontology, Description Logic

Architecture Requirements

1) OWL ontologies capture data format structure and semantics: Existing service ontologies [e.g. C. Wroe et al 2003] can be

extended with concepts and properties to describe data contents

2) Modular and composable mapping language Mapping overhead reduced when service providers

expose multiple operations over single schemas When schemas are combined to form new datasets,

existing mappings can be reused

Architecture Requirements

3) Invocation of arbitrary Web Services Grid and WS applications pull resources from

multiple providers into a dynamic and volatile environment

Must be able to invoke previously unseen services

4) Minimise annotation overhead Reuse existing Semantic Web Service description

methods Input and output types are assigned a concept

(semantic type)

Mapping XML to OWL

• Problem can be simplified by assuming a canonical XML representation for OWL concept instances [OWL-XI] XML serialisations of OWL concepts commonly used

• However, XML Schemas to validate individuals do not exist

• To support validation, OWL instance Schemas [OWL-XIS] are generated from ontologies Concept hierarchies computed Jena + Java Implementation

• Enables us to view the translation as an XML to XML transformation

Architecture Diagram

Service providers describe their Web Service interfaces using WSDL. Data consumed and produced is defined

using XML Schema.

OWL Ontologies are created toDescribe the information contained

Within Bioinformatics data structures.

Serialisation and Realisation Mappings describe how totransform XML dopcuments to and from [OWL-XI]

Semantic Annotations associate each WSDLMessage part with a concept from the ontology.

[OWL XIS]are generated to validate ontology instances.

Configurable Mediator

• Input: Source data instance Source schema Realisation Mapping (source format -> ontology) Ontology Definition Serialisation Mapping (ontology -> destination format) Destination Schema

• Output: Destination data instance

• Conversion performed via intermediate OWL concept instance

Configurable Mediator

Mapping Mechanics

<S> <X>foo</X> <X>bar</X></S>

<D> <Y>foo</Y> <Y>bar</Y></D>

Source Document Destination Document

m1: S/X -> D/Y

m2: X/$ -> Y/$

Mappings

Mapping Mechanics

S

X

“foo”

S/* S/*

xsd:string xsd:string

X

“bar”

D

Y

“foo”

D/* D/*

xsd:string xsd:string

Y

“bar”

m1: S/X -> D/Ym2: X/$ -> Y/$

Example M-Binding<binding xmlns="http://www.ecs.soton.ac.uk/~mns03r/mapping/example" xmlns:sns="http://jaco.ecs.soton.ac.uk/schema/source" xmlns:dns="http://jaco.ecs.soton.ac.uk/schema/destination">

<mapping id="1"> <source match="sns:S/sns:X"/> <destination create="dns:D[join]/dns:Y[branch]"/> </mapping>

<mapping id=”2"> <source match="sns:X/$"/> <destination create="dns:Y[join]/$"/> </mapping>

</binding>

Bio Example<ddbj:DDBJXML> <ddbj:ACCESSION>AB000059</ddbj:ACCESSION> <ddbj:FEATURES> <ddbj:source> <ddbj:location>1..1755</ddbj:location> <ddbj:qualifiers name="isolate">Som1</ddbj:qualifiers> <ddbj:qualifiers name="lab_host">Felis domesticus</ddbj:qualifiers> </ddbj:source> </ddbj:FEATURES></ddbj:DDBJXML>

<ont:Sequence_Data_Record> <ont:accession_id>AB000059</ont:accession_id> <ont:has_feature> <ont:Feature_Source> <ont:isolate>Som1</ont:isolate> <ont:lab_host>Felis domesticus</ont:lab_host> <ont:location> <ont:Feature_Location> <ont:start>1</ont:start> <ont:end>1755</ont:end> </ont:Feature_Location> </ont:location> </ont:Feature_source> </ont:has_feature></ont:Sequence_Data_Record>

Simple One-to-OneElement and literalMany-to-ManySplit literal valuePredicate evaluation

Example M-Binding<binding xmlns="http://www.ecs.soton.ac.uk/~mns03r/mapping/ddbj-to-ont-mapping" xmlns:sns="http://jaco.ecs.soton.ac.uk/schema/DDBJ" xmlns:dns="http://jaco.ecs.soton.ac.uk/ont/sequencedata">

<mapping id="1"> <source match="sns:DDBJXML/sns:ACCESSION"/> <destination create="dns:Sequence_Data_Record[join]/dns:accession_id[branch]/"/> </mapping>

<mapping id=“2”> <source match="sns:ACCESSION/$"/> <destination create="dns:accession_id[join]/$"/> </mapping>

<mapping id=”3"> <source match="sns:DDBJXML/sns:FEATURES/sns:source"/> <destination create="dns:Sequence_Data_Record[join]/dns:has_feature[branch]/ dns:Feature_Source[branch]"/> </mapping> <mapping id=”4"> <source match='sns:source/sns:qualifiers[sns:qualifiers/sns:name/$ = "lab_host"]'/> <destination create="dns:Feature_Source[join]/dns:lab-host[branch]"/> <mapping> <source match="sns:qualifiers/$"/> <destination create="dns:lab-host[join]/$"/> </mapping> </mapping>

<mapping id=”5"> <source match="sns:location/$^[^.]+"/> <destination create="dns:Location[join]/dns:start[branch]/$"/> </mapping></binding>

Conclusions

• Provide infrastructure to support syntactic mediation: OWL Ontologies to capture data format structure and

semantics (reuse existing annotations) Mapping Language to describe relationships between

XML Schemas and OWL Ontologies• Modular and Composable

Configurable Mediator to consume mappings and perform document translation

Dynamic Web Service Invoker

Future Work

• Dynamic discovery of Mappings Already implemented using the GRIMOIRES

registry and WSDL to describe mapping capabilities

• [Szomszor, Payne, Moreau 2006] UK All Hands, Nottingham

• Annotation Tool Mappings are complex and difficult to write by

hand Web based annotation tool

Questions and Comments?