OGC spet 2010 Meta-propagation of uncertainties within workflows

36
® © 2010 Open Geospatial Consortium, Inc. Workflow Uncertainty using a Metamodel Framework and Metadata for Data and Processes OGC Technical Committee September 20-24, 2010 Toulouse, France Didier G Leibovici and Amir Pourabdollah Centre for Geospatial Science University of Nottingham

description

To begin with let us quote the QA4EO (Quality Assurance for Earth Observation)1: “If the vision of GEOSS is to be achieved, Quality Indicators (QIs) should be ascribed to data and, in particular, to delivered information products, at each stage of the data processing chain - from collection and processing to delivery. A QI should provide sufficient information to allow all users to readily evaluate a product’s suitability for their particular application, i.e. its “fitness for purpose”. To ensure that this process is internationally harmonised and consistent, the QI needs to be based on a documented and quantifiable assessment of evidence demonstrating the level of traceability to internationally agreed (where possible SI) reference standards. Such standards may be manmade, natural or intrinsic in nature. The documented evidence should include a description of the processes used, together with an uncertainty budget (or other appropriate quality performance measure).The guidelines of QA4EO provide a template and guidance on how to achieve this in a harmonised and robust manner. “ For interoperability purposes, each data and process registered within EuroGEOSS possesses appropriate metadata elements. The metadata description and the semantics attached to each component of a workflow (datasets and processing services) allow updating/swapping of these components. With varying quality of the components of the workflow, the quality of the outputs of this workflow can become unreliable. With the knowledge of the level of uncertainty in each dataset involved and the sensitivity aspects of the processing steps it is possible to define the quality of a workflow and the level of uncertainty of the outputs by error propagation principles. Reusing of a given model encapsulated in a scientific workflow implies running the workflow using either the same datasets but not necessarily coming from the same sources, or different datasets which have also not necessarily the required/desired scale specified by the workflow. From error propagation principles and the knowledge of the quality metadata of the components of the workflow, using datasets from different sources or at different scales can be assessed for the quality of the workflow. As part of the integrated modelling activity the latter assessment will help the modeller in choosing the appropriate datasets or in refining the workflow model for example by considering data assimilation, downscaling, multiple scale integration steps within the scientific model and its associated workflow. The workflow quality assessment will help also the modeller in swapping or refining the processing steps as well. Under these modelling activities, the workflow is then seen as the concrete support of a conceptual model, which evolves as the conceptual model does. On top of quality descriptors existing in the ISO19157, the present document describes the requirements for uncertainty analysis within scientific workflows.

Transcript of OGC spet 2010 Meta-propagation of uncertainties within workflows

Page 1: OGC spet 2010 Meta-propagation of uncertainties within workflows

®

© 2010 Open Geospatial Consortium, Inc.

Workflow Uncertainty using a Metamodel Framework and Metadata for Data and

ProcessesOGC Technical Committee

September 20-24, 2010

Toulouse, France

Didier G Leibovici and Amir Pourabdollah

Centre for Geospatial Science

University of Nottingham

Page 2: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

© 2010 Open Geospatial Consortium, Inc. 2

outline

• integrated modelling /scientific workflowmodel building / reusing / user’s perspective /rescaling / quality assessment

• uncertainty / sensitivity analyses for workflowserror propagation / uncertainty analysis / emulator (“metamodelling”) / use of metadata

• metadata for data and for processes quality metadata / UncertML / quality principles & measures for processes

• metamodel for workflows notation/ encoding/ enrichment

• towards Web Workflow Service? WPS / WWS / requirements for workflow assessment

FP7 European project

Page 3: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

OGC initiatives related to workflows

• OWS-5 http://www.opengeospatial.org/projects/initiatives/ows-5

conflation workflow and SWE workflow

• OWS-6 http://www.opengeospatial.org/projects/initiatives/ows-6GeoProcessing Workflow, Decision Support Service

http://www.opengeospatial.org/pub/www/ows6/web_files/ows6.html

© 2010 Open Geospatial Consortium, Inc. 3

Page 4: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

OGC OWS-5 conflation workflow

© 2010 Open Geospatial Consortium, Inc. 4

Page 5: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

OGC OWS-6 landslide sensor geoprocessingworkflow

© 2010 Open Geospatial Consortium, Inc. 5

Page 6: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

Debris flow operational scenario

Page 7: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

integrated modelling/ scientific workflow

© 2010 Open Geospatial Consortium, Inc. 7

model building

reusing

user’s perspective

multidiscipline

rescaling

quality assessment

uncertainties

Page 8: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

integrated modelling/ scientific workflow

• representation BPMN

© 2010 Open Geospatial Consortium, Inc. 8

toy example:greenness model

Data3= P1(Data1, Data2)

Data3= P1’ (Data1, Data2, Data7)Data6= P2(Data3, Data4, Data5)

P1’

D7

Page 9: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

uncertainty / accuracy /sensitivity

© 2010 Open Geospatial Consortium, Inc. 9

Page 10: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

© 2010 Open Geospatial Consortium, Inc. 10

Page 11: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

uncertainty / accuracy /sensitivity

© 2010 Open Geospatial Consortium, Inc. 11

Page 12: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

uncertainty / accuracy /sensitivity

• error propagation (via the model)

– variables interaction

– spatial dependence of uncertainties

© 2010 Open Geospatial Consortium, Inc. 12

sensitivity and uncertainty analysis

sampling design and model building

sampling design and propagation

Page 13: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

uncertainty / accuracy /sensitivity

• uncertainty analysis what is the output uncertainty?

• and sensitivity analysis where output uncertainty comes from?

© 2010 Open Geospatial Consortium, Inc. 13

1. uses quality metadata about inputs (distribution, variance, ...)

2. sampling design accordingly

3. look at output distribution, variance, ... and compare with inputs

A. using the model

B. using an emulator (see UncertWeb project)

C. can we do a simple estimation without 2 and 3?

for each atomic process

Workflow level

Page 14: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

propagating thematic uncertainty

© 2010 Open Geospatial Consortium, Inc. 14

Z

X1

X2

X3

^

^

^

Y^

^Z

X1

X2

X3

^

^

^

Y^

^

=><

?

variance

Page 15: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

propagating thematic uncertainty

© 2010 Open Geospatial Consortium, Inc. 15

Z

X1

X2

X3

^

^

^

Y

Z

X1

X2

X3

^

^

^

Y

^=~><<<>>

?

• is in the “tolerance” of according to ? ~

• If then

• if

X1^

Sensitivityinformation

Page 16: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

propagating thematic uncertainty

© 2010 Open Geospatial Consortium, Inc. 16

Z

X1

X2

X3

^

^

^

Y

Z

X1

X2

X3

^

^

^

Y

^

Need more thanSensitivityInformation

Need a kind of meta-sensitivityi.e. for various samplingVariancesa variance transfer function

=~><<<>>

Page 17: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metadata for data and for processes

• ISO standards (data and services)

19115, 19113, 19114, 19135, 19138,19119, (19139)

• UncertML (OGC discussion paper)

encoding uncertainty measures

© 2010 Open Geospatial Consortium, Inc. 17

ISO 19113 - Quality principles, ISO 19114- Quality evaluation procedures, ISO 19115-Metadata, ISO - 19138 - Data quality measures and ISO - 19135 Registration,

Page 18: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metadata for data

© 2010 Open Geospatial Consortium, Inc. 18

Table 1: Data quality elements and data quality sub-elements with definitions (ISO 19113)

Page 19: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metadata for data

© 2010 Open Geospatial Consortium, Inc. 19

Page 20: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metadata for processes (proposal)

© 2010 Open Geospatial Consortium, Inc. 20

Page 21: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metadata for processes (proposal)

© 2010 Open Geospatial Consortium, Inc. 21

Page 22: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

© 2010 Open Geospatial Consortium, Inc. 22

Metadata for processes / basic measures

Page 23: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

© 2010 Open Geospatial Consortium, Inc. 23

Metadata for processes / basic measures

• encoding using the same structure as in

ISO19115/ISO19139 for data quality

DQ_element PQ_element

• registration of measures ISO19135

PQ_ConflationInformationLoss, PQ_ThematicClassificationPropagation, PQ_QuantitativeAttributePropagation PQ_ConceptualSemanticConformance, PQ_DomainConsistency, PQ_TopologicalPreservation

Page 24: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

Metadata workflow quality / metadata propagation

© 2010 Open Geospatial Consortium, Inc. 24

Dynamic Metadatae.g -discrepancy of scales (data chosen vs expected input)

-Capitalising uses:dynamic alsoby web 2.0

-parameter choices ”

model building

reusing

user’s perspective

multidiscipline

rescaling

quality assessment

Page 25: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

metamodel for workflows

• representing / storing & navigate / execute• notation encoding enrichment engine

BPMN XPDL (extensions) XPDL or BPEL engine

PNML (Petri-Nets)

• enrichment with metadata (quality element)• enrichment with semantic related to quality (tags)

© 2010 Open Geospatial Consortium, Inc. 25

e.g greenery / greenness model

Page 26: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

XPDL 2.1 process meta-model

© 2010 Open Geospatial Consortium, Inc. 26attached with quality metadata

Page 27: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

XPDL 2.1 linking with BPMN

© 2010 Open Geospatial Consortium, Inc. 27

attached with quality metadata

Page 28: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

Extended attributes

• Without namespace

• With namespace

© 2010 Open Geospatial Consortium, Inc. 28

Page 29: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

BPMN/XPDL Example

© 2010 Open Geospatial Consortium, Inc. 29

Data3= P1(Data1, Data2)

Page 30: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

BPMN/XPDL Example – Step 2

© 2010 Open Geospatial Consortium, Inc. 30

Data3= P1(Data1, Data2)Data6= P2(Data3, Data4, Data5)

Page 31: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

BPMN/XPDL Example – Step 3

© 2010 Open Geospatial Consortium, Inc. 31

Data3= P1’ (Data1, Data2, Data7)Data6= P2(Data3, Data4, Data5)

Page 32: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

towards Web Workflow Service?

• needs to easily

combine /assess / refine web data/process services

• in a “WPS” fashion (WPS are atomic Workflows)

• and other things: validation using PNML

© 2010 Open Geospatial Consortium, Inc. 32

Page 33: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

towards Web Workflow Service?

• WPS executing a worklfow

see OWS-5 6 (“hard-coded” and / or using a BPEL engine)

• WPS acting alike a workflow service WPS GetCapabilities:

. specific operations stored as available processes (Op)

. list of the workflows processes (Wkf)

the principle is the Ops informed on a Wkf by returning an enriched XPDL file representing the workflow

• WWS the “WPS acting” has unbalanced intrinsic properties of the existing processes living in the WPS

© 2010 Open Geospatial Consortium, Inc. 33

Page 34: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

towards Web Workflow Service?

• WPS acting alike a workflow service WPS GetCapabilities:

. specific operations stored as available processes (Op)

. list of the workflows processes (Wkf) the principle is the Ops informed on a Wkf by returning an enriched XPDL file

representing the workflow

1. OpShow Id_Wkf returns the XPDL (enriched) of a Wkf

2. OpSet data/processes (modifiable entries of Wkf) returns the updated XPDL file with the updated metadata (particularly propagated

metadata)

3. OpExecute, same as OpSet but runs the Wkf as an“aggregated process”, returns an XPDL containing as well the links for the outputs.

4. OpStatus returns the status per node of the Wkf in an XPDL file

© 2010 Open Geospatial Consortium, Inc. 34

Page 35: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

towards Web Workflow Service?

• WWS• GetCapabilities OGC generic request• DescribeWorkflow request to retrieve the definition of a workflow in a number of

standard formats, in which XPDL is the primary choice. It corresponds to OpShow.

• DefineWorkflow like OpSet allowing to set/modify a workflow (fixed workflow witih user’s input, partially modifiable workflow with user’s inputs and swaps of internal processes or data, or user’s workflow)

• ExecuteWorkflow as OpExecute launch the execution in “instant” or “delayed” mode, as in WPS and requests the execution status as XPDL or “other workflow format”.

Parameters to manage the

- different levels of aggregation/hierarchy (e.g. an erosion model may have precipitation model and a run-off model (among other sub-models).

- uncomplete but published conceptual workflows (collaborations)

© 2010 Open Geospatial Consortium, Inc. 35

Page 36: OGC spet 2010 Meta-propagation of uncertainties within workflows

OGC®

© 2010 Open Geospatial Consortium, Inc. 36

summary

• integrated modelling /scientific workflowmodel building / reusing / user’s perspective /rescaling / quality assessment

• uncertainty / sensitivity analyses for workflowserror propagation / uncertainty analysis / emulator (“metamodelling”) / use of metadata

• metadata for data and for processes quality metadata / UncertML / quality principles & measures for processes

• metamodel for workflows notation/ encoding/ enrichment

• towards Web Workflow Service? WPS / WWS / requirements for workflow assessment

FP7 European project