Presentazione jrc 24 ottobre

35
Alberto Belussi, Mauro Negri, Giuseppe Pelagatti Politecnico di Milano [email protected] October 24, 2013 Inspire Data Model Browser

description

Presentazione del 24 ottobre 2013 presso JRC su: - GeoUML tools (Catalogue e Validator) - prototipo INSPIRE Data Model Browser Nel webinar del 21 Novembre sono state illustrate solo alcune delle slide di questa presentazione.

Transcript of Presentazione jrc 24 ottobre

Page 1: Presentazione jrc 24 ottobre

Alberto Belussi, Mauro Negri, Giuseppe Pelagatti

Politecnico di [email protected]

October 24, 2013

Inspire Data Model Browser

Page 2: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Index

1. General Architecture, derivation from GeoUML tools and state of the art

2. Inspire Data Model (IDM) and Browser (IDMB):– IDM rules– use of IDMB by non-UML expert– use of IDMB by UML experts for Schema Maintenence

3. Inspire DB and SQL Mapping Generator– motivation– comparison with similar approaches/tools

Discussion – how to go on

2

Page 3: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 3

1) General Architecture

INSPIRE DATABASE

(e.g. PostGis)

WFS

INSPIRE DATA MODEL BROWSER (IDMB)

(Inspire Data Specifications)WFS

Configuration

SourceDatabases

SQLscripts

SQL mapping generator

GML datasets

SQLSchema

VALIDATOR

INSPIRE SCHEMA LOADER

schema in EAP format

Page 4: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 4

Derived from GeoUML Tools

GeoUML Validator

SourceDataset

Diagnostic Information

GeoUML Catalogue(Data Specification)

XML Data Specification (scs format)

Mapping Generator

PhysicalStructure

PhysicalModel

Reader

6 different Implementation Models on different technologies:• SQL,• Shapefiles, • GML

Page 5: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 5

Editor and Constraints

Page 6: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 6

Data product (shapefile e DBF)

loading

SC.scs(XML file)SC, SF,MappingDomains

DB reportistica)

Report Generator

(es.IReport)

Report Generator

(es.IReport)

OpenjumpOpenjump

Database ClientDatabase Client

Validator

Page 7: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 7

Page 8: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 8

GZ_STR.Posizione (TC) esiste TP_STR.Tracciato

Page 9: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 9

Missing el_str

Page 10: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

GeoUML Tools Experience

• Catalogue has been adopted to model the Italian National Core data specification (legally mandatory for italian public administrations since 2012): 150 classes and 90 spatial constraints

• Each administration can use the Catalogue (Editor) to extend the National Core for its specific requirements, provided that the mandatory content is preserved

• Each administration can choose its own Implementation Model• Validator has been applied in several productions, some of which

are rather large, e.g.:– Provincia di Sondrio: 78 municipalities, 3195 km2, 100 classes

and 50 constraints)– Regione Basilicata: 10000 km2 (full National Core)

10

Page 11: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

GeoUML Tools Experience - 2

• Provincia Autonoma di Trento has financed an extension to the Validator in order to use it in its Information System: the Validator can now be invoked through an API

• Regione Lombardia is using the tools for its Regional SDI (essentially for transferring local updates to the central database):– the interchange format has been defined using the

Catalogue and the ShapeFlat Implementation Model

– the use of the Validator for checking the quality of updates is being evaluated

11

Page 12: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

GeoUML Tools Experience - 3

Distribution and downloads:– Currently only the Catalogue Viewer is distributed to

everybody; the Editor and Validator are distributed only to public administration

– From the next release (november 2013) these restrictions will be eliminated

– About 280 (different) downloads of the Viewer (50% by privates)

12

Page 13: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

2) Inspire Data Model – Main Rules

• Defined in Inspire Generic Conceptual Model and Referenced Standards

• Rules = Restrictions to generic UML class diagramsa. Feature Classes cannot be the type of an attribute

b. Datatypes, Codelists and Enumerations cannot be the Target of a Role

c. Datatype containment cannot be cyclic (including inheritance)

d. Inheritance Hierarchies are homogeneous in stereotype

13

Page 14: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Additional Restrictions of the current Release

Applied in current IDMB to reduce development cost and time

(can be removed)

e. Union stereotypes currently not supported

f. Datatypes, Codelists and Enumerations do not have Multiple Inheritance

g. If an attribute is of type DT and DT is a datatype having subdatatypes DT1, DT2, … then DT, DT1, DT2, … cannot have attributes of type datatype DTX which has subdatatypes DTX1, DTX2 …

h. limitations on cardinality domain

14

Page 15: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

What data are loaded in IDMB?

• Extracted from Inspire2010-04-26_r937.EAP• Without:

– Foundation Schemas– class INSPIRE THEME XX::SpatialObjectWithAName

• some additional <basic types>:– Spatial types (GM_...) used by Annex 1– ….

• + Selected ISO Types– from ISO 00639 Language Codes: <codelist> LanguageCode– from ISO 19103 Schema Language: Bit, Digit, Sign, StandardUnits,

UnitsList– from ISO 19115

15

Page 16: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Few Problems discovered in Loading

• 4 Classes used as Attribute Types (violation of rule (a)):– Percentage for attributes:

• percentageUnderDesignation of class DesignationType, • percentageUnderHabitat of class PresentHabitatType, • percentageOfSiteCoveredByProtectedEntity of class

ProtectedEntityType– NetworkArea for attribute: officialsiteArea of class ProtectedSite– LinkSequence for attribute: aggregatedLink of class LRSElement– Link for attribute: link of class LRSElement

• Cardinality without Role (DirectedLink)• Roles which are not represented in the EA diagrams (example:

"SpeciesAggregationUnit" in package "Species distribution (Annex III)" has role "isAggregatedOver" to featuretype "Protected Site” )

16

Page 17: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

IDMB for non-UML experts

• To some extent an answer to the requirement stated in Florence by Anders Ryden in «Analysis of Inspire DS – A stepwise approach»

• DEMO (in parallel with EA diagrams):– Road, – NetworkProperty, – ProtectedSite (full) – designationvalue, – …

17

Page 18: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

IDMB for design and maintenence of an Inspire Data Schema

• Control of the IDM rules (EA does not)• Contribution to understanding of an UML schema developed

in EA:– different navigation– Statistics are possible (e.g. how many feature types,

percentage of abstract classes, average depth of inheritance hierarchies, …)

• Requires the development of the IDM Loader (currently ad hoc operations) and the elimination of the restrictions of the current Release

18

Page 19: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe 19

IDMB and INSPIRE MIF

INSPIRE DATA MODEL BROWSER (IDMB)

(Inspire Data Specifications)

INSPIRE SCHEMA LOADER

schema in EAP format

Updates

Errors (violations ofIDM rules)

Statistics andAnalysis by

different navigation

Page 20: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

3) Inspire DB, Mapping and WFS

Goals of Inspire DB– Supporting the implementation of WFS for spatial data

having complex structure (this is the case of INSPIRE datasets)

– Guaranteeing the satisfaction of integrity constraints by database content thus ensuring high quality of provided data

– Possibly improving the performance of spatial queries and Xlink navigation (WFS 2.0)

– Possibly providing the ability of managing incoming data through WFS-T service

– being an SQL Reference Implementation

20

Page 21: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Features of Inspire Database

• Primary key and unique constraints derived from Data Specification are added to the SQL schema

• Foreign key constraints are also specified in order to deal with referential integrity of roles and other links among tables. Constraints involving different tables are handled.

• Cardinality constraints are added in order to check required attributes and roles.

• Views can possibly be added to obtain the extent of superclasses in hierarchies• Geometric domains are specialized and constrained in order to guarantee that

the represented geometries have the correct type (POINT, LINESTRING, POLYGON, …) and the correct coordinate dimension

• Spatial indices are defined on every geometric attribute• No redundancy is introduced for the representation of a role together with its

corresponding inverse role.

21

Page 22: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

SQL schema generation rules

Main general rules:• Each class has a reference table (master) with a unique identifier

(ClassID) used for providing the GML:id• Each atomic property of a class with max cardinality equal to one

(mono-property) is represented in the class reference table; the same behaviour is adopted for each attribute of a datatype property of a class, provided that it is mono-property.

• Each atomic or datatype property with max cardinality greater than one or equal ‘*’ (multi-property) produces an additional table (slave) having a foreign key (ClassREF) with an integrity constraint towards the master table.

22

Page 23: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

SQL schema generation rules

Main general rules (cont.):• Each role with max cardinality greater than one or equal

‘*’ (multi-role) without inverse role or having an inverse multi-role produces an additional table (slave) having two foreign keys (ClassREF_0, ClassREF_1) with integrity constraints towards the master tables.

• Additional views are added to handle the reference to the features/objects/codes belonging to a hierarchy of feature classes/datatypes/code lists.

23

Page 24: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Mapping example on GeoUML Catalogue

24

Page 25: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Mapping example on GeoUML Catalogue

25

Page 26: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Comparison with database generated from XSD

(Deegree approach)

• Primary key and unique constraints cannot be always derived from XSD syntax

• Foreign keys cannot be specified in XSD syntax, thus are not defined in SQL schema derived from XSD

• Cardinality constraints and domain constraints can be derived from XSD syntax, but are not always specified in SQL schema.

• Geometric types cannot always be derived from XSD syntax.• Spatial indices are a possible choice when a geometric attribute is defined,

however schema generators do not always produce them• Redundancy in role/inverse role implementation cannot be avoided starting

from XSD syntax, since inverse roles are not specified.

26

Page 27: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Example: a DB generated by deegree

27

Primary key completely

useless

no Foreign key constraint

no Foreign key constraint and redundancy

no Foreign key constraint and redundancy

Page 28: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Comparison with GO Loader

28

PostGIS

Page 29: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Comparison with GO Loader

29

Page 30: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

The Inspire DB as a target for mapping from existing production database

The Data producer has to define and implement the SQL scripts which transfer the data from their Source Database to the Inspire Database. This could be easier than directly producing the input to a WFS.

30

SourceDatabase

SQLscripts

INSPIRE DATABASE

Page 31: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

4) How should we go on?

Depends on the answer to the following 3 questions:

1. Do you think that IDMB can be useful for facilitating the understanding of Inspire DS by non UMLexperts?

2. Do you think that IDMB + Inspire Schema Loader can be useful for Inspire Schema Maintenence?

3. Do you think that the Mapping Generator and Inspire Database can be useful for the Inspire Data Producers?

A positive answer to each one of these questions can have consequences for the prosecution of this project.

31

Page 32: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

IDMB for facilitating the understanding of Inspire DS by non-

UML experts

• Actions (already decided by SpatialDBGroup)– complete the features of IDMB– distribute it free

• Problems and Opportunities– how to provide some kind of certification (formal or not)– how to disseminate– use in training?

32

Page 33: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

IDMB + Inspire Schema Loader for Inspire Schema Maintenence

• Actions– refine requirements for maintenence (e.g. useful

statistics…)– produce an usable Inspire Schema Loader

• Problems– financial support

33

Page 34: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

Mapping Generator and Inspire Database for the Inspire Data

Producers

• Actions:– find interested parties (stakeholders) – define detailed rules for critical aspects– complete the production of the Schema Generator

• Inspire Data Producers could be also interested in:– Applying the Validator on the IDM

34

Page 35: Presentazione jrc 24 ottobre

© 2013 - Pelagatti Giuseppe

IDM and Italian National Core:a Mapping project

• We are studying the possibility/usefulness to support the mapping between a NC database and the Inspire database by connecting the GeoUML catalogue and the IDMB

• If this approach can be useful we will ask financial support from CISIS (or other interested italian agencies) for this project

35