ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning...
-
Upload
shannon-hill -
Category
Documents
-
view
223 -
download
5
Transcript of ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Transitioning...
ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Transitioning Relational Databases to Ontologies
Farid CerbahDassault Aviation
2ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Outline
Problem statement Previous work The RDBToOnto tool and the RTAXON method Improving the process through database
optimisation A case study in aircraft maintenance Extending RDBToOnto Conclusion
3ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Problem statement
Relational databases are valuable heterogeneous sources for ontology learning Better accuracy can be expected than from text corpora
Ontology learning from relational databases is not a new research issue
Limitations of existing support Problem often restricted to finding automated ways to
import “tables” into ontologies
Derivation of ontologies with flat structure that look like the source databases
4ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Our contribution
RDBToOnto Platform
A comprehensive software support to learn fine-tuned ontologies
A framework that eases the development and the experimentation of transitioning methods
RTAXON Method
To find out taxonomies hidden in the data
5ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
A motivating example
Typical mappingscovered by
several methods
Specific toRTAXON
6ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Previous work (1)
RDB -> Ontology Transformation Database Reverse Engineering
Many transformation rules from this domain are reused for ontology learning
[Behm et al. 1997], [Ramanathan & Hodges 1997], …
Approaches mostly based on an analysis of the RDB schema
Data correlations are considered but with the restriction "Data ≡ Key Values" Key inclusion may express inheritance
Exploiting null values semantics [Lammari et al. 2007] Partitioning of a table on the basis of null values may reveal
concept hierarchies Involves data from non-key attributes
7ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Previous work (2) Mapping languages and tools
D2RQ RDB to OWL/RDF mapping Ontology-based access to relational databases Rewriting SPARQL queries into SQL
Relational.OWL A minimal ontology of ‘tables’ and ‘column’ and a processor to populate
this ontology with data from relational databases Can be used to exchange data between databases
Triplify Plugin for web applications Converts the result of SQL queries into RDF
KAON Reverse Software support to interactively map an RDB schema to a predefined
ontology
DataMaster Protégé Plugin to import table data into ontologies
8ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
RDBToOnto
A user-oriented tool with a full-fledged user interface
Supports an extensive process from the access to the data to ontology generation
Includes the RTAXON converter
Though automated to a large extent, local constraints can be interactively included to progressively refine the ontologies
Types of local constraints Table and column exclusion Naming patterns for classes and instances Categorisation patterns
9ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
The RTAXON method
Major improvement over existing methods Further refine the classes derived from the schema with subclasses found in the
content of the relations Focus on reliable categorisation patterns
Demo
Access Zones (X 516)
A/C Codes Description Type
F7X 2103 nose cone DOOR
F7X 281FL windshield retainers PANEL
F7X 300ZZ umbrella access panel No.1 PANEL
F7X 243DF servicing compartment floor No.1 FLOOR
F7X 342EZ rear under pylon fairing FAIRG
Access Zone
Door Panel Fairing Floor
Two sources involved in the identification of categ. attributes Attribute names
Revealed by lexical clues Redundancy in attribute extensions
Entropy-based approach to find good profiles Formal definition of RTAXON
Categorising attribute
10ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Optimising the source databases
Another key improvement is the inclusion of a database optimisation step Many input databases suffer from data duplication problems Optimisation -> eliminate data duplication through the processing of
inclusion dependencies
Dassault-AviationF0214
Messier-DowtyF564
ParkerF0086
NameCage_Code
(PKEY)
Companies (X 105)
Data Duplication
eels, Brakes and Braking
Landing Gear Emergency Control System
Landing Gears
Hydraulic Power
WP Title
ABSB45335
Dassault-AviationF021434A
Messier-DowtyF56434
ParkerF008633
Company NameCompany CodeWP Number
WorkPackages (X 82)
Companies (X 106)
eels, Brakes and Braking
Landing Gear Emergency Control System
Landing Gears
Hydraulic Power
WP Title
B45335
F021434A
F56434
F008633
Company CodeWP Number
WorkPackages (X 82)
Foreign Key Relationship
Name]Companies[ Code] es[CompanyWorkPackag Name] Code, CageCompanies[ Names]Company Code, es[CompanyWorkPackag
Inclusion dependency
11ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Effect of inclusion dependency processing Inclusion dependencies more inter-class relations (i.e. object properties).
Without ID identification
With ID identification
12ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Identification of inclusion dependencies RDBToOnto includes an editor to interactively define inclusion dependencies
Automated identification of inclusion dependencies A data mining approach Based on LATINO
See presentation in this tutorial on ontology learning by Miha Grčar (JSI) Dependencies discovered by LATINO are exported in RDBToOnto and can be
validated in the ID editor
13ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Mining inclusion dependencies with LATINO
14ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
A case study in aircraft maintenance
KCIT(GATE-based annotator)RDBToOnto + LATINO
Radiant
OWLIM
15ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
The ontology acquisition process
The legacy data LSA database: an heterogeneous relational database
that gathers all information related to maintenance activity
Required logistic resources Aircraft parts (Product tree) Scheduling data
Standards: Documents including widely shared conceptual models
The ontology acquisition process A multi-step transitioning process that favours modular
design
16ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Model Boostrapping + Ontology Normalisation
<>…</><>… </>….
<> …</>
Reusable Ontologies
Ontology Learning Tools
MSG-3 SNS/ATA FOAF
ModelBootstrapping
Ontology Normalisation
ATA
imports
Legacy Data
OWLIM/HKSRepository
17ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
The defined RDBToOnto conversion project
75 constraints Mostly naming patterns and inclusion dependencies
Resulting ontology Ontology model
115 classes, 334 datatypes, 54 object properties Population
49617 class instances, 51449 object property instances
No constraints for categorisation The ten discovered hierarchies by RTAXON are relevant Good behaviour when faced with categorisation conflicts
18ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
The generated class hierarchy
19ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Identified object properties
20ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
RDBToOnto extension capabilities
RDBToOnto is a user-oriented tool but it is also a framework Written in Java OWL as target language (exploiting Jena 2.5 API)
Two types of components can be added Database readers to cover more database
formats Converters to implement new learning methods New converters can have their specific global
options, local constraints and GUI
21ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Structure of RDBToONTO
DBReader
Database getDatabase()
Table ReadData(String name)
…
MSAccessReader DB2Reader
Database
RDBToOntoConverter
OntModel Convert(Database db)
OntClass CreateClass(TableDef)
…
RTAXON BasicConverter
can be extended by the users
22ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
The neutral database model
DatabaseDBSchema
TableDef
Key
PrimaryKey ForeignKey
Attribute
Table Column
StringfriendlyNames Values
*
* *
*
**
*Input to any converter
23ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife
Conclusion
We presented a significant support for transitioning relational databases to ontologies
RDBToOnto and RTAXON method have been evaluated on significant databases
RTAXON is just a first step as many extensions can be studied Learning two-level hierarchies Automatically generating local constraints (e.g. naming patterns)
More resources are available on TAO project web site, including User Guide and demos Development Guide A fully implemented sample showing how to extend the tool