© sebis 1030502-Wi-sebis-Master Next-Generation User-Centered Information Management Ontology-based...

53
© sebis 1 030502-Wi-sebis-Master Next-Generation User-Centered Information Management Ontology-based Information Representation Software Engineering betrieblicher Informationssysteme (sebis) Ernst Denert-Stiftungslehrstuhl Lehrstuhl für Informatik 19 Institut für Informatik TU München wwwmatthes.in.tum.de Ontology Information Representati on

Transcript of © sebis 1030502-Wi-sebis-Master Next-Generation User-Centered Information Management Ontology-based...

© sebis 1030502-Wi-sebis-Master

Next-Generation User-Centered Information Management

Ontology-based Information Representation

Software Engineering betrieblicher Informationssysteme (sebis)Ernst Denert-StiftungslehrstuhlLehrstuhl für Informatik 19 Institut für InformatikTU München

wwwmatthes.in.tum.de

OntologyInformation Representation

© sebis 2030502-Wi-sebis-Master

Ontology-based Information Representation

Outline

Motivation

Semantic Models for Information Representation

Taxonomy

Thesaurus

Topic Map

Ontology

The Semantic Web

URI, XML, RDF, RDFS, OWL

Jena

Ontology-Based Information Visualization with Cluster Maps

Conclusion

© sebis 3030502-Wi-sebis-Master

Motivation (1)

Information Representation

Data: information resources described by concepts

Semantic Structure: select, filter, classify, merge... based on terms

Representation: organized information resources

Search for information

Visualize search results

Navigate through search results

Data Semantic Structure Representation

... what how

© sebis 4030502-Wi-sebis-Master

Motivation (2)

Metadata

Information about information resources

Object-based information representation

Example: Dublin Core

- Best-known vocabulary for metadata, a set of 13 properties describing information resource

Document managemen properties: title, creator, publisher, date, language

Semantic properties: subject

Metadata about a document in a simple textfield without restrictions?

Context-based information representation

Grouping information resources by subjects they are about

Semantic models for information representation

© sebis 5030502-Wi-sebis-Master

Ontology-based Information Representation

Outline

Motivation

Semantic Models for Information Representation

Taxonomy

Thesaurus

Topic Map

Ontology

The Semantic Web

URI, XML, NS, XMLS

RDF, RDFS, OWL

Jena

Ontology-Based Information Visualization with Cluster Maps

Conclusion

© sebis 6030502-Wi-sebis-Master

Taxonomy (1)

Taxonomy

Biologically motivated: classification of organisms (Carl von Linné)

Classification that arranges terms into a hierarchy

Based on inheritance (is-a relationship)

[ABiilsma]

© sebis 7030502-Wi-sebis-Master

Taxonomy (2)

Taxonomy of Visual Elements

[JHugo]

© sebis 8030502-Wi-sebis-Master

Person Taxonomy

Child

Adult

Taxonomy (3)

Boy Girl Man Woman

Child Adult

Employee

Student

Toddler

Pensioneer

Employee

Student StudentBaby Pensioneer

School-Boy

Student

School-GirlToddler

Baby

Person

© sebis 9030502-Wi-sebis-Master

Taxonomy (4)

Properties of Taxonomies

Hierarchy based on inheritance (is-a relationship)

A mammal is an animal.

Grouping of related terms

No explicite definition about how terms relate

Synonyms

Terms with some degree of similarity

Redundancy when a subclass belongs to more than one superclasses

Baby, Toddler and Student appear more than once in the Person taxonomy.

© sebis 10030502-Wi-sebis-Master

Thesaurus (1)

Thesaurus

Motivated by linguistics

Classification of terms based on inheritance, similarity and synonymity

ISO standard: ISO2788 for monolingual and ISO5964 multilingual thesauri

[Creighton]

© sebis 11030502-Wi-sebis-Master

Thesaurus (2)

Example of Thesaurus for „Person“

Toddler Baby

Student School-Girl

Student School-Boy

Baby

School-Boy

Baby

School-Girl

Boy Girl

Child

ToddlerToddler

Student Student

Similarity

Synonym

© sebis 12030502-Wi-sebis-Master

Thesaurus (3)

Properties of Thesauri

Hierarchy based on inheritance (is-a relationship): same as taxonomy

Much reacher vocabulary for describing relationships

Related term: term with similar meaning

USE: with synonyms, preferred term; UF: inverse

Property: scope note

annotation, string attached to the term explaining its meaning

Homonyms (same word, different meaning) not possible to distinguish

Still redundancy when a sublcass belongs to more than one superclasses

Baby, Toddler and Student appear more than once in the taxonomy.

© sebis 13030502-Wi-sebis-Master

Topic Map (1)

Topic Map

Motivated by mathematical models of how long-term memory works

Classification of terms represented by topics based on

Inheritance

Similarity, synonyms

User-defined relationships

XML Topic Maps

Standard XML format for TM

Open Vocabulary

www.TopicMaps.org

[TM2]

© sebis 14030502-Wi-sebis-Master

Topic Map (2)

Information resource optionally identified by URI

Hierarchy of concept represented by a topic described by

Name with the properties

- Scope – a set of topics representing a context

- Type – a set of topics, a kind of an association between topics

Occurances (properties) connect a topic to an information resource; optionally scope and type

Association (Relationship); optionally scope and type

[TM3]

© sebis 15030502-Wi-sebis-Master

Topic Map (3)

Baby

School-Boy

Baby

School-Girl

Boy Girl

ToddlerToddler

Student Student

Similarity

Synonym

Example of Topic Map for „Person“

Toddler Baby

Student School-girl

Student School-boy

Name

Age

hasChild

isSiblingOf

Name Age

AdulthasParent

isChildOf

Child

Person

© sebis 16030502-Wi-sebis-Master

Topic Map (4)

Properties of Topic Maps

Flexible network of concepts strucutured by open vocabulary

More powerful (precise) searches

Flexible navigation

Composition, association (user-defined relationship types) possible

Able to distinguish between homonyms due to concept‘s type

Name and Age on the same conceptual level as Boy and Girl

Disambiguity of homonyms

Paris (France), Paris (Greek Mythology)

Still redundancy when a sublcass belongs to more than one superclasses

Model in its infancy

© sebis 17030502-Wi-sebis-Master

Ontology (1)

Ontology

Originally motivated by philosophy: „the science of being“ (Aristotle)

Definition: „a formal explicit specification of a shared conceptualization“ (Gruber)

Vocabulary + Structure = Taxonomy

Taxonomy + Relationships, Constraints, Rules = Ontology

„Model for describing the world that consists of

- a set of types,

- properties, and

- a set of relationship types“ (Garshol)

Classification of terms for objects and individuals

Open set of terms

Open language for describing relationships

© sebis 18030502-Wi-sebis-Master

Ontology (2)

Baby School-Boy

Ontology for „Person“

......

Name

Toddler

Boy

Age

John Big 6 months

A isChildOf B isChildOf C A isGrandChildOf C

A isChildOf B B isParentOf A

Rules

isChildOf

isSiblingOf

A isChildOf B A hasParent B

School-GirlStudent

Girl

Child

Person

Adult

hasChildhasParent

© sebis 19030502-Wi-sebis-Master

Ontology (3)

Properties of Ontologies

Clearly defined relationships (inverse, transitive, symmetrical... )

Constraints, rules

Open vocabulary

Machine-readability

Rule-based (logical) inferencing

Descriptive power

Precise searching, visualization, navigation

Managed redundancy

Easily extensible

Not only meta-model but also instances

Common standard between several parties

- Binding data from heterogeneous sources

© sebis 20030502-Wi-sebis-Master

Ontology-based Information Representation

Outline

Motivation

Semantic Models for Information Representation

Taxonomy

Thesaurus

Topic Map

Ontology

The Semantic Web

URI, XML, RDF, RDFS, OWL

Jena

Ontology-Based Information Visualization with Cluster Maps

Conclusion

© sebis 21030502-Wi-sebis-Master

The Semantic Web (1)

Motivation

Extend existing markup with semantic markup

Define a standard web ontology language

Common syntax in order to share semantics

Provide tools and services to help users to

Design and maintain high quality ontologies

Store instances of ontology classes

Query ontology classes and instances

Integrate and align multiple ontologies

© sebis 22030502-Wi-sebis-Master

The Semantic Web (2)

The Semantic Web

A product of W3C (World Wide Web Consortium) headed by Berners-Lee

Goal: lead W3 to its full potential

Develop common protocols

Control evolution of W3

Maintain interoperability of W3

Relational Data

Semantics and Reasoning

Data Exchange

© sebis 23030502-Wi-sebis-Master

XML (1)

XML and XML Schema

eXtensible Markup Language

Open vocabulary extensibility

Strict syntax well-formedness

Separation of content different rendering of tree-like documents

XML Schema

Validity

NameSpace

URI that vocabulary is associated with, need not contain a document

- Uniform Resource Identifier the set of all addresses that refer to resources

- Resource: any object that can be pointed by a URI

- URL: subtype of URI

Unambiguous interpretation of identifiers

© sebis 24030502-Wi-sebis-Master

RDF (1)

RDF

Resource Description Framework:

Standardization of description of resources

Extensible and flexible hierarchy based on XML

Open vocabulary: classes with properties and relationships

Namespaces: range and domain of properties, need be an existing document

Directed Graph built using statements

Statement specifies properties and values of web resources:

John (Object) name (Property) „John Big“ (Value)

John (Object) age (Property) „6 months“ (Value)

John (Object) isChildOf (Property) Jane (Object)

John (Object) isChildOf (Property) Tom (Object)

© sebis 25030502-Wi-sebis-Master

RDF (2)

RDF Document: one description per resource with a list of properties

Description element

may be anonymous (no attributes)

possible attribute for class (object) definion

- rdf:about to describe a resource (via URI) or

- rdf:ID to define a resource (via a fragment identifier without #)

Fundamental Concepts

Object: resource defined by URI

Property: resource

Value: resource or literal

Only fact-stating, basic data model for object, property, value

RDF schema vocabulary (RDF Schema Building Blocks)

© sebis 26030502-Wi-sebis-Master

http://www.person.bgr/john http://www.person.bgr/jane

http://www.family.org/isChildOf

RDF (3)

http://www.person.bgr/tom

http://www.family.org/isChildOf

„John Big“

http://purl.org/cd/elements/1.1/creator

mailto:[email protected]„6 months“

http://www.person.bgr/age

http://www.person.bgr/name

© sebis 27030502-Wi-sebis-Master

RDF (4)

<Description about=„http://www.big.bgr/john“>

<person:name resource=„John Big“/>

<person:age resource =„6 months“/>

< family:isChildOf resource =„http://www.person.org/jane“/>

< family:isChildOf resource =„http://www.person.org/tom“/>

</Description>

<Description about=„http://www.big.bgr“ dc:creator=„[email protected]“>

</Description>

© sebis 28030502-Wi-sebis-Master

RDFS (1)

Valid RDF

Provides information about interpretation of RDF statements

Class definition

Subclass definition using rdfs:subClassOf

Subproperty definition using rdfs:Property

Domain and Range restrictions

Example for Music use

<Music rdf:resource=http://www.music.bgr/>

© sebis 29030502-Wi-sebis-Master

RDFS (2)

<!DOCTYPE rdf:RDF [ <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'> ]>

<rdf:RDF xmlns:rdf="&rdf;" xmlns:rdfs="&rdfs;">

<rdf:Description rdf:ID="Music">

<rdf:type rdf:resource="&rdfs;Class"/> </rdf:Description>

<rdf:Description rdf:ID="Symphony">

<rdf:type rdf:resource="&rdfs;Class"/>

<rdfs:subClassOf rdf:resource="#Music"/> </rdf:Description>

<rdf:Description rdf:ID="Concerto">

<rdf:type rdf:resource="&rdfs;Class"/>

<rdfs:subClassOf rdf:resource="#Music"/> </rdf:Description>

</rdf:RDF>

© sebis 30030502-Wi-sebis-Master

RDFS (3)

RDFS Weakness to describe resources in sufficient detail

No localized range and domain constraints: the range of hasChild is

- person when applied to person

- animal when applied to animal

No cardinality constraints:

- Person has exactly two parents

No existence constraints:

- all instances of person have a mother that is also a person

No transitive, inverse, symmetrical properties:

- isChildOf is a transitive property

- isChildOf is the inverse of isParentOf

- isSiblingOf is symmetrical

© sebis 31030502-Wi-sebis-Master

OWL (1)

OWL

Web Ontology Language

General Public Licence

Based on RDF Open vocabulary

Logical combinations of classes (union, interesection, complement)

Extented properties: transitive, symmetrical, inverse

Web Ontology Language Requirements

Easy to understand and use

Formally specified, of adequate expressive power

Providing an automated reasoning support

© sebis 32030502-Wi-sebis-Master

OWL (2)

OWL Types

OWL Full

Greatest expressive power

OWL DL

Extention of DL subset of RDF

Well-defined semantics

User-friendly syntax

OWL Lite

Simple syntax, tractable inference

[OWL]

© sebis 33030502-Wi-sebis-Master

OWL (3)

Example of Ontology for two books about African Lion

© sebis 34030502-Wi-sebis-Master

OWL (4)

Example of Ontology for „Man“

<owl:Class rdf:ID="Man">

<rdfs:subClassOf rdf:resource="#Person"/>

<rdfs:subClassOf rdf:resource="#Adult"/>

<owl:disjointWith rdf:resource="#Woman"/>

</owl:Class>

Example of Ontology for Property „isChildOf“

<owl:ObjectProperty rdf:ID=„isChildOf">

<owl:inverseOf rdf:resource="#isParentOf"/>

</owl:ObjectProperty>

© sebis 35030502-Wi-sebis-Master

OWL (4)

Extention towards including instances

Use of OWL and Ontologies

Data integration Ontology mapping

- Minimization of intellectual effort involved in developing an ontology by re-use

- Composition of ontologies and adoption

Data interchange Jena

Data querying RDQL

Data visualization Cluster Maps

© sebis 36030502-Wi-sebis-Master

Ontology-based Information Representation

Outline

Motivation

Semantic Models for Information Representation

Taxonomy

Thesaurus

Topic Map

Ontology

The Semantic Web

URI, XML, RDF, RDFS, OWL

Jena

Ontology-Based Information Visualization with Cluster Maps

Conclusion

© sebis 37030502-Wi-sebis-Master

Jena (1)

Jena Semantic Web Toolkit (Open Source, HP)

Java framework for writing web application in Java

OWL Lite based on RDF

© sebis 38030502-Wi-sebis-Master

Jena (2)

Jena Architecture

Model Factory creates an empty ontology model that can be added resources, properties, statements

Model model = ModelFactory.createDefaultModel();

ModelFactory

createDefaultModel:Model

© sebis 39030502-Wi-sebis-Master

Jena (4)

createStatement(Resource, Property, Object): Statements

createProperty(String):Property

createResource(String) : Resource

Model

Creation of resources, properties and rules

Resource john = model.createResource(familyURI+“john“);

Resource jane = model.createResource(familyURI+“jane“);

Property childOf = model.createProperty(relationshipURI);

Statement statement = model.createStatement(john, childOf, jane);

Querying of a model

model.listObjectsOfProperty(childOf);

model.listStatements(john,childOf, null);

listStatements(Object, Object, Object)

listObjectsOfProperty(Property)

Model

© sebis 40030502-Wi-sebis-Master

Jena (5)

Addition of properies to subjects john.addProperty(childOf,jane);

Querying of properties john.listProperties(siblingOf);

addProperty(Property,Object)

Resource

listProperties(Property)

Resource

© sebis 41030502-Wi-sebis-Master

Jena (6)

RDF Data Query Language (RDQL)

Keywords: select, where, using

SELECT ?x

WHERE (?x, http://www.family.org/child#, „John Big“)

==================

http://www.big.bgr/john

==================

SELECT ?resource

FROM http://www.big.bgr

WHERE (?resource info:age ?age) AND ?age >= 2

USING info FOR http://www.big.bgr/peopleInfo#

===================

http://www.big.bgr/jane

http://www.big.bgr/tom

© sebis 42030502-Wi-sebis-Master

Ontology-based Information Representation

Outline

Motivation

Semantic Models for Information Representation

Taxonomy

Thesaurus

Topic Map

Ontology

The Semantic Web

URI, XML, RDF, RDFS, OWL

Jena

Ontology-Based Information Visualization with Cluster Maps

Conclusion

© sebis 43030502-Wi-sebis-Master

Cluster Maps (1)

Clustering based on similarity

Tasks:

Data Analysis: different ontologies, same dataset

Data comparison: same ontology, multiple data sets

Query relaxation: find result set to queries for which no exact matches exist

Data Analysis: Search on jobs offered by economics sector

Visible size

Differentiation

© sebis 44030502-Wi-sebis-Master

Cluster Maps (2)

Data Analysis: Search on jobs offered by economic sector

Various overlaps

© sebis 45030502-Wi-sebis-Master

Cluster Maps (3)

Data Analysis: Search on jobs offered by region

Visible size

Geographical closeness is preserved

© sebis 46030502-Wi-sebis-Master

Cluster Maps (4)

Data Comparison: services offered by two banks

Same ontology, different data sets

© sebis 47030502-Wi-sebis-Master

Cluster Maps (5)

Query relaxation: query about a holiday in France

colour intensity for the cases

- no exact matches

- matches based on query relaxation

© sebis 48030502-Wi-sebis-Master

Cluster Maps (6)

Clustering based on similarity for Search, Navigation, Vizualization

Advantages

Visible and configurable size of the result set

Similarity between the instances of the result set

Intuitive search and navigation process

© sebis 49030502-Wi-sebis-Master

Conclusion

Use of Ontologies

[Ont15]

User-centered Information

Management!

Context-dependent Information

Personalized Information

Information SharingInformation Visulaization

© sebis 50030502-Wi-sebis-Master

Share your opinion ...

Can we expect maturity in the field of ontology engineering in 5, 10, 15 years from now?

Is there a way to make information find you rather than look for it?

Is XML the best format to build on? How does it influence ontologies today?

© sebis 51030502-Wi-sebis-Master

Refereces

[ABiilsma] Allard Biilsma. De Rode Planeet. www.drp.nl/openmind/ voorbeelden.htm

[JHugo] Jacques Hugo. Visual Literacy and Software Design. http://www.chi-sa.org.za/articles/vislit2.htm

[Creighton] Technology in the Secondary Schools. http://spahp.creighton.edu/chapman/EDU342/lesson3word/thesaurus_word.htm

[TM1] www.media-style.com/gfx/assets/topicmap.gif

[TM2] http://www.ontopia.net/topicmaps/materials/tm-vs-thesauri.html

[TM3] http://sys-con.com/xml

[CM1] www.touchgraph.com/news2001.html

[CM2] www.infovis.net/

[OWL] http://www.cs.vu.nl/~guus/public/2004-webont-zeist/all.htm

[RDFS] http://www.kanzaki.com/docs/sw/

© sebis 52030502-Wi-sebis-Master

Application Area Search Engine

Graphical Representation of the results of a search engine

Source: www.kartoo.com

© sebis 53030502-Wi-sebis-Master

Topic Map

Topic Map of Documents

Distances between topics proportional to semantics

Colour intensity proportional to pecentage

[TM1]