Ontology engineering: Ontology alignment
-
Upload
guus-schreiber -
Category
Education
-
view
901 -
download
7
description
Transcript of Ontology engineering: Ontology alignment
Ontology Alignment
Course “Ontology Engineering”
Goals of the lecture
Understand why ontology alignment is done Know what constructs can be used to express
an alignment between two concepts Know what options there are to find mappings
2
3
Agenda
Why ontology alignment? Alignment relations Alignment techniques
4
Why is Ontology Alignment done?
5
Interoperability problem II
A private company wants to participate in a marketplace
E.g. eBay: Home > Buy > Cameras & Photo > Digital Cameras >
Digital SLR > Nikon > D40
Needed: correspondences between entries of its catalogs and entries of a common catalog of a marketplace.
Example use of vocabulary alignment
“Tokugawa”
SVCN period Edo
SVCN is local in-house ethnology thesaurus
AAT style/period Edo (Japanese period) Tokugawa
AAT is Getty’s Art & Architecture Thesaurus
Alignment architecture for P2P
8
Two kinds of interoperability
Syntactic interoperability– using data formats that you can share– XML family is the preferred option
Semantic interoperability– How to share meaning / concepts– Technology for finding and representing semantic
links
9
Reusing vocabularies
10
The myth of a unified vocabulary
There will always be multiple ontologies Partly overlapping In multiple languages Each with their own perspective
11
Links between ontologies
“Ontology Alignment” / “Ontology Mapping”– use ontologies jointly by defining a limited set of
links– Benefit from knowledge encoded in the other
ontology– Enable access across applications/collections.– Partial by nature!
12
Why ontology alignment?
Summary: There is no single ontology of the world People work with different viewpoints and
thus multiple conceptualizations But: these concepts often overlap Semantic relations between ontologies help
integrating information sources Currently seen as a major issue in
development of distributed (web) systems
13
How do we represent the alignment between two concepts?
14
Link types between concepts in different ontologies
Equality
owl:sameAs
Individual individual“Den Haag” = “The Hague”
Equivalence
owl:EquivalentClass
class class
wood-material = wood
Subclass
rdfs:subClassOf
class class
aat:Artist wn:Artist
Instance of
rdf:typeindividual class
tgn:Africa wn:Continent
Disjoint
owl:disjointWith
class class
aat:wood wn:plastic
15
skos:mappingRelation- skos:closeMatch- skos:exactMatch- skos:broadMatch- skos:narrowMatch- skos:relatedMatch
Types of links between concepts in different thesauri
SKOS mapping properties
16
- skos:closeMatch- symmetricProperty
- skos:exactMatch- subPropertyOf
skos:closeMatch- transitiveProperty- symmetric property
- skos:broadMatch- subPropertyOf
skos:broader- inverseOf
skos:narrowMatch
- skos:narrowMatch-subPropertyOf skos:narrower-inverseOf skos:broadMatch
-skos:relatedMatch-subPropertyOf skos:related-symmetric property
17
Example: partial alignment between citations
18
Example: alignment between XML Schemas
19
Example: alignment between thesauri
20
Links between properties: equivalentProperty subPropertyOf inverseOf
E.g. painterOf – creatorOf Trick: wn:hyponym subPropertyOf
rdfs:subClassOf
Types of links between properties in different ontologies
21
Domain-specific links– Van Gogh (ULAN) born-in Groot-Zundert
(TGN) – Derain (ULAN) related-to Fauve (AAT))– Wandelkaart Pyreneeën RANDO.07 Haute-
Ariège - Vicdessos (Pied à Terre) related to Pyrénées (TGN)
– Part-of relations
Types of links between concepts in different ontologies
22
Alignment Techniques
23
Alignment tools Input: two ontologies, each consisting of a set
of discrete entities• HTML table headers• XML elements• Classes• Properties
Output: relationships holding between these entities (equivalence, subsumption, etc.) + confidence measure.
Cardinality (e.g., 1:1, 1:m)
24
Alignment techniques Syntax: comparison of characters of the terms
– Measures of syntactic distance– Language processing
• E.g. Tokenization, single/plural,
Relate to lexical resource– Relate terms to place in WordNet hierarchy
Taxonomy comparison– Look for common parents/children in taxonomy
Instance based mapping– Two classes are similar if their instances are similar.
String-based techniques (1)
Exact string match Prefix
– takes as input two strings and checks whether the first string starts with the second one
– net = network; but also hot = hotel Suffix
– takes as input two strings and checks whether the first string ends with the second one
– ID = PID; but also word = sword
String-based techniques (2)
Edit distance– takes as input two strings and calculates the
number of edition operations, (e.g., insertions, deletions, substitutions) of characters
– required to transform one string into another, normalized by length of the maximum string
– EditDistance ( NKN , Nikon ) = 0.4 (2/5)
Language-based techniques
Tokenization– parses names into tokens by recognizing punctuation, cases– Hands-Free Kits => hands, free, kits
Lemmatization– analyses morphologically tokens in order to find all their possible
basic forms– Kits => Kit
Elimination– discards “empty” tokens that are articles, prepositions,
conjunctions . . .– a, the, by, type of, their, from
Linguistic techniquesusing WordNet senses
A subClassOf B if A is a hyponym of B– Pine subClassOf Tree
A hasPart B if A is a holonym of B– Europe hasPart Greece
A = B if they are synonyms– Quantity = Amount
A disjoint B if they are antonyms or ar siblings in the same part of hierarchy– Pine disjoint Oak
Linguistic techniques: gloss-based
WordNet gloss comparison– The number of the same words occurring in both
input glosses increases the similarity value. – The equivalence relation is returned if the
resulting similarity value exceeds a given threshold
– Maltese dog is a breed of toy dogs having a long straight silky white coat Afghan hound is a tall graceful breed of hound with a long silky coat
Structural technique: taxonomy comparison
31
Techniques for Part-of Relations
Phrase (Hearst) patterns:
add <part> to <whole><whole> is made of <part><part> gives the <whole> its<whole>-containing <part><whole> consists of <part>
Overview of alignment techniques
33
Alignment issues (1)
Nature of the input– Underlying data models– Schema-level vs. Instance-level– Example: Link WordNet to Wikipedia
Interpretation of the output– Approximate vs. exact– Graded vs. absolute confidence
Performance varies> semi-automatic alignment.
Involving the human in alignment evaluation
35
Evaluation of alignments
Judging individual alignments– Precision
Comparison to a reference alignment– Recall– Precision?
Comparing the logical consequences of the models
End-to-end evaluation
The intrinsic fuzziness of alignment
37
WordNet
AAT
38
Literature / acknowledgment
Some slides from this lecture are based on a tutorial of Pavel Shvaiko and Jerome Euzenathttp://dit.unitn.it/~accord/Presentations/
ESWC'05-MatchingHandOuts.pdf
Some slides are from Antoine Isaac (STICH)