Leveraging Data and Structure in Ontology Integration
description
Transcript of Leveraging Data and Structure in Ontology Integration
![Page 1: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/1.jpg)
Leveraging Data and Structure in Ontology Integration
Octavian Udrea1
Lise Getoor1
Renée J. Miller2
1University of Maryland College Park2University of Toronto
![Page 2: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/2.jpg)
Contents
Motivation and goals Short overview of OWL Lite The ILIADS method Experimental evaluation
![Page 3: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/3.jpg)
ILIADS Goal:
Produce high-quality integration via a flexible method able to adapt to a wide variety of ontology sizes and structures
Method: Combining statistical and logical inference Use schema (structure) and data (instances)
effectively Solution:
Integrated Learning In Alignment of Data and Schema (ILIADS)
![Page 4: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/4.jpg)
Contributions Show how to combine statistical and logical
inference effectively
Show that a small amount of inference yields high qualitative gain
Show that parameters needed to perform inference over data and structure are robust
Provide a thorough evaluation on 30 pairs of real-world ontologies (with ground truth)
![Page 5: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/5.jpg)
Contents
Motivation and goals Short overview of OWL Lite The ILIADS method Experimental evaluation
![Page 6: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/6.jpg)
Example OWL Lite ontologies
(discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty)(discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty)(resultsF rom, rdfs:subPropertyOf, associatedWith)
![Page 7: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/7.jpg)
Example OWL Lite ontologies
An entity can be a:• Class
(discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty)(discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty)(resultsF rom, rdfs:subPropertyOf, associatedWith)
![Page 8: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/8.jpg)
Example OWL Lite ontologies
An entity can be a:• Class• Instance
(discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty)(discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty)(resultsF rom, rdfs:subPropertyOf, associatedWith)
![Page 9: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/9.jpg)
Example OWL Lite ontologies
An entity can be a:• Class• Instance• Property
(discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty)(discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty)(resultsF rom, rdfs:subPropertyOf, associatedWith)
![Page 10: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/10.jpg)
Example OWL Lite ontologies
(discoveredBy, owl:inverseOf, discoverer)(discoveredBy, owl:type, owl:FunctionalProperty)(discoveredBy, owl:inverseOf, discoverer)(associatedWith, owl:type, owl:TransitiveProperty)(resultsF rom, rdfs:subPropertyOf, associatedWith)
![Page 11: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/11.jpg)
Inference in OWL Lite
![Page 12: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/12.jpg)
Inference in OWL Lite
![Page 13: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/13.jpg)
Inference in OWL Lite
![Page 14: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/14.jpg)
The integration problem
![Page 15: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/15.jpg)
The integration problem
![Page 16: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/16.jpg)
The integration problem
![Page 17: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/17.jpg)
The integration problem
![Page 18: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/18.jpg)
Contents
Motivation and goals Short overview of OWL Lite The ILIADS method Experimental evaluation
![Page 19: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/19.jpg)
State of the art Robust statistical methods
Well-known similarity measures Used for matching data (entities) and schema May use graph structure of schema
Logical inference Not combined with statistical inference Basis for most schema mapping and ontology
integration methods Approaches integrate schema, but not data
![Page 20: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/20.jpg)
Issues How to combine statistical inference with
logical inference Takes into account data, structure, etc. so it’s no
longer obvious In particular, how to quantify the results of logical
inference into a similarity-like form? How to do logical inference in a tractable
manner For OWL-Lite, EXPTIME-complete for the worst
case for the entire ontology
![Page 21: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/21.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Select relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 22: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/22.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Select relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 23: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/23.jpg)
Computing local similarities
simlexical: Jaro-Winkler and Wordnet
simstructural: Jaccard for neighborhoods
simextensional: Jaccard on extensions
parameters: λx, λs, λe
different for classes, instances and properties
)e(e,sim
)e(e,sim
)e(e,sim )esim(e,
extensione
structures
lexicalx
![Page 24: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/24.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Select relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 25: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/25.jpg)
Selecting promising candidates
1. Select candidates with sim(e,e’) > λt
2. Use a policy based on entity type to order, e.g.:
Class alignments first Instance alignments firstAlternate between classes and instances
![Page 26: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/26.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Select relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 27: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/27.jpg)
Selecting relationship Must decide on relation type
subClassOf vs. equivalentClass subPropertyOf vs. equivalentProperty
Determination is difficult, especially under the OWL open-world semantics
Use a simple extension based technique based on a threshold λr
![Page 28: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/28.jpg)
Selecting relationship
![Page 29: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/29.jpg)
Selecting relationship
![Page 30: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/30.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Represent candidate relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 31: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/31.jpg)
Performing logical inference
For the candidate pair (e,e’): Select an axiom to apply The logical consequences are the pairs of
entities (e(i), e(j)) that have just become equivalent
Repeat a small number of times (5) to maintain tractability
![Page 32: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/32.jpg)
Performing logical inference
![Page 33: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/33.jpg)
Performing logical inference
![Page 34: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/34.jpg)
Performing logical inference
![Page 35: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/35.jpg)
Performing logical inference
(TheodorEscherich, owl:sameAs, T.S. Escherich) is a logical consequence of the candidate (E-ColiPoisoning, owl:sameAs, E-Coli)
![Page 36: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/36.jpg)
The ILIADS algorithm
repeat until no more candidates
1. Compute local similarities
2. Select promising candidates
3. For each candidatea. Represent candidate relationship
b. Perform logical inference
c. Update score with the inference similarity
4. Select the candidate with the best score
end
![Page 37: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/37.jpg)
Updating score
For the candidate pair (e,e’): Initial local similarity sim(e,e’) Inference similarity over all consequences:
Updated similarity:
e,esim-1
e,esim
(j)(i) e,e(j)(i)
(j)(i)
P
Pss *e'e,ime'e,imupdated
![Page 38: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/38.jpg)
Updating score
![Page 39: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/39.jpg)
Updating score
![Page 40: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/40.jpg)
Consistency The constructed alignment is not guaranteed
to be consistent ILIADS can only detect inconsistencies that
appear in the few logical inference steps Pellet used to check consistency after ILIADS
Experimentally, inconsistent ontologies in less than .5% of runs
![Page 41: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/41.jpg)
Contents
Motivation and goals Short overview of OWL Lite The ILIADS method Experimental evaluation
![Page 42: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/42.jpg)
Experimental framework 30 pairs of real-world ontologies
From 194 to over 20,000 triples From a variety of domains: medical, geographical,
economical, biological
Ground truth provided by human reviewers Multiple iterations to ensure the best human-
provided alignment
Datasets available: http://www.cs.umd.edu/linqs/projects/iliads
![Page 43: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/43.jpg)
Experimental framework Evaluation: precision, recall and F1 quality
F1 = 2 * Precision * Recall / (Precision + Recall) 7 independent runs
ILIADS Variations: ILIADS-tailored uses the best set of parameters
for each pair of ontologies ILIADS-fixed uses one set of parameters for all
pairs of ontologies Used to evaluate robustness of the parameters
![Page 44: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/44.jpg)
Experimental framework ILIADS compared to two leading tools:
FCA-merge [Stumme and Maedche, IJCAI 2001] uses formal concept analysis and an external
document corpus
COMA++ [Aumueller et al., SIGMOD 2005] implements multiple match strategies,
including fragment and reuse-based matching
![Page 45: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/45.jpg)
Precision/recall
![Page 46: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/46.jpg)
Precision/recall
![Page 47: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/47.jpg)
Precision/recall
![Page 48: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/48.jpg)
Precision/recall
![Page 49: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/49.jpg)
Precision/recall comparison
![Page 50: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/50.jpg)
Precision/recall for ontologies with substantial instance data
![Page 51: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/51.jpg)
Number of inference steps
![Page 52: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/52.jpg)
ILIADS parameters
ILIADS-fixed
.2 .4 .1 .5 .6 .4 .3 .5 .7 .2
Min ILIADS-tailored
.15 .4 0 .3 .45 .35 .2 .35 .65 .2
Max IILIADS-tailored
.25 .45 .1 .65 .7 .5 .35 .65 .7 .2
cx i
x px c
sis
ps
ce
pe t r
Lexical parameters
Structuralparameters
Extensionalparameters
![Page 53: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/53.jpg)
Choosing ILIADS parameters Despite the number of parameters, method is
quite robust Parameters are stable around the ILIADS-fixed
values if the two ontologies in a pair are not very different
Strong correlations between Structural similarity coefficients and the average
node degree Extensional coefficients and the ratio of instances
to classes
![Page 54: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/54.jpg)
False negative analysis
![Page 55: Leveraging Data and Structure in Ontology Integration](https://reader036.fdocuments.net/reader036/viewer/2022062408/5681449a550346895db142b3/html5/thumbnails/55.jpg)
Concluding remarks New ontology integration algorithm
First to combine statistical and logical inference
Evaluated feasibility of combined inference Small number of logical inference steps are sufficient
for integration decisions Inference is stable to parameter settings Parameters permit principled tuning based on
ontology characteristics
Dataset and code available at:http://www.cs.umd.edu/linqs/projects/iliads