A Study of Hybrid Similarity Measures for Semantic Relation Extraction
-
Upload
alameda-garza -
Category
Documents
-
view
67 -
download
6
description
Transcript of A Study of Hybrid Similarity Measures for Semantic Relation Extraction
![Page 1: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
Presenter : BEI-YI JIANG
Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM
2012. ASSOCIATION FOR COMPUTING MACHINERY
A Study of Hybrid Similarity Measures for Semantic Relation Extraction
![Page 2: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
![Page 3: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
Motivation
• The quality of the relations provided by existing extractors is still lower than the quality of the manually constructed relations.
• Most studies are still not taking into account the whole range of existing measures, combining mostly sporadically different methods.
![Page 4: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
Objectives
• To development of new relation extraction methods.• The method is a systematic analysis of 16 baseline
measures, and their combinations with 8 fusion methods and 3 techniques for the combination set selection.
![Page 5: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
Methodology• norm function
• similarity scores
• knn function
![Page 6: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
Methodology-Single Similarity Measures
• Measures Based on a Semantic Network(5)– exploit the lengths of the shortest paths between
terms in a network– probability of terms derived from a corpus– Wu and Palmer, Leacock and Chodorow, Resnik,
Jiang and Conrath , and Lin
![Page 7: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
• Web-based Measures(3)– Web search engines– rely on the number of times the terms co-occur in
the documents– Normalized Google Distance(NGD)– Measures of Semantic Relatedness(MSR)– YAHOO!, BING, GOOGLE over the domain
wikipedia.org
Methodology-Single Similarity Measures
![Page 8: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
• Corpus-based Measures(5)– Distributional Measures
› Bag-of-words Distributional Analysis(BDA) › Syntactic Distributional Analysis(SDA)
– Pattern-based Measure› PatternWiki
– Other Corpus-based Measures› Latent Semantic Analysis(LSA)› Normalized Google Distance(NGD)
Methodology-Single Similarity Measures
![Page 9: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
• Definition-based Measures(3)– WktWiki– Gloss Vectors– Extended Lesk
Methodology-Single Similarity Measures
![Page 10: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
• Combination Methods – Input: a set of similarity matrices{S1, . . . , SK}
produced by K single measures– Output: a combined similarity matrix Scmb
› 1. Mean› 2. Mean-Nnz› 3. Mean-Zscore› 4. Median
Methodology- Hybrid Similarity Measures
› 5. Max› 6. Rank Fusion› 7. Relation Fusion› 8. Logit
![Page 11: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
• Combination Methods– Mean. A mean of K pairwise similarity scores:
– Mean-Nnz. A mean of those pairwise similarity scores which have a non-zero value:
Methodology- Hybrid Similarity Measures
![Page 12: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
• Combination Methods– Mean-Zscore. A mean of K similarity scores transformed
into Z-scores:
– Median. A median of K pairwise similarities:
Methodology- Hybrid Similarity Measures
![Page 13: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
• Combination Methods– Max. A maximum of K pairwise similarities:
– Rank Fusion.
Methodology- Hybrid Similarity Measures
![Page 14: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
• Combination Methods– Relation Fusion.
– Logit.
Methodology- Hybrid Similarity Measures
![Page 15: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
• Combination Sets– Expert choice of measures
– Forward stepwise procedure
– Logistic regression
Methodology- Hybrid Similarity Measures
![Page 16: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/16.jpg)
Intelligent Database Systems Lab
Experiments• Evaluation– Human Judgements Datasets.
› MC, RG, WordSim353
– Semantic Relations Datasets.› BLESS, SN
![Page 17: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/17.jpg)
Intelligent Database Systems Lab
Experiments
![Page 18: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/18.jpg)
Intelligent Database Systems Lab
Experiments
![Page 19: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/19.jpg)
Intelligent Database Systems Lab
Conclusions
• The results have shown that the hybrid measures outperform the single measures on all datasets.
• A combination of 15 baseline corpus-, web-, network-, and dictionary-based measures with Logistic Regression provided the best results.
![Page 20: A Study of Hybrid Similarity Measures for Semantic Relation Extraction](https://reader031.fdocuments.net/reader031/viewer/2022031802/56813300550346895d99bce2/html5/thumbnails/20.jpg)
Intelligent Database Systems Lab
Comments• Advantages– higher performance
• Applications