Presenter : Jian-Ren Chen Authors : Gonenc Ercan , Ilyas Cicekli * 2007.IPM
description
Transcript of Presenter : Jian-Ren Chen Authors : Gonenc Ercan , Ilyas Cicekli * 2007.IPM
Intelligent Database Systems Lab
Presenter : JIAN-REN CHEN
Authors : GONENC ERCAN, ILYAS CICEKLI *
2007.IPM
Using lexical chains for keyword extraction
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
MotivationIt is a hard and time consuming task to assign keywords to
documents.
The automatic keyphrase extraction algorithms are limited with
phrases that appear in the text.
Intelligent Database Systems Lab
Objectives
• We present a keyword extraction method such that it uses the features based on lexical chains in the selection of keywords for a text.
Intelligent Database Systems Lab
Methodology
1. first occurrence position2. frequency3. last occurrence position
Intelligent Database Systems Lab
Methodology – Lexical chains
• WordNet - synonym、 hyponym/hypernym1) more than one relationship2) up to 3 levels of depth3) identify nouns4) only consider individual words
Intelligent Database Systems Lab
Methodology – lexical chains
1. Lexical chain score of a word: 2. Direct Lexical chain score of a word:3. Lexical chain span score of a word:4. Direct lexical chain span score of a word:
59 (=7 · 7 + 10)
28 (=4 · 7)
Intelligent Database Systems Lab
Experiments - Datasets
Corpus Name Texts in corpusNumber of
Training Documents
Number of Testing
Documents
Total Number of Documents
Journal Articles full texts 55 20 75
Global corpus abstracts 110 45 155
- extract 1, 5, 10, 15 keywords for each document
Intelligent Database Systems Lab
Experiments
1. Lexical chain score of a word: 2. Direct Lexical chain score of a word:3. Lexical chain span score of a word:4. Direct lexical chain span score of a word:
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Conclusions
According to the results that are obtained, the lexical chain features improve the precision significantly in the keyword extraction process.
Intelligent Database Systems Lab
Comments• Advantages
– Lexical chains will more consider the semantic relationships
• Applications– keyword extraction