Presenter : Jian-Ren Chen Authors : Gonenc Ercan , Ilyas Cicekli * 2007.IPM

13
Intelligent Database Systems Presenter : JIAN-REN CHEN Authors : GONENC ERCAN, ILYAS CICEKLI * 2007.IPM Using lexical chains for keyword extraction

description

Using lexical chains for keyword extraction. Presenter : Jian-Ren Chen Authors : Gonenc Ercan , Ilyas Cicekli * 2007.IPM. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Transcript of Presenter : Jian-Ren Chen Authors : Gonenc Ercan , Ilyas Cicekli * 2007.IPM

Page 1: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : GONENC ERCAN, ILYAS CICEKLI *

2007.IPM

Using lexical chains for keyword extraction

Page 2: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

MotivationIt is a hard and time consuming task to assign keywords to

documents.

The automatic keyphrase extraction algorithms are limited with

phrases that appear in the text.

Page 4: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Objectives

• We present a keyword extraction method such that it uses the features based on lexical chains in the selection of keywords for a text.

Page 5: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Methodology

1. first occurrence position2. frequency3. last occurrence position

Page 6: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Methodology – Lexical chains

• WordNet - synonym、 hyponym/hypernym1) more than one relationship2) up to 3 levels of depth3) identify nouns4) only consider individual words

Page 7: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Methodology – lexical chains

1. Lexical chain score of a word: 2. Direct Lexical chain score of a word:3. Lexical chain span score of a word:4. Direct lexical chain span score of a word:

59 (=7 · 7 + 10)

28 (=4 · 7)

Page 8: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Experiments - Datasets

Corpus Name Texts in corpusNumber of

Training Documents

Number of Testing

Documents

Total Number of Documents

Journal Articles full texts 55 20 75

Global corpus abstracts 110 45 155

- extract 1, 5, 10, 15 keywords for each document

Page 9: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Experiments

1. Lexical chain score of a word: 2. Direct Lexical chain score of a word:3. Lexical chain span score of a word:4. Direct lexical chain span score of a word:

Page 10: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Experiments

Page 11: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Experiments

Page 12: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Conclusions

According to the results that are obtained, the lexical chain features improve the precision significantly in the keyword extraction process.

Page 13: Presenter   :  Jian-Ren  Chen Authors      :  Gonenc Ercan ,  Ilyas Cicekli  * 2007.IPM

Intelligent Database Systems Lab

Comments• Advantages

– Lexical chains will more consider the semantic relationships

• Applications– keyword extraction