AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha...

61
AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya

Transcript of AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha...

Page 1: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

AUTOMATIC TEXT SUMMARIZATION

By

Chetana GavankarSubhabrata MukherjeeKedharnath NarahariSarbartha Sengupta

under guidance of:Prof Pushpak Bhattacharya

Page 2: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

PRESENTATION CONTENT

• Motivation• Types of summaries• Challenges• Single-Document Summarization o Early worko Machine Learning Methods

Supervised Methods Unsupervised Method

o Deep Natural Language Analysis methods• Multi-Document Summarization• Evaluation• Conclusion

Page 3: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

MOTIVATION

• Download 1000 + papers and get the summary..  

• You have list of emails about sports event  get the summary of those emails in one para…

• You have to study loads of books for the exam and the summarizer gives the key concepts of the books as few pages notes…

Value for researchers• Get me everything Papers say about

“Automatic Text Summarization”

Page 4: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

MOTIVATION

Page 5: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

DEFINITION

Automatic Summaries

• Should be less than half of original text• Should convey important information• May be produced from single or multiple

documents (Radev et al)

Dipanjan Das, Andre F.T. Martins (2007). A Survey on Automatic Text Summarization. Literature Survey for the Language and Statistics II course at CMU, Pittsburg

Page 6: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

APPLICATIONS - NEWS AGGREGATOR

http://24eyes.com/

Page 7: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

APPLICATIONS – MOVIE REVIEWS

Page 8: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TYPES OF SUMMARIES

With respect to content: Indicative: provide an idea what the text is about, but

do not render the content Informative: shortened versions of the text

With respect to the way of creating: Extracts: identify important sections of the text Abstracts: produce important material in a new way

Dipanjan Das, Andre F.T. Martins (2007). A Survey on Automatic Text Summarization. Literature Survey for the Language and Statistics II course at CMU, Pittsburg

Page 9: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TYPES OF SUMMARIES With respect to Input

• Restricted vs. Unrestricted domain• Single-document vs. Multiple-document

 With respect to Purpose• Generic vs. Query based• Background vs. just-the-newsnd vs. just-the-news

 

Eduard Hovy and Lin C. Y. "Automated Text Summarization in summarist", MIT Press

Page 10: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TYPES OF SUMMARIES With respect to Input

• Restricted vs. Unrestricted domain• Single-document vs. Multiple-document

 With respect to Purpose• Generic vs. Query based• Background vs. just-the-news

 

Eduard Hovy and Lin C. Y. (1998 )"Automated Text Summarization and the summarist system", TIPSTER III Final Report (SUMMAC)

Page 11: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TOOLS – WORD SUMMARIZER

Microsoft Word 2007 - AutoSummarize

Page 12: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TOOLS – GNOME SUMMARIZER

Page 13: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TOOLS – SWESUM SUMMARIZER

http://www.csc.kth.se/~xmartin/swesum_lab/index-eng.html

Page 14: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

CHALLENGES

• Selecting pieces from the input and concatenating them to yield a summary

 • High reduction rates like headline 

 • Methods for evaluating summaries

• Multiple languages • Multiple Hybrid sources

Hahn U. and Mani I. (2000) “The Challenges of Automatic Summarization”, Computer, IEEE Computer Society

14

Page 15: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

EARLY WORK

ATS has its roots in the last 50’s and has been developed continuously…

• A word frequency based ATS [Luhn, 1958].• An ATS based on multiple features [Edmundson,

1969].• ……..• ……. Still unsolved !

Page 16: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

WORD FREQUENCY• Content words indicate

topic of a text• Frequency of a content

word – measure of its significance

• Retrieve the top n frequent occurring content words

• Rank a sentence according to the frequency of those words present in it

WORDS

FREQUENCY

E

Resolving power of significant words

Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of Research Development, 2(2):159-165

Page 17: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Position of a Sentence

• Sentences occurring under certain headings are positively relevant

• Topic sentences tend to occur very early or very late in a document and its paragraphs

• Optimum Position Policya ranked list that indicates in what ordinal positions in the text the high-topic-bearing sentences tend to occur. (Lin and Hovy, 97).

Ex: [T1, P1S1, P1S2, ...] for a News Article

Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM, 16(2): 264-285

Page 18: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Cue words in a Text

Probable relevance of a sentence affected by Cue words:

Bonus words: positively affecting the relevance of a sentence (e.g. “Significant”, “Greatest”)Stigma words: negatively affecting the relevance of a sentence (e.g. “Impossible”, “Hardly”)Null words: irrelevant

Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM, 16(2): 264-285

Page 19: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

NAÏVE BAYES

• Let s be a particular sentence, S the set of sentences making up the summary and F1, … Fk be the set of features

• Assume feature independence

– Additional Features• Sentence Length• Presence of Uppercase Words

– Position, Cue features with Sentence Length performed best

)(

)().|(),...,|(

1

121

iki

ikik

FP

SsPSsFPFFFSsP

Kupiec, J., Pedersen, J., and Chen, F. (1995). A trainable document summarizer, In Proceedings SIGIR '95, pages 68-73

Page 20: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Naïve Bayes Contd…

• Richer features– Tf-idf to derive signature words– Named Entity Tagger to retrieve tokens– Shallow Discourse Analysis to maintain cohesion– Synonym and Morphological variants of lexical

terms merged using WordNet.

Aone, C., Okurowski, M. E., Gorlinsky, J., and Larsen, B. (1999), A trainable summarizer with knowledge acquired from robust nlp techniques, In Mani, I. and Maybury, M. T., editors, Advances in Automatic Text Summarization, pages 71-80

Page 21: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Decision Tree• Feature independence assumption not valid in real

world situation• Creation of feature vector

– Baseline– Title– tf & tf-idf scores– Position score– Query Signature– IR Signature– Sentence Length– Average Lexical Connectivity– Numerical Data– Proper Name– Pronoun & Adjective– Weekday & Month– Quotation– First Sentence

Lin, C.-Y. (1999). Training a selection function for extraction, In Proceedings of CIKM '99, pages 55-62

Page 22: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Decision Tree Contd…

• Scores of all the features combined by automated learning process using decision tree and normalized

• Remarks– Decision Tree performs best over all dataset– Naïve combination beats Decision Tree in 3

topics

• Possible Reason ????• Features were Independent

Page 23: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Hidden Markov Model

• Drawbacks of earlier approches– Feature based bag-of-words model– Non-sequential

• Use sequential model to account for local dependencies between sentences

• Features– Position of sentence in the document– Number of terms in the sentence– Likeliness of the sentence terms given the

document terms

Conroy, J. M. and O'leary, D. P. (2001). Text summarization via hidden markov models, In Proceedings of SIGIR '01, pages 406-407

Page 24: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Hidden Markov Model Contd…

• 2s+1 states alternating between s summary states & s+1 non-summary states

• Odd state summary state, Even state non-summary state

• Transition Matrix M whose element (i,j) is the probability of transition from state i to j

• Output function bi(o)= Pr(O|state i) where O is an observed vector of features

• Assumption : features are multivariate normal

• M & O learnt from training data

Page 25: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Log-Linear Models

• Let c be a label, d the item we are interested in labeling, fi the ith feature and λi the corresponding feature weight

• Z(d) = Ʃcexp(Ʃi λifi(c,d)) is the normalization constant

• fw,c’ (d,c)= 0 c≠c’ 1 c=c’• Larger value of λi means fi is a strong indicator of class c• GIS, IIS used to iteratively tune model parameters.• Outperformed Naïve Bayes• DRAWBACK : Overfitting

)()|(

),(,1

,

dZ

edcp

cdf cjk

jcj

Osborne, M. (2002). Using maximum entropy for sentence extraction. In Proceedings of the ACL'02 Workshop on Automatic Summarization

Page 26: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Drawbacks of Earlier Methods

1.Performance might degrade, if the text consists of multiple topics.

2.Anaphors in the extracted sentences might not have any antecedent in the summary.

3.The summary might be incoherent, since the sentences are just extracted from various parts of the text.

Page 27: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Drawbacks Contd …

• Consider the following two sequences:

• 1. “Dr.Kenny has invented an anesthetic machine. This device controls the rate at which an anesthetic is pumped into the blood.”

• 2. “Dr.Kenny has invented an anesthetic machine. The doctor spent two years on this research.”

• “Dr.Kenny” appears once in both sequences and so does “machine”. But sequence 1 is about the machine, and sequence 2 is about the “doctor”.

Page 28: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Cohesion

• Cohesion ( Halliday & Hasan 1976)– “stitching together” different parts of the text– Use of semantically related terms, co-reference,

ellipsis, conjunctions

• Lexical Cohesion – Semantically related words– Reiteration category

• Repetations, synonyms, hyponyms– Collocation category

• Words occurring in same lexical context– Ex: She works as a teacher in the school.

Page 29: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Lexical Chain General Approach• 1. Select a set of candidate words;• 2. For each candidate word, find an appropriate chain

relying on a relatedness criterion among members of the chains;

• 3. If it is found, insert the word in the chain and update it accordingly.

• Relations– Extra-strong (between word & its repetation)

• No restriction– Strong (between 2 words connected by WordNet reln)

• Window of 7 sentences– Medium-strong (path length > 1 hop )

• Window of 3 sentences• Preference : Extra-strong > Strong > Medium-strong

Page 30: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Drawback

• Mr. Kenny is the person that invented an anesthetic machine which uses micro-computers to control the rate at which an anesthetic is pumped into the blood. Such machines are nothing new. But his device uses two micro-computers to achieve much closer monitoring of the pump feeding the anesthetic into the patient. (Morris & Hirst 1991)

• [lex "Mr.", sense {mister, Mr.}]• [lex "person", sense {person,

individual,someone, man, mortal, human, soul}].

• First sense of machine in WordNet – “an efficient person” – a holonym of “person” and thus wrongly disambiguated

Page 31: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Component and Graph Connectivity

Barzilay, R. and Elhadad, M. (1997). Using lexical chains for text summarization. In Proceedings ISTS'97

Page 32: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Component & Graph Connectivity Contd…

Page 33: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Multi Document Summarization

• Multiple sources of information– Similarity between topics– Supplement each other– Occasionally contradictory

• Key Tasks– Identifying Key concepts across documents– Coping with redundancy– Ensuring final Summary is coherent and

complete

• Applications: news clustering systems– Google News, Columbia NewsBlaster, News in

Essence etc

Page 34: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TOPIC-DRIVEN SUMMARIZATION

Carbonell, J. and Goldstein, J. (1998). The use of MMR, diversity-based re-ranking for reordering documents and producing summaries.

Page 35: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

TOPIC DRIVEN SUMMARIZATION(contd.)

Carbonell, J. and Goldstein, J. (1998). The use of MMR, diversity-based re-ranking for reordering documents and producing summaries. In proceedings of SIGIR '98.

Page 36: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

GRAPH SPREADING

Mani, I. and Bloedorn, E. (1997). Multi-document summarization by graph search and matching. In AAAI/IAAI, pages 622-628.

Page 37: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Example for nodes and links in the graph(mani and bloerdon 97’)

Mani, I. and Bloedorn, E. (1997). Multi-document summarization by graph search and matching. In AAAI/IAAI, pages 622-628.

Page 38: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

GRAPH SPREADING(Contd..)

Words and phrases are intialized according to their TF-IDF scores.

For each sentence in both documents, two scores are computed.

One score that reflects the presence of common nodes, which is computed as the average weight of these nodes;

Other score that computes instead the average weights of difference nodes.

the sentences that have higher common and different scores are highlighted and accordingly the ouput is generated.

Page 39: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

CENTROID-BASED SUMMARY Does not make use of language generation

model. Documents are modeled as bag-of-words.Topic Detection: Clustering algorithm that uses TF-IDF vector

repesentations of documents Successively add documents and recomputes

centroids.

Centroids are pseudo documents that include words with TF-IDF score above some threshold.

d’-truncated documentC j – jth cluster

Radev, D. R., Jing, H., and Budzikowska, M. (2000). Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies.

Page 40: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

CENTROID BASED SUMMARIZATION

Sentence Identification: 2metrics o Cluster Based Relative Utility-how relevant a

sentence is to particular topic of the cluster.o Cross Sentence Informational Subsumption-

measure of redundancy among sentences.

Radev, D. R., Jing, H., and Budzikowska, M. (2000). Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In NAACL-ANLP 2000 Workshop on Automatic summarization, pages 21-30, Morristown, NJ, USA.

Page 41: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

MULTILINGUAL MULTI-DOCUMENT SUMMARY

Target language(English) in which summary is written.

Source documents present in both preferred language and foreign language(Arabic).

Use IBM’s translational model to translate documents in source language to target language.

Check for similarity between translated sentences in two documents.

If similarities found, retain documents in source language, since they can be more grammatically correct.

Evans, D. K. (2005). Similarity-based multilingual multi-document summarization. Technical Report CUCS-014-05, Columbia University.

Page 42: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

SHORT SUMMARIES

Witbrock, M. J. and Mittal, V. O. (1999). Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries.

Page 43: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Witbrock, M. J. and Mittal, V. O. (1999). Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries. In Proceedings of SIGIR '99, pages 315{316, New York, NY, USA.

Page 44: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

SENTENCE COMPRESSION Compression of sentence used for

summarization. Uses noisy-channel model which considers

that one starts with a short summary s, according to source model P(s).

Subjected to noisy-channel to make full sentence t, in a process guided by channel model, P(t/s).

Now observing t, recover the summary as:s’=argmaxs P(s/t)= argmaxs P(s).P(t/s)

Advantage of decoupling the goals of grammatical correctness and preserving important information.

Knight, K. and Marcu, D. (2000). Statistics-based summarization - step one: Sentence compression. In AAAI/IAAI, pages 703-710

Page 45: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Knight, K. and Marcu, D. (2000). Statistics-based summarization - step one: Sentence compression. In AAAI/IAAI, pages 703-710.

Page 46: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Difficult task. (There does not exist an ideal summary for a given document or set of document.)

• Agreement between human summarizers is quite low.

• Difficult to evaluate the summary content.

• Absence of a standard human or automatic evaluation metric.

Page 47: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Lin and Hovy (2002).

describe and compare various human and automatic metrics to evaluate summaries.

Focus on the evaluation procedure used in the Document Understanding Conference 2001(DUC-2001).

compared manually written ideal summaries with summaries generated automatically by summarization systems and baseline summaries.

Lin, C.-Y. and Hovy, E. (2002). Manual and automatic evaluation of summaries. In Proceedings of the ACL-02 Workshop on Automatic Summarization, pages 45-51

Page 48: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Lin and Hovy (2002). Each text was decomposed into a list of

units (sentences).

stepped through each model unit (MU) from the ideal summaries.

marked all system units (SU) sharing content with the current model unit.

All (4) Most (3) Some (2) or Hardly any (1)

Grammaticality, cohesion, and coherence were also rated

Page 49: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Lin and Hovy (2002).

The weighted recall at threshold ‘t’ (t=1 to 4).

outline an accumulative n-gram matching score (which they call NAMS),

Taken from: Dipanjan Das, Andre F.T. Martins, A Survey on Automatic Text Summarization2007

Page 50: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004).

Let be a set of reference summary, and let be a summary generated automatically by a system. Let be a binary vector representing n-grams contained in a document d.

The metric ROUGE-N is an n-gram recall based statistic

where denotes the usual inner product of vectors

Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Marie-Francine Moens, S. S., editor, Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74-81

Page 51: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004). Example:

d := “Ram helped Shyam to learn German.”r1:= “Shyam learnt German by Ram.”r2:= “Ram taught German to Shyam.”s := “Shyam learn German from Ram.”

Φn(r1) := [10101] Φn(r2) := [10111]Φn(s) := [11101]

= 6/7

Page 52: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004).

ROUGE-N can be used for multiple reference summaries.

An alternative is taking the most similar summary in the reference set.

Page 53: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004). Example:

d := “Ram helped Shyam to learn German.”r1:= “Shyam learnt German by Ram.”r2:= “Ram taught German to Shyam.”s := “Shyam learn German from Ram.”

Φn(r1) := [10101] Φn(r2) := [10111]Φn(s) := [11101]

= 3/3

Page 54: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004).

Another metric in (Lin, 2004) applies the concept of longest common subsequence (LCS).

Let r1,…,ru be the reference sentences of the documents in R, and s a candidate summary

Page 55: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004). Example:

s := “Ram pushed Shyam into the Pool.”r1:= “Ram push Shyam into the Pool.”r2:= “Shyam push Ram into the Pool.”

ROUGE-N: r1=r2 (“Ram”, “Shyam into the Pool”)

ROUGE-L:r1: = 5/6 (“Ram”, “Shyam into the Pool”)r2: = 4/6 (“Ram”, “into the Pool”)r1>r2

Page 56: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004).

Yet another measure ROUGE-S, which can be seen as a modified version of ROUGE-N for n = 2

Page 57: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004). Example: s := “Ram pushed Shyam into the Pool.”r1:= “Ram push Shyam into the Pool.”r2:= “Shyam push Ram into the Pool.”r3:= “Shyam into the Pool Ram pushed.”

ROUGE-N: r3>r1=r2ROUGE-L: r2>r3=r4ROUGE-S:r1: = 10/15 r2: = 9/15 r3: = 7/15r1>r2>r3

Page 58: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Evaluation

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004).

The various versions of ROUGE were evaluated by computing the correlation coefficient between ROUGE scores and human judgment scores

Page 59: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

Conclusion

• a need to develop efficient and accurate summarization systems.

• attention has drifted from summarizing scientic articles to news articles, electronic mail messages, advertisements, and blogs.

• Both abstractive and extractive approaches have been attempted.

• simple extraction of sentences have produced satisfactory results in large-scale applications.

• This survey emphasizes extractive approaches to summarization using statistical methods.

Page 60: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

References• Primary Reference

o Dipanjan Das, Andre F.T. Martins (2007). A Survey on Automatic Text Summarization. Literature Survey for the Language and Statistics II course at CMU, Pittsburg

• Secondary Referenceo Edmundson, H. P. (1969). New methods in automatic extracting. Journal

of the ACM, 16(2):264-285o Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM

Journal of Research Development, 2(2):159-165o Lin, C.-Y. (1999). Training a selection function for extraction. In

Proceedings of CIKM '99, pages 55-62o Ono, K., Sumita, K., and Miike, S. (1994). Abstract generation based on

rhetorical structure extraction. In Proceedings of Coling '94, pages 344-348

o Barzilay, R. and Elhadad, M. (1997). Using lexical chains for text summarization. In Proceedings ISTS'97

o Mani, I. and Bloedorn, E. (1997). Multi-document summarization by graph search and matching. In AAAI/IAAI, pages 622-628

Page 61: AUTOMATIC TEXT SUMMARIZATION By Chetana Gavankar Subhabrata Mukherjee Kedharnath Narahari Sarbartha Sengupta under guidance of: Prof Pushpak Bhattacharya.

References Contd…• Evans, D. K. (2005). Similarity-based multilingual multi-document

summarization Technical Report CUCS-014-05, Columbia University• Radev, D. R., Jing, H., and Budzikowska, M. (2000). Centroid-based

summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In NAACL-ANLP 2000 Workshop on Automatic summarization, pages 21-30

• Witbrock, M. J. and Mittal, V. O. (1999). Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries. In Proceedings of SIGIR '99, pages 315-316

• Knight, K. and Marcu, D. (2000). Statistics-based summarization - step one: Sentence compression. In AAAI/IAAI, pages 703-710

• Lin, C.-Y. and Hovy, E. (2002). Manual and automatic evaluation of summaries. In Proceedings of the ACL-02 Workshop on Automatic Summarization, pages 45-51

• Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Marie-Francine Moens, S. S., editor, Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74-81

• Lin, C.-Y., Cao, G., Gao, J., and Nie, J.-Y. (2006). An information-theoretic approach to automatic evaluation of summaries. In Proceedings of HLT-NAACL '06, pages 463-470