From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application &...
-
Upload
abner-robbins -
Category
Documents
-
view
213 -
download
0
Transcript of From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application &...
![Page 1: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/1.jpg)
From Extracting to From Extracting to AbstractingAbstracting
Generating Quasi-Generating Quasi-abstractive Summariesabstractive Summaries
Zhuli Xie
Application & Software Research Center
Motorola Labs
Barbara Di Eugenio, Peter C. Nelson
Department of Computer Science
University of Illinois at Chicago
![Page 2: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/2.jpg)
22
OutlineOutline
IntroductionIntroduction Quasi-abstractive summariesQuasi-abstractive summaries Model & ApproachModel & Approach Experimental ResultsExperimental Results Conclusion & DiscussionConclusion & Discussion
![Page 3: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/3.jpg)
33
IntroductionIntroduction
Types of text summariesTypes of text summaries– Extractive: composed of whole Extractive: composed of whole
sentences or clauses from source text. sentences or clauses from source text. Paradigm adopted by most automatic Paradigm adopted by most automatic text summarization systemstext summarization systems
– Abstractive: obtained using various Abstractive: obtained using various techniques like paraphrasing. Equivalent techniques like paraphrasing. Equivalent to human-written abstracts. Still well to human-written abstracts. Still well beyond state-of-the-art.beyond state-of-the-art.
![Page 4: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/4.jpg)
44
Quasi-abstractive SummariesQuasi-abstractive Summaries
Composed not of whole sentences Composed not of whole sentences from source text but of fragments from source text but of fragments that form new sentences [Jing 02]that form new sentences [Jing 02]
We will show they are more similar to We will show they are more similar to human-written abstracts, as human-written abstracts, as measured with cosine similarity & measured with cosine similarity & ROUGE-1,2 metricsROUGE-1,2 metrics
![Page 5: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/5.jpg)
55
Quasi Abstractive: RationaleQuasi Abstractive: RationaleTwo sentences from a human written abstract:A1 We introduce the bilingual dual-coding theory as a model for bilingual mental representation.A2 Based on this model, lexical selection neural networks are implemented for a connectionist transfer project in machine translation.
Extractive Summary (by ADAMS):E1 We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory.E2 The bilingual dual-coding theory partially answers the above questions.
Candidate sentence set for A1:S1 The bilingual dual-coding theory partially answers the above questions.S2 There is a well-known debate in psycholinguistics concerning the bilingual mental representation. . .Candidate sentence set for A2:S3 We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory.S4 It provides a learnable lexical selection sub-system for a connectionist transfer project in machine translation.
![Page 6: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/6.jpg)
66
Model & ApproachModel & Approach
Learn model that can identify Learn model that can identify Candidate Sentence SetCandidate Sentence Set (CSS) (CSS)
a.a. Label: Label: generate patterns of correspondence
b.b. Train classifier: to identify the CSS’sTrain classifier: to identify the CSS’s Generate summary for a new
documenta. Generate CSS’sb. Realize Summary
![Page 7: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/7.jpg)
77
CSS’s Discovery DiagramCSS’s Discovery Diagram
![Page 8: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/8.jpg)
88
Learn the CSS Model (1)Learn the CSS Model (1)
Label:Label:– decomposition of abstract sentences based on
string overlaps– 70.8% of abstract sentences are composed of
fragments of length >= 2, which can be found in the text to be summarized in our test data (CMP-LG corpus)
![Page 9: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/9.jpg)
99
Learn the CSS Model (2)Learn the CSS Model (2)
Train classifier: Given docs where all CSS’s have been labelled, transform each doc into sentence pair set. Each instance is represented by feature vector and target feature is whether pair belongs to same CSS– Used Decision Trees, also tried Support Vector
Machines [Joachims, 2002] and Naïve Bayes classifiers [Borgelt, 1999]
– Sparse data problem: [Japkowicz 2000; Chawla et al., 2003]
![Page 10: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/10.jpg)
1010
Summary GenerationSummary Generation
Generate CSS’s for unseen documents:Generate CSS’s for unseen documents:– Use classifier to identify sentence pairs Use classifier to identify sentence pairs
belonging to same CSS and merge thembelonging to same CSS and merge them– CSS’s formation exhibits natural order since CSS’s formation exhibits natural order since
sentences and sentence pairs are labeled sentences and sentence pairs are labeled sequentially: i.e., first CSS will contain at least sequentially: i.e., first CSS will contain at least one fragment which appears earlier in source one fragment which appears earlier in source text than any fragments in second CSStext than any fragments in second CSS
Summary RealizationSummary Realization
![Page 11: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/11.jpg)
1111
Summary RealizationSummary Realization
Simple Quasi-abstractive (SQa)Simple Quasi-abstractive (SQa)– New sentence generated by appending New sentence generated by appending
new word to previously generated new word to previously generated sequence according to n-gram sequence according to n-gram probabilities calculated from CSS probabilities calculated from CSS
– Each CSS is used only onceEach CSS is used only once
![Page 12: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/12.jpg)
1212
Summary RealizationSummary Realization
Quasi-abstractive with Salient Topics Quasi-abstractive with Salient Topics (QaST)(QaST)– Salient NPs model based on social networks Salient NPs model based on social networks
[Wasserman & Faust, 94; Xie 2005][Wasserman & Faust, 94; Xie 2005]– Sort predicted salient NPs according to their Sort predicted salient NPs according to their
lengthslengths– Traverse list of salient NPs and of CSS-based Traverse list of salient NPs and of CSS-based
n-gram probabilities in parallel to generate n-gram probabilities in parallel to generate sentence: use highest ranked NP which has not sentence: use highest ranked NP which has not been used yet, and first n-gram probability been used yet, and first n-gram probability model that contains this NPmodel that contains this NP
![Page 13: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/13.jpg)
1313
Topic PredictionTopic Prediction Salient NPsSalient NPs
– Abstract should contain salient topics of articleAbstract should contain salient topics of article– Topics are often expressed by NPsTopics are often expressed by NPs– We assume that NPs in an abstract represent We assume that NPs in an abstract represent
most salient topics in articlemost salient topics in article NP Network & NP CentralityNP Network & NP Centrality
– Collocated NPs can be connected and hence Collocated NPs can be connected and hence network can be formednetwork can be formed
– Social network analysisSocial network analysis techniques used to techniques used to analyze network [Wasserman & Faust 94] and analyze network [Wasserman & Faust 94] and calculate centrality for nodes [Xie 05]calculate centrality for nodes [Xie 05]
![Page 14: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/14.jpg)
1414
ExperimentsExperiments Data: 178 documents from CMP-LG corpus, Data: 178 documents from CMP-LG corpus,
3-fold cross validation3-fold cross validation Four Models:Four Models:
– Lead: first sentence from first m paragraphs.Lead: first sentence from first m paragraphs.– ADAMS: top m sentences ranked according to ADAMS: top m sentences ranked according to
sentence ranking function ADAMS learned.sentence ranking function ADAMS learned.– SQa: uses n-gram probabilities over first m SQa: uses n-gram probabilities over first m
discovered CSS’s to generate new sentences.discovered CSS’s to generate new sentences.– QaST: anchors choice of specific set of n-gram QaST: anchors choice of specific set of n-gram
probabilities in salient topics. Stops after m probabilities in salient topics. Stops after m sentences have been generated.sentences have been generated.
![Page 15: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/15.jpg)
1515
Evaluation MetricsEvaluation Metrics
Cosine similarity: bag of words methodCosine similarity: bag of words method ROUGE-1,2: [Lin 2004]ROUGE-1,2: [Lin 2004]
– A recall measure to compare machine-generated A recall measure to compare machine-generated summary and its reference summariessummary and its reference summaries
– Still bag of words/n-gram methodStill bag of words/n-gram method– But showed high correlation with human judgesBut showed high correlation with human judges
∑ ∑∑ ∑
∈ ∈
∈ ∈=−
}{
}{
)(
)(
ummariesreferenceSS Sgramn
ummariesreferenceSS Sgramnmatch
n
n
gramCount
gramCount
NROUGE
![Page 16: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/16.jpg)
1616
Experimental ResultsExperimental Results
LeadLead ADAMSADAMS SQaSQa QaSTQaST
Cosine SimilarityCosine Similarity 0.3300.330 0.3750.375 0.2930.293 0.4270.427
ROUGE-1 ScoreROUGE-1 Score 0.3330.333 0.4260.426 0.3340.334 0.4380.438
ROUGE-2 ScoreROUGE-2 Score 0.0980.098 0.1350.135 0.0780.078 0.1610.161
SQa’s performance is even lower than LeadSQa’s performance is even lower than Lead ADAMS achieved +13.6%, +27.9%, and +37.8% ADAMS achieved +13.6%, +27.9%, and +37.8%
improvement over Lead for the three metricsimprovement over Lead for the three metrics QaST achieved +29.4%, +31.5%, and +64.3% QaST achieved +29.4%, +31.5%, and +64.3%
improvement over Lead, and +13.9%, +2.8%, +19.3% improvement over Lead, and +13.9%, +2.8%, +19.3% over ADAMSover ADAMS
All differences between QaST and others are statistically All differences between QaST and others are statistically significant (two sample t-test) except for ADAMS/ROUGE-1significant (two sample t-test) except for ADAMS/ROUGE-1
![Page 17: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/17.jpg)
1717
Generated Sentence SampleGenerated Sentence Sample
QaST:
Original:
In collaborative expert-consultation dialogues, two participants ( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan.
In collaborative expert-consultation dialogues, two participants (executing agent and consultant) work together to construct a plan for achieving the executing agent’s domain goal. The executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan.
![Page 18: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/18.jpg)
1818
Sample SummarySample SummaryQaST: In this paper, we present a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the user has indicated preferences. to an existing tripartite model might require inferring a chain of actions for addition to the shared plan, can appropriately respond to user queries that are motivated by ill-formed or suboptimal solutions, and handles in a unified manner the negotiation of proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions as well as the relationship amongst them. In collaborative expert-consultation dialogues, two participants( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan. In suggesting better alternatives, our system differs from van Beek’s in a number of ways.
Abstract: This paper presents a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the system (consultant) and user (executing agent) disagree. Our work contributes to an overall system for collaborative problem-solving by providing a plan-based framework that captures the Propose-Evaluate-Modify cycle of collaboration, and by allowing the system toinitiate subdialogues to negotiate proposed additions to the shared plan and to provide support for its claims. In addition, our system handles in a unified manner the negotiationof proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions. Furthermore, it captures cooperative responses within the collaborativeframework and accounts for why questions are sometimes never answered.
![Page 19: From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,](https://reader035.fdocuments.net/reader035/viewer/2022081821/56649e855503460f94b8785d/html5/thumbnails/19.jpg)
1919
Conclusion & DiscussionConclusion & Discussion
New type of machine generated summary: New type of machine generated summary: Quasi-abstractive summaryQuasi-abstractive summary
N-gram model anchored by salient NPs N-gram model anchored by salient NPs gives good resultsgives good results
Further investigation needed in several Further investigation needed in several aspectsaspects– CSS’s Discovery with cost-sensitive classifiers CSS’s Discovery with cost-sensitive classifiers
[Domingos, 1999; Ting, 2002][Domingos, 1999; Ting, 2002]– Grammaticality and length of generated Grammaticality and length of generated
summaries [Wan et al, 2007]summaries [Wan et al, 2007]