Towards Automated Re lated Wo rk S ummarization ( ReWoS )

1

Towards Automated Related Work Summarization

(ReWoS)

HOANG Cong Duy Vu

03/12/2010

2

Outline

• Recall• A Motivating Example• The Proposed Approach

– General Content Summarization (GCSum)– Specific Content Summarization (SCSum)– Generation

• Experiments & Results• Future Work• Conclusion

3

RecallA set of articles

Topic hierarchy tree

RW SummarizerA desired lengthUser

[]

[,]

[]

A RW summary

assumption

RW: related work

4

A Motivating Example

A related work section extracted from “Bilingual Topic Aspect Classification with A few Training Examples” (Wu et al., 2008)

5

The Proposed Approach

The ReWoS architecture, Decision edges are labeled as (T)rue, (F)alse or (R)elevant.

For leaf nodes For internal nodes

6


• Pre-Processing– Based on heuristic rules of sentence length

and lexical clues• Sentences with token-based length is too short

(<7) or too long (>80)• Sentences referring to future tenses• Sentences containing obviously redundant clues

such as: “in the section ...”, “figure XXX shows ...”, “for instance” …

7


• Agent-based rule– Attempts to distinguish whether the sentence

describes an author’s own work or not.

– Based on the presence of tokens that signals work done by the author, such as “we”, “our”, “us”, “this approach”, and “this method” …

– Says that if a sentence does not satisfy this rule, route for GCSum, otherwise for SCSum

8

General Content Summarization (GCSum)

• The objective of GCSum is to extract sentences containing useful background information on the topics of the internal node in focus.

9


General content

informative indicative

1) Text classification is a task that assigns a certain number of pre-defined labels for a given text.

2) Statistical machine translation (SMT) seeks to develop mathematical models of the translation process whose parameters can be automatically estimated from a parallel corpus.

1) Many previous studies have approached the problem of mono-lingual text classification.

2) This paper refers to the problem of sentiment analysis.

10


• Informative sentences– Give detail on a specific aspect of the problem, e.g.

definitions, purpose or application of the topic

• Indicative sentences– simpler, inserted to make the topic transition explicit

and rhetorically sound

• Summarization issue– Given a topic:

• For indicative sentences, using pre-defined templates• For informative sentences, extract from input articles

11


• GCSum first checks the subject of each candidate sentence, filtering ones whose subjects do not contain at least one topic keyword. (Subject-based rule)

• Or GCSum checks whether stock verb phrases (i.e., “based on”, “make use of” and 23 other patterns) are used as the main verb. (Verb-based rule)

• Or GCSum checks for the presence of at least one citation – general sentences may list a set of citations as examples. (Citation-based rule)

Importantly note that if cannot find out any informative sentences from input articles, generate indicative sentences instead!

12


• Topic relevance computation (GCSum)– ranks sentences based on keyword content– states that the topic of an internal node is

affected by its surrounding nodes – ancestor, descendants and others

- scoreS is the final relevance score- scoreS

QA, scoreSQ, and scoreS

QR mean the component relevance score of the sentence S with respect to the ancestor, current or other remaining nodes,respectively.

13


• Topic relevance computation (GCSum)

5

6 73

The linear combination: S’( ) = S( ) + S( ) – S(5 x )

4

1 4

ancestors others

2

The maximum number of sentences for each intermediate node is 2-3.

4

1

itself

ancestors

itself

others

14


• To obtain each component relevance score, we employ TF×ISF relevance computation

15

Specific Content Summarization (SCSum)

• Sentences that are marked with author-as-agent are input to the Specific Content Summarization (SCSum) module.

• SCSum aims to extract sentences that contain detailed information about a specific author’s work that is relevant to the input leaf nodes’ topic.

16


• Topic relevance computation (SCSum)

5

6 73

The linear combination: S’( ) = S( + ) + S( ) – S( )

2

1

ancestors siblings

Initially, the number of sentences for each leaf node is assigned equivalently.

The relevance score is computed using the formula similar to GCSum presented earlier.

2

1

itself

ancestors

itself

siblings

4

4 2 3

17


• Context modeling– Motivation: single sentences occasionally do

not contain enough context to clearly express the idea mentioned in original articles

– Try to use the contexts to increase the confidence of agent-based sentences

score(sentence) score(contexts)+

topic

final_score(sentence)

18

SCSum - Context modeling

*** We evaluated the accuracy of each of the paraphrases that was extracted from the manually aligned data, as well as the top ranked paraphrases from the experimental conditions detailed below in Section 3.3.

*** Because the accuracy of paraphrases can vary depending on context, we substituted each set of candidate paraphrases into between 2-10 sentences which contained the original phrase.

*** Figure 4 shows the paraphrases for under control substituted into one of the sentences in which it occurred.

*** We created a total of 289 such evaluation sets, with a total of 1366 unique sentences created through substitution.

*** We had two native English speakers produce judgments as to whether the new sentences preserved the meaning of the original phrase and as to whether they remained grammatical.

*** Paraphrases that were judged to preserve both meaning and grammaticality were considered to be correct, and examples which failed on either judgment were considered to be incorrect.

Example extracted from (Bannard and Callison-Burch 2005)

Adjacent sentences

Agent-based sentence

*** (Bannard and Callison-Burch 2005) replaced phrases with paraphrases in a number of sentences and asked judges whether the substitutions “preserved meaning and remained grammatical.”

Summary sentence

19


• Context modeling– Choose nearby sentences within a contextual

window (size 5) after the agent-based sentence to represent more for given topic.

20


• Weighting– The observation is that the presence of one or

more of current, ancestor and sibling nodes may affect the final score from the computation

– Add a new weighting coefficient for the score computed from the topic relevance computation (SCSum)

a weighting coefficient that takes on differing values based on the presence of keywords in the sentence

Values as follows:

If sentence contains no keywords in siblings:

+ Keywords in both ancestors & itself 1

+ Keywords in itself only 0.5

+ Keywords in ancestors only 0.25

If sentence contains keywords in siblings 0.1 (penalty)

21


• Ranking & Re-ranking– Sentences are ranked descendingly

according to their relevance scores– Then, simplified MMR (SimRank) is

performed:• A sentence X is removed if it has the maximum

cosine similarity value exceeding a pre-defined threshold (0.75) with any sentence Y which is already chosen at previous steps of SimRank.

22

Post-Proccessing

• Two steps:– First, replace agentive forms (e.g., “we”, “our”,

“this study”, ...) with a citation to the articles

– Second, resolves abbreviations found in the extracted sentences

• E.g. SMT Statistical Machine Translation

23

Generation

• In this work, we only generate the related work summaries by using depth-first traversals to form the ordering of topic nodes in a topic tree

1 − 4 −2 − 3 − 5 − 6 − 7

Node ordering

24

Experiments & Results

• Dataset– Use RWSData described before, including 20 sets

• 10 out of 20 sets were evaluated automatically and manually.

• Baselines– LEAD (title + abstract – based RW)– MEAD (centroid + cosine similarity)

• Proposed systems– ReWoS-WCM (ReWoS without context modeling)– ReWoS-CM (ReWoS with context modeling)

25

Experiments & Results• Automatic evaluation

– Use ROUGE variants (ROUGE-1, ROUGE-2, ROUGE-S4, ROUGE-SU4)

• Manual evaluation (measure over 5-point scale of 1 (very poor) to 5 (very good)

– Correctness: Is the summary content actually relevant to the hierarchical topics given?

– Novelty: Does the summary introduce novel information that is significant in comparison with the human created summary?

– Fluency: Does the summary’s exposition flow well, in terms of syntax as well as discourse?

– Usefulness: Is the summary acceptable in terms of its usefulness in supporting the researchers to quickly grasp the related works relevant to hierarchical topics given?

• Summary length: 1% of the original relevant articles, measured in sentences

26


- ROUGE evaluation seems to work unreasonably when dealing with verbose summaries, often produced by MEAD.

- Related work summaries are multi-topic summaries of multi-article references. This may cause miscalculation from overlapping n-grams that occur across multiple topics or references.

27


- The table shows that both ReWoS–WCM and ReWoS-CM perform significantly better than baseline in terms of correctness, novelty, and usefulness.

- Comparing with LEAD, showing that necessary information is not only located in titles or abstracts, but also in relevant portions of the research article body.

- ReWoS–CM (with context modeling) performed equivalent to ReWoS–WCM (without it) in terms of correctness and usefulness.

- For novelty, ReWoS–CM is better than ReWoS–WCM. It proved that the proposed component of context moding is useful in providing new information.

28

Future work

• Overcome the assumption about topic hierarchy tree

• Investigate better generation – Focus on local coherence and topic transition

29

Conclusion

• According to the best of our knowledge, automated related work summarization has not been studied before.

• This work took initial steps towards solving this problem, by dividing the task into general and specific summarization processes.

• Initial results showed an improvement over generic multi-document baselines in both automatic and human evaluation.

30

Thank you!

• Questions???

Towards Automated Re lated Wo rk S ummarization ( ReWoS )

Documents

Transcript of Towards Automated Re lated Wo rk S ummarization ( ReWoS )