Ranking Related News Predictions

62
Ranking Related News Predictions Ranking Related News Predictions Nattiya Kanhabua 1 , Roi Blanco 2 and Michael Matthews 2 1 Norwegian University of Science and Tech., Norway 2 Yahoo! Research, Barcelona, Spain SIGIR’2011, Beijing

Transcript of Ranking Related News Predictions

Page 1: Ranking Related News Predictions

Ranking Related News Predictions

Ranking Related News Predictions

Nattiya Kanhabua1, Roi Blanco2 and Michael Matthews2

1Norwegian University of Science and Tech., Norway

2Yahoo! Research, Barcelona, Spain

SIGIR’2011, Beijing

Page 2: Ranking Related News Predictions

Ranking Related News Predictions

Outline

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 3: Ranking Related News Predictions

Ranking Related News Predictions

Outline

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 4: Ranking Related News Predictions

Ranking Related News Predictions

Outline

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 5: Ranking Related News Predictions

Ranking Related News Predictions

Outline

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 6: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 7: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Problem statement

People are naturally curious about the future.◮ How long will a war in the middle east last?◮ What is the latest health care plan?◮ What will happen to EU economies in next 5 years?

◮ What will be potential effects of climate changes?

Over 32% of 2.5M documents from Yahoo! News (July 2009 toJuly 2010) contain at least one prediction.

A new task called ranking related news predictions .◮ Retrieve predictions related to a news story in news archives.

◮ Rank them according to their relevance to the news story.

Page 8: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Problem statement

People are naturally curious about the future.◮ How long will a war in the middle east last?◮ What is the latest health care plan?◮ What will happen to EU economies in next 5 years?

◮ What will be potential effects of climate changes?

Over 32% of 2.5M documents from Yahoo! News (July 2009 toJuly 2010) contain at least one prediction.

A new task called ranking related news predictions .◮ Retrieve predictions related to a news story in news archives.

◮ Rank them according to their relevance to the news story.

Page 9: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Problem statement

People are naturally curious about the future.◮ How long will a war in the middle east last?◮ What is the latest health care plan?◮ What will happen to EU economies in next 5 years?

◮ What will be potential effects of climate changes?

Over 32% of 2.5M documents from Yahoo! News (July 2009 toJuly 2010) contain at least one prediction.

A new task called ranking related news predictions .◮ Retrieve predictions related to a news story in news archives.

◮ Rank them according to their relevance to the news story.

Page 10: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Related News Predictions

Page 11: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Related News Predictions

Page 12: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Problem Statement

Related News Predictions

Query = <gas, emission, percent, european, global, climate>

Page 13: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Related Work

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 14: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Related Work

Future-related Information Analyzing Tools

Recorded Future

Difference: a user must specify a query in advance using “predefined” entities.

Page 15: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Related Work

Future-related Information Analyzing ToolsYahoo’s Time Explorer

Difference: No ranking or performance evaluation is done.

Page 16: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Related Work

Previous Work on Future Retrieval

R. Baeza-Yates. Searching the future. SIGIR’2005 Workshopon Mathematical/Formal Methods in IR.

◮ Extract temporal expressions from news articles.◮ Retrieve future information using a probabilistic model, i.e.,

multiplying term similarity and a time confidence.

◮ Only a small data set and a year granularity are used.

Page 17: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Related Work

Previous Work on Future Retrieval

A. Jatowt et al. Supporting analysis of future-relatedinformation in news archives and the web. JCDL’2009.

◮ Extract future mentions from news snippets obtained fromsearch engines.

◮ Summarize and aggregate results using clustering methods.

◮ Not focus on relevance and ranking of future information.

Page 18: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Contributions

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 19: Ranking Related News Predictions

Ranking Related News Predictions

Introduction

Contributions

Contributions

I. Formally define ranking related news predictions .

II. Four classes of features: term similarity, entity-basedsimilarity, topic similarity and temporal similarity.

III. Extensive evaluation using dataset with over 6000judgments from the NYT Annotated Corpus.

Page 20: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

System Architecture

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 21: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

System Architecture

System Architecture

Step 1: Document annotation.

◮ Extract temporal expressionsusing time and event recognition.

◮ Normalize them to dates so theycan be anchored on a timeline.

◮ Output: sentences annotatedwith named entities and dates,i.e., predictions.

Page 22: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

System Architecture

System Architecture

Step 2: Retrieving predictions.

◮ Automatically generate a queryfrom a news article being read.

◮ Retrieve predictions that matchthe query.

◮ Rank predictions by relevance. Aprediction is “relevant” if it isabout the topics of the article.

Page 23: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 24: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Annotated Document Model

Collection C = {d1, . . . , dn}.

Document d = {{w1, . . . ,wn} , time(d)}.

◮ time(d) gives the publication date of d .

Annotated document d̂ is composed of:

◮ Named entities d̂e = {e1, . . . ,en}

◮ Temporal expressions d̂t = {t1, . . . , tm}

◮ Sentences d̂s = {s1, . . . , sz}

Page 25: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Annotated Document Model

Collection C = {d1, . . . , dn}.

Document d = {{w1, . . . ,wn} , time(d)}.

◮ time(d) gives the publication date of d .

Annotated document d̂ is composed of:

◮ Named entities d̂e = {e1, . . . ,en}

◮ Temporal expressions d̂t = {t1, . . . , tm}

◮ Sentences d̂s = {s1, . . . , sz}

Page 26: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Prediction Model

Let dp be the parent document of a prediction p.

p is a sentence containing field/value pairs:

Field Value

ID 1136243_1PARENT_ID 1136243TITLE Gore Pledges A Health Plan For Every ChildTEXT Vice President Al Gore proposed today to guarantee access to

affordable health insurance for all children by 2005, expandingon a program enacted two years ago that he conceded had hadlimited success so far.

CONTEXT Mr. Gore acknowledged that the number of Americans withouthealth coverage had increased steadily since he and PresidentClinton took office.

ENTITY Al GoreFUTURE_DATE 2005PUB_DATE 1999/09/08

Page 27: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Query Model

Query q is extracted from a news article being read dq.

1. Keywords qtext

2. Time constraints qtime

Page 28: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Query Keywords

A news articlebeing read

Query keywordextraction

Term query

(1) (2) (3)

Entity queryQ Q

E T

Field

A prediction

ID

PARENT_ID

TITLE

TEXT

ENTITY

CONTEXT

FUTURE_DATE

PUB_DATE

Combined queryQC

QE = {e1, . . . , em}E.g., 〈Barack Obama, Iraq,America〉

Page 29: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Query Keywords

A news articlebeing read

Query keywordextraction

Term query

(1) (2) (3)

Entity queryQ Q

Combined queryQ

E T C

Field

A prediction

ID

PARENT_ID

TITLE

TEXT

ENTITY

CONTEXT

FUTURE_DATE

PUB_DATE

QE = {w1, . . . ,wn}E.g., 〈troop,war,withdraw〉

Page 30: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Query Keywords

A news articlebeing read

Query keywordextraction

Term query

(1) (2) (3)

Entity queryQ Q

Combined queryQ

E T C

Field

A prediction

ID

PARENT_ID

TITLE

TEXT

ENTITY

CONTEXT

FUTURE_DATE

PUB_DATE

QC = {e1, . . . , em} ∪ {w1, . . . ,wn}E.g., 〈Barack Obama, Iraq,America, troop,war,withdraw〉

Page 31: Ranking Related News Predictions

Ranking Related News Predictions

Task Definition

Models

Query Time

Time constraints qtime

1. only predictions that are future to time(dq) - (time(dq), tmax]

2. only articles published before time(dq) - [tmin, time(dq)]

now futurepast

Query

2016 203320181999 20062002

P P P

Page 32: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 33: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Term Similarity

Capture the term-similarity between q and p.

1. retScore(q,p) Lucene’s TF-IDF scoring function◮ Problem: keyword matching, short texts◮ Predictions not containing query terms are not retrieved.

2. bm25f(q,p) field-aware ranking function◮ Extend a sentence structure by surrounding sentences.◮ Search CONTEXT in addition to TEXT [Blanco et al. 2010].

Page 34: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Term Similarity

Capture the term-similarity between q and p.

1. retScore(q,p) Lucene’s TF-IDF scoring function◮ Problem: keyword matching, short texts◮ Predictions not containing query terms are not retrieved.

2. bm25f(q,p) field-aware ranking function◮ Extend a sentence structure by surrounding sentences.◮ Search CONTEXT in addition to TEXT [Blanco et al. 2010].

Page 35: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Term Similarity

Capture the term-similarity between q and p.

1. retScore(q,p) Lucene’s TF-IDF scoring function◮ Problem: keyword matching, short texts◮ Predictions not containing query terms are not retrieved.

2. bm25f(q,p) field-aware ranking function◮ Extend a sentence structure by surrounding sentences.◮ Search CONTEXT in addition to TEXT [Blanco et al. 2010].

Page 36: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Entity-based Similarity

Measure the similarity between q and pby exploiting annotated entities in dp, p,q.

◮ Only applicable for QE and QC .

◮ Features commonly employed inentity ranking tasks.

◮ Time distance captures therelationship of term and time.

ID Feature

1 entitySim(q, p)2 title(e, dp)3 titleSim(e, dp)4 senPos(e, dp)5 senLen(e, dp)6 cntSenSubj(e, dp)7 cntEvent(e, dp)8 cntFuture(e, dp)9 cntEventSubj(e, dp)10 cntFutureSubj(e, dp)11 timeDistEvent(e, dp)12 timeDistFuture(e, dp)13 tagSim(e, dp)14 isSubj(e, p)15 timeDist(e, p)

Page 37: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

Compute the similarity between q and p on a topic level.◮ Latent Dirichlet allocation [Blei et al. 2003] for modeling topics.

1. Train a topic model2. Infer topics3. Compute topic similarity

Page 38: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

Step 1: Learn a topic model.

◮ Partition DN into sub-collections,called document snapshot Dtrain,tk .

◮ For each Dtrain,tk , randomly selectdocuments for training a topic model.

◮ Output: topic models at differenttime snapshots, e.g., φtk at tk .

Page 39: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

Step 2: Infer topics.

◮ Determine topics for q and p usingtheir contents, called topic inference.

◮ Both q and p are represented by aprobability distribution of topics.

◮ pφ = p(z1), . . . ,p(zn), where p(z) isa probability of a topic z.

Page 40: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

I. Which model snapshot should be used for inference?

Select a topic model φtk for inference in 2 ways:◮ tk = time(dq)◮ tk = time(dp)

II. Which contents should be used for inference?

For a query q, the parent document dq is used. For aprediction p, the contents can be:

◮ Only text ptxt◮ Both text ptxt and context pctx◮ Parent document dp

Page 41: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

I. Which model snapshot should be used for inference?

Select a topic model φtk for inference in 2 ways:◮ tk = time(dq)◮ tk = time(dp)

II. Which contents should be used for inference?

For a query q, the parent document dq is used. For aprediction p, the contents can be:

◮ Only text ptxt◮ Both text ptxt and context pctx◮ Parent document dp

Page 42: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

I. Which model snapshot should be used for inference?

Select a topic model φtk for inference in 2 ways:◮ tk = time(dq)◮ tk = time(dp)

II. Which contents should be used for inference?

For a query q, the parent document dq is used. For aprediction p, the contents can be:

◮ Only text ptxt◮ Both text ptxt and context pctx◮ Parent document dp

Page 43: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

I. Which model snapshot should be used for inference?

Select a topic model φtk for inference in 2 ways:◮ tk = time(dq)◮ tk = time(dp)

II. Which contents should be used for inference?

For a query q, the parent document dq is used. For aprediction p, the contents can be:

◮ Only text ptxt◮ Both text ptxt and context pctx◮ Parent document dp

Page 44: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Topic Similarity

Step 3: Measuring topic similarity.

◮ q and p are represented by topicdistributions.

◮ qφ = p(z1), . . . ,p(zn)

◮ pφ = p(z1), . . . ,p(zn)

◮ Compute the topic similarity usingcosine similarity.

topicSim(q, p) =qφ · pφ

||qφ|| · ||pφ||

=

z∈Z qφz · pφz√

z∈Z q2φz

·√

z∈Z p2φz

Page 45: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Temporal Similarity

Hypothesis I. Predictions that are more recent to the query aremore relevant.

now futurepast

Query

2016 203320181999 20062002

P P P

Time distance

Page 46: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Features

Temporal Similarity

Hypothesis II. Predictions extracted from more recentdocuments are more relevant.

now futurepast

Query

2016 203320181999 20062002

P P P

Time distance

◮ Timestamp-based Uncertainty (TSU) [Kanhabua and Nørvåg 2010]◮ FussySet (FS) [Kalczynski and Chou 2005]

Page 47: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Ranking Method

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 48: Ranking Related News Predictions

Ranking Related News Predictions

Approach

Ranking Method

Ranking Method

Learning-to-rank: Given an unseen (q, p), p is ranked using amodel trained over a set of labeled query/prediction pairs.

score(q, p) =N∑

i=1

wi × fi

◮ SVMMAP [Yue et al. 2007]◮ RankSVM [Joachims 2002]◮ SGD-SVM [Zhang 2004]◮ PegasosSVM [Shalev-Shwartz et al. 2007]◮ PA-Perceptron [Crammer et al. 2006]

Page 49: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experiment Setting

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 50: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experiment Setting

Document collection

NYT Annotated Corpus 1.8M from 1987 to 2007.◮ More than 25% contain at least one prediction

Annotation process uses several language processing tools.◮ OpenNLP for tokenizing, sentence splitting, part-of-speech

tagging, shallow parsing◮ SuperSense tagger for named entity recognition

◮ TARSQI for extracting temporal expressions

Apache Lucene for indexing and retrieving.◮ 44,335,519 sentences and 548,491 predictions

◮ 939,455 future dates (avg. future date/prediction is 1.7)

Page 51: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experiment Setting

Relevance judgments

42 future-related topics

POLITICS ENVIRONMENT SPACE

president election global warming MarsIraq war energy efficiency Moon

SCIENCE PHYSICS HEALTH

earthquake particle Physics bird fluetsunami Big Bang influenza

BUSINESS SPORT TECHNOLOGY

subprime Olympics Internetfinancial crisis World cup search engine

Page 52: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experiment Setting

Relevance judgments

Human assessors gave a relevance score Grade(q, p, t).◮ 4 (very relevant), 3 (relevant), 2 (related), 1 (non-relevant), and 0

(incorrect tagged date)

◮ relevant if Grade(q,p, t) ≥ 3 and non-relevant if1 ≤ Grade(q,p, t) ≤ 2

In total, assessors judged 52 queries.◮ On average 94 predictions were retrieved per query

◮ 4,888 query/prediction pairs (approximately 6,032 of triples)

Available for download at:www.idi.ntnu.no/~nattiya/data/sigir2011/futurepredictions.zip

Page 53: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experiment Setting

Parameter setting

BM25F: b = 0.75, k1 = 1.2 [Robertson et al. 1994]◮ boost(TEXT) = 5.0◮ boost(CONTEXT) = 1.0

◮ boost(TITLE) = 2.0

LDA: Stanford Topic Modeling Toolbox◮ randomly select 4% of documents in each year for training◮ filter 100 most common terms and in less than 15 documents◮ number of topics Nz is 500

◮ collapsed variational Bayes approximation algorithm

Temporal features:◮ DecayRate = 0.5, λ = 0.5, µ = 2y◮ n = 2, m = 2, smin = 4y , smax = 2y

◮ α1 = time(dq)− 4y , α2 = time(dq) + 2y

Page 54: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 55: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Methods for comparison

Baseline: QE , QT , QC

◮ Rank using Lucene’s default ranking function.

Our approach: Re-QE , Re-QT , Re-QC

◮ Re-rank the baseline results using learning-to-rank.

Metrics: P@1, P@3, MRR◮ Typically, a user is interested in a few top predictions.

Page 56: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Selecting top-m entities and top-n terms

Select m and n with reasonable improvement in a hold-out set.◮ Using QE to retrieve predictions, choose m = 11.

◮ Observing the performance of QC when m = 11, choose n = 10.

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

top-m entities

P10MAP

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

top-n terms

P10MAP

Page 57: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Compare all other methods against QE

0

0.2

0.4

0.6

0.8

1

P1

QEQTQC

Re-QERe-QT

Re-QC

0

0.2

0.4

0.6

0.8

1

P3

QEQTQC

Re-QERe-QT

Re-QC

0

0.2

0.4

0.6

0.8

1

MRR

QEQTQC

Re-QERe-QT

Re-QC

Results:◮ QE performs worst among the baselines, while QC is superior to QT .◮ Re-QC gains the highest effectiveness followed by Re-QT .

◮ Re-ranking approach gains improvement, except Re-QE .

Page 58: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Compare all other methods against QE

0

0.2

0.4

0.6

0.8

1

P1

QEQTQC

Re-QERe-QT

Re-QC

0

0.2

0.4

0.6

0.8

1

P3

QEQTQC

Re-QERe-QT

Re-QC

0

0.2

0.4

0.6

0.8

1

MRR

QEQTQC

Re-QERe-QT

Re-QC

Analysis:◮ QE not retrieved any relevant result in the judged pool, difficult for re-ranking.

◮ Entity-based features perform well for some topics.

Page 59: Ranking Related News Predictions

Ranking Related News Predictions

Evaluation

Experimental Results

Feature analysisTop-5 features with highest weights and lowest weights for each query type.

QE QT QC

Feature Wi Feature Wi Feature Wi

tagSim 1.00 bm25f 1.00 LDA1,parent,k 1.00FS1 0.97 retScore 0.60 retScore 0.99TSU2 0.88 LDA1,parent,k 0.55 LDA1,parent,all 0.96LDA1,txt,k 0.87 LDA2,parent,k 0.51 bm25f 0.93LDA1,txt,all 0.82 LDA1,parent,all 0.49 isSubj 0.87

cntSenSubj 0.01 timeDistEvent -0.03 cntEventSen -0.02cntEventSubj 0.01 timeDistFuture -0.11 querySim -0.05isInTitle 0.00 cntEventSen -0.12 cntFutureSen -0.10cntEventSen 0.00 cntFutureSen -0.12 timeDistFuture -0.14querySim -0.01 senLen -0.16 senLen -0.18

◮ Topic-based features play an important role in the re-ranking model.◮ Although relying on terms, retScore and bm25f help to re-rank predictions.

◮ Features in top-5 features with lowest weights are from the entity-based class.

Page 60: Ranking Related News Predictions

Ranking Related News Predictions

Conclusions

Conclusions and Future Work

Outline

IntroductionProblem StatementRelated WorkContributions

Task DefinitionSystem ArchitectureModels

ApproachFeaturesRanking Method

EvaluationExperiment SettingExperimental Results

Page 61: Ranking Related News Predictions

Ranking Related News Predictions

Conclusions

Conclusions and Future Work

Conclusions and future work

◮ Define the task of ranking related future predictions .◮ Employ learning-to-rank incorporating 4 feature classes.◮ Conduct extensive experiments and create an evaluation

dataset with over 6000 relevance judgments.◮ Future work:

◮ Combining multiple sources (Wikipedia, blogs, homepages, etc.) of future-related information.

◮ Sentimental analysis for future-related information.

Page 62: Ranking Related News Predictions

Ranking Related News Predictions

Conclusions

Conclusions and Future Work

Acknowledgment: Thank Hugo Zaragoza for his help at theearly stages of this work.

Thank you!