Clustering and Exploring Search Results using Timeline Constructions

12
Intelligent Database Systems Lab N.Y.U.S. T. I. M. Clustering and Exploring Search Results using Timeline Constructions Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz, Ricardo Baeza-Yates CIKM 2009 國國國國國國國國 National Yunlin University of Science and Technology

description

Clustering and Exploring Search Results using Timeline Constructions. Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz , Ricardo Baeza -Yates. 國立雲林科技大學 National Yunlin University of Science and Technology. CIKM 2009. Outline. Motivation Objective - PowerPoint PPT Presentation

Transcript of Clustering and Exploring Search Results using Timeline Constructions

Page 1: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Clustering and Exploring Search Results usingTimeline Constructions

Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz, Ricardo Baeza-Yates

CIKM 2009

國立雲林科技大學National Yunlin University of Science and Technology

Page 2: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Time annotated document model Methodology Experiments Conclusion Comments

2

Page 3: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

Any of the current search engines does not exploit the temporal information embedded in the documents.

3

1. Do you think current timelines for organizing or clustering search results (such as in Google’s timeline) are useful for some of your daily search activities?

2. Do you use (or would use) timelines to explore search results?

3. Please indicate some search scenarios where you use timelines or would like to use timelines to organize search results.

4. Please give some examples of search scenarios where current search engines do not sufficiently support the concept of timelines to organize and explore search results?

5. What other features would you like to see in the context of timelines?

時間軸

Page 4: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

To present an add-on to traditional information retrieval applications in which we exploit various temporal information associated with documents to present and cluster documents along timelines.

4

Page 5: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

TIME ANNOTATED DOCUMENT MODEL Time and Timelines Temporal Expressions Temporal Document Profiles

5

Our base timeline, denoted Td, is an interval of consecutive day chronons.EX: “March 12, 2002; March 13, 2002;March 14, 2002”implicit temporal expressionEX:“Valentine's Day 2006”Explicit temporal expressionsEX:December 2004Relative temporal expressionsEX:“today”

ExplicitimplicittimestampsRelative

Page 6: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

PROTOTYPE Process Overview

6

Corpora

Alembic (POS tagger)

GUTime temporal tagger

Oracle

XML Document

(tdp)

Page 7: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

TCluster Constructing a Time Outline for the documents in the hit list Lq. Document Clustering Ranking Documents in a Cluster

7

a hit list Lq =[d1, d2, . . . , dk] of k documents

Page 8: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

DMOZ Introduction :a multilingual open content directory

8

2010, 2006, 2002, 1998 and 1994

document clusters

World Cupdocumentspre-defined categories(5)< TCluster (21)

Each World Cup document has a single event as the main theme.

Resultdocuments are well classified by users in terms of the actual event.

Page 9: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

The TimeBank 1.2 corpus It contains news articles that have been annotated using TimeML with

temporal expressions related to events, times and temporal links between events and times.

9

ResultA 50% increase in the number of clusters discovered by TCluster

Page 10: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

Relevance Evaluation using AMT It is a crowdsourcing platform

10

ResultThe average response was 4.04(with an 80% agreement level)

Page 11: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

MAJOR CINTRIBUTION TCluster algorithm provides great flexibility and allows users to explore

clusters of search result documents that are organized along well-defined timelines, supporting different levels of time granularity.

The utility of the time-based clustering over existing approaches that cluster documents only based on document timestamps.

FUTURE WORK To want to study the weighting of relative temporal expressions as well

as different sentence distance functions for determining the rank of documents in a cluster.

11

Page 12: Clustering and Exploring Search Results using Timeline Constructions

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comment

Advantage Provides a new method of time searching

Drawback Some mistakes

Application information retrieval Clustering

12