Searching Question by Identifying Question Topic and Question Focus

25
Huizhong Doan, Yonbo Cao, Chin-Yew Lin and Yong Yu Shanghai Jiao Tong University & MSRA ACL 2008 Searching Question by Identifying Question Topic and Question Focus 2008/7/9 1 Rick Liu

description

Huizhong Doan, Yonbo Cao, Chin-Yew Lin and Yong Yu Shanghai Jiao Tong University & MSRA ACL 2008. Searching Question by Identifying Question Topic and Question Focus. Introduction. Question & their Answers A very large archives Built up by Online Services Example - PowerPoint PPT Presentation

Transcript of Searching Question by Identifying Question Topic and Question Focus

Rick Liu 1

Huizhong Doan, Yonbo Cao, Chin-Yew Lin and Yong YuShanghai Jiao Tong University & MSRA

ACL 2008

Searching Question by Identifying Question Topic and Question Focus

2008/7/9

Rick Liu 2

Introduction

Question & their Answers A very large archives Built up by Online Services

Example Traditional FAQ services Community-based Q&A services▪ Emerging▪ Yahoo! Answers, Live QnA, Baidu Zhidao

2008/7/9

Rick Liu 3

Motivation

Question Search Help users to search previous

answers

2008/7/9

Any cool clubs in Berlin or Hamburg?

What are the best/most fun clubs in Berlin? Any nice hotels in Berlin or Hamburg? How long does it take to Hamburg from

Berlin? Cheap hotels in Berlin?

Rick Liu 4

Motivation

2008/7/9

Any cool clubs in Berlin or Hamburg?

Question TopicQuestion Focus

Rick Liu 5

Approach

Identifying question topic & focus Question tree Determining the tree cut

Modeling question topic & focus for search Language model

2008/7/9

Rick Liu 6

Question Tree

Topic terms BaseNP, WH-ngram

Topic profile probability distribution of categories

Specificity inverse of the entropy of the topic profile

Topic chain topic terms ordered by specificity value

(desc) Topic tree2008/7/9

Rick Liu 7

Question Tree Example

2008/7/9

Rick Liu 8

Tree Cut Model

M = ( Γ , θ ) Γ = [ C1, C2, .. Ck ] , tree cut Θ = [ P(C1), P(C2), .. P(Ck) ] , prob

param vector A cut is any set of nodes Σi=1..kP( Ci ) = 1

2008/7/9

Rick Liu 9

Tree Cut Model Example

2008/7/9

[n0, n11], [n12, n21, n22, n23], [n13, n24][n11, n21, n22,

n23, n24]

Rick Liu 10

MDL-base Tree Cut Model

2008/7/9

Minimum Description Length

Ref : Li and Abe, 1998

Rick Liu 11

Determining the Tree Cut

2008/7/9

HEADTAIL

Rick Liu 12

Modeling for Search

P( q | q ) q : queried question q : targeted question

2008/7/9

~

~

Rick Liu 13

Experimental Data

Yahoo! Answers Resolved questions

travel : 314,616 items computers & internet : 210,785 items

Tree fields title ( only used ) description answers

2008/7/9

Rick Liu 14

Ground Truth

Employed Vector Space Model Manual judgments : relevant /

irrelevant

Baseline : VSM, LMIR Evaluation : MAP, R-precision, MRR

2008/7/9

Rick Liu 15

Results for ‘travel’

2008/7/9

Rick Liu 16

Results for ‘computer & internet’

2008/7/9

Rick Liu 17

About the λ

2008/7/9

Emphasize more in question topic

Rick Liu 18

Error Analysis ( travel )

Examine the correctness of question topics and question foci

200 queried question => 69 question incorrect (a) Only have the head part ( 59 ) (b) Incorrect order ( 10 )

(a) explains why λ is 0.7

2008/7/9

Rick Liu 19

Related Work

FAQ data Community based

Jeon et al., 2005 Compared four different retrieval

methods▪ Vector space model▪ Okapi▪ Language model▪ Translation-based model

Translation-based model performed the best2008/7/9

Rick Liu 20

Translation Model

Lexical chasm Where to stay in Hamburg? The best hotel in Hamburg?

IBM model 1 Use question titles and question

description as the parallel corpus

2008/7/9

Rick Liu 21

Results

2008/7/9

Rick Liu 22

Conclusions and Future Work

1) Data Structure2) Use MDL-based Tree Cut Model to

Identify3) A new form of language modeling

for question search4) Extensive experiments

2008/7/9

Now only community-based From forum sites / FAQ sites

Rick Liu 23

Thanks

2008/7/9

Rick Liu 24

Modeling for Search

2008/7/9

Rick Liu 25

Translation Probability

2008/7/9