Incorporating Non-Relevance Information in the Estimation of Query Models
description
Transcript of Incorporating Non-Relevance Information in the Estimation of Query Models
Incorporating Non-Relevance Informationin the Estimation of Query Models
Chen, Yi-wen
Dept. of Computer Science & Information Engineering
National Taiwan Normal University
✩2012/5/15
2008 TREC
Edgar Meij, Wouter Weerkamp, Jiyin He, Maarten de Rijke
2
Introduction
• We introduce a new model which incorporates information from relevant and non-relevant documents to improve the estimation of query models.
3
Retrieval Framework
log 𝑃 (𝐷|𝑄 )∝ log 𝑃 (𝐷 )+∑𝑡 ∈𝑄
𝑃 (𝑡|𝑄 ) ∙ log 𝑃 (𝑡|𝐷 )(1)
𝑃 (𝑡|𝐷 )=(1− 𝜆𝐷 )∙𝑃 (𝑡|𝜃𝐷 )+𝜆𝐷 ∙𝑃 (𝑡 )(2)
𝑃 (𝑡 )=𝑃 (𝑡|𝐶 )=∑𝐷
𝑛 (𝑡 ,𝐷 )
|𝐶|(3)
𝑃 (𝑡|𝜃𝐷 )=𝑛 (𝑡 ,𝐷 )
∑𝑡 ′
𝑛 (𝑡′ ,𝐷 )
𝑃 (𝑡|𝑄 )=𝜆𝑄 ∙𝑃 (𝑡|�̂�)+(1−𝜆𝑄 ) ∙𝑃 (𝑡|𝑄 )(4)
𝑃 (𝑡|�̂�)∝ ∑𝐷∈𝑅
𝑃 (𝑡|𝐷 ) 𝑃 (𝐷|𝑅 )
P (𝐷|𝑅 )= 𝑁𝐿𝐿𝑅 (𝐷|𝑅 )
∑𝐷 ′
𝑁𝐿𝐿𝑅 (𝐷′|𝑅 )(6)
uniform
4
Modeling Non-Relevance
Kraaij (2004) defines the NLLR (Normalized Log Likelihood Ratio) measure as being equivalent to determining the negative KL-divergence for document retrieval.
𝑁𝐿𝐿𝑅 (𝑄|𝐷 )=𝐻 (𝑄 ,𝐶 )−𝐻 (𝑄 ,𝐷 )(5)
where is the cross-entropy between two multinomial language models:
Eq. 5 can be interpreted as the relationship between two language models Q and D, normalized by a third language model C.
5
Modeling Non-Relevance
P (𝐷|𝑅 )= 𝑁𝐿𝐿𝑅 (𝐷|𝑅 )
∑𝐷 ′
𝑁𝐿𝐿𝑅 (𝐷′|𝑅 )(6)
Where But use (5) ...
𝑃 (𝑡|𝑅 )=∑𝐷∈𝑅
𝑃 (𝑡|𝐷 )
|𝑅|
6
Runs
(A) no relevance judgements
(B) 3 relevant documents
(C) 3 relevant and 3 non-relevant documents
(D) 10 judged documents (division unknown)
(E) large set of judgements (division and number unknown)
we compare our model using explicit non-relevance information to the same model using the collection as a non-relevance model, by submitting two distinct runs:
met6, using the set of non-relevant documents
met9, using only the collection (, viz. Eq. 7).• standard stopword removal • stemming using a Porter stemmer.• Due to a small bug in the implementation, however, parameter was not
properly normalized, causing a slight difference in the retrieval results for met6 on set B.
7
Runs
Baseline
8
Runs
When we zoom out and look at the full range of available topics (Table 2), we observe that both models improve statMAP over the baseline (set A) for the full set of topics.When the feedback set is small, met9 improves statMAP more effectively than met6, i.e., the background model is performing better than the non-relevant documents.
The difference does seem to suggest that the amount of non-relevance information needs to reach a certain size to outperform the estimation on the collection.
9
Runs
Figure 1 shows a per-topic breakdown of the difference in MAP between the two submitted runs. We observe that most topics are helped more using the collection-based estimates. We have to conclude that, for the TREC Terabyte topics, the estimation on the collection yields the highest retrieval performance and is thus a better estimate of non-relevance than the judged non-relevant documents.
10
Runs
Given a limited amount of non-relevant documents (sets C and D), our model especially improves early precision, although not significantly. A larger amount of non-relevant documents (set E) decreases overall retrieval effectiveness.From Figure 2a we observe that set E only outperforms the other sets at the very ends of the graph.
11
Conclusion
• met6, a larger number of judged non-relevant documents improve retrieval effectiveness most
• the collection is a viable and sufficient alternative
12
Future Work
• First, we intend to investigate the impact of the available judged (non-)relevant documents and their properties with respect to the estimates on the collection.
• Second, given the relevance assessments, we will try to find better ways of estimating the true distribution of the (non-)relevant information within our framework.