Information Retrieval Quality of a Search Engine.
-
Upload
russell-adams -
Category
Documents
-
view
222 -
download
0
description
Transcript of Information Retrieval Quality of a Search Engine.
![Page 1: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/1.jpg)
Information Retrieval
Quality of a Search Engine
![Page 2: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/2.jpg)
Is it good ? How fast does it index
Number of documents/hour (Average document size)
How fast does it search Latency as a function of index size
Expressiveness of the query language
![Page 3: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/3.jpg)
Measures for a search engine All of the preceding criteria are measurable
The key measure: user happiness…useless answers won’t make a user happy
![Page 4: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/4.jpg)
Happiness: elusive to measure Commonest approach is given by the
relevance of search results How do we measure it ?
Requires 3 elements:1. A benchmark document collection2. A benchmark suite of queries3. A binary assessment of either Relevant or
Irrelevant for each query-doc pair
![Page 5: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/5.jpg)
Evaluating an IR system Standard benchmarks
TREC: National Institute of Standards and Testing (NIST) has run large IR testbed for many years
Other doc collections: marked by human experts, for each query and for each doc, Relevant or Irrelevant
On the Web everything is more complicated since we cannot mark the entire corpus !!
![Page 6: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/6.jpg)
General scenario
Relevant
Retrieved
collection
![Page 7: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/7.jpg)
Precision: % docs retrieved that are relevant [issue “junk” found]
Precision vs. Recall
Relevant
Retrieved
collection
Recall: % docs relevant that are retrieved [issue “info” found]
![Page 8: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/8.jpg)
How to compute them Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved
Precision P = tp/(tp + fp) Recall R = tp/(tp + fn)
Relevant Not RelevantRetrieved tp (true positive) fp (false positive)
Not Retrieved
fn (false negative) tn (true negative)
![Page 9: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/9.jpg)
Some considerations Can get high recall (but low precision) by
retrieving all docs for all queries!
Recall is a non-decreasing function of the number of docs retrieved
Precision usually decreases
![Page 10: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/10.jpg)
Precision vs. Recall
Relevant
Highest precision, very low recall
Retrieved
Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved
![Page 11: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/11.jpg)
Relevant
Lowest precision and recall
Retrieved
Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved
Precision vs. Recall
![Page 12: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/12.jpg)
Relevant
Low precision and very high recall
Retrieved
Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved
Precision vs. Recall
![Page 13: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/13.jpg)
Relevant
Very high precision and recall
Retrieved
Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved
Precision vs. Recall
![Page 14: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/14.jpg)
Precision-Recall curve We measures Precision at various levels of Recall Note: it is an AVERAGE over many queries
precision
recall
x
x
x
x
![Page 15: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/15.jpg)
A common picture
precision
recall
x
x
x
x
![Page 16: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/16.jpg)
Interpolated precision If you can increase precision by increasing
recall, then you should get to count that…
![Page 17: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/17.jpg)
Other measures Precision at fixed recall
most appropriate for web search: 10 results
11-point interpolated average precision The standard measure for TREC: you take
the precision at 11 levels of recall varying from 10% to 100% by 10% of retrieved docs each step, using interpolation, and average them
![Page 18: Information Retrieval Quality of a Search Engine.](https://reader035.fdocuments.net/reader035/viewer/2022081804/5a4d1b567f8b9ab0599a98e6/html5/thumbnails/18.jpg)
F measure Combined measure (weighted harmonic mean):
People usually use balanced F1 measure i.e., with = 1 or = ½ thus 1/F = ½ (1/P + 1/R)
Use this if you need to optimize a single measure that balances precision and recall.
RPPR
RP
F
2
2 )1(1)1(1
1