Information Retrieval Quality of a Search Engine.

18
Information Retrieval Quality of a Search Engine

description

Measures for a search engine All of the preceding criteria are measurable The key measure: user happiness …useless answers won’t make a user happy

Transcript of Information Retrieval Quality of a Search Engine.

Page 1: Information Retrieval Quality of a Search Engine.

Information Retrieval

Quality of a Search Engine

Page 2: Information Retrieval Quality of a Search Engine.

Is it good ? How fast does it index

Number of documents/hour (Average document size)

How fast does it search Latency as a function of index size

Expressiveness of the query language

Page 3: Information Retrieval Quality of a Search Engine.

Measures for a search engine All of the preceding criteria are measurable

The key measure: user happiness…useless answers won’t make a user happy

Page 4: Information Retrieval Quality of a Search Engine.

Happiness: elusive to measure Commonest approach is given by the

relevance of search results How do we measure it ?

Requires 3 elements:1. A benchmark document collection2. A benchmark suite of queries3. A binary assessment of either Relevant or

Irrelevant for each query-doc pair

Page 5: Information Retrieval Quality of a Search Engine.

Evaluating an IR system Standard benchmarks

TREC: National Institute of Standards and Testing (NIST) has run large IR testbed for many years

Other doc collections: marked by human experts, for each query and for each doc, Relevant or Irrelevant

On the Web everything is more complicated since we cannot mark the entire corpus !!

Page 6: Information Retrieval Quality of a Search Engine.

General scenario

Relevant

Retrieved

collection

Page 7: Information Retrieval Quality of a Search Engine.

Precision: % docs retrieved that are relevant [issue “junk” found]

Precision vs. Recall

Relevant

Retrieved

collection

Recall: % docs relevant that are retrieved [issue “info” found]

Page 8: Information Retrieval Quality of a Search Engine.

How to compute them Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved

Precision P = tp/(tp + fp) Recall R = tp/(tp + fn)

Relevant Not RelevantRetrieved tp (true positive) fp (false positive)

Not Retrieved

fn (false negative) tn (true negative)

Page 9: Information Retrieval Quality of a Search Engine.

Some considerations Can get high recall (but low precision) by

retrieving all docs for all queries!

Recall is a non-decreasing function of the number of docs retrieved

Precision usually decreases

Page 10: Information Retrieval Quality of a Search Engine.

Precision vs. Recall

Relevant

Highest precision, very low recall

Retrieved

Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved

Page 11: Information Retrieval Quality of a Search Engine.

Relevant

Lowest precision and recall

Retrieved

Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved

Precision vs. Recall

Page 12: Information Retrieval Quality of a Search Engine.

Relevant

Low precision and very high recall

Retrieved

Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved

Precision vs. Recall

Page 13: Information Retrieval Quality of a Search Engine.

Relevant

Very high precision and recall

Retrieved

Precision: fraction of retrieved docs that are relevantRecall: fraction of relevant docs that are retrieved

Precision vs. Recall

Page 14: Information Retrieval Quality of a Search Engine.

Precision-Recall curve We measures Precision at various levels of Recall Note: it is an AVERAGE over many queries

precision

recall

x

x

x

x

Page 15: Information Retrieval Quality of a Search Engine.

A common picture

precision

recall

x

x

x

x

Page 16: Information Retrieval Quality of a Search Engine.

Interpolated precision If you can increase precision by increasing

recall, then you should get to count that…

Page 17: Information Retrieval Quality of a Search Engine.

Other measures Precision at fixed recall

most appropriate for web search: 10 results

11-point interpolated average precision The standard measure for TREC: you take

the precision at 11 levels of recall varying from 10% to 100% by 10% of retrieved docs each step, using interpolation, and average them

Page 18: Information Retrieval Quality of a Search Engine.

F measure Combined measure (weighted harmonic mean):

People usually use balanced F1 measure i.e., with = 1 or = ½ thus 1/F = ½ (1/P + 1/R)

Use this if you need to optimize a single measure that balances precision and recall.

RPPR

RP

F

2

2 )1(1)1(1

1