Preference Based Evaluation Measures for Novelty and Diversity
description
Transcript of Preference Based Evaluation Measures for Novelty and Diversity
![Page 1: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/1.jpg)
Preference Based Evaluation Measures for Novelty andDiversity
Date: 2014/04/08Author: Praveen Chandar and Ben CarteretteSource: SIGIR’13Advisor: Jia-Ling Koh Speaker: Sheng-Chih Chu
![Page 2: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/2.jpg)
2
Outline• Introduction• Preference Based framework• Preference-Based Evulation
Measure• Experiments• Conclusion
![Page 3: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/3.jpg)
3
Introduction• Traditional IR evaluation under the
assumption.• Subtopics-based is relevant to the
query, but not depends on the user and the scenario.
Living in India
information for visitors and immigrants
how people live in India
history about life and culture in India
Query:
subtopic
![Page 4: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/4.jpg)
4
Introduction• User profiles can be used to represent
the combination of relevant subtopics and the other.
• Goal: propose an evaluation framework and metrics based on user preference for the novelty and diversity task.
![Page 5: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/5.jpg)
5
Outline• Introduction• Preference Based framework• Preference-Based Evulation
Measure• Experiments• Conclusion
![Page 6: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/6.jpg)
6
Preference Based framework• Some issue based on subtopic:• subtopic identification is challenging and not easy
to enumerate.• measures often require many parameters.• measures assume subtopics to be independent of
each other but in reality this is not true.
![Page 7: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/7.jpg)
7
Preference Based framework• Preference judgements :1. simple pairwise preference judgments2. conditional preference judgments
![Page 8: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/8.jpg)
8
Outline• Introduction• Preference Based framework• Preference-Based Evulation
Measure• Experiments• Conclusion
![Page 9: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/9.jpg)
9
Preference-Based Evaluation Measure• Browsing model• Documents utility• Utility accumulation• user scans documents down a ranked list one-
by-one and stops at some rank k.
![Page 10: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/10.jpg)
10
Preference-Based Evaluation Measure• Ex:
S : a set of previously ranked docuementsi=1,U(d1)i=2,U(d2|d1)i=3,F({U(d3|d2),U(d3|d1)})i=4,F({U(d4|d3),U(d4|d2),U(d4|d1)})…...
Ex: U(d3|d2) = 9/10 , U(d3|d1) = 4/5F() has two function:Average: (0.9+0.8)/2 = 0.85Minimum: min({0.9,0.8}) = 0.8
![Page 11: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/11.jpg)
11
Preference-Based Evaluation Measure• K = 5,10,20• Final step: normalize
![Page 12: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/12.jpg)
12
Outline• Introduction• Preference Based framework• Preference-Based Evulation
Measure• Experiments• Conclusion
![Page 13: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/13.jpg)
13
Data set• Use ClueWeb09 dataset(with English
docuements)• A total of 150 queries have been
developed and judged for the TREC Web track
• Subtopic:3~8• Based on TREC profile
![Page 14: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/14.jpg)
14
Analysis• System Ranking Comparison
![Page 15: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/15.jpg)
15
Analysis• Rank Correlation Between Measure
![Page 16: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/16.jpg)
16
Analysis• Rank Correlation Between Measure
![Page 17: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/17.jpg)
17
Analysis• Evaluation Multiple User Profiles
![Page 18: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/18.jpg)
18
Analysis
![Page 19: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/19.jpg)
19
CONCLUSION
![Page 20: Preference Based Evaluation Measures for Novelty and Diversity](https://reader034.fdocuments.net/reader034/viewer/2022051421/56816110550346895dd06c27/html5/thumbnails/20.jpg)
20
Conclusion• The author proposed a novel evaluation
framwork and a family of measure for IR .
• It can incorporate any property that influences user preferences for one document over another.