Evaluating Hiera r chical Clustering of Search Results
description
Transcript of Evaluating Hiera r chical Clustering of Search Results
Evaluating Hierarchical Clustering of Search Results
Departamento de Lenguajes y
Sistemas Informáticos
UNED, Spain
Juan Cigarrán
Anselmo Peñas
Julio Gonzalo
Felisa Verdejo
nlp.uned.es
SPIRE 2005, Buenos Aires
Overview
Scenario Assumptions Features of a Good Hierarchical Clustering Evaluation Measures
– Minimal Browsing Area (MBA)– Distillation Factor (DF)– Hierarchy Quality (HQ)
Conclusion
Scenario Complex information needs
– Compile information from different sources– Inspect the whole list of documents
• More than 100 documents
Help to– Find the relevant topics– Discriminate from unrrelevant documents
Approach– Hierarchical Clustering – Formal Concept Analysis
Problem
How to define and measure the quality of a hierarchical clustering?
How to compare different clustering approaches?
Previous assumptions Each cluster contains only those documents
fully described by its descriptors
d1 d2 d3 d4
Physics X X X X
Nuclear physics
X X
Astrophysics
X
d2, d3
d1Physics
Astrophysicsd4
Nuclear physics
d2, d3
d1, d2, d3, d4 Physics
Astrophysicsd4
Nuclear physics
Previous assumptions ‘Open world’ perspective
d1 d2 d3
Physics X X
Jokes X X
Jokes about physics
X
d1Physics
Jokes about physics
d3
Jokesd2
Jokes about physicsd3
d1Physics
Jokesd2
Jokes about physicsd3
Good Hierarchical Clustering The content of the clusters.
– Clusters should not mix relevant with non relevant information
+ + +
+ + + +
-- -+ - -
+ - + +
-- +
Good Hierarchical Clustering The hierarchical arrangement of the clusters
– Relevant information should be in the same path
+ + + +
- - -++ +
- - -
+ + +
+ + + +
-- -
- - -
Good Hierarchical Clustering The number of clusters
– Number of clusters substantially lower than the number of documents
How clusters are described– Cognitive load of reading a cluster description– Ability to predict the relevance of the information
that it contains (not addressed here)
Evaluation Measures Criterion
– Minimize the browsing effort for finding ALL relevant information
Baseline– The original document list returned by a search
engine
(lattice) load cognitive
list) (ranked load cognitivelatticeQuality
Evaluation Measures Consider
– Content of clusters– Hierarchical arrangement of clusters– Size of the hierarchy– Cognitive load of reading a document (in the
baseline): Kd – Cognitive load of reading a node descriptor (in the
hierarchy): Kn
Requirement– Relevance assessments are available
Minimal Browsing Area (MBA) The minimal set of nodes the user has to traverse to
find ALL the relevant documents minimising the number of irrelevant ones
+
- - +
+
+ -
Distillation Factor (DF) Ability to isolate relevant information compared
with the original document list (Gain Factor, DF>1)
MBAMBAd
d
D
D
Dk
Dk
latticeloadCognitive
listrankedloadCognitiveLDF List RankedList Ranked
)(_
)_(_)(
List Ranked
MBA)(precision
precisionLDF
Considers only the cognitive load of reading documents
Equivalent to:
Distillation Factor (DF) Example
DF(L) = 7/5 = 1.4
Doc 1 +
Doc 2 -
Doc 3 +
Doc 4 +
Doc 5 -
Doc 6 -
Doc 7 +
Document List
Precision = 4/7 Precision MBA = 4/5
+
- - +
+
+ -
Distillation Factor (DF) Counterexample:
+ + - -+ -- +
Precision = 4/8
Precision MBA = 4/4
DF = 8/4 = 2
Bad clustering with good DF
Extend the DF measure considering the cognitive cost of taking browsing decisions HQ
Hierarchy Quality (HQ) Assumption:
– When a node (in the MBA) is explored, all its lower neighbours have to be considered: some will be in turn explored, some will be discarded
– Nview : subset of lower neighbours of each node belonging to the MBA
- +
- +
+
+-
-
-
-
MBA
|Nview|=8
Hierarchy Quality (HQ)
Kn and Kd are directly related with the retrieval scenario in which the experiments take place The researcher must tune K=Kn/Kd before conducting the
experiment
HQ > 1 indicates an improvement of the clustering versus the original list
viewMBA
List Ranked
viewMBA
List Ranked)(N
k
kD
D
NkDk
DkLHQ
d
nnd
d
(lattice) load cognitive
list) (ranked load cognitivelatticeQuality
Hierarchy Quality (HQ)
- +
- +
+
+-
-
-
-nd
d
n
kk
k
kLHQ
66.1
:Value Cutting
85
10)(
+ + - -+ -- +
nd
d
n
kk
k
kLHQ
3
:Value Cutting
124
8)(
Example
Conclusions and Future Work Framework for comparing different clustering
approaches taking into account:– Content of clusters– Hierarchical arrangement of clusters– Cognitive load to read document and node descriptions
Adaptable to the retrieval scenario in which experiments take place
Future work– Conduct user studies to compare their results with the
automatic evaluation• Results will reflect the quality of the descriptors• Will be used to fine-tune the kd and kn parameters
Thank you!