Modeling term relevancies in information retrieval using Graph Laplacian Kernels
description
Transcript of Modeling term relevancies in information retrieval using Graph Laplacian Kernels
![Page 1: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/1.jpg)
Modeling term relevancies in information retrieval
using Graph Laplacian Kernels
Shuguang WangJoint work with Saeed Amizadeh and
Milos Hauskrecht
![Page 2: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/2.jpg)
A Problem in Document Retrieval
• There is a ‘gap’ between search queries and documents. Query: car
![Page 3: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/3.jpg)
A Problem in Document Retrieval
• There is a ‘gap’ between search queries and documents.
Google.comBing.comYahoo.com…
Query: car
![Page 4: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/4.jpg)
A Problem in Document Retrieval
• There is a ‘gap’ between search queries and documents.
Google.comBing.comYahoo.com…
Query: car
![Page 5: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/5.jpg)
A Problem in Document Retrieval
• There is a ‘gap’ between search queries and documents.
Google.comBing.comYahoo.com…
Query: car
Good enough?
![Page 6: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/6.jpg)
A Problem in Document Retrieval
• What about the documents about automobiles, BMW, Benz, …?
• There are various expressions for a same entities.
• One solution is to expand the original user queries with some ‘relevant’ terms.
![Page 7: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/7.jpg)
Traditional Query Expansion Methods
• Human and/or computer generated thesauri– Zhou et al., SIGIR 2007 proposed to expand query with MeSH
concepts.• Human Relevance feedback
– Implicit feedback from human such as tracking eye movement (Buscher et al., SIGIR 2009).
– User click information (Yin et al., ECIR 2009)• Automatic query expansion
– Pseudo Relevance Feedback first proposed in (Xu and Croft, SIGIR 1996).• Use top ‘n’ documents from the initial search as the implicit feedback
and select ‘relevant’ terms from these ‘n’ documents.
![Page 8: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/8.jpg)
Traditional Query Expansion Methods
• Human and/or computer generated thesauri– Zhou et al., SIGIR 2007 proposed to expand query with MeSH concepts.
• Human Relevance feedback– Implicit feedback from human such as tracking eye movement (Buscher
et al., SIGIR 2009).– User click information (Yin et al., ECIR 2009)
• Automatic query expansion– Pseudo Relevance Feedback first proposed in (Xu and Croft, SIGIR
1996).• Use top ‘n’ documents from the initial search as the implicit feedback and
select ‘relevant’ terms from these ‘n’ documents.– Analyze query flow graph in (Bordino et al., SIGIR 2010)
Expensive, and time consuming
![Page 9: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/9.jpg)
Traditional Query Expansion Methods
• Human and/or computer generated thesauri– Zhou et al., SIGIR 2007 proposed to expand query with MeSH concepts.
• Human Relevance feedback– Implicit feedback from human such as tracking eye movement (Buscher
et al., SIGIR 2009).– User click information (Yin et al., ECIR 2009)
• Automatic query expansion– Pseudo Relevance Feedback first proposed in (Xu and Croft, SIGIR
1996).• Use top ‘n’ documents from the initial search as the implicit feedback and
select ‘relevant’ terms from these ‘n’ documents.– Analyze query flow graph in (Bordino et al., SIGIR 2010)
Expensive, and time consuming
Human Input
![Page 10: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/10.jpg)
Traditional Query Expansion Methods
• Human and/or computer generated thesauri– Zhou et al., SIGIR 2007 proposed to expand query with MeSH concepts.
• Human Relevance feedback– Implicit feedback from human such as tracking eye movement (Buscher
et al., SIGIR 2009).– User click information (Yin et al., ECIR 2009)
• Automatic query expansion– Pseudo Relevance Feedback first proposed in (Xu and Croft, SIGIR
1996).• Use top ‘n’ documents from the initial search as the implicit feedback and
select ‘relevant’ terms from these ‘n’ documents.– Analyze query flow graph in (Bordino et al., SIGIR 2010)
Expensive, and time consuming
Human Input
![Page 11: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/11.jpg)
A Different View
• What we really need here is a way to estimate term-term relevance.
• Problem of finding expansion terms for user queries Problem of finding ‘relevant’ terms given a similarity metric.
• How to derive a term-term similarity metric?
![Page 12: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/12.jpg)
Term-Term Similarity
• Hypothesis: the metric ‘d’ should be smooth, i.e., d(t1) ~ d(t2) if ‘t1’ and ‘t2’are similar/relevant.
• Why not graph Laplacian kernels?!– We can easily have smoothness property.– We can also define distance metrics with it.
![Page 13: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/13.jpg)
Define Affinity Graph
• Nodes are terms• Edges are co-occurrences• Weights of the edges are the number of
documents terms co-occur
![Page 14: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/14.jpg)
Graph Laplacian Kernels• General Form Definition:
• Resistance:
• Diffusion:
• P-step Random Walk:• • …
n
i
Tii i
gK1
)(
1)(g
2
2
)(
eg
pg )()(
Recall:
![Page 15: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/15.jpg)
n
i
Tii i
gK1
)( Recall:
Graph Laplacian Kernels• General Form Definition:
• Resistance:
• Diffusion:
• P-step Random Walk:• • …
1)(g
2
2
)(
eg
pg )()(
How to choose hyper parameters?
![Page 16: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/16.jpg)
n
i
Tii i
gK1
)( Recall:
Graph Laplacian Kernels• General Form Definition:
• Resistance:
• Diffusion:
• P-step Random Walk:• • …
1)(g
2
2
)(
eg
pg )()( How to
choose g(λ)?
How to choose hyper parameters?
![Page 17: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/17.jpg)
Non-parametric kernel
• Learn the transformation g(λ) directly from training data.– If we know some terms are similar, we want to
maximize their similarities.– At the same time, we want to have a smoother
metric.
![Page 18: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/18.jpg)
An Optimization Problem
: the set of eigenvalues of original Laplacian graph
‘ ‘‘‘ ‘
tin‘ and tjn’ are pair of similar terms in the training document n’
![Page 19: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/19.jpg)
An Optimization Problem
: the set of eigenvalues of original Laplacian graph
‘ ‘‘‘ ‘
tin‘ and tjn’ are pair of similar terms in the training document n’
Maximize for known similar terms tin and tjn
![Page 20: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/20.jpg)
An Optimization Problem
: the set of eigenvalues of original Laplacian graph
‘ ‘‘‘ ‘
tin‘ and tjn’ are pair of similar terms in the training document n’
Penalize more for large eigenvalues
![Page 21: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/21.jpg)
Kernel to Distances
• Given the kernel K, we can define distances between any pair of nodes, d(i,j), in the graph.
µ1 µ2 µn
ti
tj
Recall:
We define:
n
i
Tii i
gK1
)( )( ii g
![Page 22: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/22.jpg)
Kernel to Distances
• Given the kernel K, we can define distances between any pair of nodes, d(i,j), in the graph.
µ1 µ2 µn
ti
tj
Recall:
We define:
d(i,j) = Kii+Kjj-2Kij
Euclidean Distance
n
i
Tii i
gK1
)( )( ii g
![Page 23: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/23.jpg)
n
i
Tii i
gK1
)(
Kernel to Distances
• Given the kernel K, we can define distances between any pair of nodes, d(i,j), in the graph.
µ1 µ2 µn
ti
tj
Recall:
We define:
d(i,j) = Kii+Kjj-2Kij
Euclidean Distance
The distance metric derived from graph Laplacian Kernel is the Euclidean distances in the kernel space
)( ii g
![Page 24: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/24.jpg)
Using term-term similarity in IR
• Deal with similarity between sets and terms.– In query expansion tasks, set of query terms is ‘S’
and a candidate expansion term is ‘t’.• Transform the pair-wise distances, ‘d’, to set-
to-term similarity.– Naïve methods: • dmax=max(d(S,t))• davg=avg(d(S,t))• dmin=min(d(S,t))
![Page 25: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/25.jpg)
Set-to-term Similarity
• Query collapsing
![Page 26: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/26.jpg)
Query Collapsing
• We have to compute eigen-decompostion again for each query. – It is too expensive for the online task.
• Approximation is possible.– We want to approximate the projection of `new’
point ‘S’ in the kernel space.– We need to add one element in the original
eigenvector. µ1 µ2 µn
AE
µ1 µ2 µn
AE
S
![Page 27: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/27.jpg)
Nystrőm Approximation
• For all nodes in the graph Laplacian, we have
• If the new point s’ was in the graph, it would satisfy the above as well.
)()(),( ijjiL kkj
k
)'()(),'( sjjsL kkj
k
![Page 28: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/28.jpg)
Nystrőm Approximation
• For all nodes in the graph Laplacian, we have
• If the new point s’ was in the graph, it would satisfy the above as well.
)()(),( ijjiL kkj
k
)'()(),'( sjjsL kkj
k
)'()'()','()(),'('
ssssLjjsL kkksj
k
![Page 29: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/29.jpg)
Nystrőm Approximation
• For all nodes in the graph Laplacian, we have
• If the new point s’ was in the graph, it would satisfy the above as well.
)()(),( ijjiL kkj
k
)'()(),'( sjjsL kkj
k
)'()'()','()(),'('
ssssLjjsL kkksj
k
![Page 30: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/30.jpg)
Evaluation• Two tasks:
– Term prediction (scientific publication)• Give the terms in the abstracts, predict the possible terms in the full body • Compare with TFIDF, PRF, PLSI
– Query expansion• Compare with Lemur/Indri + PRF and Terrier + PRF
• Kernels:– Diffusion (optimized by line search)– Resistance– Non-parametric (optimized by line search)
• Set-to-term:– Average– Query collapse
![Page 31: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/31.jpg)
Term prediction
• 6000 articles about 10 cancers downloaded from PubMed.– 80% as training and 20% as testing
• Given the terms in abstracts, rank all the candidate terms using the distance metrics.– The smaller the distances between candidate
terms and query terms, the higher rank these terms are.
• Use AUC to evaluate (Joachims, ICML 2005)
![Page 32: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/32.jpg)
Results
![Page 33: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/33.jpg)
Query Expansion• Four TREC datasets: Genomic 03 & 04, Adhoc TREC 7 & 8.• We built graph using different set of terms in these
datasets:– genes/proteins on Genomic 03 data– 5000 terms with highest TFIDF scores on Genomic 04 data.– 25% subsamples from all (~100k) unique terms from TREC7 & 8.
• Use Mean Average Precision (MAP) to evaluate the performance.
• Only Resistance Kernel
![Page 34: Modeling term relevancies in information retrieval using Graph Laplacian Kernels](https://reader036.fdocuments.net/reader036/viewer/2022062323/568165af550346895dd89ebd/html5/thumbnails/34.jpg)
Results