Entity Aware Click Graph
-
Upload
pablo-mendes -
Category
Documents
-
view
952 -
download
0
description
Transcript of Entity Aware Click Graph
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 1/16
Measuring Website Similarity using an Entity-Aware Click Graph
Pablo N. Mendes1, Peter Mika2, Hugo Zaragoza2, Roi Blanco2
1. Freie Universität Berlin2. Yahoo! Research Barcelona
Nov 1st 2012, Maui, CIKM 2012
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 2/16
Introduction: query log analysis
● Query logs record user interaction with Web search engines
● Query log analysis has been proven critical to improving search
● For search engines – Ranking, autosuggest, “Also try”, etc.
● For site owners – insight into user needs, allows optimizing Web
presence, etc.
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 3/16
Introduction: website similarity
● Click graph: relating queries and websites, edges are clicks
● Allows modeling website relatedness based on shared queries leading to each website pair
Site similarity graph (Site similarity graph (SGSG))Click graphClick graph
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 4/16
Problems: Sparsity
● 44% of queries occur only once even when considering a full year of data [1]
● using “shared queries” as relatedness measure relatedness becomes tough in the long tail.
[1] Baeza-Yates. Relating content through web usage. In HT ’09, 2009.
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 5/16
Problems: partial overlaps
● Breaking up into words distorts semantics– “Forest” vs “Forest Gump”
– “Pitt” vs “Brad Pitt”
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 6/16
Introduction● >62% of queries contain entity name or type [20]
[20] Pound, Mika, & Zaragoza. Ad-hoc object retrieval in the web of data. In WWW’10, 2010.
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 7/16
Entity-aware Click Graph
● Websites can share entities and/or modifiers
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 8/16
Entity-aware Website Similarity Graph
● More connected● Preserves semantics● Allows analysis of
how websites relateto entities and modifiers
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 9/16
Experiments
● Website similarity– Find top K similar sites
– Evaluation: two sites are “similar” if they are in the same category in ODP (Open Directory Project)
● Website characteristics from the searcher POV– What entities lead to a website
– What context words lead to a website
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 10/16
Dataset Statistics: Query Log
● 1 month of queries from Yahoo!, 45M sessions● 5M entities from Freebase
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 11/16
Results 1
● Similarity edge prediction
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 12/16
Results 1
● Similarity edge prediction with credit to partial category overlap
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 13/16
Results 2
Entropy of distribution of
entities
Entropy of distribution of modifiers
Many entitiesMany modifiers
Few entitiesMany modifiers
Many entitiesFew modifiers
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 14/16
Results 2
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 15/16
Conclusion
● Recognizing entities in Web search logs allows for click graphs that account for internal composition of queries
● New similarity graphs built from entity-aware click graphs allow enable more robust and flexible similarity analysis (evaluated for website similarity)
● Future:– Exploit the knowledge base (e.g. type hierarchy)
– More complex queries
– etc
Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 16/16
Thank you!
● Web: http://pablomendes.com● E-mail: [email protected]● Twitter: @pablomendes● Slideshare: slideshare.net/pablomendes
Questions?