PEERSPECTIVE.MPI-SWS.ORG
ALAN MISLOVEKRISHNA P. GUMMADI
PETER DRUSCHEL
BYRAGHURAM KRISHNAMACHARI
Exploiting Social Networks for Internet Search
Motivation
WWW, Search engines, social networkingHyperlinks – author, human, index, rankSocial Networks
No study to examine information exchange Explicit links between users, not content Can these links be used by search engines?
In this paper Compare mechanisms for publishing and location Experiment: Social network based Web search Challenges in leveraging social networks in the future
The Web verses Social Networks
Publishing Users place documents on Web server Author places hyperlinks on Web page that refer to related pages Links placed to increase rank and promote indexing
Locating Web search engines employing sophisticated technologies Google: Uses hyperlink structure and query/page relevance Limitations:
New pages: discovering/indexing, hyper-linking, link(s) discovery # of links determines relevance -> reflects interests/biases of the
Web Ignored: Unlinked/private pages, pages with insufficient relevance
The Web verses Social Networks
Ex: User shares Web content with friends; content is invisible to others; content is now linked between users
Publishing Content is posted by the user and is recommended by others Links among users: Directed (distinct) & Undirected (mutual)
Locating Traversing the social network, keyword search, top 10 lists Timely, relevant & reliable (non-)textual info can be found Content is rated by consumers, not producers Content is rated almost immediately; doesn’t rely on
discovery
Integrating Web search and social networks
Problem No unified search tool, no unified finding tool as well Social network-based search not used in Web and vice
versa
Questions Leverage social network links to improve search
results Explore benefits of social network-based Web search
Solution Conduct an experiment to validate these
PeerSpective: The experiment
Web content of 10 students/researchers are shared
A HTTP proxy indexes all visited URLs by an user
When a Google search (query) is performed Local proxy forwards query to Google and to all peer
proxies All proxies execute the query on local index & return
results Results are collated and presented alongside the
Google results
PeerSpective: Measurements & Experiences
In a month long experimental deployment (10 users) 439, 384 HTTP requests 198, 492 distinct URLs (45%) 113, 800 HTML and PDF requests (25.9%)
User base is small, with highly specialized interestsThe results may not represent a large, diverse group
Technology Local text search engine – Lucene Local peer-peer overlay engine - FreePastry
Limits of hyperlink-based search
Web search engines index only well linked content
Limit: URLs visited by users / not indexed by Google
Reasons why a page might not be indexed The page could be too new (blogs, news) The page could be in deep web and not well
connected The page could be in dark web (private pages)
PeerSpective verses Google
For each HTTP request Does Google’s index contain this URL Has some peer in PeerSpective viewed this URL
Static HTML content (No GET/POST) 6,679 requests (<6%) for 3,987 URLs (2%)
Google Index Covers 62.5% of the requests, 68.1% of the distinct URLs 1/3rd of all URL requests cannot be retrieved by Google
PeerSpective Index Covers 30.4% of requested URLs Achieves half of Google’s coverage with a much smaller size 13.3% of the URLs were in PeerSpective but not in Google’s index 19.5% improvement by PeerSpective compared to Google
searchWhat are the documents that interests our users, but not
Google?
Benefits of social network-based search
Search engines have to rank pages Users rarely go beyond first 20 search results
1,730 Google searches were observered First page results: Google – 9.45, PeerSpective – 5.17 1,079 (62.3%) resulted in clicks on result(s) 307 (17.7%) were followed by a refined query Users gave up 344 (19.8%) of the time 933 (86.5%) of clicked results were returned only by Google 83 (7.7%) of clicked results were returned only by
PeerSpective 63 (5.7%) of clicked results were returned by both 9% improvement in search result clicks over Google alone
How PeerSpective outperforms Google
Disambiguation Search terms have multiple meanings depending on the
context
Ranking Search engine: Top rank, Social Network: Nearby pages
Serendipity Making unexpected or fortunate discoveries
Opportunities and Challenges
Online social networking enables new forms of information exchange Users can very easily and conveniently publish
information Makes it possible to locate and access “WOM”
information Organizes information according to tastes and
preferences of smaller groups of individualsOpportunities and Challenges
Privacy – willingness of individuals to share information Membership and clustering of social networks Content rating and ranking (page rank, views) System architecture (centralized or distributed)
Thank You
Top Related