BOSS: HackU IIT Bombay
-
Upload
saurabh-sahni -
Category
Education
-
view
3.837 -
download
1
description
Transcript of BOSS: HackU IIT Bombay
HackU: IIT Bombay 5th Feb’ 2009
Chris Heilmann Saurabh Sahni
Build your Own Search Service
- 2 -
Outline
• Search engines using BOSS • About BOSS API
– What? – Why? – Features
• How to use it – BOSS API – BOSS Mashup framework
- 3 -
Search engines using BOSS
- 4 -
hakia: http://hakia.com/
- 5 -
hakia: http://hakia.com/
- 6 -
hakia: http://hakia.com/
- 7 -
Cluuz: http://cluuz.com
- 8 -
Cluuz: http://cluuz.com
- 9 -
Cluuz: http://cluuz.com
- 10 -
Keyword finder - http://keywordfinder.org/
- 11 -
askBOSS: http://ask-boss.appspot.com/
- 12 -
askBOSS: http://ask-boss.appspot.com/
- 13 -
askBOSS: http://ask-boss.appspot.com/
- 14 -
askBOSS: http://ask-boss.appspot.com/
- 15 -
askBOSS: http://ask-boss.appspot.com/
- 16 -
About BOSS API
- 17 -
What?
• Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search
• Unrestricted
http://developer.yahoo.com/search/boss
- 18 -
Usage
Opening the search technology stack
50B pages * 20ms page download = 31 years
CRAWL
EXTRACT
SPAM <-> Gold
Analyze
Index
Rank Assist
Index
Web Map
Retrieve
- 19 -
Usage
Opening the search technology stack
50B pages * 20ms page download = 31 years
CRAWL
EXTRACT
SPAM <-> Gold
Analyze
Index
Rank Assist
Index
Web Map
Retrieve
WEB API
Your App here
- 20 -
Why?
• Removes entry barriers – massive capital investment – access to top technical talent
• Asset to Innovate – Develop new relevance models
• Leverage user insights • Use tags, bookmarks
– Change presentation style
• Search anywhere – Improve Vertical Quality w/ Web comprehensiveness – Fragment the market, foster more players, choice, competition
- 21 -
BOSS API features
• Unlimited queries per day • No branding or attribution • No restrictions on presentation • Ability to re-order results and blend-in addition content • Access to multiple verticals (web search, image, news) • Spell checks, keyword suggestions • 40+ supported language and region pairs • Ability to monetize
- 22 -
How to use it?
- 23 -
Get Started
• Register for an application id http://developer.yahoo.com/wsregapp/
• Documentation http://developer.yahoo.com/search/boss/boss_guide/
• Code samples: Javascript, PHP and Python http://www.saurabhsahni.com/boss-examples.zip
- 24 -
BOSS API
Searching Slumdog Millionaire
(Source: http://en.wikipedia.org/wiki/File:Slumdog_Millionaire_poster.jpg)
- 25 -
BOSS API
• Search for slumdog millionaire: – http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml
- 26 -
BOSS API: XML response
http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml
- 27 -
BOSS API
• Exact search for “slumdog millionaire” – http://boss.yahooapis.com/ysearch/web/v1/%22slumdog+millionaire%22?appid=xyz&format=xml
- 28 -
BOSS API
• Search for slumdog millionaire only on indiatimes.com: – Add site:indiatimes.com to your query – http://boss.yahooapis.com/ysearch/web/v1/slumdog
+millionaire+site%3Aindiatimes.com?appid=xyz&format=xml
• Search for slumdog millionaire on selected movie sites – Add param sites=indiatimes.com,movies.yahoo.com,imdb.com – http://boss.yahooapis.com/ysearch/web/v1/slumdog
+millionaire?appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml
- 29 -
http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire? appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml
- 30 -
BOSS API
• Find related keywords – Add parameter view=keyterms – http://boss.yahooapis.com/ysearch/web/v1/slumdog
+millionaire?appid=xyz&view=keyterms&format=xml
- 31 -
http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&view=keyterms&format=xml
- 32 -
BOSS API
• Search images – http://boss.yahooapis.com/ysearch/images/v1/slumdog
+millionaire?dimensions=small
- 33 -
http://boss.yahooapis.com/ysearch/images/v1/ slumdog +millionaire?dimensions=small
- 34 -
BOSS API
• Search news – http://boss.yahooapis.com/ysearch/news/v1/slumdog
+millionaire?age=15d
- 35 -
http://boss.yahooapis.com/ysearch/news/v1/slumdog +millionaire?age=15d
- 36 -
BOSS API
Spell check request
http://boss.yahooapis.com/ysearch/spelling/v1/milionare?format=xml
Response
- 37 -
BOSS API REST Interface
• {query}: term to look for (url-encoded) • {vert} := {web, news, images, spelling} • @ required
– appid
• @ optional – start, count, lang, region, format, callback, sites
http://boss.yahooapis.com/ysearch/{vert}/v1/{query}
- 38 -
BOSS Mashup Framework
• Python (v2.5+) library
• BOSS Search SDK plus …
• SQL for remixing arbitrary XML/JSON sources
http://developer.yahoo.com/search/boss/mashup.html
- 39 -
BMF + Google App Engine
• Enhanced version of BMF to GAE platform
• http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/
• Enables quick deployment of BOSS applications online
- 40 -
One more thing…
- 41 -
BOSS in Academic Research
• The biggest dataset available on web • Very useful for Web-mining research experiments
– Natural language processing – Semantic extraction – Related keywords – Similarity detection – Clustering algorithms – Spelling corrections
- 42 -
Questions?
Thank You
More: http://developer.yahoo.com/search/boss/
- 43 -
Appendix
- 44 -
http://www.yahoo.com
Search UI Templates are Included in the BOSS Mashup Framework
BOSS Mashup Framework simplifies aggregating and presenting multiple data sources
- 45 -
BMF Features
• select, group, sort, union, joins, udfs, where • Text normalization and duplicate removal • Auto-transformation of resource-oriented API results
into tables w/o parsing • All-in-memory storage and retrieval operations • Ability to join lists of tables via an arbitrary predicate
function (map-like)
• Search UI template framework • Single search function provides total access to
BOSS REST API