Indextank east bay ruby meetup slides
-
Upload
yogiwankenobi -
Category
Documents
-
view
464 -
download
3
Transcript of Indextank east bay ruby meetup slides
Finding anything: Real-time search with IndexTank
Tim SpenceApril 19, 2011
About the Presenter
Tim Spence● Senior Infrastructure Engineer at MedHelp
( http://www.medhelp.org/ )● Former .NET developer● Recently converted to Ruby● In love with Open Source Software● More at http://whyhello.im/tim
Agenda
● State of search today● Quick survey: how much time/effort did
YOU spend implementing search on your webapp?
● Examples of services that need improved search
● IndexTank to the rescue● Case study: reddit.com
Agenda, continued
● How I found out about IndexTank● Two apps I built with IndexTank● Live Demo
The State of Search Today
● Not well implemented at all– Search works, but...
– Barely
● How many pages of results do you typically browse through before finding what you were looking for?
● Or do you give up and head for google site search instead?
Survey Time!
● How much time/effort did YOU spend implementing search on your webapp?
● How many times have you iterated on your search feature?
● When was the last time someone thanked you for building a powerful, reliable search feature for your webapp?
My Opinion
● Search as an in-app feature is an afterthought
● Minimal implementation is the norm● If it wasn't for MySQL/MS-SQL full text
indexing, most apps probably wouldn't even have a search feature
● Most good web apps don't make it easy for users to find specific content outside of predetermined navigation
Let's pick on some apps!
● These are companies with great products, but their search comes up short
● Don't worry–they can take it!
App #1: Github
App #1: Github
App #1: Github
● Interface is decent– Search repos, code, users, or everything
– Search by language
● However...– Can't do much with results but browse
– Check out this example
App #1: Github
App #1: Github
● Why these results aren't so hot– Can't search by most recently maintained
– Can't search by most popular (most watched)
– Are you ready to browse 1,297 results?
● Advanced search capabilities exist, but not the best interface
– recency/popularity implemented, but require specific arguments
App #2: Amazon Web Services
● ”Hey, I bet I can find an AMI from the community for the exact EC2 setup I need”
● Fact: probably not
App #2: Amazon Web Services
App #2: Amazon Web Services
● Notice something missing?– No search
– Only sort by date, title
● Ready to browse 934 results? – I'd rather build my own AMI
● Incredible missed opportunity– o/s search
– Stack search
– etc...
Fact: Github & Amazon aren't the only ones
● Lots of good web services● Massive quantities of quality content● Unfortunately not discoverable in
meaningful ways
Interlude: Sites with great search
● Foodspotting– Proximity
– Recency
– Rating
● Medhelp– Content category
– Promoted content
● Other sites I overlooked? Whose search do you like?
What was the point of that last slide?
● Search can be useful if it is valued as a feature
● Any company willing to invest in the resources can build and host a high quality search engine
● However, must you roll your own?
Enter Search as a Service
● No need for you to invest in additional infrastructure
● No need to reinvent the wheel– Search is a solved problem
– Let the experts refine it
IndexTank to the rescue!
● Hosted–no load on your infrastructure● Powerful
– We'll get into the details next
● Always Improving– Search IS their product
● Freemium● Easy to implement
Let's talk features
● Real-time search– Real-time indexing–results immediately
available
● Custom scoring● Autocomplete● Faceting● Geo search● Advanced text search
●Real-time search
● Real-time indexing– results immediately available
● Index multiple docs/sec● Overwrite existing docs as you wish
– Changes also immediately available
Custom Scoring
● Implementer has full control over how results are returned
● Choose which fields are searched● Use pre-written scoring functions● Or write your own
Custom Scoring
Everyone loves autocomplete
● Saves users time● Potentially avoids spelling errors
– Not for hunters/peckers
● Adds a degree of intelligence to the search process
Faceting
● Does it make sense for you to categorize documents in your index?
– In all cases, YES
● Consider your advanced users and the narrow results they seek
– Don't make anyone sift through irrelevant results
Faceting
Geo
● It's 2011– Location is more relevant than ever before
– Mobile is skyrocketing–every client has a GPS
● IndexTank has built-in geo proximity search capability
Geo
Advanced Text Search (Beta)
● Fuzzy search (Did you mean...?)● Stemming
– Alternate word forms (tense, possession, etc...)
● Alternate spellings– Misspellings
Other Benefits
● Zero maintenance● Scalability included for free● Easy implementation
– Clients available in many languages
– Excellent documentation–Let's check it out
● Excellent support– Humans or bots? You decide
● Dog food: their site search is done well
Case Study: reddit.com
● High traffic news aggregator (> 1.0E9 pvs/mo) with tons of content
● Who remembers how bad reddit's search was?
– When it even worked
● Can't blame them for trying– Many attempts, but none worked
● IndexTank excelled in all areas● Let's check it out now
My experience with IndexTank
● Discovered through Heroku/IndexTank contest
● Built my first irl Rails app in an afternoon/evening w/ fellow hacker Chris Saylor (@cwsaylor)
● Didn't win the contest but learned how easy it is to quickly create highly targeted search
App #1: Toxosis
● Searchable database of toxic release data supplied by U.S. E.P.A.
● Hosted at http://toxosis.heroku.com/● Search enabled on many fields including
city/state/zip, toxin● Additional fields can be added to index
– When I have time, of course...
More personal backstory
● Still in the business of reinventing myself as a Rails developer
● How to get a Rails gig? Develop an app multiple Rails apps and show it them off
● Opportunities are everywhere–contests, hackathons, and weekend hacks for developer community
App #2: SXSWdex
● Searchable database of 2011 SXSW attendees
● Hosted at http://sxswdex.heroku.com/● Design goal: do a better job than SXSW
official site● Search within bio, company, location,
name● Facets: company, city/state
The moment we've all been waiting for
● Let's build an app!
Questions?
● Q&A time with an IndexTank engineer