Internalizing location services with geo names

23
Internalizing loca.on services with GeoNames John Marc Imbrescia Senior Software Engineer - Etsy Wednesday, May 1, 13

description

Presented by John Marc Imbrescia, Senior Software Engineer, Etsy.com Etsy recently chose to bring our location services in house. We used the open source GeoNames data set and built the tools we needed to use that data to allow members to select their location, show translations of place names, and to feed data into our search database for local, regional, and country based searches. This talk will cover the implementation details and decisions we made along the way. How we mapped places from our old data set to the GeoNames data. The internal tools we built including a SOLR core for doing location place name autosuggest. Modifications to our Listings Search and Shop Search cores and the different ways we use location based search around the site both distance and region based using GeoNames hierarchy data. There will also be a discussion about choosing to release some of the tools we built for this project open source and the decisions behind the non-search (display etc.) related elements of the project and the tools we chose for them and why.

Transcript of Internalizing location services with geo names

Page 1: Internalizing location services with geo names

Internalizing  loca.on  services  with  GeoNames

John Marc ImbresciaSenior Software Engineer - Etsy

Wednesday, May 1, 13

Page 2: Internalizing location services with geo names

Internalizing location services with GeoNames

May 2 2013

Wednesday, May 1, 13

Page 3: Internalizing location services with geo names

The world’s online handmade marketplace.What is Etsy.com?

Wednesday, May 1, 13

Page 4: Internalizing location services with geo names

What is Etsy.com?•20 million unique items•18 million daily item searches•800,000 sellers•28 million unique views per month•Developer blog: codeascraft.etsy.com•450 worldwide employees

Wednesday, May 1, 13

Page 5: Internalizing location services with geo names

Our ProblemLocation names were only in English

•Search based on English names•Display and search needed to be i18n friendly.•API limits and speed concerns meant we needed a new solution.

Wednesday, May 1, 13

Page 6: Internalizing location services with geo names

What do we use Location for?More than just search

•Display•Local Search

•No Mapping•No Bounding boxes

Wednesday, May 1, 13

Page 7: Internalizing location services with geo names

What do we use Location for?Item Search

Wednesday, May 1, 13

Page 8: Internalizing location services with geo names

What do we use Location for?Item Search

Wednesday, May 1, 13

Page 9: Internalizing location services with geo names

What do we use Location for?Item Search

Wednesday, May 1, 13

Page 10: Internalizing location services with geo names

What do we use Location for?Location Display

Wednesday, May 1, 13

Page 11: Internalizing location services with geo names

How did this use to work?•Yahoo API•Every lookup was an API call•Stored user input and API response•Searched based on text match of API response•Not radius using lat/lon•No way to Internationalize

Wednesday, May 1, 13

Page 12: Internalizing location services with geo names

What Services did Etsy need to Internalize location services?

•Lookup - Autosuggest•Update - Scripts to refresh data•Display - Built into the php stack•Search - Existing, modified for new pattern

Wednesday, May 1, 13

Page 13: Internalizing location services with geo names

What we have now•GeoNames as a data source•Feeds “geonamessuggest” Solr Core•Sqlite database for place name lookup•GeoName IDs used for local search•Leverages GeoName hierarchy data•Built in Internationalization

Wednesday, May 1, 13

Page 14: Internalizing location services with geo names

How did we get here?•Mapped old locations to GeoNames•Added Geoname ID hierarchy to listing search•Pushed out Sqlite database to webservers•Slowly transitioned lookup and search services•Did side by side testing to look for anomalies

Wednesday, May 1, 13

Page 15: Internalizing location services with geo names

What are the data types?GeoNames

Wednesday, May 1, 13

Page 16: Internalizing location services with geo names

SchemasGeoNames

•775k Entries•1.4m alternate spellings

Wednesday, May 1, 13

Page 17: Internalizing location services with geo names

SchemasGeoNames

Wednesday, May 1, 13

Page 18: Internalizing location services with geo names

GeonamessuggestOur autosuggest for place names

•Localized•GeoIP•Population

Wednesday, May 1, 13

Page 19: Internalizing location services with geo names

GeonamessuggestSchema

Wednesday, May 1, 13

Page 20: Internalizing location services with geo names

Distance and population come firstSort function

Wednesday, May 1, 13

Page 21: Internalizing location services with geo names

GeonameId HierarachyLocal listing search

•Each listing gets a hierarchy of geonameids•Local search is a filter on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields

Wednesday, May 1, 13

Page 22: Internalizing location services with geo names

GeonameId CollectionLocal listing search

•Each listing gets a hierarchy of geonameids•Local search is a filter query on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields

Wednesday, May 1, 13

Page 23: Internalizing location services with geo names

CONFERENCE PARTYThe Tipsy Crow: 770 5th AveStarts after Stump The ChumpYour conference badge gets you in the door

TOMORROW Breakfast starts at 7:30Keynotes start at 8:30

CONTACT John Marc Imbrescia@[email protected]

Wednesday, May 1, 13