Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC

41
Recommendation Architecture Jeremy Schiff MLConf 2015 03/27/2015

Transcript of Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC

Recommendation Architecture

Jeremy Schiff MLConf 2015 03/27/2015

BEFORE   DURING     AFTER  

DINER

S  RE

STAU

RANTS  

Understanding  &  Evolving  

A2rac4ng  &  Planning  

OpenTable: Deliver great experiences at every step, based on who you are

Proprietary   2  

OpenTable in Numbers • Our network connects diners with more than

32,000 restaurants worldwide. • Our diners have spent more than $30 billion

at our partner restaurants. • OpenTable seats more than 16 million diners

each month. • Every month, OpenTable diners write more

than 450,000 restaurant reviews

3  

Recommendations >>

Collaborative Filtering

4  

So what are recommendations?

5  

Building Recommendation Systems •  Importance of A/B

Testing

• Generating Recommendations

• Recommendation Explanations

6  

What’s the Goal Minimizing Engineering Time to Improve The

Metric that Matters

• Make it Easy to Measure • Make it Easy to Iterate • Reduce Iteration Cycle Times

7  

Importance of A/B Testing •  If you don’t measure it,

you can’t improve it

• Metrics Drive Behavior

• Continued Forward Progress

8  

Pick Your Business Metric Revenue, Conversions • OpenTable • Amazon Engagement • Netflix • Pandora • Spotify

9  

Measuring & The Iteration Loop

A/B    Tes4ng  

Weeks  

Measure  

10  

Measuring & The Iteration Loop

Op4mize    Models  

A/B    Tes4ng  

Days   Weeks  

Predict   Measure  

11  

Measuring & The Iteration Loop

Analyze  &    Introspect  

Op4mize    Models  

A/B    Tes4ng  

Hours   Days   Weeks  

Insights   Predict   Measure  

12  

Ranking Objectives Objectives: • Training Error - Minimize Loss Function

§ Often Convex

• Generalization Error - Precision at K

• A/B Metric - Conversion / Engagement

13  

Training, Generalization, and Online Error

• Training: Train on your specific dataset - Dealing with Sparseness

• Test/Generalization: How does it generalize to unseen data? - Hyper-Parameter Tuning

• Online: How does it perform in the wild - Model interaction effects between recommend

items (diversity)

Fundamental Differences in Usage

Right now vs. Planning

Cost of Being Wrong

Search vs. Recommendations

15  

Recommendation Stack

Query  Interpreta4on  

Retrieval  

Ranking  –  Item  &  Explana4on  

Index  Building  

Context  for  Query  &  User    

Model  Building  

Explana4on  Content  

Visualiza4on  

Collabora4ve  Filters  

Item  /  User  Metadata  

16  

Using Context, Frequency & Sentiment • Context - Implicit: Location, Time, Mobile/Web - Explicit: Query

• High End Restaurant for Dinner - Low Frequency, High Sentiment

• Fast, Mediocre Sushi for Lunch - High Frequency, Moderate

Sentiment

17  

How to use this data • Frequency Data: - General: Popularity - Personalized: Implicit CF

• Sentiment Data: - General: Good Experience - Personalized: Explicit CF

• Good Recommendation - Use both to drive your Business Metric

18  

Ranking Phase 1: Bootstrap through heuristics Phase 2: Learn to Rank • Many models - E [ Revenue | Query, Position, Item, User ] - E [ Engagement | Query, Position, Item, User ] - Regression, RankSVM, LambdaMart…

• Modeling Diversity is Important

19  

Training Example • Context Free (Collaborative Filtering)

- Train for Content Based and Collaborative Filtering models. - Create an Ensemble Model - Perform Hyper-Parameter Tuning for each model

• With Context (Search) - Train a model using query (implicit & explicit)

§  Includes Context-Free Model - Perform Hyper-Parameter Tuning

•  Evaluate Model using A/B - Change models, objective functions, etc.

Training DataFlow

Collabora4ve  Filter  Service  

(Real4me)  

Collabora4ve  Filter  HyperParameter  Tuning    

(Batch  with  Spark)  

Collabora4ve  Filter  Training  

(Batch  with  Spark)  

Training DataFlow

Collabora4ve  Filter  Service  

(Real4me)  

Collabora4ve  Filter  HyperParameter  Tuning    

(Batch  with  Spark)  

Collabora4ve  Filter  Training  

(Batch  with  Spark)  

Search  Service  (Real4me)  

Search  HyperParameter  Tuning    

(Batch  with  Spark)  

Search  Training  (Batch  with  Spark)  

Training DataFlow

Collabora4ve  Filter  Service  

(Real4me)  

Collabora4ve  Filter  HyperParameter  Tuning    

(Batch  with  Spark)  

Collabora4ve  Filter  Training  

(Batch  with  Spark)  

Search  Service  (Real4me)  

Search  HyperParameter  Tuning    

(Batch  with  Spark)  

Search  Training  (Batch  with  Spark)  

User  Interac4on  Logs  (Ka_a)  

A/B  Tes4ng  Dashboards  

Other  Services  

Compelling Recommendations

24  

Recommendation Explanations •  Amazon

•  Ness

•  Netflix

•  Ness - Social

25  

Summarizing Content • Essential for Mobile • Balance Utility With Trust? - Summarize, but surface raw

data • Example: - Initially, read every review - Later, use average star rating

26  

Summarizing Restaurant Attributes

27  

Dish Recommendation • What to try once I have arrived?

28  

Edit  via  the  Header  &  Footer  menu  in  PowerPoint   29  29  

Analyzing Review Content

30  

The ingredients of a spectaculardining experience…

31  

… and a spectacularly bad one

32  

Content Features Pandora • Music Genome Project Natural Language Processing • Topics & Tags

33  

Topic Modeling Methods We applied two main topic modeling methods: • Latent Dirichlet Allocation

(LDA) - (Blei et al. 2003)

• Non-negative Matrix Factorization (NMF) - (Aurora et al. 2012)

34  

The food was great! I loved the view of the sailboats.

Bag of Words Model

food   great   chicken   sailboat   view   service  

1   1   0   1   1   0  

35  

Topics with NMF using TF-IDF Word  1   Word  …   Word  N  

Review  1   0.8   0.9   0  

Review  …   0.6   0   0.8  

Review  N   0.9   0   0.8  

Reviews  X  

Words  

Reviews  X  

Topics  

Topics  X  

Words  

36  

Describing Restaurants as Topics

Each  review  for  a  given  restaurant    has  certain  topic  distribuCon  

Combining  them,  we  idenCfy  the  top  topics  for  that  restaurant.  

Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!

Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!

Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!

review  1  

review  2  

review  N  

.  .  .  

Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!

Restaurant  

37  

Examples of Topics

38  

Varying Topic By Region •  San Francisco

•  `

•  London

•  Chicago

•  New York

39  

Building Recommendation Systems •  Importance of A/B

Testing

• Generating Recommendations

• Recommendation Explanations

40  

Thanks!

Jeremy Schiff [email protected]