Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC

Recommendation Architecture

Jeremy Schiff MLConf 2015 03/27/2015

BEFORE DURING AFTER

DINER

S RE

STAU

RANTS

Understanding & Evolving

A2rac4ng & Planning

OpenTable: Deliver great experiences at every step, based on who you are

Proprietary 2

OpenTable in Numbers • Our network connects diners with more than

32,000 restaurants worldwide. • Our diners have spent more than $30 billion

at our partner restaurants. • OpenTable seats more than 16 million diners

each month. • Every month, OpenTable diners write more

than 450,000 restaurant reviews

3

Recommendations >>

Collaborative Filtering

4

So what are recommendations?

5

Building Recommendation Systems •  Importance of A/B

Testing

• Generating Recommendations

• Recommendation Explanations

6

What’s the Goal Minimizing Engineering Time to Improve The

Metric that Matters

• Make it Easy to Measure • Make it Easy to Iterate • Reduce Iteration Cycle Times

7

Importance of A/B Testing •  If you don’t measure it,

you can’t improve it

• Metrics Drive Behavior

• Continued Forward Progress

8

Pick Your Business Metric Revenue, Conversions • OpenTable • Amazon Engagement • Netflix • Pandora • Spotify

9

Measuring & The Iteration Loop

A/B Tes4ng

Weeks

Measure

10


Op4mize Models

A/B Tes4ng

Days Weeks

Predict Measure

11


Analyze & Introspect

Op4mize Models

A/B Tes4ng

Hours Days Weeks

Insights Predict Measure

12

Ranking Objectives Objectives: • Training Error - Minimize Loss Function

§ Often Convex

• Generalization Error - Precision at K

• A/B Metric - Conversion / Engagement

13

Training, Generalization, and Online Error

• Training: Train on your specific dataset - Dealing with Sparseness

• Test/Generalization: How does it generalize to unseen data? - Hyper-Parameter Tuning

• Online: How does it perform in the wild - Model interaction effects between recommend

items (diversity)

Fundamental Differences in Usage

Right now vs. Planning

Cost of Being Wrong

Search vs. Recommendations

15

Recommendation Stack

Query Interpreta4on

Retrieval

Ranking – Item & Explana4on

Index Building

Context for Query & User

Model Building

Explana4on Content

Visualiza4on

Collabora4ve Filters

Item / User Metadata

16

Using Context, Frequency & Sentiment • Context - Implicit: Location, Time, Mobile/Web - Explicit: Query

• High End Restaurant for Dinner - Low Frequency, High Sentiment

• Fast, Mediocre Sushi for Lunch - High Frequency, Moderate

Sentiment

17

How to use this data • Frequency Data: - General: Popularity - Personalized: Implicit CF

• Sentiment Data: - General: Good Experience - Personalized: Explicit CF

• Good Recommendation - Use both to drive your Business Metric

18

Ranking Phase 1: Bootstrap through heuristics Phase 2: Learn to Rank • Many models - E [ Revenue | Query, Position, Item, User ] - E [ Engagement | Query, Position, Item, User ] - Regression, RankSVM, LambdaMart…

• Modeling Diversity is Important

19

Training Example • Context Free (Collaborative Filtering)

- Train for Content Based and Collaborative Filtering models. - Create an Ensemble Model - Perform Hyper-Parameter Tuning for each model

• With Context (Search) - Train a model using query (implicit & explicit)

§  Includes Context-Free Model - Perform Hyper-Parameter Tuning

•  Evaluate Model using A/B - Change models, objective functions, etc.

Training DataFlow

Collabora4ve Filter Service

(Real4me)

Collabora4ve Filter HyperParameter Tuning

(Batch with Spark)

Collabora4ve Filter Training

(Batch with Spark)

Training DataFlow


(Real4me)


(Batch with Spark)


(Batch with Spark)

Search Service (Real4me)

Search HyperParameter Tuning

(Batch with Spark)

Search Training (Batch with Spark)

Training DataFlow


(Real4me)


(Batch with Spark)


(Batch with Spark)

Search Service (Real4me)

Search HyperParameter Tuning

(Batch with Spark)

Search Training (Batch with Spark)

User Interac4on Logs (Ka_a)

A/B Tes4ng Dashboards

Other Services

Compelling Recommendations

24

Recommendation Explanations •  Amazon

•  Ness

•  Netflix

•  Ness - Social

25

Summarizing Content • Essential for Mobile • Balance Utility With Trust? - Summarize, but surface raw

data • Example: - Initially, read every review - Later, use average star rating

26

Summarizing Restaurant Attributes

27

Dish Recommendation • What to try once I have arrived?

28

Edit via the Header & Footer menu in PowerPoint 29 29

Analyzing Review Content

30

The ingredients of a spectaculardining experience…

31

… and a spectacularly bad one

32

Content Features Pandora • Music Genome Project Natural Language Processing • Topics & Tags

33

Topic Modeling Methods We applied two main topic modeling methods: • Latent Dirichlet Allocation

(LDA) - (Blei et al. 2003)

• Non-negative Matrix Factorization (NMF) - (Aurora et al. 2012)

34

The food was great! I loved the view of the sailboats.

Bag of Words Model

food great chicken sailboat view service

1 1 0 1 1 0

35

Topics with NMF using TF-IDF Word 1 Word … Word N

Review 1 0.8 0.9 0

Review … 0.6 0 0.8

Review N 0.9 0 0.8

Reviews X

Words

Reviews X

Topics

Topics X

Words

36

Describing Restaurants as Topics

Each review for a given restaurant has certain topic distribuCon

Combining them, we idenCfy the top topics for that restaurant.

Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!



review 1

review 2

review N

. . .


Restaurant

37

Examples of Topics

38

Varying Topic By Region •  San Francisco

•  `

•  London

•  Chicago

•  New York

39

Building Recommendation Systems •  Importance of A/B

Testing

• Generating Recommendations

• Recommendation Explanations

40

Thanks!

Jeremy Schiff [email protected]

Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC

Technology

Transcript of Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC