Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin...

38
FlexRecs: expressing and combining flexible recommendations Article by Georgia Koutrika , Benjamin Bercovitz and Hector Garcia-Molina Presented by Amit Portnoy
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin...

Page 2: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Can we make the recommendations more flexible to the user ?

I’m only interested in books from the same genre

I’m only interested in books from the same country of the author

Consider only what people from Israel bought

Consider only what people that gave this book a high rank bought

Recommendation is basically adding a score to entities then displaying the highest score results to the user

Page 3: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Short Introduction to Recommender Systems

Change it so it will only recommend friends that have at least one hobby in common with the user

Is it easy to change the friend recommendation implementation (is it hard-wired)?

Add a feature that checks that the recommended user is not deceased

Page 4: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

The ProblemNo Flexibility

Recommendations generally aren't customizable by the user

Hard Wired Hard to modify and to experiment with different

approaches

Page 5: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

The suggested solution is a framework called FlexRecs that enables a Declarative approach to recommendations

We let the designer describe what he wants and not how to how to do it

Much like in the way that SQL queries specify what are the tuples we need but not how to get them.

In fact we’ll be extending SQL with new operators (called ‘recommend’, ‘extend’ and ‘blend’)

Such a query may contain parameters that will be determined by the end user through a graphic user interface.

Page 6: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

FlexRecs’ Approach

Page 7: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

The example we’ll work with:

Register to courses

Plan schedule

Write reviews for coursesRank courses

Check course average rankings and reviews

Get a list of recommended courses (using FlexRecs)

Page 8: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

CourseRank is flexible, students can get recommended courses based on:

What people liked?

What is more critical for completing my major?

What is recommended by students in my major?

What is recommended by students who liked the same courses I did?

Page 9: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

CourseRank is not hard-wired which enables to easily research different approaches:

What is the best way to blend two recommendations?

What is the best way to calculate a course ranking?

Can we give a recommendation that will predict the expected grade of the student?

Page 10: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

The University DatabaseWill meet the following relations later in examples:

Departments(DepID,DepCode,Name) Courses(CourseID,DepID,Title,Description,Units,Days) CourseSched(CourseID,Year,Term,InstrID,Location,TimeSlot

,Days) Instructors(InstrID,Name,Url) Students(SuID,Name,Class,GPA,Status) StudentStudies(SuID,StudyPrgID) StudyPrograms(StudyPrgID,ProgramName,Classification,De

pID) StudentHistory(SuID,CourseID,Year,Term,Grade) Comments(SuID,CourseID,Year,Term,Text,Rating,Date)

Page 11: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Jaccard(“Programming: Part Two”, “Programming: Part One”) = 2/4Jaccard( “Greek philosophy: Part One”, “Programming: Part One”) = 2/5

Example 1:BROWSING “PROGRAMMING: PART ONE” (WITH ID ‘C22)I WOULD LIKE TO SEE ALL RELATED COURSES IN THE

SAME YEAR

In this case related courses will be decided by calculating the Jaccard distance between two courses:

To do that we need to:1) Get all the courses given this year2) Give each course a score by comparing it’s title to “Programming: Part One”

New Operator: “Recommend” !

For each of the tuples in the left:

Add a score attribute by comparing it with the right

relation

CourseID Tilte … Score

C23 Programming: Part two … 0.5

C55 Greek philosophy: Part One

… 0.4

RelatedCourses

Page 12: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend OperatorR1 Recommend R2 :

For each tuple t in R1:1. compare it to every tuple in R2 (each comparison result is a

scalar value)2. Aggregate the results of t’s comparisons to a single scalar

value.3. Add the result to t as new attribute named ‘score’

Formally :

cf – Is a function that compares 2 tuples and outputs a scalar

a – Is a function that maps a tuple t and a set of tuples S into a scalar value using cf to compare t to every tuple in S and then aggregate the results

Page 13: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Example 2:I WOULD LIKE TO SEE COURSES THAT ARE RELATEDTO COURSES THAT HAVE ALREADY TAKEN

1) Get all my previously taken courses2) Get all courses given this year3) Calculate the Jaccard distance between each given course

to all my previous courses4) Take the minimum of those as the score of the candidate

course

Page 14: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 3:I WOULD LIKE TO SEE ONLY COURSES THAT ARE

RECOMMENDEDBY PEOPLE WITH SIMILAR TASTE AS MY FRIEND

MOSHE(ID444)

First we need to rank the students !Giving higher scores to those who have similar taste to Moshe

As we recall: Students(SuID,Name,Class,GPA,Status) Comments(SuID,CourseID,Year,Term,Text,Rating,Date)

We would like to see the students ratings as an attribute of a studentFor this we got a new operator: the extend operator

Page 15: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Comments

The Extend () OperatorSuID CourseI

DRating Date

1 C3 5 Feb, 2008

1 C2 6 Dec, 2007

2 C1 5 Mar, 2007

2 C2 6 Dec. 2007

2 C7 5.6 Dec. 2007

SuID Name

1 PaulLittle

2 John Doe

comments

students

SuID Name Comments(CourseID,Rating,Date) 1 Paul Little

C3 5 Feb, 2008 C2 6 Dec, 2007

2 John Doe

C1 5 Mar, 2007 C2 6 Dec, 2007 C7 5.6 Dec 2007

students comments

Page 16: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 3 (Cont’):I WOULD LIKE TO SEE ONLY COURSES THAT ARE

RECOMMENDEDBY PEOPLE WITH SIMILAR TASTE AS MY FRIEND

MOSHE(ID444)

First we need to rank the students !Giving higher scores to those who has similar taste to Moshe

Page 17: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend Operator: Comparison Functions Examples

Euclidean[E,A1,A2](t,s) =

SuID

Name Comments(CourseID, Rating, Date)

312 Paul Little

(C1, 5, 2 Feb 2008)(C7, 10, 3 Dec 2007)

124 John Doe

(C1, 5, 15 Mar 2007)(C2, 6, 12 Dec 2007)(C5, 6.6, 22 Jun 2007)(C7, 7, 22 Jun 2007)

E = Comments,A1 = CourseID,A2 = Rating

Page 18: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 3 (Cont’):I WOULD LIKE TO SEE ONLY COURSES THAT ARE

RECOMMENDEDBY PEOPLE WITH SIMILAR TASTE AS MY FRIEND

MOSHE(ID444)

Now we can give a score to the courses based on RStds: The score will be the weighted average of the ratings in

RStds We will give higher weights to students with an higher score in

Rstds (students with similar taste as Moshe)

Page 19: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend Operator: Comparison Functions Examples

Weighted average Aggregation

W_Avg[cf,A](t,s) =

SuID

Name

Comments(CourseID, Rating, Date)

Score

312 Paul Little

(C1, 5, 2 Feb 2008)(C7, 10, 3 Dec 2007)

0.5

124 John Doe

(C1, 5, 15 Mar 2007)(C2, 6, 12 Dec 2007)(C5, 6.6, 22 Jun 2007)(C7, 7, 22 Jun 2007)

0.2

Given a course t his score will be calculated as the weighted average:

s1

s2

We want ‘cf’ to find us t’s rating in si

(find the ranking given to this course by the student we are

currently comparing with)

Page 20: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend Operator: Comparison Functions Examples

Identify[A,E,B](t,s) =

Identify[CourseID,Comments,Rating](specificCourse,specificStudent)

= the rating that the student gave this course

SuID

Name Comments(CourseID, Rating, Date)

312 Paul Little

(C1, 5, 2 Feb 2008)(C7, 10, 3 Dec 2007)

124 John Doe

(C1, 5, 15 Mar 2007)(C2, 6, 12 Dec 2007)(C5, 6.6, 22 Jun 2007)(C7, 7, 22 Jun 2007)

Page 21: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 3 (Cont’):I WOULD LIKE TO SEE ONLY COURSES THAT ARE

RECOMMENDEDBY PEOPLE WITH SIMILAR TASTE AS MY FRIEND

MOSHE(ID444)

Extend students with comments (so we can view student’s ratings as an attribute of student)

Get Moshe as a relation Get the other students as a relation Rank the other students based on the inverse Euclidean

distance from their ratings to Moshe’s ratings. Rank the courses by looking at all the ratings a course

got and performing a weighted average on those rating giving higher weights to students who got an higher score in previous step.

Page 22: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 4:I WOULD LIKE TO COMBINE RESULTS FROM EXAMPLES

2 AND 3

I.e. I want to see courses that are related to courses I’ve already taken and are recommended by students with similar taste as Moshe

We’ll introduce and new operator blend Blend will create a union of both recommendations

Page 23: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Blend OperatorR1 Blend R2:

1. Create a unified relation with all the tuples from R1 and R2

2. For each tuple r in the unified relation calculate a scalar value using a blending function

3. Add the result to r as new attribute named ‘bscore’

* R1 and R2 must be union compatible (same attributes)

Formally :

M – Is a blending function which maps a tuple into a scalar value (using the input relations as parameters)

Page 24: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Blend OperatorR1 Blend R2:

Example: A simple blend will be to normalize the results from two recommendation (tuples with a ‘score’ attribute)

… score

A 5

B 4

C 10

D 2

E 1

… score

a 400

b 43

c 1000

d 32

… score bscore

a 400 0.4

b 43 0.043

c 1000 1

d 32 0.032

A 5 0.5

B 4 0.4

C 10 1

D 2 0.2

E 1 0.1

bscore=score/1000

bscore=score/10

Page 25: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

•Example 4 (cont’):I WOULD LIKE TO COMBINE RESULTS FROM EXAMPLES 2

AND 3

I.e. I want to see courses that are related to courses I’ve already taken and are recommended by students with similar taste as Moshe

To Combine the recommendation we’ll use a weighted average blend

We’ll define example2 weight to be 0.7 and example3 weight to 1‘bscore’ of course ‘t’ the blended relation will be:

CourseID

… Score

C23 … 0.25

C55 … 0.44

C43 … 0.2

CourseID

… Score

C23 … 0.1

C55 … 0.6

Example2

Example3

CourseID

… Score bscore

C23 … 0.1 0.18

C55 … 0.6 0.5

C23 … 0.25 0.18

C55 … 0.44 0.5

C43 … 0.2 0.11

Page 26: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Example 5 – MOVIE RECOMMENDATION

Movies(Mid,Title,Year) Viewer(Vid,Name,Age) Saw(Vid,Mid,Rating)

Mid Mid2

Saw(Vid,Mid,Ratings)

1 2 Tuples of Mid 1 ratings

1 3 Tuples of Mid 1 ratings

2 1 Tuples of Mid 2 ratings

2 3 Tuples of Mid 2 ratings

3 1 Tuples of Mid 3 ratings

3 2 Tuples of Mid 3 ratings

Mid Mid2

Saw(Vid,Mid2,Ratings)

1 2 Tuples of Mid 2 ratings

1 3 Tuples of Mid 3 ratings

2 1 Tuples of Mid 1 ratings

2 3 Tuples of Mid 3 ratings

3 1 Tuples of Mid 1 ratings

3 2 Tuples of Mid 2 ratings

ExtMovie1 ExtMovie2

Page 27: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Example 5 – MOVIE RECOMMENDATION

Movies(Mid,Title,Year) Viewer(Vid,Name,Age) Saw(Vid,Mid,Rating)

for each movie pair:look for viewers who saw both moviesif the viewers gave the movies similar ratings the

score of the movie pair will be higher

Page 28: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Example 5 – MOVIE RECOMMENDATION

Movies(Mid,Title,Year) Viewer(Vid,Name,Age) Saw(Vid,Mid,Rating)

look only at movies I (Vid=123) saw (Mid)for each movie I didn’t saw (Mid2):

Mid2 score will be the sum of the scores of the pairs (Mid,Mid2)

(if many users ranked Mid2 similarly to my movies it will get an higher score)

Page 29: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recap - FlexRecs’ ApproachThe examples we saw are called recommendation workflows. In FlexRecs such workflow is the declarative way to specify a recommendation.

A recommendation workflow is a query in extended SQL using:

Traditional relational operators: Select Project Join |x|

And additional operators: Extend Recommend |> Blend

FlexRecs also enables us to parametrize the workflow; for instance:

in example 3 we used Moshe’s ID, we could have however mark the ID as a parameter and leave it to the end-user

Page 30: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

System ArchitectureWorkflow Manager• Define high-level workflow• Define mapping to user Interface

Workflow Parser• Construct an expression tree (we’ll

see such a tree in the next slide)Recommendation Plan Generator• Generates an execution plan

compromised of a sequence of SQL statements

Recommendation Generator• Executes a plan

Page 31: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.
Page 32: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

PerformanceReasonable Performance when working with 10,000 students and 20,000 courses

What if in the movie example we had a database of 1,000,000 viewers and 100,000 movies ? How do we describe offline processes?

How do we handle data that is not stored in a relational database?

Page 33: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

FlexRecs

Simple framework for defining recommendation workflows.

Page 34: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend Operator: Comparison Functions Examples

Probability[A,B,R](t,s) =

Sells := (customerID, ProductID)

What’s the probability of a customer buying product T when given that he bought product S ?

Number of customers who bought both productsNumber of costumers who bought product S

A = ProductID, B = customerID, R = Sells

Page 35: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Recommend Operator:Comparison Functions Examples

Jaccard[A](t,s) =

Animal(name,keywords,classification)

Tiger’s keywords: “Mammal carnivore endangered“

Dog’s keywords: “Mammal carnivore common”

Jaccard[keywords](Tiger,Dog) - The Jaccard similarity is 2/ 4 = 0.5

Distance[A](t,s) = t[A] - s[A]

Page 36: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

More Recommendation Workflows Examples

Example 4: Using the blend operator we can now combine results from

the two previous queries:

Page 37: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

The Extend () Operator

Extending a relation (Ri) by another relation (Rj)

For each tuple in Ri Performing a join with Rj Project only the attribute from Rj on the created

tuples Add the created tuples to Ri as a new attribute

Where A is the set of attributes of Rj

Page 38: Article by Georgia Koutrika, Benjamin Bercovitz and Hector Garcia-MolinaGeorgia KoutrikaBenjamin BercovitzHector Garcia-Molina Presented by Amit Portnoy.

Complex comparison Function Example

Probability[A,B,R](t,s) =

Sells := (customerID, ProductID)

What’s the probability of a customer buying product T when given that he bought product S ?

Number of customers who bought both productsNumber of costumers who bought product S

A = ProductID, B = customerID, R = Sells