Overview of recommender system

59
Overview of Recommender System STANLEY W ANG SOLUTION ARCHITECT , TECH LEAD @SWANG68 http://www.linkedin.com/in/stanley-wang-a2b143b

Transcript of Overview of recommender system

Page 1: Overview of recommender system

Overview of Recommender System

STANLEY WANG SOLUTION ARCHITECT, TECH LEAD @SWANG68 http://www.linkedin.com/in/stanley-wang-a2b143b

Page 2: Overview of recommender system

Recommender System

Page 3: Overview of recommender system

What is Recommender System?

Page 4: Overview of recommender system

Feedback to Recommender System

Page 5: Overview of recommender system

Which Areas can Recommender Benefit?

Page 6: Overview of recommender system

Typical Architecture of Recommender System

Page 7: Overview of recommender system

Recommender System Types

• Collaborative Filtering System – aggregation of consumers’ preferences and recommendations to other users based on similarity in behavioral patterns;

• Content-based System – supervised machine learning used to induce a classifier to discriminate between interesting and uninteresting items for the user;

• Knowledge-based System – knowledge about users and products used to reason what meets the user’s requirements, using discrimination tree, decision support tools, case-based reasoning ;

Page 8: Overview of recommender system

Paradigms of Recommender: Collaborative Filtering

Collaborative: "Tell me what's popular among my peers"

Page 9: Overview of recommender system

Paradigms of Recommender : Content Based

Content-based: "Show me more of the same what I've liked"

Page 10: Overview of recommender system

Paradigms of Recommender : Knowledge Based

Knowledge-based: "Tell me what fits based on my needs"

Page 11: Overview of recommender system

Paradigms of Recommender : Hybrid

Hybrid: combinations of various inputs and/or composition of different mechanism

Page 12: Overview of recommender system

Technology Evolution of Recommender

Page 13: Overview of recommender system

abcd People who liked this also liked …..

How Collaborative Filtering works?

13

Item

to

Item

User to

User

abcd User-to-User

Recommendations are made by finding users

with similar tastes. Jane and Tim both liked

Item 2 and disliked Item 3; it seems they might

have similar taste, which suggests that in

general Jane agrees with Tim. This makes

Item 1 a good recommendation for Tim.

This approach does not scale well for millions

of users.

Item-to-Item

Recommendations are made by finding items

that have similar appeal to many users.

Tom and Sandra are two users who liked both

Item 1 and Item 4. That suggests that, in

general, people who liked Item 4 will also like

item 1, so Item 1 will be recommended to Tim.

This approach is scalable to millions of users

and millions of items.

Page 14: Overview of recommender system

Collaborative Filtering

• The most prominent approach to generate recommendations o used by large, commercial e-commerce sites

o well-understood, various algorithms and variations exist

o applicable in many domains (book, movies, songs, ..)

• Approach o use the "wisdom of the crowd" to recommend items

• Basic assumption and idea o Users give ratings to catalog items (implicitly or explicitly)

o Customers who had similar tastes in the past, will have

similar tastes in the future

Page 15: Overview of recommender system

Collaborative Filtering Toolkit

• Implemented Big Graph ML Algorithms, including:

o Alternative Least Squares (ALS)

o Sparse-ALS

o SVD++

o LibFM (factorization machines)

o GenSGD

o Item-similarity based methods

Page 16: Overview of recommender system

User-based Nearest-Neighbor CF

• The basic technique: o Given an "active user" (Alice) and an item I not yet seen by

Alice o The goal is to estimate Alice's rating for this item, e.g., by

• find a set of users (peers) who liked the same items as Alice in the past and who have rated item I

• use, e.g. the average of their ratings to predict, if Alice will like item I

• do this for all items Alice has not seen and recommend the best-rated

Page 17: Overview of recommender system

User-based Nearest-Neighbor CF

• Some first questions o How do we measure similarity? o How many neighbors should we consider? o How do we generate a prediction from the neighbors' ratings?

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Page 18: Overview of recommender system

Commonly Used Similarity Measure

Page 19: Overview of recommender system

KNN Nearest Neighbour Methods

• unseen item needed to be classified

• positive rated items

• negative rated items

• k = 3: negative

• k = 5: positive

A user-based kNN collaborative filtering method consists of two primary phases: • the neighborhood formation phase • the recommendation phase

Page 20: Overview of recommender system

Measuring user similarity

• A popular similarity measure in user-based CF: Pearson correlation a, b : users ra,p : rating of user a for item p P : set of items, rated both by a and b Possible similarity values between -1 and 1; = user's average ratings

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

sim = 0,85 sim = 0,70

sim = -0,79

𝒓𝒂, 𝒓𝒃

Page 21: Overview of recommender system

Making predictions

• A common prediction function:

• Calculate, whether the neighbors' ratings for the unseen item i are

higher or lower than their average

• Combine the rating differences – use the similarity as a weight

• Add/subtract the neighbors' bias from the active user's average and use

this as a prediction

Page 22: Overview of recommender system

Item-based Collaborative Filtering

• Basic idea: o Use the similarity between items (and not users) to make predictions

• Example: o Look for items that are similar to Item5 o Take Alice's ratings for these items to predict the rating for Item5

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Page 23: Overview of recommender system

Pre-processing for Item-based CF • Item-based filtering does not solve the scalability problem itself

• Pre-processing approach by Amazon.com in 2003 o Calculate all pair-wise item similarities in advance o The neighborhood to be used at run-time is typically rather small,

because only items are taken into account which the user has rated o Item similarities are supposed to be more stable than user similarities

• Memory requirements o Up to N2 pair-wise similarities to be memorized (N = number of items)

in theory o In practice, this is significantly lower (items with no co-ratings) o Further reductions possible

• Minimum threshold for co-ratings (items, which are rated at least by n users)

• Limit the size of the neighborhood (might affect recommendation accuracy)

Page 24: Overview of recommender system

Similarity Measure for Item based CF

• Produces better results in item-to-item filtering o for some datasets, no consistent picture in literature

• Ratings are seen as vector in n-dimensional space • Similarity is calculated based on the angle between the vectors

• Adjusted cosine similarity

o take average user ratings into account, transform the original ratings

o U: set of users who have rated both items a and b

Page 25: Overview of recommender system

Recommendation for Item-based CF

• After computing the similarity between items we select a set of k most similar items to the target item and generate a predicted value of user u’s rating

where J is the set of k similar items

Jj

Jj j

jisim

jisimrip

),(

),()(

,u

u,

Page 26: Overview of recommender system

What is Latent Factor Model?

Latent variables are introduced to account for the underlying reasons of a user’s choice. When the connections between the latent variables and observed variables (user, product, rating, etc.) are estimated during the training recommendations can be made to users by computing their possible interactions with each product through the latent variables;

Page 27: Overview of recommender system

Matrix Factorization Approach

Page 28: Overview of recommender system

How does LSM Work?

Page 29: Overview of recommender system
Page 30: Overview of recommender system

Latent Factor Model Algorithm

Page 31: Overview of recommender system

LSM Algorithm : Alternating Least Square

Page 32: Overview of recommender system

LSM Algorithm : Alternating Least Square

Page 33: Overview of recommender system

LSM Algorithm : Stochastic Gradient Descent

Page 34: Overview of recommender system

Context-Based Recommender Systems Overview

The recommender system uses additional data about the context of an item consumption;

For example, in the case of a restaurant the time or the location may be used to improve the recommendation compared to what could be performed without this additional source of information;

A restaurant recommendation for a Saturday evening when you go with your spouse should be different than a restaurant recommendation on a workday afternoon when you go with co-workers;

abcd Overview

Page 35: Overview of recommender system

Context-Based Recommender Systems

Recommend a vacation

Winter vs. summer

Recommend a purchase (e-retailer)

Gift vs. for yourself

Recommend a movie

To a student who wants to watch it on Saturday night with his girlfriend in a movie theater.

Motivating Examples

35

Page 36: Overview of recommender system

Recommend music

The music that we like to hear is greatly affected by a context, such that can be thought of a mixture of our feelings (mood) and the situation or location (the theme) we associate it with.

Listen to Bruce Springteen "Born in USA" while driving along the 101.

Listening to Mozart's Magic Flute while walking in Salzburg.

Motivating Examples

Context-Based Recommender Systems

36

Page 37: Overview of recommender system

What is the user when asking for a recommendation?

Where (and when) the user is ?

What does the user (e.g., improve his knowledge or really buy a product)?

Is the user or with other ?

Are there products to choose or only ?

Plain recommendation technologies forget to take into account the user context.

Context-Based Recommender Systems

What simple recommendation techniques ignore?

37

Page 38: Overview of recommender system

Obtain sufficient and reliable data describing the user context

Selecting the right information, i.e., relevant in a particular personalization task

Understand the impact of contextual dimensions on the personalization process

Computational model the contextual dimension in a more classical recommendation technology

For instance: how to extend Collaborative Filtering to include contextual dimensions?

abcd Major obstacle for contextual computing

Context-Based Recommender Systems

38

Page 39: Overview of recommender system

Each item in the data base is a candidate for splitting

Context defines all possible splits of an item ratings vector

We test all the possible splits – we do not have many contextual features

We choose one split (using a single contextual feature) that maximizes an impurity measure and whose impurity is higher than a threshold

abcd Item Split - Intuition and Approach

Context-Based Recommender Systems

39

Page 40: Overview of recommender system

Each item in the data base is a candidate for splitting

Context defines all possible splits of an item ratings vector

We test all the possible splits – we do not have many contextual features

We choose one split (using a single contextual feature) that maximizes an impurity measure and whose impurity is higher than a threshold

abcd Item Split - Intuition and Approach

Context-Based Recommender Systems

40

Page 41: Overview of recommender system

Context-Aware Splitting Approaches

Page 42: Overview of recommender system

Types of Context

Page 43: Overview of recommender system

Different Views of Context

Page 44: Overview of recommender system

Model of Context-Based Recommender Systems

Page 45: Overview of recommender system

Context-Based Pre Filtering Model

Page 46: Overview of recommender system

Context-Based Post Filtering Model

Page 47: Overview of recommender system

Context-Based Contextual Model

Page 48: Overview of recommender system

? 3

Active user

Rating

prediction

Trust- based Collaborative Filtering

Active users’ trusted friends

Users tend to receive advice from people they trust, such as Trusted friends who can be defined explicitly by the users or inferred from social networks .

Page 49: Overview of recommender system

• Global Metrics: computes a single global trust value for every

single user (reputation on the network)

• Pros:

o Based on the whole community opinion

• Cons:

o Trust is subjective (controversial users)

a

b

d

c

1 3

3 2

3

Metrics of Trust based Recommender

Page 50: Overview of recommender system

• Local Metrics: predicts (different) trust scores that are personalized from the point of view of every single user

• Pros: o More accurate o Attack resistance

• Cons: o Ignoring the “wisdom of the crowd”

a

b

d

c

1 5

3 2

?

Metrics of Trust based Recommender

Page 51: Overview of recommender system

51

Content-Based Recommender System

• In content-based recommendations the system tries to recommend items that matches the User Profile;

• The Profile is based on items user has liked in the past or explicit interests that he defines;

• A content-based recommender system matches the profile of the item to the user profile to decide on its relevancy to the user;

Page 52: Overview of recommender system

52

Read update User Profile

New books User Profile

Recommender Systems

Match

recommendation

Example of Content Based Recommender

Page 53: Overview of recommender system

What is the “Content"?

• The genre is actually not part of the content of a book • Most CB-recommendation methods originate from

Information Retrieval (IR) field: o The item descriptions are usually automatically

extracted (important words) o Goal is to find and rank interesting text documents

(news articles, web pages) • Here are some examples:

o Classical IR-based methods based on keywords o No expert recommendation knowledge involved o User profile (preferences) are rather learned than explicitly

elicited

Page 54: Overview of recommender system

Content Representation

• Items stored in a database table

Page 55: Overview of recommender system

Content Representation • Structured data

Small number of attributes Each item is described by the same set of attributes Known set of values that the attributes may have

• Straightforward topic to work with

User’s profile contains positive rating for 1001, 1002, 1003 Would the user be interested in say Oscar, French cuisine, table

service?

• Unstructured data No attribute names with well-defined values Need to impose structure on free text before it can be used Natural language complexity Same word with different meanings Different words with same meaning

Page 56: Overview of recommender system

Term-Frequency - Inverse Document Frequency

• Simple keyword representation has its problems o In particular when automatically extracted because

• Not every word has similar importance

• Longer documents have a higher chance to have an overlap with the user profile

• Standard measure: TF-IDF o Encodes text documents as weighted term vector o TF: Measures, how often a term appears (density in a document)

• Assuming that important terms appear more often

• Normalization has to be done in order to take document length into account

o IDF: Aims to reduce the weight of terms that appear in all documents

Page 57: Overview of recommender system

TF - IDF Weighting

• Term frequency tft,d of a term t in a document d

• Inverse document frequency idft of a term t

• TF*IDF weighting

k

dk

dt

dtn

ntf

,

,

,

t

tdf

Nidf log

tdt idftfdtw ,,

Page 58: Overview of recommender system

Example TF IDF Representation

Page 59: Overview of recommender system

User Profiles

• User profile consists of two main types of information A model of the user’s preferences. e.g., a function that for any item

predicts the likelihood that the user is interested in that item User’s interaction history. e.g., items viewed by a user, items

purchased by a user, search queries, etc. • “Manual” recommending approaches

Provide “check box” interface that let the users construct their own profiles of interests

A simple database matching process is used to find items that meet the specified criteria and recommend these to users.

• Rule-based Recommendation The system has rules to recommend other products based on user

history Rule to recommend sequel to a book or movie to customers who

purchased the previous item in the series Can capture common reasons for making recommendations