Recommender System
description
Transcript of Recommender System
Read any good books lately?
拉蒙 - 卡哈尔 ( 西班牙 ) 1906 诺贝尔生理学或医学奖:“现代神经科学的主要代表人物和倡导者” “最重要的问题已经解决完了” “过度关注应用科学” “认为自己缺乏能力”
Outline Today
What: Recommender System How:
Collaborative Filtering (CF) Algorithm User-based Item-based Model-based
Evaluation on recommender system
What is Recommender System?
The Problem
分类分类
检索检索
还有什么更有效的手段?
还有什么更有效的手段?
Recommendation
This title is a textbook-style exposition on the topic, with its information organized very clearly into topics such as compression, indexing, and so forth. In addition to diagrams and example text transformations, the authors use "pseudo-code" to present algorithms in a language-independent manner wherever possible. They also supplement the reading with mg--their own implementation of the techniques. The mg C language source code is freely available on the Web.
This title is a textbook-style exposition on the topic, with its information organized very clearly into topics such as compression, indexing, and so forth. In addition to diagrams and example text transformations, the authors use "pseudo-code" to present algorithms in a language-independent manner wherever possible. They also supplement the reading with mg--their own implementation of the techniques. The mg C language source code is freely available on the Web.
Personalized Recommendation
Personalized Recommendation
Everyday Examples of Recommender Systems…
Bestseller lists Top 40 music lists The “recent returns” shelf at the library Many weblogs “Read any good books lately?” ....
Common insight: personal tastes are correlated:•If Marry and Bob both like X and Marry likes Y then Bob is more likely to like Y•especially (perhaps) if Bob knows Marry
Common insight: personal tastes are correlated:•If Marry and Bob both like X and Marry likes Y then Bob is more likely to like Y•especially (perhaps) if Bob knows Marry
Rec System: Applications Ecommerce
Product recommendations - amazon Corporate Intranets
Recommendation, finding domain experts, … Digital Libraries
Finding pages/books people will like Medical Applications
Matching patients to doctors, clinical trials, … Customer Relationship Management
Matching customer problems to internal experts
Recommender Systems
给出一个 users 和 items 集合 Items 可以是 documents, products, other users …
向一个 user 推荐 items ,根据 : users 和 items 的属性信息
age, genre, price, … 这个 user 以及其它 user 过去的 behavior
Who has viewed/bought/liked what? 来帮助人们
make decisions maintain awareness
Recommender systems are software applications that aim to support users in their decision-making while interacting with large information spaces.
Recommender systems help overcome the information overload problem by exposing users to the most interesting items, and by offering novelty, surprise, and relevance.
Recommender systems are software applications that aim to support users in their decision-making while interacting with large information spaces.
Recommender systems help overcome the information overload problem by exposing users to the most interesting items, and by offering novelty, surprise, and relevance.
The Web, they say, is leaving the era of search and entering one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.
The Web, they say, is leaving the era of search and entering one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.
Collaborative Filtering Algorithm
Ad Hoc Retrieval and Filtering
Ad hoc retrieval ( 特别检索 : 文档集合保持不变 )
Collection“Fixed Size”
Q2
Q3
Q1
Q4Q5
Ad Hoc Retrieval and Filtering
Filtering( 过滤 : 用户需求不变 )
Documents Stream
User 1Profile
User 2Profile
Docs Filteredfor User 2
Docs forUser 1
Inputs - more detail
Explicit role/domain/content info: content/attributes of documents Document taxonomies Role in an enterprise Interest profiles
Past transactions/behavior info from users: which docs viewed , browsing history search(es) issued which products purchased pages bookmarked explicit ratings (movies, books … )
Large spaceLarge space
Extremely sparseExtremely sparse
Users Items
The Recommendation Space
Item-ItemLinks
User-UserLinks
Links derived from similar attributes,
similar content, explicit cross references
Links derived from similar attributes,
explicit connections
Observed preferences(Ratings, purchases, page views, laundry
lists, play lists)
Definitions
recommender system 为 user 提供对 items 的 recommendation/ prediction
/ opinion 的系统 Rule-based systems use manual rules to do
this An item similarity/clustering system
使用 item links A classic collaborative filtering system
使用 links between users and items Commonly one has hybrid systems
使用前面 all three kinds of links
Link types
User attributes-based Recommendation Male, 18-35: Recommend
The Matrix Item attributes-based
Content Similarity You liked The Matrix:
recommend The Matrix Reloaded
Collaborative Filtering People with interests like
yours also liked Forrest Gump
Example - behavior only
Users Docs viewed
U1
U2
d1
d2
d3
U1 viewed d1, d2, d3.
U2 views d1, d2.
Recommend d3 to U2.
?
Expert finding - simple example
Recommend U1 to U2 as someone to talk to?
U1
U2
d1
d2
d3
Simplest Algorithm:Neighbors Voting
U viewed d1, d2, d5. 看还有谁 viewed d1, d2
or d5. 向 U 推荐:那些 users里面 viewed 最“ popular” 的 doc.
V
W
d1U
d2
d5
Simple algorithm - shortcoming
把所有其它 users 同等对待
实际上,通过过去的历史 behavior 数据可以发现, users 与 U 相像的程度不同。
V
W
d1U
d2
d5
怎样改进?如何区分 user 对于 U 的重要度?User-based Nearest Neighbors
怎样改进?如何区分 user 对于 U 的重要度?User-based Nearest Neighbors
Matrix View
AijAirplane Matrix Room with
a View... Hidalgo
Joe 1 1 1 ... 1Carol 1 0 1 ... 0
... ... ... ... ... ...Kumar
1 1 0 ... 1
Users-Items Matrix Aij = 1 if user i viewed item j, = 0 otherwise. 共同访问过的 items# by pairs of users = ?AAt
user
item
Voting Algorithm
AAt 的行向量 ri
jth entry is the # of items viewed by both user i and user j.
ri A 是一个向量 kth entry gives a weighted vote count to item k
按最高的 vote count 推荐 items.
riuser
user
Auser
item
Add Rating to Algorithm
user i 给出评分 一个实数 rating Vik for item k
每个 user i 都拥有一个 ratings vector vi 稀疏,有大量空值
计算每一对 users i,j 之间的 Similarity measure of how much user pair i,j agrees: wij
Predict user i’s utility for item k
与 voting 算法类似, WiV 是一个向量 Sum (over users i’s nearest neighbors j) ∑wij Vjk
按这个值为 user i 推荐 item k.
VijAirplane Matrix Room with
a View... Hidalgo
Joe 9 7 2 ... 7Carol 8 ? 9 ... ?
... ... ... ... ... ...Kumar 9 3 ? ... 6
Similarity Measure
COS similarity(From IR)
Real data problems
User 有各自的 rating bias
VijAirplane Matrix Room with
a View... Hidalgo
Joe 50 10 40 ... 40Carol 100 ? 80 ... ?
... ... ... ... ... ...Kumar 95 85 ? ... 75
Similarity Measure
Correlation Between two random variables
Mean
Standard variance
Pearson's correlation indicating the degree of
linear dependence between the variables
Correlation Between two random variables
Discussion on Pearson Correlation
Whether two users have co-rated only a few items(on which they may agree by chance)or whether there are many items on which they agree Significance weighting
An agreement by two users on a more controversial item has more “value” than an agreement on a generally liked item. Inverse user frequency Variance weighting factor
Neighborhood Selection
Define a specific minimum threshold of user similarity
Limit the size to a fixed number k 20 to 50 neighbors seems reasonable
Voting Algorithm - implementation issues
计算复杂度? User similarity: w(a,i) Matrix Multiply K nearest neighbors Hold all rating data in memory Memory-based
algorithm Scalability Problem
Does pre-computation on w matrix works?
vi,j= vote of user i on item j Ii = items for which user i has voted Mean vote for i is
User u,v similarity is
avoids overestimating who happen to have rated a few items identically
User-based Nearest Neighbor Algorithm
User Nearest Neighbor Algorithm
选取 user u 的 nearest neighbor 集合 V ,计算 u对 item j 的 vote 如下
Item-based .vs. User-based
Amazon Online Shop 2003 29 millions users Millions of catalog items Prediction in real time is infeasible
Item-based .vs. User-based Pre-computation much stable for item similarity
than user similarity.
Item-based Algorithm
U be the set of users that rated both items a and b.
Predict the rating for user u for a product p
Also limited to k nearest neighbors
Model-based Algorithm
Item-based algorithm is still memory-based Original rating database is held in memory and
used directly for generating the recommendations
Model-based Only precomputed or “learned” model is
required to make predictions at runtime E.g. Matrix factorization/latent factor models
Matrix factorization
LSI/SVD
Dimensionality reduction Noise removing
Challenges of Nearest-Neighbor CF What is “the most optimal weight calculation” to
use? Requires fine tuning of weighting algorithm for the
particular data set What do we do when the target user has not
voted enough to provide a reliable set of nearest-neighbors? One approach: use default votes (popular items) to
populate matrix on items neither the target user nor the nearest-neighbor have voted on
A different approach: model-based prediction using Dirichlet priors to smooth the votes
Other factors include relative vote counts for all items between users, thresholding, clustering (see Sarwar, 2000)
Summary of Advantages of Pure CF
No expensive and error-prone user attributes or item attributes
Incorporates quality and taste Want not just things that are similar, but things
that are similar and good Works on any rate-able item One model applicable to many content
domains Users understand it
It’s rather like asking your friends’ opinions
Evaluation
Netflix Prize
NetFlix: on-line DVD-rental company a collection of 100,000 titles and over
10 million subscribers. They have over 55 million discs and
ship 1.9 million a day, on average a training data set of over 100 million
ratings that over 480,000 users gave to nearly 18,000 movies
Submitted predictions are scored against the true grades in terms of root mean squared error (RMSE)
Netflix Prize
prize of $1,000,000 A trivial algorithm got RMSE of 1.0540 Netflix, Cinematch, got RMSE of 0.9514 on the quiz
data, a 9.6% improvement To WIN
10% over Cinematch on the test set a progress prize of $50,000 is granted every year for the
best result so far By June, 2007, over 20,000 teams had registered
for the competition from over 150 countries. On June 26, 2009 the team "BellKor's Pragmatic Chaos", a
merger of teams "Bellkor in BigChaos" and "Pragmatic Theory", achieved a 10.05% improvement over Cinematch (an RMSE of 0.8558).
Measuring collaborative filtering
How good are the predictions? How much of previous opinion do we need? How do we motivate people to offer their opini
ons?
Measuring recommendations Typically, machine learning methodology Get a dataset of opinions; mask “half” the
opinions Train system with the other half, then
validate on masked opinions Studies with varying fractions half
Compare various algorithms (correlation metrics)
<User, Item, Grade><User, Item, Grade><User, Item, Grade><User, Item, Grade>。。。 。。。 。。。
<User, Item, Grade><User, Item, Grade><User, Item, Grade><User, Item, Grade>。。。 。。。 。。。
Common Prediction Accuracy Metric
Mean absolute error (MAE)
Root mean square error (RMSE)N
rpE
N
iii
1
N
rpE
N
iii
1
2
McLaughlin & Herlocker 2004
Argues that current well-known algorithms give poor user experience Nearest neighbor algorithms are the most
frequently cited and the most widely implemented CF algorithms, consistently are rated the top performing algorithms in a variety of publications
But many of their top recommendations are terrible
These algorithms perform poorly where it matters most in user recommendations
Characteristics of MAE
Characteristics of MAE Assumes errors at all levels in the ranking have
equal weight Works well for measuring how accurately the
algorithm predicts the rating of a randomly selected item.
Seems not appropriate for “Find Good Items” task
Limitations of the MAE metric have concealed the flaws of previous algorithms it looks at all predictions not just top predictionsPrecision?Precision?
Precision of top k
Concealed because past evaluation mainly on offline datasets not real users Many un-rated item exist, but not participate
the evaluation
100 ? 80 ... ?
96 97 70 ... 95
test-data
prediction
Appear in recommendation list but not calculated in PrecisionAppear in recommendation list but not calculated in Precision
What’sthis?What’sthis?
Improve the Precision Measure
Precision of top k has wrongly been done on top k rated movies. Instead, treat not-rated as disliked
(underestimate) Captures that people pre-filter movies
Precision with non-rated items should be counted as non-relevant
Novelty versus Trust
There is a trade-off High confidence recommendations
Recommendations are obvious Low utility for user However, they build trust
Users like to see some recommendations that they know are right
Recommendations with high prediction yet lower confidence Higher variability of error Higher novelty → higher utility for user
McLaughlin and Herlocker argue that “very obscure” recommendations are often bad (e.g., hard to obtain)
Rsults from SIGIR 2004 Paper
Much better predicts top movies
Cost is that it tends to often predict blockbuster movies
A serendipity/ trust trade-off
Modified Precision at Top-N
0
0.05
0.1
0.15
0.2
0.25
0.3
Top 1 Top 5 Top 10 Top 15 Top 20
Mo
dif
ied
Pre
cisi
on
User-to-User Item-Item Distribution
Recommender Systems
Early systems
GroupLens (U of Minn) (Resnick, Iacovou, Bergstrom, Riedl) netPerceptions company Based on nearest neighbor recommendation
model Tapestry (Goldberg/Nichols/Oki/Terry) Ringo (MIT Media Lab) (Shardanand/Maes) Experiment with variants of these
algorithms
Datasets @ GroupLens
MovieLens Data Sets consists of 100,000 ratings for 1682 movies by 943 users 1 million ratings for 3900 movies by 6040 users
Book-Crossing Data Set 278,858 users (anonymized but with demographic
information) providing 1,149,780 ratings (explicit / implicit) about 271,379 books.
J ester J oke Data Set 4.1 million continuous ratings (-10.00 to +10.00) of 100
jokes from 73,496 users.
EachMovie Data Set 2,811,983 ratings entered by 72,916 for 1628 different
movies
Strands Recommendation Engine
Resources
GroupLens http://citeseer.nj.nec.com/resnick94grouplens.html http://www.grouplens.org
Has available data sets, including MovieLens Breese et al. UAI 1998
http://research.microsoft.com/users/breese/cfalgs.html McLaughlin and Herlocker, SIGIR 2004
http://portal.acm.org/citation.cfm?doid=1009050 CoFE CoFE “Collaborative Filtering Engine”
Open source Java Reference implementations of many popular CF algorithms http://eecs.oregonstate.edu/iis/CoFE
C/Matlab Toolkit for Collaborative Filtering http://www.cs.cmu.edu/~lebanon/IR-lab.htm
Related Conferences
http://recsys.acm.org/
Books Recommender Systems An Introdu
ction This book offers an overview of approach
es to developing state-of-the-art recommender systems. The authors present current algorithmic approaches for generating personalized buying proposals, such as collaborative and content-based filtering, as well as more interactive and knowledge-based approaches. They also discuss how to measure the effectiveness of recommender systems and illustrate the methods with practical case studies. The final chapters cover emerging topics such as recommender systems in the social web and consumer buying behavior theory.
Readings
[1] MIW Ch8 [2] R. M. Matthew and L. H. Jonathan, "A collab
orative filtering algorithm and evaluation metric that accurately model the user experience," in Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. Sheffield, United Kingdom: ACM, 2004.
Summary
Collaborative Filtering Input data space , especial
ly the User-Item links Nearest Neighbor CF
Weighting scheme Evaluation of CF
MAE failure
Thank You!
Q&A
Challenges of Nearest-Neighbor CF
Structure based recommendations Recommendations based on similarities between
items with positive votes (as opposed to votes of other users)
Structure of item dependencies modeled through dimensionality reduction via singular value decomposition (SVD) aka latent semantic indexing
Approximate the set of row-vector votes as a linear combination of basis column-vectors
i.e. find the set of columns to least-squares minimize the difference between the row estimations and their true values
Perform nearest-neighbor calculations to project predictions for all items
GroupLens Collaborative Filtering Scheme
aqaaq pvp .Prediction for active user a on
item q
n
iiqaiaq zwp
1
Weighted average of preferences
Similarity weight between active user and user i
k
ikakai zzw .
z-scores for item q
i
iiqiq
vvz
Rating for user i on item q
Mean vote for user i
iIjij
ii v
Iv
||
1
Nearest-Neighbor CF
Basic principle: utilize user’s vote history to predict future votes/recommendations based on “nearest-neighbors”
A typical normalized prediction scheme: goal: predict vote for item ‘j’ based on other
users, weighted towards those with similar past votes as target user ‘a’