Nicholas Ruozzi University of Texas at Dallas
Transcript of Nicholas Ruozzi University of Texas at Dallas
![Page 1: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/1.jpg)
Collaborative Filtering
Nicholas RuozziUniversity of Texas at Dallas
based on the slides of Alex Smola & Narges Razavian
![Page 2: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/2.jpg)
Collaborative Filtering
β’ Combining information among collaborating entities to make recommendations and predictions
β’ Can be viewed as a supervised learning problem (with some caveats)
β’ Because of its many, many applications, it gets a special name
2
![Page 3: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/3.jpg)
Examples
β’ Movie/TV recommendation (Netflix, Hulu, iTunes)
β’ Product recommendation (Amazon)
β’ Social recommendation (Facebook)
β’ News content recommendation (Yahoo)
β’ Priority inbox & spam filtering (Google)
β’ Online dating (OK Cupid)
3
![Page 4: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/4.jpg)
Netflix Movie Recommendation
user movie rating
1 14 3
1 200 4
1 315 1
2 15 5
2 136 1
3 235 3
4 79 3
4
user movie rating
1 50 ?
1 28 ?
2 94 ?
2 32 ?
3 11 ?
4 99 ?
4 54 ?
Training Data Test Data
![Page 5: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/5.jpg)
Recommender Systems
β’ Content-based recommendations
β’ Recommendations based on a user profile (specific interests) or previously consumed content
β’ Collaborative filtering
β’ Recommendations based on the content preferences of similar users
β’ Hybrid approaches
5
![Page 6: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/6.jpg)
Collaborative Filtering
β’ Widely-used recommendation approaches:
β’ ππ-nearest neighbor methods
β’ Matrix factorization based methods
β’ Predict the utility of items for a user based on the items previously rated by other like-minded users
6
![Page 7: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/7.jpg)
Collaborative Filtering
β’ Make recommendations based on user/item similarities
β’ User similarity
β’ Works well if number of items is much smaller than the number of users
β’ Works well if the items change frequently
β’ Item similarity (recommend new items that were also liked by the same users)
β’ Works well if the number of users is small
7
![Page 8: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/8.jpg)
ππ-Nearest Neighbor
β’ Create a similarity matrix for pairs of users (or items)
β’ Use ππ-NN to find the ππ closest users to a target user
β’ Use the ratings of the ππ nearest neighbors to make predictions
8
![Page 9: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/9.jpg)
User-User Similarity
β’ Let πππ’π’,ππ be the rating of the πππ‘π‘π‘ item under user π’π’, οΏ½πππ’π’ be the average rating of user π’π’, and ππππππ(π’π’, π£π£) be the set of items rated by both user π’π’ and user π£π£
β’ One notion of similarity between user π’π’ and user π£π£ is given by Pearsonβs correlation coefficient
π π ππππ π’π’, π£π£ =βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’ πππ£π£,ππ β οΏ½πππ£π£
βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’2 βππβππππππ π’π’,π£π£ πππ£π£,ππ β οΏ½πππ£π£
2
9
![Page 10: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/10.jpg)
User-User Similarity
β’ Let πππ’π’,ππ be the rating of the πππ‘π‘π‘ item under user π’π’, οΏ½πππ’π’ be the average rating of user π’π’, and ππππππ(π’π’, π£π£) be the set of items rated by both user π’π’ and user π£π£
β’ One notion of similarity between user π’π’ and user π£π£ is given by Pearsonβs correlation coefficient
π π ππππ π’π’, π£π£ =βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’ πππ£π£,ππ β οΏ½πππ£π£
βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’2 βππβππππππ π’π’,π£π£ πππ£π£,ππ β οΏ½πππ£π£
2
10
Empirical covariance of ratings
![Page 11: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/11.jpg)
User-User Similarity
β’ Let πππ’π’,ππ be the rating of the πππ‘π‘π‘ item under user π’π’, οΏ½πππ’π’ be the average rating of user π’π’, and ππππππ(π’π’, π£π£) be the set of items rated by both user π’π’ and user π£π£
β’ One notion of similarity between user π’π’ and user π£π£ is given by Pearsonβs correlation coefficient
π π ππππ π’π’, π£π£ =βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’ πππ£π£,ππ β οΏ½πππ£π£
βππβππππππ π’π’,π£π£ πππ’π’,ππ β οΏ½πππ’π’2 βππβππππππ π’π’,π£π£ πππ£π£,ππ β οΏ½πππ£π£
2
11
Empirical standard deviation of user uβs ratings for common items
![Page 12: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/12.jpg)
User-User Similarity
β’ Let ππππ(π’π’) denote the set of ππ-NN to π’π’
β’ πππ’π’,ππ, the predicted rating for the πππ‘π‘π‘ item of user π’π’, is given by
πππ’π’,ππ = οΏ½πππ’π’ +βπ£π£βππππ(π’π’) π π ππππ π’π’, π£π£ β πππ£π£,ππ β οΏ½πππ£π£
βπ£π£βππππ(π’π’) π π ππππ(π’π’, π£π£)
β’ This is the average rating of user π’π’ plus the weighted average of the ratings of π’π’βs ππ nearest neighbors
12
![Page 13: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/13.jpg)
User-User Similarity
β’ Issue: could be expensive to find the ππ-NN if the number of users is very large
β’ Possible solutions?
13
![Page 14: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/14.jpg)
Item-Item Similarity
β’ Use Pearsonβs correlation coefficient to compute the similarity between pairs of items
β’ Let ππππππ(ππ, ππ) be the set of users common to items ππ and ππ
β’ The similarity between items ππ and ππ is given by
π π ππππ ππ, ππ =βπ’π’βππππππ ππ,ππ πππ’π’,ππ β οΏ½πππ’π’ πππ’π’,ππ β οΏ½πππ’π’
βπ’π’βππππππ ππ,ππ πππ’π’,ππ β οΏ½πππ’π’2 βπ’π’βππππππ ππ,ππ πππ’π’,ππ β οΏ½πππ’π’
2
14
![Page 15: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/15.jpg)
Item-Item Similarity
β’ Let ππππ(ππ) denote the set of ππ-NN to ππ
β’ πππ’π’,ππ, the predicted rating for the πππ‘π‘π‘ item of user π’π’, is given by
πππ’π’,ππ =βππβππππ(ππ) π π ππππ ππ, ππ β πππ’π’,ππ
βππβππππ(ππ) π π ππππ(ππ, ππ)
β’ This is the weighted average of the ratings of ππβs ππ nearest neighbors
15
![Page 16: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/16.jpg)
ππ-Nearest Neighbor
β’ Easy to train
β’ Easily adapts to new users/items
β’ Can be difficult to scale (finding closest pairs requires forming the similarity matrix)
β’ Less of a problem for item-item assuming number of items is much smaller than the number of users
β’ Not sure how to choose ππ
β’ Can lead to poor accuracy
16
![Page 17: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/17.jpg)
ππ-Nearest Neighbor
β’ Tough to use without any ratings information to start with
β’ βCold Startβ
β’ New users should rate some initial items to have personalized recommendations
β Could also have new users describe tastes, etc.
β’ New Item/Movie may require content analysis or a non-CF based approach
17
![Page 18: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/18.jpg)
Matrix Factorization
β’ There could be a number of latent factors that affect the recommendation
β’ Style of movie: serious vs. funny vs. escapist
β’ Demographic: is it preferred more by men or women
β’ Alternative approach: view CF as a matrix factorization problem
18
![Page 19: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/19.jpg)
Matrix Factorization
β’ Express a matrix ππ β βππΓππ approximately as a product of factors π΄π΄ β βππΓππ and π΅π΅ β βππΓππ
ππ~π΄π΄ β π΅π΅
β’ Approximate the user Γ items matrix as a product of matrices in this way
β’ Similar to SVD decompositions that we saw earlier (SVD canβt be used for a matrix with missing entries)
β’ Think of the entries of ππ as corresponding to an inner product of latent factors
19
![Page 20: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/20.jpg)
Matrix Factorization
20 [from slides of Alex Smola]
![Page 21: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/21.jpg)
Matrix Factorization
21 [from slides of Alex Smola]
![Page 22: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/22.jpg)
Matrix Factorization
22 [from slides of Alex Smola]
![Page 23: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/23.jpg)
Matrix Factorization
β’ We can express finding the βclosestβ matrix as an optimization problem
minπ΄π΄,π΅π΅
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β π΄π΄π’π’,:,π΅π΅:,ππ2 + ππ π΄π΄ πΉπΉ
2 + π΅π΅ πΉπΉ2
23
![Page 24: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/24.jpg)
Matrix Factorization
β’ We can express finding the βclosestβ matrix as an optimization problem
minπ΄π΄,π΅π΅
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β π΄π΄π’π’,:,π΅π΅:,ππ2 + ππ π΄π΄ πΉπΉ
2 + π΅π΅ πΉπΉ2
24
Computes the error in the approximation
of the observed matrix entries
![Page 25: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/25.jpg)
Matrix Factorization
β’ We can express finding the βclosestβ matrix as an optimization problem
minπ΄π΄,π΅π΅
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β π΄π΄π’π’,:,π΅π΅:,ππ2 + ππ π΄π΄ πΉπΉ
2 + π΅π΅ πΉπΉ2
25
Regularization preferences matrices with small Frobenius
norm
![Page 26: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/26.jpg)
Matrix Factorization
β’ We can express finding the βclosestβ matrix as an optimization problem
minπ΄π΄,π΅π΅
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β π΄π΄π’π’,:,π΅π΅:,ππ2 + ππ π΄π΄ πΉπΉ
2 + π΅π΅ πΉπΉ2
β’ How to optimize this objective?
26
![Page 27: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/27.jpg)
Matrix Factorization
β’ We can express finding the βclosestβ matrix as an optimization problem
minπ΄π΄,π΅π΅
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β π΄π΄π’π’,:,π΅π΅:,ππ2 + ππ π΄π΄ πΉπΉ
2 + π΅π΅ πΉπΉ2
β’ How to optimize this objective?
β’ (Stochastic) gradient descent!
27
![Page 28: Nicholas Ruozzi University of Texas at Dallas](https://reader033.fdocuments.net/reader033/viewer/2022050403/627023d5650fdc62f7254a1a/html5/thumbnails/28.jpg)
Extensions
β’ The basic matrix factorization approach doesnβt take into account the observation that some people are tougher reviewers than others and that some movies are over-hyped
β’ Can correct for this by introducing a bias term for each user and a global bias
minπ΄π΄,π΅π΅,ππ,ππ
οΏ½π’π’,ππ πππππππππππ£π£ππππ
πππ’π’,ππ β ππ β ππππ β πππ’π’ β π΄π΄π’π’,:,π΅π΅:,ππ2
+ππ π΄π΄ πΉπΉ2 + π΅π΅ πΉπΉ
2 + ππ οΏ½ππ
ππππ2 + οΏ½π’π’
πππ’π’2
28