Semantically-Enhanced Recommendation Algorithms
-
Upload
luigi-ceccaroni -
Category
Technology
-
view
145 -
download
2
description
Transcript of Semantically-Enhanced Recommendation Algorithms
Semantically-EnhancedRecommendation Algorithms
CCIA 2012
Victor Codina & Luigi Ceccaroni [email protected] [email protected]
Departament de Llenguatges i Sistemes Informàtics
Knowledge Engineering and Machine Learning Group
Health Informatics
Personalized Computational Medicine
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 2
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 3
Netflix: 2/3 of the movies rented are recommend Google News: 38% more clickthrough Amazon: 35% sales from recommendations
The value of recommendations
All these systems employ as a main component Collaborative Filtering (CF) approach
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 4
But in most online services the CF approach does not work so well
Why??
Usually: Lack of Data
Other reasons: lack of context-awareness, domain-specific particularities
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Outline
5
Cold-start problem and existing solutions
Proposed solution to overcome cold start
Evaluation and results
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Outline
6
Cold-start problem andexisting solutions
Proposed solution to overcome cold start
Evaluation and results
Cold-start problem
Hybrid recommenders
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
What is the cold-start problem?
Narrow viewo No ratings at all associated to items or users
Wider viewo Few ratings associated
7
UsersMany ratings Few ratings
Items
Many ratings Normal New user
Few ratings New item New user & item
Cold-start scenarios:
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Typical solution: hybrid recommender combining CF with content-based filtering
8
Traditional Content-based filtering
Semantically-EnhancedContent-based filtering
Collaborative Filtering
+Collaborative Filtering
+
New item
New user
Limitation
PAST SOLUTION MORE RECENT SOLUTION
Lack of understanding and exploitation of domain semantics
The need of domain ontologies describing explicit metadata relations
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Outline
9
Cold-start problem and existing solutions
Evaluation and results
Acquisition of implicit semantics
Methods for semantics exploitation
Proposed solution to overcome cold start
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Acquisition of implicit domain semantics
Implicit semantics = semantic similarities among item attributes extracted from Vector Space Models (VSMs)
Distributional hypothesis: “words that share similar contexts share similar meaning”
10
Attrib
utes
Context
wa,c…
… MatrixTransformation (SVD, Conditional
probabilities)
Similarity measure (Cosine, Jaccard)
Attribute semantic
similarities
Items Users
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Item-basedo Similarity is measured in terms of how many items are similarly
described by both attributes User-based
o Similarity is measured in terms of how many users are similarly interested in both attributes
Semantic similarities are context-dependant
Scifi 0.79598457future 0.6889696space 0.65459067aliens 0.6110453robots 0.59465224
Scifi 0.48631117aliens 0.42508063dystopia 0.34769687space 0.32580933future 0.27470198
Example:- Top-5 tags similar to “Sci-Fi” - Calculated using cosine similarity without matrix transformation
User-based Items-based
11
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Exploitation of implicit semantics in content-based filtering
12
Attributes
Item
s
wi,a…
…
User modeling technique
Items ru,i… …
user ratings (u)
Attributes… wu,a
User interests (u)
Vector-based matching
Attributes… wi,a
Item attributes (i)
score
1. Profileexpansion
2. Semantic matching
Attributerelevance [0,1]
degree of interest [-1,1]
( )
Expanded user interests (u)
USER MODELING PREDICTION GENERATION
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Method 1: User profile expansion by constrained spreading activation
13
Attribute semantic similarities
1 0.5 0.2 0 0.30.5 1 0.3 0 0.10.2 0.3 1 0.7 0.80 0 0.7 1 00.3 0.1 0.8 0 1
a1 a2 a3 a4 a5
Similarities can be symmetric or not depending on the similarity measure used
0 0.5 -0.1 0 0 User interests [-1,1] a1 a2 a3 a4 a5
0.25 0.5 0.05 0 0 Expandeduser interests [-1,1] a1 a2 a3 a4 a5
- fan-out threshold = 0.25
- activation threshold = 0.25
activated node
(0.5) (0.3)
Method hyper-parameters:
- max.expansion levels = 1
new interest Weight updated
a1
a2
a3
a4
a5
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Method 2: Prediction generation by pair-wise semantic matching strategies
Item attributes [0,1]0 0.3 0 0 0.7
Attribute semantic similarities
1 0.5 0.2 0 0.30.5 1 0.3 0 0.10.2 0.3 1 0.7 0.80 0 0.7 1 00.3 0.1 0.8 0 1
a1 a2 a3 a4 a5
User interests [-1,1]0 0.5 -0.1 0 0
a1 a2 a3 a4 a5
a1 a2 a3 a4 a5
Direct matching (1)
Approach: Vector-based matching
Result: (using the product as aggregation function)
0.15
Best-pairs matching
Similarities can be symmetric or not depending on the similarity measure used
(0.8)
- 0.056 = 0.094
All-pairs matching
- 0.009 + 0.035 - 0.056 = 0.12
(0.3)(0.1)
a1
a2
a3
a4
a5
- similarity threshold = 0.05Methodhyper-parameter:
14
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Outline
15
Cold-start problem and existing solutions
Evaluation and resultsMovieLens data set
Experimental results
Proposed solution to overcome cold start
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Offline experimentation with a MovieLens data set extended with movie metadata
Users 2113
Movies 1646
Attributes 4 (Genres, directors, actors and tags)
Attribute values 2886
Ratings per user on avg. 239
Rating density 14%
Data set statistics after pruning unusual attributes values and movies with few attributes:
16
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Evaluation of methods for semantics exploitation
17
Baseline = Traditional CB using hybrid user modeling techniqueExpansion-CB = CSA-same + User-based + raw frequenciesMatching-CB = Best-pairs-same + User-based + Forbes-Zhu methodBPR-MF = CF based on matrix factorization optimized for ranking
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Conclusions
Cold-start problem can be very critical o Above all in systems with small databases
Existing solutions have some limitationso Traditional CB cannot solve new user scenarioo Semantically-enhanced CB requires domain ontologies to work
Exploitation of implicit semantics can be a good alternative to overcome cold-start problemo User-based semantics is more effective than item-basedo The best-pair semantic matching method is more effective than
the profile expansion based on spreading activation
18
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni
Future work
Experimenting with data sets of different domainso Million Song data set
Extending the study of Vector Space Modelso Probabilistic similarity measures (e.g. Kullback-Leiber)
Apply the same approach to enhance cold-start performance of context-aware recommenderso Implicit semantics of contextual conditions can also be acquired
from user datao Similarly, pair-wise semantic strategies can be employed to
enhance contextual user modeling
19