Lecture 4: Social Web Personalization (2012)

54
Social Web Lecture IV: Personalization on the Social Web (some slides were adopted from Fabian Abel) Lora Aroyo The Network Institute VU University Amsterdam Monday, February 27, 12

description

This is the fourth lecture in the Social Web course at the VU University Amsterdam Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/ Thanks to Fabian Abel for letting me adopt slides from his lectures

Transcript of Lecture 4: Social Web Personalization (2012)

Page 1: Lecture 4: Social Web Personalization (2012)

Social WebLecture IV Personalization on the Social Web

(some slides were adopted from Fabian Abel)

Lora AroyoThe Network Institute

VU University Amsterdam

Monday February 27 12

Personalization amp Social Web

bull Applications on the Social Web use web data [last week] amp are lsquosocialrsquo

bull To design lsquosocialrsquo functionality we need to understand how out of the data the application can provide relevant information (what users perceive as relevant)

bull Therefore we need to understand

bull how good personalization (recommenders) are

bull how good the user models are they are based on

bull In this lecture we consider theory amp techniques for how to design and evaluate recommenders and user models (for use in SW applications)

Monday February 27 12

total transparency - is it desired

Monday February 27 12

User Modeling

httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg

How to infer amp represent user information that supports a given application or context

Monday February 27 12

User Modeling Challengebull Application has to obtain

understand amp exploit information about the user

bull Information (need amp context) about user

bull Inferring information about user amp representing it so that it can be consumed by the application

bull Data relevant for inferring information about user

Monday February 27 12

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 2: Lecture 4: Social Web Personalization (2012)

Personalization amp Social Web

bull Applications on the Social Web use web data [last week] amp are lsquosocialrsquo

bull To design lsquosocialrsquo functionality we need to understand how out of the data the application can provide relevant information (what users perceive as relevant)

bull Therefore we need to understand

bull how good personalization (recommenders) are

bull how good the user models are they are based on

bull In this lecture we consider theory amp techniques for how to design and evaluate recommenders and user models (for use in SW applications)

Monday February 27 12

total transparency - is it desired

Monday February 27 12

User Modeling

httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg

How to infer amp represent user information that supports a given application or context

Monday February 27 12

User Modeling Challengebull Application has to obtain

understand amp exploit information about the user

bull Information (need amp context) about user

bull Inferring information about user amp representing it so that it can be consumed by the application

bull Data relevant for inferring information about user

Monday February 27 12

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 3: Lecture 4: Social Web Personalization (2012)

total transparency - is it desired

Monday February 27 12

User Modeling

httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg

How to infer amp represent user information that supports a given application or context

Monday February 27 12

User Modeling Challengebull Application has to obtain

understand amp exploit information about the user

bull Information (need amp context) about user

bull Inferring information about user amp representing it so that it can be consumed by the application

bull Data relevant for inferring information about user

Monday February 27 12

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 4: Lecture 4: Social Web Personalization (2012)

User Modeling

httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg

How to infer amp represent user information that supports a given application or context

Monday February 27 12

User Modeling Challengebull Application has to obtain

understand amp exploit information about the user

bull Information (need amp context) about user

bull Inferring information about user amp representing it so that it can be consumed by the application

bull Data relevant for inferring information about user

Monday February 27 12

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 5: Lecture 4: Social Web Personalization (2012)

User Modeling Challengebull Application has to obtain

understand amp exploit information about the user

bull Information (need amp context) about user

bull Inferring information about user amp representing it so that it can be consumed by the application

bull Data relevant for inferring information about user

Monday February 27 12

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 6: Lecture 4: Social Web Personalization (2012)

User amp Usage Data is everywhere

bull People leave traces on the Web and on their computers

bull Usage data eg query logs click-through-data

bull Social data eg tags (micro-)blog posts comments bookmarks friend connections

bull Documents eg pictures videos

bull Personal data eg affiliations locations

bull Products applications services - bought used installed

bull Not only a userrsquos behavior but also interactions of other users

bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems

Monday February 27 12

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 7: Lecture 4: Social Web Personalization (2012)

UM Basic Concepts

bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system

bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles

bull User Modeling the process of representing the user

Monday February 27 12

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 8: Lecture 4: Social Web Personalization (2012)

User Modeling Approaches

bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics

bull Customizing user explicitly provides amp adjusts elements of the user profile

bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo

bull Stereotyping stereotypical characteristics to describe a user

bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user

Related scientific conference httpumap2011org Related journal httpumuaiorg

Monday February 27 12

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 9: Lecture 4: Social Web Personalization (2012)

Which approach suits

best the conditions of applications

httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg

Monday February 27 12

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 10: Lecture 4: Social Web Personalization (2012)

Overlay User Models

bull among the oldest user models

bull used for modeling student knowledge

bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge

bull concept-value pairs

Monday February 27 12

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 11: Lecture 4: Social Web Personalization (2012)

User Model Elicitation

bull Ask the user explicitly learn

bull NLP intelligent dialogues

bull Bayesian networks Hidden Markov models

bull Observe the user learn

bull Logs machine learning

bull Clustering classification data mining

bull Interactive user modeling mixture of direct inputs of a user observations and inferences

Monday February 27 12

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 12: Lecture 4: Social Web Personalization (2012)

httphunchcomMonday February 27 12

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 13: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 14: Lecture 4: Social Web Personalization (2012)

Stereotyping

bull set of characteristics (eg attribute-value pairs) that describe a group of users

bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes

Monday February 27 12

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 15: Lecture 4: Social Web Personalization (2012)

Why are stereotypes

useful

httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg

Monday February 27 12

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 16: Lecture 4: Social Web Personalization (2012)

Can we use Social Web

data for user

modeling

Monday February 27 12

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 17: Lecture 4: Social Web Personalization (2012)

Can we infer a Twitter-based user profile

User Modeling (4 building blocks)

Semantic Enrichment Linkage and Alignment

Personalized News Recommender

Profile

I want my

personalized news recommendations

Example from Abel et al (2011)Monday February 27 12

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 18: Lecture 4: Social Web Personalization (2012)

User Modeling Building Blocks

Profile concept weight

time

1 Which tweets of the user should be

analyzed

Morning Afternoon Night

1 Temporal Constraints

June 27 July 4 July 11

(b) temporal patterns

weekends start end

(a) time period

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 19: Lecture 4: Social Web Personalization (2012)

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won French Open fo2010

Francesca Schiavone

French Open

Francesca Schiavone French Open entity-based

Sport T

T topic-based

2 What type of concepts should represent ldquointerestsrdquo

fo2010

fo2010 hashtag-based

1 Temporal Constraints

time

June 27 July 4 July 11

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 20: Lecture 4: Social Web Personalization (2012)

User Modeling Building Blocks

Profile concept weight

2 Profile Type

Francesca Schiavone won httpbitly2f4t7a

Francesca Schiavone

3 Further enrich the semantics of tweets

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip

French Open

Tennis

French Open

Tennis

(b) further enrichment

(a) tweet-based

Monday February 27 12

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 21: Lecture 4: Social Web Personalization (2012)

User Modeling Building Blocks

Profile concept weight

2 Profile Type

4 How to weight the concepts

1 Temporal Constraints

3 Semantic Enrichment

Francesca Schiavone

French Open

Tennis

4 Weighting Scheme

time

June 27 July 4 July 11

weight(Francesca Schiavone)

Concept frequency (TF)

4

3 6

TFxIDF Time-sensitive

weight(French Open)

weight(Tennis)

Monday February 27 12

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 22: Lecture 4: Social Web Personalization (2012)

Observationsbull Profile characteristics

bull Semantic enrichment solves sparsity problems

bull Profiles change over time fresh profiles reflect better current user demands

bull Temporal patterns weekend profiles differ significantly from weekday profiles

bull Impact on news recommendations

bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based

bull Semantic enrichment improves recommendation quality

bull Time-sensitivity (adapting to trends) improves performance

Monday February 27 12

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 23: Lecture 4: Social Web Personalization (2012)

User Modelingit is not about putting everything in a user profile

it is about making the right choices

Monday February 27 12

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 24: Lecture 4: Social Web Personalization (2012)

User AdaptationKnowing the user - this knowledge - can be applied to adapt

a system or interface to the user to improve the system functionality and user experience

Monday February 27 12

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 25: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 26: Lecture 4: Social Web Personalization (2012)

user modeling

user profile

observations data and information about user

profile analysis

adaptation decisions

User-Adaptive Systems

A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003

Monday February 27 12

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 27: Lecture 4: Social Web Personalization (2012)

user modeling (infer current musical taste)

user profile interests in

genres artists tags

history of songs like ban pause skip

compare profile with possible next

songs to play

next song to be played

Lastfm adapts to your music taste

Monday February 27 12

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 28: Lecture 4: Social Web Personalization (2012)

Google Adaptive Search

httpwwwgooglecomgoodtoknow

Monday February 27 12

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 29: Lecture 4: Social Web Personalization (2012)

Issues in User-Adaptive Systems

bull Overfitting ldquobubble effectsrdquo loss of serendipity problem

bull systems may adapt too strongly to the interestsbehavior

bull eg an adaptive radio station may always play the same or very similar songs

bull We search for the right balance between novelty and relevance for the user

bull ldquoLost in Hyperspacerdquo problem

bull when adapting the navigation ndash ie the links on which users can click to findaccess information

bull eg re-orderinghiding of menu items may lead to confusion

Monday February 27 12

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 30: Lecture 4: Social Web Personalization (2012)

httpwwwflickrcomphotosbellarosebyliz4729613108

What is good user modeling amp personalization

Monday February 27 12

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 31: Lecture 4: Social Web Personalization (2012)

Success perspectives

bull From the consumer perspective of an adaptive system

bull From the provider perspective of an adaptive system

Adaptive system maximizes satisfaction of the user

hard to measureobtain

Adaptive system maximizes the profit

influence of UM amp personalization may be hard to measureobtain

Monday February 27 12

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 32: Lecture 4: Social Web Personalization (2012)

Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people

whether you did a good job

bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo

bull Evaluation of user modeling

bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles

bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system

Monday February 27 12

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 33: Lecture 4: Social Web Personalization (2012)

Possible Metricsbull The usual IR metrics

bull Precision fraction of retrieved items that are relevant

bull Recall fraction of relevant items that have been retrieved

bull F-Measure (harmonic) mean of precision and recall

bull Metrics for evaluating recommendation (rankings)

bull Mean Reciprocal Rank (MRR) of first relevant item

bull Successk probability that a relevant item occurs within the top k

bull If a true ranking is given rank correlations

bull Precisionk Recallk amp F-Measurek

bull Metrics for evaluating prediction of user preferences

bull MAE = Mean Absolute Error

bull TrueFalse PositivesNegatives

runs

performance strategy X baseline

Is strategy X better than the baseline

Monday February 27 12

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 34: Lecture 4: Social Web Personalization (2012)

Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a

proposal for improving (tag) recommendations (using social networks)

bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation

bull Steps

1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data

2 Use the input data and calculate for the different strategies the predictions

3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)

4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)

[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]

Monday February 27 12

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 35: Lecture 4: Social Web Personalization (2012)

Example Evaluationbull [Guy et al] shows another example of a similar evaluation

approach

bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation

bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline

[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]

Monday February 27 12

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 36: Lecture 4: Social Web Personalization (2012)

user interactions (level amp type) instead of general social context - better for recommendations

does hybrid always work worseMonday February 27 12

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 37: Lecture 4: Social Web Personalization (2012)

Recommendation Systems

Predict items that are relevantusefulinteresting (and to what extent)

for given user (in a given context)

itrsquos often a ranking task

Monday February 27 12

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 38: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 39: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 40: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 41: Lecture 4: Social Web Personalization (2012)

httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 42: Lecture 4: Social Web Personalization (2012)

Collaborative Filtering

u1 likes u2

likes likes u1 likes Pulp Fiction

bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users

bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes

bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster

bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences

bull Others rule-based other data mining techniques

bull

Monday February 27 12

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 43: Lecture 4: Social Web Personalization (2012)

Memory vs Model-based

bull complete input data is required

bull pre-computation not possible

bull does not scale well (ldquotricksrdquo are needed)

bull high quality of recommendations

bull abstraction (model) of input data

bull pre-computation (partially) possible (model has to be re-built from time to time)

bull scales better

bull abstraction may reduce recommendation quality

Monday February 27 12

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 44: Lecture 4: Social Web Personalization (2012)

Social Networks amp Interest Similarity

bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood

bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones

bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)

bull does a social connection indicate user interest similarity

bull how much users interest similarity depends on the strength of their connection

bull is it feasible to use a social network as a personalized recommendation

[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 45: Lecture 4: Social Web Personalization (2012)

Conclusionsbull pairs unilaterally connected have more common information

items metadata and tags than non-connected pairs

bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks

bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship

bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems

bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation

Monday February 27 12

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 46: Lecture 4: Social Web Personalization (2012)

Social recommendations on conflicting groups

are all interests of our friends relevant Is it application generic

Monday February 27 12

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 47: Lecture 4: Social Web Personalization (2012)

Content-based Recommendations

bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests

bull Techniques

bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters

bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items

bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user

Monday February 27 12

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 48: Lecture 4: Social Web Personalization (2012)

Government stops renovation of tower bridge Oct 13th 2011

Tower Bridge today Under construction

Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames

Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

02 0 0

02 04 01 01

= a

Weighting strategy -  occurrence frequency -  normalize vectors (1-norm sum of vector equals 1)

Content Features

Monday February 27 12

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 49: Lecture 4: Social Web Personalization (2012)

RT Government stops renovation of tower bridge Oct 13th 2011

Userrsquos Twitter history

I am in London at the moment Oct 13th 2011

I am doing sports Oct 12th 2011

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

0 01 0

05 02 02 0

= u

Weighting strategy -  occurrence frequency (eg smoothened by occurrence time recent concepts are more important -  normalize vectors (1-norm sum of vector equals 1)

User Model

Monday February 27 12

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 50: Lecture 4: Social Web Personalization (2012)

dbPolitics dbSports

dbEducation dbLondon

dbTower_Bridge dbGovernment

dbUK

u 0

01 0

05 02 02 0

candidate items user a

02 0 0

02 04 01 01

b 0 0 0

08 02 0 0

c 0

05 02 0 0 0

03

cosine similarities

a b c

u 067 092 014

Ranking of recommended items 1  b 2  a 3  c

Recommendations

Monday February 27 12

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 51: Lecture 4: Social Web Personalization (2012)

RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer

the preferences of new users

bull Changing User Preferences user interests may change over time

bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item

bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations

bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people

bull Research challenge find right balance between serendipity amp personalization

bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior

Monday February 27 12

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 52: Lecture 4: Social Web Personalization (2012)

What is true personalization

Monday February 27 12

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 53: Lecture 4: Social Web Personalization (2012)

Monday February 27 12

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12

Page 54: Lecture 4: Social Web Personalization (2012)

Hands-on Teaser

bull Build your own recommender system 101

bull Recommend pages on delicious

bull Recommend pages to your Facebook friends

image source httpwwwflickrcomphotosbionicteaching1375254387

Monday February 27 12