User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems...

User Profiling based on Folksonomy Information in Web

2.0 for Personalized Recommender Systems

Huizhi (Elly) Liang

Supervisors: Yue Xu, Yuefeng Li, Richi Nayak

Queensland University of Technology, Australia

Agenda

4

Introduction1

2

3

5

The Proposed Approaches

Experiments

Conclusion

Literature Review

1 Introduction

Information overload

Personalization “Personalization is the ability providing content and services tailored

to individuals based on knowledge about their preferences and behaviours” (Hagen, 1999)

Recommender systems User profiling

Explicit user profiles Explicit ratings

Implicit user profiling Web log Other information sources

Web 2.0

Web 2.0: Read and Write web (O’Reilly Media, 2004) A platform for users to conduct online participation,

collaboration and interaction. Expressing opinions, sharing information, building

networks Wikipedia, Facebook, Delicious, Tweeter

Plenty of new user information Folksonomy (Tags), reviews, networks, blogs, micro-blogs

etc.

Opportunities Providing possible new solutions to profile users

Folksonomy

Folksonomy= folk + taxonomy Tags: Typical Web 2.0 information

Keywords given by users to organize and classify items

The wisdom of crowds

Multiple functions Item organizing and sharing Building networks Expressing users’ explicit topic interests and

opinions

Tag Cloud

Folksonomy Tags

Taxonomy categories

Taxonomy Given by experts Standard vocabulary & Structural relationship Well recognized as common knowledge Independent with user communities

No users’ personal viewpoints or preferences information

Folksonomy Given by users explicitly and proactively

Reflecting users’ personal viewpoints and topic preferences

Less intrusive & Multiple function

Lightweight textural information

Contains a lot of noise

Literature Review2

User Profiling

Web User profiling Web content & structure Web log & Web usage Taxonomy & Ontology

User Profiling in Web 2.0 New user information sources

Folksonomy, blogs, reviews, micro-blogs Videos, audios, images Friends, trust network, followers, following

User Profiling 2

User Profiling based on folksonomy Approaches

Users’ own tags Associated tags Latent topics of tags Popular tags

Challenges Distinctive features of tags

Tag quality problem

Semantic ambiguity and synonyms

About 60% of tags are personal tags

Recommender system

Recommendation tasks Top N Recommendation (Precision, Recall, F1) Rating Prediction (Mean Absolute Error, Root Mean

Squared Error)

Recommendation approaches Content based

Term vector model Latent Dirichlet Allocation (LDA)

Collaborative Filtering (CF) Memory based CF: User-KNN & Item-KNN Model based CF: Matrix Factorization techniques

Hybrid

Recommender system 2

Recommender systems based on Taxonomy Ziegler’s approach (CIKM, 2004)

Recommender systems based on Folksonomy Tag recommendations

Tensor based approach (KDD, 2009) Graph based approach (SIGIR, 2009)

Item recommendations Tso-Sutter’s approach(SAC, 2008) Clustering (RecSys, 2009) LDA approach (HT, 2009) Graph Rank (2010) Special tag rating function(WWW,2009)

Research Problem

Research Gap Features of folksonomy Noise of folksonomy Combining with taxonomy

Research Problem Profiling users based on folksonomy

information in Web 2.0 and enhance recommender systems

The Proposed Approaches3

User Profiling Models User Profiling based on Folksonomy User Profiling based on Taxonomy Hybrid User Profiling

Recommender System Top N item recommendation

Recommendation making

The Proposed Approaches

User Profiling

User Profiling-Folksonomy

User Profiling-Taxonomy

User Profiling-Hybrid

The Relationship Modelling

The Multiple relationships of tagging Two dimensional relationships

User-Item relationship

User-Tag relationship

Item-Tag relationship

Three dimensional relationship Personal tagging behavior

User-Tag-Item relationship (User×Tag)-Item mapping Item-(User×Tag) mapping

Part 1: User Profiling Approaches based on Folksonomy

Tag representation-Folksonomy

Item representation-Folksonomy User representation-Folksonomy

Tag Representation-

Folksonomy

Item Representation-

Folksonomy

User Representation-

Folksonomy


Tag representation-Folksonomy

Reduce the noise of tags Find the personally related tags of each tag Determine the relevance weight

Relevance weight of two tags with respect to a user The collected items of a tag The expectation of the probability of a tag being used for the

collected items

“apple”

“garden”

“globalization”

“apple”

“internet”

0.16

0.34

Number of users used the tag for the item

Number of users collected the item

Item representation-Folksonomy

Expand the tags of each item Find the relevant tags of each item Determine the relevance weight

The relevance of an item to a tag User-tag pairs The relevance of two tags with respect to a user Inverse item frequency

“garden”

“apple”

“globalization”

“internet”

“0403”

User Representation-Folksonomy

Find users’ preferences to tags The preference weight of a user to a tag

Preferences to one tag The relevance of two tags with respect to a user Inverse user frequency

“garden”

“apple”“globalizatio

n”“intern

et”“0403”

Number of items collected with the tag by the user

Number of items collected by the user

User Item preferences

Implicit ratings Topic preferences

Tag vocabulary

Item Tag vocabulary


“garden”

“apple”

“globalization”

“internet”

“0403”

“garden”

“apple”“globalizatio

n”“intern

et”“0403”

Part 2: User Profiling based on Taxonomy

Advantages of Taxonomy Standard vocabulary Well recognized Independent with user communities Experts’ viewpoints

Representations Item representation-Taxonomy Tag representation-Taxonomy User representation-Taxonomy

“apple”

Tag Representation-

Taxonomy

Item Representation-

Taxonomy

User Representation-

Taxonomy


Find the relevant taxonomic topics of each item The relevance of an item to a taxonomic topic

The average weight of a taxonomic topic in all descriptors The weight of a taxonomic topic in an item descriptor Deploy weight from leaf topic to root topic

Inverse item frequency

Item Representation-Taxonomy

“programming”

“book”

“computers”

“networks”

Reduce the noise of tags Find the personal semantic meaning of each tag

The relevance of a tag to a taxonomic topic with respect to a user The collected items of a tag Average relevance weight of a taxonomic topic to the collected

items

Tag Representation-Taxonomy

“computers”“programming”

“databases”

“networks”

“apple”

“apple”

“garden”“flowers”“fruit”

“apple”

“apple”

Find users’ preferences to taxonomic topics The preference weight of a user to a taxonomic topic

Preference to a tag Relevance of a tag to a taxonomic topic with respect to the user Inverse user frequency

User Representation-Taxonomy

“databases”

“programming”

“computers”

“book”

“0403”

User Item preferences

Implicit ratings Topic preferences

Taxonomy vocabulary

Item Taxonomy vocabulary


“databases”

“programming”

“computers”

“book”

“book”

“computers”

“programming”

“networks”

Part 3: Hybrid User Profiling

Combine Part 1 and Part 2 Wisdom of crowds

Tag vocabulary & Users’ viewpoints Wisdom of experts

Taxonomy vocabulary & Experts’ viewpoints

Tag representation-Hybrid

Item representation-Hybrid

User representation-Hybrid

Personalized Recommendation Making

Top N item recommendation

Neighborhood Formation

Recommendation Generation



User Profiling-Hybrid

User Profiling

Recommendation Making

Neighbourhood Formation

K-Nearest Neighbourhood User-KNN

Similarity of item preferences Similarity of topic preference

Tags Taxonomic topics

Linear combination

Item Preferences

Topic Preferences

Tags Taxonomic topics

User Similarity

Neighbourhood Formation 2

K-Nearest Neighbourhood Item-KNN

Similarity of Tags Similarity of Taxonomic topics Linear combination

Tags Taxonomic Topics

Item similarity

Recommendation Generation

Candidate items Neighbour items & Not tagged by the target user

User based recommendation

Item based recommendation

Prediction Score

User Similarity Content

matching TagsTaxonomic

Topics

Prediction Score

Item Similarity

Experiments4

Datasets

D1: Amazon.com 4112 users, 34201 tags, 30467 items, 9919 taxonomic topics

D2: CiteULike “Who-posted-what” dataset 7103 users, 78414 tags, 117279 items

Power Law Distributions

0 200 400 600 800 1000 12000

50010001500200025003000350040004500

Number of Users

Nuver

of

Tags D2

D1

0 20 40 60 80 100 120 1400

2000

4000

6000

8000

10000

12000

Number of Users

Num

ber

of

Item

s D2

D1

Tags Items

Experiment setup

Top N item recommendation Experiment setup

5-folded 80% training & 20% testing

Evaluation Metrics Precision, Recall, F1 Measure

Comparisons Proposed Models

Folksonomy Model: FM-User, FM-Item Taxonomy Model: TM-User, TM-Item Hybrid Model: FTM-User, FTM-Item

Baseline Models

Tag Noise Removing Approaches (Dataset D1) Parameter setting

FM-User: : 0.8-1.0 , 1 : 0.4-0.5

FM-Item: 1 : 0.4-0.5

Results-I Folksonomy Model

1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Top N

Pre

cisio

n

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Top N

FM-User

FM-Item

Clustering

ARTE

LDA

F1

The Comparison of the State-of-the-art approaches (Dataset D1)

Results-I

1 2 3 4 5 6 7 8 9 100.1

0.15

0.2

0.25

0.3

0.35

0.4

Top N

Pre

cisio

n

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Top N

FM-User

Graph Rank

Tag tf-iuf

Tso-Sutter’s approach

CF-Item

F1

Comparison results of Dataset D2

Results-I

1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

Top N

Pre

cisio

n

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Top N

FM-User

Graph Rank

Clustering

CF-Item

F1

Parameter setting (Dataset D1) TM-User:

: 0.8-1.0 , 1 : 0.4-0.5

TM-Item: 1 : 0.4-0.5

Results-2 Taxonomy Model

1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Top N

Pre

cisio

n

1 2 3 4 5 6 7 8 9 100

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

TM-User

TM-Item

TPR

F1

Parameter setting (Dataset D1) FTM-User: FTM-Item: 1=0.3,

Hybrid Models v.s. Single Models Folksonomy Model v.s. Taxonomy Model

Results-3 Hybrid Models

1 2 30.18

0.23

0.28

0.33

0.38

0.43

FTM-UserFTM-ItemFM-UserTM-User

Top N

Pre

cis

ion

1 2 30

0.01

0.02

0.03

0.04

0.05

0.06

0.07

FTM-UserFTM-ItemFM-UserTM-user

Top N

Recall

Results-3

The influence of personal tags

D1 personal tags: 67%, 10: 4.8% D2 personal tags: 70% , 10: 5.2%

Findings Personal tags can improve the precision results Precision values decreased dramatically when large number (i.e., 90%) of

tags (i.e., 5) was removed.

1 2 3 4 5 6 7 8 9 101000

10000

10000078414

23229

1404510299

822968395837512445734131

34201

11298

69064897

37843051

2544215518841648

Num

ber

of

Tags

θ

D2

D1

1 2 3 4 5 6 7 8 9 100.1

0.15

0.2

0.25

0.3

0.35

(78414,0.21)

(23229,0.19)

(4131, 0.16)

(34201, 0.31)

(11298, 0.28)

(4897,0.24)

(1648, 0.2)

θ

Top-3

Pre

cisi

on

FM-User, D1

FM-User, D2

0.24

TM-User, D1

(9919, 0.24)

Discussions

The proposed approaches outperformed other related work

The Hybrid Model performed the best

Each tag counts Folksonomy can be used as quality information source

(rich personalization information)

Conclusions5

Conclusions

Web 2.0 New user information Modelling the relationships of tagging behaviour Tag quality problem The wisdom of crowds & experts

Proposed three user profiling models User profiling based on folksonomy User profiling based on taxonomy Hybrid user profiling

Utilized the proposed user profiles to improve recommender systems User based Item based

Evaluation Experiments

Contributions

Advantages Domain free Language free

Information overload User profiling and web personalization Recommender systems Web 2.0

Future Work

Time factor Cross folksonomy recommendations Mobile platform application Integrate with other user information

Explicit ratings Tweets Friendship network

Published Work

Liang, H. et al. (2010). Personalized Recommender System Based on Item Taxonomy and Folksonomy. CIKM

Liang, H. et al. (2010). Connecting Users and Items with Weighted Tags for Personalized Item Recommendations. Hypertext

Liang, H. et al. (2010). A Hybrid Recommender System based on Weighted Tags. SDM Workshop

Liang, H. et al. (2010). Mining Users’ Opinions based on Item Folksonomy and Taxonomy for Personalized Recommender Systems. ICDM Workshop

Liang, H. et al. (2010). Parallel User profiling based on folksonomy for Large Scaled Recommender Systems-An implementation of Cascading MapReduce. ICDM Workshop

Liang, H. et al. (2009). Collaborative Filtering Recommender Systems based on Popular Tags. ADCS

Liang, H. et al. (2009). Tag Based Collaborative Filtering for Recommender Systems. RSKT

Liang, H. et al. (2009). Personalized Recommender Systems Integrating Social tags and Item Taxonomy. WI

Liang, H. et al. (2008). Collaborative Filtering Recommender Systems Using Tag Information. WI Workshop

Bhuiyan, T., Xu, Y., Jøsang, A., & Liang, H. (2010). Developing Trust Networks Based on User Tagging Information for Recommendation Making. WISE

Acknowledgements

Time Supervisor Team HPC group

Penal Members ISS Anonymous Reviewers Papers Staffs

Colleagues Friends Google Books Sunshine CSC

Trees Stars Music Trips Blogs Beaches Family …

Questions & Answers

[email protected]

User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems...

Documents

Transcript of User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems...