The Effect of Dimensionality Reduction in Recommendation Systems Juntae Kim Department of Computer...

Post on 04-Jan-2016

215 views 0 download

Transcript of The Effect of Dimensionality Reduction in Recommendation Systems Juntae Kim Department of Computer...

The Effect of Dimensionality Reduction in Recommendation Systems

Juntae Kim

Department of Computer EngineeringDongguk University

Contents Introduction

Collaborative Recommendation

Data Sparseness Problem

Dimensionality Reduction by using SVD

An Example

Experiments

Conclusion

Introduction e-CRM

Provides personalized service Enhance sales by

Product recommendation, target advertisement, etc.

Recommendation System

Demographic features

Item features

Sales history

Purchase historyCustomer

Recommend items

Introduction Use item-to-item similarity – content-based

Use item-to-item similarity – association

A

C

B

like

similarcontents

Recommend

A

C

B

like

highcorrelation

Recommend

Introduction Use people-to-people similarity – demographic

Use people-to-people similarity – collaborative

A

C

Bsimilarfeature

like

Recommend

A

C

B

A B

highcorrelation

like

like

Recommend

Collaborative Method Advantages

No needs of contents analysis Items that are difficult to analyze contents can be

recommended Ex> Movie, music, …

No needs of user information High precision

Method1. Find out similar users

2. Predict preferences based on similar users preferences

Collaborative Method Computing similarity

Pearson correlation coefficient ( [-1, 1] )

: Rating of user a to item i

Example User a: (1, 8, 9) (-5, +2, +3) User b: (2, 9, 7) (-4, +3, +1) User a is similar to b User c: (9, 3, 3) (+4, -2, -2)

Collaborative Method Prediction of preferences

Weighted sum of similar users’ preferences

: 사용자 a 와 u 의 유사도

Example Average rating of user a: 5 Preferences of user a User b: (2, 8, 8), wa,b = 0.5 = (5, 5, 5) + (-4, 2, 2)*0.5

User c: (4, 4, 7), wa,c = 0.1 + (-1, -1, 2)*0.1

= (2.9, 5.9, 6.2)

Data Sparseness Problem Example data

10000

01000

11000

00001

00110

01101

6

5

4

3

2

1.

user

user

user

user

user

userScreamHolloweenPocahontasKingLionCoMonster

A

Data Sparseness Problem Explicit ratings are not usually available

Available data purchase, click, etc.

0 or 1 Computing correlation is not appropriate

(no negative preference information)

use cosine similarity

ua

uaua rr

rrw

,

Data Sparseness Problem Available data are usually very sparse

Buy 2~3 items among thousands of items Cosine similarity can not be computed

Reduce dimension

10000

01000

11000

00001

00110

01101

A

?

A

Dimensionality Reduction Using category information

Represent user preference vector with item categories Monster Co., Lion King, Pocahontas animation Holloween, Scream horror

10

10

10

01

01

11

A

10000

01000

11000

00001

00110

01101

A

Dimensionality Reduction Singular Value Decomposition (SVD)

Decompose the user-item matrix Amn

Amn = Umm Smn (Vnn)T

S : Diagonal matrix that contains the singular values of A in descending order

U, V : Orthogonal matrices

Rotating the axes of the n-dimensional space 1st axis runs along the direction of largest variation

Dimensionality Reduction SVD example

22.058.033.041.012.0

41.058.012.022.033.0

19.000.020.063.045.0

63.058.045.019.020.0

29.000.075.053.028.0

53.000.028.029.075.0

U

39.000.000.000.000.0

00.000.100.000.000.0

00.000.028.100.000.0

00.000.000.059.100.0

00.000.000.000.016.2

S

09.058.041.065.026.0

16.058.015.035.070.0

61.000.037.051.048.0

73.000.059.033.013.0

25.058.057.030.044.0

TV

Dimensionality Reduction Approximation of A

Select largest k singular values

A’mn = Umk Skk (Vnk)T

Computing user similarity AAT = USVT(USVT)T

= USVTVSTUT

= (US)(US)T

Projection of A into k dimensionA’mn Vnk = Umk Skk

An Example User-item matrix

10000

01000

11000

00001

00110

01101

6

5

4

3

2

1.

user

user

user

user

user

userScreamHolloweenPocahontasKingLionCoMonster

A

An Example Reduction, k = 2

65.026.0

35.071.0

00.197.0

30.004.0

84.060.0

46.062.0

2USVA

10000

01000

11000

00001

00110

01101

A

An Example User-user similarity

00.1

74.000.1

93.094.000.1

87.032.062.000.1

54.016.018.088.000.1

10.074.047.040.078.000.1

))(( TUSUS

An Example User vectors in 2-D space

u6

u4

u5

u3

u2

u1

Experiments Dataset – MovieLens

943 users, 1628 movies, 1~5 rating, 6.4% rated Change ratings to 0/1 3.6% rated

Experiments Compare performance of plain collaborative(CF) and reduce

d dimension(SVD) recommendation CF: 60 neighbor SVD: rank 20

Change sparseness to 2.0%, 1.0%, 0.5%

Experiments Metric

Hit ratio Remove 1 rating from each user test data Recommend 10 items for each user If the test data is in the recommended item hit

Total # of hit

Total # of test data

Result Sparseness 3.6% SVD improves hit ratio by x % Sparseness 0.5% SVD improves hit ratio by x %

Hit ratio =

Experiments Results

0

0.05

0.1

0.15

0.2

0.25

3.6% 2.0% 1.0% 0.5% 0.1%

Sparseness

recall Avg

CF 60NN

SVD Rank20

Conclusion Solve data sparseness problem

Reduce dimension – heuristics Reduce dimension – SVD

Experimental results SVD shows more performance improvement in sparser data

Future research Statistical analysis Combined methods

References Basu, C, Hirsh, H., Cohen, W., “Recommeder Systems. Recommedation As Classification: Using Social And C

onent-Based Information,” Proceedings of the Workshop on Recommendation system. AAAI Press, Menlo Park California, 1998.

Billsus, D., Pazzani, M. j., “Learning Collaborative Information Filters,” Proceedings of workshop on recommender system, 1998.

Berry, M. W., Dumais, S. T., and O’Brain, G. W. “Using Linear Algebra for Intelligent Information Retrieval,” SIAM Review, 37(4), pp. 573-595, 1995.

Breese, J. S., Heckerman, D., and Kadie, C., “Empirical Analysis of Predictive Algorithm for Collaborative Filtering,”Proceeding of the Fourteenth Conference UAI, July 1998.

Goldberg, k., Roeder, T., Gupta, D., and Perkins, C., “Eigentaste: A Constant Time Collaborative Filtering Algorithm,” Technical Report M00/41. Electronics Research Laborotary, University of California, Berkeley, 2000.

Herlocker, J., Konstan, J., Borchers, A., Riedl, J., “An Algorithmic Framework for Performing Collaborative Filtering,”Proceedings of the 1999 Conference on Research and Development in Information Retrieval, Aug. 1999.

Sarwar, B. M. “Sparsity, Scalability, and Distribution in Recommender Systems,” Ph.D. Thesis, Computer Science Dept., University of Minnesota, 2001.

Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J., “Application of Dimensionality Reduction in Recommender System-A Case Study,”WebKDD   00-Web-mining for E-Commerce Workshop, 2000.

Schafer, J. B., Konstan, J., and Riedl, J., “Recommender Systems in E-Commerce ,” Proceedings of the ACM Conference on Electronic Commerce, November 1999.

Shardanand, U., "Social information filtering for music recommendation," Technical Report MA95, MIT Media Laboratory, 1995.