Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

17
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New York City, NY, USA, October 22-25, 2009 http://lca.epfl.ch/privacy Reza Shokri Pedram Pedarsani George Theodorakopoulos Jean-Pierre Hubaux

description

http://lca.epfl.ch/privacy. Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles. Reza Shokri Pedram Pedarsani George Theodorakopoulos Jean-Pierre Hubaux. The 3rd ACM Conference on Recommender Systems, New York City, NY, USA, October 22-25, 2009. - PowerPoint PPT Presentation

Transcript of Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

Page 1: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

1

Preserving Privacy in Collaborative Filtering through Distributed

Aggregation of Offline Profiles

The 3rd ACM Conference on Recommender Systems, New York City, NY, USA, October 22-25, 2009

http://lca.epfl.ch/privacy

Reza Shokri Pedram Pedarsani

George TheodorakopoulosJean-Pierre Hubaux

Page 2: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

2

Privacy in Recommender Systems

• Untrusted Server– Tracking users’ activities

• Publishing Users’ Profiles– Re-identification attacks on anonymous datasets

A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, 2008.

Page 3: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

3

Problem Statement• Improving users privacy with minimum imposition of

accuracy loss on the recommendations– Centralized recommender system– Contact between users– Distributed privacy preserving mechanism

• Distributed aggregation of users’ profiles– Users hide the items they have actually rated through

adding items rated by other users to their profile

Proposed Solution

Page 4: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

4

Outline

• Profile Aggregation• Aggregation Methods• Evaluation

Page 5: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

5

Profile Aggregationitems

ratings

2 25 5 34 43315

21

34

• Each user gives a subset of his items to his contact peer• Thus, users profiles are aggregated after the contact

Alice Bob

Page 6: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

6

System ModelOnline profile

Offline profile

synchronization

• Actual Profile: Set of items rated by a user• Offline Profile: Actual profile + aggregated items• Online Profile: The latest synchronized offline profile on the server

contact

Page 7: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

7

Online Profiles vs. Actual Profiles…

Online profile of users

Actual profile of users

Page 8: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

8

Aggregation Methods

• How many items to aggregate?• Which items to aggregate?

Similarity-based Aggregation(Similarity: The Pearson’s correlation coefficient)

– Random Selection (SRS)– Minimum Rating Frequency (SMRF)

(rating frequency: percentage of users that have rated an item)IMDB: 167,237 votes IMDB: 1,625 votes

Page 9: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

9

Evaluation Metrics

• Privacy Gain

• Accuracy Loss

Page 10: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

10

Privacy Gain

number of usersactual profile of user ‘u’

online profile of user ‘u’ rating frequency of item ‘i’

R. Myers, R. C. Wilson, and E. R. Hancock. Bayesian graph edit distance. IEEE Trans. Pattern Anal. Mach. Intell., 22(6), 2000.

Intuition: Structural difference of two graphs (online and actual) viewed as difference between correspondent edges

Privacy: How difficult is for the server to guess the users’ actual profiles, having access to their online profiles

Weight of items added by aggregation

Weight of items in online profile

Page 11: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

11

Accuracy Loss

The bipartite graph that contains actual ratings

The bipartite graph available to the server

Page 12: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

12

Experiment• Simulation on randomly chosen profiles

– From the Netflix prize dataset– 300 users– Average: 30000 ratings and 2500 items in each experiment

• Memory-based CF: user-based• Testing set: 10% of the actual ratings of each user• Users select their contact peers at random• Aggregation methods

– Union– SRS– SMRF

Page 13: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

13

Privacy Gain

Similarity-based Random Selection (SRS)Similarity-based Minimum Rating Frequency (SMRF)

Page 14: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

14

Accuracy Loss

Similarity-based Random Selection (SRS)Similarity-based Minimum Rating Frequency (SMRF)

Page 15: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

15

Tradeoff between Privacy

and Accuracy

Page 16: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

16

Conclusion

• A novel method for privacy preservation in collaborative filtering recommendation systems

• Protection of users privacy against an untrusted server

• Considerably improving users privacy with minimum effect on recommendations accuracy by aggregating users’ profiles based on their similarities

• Proposed method can also be used on protecting privacy of users in published datasets

Page 17: Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles

17

Future Work

• The evaluation of the mechanism can be improved by considering more realistic contact pattern between users, e.g., users friendship in a social network, or physical vicinity

• We would like to evaluate the practical implication of the method on the maintenance of the profiles

http://lca.epfl.ch/privacy