Transparent User Models for Personalization

59
Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models for Personalization

description

Transparent User Models for Personalization. Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich , Jurgen Van Gael, Blaise Agüera y Arcas. Personalization is ubiquitous. Personalization is invaluable. YouTube : 72+ hours/minute of new video - PowerPoint PPT Presentation

Transcript of Transparent User Models for Personalization

Page 1: Transparent User Models for Personalization

Khalid El-AriniCarnegie Mellon University

Joint work with:Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera

y Arcas

Transparent User Models for

Personalization

Page 2: Transparent User Models for Personalization

Personalization is ubiquitous.

Page 3: Transparent User Models for Personalization

3

• YouTube: 72+ hours/minute of new video• Facebook: 950 million+ users• Twitter: 400+ million tweets/day• Shopping:

[1994]: 500K unique consumer goods sold in U.S.[2010]: Amazon alone offered 24 million.

Personalization is invaluable.

Keyword search is not enough.

Page 4: Transparent User Models for Personalization

Personalization is often wrong.

Page 5: Transparent User Models for Personalization

- J. Zaslow, November 26, 2002

“Basil…is not a neo-Nazi. Lukas…is not a shadowy stalker.David…is not Korean.

intent on giving them such labels.”

Page 6: Transparent User Models for Personalization

“there's just one way to change its mind: outfox it.” - J. Zaslow, November 26, 2002

What recourse do we have?

Can we do better?

Page 7: Transparent User Models for Personalization

You behave like a

vegan hipster

Vegan? Really? Why?

You: • tweeted with #meatlessmonday• follow @WholeFoods• …

We propose an alternative.

Why am I getting this?

Page 8: Transparent User Models for Personalization

We propose an alternative.

Why am I getting this?

You behave like a

Brooklyn hipster

Goal: Achieve transparency via interpretable user features, learned from user activity

Page 9: Transparent User Models for Personalization

You behave like a

Brooklyn hipster

Goal: Achieve transparency via interpretable user features, learned from user activity

Badges

Page 10: Transparent User Models for Personalization

10

Approach Model Experiments Summary

Page 11: Transparent User Models for Personalization

11

1. Define a vocabulary of badges

Apple fanboy

vegan runner photographer

Rich, interpretable and explainable

Page 12: Transparent User Models for Personalization

12

1. Define a vocabulary of badges

2. Identify exemplars

How do I find vegans?

Page 13: Transparent User Models for Personalization

observed label

Take advantage of how users describe themselves

Page 14: Transparent User Models for Personalization

14

Most vegans don’t label themselves as “vegan” on Twitter…

we want to infer the attributes of these users

Page 15: Transparent User Models for Personalization

15

1. Define a vocabulary of badges

2. Identify exemplars3. Model characteristic

behavior• Hashtags #meatlessmonday• Retweets RT @WholeFoods

Page 16: Transparent User Models for Personalization

16

Approach Model Experiments Summary

Page 17: Transparent User Models for Personalization

• We have no negative training examples.Use a generative model.

• Actions can be explained by multiple badges, even for the same user.

Noisy-or to combine badges.• How do we deal with user corrections?

Observing a latent variable.

Model sketch

Page 18: Transparent User Models for Personalization

18

i=1…B

B badges

Page 19: Transparent User Models for Personalization

19

u=1…N

i=1…B

N users

Page 20: Transparent User Models for Personalization

20

u=1…N

i=1…B

F actions j=1…F

j=1…F

Page 21: Transparent User Models for Personalization

21

bi(u)

u=1…N

i=1…BDoes user u have badge i?

j=1…F

j=1…F

Page 22: Transparent User Models for Personalization

22

bi(u) λi(u)

u=1…N

i=1…B

j=1…F

j=1…FDoes user u have label for

badge i in his profile?

Page 23: Transparent User Models for Personalization

23

aj(u)

bi(u) λi(u)

j=1…F u=1…N

i=1…B

Has user u performed action j?

j=1…F

Page 24: Transparent User Models for Personalization

24

sij

aj(u)

bi(u) λi(u)

j=1…F

j=1…F

u=1…N

i=1…B

Does badge i explain action j?

Page 25: Transparent User Models for Personalization

25

sijφij

aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

What’s the probability that a user with badge i performs action j?

Page 26: Transparent User Models for Personalization

26

sijφijφbg aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

What is the background probability for each action?

Page 27: Transparent User Models for Personalization

27

sijφijφbg aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

noisy or:Can at least one of my badges (or the background) explain it?

Page 28: Transparent User Models for Personalization

28

sijφijφbg aj(u)

bi(u) λi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

Page 29: Transparent User Models for Personalization

29

sijφijφbg aj(u)

bi(u) λi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

Beta priors to control sparsity

Page 30: Transparent User Models for Personalization

30

sijφijφbg aj(u)

bi(u) λi(u)

γiT γiF

αφβφ

αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…B

Beta prior to encode low recall (e.g., 10%)

Beta prior to encode high precision

(e.g., 99.9%)

Page 31: Transparent User Models for Personalization

31

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…B

Page 32: Transparent User Models for Personalization

32

• Collapsed Gibbs sampler (with MH steps)

Inference

sijφijφbg

bi(u)

Page 33: Transparent User Models for Personalization

33

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…BYou behave like a

vegan hipster.

Page 34: Transparent User Models for Personalization

34

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…BYou behave like a

vegan hipster.

Page 35: Transparent User Models for Personalization

35

Approach Model Experiments Summary

Page 36: Transparent User Models for Personalization

36

• Start with 7 million Twitter users• Manually define 31 sample badges

by specifying labels

Data description

Page 37: Transparent User Models for Personalization

• Start with 7 million Twitter users• Manually define 31 sample badges by

specifying labels• Gather 2 million tweets from August

2011• Recall: actions are hashtags and

retweets

Remove infrequent actions and inactive users, leaving us with:

75,880 users32,030 actions

Data description

Page 38: Transparent User Models for Personalization

38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 310

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Chart Title

Badges

artist

photographer

country music fan

book worm

Badge statistics

Page 39: Transparent User Models for Personalization

39

Can we learn badges?

Page 40: Transparent User Models for Personalization

40

Vegetarian badge

Page 41: Transparent User Models for Personalization

41

Runner badge

Page 42: Transparent User Models for Personalization

42

Hacker badge

Page 43: Transparent User Models for Personalization

43

Manchester United badge

Page 44: Transparent User Models for Personalization

44

Do all badges look this good?

No, but most do.

Page 45: Transparent User Models for Personalization

45wine lover

Over-generalized

Page 46: Transparent User Models for Personalization

46

Overwhelmed

Ruby on Rails

Page 47: Transparent User Models for Personalization

47

Can we just use the labels directly?

Page 48: Transparent User Models for Personalization

48

Inferred Apple fanboy badge

Self-described Apple fanboys

Page 49: Transparent User Models for Personalization

49

• Compare to labeled LDA [Ramage+ 2009]– LDA extension where each document is

labeled with multiple tags– One-to-one mapping between topics and tags– Document explained only by topics

associated with its tags

• Hold out random 10% of labels, treat as ground truth, and try to predict them

Comparative Analysis

Page 50: Transparent User Models for Personalization

50

Rank of held-out labels be

tter

Better predictiveperformance

Page 51: Transparent User Models for Personalization

51

bett

erBetter predictions for active

users

Page 52: Transparent User Models for Personalization

52

Sparse badges

Apple fanboy (badges) Apple fanboy (l-lda)

Page 53: Transparent User Models for Personalization

53

Approach Model Experiments Summary

Page 54: Transparent User Models for Personalization

54

Leveraged how users describe themselves

Page 55: Transparent User Models for Personalization

55

Leveraged how users describe themselves to build interpretable user features You behave like a

vegan hipster

Page 56: Transparent User Models for Personalization

56

Empirically showed we can infer a user’s attributes from his behavior

Page 57: Transparent User Models for Personalization

57

谢谢

Page 58: Transparent User Models for Personalization

What recourse do we have?

Collaborative filtering

Content-based filtering

Can we do better?

Page 59: Transparent User Models for Personalization

59

Most vegans don’t label themselves as “vegan” on Twitter……but what about non-vegans?

“I drink too much and hate vegans.”