GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING...

45
GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke , Ilija Subašić and Bettina Berendt MUSE September 24, 2012, Bristol 1

Transcript of GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING...

Page 1: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

GUIDING USER GROUPINGSLEARNING AND COMBINING CLASSIFICATION

FOR ITEMSET STRUCTURING

Mathias Verbeke, Ilija Subašić and Bettina Berendt

MUSESeptember 24, 2012, Bristol

1

Page 2: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

MOTIVATION

Page 3: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

MOTIVATION

Page 4: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

MOTIVATION

Structuring is natural

Page 5: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

MOTIVATION

Structuring is natural... also for digital items

Page 6: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CONTRIBUTIONS

1. A new DM approach that learns an intensional modelof user groupings and uses this to group new items Identify structuring dynamics

2. New divergence measure

3. A study of grouping behaviour in a social bookmarking system

Page 7: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Art

Music

Page 8: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Art

Music

?

Page 9: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Art

Music

?

Page 10: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

? Monet

Page 11: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

? Monet

Page 12: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Page 13: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Helmets The Roman Empire

Page 14: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Helmets The Roman Empire

AI

Pop

Hiphop

Rock

Page 15: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Helmets The Roman Empire

AI

Pop

Hiphop

Rock

?

Page 16: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Helmets The Roman Empire

AI

Pop

Hiphop

Rock

?

Page 17: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

STRUCTURING

CS

Picasso

Music

Monet

Helmets The Roman Empire

AI

Pop

Hiphop

Rock

?

Page 18: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

... AND ITS DYNAMICS

• Goal: insight in structuring dynamics dynamic conceptual clustering that simulates the intellectual structuring process

• Two types of guides:

1. own prior structuring

2. structuring of peers

Page 19: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

AT A GLANCE

Combination of 2 data mining tasks:

1. Learn model of structuring (classification) = intension: set of conditions for an object to belong to a certain class (vs. extension: list of objects in class)

2. Use intension or extension to structure new items

A. based on own structuring

B. based on k peers

Page 20: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

GROUPING GUIDANCEBASIC NOTATION

• U: the set of all users (used symbols: u, v, w)

• T: the set of all time points {0, 1, ..., tmax}, where tmax represents the time at which the last item arrives

• D: the set of all items (used symbol: d) Dut ⊆D: the set of all d ∈ D already considered by u ∈ U at t ∈ T dut ∈ (D \ Dut): the item assigned to the structure by user u at t

Page 21: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

GROUPING GUIDANCEGROUPINGS AND CLASSIFIERS

• G: (machine-induced) groupings for each user’s items

• C: classifiers (i.e. intensions) learned for these groupings

‣OG: Observed Grouping

‣GS: Simulated Grouping, guided by self

‣Gn: Simulated Grouping, guided by n peers

Page 22: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

INITIAL CLASSIFIER LEARNING

Goal: determine intensional definitions for the user-generated groupings

Each group is regarded as a class for which a definition needs to be calculated

Definitions used to assign new items to these gorups

Page 23: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CLASSIFIER SELECTION

= selection of peer guides

0

0

Page 24: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CLASSIFIER SELECTION

= selection of peer guides

0

0

Requires divergence measure betweengroupings of non-identical item sets

Page 25: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CLASSIFIER SELECTION

= selection of peer guides

0

0

Requires divergence measure betweengroupings of non-identical item sets

... but existing measures require large overlap between sets

Page 26: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CLASSIFIER SELECTION

= selection of peer guides

0

0

Requires divergence measure betweengroupings of non-identical item sets

... but existing measures require large overlap between sets

Inter-guide measure of diversity:

Page 27: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CLASSIFICATION

Selected classifiers are used to classify the itemunder consideration

Two cases:

• Self-guided classification

• Peer-guided classification

Page 28: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SELF-GUIDED

OGu0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSu0

(simulated,self-guided)

G1utmax

(simulated,1 peer)

G5utmax

(simulated,5 peers)

G10utmax

(simulated,10 peers)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

U

Page 29: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SELF-GUIDED

OGu0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSu0

(simulated,self-guided)

G1utmax

(simulated,1 peer)

G5utmax

(simulated,5 peers)

G10utmax

(simulated,10 peers)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

apply classifier OCu0 to item a;

learn the new classifier CSu1

Page 30: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SELF-GUIDED

OGu0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSu0

(simulated,self-guided)

G1utmax

(simulated,1 peer)

G5utmax

(simulated,5 peers)

G10utmax

(simulated,10 peers)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

apply classifier OCu0 to item a;

learn the new classifier CSu1

apply classifier CSu1 to item b;

learn the new classifier CSu2

tu2: item b reaches u

Page 31: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SELF-GUIDED

OGu0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSu0

(simulated,self-guided)

G1utmax

(simulated,1 peer)

G5utmax

(simulated,5 peers)

G10utmax

(simulated,10 peers)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

apply classifier OCu0 to item a;

learn the new classifier CSu1

apply classifier CSu1 to item b;

learn the new classifier CSu2

tu2: item b reaches u

tu3: item c reaches u apply classifier CSu2 to item c;

learn the new classifier CSu3

Page 32: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

PEER-GUIDED

OGv0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSv0

(simulated,self-guided)

G1vtmax

(simulated,1 peer)

G5vtmax

(simulated,5 peers)

G10vtmax

(simulated,10 peers)

G20vtmax

(simulated,20 peers)

OGvtmax

(observed)

V

GSu0

(simulated,self-guided)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

tu2: item b reaches u

tu3: item b reaches u

G10utmax

(simulated,10 peers)

G5utmax

(simulated,5 peers)

G1utmax

(simulated,1 peer)

GSw0

(simulated,self-guided)

G20wtmax

(simulated,20 peers)

OGwtmax

(observed)

Wtw1: item d reaches w

tw2: item e reaches w

tw3: item f reaches w

G10wtmax

(simulated,10 peers)

G5wtmax

(simulated,5 peers)

G1wtmax

(simulated,1 peer)

Page 33: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

PEER-GUIDED

OGv0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSv0

(simulated,self-guided)

G1vtmax

(simulated,1 peer)

G5vtmax

(simulated,5 peers)

G10vtmax

(simulated,10 peers)

G20vtmax

(simulated,20 peers)

OGvtmax

(observed)

Vtv1: item g reaches v

apply classifier OCu0 to item g;

learn the new classifier C1v1

GSu0

(simulated,self-guided)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

tu2: item b reaches u

tu3: item b reaches u

G10utmax

(simulated,10 peers)

G5utmax

(simulated,5 peers)

G1utmax

(simulated,1 peer)

GSw0

(simulated,self-guided)

G20wtmax

(simulated,20 peers)

OGwtmax

(observed)

Wtw1: item d reaches w

tw2: item e reaches w

tw3: item f reaches w

G10wtmax

(simulated,10 peers)

G5wtmax

(simulated,5 peers)

G1wtmax

(simulated,1 peer)

Page 34: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

PEER-GUIDED

OGv0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSv0

(simulated,self-guided)

G1vtmax

(simulated,1 peer)

G5vtmax

(simulated,5 peers)

G10vtmax

(simulated,10 peers)

G20vtmax

(simulated,20 peers)

OGvtmax

(observed)

Vtv1: item g reaches v

apply classifier OCu0 to item g;

learn the new classifier C1v1

apply classifier CSu1 to item h;

learn the new classifier C1v2

tv2: item h reaches v

GSu0

(simulated,self-guided)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

tu2: item b reaches u

tu3: item b reaches u

G10utmax

(simulated,10 peers)

G5utmax

(simulated,5 peers)

G1utmax

(simulated,1 peer)

GSw0

(simulated,self-guided)

G20wtmax

(simulated,20 peers)

OGwtmax

(observed)

Wtw1: item d reaches w

tw2: item e reaches w

tw3: item f reaches w

G10wtmax

(simulated,10 peers)

G5wtmax

(simulated,5 peers)

G1wtmax

(simulated,1 peer)

Page 35: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

PEER-GUIDED

OGv0: observed initial grouping, learned:

intensional description via classifier learningfrom extension

GSv0

(simulated,self-guided)

G1vtmax

(simulated,1 peer)

G5vtmax

(simulated,5 peers)

G10vtmax

(simulated,10 peers)

G20vtmax

(simulated,20 peers)

OGvtmax

(observed)

Vtv1: item g reaches v

apply classifier OCu0 to item g;

learn the new classifier C1v1

apply classifier CSu1 to item h;

learn the new classifier C1v2

tv2: item h reaches v

tv3: item i reaches v apply classifier CSw2 to item i;

learn the new classifier C1u3

GSu0

(simulated,self-guided)

G20utmax

(simulated,20 peers)

OGutmax

(observed)

Utu1: item a reaches u

tu2: item b reaches u

tu3: item b reaches u

G10utmax

(simulated,10 peers)

G5utmax

(simulated,5 peers)

G1utmax

(simulated,1 peer)

GSw0

(simulated,self-guided)

G20wtmax

(simulated,20 peers)

OGwtmax

(observed)

Wtw1: item d reaches w

tw2: item e reaches w

tw3: item f reaches w

G10wtmax

(simulated,10 peers)

G5wtmax

(simulated,5 peers)

G1wtmax

(simulated,1 peer)

Page 36: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

DATASETCiteULike dataset

sampled with p-core subgraphs to overcome sparsity

# users 377

# documents 11,400

# tags 12,982

timeframe 01/2009 - 02/2010

Page 37: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

INITIAL GROUPING

Tagging as implicit structuring

• First 7 months to learn initial grouping

• Modularity clustering

Page 38: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

INITIAL GROUPING

Tagging as implicit structuring

• First 7 months to learn initial grouping

• Modularity clustering

Initial classifier learning

•High dimensional input space (BoW of abstracts)

Naive bayes

Page 39: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SIMULATING GROUPINGS

• Groups represented by language models

• Jensen-Shannon divergence as inter-group divergence

•Normalized Mutual Information to compare groupigns

Page 40: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

RESULTS

Page 41: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

SIMILARITY DISTRIBUTION

Page 42: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CONCLUSIONS

• Investigate and simulate collaborative structuring

Learning and combining classifiers for itemset structuring

•New divergence measure

• Tested on social-bookmarking platformfor literature management

28

Page 43: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

CONCLUSIONSLIMITATIONS

•Observed groupings based on tag assignements

• Simple classifier

... but provides initial insights into grouping behaviour and behaviour of users in social bookmarking systems

29

Page 44: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

FUTURE WORK

• Applications: (tag) recommendation and social search

Adds new level to individual and social measures

• Regrouping based on peers

• Hybrid measure for itemset structuring

30

Page 45: GUIDING USER GROUPINGS - Universität Kassel€¦ · GUIDING USER GROUPINGS LEARNING AND COMBINING CLASSIFICATION FOR ITEMSET STRUCTURING Mathias Verbeke, Ilija Subašić and Bettina

THANKS!

QUESTIONS?

31