Download - Folksonomy-Based Collabulary Learning

Transcript
Page 1: Folksonomy-Based Collabulary Learning

1

Folksonomy-Based Collabulary Learning

Leandro Balby Marinho, Krisztian Buza, Lars Schmidt-Thieme{marinho,buza,schmidt-thieme}@ismll.uni-hildesheim.de

Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim, Germany

Page 2: Folksonomy-Based Collabulary Learning

2

Motivation Scenario

Classic Music

Bossa Nova

Jazz

Girl from Ipanema

Chill out

Chopin

Page 3: Folksonomy-Based Collabulary Learning

3

Motivation Scenario

Page 4: Folksonomy-Based Collabulary Learning

4

Outline

Problem Definition Collabulary Learning

Folksonomy Enrichment Frequent Itemset Mining for Ontology

Learning from Folksonomies Recommender Systems for Ontology

Evaluation Experiments and Results Conclusions and future work

Page 5: Folksonomy-Based Collabulary Learning

5

Problem Definition

Semantic Web suffers from knowledge bottleneck

Folksonomies can help How?

Voluntary annotators Educated towards shareable annotation How?

Through a collabulary

Page 6: Folksonomy-Based Collabulary Learning

6

Problem Definition

“A possible solution to the shortcomings of folksonomies and controlled vocabulary is a collabulary, which can be conceptualized as a compromise between the two: a team of classification experts collaborates with content consumers to create rich, but more systematic content tagging systems.”

Wikipedia article on Folksonomies(http://en.wikipedia.org/wiki/Folksonomy)

Page 7: Folksonomy-Based Collabulary Learning

7

Problem Definition

An ontology with concepts and a knowledge base with f is called a collabulary over and

Problem:Learn a collabulary that best represents

folksonomy and domain-expert vocabulary

Page 8: Folksonomy-Based Collabulary Learning

8

Collabulary LearningUser TagPeterAnnaAnna modernPeterAnnaPeter modernAnnaPeterAnna

ResourceFour Seasons renaissanceHair musicalHairCats musicalHair spiritualCatsFour Seasons classiccats good-to-hearFour Seasons vivaldi

User TagPeterAnnaAnna modernPeterAnnaPeter modernAnnaPeterAnna

Cats... ...

ResourceFour Seasons renaissanceHair musicalHairCats musicalHair spiritualCatsFour Seasons classiccats good-to-hearFour Seasons vivaldi

expert Hair musicalexpert musicalexpert

Folksonomy

Enriched folksonomy:user + expert knowledge

1. Folksonomy enrichment

music

modernclassic

musical ......Collabulary: ontology with user + expert knowledge

music

opera

musical

Domain-Expert Ontology

Folksonomy

classic

2. Ontology Learning

Page 9: Folksonomy-Based Collabulary Learning

9

Folksonomy to trivial ontology

root

stuff_to_chill awesome_artists makes_me_happy

r1, r5 r7, r8 r1, r2, r3

Res 8Res 7

Res 5User 4

User 2

User 1

User 3

stuff_to_chill

makes_me_happy

Res 3Res 2

Res 1

awesome_artists

UserResourceTag

Page 10: Folksonomy-Based Collabulary Learning

10

Matching Concepts

root

stuff_to_chill awesome_artists makes_me_happy

r1, r5 r7, r8 r1, r2, r3

Trivial ontology

(a)

rock

rockabilly

r1, r4

r5, r6 r7, r8

alternative

emo

root

Domain expert ontology

(b)

jazz

garage

Page 11: Folksonomy-Based Collabulary Learning

11

Additional tag assignments

Res 5

User 1 stuff_to_chill

Res 1

alternative

rock

rockabilly

r1, r4

r5, r6 r7, r8

alternative

emo

root

jazz

garage

Page 12: Folksonomy-Based Collabulary Learning

12

Expert conceptualization

Res 5

User 1 stuff_to_chill

Res 1

alternative

rock

rockabilly

r1, r4

r5, r6 r7, r8

alternative

emo

root

jazz

garage

Expert

Res5 Res6 Res7 Res8 Res1 Res4

Rockabilly

Emo

Page 13: Folksonomy-Based Collabulary Learning

13

Frequent Itemsets for Learning Ontologies from Folksonomies

Most of the approaches rely on co-occurrence models In sparse structures positive correlations carry essential

information about the data Project folksonomy to transactional database and use

state of the art frequent itemsets mining algorithms

Page 14: Folksonomy-Based Collabulary Learning

14

Frequent Itemsets for Learning Ontologies from Folksonomies Assumptions for relation extraction from

frequent intemsets High Level Tag

• The more popular a tag is, the more general it is• A tag x is a super-concept of a tag y if there are

frequent itemsets containing both tags such that sup({x})≥sup({y})

Frequency • The higher the support of an itemset, stronger

correlated are the items on it Large Itemset

• Preference is given for items contained in larger itemsets

Page 15: Folksonomy-Based Collabulary Learning

15

Frequent Itemsets for Learning Ontologies from Folksonomies

Page 16: Folksonomy-Based Collabulary Learning

16

Recommender Systems for Ontology Evaluation

Ontologies can facilitate browsing, search and information finding in folksonomies

They should be evaluated in this respect Recommender Systems are programs

for personalized information finding Let the recommender tell which is the

best ontology

Page 17: Folksonomy-Based Collabulary Learning

17

Recommender Systems for Ontology Evaluation

Task Recommend useful resources

Application Ontology-based collaborative filtering

Ontologies A trivial ontology (folksonomy), domain-

expert and collabulary Gold Standard

Test SetPorzel, R., Malaka, R.: A task-based approach for ontology

evaluation. In: Proc. of ECAI 2004, Workshop on Ontology Learning and Population, Valencia, Spain

Page 18: Folksonomy-Based Collabulary Learning

18

Recommender Systems for Ontology Evaluation

User 1

Res 1

User 1 := (res1:=1)T

rock

rockabilly

r1, r4

r5, r6 r7, r8

alternative

emo

root

jazz

garage

User := (emo:=53.3, alternative:=26.6, rock:=13.3, root:=6.6)T

Ziegler, C., Schmidt-Thieme, L., Lausen, G.: Exploiting semantic product descriptions for recommender systems. In: Proc. of the 2nd ACM SIGIR Semantic Web and Information Retrieval Workshop (SWIR 2004), Sheffield, UK

Page 19: Folksonomy-Based Collabulary Learning

19

Experiments and results

Datasets Last.fm (folksonomy) Musicmoz (domain-expert ontology)

Only the resources contained in both were considered

Datasets |U| |T| |R| |Y|Last.fm 3532 7081 982 130899Musicmoz - 555 982 -

Page 20: Folksonomy-Based Collabulary Learning

20

Experiments and results

Folksonomy EnrichmentEdit distance to handle duplications

electro

hip hop

chillout

old skool dance

anything else but death

depeche modealternative

heavy metal

experimental rock

electronica

house

Page 21: Folksonomy-Based Collabulary Learning

21

Frequent Itemsets for Learning Ontologies from Folksonomies

Page 22: Folksonomy-Based Collabulary Learning

22

Frequent Itemsets for Learning Ontologies from Folksonomies

Page 23: Folksonomy-Based Collabulary Learning

23

Recommender Systems for Ontology Evaluation

Top-10 best recommendations / Allbut1 protocolNeighborhood size 20

Recall:=Number of hits / Number test users

Rec

all

Page 24: Folksonomy-Based Collabulary Learning

24

Conclusions and Future work

Conclusions Folksonomies can alleviate knowledge bottleneck Users need to be educated towards more shareble

vocabulary though Collabularies can help

Our Contributions Definition of the collabulary learning problem An approach for enriching folksonomies with domain expert

knowledge A new algorithm for learning ontologies from folksonomies A new benchmark for task-based ontology evaluation

Future Work Non-taxonomic relations ? Different enrichment strategies ? Optimized structure for the task with constraints ?

Page 25: Folksonomy-Based Collabulary Learning

25

Thanks for your attention!

Page 26: Folksonomy-Based Collabulary Learning

26

Frequent Itemsets for Learning Ontologies from Folksonomies