Semantic tagging for crowd computing - SEBD 2010

10
SEBD 2010 - Holiday Inn Hotel, Rimini June 21, 2010 Semantic tagging for crowd computing Roberto Mirizzi 1 , Azzurra Ragone 1,2 , Tommaso Di Noia 1 , Eugenio Di Sciascio 1 1 Politecnico di Bari Via Orabona, 4 70125 Bari (ITALY) 2 University of Trento Via Sommarive, 14 38100 Trento (ITALY)

description

The recent blow up of crowd computing initiatives on the web calls for smarter methodologies and tools to annotate, query and explore repositories. There is the need for scalable techniques able to return also approximate results with respect to a given query as a ranked set of promising alternatives. In this paper we concentrate on annotation and retrieval of software components, exploiting semantic tagging relying on DBpedia. We propose a new hybrid methodology to rank resources in this dataset. Inputs of our ranking system are (i) the DBpedia dataset; (ii) external information sources such as classical search engine results, social tagging systems and wikipedia-related information. We compare our approach with other RDF similarity measures, proving the validity of our algorithm with an extensive evaluation involving real users. The system is available at http://sisinflab.poliba.it/not-only-tag/ SEBD 2010 - 18th Italian Symposium on Advanced Database Systems presented by Roberto Mirizzi (http://sisinflab.poliba.it/mirizzi - roberto.mirizzi -at- gmail.com) Rimini (ITALY), June 20-23, 2010

Transcript of Semantic tagging for crowd computing - SEBD 2010

Page 1: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Semantic tagging for crowd computing

Roberto Mirizzi1, Azzurra Ragone1,2, Tommaso Di Noia1, Eugenio Di Sciascio1

1Politecnico di BariVia Orabona, 470125 Bari (ITALY)

2University of TrentoVia Sommarive, 14

38100 Trento (ITALY)

Page 3: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Why not to use Semantic tags?

Plugged into the Web 3.0DisambiguationRelations among tagsMachine understandable

NOT: Not Only Tag

http://sisinflab.poliba.it/not-only-tag/

Page 4: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

DBpedia-Ranker: architectureSystem architecture

Linked Data graph exploration

Rank nodes exploiting external information

Store results as pairs of nodes together with their similarity

Start typing a tag

Query the system for relevant tags (corresponding to DBpedia resources)

Show the semantic tag cloud

1

2

3

1

2

3

SPARQL

Runtime searchOffline classification

1

External Information

Sources

2 3

1

2 3

STORAGE

Delicious

Yahoo!

Google

GRAPH EXPLORER

RANKER

DBpedia

TAGS

WEB INTERFACE

Bing

Page 5: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

DBpedia-Ranker: ranking

?r1 ?r2isSimilar

v

hasValue

)(

),(

)(

),(),(

2

21

1

2121 rf

rrf

rf

rrfrrsim

),()()(

),(),(

2121

2121 rrfrfrf

rrfrrcoOcc

)}(log),(min{loglog

),(log)(log),(logmax),(

21

212121 rfrfN

rrfrfrfrrngd

viceversaand r and rbetween wikilink,2

saor vicever r and rbetween k wikilin,1

r and rbetween wikilink no ,0

),(

21

21

21

21 rrorewikilinkSc

)(

),(),(

2

1221 rl

rrlrroreabstractSc

Page 6: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

DBpedia-Ranker: context analysis

The same similarity measure is used in the context analysis

?r1

?c1

belongsTo

v

hasValue

?c2

?c…

?cN

C

Example:

C = {Programming Languages, Databases, Software}

Does Dennis Ritchie belongs to the given context?

Algorithm:

If(v>THRESHOLD) then r1 belongs to the context; add r1 to the graph exploration queueElse r1 does not belong to the context; exclude r1 from graph explorationEndIf

Page 7: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Evaluation (I)

http://sisinflab.poliba.it/evaluation

Page 8: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Evaluation (II)

http://sisinflab.poliba.it/evaluation/data

Page 9: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Future work

Test our algorithms with different domains

Extract more fine grained contexts

Enrich the extracted context using also relevant properties

Integrate our approach with real existing systems

Use the core system to automatically extract relevant tags (concepts) from a document (or from a collection of documents) exploiting tools for named entities extraction

Page 10: Semantic tagging for crowd computing - SEBD 2010

SEBD 2010 - Holiday Inn Hotel, RiminiJune 21, 2010

Q&A

Semantic tagging for crowd computingRoberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio

[email protected], [email protected], {ragone,dinoia,disciascio}@poliba.it

Thank you for your attention!