SAnno: a unifying framework for semantic annotation

18
SAnno: a unifying framework for semantic annotation Davide Eynard IDSIA, 01/06/2010

description

A talk presentation of SAnno at IDSIA, 2010/06/01.

Transcript of SAnno: a unifying framework for semantic annotation

Page 1: SAnno: a unifying framework for semantic annotation

SAnno: a unifying framework for semantic annotation

Davide Eynard

IDSIA, 01/06/2010

Page 2: SAnno: a unifying framework for semantic annotation

2

Davide EynardIDSIA, 01/06/2010

Introduction

• S(emantic)Anno(tations) • … in Italy, “sanno” also means “they know”

• Basic principle: anyone should be able to say anything about anything else

• Well, this should hold in general :-)• Actually, in our case it is “anything about any URI”• And we would like everyone to say that in a formal way

• But first, a little step back in time...

Page 3: SAnno: a unifying framework for semantic annotation

3

Davide EynardIDSIA, 01/06/2010

Participation and semantics

Data

Structure

Page 4: SAnno: a unifying framework for semantic annotation

4

Davide EynardIDSIA, 01/06/2010

Sanno's grandfather: Speakinabout [1]

• Purpose: produce semantic annotations about named entities• When you read “Harry Potter”, is it the book or the movie?

• Plays with user gratifications• When users annotate a string as matching a specific concept, they

are shown a list of services/search engines which are related to it

• Relies on user provided data:• Freebase types• User generated search templates, built inside a wiki system

Page 5: SAnno: a unifying framework for semantic annotation

5

Davide EynardIDSIA, 01/06/2010

Sanno's grandfather: Speakinabout [1]

Page 6: SAnno: a unifying framework for semantic annotation

6

Davide EynardIDSIA, 01/06/2010

Sanno's grandfather: Speakinabout [1]

Page 7: SAnno: a unifying framework for semantic annotation

7

Davide EynardIDSIA, 01/06/2010

Sanno's father: RDFMonkey [2]

• Purpose: augment browsing experience by providing information/services related to the visited URL

• Relies on Freebase types• … as in SpeakinAbout, but without requiring user interaction• Types are found by searching backlinks in Freebase (which topics

are linking the visited page)

• Related services as widgets inside a browser extension• The app could load widgets at runtime (from Freebase itself or

another collaborative system)

Page 8: SAnno: a unifying framework for semantic annotation

8

Davide EynardIDSIA, 01/06/2010

Sanno's father: RDFMonkey [2]

Musical Artists

Cities

Books

Page 9: SAnno: a unifying framework for semantic annotation

9

Davide EynardIDSIA, 01/06/2010

The problem

• We already have semantics on the annotation (i.e. Annotea), but how can we have semantics within the annotation?

• Good starting points:• Some participative systems already provide semi-structured

information (i.e. infoboxes in Wikipedia)• Some communities of practice already built their own bottom-up

way to structure information (i.e. machine tags)• Some (relatively new) systems allow, with some additional effort, to

save information in a structured way almost without requiring users to know that (i.e. semantic wikis)

• Challenges• Provide a shared way to describe annotations coming from

heterogeneous systems• Aggregate this information to provide something new and useful

Page 10: SAnno: a unifying framework for semantic annotation

10

Davide EynardIDSIA, 01/06/2010

SAnno as a framework

• Sanno is built up of many different parts, which all together provide something (we consider) new and useful

• An ontology to describe annotations (the “shells” that contain metadata about a resource)

• An ontology describing the types of properties we are already able to aggregate

• A set of conversion tools which are able to translate existing annotations from other systems into our notation

• A system to show the results of the aggregation of different annotations

• A system to manage provenance, authorship, and filters on incoming annotations

Page 11: SAnno: a unifying framework for semantic annotation

11

Davide EynardIDSIA, 01/06/2010

The annotations ontology

• Every annotation can be considered as a “Post-it”, a piece of paper where something is written about something else

• … you can say things about what is written there, but also about the Post-it itself

• The annotation is about a resource, it is created by someone in a specific date, it comes from a particular annotation system and might be connected to a specific community

• Main goal: do not reinvent everything from scratch• Reuse well-known ontologies such as DC, SIOC, etc.• Use named graphs as an alternative to reifications

• Start in an easy way: restriction to URLs• Also a way to provide instant gratification to users: show

annotations while they are browsing a website

Page 12: SAnno: a unifying framework for semantic annotation

12

Davide EynardIDSIA, 01/06/2010

The aggregation ontology

• Aggregation deals with the contents of the annotation (i.e. The triples found in the NG)

• Objectives• Avoid constraining users to a specific vocabulary for annotations• Find a way to collect different annotations and provide something

new and interesting by aggregating them

Our approach• Properties used inside annotations could be described as belonging

to families we already know how to deal with• Examples: very specific (tags, ratings), more general (transitive

relations)• Properties inside some external vocabulary are mapped as

subproperties of ours• … by whom? High-experience users who have incentives to do this

(think about users building templates in Wikipedia...)

Page 13: SAnno: a unifying framework for semantic annotation

13

Davide EynardIDSIA, 01/06/2010

Conversion tools

• Our worst enemy: the bootstrap• who is going to annotate the first resources? I don't have time!

• Our best friends: already existing annotation systems• why don't we convert existing data to our notation and show the

advantages of our approach?

Different families of conversion tools• Easy: already existing APIs, with realtime search functionalities

(i.e. del.icio.us)• Medium: conversions from existing structured repositories such as

SPARQL endpoints (advantage: the conversion is very clean, you just need one tool and different CONSTRUCTs)

• A little harder: Web scraping when no other sources are available

Page 14: SAnno: a unifying framework for semantic annotation

14

Davide EynardIDSIA, 01/06/2010

Annotation client

• Actually, two possible clients in our mind:• a browser extension which shows annotations while users are

browsing the Web• an independent service which is able to aggregate heterogeneous

information related to similar resources (i.e. URLs marked as being MP3 files)

• Filter annotations according to author, date, originating system, and community

• Users should be able to “subscribe” to some annotating communities and ignore others

• System is thought as distributed, as data can come from different, unrelated sources

Page 15: SAnno: a unifying framework for semantic annotation

15

Davide EynardIDSIA, 01/06/2010

The prototype

• Early annotation ontology• Property families: tag, rating, generically related URI• Conversions from SMW, Delicious• Visualization as a web service + Firefox extension• No subscriptions yet

Page 16: SAnno: a unifying framework for semantic annotation

16

Davide EynardIDSIA, 01/06/2010

The prototype

Page 17: SAnno: a unifying framework for semantic annotation

17

Davide EynardIDSIA, 01/06/2010

The end

Thank you! Questions?

References:• [0] D.Laniado, D.Eynard and M.Colombetti. Using WordNet to turn a folksonomy into a

hierarchy of concepts. Semantic Web Application and Perspectives 192–201, 2007.

• [1] D.Eynard and M.Colombetti. Exploiting User Gratification for Collaborative Semantic Annotation. Proceedings of SWUI 2008. April 2008.

• [2] D.Eynard. Using semantics and user participation to customize personalization. HP Labs Technical Report HPL-2008-197. September 2008.

• [3] L.Mazzola, D.Eynard and R.Mazza. GVIS: a framework for graphical mashups of heterogeneous sources to support data interpretation. HSI 2010. May 2010.

Page 18: SAnno: a unifying framework for semantic annotation

Contact Davide Eynard

[email protected]

http://davide.eynard.it

Tel. 02 2399 4010

Fax 02 2399 3411

Back

Project page @AIRLab: http://airwiki.elet.polimi.it