SAnno: a unifying framework for semantic annotation
-
Upload
davide-eynard -
Category
Technology
-
view
604 -
download
1
description
Transcript of SAnno: a unifying framework for semantic annotation
SAnno: a unifying framework for semantic annotation
Davide Eynard
IDSIA, 01/06/2010
2
Davide EynardIDSIA, 01/06/2010
Introduction
• S(emantic)Anno(tations) • … in Italy, “sanno” also means “they know”
• Basic principle: anyone should be able to say anything about anything else
• Well, this should hold in general :-)• Actually, in our case it is “anything about any URI”• And we would like everyone to say that in a formal way
• But first, a little step back in time...
3
Davide EynardIDSIA, 01/06/2010
Participation and semantics
Data
Structure
4
Davide EynardIDSIA, 01/06/2010
Sanno's grandfather: Speakinabout [1]
• Purpose: produce semantic annotations about named entities• When you read “Harry Potter”, is it the book or the movie?
• Plays with user gratifications• When users annotate a string as matching a specific concept, they
are shown a list of services/search engines which are related to it
• Relies on user provided data:• Freebase types• User generated search templates, built inside a wiki system
5
Davide EynardIDSIA, 01/06/2010
Sanno's grandfather: Speakinabout [1]
6
Davide EynardIDSIA, 01/06/2010
Sanno's grandfather: Speakinabout [1]
7
Davide EynardIDSIA, 01/06/2010
Sanno's father: RDFMonkey [2]
• Purpose: augment browsing experience by providing information/services related to the visited URL
• Relies on Freebase types• … as in SpeakinAbout, but without requiring user interaction• Types are found by searching backlinks in Freebase (which topics
are linking the visited page)
• Related services as widgets inside a browser extension• The app could load widgets at runtime (from Freebase itself or
another collaborative system)
8
Davide EynardIDSIA, 01/06/2010
Sanno's father: RDFMonkey [2]
Musical Artists
Cities
Books
9
Davide EynardIDSIA, 01/06/2010
The problem
• We already have semantics on the annotation (i.e. Annotea), but how can we have semantics within the annotation?
• Good starting points:• Some participative systems already provide semi-structured
information (i.e. infoboxes in Wikipedia)• Some communities of practice already built their own bottom-up
way to structure information (i.e. machine tags)• Some (relatively new) systems allow, with some additional effort, to
save information in a structured way almost without requiring users to know that (i.e. semantic wikis)
• Challenges• Provide a shared way to describe annotations coming from
heterogeneous systems• Aggregate this information to provide something new and useful
10
Davide EynardIDSIA, 01/06/2010
SAnno as a framework
• Sanno is built up of many different parts, which all together provide something (we consider) new and useful
• An ontology to describe annotations (the “shells” that contain metadata about a resource)
• An ontology describing the types of properties we are already able to aggregate
• A set of conversion tools which are able to translate existing annotations from other systems into our notation
• A system to show the results of the aggregation of different annotations
• A system to manage provenance, authorship, and filters on incoming annotations
11
Davide EynardIDSIA, 01/06/2010
The annotations ontology
• Every annotation can be considered as a “Post-it”, a piece of paper where something is written about something else
• … you can say things about what is written there, but also about the Post-it itself
• The annotation is about a resource, it is created by someone in a specific date, it comes from a particular annotation system and might be connected to a specific community
• Main goal: do not reinvent everything from scratch• Reuse well-known ontologies such as DC, SIOC, etc.• Use named graphs as an alternative to reifications
• Start in an easy way: restriction to URLs• Also a way to provide instant gratification to users: show
annotations while they are browsing a website
12
Davide EynardIDSIA, 01/06/2010
The aggregation ontology
• Aggregation deals with the contents of the annotation (i.e. The triples found in the NG)
• Objectives• Avoid constraining users to a specific vocabulary for annotations• Find a way to collect different annotations and provide something
new and interesting by aggregating them
Our approach• Properties used inside annotations could be described as belonging
to families we already know how to deal with• Examples: very specific (tags, ratings), more general (transitive
relations)• Properties inside some external vocabulary are mapped as
subproperties of ours• … by whom? High-experience users who have incentives to do this
(think about users building templates in Wikipedia...)
13
Davide EynardIDSIA, 01/06/2010
Conversion tools
• Our worst enemy: the bootstrap• who is going to annotate the first resources? I don't have time!
• Our best friends: already existing annotation systems• why don't we convert existing data to our notation and show the
advantages of our approach?
Different families of conversion tools• Easy: already existing APIs, with realtime search functionalities
(i.e. del.icio.us)• Medium: conversions from existing structured repositories such as
SPARQL endpoints (advantage: the conversion is very clean, you just need one tool and different CONSTRUCTs)
• A little harder: Web scraping when no other sources are available
14
Davide EynardIDSIA, 01/06/2010
Annotation client
• Actually, two possible clients in our mind:• a browser extension which shows annotations while users are
browsing the Web• an independent service which is able to aggregate heterogeneous
information related to similar resources (i.e. URLs marked as being MP3 files)
• Filter annotations according to author, date, originating system, and community
• Users should be able to “subscribe” to some annotating communities and ignore others
• System is thought as distributed, as data can come from different, unrelated sources
15
Davide EynardIDSIA, 01/06/2010
The prototype
• Early annotation ontology• Property families: tag, rating, generically related URI• Conversions from SMW, Delicious• Visualization as a web service + Firefox extension• No subscriptions yet
16
Davide EynardIDSIA, 01/06/2010
The prototype
17
Davide EynardIDSIA, 01/06/2010
The end
Thank you! Questions?
References:• [0] D.Laniado, D.Eynard and M.Colombetti. Using WordNet to turn a folksonomy into a
hierarchy of concepts. Semantic Web Application and Perspectives 192–201, 2007.
• [1] D.Eynard and M.Colombetti. Exploiting User Gratification for Collaborative Semantic Annotation. Proceedings of SWUI 2008. April 2008.
• [2] D.Eynard. Using semantics and user participation to customize personalization. HP Labs Technical Report HPL-2008-197. September 2008.
• [3] L.Mazzola, D.Eynard and R.Mazza. GVIS: a framework for graphical mashups of heterogeneous sources to support data interpretation. HSI 2010. May 2010.
Contact Davide Eynard
http://davide.eynard.it
Tel. 02 2399 4010
Fax 02 2399 3411
Back
Project page @AIRLab: http://airwiki.elet.polimi.it