Interlinking Online Communities and Enriching Social Software with the Semantic Web

190
Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.ie Interlinking Online Communities and Enriching Social Software with the Semantic Web Uldis Bojārs 1 , Alexandre Passant 2 , John Breslin 1 1 Digital Enterprise Research Institute, National University of Ireland, Galway 2 LaLIC, Université Paris-Sorbonne / Electricité de France R&D World Wide Web Conference / Beijing, China / 21st April 2008
  • date post

    18-Sep-2014
  • Category

    Technology

  • view

    44
  • download

    0

description

The 17th International World Wide Web Conference / Beijing, China / 21st April 2008

Transcript of Interlinking Online Communities and Enriching Social Software with the Semantic Web

Page 1: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Interlinking Online Communities and Enriching Social Software with the Semantic Web

Uldis Bojārs1, Alexandre Passant2, John Breslin1

1 Digital Enterprise Research Institute, National University of Ireland, Galway2 LaLIC, Université Paris-Sorbonne / Electricité de France R&D

World Wide Web Conference / Beijing, China / 21st April 2008

Page 2: Interlinking Online Communities and Enriching Social Software with the Semantic Web

2

URL for the presentation

Full presentation file (40 MB!):

http://url.ie/bwk

Uploading to SlideShare for web browsing:

http://www.slideshare.net/Cloud

Page 3: Interlinking Online Communities and Enriching Social Software with the Semantic Web

3

Summary

1. Overview of the SIOC Project

2. From Disconnected Communities to Interlinked-Online Communities

3. Creating Semantic Web Data from Social Media Sites

4. Using SIOC with Other Ontologies

5. Finding, Reusing and Searching Semantic Web Data Produced by the Social Web

6. Browsing, Exploring and Consuming Semantic Web Data

7. Portable Data and Re-Use of SIOC Data

8. Industry Applications of Semantic Technologies for Online Communities

9. Leveraging Content Semantics in Social Software

Page 4: Interlinking Online Communities and Enriching Social Software with the Semantic Web

4

Who are we?

• Alexandre Passant– PhD student, Semantic Web researcher– LaLIC, Université Paris-Sorbonne / Electricité de France R&D

• John Breslin– Social software researcher, adjunct lecturer– DERI, National University of Ireland, Galway

• Uldis Bojārs– Semantic Web researcher, PhD student– DERI, National University of Ireland, Galway

Page 5: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

1. Overview of the SIOC Project

John Breslin

Page 6: Interlinking Online Communities and Enriching Social Software with the Semantic Web

6

timbl on the Semantic Web and online communities

“I think we could have both Semantic Web technology supporting online communities, but at the same time also online communities can support Semantic Web data by being the sources of people voluntarily connecting things together.”

Sir Tim Berners-Lee, podcast interview during ISWC 2005

http://esw.w3.org/topic/IswcPodcast

Page 7: Interlinking Online Communities and Enriching Social Software with the Semantic Web

7

The Semantic Web in brief

• “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” - Tim Berners-Lee, James Hendler, Ora Lassila, Scientific American, May 2001

• A common model to describe data in a machine-readable way:– RDF (Resource Description Framework)– RDF statements are triples (subject predicate object):

WWW2008 isA Conference .SIOC isA CoolProject .

• Common semantics for this data, using ontologies:– “An ontology is a specification of a conceptualisation” - Tom Gruber– RDFS (RDF Schema)– OWL (Web Ontology Language)

• The Semantic Web FAQ:– http://www.w3.org/2001/sw/SW-FAQ

Page 8: Interlinking Online Communities and Enriching Social Software with the Semantic Web

8

The (evolving) Semantic Web layer cake

http://www.w3.org/2007/03/layerCake.png

Page 9: Interlinking Online Communities and Enriching Social Software with the Semantic Web

9

Vision

Page 10: Interlinking Online Communities and Enriching Social Software with the Semantic Web

10

The aims of SIOC

• To “semantically-interlink online communities”• To fully describe the content and structure of community

sites• To create new connections between online discussion

posts and items, forums and containers• To enable the integration of online community

information• To browse connected Social Web items in interesting

and innovative ways• To overcome a chicken-and-egg problem with the

Semantic Web• Add a social aspect to the Semantic Web

Page 11: Interlinking Online Communities and Enriching Social Software with the Semantic Web

11

The steps involved

1. Develop an ontology of terms for representing rich data from the Social Web

2. Create a food chain for producing, collecting and consuming SIOC data from open-source discussion systems and popular community sites

3. As well dissemination via academic papers about SIOC, provide easy-to-read documentation and usage examples at sioc-project.org

• SIOC aims to enrich the Web infrastructure:– During the next upgrade cycle, gigabytes of community data

become available!

Page 12: Interlinking Online Communities and Enriching Social Software with the Semantic Web

12

The SIOC ontology

• The main classes and properties are:

Page 13: Interlinking Online Communities and Enriching Social Software with the Semantic Web

13

The SIOC food chain

Page 14: Interlinking Online Communities and Enriching Social Software with the Semantic Web

14

Dissemination

Page 15: Interlinking Online Communities and Enriching Social Software with the Semantic Web

15

Page 16: Interlinking Online Communities and Enriching Social Software with the Semantic Web

16

Quotes about SIOC

• “I […] think the concept is HOT” – Robert Douglass, Drupal Developer

• “It just dawned on me that the burgeoning SIOC-o-sphere (online communities exporting and exposing content via SIOC Ontology) is actually: Blogosphere 2.0” – Kingsley Idehen, Founder and CEO of OpenLink Software

• “SIOC has the potential to become one of the foundational vocabularies that make Semantic Web applications useful” – Ivan Herman, W3C / ERCIM

• “A project that started back in 2000 called Friend-of-a-Friend (FOAF) represents relationships between people, as well as basic contact details. SIOC does this for groups: it extends the FOAF idea to being able to talk about whole groups of people. I am excited about SIOC because you can use that information to determine trust, to let people in.” – Tim Berners-Lee, Creator of the World Wide Web

Page 17: Interlinking Online Communities and Enriching Social Software with the Semantic Web

17

Number of SIOC documents pinged via PingTheSemanticWeb

0

20000

40000

60000

80000

100000

12000001/09/2007

15/09/2007

29/09/2007

13/10/2007

27/10/2007

10/11/2007

24/11/2007

08/12/2007

22/12/2007

05/01/2008

19/01/2008

02/02/2008

16/02/2008

01/03/2008

15/03/2008

29/03/2008

12/04/2008

Page 18: Interlinking Online Communities and Enriching Social Software with the Semantic Web

18

What SIOC is not?

• An ontology to describe the content of social media contributions:– Need to use dedicated ontologies

• A way to automagically translate non-semantic social data to RDF:– Need to write and use exporters

• A model to describe physical people that contribute to social media websites:– This is the role of FOAF (Friend-of-a-Friend)

• An axiomatisation of a domain

Page 19: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

2. From Disconnected Communities to Interlinked-Online Communities

Alexandre Passant

Page 20: Interlinking Online Communities and Enriching Social Software with the Semantic Web

20

What are online communities?

• People form online communities by combining one-to-one (e.g. e-mail and instant messaging), one-to-many (web pages and blogs) and many-to-many (forums, wikis) forms of communication

Pre-Web and Web 1.0:• BBS services• Mailing lists• USENET• Web-based bulletin boards

Web 2.0:• Multi-forum sites• Online social networks• Weblogs• Wikis• Microblogging• Social tagging services

Page 21: Interlinking Online Communities and Enriching Social Software with the Semantic Web

21

Evolution of online community sites

Online community sites:• Provide a valuable source of information• May contain rich meta-information • But are isolated from one another:

– Many sites discussing complementary topics– How to relate and interlink them?

Next steps:• Connect sites together• Add more value:

– Let other sites know more about the structure and contents– Make more use of tagging and semantic metadata

Page 22: Interlinking Online Communities and Enriching Social Software with the Semantic Web

22

Existing connections using RSS, Atom syndication

• First step towards connecting online community sites:– More visibility through aggregation and search– Allow one to subscribe to distributed items

• Benefits:– Good tool support:

• RSS readers, APIs, etc.

– Many consumers

• Shortcomings:– Little information about structure of the site or community:

• Mainly represent items, not users nor the container

– Feeds typically include only last five to 20 items:• How can we access information about the whole site?

Page 23: Interlinking Online Communities and Enriching Social Software with the Semantic Web

23

• Many different kinds of online communities:– Social networks– Discussion groups– Message boards– ...

• Each community is a closed world / walled garden:– Moving / reusing data from

one community on another?

– Interlinking networks?– Inviting friends?

Disconnected communities

Page 24: Interlinking Online Communities and Enriching Social Software with the Semantic Web

24

Need ways to connect these islands

* Source: Pidgin Technologies, www.pidgintech.com

Page 25: Interlinking Online Communities and Enriching Social Software with the Semantic Web

25

A need for common semantics

• Communities should provide their data in a common, machine-understandable way:– RDF (resource description framework) as a data layer– One single format for all the data– Different transport layers (RDF/XML, N3, etc.)– The base of the Semantic Web

• Communities should use common semantics to define this data:– Avoiding the use of proprietary APIs– Since this means that they can talk together, exchange

information, using the same modelling layer for their data– Using SIOC for representing content and actions– Using FOAF for representing people and networks

Page 26: Interlinking Online Communities and Enriching Social Software with the Semantic Web

26

What is required to represent a community?

• Represent the data, not only documents:– From the WWW to a “GGG”, hyperlinks to semantic relationships

• A model for all the aspects of a community:– Users accounts, groups and roles:

• Reader, reviewer, moderator

– Content and types:• A blog, a blog post, a bulletin board, a wiki page, etc.

– Actions between users and content:• Uldis creates a post, Alex comments on it, John moderates it

• A model for the entire content:– Any data: RSS 1.0 and Atom limited to syndication / latest posts– Any user and relationship: new user, new post, replies, etc.

Page 27: Interlinking Online Communities and Enriching Social Software with the Semantic Web

27

Representing community data with SIOC

• Using SIOC as an ontology to represent the activities of online communities on the Web:– Namespace: http://rdfs.org/sioc/ns– Five top-level classes: User / Role / Space / Container / Item– A “SIOC Types” module for Social Web content– Action: A user posts an item in a container

• A Semantic Web citizen: – Reusing and interlinking existing ontologies– Not reinventing the wheel (connects to DC, FOAF, etc.):

• http://www.w3.org/Submission/2007/SUBM-sioc-related-20070612/

Page 28: Interlinking Online Communities and Enriching Social Software with the Semantic Web

28

Example of SIOC data

• Alex wrote a post on his WordPress blog:

:myblogpost rdf:type sioc:Post ;dc:title “I’m blogging this” ;sioc:has_creator :alex ;sioc:has_container :mywpblog .

:mywpblog rdf:type sioc:Forum .

Page 29: Interlinking Online Communities and Enriching Social Software with the Semantic Web

29

The same model for any website

• John wrote a post on his Drupal-powered site:

:myblogpost rdf:type sioc:Post ;dc:title “Another blog post” ;sioc:has_creator :john ;sioc:has_container :mydrupal .

:mydrupal rdf:type sioc:Forum .

Page 30: Interlinking Online Communities and Enriching Social Software with the Semantic Web

30

Interlinking communities

• Since all communities can use the same model to define their data, it is easy to link them from a data point of view

• Interlinking:– URIs are used to define things and created objects– A post on blog “A” can be semantically linked to a post on blog “B”

• Using SPARQL to query data:– Can perform unified queries no matter where the data comes from– No need to learn new APIs from data providers:

• Standard queries for RDF data rather than APIs

– SPARQL is a W3C recommendation:• http://www.w3.org/TR/rdf-sparql-query/

Page 31: Interlinking Online Communities and Enriching Social Software with the Semantic Web

31

Linking people to user accounts

• FOAF is the main vocabulary used to represent people:– http://foaf-project.org– foaf:Person class:

• “The foaf:Person class represents people. Something is a foaf:Person if it is a person.”

– foaf:holdsAccount property:• “The foaf:holdsAccount property relates a foaf:Agent to a

foaf:OnlineAccount for which they are the sole account holder.”

– Linking people to user accounts:• sioc:User rdfs:subClassOf foaf:onlineAccount• Links a foaf:Person to various sioc:User(s)• As many sioc:User(s) as required can be linked to a single person• One people, various identities

Page 32: Interlinking Online Communities and Enriching Social Software with the Semantic Web

32

Representing users and online accounts

• The sioc:User class:– An online user account– Can be thought of as a virtual representation of any person online,

within the context of a given social media website or community– A subclass of foaf:OnlineAccount– Users create and manage content:

• has_creator and has_modifier properties• :blogpost123 sioc:has_creator :john

– A user can have roles on a given container:• (Moderator, Forum 1) ← User A• (Contributor, Blog 2) ← User B

Page 33: Interlinking Online Communities and Enriching Social Software with the Semantic Web

33

A person and their user accounts

Page 34: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

3. Creating Semantic Web Data from Social Media Sites

Uldis Bojārs

Page 35: Interlinking Online Communities and Enriching Social Software with the Semantic Web

35

Page 36: Interlinking Online Communities and Enriching Social Software with the Semantic Web

36

Producing SW data from social media sites

• We use SIOC as a common framework / data model for expressing social media data on the Semantic Web:– http://rdfs.org/sioc/spec/– http://rdfs.org/sioc/related/

• The “SIOC Types” module:– Additional classes and properties for expressing different kinds

of social media / Web 2.0 content– Connects SIOC with domain-specific ontologies

• Reuse existing ontologies:– Dublin Core, FOAF, SKOS, etc.

Page 37: Interlinking Online Communities and Enriching Social Software with the Semantic Web

37

The SIOC ontology

• The main classes and properties are:

Page 38: Interlinking Online Communities and Enriching Social Software with the Semantic Web

38

SIOC data producers

• SIOC applications list:– http://rdfs.org/sioc/applications/

• > 20 applications for producing SIOC data:– Free and open source

• SIOC export tools for:– Blogs and forums: WordPress, phpBB, Drupal, b2evolution– “Legacy” applications: mailing lists, IRC– New media: Twitter, Jaiku, Facebook, Flickr– Enterprise applications: CWE (collaborative work environments)

Page 39: Interlinking Online Communities and Enriching Social Software with the Semantic Web

39

Case studies

• WordPress SIOC Exporter:– http://sioc-project.org/wordpress– First SIOC plugin created, custom built

• vBulletin and phpBB SIOC Exporters:– http://wiki.sioc-project.org/index.php/VBSIOC– http://sioc-project.org/phpbb– Uses SIOC API for PHP

Page 40: Interlinking Online Communities and Enriching Social Software with the Semantic Web

40

Overview of WordPress SIOC Exporter

• Installation:– Download from http://sioc-project.org/wordpress– “Drop” two files into the WordPress plugins folder– Go to the administrator’s user interface– Plugins → SIOC Plugin → “Activate”

• SIOC data created for every page:– Data describing all blog posts, comments, users, etc.– SIOC data can be discovered via RDF autodiscovery links:<link rel="meta" type="application/rdf+xml"

title="SIOC" href="http://www.johnbreslin.com/blog/index.php?sioc_type=site" />

• Data can be explored or crawled using existing Semantic Web applications

Page 41: Interlinking Online Communities and Enriching Social Software with the Semantic Web

41

Sample export of SIOC data from WordPress

Page 42: Interlinking Online Communities and Enriching Social Software with the Semantic Web

42

• RDF data from the WordPress SIOC Exporter, displayed in the SIOC RDF Browser

Page 43: Interlinking Online Communities and Enriching Social Software with the Semantic Web

43

SIOC export APIs

• Benefits:– Hides the complexity from application developers– Can be used by people who are not Semantic Web experts– Automatically updated according to changes in the SIOC

ontology and best practices documents

• Existing SIOC APIs:– Java– Perl (new!)– PHP (most used)– RDFa on Rails

• See “2.1 SIOC APIs” in http://rdfs.org/sioc/applications/

Page 44: Interlinking Online Communities and Enriching Social Software with the Semantic Web

44

Overview of vBulletin and phpBB SIOC Exporters

• There is a large amount of structured related information contained within message boards, and this can be leveraged in interesting ways by exposing the semantic data for new applications

• Exporters have been developed for commercial (vBulletin) and open-source (phpBB) message board systems, bringing these islands together and allowing conversations on topics that are taking place across various sites

• vBulletin and phpBB SIOC Exporters are based on the SIOC API for PHP:– http://wiki.sioc-project.org/index.php/PHPExportAPI

Page 45: Interlinking Online Communities and Enriching Social Software with the Semantic Web

45

Sample export of SIOC data from vBulletin

Page 46: Interlinking Online Communities and Enriching Social Software with the Semantic Web

46

Sample export of SIOC data from vBulletin (2)

Page 47: Interlinking Online Communities and Enriching Social Software with the Semantic Web

47

SIOC competition with boards.ie

• boards.ie has been publishing social graph information online using FOAF since 2004

• With its 10 years of discussions, boards.ie can serve as a rich source of SIOC data for the Social Semantic Web:– The data to be “SIOC-ified” is already all publicly viewable, but it

is difficult to leverage without any added semantics due to the fact that it is embedded in heavily-styled HTML pages

• DERI are sponsoring a competition with prizes (the top prize is €3000) for whoever is judged to have produced the most interesting application(s) that makes use of the SIOC data exported from boards.ie

• To enter, go to http://data.sioc-project.org

Page 48: Interlinking Online Communities and Enriching Social Software with the Semantic Web

48

Creating your own exporters

• Use SIOC API(s) if possible:– Or create new APIs to contribute back to the community

• Creating RDF data is easy:– Use the plugin API provided by the host system– Collect required information from the host (CMS) system– Create in-memory RDF or object model (optional)– Serialise RDF data (using RDF API or print templates)

• Seek help from the SIOC developer community:– http://sioc-project.org/ or SIOC-Dev mailing list or

#sioc on IRC (irc.freenode.net)

Page 49: Interlinking Online Communities and Enriching Social Software with the Semantic Web

49

Things to consider

• URIs to use for SIOC concepts:– Developer needs to choose URIs to use and to supply them to

SIOC API calls

• Linked data from SIOC tools:– http://www.w3.org/DesignIssues/LinkedData.html– Choice of URIs is important– To use HTTP content negotiation or not?

• Do you need to use any domain-specific ontologies?– Structured data within content items– Multimedia, etc.

• External or embedded metadata?

Page 50: Interlinking Online Communities and Enriching Social Software with the Semantic Web

50

Explore more producers of SIOC data

• Sioku:– SIOC data from Jaiku microblogging service– http://sioku.sioc-project.org/

• SWAML:– Exports mailing list archives in RDF– http://swaml.berlios.de/

• OpenLink DataSpaces:– Uses SIOC as a representation format for multiple social spaces– http://virtuoso.openlinksw.com/wiki/main/Main/OdsIndex/

• Use the Semantic Radar extension for Firefox for detecting / exploring SIOC data:– http://sioc-project.org/firefox

Page 51: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

4. Using SIOC with Other Ontologies

Uldis Bojārs

Page 52: Interlinking Online Communities and Enriching Social Software with the Semantic Web

52

SIOC and its friends

Page 53: Interlinking Online Communities and Enriching Social Software with the Semantic Web

53

SIOC and FOAF are used together

• FOAF is the main vocabulary used to represent people:– http://foaf-project.org– foaf:Person class:

• “The foaf:Person class represents people. Something is a foaf:Person if it is a person.”

– foaf:holdsAccount property:• “The foaf:holdsAccount property relates a foaf:Agent to a

foaf:OnlineAccount for which they are the sole account holder.”

– Linking people to user accounts:• sioc:User rdfs:subClassOf foaf:onlineAccount• Links a foaf:Person to various sioc:User(s)• As many sioc:User(s) as required can be linked to a single person• One people, various identities

Page 54: Interlinking Online Communities and Enriching Social Software with the Semantic Web

54

A person and their user accounts

Page 55: Interlinking Online Communities and Enriching Social Software with the Semantic Web

55

Add SKOS for topics and categories

• Interlinking using common categories:– Share tags and topics across different content

• SKOS (Simple Knowledge Organisation System):– http://www.w3.org/2004/02/skos/– A vocabulary to describe controlled vocabularies– Used in the “Tag Ontology”:

• http://www.holygoat.co.uk/projects/tags/

Page 56: Interlinking Online Communities and Enriching Social Software with the Semantic Web

56

Interlinking content with SKOS

skos:isSubjectOfsioc:topic

Page 57: Interlinking Online Communities and Enriching Social Software with the Semantic Web

57

Interlinking content items

• Can create direct links between instances of sioc:Item:– Link from a blog post to a bulletin board page– sioc:related_to, sioc:links_to

• Interlinking using common categories:– SKOS (Simple Knowledge Organisation System):

• http://www.w3.org/2004/02/skos/

• Interlink using existing URIs as topics:– geonames.org , DBpedia, Revyu– MOAT (Meaning of a Tag) simplifies linking content to such URIs:

• http://moat-project.org/

Page 58: Interlinking Online Communities and Enriching Social Software with the Semantic Web

58…can connect us to other people

Various types of content we create and consume…

• Discussions• Bookmarks• Annotations• Profiles• Microblogs• Multimedia

Page 59: Interlinking Online Communities and Enriching Social Software with the Semantic Web

59

The “SIOC Types” module

• Ontology module:– Extends the SIOC ontology

• SIOC Types:– Defines subclasses and

subproperties of core SIOC terms for various types of Social Web content items and containers

Page 60: Interlinking Online Communities and Enriching Social Software with the Semantic Web

60

Sample classes from the “SIOC Types” module

• Weblog:– Describes a weblog (blog), i.e. an online journal

• BlogPost:– Describes a post that is specifically made on a weblog

• Comment:– Comment is a subtype of sioc:Post and allows one to explicitly

indicate that a particular post is a comment– Note that comments have a narrower scope than sioc:Post and

may not apply to all types of community site

Page 61: Interlinking Online Communities and Enriching Social Software with the Semantic Web

61

Sample classes from the “SIOC Types” module (2)

• ImageGallery:– An image gallery, for example, a photo album containing exif:IFD

instances

• AddressBook:– A collection of personal or organisational addresses, e.g.

foaf:Person (foaf:Agent) or vCard instances

• ReviewArea:– An area where reviews are posted, using the Rev or Review

vocabularies

• ResumeBank:– A collection of resumes, e.g. as defined using Resume-RDF

Page 62: Interlinking Online Communities and Enriching Social Software with the Semantic Web

62

Extending the SIOC data model with SIOC Types

• John wrote a post on his Drupal-powered blog:

:myblogpost rdf:type sioct:BlogPost ;dc:title “Another blog post” ;sioc:has_creator :john ;sioc:has_container :mydrupal .

:mydrupal rdf:type sioct:Blog .

Page 63: Interlinking Online Communities and Enriching Social Software with the Semantic Web

63

Using SIOC Types to represent Flickr data

• Uldis owns a photo gallery on Flickr:

:myitempost rdf:type exif:IFD ;dc:title “Another posted picture” ;sioc:has_creator :uldis ;sioc:has_container :myflickrgallery .

:myflickrgallery rdf:type sioct:ImageGallery .

• Reusing external vocabularies (e.g. EXIF) to define item types

Page 64: Interlinking Online Communities and Enriching Social Software with the Semantic Web

64

FlickrRDF using SIOC, FOAF, SIOC Types, Geo

Page 65: Interlinking Online Communities and Enriching Social Software with the Semantic Web

65

Reviews (structured blog posts) described with linked data from SIOC and other ontologies

• Mapping from the hReview microformat to RDF ontologies:

– Dublin Core, SIOC, SIOC Types, Review, Creative Commons

hReview field RDF field(s)

summary dc:title

item type Classes linked from SIOC Types

item info sioc:about

reviewer foaf:maker, foaf:Person, rev:reviewer

dtreviewed dcterms:created

rating rev:rating

description sioc:content, rev:text

tags sioc:topic

permalink sioc:link, URL

licence cc:license

Page 66: Interlinking Online Communities and Enriching Social Software with the Semantic Web

66

Page 67: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

5. Finding, Reusing and Searching Semantic Web Data Produced by the Social Web

Alexandre Passant

Page 68: Interlinking Online Communities and Enriching Social Software with the Semantic Web

68

Page 69: Interlinking Online Communities and Enriching Social Software with the Semantic Web

69

• There is a lot of Social Semantic Web data available:– From services

– Via exporters

– Hand-crafted

• But it is scattered all around the Web:– How do we find, browse, query, reuse it?

• These need to be addressed:– To provide novel applications that can leverage the interlinked nature of

this data from the Social Web

– To show the benefits of RDF and the Semantic Web

Motivation for finding and reusing semantic data

Semantic Web Documents (RDF)

Page 70: Interlinking Online Communities and Enriching Social Software with the Semantic Web

70

Finding data from the Social SW

• PingTheSemanticWeb:– http://pingthesemanticweb.com– A ping service for SW documents– REST or XML/RPC– Accepts, reads different formats:

• RDF/XML, N3, Turtle

– The “blo.gs” of the Semantic Web

• Various ontologies detected by PTSW:– FOAF, DOAP, SIOC, etc.– About 1M documents, 3.7M pings

• “A Scripting Architecture to Discover and Query Decentralized RDF Data”, The 3rd Workshop on Scripting for the Semantic Web (SFSW 2007), Innsbruck, Austria, June 2007

Page 71: Interlinking Online Communities and Enriching Social Software with the Semantic Web

71

• Direct ping to PingTheSemanticWeb:– Blog engines: WordPress, Drupal, etc.– Services: Revyu, TalkDigger

• “Semantic Radar” extension for Firefox:– http://sioc-project.org/firefox– Easy to setup and use (Firefox extension, auto-update)– Support for RDFa!– Architecture of participation: just browse the Web– Discover Semantic Web documents using RDF autodiscovery

links (a popular practice for advertising Atom/RSS and FOAF):<head> <link rel="meta" type="application/rdf+xml" title="FOAF"

href="http://example.com/people/~you/foaf.rdf"/></head>

Advertising RDF data to PTSW

Page 72: Interlinking Online Communities and Enriching Social Software with the Semantic Web

72

Click to view SW data.

Semantic Radar in action, sending pings to PTSW

Page 73: Interlinking Online Communities and Enriching Social Software with the Semantic Web

73

Reusing data from PTSW

FireFoxSemantic Radars

Web Services and Software Agents

Semantic Web Documents (RDF)

Ping the Semantic Web

doap:store

• PTSW acts as a central access point for RDF data:– Subscribe to the service– Ask for recent updates– Apply namespace

restrictions (e.g. export FOAF only)

– Get fresh Semantic Web data

– Concentrate on your tools, rather than on finding the data

Page 74: Interlinking Online Communities and Enriching Social Software with the Semantic Web

74

• Sindice:– Lookup service for Semantic Web documents

• doap:store:– DOAP-based projects directory

• SWSE, Zitgist, Swoogle:– Semantic Web search engines

• You should be able to write a service after this tutorial

Existing services that can make use of PTSW

Page 75: Interlinking Online Communities and Enriching Social Software with the Semantic Web

75

doap:store

Page 76: Interlinking Online Communities and Enriching Social Software with the Semantic Web

76

Write your own Social Semantic Web application

• Requirements for applications that store RDF data• Find data:

– Subscribe to PTSW– Make a crontab script to regularly fetch new data

• Store data:– Plain-text files– RDF stores

• Query the data:– SPARQL query language and protocol, a W3C recommendation– “Trying to use the Semantic Web without SPARQL is like trying to

use a relational database without SQL” - Tim Berners-Lee

Page 77: Interlinking Online Communities and Enriching Social Software with the Semantic Web

77

Storing RDF data

• RDF stores:– Storage systems for triples– Better performance that distributed queries– Some support inference engines (OWL, RDFS)– Many provide an open SPARQL endpoint to let people use data

• Various implementations:– YARS (Java)– ARC2 (PHP)– 3Store (C)– Virtuoso, etc.

Page 78: Interlinking Online Communities and Enriching Social Software with the Semantic Web

78

Querying RDF data

• SPARQL language:– A language to query a set of triples– REST-protocol between clients and endpoint– Results in standard formats (XML or JSON)– http://www.w3.org/TR/rdf-sparql-query/

• SPARQL endpoint:– Remotely accessible data– Data openness– Easy to use, e.g. ARC2 requires just three lines of code:

include_once('path/to/arc/ARC2.php');$ep = ARC2::getStoreEndpoint(array(...));$ep->go();

Page 79: Interlinking Online Communities and Enriching Social Software with the Semantic Web

79

Semantic Web Search Engine (SWSE)

• A large-scale Semantic Web search engine developed and run by DERI Galway:– http://swse.deri.org/– Andreas Harth, Jürgen Umbrich, Aidan Hogan, Stefan Decker,

“YARS2: A Federated Repository for Querying Graph Structured Data from the Web”, The 6th International Semantic Web Conference (ISWC 2007), pp. 211-224, Busan, Korea, 2007

• A SPARQL endpoint for today’s tutorial:– http://swse.deri.org/boards/yars2/

Page 80: Interlinking Online Communities and Enriching Social Software with the Semantic Web

What does SWSE do?

• SWSE searches and navigates factual entities collected from over 200,000 data sources

• Components:– Web-scale crawling and object consolidation– Fully-distributed RDF storage and SPARQL query processing using

YARS2 (already achieved 7 billion synthetically generated triples)– Advanced schema agnostic ranking– User interface with guided navigation

• Features:– Ability to handle various entity types (such as people, places, proteins)

and various media types– Tracking provenance of triples using context / named graphs

• Search and explore the Semantic Web at:– http://swse.deri.org/

Page 81: Interlinking Online Communities and Enriching Social Software with the Semantic Web

SWSE™ data flow

QueryProcessorIndexCrawler

UserInterface

Page 82: Interlinking Online Communities and Enriching Social Software with the Semantic Web

82

The Sindice lookup index

Page 83: Interlinking Online Communities and Enriching Social Software with the Semantic Web

83

The Sindice SIOC widget

Page 84: Interlinking Online Communities and Enriching Social Software with the Semantic Web

84

Page 85: Interlinking Online Communities and Enriching Social Software with the Semantic Web

85

SPARQLing Social Semantic Web data

• Find all posts and their titles by John, using SELECT, and combining vocabularies (DC, SIOC, SIOC Types):

SELECT ?post ?title

WHERE {

?post rdf:type sioct:BlogPost ;

dc:title ?title ;

sioc:has_creator <$johns_URI> .

}

Page 86: Interlinking Online Communities and Enriching Social Software with the Semantic Web

86

SPARQLing Social Semantic Web data (2)

• Find all users that posted replies to John’s blog since January 2008, introducing the FILTER clause:

SELECT ?who

WHERE {

?post rdf:type sioct:BlogPost ;

dc:title ?title ;

sioc:has_creator <$johns_URI> .

?post sioc:has_reply ?reply .

?reply sioc:has_creator ?who ;

dcterms:created ?date .

FILTER (?date > "2008-01-01T00:00:00Z"^^xsd:dateTime)

}

Page 87: Interlinking Online Communities and Enriching Social Software with the Semantic Web

87

SPARQLing Social Semantic Web data (3)

• Find all content created by someone with a given OpenID URL:– Browse someone’s social media contributions posted on various

websites using different account names, but for the same person

SELECT ?item

WHERE {

?person foaf:openid <$openid> ;

foaf:holdsAccount ?user .

?user sioc:creator_of ?item .

}

Page 88: Interlinking Online Communities and Enriching Social Software with the Semantic Web

88

Parse SPARQL results

• SPARQL XML• JSON:

– Easiest– Many extensions

• PHP5: json_decode() : json data to PHP array

– Many examples• Most SPARQL endpoints support JSON output

• Can be easily used in JavaScript applications

Page 89: Interlinking Online Communities and Enriching Social Software with the Semantic Web

89

Querying RDF files

• Redland: http://librdf.org• Bindings:

– Available for PHP, Python, etc.– SQL2RDF bindings include D2RQ and Triplify

• Example in Python:Import RDFm = RDF.Model()m.load(‘http://apassant.net/foaf.rdf’)q = RDF.Query("SELECT ?s WHERE { ?s ?p ?o .}")results = q1.execute(model)for result in results:

print result[’s']

Page 90: Interlinking Online Communities and Enriching Social Software with the Semantic Web

90

Need more data?

• Translate any data to SIOC:– Re-use SIOC tools for non-SIOC data

• Semantic Pipes:– http://pipes.deri.org/– Remix your RDF data

• SPARQL constructs:– The “XSLT” of RDF– Translate a set of RDF data from one graph format to another– For example:

CONSTRUCT { ?x a sioc:Post . ?x sioc:has_creator ?y }

WHERE { ?x a myont:BlogElement . ?x myont:created_by ?y }

Page 91: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

6. Browsing, Exploring and Consuming Semantic Web Data

Uldis Bojārs

Page 92: Interlinking Online Communities and Enriching Social Software with the Semantic Web

92

Page 93: Interlinking Online Communities and Enriching Social Software with the Semantic Web

93

Consuming SIOC as Semantic Web data

• SIOC = RDF data

• Generic Semantic Web applications can be used:– RDF APIs (Jena, Redland, etc.)– RDF crawlers– RDF browsers (Tabulator, Zitgist, SIOC RDF Browser, etc.)– More apps: http://www.w3.org/2001/sw/SW-FAQ#tools

• Customised applications can provide more added value and / or better user interfaces:– SIOC Explorer (faceted browsing of SIOC data):

• https://launchpad.net/sioc-ex

– Buxon, etc.

Page 94: Interlinking Online Communities and Enriching Social Software with the Semantic Web

94

How can SIOC data be used?

Page 95: Interlinking Online Communities and Enriching Social Software with the Semantic Web

95

Browsing SIOC

Page 96: Interlinking Online Communities and Enriching Social Software with the Semantic Web

96

SIOC RDF Browser

http://sparql.captsolo.net/browser

Page 97: Interlinking Online Communities and Enriching Social Software with the Semantic Web

97

SIOC Store Browser

http://apassant.net/home/2006/06/sioc-browser

Page 98: Interlinking Online Communities and Enriching Social Software with the Semantic Web

98

SIOC Store Browser (2)

Page 99: Interlinking Online Communities and Enriching Social Software with the Semantic Web

99

Demonstration of SIOC Explorer

Page 100: Interlinking Online Communities and Enriching Social Software with the Semantic Web

100

Accessing SIOC content from multiple sourcesBrowsing SIOC content from one sourceFilter by “facet” from all sources

• Facet can be a direct or indirect property:

Direct

– The topic of the content item

– The creator of the item

– The date created

Indirect

– A geographic location of the person who created it

– The gender of the person

– An interest shared by many creators

Page 101: Interlinking Online Communities and Enriching Social Software with the Semantic Web

101

Exploring implicit social network connections

Page 102: Interlinking Online Communities and Enriching Social Software with the Semantic Web

102

Social SIOC Explorer

Page 103: Interlinking Online Communities and Enriching Social Software with the Semantic Web

103

Browsing SIOC with Piggy Bank

Page 104: Interlinking Online Communities and Enriching Social Software with the Semantic Web

104

Browsing SIOC with Tabulator

Page 105: Interlinking Online Communities and Enriching Social Software with the Semantic Web

105

Browsing SIOC with TimeLine

Page 106: Interlinking Online Communities and Enriching Social Software with the Semantic Web

106

Browsing SIOC with TimeLine (2)

Page 107: Interlinking Online Communities and Enriching Social Software with the Semantic Web

107

Weaving microblogging into the Semantic Web

• Producing AND consuming data

• Interlinking with existing RDF data (e.g. GeoNames)

• Faceted browsing

Page 108: Interlinking Online Communities and Enriching Social Software with the Semantic Web

108

Screenshots

Page 109: Interlinking Online Communities and Enriching Social Software with the Semantic Web

109

Reviews using SW technologies

• Revyu:– http://revyu.com

• Review website combining Web 2.0 / SW technologies

Page 110: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

7. Portable Data and Re-Use of SIOC Data

John Breslin

Page 111: Interlinking Online Communities and Enriching Social Software with the Semantic Web

111

What if I use multiple services and I want to…

• Move the stuff I have on one service to another (e.g. move all my blog posts, comments, friends, etc. from WordPress.com to “Acme Blogs”)

• Move all my stuff from multiple services to one third-party service

• Centralise my stuff on my own service, e.g. my blog• See my stuff on a third-party service providing an

aggregate view, like FriendFeed

Page 112: Interlinking Online Communities and Enriching Social Software with the Semantic Web

112

So many social media sites…

* Source: Smashcut Media, www.smashcut-media.com

Page 113: Interlinking Online Communities and Enriching Social Software with the Semantic Web

113

Even more services…

Page 114: Interlinking Online Communities and Enriching Social Software with the Semantic Web

114

It takes a lot of time…

Page 115: Interlinking Online Communities and Enriching Social Software with the Semantic Web

115

Filling out your profiles, re-adding your friends…

Page 116: Interlinking Online Communities and Enriching Social Software with the Semantic Web

116

Uploading posts and content items to “stovepipes”!

Page 117: Interlinking Online Communities and Enriching Social Software with the Semantic Web

117

Social media sites are like data silos

* Source: Pidgin Technologies, www.pidgintech.com

Page 118: Interlinking Online Communities and Enriching Social Software with the Semantic Web

118

Many isolated communities of users and their data

* Source: Pidgin Technologies, www.pidgintech.com

Page 119: Interlinking Online Communities and Enriching Social Software with the Semantic Web

119

Need ways to connect these islands

* Source: Pidgin Technologies, www.pidgintech.com

Page 120: Interlinking Online Communities and Enriching Social Software with the Semantic Web

120

Allowing users to easily move from one to another

* Source: Pidgin Technologies, www.pidgintech.com

Page 121: Interlinking Online Communities and Enriching Social Software with the Semantic Web

121

Enabling users to easily bring their data with them

* Source: Pidgin Technologies, www.pidgintech.com

Page 122: Interlinking Online Communities and Enriching Social Software with the Semantic Web

122

Social networking fatigue

• How many general or niche SNSs are you willing to register and / or interact with?

• People search engine and aggregation sites are now appearing to compensate:– SocialURL – organise your online identities– PeekYou – matching web pages with their owners– Spock – organising information around people– Rapleaf – reputation lookup and email search– Wink – free people search engine– FriendFeed – subscribe to all of your friends’ feeds

Page 123: Interlinking Online Communities and Enriching Social Software with the Semantic Web

123

Social network portability and reusability

• Need distributed social networks and reusable profiles• Users may have many identities and sets of friends on

different social networks, where each identity was created from scratch

• Allow user to import existing profile and contacts, using a single global identity with different views (e.g. via FOAF, XFN / hCard, OpenID, etc.)

• See also:– http://bradfitz.com/social-graph-problem/– http://danbri.org/words/2007/09/13/194– http://code.google.com/apis/socialgraph/

Page 124: Interlinking Online Communities and Enriching Social Software with the Semantic Web

124

Identity management across networks

• Social media sites (or RDF exporters) create a new foaf:Person instance when they export their data:– TalkDigger, Revyu, Flickr exporters, etc.– There is a need to unify URIs so as to represent one's unified identity

• Linked-data principles are to use owl:sameAs and rdfs:seeAlso:– See: http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/– owl:sameAs: Used to identify two resources with different URIs as being

the same resource– rdfs:seeAlso: “More information about this resource can be found here”,

can be used by Semantic Web tools such as Tabulator

• Inference using owl:InverseFunctionalProperty:– foaf:mbox, foaf:openid, etc. can be used to identify uniqueness for a

foaf:Person

• Unifying aspects of a foaf:Person across networks:– All relevant sioc:User accounts may be related to one foaf:Person

Page 125: Interlinking Online Communities and Enriching Social Software with the Semantic Web

125

:alex owl:sameAs flickr:33669349@N00 ;owl:sameAs twitter:terraces

URI unification for a foaf:Person

Page 126: Interlinking Online Communities and Enriching Social Software with the Semantic Web

126

FOAF and social network connections

• FOAF allows us to represent the connections between people:– A machine-readable format for social-networking

• Using the foaf:knows property:– :John foaf:knows :Alex

• Extensions using the RELATIONSHIP vocabulary:– http://vocab.org/relationship/– All rel:* properties are subproperties of foaf:knows– :John rel:worksWith :Uldis– RDFS inferencing allows tools to answer queries using foaf:knows

when people use rel:* alternatives

Page 127: Interlinking Online Communities and Enriching Social Software with the Semantic Web

127

Distributed social networking with FOAF

• Combining networks from multiple FOAF URIs via owl:sameAs:– Decentralised social networks can represent connections for the

same person – A person’s networks can be merged together– Any sub-network in the social graph can be reached from a single

entry point, via the person’s URI

Page 128: Interlinking Online Communities and Enriching Social Software with the Semantic Web

128

Integrating social networks with FOAF

Common formats,unique URIs

* Source: Sheila Kinsella, Applications of Social Network Analysis 2007

Page 129: Interlinking Online Communities and Enriching Social Software with the Semantic Web

129

Distributed social networking with FOAF

Page 130: Interlinking Online Communities and Enriching Social Software with the Semantic Web

130

Applications for browsing the social (semantic) graph

• FOAFnaut, FOAF Explorer, etc.• FOAFGear: thanks to common semantics, only 100 lines

of code: http://apassant.net/home/2008/01/foafgear/

Page 131: Interlinking Online Communities and Enriching Social Software with the Semantic Web

131

Aggregation of semantic social networks

• Browse / re-use your social graph in personal applications• Merge identities with pre-defined rules• Tools:

– Beatnik– Knowee– SPARQLpress– Nepomuk

Page 132: Interlinking Online Communities and Enriching Social Software with the Semantic Web

132

Using OpenID with FOAF

• Can link to your FOAF profile from your OpenID URL, so that services can browse your machine-readable profile when you log-in:

<head><link rel="meta" type="application/rdf+xml" title="FOAF" href="foaf.rdf" />

</head>

Page 133: Interlinking Online Communities and Enriching Social Software with the Semantic Web

133

Example of OpenID used with FOAF

• Bob creates an account on Networkr, a new social networking website, using OpenID

• Networkr retrieves the FOAF URI thanks to an auto-discovery link

• From the FOAF file, it identifies if there are any people already subscribed to Networkr who are listed in Bob’s defined relationships

• If that is the case, Bob can add them as “local connections”, share data with them, etc. without having to once again search for / add his friends

• Specific rules: – If I know X from Flickr, he / she can see my pictures on Networkr

Page 134: Interlinking Online Communities and Enriching Social Software with the Semantic Web

134

The DataPortability initiative

• http://dataportability.org• Existing technologies• Inventing no new ones

Page 135: Interlinking Online Communities and Enriching Social Software with the Semantic Web

135

Other initiatives “near” DataPortability

Page 136: Interlinking Online Communities and Enriching Social Software with the Semantic Web

136

Semantics can help

• By using agreed-upon semantic formats to describe people, content objects and the connections that bind them all together, social media sites can interoperate by appealing to common semantics

• Developers are already using semantic technologies to augment the ways in which they create, reuse, and link profiles and content on social media sites (using FOAF, XFN / hCard, SIOC, etc.)

• In the other direction, object-centered social networks can serve as rich data sources for semantic applications

Page 137: Interlinking Online Communities and Enriching Social Software with the Semantic Web

137

Using SIOC and FOAF to represent portable data

Page 138: Interlinking Online Communities and Enriching Social Software with the Semantic Web

138

Porting social media contributions from data providers to import services

• Importing SIOC data:– A Semantic Web “building

block” for portable data

Page 139: Interlinking Online Communities and Enriching Social Software with the Semantic Web

139

SIOC import tools

• Importing SIOC data is easy:– Parse SIOC RDF data (e.g. using ARC2 or RAP for PHP)– Convert SIOC data to the content model of the target system:

• e.g. content and other properties of blog posts and comments• Can use SIOC APIs to hold the data model

– Store data in the target application:• The most difficult part

• More info:– Uldis Bojārs, Alexandre Passant, John Breslin, Stefan Decker,

“Social Network and Data Portability using Semantic Web Technologies”, Accepted for the 2nd Workshop on Social Aspects of the Web (SAW 2008), Innsbruck, Austria, May 2008

Page 140: Interlinking Online Communities and Enriching Social Software with the Semantic Web

140

WordPress SIOC Importer

• We have lots of producers of SIOC data, but now we need more applications that can consume it, like the WordPress SIOC Importer:– http://wiki.sioc-project.org/w/SIOC_Import_Plugin

• Just as WordPress can import blog entries from various blogging systems, the WordPress SIOC Importer can import any discussion posts (and comments) represented in SIOC (forum posts, mail messages, IRC chats, etc.)

Page 141: Interlinking Online Communities and Enriching Social Software with the Semantic Web

141

SIOC import process for WordPress

1. Parse RDF data (using the open-source RAP RDF parser)

2. Find all posts, i.e. instances of sioc:Post, which exhibit all of the properties required by the target site

3. For each post found, it creates a new post using WordPress API calls

• To do:• Multiple sources• Authentication• Synchronisation• SIOC import APIs

Page 142: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

8. Industry Applications of Semantic Technologies for Online Communities

Alexandre Passant

Page 143: Interlinking Online Communities and Enriching Social Software with the Semantic Web

143

Enterprise 2.0

• Web 2.0 includes applications such as blogs, wikis, RSS feeds and social networking, while Enterprise 2.0 is the packaging of those technologies in both corporate IT and workplace environments

• “Enterprise 2.0 is the use of emergent social software platforms within companies, or between companies and their partners or customers”, Harvard Business School’s Professor Andrew McAfee

• “There are direct enterprise equivalents [to Facebook]. You can ask people the status of their projects, what they’re working on, are they travelling, things they’ve learned. All of these things would be very valuable inside an enterprise.”

Page 144: Interlinking Online Communities and Enriching Social Software with the Semantic Web

144

• Social media services that people have been using in everyday life on the Web are now entering organisations:– Blogs– Wikis– Social networking– Tagging

• Lots of companies and products in this space:– Awareness, Mentor Scout, Contact Networks, Microsoft

SharePoint, IBM Lotus Connections, SelectMinds, introNetworks, Tacit, Illumio, Jive Software, Visible Path, Leverage Software, Web Crossing, SocialText

• These new deployments also face the same issues that are on the Web

Enterprise 2.0 (2)

Page 145: Interlinking Online Communities and Enriching Social Software with the Semantic Web

145

• Semantic Web technologies can be leveraged in organisations for:– Knowledge management– Data integration– Reasoning– Augmented search

• See the SWEO use cases document:– http://www.w3.org/2001/sw/sweo/public/UseCases/– More than 25 case studies and use cases– Vodafone, NASA, Renault, etc.

Semantic Web in organisations

Page 146: Interlinking Online Communities and Enriching Social Software with the Semantic Web

146

Distributed Web 2.0 corporate information systems

• McAfee’s “SLATES” requirements for Enterprise 2.0:– Search– Links– Authoring– Tagging– Extension– Signals

• The Semantic Web can offer enhanced functionality by interlinking Enterprise 2.0 data with common semantics:– Use back-end domain ontologies to extend search– Search by type (i.e. restrict to wiki pages)– Provide semantic links between documents

Page 147: Interlinking Online Communities and Enriching Social Software with the Semantic Web

147

Interconnecting Enterprise 2.0 services

• RDF bus architecture (Tim Berners-Lee):

– Add-ons to produce RDF data from existing Web 2.0 applications

– Store distributed data using RDF stores

• Create new applications:– Semantic mashups– Semantic search– Open architecture thanks to a

SPARQL endpoint, services as plugins to the architecture

Page 148: Interlinking Online Communities and Enriching Social Software with the Semantic Web

148

SIOC use case for Enterprise 2.0

• EDF R&D:– Enterprise 2.0 systems: blogs, wikis, RSS, etc.

• Interconnecting data with SIOC:– Exporters for blogs and wikis– Translation of RSS items to SIOC data

• Maintaining ontology instances:– Using a semantic wiki for ontology population– Using MOAT to link data to those instances

• New usages:– Semantic search engine: based on type and instances rather than plain

text– Geolocation mashups, by interlinking the GeoNames ontology with

internal data thanks to a common modelling scheme

Page 149: Interlinking Online Communities and Enriching Social Software with the Semantic Web

149

SIOC use case for Enterprise 2.0 (2)

• Statistics over 2½ years:– > 500k SIOC triples– 12852 sioc:Post(s)

• 11954 sioct:BlogPost(s)• 876 sioct:WikiArticle(s)

– 79 sioc:User(s)

• Search:– Plain text to ontology– MOAT for identifying instances– Retrieves blog posts, wikis, etc.– SPARQL-based queries

Page 150: Interlinking Online Communities and Enriching Social Software with the Semantic Web

150

Suggesting related content

• Using sioc:Post(s) and the sioc:topic property

Page 151: Interlinking Online Communities and Enriching Social Software with the Semantic Web

151

Using SIOC in collaborative working environments

Page 152: Interlinking Online Communities and Enriching Social Software with the Semantic Web

152

OpenLink DataSpaces

• ODS provides access to SIOC instance data from a range of ODS application instances including blogs, wikis, aggregated feeds, shared bookmarks, discussions, photo galleries, briefcases, etc.

Page 153: Interlinking Online Communities and Enriching Social Software with the Semantic Web

153

Talis Engage

Page 154: Interlinking Online Communities and Enriching Social Software with the Semantic Web

154

Seesmic

Page 155: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

9. From Blogs and Wikis to Semantic Blogging and Semantic Wikis

John Breslin

Page 156: Interlinking Online Communities and Enriching Social Software with the Semantic Web

156

Structure-enhanced blog posts

• Sometimes you have a burning need for more structure, at least some of the time

• When you know a subject deeply, and your observations or analysis recur, you may be best served by filling in a form

• The form will have its own metadata and its own data model

• Uses:– People get to express themselves, and– Blogs start to interoperate with enterprise applications

Page 157: Interlinking Online Communities and Enriching Social Software with the Semantic Web

157

Soccer coach example

• An after-game soccer report typically includes: – which teams played– where and when– officials, and– a list of game events:

• who scored (and when and how)

• who received penalties (when and for what), etc.

• Wouldn't it be handy for the coach’s blogging tool to understand this structure, present an editing form, render the form in HTML to their blog, and render the post (including the form) to their RSS feed?– Great for the World Cup!

Page 158: Interlinking Online Communities and Enriching Social Software with the Semantic Web

158

Integrating readers with structured blogging

• And in the future, news aggregators and news readers should be able to: – Auto-discover an unknown structure– Notify the user that a new structure is available– Learn the structure, including entry forms, pick list sources,

rendering guidance, and default style sheet– Make it available when the blogger is ready to write

Page 159: Interlinking Online Communities and Enriching Social Software with the Semantic Web

159

Structured blogging using WordPress

Page 160: Interlinking Online Communities and Enriching Social Software with the Semantic Web

160

Making use of structured blog posts

Page 161: Interlinking Online Communities and Enriching Social Software with the Semantic Web

161

Why semantic blogging?

• Traditional blogging:– Little or no query possibilities (except keyword and flat tags)– Little or no reuse of data (except textual copy and paste)– Little or no linking between posts (except simple hrefs and

trackbacks)

• Semantic blogging:– Facilitates better querying:

• More precise

• Allows aggregation from various sources

– Better reuse potential– Richer links

Page 162: Interlinking Online Communities and Enriching Social Software with the Semantic Web

162

Why semantic blogging? (2)

• Users collect and create large amounts of structured data on their desktops

• This data is often tied to specific applications and locked within the user's computer

• Semantic blogging can lift this data into the Web

Page 163: Interlinking Online Communities and Enriching Social Software with the Semantic Web

163

Releasing your data to the Web scenario

Ina

John

Ina‘s Computer

John‘s Computer

Blog Post

Blog Post

Blog Post

Blog Post

Metadata

Metadata

Metadata

writes Post

annotates Post

publishes Post

reads Post

imports metadata

Web

Page 164: Interlinking Online Communities and Enriching Social Software with the Semantic Web

164

Positioning of the metadata

Where in the blog will the semantic metadata go?• Directly in the HTML?

– Validity problems, parsing, restrictions on use of RDF...

• Put it in the newsfeed (RSS 1.0)?– Would have to change blogging platforms, hard to get accepted– Newsfeed items disappear over time

• Externally?– Just add a link to HTML– á la:

<a type=“application/rdf+xml“ href=“http://bresl.in/foaf/foaf.rdf“>John</a>

Page 165: Interlinking Online Communities and Enriching Social Software with the Semantic Web

165

How is this related to structured blogging?

• Structured blogging is mainly based on “Microformats” (http://www.microformats.org/)– Therefore restricted to specific schemata, not open– Positioned inline on HTML page (and in feed)– Can be directly rendered using CSS– Structured and semantic blogging do not compete

• Metadata can be added as RDF and using Microformats

– Web-based implementations for generating structured blogging metadata

• e.g. for WordPress and Movable Type

Page 166: Interlinking Online Communities and Enriching Social Software with the Semantic Web

166

Creating the metadata

Structural metadata:• Relations within the blogosphere: what relates to what

and how (replies, follow-ups or trackbacks, blogroll links and bookmarks, topics, etc.)?

• Closed domain, suggested vocabulary: SIOC• Plugins for blogging platforms, e.g. WordPress, Drupal• Produced automatically from a blog’s database

Page 167: Interlinking Online Communities and Enriching Social Software with the Semantic Web

167

Creating the metadata (2)

Content related metadata:• What do blog posts talk about (e.g. books, individuals,

meetings)?• Keep open domain – so that can use any vocabulary /

ontology (BibTeX, FOAF, iCal, ...)• Web-based approach (á la structured blogging) - user

fills in an HTML form• Desktop-based approach (á la semiBlog) - user selects

existing data on their computer, this gets converted into RDF

Page 168: Interlinking Online Communities and Enriching Social Software with the Semantic Web

168

Creating a semantic blog post with semiBlog

Annotating a blog entry with an address book entry.

<foaf:Person rdf:ID="andreas"> <foaf:homepage> http://sw.bla.org/~aharth/</foaf:homepage> <foaf:surname>Harth</foaf:surname> <foaf:firstName>Andreas</foaf:firstName> <!-- ... more properties ... --> <rdf:value>Andreas Harth</rdf:value></foaf:Person>

Page 169: Interlinking Online Communities and Enriching Social Software with the Semantic Web

169

Using the metadata

Once a blog has semantic metadata, it can be...• Used to query: “Which blog posts talk about papers by

Stefan Decker?”• Used to browse across blogs and other kinds of

discussion methods:• Imported into desktop applications of blog readers (AKA

“The Web as a Clipboard“)

Page 170: Interlinking Online Communities and Enriching Social Software with the Semantic Web

170

The Web as a clipboard using a suitable reader

• A user can import metadata from here into his / her own applications

Page 171: Interlinking Online Communities and Enriching Social Software with the Semantic Web

171

• Structural metadata:– Relations between blogs, posts, comments, etc.– More than just “A links to B“ - what kind of relationship?

• Approval? Criticism? Mentions? Is about?

– …relations within the blogosphere

• Content-related metadata:– What is this post about, what is its topic?– Anything a blog author wishes to discuss– ...relations between the blogosphere and everything else

Structural versus content-related

Page 172: Interlinking Online Communities and Enriching Social Software with the Semantic Web

172

Argumentative discussion topics similar to IBIS

Page 173: Interlinking Online Communities and Enriching Social Software with the Semantic Web

173

• Closed-domain metadata:– The domain is restricted to a certain set of real-world entities or

concepts, e.g. blog structure or scientific publications.– Allows the definition of one specific domain ontology (e.g.

SIOC)

• Open-domain metadata:– The domain is not restricted, e.g. as in blog content– Hard to define one all-embracing ontology, very unwieldy, hard

to convince people to use it– Instead divide into closed subdomains, use small, vertical

domain ontologies (e.g. FOAF, BibTeX, etc.)

Closed domain versus open domain

Page 174: Interlinking Online Communities and Enriching Social Software with the Semantic Web

174

• Client-side metadata:– Data to be used resides client-side– Implementation can best be realised client-side (e.g. harvesting

desktop data with semiBlog)

• Server-side metadata:– Data to be used resides server-side– Implementation can best be realised server-side (e.g. harvest

WordPress database tables with WordPress SIOC plugin)

Client side versus server side

Page 175: Interlinking Online Communities and Enriching Social Software with the Semantic Web

175

Describing structure and content

Page 176: Interlinking Online Communities and Enriching Social Software with the Semantic Web

176

Tagging and the Semantic Web

• Tags are powerful but:– Heterogeneity: different tags, one meaning– Ambiguity: one tag, different meanings– Unrelated: may be no relationship between tags

• A common semantic for tags and tagging actions:– The “Tag Ontology” by Newman from 2005– tags:Tag rdfs:subClassOf skos:Concept– A “Tagging” class describes relationships between:

• A user

• An annotated resource

• Some tags

Page 177: Interlinking Online Communities and Enriching Social Software with the Semantic Web

177

Going further with tagging

• SCOT (Social Semantic Cloud of Tags):– A model to describe tagclouds (tags and co-occurrence)– Ability to move your own tagcloud from one service to another– Share tagclouds between services, and between users– “Tag portability”

• MOAT (Meaning of a Tag)– A model to define “meanings” of tags using existing URIs– e.g. SPARQL → http://dbpedia.org/resource/SPARQL– Tagged content enters the “Linked Data” web– Collaborative approach:

• Anyone can define a new meaning for a tag

• Meanings are shared inside a given community

Page 178: Interlinking Online Communities and Enriching Social Software with the Semantic Web

178

gnizr

Page 179: Interlinking Online Communities and Enriching Social Software with the Semantic Web

179

Problems with traditional wikis

• Structured access• Information reuse

JohnGrisham

He is the author of PelicanBrief.He lives in Mississippi.He writes a book each year.He is published by RandomHouse.

Structured access:✗ Other books by JohnGrisham (navigation)✗ All authors that live in Europe? (query)Information reuse:✗ The authors from RandomHouse (views)✗ And what if I don't speak English? (translation)

Page 180: Interlinking Online Communities and Enriching Social Software with the Semantic Web

180

What are semantic wikis?

• A wiki that has an underlying model of the knowledge described in its pages:– Semantic wikis allow to capture or identify further information

about the pages (metadata) and their relations– Knowledge model available in a formal language, so that

machines can (at least partially) process and reason on it– A semantic wiki would be able to capture that an "apple" article is

a "fruit" (through an inheritance relationship) and present you with further fruits when you look at apple

– Some are used for personal knowledge management, others aimed at KM for communities

• http://wiki.ontoworld.org/wiki/Swikig• http://www.semwiki.org/

Page 181: Interlinking Online Communities and Enriching Social Software with the Semantic Web

181

Structural and content metadata in semantic wikis

Page 182: Interlinking Online Communities and Enriching Social Software with the Semantic Web

182

Information reuse in SemperWiki

Page 183: Interlinking Online Communities and Enriching Social Software with the Semantic Web

183

Semantic MediaWiki

• Semantic MediaWiki is an extension of MediaWiki, the open-source wiki system powering Wikipedia:– Allows users to add structured data to the entries, turning it into a

semantic wiki– Users can classify the “type” of links, e.g. making a relationship

such as “capital of” between Berlin and Germany explicit:• ... [[capital of::Germany]] ... resulting in the semantic statement

"Berlin" "capital of" "Germany"– On the page about Berlin, users can explicitly define its population

by writing:• ... the population is [[population:=3,993,933]] ... resulting in the

semantic statement "Berlin" "has population" "3993933"– Currently the most widely-deployed semantic wiki, Semantic

MediaWiki is also being used by various organisations, and is being deployed as a service by Centiare and Wikia

Page 184: Interlinking Online Communities and Enriching Social Software with the Semantic Web

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

10. Conclusions

John Breslin

Page 185: Interlinking Online Communities and Enriching Social Software with the Semantic Web

185

Some examples of where SIOC is already use (about 50 implementations / applications)

Page 186: Interlinking Online Communities and Enriching Social Software with the Semantic Web

186

A list of some these SIOC implementations

Creating SIOC data• SIOC APIs

– SIOC Export API for PHP– SIOC API for Java

• Weblog, forum and CMS exporters – WordPress SIOC Exporter– Dotclear SIOC Exporter– b2evolution SIOC Exporter– Drupal SIOC Exporter– phpBB 2.x SIOC Exporter– Triplify

• Other exporters – OpenLink DataSpaces– TalkDigger– SWAML– Mailing List Archives– Mailing List Exporter– Twitter2RDF– IRC2RDF– Sioku (Jaiku2RDF) – gnizr– OpenQabal– BlogEngine.NET

Using SIOC data• SPARQL endpoints, querying SIOC data

– ODS demo server and MyOpenLink.net– #B4mad.Net SPARQL endpoint

• Crawling SIOC data – SIOC Crawler– SIOC Browsers– Buxon– SIOC Explorer

• Using SIOC for new data – Fishtank– BAETLE– RDFa on Rails– IkeWiki– int.ere.st– OpenLink Virtuso AMI– Talis Engage

• Reusing SIOC data – IKHarvester– notitio.us and JeromeDL

SIOC utilities• Finding and indexing SIOC data

– Semantic Radar– PingTheSemanticWeb.com

Page 187: Interlinking Online Communities and Enriching Social Software with the Semantic Web

187

A vocabulary onion, building on FOAF, SKOS, SIOC, SIOC Types, DC

Page 188: Interlinking Online Communities and Enriching Social Software with the Semantic Web

188

Disconnected sites on the Social Web / Web 2.0 can be linked using Semantic Web vocabularies

Page 189: Interlinking Online Communities and Enriching Social Software with the Semantic Web

189

Find people experienced in using SIOC / suggest improvements / participate in SIOC development

• The SIOC project page and wiki:– http://sioc-project.org and http://wiki.sioc-project.org

• The SIOC W3C member submission:– http://www.w3.org/Submission/2007/02/

• A SIOC developer mailing list:– http://groups.google.com/group/sioc-dev

• Real-time IRC chat channel about SIOC:– irc://irc.freenode.net/sioc

• A comprehensive list of SIOC applications:– http://rdfs.org/sioc/applications/

• The SIOC RDF Browser prototype:– http://sparql.captsolo.net/browser/

• Semantic Radar extension for Firefox:– https://addons.mozilla.org/en-US/firefox/addon/3886

Page 190: Interlinking Online Communities and Enriching Social Software with the Semantic Web

190

Contact information / thanks

• Alexandre Passant– [email protected]

– http://apassant.net/

• John Breslin– [email protected]

– http://www.johnbreslin.com/

• Uldis Bojārs– [email protected]

– http://captsolo.net/

• Thanks to Thomas Schandl, Tuukka Hastrup, Sergio Fernandez for help with slides

• The SIOC project is supported by Science Foundation Ireland under grant number SFI/02/CE1/I131