Introduction to the Semantic Web

58
© Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie Introduction to the Semantic Web Alexandre Passant Digital Enterprise Research Institute, National University of Ireland, Galway DM110 – Emerging Web Media Week 7 – 02 Nov. 2009

description

Lecture at DM110 on "Emerging Web Media", NUIG - 2nd November 2009

Transcript of Introduction to the Semantic Web

Page 1: Introduction to the Semantic Web

© Copyright 2009 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Introduction to the Semantic Web Alexandre Passant

Digital Enterprise Research Institute, National University of Ireland, Galway

DM110 – Emerging Web Media Week 7 – 02 Nov. 2009

Page 2: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Agenda

  What is the Semantic Web ? What for ?   From Documents to Data

  URIs and RDF   To identify resources and define statements about these

resources

  Ontologies with RDFS/OWL   Shared semantics to improve interoperability between

applications

  Querying data with SPARQL   To make use of it and create new applications

  NB: The upcoming lectures will cover related topics / subtopics

2 of XYZ

Page 3: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

The initial Proposal (1989)

3 of XYZ

http://www.w3.org/History/1989/proposal.html

Page 4: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

… but so far …

4 of XYZ

??

??

??

Page 5: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

… however …

To a computer, the Web is a flat, boring world, devoid of meaning. This is a pity, as in fact documents on the Web describe real objects and imaginary concepts, and give particular relationships between them. For example, a document might describe a person. The title document to a house describes a house and also the ownership relation with a person. Adding semantics to the Web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values. Only when we have this extra level of semantics will we be able to use computer power to help us exploit the information to a greater extent than our own reading.

Tim Berners-Lee, 1st World Wide Web Conference, Geneva, May 1994

5 of XYZ

Page 6: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

… so ?

6 of XYZ

http://www.slideshare.net/danbri/when-presentation-849447

Page 7: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

The Semantic Web is about

  Bridging the gap from a Web of Documents to a Web of Data   With typed objects and typed relationships: The Web as a

giant decentralized database

  Adding machine-readable meta-data to existing content   So that information can be parsed, queried, reused

  Defining shared semantics for this meta-data   For interoperability between applications and for advanced

purposes, such as reasoning

  Enabling machine-readable knowledge at Web scale, making information more easy to find and process

7 of XYZ

Page 8: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

A Bit of History

  Memex   1945 ! - Vannevar Bush

  A memex is “a device in which an individual stores all his books, records, and communications.”

  Augmenting Human Intellect   1960 - Douglas Engelbart

  “By ‘augmenting human intellect’ we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.”

8 of XYZ

Page 9: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

More recently

  SHOE   http://www.cs.umd.edu/projects/plus/SHOE/

  “SHOE is a small extension to HTML which allows web page authors to annotate their web documents with machine-readable knowledge. SHOE makes real intelligent agent software on the web possible.“

9 of XYZ

Page 10: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

The Semantic Web, right now

  Most standardisation work is done in the W3C   http://w3.org

  The Semantic Web activity   http://www.w3.org/2001/sw/

  Various Incubator Groups, Working Group, Interest Group   SPARQL - http://www.w3.org/2009/sparql/wiki/Main_Page

  RDB2RDF – http://www.w3.org/2005/Incubator/rdb2rdf

  RIF - http://www.w3.org/2005/rules/

  HCLS - http://www.w3.org/2001/sw/hcls/

  …

10 of XYZ

Page 11: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

The Semantic Web stack

11 of XYZ

http://www.w3.org/2007/03/layerCake.png

Page 12: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

URIs

12

  A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource as of RFC3986

  URIs are used to identify everything in a unique and non-ambiguous way   Not only pages (as on the current Web), but any resource

(people, documents, books, interests …)

  A URI for a person is different from a URI for a document about the person, because a person is not a document !

  Example   http://apassant.net/alex - myself

  http://apassant.net - my homepage

Page 13: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Content-negociation

13

  URI for resource, URI for documents   But documents made for people cannot be read by

computers (the issue with the current Web)

  Content negotiation   Provides a way, for a resource URI to redirect to the

document describing that resource

  Depending on who is accessing it –  Human-readable of machine-readable

  Example   http://dbpedia.org/resource/Galway

  http://dbpedia.org/page/Galway

  http://dbpedia.org/data/Galway

Page 14: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDF

14

  URI represent resources   But how define things about these resources ?

  RDF – Resource Description Framework   RDF abstract syntax, a data model: a directed, labeled

graph based on URIs

  RDF is not XML ! RDF/XML is only one of the multiple way to serialize RDF data (N3, RDFa …)

  RDF is based on triples   <subject> <predicate> <object> .

Page 15: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDF

@prefix dct: <http://purl.org/dc/terms/> . !

<http://example.org/dm110-semweb>! dct:title “Introduction to the Semantic Web” ; ! dct:author <http://apassant.net/alex> ;! dct:subject <http://dbpedia.org/resource/Semantic_Web> .!

15

Page 16: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDF serializations

16

  RDF/XML   The most used, probably the most complex !

  E.g. http://geonames.org/2988507/about.rdf

  N3/Turtle   Easier to read for humans

  E.g. http://dbpedia.org/data/Galway.n3

  RDFa   Embeds RDF in XHTML, one page for humans and

machines

  E.g. http://apassant.net (browse source)

16 of XYZ

Page 17: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Ontologies

17

  RDF provide a way to write assertions about URIs   But what about the semantic of these assertions

  E.g. how can one know thathttp://xmlns.com/foaf/0.1/knows identifies an acquaintance relationship ?

  Ontologies provide common semantics for resources on the Semantic Web   “An ontology is a specification of a conceptualization.”

  Developing ontologies for the Semantic Web   Main languages are RDFS (RDF Schema) and OWL (Web

Ontology Language)

17 of XYZ

Page 18: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Ontologies

18

  Classes and properties   :Person a rdfs:Class .

  :father a rdfs:Property .

  :father rdfs:domain :Person .

  :father rdfs:range :Person .

18 of XYZ

Page 19: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFS

19

  RDFS defines classes, properties and subsumption relations between classes and properties   ex:Person rdfs:subClassOf ex:humanLiving .

  ex:worksWith rdfs:subPropertyOf ex:knows .

  Such relationships are used to infer new statements   :alex rdf:type ex:Person .

  :Alex ex:worksWith :Axel .

  Is enough to say that Alex is a humanLiving and knows Axel

19 of XYZ

Page 20: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

OWL

20

  OWL goes further than RDFS by introducing new axioms   Disjunction (e.g. person / document)

  Transitivity (e.g. ancestor)

  Symmetry (e.g. sibling)

  Cardinality constraints (e.g. ancestor > 1)

  OWL2 has just been standardized W3C and introduces a lot of useful features, especially for reasoning   Property Chains

  parent + brother -> uncle

20 of XYZ

Page 21: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

OWL2 Property chain example

21

ex:uncle rdf:type owl:ObjectProperty .

ex:parent rdf:type owl:ObjectProperty .

ex:brother rdf:type owl:ObjectProperty .

[] rdfs:subPropertyOf ex:uncle;  

owl:propertyChain ( 

ex:parent  

ex:brother  

).

:alice ex:parent :bob .

:bob ex:brother :joe .

=>

:alice ex:uncle :joe .

21 of XYZ

Page 22: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Notable ontologies

  Social networks and social data   FOAF – Friend Of A Friend

  SIOC – Semantically-Interlinked Online Communities

  Software development   DOAP – Description Of A Project

  BEATLE - Bug And Enhancement Tracking LanguagE

  Comprehensive / Top-level   Yago (From Wikipedia)

  OpenCYC

  Taxonomies   SKOS – Simple Knowledge Organisation System

22

Page 23: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Zooming in: FOAF Ontology

  A model to describe people and social networks   http://foaf-project.org

  Concepts   Person, OnlineAccount, Document, etc.

  Properties   name, homepage, holdsAccount, knows, etc.

23

Page 24: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

FOAF in use

  Google Social Graph API   http://code.google.com/intl/fr/apis/socialgraph/

  Uses FOAF information already there on the Web to find your contacts   http://socialgraph-resources.googlecode.com/svn/trunk/

samples/findcontacts.html

  E.g.: http://apassant.net –  http://socialgraph-resources.googlecode.com/svn/trunk/

samples/findcontacts.html?q=http%3A%2F%2Fapassant.net –  Contacts found in various FOAF files that link to myself and to

my profile

24

Page 25: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Which ontologies to use ?

  SearchMonkey Vocabularies   http://developer.yahoo.com/searchmonkey/smguide/

profile_vocab.html

25

Page 26: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Which ontologies to use ?

  How to Publish Linked Data on the Web   http://www4.wiwiss.fu-berlin.de/bizer/pub/

LinkedDataTutorial/

26

Page 27: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Extending ontologies ?

  What if existing ontologies are not enough for your needs ?   Create a new ontology

  … or extend an existing one !

  Ontologies can be extended in a decentralized way   E.g. you can create a subproperty of foaf:knows,

“hasLecturer”, in your own ontology and publish it online

  Open.vocab.org   A collaborative platform to manage ontologies

  http://open.vocab.org

27

Page 28: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Warning: Domain and range

  Domain and range of properties in ontologies are descriptive, not prescriptive   :father rdfs:domain :Person

–  Not only pre-defined Persons can be fathers

–  But every father is a Person !

  Consequence 1: One triple is enough to describe several informations

  Consequence 2: Don’t use foaf:homepage for a shoe !

  For details   Based on RDF semantics (Rule rdfs2)

  http://www.w3.org/TR/rdf-mt/

28

Page 29: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Warning: Open World

29

  The Open World Assumption   Might be complex to understand when coming from a

RDBMS or OOP background

  If a fact is not there, it does not means it is false   Bob’s father is Paul. Is Jim Paul’s father ?

–  Cannot be answered unless usin cardinality constraints in the ontology (in OWL), e.g. a Person has only 1 father.

  Is Axel speaking today ? –  Cannot be answered

  Bob’s daughters are Alice and June. Has Bob 3 daughters ? –  Cannot be answered

  In practice, most applications use close-world reasoning / querying

29 of XYZ

Page 30: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Creating RDF data using ontologies

  Overview of different methods:   Create RDF manually (using your favourite text-editor or

Web-based interfaces)

  Create XHTML+RDFa documents and use GRDDL transformation

–  For both human and machines !

  Use exporters / wrappers for existing service

  Use applications that natively expose RDF data

  Provide mappings from RDBMS to RDF data

30

Page 31: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Getting a FOAF profile

  Or how to give yourself a URI   Give yourself an identity on the Semantic Web

  Create your FOAF file   http://www.ldodds.com/foaf/foaf-a-matic (requires

hosting, e.g. your FTP space or uploaded via Drupal)

  http://foafbuilder.qdos.com/builder/ (requires OpenID)

  I already have an homepage, what about duplication of information ?   Use RDFa to embed RDF annotations in your homepage !

  More on this topic in a few slides

31

Page 32: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Extend your FOAF profile

  The foaf:knows property aims to represent social connections between people   :alex foaf:knows :axel .

  … but it’s voluntary a weak relationship (no strongsemantics on why / how we know each other)

  Going further with the relationship vocabulary   http://vocab.org/relationship/

  Various properties can be used: colleagueOf, hasMet …

  You can extend your FOAF file to add colleagues, co-workers, and use different properties for each of them   Useful for querying a particular type of relationship only

32

Page 33: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Defining personal interests

  Instead of modeling interests as plain-text strings, use URIs to describe them !   Since URI are unique and non ambiguous identifiers

  Allows interlinking of various resources for advanced query purposes: “find all people that like movies directed by Tarantino”

  Using the foaf:topic_interest properties   :me foaf:topic_interest :movie .

  But … where to get these URIs ?   Sindice, the Semantic Web index, can be used to find URIs

for a given concept

  http://sindice.com

33

Page 34: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Defining personal interests

34

Page 35: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Defining personal interests

35

Page 36: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL

  GRDDL is a mechanism to transform any kind of XML to RDF

  XHTML+RDFa is XML, hence GRDDL can extract it   Simply embeds RDFa annotations in your HTML code

  Indexed by Yahoo! SearchMonkey and Google

  Done via XSLT, available at http://www.w3.org/2008/07/rdfa-xslt

36

Page 37: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL

  The GRDDL Primer athttp://www.w3.org/TR/grddl-primer/#scheduling shows the overall processing of XHTML+RDFa:

37

Page 38: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL example

  http://sdow2009.semanticweb.org

38

Page 39: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL example

  http://sdow2009.semanticweb.org   Browse source to check RDFa annotations

39

Page 40: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL

  http://sdow2009.semanticweb.org   Header contains prefixes and links to the GRDDL

transformation

40

Page 41: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

RDFa and GRDDL example

  http://sdow2009.semanticweb.org   Webpage can be translated to native RDF/XML using an

RDFa distiller - http://www.w3.org/2007/08/pyRdfa/

41

Page 42: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Other example

  Adding RDFa in one’s profile   Need to define prefixes in the header, or include them in

the markup

  See http://www.w3.org/TR/xhtml-rdfa-primer/

42

Page 43: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Wrappers for existing sources

  Creating and maintaining a FOAF file by hand can be a time-consuming task   How can we automatically get RDF data from existing

sources ?

  What about Web 2.0 services in which we already give lots of personal information ?   Most of them provide APIs to get structured information

(JSON, XML …) about the user profiles, content, etc.

  API to RDF wrappers can easily be implemented

43

Page 44: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Wrappers for Web 2.0 services

  Facebook wrapper   Generates a FOAF file from your Facebook profile

  If you have a Facebook profile, then you can have a related FOAF file (and escape the Facebook walled-garden !)

  http://www.dcs.shef.ac.uk/~mrowe/foafgenerator.html

  Flickr wrapper   Generates FOAF + SIOC + links to geographical information

(using geonames.org)

  http://apassant.net/home/2007/12/flickrdf

44

Page 45: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

More RDF-ification services

  Translates many structured sources into RDF   URIBurner

–  http://linkeddata.uriburner.com/

–  Open Source, C++ , Based on Virtuoso

  Any23 –  Sindice sponsored

–  Open Source, Java based

  Swignition –  http://buzzword.org.uk/swignition/

–  Perl based

  Triplr –  Purely syntactic, fast

–  http://triplr.org

45

Page 46: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Native export of RDF data

  CMS can expose RDF data natively using dedicated plug-ins   SIOC Export for Drupal: http://drupal.org/project/SIOC

  Provide RDF export of each blog post –  http://apassant.net/blog/2009/03/07/call-suggested-

features-sparql-working-group

–  http://apassant.net/sioc/node/235

  Using RDF autodiscovery feature in the HTML header –  So that RDF can be discovered when browsing HTML

–  Semantic Radar: http://sioc-project.org/firefox

  RDFa to be included in Drupal7 core ! –  http://groups.drupal.org/node/16597

–  100.000’s of RDFa-powered websites

46

Page 47: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Overview: SIOC for vBulletin

47

Page 48: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Relational to RDF Mapping

  Relational data (RDB) is structured data and can be mapped to RDF straight-forward   Especially useful as various websites are back-ended by a

relational database (e.g. MySQL in the previous Drupal lecture)

  Main issues:   Closed-world vs. open-world modeling

  Assigning URIs for entities (records)

  Mapping language expressivity

  For a state-of-the-art see http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf

48

Page 49: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Relational to RDF Mapping

  Standardization   W3C RDB2RDF Incubator Group 2008/2009

  Upcoming W3C RDB2RDF Working Group

  Current solutions (see state-of-the-art)   D2RQ

–  http://www4.wiwiss.fu-berlin.de/bizer/d2rq/

–  DBLP in RDF: http://dblp.l3s.de/d2r/

  OpenLink’s Virtuoso –  http://www.openlinksw.com/virtuoso/

  Triplify –  http://triplify.org

49

Page 50: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

SPARQL

  RDF(S)/OWL useful to produce data   A need to query it

  SPARQL   SPARQL Protocol and RDF Query Language

  The “SQL” of the Semantic Web

  FAQ   http://www.thefigtrees.net/lee/sw/sparql-faq

  SPARQL Query Recommendation / tutorial   http://www.w3.org/TR/rdf-sparql-query/

  Currently under standardization for new features   http://www.w3.org/2009/sparql/wiki/Main_Page

50

Page 51: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

How it works ?

  Basic concept of Graph Pattern Matching   RDF data is graph data, SPARQL checks if the graph you

are looking for belongs to the graph you are querying

  Four different operators   SELECT, DESCRIBE, CONSTRUCT, ASK

  Combined with the pattern you want to match and optional features (union, filters …)

  A Protocol   To query RDF data using SPARQL endpoints via HTTP

  Most of endpoints are associated with an RDF store   A place that stores RDF data and provides open access to

it – e.g. http://dbpedia.org/sparql

51

Page 52: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

52

Example of SELECT queries

“select persons older than 30”

SELECT ?X WHERE { ?X a foaf:Person. ?X ex:age ?Y. FILTER (?Y > 30) }

Page 53: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Query DBPedia

  The Semantic Web aims at creating structured data where there is only HTML data at the moment   Wikipedia: A great resource for humans, poor for machines

  DBPedia – RDF version of Wikipedia

  New kind of advanced queries   http://wiki.dbpedia.org/OnlineAccess#h28-5

  People born in Berlin before 1900

  German musicians born in Berlin

  Etc …

  The following queries can be run online   http://dbpedia.org/snorql

53

Page 54: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Example query 1

  People Born in Galway   Simple triple pattern

  http://dbpedia.org/ontology/birthplace

  Answer SELECT ?who!

WHERE {!

?who !

<http://dbpedia.org/ontology/birthplace> :Galway .!

}!

54

Page 55: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Example query 2

  Japanese name of Galway

  Using the FILTER by LANG clause   FILTER(lang(?x) = “ja”)

  Answer SELECT ?name!

WHERE {!

:Galway rdfs:label ?name .!

FILTER (lang(?name) = “ja”) .!

}!

55

Page 56: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Example query 3

  Irish cities at the east of Galway!

56

Page 57: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Example query 3

  FILTER by type and comparison of coordinates

  Answer PREFIX yago: <http://dbpedia.org/class/yago/>!

SELECT DISTINCT ?place ?long WHERE {!

:Galway dbpedia2:westCoord ?glong .!

?place rdf:type!

yago:CitiesInTheRepublicOfIreland ;!

dbpedia2:westCoord ?long .!

FILTER (?long < ?glong) !

}!

57

Page 58: Introduction to the Semantic Web

Digital Enterprise Research Institute www.deri.ie

Assignement

  Create a FOAF file   Define your social network (>3) using the relationships

vocabulary and add some interests using DBPedia URI (>3)

  Validate at http://www.w3.org/RDF/Validator/

  Add the same information in your Drupal profile as RDFa   Check if it translates well using

http://www.w3.org/2007/08/pyRdfa/

  Some SPARQL queries over Dbpedia (based on the interests defined in your FOAF file)   Will send the list by e-mail

  Deadline 16 November

58