ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

123
Linked Open Data Dan Brickley, Google Denny Vrandečić, Wikimedia Session Linked data, Tuesday, 9:45-11:15

Transcript of ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

Page 1: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

Linked Open Data Dan Brickley, Google Denny Vrandečić, Wikimedia

Session Linked data, Tuesday, 9:45-11:15

Page 2: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

2

Agenda !   Notation !   Linked Open Data principles !   Applied LOD principles !   Application: schema.org !   Application: Wikidata !   Open questions !   Hands-On Intro: On links !   Hands-on: Exploration !   Hands-on: SPARQL !   Hands-on: Spark

22/05/2012

Page 3: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

3

dbpedia:Kalamaki

Notation !   URIs here generally abbreviated with CURIEs (e.g. http://dbpedia.org/resource/Kalamaki = dbpedia:Kalamaki) !   Entities and literals are labeled rectangles !   Blank nodes are circles !   Triples are arrows labeled with property connecting subject and object

22/05/2012

dbpedia:Kalamaki fb:likes

Page 4: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

4

LOD PRINCIPLES Background

22/05/2012

Page 5: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

5

Linked Open Data principles

1.  Use URIs as names for things

2.  Use HTTP URIs so that they can be looked up

3.  Provide results in standard formats (e.g. RDF, SPARQL)

4.  Link to other URIs

22/05/2012

Page 6: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

6

WHY SEMANTIC COMPUTING? LOD Application

Page 7: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

7

Page 8: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

8

1 2 10 100 10 D

Field 1: Tag !   0 = Face up !   1 = Face down Field 2: Suit !   1 = Clubs !   2 = Diamonds !   3 = Hearts !   4 = Spades

Field 3: Rank !   1 = Ace !   2..10 = 2..10 !   11 = Jack !   12 = Queen !   13 = King Field 4: Address next card Field 5: “Human-readable”

Example from Donald Knuth, The Art of Computer Programming, Chapter 1

Page 9: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

9

1 2 10 100 10 D

Page 10: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

10

1 2 10 100 10 D

card

s:ne

xt

Page 11: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

11

1 2 10 100 10 D

cards:d10

card

s:ne

xt

cards:card

Page 12: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

12

http://example.org/cards/d10!!   Oh, an unknown term !   It is an HTTP URI!

GET /cards/d10 HTTP/1.1!HOST www.example.org!Accept: text/rdf+n3, application/rdf+xml!

!!

HTTP/1.1 200 OK!Content-type: text/n3; charset-UTF-8!

!

cards:d10 rdf:type cards:Card ;! rdfs:label “10 of diamonds”@en ;! cards:suit cards:diamonds ;! cards:rank cards:rank-10 .!

22/05/2012

Page 13: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

13

1 2 10 100 10 D

cards:d10

cards:diamonds

cards:rank-10

card

s:ne

xt

cards:card cards:rank

Page 14: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

14

cards:card

1 2 10 100 10 D

cards:d10

cards:diamonds

cards:rank-10

“10 of Diamonds”@en

card

s:ne

xt

cards:rank

rdfs:label

Page 15: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

15

cards:card

1 2 10 100 10 D

cards:d10

cards:facedown cards:diamonds

cards:rank-10

“10 of Diamonds”@en

card

s:ne

xt

cards:rank

rdfs:label

Page 16: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

16

cards:card cards:d10

cards:facedown cards:diamonds

cards:rank-10

“10 of Diamonds”@en

“10”^xsd:int

color:red

“Karo 10”@de card

s:ne

xt

cards:rank

rdfs:label

Page 17: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

17

cards:d10

cards:facedown cards:diamonds

cards:rank-10

“10 of Diamonds”@en

“10”^xsd:int

color:red

“Karo 10”@de card

s:ne

xt

cards:card cards:rank

rdfs:label

cards:suit ○ cards:color ⊑ cards:cardcolor

Page 18: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

18

Programming function color(card) { if ((card[2] == 1) or (card[2] == 4)) { return 1; } else { return 2; } }

function color(card) { if ((card.suite == cards.clubs) or (card.suite == cards.spades)){ return cards.black; } else { return cards.red; } }

function color(card) { return 2 – int((card[2] == 1) or (card[2] == 4)); }

cards:cardcolor select ?color where { card cards:cardcolor ?color }

Classic Symbolic constants

Wannabe Hacker Semantic

Where is the knowledge? How do I edit it?

Page 19: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

19

Page 20: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

20

cards:d10

cards:facedown cards:diamonds

cards:rank-10

“10 of Diamonds”@en

“10”^xsd:int

color:red

“Karo 10”@de card

s:ne

xt

cards:card cards:rank

rdfs:label

color:yellow

cards:suit ○ skat:color ⊑ cards:cardcolor

Page 21: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

21

Page 22: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

22

cards:d10

cards:facedown cards:diamonds

cards:rank-10

“10 of Diamonds”@en

“10”^xsd:int

color:red

“Karo 10”@de

card

s:ne

xt

cards:card cards:rank

rdfs:label

color:yellow color:purple

poker:color

cards:suit ○ poker:color ⊑ cards:cardcolor cards:cardcolor

Page 23: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

23

BUT THAT ARE KNOWLEDGE-BASED SYSTEMS AS DONE FOR DECADES!

Page 24: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

24

CHRIS WELTY, IBM

“In the Semantic Web, it is not the ‘Semantic’ which is new, it is the ‘Web’ which is new.”

Page 25: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

25

cards:d10

cards:diamonds

color:red

cards:card

color:yellow color:purple

poker:color

aifb:Elena

fb:li

ke

Page 26: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

26

Elena

AIFB

Purple

Tatort

Diamond

10-Diamond Queen-Diamond

Queen

King

KIT

Culture

University

Karlsruhe

Education China

Ceylon

India

Airline

Asia

Hotel Restaurant Enterprise

Airport Advertisment

Animal Vegeterian restaurant

Cosmos

TV Show

Inchineon Mumbay Airport

Mumbay

Human

Carbon

Diamond

Lao Tse Religion

Philosophy

Semantic Web

Page 27: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

27

Semantic Web

22/05/2012

2007

Page 28: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

28

Semantic Web

22/05/2012

2008

Page 29: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

29 22/05/2012

2009

Page 30: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

30 22/05/2012

2010

Page 31: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

31 22/05/2012

2011

Page 32: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

32

SCHEMA.ORG Applications

22/05/2012

Page 33: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

33

Schema.org A quick look.

Page 34: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

34

Page 35: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

35

Page 36: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

36

Page 37: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

37

Yandex

Page 38: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

38

event

place

intangible LocalBusiness

Organization

CivicStructure

CreativeWork

Landform

UserInteraction

Page 39: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

39

For example?

Page 40: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

40

Page 41: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

41

<div itemscope itemtype="http://schema.org/VideoObject">!  <h2>Video: <span itemprop="name">My Title</span></h2>!  <meta itemprop="duration" content="T1M33S" />!  <meta itemprop="thumbnailUrl" content="thumbnail.jpg" />!  <meta itemprop="embedUrl"!    content="http://example.com/videoplayer.swf?video=123" />!  <object ...>!    <embed type="application/x-shockwave-flash" ...>!  </object>!  <span itemprop="description">Video description</span>!</div>!

Type: http://schema.org/VideoObject name = My Title duration = T1M33S thumbnailurl = thumbnail.jpg embedurl = http://www.example.com/videoplayer.swf?video=123 description = Video description

Page 42: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

42

(this is almost all you need to know about RDF, incidentally)

Page 43: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

43

WIKIDATA Applications

22/05/2012

Page 44: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

44

Page 45: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

45

Main page Content API Random page Donate to Wikidata Interaction Help About Wikidata Community Recent changes Languages Catalá Cesky Dansk Eesti English Español Esperanto Français Hrvatski Italiano O’zbek Complete list

Berlin edit | x

Continent Europe [3 sources]

Country Germany [2 sources]

Population 3,499,879 As of November 30 2011 Method Extrapolation

[1 source]

3,500,000 As of 2012 Method Estimate

[2 sources]

[further values]

Phone prefix 030 since June 1973

[2 sources]

0311 before June 1973

[1 source]

Mayor Klaus W| [no source]

Registration license B [1 source]

Area 891,85 km” [2 sources]

Twin city Los Angeles [no sources]

[new statement]

edit

edit

Klaus Wowereit German politician Klaus Wunderlich German musician Klaus Wagner Stalker of the British royal family Klaus Wagner German mathematician Klaus Waldeck Austrian musician and lawyer

Capital of Germany Also known as: City of Berlin

Page 46: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

46

Hauptseite Inhalt API Zufällige Seite Spende an Wikidata Interaktion Hilfe Über Wikidata Benutzerportal Letze Änderungen Sprachen Catalá Cesky Dansk Eesti English Español Esperanto Français Hrvatski Italiano O’zbek Vollständige Liste

Berlin edit | x

Kontinent Europa [3 Quellen]

Land Deutschland [2 Quellen]

Einwohner 3.499.879 Stand 30. November 2011 Methode Fortschreibung

[1 Quelle]

3.500.000 Stand 2012 Methode Schätzung

[2 Quellen]

[weitere Werte]

Telefonvorwahl 030 Seit Juni 1973

[2 Quellen]

0311 Vor Juni 1973

[1 Quelle]

Bürgermeister Klaus W| [keine Quellen]

Amtliches Kennzeichen B [1 Quelle]

Fläche 891,85 km” [2 Quellen]

Partnerstadt Los Angeles [keine Quellen]

[neue Aussage]

edit

edit

Klaus Wowereit Deutscher Politiker Klaus Wunderlich Deutscher Musiker Klaus Wagner Stalker der Britischen Königsfamilie Klaus Wagner Deutscher Mathematiker Klaus Waldeck Österreichischer Musiker und Anwalt

Hauptstadt von Deutschland Auch bekannt als: Stadt Berlin

Page 47: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

47

Application: Infoboxes !  Now: every article calls an

infobox with local values

!  In Wikidata: one page with values

! Wikipedias fill infoboxes with Wikidata values

Page 48: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

48

Page 49: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

49

OPEN QUESTIONS Or: A few dozen possible paper, project and thesis topics

Page 50: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

50

UNFINISHED WORK Open questions

Page 51: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

51

Page 52: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

52

Unfinished work !   What does a unifying logic look like? !   How do we export proofs? !   How do we validate proofs? !   How do we express trust? !   How does the crypto stack really work? !   What are usable interfaces to the Semantic Web? !   How are Semantic Web applications created?

Page 53: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

53

IDENTITY AND REPRESENTATION

Open questions

Page 54: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

54

http://simpsons.com/id/Bart

http://rdf.freebase.com/id/en.bart_simpson

http://en.wikipedia.org/wiki/Bart_Simpson http://dbpedia.org/resource/Bart_Simpson

http://en.wikipedia.org/wiki/Bart_Simpson

Bart

4030

Bart Simpson

(Character ID on ComicbookDB)

Page 55: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

55

Identity and representation !   Is there anything out there? !   How to find the right identifier? !   How to know what an identifier identifies? !   What about the multitude of identifiers? !   How do we know that two identifiers identify the same entity? !   How do we know that two identifiers identify different entities? !   Without this, can we still usefully apply statistical techniques? !   What about creating new identifiers? !   What if identifiers are ambiguous? !   How to find representations for entities fitting my UI? !   How to choose a representation?

Page 56: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

56

TRUST AND DIVERSITY Open questions

Page 57: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

57

Main page Content API Random page Donate to Wikidata Interaction Help About Wikidata Community Recent changes Languages Catalá Cesky Dansk Eesti English Español Esperanto Français Hrvatski Italiano O’zbek Complete list

Berlin edit | x

Continent Europe [3 sources]

Country Germany [2 sources]

Population 3,499,879 As of November 30 2011 Method Extrapolation

[1 source]

3,500,000 As of 2012 Method Estimate

[2 sources]

[further values]

Phone prefix 030 since June 1973

[2 sources]

0311 before June 1973

[1 source]

Mayor Klaus Wowereit [no source]

Registration license B [1 source]

Area 891,85 km” [2 sources]

Twin city Los Angeles [no sources]

[new statement]

edit

edit

Capital of Germany Also known as: City of Berlin

Page 58: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

58

A statement in Wikidata

Population 3,499,879 As of November 30 2011 Method Extrapolation

[2 sources]

3,500,000 As of 2012 Method Estimate

[1 source]

Berlin

Page 59: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

59

A statement in Wikidata

Population 3,499,879 As of November 30 2011 Method Extrapolation

[2 sources]

3,500,000 [1 source]

Berlin

Berlin 3499879 population

Statement1

item property

value

3500000 population

2011-11-30 Extrapolation

as of method

Page 60: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

60

A statement in Wikidata

Population 8,000 As of 15th century Method Estimate

[2 sources]

3,500,000 [1 source]

Berlin

Berlin 8000 population

Statement1

item property

value

3500000 population

15th century Estimate

as of method

Statement2

property value

Page 61: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

61

A statement in Wikidata

Population 3,499,879 As of November 30 2011 Method Extrapolation

[2 sources]

3,500,000 [1 source]

Berlin

Berlin 3499879 population

Statement1 Source1

item property

value

reference

3500000 population

Statement2

property value

Source2

2011-11-30 Extrapolation

as of method

Source3

reference

Page 62: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

62

Trust and diversity !   How to express provenance information? !   How to store provenance of data? !   Can provenance information be expressed such that the data is still

easily accessible? !   How to query data with provenance information? !   How to deal with genuinely diverse data? !   How to match diverse vocabularies? !   How to deal with noisy data? !   Is reification really necessary? !   Do named graphs provide solutions? !   Use one graph per statement?

Page 63: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

63

UNITS AND ACCURACY Open questions

22/05/2012

Page 64: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

64

Units and accuracy !   How to express “17th century” next to literal dates? !   How to express heterogeneous accuracies? !   Is a functional value of 40,000km really inconsistent with 39,987km? !   How to express confidence values? !   How to express units? !   Is 176cm equal to 5ft9? 177cm too? Is equality transitive? !   How to express ranges (e.g. property “active” for bands)?

22/05/2012

Page 65: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

65

SERIALIZATIONS Open questions

Page 66: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

66

http://simpsons.com/id/Bart

http://simpsons.com/id/Marge

http://family.org/id/parent

http://simpsons.com/id/Lisa

Bart

http://www.w3.org/2000/01/rdf-schema#label

Marge parent

Lisa

Child

sibling

Adult

http://family.org/id/sibling

http://family.org/id/Child

http://family.org/id/Adult

http://www.w3.org/1999/02/22/rdf-syntax-ns#type

Page 67: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

67

<?xml version=“1.0” encoding=“UTF-8”?> <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:rdfs=“http://www.w3.org/2000/01/rdf-schema#” xmlns:family=“http://family.org/id/”> <rdf:Description rdf:about=“http://simpsons.com/id/Marge”> <rdf:type rdf:resource=“http://family.org/id/Adult”/> <rdfs:label>Marge</rdfs:label> </rdf:Description> <rdf:Description rdf:about=“http://simpsons.com/id/Bart”> <rdfs:label>Bart</rdfs:label> <rdf:type rdf:resource=“http://family.org/id/Child”/> <family:parent rdf:resource=“http://simpsons.com/id/Marge”/> <family:sibling rdf:resource=“http://simpsons.com/id/Lisa”/> </rdf:Description> <rdf:Description rdf:about=“http://simpsons.com/id/Lisa”> <rdfs:label>Lisa</rdfs:label> <rdf:type rdf:resource=“http://family.org/id/Child”/> <family:parent rdf:resource=“http://simpsons.com/id/Marge”/> </rdf:Description> </rdf:RDF>

Page 68: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

68

@prefix rdf ‘http://www.w3.org/1999/02/22-rdf-syntax-ns#’ @prefix rdfs ‘http://www.w3.org/2000/01/rdf-schema#’ @prefix family ‘http://family.org/id/’ @prefix simpsons ‘http://simpsons.com/id/’ simpsons:Marge rdf:type family:Adult ; rdfs:label ‘Marge’ . simpsons:Bart rdf:type family:Child ; rdfs:label ‘Bart’ ; family:parent simpsons:Marge ; family:sibling simpsons:Lisa . simpsons:Lisa rdf:type family:Child ; rdfs:label ‘Lisa’ ; family:parent simpsons:Marge .

{ “id” : “Bart”, “type” : “Child”, “sibling” : “Lisa”, “parent” : “Marge” }

Page 69: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

69

Child(Bart). sibling(Bart, Lisa). parent(Bart, Marge).

Bart is a son of [[parent::Marge]] and the brother of [[sibling::Lisa]].

0 HEAD 1 FILE simpsons 1 GEDC 2 VERS 5.5 0 @I1@ INDI 1 NAME Marge /Bouvier/ 2 SURN Simpson 1 SEX F 1 FAMS @F1@ 0 @I2@ INDI 1 NAME Bart /Simpson/ 1 SEX M 1 FAMS @F1@ 0 @I3@ INDI 1 NAME Lisa /Simpson/ 1 SEX F 1 FAMS @F1@ 0 @F1@ FAM 1 WIFE @I1@ 1 CHIL @I2@ 1 CHIL @I3@ 0 TRLR

{ “id” : “Bart”, “type” : “Child”, “sibling” : “Lisa”, “parent” : “Marge” }

Page 70: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

70

Serializations !   Do all tools need to understand all serializations? !   Are all serializations lossless? !   How to ensure they are up-to-date? !   What about current tools that don’t understand anything? !   Is the data sufficiently complete? !   How to seamlessly ground and lift data to RDF?

Page 71: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

71

ONTOLOGIES Open questions

Page 72: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

72

Ontologies

!   “An ontology is a formal specification of a shared conceptualization” !   Defines concepts and their formal relations to each other !   You can understand a concept without having a word for it !   Axiom not possible in OWL L, can only be approximated

parent ○ brother = uncle

Bart

Marge Selma Sideshow Bob ⚭Homer ⚭Herb

parent ○ sister ○ husband V ⊑ sibling

Page 73: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

73

Ontologies

!   “An ontology is a formal specification of a shared conceptualization”

!   Strict taxonomies !   Bart a FictionalPerson

! owl:sameAs !   GDR sameAs Germany

!   Classes as individuals !   Eagle a EndangeredSpecies

! rdfs:domain and rdfs:range ! family:child rdfs:range foaf:Person

!   “Unauthorized” extensions ! foaf:favouriteMovie

Page 74: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

74

Ontologies !   How to achieve and measure sharedness? !   Who defines the semantics of a term? !   How to achieve correctness? !   Does sharedness mean correctness? !   How to overcome limitations on expressivity? !   How to deal with wishes for more expressivity? !   How to deal with undecidability? !   What does inconsistency mean? !   How to deal with brittleness?

Page 75: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

75

PRIVACY Open questions

Page 76: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

76 76

Page 77: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

77 77

Page 78: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

78

Privacy !   How to ensure privacy? !   What does privacy mean? !   How to publish linked data that is not open? !   What about the ethics of combining data?

Page 79: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

79

SCALABILITY Open questions

Page 80: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

80

Web Data Commons !   Extracts data from Common Crawl (5b pages, 20 TB compressed) !   65,408,946 domains with triples !   1,222,563,749 typed entities !   3,294,248,653 triples ! www.webdatacommons.org

22/05/2012

Page 81: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

81

Scalability !   How to efficiently use Semantic Web data? !   How to select the appropriate set? !   How to cache it? !   How to deal with frequent updates? !   How to deal with SPARQL endpoints vs RDF? !   How to do federated queries? !   Who pays for it and when?

Page 82: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

82

QUESTIONS?

Page 83: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

83

WHAT ABOUT THE LINKS? Introduction to Hands-On

22/05/2012

Page 84: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

84

What are the links in "linked data"?

Are they links between things?

Are they links between documents?

How exactly do the "Web hyperlinks" we know and love relate to the factual "typed links" of data modeling?

Page 85: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

85

Links and Links !   These questions motivate and drive the Linked Data project, and

have been with the Web from the start. !   They explain our most boring debates ("http-range-14"). !   And show how 'Semantic Web' is a project to improve the

mainstream Web itself.

Page 86: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

86

Page 87: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

87

In the beginning...

(1989, 1994, ...)

Page 88: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

88

Page 89: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

89

Page 90: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

90

Page 91: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

91

Page 92: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

92

Page 93: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

93

Page 94: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

94

Page 95: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

95

Page 96: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

96

What's in a (hyper)link?

!   Does a node in the graph stand for 'Stephen Fry'-the-Person? or 'a page about Stephen Fry'?

!   What about when there are multiple pages about the same person? in different voices? sometimes disagreeing?

!   RDF thinks in triples, but data management is often in quads: asking who-said-what in SPARQL

Page 97: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

97

1989 again

One flat graph? What if we disagree?

Page 98: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

98

A Graph of Graphs?

!   Classic WWW hypertext is a top-level document graph. !   Those documents make claims about the world; factual

graphs, e.g. schema.org, RDFa. !   SPARQL let's us store and query all this. !   Each Web 'node' may give us its own 'nodes and links'

description, including links.

Page 99: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

99

Page 100: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

100

IMDB

BBC

stephenfry.com

Freebase

sameas.org

dbpedia.org

NewYorkTimes RottenTomatoes

VIAF

Page 101: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

101

We can emphasize the landscape of sites/datasets...

(No single 'correct' view)

Page 102: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

102

(No single 'correct' view)

We can emphasize the landscape of sites/datasets...

Page 103: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

103

Or we can zoom in, and see how records can be merged / flattened into a single set of triples...

Page 104: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

104

Summary

!   Linked datasets, pages, real world things...

!   ... all of these are represented in RDF datasets.

!   To query this hands on, we can use SPARQL to ask questions, and 'named graphs' to organize factual claims into groups.

Page 105: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

105

EXPLORATION Hands-on

Page 106: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

106

Hands-on !   You will explore datasets with SPARQL about Stephen Fry

!   SPARQL yourself and your colleagues

!   Spark: SPARQL on the Web

Page 107: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

107

Thinking about data !   We made a data/ folder for you !   Real public RDF data about a real person !   Sources: DBpedia, Freebase, VIAF, sameas.org, New York Times,

Identi.ca, BBC, Rotten Tomatoes, IMDB and us. !   I’ll briefly introduce the data now, then see info/data-and-queries-

intro.txt

http://192.168.0.20:8080/openrdf-workbench/repositories/Tuesday

Page 108: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

108

What to do

!   “Get your hands dirty” with real Linked Data !   If you hit a problem, make a note of it - & ask! !   Most files have RDF describing Stephen Fry; he is real and

human, please bear that in mind. !   Study the shape and patterns of the data, ask yourself

questions, using SPARQL to explore.

Page 109: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

109

Questions

!   What RDF schemas/ontologies do you see? !   How are people and other things identified? !   Are there common patterns across sources? !   Can you write queries that integrate these? !   What bugs in the data are there? How do you think they got

there?

Page 110: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

110

Internet Detectives !   for each triple, can you figure out “how it got there”? in whose voice

is it? !   is there a real schema? (if the Wifi is up) !   how would you check its truth? who “said” it and how could a

machine tell? !   which sources (or parts) aggregate different points of view within a

single RDF graph?

Page 111: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

111

data-and-queries-intro.txt !   See the info/ folder for more details - SPARQL setup and some

querying tutorial. !   Goal is to study the Linked Data Web and understand how it might

evolve. !   Identify project and research topics, and ways of helping to improve

the Web.

Page 112: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

112

SPARQL YOURSELF Hands-on

Page 113: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

113

SPARQL yourself

http://192.168.0.20:8080/openrdf-workbench/repositories/Students/query

http://192.168.0.20:8080/openrdf-sesame/repositories/Students

SPARQL endpoint

SPARQL Web Form

Page 114: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

114

SPARK Hands-on

Page 115: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

115

Spark

Page 116: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

116

Spark visualizations

Page 117: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

117

Spark visualizations

Page 118: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

118

Exercise

Page 119: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

119

Exercise

Page 120: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

120

Semantic MediaWiki

Page 121: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

121

Semantic MediaWiki - Export

Page 122: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

122

Task

!   Let’s add semanticweb.org as an additional source in order to add Dan from there to the lists of the “Friends of Spark”.

!   Expand spark.zip, then check test/index.html

Page 123: ESWC SS 2012 - Tuesday Tutorial Dan Brickley and Denny Vrandecic: Linked Open Data

123 22/05/2012