Semantic Knowledge Graphs - Software engineering · Semantic Knowledge Graphs 2018-10-01 Dr. Hamed...

Post on 24-May-2020

10 views 0 download

Transcript of Semantic Knowledge Graphs - Software engineering · Semantic Knowledge Graphs 2018-10-01 Dr. Hamed...

Semantic Knowledge Graphs

2018-10-01 Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann

Overview

1 “Semantic Web” and “Linked Open Data”

2 RDF Data Model

3 RDF-Serialization

4 SPARQL

5 Summary

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 2

“Semantic Web” and “Linked Open Data”

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 3

The Current Web

Immensely successful:

B Huge amounts of information and data

B Syntax standards for transfer of structured data

B Machine-processable, human-readable documents

But:

B Content/knowledge cannot be accessedby machines

B Meaning (semantics) of transferred datais not accessible

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 4

The Semantic Web Vision

”The Semantic Web is an extension ofthe current web in which information is

given well-defined meaning, betterenabling computers and people to work

in cooperation.”Tim Berners-Lee, James Hendler, Ora Lassila

(2001). 1

Tim Berners-Lee (Inventor of WWW)

1Source: http://flic.kr/p/dRiWjB

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 5

The Semantic Web

W3C Definition: ”The Semantic Web provides a common framework that allowsdata to be shared and reused across application, enterprise, and communityboundaries. It is a collaborative effort led by W3C with participation from a largenumber of researchers and industrial partners.”2

2https://www.w3.org/RDF/FAQ

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 6

Linked Open Data

”Linked Open Data (LOD) is Linked Data which is released under an openlicense, which does not impede its reuse for free.” – Tim Berners-Lee 3

B Linked Data• Semantic Web is about making links between data• People and machines can explore the web of data• Having some data, you can find other related data

B Open Data“Open data is data that can be freely used, re-used and redistributed byanyone.” 4

• Availability and Access: the data must be readily available in a convenient andmodifiable form.

• Re-use and Redistribution: reusing, redistribution and intermixing of data with otherdatasets must be properly permitted.

• Universal Participation: everyone must be able to use, re-use and redistribute and nodiscrimination against fields of endeavor or against persons or groups is allowed.

B In Short: Linked Open Data = Linked Data + Open Data3https://www.w3.org/DesignIssues/LinkedData.html4http://opendatahandbook.org/

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 7

Linked Open Data Cloud, Sep. 2011 (295 Datasets)

5

5Original Source: http://lod-cloud.net

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 8

Linked Open Data Cloud, Jul. 2018 (1220 Datasets)

6

6Original Source: http://lod-cloud.net

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 9

Semantic Web-Stack

B Unicode

• A standard for encoding internationalcharacter sets

• Allows all human languages to be used

B URI (Uniform Resource Identifier)

• Is a string of a standardized form• Allows to uniquely identify resources

(e.g., documents)

B XML

• A markup language for creatingdocuments composed of structureddata

• Provides a common syntax for thesemantic web

User interface and applications

Trust

Proof

Cry

pto

gra

phy

Unifying Logic

Querying:SPARQL

Ontologies:OWL

Rules:RIF/SWRL

Taxonomies: RDF Schema

Data interchange:RDF

Syntax:XML

Identifiers: URI Character Set: UNICODE

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 10

Semantic Web-Stack

B RDF (Resource Description Framework)

• A framework to represent data astriples i.e. (subject, predicate, object)

• Data will be presented as directedlabeled graphs

• Anyone can define vocabulary of termsused for more detailed description

B RDF Schema

• Is a lightweight, easy to use languagefor defining RDF vocabularies (i.e.ontology)

• Used to define object-oriented conceptssuch as classes and properties

• Allows standardized description oftaxonomies and other ontologicalconstructs

User interface and applications

Trust

Proof

Cry

pto

gra

phy

Unifying Logic

Querying:SPARQL

Ontologies:OWL

Rules:RIF/SWRL

Taxonomies: RDF Schema

Data interchange:RDF

Syntax:XML

Identifiers: URI Character Set: UNICODE

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 11

Semantic Web-Stack

B OWL (Web Ontology Language)

• Is a language derived from descriptionlogics

• Like RDFS, OWL is a data modelinglanguage used to describe RDFvocabularies

• Offers much larger set of vocabulariesand more constructs than RDFS

B SPARQL

• Is a SQL-like query language for thesemantic web

• Queries are based on graph patternmatching

• The returned result are triples thatmatch the pattern of the query

User interface and applications

Trust

Proof

Cry

pto

gra

phy

Unifying Logic

Querying:SPARQL

Ontologies:OWL

Rules:RIF/SWRL

Taxonomies: RDF Schema

Data interchange:RDF

Syntax:XML

Identifiers: URI Character Set: UNICODE

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 12

Semantic Web-Stack

B RDFS and OWL

• Define semantics which allowsreasoning within ontologies andknowledge bases

B RIF (Rule Interchange Format)

• Provide rules beyond the constructsavailable in RDFS and OWL

• Allows describing relations that cannotbe directly described using descriptionlogic used in OWL

B Logic, Proof and Trust

• Toghether provide a solid infrastructurefor application layer

User interface and applications

Trust

Proof

Cry

pto

gra

phy

Unifying Logic

Querying:SPARQL

Ontologies:OWL

Rules:RIF/SWRL

Taxonomies: RDF Schema

Data interchange:RDF

Syntax:XML

Identifiers: URI Character Set: UNICODE

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 13

RDF Data Model

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 14

Introduction

7

7Original Source: http://flic.kr/p/65uF7Z

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 15

RDF – Overview

B RDF = Resource Description Framework

B W3C Recommendation since 1998

B RDF is a data model

• originally used for metadata and web-resources, but later it wasgeneralized

• encodes structured information• universal, machine-readable interchange format• data is represented as triples i.e. (subject, predicate, object)• data is structured in the form of directed labeled graphs

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 16

Parts of RDF Graphs

B URIs

• Used to uniquely identify resources

B Literals

• Describe data values that do not have own identity,e.g., “100km/h”

B Blank Nodes

• A blank node is a node in an RDF graphrepresenting a resource for which a URI or literal isnot given• Enable description of characteristics of entities that

do not need to be namedDr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 17

RDF Triple

Components of an RDF triple:

B Based on linguistic categories, but not always consistently

B Allows assignments:

• Subject: URI or blank node• Predicate: URI (a.k.a. property)• Object: URI, blank node or literal

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 18

Example of an RDF Graph

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 19

URI

B URI = Uniform Resource Identifier

B Gives resources globally unique names

B Extension of the URL-concept

• or equivalently URL is a special type of URI

B Every object with a clear identity can be a resource

• Books, places, organizations ...

B ISBN serves the same purpose for books

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 20

URI-Syntax

B Not every URI denotes a web document, but the URL is oftenused as URI for web documents

B URLs:

• Starts with URL schema separated from the rest by ”:”

B : http, ftp, mailto, file

• Typically a hierarchical structure

B [scheme:][//authority][path][?query][#fragment]

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 21

IRIs

B IRI = Internationalized Resource Identifier

B Generalization of URI concept

B IRI can contain Unicode

B Example:

• http://www.example.org/Wuste• http://www.example.org/荒野

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 22

Literals

B Used to model data valuesB Representation through stringsB Interpretation through data typeB Literals without data type are treated as stringsB Literals may never be the origin of an edge of an RDF graphB Edges may never be labeled with literals

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 23

Literals II

B Example: xsd:decimal

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 24

Datatypes in RDF

B So far: literals are untyped, treated as strings:”02”< ”100”< ”11”< ”2”

B Typing allows a semantic interpretation of values

B Data types get identified by URIs and can be chosen freely

B Typically usage of XML-Schema-Datatypes (XSD)

B Syntax: "Data Value"^^Datatype-URI

B rdf:XMLLiteral are the only predefined datatype in RDF

• Used for HTML and XML fragments

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 25

RDF Schema

Not all triples are meaningful:

Example

Cinema AlbertEinstein 2012

How can we restrict the use of RDF?

RDF Schema allows us to define classes and properties and to restrict their use.

B RDFS is a lightweight, easy to use language for defining RDF vocabularies

B Allows to define rudimentary relationships

B Used to define object-oriented concepts such as classes and properties

B Offers limited inferencing capabilities

B Allows standardized description of taxonomies and other ontologicalconstructs

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 26

Ontologies

B Provides a frame of reference for the disambiguation and theglobal interconnection of knowledge

B Allows representation and usage of background knowledge

B Allows us to deal with implicit knowledge

B Has explicit formal semantics

B Can be used as a collective resource, e.g. over the WWW

B Allows integration of the distributed knowledge

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 27

OWL - Web Ontology Language

B RDFS vs OWL

• RDFS allows definition of rudimentary relationships• RDFS has limited inferencing power• OWL allows definition of much richer relationships• OWL has much better inferencing power

B Since 2004 W3C Standard

B Semantic fragment of Description Logic

B Three variations:OWL Lite ⊆ OWL Description Logic ⊆ OWL Full

B OWL DL is decidable and corresponds to description logicSHOIN (D) (for OWL 2 DL: SROIQ(D))

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 28

Ontologies – An Example

An example of RDFS and fragment of FOAF ontology 8

8Taken From: Moreira da Costa, Thiago. (2017), OPP IoT An ontology-based privacy preservationapproach for the Internet of Things.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 29

Test Questions

1. Can literals occur in the subject position of an RDF triple?

Correct answer: no

2. Why is HTML not machine readable data?

It is digitally accessible, but it cannot be understood by computers, inparticular content cannot easily be combined with existing knowledgeand/or directly processed.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 30

Test Questions

1. Can literals occur in the subject position of an RDF triple?

Correct answer: no

2. Why is HTML not machine readable data?

It is digitally accessible, but it cannot be understood by computers, inparticular content cannot easily be combined with existing knowledgeand/or directly processed.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 30

Test Questions

1. Can literals occur in the subject position of an RDF triple?

Correct answer: no

2. Why is HTML not machine readable data?

It is digitally accessible, but it cannot be understood by computers, inparticular content cannot easily be combined with existing knowledgeand/or directly processed.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 30

Test Questions

1. Can literals occur in the subject position of an RDF triple?

Correct answer: no

2. Why is HTML not machine readable data?

It is digitally accessible, but it cannot be understood by computers, inparticular content cannot easily be combined with existing knowledgeand/or directly processed.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 30

RDF-Serialization

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 31

Problems with RDF-Syntax

9

9Original Source: http://milicicvuk.com/blog/2011/07/21/problems-of-the-rdf-syntax/

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 32

Overview of Formats

B RDF is a data model, which aims to represent information and therelationship in an abstract level

B Therefore, there is no unique RDF serialization format

B Different serialization formats for different purposes:

• Turtle - a text format with emphasis on human readability• N-Triples - a text format with emphasis on simple parsing• RDF / XML - the official XML-serialization of RDF• JSON-LD - W3C recommendation (2014) for expressing RDF

in JSON (JavaScript Object Notation)• RDFa - a mechanism for embedding RDFa in (X) HTML

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 33

Turtle – Syntax

B Turtle = Terse RDF Triple Language

B URIs in angle brackets

• <http://dbpedia.org/resource/Leipzig>

B Literals in quotes

• "Leipzig"@de

• "51.333332"^^xsd:float

B Triples are subject-predicate-object sentences terminated with a dot.

• <http://dbpedia.org/resource/Leipzig>

<http://www.w3.org/2000/01/rdf-schema#label>

"Leipzig"@de .

B Whitespace and line breaks are ignored outside of identifiers

B Status: W3C Recommendation 25 February 2014,http://www.w3.org/TR/turtle/

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 34

Turtle – Abbreviations

B In Turtle one can use abbreviations

• Syntax: @prefix abbr: <URI> .

• E.g.: @prefix dbr: <http://dbpedia.org/resource/> .

B One can transform:

<http://dbpedia.org/resource/Leipzig>

<http://www.w3.org/2000/01/rdf-schema#label>

"Leipzig"@de .

B into:

@prefix dbr: <http://dbpedia.org/resource/> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema> .

dbr:Leipzig rdfs:label "Leipzig"@de .

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 35

Turtle – Grouping

B Triples with the same subject can be grouped together :@prefix rdf:

...

@prefix geo:

dbr:Leipzig dbp:hasMayor dbr:Burkhard_Jung ;

rdfs:label "Leipzig"@de ;

geo:lat "51.333332"^^xsd:float ;

geo:long "12.383333"^^xsd:float .

B Even triples with the same subject and predicate can be grouped together:

@prefix dbr: <http://dbpedia.org/resource/>.

@prefix dbp: <http://dbpedia.org/property/> .

dbr:Leipzig dbp:locatedIn dbr:Saxony,

dbr:Germany;

dbp:hasMayor dbr:Burkhard_Jung .

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 36

N-Triples

B N-Triples is a line-based, plain text format

B N-Triples is a subset of Turtle and Notation 3

• Abbreviations and groupping not allowed• Limited to ASCII character set

B Example:<http://www.w3.org/2001/sw/RDFCore/ntriples/>

<http://purl.org/dc/elements/1.1/creator>

"Dave Beckett" .

<http://www.w3.org/2001/sw/RDFCore/ntriples/>

<http://purl.org/dc/elements/1.1/creator>

"Art Barstow" .

<http://www.w3.org/2001/sw/RDFCore/ntriples/>

<http://purl.org/dc/elements/1.1/publisher>

<http://www.w3.org/> .

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 37

RDFa – Motivation

Left: Browser’s point of view, Right: Person’s point of view

B Can we close the gaps so that browsers/agents see more?

B Solution:

• Augment (X)HTML/XML with extra structured contents• Use processors to extract and convert them into RDF

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 38

RDFa – Syntax

B RDFa = RDF in attributes

B It is a bridge between web of document and web of data

B Developed to embed RDF into HTML and XML

B Embedded triples can be accessed or extracted

B IRIs can be used(XML and HTML nowadays typically encoded as UTF-8 Unicode)

B W3C Recommendation since June 2012(http://www.w3.org/TR/rdfa-core/)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 39

RDFa – Example

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"

"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">

<html version="XHTML+RDFa 1.0" xml:lang="en"

xmlns="http://www.w3.org/1999/xhtml"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"

xmlns:xsd="http://www.w3.org/2001/XMLSchema#"

xmlns:dbp="http://dbpedia.org/property/"

xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">

<head>

<title>Leipzig</title>

</head>

<body about="http://dbpedia.org/resource/Leipzig">

<h1 property="rdfs:label" xml:lang="de">Leipzig</h1>

<p>Leipzig is a city in Germany. It is located at latitude

<span property="geo:lat" datatype="xsd:float">51.3333</span> and longitude

<span property="geo:long" datatype="xsd:float">12.3833</span>.

</p>

</body>

</html>

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 40

JSON-LD

B JSON-LD = JavaScript Object Notation for Linked Data

B http://json-ld.org/

B Easily enriches legacy JSON with semantics

B W3C Recommendation since January 2014 (http://www.w3.org/TR/json-ld/) andsupersedes RDF/JSON

B @context specifies how processors capable of linked data should interpret JSON-LD:links to concepts of hierarchy (in this case, ontology person.jsonld)

B Example:

{

"@context": "http://json-ld.org/contexts/person.jsonld",

"@id": "http://dbpedia.org/resource/John_Lennon",

"name": "John Lennon",

"born": "1940-10-09",

"spouse": "http://dbpedia.org/resource/Cynthia_Lennon"

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 41

References

B Linked Data: https://www.w3.org/DesignIssues/LinkedData.html

B Semantic Web:https://www.cse.wustl.edu/~jain/cse570-13/ftp/semantic/index.html,https://www.obitko.com/tutorials/ontologies-semantic-web/

semantic-web-architecture.html

B URI: http://www.ietf.org/rfc/rfc1630.txt

B N3: http://www.w3.org/DesignIssues/Notation3.html

B RDF: http://www.w3.org/RDF/

B RDFa Lite: http://www.w3.org/TR/rdfa-lite/

B RDFa 1.1 Core – 2nd Ed.: http://www.w3.org/TR/2013/REC-rdfa-core-20130822/

B RDFa: https://www.slideshare.net/ivan_herman/rdfa-tutorial

B RDF / JSON: http://www.w3.org/blog/SW/2011/09/13/the-state-of-rdf-and-json/

B Xturtle Editor: http://aksw.org/Projects/Xturtle.html

B HTML+RDFa 1.1: http://www.w3.org/TR/html-rdfa/

B XHTML+RDFa 1.1 – 2nd Ed.: http://www.w3.org/TR/2013/REC-xhtml-rdfa-20130822/

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 42

SPARQL

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 43

SPARQL

SPARQL (pronounced sparkle) stands for:Simple Protocol And RDF Query Language

B SPARQL 1.0 W3C-Recommendation since January 15th 2008

B SPARQL 1.1 W3C-Recommendation since March 21st 2013

B Query language to query instances in RDF documents

B Great practical importance (almost all applications need it)

Parts of the SPARQL specification

B Query language: topic of this lecture

B Output format: Representation of results in XML

B Query protocol: Transmission of queries and results

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 44

SPARQL

SPARQL (pronounced sparkle) stands for:Simple Protocol And RDF Query Language

B SPARQL 1.0 W3C-Recommendation since January 15th 2008

B SPARQL 1.1 W3C-Recommendation since March 21st 2013

B Query language to query instances in RDF documents

B Great practical importance (almost all applications need it)

Parts of the SPARQL specification

B Query language: topic of this lecture

B Output format: Representation of results in XML

B Query protocol: Transmission of queries and results

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 44

SPARQL - Query Language for RDF

Example

SELECT * WHERE { jwebsp:John foaf:knows ?friend }

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 45

Simple Queries

A simple query example:

PREFIX ex: <http :// example.org/>

SELECT ?title ?author

WHERE

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

?book ex:title ?title .

?book ex:author ?author . }

B Main component is a query pattern (WHERE) query pattern uses the Turtle-Syntax for RDF Patterns may contain variables (?variable)

B Abbreviated forms for URIs are allowed (PREFIX)

B Query result through a range of variables (SELECT)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 46

Simple Queries

A simple query example:

PREFIX ex: <http :// example.org/>

SELECT ?title ?author

WHERE

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

?book ex:title ?title .

?book ex:author ?author . }

B Main component is a query pattern (WHERE) query pattern uses the Turtle-Syntax for RDF Patterns may contain variables (?variable)

B Abbreviated forms for URIs are allowed (PREFIX)

B Query result through a range of variables (SELECT)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 46

Example Output

[...] ?book ex:hasPublisher <http :// springer.com/Verlag > .

?book ex:title ?title .

?book ex:author ?author . }

Example RDF document:

@prefix ex: <http :// example.org/> .

ex:SemanticWeb ex:hasPublisher <http :// springer.com/Verlag >;

ex:title "Semantic Web - Foundations";

ex:author ex:Hitzler , ex:Kr otzsch ,

ex:Rudolph , ex:Sure .

Result of the query: table with one row per result

title author"Semantic Web - Foundations" http://example.org/Hitzler

"Semantic Web - Foundations" http://example.org/Krotzsch

"Semantic Web - Foundations" http://example.org/Rudolph

"Semantic Web - Foundations" http://example.org/Sure

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 47

Simple Graph Pattern

The fundamental query patterns are simple graph patterns

B A set of RDF-Triples in Turtle Syntax

B Turtle abbreviations (using , and ;) are allowed

B Variables are denoted by ? or $

• ?variable has the same meaning as $variable

B Variables can be subject, predicate or object

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 48

Grouping Graph Pattern

Simple graph patterns can be grouped by {. . . }.

B Groups are used to provide additional structure among patterns

B All patterns must match within a group (interpreted conjunctively)

B Empty group(s) {} is also allowed (it is matched to any data)

Example:PREFIX ex: <http :// example.org/>

SELECT ?title ?author

WHERE

{ { ?book ex:hasPublisher <http :// springer.com/Verlag > .

?book ex:title ?title . }

{ }

?book ex:author ?author .

}

Meaningful when additional constructors are usedDr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 49

Optional Patterns

The keyword OPTIONAL allows specification of an optional part of apattern

Example:{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

OPTIONAL { ?book ex:title ?title . }

OPTIONAL { ?book ex:author ?author . }

}

Parts of a query result can be unbound:

book title authorhttp://example.org/book1 "title1" http://example.org/author1

http://example.org/book2 "title2"

http://example.org/book3 "title3" :a 10

http://example.org/book4 :a

http://example.org/book5

10 :a is a blank node.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 50

Optional Patterns

The keyword OPTIONAL allows specification of an optional part of apattern

Example:{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

OPTIONAL { ?book ex:title ?title . }

OPTIONAL { ?book ex:author ?author . }

}

Parts of a query result can be unbound:

book title authorhttp://example.org/book1 "title1" http://example.org/author1

http://example.org/book2 "title2"

http://example.org/book3 "title3" :a 10

http://example.org/book4 :a

http://example.org/book5

10 :a is a blank node.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 50

Alternative Patterns

The keyword UNION allows specification of an alternative part of apattern.

Example:

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

{ ?book ex:author ?author . } UNION

{ ?book ex:Author ?author . }

}

Result corresponds to unification of the result with one of the two conditions

Assumption: Same variable names in both parts of UNION do not influence eachother

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 51

Combinations of Options and Alternatives (1)

What do combinations of OPTIONAL and UNION mean?

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

{ ?book ex:author ?author . } UNION

{ ?book ex:Author ?author . } OPTIONAL

{ ?author ex:Lastname ?name . }

}

B Unification of the two-part pattern with added optional pattern or

B Unification of the two-part pattern where the second has an optional part?

The first interpretation is the correct one:

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

{ { ?book ex:author ?author . } UNION

{ ?book ex:Author ?author . }

} OPTIONAL { ?author ex:Lastname ?name . }

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 52

Combinations of Options and Alternatives (1)

What do combinations of OPTIONAL and UNION mean?

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

{ ?book ex:author ?author . } UNION

{ ?book ex:Author ?author . } OPTIONAL

{ ?author ex:Lastname ?name . }

}

B Unification of the two-part pattern with added optional pattern or

B Unification of the two-part pattern where the second has an optional part?

The first interpretation is the correct one:

{ ?book ex:hasPublisher <http :// springer.com/Verlag > .

{ { ?book ex:author ?author . } UNION

{ ?book ex:Author ?author . }

} OPTIONAL { ?author ex:Lastname ?name . }

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 52

Combinations of Options and Alternatives (2)

General Rules

B OPTIONAL always refers to exactly one grouping pattern to the right of it

B OPTIONAL and UNION are equivalent and left-associative (as, for example,subtraction)

Example:

{ {s1 p1 o1} OPTIONAL {s2 p2 o2} UNION {s3 p3 o3}

OPTIONAL {s4 p4 o4} OPTIONAL {s5 p5 o5}

}

means

{ { { { {s1 p1 o1} OPTIONAL {s2 p2 o2}

} UNION {s3 p3 o3}

} OPTIONAL {s4 p4 o4}

} OPTIONAL {s5 p5 o5}

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 53

Combinations of Options and Alternatives (2)

General Rules

B OPTIONAL always refers to exactly one grouping pattern to the right of it

B OPTIONAL and UNION are equivalent and left-associative (as, for example,subtraction)

Example:

{ {s1 p1 o1} OPTIONAL {s2 p2 o2} UNION {s3 p3 o3}

OPTIONAL {s4 p4 o4} OPTIONAL {s5 p5 o5}

}

means

{ { { { {s1 p1 o1} OPTIONAL {s2 p2 o2}

} UNION {s3 p3 o3}

} OPTIONAL {s4 p4 o4}

} OPTIONAL {s5 p5 o5}

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 53

Why Filters?

Many queries are not possible even with complex graph patterns:

B”Which persons are between 18 and 23 year old?“

B”Which person’s last name contains a dash?“

B”Which German language texts are available in the ontology?“

Filters as general mechanisms for such expressions

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 54

Filter in SPARQL

Example:

PREFIX ex: <http :// example.org/>

SELECT ?book WHERE

{ ?book ex:hasPublisher <http :// springer.com/Publisher > .

?book ex:Price ?price .

FILTER (?price < 35)

}

B Keyword FILTER, followed by filter expression in parentheses

B Filter conditions output truth values (and possibly errors)

B Many filter functions are not specified by RDF functions partly taken from XQuery/XPath-Standard for XML

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 55

Filter Functions: Comparison and Arithmetic

Comparison operators: <, =, >, <=, >=, !=

B Comparison of data literals according to natural order

B Support for numerical data types, xsd:dateTime, xsd:string (alphabeticordering), xsd:Boolean (1>0)

B For other types and other RDF-elements, only = and != are available

B Comparison of literals of incompatible types is not allowed(e.g. xsd:string and xsd:integer)

Arithmatic operators: +, -, *, /

B Support for numerical data types

B Used to combine values in filter conditions

Ex.: FILTER(?weight/(?size*?size)>=25)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 56

Filter Functions: Comparison and Arithmetic

Comparison operators: <, =, >, <=, >=, !=

B Comparison of data literals according to natural order

B Support for numerical data types, xsd:dateTime, xsd:string (alphabeticordering), xsd:Boolean (1>0)

B For other types and other RDF-elements, only = and != are available

B Comparison of literals of incompatible types is not allowed(e.g. xsd:string and xsd:integer)

Arithmatic operators: +, -, *, /

B Support for numerical data types

B Used to combine values in filter conditions

Ex.: FILTER(?weight/(?size*?size)>=25)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 56

Filter Functions: Special Functions for RDF (1)

SPARQL also supports RDF-specific filter functions:

BOUND(A) Returns true if A is bound to a value. Variables withthe value NaN or INF are considered bound.

isURI(A) true if A is a URIisBLANK(A) true if A is a blank nodeisLITERAL(A) true if A is an RDF-literalSTR(A) lexical representation (xsd:string) of RDF-literals or

URIsLANG(A) Language code of an RDF-literal (xsd:string) or em-

pty string if there is no language codeDATATYPE(A) Data type-URI of an RDF-literal (xsd:string of an

untyped literal without a language specification)

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 57

Filter Functions: Special Functions for RDF (2)

Other RDF-specific filter functions:

sameTERM(A,B) true, if A and B are the same RDF-terms.langMATCHES(A,B) true, if the language specification A fits the pat-

tern B

REGEX(A,B) true, if the character string A contains the re-gular expression B

Example:

PREFIX ex: <http :// example.org/>

SELECT ?book WHERE

{ ?book ex:Review ?text .

FILTER ( langMATCHES( LANG(?text), "de") )

}

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 58

Filter functions: Boolean Operators

Filter conditions can be linked with boolean operators: &&, ||, !

Partially also expressible through graph pattern:

B Conjunction

• corresponds to specifications of several filters

B Disjunction

• corresponds to application of filters in alternative patterns

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 59

Output Formatting with SELECT

Until now, all results were tables: output format SELECT

Syntax: SELECT <Variable-list> or SELECT *

Advantage

Easy sequential processing of results

DisadvantageStructure/Relationships of the objects in the results are not obvious!→ use CONSTRUCT instead of SELECT to get RDF back directly

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 60

Why Modifiers?

Until now, only basic formatting setting for results:

B How can one retrieve defined parts of the output set?

B How are the results ordered?

B Can duplicate result rows be removed instantaneously?

solution: sequence modifiers

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 61

Sorting the Results

Sorting of results can be done with keyword ORDER BY

SELECT ?book , ?price

WHERE { ?book <http :// example.org/Price > ?price . }

ORDER BY ?price

B Sorting of URIs alphabetically as sequence of characters

Other possible specifications:

B ORDER BY DESC(?price): descending

B ORDER BY ASC(?price): ascending, default setting

B ORDER BY DESC(?price), ?title: hierarchical classification criteria

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 62

Sorting the Results

Sorting of results can be done with keyword ORDER BY

SELECT ?book , ?price

WHERE { ?book <http :// example.org/Price > ?price . }

ORDER BY ?price

B Sorting of URIs alphabetically as sequence of characters

Other possible specifications:

B ORDER BY DESC(?price): descending

B ORDER BY ASC(?price): ascending, default setting

B ORDER BY DESC(?price), ?title: hierarchical classification criteria

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 62

Sorting the Results

Sorting of results can be done with keyword ORDER BY

SELECT ?book , ?price

WHERE { ?book <http :// example.org/Price > ?price . }

ORDER BY ?price

B Sorting of URIs alphabetically as sequence of characters

Other possible specifications:

B ORDER BY DESC(?price): descending

B ORDER BY ASC(?price): ascending, default setting

B ORDER BY DESC(?price), ?title: hierarchical classification criteria

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 62

LIMIT, OFFSET and DISTINCT

Restriction of output set:

B LIMIT: maximal number of results (table rows)

B OFFSET: position of the first delivered result

B SELECT DISTINCT: removal of duplicate table rows

SELECT DISTINCT ?book , ?price

WHERE { ?book <http :// example.org/Price > ?price . }

ORDER BY ?price LIMIT 5 OFFSET 25

LIMIT and OFFSET only make sense with ORDER BY!

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 63

Introduced SPARQL-features in Overview

Basic structurePREFIX

WHERE

Output formatSELECT

CONSTRUCT

ASK

Graph-patternsimple Graph-pattern{. . . } GrouppingOPTIONAL

UNION

FilterBOUND

isURI

isBLANK

isLITERAL

STR

LANG

DATATYPE

sameTERM

langMATCHES

REGEX

ModifiersORDER BY

LIMIT

OFFSET

DISTINCT

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 64

Test Questions

What is this query doing?

SELECT DISTINCT ?type

WHERE {

?e rdf:type ?type

}

Short form:

SELECT DISTINCT ?type { ?e a ?type }

Note: The SPARQL keyword a is a shortcut for the common predicate rdf:type,giving the class of a resource.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 65

Test Questions

What is this query doing?

SELECT DISTINCT ?type

WHERE {

?e rdf:type ?type

}

Short form:

SELECT DISTINCT ?type { ?e a ?type }

Note: The SPARQL keyword a is a shortcut for the common predicate rdf:type,giving the class of a resource.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 65

Test Questions

What is this query doing?

SELECT DISTINCT ?type

WHERE {

?e rdf:type ?type

}

Short form:

SELECT DISTINCT ?type { ?e a ?type }

Note: The SPARQL keyword a is a shortcut for the common predicate rdf:type,giving the class of a resource.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 65

Test Questions

What is this query doing?

SELECT ?e ?name

WHERE {

?e dbp:title dbpedia:List_of_Russian_rulers .

?e rdfs:label ?name

}

LIMIT 20

OFFSET 10

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 66

Test Questions

What is this query doing?

SELECT ?director_name ?movie_name ?actor_name

WHERE {

?movie dbpedia -owl:starring dbpedia:Julia_Roberts .

?movie dbpedia -owl:starring ?actor .

?movie rdfs:label ?movie_name .

?actor rdfs:label ?actor_name .

?movie dbpedia -owl:director ?director .

?director rdfs:label ?director_name .

FILTER (langMatches(lang(? movie_name), "EN") .

}

ORDER BY ?director ?movie

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 67

Triple Stores

Purpose:

B Load and serialize RDF

B Persistent saving of RDF knowledge base - often as quad (G,S,P,O)

B Performant queries (via SPARQL)

B Partial inference

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 68

Types of Triple Stores

Native:

B Specially optimized for the RDF data model

B Examples: OWLIM, YARS

Database-based:

B Relational DB-Schema to save

B Examples: Sesame/SDB, OntoWiki/Erfurt

Hybrids:

B Relational databases extended with special datatypes and indices

B Examples: Oracle, Virtuoso

Evaluation of Triple Stores

3 main types have different advantages and disadvantages. Evaluation of therun-time using benchmarks such as DBSBM, BSBM, LUBM, etc.

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 69

Summary

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 70

Summary

RDF and its serialization

B broadly supported standard for the storage and communication of data

B graph-based data model

B serializations: Turtle, N-Triples, RDFa, JSON-LD, RDF/XML

B pure RDF is very data-oriented

B RDFS and OWL for schemata on top

SPARQL as a query language for RDF

B W3C-Standard, very widespread use

B Query based on graph pattern

B Various expansions (filters, modifiers, output formats)

B Specification of query syntax, output format, query protocol

B Not covered: semantics using translation in SPARQL-Algebra

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 71

References

Pascal Hitzler, Markus Krotzsch, Sebastian Rudolph, York Sure

Semantic Web Foundations

Dr. Hamed Shariat Yazdi, Prof. Jens Lehmann Semantic Knowledge Graphs 72