Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying...

39
1 Terse RDF Triple Language Concise, human-readable Prefixes improve readability Turtle https://www.w3.org/TR/turtle/

Transcript of Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying...

Page 1: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

1

Terse RDF Triple Language

• Concise, human-readable

• Prefixes improve readability

Turtle

https://www.w3.org/TR/turtle/

Page 2: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

2

css:enrollment "541"^^xsd:integer .

TrialURI

@prefix css: <http://www.example.org/CSS/> . @prefix ct: <http://bio2rdf.org/clinicaltrials/ > . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ct:NCT00799760 css:title "Evaluation of Efficacity and Safety…”@en ; css: phase "Phase 3"@en ;

title

phase Phase 3

enrollment

541

Evaluation of Efficacity and Safety of Oseltamivir and Zanamivir

Turtle

Page 3: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

3

Working with RDF • Storage

• Querying

• Creation

Optional Applications

– Apache Jena, Jena Fuseki

• RDF storage, validation, querying

– R or SAS

Instructions provided prior to conference

Page 4: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

4

Native • 4Store http://www.4store.org/

• AllegroGraph http://franz.com/agraph/allegrograph/

• Apache Jena TDB http://jena.apache.org/

• GraphDB http://ontotext.com/products/graphdb/

DBMS-backed • Apache Jena SDB http://jena.apache.org/

• Oracle Spatial and Graph

http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/rdfse

mantic-graph-1902016.html

Hybrid Sesame http://rdf4j.org/

Virtuoso http://virtuoso.openlinksw.com/

List at the W3C: https://www.w3.org/2001/sw/wiki/Category:Triple_Store

Storing RDF: Triple Stores

Adapted from Dr. Harold Stack Knowledge Engineering with Semantic Web Technologies 2015

Page 5: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

5

Introduction to Jena Fuseki

Try or follow along

• Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT …

• Apache-Jena-Fuseki – the Jena SPARQL server

Page 6: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

6

Load a File into Fuseki

Try or follow along

• File: ex001.ttl

@prefix css: <http://www.example.org/CSS/> .

@prefix ct: <http://bio2rdf.org/clinicaltrials/> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ct:NCT00799760 css:title "Evaluation of Efficacity…"@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:int .

Instructions sent to attendees/available on wiki

Page 7: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

7

Resource Description Framework (RDF)

• Basic Concepts

• SPARQL

• Creating RDF

Page 8: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

8

• SPARQL – SPARQL Protocol

And RDF Query Language

• Not limited to RDF

– Utilities for relational database, spreadsheets, XML, JSON

• Protocol

– Rules for queries and results exchange

What is SPARQL?

Mr. Sparkle - The Simpsons

Page 9: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

9

Your First SPARQL Query

Try or follow along

File: ex002.rq

PREFIX css: <http://www.example.org/CSS/>

SELECT *

WHERE{

?s ?p ?o .

} LIMIT 10

Page 10: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

10

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?nctid ?title

WHERE{

?nctid css:title ?title .

}

ct:NCT00799760 css:title "Evaluation of Efficacity and Safety…”@en ;

S

Query #2: Graph Pattern for Title

Query

P Data

O

?nctid css:title

?title

Page 11: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

11

Query for Study Title

Try or follow along

File: ex003.rq

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?nctid ?title

WHERE{

?nctid css:title ?title .

}

Page 12: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

12

Upload another file

Try or follow along

File: ex004.TTL

@prefix css: <http://www.example.org/CSS/> .

@prefix ct: <http://bio2rdf.org/clinicaltrials/> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ct:NCT00799760 css:title "Evaluation of Efficacity …”@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:integer ;

css:primOutcome css:outcome1 .

css:outcome1 rdf:type ct:primary-outcome;

ct:measure "RT-PCR for influenza A virus…"@en ;

ct:time-frame "2 days".

Page 13: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

13

css:title "Evaluation of Efficacity …”@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:integer ;

css:outcome1 rdf:type ct:primary-outcome;

css:primOutcome css:outcome1.

ct:NCT00799760

"RT-PCR for influenza A virus…"@en ; ct:measure

ct:time-frame

Graph Query

ct:NCT00799760 ?outURI css:primOutcome

Query for Primary Outcome

"2 days".

Data

?outURI ct:measure

?outcome

Page 14: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

14

Data Query

ct:measure css:primOutcome

ct:NCT00799760 ?outURI

?outcome

SELECT ?outcome

"RT-PCR for influenza A virus…"@en ;

Page 15: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

15

SPARQL Query PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?outcome

WHERE

{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?outcome .

}

Retrieve data that matches the Graph Pattern

NCTID ?outURI primOutcome measure

?outcome

Page 16: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

16

Query for Study Outcome

Try or follow along

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?outcome

WHERE{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?outcome . }

File: ex005.rq

Page 17: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

17

Query with R R Packages: • rrdf • rrdflibs

http://github.com/egonw/rrdf

Requires Java 7 or higher

rrdf, rrdflibs

Willighagen E. (2014) Accessing biological data in R with semantic web technologies. PeerJ PrePrints 2:e185v3 See https://dx.doi.org/10.7287/peerj.preprints.185v3

Page 18: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

18

File: queryLocalTTL.R

library(rrdf)

dataSource = load.rdf(“<path to the TTL file>/ex004.ttl",

format="N3")

query = 'PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?primaryOutcome

WHERE

{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?primaryOutcome .

}'

queryResult = as.data.frame(sparql.rdf(dataSource, query))

queryResult

Try or follow along

Page 19: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

19

> library(rrdf) Loading required package: rJava Loading required package: rrdflibs > dataSource = load.rdf(“<your path>/ex004.ttl", format="N3") log4j:WARN No appenders could be found for logger (org.apache.jena.riot.RDFLanguages). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > query = 'PREFIX css: <http://www.example.org/CSS/> + PREFIX ct: <http://bio2rdf.org/clinicaltrials/> + SELECT ?primaryOutcome + WHERE + .... [TRUNCATED] > queryResult = as.data.frame(sparql.rdf(dataSource, query)) > queryResult primaryOutcome 1 RT-PCR for influenza A virus in nasal secretion

Ignore log4j warnings

Query result!

Page 20: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

20

Query an Endpoint with R

library(rrdf)

endpoint = "http://localhost:3030/test/query"

query = "SELECT * WHERE {?s ?p ?o . } LIMIT 10 "

queryResult = sparql.remote(endpoint, query)

queryResult

File: queryLocalFuseki.R

Page 21: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

21

Query with SAS SAS Macros: %sparqlquery - SPARQL query %sparqlupdate - SPARQL update

https://github.com/MarcJAndersen/SAS-SPARQLwrapper

Implementation: • SAS PROC HTTP to access the

service • Send query/update as text file • Input result using SAS LIBNAME

for XML

Other approaches: • PROC groovy to execute Java Code

from Apache Jena • SAS Java objects to interface to Apache

Jena

Requires running SPARQL service, for example Apache Jena

Page 22: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

22 Try or follo

w along File: queryLocalFuseki.sas

Assumptions: • Service active at endpoint • TTL file uploaded to store

Page 23: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

23

ns1:NCT00799760 rdf:type ns2:Resource ,

ns2:Clinical-Study .

ns1:NCT00799760 ns3:title "Evaluation of Efficacity and Safety

of Oseltamivir and Zanamivir"@en .

ns2:actual-enrollment 541 ;

…AND MUCH MORE….

Trial Triples with SPARQL http://lod.openlinksw.com/sparql

DESCRIBE <http://bio2rdf.org/clinicaltrials:NCT00799760>

Page 24: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

24

Query a Remote Source At: http://lod.openlinksw.com/sparql

Page 25: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

25

Federated Query: Join data across sources

Local Fed Query Example

Page 26: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

26

Page 27: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

27

More SPARQL

SPARQL Query Language for RDF https://www.w3.org/TR/rdf-sparql-query/ SPARQL 1.1 Query Language https://www.w3.org/TR/sparql11-query/

“Learning SPARQL” - Bob DuCharme

http://www.learningsparql.com/index.html - examples for download

Page 28: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

28

Resource Description Framework (RDF)

• Basic Concepts

• SPARQL

• Creating RDF

Page 29: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

29

Creating RDF • Ontologies

• Create RDF with

• SPARQL

• Text editor

» Validate

• R

• SAS Other Choices

• Python

• Ruby

• Java

• OpenRefine.....

Page 30: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

30

Become a “Triple Maker”…

Page 31: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

31

… and not a “Trouble Maker”

The Trouble with Triples…

As you make triples, tame them with: • Standard Vocabularies/Ontologies • Data Models

RDF Data Cube

• “Datensparsamkeit” [1]

Store only the data you need. Link to the rest!

[1] http://martinfowler.com/bliki/Datensparsamkeit.html

Page 32: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

32

• No clear division between Vocabulary and Ontology. – W3C

• Vocabulary = standard set of words

• Ontology = concepts and their relations, classes, hierarchies. More formal than vocabulary

• RDFS, Web Ontology Language (OWL)

• Standard naming, classification, inferencing, reasoning

• One of the most important components in the Semantic Web

What is an Ontology?

Page 33: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

33

• General purpose

– Dublin Core (common metadata) http://dublincore.org

• Modeling

– OWL, RDFS, SKOS

– W3C RDF Data Cube

• Concept Specific

– STATO – general statistics http://stato-ontology.org/

– Ontology of Clinical Research OCRE http://bioportal.bioontology.org/ontologies/OCRE

– CDISC Standards RDF http://www.cdisc.org/rdf

– Provenance Authoring and Versioning http://purl.org/pav/

Example Ontologies

Page 34: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

34

Find Ontologies Linked Open Vocabularies http://lov.okfn.org/dataset/lov/

541 Vocabularies March, 2016

Page 35: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

35

Ontology Tools Protégé

• Free, widely used

• Web/cloud version

http://protege.stanford.edu/

TopBraid Composer from TopQuadrant

• Free edition, commercial edition

Page 36: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

36

Resource Description Framework (RDF) • Ontologies

• Create RDF with

• SPARQL

• Text editor

» Validate

• R

• SAS

Page 37: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

37

Create RDF using SPARQL

…similar to SQL

• CREATE

• UPDATE

• INSERT *

• DELETE

* See later SAS example

Page 38: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

38

Create RDF: Text Editor

@prefix ct: <http://bio2rdf.org/clinicaltrials/> .

@prefix css: <http://www.example.org/CSS/> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

@prefix pav: <http://purl.org/pav> .

ct:NCT00799760 css:enrollment "541"^^xsd:int ;

css:phase "Phase 3"@en ;

css:title "Evaluation of Efficacity a

pav:createdWith "Text Editor"^^xsd:string .

Try or follow along

Page 39: Concise, human-readable Prefixes improve readability...3 Working with RDF •Storage •Querying •Creation Optional Applications –Apache Jena, Jena Fuseki •RDF storage, validation,

39

Validate • Apache Jena RIOT (RDF I/O Technology)

riot –validate CreateTTLFromEditor.TTL

Example errors 1. Forgot PAV prefix

08:45:44 ERROR riot :: line: 9, col: 16] Undefined prefix: pav

2. Incorrect triples termination

08:45:44 ERROR riot :: [line: 9, col: 32] Unexpected IRI

for predicate…

* note: requires Apache Jena in the system path