Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

31
★★★★★ Miroslav Líška, Marek Šurek Datalan (Bratislava, Slovakia) l Five Star Open Data in SR, 16.9.2015 Toward Government Linked Data : A Slovak Case data.gov.sk-semanticweb

Transcript of Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Page 1: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

★★★★★

Miroslav Líška, Marek ŠurekDatalan (Bratislava, Slovakia)

l

Five Star Open Data in SR, 16.9.2015

Toward Government Linked Data : A Slovak Case

data.gov.sk-semanticweb

Page 2: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

I. Introduction1. Five Star Open Data2. History of Semantic Web in Slovakia/Datalan

II. Method3. Main Principles4. data.gov.sk Resource General* URI Pattern5. Supported ontologies (ODP Ontology + SEMIC Recommendations)

III. Process6. URI Registration – process7. URI Registration – use case model

IV. Searching for Business Cases8. Slovpedia (Tripleskop)9. Slovpedia (PharmaGuard)

*Annext A: data.gov.sk Resource URI Patterns – detail specification

Agenda

Page 3: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

I. Introduction

Page 4: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

1) Five Star Open Data● Slovak Government Data ? Sad story.

But this can change! Semantic Web !

Page 5: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

2) History of Semantic Web in SK, Datalan

Datalan

Slovakia

● 2006 … – 1st Workshop on Intelligent and Knowledge oriented Technologies. SAV, FIIT STU, FEI TUKE

● 2009 … – start of Sestate, Susan, Tripleskop, Slovpedia, Pharmaguard, SemTelcoSearch

● 2013 … – DTLN became a member of Data Standardization Process in SK /Ministry of Finance SR/ (as ITAS Deputy) ● 1. formal proposal of sk semantic standards [too soon]

● 2015

● 2. formal proposal of data.gov.sk-semanticweb_1.0 (we believe for approval until end of 2015)

202X

Miroslav Líška

We fought for the Semantic Web

Page 6: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

II. An Approach

Page 7: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

3) Method Overwiev

URI Pattern Rules +

Simple / Extendable Government URI System

data.gov.sk Semantic Standards

data.gov.sk general URI pattern

Catalog URIOntology URI

Class URI

Individuals Template URI

UR

I V

ers

ion

ing

ru

les

URI

IndividualURI

Dataset URI

Dataset Item URI

Object Property URI

DataType Property URI

Su

pp

lem

en

tary

UR

I ru

les

Supported Ontologies+ URI Registration Process

methodprocess)(

Page 8: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

4) data.gov.sk Resource URI Patterns

TYPE● id = concrete individual („Lukas Liska“, „Datalan“, „Bratislava“ ...) ● def = ontology entity definition● doc = document, file, electronic form ...● set = catalog, dataset (codelist), distribution

CLASS - resource classification

IDENTITY – standard relationalDB-like ID (0000001, 0000002 … )

VERSION - resource version/distribution (2015-09-17, 1.0, A, B …)

General URI Pattern for data.gov.sk Resource

http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}

§1

Page 9: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

General URI Pattern for data.gov.sk Resource

http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}

§1

Example – Legal Form Class (ODP Ontology)

http://data.gov.sk/def/ontology/odp/LegalForm

Example – Legal Form 121 represents a joint stock form of company

http://data.gov.sk/def/legalform/121

Example – Legal Forms Codelist

http://data.gov.sk/set/codelist/legalform

Example – Distribution of the Legal Forms codelist

http://data.gov.sk/set/codelist/legalform/2015-09-16

examples

See Annext A for full specification

4) data.gov.sk Resource General URI Patterns

Page 10: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

5) Supported Ontologies (1/2)A. ODP Ontology: Knowledge Kernel

Mapping to the actual KDP element

Mapping to actual codelist

Mapping to SEMIC recommended ontology

= OntologizationOf (ElementsOf(KDP + MetaIS)) + mapping to SEMIC Ontologies

Page 11: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

5) Supported Ontologies (2/2) B. SEMIC Recommended ontologies

DCAT Data Catalog Vocabulary ADMS Asset Description Metadata Schema ADMS.SW ADMS for Software CPSV Core Public Service Vocabulary ROV Registered Organization Vocabulary LOCN Location Core Vocabulary PERSON Person Core Vocabulary

RDF Resource Description Framework RDFS Resource Description Framework Scheme OWL Web Ontology Language SKOS Simple Knowledbe Organizational System

C. Semantic Core Ontologies

Page 12: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

III. Process

Page 13: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

6) URI Registration – process model

§3

Page 14: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

7) URI Registration – UC model

Page 15: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

III. Searching for Business Models

Page 16: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

8) Slovpedia/Tripleskop

MetaIS

data.gov.sk URIresourcesdefinition

Slovpedia

Enriched dataAddional datasets

data.gov.sk

LinkedDataDatasets Governent data

source1

source2

data

definitionssourceN

future status

Page 17: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

9) Slovpedia/Tripleskop

Governent data

source1

source2

sourceN

Tripleskop

Slovpedia

Linked Data

actual statusactual status

SestateCity in Mobile 2

PharmaGuard

Page 18: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

8) Slovpedia/Tripleskop

Page 19: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

8) Slovpedia/Tripleskop

Page 20: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

8) Slovpedia/Tripleskop

Page 21: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

8) Slovpedia/Tripleskop

Page 22: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

9) Slovpedia/PharmaGuard (1/2)

Líška, M., Šurek, M.: An Approach to NLP based Drug Interactions with Inferencing. Unpublished yet.

PharmaGuard.EU (1.0)A Drug & Medication Mobile Application based on government drug data (sk data + drugbank.ca)

uses

Page 23: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

9) Slovpedia/PharmaGuard (2/2)

1.

2.

3.4.

Page 24: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Referenceshttp://www.w3.org/standards/semanticweb/https://joinup.ec.europa.eu/community/semic/descriptionhttp://www.openrdf.org/http://www.w3.org/OWL/http://www.w3.org/RDF/

http://sk.linkedin.com/in/miroslavliska/http://sk.linkedin.com/in/mareksurekhttp://www.datalan.skhttp://www.slovpedia.comhttp://www.tripleskop.com

Thanks for your attention

Annext A: data.gov.sk Resource URI Patterns >>

Page 25: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

TYPE● id = concrete individual („Lukas Liska“, „Datalan“, „Bratislava“ ...) ● def = ontology entity definition● doc = document, file, electronic form ...● set = catalog, dataset (codelist), distribution

CLASS - resource classification

IDENTITY – standard relationalDB-like ID (0000001, 0000002 … )

VERSION - resource version/distribution (2015-09-17, 1.0, A, B …)

General URI Pattern for data.gov.sk Resource

http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}

§1

Annext A: data.gov.sk Resource URI Patterns

Page 26: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Individual URI http://data.gov.sk/id/[class]/[code]

Example – Bratislava Self-Governing Region

http://data.gov.sk/id/nuts3/SK01

Example - Datalan http://data.gov.sk/id/corporatebody/35810734

Example – Drug Concor 30x5mg

http://data.gov.sk/id/drug/94164

Example – Andrej Kiska (Slovak President)

http://data.gov.sk/id/president/andrej-kiska

Example – this document

http://data.gov.sk/doc/pdf/method/uri-for-slovak-public-data/201509-09-16

§1.3

Document URI http://data.gov.sk/doc/[docType/filename]/[version]

§1.2

Example – Andrej Kiska (Slovak President)

http://data.gov.sk/id/president/andrej-kiska

The Public Procurement Information Systemversion

http://data.gov.sk/id/isvs/5854

Annext A: data.gov.sk Resource URI Patterns

Page 27: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Dataset (codelist) http://data.gov.sk/setset/[datasetType]/[dataset]

[datasetType]● codelist = a set that contains codelist elements● data = a set that contains data „records“

[dataset] = english name of actual dataset

Example – Legal Forms codelist

http://data.gov.sk/set/codelist/legalform

Example – Approved and Categorized Drugs Datasets

http://data.gov.sk/set/data/categorizeddrug

§1.4

Annext A: data.gov.sk Resource URI Patterns

Page 28: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Dataset item http://data.gov.sk/[type]/[class]/[identity]

[type]● def = an item represents ontology entity definition (§1.1)● id = an item represents individual (§1.4.3)

[class] = type of the item

[identity] = present item code

Example – Joint Stock Company as the item of the Legal Forms codelist

http://data.gov.sk/def/legalform/121

Example – Bratislava Region as the item of the NUT3 codelist

http://data.gov.sk/id/nuts3/SK01

Extended example

legalform:121 rdf:type odp:LegalForm .legalform:121 rdfs:label “Joint Stock Company“@en .legalform:121 rdfs:label “Akciová spoločnosť“@sk .legalform:121 rdfs:label “Aktiengesellschaft“@de .

§1.4.1

Annext A: data.gov.sk Resource URI Patterns

Page 29: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

Catalog (set of datasets) http://data.gov.sk/set/cat/[catalog]

Example – drug related datasets gropu http://data.gov.sk/set/cat/registered-drugs

§1.4.2

Annext A: data.gov.sk Resource URI Patterns

Page 30: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

● Versionable resource is a resource which versions can exists in parallel, such as● an information systems, a service ...● an ontology● dataset distribution …

● Otherwise a resource is unversionable, such as● a person● geo entity● ...

Example – The Public Procurement Information Systemversion 1.0

http://data.gov.sk/id/isvs/5854/1.0

Example – A second version of

http://data.gov.sk/set/codelist/legalform/2015-09-04

Example – The Legal Forms Dataset published 2015-09-04

http://data.gov.sk/set/codelist/legalform/2015-09-04

§1.5 Resources versioning

Annext A: data.gov.sk Resource URI Patterns

Page 31: Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case

URI – identify contentURL - navigate to content

Example – an eform

<http://data.gov.sk/doc/eform/DCOM_eDemokracia_StaznostFO_sk/1.0>

Example – eforms XSD

<http://data.gov.sk/doc/xsdschema/DCOM_eDemokracia_StaznostFO_sk/1.0>

Example – NOT this

<http://data.gov.sk/doc/eform/DCOM_eDemokracia_StaznostFO_sk/1.0/share/files/schema.xsd>

Annext A: data.gov.sk Resource URI Patterns§1.6 URI is not URL