Presentation Dutch Ships and Sailors at ISWC2014

20
Dutch Ships and Sailors Linked Data Cloud Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra With input from Andrea Bravo Balado and Robin Ponstein Netherlands Institute for Sound and Vision / VU University Amsterdam [email protected] ISWC2014

Transcript of Presentation Dutch Ships and Sailors at ISWC2014

Dutch Ships and Sailors Linked Data Cloud

Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra

With input from Andrea Bravo Balado and Robin Ponstein

Netherlands Institute for Sound and Vision / VU University Amsterdam [email protected]

ISWC2014

The Problem:((Maritime) historical) data is not integrated

25+ Maritime datasets; Heterogeneous

The solution

Well, Linked Data obviously!

But why Linked Data

• Heterogeneous models, one dataformat– Link what can be linked– Keep specificity of original data – Allow integration at project level (and beyond)

• Links to other sources: re-use knowledge

• Extensible

• Allow multiple levels of semantic enrichment/ normalization – through Named Graphs – Provenance

KB Delpher

Dutch-Asiatic Shipping (DAS) – Voyages (Huygens ING)

“VOC Opvarenden”Mustering and payroll information (DANS Easy)

Dutch Ships and Sailors

Modeling in collaboration with historians (1)

dss:Recordmdb:Aanmonstering

mdb:aanmonstering-del_gem-1879-101

dss:Recordmdb:PersoonsContractmdb:persoonscontract-

del_gem-1879-101-16858-Pieter_Hoekstra

dss:Schipmdb:Schip

mdb:schip-del_gem-1879-101-Isadora

dss:shipmdb:ship

“1870-1894"

"Isadora"

rdfs:labeldss:shipname

mdb:scheepsnaam

dss:ShipTypemdb:ScheepsTy

pemdb:schoener

dss:shiptypemdb:scheepstype

“32”

dcterms:identifiermdb:inventarisnummer

mdb:has_KB_article

<http://resolver.kb.nl/resolve?urn=ddd:010063756:mpeg21:a0045:ocr>

mdb:schip-del_gem-1879-137-Isadora

owl:sameAs

dss:has_aanmonstering

mdb:has_personfoaf:Persondss:Person

mdb:Personmdb:persoon-del_gem-1879-101-16858

dss:rank

mdb:rank

dss:Rankmdb:Rang

mdb:matroos

mdb:maandgage

“Pieter"foaf:firstname mdb:voornaa

m“Hoekstra"

foaf:lastnamemdb:achternaam

Jur Leinenga (Huygens ING) Muster-rolls Northern Provinces1803-1937

Modeling in collaboration with historians (2)

dss:Recordgzmvoc:Telling

gzmvoc:telling-1046-De_Berkel __bnode_

1gzmvoc:aziatischeBemanning

dss:Shipgzmvoc:Schip

gzmvoc: schip-1046-De_Berkel

dss:has_shipgzmvoc:schip

"1046"

“Schip”

“De Berkel”rdfs:label

dss:scheepsnaamgzmvoc:scheepsnaam

dss:ShipTypegzmvoc:Scheepst

ypegzmvoc: type-

Ship

dss:has_shiptypegzmvoc:has_shiptype

gzmvoc:scheepstype

“21”

“Moorse mattroosen”

dss:azRegistratieKop

gzmvoc:azAantalMatrozen

gzmvoc:telling

gzmvoc:heeft DAS heenreis

dss:Recorddas:Voyagedas:voyage-

1918_61

Matthias van Rossum (VU-hist) Payroll information for European

vs Asiatic Sailors (17th / 18th C)

Modelling principles

• Model each dataset as directly as possible– Only “syntactical” transformation to RDF– No normalization

• Reusability• Transparency, trust

• Normalize and link in second stage – store in separate RDF Named Graphs

mdb:Schip1 mdb:Kof

mdb:scheepsType

das:ShipX das:Kofship

das:typeOfShip

dss:has_shipTypedss:has_shipType

rdfs:subPropertyOf

rdfs:subPropertyOf

Link properties and classes to interoperability layer

mdb:Schip1 mdb:Kof

mdb:scheepsType

das:ShipX das:Kofship

das:typeOfShip

Aat:Kof

Aat:Platbodems

skos:exactMatch

skos:exactMatch

skos:exactMatch

Vocabulary Links

Links to DBPedia (Ship types, places, ranks)Links to Getty AAT (Ship types, ranks)Links to GeoNames (Places)

http://semanticweb.cs.vu.nl/amalgame/

Identifying ships

• Identify ships within a dataset using Machine Learning techniques– Based on: name, size, type, destinations etc.– Background knowledge

• 33,435 owl:sameAs links

Date ShipName ShipType ShipSize HomePort CurrentPort Captain1852-02-27 Alberdiena kof NULL NULL Noorwegen (N) Wolkammer Albert Augustinus1852-07-31 Alberdina kof NULL Farmsum Friedrichstadt (D) Wolkammer Albert A.1861-09-30 Alberdina kof 98 NULL Gdansk, Danzig (PL) Wolkammer Albert Augustinus1870-03-08 Alberdina brik 222 NULL NULL Wolkammer Albert Augustinus1875-09-22 Alberdina bark 309 NULL Oostzee Wolkammer Augustinus

– Robin Ponstein

Linking to Historical newspapers

• Use ML to detect links between ships and historical newspaper articles (delpher.nl)– Features: ship name, time

intervals, captain’s names, ship type, named entities, keywords, background knowledge

• 179,120 links- Andrea Bravo Balado

Example

[HARLINGEN, 24 October.] . «et gestrande Zweedsche schip , waarvan wij ons vorig no. melding maakten , is door de 'eepboot van hier afgebragt en hier binnengede u BiJ die gelegenheid werd ons medegeeeid, dat nog vier vaartuigen op Terschelling aren gestrand. Tevens is het berigt ontvan°e > dat het hier behoorende schoonerschip Transit, kapitein Schaap, in de Noordzee is gezonken, nadat het achterschip was weggeslagen ; een ligtmatroos verloor daarbij het leven. Mede zijn hier drie vreemde schepen met meer en minder zware averij binnengeloopen.Spoiler alert! It sank in the North Sea.

Provenance (PROV-O)

• Individual named graphs have provenance information– Who made it (people/software?)– Based on what source– Content confidence

• Matches historical

science requirements

ClioPatria Triplestore

• Data live at Huygens Institute for Dutch History– http://dutchshipsandsailors.nl/data– ~30 Million triples

• Dev. Server – http://semanticweb.cs.vu.nl/dss

• Purl.org URIs redirect to live server w/ content negotiation

• SPARQL endpoint• Web interface

DAS

GZMVOC

MDB

VOCOPVBegunstig

den

VOCOPVSoldijboek

en

PROV

AAT

VOCOPVOpvaren

den

foaf

owl:sameAs

dss:hasKBLink

rdfs:subClassOf,rdfs:subPropertyOf

dss:DAS link

skos :exactMatch

Data analysis and visualisation

Current work: linking original scans

Take home

• Linked Data principles are a great fit to digital history requirements– Heterogeneous models/datasets, light-weight

reusable integration– Multiple levels of normalisation, through separate

named graphs– SW Provenance matches Historical Provenance

• Watch out when you sail your Schooner into the North Sea

DataLab

http://dutchshipsandsailors.nl/data

[email protected]