Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference...

35
Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development Manager Collections Trust, UK Regine Stein Head of Information Technology, Deutsches Dokumentationszentrum für Kunstgeschichte – Bildarchiv Foto Marburg, Germany
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference...

Page 1: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Cultural Linked Data: Some preliminary results

of the Linked Heritage projectEVA Moscow Conference

November 2011

Gordon McKenna

International Development Manager

Collections Trust, UK

Regine Stein

Head of Information Technology, Deutsches Dokumentationszentrum

für Kunstgeschichte –

Bildarchiv Foto Marburg, Germany

Page 2: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Context – The Linked Heritage Project

http://www.linkedheritage.org

Page 3: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Project Overview

Basic information:• Length – 30 months; • Partners – 38+; • Budget – €3.85m (80% from EC ICT-PSP Programme); • Background – Successor to ATHENA (Minerva & MICHAEL)

Objectives: • To contribute large quantities of new content to Europeana, from both the

public and private sectors;• To demonstrate enhancement of quality of content, in terms of metadata

richness, re-use potential and uniqueness;• To demonstrate enable improved search, retrieval and use of Europeana

content.

Work packages: • WP 1 Project management and Coordination (114 person months)• WP 2 Linking Cultural Heritage Information (53 pm)• WP 3 Terminology (73 pm)• WP 4 Public Private Partnership (57 pm)• WP 5 Technical Integration (38 pm)• WP 6 Coordination of Content (238 pm)• WP 7 Dissemination & Training (116 pm)

Page 4: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

WP 2 – Selected Overview

Objectives: • To explore the state of the art in linked data;

• To identify appropriate models, processes and technologies for the deployment of linked data;

• To consider how linked data practices can be applied to cultural heritage;

• To explore the state of the art in persistent identifiers.

Tasks and Deliverables:• T2.1 – Exploring cultural heritage information best practic

o D2.1 – Best practice report on cultural heritage linked data and metadata standards

• T2.2 – Resource identification [PIDs]

o D2.2 – State of the art report on persistent identifier standards and management tools

Page 5: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Project Methodology

1. Carry out research – What exists, survey

2. Make an analysis – Look for patterns and trends.

3. Give simple advice – practical and implementable

4. Reuse or create tools – Easy to use, audience relevant, adaptable open licence (e.g. Multilingual versions possible)

5. Identify further needs – Leading to further work

Page 6: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Partner Survey

Page 7: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Aimed at partners in Linked Heritage

• Data collection – Online Surveymonkey (supported by a RTF document)

• Sections:

1. Participant information2. Metadata standards and use3. Linked data use and Europeana agreement

Survey Method and Structure

Page 8: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Museum – 4

• Library – 5

• Archive – 4

• Sound archive – 1

• Aggregator – 10

• Other – 23

• Total – 47

Participant Type

Page 9: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Yes: 29 (74.4%)

• No: 10 (25.6%)

Familiar with the Linked Data Concept?

Page 10: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Yes: 6 (15.40%)

• No: 33 (84.60%)

• Details:

o 4 – Dbpedia;

o 3 – GeoNames;

o 1 – Freebase;

o 1 – IPTC;

o 1 – SKOS;

o 1 – [in-house];

Used Linked Data?

Page 11: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Yes: 4 (10.3%)

• No: 35 (89.7%)

• Details:

• http://data.kunstkamera.ru/sparql;

• http://data.kunstkamera.ru

• http://nektar.oszk.hu/wiki/Semantic_web

• Thesaurus in SKOS

Published Linked Data?

Page 12: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Yes: 15 (38.5%)

• No: 24 (65.5%)

• Activity in:

• France• Germany• Israel• Italy• Russia• Spain• Sweden• United Kingdom

Know of Linked Data Projects?

Page 13: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Europeana's new licence requires that provider's will have to agree to have the metadata that they provide to Europeana published as Linked Open Data. This means that any 3rd party use, including commercial, is permitted. Does your organisation agree to this?

• Please explain your answer.

Europeana Agreement Questions

Page 14: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• Yes: 30.6% – Why? • [no explanation];• Publishing on Web means Open Data; • Participated in the ATHENA project; • Metadata provided to Europeana specifically selected for Open Linked

Data

• No: 16.7% – Why?• Against 3rd party commercial use; • National policy does not allow commercial use; • Do not contribute to Europeana; • [No explanation]

• Not sure: 52.8% – Why?• Under discussion; • Metadata not ours (our providers’ decision); • Under discussion (possible legal obstacles); • Decision not ours (made at a higher level); • Will provide minimal data; • Against commercial reuse

Europeana Licence Agreement?

Page 15: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Conclusions

• A market for basic information and guidance;

• Significant concerns in cultural organisations about publishing completely open data.

Page 16: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Research into the Linked Open Data Cloud

Page 17: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Tim Berners-Lee 2007 –http://www.w3.org/DesignIssues/LinkedData.html

1.Use URIs as names for things;

2.Use HTTP URIs so that people can look up those names;

3.When someone looks up a URI, provide useful RDF information;

4.Include RDF statements that link to other URIs so that they can discover related things.

Linked Data Principles

Page 18: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Linked Data – simple rules

• The URI identifies an entity – this can be an artwork, a person, a place, a concept etc.

• If two people create data using the same URI then they are describing the same entity.

• That makes it easy to merge data from different sources together – not only in one single database, in one portal, but „web-wide“.

• This actually means making the web – which currently is a global, universal information space for documents – into a global database.

Page 19: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

http://linkeddata.org

Linked Open Data CloudMay 2007

12 data packages

Page 20: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Linked Open Data CloudMarch 2009

89 data packages

Page 21: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Linked Open Data CloudSeptember 2011

295 data packages

Page 22: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

The Data Hub

http://thedatahub.org

• Part of CKAN – Comprehensive Knowledge Archive Network)

• Registry of open [and not open] knowledge

• Packages: > 2.300 packages in total, ~ 300 of them in the LOD cloud

• Projects (and a few closed ones).

Page 23: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

‘Open’ = commercial use

311 packages:

• Yes 42.6%

• No 57.4%

c38 billion triples:

• Yes 30.9%

• No 69.1%

Is the LOD Cloud Open?

Page 24: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

PackagesTriples

• CC BY 28.8% 45.8%

• CC BY-SA 18.2% 10.2%

• PDDL 10.6% 0.2%

• CC0 9.1% 2.9%

• UK Crown Copyright with data.gov.uk rights 7.6% 27.4%

• Other (Public Domain) 6.8% 7.0%

• Other (Open) 5.3% 5.0%

• Other (Attribution) 3.0% 0.4%

• UK Open Government Licence (OGL) 3.0% 0.1%

• GNU FDL 3.0% <0.1%

• ODbL 2.3% 0.9%

• GNU GPL 0.8% <0.1%

• New and Simplified BSD licences 0.8% 0.1%

Open Licences Used

Page 25: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Packages Triples

• [not given] 69.1% 89.4%

• None 14.6% 0.3%

• CC BY-NC 7.3% 5.8%

• Other (Not Open) 6.7% <0.1%

• CC BY 1.1% 0.6%

• Other (Non-Commercial) 0.6% 3.9%

• CC BY-SA 0.6% <0.1%

Not Open Licences Used (or Not)

Page 26: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

• > 1 b 2.9%

• > 500 m 1.9%

• >100 m 6.1%

• >50 m 5.79%

• >10 m 14.8%

• >5 m 6.1%

• >1 m 15.8%

• > 0.5 m 7.4%

• > 0.1 m 14.5%

• < 0.1 m 24.4%

Number of triples per package

Page 27: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Top Packages Linked To By Packages

Packages Links (million)

1. DBpedia 158 31.53

2. GeoNames Semantic Web 38 9.35

3. [none] 34 0

4. DBLP Computer Science Bibliography (RKBExplorer) 27 1.34

5. Association for Computing Machinery (ACM) (RKBExplorer) 26 1.49

6. ePrints3 Institutional Archive Collection (RKBExplorer) 26 0.28

7. Freebase 25 10.45

8. CiteSeer (Research Index) (RKBExplorer) 24 0.80

9. School of Electronics and Computer Science, University of Southampton (RKBExplorer) 24 0.04

10.ReSIST Project Wiki (RKBExplorer) 24 <0.01 [408]

Page 28: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Cultural Packages in the Cloud

Triples (million)

• VIAF: The Virtual International Authority File 200.0• Europeana Linked Open Data 185.0• British National Bibliography (BNB) 80.2• Hungarian National Library (NSZL) catalog 19.3• Amsterdam Museum as Linked Open Data in the Europeana Data Model 5.0• Library of Congress Subject Headings 4.2• Swedish Open Cultural Heritage Other (Open) 3.4• Calames 2.0• RAMEAU subject headings (STITCH) 1.6• data.bnf.fr - Bibliothèque nationale de France 1.4• National Diet Library of Japan subject headings 1.3• Gemeenschappelijke Thesaurus Audiovisuele Archieven 1.0• Gemeinsame Normdatei (GND) 0.6• Archives Hub Linked Data 0.4• Thesaurus for Graphic Materials (t4gm.info) 0.1• Italian Museums (LinkedOpenData.it) <0.1• Thesaurus W for Local Archives <0.1• MARC Codes List Open Data <0.1

Page 29: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Open licences

Number

• CC0 2

• Other (Public Domain) 1

• Other (Open) 1

• ODbL 1

Not open licences

Number

• [not given] 9

• CC BY-SA 3

• Other (non-commercial) 1

Cultural Heritage – Licences Used

Page 30: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

3,5 Mio object descriptions= 185 m triple

contains currently< 620.000 links to other packages

Europeana in the LOD

Page 31: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

31

Europeana examplesAmsterdam Museum

Page 32: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Europeana examplesAmsterdam Museum

Page 33: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Hack4Europe Award „Most Innovative Application“: Time Mash – based on your current geographical location historical views of the same place and interesting objects in the vicinity are searched in Europeana.

33

Europeana examples

Page 34: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Conclusions

Open Data – Licensing?• Must have one Before publishing make a decision? • What kind of licence can you give (CC useable?)?• What kind of 3rd party use do you want to allow?

Linkable Data – Publishing?• Use Persistent Identifiers;• Select ‘standard’ data formats; • Carefully choose what you are publishing

Linking Data – Which package(s) do you link to? • Trusted source?• Presence of PIDs and maintained resource?

Linked Culture Cloud – shared resource?• Sub-set of the LOD Cloud / CKAN; • Information relevant for cultural institutions• Feed into general LOD Cloud

Page 35: Cultural Linked Data: Some preliminary results of the Linked Heritage project EVA Moscow Conference November 2011 Gordon McKenna International Development.

Thank you

Gordon McKenna – Regine Stein

[email protected][email protected]