Europeana in a research contextAlastair Dunning, @alastairdunning
The European Library / Europeana Foundation
Mining Digital Repositories Conference
National Library of Netherlands, April 2014
Oxyrhynchus Papyrus No. 20
Fragment of Homer's The Iliad, 2nd century Common Era
British Library, London
Image via Wikimedia Commons
Metadata record for Oxyrhynchus Papyrus No. 20 (numbered 742) from British Library, London
Metadata record for Oxyrhynchus Papyrus No. 20 from University of Oxford
Tool for transcribing Oxyrhynchus Papyrus from Ancient Lives project
Tool for measuring Oxyrhynchus Papyrus from Ancient Lives project
Discussion Forum Oxyrhynchus Papyri from Ancient Lives project
Images
Hi-res TIFF – British LibraryJPEG – University of Oxford
Full-text
Multiple transcriptions – Ancient Lives Project
Metadata
Version 1 – British LibraryVersion 2– University of Oxford Version 3 – Ancient Lives
Unstructured Commentary
Ancient Lives Project
The richness of this information is created
by many parties, but it sits in different
places, took many projects to make
happen and still is not fully connected.
Europeana as a data brain, helping connect
disparate datasets
Rather than Europeana as end-user facing portal
2005 A letter to the European Commission from 6 Heads of State (from France, Poland, Germany, Italy, Spain and Hungary) suggests the creation of a European
digital library.
2007 The European Digital Library Network - EDLnet - begins to create a prototype, funded by i2010.
2008 Europeana's prototype is launched on 20 November.
2009 Europeana's collection reaches 5 million items.
2010 A European Parliament report approved in February asks for more content and funding for Europeana. It is unanimously approved.
2012 Europeana releases all metadata under a CC0 waiver, making it freely available for re-use. Europeana’s collection reaches 25 million items.
2013 Europeana continues to further its position as a catalyst for innovation and digital enterprise in support of the Digital Agenda of Europe - one of the pillars of the
EU’s Europe 2020 strategy
How does Europeana get its content?
Through its aggregation structure, Europeana represents 2,300 organisations across Europe
From 150 Aggregators
• Promoting national aggregation structures
• More efficient than working with every individual content provider
• Helps to achieve international standardisation
End-user generated content
• Crowd-sourcing projects such as Europeana 1914-1918 and Europeana 1989
Who submits data to Europeana?
Domain Aggregators National initiatives
Audiovisual collections
National Aggregators
Regional Aggregators
Archives
Thematic collections
Libraries
e.g. Musées Lausannois
e.g. Culture Grid,
Culture.fr
e.g. The European Library
e.g. APEX
e.g. EUScreen, European Film Gateway
e.g. Judaica Europeana, Europeana Fashion
The evolution is to Europeana Cloud : a infrastructure for aggregators and data providers.
This would allow members of Europeana Cloud to:
1.Upload metadata
2.Define who can use that metadata and in what ways (download, annotate, delete)
3.Give third parties access via APIs
4. Capability for sharing content also feasible
Development of Europeana as portal not
platform Cloud infrastructure sits at heart of this
“Portals are for visiting, platforms are for building on”
Europeana Labs as a platform for the
creative industries
Europeana Research as a platform for humanities,
social sciences
Europeana Research will not be a single discovery portal;
however at it will offer researchers access to APIs and
downloadable to ‘raw data’ stored in Europeana Cloud
Third parties can build their own specific tools using these APIs or
downloadable data
Europeana Research will give access to data that can only
be used in a non-commercial context
Europeana Research will have open APIs to allow bi-
directional access (read, write) to metadata in Europeana Cloud
(dependent on permissions)
Pilot Study 1:
Tool to search through Europeana (and other content) related to philosophy of logic http://greenlearningnetwork.com/axiom/
Pilot Study 2:
Musicologists’ tool to annotate early music manuscripts from disparate sources (Work in Progress)
Other Possibilities
Service and end-user tool to allow for transcription of multiple documents aggregated from multiple sources
Service to allow for extraction of geographic or other terms from aggregation of services
Aggregation of text documents for download for text mining
Text Mining Opportunties
Aggregation of corpora of primary sources, with harmonized licencing
Versioning corpora of primary sources
Enrichment of corpora via third-party tools
Brokerage of in-copyright material for non-commerical usage ? (Primary and secondary sources)
Ability to upload algorithms / software ?
Cons
Lack of maturity in research community in building APIs
Time taken for tool development
Quality and extent of underlying data still essential
Still needs engagement with research communities / tools builders
Pros
Europeana leverages its aggregation network to provide single access point to data
Tools can be built to help specific questions for researchers
Responsibility for sustainability and outreach are distributed
Works very well for time-limited projects
Europeana licencing framework provides standards for access to data
Thank you
Alastair Dunning, @alastairdunning
Top Related