Crowdsourcing and Semantic Enrichments for European Cultural Heritage

16
AIT Austrian Institute of Technology Crowdsourcing and Semantic Enrichments for European Cultural Heritage Sergiu Gordea , Michela Vignoli and Roman Graf CAIRA@KI 2016 Klagenfurt am Wörthersee, 27.09

Transcript of Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Page 1: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

AIT Austrian Institute of TechnologyCrowdsourcing and Semantic Enrichments forEuropean Cultural Heritage

Sergiu Gordea, Michela Vignoli and Roman Graf

CAIRA@KI 2016 Klagenfurt am Wörthersee, 27.09

Page 2: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Agenda

• Europeana Digital Service Infrastructure• Search and browsing in multilingual Cultural Heritage repository• Semantic enrichments in Thematic Collections

• Crowdsourcing semantic enrichments in EU Sounds• Vocabulary/Thesauri alignment approach

• Experimental Results• Vocabulary alignment for music instruments• Using categorization information• Using text search

• Conclusions and future work

Europeanasounds.eu

Page 3: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Europeana Digital Service Infrastructure

The Platform for Europe’s Digital Cultural Heritage

Aggregates metadata:

• From all EU countries

• ~3,500 galleries, libraries, archives and museums

• More than 52M objects

• In about 50 languages

• Huge amount of references to places, agents, concepts, time

Source: [Manguinhas et al. 1]

Page 4: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Europeana Digital Service InfrastructureFree text search: piano concerto (of austrian composers?)

Page 5: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Europeana Digital Service InfrastructureFree text search: Klavierkonzert (von österreichische Komponist?)

Page 6: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Semantic searchQuery: Piano concerto of austrian composers

Music instrument:Piano

Music genre:Concerto

Agent:*

Role:Composer

Place:Austria

Europeanasounds.eu

Page 7: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Semantic search

User Expectations• Complete search

• „All piano concerts of all/any austrian composer“• User input

• in preferred (mother) language• Records in all/any languages

• Metadata language vs. content language• Spoken language vs „Technical languages“ (e.g. music notations)

• All content types• Text • Image• Audio• Video

Europeanasounds.eu

Page 8: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Semantic enrichments

Huge effort• Automatic processing• Domain Expert

Knowledge• User Validation

Domain specific• Thematic collections• Multilingual

vocabularies/thesauri

EuSounds• Music instruments• Music genres

Europeanasounds.eu

Reference: [Manguinhas et al. 2]

Page 9: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Cultuurlink

Semi-automatic Vocabulary Alignment Tool• SKOS format http://cultuurlink.beeldengeluid

.nl

Europeanasounds.eu

Page 10: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Cultuurlink

Freely available

• as an online open service that any user can use

Users have the ability to design and experiment with different alignment strategies

• helps the task of discovering new alignments between two vocabularies

Manual control

• users can decide which alignments are correct and can assign a specific meaning (e.g. skos:exactMatch, skos:related, skos:broadMatch)

Europeanasounds.eu

Page 11: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Experimental Results

The British Library (BL) participated with 3 collections:• A selection of Asian instruments (1,099 records) from the "Colin Huehns

Asia Collection"

• a selection from the “Peter Cooke Uganda Collection” (1,312 records)

• and the “Keith Summers English Folk Music Collection” (1,326 records)

The Centre de Recherche en Ethnomusicologie (CREM) • participated with a test collection of 36 records published in the CD

“Musical Instruments of the World”

The Maison Méditerranéenne des Sciences de l'Homme (MMSH)

• participated with a collection of 25 records about folk music

The Netherlands Institute of Sound and Vision (NISV)

Europeanasounds.eu

Reference: [Manguinhas et al. 1]

Page 12: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Experimental Results

Automatic enrichments using sample EU Sounds datasets(using categorization metadata)

Source: [Manguinhas et al. 1]

Page 13: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Experimental Results

Austrian National Library Dataset• 1396 records of letters and music scores of classic music composers• References of music instruments are available in title and description only• Music instrument names available in different languages (german, italian/latin,

french)

Music instruments terms

Music instruments Instrument Taggs Instrument Family Tags

0

100

200

300

400

500

600

700

800

141

39

668 674

Europeanasounds.eu

Page 14: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

Conclusions & Future Work• Semantic enrichments for Europeana

• Targeting Thematic Collections• Infrustructure to support generation and acquisition• Europeana Entity Collection

• Preliminary experiments• Small scale• High precision enrichments

• Future work• Validation through crowdsourcing• Scalability to all Europeana Sounds dataset (300.000+)• Music genres tagging

Europeanasounds.eu

Page 15: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

AIT Austrian Institute of Technologyyour ingenious partner

Thank you!

Sergiu [email protected]

Page 16: Crowdsourcing and Semantic Enrichments for European Cultural Heritage

References [Manguinhas et al. 1] Hugo Manguinhas, Valentine Charles, Antoine

Isaac, Tom Miles, Aude Lima, Ariane Néroulidis, Véronique Ginouvès, Dimitra Atsidis, Maarten Brinkerink, Michiel Hildebrand, Sergiu Gordea: Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink, NKOS 2016, Hannover

[Manguinhas et al 2] Hugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten Brinkerink: Challenges on modeling annotations in the Europeana Sounds project, iAnnotate 2016, Berlin