CLARINO WP2 National Registry and Long-Term Archiving

13
CLARINO WP2 National Registry and Long-Term Archiving Freddy Wetjen and Oddrun Pauline Ohren National Library of Norway Bergen, 12. September 2013

description

CLARINO WP2 National Registry and Long-Term Archiving. Freddy Wetjen and Oddrun Pauline Ohren National Library of Norway Bergen, 12. September 2013. National Registry of metadata. Goal Joint metadata registry of resources in all Clarino centres - PowerPoint PPT Presentation

Transcript of CLARINO WP2 National Registry and Long-Term Archiving

Page 1: CLARINO WP2 National Registry and Long-Term Archiving

CLARINOWP2 National Registry and Long-Term Archiving

Freddy Wetjen and Oddrun Pauline Ohren National Library of Norway

Bergen, 12. September 2013

Page 2: CLARINO WP2 National Registry and Long-Term Archiving

National Registry of metadata• Goal– Joint metadata registry of resources

in all Clarino centres• Harvest data from all CLARINO centres• Exchange data with other national

CLARIN centres• Status – current situation• On-going and planned activities

Page 3: CLARINO WP2 National Registry and Long-Term Archiving

National Registry of metadataStatus (1)• Metadata registry version 1 is running

– Search/browse, editing and management, but no harvesting facilities

– Infrastructure:• META-SHARE infrastructure 3.0

– http://metashare.nb.no/, proxied by the managing node http://metashare.tilde.com/

– Metadata complying META-SHARE metadata format 3.0– No harvesting facilities

– Metadata content:• 71 resources

– Usage:• 11.9.2013: 37 of the resources downloaded 1-17 times

– Norwegian Wordnet (Bokmål) at the top– Topmost downloading locations: Norway, Germany, Greece,

Sweden

Page 4: CLARINO WP2 National Registry and Long-Term Archiving
Page 5: CLARINO WP2 National Registry and Long-Term Archiving
Page 6: CLARINO WP2 National Registry and Long-Term Archiving
Page 7: CLARINO WP2 National Registry and Long-Term Archiving

National Registry of metadataStatus (2)• Decision made: Migrate to CMDI

(CLARIN platform)–Uncertain future for META-SHARE• 2 ys guaranteed life span

–Need for more adaptability and expressivity in metadata model

– Increased involvement with the CLARIN community

Page 8: CLARINO WP2 National Registry and Long-Term Archiving

National Registry of metadataPlanned activities• Build a basic CMDI infrastructure– Repository, editor, search service, PID

scheme, harvesting • Convert metadata from META-SHARE to

CMDI – Use META-SHARE profile as specified in

Component Registry• Extend/adapt metadata model according to

need– In collaboration with the other CLARINO centres

Page 9: CLARINO WP2 National Registry and Long-Term Archiving

CMDI Metadata framework

SearchService

Joint MetadataRepository

TextLab EDD

Relation Registry

ISOcatConcept Registry

Other trusted concept

Registries

CLARINComponent

Registry

Bergen Centre

LAP

META-SHARE components, a.o

<xxxx><yyyy><zz><xxxx>

Other centre…

Componenteditor

Metadataeditor

Adaptation of Broeder, D. A Data Category Registry- and Component-based Metadata Framework. LREC 2010.

«My profile»

Definitions of concepts used in metadata components

Metadata modeler

Metadata creator

Språk-banken

User

Infrastructure

provided by CLARIN

centrally

Page 10: CLARINO WP2 National Registry and Long-Term Archiving

National Registry of metadata; Services

Repository

CMDI

MetadataEditor

(Arbil..?)Metadata creator

OAI/PMH harvesting

SearchServices

WeblichtVLOFCS?

«Our profiles»

Clarin common infrastructure

Page 11: CLARINO WP2 National Registry and Long-Term Archiving

Data Repository

Metadataeditor

-Resoures DataDelivery

client

Processing and adaptation for long term storage (Checksum,pid,metadata etc.)

NB long term storage (preservation)

Long term archiving

Page 12: CLARINO WP2 National Registry and Long-Term Archiving

Time perspective• Metadata registry version 2 : Primo 2014– Basic CMDI infrastructure

• existing metadata converted from META-SHARE

• OAI/PMH endpoint, but no harvesting from other centres

• Metadata registry version 3: Mid 2015– Extended/adapted metadata model– Harvesting from other CLARINO centres

• Long term archiving: Mid 2014 with both data and metadata.

Page 13: CLARINO WP2 National Registry and Long-Term Archiving

CLARINOWP2 National Registry and Long-Term Archiving

Freddy Wetjen and Oddrun Pauline Ohren National Library of Norway

Bergen, 12. September 2013