Persistent Identifiers in EUDAT services

27
www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 Persistent Identifiers in EUDAT services PIDs in EUDAT This work is licensed under the Creative Commons CC-BY 4.0 licence. Attribution: EUDAT – www.eudat.eu

Transcript of Persistent Identifiers in EUDAT services

Page 1: Persistent Identifiers in EUDAT services

www.eudat.eu

EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065

Persistent Identifiers in EUDAT services

PIDs in EUDAT

This work is licensed under the Creative Commons CC-BY 4.0 licence.Attribution: EUDAT – www.eudat.eu

Page 2: Persistent Identifiers in EUDAT services

The EUDAT data domain handles registered dataEach digital object should have a persistent identifier

This persistent identifier is used for:Replica identificationIdentification of the repository of record (in the case of replication)Querying of additional informationChecksum (time stamped)

Used as Actionable URLs

PIDs in EUDAT

Page 3: Persistent Identifiers in EUDAT services

The EUDAT Service Suite

http://www.eudat.eu/services

Page 4: Persistent Identifiers in EUDAT services

The EUDAT Service Suite + PIDs

http://www.eudat.eu/services

Supports living objects no PIDs

PIDs (collections) referable

PIDs (files) long-term preservation

User Access no PIDs

PIDs fetch data

PIDs refer to data

PID management

Page 5: Persistent Identifiers in EUDAT services

Store and exchange data with colleagues and team members, including research data not finalized for publishing share data with fine-grained access controls synchronize multiple versions of data across different devices

Β2DROP

A secure and trusted data exchange service for researchers and scientists to keep their research data synchronized and up-to-date and to exchange with other researchers.

Supports living objects, so no PIDs

Page 6: Persistent Identifiers in EUDAT services

B2SHARE

A user-friendly, reliable and trustworthy way for researchers, scientific communities and citizen scientists to store and share small-scale research data coming from diverse contexts. PIDs to every data collection, to make

them referable

Page 7: Persistent Identifiers in EUDAT services

B2SHARE: The process

assigns PIDs to every data collection , to make it referable

Page 8: Persistent Identifiers in EUDAT services

B2SHARE

The persistent identifier for the resource

Page 9: Persistent Identifiers in EUDAT services

B2SAFE

a robust, safe and highly available service which allows community and departmental repositories to implement data management policies on research data across multiple administrative domains in a trustworthy manner

PIDs at file level, for long-term preservation and linking replicas and their originals

Page 10: Persistent Identifiers in EUDAT services

B2SAFE: What happens step by step?

iRods

PID

Data Center Store 1

Community repository Digital Object (DO)

unique identifier (PID) to the DO

PID

Data ingestion

Data replication

own PID

systemOR

iRODS rulesiRodsCom

mun

ity C

entre

iRods

PID

Data Center Store 2

Based on community policy

PID assignment

Page 11: Persistent Identifiers in EUDAT services

B2SAFE: Original DO and replicas

ROR: Repository of Records, the repository where data was stored first.PPID: Parent PID, the persistent identifier associated to the source object in a replication chain. If the chain has only two elements, the master copy and the first replica, then the PPID = ROR.

PIDs as Linked list

Page 12: Persistent Identifiers in EUDAT services

B2STAGE

a reliable, efficient, light-weight and easy-to-use service to transfer research data sets between EUDAT storage resources and high-performance computing (HPC) workspaces. PIDs, to fetch data

Transfer large data collectionsIn conjunction with B2SAFE, replicate community data sets, ingesting them onto EUDAT storage resources for long-term preservationIngest computation results into the EUDAT infrastructure

Page 13: Persistent Identifiers in EUDAT services

B2STAGE

An extension of the B2SAFE and B2FIND services, which allow users to store, preserve and find dataData-staging script facilitates staging, ingestion and retrieval of persistent identifier (PID) information of transferred data

Page 14: Persistent Identifiers in EUDAT services

B2STAGE

Page 15: Persistent Identifiers in EUDAT services

B2STAGE

Page 16: Persistent Identifiers in EUDAT services

B2FIND

a discovery service also known as a simple, user-friendly metadata catalogue of research data collections stored in EUDAT data centres and other repositories.

PIDs, as source identifier

Find collections of scientific data quickly and easily, irrespective of their origin, discipline or communityGet quick overviews of available dataBrowse through collections using standardised facets

Page 17: Persistent Identifiers in EUDAT services

B2FIND

Metadata are stored through EUDAT services (such as B2SHARE) and harvested from various research community repositories overarching a wide scope of research disciplines. The benefit for the communities by publishing metadata in EUDAT is: improved visibility and searchability of their research data in an interdisciplinary, pan-European scope.

Page 18: Persistent Identifiers in EUDAT services

PID Training

B2FIND – B2SHARE Community

The Source is an identifier, therefore a unique string that identifies the resource. It may link to the data resource itself or to a landing page that curates the data.You may also find

PID as an alternate identifier.DOI as an alternate identifier.

B2FIND uses B2SHARE PIDs

Page 19: Persistent Identifiers in EUDAT services

PID Training

B2FIND – SDL Community

The SDL community supports as alternate identifier:

PID and DOI

B2FIND uses PID and DOI from SDL Community.

Page 20: Persistent Identifiers in EUDAT services

B2HANDLE

EUDAT has adopted Handle-based persistent identifiers based on A combined solution of handle system and EPIC service. B2HANDLE is a central service for managing persistent identifiers at EUDAT.

PID management

Why handles?Stable globally unique IDs, stable cross-LinksTechnology Agnostic Simple Integration

Page 21: Persistent Identifiers in EUDAT services

Follows policies to register data and make it long term referable and citableReliability through mutual PID mirroring, Handle Prefix Registrars from ePIC or other DONA MPAs are partners of EUDAT. Provides the abstraction layer between a globally unique persistent identifier and physical location of data objects Machine readable via HTTP RESTful API

Benefits of B2HANDLE

Page 22: Persistent Identifiers in EUDAT services

B2HANDLE – The Python library

b2handle: A Python library for interaction with EUDAT Handle services

setuptools-enabled Python packageRequires contact to one of the EUDAT Handle server sites

Technical documentation:

http://eudat-b2safe.github.io/B2HANDLE

Page 23: Persistent Identifiers in EUDAT services

B2HANDLE – B2SAFE example

offers integration of the EPIC API into iRODS via a python script.

This comes out of the box with the B2SAFE service Complexity of the structure hidden

The script takes credentials as input Supplied on the command line (or)Stored in a configuration file (iRODS or local fs)

The script supports the following actions Searching Resolving Creation Modification

Page 24: Persistent Identifiers in EUDAT services

B2ACCESS

Simple and secure authorisation and authentication platform of EUDAT, which allows single sign-on on EUDAT’s public and internal services.

User Access, no PIDs

EUDAT users to authenticate themselves using a variety of credentials

User's Home Organisation Identity ProviderGoogle accountB2ACCESS ID

Page 25: Persistent Identifiers in EUDAT services

The EUDAT Service Suite + PIDs

http://www.eudat.eu/services

Supports living objects no PIDs

PIDs (collections) referable

PIDs (files) long-term preservation

User Access no PIDs

PIDs fetch data

PIDs refer to data

PID management

Page 26: Persistent Identifiers in EUDAT services

Thanks

Page 27: Persistent Identifiers in EUDAT services

www.eudat.eu

Authors Contributors

This work is licensed under the Creative Commons CC-BY 4.0 licence

EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures.Contract No. 654065

Themis Zamani, GRNETWillem Elbers, CLARINChristine Staiger, SURFsara

Ellen Leenarts, DANS

Thank you