Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop...

14
Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn, 1 st and 2 nd of February 2011 Wolfgang Zenk-Möltgen, GESIS – Leibniz Institute for the Social Sciences [email protected]

Transcript of Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop...

Page 1: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences

Workshop “Persistent Identifiers for the Social Sciences” Bonn, 1st and 2nd of February 2011

Wolfgang Zenk-Möltgen, GESIS – Leibniz Institute for the Social [email protected]

Page 2: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Overview

Initial Situation

Requirements for Persistent Identifiers

Implementation Decisions

Version Management

Allocation of DOI Names

Workflow Integration

Technical Developments

• DBK Extension to Manage Versions and DOIs

• DDI Export for Enhanced Publications Including URNs

Conclusions

Page 3: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Initial Situation

• The Data Archive provides research data for the social sciences• Currently 4480 studies, including cross-section, national or

international comparative, and longitudinal studies

• The Data Archive works on different levels of data processing, and data documentation

• Metadata on study level: The Data Catalogue (DBK)• Enables searching and browsing for users• Management system for internal and external information

• Bibliographic citation requirement for data users• http://www.gesis.org/en/services/data/retrieval-data-access/data-archive-service/citation-of-research-data/ • Includes Study Identifier „ZAnnnn“

Page 4: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Requirements for Persistent Identifiers

• Clear understanding of what is to be identified• E.g. survey projects, series of studies, data collection events,

questionnaires or other instruments, datasets, variables

• Guarantee of unique names• Can I control that nobody else will use the same name for

anything else, and can I avoid duplicates?

• Expectation of stability• Woody Allen: „Eternity is really long, especially near the end“

• Change management• Which processes must be triggered when change occurs?

Page 5: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Implementation Decisions

• We want to identify datasets• Not study series, studies, data collection events • Not variables or questions• Independent from specific format (SPSS, STATA, SAS)

• We joined the DataCite consortium• Using DOIs as PI system with existing infrastructure• Supporting citation of research data• Planning added value services

• We keep DDI as the main metadata standard• Including URN names for every DDI object• Aiming at interoperability

• We keep track of dataset versions• Documentation inside the dataset• Policy for version numbering

Page 6: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Version Management

• DBKEdit is used as management system at the Archive:• Introduction of mandatory version numbers• Format: N.N.N (in accordance to DDI standard) starting with 1.0.0• Rules for increasing major, minor, revision number• Documenting also the version date, author, title and description• Errata of the current version are documented• Corrected errata are kept for documenting a new version• Version History and Errata are published

• da|ra is the service provider for DOI allocation: • A DOI proposal is transmitted from DBKEdit• Together with the URL and some metadata• DOI allocation takes place here• URL and/or metadata changes have to be transmitted from

DBKEdit again

Page 7: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

doi:10.4232/1.0001

http://info1.gesis.org/dbksearch13/print.asp?no=0001&search=0001&search2=&db=D

schema name

doi® GESIS

Prefix Suffix

Standard for DataCite:UTF8, one prefix per institutionno semantics in DOI

Allocation of DOI Names

DBK

DBK incremental no.

Page 8: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

DSDM

Workflow Integration

Online Publication Publication on CD-ROM

DBK Data CatalogueZACAT Online Study CatalogueDBKEdit Data Catalogue Edit-Tool

DSDM Dataset Documentation ManagerCBE CodebookExplorer

DBKEdit

Report-Tool

Longterm-preservation

Versions and DOIs

Page 9: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Technical Developments

DBKEdit - Data Catalogue Edit-Tool

Documentation and management of metadata on study level of all archived studies

Information providedFor each study/dataset (partly bi-lingual): Title & study number Primary investigators & fieldwork agencies Universe & sampling Abstracts Related publications Access categories Questionnaires, etc. Notes and comments for internal use

Extension to manage versions, errata, and DOIs.

Publication via- ZACAT (Nesstar)- CBE (CodebookExplorer)- DBK (Data Catalogue)

Page 10: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

DBKEdit – Management of Metadata on Study-Level

Page 11: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

DBK – Research and Information Provided for the User

Page 12: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

DDI Export for Enhanced Publications including URNs

containsStudy numberSPSS variable names, labels, and URNsVariable groupsComplete question, answer texts and notes for each variableSPSS values and value labels

<l:VariableScheme id="Archive Study ID_VarSch"><l:Variable id="SPSS Variable Name"

urn=”urn:ddi:de.gesis:VariableScheme.Archive Study ID_VarSch.1.0.0:Variable.SPSS Variable Name.1.0.0”>* <r:UserID type="SPSS Variable Name"> SPSS Variable Name </r:UserID> <r:Label maxLength="255" type="label"> SPSS Variable Label</r:Label> <l:QuestionReference>

<r:ID>Archive Study ID _QSPSS Variable Name</r:ID> </l:QuestionReference>

</l:Variable>* Repeatable

Technical Developments

Page 13: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Establishing DOIs for research datasets is easy… if you have a service provider like da|ra

Managing the metadata and keep track of versions is possible… if you invest into documentation systems and establish a policy

Time will tell, … if data and it‘s identifiers will be for eternity

But remember:„The preservation of valuable data sets and their distribution on

demand is of utmost importance for the progress of science.“

from „Data for eternity“, in: Nature Geoscience 3, 219 (2010) doi:10.1038/ngeo840

Conclusions

Page 14: Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,

Thank you!

Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences

Workshop “Persistent Identifiers for the Social Sciences” Bonn, 1st and 2nd of February 2011

Wolfgang Zenk-Möltgen, GESIS – Leibniz Institute for the Social [email protected]