Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality...

35
Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access http://www.digoiduna.eu/ Barbara Bazzanella, Paolo Bouquet, Martin Dow, Ruben Riestra SMART 2010/0054 Contract N. 30-CE- 0395470/00-32

Transcript of Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality...

Page 1: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Digital Object Identifiers andUnique Authors Identifiers

to enable services for data quality assessment, provenance, access

http://www.digoiduna.eu/

Barbara Bazzanella, Paolo Bouquet, Martin Dow, Ruben Riestra

SMART 2010/0054Contract N. 30-CE-0395470/00-32

Page 2: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Objectives of the DIGOIDUNA study

1. Supporting policy makers at European and member state level to understand the opportunities and challenges of adopting solutions for managing identifiers in the context of establishing scientific data e-infrastructures (SDIs)

2. Providing instruments that will support decision making on solutions that will have a long-lasting impact on scientific research and on the long term access, preservation and integration of valuable data and knowledge assets

Page 3: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The three key messages

1. Digital identifiers are at the root of extracting value of information resources within SDIs.

2. The key challenge for managing identifiers goes far beyond the technical level to embrace a much wider vision, where organizational, social and business strategies form an intertwined eco-system.

3. Action is needed to exploit the opportunities provided by a coordinated eco-system of identifiers in e-Science

Page 4: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

e-Science infrastructures in the digital agenda for Europe

How to deliver the benefits of the digital era to e-Science?

• Building a pan-European and worldwide network of research infrastructures to increase the potential of innovation and the advancement of the research

• Increasing the efficiency and effectiveness of European research and reinforcing the research community

Page 5: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Identifiers: creating value from data

DIs are an essential building block for enabling effective and efficient technical solutions and for supporting the creation of value-added services like:

• Data and information Access, Search and Navigation• Fast, large-scale and decentralized Data Sharing & Reuse• Effective Linkage of data and information across repositories• Fine-grained Access Control• Data and information Quality assessment• Reputation assessment & Citation indexes • Impact and ROI assessment (reliable research outputs beyond the scope

of published literature)• Ownership management for data and scholarly content (citability)• …

on top of scientific data and contents

Page 6: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

From local-non digital to global-digital

Page 7: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

But what if we need to deal with data & information created and managed across

• national • organizational• disciplinary• cultural• Technological….

boundaries?

Digital Identifiers (DIs) are the keys for cost-effective data management in digital systems

The new fundamental challenge

Enabling a cross-boundary key to data

Page 8: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

A multi-dimensional vision

Page 9: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The opportunities ahead

SWOT analysis

The current situation

Future e-Science scenarios

Actions needed to fill the gap

Page 10: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

THE STARTING POINT

Page 11: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The current situation - I

FUNDAMENTAL AGREEMENT: managing identifiers is an essential component for data management in SDIs and a key to produce (more) value from data.

Page 12: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The current situation - II

FRAGMENTATION CONVERGENCE: despite the current fragmentation of solutions, it is clear that the main stakeholders are in fact converging toward a restricted number of systems and initiatives for managing persistent identifiers on top of which value-added services are being built (for example, DataCite or CrossRef)

[See below: the PIs / Cool URIs debate]

Page 13: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

DIs for Digital Objects

Source: elaboration from APARSEN questionnaire (2011)

Page 14: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

DIs for Authors

Source: elaboration from APARSEN questionnaire (2011)

Page 15: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The current situation - III

There is a different level of maturity between more consolidated solutions for digital objects and the gradually emerging solutions for authors (e.g. ResearcherID, ScopusID, the ORCID initiative).

Page 16: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The current situation - IV

There is a clear indication that trusted institutions should support the definition of agreements between the relevant stakeholders and users, especially when there are potentially conflicting interests

Trust is a key requirement towards a solution (trusted authorities, sustainability & long-term preservation, data quality, explicit policies & governance models, etc.)

Page 17: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Linked Open Data: Cool URIs as Persistent Identifiers for e-Science?

May the Web itself be taken as the platform for e-Science and the current practices about HTTP URIs as a way of managing persistent identifiers?

The Web of Data and the Cool URIs solution

Page 18: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

PIs / Cool URIs: the debate

Persistent Identifiers

Cool URIs

Persistence yes to be proven

Authority yes no

Level of trust high low

Effort for implementation

high low

Sustainability to be proven yes

Cross-linkage Weak yes

Metadata interoperability

low high

ID resolution partially yes

Content negotiation

no yes

Page 19: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

The “Den Hague Manifesto”:towards interesting synergies

PIs Cool URIs

owl:sameAs relations for linking PIs PIs as resource IDs in RDF triples

Persistent Object Identifiers SeminarThe Hague, The Netherlands,14 - 15 June 2011

Page 20: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

WHERE DO WE WANT TO GO?Future looking scenarios

Page 21: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Example I: speeding up data sharing & reuse

Genomic data

Dataset ID

Authorship and intellectual property

Citability before publication

Research collaboration and collective benefit

Source: interview with Jan Brase (Datacite.org)

Page 22: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Scientific publications Biographical

informationResearch collaborations

& projects

Papers on the same research topic

DI for AuthorCitation metrics

Related institutions

Example II: building the network of e-Science

Related / subsidiary info on the web

Rich measures and impact assessment

Page 23: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

DIAGNOSING THE GAPSWOT analysis

Page 24: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

S

O

A SWOT ANALYSIS

Europe has been already investing in the creation of e-infrastructures

The necessary know-how is available

Success of pioneering efforts in restricted contexts

Consolidation of e-Science

Demand for Science based innovation

Expectations for “ROI” of public interventions

Technical solutions are available and relatively mature

W

T

Ultimate consensus and coordination among stakeholders

Lack of clear business & sustainability models

Scarce awareness beyond (and even within) direct players

Political priorities

Institutional consensus / different agendas

Resistance to change in scientific communities

Page 25: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Possible strategies to fill the gap

• Stakeholders agreement

• Interoperability • Acceptance• Trust

• Bottom-up agreement

• Time• Failure?• Risk of

fragmentation

• A priori interoperability

• Costs and resources

• Locking out pre-existing solutions

• Low acceptance

• Reuse of available systems

• Lack of

adoption• Lack of

backward compatibility

• Time?

Page 26: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Supporting coordination: the main actions

Page 27: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Define the DIs agenda

The EC should start initiatives aiming at defining a common agenda among key stakeholders towards the design and implementation of a governance model and an integrating infrastructure for managing DIs in SDIs in which technological, economical, social and political factors are taken into account.

Page 28: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Define the DIs agenda

1. Defining the common objectives and organize them into a list

of workable temporal priorities.2. Agreeing on a shared governance model, which defines

devolved responsibilities amongst stakeholders and ensures long-term sustainability.

3. Sharing a conceptual framework in which the basic technical parameters and the fundamental services are introduced and described.

4. Planning interventions to promote awareness, dissemination and education activities aiming at expanding and reinforcing DI knowledge and skills.

Page 29: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Bootstrapping DIs in SDIs

The EC, Member States and other relevant stakeholders must take specific actions aiming at bootstrapping the implementation of the DI agenda in order to secure a critical mass of coordinated DI systems.

Page 30: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Bootstrapping DIs in SDIs

1. Reinforce, promote and secure EU wide institutional coordination among vertical and regional clusters of e-infrastructure stakeholders (common policies on the governance of identifiers for digital objects and authors).

2. Funding bodies must provide initial support to seed initiatives, which aim at implementing the coordination model defined in the agenda and at creating a critical mass of coordinated DI systems. This must be done in a flexible way, which allows the reallocation of funds in the portfolio based on the emerging needs and requirements.

3. Promote awareness and skills development to enable different stakeholders to participate effectively on DI initiatives and infrastructures.

4. Work towards systematic implementation of technical and organisational factors that underpin trust in identifiers, their reliability as a key component of SDIs - secure their operational management.

Page 31: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Mobilizing resources

Stakeholders at any levels should promote actions to mobilize technical, human, financial resources aiming at triggering a wider demand of usage and exploitation of e-Science results based on DIs.

Page 32: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Mobilizing resources

1. Funding agencies should design funding schemes, which may attract new public and private investments and efforts in developing and adopting DI-based added value services and solutions.

2. Stakeholders, and especially funding agencies, should foster interoperability based on consolidation of established DI systems (where possible) more than on proliferation of ad hoc systems.

3. Actions should be taken to mobilize consolidated technical skills to implement effective digital identifiers infrastructures (DIIs) within SDIs and adopt measures to assess the quality and impact of them for the exploitation of e-Science results.

Page 33: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Sustainable solutions

Efforts should be invested to build suitable organisational mechanisms and business models to guarantee the-long term sustainability of DI solutions.

Page 34: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

Sustainable solutions

1. Stakeholders need to develop business models where the costs of developing and sustaining identifier infrastructures and the responsibility in granting the long term sustainability of these infrastructures are distributed among the beneficiaries.

2. The flexibility of funding sources should be enhanced, allowing the reallocation of funds in the portfolio to enable the rapid scaling of promising solutions that embed or promote the value (usage) of identifiers.

3. Funding bodies must support the development of collaborative models and actions to create synergies and exchange opportunities between the private/commercial sector and scientific sector -DI- PPP).

Page 35: Digital Object Identifiers and Unique Authors Identifiers to enable services for data quality assessment, provenance, access  Barbara.

THANK YOU!

THE DIGOIDUNA TEAM

Contact: [email protected]

Paolo Barbara Martin Ruben