The need and drive for high quality data publication

21
THE NEED AND DRIVE FOR HIGH QUALITY DATA PUBLICATION Iain Hrynaszkiewicz Head of Data and HSS Publishing, Open Research Nature Publishing Group & Palgrave Macmillan [email protected] @iainh_z COASP, Paris, September 2014

description

The need and drive for high quality data publication. COASP, Paris, September 2014. Iain Hrynaszkiewicz Head of Data and HSS Publishing, Open Research Nature Publishing Group & Palgrave Macmillan [email protected] @ iainh_z. Publishers and data/reproducibility. - PowerPoint PPT Presentation

Transcript of The need and drive for high quality data publication

Page 1: The need and drive  for high  quality data publication

THE NEED AND DRIVE FOR HIGH QUALITY DATA PUBLICATION

Iain HrynaszkiewiczHead of Data and HSS Publishing, Open Research

Nature Publishing Group & Palgrave Macmillan

[email protected]@iainh_z

COASP, Paris, September 2014

Page 2: The need and drive  for high  quality data publication

Publishers and data/reproducibility

• Policies on access (to data, code, reagents etc)

• Supporting funder & community needs

• Format and amount of content

• Methodological details, supp info, data integration and links to repositories

• Licensing for reuse

• Incentives to share

• Data citations

• Data journals and articles

• Quality assurance through peer review

2

Page 3: The need and drive  for high  quality data publication

Data/reproducibility and NPG

Some important events

1996: Bermuda Principles – prepublication sharing

1998: Structural data accession codes (Nature & Science)

2002: MIAME-compliant microarray data deposition

2007: Removal of limitations on Methods sections online

2009: Ioannidis et al. Nat Gen 41, 2, 149 (2009)

.

3

Page 4: The need and drive  for high  quality data publication

2013

Page 5: The need and drive  for high  quality data publication

Data/reproducibility and NPG

Some important recent events

2013: Reproducibility checklist, source data from figures

2014: Endorsing the Joint Declaration of Data Citation Principles

2014: Launch of Scientific Data

5

Page 6: The need and drive  for high  quality data publication

Role of data journals/articles

• Credit

• Unpublished data

• Peer review focus

• Value of data vs. analysis

• Discoverability

• Reusability

• Narrative/context

• “Intelligently open data”

6

Page 7: The need and drive  for high  quality data publication

Data, data (journals) everywhere?

7

Page 8: The need and drive  for high  quality data publication

Scientific Data 2011 Market Research

Scope of survey

• How much data researchers produce, in what format and what they do with it

• Perceived availability of public repositories

• Perceptions of the Scientific Data concept

• Level/nature of data journal peer review

Respondent characteristics

• 387 respondents (329 active researchers)

• Physics (24%), Earth and environmental science (21%), Biology (20%) Chemistry (19%) Others (16%)

8

Page 9: The need and drive  for high  quality data publication

Scientific Data 2011 Market Research

Key survey data

• 60% share their data with their colleagues

• 50% look at other researchers’ datasets at least once a month

• 45% unaware of a repository for some of their data

• 90% reacted positively to the concept of Scientific Data

• 80% believed Scientific Data would increase data deposition rates

9

Page 10: The need and drive  for high  quality data publication

Scientific Data 2011 Market Research

Key survey data – what do researchers want from a data publication?

• 96% - increased visibility and discovery

• 95% - increased usability of their research data

• 93% - credit mechanism for deposit of data

• 80% - peer review of content/datasets

10

Page 11: The need and drive  for high  quality data publication

Get Credit for Sharing Your Data

Publications will be indexed and citeable.

Open-access

Creative Commons licenses (CC-BY/CC-BY-NC) for the main Data Descriptor. Each publication supported by CCO metadata.

Focused on Data Reuse

All the information others need to reuse the data; no interpretative analysis, or hypothesis testing

Peer-reviewed

Rigorous peer-review focused on technical data quality and reuse value

Promoting Community Data Repositories

Not a new data repository; data stored in community data repositories

Page 12: The need and drive  for high  quality data publication

Scientific Data

ScopeAn open access, peer-reviewed publication for descriptions of scientifically valuable datasets. Our primary article-type, the Data Descriptor, is designed to make your data more discoverable, interpretable and reusable.

Editorial teamManaging Editor (Andrew Hufton)Editorial Curator (Victoria Newman)Honorary Academic Editor (Susanna Sansone, Oxford)Advisory Panel and Editorial Board

Open access article processing charge$1,000 USD / £650 GBP / €750 for each accepted article

12

Page 13: The need and drive  for high  quality data publication

Sections:• Title• Abstract• Background & Summary• Methods• Technical Validation• Data Records• Usage Notes • Figures & Tables • References• Data Citations

The ‘Data Descriptor’ article

Detailed descriptions of the methods and technical analyses supporting the quality of the measurements. Does not contain tests of new scientific hypotheses

Page 14: The need and drive  for high  quality data publication

Peer review at Scientific Data

Focuses on:

• Completeness (can others reproduce?)

• Consistency (were community standards followed?)

• Integrity (are data in the best repository?)

• Experimental rigour and technical quality(were the methods sound?)

Does not focus on:

• Perceived impact/importance

• Size/complexity of data

Page 15: The need and drive  for high  quality data publication

The ‘Data Descriptor’ article

Experimental metadata or

structured component(in-house curated, machine-readable

formats)

Article or narrative

component(PDF and HTML)

Page 16: The need and drive  for high  quality data publication

Zehr et al. Scientific Data 1, Article number: 140019 doi:10.1038/sdata.2014.19

Page 17: The need and drive  for high  quality data publication

17

Page 18: The need and drive  for high  quality data publication

Stem Cells

• Associated Nature Article

• Data at figshare & NCBI GEO

• Integrated figshare data viewer

Page 19: The need and drive  for high  quality data publication

Neuroscience

Code in GitHub

• New Dataset• Data in OpenfMRI• Source code in GitHub• Big Data

Page 20: The need and drive  for high  quality data publication

The right licence

Data: depends on public repositories. Partner repositories figshare and Dryad both use the CC0 waiver.

Metadata: released under the CC0 waiver to maximize reuse and aid data miners

Data Descriptor article: Licensed under one of two Creative Commons licenses, by author choice:

Page 21: The need and drive  for high  quality data publication

For more information please contact

IAIN HRYNASZKIEWICZHead of Data and HSS Publishing, Open Research

M: +44 (0)7814 290576T: +44 (0)207 0146753E: [email protected]

Thank you