OpenAIRE-COAR conference 2014: Allowing research data to shine: providing tangible credit for data...
description
Transcript of OpenAIRE-COAR conference 2014: Allowing research data to shine: providing tangible credit for data...
ALLOWING RESEARCH DATA TO SHINE: PROVIDING
TANGIBLE CREDIT FOR DATA SHARING
Joint OpenAIRE-COAR conference, 21-22nd May 2014
Varsha Khodiyar Editorial Biocurator, F1000Research
@vkf1000
f1000research.com @f1000research
ABOUT FACULTY OF 1000
The Seer of Science Publishing Science 4 October 2013:
Vol. 342 no. 6154 pp. 66-67 DOI: 10.1126/science.342.6154.66
http://www.sciencemag.org/content/342/6154/66.full.pdf
@vkf1000 | @f1000research
WHAT IS F1000RESEARCH?
F1000Research is an open access journal for life scientists that
accepts all scientifically sound articles, ranging from single
findings, case reports, protocols, replications, and null or negative
results to more traditional articles.
Key features:
• Publication within a week
• Transparent, post-publication peer review
• All data included
• Accepts non-traditional article types
@vkf1000 | @f1000research
POST-PUBLICATION PEER REVIEW
Remove the publication delay.
Invited peer review (post-publication).
Transparent refereeing.
Inclusion of all data in all articles.
No restriction of access.
All article types published, including
data only papers.
@vkf1000 | @f1000research
WHY SHARE RESEARCH DATA?
WHY SHARING RESEARCH DATA IS IMPORTANT
Transparency and openness are cornerstones of the scientific method
“Not allowing reuse of
data is scientific
malpractice”
Royal Society; Science as
an open enterprise, Final
report 2012
http://royalsociety.org/about-us/history/
@vkf1000 | @f1000research
SHARING DATA ALLOWS REPLICATION
“[W]e evaluated the replication of data analyses in 18
articles on microarray-based gene expression profiling
published in Nature Genetics in 2005–2006...We
reproduced two analyses in principle and six partially or
with some discrepancies; ten could not be reproduced.
The main reason for failure to reproduce was data
unavailability.”
Ioannidis JPA. et al. Repeatability of published microarray
gene expression analyses. Nature Genetics 41, 149–55
(2009)
@vkf1000 | @f1000research
SHARING DATA CORRELATES WITH HIGHER CITATIONS
“We conclude there is a direct effect of third-party data
reuse that persists for years beyond the time when
researchers have published most of the papers reusing
their own data...We further conclude that...a
substantial fraction of archived datasets are reused,
and that the intensity of dataset reuse has been steadily
increasing since 2003.”
Piowar HA., Vision TA. Data reuse and the open data citation
advantage. PeerJ (2013) doi: 10.7717/peerj.175
@vkf1000 | @f1000research
SHARING DATA ADDITIONALLY PROMOTES
• Diversity of analyses and opinion
• Increased return on investment in research
• Reduction of error and fraud
• Education of new researchers
• New research
• meta-analyses to create new datasets
• testing of new hypotheses
• new analysis methods
• studies on data collection methods
@vkf1000 | @f1000research
PUSH TOWARDS SHARING RESEARCH DATA
http://www.biosharing.org/policies
@vkf1000 | @f1000research
TECHNICAL CONSIDERATIONS FOR SHARING
RESEARCH DATA
MAKING DATA ACCESSIBLE
‘Openly accessible’ – apply the principles of the Budapest Open Access
Initiative* (originally created for scholarly articles) to scholarly data
• Free to view/access
• Free to download
• Free to re-analyse (as a individual dataset or as part of a larger meta-
analysis)
• Free to modify
Community norms to be applied regarding acknowledgement and
citation of data.
* budapestopenaccessinitiative.org
@vkf1000 | @f1000research
DATA ACCESS AT F1000RESEARCH
• Free to view/access
• Free to download
- Available without a paywall
- Use open repositories to
host data
• Free to re-analyse and modify
- Use of CC0 licence
• Acknowledge data authors
- Ensure data is citable
@vkf1000 | @f1000research
MAKING DATA USABLE
• Present data in useable format (i.e. not a supplemental PDF!)
• Specify how data was generated
• Provide quality assurance (overview of limitations; peer
review)
• Share data in non-proprietary formats
• Specify access to software required to view data
• Specify parameters in software used to analyze the data
@vkf1000 | @f1000research
DATA USABILITY AT F1000RESEARCH
@vkf1000 | @f1000research
• Preview large datasets prior to
downloading
• View data without leaving the
article
• Usage statistics provided
• Legends and DOIs for data
• We encourage authors to
- Use non-proprietary formats, e.g. CSV
over XLS/XLSX
- Include detailed methods to allow
replication
• Quality assurance with
transparent peer review
F1000RESEARCH: DATA REVIEW
Internal pre-publication checks:
• Storage (discipline-specific repository where possible)
• Format
• Layout and labelling
• Adequate data?
• Adequate protocol information?
Referees are asked to check:
• Methods were appropriate?
• Format/structure usable?
• Data limitations and sources of error included?
• Adequate information to enable potential replication?
• Does the data ‘look’ OK?
@vkf1000 | @f1000research
DISCOVERING DATA AT F1000RESEARCH Strasser C, Kunze J, Abrams S, Cruse P (2014) DataUp: A tool to help researchers describe
and share tabular data [v1; ref status: awaiting peer review, http://f1000r.es/2n7]
F1000Research 2014, 3:6
@vkf1000 | @f1000research
HELPING RESEARCHERS TO OVERCOME CULTURAL
BARRIERS TO SHARING DATA
JOURNAL/PUBLISHER DATA SHARING POLICIES
• Reproducible research or data sharing statements in
published papers (Annals Internal Medicine, BMJ)
• Data sharing implied by submission (BioMed Central)
• Data sharing as a condition of publication (PLOS, NPG)
• As above AND data must be available to reviewers/editors
• Open data as a condition of submission (F1000Research)
• Papers rejected if data unavailable freely*
Policy strength increases
@vkf1000 | @f1000research
MAKING DATA SHARING EASY - IN-ARTICLE DATA MANIPULATION
A fixed-dose randomized
controlled trial of olanzapine
for psychosis in Parkinson
disease [v1; ref status: indexed,
http://f1000r.es/1au] Michelle J Nichols, Johanna M
Hartlein, Meredith GA Eicken, Brad A
Racette, Kevin J Black
F1000Research 2013, 2:150
@vkf1000 | @f1000research
FORMAL CREDIT FOR DATA SHARING
Feeding into currently recognized scholarly outputs
Benefits:
• Appropriate credit for data producers with a citable publication
• Data accessible from repository regardless of journal subscription
• Data independently discoverable via bi-directional linking
• Data available in usable form
• Potential increase in ‘value’ of data, as increasing numbers of
studies are carried out
@vkf1000 | @f1000research
FORMAL CREDIT FOR DATA SHARING
Novel measures of scholarly outputs
• Encourage data peer review, and promote scholarly credit for peer
review
• Record basic metrics for datasets
• Promote data sharing as a formal research output
• Participate in cross-publisher initiatives to ease data sharing
• Participate in discussions about research data feeding in to the
tenure process
@vkf1000 | @f1000research
DATA PUBLISHING AND SHARING IS NOT A NEW IDEA
“I have begun to think that no one ought to publish biometric results, without lodging a well-arranged and well-
bound manuscript copy of his data in some place where it should be accessible, under reasonable restrictions, to those
who desire to verify his work.”
Galton F. Biometry. Biometrika 1 (1): 7-10 (1901)
doi: 10.1093/biomet/1.1.7
@vkf1000 | @f1000research
Email: [email protected]
Twitter: @vkf1000 / @f1000research
Leave a comment on our blog: http://blog.f1000research.com/