"Cherish old knowledge that you may acquire new" - The Analects of Confucius
description
Transcript of "Cherish old knowledge that you may acquire new" - The Analects of Confucius
"Cherish old knowledge that you may acquire new"
- The Analects of Confucius
http://datadryad.org, http://blog.datadryad.org, http://datadryad.org/wiki
[email protected]; Twitter: @datadryad
Todd Vision – DryadUK Sustainability workshop - 4/1/2011 - The British Library
Publishers Journals
Researchers Funders
1. What is the value proposition?2. What is the appropriate revenue model?3. What is the role of funders?
Long tail of orphan dataVo
lum
e
Rank frequency of datatype
Specialized repositories(e.g. Genbank, PDB)
Orphan data
after B. Heidorn
Source: PARSE Insight survey report, http://www.parse-insight.eu/
Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. A Fourth Contribution to the Study of Variation. pp. 209-226 in Biological Lectures from the Marine Biological Laboratory, Woods Hole, Mass.
Source: PARSE Insight survey report, http://www.parse-insight.eu/
Source: Publishing Research Consortium, http://publishingresearch.net
n=3824
Peer-to-peer sharing is problematic
• Wicherts et al. requested data from from 141 articles in American Psychological Association journals.
• “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied
Wicherts, J.M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61, 726-728.
Benefits to data archiving
Modified from Beagrie et al. (2009) Keeping Research Data Safe 2
DirectVerification of published researchPreserving accessibility to dataAllowing reuse and repurposing of dataDiscoverability of data
Indirect (costs avoided)Redundant data collectionInefficient legacy data curation Burden of sharing-upon-requestOpportunity cost of science not done
Near termProtection against personnel turnoverAvailability for review and validation
Long termSecure long-term stewardshipIncreased impact per publication
PrivateIncreased citationsNew collaborations New research opportunitiesFulfilling funding mandates
PublicMore efficient use of research dollarsPublic trust in scienceEducational opportunitiesImproved methodologiesMore informed policy
Brussels Declaration on STM Publishing
“Raw research data should be made freely available to all researchers. Publishers encourage the public posting of the raw data outputs of research. Sets or sub-sets of data that are submitted with a paper to a journal should wherever possible be made freely accessible to other scholars”
Signed by 46 publishers and 13 trade organizations, incl. Elsevier, Nature Publ. Group, Springer, Oxford U Press, Wiley-Blackwell.
• The End To make data archiving and reuse a standard function
of scholarly communication. • The Means
Enable low-burden, inexpensive data archiving in conjunction with article publication.
Ensure individuals receive direct benefits from data sharing.
Reduce unnecessary barriers to data reuse. Empower journals, societies & publishers in shared
governance. Plan for long-term preservation at the outset.
Dryad vs. Supplementary Online Materials
Dryad SOM
Discoverable: indexed and exposed to both web and bibliographic search engines ✔ ✗
Identifiable: Data DOIs within articles serve as permanent, resolvable identifiers ✔ ✔/✗
Attributable: reuse of data leads to article citations ✔ ✔
Permanent: preservation planning, including format migration ✔ ?
Curated: quality control of data submissions and indexing metadata ✔ ✔/✗
Ease of deposit: streamlined deposit, allowance for large and complex datasets ✔/✗ ✔/✗Formatted for reuse: support for non-PDF file formats ✔ ✔/✗Updatable: new versions of data files can be added, metadata can be enhanced ✔ ✗
Support for embargoes: can delay release of data in accordance with journal policy ✔ ?
Free reuse: no paywall, clear terms of reuse ✔ ?
Economy of scale: cost efficiency from shared infrastructure ✔ ✔/✗
Responsive to needs of individual journals ✔ ✔Core business: aligned with organizational mission ✔ ✔/✗
How well do we understand the value proposition?
• For researchers Dryad increases the impact of, and citations to, published
research. It preserves and makes available others’ data to verify published results, to refine methodologies, and for other forms of reuse. It frees researchers from being responsible for data preservation and access.
• For journals Dryad frees journals from the responsibility and costs of
maintaining supplemental data in perpetuity, and allows publishers to increase the value of their journals to its authors and readers.
• For funders Dryad provides a cost-effective mechanism to make
research more accessible, and to leverage existing investments in order to enable new science.
Dryad as an organization• International nonprofit, with multiple
institutional hosts• Governed by a Board of open size
Each partner journal appoints one (voting) representative
The full Board votes on all financial and governance matters
• Executive Committee Currently five members elected by the Board Responsible for repository policy, short-term
strategic decisions Brings issues to full Board for discussion and vote
• Institutional oversight, advisory structure both TBD
• Next board meeting 7-9 July in Vancouver Transition from interim status Adopt initial governance model Adopt initial cost-recovery model
2007
2008
2012
2009
2010
2011
NSF/ESA Data Sharing and NESCent Small Science workshopsBeginning negotiation of Joint Data Archiving Policy
Journals/societies join NESCent & others to fund Dryad through NSF
NSF funding for Dryad begins (lasts through Aug 2012)
Repository went onlineFirst consortium board meetingDebut of integrated data submission
Announcement of Joint Data Archiving Plan
JISC funding begins Discussions with potential charter partnersJDAP (and NSF DMP mandate) takes effectTransitional funding campaign Approval of cost-recovery plan and governance structure
Cost-recovery beginsTransitional funding begins
Projecting Dryad’s operating costs
• Activity-based cost model, from KRDS• Includes
Management & administrative support Storage and server hardware (incl. permanent storage) Personnel for system maintenance Curation and preservation A small amount of outreach and user support
• Does not include Facilities and other institutional costs (e.g. human
resources) Repository innovation (grants, foundation support) Special projects (grants, foundation support)
• More detail in Beagrie, Eakin-Richards and Vision, iPres 2010
17 integrated journals
Curation
Revenue return• Costs are recovered upfront, in order to
allow free dissemination assure preservation
• Fees predominantly paid by journals, which may be passed on to authors subsidized by societies rolled into publisher costs/revenue
• Fees should be attractive: cost-effective relative to SOM fair: to all different types of journals
• Model will surely evolve Under control of consortium of partner journals
A Full
B Associate
C Author pays
Joining fee (waived for charter members)
$1000 $1000 NA
Annual fee from journal a. all peer-reviewed articles in prior yr b. articles with data deposited to Dryad
Prospective$25/articlea
Retrospective $100/articleb
0
Author charge at deposit 0 0 $200Length of contract 3 or 5 yrs 3 or 5 yrs n/aLegacy data deposits free? Y N N
Can move between plans A & B? Y Y N/ARepresentative on Consortium Board
Y Y N
Can vote on board and serve on executive committee
Y N N
Coordinated data deposit Y Y Y/N
Data DOI in published article Y Y Y/N
Branding of journal content Y Y N
Journal Society or publisher
Archiving rqmnt
Submission integration
Subscriber
American Naturalist ASN Y Y AEvolution SSE Y Y ASystematic Biology SSB Y Y AMolecular Biology & Evolution
SMBE Y - A
Heredity The Genetics Soc. Y In progress AJournal of Heredity AGA N Y APaleobiology / J. of Paleontology
Paleontology Society
Y In progress A
Ecological Monographs ESA Y In progress AJournal of Evolutionary Biology
ESEB Y Y A
Molecular Ecology / MER Wiley-Blackwell Y Y -Biological J. Linnean Society Linnean Soc.
LondonN Y -
Evolutionary Applications Wiley-Blackwell Y Y -Integrative & Comparative Biology
SICB - In progress -
BMC Ecology / BMC Evolution Springer/BMC Y? In progress -BMJ Open BMJ - In progressPLoS Biology PLoS - In progress -Molecular Phylogenetics & Evolution
Elsevier - - -
Ecology Letters CNRS/W-B - - -Journal of Ecology British Ecological
Soc.- - -
Issues with the subscription plan
• Are the differences in per-article costs appropriate? Plan B and Plan C are set based on incentives, not
cost Should there be a Plan B at all? Should there be a greater safety buffer for Plan A?
• How to accommodate journals from developing countries? authors from non-partner journals who lack grant
resources?• Annual fee depends only on article volume
Is this the most equitable arrangement?
Role for funders?
“this sort of open access archiving costs money and it is not clear who pays. Certainly research funding agencies seen very keen on the doing and not very keen on the paying.”
n=564
H. Piwowar (unpubl.)
Role for funders?• Policy
Strong archiving guidelines, with enforcement
Endorsement of trusted repositories• Funding
Renewable infrastructure grants (supporting curation, maintenance, user support, business operations)
Matching funds to repositories based on deposits or reuse
Top-slicing to researchers Waiver funds for researchers
CC BY-NC-ND 2.0