Stephen Abrams Patricia Cruse John Kunze UC Curation Center California Digital Library
description
Transcript of Stephen Abrams Patricia Cruse John Kunze UC Curation Center California Digital Library
UC3
Standards and Best Practices for Datasets and
Other Supplemental Journal Article Materials
DataCite @ UC3Stephen Abrams
Patricia CruseJohn Kunze
UC Curation CenterCalifornia Digital LibraryUniversity of California
CNI Spring 2010 Membership MeetingBaltimore, April 12-13, 2010
UC3
DataCite @ UC3The California Digital Library was founded by the University of California in 1997 to take advantage of emerging technologies transforming the way digital information is published and accessed
In collaboration with the UC libraries and other partners, the CDL has assembled one of the world’s largest digital research libraries and changed the ways that faculty, students, and researchers use information
– Collection development, licensing, mass digitization, and cataloging– Digital special collections– Discovery and delivery– Publishing– UC Curation Center (UC3)
UC3
DataCite @ UC3
UC3’s participation in DataCite is a continuation of our ongoing activities in digital curation
– The set of policies and practices focused on managing and adding value to a body of trusted digital content over time
Publish Preserve
Access
Collect
Discover
Gather
Create
Share
ManageResearch
TeachLearn
Information lifecycleScholarly lifecycle
UC3
The gap between possibility and practice
Journal articles– Most articles held in multiple
academic and national libraries– Libraries ensure long-term
storage and access– Extensive mechanisms for
publication and discovery– Established funded mechanisms
for archival management– Citations form the basis of
impact analysis
Data– Few archives in widely visible
facilities– Difficult data management
after project funding ceases– Little opportunity for
publication, informal discovery– Ad hoc funding sources, if at
all– Not included in impact
analysis
UC3
What we’d like to enable…
Precise identification of datasets at appropriate granularity
Bi-directional linking between traditional publications and the data underlying them
Domain-specific discovery to facilitate innovative reuse of data
Citation “credit” for data producers and publishers
Use metrics for data
UC3
CDL discovery services
ark:/a50600/rb2468097doi:10.5060/rb2468097http://n2t.net/a5060/rb2468097
UC3
CDL eScholarship publishing
Supplementary DataReichl, R., Waldinger, R., et al. (2006)Table A: Survey of Attitudes …Table B: Latinos in LA Basin … …
UC3
Licensed resources
Supplementary data
UC3
DataONE
UC3
Identity is a fundamental curation service
ValueAnnotation of content by consumers
Notification of new content availability
Transformation to create derivatives
ServiceSearch of content and metadata
Index to enable fast search
Curation Ingest of content for curation
PreservationContext
Characterization to extract content properties
Inventory of curated content and metadata
Replication for safety
StateFixity to verify bit-level integrity
Storage for long-term retention
Identity for long-term reference
UC3
Easy Identifiers (EZID)Tier 1 Anonymous request for persistent identifier
Tier 2 Tier 1, plus supply of a resolvable URL(c.f. tinyurl)
Tier 3 Tier 2, but authenticated(enabling link checking and personalized services)
Tier 4 Tier 3, plus supply of metadata(enhanced discovery and resolution)
Tier 5 Tier 4, plus supply of the digital asset(for local or brokered hosting)
Tier 6 Tier 5, plus supply of the asset from the web(c.f. Zotero)
UC3
User-facing EZID interfaces
Two primary methods: mint and bindid = mint (scheme, namespace)bind (id, url)bind (id, metadata)
Interface implementations
– HTML
– Emailmailto: [email protected]
– RESTPOST /mint/scheme/namespace HTTP/1.1
MintDOIScheme: Namespace: UC3
Identifier:
UC3
Repository for cited dataIngest Service
Submission package
File: Browse...
Object profile
File
Container
Profile:
Object identifierPrimary Identifier:
Leave blank to have identifier assigned automatically
My profile
Object description (optional)
My profile Log out Help
Creator:
Title:
Date:
UC3 M e r r i t t
Copyright © 2009 The Regents of the University of California Terms of service HelpWednesday, December 9, 2009 12:04pm PST
What is an object file?
What is an object container?
What is a profile?
What is a creator?
What is a date?
What is a title?
Local Identifier: What is a local identifier?
What is a primary identifier?
Manifest What is an object manifest?
Manifest What is a batch manifest?Batch of objects
Single object
Submission operationCreate a new object
Update an existing object
Package verification (optional)
MD5Checksum:
What is a checksum?
UC3
Pilot projects
UC ETDshttp://www.escholarship.org/
Dryadhttp://datadryad.org/repo
UC Berkeley Water Resources Centerhttp://www.lib.berkeley.edu/WRCA
UC Berkeley Jepson Herbariumhttp://ucjeps.berkeley.edu/jeps
UC3
Next steps
Work with DataCite partners to establish metadata and citation standards and best practices
Integrate support for DataCite DOIs into EZID
Promote data citation for research, teaching, and learning on UC campuses and by funded project partners
Increase the visibility of UC3 as a DataCite registration agency and the UC3 curation environment for data hosting
UC3
Summary
Digital resources lacking identification cannot be curated
Data should be seen (and supported) as a new kind of publication
Scholarly inquiry is facilitated by bi-directional linking between articles and the data on which they are based
DataCite plays a vital role in supporting data as citable publication
UC3 is working with campus and external partners to provide effective data citation services
UC3
For more information
DataCitehttp://www.datacite.org/
UC Curation Centerhttp://www.cdlib.org/services/uc3