SESAR, IGSN, & a vision for a Repository Portal and Hosted...
Transcript of SESAR, IGSN, & a vision for a Repository Portal and Hosted...
Moving Repositories into the Digital Age:
SESAR, IGSN, & a vision for a Repository Portal and Hosted Collection Management
Kerstin Lehnert & Megan Carter Orlando IEDA | Lamont-Doherty Earth Observatory, Columbia University
1
Talk Outline • Update on CI developments: SESAR, IGSN
• Motivations & Previous Work • Proposed System Functionalities • Leveraging Existing Components
• Discussion
2
SESAR System for Earth Sample Registration
• Authenticated workspace with tools for users to submit & manage sample metadata (MySESAR)
• IGSN Allocating Agent: Register samples with IGSN • Searchable catalog of sample metadata & supplementary
documents submitted by users
3
• Register samples (batch or individual) • View/edit metadata • Create groups/collections • Transfer ownership of metadata • Generate labels (QR code) • Role-based access
Update: New SESAR Features • Architecture aligned with new IGSN syntax rules
• Allows IGSN >9 digits • SESAR holds name space ‘IE’, users have sub-name spaces
• APIs for submission (incl. authentication), updates, & access of sample metadata
• Batch updating of sample metadata • Role-based permissions • Linked Data version of SESAR (GeoLink project R. Arko, P. Ji)
• Links to cruise DOIs, publication & data DOIs, ORCIDs • OAI-PMH provider for IGSN Central Catalog & Community
Portals 4
IEDA Data Browser
5
SESAR: Plans • Modify metadata to distinguish samples in ‘trusted
repositories’ from informal (uncurated) archiving to accommodate emerging open access policies of publishers & funders
• Allow users to add customized metadata fields to accommodate specific sample types & communities
• Improved search interface (SolR index, Elastic Search) • Linking to data (as data repositories start to use IGSN)
• PID based linking to people (ORCID), cruises (DOI), funding awards (FundRef), repositories (re3data), etc.
6
IGSN • Resolvable, globally unique PID (Persistent Identifier) • Based on the Handle System (like DOI), resolvable at
http://igsn.org/XXXZZZZZZ • Governed by an international non-profit organization (IGSN e.V.) • Membership in 5 continents
7
4,106,273
2,100,273
25,748 25,583 4,461
IEDA Geoscience Australia CSIRO MARUM GFZ
Update about IGSN
>6.25 million IGSNs issued!
8 2005 2015 2015 2015 2015
Status January 2017
started:
Allocating Agent:
9 0
20
40
60
80
100
120
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
IGSN Prefixes Used at SESAR (proxy for active users)
Batch registration released
Linking Samples, Data, & Publications 10
Adoption: Agencies • USGS • State surveys • Smithsonian • GA • BGS • IFREMER • GESEP
11
IGSN Adoption by publishers
12
“… AGU Publications also strongly encourages use of other identifiers in our journal papers. International Geo Sample Numbers (IGSNs) uniquely identify items, such as a rock sample, a piece of coral, or a vial of water taken from the natural environment, and provide important, consistent information about these samples. Registering samples and including the IGSN in papers helps secure provenance information but most importantly connects common samples across multiple studies in the literature. IGSNs also will help you keep track of your samples. These identifiers can be reserved before a field season or assigned afterward.”
Hanson, B. (2016), AGU opens its journals to author identifiers, Eos, 97, doi:10.1029/2016EO043183. Published on 7 January 2016.
IGSN in DataCite
13
IGSN & ORCID All, ORCID is officially introducing IGSN as a new identifier type in the API and search indexes. It will use the igsn.org prefix and should be live in a couple of weeks. I will check what this means concretely when it is live and see if I can come up with some examples that demonstrate the functionality. Cheers, m. (mail from Markus Stocker, Pangaea, 4/25/2017)
14
IGSN Architecture • The separation of
administrative and descriptive metadata (learning from setting up DataCite).
• Metadata in a common service can be: • Least common denominator • All-encompassing with lots of
optional elements • Communities of practice
define their metadata as an extension of a core set of metadata.
15 |
http://igsn.github.io/
IGSN Description Metadata
16 |
Sample Identification
Sampling Activity
Sample Curation
Related Resources
The schema gives a minimal set of descriptive metadata for a global sample catalogue.
Elements reference to IGSN admin, ODM2, O&M, DataCite.
http://schema.igsn.org/description/
A Repository Portal • Federated network of repository catalogs that improves
discovery, access, sharing, analysis, and curation of physical samples
• to promote transparency, reproducibility, and re-use in the era of open science.
• Support for digital collection management if needed Streamline collection metadata management, exchange, &
integration with other systems (e.g. IGSN, IMLGS) Facilitates discovery & access to samples for investigators Harmonizes sample request & access policies and procedures
across repositories. Supports consistent & automated generation of use statistics
to demonstrate Return on Investment (ROI) 17
Requirements Gathering: DESC • Supplement to IEDA in 2011 from OCI (DCL)
• Collaboration with OSU, LacCore, AZGS, RENCI • Extensive survey of repositories • Produced a report on status and requirements
18
Requirements Gathering: iSamples • EarthCube iSamples Research Coordination Network (RCN)
• Stakeholder Alignment Survey (J. Cutcher-Gershenfeld) • Use Cases Working Group (S. Ramdeen, A. Deere)
• Collected user stories to articulate life cycle practices of samples for different users
• Workflow Applications Working Group (J. Bowring, A. Hangsterfer) • Identify barriers that inhibit adoption of leading practices, incl. IGSN
assignment, sample documentation, and sample citation.
19
Requesting and receiving actual physical samples from a museum/repository is important .81 (.19) Requesting and receiving actual physical samples from a museum/repository is easy. .43 (.24)
Requirements Gathering: SESAR • Ongoing feedback from the community (investigators,
curators) regarding functionality, usability, performance • Includes EAR and PLR funded repositories, museums, agencies
• Collaboration with other Allocating Agents of the IGSN e.V. • International best practices and emerging software tools • LDEO is the Managing Office of the IGSN e.V.
• Participation in previous Curators Meetings • Extensive use cases provided by Anders Noren, Anthony
Koppers, Kevin Johnson and others
20
Possible System Components
21
• Repository Portal • User interface and APIs for sample search across repositories • Authenticated workspace for users & repositories
• Request samples from multiple repositories with a single transaction • View loans • Communicate with curators
• Authenticated dashboard for NSF to view use statistics • Repository Collection Management
• Digital sample and collection management for curators
Repository Portal Functionality • Rich metadata catalog, harvests catalogs from external systems • UI to search for samples across repositories
• Phase 1: simple search by location, lithology, age, expedition • Phase 2: advanced search on available data (via OpenCoreData?)
• “Shopping cart” approach for sample requests • Submit single request to multiple repositories
• Centralized user management (aligned with ORCID) • User account setup and login for transactions with all repositories • Dashboard for users to view pending, active, and past requests • Track communication with repositories
• Dashboard for NSF to view use statistics such as number of requests, samples shipped, users served, requests returned, etc.
• Additional functionality as requested and as funding allows 22
Repository Collection Management
23
• Hosted collection database to enter & edit sample metadata • Batch upload of new samples added to the collection • Batch upload of samples from sampling events in the lab • Customizable metadata (add fields, add vocabularies) • Set role-based permissions for metadata access & editing • Track metadata changes (who did it, when)
• Automated IGSN registration • Integration with Repository Portal loan management
• Alerts about new requests • View, approve, track request status • Send reminder to users for sample returns and reports • Automatically tracks loan statistics
• Other functionality as desired (data & image storage, label generation, etc.)
Leveraging Existing Components System PLUS MINUS
SESAR Multi-user; role-based access; dashboards; APIs; integration with IGSN; operational environment (data facility)
Missing sample request & loan management, technology upgrade required
Specify Cloud-based, scalable, modular; broad adoption in BIO; open source; customizable
Designed for BIO; no core-specific functionality; single operator at KU
Polar Rock Repository
Great functionality for collection management and sample requests; user friendly
Single user implementation; no core-specific functionality
Curation DIS core-specific functionality; multi-institutional architecture
Not open source; old technology; single POF; only for cores
CyVerse (former iPlant)
Scalable architecture Complex (overkill?)
TAMU ODP To be explored To be explored
24
Value Proposition
25
• For Repositories • Online loan management & user interactions • Automated use statistics • Seamless IGSN registration • APIs to support metadata harvest by IMLGS & others catalogs • Common user database aligned with ORCID • Linking of samples to cruises (R2R), data, publications, etc.
• Integration with Open Core Data? • Enhanced communication & coordination across repositories • Long-term sample metadata preservation
Value Propositions
26
• For Users • One-stop-shop to search for and request samples across
repositories • Single user account to view pending/active/past requests, and to
communicate with repositories
• For NSF • Increased efficiency of repository operations over the long-term • Easy access to harmonized up-to-date use statistics via dashboard • Implementation of common sample access policies
Discussion
27
• Does this plan address your most pressing concerns? • Is there anything that is missing? • Are there other existing tools or efforts that you think we should
look into/leverage? • ….
Adoption
EGU
201
6: "T
he IG
SN E
xper
ienc
e"
28
Posters at EGU General Assembly 2016
Poster session at AGU Fall Meeting 2016