SESAR, IGSN, & a vision for a Repository Portal and Hosted...

28
Moving Repositories into the Digital Age: SESAR, IGSN, & a vision for a Repository Portal and Hosted Collection Management Kerstin Lehnert & Megan Carter Orlando IEDA | Lamont-Doherty Earth Observatory, Columbia University 1

Transcript of SESAR, IGSN, & a vision for a Repository Portal and Hosted...

Page 1: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Moving Repositories into the Digital Age:

SESAR, IGSN, & a vision for a Repository Portal and Hosted Collection Management

Kerstin Lehnert & Megan Carter Orlando IEDA | Lamont-Doherty Earth Observatory, Columbia University

1

Page 2: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Talk Outline • Update on CI developments: SESAR, IGSN

• Motivations & Previous Work • Proposed System Functionalities • Leveraging Existing Components

• Discussion

2

Page 3: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

SESAR System for Earth Sample Registration

• Authenticated workspace with tools for users to submit & manage sample metadata (MySESAR)

• IGSN Allocating Agent: Register samples with IGSN • Searchable catalog of sample metadata & supplementary

documents submitted by users

3

• Register samples (batch or individual) • View/edit metadata • Create groups/collections • Transfer ownership of metadata • Generate labels (QR code) • Role-based access

Page 4: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Update: New SESAR Features • Architecture aligned with new IGSN syntax rules

• Allows IGSN >9 digits • SESAR holds name space ‘IE’, users have sub-name spaces

• APIs for submission (incl. authentication), updates, & access of sample metadata

• Batch updating of sample metadata • Role-based permissions • Linked Data version of SESAR (GeoLink project R. Arko, P. Ji)

• Links to cruise DOIs, publication & data DOIs, ORCIDs • OAI-PMH provider for IGSN Central Catalog & Community

Portals 4

Page 5: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IEDA Data Browser

5

Page 6: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

SESAR: Plans • Modify metadata to distinguish samples in ‘trusted

repositories’ from informal (uncurated) archiving to accommodate emerging open access policies of publishers & funders

• Allow users to add customized metadata fields to accommodate specific sample types & communities

• Improved search interface (SolR index, Elastic Search) • Linking to data (as data repositories start to use IGSN)

• PID based linking to people (ORCID), cruises (DOI), funding awards (FundRef), repositories (re3data), etc.

6

Page 7: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN • Resolvable, globally unique PID (Persistent Identifier) • Based on the Handle System (like DOI), resolvable at

http://igsn.org/XXXZZZZZZ • Governed by an international non-profit organization (IGSN e.V.) • Membership in 5 continents

7

Page 8: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

4,106,273

2,100,273

25,748 25,583 4,461

IEDA Geoscience Australia CSIRO MARUM GFZ

Update about IGSN

>6.25 million IGSNs issued!

8 2005 2015 2015 2015 2015

Status January 2017

started:

Allocating Agent:

Page 9: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

9 0

20

40

60

80

100

120

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

IGSN Prefixes Used at SESAR (proxy for active users)

Batch registration released

Page 10: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Linking Samples, Data, & Publications 10

Page 11: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Adoption: Agencies • USGS • State surveys • Smithsonian • GA • BGS • IFREMER • GESEP

11

Page 12: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN Adoption by publishers

12

“… AGU Publications also strongly encourages use of other identifiers in our journal papers. International Geo Sample Numbers (IGSNs) uniquely identify items, such as a rock sample, a piece of coral, or a vial of water taken from the natural environment, and provide important, consistent information about these samples. Registering samples and including the IGSN in papers helps secure provenance information but most importantly connects common samples across multiple studies in the literature. IGSNs also will help you keep track of your samples. These identifiers can be reserved before a field season or assigned afterward.”

Hanson, B. (2016), AGU opens its journals to author identifiers, Eos, 97, doi:10.1029/2016EO043183. Published on 7 January 2016.

Page 13: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN in DataCite

13

Page 14: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN & ORCID All, ORCID is officially introducing IGSN as a new identifier type in the API and search indexes. It will use the igsn.org prefix and should be live in a couple of weeks. I will check what this means concretely when it is live and see if I can come up with some examples that demonstrate the functionality. Cheers, m. (mail from Markus Stocker, Pangaea, 4/25/2017)

14

Page 15: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN Architecture • The separation of

administrative and descriptive metadata (learning from setting up DataCite).

• Metadata in a common service can be: • Least common denominator • All-encompassing with lots of

optional elements • Communities of practice

define their metadata as an extension of a core set of metadata.

15 |

http://igsn.github.io/

Page 16: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

IGSN Description Metadata

16 |

Sample Identification

Sampling Activity

Sample Curation

Related Resources

The schema gives a minimal set of descriptive metadata for a global sample catalogue.

Elements reference to IGSN admin, ODM2, O&M, DataCite.

http://schema.igsn.org/description/

Page 17: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

A Repository Portal • Federated network of repository catalogs that improves

discovery, access, sharing, analysis, and curation of physical samples

• to promote transparency, reproducibility, and re-use in the era of open science.

• Support for digital collection management if needed Streamline collection metadata management, exchange, &

integration with other systems (e.g. IGSN, IMLGS) Facilitates discovery & access to samples for investigators Harmonizes sample request & access policies and procedures

across repositories. Supports consistent & automated generation of use statistics

to demonstrate Return on Investment (ROI) 17

Page 18: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Requirements Gathering: DESC • Supplement to IEDA in 2011 from OCI (DCL)

• Collaboration with OSU, LacCore, AZGS, RENCI • Extensive survey of repositories • Produced a report on status and requirements

18

Page 19: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Requirements Gathering: iSamples • EarthCube iSamples Research Coordination Network (RCN)

• Stakeholder Alignment Survey (J. Cutcher-Gershenfeld) • Use Cases Working Group (S. Ramdeen, A. Deere)

• Collected user stories to articulate life cycle practices of samples for different users

• Workflow Applications Working Group (J. Bowring, A. Hangsterfer) • Identify barriers that inhibit adoption of leading practices, incl. IGSN

assignment, sample documentation, and sample citation.

19

Requesting and receiving actual physical samples from a museum/repository is important .81 (.19) Requesting and receiving actual physical samples from a museum/repository is easy. .43 (.24)

Page 20: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Requirements Gathering: SESAR • Ongoing feedback from the community (investigators,

curators) regarding functionality, usability, performance • Includes EAR and PLR funded repositories, museums, agencies

• Collaboration with other Allocating Agents of the IGSN e.V. • International best practices and emerging software tools • LDEO is the Managing Office of the IGSN e.V.

• Participation in previous Curators Meetings • Extensive use cases provided by Anders Noren, Anthony

Koppers, Kevin Johnson and others

20

Page 21: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Possible System Components

21

• Repository Portal • User interface and APIs for sample search across repositories • Authenticated workspace for users & repositories

• Request samples from multiple repositories with a single transaction • View loans • Communicate with curators

• Authenticated dashboard for NSF to view use statistics • Repository Collection Management

• Digital sample and collection management for curators

Page 22: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Repository Portal Functionality • Rich metadata catalog, harvests catalogs from external systems • UI to search for samples across repositories

• Phase 1: simple search by location, lithology, age, expedition • Phase 2: advanced search on available data (via OpenCoreData?)

• “Shopping cart” approach for sample requests • Submit single request to multiple repositories

• Centralized user management (aligned with ORCID) • User account setup and login for transactions with all repositories • Dashboard for users to view pending, active, and past requests • Track communication with repositories

• Dashboard for NSF to view use statistics such as number of requests, samples shipped, users served, requests returned, etc.

• Additional functionality as requested and as funding allows 22

Page 23: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Repository Collection Management

23

• Hosted collection database to enter & edit sample metadata • Batch upload of new samples added to the collection • Batch upload of samples from sampling events in the lab • Customizable metadata (add fields, add vocabularies) • Set role-based permissions for metadata access & editing • Track metadata changes (who did it, when)

• Automated IGSN registration • Integration with Repository Portal loan management

• Alerts about new requests • View, approve, track request status • Send reminder to users for sample returns and reports • Automatically tracks loan statistics

• Other functionality as desired (data & image storage, label generation, etc.)

Page 24: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Leveraging Existing Components System PLUS MINUS

SESAR Multi-user; role-based access; dashboards; APIs; integration with IGSN; operational environment (data facility)

Missing sample request & loan management, technology upgrade required

Specify Cloud-based, scalable, modular; broad adoption in BIO; open source; customizable

Designed for BIO; no core-specific functionality; single operator at KU

Polar Rock Repository

Great functionality for collection management and sample requests; user friendly

Single user implementation; no core-specific functionality

Curation DIS core-specific functionality; multi-institutional architecture

Not open source; old technology; single POF; only for cores

CyVerse (former iPlant)

Scalable architecture Complex (overkill?)

TAMU ODP To be explored To be explored

24

Page 25: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Value Proposition

25

• For Repositories • Online loan management & user interactions • Automated use statistics • Seamless IGSN registration • APIs to support metadata harvest by IMLGS & others catalogs • Common user database aligned with ORCID • Linking of samples to cruises (R2R), data, publications, etc.

• Integration with Open Core Data? • Enhanced communication & coordination across repositories • Long-term sample metadata preservation

Page 26: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Value Propositions

26

• For Users • One-stop-shop to search for and request samples across

repositories • Single user account to view pending/active/past requests, and to

communicate with repositories

• For NSF • Increased efficiency of repository operations over the long-term • Easy access to harmonized up-to-date use statistics via dashboard • Implementation of common sample access policies

Page 27: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Discussion

27

• Does this plan address your most pressing concerns? • Is there anything that is missing? • Are there other existing tools or efforts that you think we should

look into/leverage? • ….

Page 28: SESAR, IGSN, & a vision for a Repository Portal and Hosted ...osu-mgr.org/wp-content/.../Moving-Repositories-into...repositories’ from informal (uncurated) archiving to accommodate

Adoption

EGU

201

6: "T

he IG

SN E

xper

ienc

e"

28

Posters at EGU General Assembly 2016

Poster session at AGU Fall Meeting 2016