Post on 10-May-2015
description
GBRDSGlobal Biodiversity Resources
Discovery System
European GBIF Nodes Meeting 2010, March 10th-12th Alicante, SpainDag Endresen, Nordiv Genetic Resources Center, NordGen
What is the Global Biodiversity Information Facility?
• GBIF enables free and open access to biodiversity data online.
• An international government-initiated and funded initiative focused on making biodiversity data available to all and anyone, for scientific research, conservation and sustainable development.
• GBIF’s Data Portal provides this infrastructure.
GBIF site :: http://www.gbif.org/index.php?id=269
GBIF from prototype to full operationGlobal informatics research infrastructure:•Global participation, a global network of partners•Enabling publishing of biodiversity data•Promoting development of data exchange standards•Building an informatics architecture•Capacity building•Catalysing development of analytical tools
•Data provider•Data aggregator
GBRDS :: Linking resources
• Who? (Institutions, Collections, Networks, ...)
• What? (Data sets, Services, Persistent Identifiers...)
• Where? (Locations, Service access points, ...)
• When? (Temporal scope)
• How? (Formats, protocols, ...)
As a distributed service
Available APIs (so far)
Question: Do we (the NODES) miss an important API here, or are these good to go?
Visit GBRDS at Google Code: http://code.google.com/p/gbif-registry/w/list
At the core, a Discovery System
ConsumersDataPublishers
Discovering
SearchingRetrieving
DiscoverySystem
Registering
ServicePublishers Others…
Slide by Vishwas Chavan, GBRDS workshop September 2009
GBIF GBRDS
http://gbrds.gbif.org Slide prepared by GBIF (Samy Gaiji)
USE CASES & mock-up examples
• The same primary biodiversity data can be analyzed differently for different uses
Discussion: Do we the NODES, have other important use cases?
Discussion: Do we the NODES have other requirements for the GBRDS user interface?
What can you do with georeferenced biodiversity data?
Predict effects of climate change Analyse and predict spread of pests and diseases of humans, crops, livestock,
wildlife, etc. Predict best places to set up new protected areas Analyse invasive species and predict invasion pathways Provide policymaker-relevant data of all kinds Be a resource for biodiversity science communities
The same primary biodiversity data can be analyzed differently for different uses.
Uses cases primary biodiversity data
Modified from slide by Vishwas Chavan, GBRDS workshop September 2009
http://code.google.com/p/gbif-registry/wiki/UseCases
Moving towards… global integration
ThreatenedSpp.; Red List Spp.
Migratory Spp.
Invasives, crop wild relatives, medicinals, etc.
?
Slide by Vishwas Chavan, GBRDS workshop September 2009
Global Biodiversity Resources Discovery Systemempowering discovery of biodiversity data
microhyla ornata western ghats 1980 Discover
About GBRDS Bug Report @ 2009, GBIF
1-5 of 351, <<Next>>Database of Frogs of Southern India..........(more)
AmphibiaWeb...................................................(more)
Fauna of India Database..................................(more)
Microhyla of the World....................................(more)
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
............................................
779 resources in six categories discovered, with 428 accessible, 351 resources records
Names (82) Primary Biodiversity Data (200)Resources (351)
Multimedia (12) Maps (28) Literature (106)
Mock-up examples
Slide by Vishwas Chavan, GBRDS workshop September 2009
Database of Frogs of Southern India: South Asian Centre for Biodiversity Monitoring, Kathmandu, NepalCDROM, published November 2006, ISBN-001-898-0788, sacbm@sacbm.org . <<Click here for complete metadata record>>
AmphibiaWeb: http://www.amphibiaweb.org/ Access point live as on 20th June 2009<<Click here for complete metadata record>>
Fauna of India Database: National Centre for Biodiversity Informatics, New Delhi, IndiaIn-house database, accessible through bi-lateral arrangement, fid@ncbi.org.in <<Click here for complete metadata record>>
Microhyla of the World: Smithsonian Institution, Washington DC, USA257 specimen records: 1700 – 2001: http://www.si.edu/amphibia/microhyla/ Access point live as on 12th Jan 2009 <<Click here for complete metadata record>>
Amphibian Collection of Raffles Museum: Raffles Museum, Singapore7004 specimens, non-digital: curator@raffles.sg. <<Click here for complete metadata record>>
Global Biodiversity Resources Discovery Systemempowering discovery of biodiversity data
microhyla ornata western ghats 1980 Discover
About GBRDS Bug Report @ 2009, GBIF
779 resources in six categories discovered, with 428 accessible, 351 resources records
Resources Names PrimaryBiodiversity Data Multimedia Maps Literature
Resources (351): 1-5 of 351 : Next : Previous : Last :
Digital, Offline
Digital, Free, Online
Digital, Restricted Access
Digital, Free, Online
Non- Digital
Slide by Vishwas Chavan, GBRDS workshop September 2009
Mock-up examples
Online demo (pre-alpha)
Online demo (pre-alpha)
GBRDS? What’s that?
eBiosphere resolution recommendation:“Complete durable global registries of biodiversity informatics resources”
Slide by Samy Gaiji, GBRDS workshop September 2009
Is this the GBRDS?The GBRDS is 1) a Registry of resources and services and 2) a set of discovery services interacting with existing infrastructure such as GBIF to facilitate the discovery of biodiversity information. The most important component, the Registry would facilitate the inventory of information resources by creating a single annotated index of publishers, institutions, networks, collections (datasets), schema repository and services. The envisaged GBRDS is not conceived to be designed as simply a collection of centralized indexes but much more as an integrated ‘Yellow Pages’ reference of all biodiversity information resources, reconciling all distributed resources and providing a meaningful way to discover them in a distributed manner.Slide by Samy Gaiji, GBRDS workshop September 2009
Registry: The past…Universal Description Discovery and Integration
(UDDI)
“…XML-based registry for businesses worldwide to list themselves on the Internet …”
UDDI GBIF
Businesses Institutions
+ Services + Collections
+ Service Bindings + Endpoints (DiGIR etc)
+ TModels + Application Schemas (DwC etc)
Modified from slide by Tim Robertson, GBRDS workshop September 2009
NB! Capitalise on resources (investments) till date, re-use previous solutions.
Technical specifications• Re-use of registry for your own purposes, and thematic
networks (model sub networks)• Identification of duplicate datasets / records• Scalable, up-time (24/7/365)• How to build relations between resources (Perhaps
FaceBook style: “Institution X requests to be associated with you. Would you like to accept this association?”)
• Central curation, or distributed community curation?• Allow the tagging of resources (human, machine)
• Discussion: Any additional specifications to add from the NODES?
Source: slides by Tim Robertson, GBRDS workshop September 2009
GBRDS Planning: Some issues to
consider
GBIF Network: The real scenario
Challenge #1:
Model the true nature of the network makeup.
A graph and not a tree Multiple entity types
Institutions, networks, collections, GBIF Nodes
Many relationship types
Slide by Tim Robertson, GBRDS workshop September 2009
Slide by Samy Gaiji, GBRDS workshop September 2009
GBRDS scalability/portability to other communities/locations will be critical!
Where metadata fits in ...
Modified from slide by Éamonn Ó Tuama, GBRDS workshop September 2009
NB! Remember import / compatibility with other metadata catalogues / systems
• The Persistent Identifier is a digital name tag• Also called Global Unique Identifiers (GUID)• Life Science Identifiers (LSID) is one example• Digital Object Identifier (doi) is another example
• The Persistent Identifier concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores.
• Persistent Identifiers provides a naming standard to support interoperability.
GBRDS : Persistent Identifiers
Discussion: Are Persistent Identifiers an important task for the GBRDS? (From NODES view-point)
LSID-GUID Task Group: summaryEffective identification of data objects is essential for linking the world’s biodiversity data. If GBIF is to enable the exchange of biodiversity data it must promote identifier adoption through:
- education, training, outreach- leadership- practical services
Source: slides by Éamonn Ó Tuama, GBRDS workshop September 2009
Recommendation 10: GBIF should provide services to support identifier resolution, redirection, metadata hosting, and caching.
Endpoint monitoring http://bioguid.info/status/ (Rod Page)
GBRDS : Provider monitoring
Modified from slide by Tim Robertson, GBRDS workshop September 2009
Endpoint monitoring http://bigdig.ecoforge.net/ (David Vieglais, Kansas University, 2006)
GBRDS : Provider monitoringDate Last Updated: 2010-03-08 08:15:01+0000
Endpoint monitoring http://chm.grinfo.net/ (Bioversity, Dag Endresen, March 2006)
GBRDS : Provider monitoring
Long tail or Dark Data is economically and ecologically very critical
Most of existing and future data would be hold by Small Data Publishers
The early focus of GBIF was the low hanging fruits – the LARGE datasets
Further expansion of the GBIF data network should struggle to include the small datasets
Source: Curating the Dark Data in the Long tail of science by P. Bryan Heidorn (Google Tech Talk, August 28, 2008, http://www.youtube.com/watch?v=mgN74bR57i0
REMEMBER : the small data providers
Source: slides by Vishwas Chavan, GBRDS workshop September 2009
Thanks for listening