Data exchange alternatives, GIGA TAG (2009)
-
Upload
dag-endresen -
Category
Technology
-
view
1.884 -
download
3
description
Transcript of Data exchange alternatives, GIGA TAG (2009)
![Page 1: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/1.jpg)
Data exchange alternatives
Global Information on Germplasm Accessions (GIGA, ALIS)2nd GIGA Technical Advisory Group Meeting
Dag Terje Filip Endresen, Nordic Genetic Resources Center, NordGen (Sweden)
![Page 2: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/2.jpg)
2
Data exchange
Cartoon by Sasha Kopf (Creative Commons)
![Page 3: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/3.jpg)
3
Data Exchange Format
• MCPD (1997)– Multi Crop Passport Descriptors
• Darwin Core (2001) **– New version up for revision at TDWG2009– http://rs.tdwg.org/dwc/index.htm
• ABCD (2001) – Access to Biological Collections Data– http://wiki.tdwg.org/twiki/bin/view/ABCD
• GCP Passport (2005)– http://www.generationcp.org
• Ontology (including all above)– perhaps develop a new GIGA ontology
![Page 4: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/4.jpg)
4
Data Provider Software• BioMOBY (2001)
– http://biomoby.org
• DiGIR (2002, not active)– http://digir.sourceforge.net
• BioCASE (2003, PyWrapper v2)– http://www.biocase.org
• EURISCO (2003, tab delimited text)– http://eurisco.ecpgr.org
• PyWrapper 3 (2006, not active)– http://trac.pywrapper.org
• TapirLink (2007)– http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink
• GBIF Provider Toolkit (2009) **– http://code.google.com/p/gbif-providertoolkit
![Page 5: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/5.jpg)
5
Data Harvest Infrastructure
• GIGA Registry (UDDI)– New GIGA registry for germplasm dataset?
• ICIS and CropForge tools– http://cropwiki.irri.org/icis/– https://cropforge.org/
• GBIF data portal and registry**– http://data.gbif.org – gbrds.gbif.org (registry)
• GBIF Indexing Toolkit (2009)– http://code.google.com/p/gbif-indexingtoolkit/
![Page 6: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/6.jpg)
6
Data Provider Software
![Page 7: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/7.jpg)
EURISCO tab delimited upload
http://eurisco.ecpgr.org
7
![Page 8: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/8.jpg)
8
BioMOBY
• The BioMOBY project was initiated in 2001 (in Saskatchewan, Canada).
• Two branches, web service and semantic (MOBY-S).
• MOBY ontology-aware registry for discovery of both data and services.
• Works well with TAPIR and BioCASE.• GCP have selected BioMOBY as the main
web service technology.
http://biomoby.org
![Page 9: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/9.jpg)
9
BioCASE 2.5• The BioCASE provider software is a
product of the EU funded BioCASE project (2001-2004).
• Developed at BGBM in Berlin. • Last updated in April 2008, with
support for Python version 2.5• Data formats include: ABCD 2.06,
Darwin Core, GCP_Passport, MCPD.
http://www.biocase.org
![Page 10: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/10.jpg)
10
BioCASE 2.5
Configuration
• Add datasource (dsa)
• Database connection
• Database table structure
• Mapping of data model to standard schema
![Page 11: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/11.jpg)
11
TAPIR• TAPIR - TDWG Access Protocol for
Information Retrieval. • During the 2004 TDWG meeting in
Christchurch, NZ, work started on a unified protocol and named TAPIR.
• TAPIR is based on the protocol from the two data provider software, BioCASE and DiGIR.
![Page 12: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/12.jpg)
12
PyWrapper3
Home: http://trac.pywrapper.org/Primary developers: Markus Döring, Javier de la TorreSource code: Python
14/07/2008 - Development stalledWe are sorry to inform you that development of the TAPIR branch of PyWrapper has been stalled. The latest 3.1 alpha version is not stable and not recommended for production! (Message from the home page)
PyWrapper is tested and verified to work fine with Windows, Mac OS X and Linux.
![Page 13: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/13.jpg)
13
Web configuration tool
PyWrapper graphical web based configuration tool
![Page 14: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/14.jpg)
14
TapirLink 0.6.1Home: http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLinkPrimary developers: Renato De Giovanni, Dave VieglaisDownload: http://sourceforge.net/project/showfiles.php?group_id=38190Source code: PHP
Test resource with client form:http://localhost/tapirlink/tapir_client.php
The XML Client form is very illustrative for understanding exactly how the wrapper software works!
![Page 15: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/15.jpg)
15
GBIF IPTHome: http://code.google.com/p/gbif-providertoolkit/ Primary developers: Markus Döring, Tim RobertsonDownload: http://code.google.com/p/gbif-providertoolkit/downloads/list Source code: Java
DEMO at http://atlas.nordgen.org/ipt/
![Page 16: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/16.jpg)
16
GBIF IPT
• The GBIF IPT is an open source, Java (TM) based web application that connects and serves three types of biodiversity data: taxon primary occurrence data, taxon checklists and general resource metadata.
• The data registered in the IPT is connected to the GBIF distributed network and made available for public consultation and use.
• Designed to transfer big amounts of records. Decentralize and speed up the process of indexing biodiversity occurrence datasets.
• IPT also provides a local tool for data quality assessment to data publishers.
• The data publisher will easily monitor data access and use.
![Page 17: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/17.jpg)
17
GBIF IPT
![Page 18: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/18.jpg)
18
GBIF IPT
![Page 19: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/19.jpg)
19
IPT
![Page 20: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/20.jpg)
20
GBIF IPT
![Page 21: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/21.jpg)
21
Web service interface
![Page 22: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/22.jpg)
22
EXAMPLE TAPIR SERVICE SEARCH REQUEST
![Page 23: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/23.jpg)
23
EXAMPLE TAPIR SERVICE SEARCH RESPONSE
![Page 24: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/24.jpg)
24
EXAMPLE OF OAI-PMH SERVICE REQUEST
http://an.oa.org/OAI-script?verb=GetRecord&identifier=oai:arXiv.org:hep-th/9901001&metadataPrefix=oai_dc
OAI-PMH requests are submitted using either the HTTP GET or POST methods.
Request types:IdentifyListMetadataFormatsListSetsGetRecordListIdentifiersListRecords
![Page 25: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/25.jpg)
25
EXAMPLE OF OAI-PMH SERVICE RESPONSE
OAI-PMH responses formatted as HTTP.With The Content-Type as text/xml.
![Page 26: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/26.jpg)
GBIF PGR Network 2
[http://data.gbif.org/datasets/network/2]26
![Page 27: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/27.jpg)
27
DARWIN CORE
• A new version of Darwin Core is up for public review.– http://rs.tdwg.org/dwc/terms/index.htm – TDWG 2009, Montpellier, November 9 -13
• DRAFT Germplasm extension– http://code.google.com/p/darwincore/source/browse/#svn/
trunk/xsd/profiles/germplasm
• RDF, LSID, ontology friendly
![Page 28: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/28.jpg)
28
Outlook• The compatibility of data standards between PGR and
biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (GBIF, TDWG).
• Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
• Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.
• The establishment of new data portals and tools on a specific crop, a regional thematic network or similar subset of the total global biodiversity datasets; can be done with rather few efforts!
Adopted from a slide by Helmut Knüpffer (IPK Gatersleben)
![Page 29: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/29.jpg)
29
Special thanks to
• Bioversity International http://www.bioversityinternational.org
• GBIF, Global Biodiversity Information Facility http://www.gbif.org
• BioCASE, The Biological Collection Access Service for Europe. http://www.biocase.org
• TDWG, Biodiversity Information Standards http://www.tdwg.org
![Page 30: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/30.jpg)
30
Data portal example (2006)
![Page 31: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/31.jpg)
31http://wwwdev.ngb.se/portal/index.php?scope=demo
![Page 32: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/32.jpg)
32
![Page 33: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/33.jpg)
33
![Page 34: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/34.jpg)
34
![Page 35: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/35.jpg)
35
![Page 36: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/36.jpg)
36
Data Harvest
![Page 37: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/37.jpg)
GBIF GBRDS
http://gbrds.gbif.org 37
![Page 38: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/38.jpg)
GBIF GBRDS
http://gbrds.gbif.org 38
![Page 39: Data exchange alternatives, GIGA TAG (2009)](https://reader033.fdocuments.net/reader033/viewer/2022061204/547fe874b4af9fee3b8b49f1/html5/thumbnails/39.jpg)
39
Fallacies of Distributed Computing
1. The network is reliable.2. Latency is zero.3. Bandwidth is infinite.4. The network is secure.5. Topology doesn't change.6. There is one administrator.7. Transport cost is zero.8. The network is homogeneous.
This list of fallacies came about at Sun Microsystems around 1994.