SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental...
-
Upload
jonathan-palmer -
Category
Documents
-
view
213 -
download
1
Transcript of SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental...
![Page 1: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/1.jpg)
SEEK EcoGrid
Integrate diverse data networks from ecology, biodiversity, and environmental sciences
Metacat, DiGIR, SRB, Xanthoria, ... EML is the core for data documentation Open programming interface
![Page 2: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/2.jpg)
EcoGrid client interactions
![Page 3: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/3.jpg)
Aims of EcoGrid Which, Where, How, Who ???? Share Data and Information Relate Data from multiple projects/groups Crosswalks across data structures Develop Eco-related Finding Aids for Data Global User: Authenticate and Authorize Provide an infrastructure for “Archivable
Collection-building” for SEEK scientists Facilitate the A&M layer and the SMS
layer
![Page 4: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/4.jpg)
Challenges of EcoGrid Data & User Diversity
6000+ datasets & 1500+ scientists themes, methods, units,structures Small data sizes but high complexity - metadata
Multiple Data Organizations Biodiversity Surveys Population data GIS, Satellite Images, Weather Data, …
Ontologies & Taxonomies Data Discovery: No single place to find Data Entropy – rapid decline of information on data Autonomy with Centralized access Leverage Computational Grid work
![Page 5: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/5.jpg)
Existing services Metacat – syntactic and semantic metadata
querying/inserting/updating/deleting, user registration/authentication, data replication, data/metadata versioning, - supports any XML-based metadata
Xanthoria – common-schema mediator (currently 8 sites) metadata query/insert/update/delete for any XML schema to underlying metadatabase (SQL, native XML)
![Page 6: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/6.jpg)
Existing Systems
DiGIR – querying arbitrary XML-describable resources (underlying data sources can be any type: RDB, XMLDB).
ClimDB – integrating (using wrapping at the data source) diverse format climate data. Access through web, common schema identified beforehand – tabular description
HyperLTER – summary ontology as metadata for images put in as metadata, image extraction /geographicsubsetting/band-level subsetting/ - integration with MODIS images and Hyperspectral images, TM images, airphotos, …
![Page 7: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/7.jpg)
Existing Systems
VegBank – 3 databases co-occurrence records, species taxonomic database that is concept-driven, community classification. Distributed vegbank, querying by plots. Querying/insert/update/annotate across three diverse databases that are described using XML
SRB – access distributed data, syntactic, semantics,user-defined (arbitrary relational) metadata based querying. Annotations for data. Opertions on data. Extraction of metadata. ingest,bulk ingest, delete,upate of data/metadata
![Page 8: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/8.jpg)
EcoGrid Services
Query Search metadata and data, return result sets with ID
Read Retrieve data objects by ID
Authentication Verify user identity
Authorization Record allowable interactions
Write Write data objects by ID
Replication Mirror objects for backup and efficiency
Computation Execute models and simulations from AMS on various nodes
![Page 9: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/9.jpg)
EcoGrid Search Interactions
Features Well-defined interfaces (e.g., WSDL) Standardized messaging formats Automated discovery of implementing services Aggregation/Indexing across nodes for efficiency Support heterogeneous data objects via metadata descriptions Lightweight to implement for various systems like DiGIR and
Metacat
Client
Registry
QueryServiceQueryServiceQueryServiceQueryServiceQueryServiceQueryService
1. Register2. Find Query Nodes
3. Search (recursive)
![Page 10: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/10.jpg)
4. Read (recursive)
5. Find Index Nodes1. Register
EcoGrid Index Interactions
Client
Registry
QueryServiceQueryServiceQueryServiceQueryServiceQueryServiceQueryService
3. Search (recursive)
IndexedQueryService
6. Search
2. Find Query Nodes
![Page 11: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/11.jpg)
Authentication and Authorization
KNB uses simple LDAP system with referrals Leverages existing DB (e.g. LTER personnel DB) Not really scalable in terms of administration
Grid Security Infrastructure (GSI) Certificate based authentication Proxy certificates allows transfer of rights De-centralized administration (I.e., multiple CA’s)
Can we easily transition to GSI?
![Page 12: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/12.jpg)
Native Range prediction workflow
Slide from D. Pennington
KNBAbundance
Data(a1)
Training sample (d)
GARPrule set (e)
Test sample (d)
Integrated layers
(native range) (c)
DiGIRSpecies
presence &absence points
(a2)EcoGridQuery
EcoGridQuery
LayerIntegration
Sample
+A3+A2
+A1
DataCalculation
Map Validation
User
Model qualityparameter (g)
Native range prediction
map (f)
SRBEnvironmental
layers (b)
EcoGridQuery
EcoGrid
Archive
![Page 13: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/13.jpg)
Implementation
Short-term Define common WSDL services Simple service registry Wrappers for Metacat, DiGIR, SRB, Xanthoria, etc.
Medium-term Use OGSI-compliant interfaces
(add methods to current WSDL) Grid Registry service
![Page 14: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/14.jpg)
Timing
April 4 April 11 -- Design Diagrams April 18 -- WSDL, Registry instance operational, query + read, RSIDS
schema and examples. April 25 May 2 May 9 Wrapper implementations + test client(s) May 16 (SEEK Technical WG meeting) May 23 May 30 -- Hard deadline for implementation of Eco-GRID alpha 1
![Page 15: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/15.jpg)
Query Messages
<egq:query queryId="test.1.1" system="test" xmlns:egq="ecogrid://ecoinformatics.org/ecogrid-query-1.0.0alpha1">
<namespace prefix="eml" space="eml://ecoinformatics.org/eml-2.0.0"/> <title>Soils metadata query</title> <AND> <OR> <condition operator="LIKE" concept="eml:title">%soil%</condition> <condition operator="LIKE" concept="eml:title">%dirt%</condition> </OR> <OR> <condition operator="LIKE" concept="eml:surName">%Jones%</condition> <condition operator="LIKE" concept="eml:surName">%Vieglais%</condition> </OR> </AND></egq:query>
![Page 16: SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.](https://reader036.fdocuments.net/reader036/viewer/2022081700/56649edc5503460f94bed113/html5/thumbnails/16.jpg)
Result responses<rs:resultset resultsetId="foo.1.1" system="http://knb.ecoinformatics.org/knb/" xmlns:rs='ecogrid://ecoinformatics.org/ecogrid-resultset-1.0.0alpha1'> <resultsetMetadata> <sendTime>2003-05-02T16:45:50-09:00</sendTime> <recordCount>86</recordCount> </resultsetMetadata> <records startRecord="1" endRecord="1" xmlns:eml='eml://ecoinformatics.org/eml-2.0.0'> <record number="1" identifier="bar.1.23"> <eml:eml packageId="bar.1.23"> <title>Soil data from West Valley, 1983</title> <creator> <individualName><surName>Jones</surName></individualName> </creator> <creator> <individualName><surName>Smith</surName></individualName> </creator> <keywordSet> <keyword>aves</keyword> <keyword>ornithology</keyword> <keyword>biodiversity</keyword> </keywordSet> </eml:eml> </record> </records></rs:resultset>