1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical...

20
1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical Collaboration Copenhagen, Denmark March, 23 2009 Bruce Bargmeyer Lawrence Berkeley National Laboratory and Berkeley Water Center University of California, Berkeley Tel: +1 510-495-2905 [email protected]

Transcript of 1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical...

1

Collaborative Research, Development and Demonstration

Ecoinformatics International Technical Collaboration

Copenhagen, Denmark

March, 23 2009

Bruce BargmeyerLawrence Berkeley National LaboratoryandBerkeley Water CenterUniversity of California, BerkeleyTel: +1 [email protected]

Collaborative Research, Development and Demonstration

SciScope Microsoft, LBNL, Berkeley Water Center, EPA, USGS, EEA?

Involves UCB researchers, data cubes for water data,

Accomplished: Demo using terminology connected to metadata and data to access STORET and NWIS water data

SciScope running on computers at LBNL

Current effort Extending to include some things that were “hardwired” in demo documenting code, creating SDK Install at new site to validate June milestone

To discuss Further work on terminology and linkage to metadata Extensions: Citizen Observatories, Citizen Science Extensions: Social computing Hosting and Governance

– Consortium of Universities for the Advancement of Hydrologic Science( CUASHI)

2

SciScope

STORET has 758 sites in Texas, TCEQ has 8407.

STORET has 47,602 sites in Florida, NWIS has 27,906.

NWIS has 121,545 in Minnesota, STORET has 22,260.

TCEQ data from David MaidmentSource: Bora Beran, Microsoft Research

Citizen Observatories & Citizen Science Quick Examples

Weather Underground Community Collaborative Rain, Hail and

Snow Network.  CoCoRaHS Microsoft World Telescope

4

Building Blocks

Terminology and Ontology Metadata Registration Collaborative authoring and Wikis Information and Data Modeling

5

The Wiki Way

Collaborative authoring and Wikis The ability for any member of the community to contribute to the resource A history mechanism The ability to add links, tags, keywords and classifications A rapid, "organic" evolution cycle. The ability to leave information un (or under) specified Federation – virtual "wiki space".  Multilingual Collaborative authoring and Wikis

Semantic Wiki The ability to add "meaning" to links, tags, keywords and classifications The ability to import and export more formalized knowledge  - e.g., ontology

descriptionsThe ability to represent the same information across the "formalization spectrum" (!) -

6

7

8

Ecoinformatics Challenge: Draw Together Concept Systems, Metadata & Data

ID Date Temp Hg

A 06-09-13 4.4 4

B 06-09-13 9.3 2

X 06-09-13 6.7 78

Name Datatype Definition Units

ID textMonitoring Station Identifier

not applicable

Date date Date yy-mm-dd

Temp numberTemperature (to 0.1 degree C)

degrees Celcius

Hg numberMercury contamination

micrograms per liter

Facilitate discovery, access, use and understanding

Data:

Metadata:

Biological Radioactive

Contamination

lead cadmiummercury

Chemical

Concept system:

Ecoinformatics Research

Web 2.0 Semantic Technology - the tools for creating, disseminating and using terminological

resources such as dictionaries, classification schemes and ontologies have now reached the point that it is possible to maintain a centralized terminology that can be used to describe data resources in a queryable and interoperable fashion.

Metadata Repositories - ISO 11179 Edition 3 provides a common model to record, manage and disseminate semantic annotation of information resources.  This provides a framework for acquiring, integrating and sharing catalog content.

Semantic Wiki - wiki technology has demonstrated the viability of community generated content.  Semantic wiki has demonstrated that community generated content can exist across a continuum of formality.  This provides the ability to collect information about catalog content in a format and level of formality that suits each user's needs and to gradually transform user input into a shared structure and semantics for distribution, integration and sharing.

Data Modeling and Model Driven Architecture - it is now possible to automatically upload and integrate the contents of most modern SQL catalogs, information models, schemas, spreadsheet templates, etc.  This technology makes it possible to create an inventory of physical data resources very quickly.

9

Issues for Strategic Discussion

10

Strategic DiscussionMajor Results

XMDR ISO/IEC Standard Prototype - Open Source Software EPA System of Registries In-house DOD National Cancer Insitute caBIG

caDSR is key part of caCORE Now demonstrating for DOE Nuclear Non-Proliferation

Model Integration

11

caGrid Data Description Infrastructure

Client and service APIs are object oriented, and operate over well-defined and curated data types

Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR)

Object definitions draw from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described

XML serialization of objects adhere to XML schemas registered in the Global Model Exchange (GME)

Service

Core Services

Client

XSDWSDL

Grid Service

Service Definition

Data TypeDefinitions

Service API

Grid Client

Client API

Registered In

Object Definitions

SemanticallyDescribed In

XMLObjectsSerialize To

ValidatesAgainst

Client Uses

Cancer Data Standards Repository

Enterprise Vocabulary

Services

Objects

GlobalModel

Exchange

GMERegistered In

ObjectDefinitions

Objects

Source: National Cancer Institute, caBIG

Collaborative R&D Issues

EcoinformaticsActivities & Research

Two kinds of activities: Advances/activities as part of current operations

with internal agency resources Ecoinformatics result: primarily technology transfer

by sharing ideas. Activities requiring additional resources

(contracts, research grants, …) Ecoinformatics result: technology transfer of ideas,

research results, and tools/infrastructure.

14

Expressed Intent – Coordinate R&D in Ecoinformatics

Share cost and benefits through coordination of US & EU (& Asia?) ecoinformatics R&D

Identify key advances needed at the core of ecoinformatics Semantics management, semantics services, semantic computing Terminology web services IT support for indicators, … Demonstrate in ecoinformatics “Test Bed” Develop an “architecture” of advanced ecoinformatics

technologies? Research, Development and Demonstration projects ranging from

improvements in operations to strategic breakthroughs

15

How to Share Tasks & Results of Ecoinformatics R&D

Conduct the international R&D in separate projects that are funded separately (but aware of others)

Conduct the R&D with interlocking workpackages/tasks/deliverables. International R&D with integrated results

16

Coordinating International Ecoinformatics R&D

Who funds whom? US NSF and EU 7th FP do not fund internal

government agency activities. R&D funds to academe, Govt. labs, and private industry.

Environmental agencies have operational and “R&D” funding used to fund outside R&D organizations.

Government staff participate on their agency’s own dime. Contract/award staff participate under project funding.

17

Coordinating International Ecoinformatics R&D

How to coordinate? Government agencies can collectively coordinate

priority areas of international ecoinformatics R&D. (EEA, EC DGs, EPA, USGS, NSF, …)

Funding agencies can declare that international joint R&D efforts are encouraged. (But will all partners get funded?)

Government agencies can combine funding. However, it is difficult for funds to cross oceans (or

major political boundaries)

18

Coordinating International Ecoinformatics R&D

Getting down to brass tacks: How do R&D organizations submit international proposals with interlocking

workpackages/deliverables? How do R&D organizations submit proposals in which all of the international

participants get funded?

R&D proposals can be funded with expectation of future international linkage. Funding agencies can establish international linkage of proposal review

process. More difficult as more agencies are involved

Different proposal & funding processes can be utilized, outside of the usual “call” or “RFP” process. Some kind of incremental process?

19

Coordinating International Ecoinformatics R&D

What are the next steps in coordinating international Ecoinformatics R&D?

Probably multiple approaches. Discussion

20