Mark Schildhauer Director of Computing, NCEAS
description
Transcript of Mark Schildhauer Director of Computing, NCEAS
![Page 1: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/1.jpg)
Opportunities for earth science data interoperability through coordinated
semantic development, using a shared model for observations and
measurementsMark Schildhauer
Director of Computing, NCEAS
Logan Utah: CUAHSI Conference on Hydrologic Data and Information Systems
June 2011
SONet
![Page 2: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/2.jpg)
2
Integrative Environmental Research
Analyses require a wide range of data– Broad scales: geospatial, temporal, biological (micro-macro)– Diverse topics: abiotic and biotic phenomena
• Predicting impact of invasive insect species on crop production
• Documenting effects of climate change on forest composition
• Large amounts of relevant data…– E.g., over 25,000 data sets are available in the
Knowledge Network for Biocomplexity repository (KNB– http://knb.ecoinormatic.org)
• But researchers struggle to …– Discover relevant datasets for a study– And combine these into an integrated product to analyze
![Page 3: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/3.jpg)
3
for Discovery, Access, Interpretation, Re-use:
SHARED KNOWLEDGE MODELS
• Need consistency and rigor in terminology• Standardized protocols, methods when
possible• Interoperability (syntax)• Comparability (semantics)
• Minimally, need a “shared community vocabulary”
• For hydrologists--- WATERML?• For broader, integrative environmental science--- ?
![Page 4: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/4.jpg)
• metadata and keywords are good
start, but not enough: ambiguous, idiosyncratic, hard to parse
• controlled vocabularies: an improvement, but can do more with today’s technology
SHARED KNOWLEDGE MODELS
![Page 5: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/5.jpg)
5
SHARED KNOWLEDGE MODELS– Ontologies provide a “shared vocabulary”
• Common “external” definitions (namespaces)• for explicating relationships among terms• describing data schemas (observations)• for machine-assisted discovery, reasoning, integration
– Standard technologies for creating and operating on ontologies:
– Syntaxes: RDF, SKOS, OWL– FOSS applications and frameworks: Jena, Protégé– Standard Reasoners: Pellet, FaCT++, Racer
![Page 6: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/6.jpg)
6
Another Opportunity: Observational data
Environmental and earth science data often consists of “observations”
• Data sets are often stored in tables (e.g., flat files, spreadsheets)
• Represent collections of associated measurements
• Highly heterogeneous (format, content, semantics)
• (cell) Values represents measurements
![Page 7: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/7.jpg)
Examples of “raw” observational data
![Page 8: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/8.jpg)
Several prospective observation models…
Project Domain Observational data model
VSTO Atmospheric sciences
Ontologies for interoperability among different meteorological metadata standards and other atmospheric measurements
SERONTO Socioecological research
Ontology for integrating socio-ecological data
OGC’s O&M Geospatial Observations and Measurements standard for enhancing sensor data interoperability
SEEK’s OBOE Ecology Extensible Observation Ontology for describing data as observations and measurements
PATO’s EQ Phenotype/Evolution Underlying model for describing phenotypic traits to link with genomic data
![Page 9: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/9.jpg)
9
Observational Data Models
• High degree of similarity across independently derived models
• Opportunity to enable enhanced data interoperability and uniform access– Domain-neutral “foundational” template– Abstracts away underlying format issues – Domain ontologies “extend” core concepts, to formalize
semantics of terms used to describe measurements
![Page 10: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/10.jpg)
10
Observational Data Model
• Implemented as an OWL-DL ontology– Provides basic concepts for describing
observations– Specific “extension points” for domain-specific
terms
Entity
Characteristic
Observation
Measurement
Protocol Standard
+ precision : decimal + method : anyType
1..1*
1..1
*
*
*
0..1 0..1
1..1
**
Value
1..1
*
*
Context ObservedEntity
![Page 11: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/11.jpg)
11
Observational Data ModelObservations are of entities (e.g., River,
Water, Sample, …)– An observation can have multiple
measurements– Each measurement is taken of the observed
entityEntity
Characteristic
Observation
Measurement
Protocol Standard
+ precision : decimal + method : anyType
1..1*
1..1
*
*
*
0..1 0..1
1..1
**
Value
1..1
*
*
Context ObservedEntity
![Page 12: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/12.jpg)
12
Observational Data ModelA measurement consists of
– The characteristic measured (e.g., Ammonium concentration)
– The standard used (e.g., unit, coding scheme)– The measurement protocol– The measurement value
Entity
Characteristic
Observation
Measurement
Protocol Standard
+ precision : decimal + method : anyType
1..1*
1..1
*
*
*
0..1 0..1
1..1
**
Value
1..1
*
*
Context ObservedEntity
![Page 13: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/13.jpg)
13
Observational Data ModelObservations can have context
– E.g. geographic, temporal, or biotic/abiotic environment in which some measurement was taken
– Context is an observation too (entity + characteristic)
– Context is transitive Entity
Characteristic
Observation
Measurement
Protocol Standard
+ precision : decimal + method : anyType
1..1*
1..1
*
*
*
0..1 0..1
1..1
**
Value
1..1
*
*
Context ObservedEntity
![Page 14: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/14.jpg)
Similarities among Observational Data Models
FeatureOfInterest
ObservationContext
ObservedProperty
OM_Observation
Result
carrierOfCharacteristic
forProperty
relatedContextObservation
hasResult
OM_Process
usesProcedure
OGC’s Observations and Measurements (O&M)
ofFeature
![Page 15: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/15.jpg)
Similarities among Observational Data Models
Entity
Context (other Observation)
Characteristic
Observation
Standard
hasCharacteristichasMeasurement
ofEntity
hasContext
usesStandard
Protocol
usesProtocol
Precision
hasPrecision
ofCharacteristic
hasValue
SEEK/Semtools Extensible Observation Ontology (OBOE)
Measurement
![Page 16: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/16.jpg)
Seronto basic classes:value_set
physical_thing
parameter_method
parametermethodselection_description
hasParameterMethodhasInvestigationItem
hasValue
hasSample hasMethod hasParameter
scale
hasScale
unithasUnit
hasValue
value_nominalvalue_float
value_nominalvalue_float
Similarities among Observational Data Models
![Page 17: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/17.jpg)
17
SHARED KNOWLEDGE MODELSNSF INTEROP program: foster communication among domains to enable greater interoperability• Scientific Observations Network, SONet• Many earth and life science domains
participating• Advanced conceptual modeling• Unifying abstraction of ‘observation’• Semantic web & ontologies• Domain scientists & knowledge engineers
SONet
![Page 18: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/18.jpg)
Developing a core model (SONet project)
Identify the key observational models in the earth and environmental sciences
Are these various observational models easily reconciled and/or harmonized?
Are there special capabilities and features enabled by some observational approaches?
What services should be developed around these observational models?
![Page 19: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/19.jpg)
Similarities among Observational Data Models
Entity FeatureOfInterest
Characteristic ObservedProperty
Measurement OM_Observation
Protocol OM_Process
Result
Standard
Value
Precision
Context ObservationContext
OBOE O&M
![Page 20: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/20.jpg)
SONet/Semtools Semantic Approach
• Data-> metadata-> annotations-> ontologies• Annotations link EML metadata elements to concepts in
ontology thru Observation Ontology• EML metadata describe data and its structures
![Page 21: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/21.jpg)
Linking data values to concepts through observations
• Link data (or metadata) through observational data model to terms from domain-specific ontologies
• Context can inter-relate values in a tuple• Can provide clarification of semantics of data set as a
whole, not just “independent” measurements
![Page 22: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/22.jpg)
22
Semantic annotation
Marburg 2011
Attribute mappings
![Page 23: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/23.jpg)
How to use observational data models…
Marburg 2011
![Page 24: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/24.jpg)
linking observational data models to data…
Marburg 2011
![Page 25: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/25.jpg)
Special Mojo of OWL Ontologies
• Class hierarchies– Parent, sibling, child class relationships
• Object properties– to relate instances between classes
• reflexive, symmetric, transitive• Specify domain and ranges Contained in
• Datatype properties– to relate instances to values
• Cardinality
• Polyhierarchies
![Page 26: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/26.jpg)
Special Mojo of OWL Ontologies
• Reasoning offers axioms such as:
– Disjointness: e.g. can’t be both X and Y• inSediment AND inWaterColumn
– Equivalence (classes) or Same_as (instances)• Synonymy across namespaces
– Properties for mereology • Composite of• Contained in• Connected to
– Reasoner can infer relationships, determine inconsistencies in assertions
![Page 27: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/27.jpg)
Special Mojo of Observations
• Enables faceted discovery along entity & characteristic hierarchies
• Economical use of concepts: don’t need to have a “red-colored eye”, or “red-colored wing”– instead re-use concept of “red” with variety of entities
• Express whether observations taken from the same instance or not (tuple explication)– E.g. Multiple chemical concentrations measured from single
water sample• Use of equivalence class (measurement types) to apply
to realized measurements in data
![Page 28: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/28.jpg)
28
Ontology Design Pattern
![Page 29: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/29.jpg)
29
Ontology Design Pattern
ThesauForm: LaPorte, Huguenot & GarnierTraitNet: Bunker, Ahrestani, Naeem
![Page 30: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/30.jpg)
Acknowledgements
Mark Schildhauer*, Matthew B. Jones, Ben Leinfelder: NCEAS, Santa Barbara CA, USALuis Bermudez:Open Geospatial Consortium Inc., Wayland MA, USAShawn Bowers: Gonzaga University, Spokane WA, USAPhillip C. Dibner: OGCii, Berkeley CA, USACorinna Gries: University of Wisconsin, Madison WI, USA Deborah L. McGuinness: Rensselaer Polytechnic Institute, Troy NY, USAMargaret O’Brien: UCSB, Santa Barbara CA, USAHuiping Cao: New Mexico State University, Las Cruces NM, USASimon J.D. Cox: Earth Science & Resource Engrg, CSIRO, Bentley WA, AUSSteve Kelling, Carl Lagoze: Cornell University, Ithaca NY, USA Hilmar Lapp: NESCent, Durham NC, USAJoshua Madin: Macquarie University, Sydney NSW, AUS
SONet* presenter
This material is based upon work supported by the National Science Foundation under Grant Numbers 0743429, 0753144.
![Page 31: Mark Schildhauer Director of Computing, NCEAS](https://reader035.fdocuments.net/reader035/viewer/2022081421/568165bc550346895dd8b930/html5/thumbnails/31.jpg)
31
FIN
“How many fingers, Winston?”
Orwell, 1984