Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary...

14
Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), [email protected]

Transcript of Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary...

Page 1: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

Semantic Interoperability for Data in Context IGRDA Plenary 3:

Friday 28th March 2014 (Day32Gary Berg-Cross (SOCoP, RDA DFT WG co-chair),

[email protected]

Page 2: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

DFT Basic Digital & Data Concepts - data is inherently collective data

• Digital Data refers to a structured sequence of bits/bytes that represents information content. In many contexts digital data and data are used interchangeably implying both the bits and the content.

• Real-Time Data is data/data collection which is produced in its own schedule & has a tight time relation to the processes that create it and that require immediate actions. Timeliness such as real time is an attribute of data.

• Dynamic Data is a type of data which is changing frequently and asynchronously.

• Note: Dynamic data has also been used in the context of Workflow- workflow that is executed a "dynamic data object", or you can call the results from executing the workflow a "dynamic data object"

• Referable Data is a type of data (digital or not) that is persistently stored and which is referred to by a persistent identifier. Digital data may be accesses by the identifier. Some data objects references may access a service on the object (OAI-ORE).

• Citable Data is a type of referable data that has undergone quality assessment and can be referred to as citations in publications.

Page 3: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

Background to this Semantic Interoperability

• Long time work on “data integration and sharing”.

• Semantics is FEATURED in the Application layer of OSI

• Intensive work in the AI & knowledge engineering areas.

• But to many the goal of semantic interoperability remains elusive.

• More recently the Semantic Web thrust pursued the goal of robust semantic interoperability & robust exchange of data.

• Needs deep knowledge and support of reasoning to fulfill SW vision.

>

Page 4: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

SI has a Socio-Tech Aspect

Who is doing what?

How to Understand

the Problem.

What are the critical Issues?

Is it a knowledge

representation problem?What is

the role of

Ontology?

What is the role of tools?

What are the best methods?

Re-use and integration of data from heterogeneous sources within and across discipline boundaries has not been routinely achieved.

Application of special technologies that infer, relate, interpret, and classify the implicit meanings of digital content are not easily adapted to the topical research interest or enfolded in traditional architectures.

Use an agile approach, based on sets of competency questions?

Don’t try too hard to train a Domain Expert in Gold Standard formal semantics?

Since meaning is a cognitive agent phenomena, semantic interoperability is the technical analogue to human communication and cooperation. That makes it intrinsically HARD.

Use metadata semantic annotation?

Page 5: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

Graphic Overview of Semantic /Ontology Manifesto (EarthCube)

Knowledge Infrastructure VisionCommunity Understanding of Semantic role and value

Guiding principles1. Uses Cases 2. Lightweight -opportunistic

methods3. Semantic interoperability with semantic heterogeneity4. Bottom-up & top-down

approaches5. Domain - ontology engineer

teams 6. Formalized bodies of

knowledge across science domains

7. Broader “Reasoning” services“Insertion”

Architecture &Workflow Between

Based on the work of (alphabetically)Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, and Ilya Zaslavsky

Paper at http://stko.geog.ucsb.edu/gibda2012/gibda2012_submission_6.pdf /

Page 6: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

SI 6

Lightweight Methods & Products

• Choose lightweight approaches to support application needs and reduced entry barrier

• Low hanging fruit leverages initial vocabularies & existing conceptual models to ensure that a semantics-driven infrastructure is available for early use.

Simple parts/patterns & direct relations to data Triple like parts

More relation types here Bottom Up.

A useful set of idea that supports a useful subset of (approximate) reasoning

Page 7: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

7

GeoSpatial Data & Web Feature Service Standardizes Terms but Lacks Semantics

A terminology created independently based on different conceptual models differing in terms/vocabulary but also & meanings.

Page 8: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

RDA P3 8

Better Conceptualization of Properties - for Interoperability (CUAHSI)

Organize Properties like size as a physical quality since it inheres in a physical object.Qualities like physical, bulk, & measured properties like stream flow, level, pollutants, evapotranspiration etc. and make them useable concepts rather than level concepts.• Currently CUAHSI has them at many levels

• E.g. 2291 Major, bulk properties 4

Water Body Water DensityUnit

Grams /cm3Water Density

For connecting to Chem/BioChem ontologies there might be sub-categories of Physical for elements – optical, hardness, color

See Dumontier Lab ontologies to represent bio-scientific concepts and relations.http://dumontierlab.com/?page=ontologies

hasConstituent hasFeature hasUnitusesStandard

ChesapeakeBay

IsA

Area

HasFeature

AreaQuantityhasQuantity

Real Number

Sq MileshasUnit

hasValue

hasLayer …..

Page 9: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

RDA P3 9

Incrementally Adding Better Semantic Relations/Properties

Data models & SKOS offer some relations, but they are limited. SKOS is more useful for terms than conceptsConsider Irreflexive, anti-symmetric & Transitive constructs that

captures common understanding.Observation –Streams and lakes flow into rivers. • Property “flows-into” is irreflexive

• any one river cannot flow into itself as a loop • “flows-into” is also anti-symmetric

• if one river flows into the second, the second one can’t flow into the first.

• Transitive property for Regions to say that the subRegionOf property between regions is transitive

• <owl:TransitiveProperty rdf:ID="subRegionOf"> <rdfs:domain rdf:resource="#Region"/> <rdfs:range rdf:resource="#Region"/> </owl:TransitiveProperty>

If Logan, Cache County and Utah are regions, and Logan is a subRegion of Cache County , Cache County is a subRegion of Utah, then Logan is also a subRegion of Utah.

Page 10: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

Grafton Street Dublin in Context

Grafton Street (Irish: Sráid Grafton) is one of the two principal shopping streets in Dublin city centre.

Do we refer to it a pedestrian mall or a shopping street?

Is it a road object but with motor traffic restrictions?

Or a public place?

Or a non-identifiable part of the city surface?

OpenStreetMap -

All such references are usually outside a computer

Page 11: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

What Grafton Street is Depends on its Setting – when we are talking about, AND what Features

Grafton Street

1814 AD or 2014?

Transport or commerce features?

Page 12: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

12

. Philosophy Psychology Perspectives

…..

Semantics in Context: Connecting 3 Viewsfor Geography/GIScience Knowledge

GeoReality

Task- Regiment LanguageWetland….geo-entity..what boundary?Flows Into isa Type of connected-toBoundary segments = straight lines, so overall boundary is a polyline…

Knowledge/GeoConcepts

This is different than regular land

and water

Name Reality

Understand Reality:Data evidence

Model to express what you understand

Models representingGeo-Knowledge

Maybe there is more than 1 type of

boundary

Page 13: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

Example of (Powerful) Challenges – Semantic Mismatches, Inclusions & Alignments

Language level for expressing semantics• Syntax and logical representation differences of the past should be handled by standardization & rule

translations.• Different expressivity (Owl vs. Common Logic) might be harder.

Ontology level (Grafton example)• Different conceptualizations such as different class scope, Hierarchy level differences, coverage or

granularity. • Scientists use different concepts & categories; • What does it mean to say that Concept P includes concept S?• What does it mean to say that concept P and S are semantically close?• Scientific understanding, often requires existing concepts to be revised or supplanted in the field

• Perspective – 4D vs. 3D, roads as straight lines or curves, time as interval or ratio…..• Tacit assumptions (when messaging, an agent has in mind a number of “unspoken,” implicit

consequences of that message.) – “You can’t drive on Grafton”….

Pragmatics of Intentions & goals (also Grafton example)

We have different goals so application & use are targeted. We need to adjust conceptualization to accommodate these.

10

Page 14: Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th March 2014 (Day32 Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com.

One View of Semantic Representation & Heterogeneity

A challenge of deep semantic interoperability is that:• A global and one size fits all (Gold Standard) representation for each distinct

situation, such as Grafton St. represented by data is not realistic, • and its procrustean nature may not be desirable if it ignores real heterogeneity

• The judgment of some (CF John Sowa) is that different representations might be optimal for different use cases

• Different levels of detail or granularity, along with different kinds of data entry options seem in practice suitable for different domains and settings.

• Since scientific research is diverse, and evolving, what approach to granular standards can be developed for use?

• Perhaps it is to use formal semantics to narrow the range of ambiguity for particular purposes.