Federated data stores using semantic web technology

12
Federated Data Stores using Semantic Web Technology Steve Ray Distinguished Research Fellow Carnegie Mellon University

Transcript of Federated data stores using semantic web technology

Page 1: Federated data stores using semantic web technology

Federated Data Stores using

Semantic Web Technology

Steve Ray

Distinguished Research Fellow

Carnegie Mellon University

Page 2: Federated data stores using semantic web technology

Interoperability is all about DATA

Three Technology Trends

that could help*

1. Semantic Web technologies

2. Cloud

3. Natural Language Processing

I will focus on semantic web technologies

*Inspired by “Top Three Technologies to Tame the Big Data Beast,” Huffington Post, 11/22/2011 Steve Ray, Carnegie Mellon University

Page 3: Federated data stores using semantic web technology

Representation Trends

IBM Card Format

EDI

XML

Metadata

Metamodels

Meta-meta-

models

RDF/OWL

XML Schema

BPML/

BPEL

CBA

Semantic Mediation

Web Services

Protocols

40

25

7

6

5

0

2

4

3

1

SOA

Legacy

Current Practice

Exploratory

18 Info Modeling

FOL

(Slide adapted from Donald Hall, Logistics Enterprise Services Office, DLA)

Steve Ray, Carnegie Mellon University

Page 4: Federated data stores using semantic web technology

Why Consider RDF & OWL

Semantic Web Technology?

RDF = Resource Description Framework

OWL = Web Ontology Language

1. Simple representation

– Everything is a triple: <subject – predicate – object>

2. Self-describing models

– Schemas and data coexist in data stores

3. Easy to interrogate

– SPARQL queries (over schema and data)

4. Easy to validate

– Supports automated reasoning

5. Easy to interoperate

– Natively supports distributed data stores

Steve Ray, Carnegie Mellon University

Page 5: Federated data stores using semantic web technology

Simple Representation

Everything is stored as triples:

<subject predicate object>

Steve Ray, Carnegie Mellon University

Page 6: Federated data stores using semantic web technology

Self-Describing Models

• The schema (model) and the data is stored in

the same place

• Schema:

– Mammal subClassOf Animal

– Human subClassOf Mammal

• Data:

– george is-a Human

– george marriedTo lisa

Steve Ray, Carnegie Mellon University

Page 7: Federated data stores using semantic web technology

Easy to Interrogate

SPARQL†

language to query an RDF database

(Just matches against patterns of triples)

SELECT ?x

WHERE {

george marriedTo ?x .

}

Returns a table:

x

lisa

SELECT ?y

WHERE {

y? subClassOf Animal .

}

Returns a table:

y

Mammal

SPARQL = SPARQL Protocol and RDF Query Language Steve Ray, Carnegie Mellon University

Page 8: Federated data stores using semantic web technology

Easy to Validate

SPARQL can be used

for reasoning,

not just interrogating

In SPARQL:

If

George sonOf Fred and

Fred siblingOf Mary Then

George nephewOf Mary

CONSTRUCT

{ ?a nephewOf ?c .}

WHERE

{

?a sonOf ?b ;

?b siblingOf ?c .

}

Steve Ray, Carnegie Mellon University

Page 9: Federated data stores using semantic web technology

Easy to Interoperate

• A single query can interact with more than one

RDF database

– Linked Movie Database contains movies, actors

– DBPedia contains people and birthdates

• Find the birthdates of all Star Trek actors

– Answer does not exist in one source

Page 10: Federated data stores using semantic web technology

Dbpedia is just one

of many RDF data stores

on the Web

We are not alone

Page 11: Federated data stores using semantic web technology

Implications

• OWL/RDF provides a representation that can

natively support transformations from other

modeling languages and native formats for

product and process models

• The API is SPARQL

• Storage can be local or web-based

Steve Ray, Carnegie Mellon University

Page 12: Federated data stores using semantic web technology

Take-away

• Poor interoperability is expensive

• Interoperability solutions can be expensive

• Semantic technology can make interoperability

solutions easier and cheaper to implement

Steve Ray, Carnegie Mellon University