Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

21
Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI Amit Sheth and the team Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, OH-45435

description

The talk titled "Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI" given by prof. Amit Sheth at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis Center from November 13-14 2013. The talk emphasized important issues that material scientists encounter in publishing data - Provenance and Access Control. workshop page: http://wiki.knoesis.org/index.php/ICMSE-MGI_Digital_Data_Workshop

Transcript of Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Page 1: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Amit Sheth and the teamKno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing

Wright State University, Dayton, OH-45435

Page 2: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

2

Kno.e.sis’ MGI related projects

• Federated Semantic Services Platform for Material Sciences (funded via AFRL/Rome)

• Materials Database Knowledge Discovery and Data Mining (funded from AFRL/RX)

Faculty: A. Sheth (PI), K. Thirunarayan (coPI), R. Srinivasan (coPI), Clare Paul (expert)

Students: K. Gunaratna, M. Panahiazar, S. Lalithsena, V. Nguyen, N. Bryant, A. Shiveley, N. Jaykumar

Page 3: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

3

Page 4: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

4

Personal desktops

Lab notebooks

Databases

Single Access

Page 5: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

5

Public-Private Data Sharing

• Enhance publicly available datasets while retaining intellectual property data for businesses

Private data and metadata(eg. ongoing experimental processes, intellectual property data)

Selectively shared data and metadata(eg. with ongoing collaborators, licensed data)

Public data and metadata (released products, material specifications)

Page 6: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

6

Research Lab A

Federated Architecture

Private

Shared

Public

Federal Endpoint

1. User Authentication

2. Federated Semantic Query Processor

AC Processor

Semantic Query

Processor

Industry Lab B

Private

Shared

Public

AC Processor

SemanticQuery

Processor

Organization C

Private

Shared

Public

AC Processor

Semantic Query

Processor

3. Semantics Mappings

Page 7: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Principles of a Federation

• Each component controls access to its local data independently (local autonomy)

• A query is decomposed to multiple sub queries, each sub query is executed at one component

• Results from sub queries are combined by the federated query processor (control global access)

Page 8: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Provenance Metadata

• Explains the origins of an artifact, such as– How was it created?– Who created it?– When was it created?

• Example: for a given material X– Which processes and properties involved? – Input and output values of those processes?– Which research/engineering team performed the

experiments?

Page 9: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Why Data Provenance?

• Verification• Reproducibility• Trust• Testing• Quality• …

Page 10: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Product – Process – Product

Capturing provenance: Sufficient + Accurate=> Reproduce the same output

OutputInput

Processes

Page 11: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

A Unified Provenance Framework

• Capturing domain-specific provenance– in addition to the W3C PROV ontology

• Representing in standard RDF • Query engine for processing provenance queries• Operators for comparing artifacts’ provenance

Page 12: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Can we choose any part of our Semantic Web data

to share with public community, or with selective collaborators ?

Page 13: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Semantic Web Data

Subject

Object

Predicate

A triple is in the format (Subject, Predicate, Object)An RDF Dataset is a set of triples

Page 14: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Linked Data Story So Far?

Non-open data?Not there yet!

Page 15: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Can we choose any part of our Semantic Web data

to share with public community, or with selective collaborators ?

Page 16: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Different levels of granularity

– Individual resources• Example: a material product, a manufacturing process

– Individual triples• Example: properties of a product, or process

– Entire datasets

Enable flexible selection of any data pieces to be shared at anytime

Page 17: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Can we choose any part of our Semantic Web data

to share with public community, or with selective collaborators ?

Page 18: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Local Component A

Creating Resources

Granting Permissions

Inferring Permissions

AC Processes

Federal Endpoint

2. AC-embeded Query Execution

User X of either Public group or Collaborators

Manager Yof component A

1. Query Rewriting

Page 19: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Various Policies

• Role-based Access Control (RBAC)• Mandatory Access Control (MAC)• Attribute-based Access Control (ABAC)• Discretionary Access Control (DAC)

1. Which policy? Depend on the organization’s

needs!

2. Our AC mechanism can be extended to

support any of these policies

Page 20: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

Summary

• Semantic Federated Architecture enables us to– Enhance the open data access– Protect the confidential information– Improve the communication between collaborating teams– Support the reproducibility of material products with

confidence and trust– Utilize the power of Semantic Web standards and

technologies to do so more easily, effectively and flexibly

Page 21: Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI

21

Thank you, and please visit us at

http://knoesis.org/

Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA

Kno.e.sis