INFSO-RI-508833 Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan...

18
INFSO-RI-508833 Enabling Grids for E- sciencE www.eu-egee.org OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy of Sciences

Transcript of INFSO-RI-508833 Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan...

Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

OGSA DAIData Access and Integration

Marek Ciglan

Institute of Informatics, Slovac Academy of Sciences

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Motivation• Different users / applications store data in different

formats– Plain files

– XML databases

– Relational Databases PostgreSQL Oracle DB2 MySql

• Difficult to work with a lot of different data formats

• Difficult to integrate data from heterogeneous resources

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 3

Enabling Grids for E-sciencE

INFSO-RI-508833

OGSA DAI - Overview• Allow different types of data models

– Files

– XML databases

– Relational Databases

• Allow data to be accessed through uniform interfaces

• Provide extensible framework for integrating data resources on the Grids

• Allow metadata about data and the data resources in which they are stored to be obtained

• Facilitate the integration of data from various sources to obtain the required information

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Architecture

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 5

Enabling Grids for E-sciencE

INFSO-RI-508833

Data Resource Activities• Relational Activities

– Run an SQL query statement

– Run an SQL update statement

– …

• XML Activities – Run an XPath statement against an XML database

– Run an XUpdate statement against an XML database

– …

• File Activities – Access a directory

– Read data from a file

– Manipulate files in a directory

– Write data into a file

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 6

Enabling Grids for E-sciencE

INFSO-RI-508833

Delivery Activities • Retrieve data from a URL

• Deliver data to a URL

• Deliver data to a GridFTP server

• Retrieve data from a GridFTP server

• Deliver results to a stream

• …

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 7

Enabling Grids for E-sciencE

INFSO-RI-508833

Transformation Activities • ZIP compress the results

• GNU-ZIP compress the results

• GNU-ZIP decompress results

• Transform data using an XSLT

• Break a single block into multiple blocks based on a set of separator characters

• Aggregate multiple blocks into a single block

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 8

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 9

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

How to integrate all those heterogeneous data into central data

warehouse ?

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 10

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

OGSA - DAI

Page 11: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 11

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

OGSA - DAI

Select data

Write data into file

Compress file

Transfer zip file

Page 12: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 12

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

OGSA - DAI

Select data

Write data into file

Compress file

Transfer zip file

Read subset of file

Transform

Compress file

Transfer zip file

Page 13: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 13

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration

MySql XML database PostgreSQL Text File

Oracle Data Warehouse

OGSA - DAI

Select data

Write data into file

Compress file

Transfer zip file

Read subset of file

XLST Transform

Compress file

Transfer zip file

Select data

Write data into file

Compress file

Transfer zip file

Read subset of file

Transform

Compress file

Transfer zip file

Page 14: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 14

Enabling Grids for E-sciencE

INFSO-RI-508833

Data integration• How to perform data integration ?

– Write specialized Java application for data integration

– Use OGSA-DAI perform documents

• Perform Documents– XML documents

– Describe activities to be performed

<sqlQueryStatement name="myQuery">

<expression>

select * from littleblackbook where id=10

</expression>

<webRowSetStream name="myQueryOutput"/>

</sqlQueryStatement>

Page 15: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 15

Enabling Grids for E-sciencE

INFSO-RI-508833

Perform documents• Activities integration with perform documents

<sqlQueryStatement name="myQuery">

<expression>

select * from littleblackbook where id<100

</expression>

<webRowSetStream name="myQueryOutput"/>

</sqlQueryStatement>

<deliverToGDT name="deliverQueryResults">

<fromLocal from="myQueryOutput"/>

<toGDT streamId="otherServiceInput" mode="full"> http://localhost:8080/ogsa/services/ogsadai/SomeDAIService

</toGDT>

</deliverToGDT>

Page 16: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 16

Enabling Grids for E-sciencE

INFSO-RI-508833

Data Security• Role mapping is the process of authorizing a client's

request to access a data resource

• two-step process: – Check whether the client is allowed to access the data resource

– Determine the database user name and password (or role) to be used for this client

• A role map document contains the information required to undertake this process

Page 17: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 17

Enabling Grids for E-sciencE

INFSO-RI-508833

Data Security• Simple OGSA-DAI Role Map Documents

<DatabaseRoles>

<Database name="jdbc:mysql://host:6502/otherData">

<User dn="No Certificate Provided"

userid="myUser" password="123"/>

<User dn="/C=UK/O=eScience/OU=Aspatria/L=AeSC/CN=tom“

userid="superUser" password="myPassword"/>

</Database>

</DatabaseRoles>

Page 18: INFSO-RI-508833 Enabling Grids for E-sciencE  OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.

Grid Application Development, Bratislava, 10.03.05 18

Enabling Grids for E-sciencE

INFSO-RI-508833

The End Thank you for your attention.