INFSO-RI-508833 Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN Singapore,...

31
INFSO-RI-508833 Enabling Grids for E- sciencE www.eu-egee.org Information System Valeria Ardizzone INFN Singapore, 1st South East Asia Forum -- EGEE tutorial 9-10 January 2006

Transcript of INFSO-RI-508833 Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN Singapore,...

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

Information SystemValeria Ardizzone

INFN

Singapore, 1st South East Asia Forum -- EGEE tutorial

9-10 January 2006

EGEE Tutorial, Singapore, 09.02.2006 2

Enabling Grids for E-sciencE

INFSO-RI-508833

lcg-infosites&

lcg-info

EGEE Tutorial, Singapore, 09.02.2006 3

Enabling Grids for E-sciencE

INFSO-RI-508833

How to discover resources ?

• Once an user is logged into an User Interface (s)he is ready to take advantage of the Grid Power for his/her own application.

• But what are the available resources to accomplish his/her tasks?

• The answer to this question comes through the interactions with the Information System (IS).

• The Information System (IS) provides information about the LCG-2 Grid resources and their status.

EGEE Tutorial, Singapore, 09.02.2006 4

Enabling Grids for E-sciencE

INFSO-RI-508833

How to discover resources (cont)

• The data published in the IS conforms to the GLUE (Grid Laboratory for a Uniform Environment) Schema. The GLUE Schema aims to define a common conceptual data model to be used for Grid resources.

• In LCG-2, the BDII (Berkeley DB Information Index), based on an updated version of the Monitoring and Discovery Service (MDS), was adopted as main provider of the Information Service.

• In gLite, R-GMA (Relational Grid Monitoring Architecture) is adopted as IS.

EGEE Tutorial, Singapore, 09.02.2006 5

Enabling Grids for E-sciencE

INFSO-RI-508833

Monitoring and Discovery Service

• Computing and storage resources at a site implement an entity called Information Provider, which generates the relevant information of the resource (e.g.: the used space in a SE).

• This information is published via an LDAP server by the Grid Resource Information Servers, or GRISes.

EGEE Tutorial, Singapore, 09.02.2006 6

Enabling Grids for E-sciencE

INFSO-RI-508833

• In each site an element called the Site Grid Index Information Server (GIIS) collects all the information of the different GRISes and publishes it.

• This BDII queries the GIISes and acts as a cache, storing information about the Grid status in its database.

Monitoring and Discovery Service

EGEE Tutorial, Singapore, 09.02.2006 7

Enabling Grids for E-sciencE

INFSO-RI-508833

• Querying the BDII a user or a service has all the available information about the status of the grid resources.

• Moreover in order to get more up-to-date information it is possible to querying directly the GIISes or GRISes.

Monitoring and Discovery Service

EGEE Tutorial, Singapore, 09.02.2006 8

Enabling Grids for E-sciencE

INFSO-RI-508833

How to query the IS?

• In order to query directly the IS elements two higher level tools are provided.

lcg-infosites

lcg-info

• These tools should be enough for most common user needs and will usually avoid the necessary of raw LDAP queries.

EGEE Tutorial, Singapore, 09.02.2006 9

Enabling Grids for E-sciencE

INFSO-RI-508833

lcg-infosites

• The lcg-infosites command can be used as an easy way to retrieve information on Grid resources for the most use cases.

USAGE: lcg-infosites --vo <vo name> options -v <verbose level> --is <BDII to query>

EGEE Tutorial, Singapore, 09.02.2006 10

Enabling Grids for E-sciencE

INFSO-RI-508833

lcg-infosites options

EGEE Tutorial, Singapore, 09.02.2006 15

Enabling Grids for E-sciencE

INFSO-RI-508833

lcg-info intro

• This command can be used to list either CEs or the SEs that satisfy a given set of conditions, and to print the values of a given set of attributes.

• The information is taken from the BDII specified by the LCG_GFAL_INFOSYS environment variable.

• The query syntax is like this: attr1 op1 valueN, ...

attrN opN valueN

where attrN is an attribute name op is =, >= or <=, and the cuts are ANDed. The cuts are comma-separated and spaces are

not allowed.

After the upgrading of the new

GLUE SCHEMA it’s not possible

use the operator ‘>’ and ‘<‘

After the upgrading of the new

GLUE SCHEMA it’s not possible

use the operator ‘>’ and ‘<‘

EGEE Tutorial, Singapore, 09.02.2006 16

Enabling Grids for E-sciencE

INFSO-RI-508833

USAGE

lcg-info --list-ce [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-se [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-attrs

lcg-info --help

lcg-info usage

EGEE Tutorial, Singapore, 09.02.2006 17

Enabling Grids for E-sciencE

INFSO-RI-508833

lcg-info options

EGEE Tutorial, Singapore, 09.02.2006 24

Enabling Grids for E-sciencE

INFSO-RI-508833

References

LCG-2 User Guide Manual Series https://edms.cern.ch/file/454439/LCG-2-

UserGuide.html

EGEE Tutorial, Singapore, 09.02.2006 25

Enabling Grids for E-sciencE

INFSO-RI-508833

SECOND PART

R-GMA

EGEE Tutorial, Singapore, 09.02.2006 26

Enabling Grids for E-sciencE

INFSO-RI-508833

Outline

Introduction to R-GMA and Grid Monitoring Architecture (GMA).

R-GMA within Testbed

R-GMA in depth:- Schema, Registry, Producer and Consumer- Query and Storage Types- R-GMA Browser

EGEE Tutorial, Singapore, 09.02.2006 27

Enabling Grids for E-sciencE

INFSO-RI-508833

Introduction to R-GMA

• Relational Grid Monitoring Architecture (R-GMA)– Developed as part of the EuropeanDataGrid Project (EDG)

– Now as part of the EGEE project.

– Based the Grid Monitoring Architecture (GMA) from the Global Grid Forum (GGF).

• Uses a relational data model.– Data are viewed as tables.

– Data structure defined by the columns.

– Each entry is a row (tuple).

– Queried using Structured Query Language (SQL).

EGEE Tutorial, Singapore, 09.02.2006 28

Enabling Grids for E-sciencE

INFSO-RI-508833

Grid Monitoring Architecture(GMA)

PRODUCER

CONSUMER

REGISTRY

Store location

Lookup location

Transfer Data

• The Producer stores its location (URL) in the Registry.

• The Consumer looks up producer URLs in the Registry.

• The Consumer contacts the Producer to get all the data or the Consumer can listen to the Producer for new data.

EGEE Tutorial, Singapore, 09.02.2006 29

Enabling Grids for E-sciencE

INFSO-RI-508833

R-GMA within Testbed

EGEE Tutorial, Singapore, 09.02.2006 30

Enabling Grids for E-sciencE

INFSO-RI-508833

R-GMA: Schema-Registry-Mediator

VIRTUAL DATABASE

TABLE 1, Colum defs

TABLE 2, Colum defs

TABLE 3, Colum defs

TABLE 4, Colum defs

SCHEMA

TABLE 1,Producer P1 details

TABLE 2,Producer P1 details

TABLE 2,Producer P2 details

TABLE 2,Producer P3 details

TABLE 3,Producer P2 details

TABLE 3,Producer P1 details

TABLE 3,Producer P3 details

REGISTRYMEDIATOR

R-GMA Server

MEDIATOR: a set of rules for deciding which data providers to contact for any given query.

REGISTRY: It holds the details of all producers that are publishing to tables in the virtual database and it also holds the details of “continuous” consumers.

SCHEMA : it holds the names and definitions of all of the tables in the virtual database, and their authorization rules.

EGEE Tutorial, Singapore, 09.02.2006 31

Enabling Grids for E-sciencE

INFSO-RI-508833

R-GMA: Producer-Consumer

VIRTUAL DATABASE

TABLE 1, Colum defs

TABLE 2, Colum defs

TABLE 3, Colum defs

TABLE 4, Colum defs

SCHEMA

TABLE 1,Producer P1 details

TABLE 2,Producer P1 details

TABLE 2,Producer P2 details

TABLE 2,Producer P3 details

TABLE 3,Producer P2 details

TABLE 3,Producer P1 details

TABLE 3,Producer P3 details

REGISTRY

MEDIATOR

P1

P2

P3

C1

C2

SQL “INSERT”

SQL “SELECT”

Producers: are the data providers for the virtual database. Writing data into the virtual database is known as publishing, and data is always published in complete rows, known as tuples. There are three types of producer: Primary, Secondary and On-demand.

Consumer: represents a single SQL SELECT query on the virtual database. The query is matched against the list of available producers in the Registry. The consumer service then selects the best set of producers to contact and sends the query directly to each of them, to obtain the answer tuples.

R-GMA Server

EGEE Tutorial, Singapore, 09.02.2006 32

Enabling Grids for E-sciencE

INFSO-RI-508833

Query and Storage Types

• Continuous: as soon as new data becomes available it is broadcast to all interested parties.

• Latest: correspond to intuitive idea of current information.

• History: return time sequenced data.TABLE 1,Producer P1 details

TABLE 2,Producer P1 details

TABLE 2,Producer P2 details

TABLE 2,Producer P3 details

TABLE 3,Producer P2 details

TABLE 3,Producer P1 details

TABLE 3,Producer P3 details

REGISTRY

P1

Latest-store

Continuous&History-store

P1

LATEST RETENTION PERIOD (LRP) and

HISTORY RETENTION PERIOD (RTP)

allow producers to periodically purge old tuples, and to give a precise meaning to the “current state”.

Tuple-store can be in Memory or Database

EGEE Tutorial, Singapore, 09.02.2006 33

Enabling Grids for E-sciencE

INFSO-RI-508833

Producer Types

• Primary Producer

• Secondary Producer

• On-Demand Producer

User Code

Producer API

Producer Service

Tuple Storage

CControl only

Queries

Tuples

SELECT * Tuples

P

User Code

Producer API

Producer Service CControl only

Queries

Tuples

TuplesQueries

User Code

User Code

Producer API

Producer Service

Tuple Storage

CControl and inserted tuples

Queries

Tuples

EGEE Tutorial, Singapore, 09.02.2006 34

Enabling Grids for E-sciencE

INFSO-RI-508833

Continuous

Producer Servlet

Registry

Store location

Lookup

locatio

n

Continuous

Store table description

Producer API

SQL “CREATE TABLE”

Result Set

TableName

Value 1 Value 2

TableName URL Predicate

SchemaTableName Column

TableName

Value 1 Value 2

Insert

TableName

UK RAL Alice

Consumer ServletConsumer API

SQL “SELECT”TableName

Value 1 Value 2TableName

Value 1 Value 2

Query

SQL “INSERT”

EGEE Tutorial, Singapore, 09.02.2006 35

Enabling Grids for E-sciencE

INFSO-RI-508833

History or Latest

Producer Servlet

Registry

Store location

Lookup

locatio

n

Query

Store table description

Producer API

SQL “CREATE TABLE”

Result Set

TableName

Value 1 Value 2

TableName URL Predicate

SchemaTableName Column

TableName

Value 1 Value 2

Insert

TableName

UK RAL Alice

Consumer ServletConsumer API

SQL “SELECT”TableName

Value 1 Value 2TableName

Value 1 Value 2

Query

SQL “INSERT”

EGEE Tutorial, Singapore, 09.02.2006 36

Enabling Grids for E-sciencE

INFSO-RI-508833

R-GMA APIs

• APIs exist in Java, C, C++, Python. – For clients (servlets contacted behind the scenes)

• They include methods for…– Creating consumers

– Creating primary and secondary producers

– Setting type of queries, type of produces, retention periods, time outs…

– Retrieving tuples, inserting data

– …

• You can create your own Producer or Consumer.

EGEE Tutorial, Singapore, 09.02.2006 38

Enabling Grids for E-sciencE

INFSO-RI-508833

https://rgmasrv.ct.infn.it:8443/R-GMA

EGEE Tutorial, Singapore, 09.02.2006 39

Enabling Grids for E-sciencE

INFSO-RI-508833

Security: Requirements

R-GMA

• Consumer users: who requests information.

• Producer users: who provides information.

• Site administrators: who runs R-GMA services.

• Virtual Organizations: who “owns” the schema and registry.

Consumer

Provider

Site Admin

VO

EGEE Tutorial, Singapore, 09.02.2006 40

Enabling Grids for E-sciencE

INFSO-RI-508833

Security: Solution

R-GMAConsumer

Provider

Site Admin

VO

• Mutual Autentication: guaranteeing who is at each end of an exchange of messages.

• Encryption: using an encrypted transport protocol (HTTPS).

• Authorization: implicit or explicit.

EGEE Tutorial, Singapore, 09.02.2006 41

Enabling Grids for E-sciencE

INFSO-RI-508833

More information

• R-GMA overview page.– http://www.r-gma.org/

• R-GMA documentation in EGEE– http://hepunx.rl.ac.uk/egee/jra1-uk/

• R-GMA in GILDA– https://rgmasrv.ct.infn.it:8443/R-GMA

EGEE Tutorial, Singapore, 09.02.2006 42

Enabling Grids for E-sciencE

INFSO-RI-508833

Questions…