EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks...

17
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE and gLite are registered trademarks Migration to the GLUE 2.0 information schema in the LCG/EGEE/EGI production Grid Stephen Burke (RAL), Laurence Field (CERN) and David Horat (CERN) CHEP2010, Taipei

Transcript of EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks...

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

EGEE and gLite are registered trademarks EGEE and gLite are registered trademarks

Migration to the GLUE 2.0 information schema in the LCG/EGEE/EGI production Grid

Stephen Burke (RAL), Laurence Field (CERN) and David Horat (CERN)

CHEP2010, Taipei

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 2

Overview

• Implementation plan• The GLUE 2.0 schema

– Structure– LDAP rendering

• Implementation and use in the LCG/EGEE/EGI Grid– BDII rollout– Service publication– Clients– Implementation timeline

• Outlook

• This presentation follows on from a talk about The impact and adoption of GLUE 2.0 in the LCG/EGEE production Grid at CHEP09

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 3

Implementation plan

• Schema migration is a complex process:

1) Define the abstract schema

2) Define the LDAP rendering

3) Implement the schema in the BDII and roll out

4) Write and deploy information providers• You are here!

5) Update client tools to understand GLUE 2

6) (Retire GLUE 1)

• The schema interacts with everything, so the rollout must be a gradual process without breaking anything

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 4

GLUE 2.0 timeline

• January 2007: First working group meeting • June 2008: Draft specification opened to public

comment• January 2009: Final specification ready• March 2009: GLUE 2.0 becomes an official OGF

standard– http://www.ogf.org/documents/GFD.147.pdf

• April 2009: Start work on implementation …

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 5

Glue 2.0 key concepts

User

Domain

Admin

Domain

Resource

Manager

ShareEnd Point

ActivityAccess

Policy

Mapping

Policy

Negotiates Share with

Provides

Manages

Runs

Defined on

Contacts

Maps User to

Has

Service

Has

HasHas

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 6

Glue 2.0 computing schema

Computing

Service

Execution

Environment

Computing

Manager

Computing

Share

Computing

End Point

Computing

Activity

Manages

Runs

Defined onMaps User to Application

EnvironmentCan use

HasHas

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 7

Glue 2.0 storage schema

Storage

Service

Data Store

Storage

Manager

Storage

Share

Storage

End Point

Storage Share

Capacity

Defined onMaps User to

Storage

CapacityHas

Storage

Access Protocol

Offers

Offers Manages

Has

Has

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 8

LDAP rendering

• Needs some basic decisions about how to map the abstract schema to LDAP

• We designed the schema knowing that LDAP was a target technology, so many things have a natural implementation– GLUE entities mapped directly to LDAP object classes– Multivalued and optional attributes supported directly

Unlike e.g. a relational database– Not much support for data types, basically just string and integer

Type conformance must largely be checked externally• Try to optimise to make the most likely queries efficient

• Defined in ~ 6 phone meetings in May/June 2009

• Generally follows GLUE 1 practice, but some changes– Case sensitivity– Some attributes are mandatory– Strings changed from 7-bit ASCII to UTF-8– The naming and usage of foreign keys are somewhat different

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 9

Object classes

• Object classes are the basic structural unit of an LDAP schema– Basically maps to a defined list of attributes– One LDAP object may include attributes from many object classes

• An LDAP schema has a global namespace, so useful to have a naming convention– Prefix schema entity names with “GLUE2”

e.g. GLUE2ComputingShare

– No clashes with GLUE 1.x (prefix “Glue”)

• Natural mapping:– One object class per schema entity– Objects representing specialised entities inherit attributes from parent

object classes

objectclass: GLUE2Entity

objectclass: GLUE2Share

objectclass: GLUE2ComputingShare

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 10

LDAP attributes

• Attribute naming– Follows the object class naming scheme– “GLUE2” + <entity name> + <attribute name>

GLUE2ComputingShareRunningJobs– Exception for unique ID

Could be confusing for all object IDs to have the same name• dn: GLUE2EntityID=x, GLUE2EntityID=y, GLUE2EntityID=z

Named for the first derived entity instead GLUE2ShareID not GLUE2EntityID (or GLUE2ComputingShareID)

• Foreign key attributes– Representation of relations between entities– All schema relations have a corresponding key attribute

Even when the relation is also implied by the DN Multiplicity (optional/mandatory, single/multivalued) follows the schema like other

attributes Inherited in the same way as other attributes

– Only needed at one end of a relation Decided case-by-case, but generally point logically “up” (child points to parent)

– Named specifically for the relation they represent Long but unambiguous GLUE2ComputingShareComputingServiceForeignKey

– Attribute value is the ID of the target entity

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 11

DN construction (LDAP tree)

• LDAP objects are arranged in a hierarchical tree• Each object has a unique Distinguished Name (DN) representing

its location in the tree• The DN is constructed as a series of components, each of which

gives the name and value of an identifying object attribute– For GLUE the natural attribute is the ID

• Attach objects in the tree according to a natural hierarchy– The GLUE schema relations are more complex than a tree can

represent, so this is only indicative– subcomponent -> component -> Service -> AdminDomain -> Root

GLUE2PolicyID=xxx, GLUE2ShareID=xxy, GLUE2ServiceID=xxz, GLUE2DomainID=zyx, GLUE2DomainID=xyz

– Extension objects must be directly below the object they extend– Dummy grouping object to insert GLUE2GroupID anywhere in the DN

No semantics, just makes the tree easier to follow, e.g. in an LDAP browser c.f. mds-vo-name

– Should navigate using foreign keys, not DNs

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 12

BDII implementation

• Merged LDAP schema, GLUE 1.3 + GLUE 2• Single LDAP server, on port 2170 as usual• Separate root DNs

– o=glue vs o=grid– Should be no crosstalk other than data volume

• Resource BDII: GLUE2GroupID=resource, o=glue• Site BDII: GLUE2DomainID=<site-name>, o=glue• Top BDII: GLUE2DomainID=<site-name>,

[GLUE2DomainID=<grid-name>,] GLUE2GroupID=grid, o=glue

• Roll out– Only implemented for SL5 (gLite 3.2)– Resource BDII in production since September 2009– Site BDII in production since August 2010– Top-level BDII in production in October 2010

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 13

Information providers

• Generic Service publisher exists– Drop-in replacement for the existing generic GlueService provider for GLUE 1.3

• Currently supports one Endpoint per Service (plus the corresponding AccessPolicy)– Easy to extend for multiple Endpoints (VOMS)

• Supports all relevant attributes• Being rolled out as new versions of services are released

– Already have CREAM, LB, bdii_site, bdii_top, VOBOX in production– In work for WMS, MyProxy, AMGA, VOMS, Frontier/squid (!), …

Easy to add publication for any service– FTS and LFC currently have their own providers for GLUE 1.3, so may need

more work• Computing and Storage providers will be more complex

– CREAM will be incremental Endpoint publisher already exists, ExecutionEnvironment (Cluster) in work Need to re-use GLUE 1.3 code where possible

• Particularly for the batch system interfaces– Storage publishers will be needed for each SRM type

Not yet started Expected to have a first version in the first EMI release (April)

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 14

Clients

• All clients need to become GLUE2-aware– Must be backward-compatible– Can happen gradually– Mostly not yet started

• WMS: JDL• Storage: lcg-utils/GFAL/FTS• Service discovery: lcg-info(sites), glite-sd-query

– First version of OGF/SAGA service discovery tool available

• Monitoring: gstat– Will be used to follow GLUE 2 deployment

• Accounting: resource accounting, not APEL• User tools• …

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 15

Summary• Define LDAP schema and deploy in BDIIs

– 1.3 and 2.0 together in parallel– Now deployed in production

But sites are slow to upgrade!• Write and deploy information providers to populate the new objects

– Generic Service publisher available Being rolled out progressively

– ComputingService publication (for CREAM) coming incrementally Full version by mid-2011?

– StorageService more complex, many different providers Timescale unclear

• Update clients to look at the new information– Workload management, data management, service discovery, monitoring,

accounting, user, …– Upgrades should be backward-compatible– Aim for the end of 2011??

• Switch off GLUE 1 publishing– Only when everything has been upgraded– >2012???

• NB EGI/EMI will bring new middleware distributions into the fold– ARC is committed to implementing GLUE 2

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 16

Outlook

• The GLUE schema has developed over 9 years of practical use• GLUE 2 is a major new version of the schema

– Incorporates all our experience, and input from many other Grids– OGF backing makes this a worldwide Grid standard

EGI/EMI are committed to adopting it– Uniform structure for any service

CE, SE, WMS, VOMS, MyProxy, LFC, FTS, …– Much more expandable

All objects can be extended– Fixes many long-standing problems

StorageService designed for SRM 2! ComputingService has a better separation of Grid endpoint, LRMS and

queue/fairshare information• GLUE 2.0 should cover all current use cases for WLCG/EGI

– And allow things we can’t do at the moment– And be much more flexible for the cases we still haven’t anticipated

• Roll out has started, but the transition process will take several years

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 GLUE 2.0 migration - CHEP10 17

References

• OGF GLUE working group home page– http://forge.ogf.org/sf/projects/glue-wg

• GLUE 2.0 specification– http://www.ogf.org/documents/GFD.147.pdf

• LDAP rendering specification (draft)– http://forge.ogf.org/sf/go/doc15518?nav=1