EDGeS: Bridging EGEE to BOINC and...

20
J Grid Computing DOI 10.1007/s10723-009-9137-0 EDGeS: Bridging EGEE to BOINC and XtremWeb Etienne Urbah · Peter Kacsuk · Zoltan Farkas · Gilles Fedak · Gabor Kecskemeti · Oleg Lodygensky · Attila Marosi · Zoltan Balaton · Gabriel Caillat · Gabor Gombas · Adam Kornafeld · Jozsef Kovacs · Haiwu He · Robert Lovas Received: 7 November 2008 / Accepted: 27 August 2009 © Springer Science + Business Media B.V. 2009 Abstract Desktop Grids, such as XtremWeb and BOINC, and Service Grids, such as EGEE, are two different approaches for science communities to gather computing power from a large number of computing resources. Nevertheless, little work has been done to combine these two Grid tech- nologies in order to establish a seamless and vast Grid resource pool. In this paper we present the E. Urbah (B ) · O. Lodygensky · G. Caillat LAL, Université Paris-Sud, IN2P3/CNRS, Bld 200, 91898 Orsay, France e-mail: [email protected] URL: http://www.lal.in2p3.fr/spip.php?article105, http://www.lal.in2p3.fr/spip.php?lang=en O. Lodygensky e-mail: [email protected] G. Caillat e-mail: [email protected] P. Kacsuk · Z. Farkas · G. Kecskemeti · A. Marosi · Z. Balaton · G. Gombas · A. Kornafeld · J. Kovacs · R. Lovas MTA SZTAKI, LPDS, Kende u. 13–17, 1111 Budapest, Hungary URL: http://www.lpds.sztaki.hu P. Kacsuk e-mail: [email protected] Z. Farkas e-mail: [email protected] G. Kecskemeti e-mail: [email protected] EGEE Service Grid, the BOINC and XtremWeb Desktop Grids. Then, we present the EDGeS so- lution to bridge the EGEE Service Grid with the BOINC and XtremWeb Desktop Grids. Keywords BOINC · Bridge · Desktop Grid · EDGeS · EGEE · Interface · Interoperation · OGF · Service Grid · XtremWeb · XWHEP A. Marosi e-mail: [email protected] Z. Balaton e-mail: [email protected] G. Gombas e-mail: [email protected] A. Kornafeld e-mail: [email protected] J. Kovacs e-mail: [email protected] R. Lovas e-mail: [email protected] G. Fedak · H. He INRIA Saclay, LRI, Bat 490, 91400 Orsay, France URL: http://www.lri.fr/projet.associe.php?prj=15 G. Fedak e-mail: [email protected] H. He e-mail: [email protected]

Transcript of EDGeS: Bridging EGEE to BOINC and...

Page 1: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

J Grid ComputingDOI 10.1007/s10723-009-9137-0

EDGeS: Bridging EGEE to BOINC and XtremWeb

Etienne Urbah · Peter Kacsuk · Zoltan Farkas · Gilles Fedak ·Gabor Kecskemeti · Oleg Lodygensky · Attila Marosi · Zoltan Balaton ·Gabriel Caillat · Gabor Gombas · Adam Kornafeld · Jozsef Kovacs ·Haiwu He · Robert Lovas

Received: 7 November 2008 / Accepted: 27 August 2009© Springer Science + Business Media B.V. 2009

Abstract Desktop Grids, such as XtremWeb andBOINC, and Service Grids, such as EGEE, aretwo different approaches for science communitiesto gather computing power from a large numberof computing resources. Nevertheless, little workhas been done to combine these two Grid tech-nologies in order to establish a seamless and vastGrid resource pool. In this paper we present the

E. Urbah (B) · O. Lodygensky · G. CaillatLAL, Université Paris-Sud, IN2P3/CNRS, Bld 200,91898 Orsay, Francee-mail: [email protected]: http://www.lal.in2p3.fr/spip.php?article105,http://www.lal.in2p3.fr/spip.php?lang=en

O. Lodygenskye-mail: [email protected]

G. Caillate-mail: [email protected]

P. Kacsuk · Z. Farkas · G. Kecskemeti ·A. Marosi · Z. Balaton · G. Gombas · A. Kornafeld ·J. Kovacs · R. LovasMTA SZTAKI, LPDS, Kende u. 13–17,1111 Budapest, HungaryURL: http://www.lpds.sztaki.hu

P. Kacsuke-mail: [email protected]

Z. Farkase-mail: [email protected]

G. Kecskemetie-mail: [email protected]

EGEE Service Grid, the BOINC and XtremWebDesktop Grids. Then, we present the EDGeS so-lution to bridge the EGEE Service Grid with theBOINC and XtremWeb Desktop Grids.

Keywords BOINC · Bridge · Desktop Grid ·EDGeS · EGEE · Interface · Interoperation ·OGF · Service Grid · XtremWeb · XWHEP

A. Marosie-mail: [email protected]

Z. Balatone-mail: [email protected]

G. Gombase-mail: [email protected]

A. Kornafelde-mail: [email protected]

J. Kovacse-mail: [email protected]

R. Lovase-mail: [email protected]

G. Fedak · H. HeINRIA Saclay, LRI, Bat 490,91400 Orsay, FranceURL: http://www.lri.fr/projet.associe.php?prj=15

G. Fedake-mail: [email protected]

H. Hee-mail: [email protected]

Page 2: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

Abbreviations

ACS Application Contents Service (OGFrecommendation)

ARC Advanced Resource Connector(Grid middleware used byNorduGrid)

BES Basic Execution Services (OGFrecommendation)

BIFI Institute for Biocomputation andPhysics of Complex Systems (Uni-versity of Zaragoza, Spain)

BOINC Berkeley Open Infrastructurefor Network Computing (Gridmiddleware)

CA Certificate AuthorityCE Computing Element (EGEE)DEISA Distributed European Infrastructure

for Supercomputing ApplicationsDG Desktop Grid = loose opportunistic

Grid using idle resourcesDMI Data Movement Interface (OGF

recommendation)DOM Document Object Model (World

Wide Web Consortium)EADM EDGeS Application Development

MethodologyEDGeS Enabling Desktop Grids for e-

ScienceEGEE Enabling Grids for E-sciencEFTP File Transfer ProtocolGEMLCA Grid Execution Management

for Legacy Code Applications(University of Westminster)

gLite Grid middleware used by EGEEGRAM Grid Resource Allocation Manager

(Globus Toolkit)INRIA Institut National de Recherche en

Informatique et en Automatique(Saclay, France)

JDL Job Description Language (EGEE)JSDL Job Submission Description Lan-

guage (OGF recommendation)LB Logging and Bookkeeping (EGEE)LAL Laboratoire de l’accélérateur

Linéaire (Orsay, France)LCG LHC Computing Grid

LPDS Laboratory of Parallel and Dis-tributed Systems (MTA SZTAKI,Budapest, Hungary)

LRI Laboratoire de Recherche en Infor-matique (Orsay, France)

OGF Open Grid ForumOGSA Open Grid Services Architecture

(OGF)OGSI Open Grid Services Infrastructure

(OGF)OMII Open Middleware Infrastructure

InstituteOSG Open Science Grid (USA)QoS Quality of ServiceRUS Resource Usage Service (OGF

recommendation)SETI Search for Extra-Terrestrial Intelli-

genceSG Service Grid = Globally managed

Grid of distributed, locally managedcomputing clusters

SRM Storage Resource ManagerSZDG SZTAKI Desktop Grid (Budapest,

Hungary)UI User Interface machine (EGEE)UR Usage Record (OGF recommenda-

tion)UoW University of Westminster (London,

Great Britain)VDT Virtual Data Toolkit (Grid middle-

ware used by OSG)VO Virtual OrganisationVOMS VO Management ServiceWLCG Worldwide LHC Computing Grid

(=LCG)WMS Workload Management System

(EGEE)WSRF Web Services Resource Framework

(OASIS)WU Work Unit (BOINC)XML eXtensible Markup Language

(World Wide Web Consortium)

1 Introduction

Originally, the aim of Grid research was to enablesharing of computing resources, facilitate collab-oration, realise the vision that anyone can donate

Page 3: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

resources to the Grid, and that anyone can claimresources dynamically according to its needs. Thisfourfold aim has been, however, not fully achievedyet. Currently we can observe two different ap-proaches in the development of Grid systems. Inthis section, we first introduce the Service Grids(SG) approach, then the Desktop Grids (DG)approach, and finally the bridging of both typesof Grids.

1.1 The Service Grids Approach

There is a growing interest among scientific com-munities to share their distributed computingand storage infrastructures to solve their grand-challenge problems and to further enhance theirapplications with extended parameter sets andgreater complexity. Many scientific communitiescall Service Grid (SG) such a shared distrib-uted computing and storage infrastructure. Re-searchers and developers in Service Grids firstcreate a Grid service that can be accessed by alarge number of users. A resource can becomepart of the Grid by installing a predefined softwareset, or middleware. However, the middleware isusually so complex that it often requires exten-sive expert effort to install and maintain. It istherefore natural that individuals do not normallyoffer their resources in this manner, and SGs aregenerally restricted to larger institutions, whereprofessional system administrators take care ofthe hardware/middleware/software environmentand ensure high-availability of the Grid.

Even though the original aim of enabling any-one to join the Grid with one’s resources hasnot been fulfilled, the largest Grid in the world(EGEE) contains more than 100,000 processors.The security model used by SGs is based on mu-tual authentication between users and resourceswhich is realised by a public key infrastruc-ture (PKI) using X.509 certificates. Anyone whoobtains a valid certificate from a CertificateAuthority (CA) can access those Grid resourcesthat trust that CA. This is often simplified byVirtual Organization (VO) or community autho-rization services that centralizes the managementof trust relationships and access rights.

Interoperation between several SGs has al-ready been achieved by the joint work of sev-

eral organisations, notably the Open Grid Forum(OGF) [1], the Open Middleware InfrastructureInstitute (OMII-Europe), the Worldwide LHCComputing Grid (WLCG).

1.2 The Desktop Grids Approach

Desktop Grids (DG), literally Grids made ofDesktop Computers, are very popular in thecontext of “Volunteer Computing” for largescale “Distributed Computing” projects likeSETI@home [2] and Folding@home. They arevery attractive, as “Internet Computing” plat-forms, for scientific projects seeking a hugeamount of computational power for massive highthroughput computing. DG uses computing, net-work and storage resources of idle desktop PCsdistributed over multiple LANs or the Internet.Today, this type of computing platform aggre-gates one of the largest distributed computing sys-tems, and currently provides scientists with tens ofTeraFLOPS from hundreds of thousands of hosts.In DG systems, such as BOINC [3] or XtremWeb[4], anyone can bring resources into the Grid.Installation and maintenance of the client sidesoftware is intuitive, requiring no special exper-tise, thus enabling a large number of donorsto contribute into the pool of shared resources.On the downside, only a very limited usercommunity (i.e., target applications) can effec-tively use DG resources for computation. Forinstance, the BOINC project features a limitednumber of applications, and the top 5 projectsshare more than 50% of the total compute power.Because users are Internet volunteers, there can-not be security model based on trust betweenusers. Because of users anonymity, security solu-tion for DG relies on autonomous mechanismssuch as sandbox or result validation to preventattacks from other users. As a consequence, DGsystems are not yet ready to be integrated in acomplex Grid infrastructure which requires a highlevel user right management, authentication, au-thorization and rights delegation.

1.3 The Bridging of Both Types of Grids

Until now, these two kinds of Grid systems arecompletely separated and there is no way to use

Page 4: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

their individual advantageous features in a uni-fied environment. However, with the objectiveto support new scientific communities that needextremely large numbers of resources they can’tfind in SG, the solution could be to intercon-nect these two kinds of Grid systems into anintegrated Service Grid–Desktop Grid (SG–DG)infrastructure.

Compared to application portals which allowjob submission to different infrastructures, bridgespermit seamless interoperation using low middle-ware layers, and prepare real interoperability. Theissue of the (potentially long) time needed by aGrid to process a job is not a purely technical issuewhich could be solved inside the bridge itself, buta issue of QoS and SLA between the Grid user,the first (Service or Desktop) Grid to which theGrid user submits the Job, and any subcontractor(Service or Desktop) Grid to which a Grid sendsthe job for execution. The owner of a resource ina Desktop Grid does not care if a job comes froma Service Grid or from a Desktop Grid server, buthe only cares that the Application is trusted, andhas some usefulness. Therefore a SG–DG bridgemust use an Application Repository for storingthe validated and hence trusted applications.

Interoperation between SGs and DGs hasalready been explored, notably by the Latticeproject [5] at University of Maryland (USA),the SZTAKI Desktop Grid [6] (Budapest,Hungary), the Condor project (BOINC backfill)at University of Wisconsin (USA), the Superlinkproject at Technion (Haifa, Israel), and theClemson University (South Carolina, USA).

Our research is part of the work conducted by anew European FP7 infrastructure project: EDGeS(Enabling Desktop Grids for e-Science) [7], whichaims to build technological bridges to facilitateinteroperability between DG and SG.

In this paper, we describe the technical solu-tion offered by the EDGeS project to bridgethe EGEE service Grid with the BOINC andXtremWeb desktop Grids, and the relevant OGFrecommendations for interoperability.

In the next section, we give a technical pre-sentation of Service Grids and Desktop Grids. InSection 3, we give a presentation of the EDGeSproject. In Section 4, we describe related research

in bridging the two kinds of Grids. In Section 5,we describe the implementation of the bridgesin EDGeS. In Section 6, we present the currentoperational status of EDGeS. In Section 7, wepresent the OGF standards relevant for futureGrid interoperability.

2 Technical Presentation of Service Gridsand Desktop Grids

Technically, service Grids and desktop Grids arevery different. In this chapter we show their typi-cal architecture and characteristics.

In order to ease the understanding of figures,we will use following colours throughout this pa-per: Light blue for Users, Light yellow for EGEE,Pink for BOINC, Light pink for XtremWeb,Gold for EDGeS, Light green for services orcomponents.

2.1 Technical Presentation of Service Grids

A Service Grid (SG) is a globally managed Gridof distributed, locally managed computing clusters,offering a guaranteed QoS (Quality of Service).Typically, institutions with their managed clusterscan join to SGs if they sign a certain SLA (Ser-vice Level Agreement) with the leadership of theSG. Since participants to a SG are most ofteninstitutions, an SG is also often called an “Insti-tutional Computing Grid”.

Inside service Grids, computing and storageresources are managed by trained staff and areauthenticated by X509 certificates. Users are alsoauthenticated by X509 certificates or proxies.Users may belong to Virtual Organisations (VOs)and get X509 proxy extensions from a VOMSserver, which can allow them to access data andsubmit jobs. On the contrary, executables are notauthenticated. So trust is primarily between sitesand VOs.

The general principle of job processing is thatusers submit their jobs by delegating their X509proxy to the SG broker (often called meta-scheduler), which pushes the jobs to resources that

Page 5: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

are both suitable and available. Figure 1 belowshows the architecture and working mechanismof an EGEE-like SG where 7 basic Grid servicesare used:

1. The Information Service is completely distrib-uted, provides Grid information to all otherGrid services, and is not explicitly shown inthe figure.

2. The User Interface is the machine where theSG client is deployed and from where userscan submit their jobs into the SG.

3. The VOMS Server is the service storing userauthorization information.

4. The Meta-scheduler (WMS) accepts the jobssubmitted by the users, checks user job re-quirements and matches them with availableGrid resources. Once it finds a matching Gridresource, it pushes the job to the selectedGrid site. When the job is finished, it retrievesthe Output Sandbox from the Grid site, andmakes it available to the user having submit-ted the job.

5. The Computing Resources are used to processthe user jobs providing the necessary comput-ing power.

6. The Storage Resources are used to providestorage capacity to user jobs.

7. Accounting, Logging and Bookkeeping collectand store job status information, logging andaccounting information.

Examples of such service Grid infrastructuresare EGEE, NorduGrid, OSG, DEISA, TeraGrid,Naregi.

2.2 Technical Presentation of Desktop Grids

A Desktop Grid (DG) is a loose opportunisticGrid using idle resources, as shown in Fig. 2 be-low. Inside desktop Grids, computing and stor-age resources are typically owned by individualvolunteer owners and not by institutes (there-fore it is often called “Volunteer Computing”).According to this, they are not managed and notauthenticated. DG servers may be authenticatedby an X509 certificate. DG users are authenti-cated by the DG servers but not by the computingand storage resources. Executables are validated,registered and deployed by managers of the DGservers, and DG users can create work units usingthese deployed applications by launching themwith their own input data set. So, resource ownershave to trust the DG servers. On the other hand,the DG servers do not fully trust the resourceowners: For example, BOINC servers usually sendeach work unit redundantly to several computingresources, and then compare the results in orderto assess their reliability [8].

Most DGs do not use the push model, butthe pull model for applications execution: Sincecomputing resources are not under control and

Fig. 1 SG = ServiceGrid = institutionalcomputing Grid

Grid User

X509 proxy

VOMS Proxy = X509 proxy with VOMS extensions

Submits Job with VOMS proxy

Publishes available Resources

Pushes Job

Log Log

VOMS Admin

Manages VO

Site Admin

Manages Site

Accesses Data with VOMS proxy

Accesses Data with VOMS proxy

Gives Job Status

Gives Accounting and Auditing

VOMS Server

Accounting Logging & Bookkeeping

Meta-scheduler(WMS)

Site

Computing Resource

Site

Storage Resource

Grid Admin

Sends back Output Sandbox Sends back

Output Sandbox

Page 6: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

Fig. 2 DG = DesktopGrid = looseopportunistic Grid usingidle resources

Grid User

Submits input data for an application

Requests Unit of Work

Sends Unit of Work

Application Manager

Certifies Application

Resource Owner (often volunteer)

Owns Resource

Sends back results

Accepts or Refusesan application on his resource

Grid Server withApplication Repository

Computing Resource

(often Desktop Computer) Sends back results

Currently, for BOINC, both roles of ‘Application Manager’ and ‘Grid User’ are fulfilled by ‘BOINC Project Owners’.

even not reachable, protected behind firewalls,starving computing resources pull work units fromthe DG server. The challenge in this concept isthe ratio of correct results returned by computingresources to the number of work units they havepulled.

DGs can not run any kind of application, butonly master/worker and parameter sweep applica-tions, where the same application code should beexecuted with a large set of different data. It is thetask of the DG server to store the data sets andsend out the needed data subset for the resourcesasking for new work unit.

Examples of such desktop Grid systems areBOINC, XtremWeb, OurGrid, Xgrid.

3 Presentation of the EDGeS Project

The EDGeS project is a 2 years European FP7project started on 01/01/2008. In this section, wepresent the goals of the project, the activitiesat work, and the way EDGeS handles interop-eration and interoperability issues among SGsand DGs.

3.1 Goals of the EDGeS Project

From the technical point of view the main goal ofEDgeS is to develop bridges that enable the inter-operation of SGs and DGs. Both the SG → DGand DG → SG direction bridges should be createdby EDGeS. However, it is not enough to createthe prototype of these bridges, but it is equallyimportant to establish a production infrastructurewhere these bridges can be used between EGEE,BOINC and XtremWeb systems. The production

infrastructure will first connect only these threeGrids. But in medium term, as shown in Fig. 3 be-low, the developed bridge technology (in gold atthe centre) should enable the integration of manyService Grids (in yellow at the top) and DesktopGrids (in pink at the bottom), as WLCG alreadypromotes integration between EGEE, NorduGridand OSG.

A further important goal of EDGeS is to portexisting SG and DG applications to the integratedSG-DG infrastructure and to provide a seamlessjob execution mechanism among the intercon-nected SG and DG systems.

3.2 Interoperation and Interoperability

The EDGeS project currently focuses on practi-cal interoperation, by building and deploying to-day ad-hoc bridges between EGEE, BOINC andXtremWeb. In particular, XtremWeb users musthave an X509 certificate, be registered in a VOand submit their jobs with an X509 proxy. BOINCProject Owners must have an X509 certificate, beregistered in the EDGeS VO of EGEE and storea medium-term X509 proxy in a MyProxy server

WLCG (CERN)

EDGeS

gLite (EGEE)

ARC (NorduGrid)

Boinc (Berkeley)

XtremWeb (INRIA/IN2P3)

Xgrid (Apple)

Unicore (DEISA)

VDT (OSG)

Current

Future

Fig. 3 Integration of service Grids and desktop Grids

Page 7: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

of the EDGeS VO. All files must be transferredthrough the input and output sandboxes.

In the future, the EDGeS project will work oninteroperability using OGF standards, in order tobridge more Grids (see Fig. 3 above), and providebetter support of Grid file access (ByteIO, SRM,GridFTP and DMI).

4 Related Research in Bridging Desktop Gridsand Service Grids

There exist two main approaches to bridge SG andDG (see Fig. 4 below).

In this section, we present the principles of thesetwo approaches and discuss them according toarchitectural, functional and security perspective.

4.1 The Superworker Approach

The superworker, proposed by the Lattice project[5] and the SZTAKI Desktop Grid [6], is the firstsolution. This enables the usage of several Gridor cluster resources to schedule DG tasks. Thesuperworker is a bridge implemented as a daemonbetween the DG server and the SG resources.

From the DG server point of view, the Grid orcluster appears as one single resource with largecomputing capabilities. The superworker contin-uously fetches tasks or work units from the DGserver, wraps and submit the tasks accordingly tothe local Grid or cluster resources manager. Whencomputations are finished on the SG computingnodes, the superworker sends back the results tothe DG server. Thus, the superworker by itself isa scheduler which needs to continuously scan thequeues of the computing resources and watch foravailable resources to launch jobs.

Since the superworker is a centralized agent,this solution has several drawbacks:

1. the superworker can become a bottleneckwhen the number of computing resourcesincreases,

2. it introduces a single point of failure in thesystem, which has low fault-tolerance.

3. the round trip for a work unit is increasedbecause it has to be marshaled/unmarshaledby the superworker,

The superworker does not require modification ofthe infrastructure, but Grid security rules abouttraceability require that the actual job owner isthe original job submitter. So, the original sub-mitters must be authorized to submit jobs to the

Fig. 4 Bridging serviceGrid and desktop Grid,the superworkerapproach versus thegliding-in approach

Grid of Remote Desktop

Computing Resources

Service Grid

Administrative Domain

Cluster

Service Grid =

Institutional Computing

Grid

Grid of Remote Desktop

Computing Resources

Administrative Domain

Desktop Grid

Server

Superworker

Returns Results

Submits jobs

Returns Results

Pulls jobs

Returns Results

Desktop Grid

ServerReturns Results

Pulls jobsGliding-in

Returns Results

Submits Pilot Jobs

Pilot Job

Pilot Job

Returns Results

Pulls jobs

Returns Results

Submits jobs

Managed Cluster of

Computing Resources

Returns Results

Local Desktop Computing Resources

Pulls jobs

Returns Results

Local Desktop Computing Resources

Pulls jobs

Agent

Super workerPulls jobs

Pulls jobs

Page 8: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

Grid, and have to delegate their credentials to thesuperworker.

4.2 The Gliding-in Approach

The Gliding-in approach to cluster resourcesspread in different Condor pool using the GlobalComputing system (XtremWeb) was first intro-duced in [9]. The main principle consists in wrap-ping the XtremWeb worker as regular Condortask and in submitting this task to the Condorpool. Once the worker is executed on a Condor re-source, the worker pulls jobs from the DG server,executes the XtremWeb task and return the resultto the XtremWeb server. As a consequence, theCondor resources communicate directly to theXtremWeb server. This is permitted by standardfirewall settings on most sites of Service Grids,which usually block all incoming connections(except those explicitly allowed), and allow alloutgoing connections.

Mechanisms similar to those described justabove are now commonly employed in Grid Com-puting [10]. For example, Dirac uses a combina-tion of push/pull mechanism to execute jobs onseveral Grid clusters. The generic approach on theGrid is called a pilot job. Instead of submittingjobs directly to the Grid meta-scheduler, this sys-tem submits so-called pilot jobs. When executed,the pilot job fetches jobs from an external jobscheduler.

The gliding-in or pilot job approach has severaladvantages. While simple, this mechanism effi-ciently balances the load between heterogeneouscomputing sites. It benefits from the fault toler-ance provided by the DG server; if Grid nodesfail, then jobs are rescheduled to the next avail-able resources. Finally, as the performance studyof the Falkon system [11] shows, it gives betterperformances because series of jobs do not haveto go through the meta-scheduler queue whichgenerally has long waiting times, and communi-cation is direct between the worker running onthe Computing Element (CE) and the DG serverwithout intermediate agent such as the super-worker. Grid security rules about multi-user PilotJobs require that the actual job owner must notbe the pilot job owner, but the original job sub-mitter. EGEE will progressively enforce this job

owner switching, which can now be achieved usinggLExec [12].

5 Implementation of the SG–DG Bridgesin EDGeS

In this section, we present the bridges and the ap-plication repository developed and implementedby the EDGeS project. The technologies used forthe bridges are based on previous developmentsperformed inside the SZTAKI Desktop Grid, andinside XtremWeb. The XtremWeb implementa-tion used is XWHEP developed by IN2P3. Thetechnology used for the application repository isbased on the GEMLCA repository.

5.1 Architecture of the 3G Bridge

The integration of the EGEE, BOINC andXtremWeb systems theoretically requires 4 differ-ent bridges:

• BOINC → EGEE,• XtremWeb → EGEE,• EGEE → BOINC,• EGEE → XtremWeb.

Instead of developing four different solutions forthe four bridges, we rather created a Generic Gridto Grid (3G) bridge that can be easily adapted forthe different cases of the required bridges. In fact,the 3G bridge is generic enough to easily adaptnot only for the three Grid systems we tackle inEDGeS, but also for other Grid systems like GT2,GT4, OurGrid, Xgrid, etc. So the introduction andusage of the 3G bridge represents a significant steptowards the standardized solution of integratingSGs and DGs.

The architecture of the 3G Bridge and its ap-plication in EDGeS is shown in Fig. 5 below. Therole of the four core components of the 3G Bridgeare as follows:

• The Job Database stores DG Work Units andSG jobs as generic descriptions. The SourceGrid Handler Interface is used to place DGwork units and SG jobs into the Job Databaseand to query their status.

Page 9: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

Fig. 5 Architecture ofthe EDGeS 3G bridge EDGeS 3G Bridge

EGEE EGEE

Job

Han

dler

Int

erfa

ce

Queue Manager

Grid

Han

dler

Int

erfa

ce

BOINC Plugins

(DC-API)

EGEE Plugins

Scheduler

BOINC

Handlers

EGEE Handler LCG-CE

for EDGeS

gLite WMS

BOINC Server

User

Job with

X509 proxy

Work Unit

BOINC Server

gLite WMS

WU

Job

Job

WU

XtremWeb Plugins XtremWeb

Server

Job Database

• The Source Grid Handler Interface is imple-mented via MySQL and as such it accepts SQLqueries, inserts and updates. Besides the taskof placing generic descriptions into the JobDatabase, its other task is to get job statusinformation from it. Jobs/WUs coming fromvarious source Grids are received by specifichandlers that transfer the incoming jobs/WUsto the Source Grid Handler Interface. So, inorder to connect a Grid as a source Grid tothe 3G Bridge, a Grid-specific handler shouldbe written for this source Grid.

• The Queue Manager periodically reads jobsfrom the Job Database and transmits them tothe Target Grid Plug-in Interface.

• The Target Grid Plug-in Interface enables toconnect various target Grids via their plug-in.The Target Grid Plug-in Interface provides ageneric set of interface functions that shouldbe implemented by the target Grid plug-ins. Inorder to connect a Grid as a target Grid intothe 3G Bridge, a Grid-specific plug-in shouldbe written for this target Grid. Note that theGrid plug-in is also responsible for queryingthe status of job/WU execution in the targetGrid and retrieving the output of submittedjobs/WUs.

As Fig. 5 shows, the EDGeS 3G Bridge per-forms the three following bridging functionalities:

1. BOINC to EGEE bridge: Fig. 6 shows theBOINC handler and the EGEE plug-in.

2. EGEE to BOINC bridge: Fig. 8 shows theBOINC plug-in and the customized EGEEGRAM manager as EGEE handler of the 3GBridge.

3. EGEE to XtremWeb bridge: Fig. 9 shows theXtremWeb plug-in and the modified EGEEGRAM manager as EGEE handler of the 3GBridge. Note that the advantage of using 3GBridge is that the EGEE handler can be thesame for both the EGEE to BOINC bridgeand for the EGEE to XtremWeb bridge.

5.2 Bridges BOINC → EGEEand XtremWeb → EGEE

The goal is to receive BOINC work units orXtremWeb jobs, and to make them execute in-side the EGEE service Grid. The challenges areto securely authenticate the submitter with anX509 proxy acceptable by EGEE, and to wrap theincoming work unit or job as an EGEE job.

5.2.1 Bridge BOINC → EGEE

The main principle of the BOINC → EGEEbridge is to fetch work units from a BOINC serverand to use the 3G bridge architecture presentedin the previous section in order to translate thesework units into EGEE jobs. Thus, in order tohave a fully functional BOINC → EGEE bridge,two main additional components are needed: aBOINC handler and an EGEE Target Grid Plug-In.

Page 10: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

The overview diagram of the complete systemcan be seen on Fig. 6 below.

The BOINC handler consists of two compo-nents shown on the left-hand side of the EDGeS3G bridge:

• The BOINC jobwrapper client is a modifiedBOINC client which simulates a very pow-erful desktop PC, consisting of many CPUcores, and performing communication withthe BOINC server in order to fetch work unitsand report their results. For each fetched workunit, the BOINC jobwrapper client creates awork unit description file describing the mainproperties of the WU, like the executablename, location and name of input and out-put files, and the command line arguments.Then, instead of starting the work unit on eachsimulated processor, the BOINC jobwrapperclient starts one 3G jobwrapper applicationon behalf of each simulated processor (e.g.if the modified BOINC client simulates a PCconsisting of ten processors, then it will startten 3G jobwrapper applications).

• The 3G jobwrapper application reads thework unit description created by the BOINCjobwrapper client in order to create a newBOINC work unit entry in the 3G Bridgedatabase. From this point on, the 3G jobwrap-per application periodically checks the statusof the entry in the database. As soon as itfinds that the work unit is terminated, the

3G jobwrapper application fetches availableoutputs from the 3G Bridge, and reports thestatus towards the BOINC jobwrapper client,which then considers the work unit as finished,and sends the result to the BOINC server atthe next scheduling request.

The EGEE Target Grid Plug-In, shown on theright-hand side of the EDGeS 3G bridge, selectsjobs from a queue in the 3G Bridge databaseand run them on EGEE. The EGEE plug-in canhave different instances depending on the EGEEVO or user credential to be used for running thejobs. Thus, each EGEE plug-in is associated to a(queue name, VO name, credential) triple, wherequeue name is the name of the queue to use inthe 3G Bridge database, VO name is the EGEEVO to be used, and credential is MyProxy accessinformation for connection to the given EGEEVO. A given 3G jobwrapper application uses aconfigured queue name to explicitly select the VOand credential used to run the jobs on EGEE.

In order to handle EGEE jobs, the EGEE plug-in uses an advanced feature of gLite collectionjob submission. Using this feature, it is possibleto send more jobs to the EGEE WMS in onesubmission request in order to lessen the load onthe WMS and increase the number of jobs that canbe handled. After the EGEE plug-in has sent thesubmission request for the collection, it retrievesone EGEE job identifier for each individual jobwithin the collection.

Fig. 6 Overview of theBOINC → EGEE bridge EGEE

WMS

EDGeS 3G bridge

EGEE Plugin

1 for each (BOINC Project Owner, EGEE VO) pair

Queue Manager & Job DB

BOINC Handler 1 for each (BOINC server,

BOINC Project Owner, EGEE VO) triple

WUi+1

WUi+2

WUi+3

Jobi+1

Jobi+1

Jobi+2

BOINC Server

Work Unit

BOINC Project Owner

Submission

MyProxy Medium term X509 proxy

Config. file

DN of X509 proxy

Short term X509 proxy

VOMS Server

VOMS extensions

J

ob

Han

dler

In

terf

ace

Grid

H

andl

er

Inte

rfac

e

BOINC jobwrapper client (simulating

a large BOINC computing resource)

3G job-wrapper

3G job-wrapper

VOMS proxy Retriever

Page 11: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

The BOINC → EGEE bridge uses configura-tion files for following components:

• BOINC jobwrapper client: the configurationfile contains the number of CPU cores toreport, and the location of the 3G Bridgejobwrapper application to start instead of thework unit executables,

• 3G jobwrapper application: the configurationfile contains the target queue name and thelocation of the configuration file needed fordatabase access,

• EGEE plug-in: the EGEE plug-in is config-ured within the 3G Bridge configuration fileas the plug-ins are set up by the 3G Bridge onstartup. For each EGEE plug-in instance, thefollowing configuration variables have to beset: the plug-in type (EGEE for EGEE plug-ins), the VO name to use, the WMS URL,and MyProxy information for getting access tocredentials (MyProxy hostname, port, DN ofthe X509 proxy of the BOINC project owner).

The security concepts of the two bridged Gridsare different: on one hand, BOINC does not re-quire X509 certificates, but permits adding of newwork units only to the BOINC project owner,and on the other hand, EGEE provides access toevery user with a registered X509 certificate. Inorder to gain access to EGEE resources and runthe project’s work units on EGEE, the BOINCproject owner has to send (just only once) the DNof his X509 proxy to the 3G bridge administrator.Then, following standard EGEE rules, he storeshis X509 proxy inside a MyProxy server whichunconditionally trusts the EDGeS 3G bridge. Asa consequence, the EDGeS 3G bridge must beoperated with at least the same security level asa MyProxy server.. The bridge components gen-erate detailed logging, so that the bridge adminis-trator can quickly identify the BOINC project andwork unit for an incidentally maliciously workingEGEE job started by an EGEE plug-in.

5.2.2 Bridge XtremWeb → EGEE

The XtremWeb project had developed a proto-type of a bridge to EGEE before the beginningof the EDGeS project. So, instead of creatinga brand new XtremWeb handler inside the 3G

bridge, we decided to upgrade the existing proto-type into a production feature.

The general principle is an XtremWeb bridgeperforming job gliding [9, 10] using an XtremWebworker job as a mono-user pilot job: When anXtremWeb worker job runs on an EGEE re-source, it contacts the XtremWeb server, retrieves1 user job, executes it, and sends back the result tothe XtremWeb server.

Care must be taken that jobs sent to EGEEreally belong to users having the right to accessEGEE resources, and really permit traceability ofthe original submitter. This is achieved using anX509 proxy of the original submitter.

XtremWeb features a security model allowingboth anonymous users (for instance volunteersgiving their computing resources) and authenti-cated users (for instance EGEE users submittingjobs) to coexist within the same infrastructure.This security model (private, group, public), whichfollows the standard POSIX model (user, group,other), is enforced by the XtremWeb serverswhich manage access rights between users, appli-cations, data and jobs.

The X509 proxy permitting to certify theXtremWeb user job for EGEE is a critical datawhich must be readable only by its owner. So,the user job must be an XtremWeb private userjob, and the corresponding mono-user pilot jobsubmitted by the XtremWeb bridge must be anXtremWeb private worker job.

The processing sequence, shown in Fig. 7 be-low, contains the following nine steps:

1. An XtremWeb user creates an X509 proxy,and contacts an EGEE VOMS server to ob-tain VOMS extensions allowing him to accessEGEE resources,

2. The XtremWeb user submits to theXtremWeb server his user job with theX509 proxy and VOMS extensions, whichpermits to certify this user job,

3. The XtremWeb bridge periodically requeststhe XtremWeb server for user jobs certifiedby an X509 proxy. If such one is pending,the XtremWeb server sends him the user jobcertified with the X509 proxy,

4. The XtremWeb bridge submits to an EGEEWMS an XtremWeb worker job, which is a

Page 12: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

Fig. 7 BridgeXtremWeb → EGEE EEGGEEEE

X509 proxy with VOMS extensions

Submits User Job with X509 proxy

Sends back Job Status and Results

VOMS Server

XtremWeb Server

Submits mono-user Pilot Job with X509 proxy

Gives Pilot Job Status

gLite WMS Computing Element

Pushes Pilot job

Mono-user Pilot Job

Requests only 1 User Job

Sends 1 User Job with same

X509 proxy

User Job

Gives Pilot Job Status

Sends back results directly

XtremWeb Bridge

Requests User Jobs

Sends User Jobs with X509 proxy

Manages User Job status

X509 proxy

XtremWeb User

mono-user pilot job with this X509 proxy (jobdescription in JDL),

5. The EGEE WMS pushes the pilot job to anEGEE CE, which executes it,

6. The mono-user pilot job, which is anXtremWeb worker job, requests only theoriginal user job from the XtremWeb server,and stops if it receives none,

7. The XtremWeb server verifies that the re-quested user job is certified with an X509proxy, and sends the user job and the X509proxy to the pilot job,

8. The pilot job verifies that the received X509proxy is the same as its own X509 proxy, andexecutes the user job inside a sandbox [13–15],

9. At the end of the user job, the pilot jobsends the job results directly to the XtremWebserver, then stops.

This satisfies EGEE security requirements [12,16].

5.3 Bridges EGEE → BOINCand EGEE → XtremWeb

The goal is to receive EGEE jobs, and to makethem execute as BOINC work units or XtremWebjobs. The challenges are to present BOINC andXtremWeb resources as an LCG-CE (current ver-sion of the EGEE Computing Element) to EGEE,and to allow only jobs executing an applicationwhich already has been validated for executioninside desktop Grids.

5.3.1 GEMLCA Repository of ValidatedApplications

As already stated, owners of resources insideDesktop Grids are willing to donate their re-sources only to trusted applications. That requiresan application repository for storing the validatedand hence trusted applications.

In the case when an EGEE WMS submits(on behalf of an EGEE user) a job containingan application not present inside the applicationrepository, the 3G bridge will not submit this jobto BOINC, but will immediately notify a failureto the EGEE WMS. Depending on the number ofretries allowed by the job description, the EGEEWMS can resubmit the job to another EGEEComputing Element.

The repository for validated applications mustfulfill following five requirements:

1. Restrict content modification to only theRepository administrators, so that the EDGeSbridge can trust the contents of the repository.

2. Allow desktop Grid site administrators topublish when their site supports a given ap-plication, in order to permit the automationof the application registration process on thedesktop Grid site.

3. Allow EGEE users and desktop Grid siteadministrators to query the available appli-cations, so that the EGEE users can de-termine which applications can run on thedesktop Grid infrastructure, and desktop Gridadministrators can determine which applica-tions they can register on their sites.

Page 13: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

4. Allow EGEE users and desktop Grid siteadministrators to download repository con-tents: executables, shared libraries, input filesetc. EGEE users need to download repositorycontent when they want to submit to the WMSa job which can be forwarded to a desktopGrid, which requires a validated executable.Site administrators use the download facilitywhen they get from the repository the appli-cation to be registered on their desktop Grid.

5. Allow the EGEE → Desktop Grid bridgesto check the authenticity of an applicationreceived within a job, in order to determinewhether the application can be forwarded tothe desktop Grid infrastructure.

Prior EDGeS project, GEMLCA [17, 18] (GridExecution Management for Legacy Code Ap-plications), developed by the University ofWestminster (UoW), already provided the sixfollowing repository functionalities, however itsconcept was focusing on the execution of theapplications in the repository:

1. Management of repository entries, which de-scribe applications with their inputs and out-puts, parameters and necessary executionenvironments.

2. Retrieve list of validated applications with hu-man readable description and id.

3. Search (using WSRF) the id, status (public,private, system) and the list of desktop Gridsites supported by each validated application.

4. Retrieve files with GridFTP from a non-hierarchical virtual filesystem. (Files related toa single validated application should reside inthe same directory).

5. Retrieve application data from XML files toGEMLCA’s proprietary format

6. Determine application ownership based onthe certificate subject of the user.

In order to comply with the requirements, wehave extended GEMLCA with following two ad-ditional repository functionalities:

1. Better searching facility now provides inter-faces for searching applications with their ver-sion information too. This helps the desktopGrid site administrators to determine whether

an installed application on their site is differ-ent than the one stored in the repository.

2. An interface to collect file hashes of the repos-itory contents. This helps the EDGeS bridgeto determine whether an application submit-ted is the same what can be found in therepository.

Finally, University of Westminster offers a portletfor administering GEMLCA services. This portlethas been enhanced to handle also the EDGeSenhanced GEMLCA versions and offers help forthe desktop Grid administrators for registeringtheir sites. It also provides a simple interface forthe EDGeS validated application repository ad-ministrator for managing the repository contents.

5.3.2 Bridge EGEE → BOINC

The EGEE → BOINC bridge is based on the 3GBridge. The rough difference with the BOINC →EGEE bridge described in a preceding chapteris that the EGEE → BOINC bridge needs anEGEE handler for receiving EGEE jobs and aBOINC plug-in for sending jobs to the BOINCtarget Grid.

The EGEE → BOINC bridge must be atransparent solution for EGEE users, i.e. atraditional EGEE user with an EGEE UserInterface access should be able to submit jobs andquery job progress information with traditionalEGEE command-line tools. Moreover, a BOINCresource should be visible in the EGEE Informa-tion System like a traditional EGEE ComputingElement. Thus, from EGEE’s point of view, thesolution must fulfil the two following requirements:

1. proper information reporting towards theBDII server,

2. proper interaction with the WMS and theL&B servers.

Before the implementation, we have examineddifferent CE solutions offered by EGEE: LCG-CE, GLITE-CE and CREAM CE. LCG-CE ismature and widely used. GLITE-CE is a proof-of-concept solution, not recommended for produc-tion by EGEE. CREAM-CE is a new technologybased on Web Services, working correctly whenaccessed directly, but not accessible through the

Page 14: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

Fig. 8 The EGEE →BOINC bridge EGEE

LCG-CE for EDGeS

EDGeS

3G bridge

Gets EXE

Watches Gets output

Reports resourcesand performance

Pushes job

Checks EXE

Adds job Watches

job Submits Job

Logs eventsSend output

X509 proxy with VOMS extensions

Logs events

BOINC Computing Resources

BOINC Server

BOINC plugin

(DC-API)

EDGeS Application Repository

EGEE BDII

gLite WMS

EGEE LB

EGEE VOMS

EGEE User

InformationProvider

GRAM JobManager

for EDGeS

Gets Infos

EGEE WMS yet. As a consequence, we havedecided to create a solution based on LCG-CE,as shown in Fig. 8. Later, when the CREAMinterface becomes stable, we will offer both theLCG-CE and the CREAM-CE interfaces for the3G bridge.

For the processing of a job, a standard LCG-CEperforms the three following steps:

1. It submits a wrapper script to a selectedGRAM jobmanager on the CE,

2. Before running the wrapper script on aworker, it starts a helper script which periodi-cally updates the proxy on the worker node ina background process and starts the wrapperscript,

3. The wrapper script contains every informationabout the job defined in variables, like thename of the executable, command line argu-ments, name and URL of input files, nameand expected location of output files, etc. The

wrapper script first fetches input files fromtheir location (from the WMS), then runs theexecutable, and finally copies the output filesto their specified location. During the process,the wrapper script sends status change eventsto the L&B server, and produces output insome WMS-specific files.

The EGEE → BOINC bridge has to performalmost the same steps as above, using a somewhatmodified LCG-CE with a 3G bridge-specific job-manager. However, there are some differences:

• the bridge must check if the received ex-ecutable is a validated application on theconnected BOINC server. Using the EDGeSApplication Repository described in the pre-vious subsection, the bridge checks if the exe-cutable is registered as an EGEE application,if this application has a BOINC variant and itis registered on the connected BOINC project.

Fig. 9 Bridge EGEE →XtremWeb EGEE

LCG-CE for EDGeS

EDGeS

3G bridge

Gets EXE

Watches Gets output

Reports resourcesand performance

Pushes job

Checks EXE

Adds job Watches

job Submits Job

Logs eventsSend output

X509 proxy with VOMS extensions

Logs events

XtremWeb Computing Resources

XtremWeb Server

XtremWeb plugin

EDGeS Application Repository

EGEE BDII

gLite WMS

EGEE LB

EGEE VOMS

EGEE User

InformationProvider

GRAM JobManager

for EDGeS

Gets Infos

Page 15: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

The job is accepted if and only if all thesestatements are true,

• the helper script doesn’t have to be started, asthe proxy doesn’t leave the CE, so there is noneed to update proxies on Worker Nodes,

• there is no need to run the wrapper script, itonly has to be parsed, so our GRAM jobman-ager can send the job to the 3G Bridge andcan interact with the L&B server instead ofthe wrapper script,

• moreover, the GRAM jobmanager periodi-cally has to check the status of the job in the3G Bridge database, and update the status ofthe job for EGEE.

In order to implement the EGEE → BOINCbridge, we have extended the 3G bridge with aWeb Service interface. So, there is no need toplace the 3G bridge and the BOINC server on thesame machine as the EDGeS CE: the 3G bridgeand the BOINC server are completely separatedfrom the EDGeS CE, which uses the WS interfaceto communicate with the 3G bridge.

On BOINC side, the DC-API [6] plug-in is usedto create BOINC work units out of entries in the3G Bridge Job Database, query their status andget the results of processed work units. Once workunits are created in the BOINC database, theyare processed sooner or later by attached BOINCclients. If desired, the BOINC server can performany job redundancy and checking as usual.

The EGEE → BOINC bridge publishes in-formation to the BDII according to GLUE 1.3,contains an EGEE producer and a BOINC GIPplug-in. The BOINC plug-in is responsible forreporting performance information about theBOINC project. For this, BOINC statistics pro-vided by the BOINC project are used.

5.3.3 Bridge EGEE → XtremWeb

The general principle of creating the EGEE →XtremWeb bridge is the same as in the case of theEGEE → BOINC bridge since both solutions usethe 3G bridge as the heart of the EGEE → DGbridges. The architecture of EGEE → XtremWebbridge, depicted in Fig. 9, clearly shows that theEGEE → 3G bridge part is the same as the oneshown in Fig. 8 above for the EGEE → BOINC

case. The only difference is the replacement, in-side the 3G bridge, of the BOINC plug-in by theXtremWeb plug-in.

6 Current Operational Status of EDGeS

The EDGeS 3G bridge is not an pure researchprototype, but implementations are in real opera-tion between EGEE and Desktop Grids, as shownin Fig. 10 below:

6.1 Operational DG → EGEE Infrastructure

The DG → EGEE bridges of the EDGeS systemhave been prototyped in June 2008 and put intooperation in September 2008.

The BOINC → EGEE bridge is currentlyin operation at SZTAKI in Budapest, Hungary.

EDGeS VO of EGEE

CNRS / IN2P3 CE 1.600 cpus

SZTAKI CE 16 cpus

CIEMAT CE 20 cpus

BDII VOMS MyProxy WMS LB

EGEE Users

EDGeS

BOINC → EGEE bridge

Application Repository

EGEE XtremWeb

bridge

EGEE BOINC bridge

BOINC-based Desktop Grids

SZDG (public) 72.000 PCs

IberCivis (public) 14.000 PCs

AlmereGrid (public) 1.000 PCs

UoW (local) 1.881 PCs

Correlation Systems

(local) 10 PCs

XtremWeb-based Desktop Grids

AlmereGrid (public) 1.000 PCs

IN2P3 (public) 600 PCs

AlmereGrid (local)10 PCs

BOINC Project Owners XtremWeb Users

→ →

Fig. 10 EDGeS operational EGEE ←→ DGinfrastructure

Page 16: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

It connects the public BOINC-based SZTAKIdesktop Grid (SZDG) in Budapest, Hungary, thepublic BOINC-based Ibercivis desktop Grid inSpain, the local UoW desktop Grid at London,Great Britain, and the local DG of the CorrelationSystems Ltd., Israel to a dedicated external VO ofEGEE, named desktopgrid.vo.edges-grid.eu.

The XtremWeb → EGEE bridge is nowincluded in the standard distribution of theXtremWeb middleware. It is in operation at thelocal IN2P3 desktop Grid at Orsay, France, atthe public INRIA desktop Grid in Saclay, France,and at the AlmereGrid desktop Grid in Almere,The Netherlands.

In the operational DG → EGEE infrastructureof EDGeS, the desktopgrid.vo.edges-grid.eu VOcurrently has access to three computing elementsas EGEE resources:

• CNRS/IN2P3 CE (approx. 1,600 CPUs, nondedicated)

• SZTAKI CE (16 CPUs, dedicated)• CIEMAT CE (20 CPUs, dedicated)

It also contains the basic EGEE core services likeBDII, VOMS, MyProxy, WMS and LB.

6.2 Operational EGEE → DG Infrastructure

The EGEE → DG bridges of the EDGeS systemhave been prototyped in December 2008 and putinto operation for the EGEE users in March 2009.

In the operational EGEE → DG infrastruc-ture of EDGeS, where the desktopgrid.vo.edges-grid.eu VO currently has access to six desk-top Grids:

• AlmereGrid: Public BOINC DG in Almere,The Netherlands (1,000 PCs)

• SZDG: Public BOINC DG in Budapest,Hungary (72,000 PCs)

• University of Westminster: Local BOINC DGin London, UK (1,881 PCs)

• IN2P3: Public XtremWeb DG in Orsay,France (600 PCs)

• AlmereGrid: Public XtremWeb DG inAlmere, The Netherlands (1,000 PCs)

• AlmereGrid: Local XtremWeb DG in Almere,The Netherlands (ten PCs)

As soon as the BOINC-based Extremadura desk-top Grid will be available, it will also be connectedto the EDGeS operational infrastructure.

Table 1 Applications already ported to EDGeS

Applications already Organisation Runs on desktop Runs on EGEEported to EDGeS Grid (EDGeS VO)

Video Stream Analysis Correlation Systems Ltd Israel√ √

in a Grid Environment (VISAGE)Digital Alias-free Signal Processing University of Westminster

√ √Protein Molecule Simulation University of Westminster

√ √using Autodock

E-Marketplace Model Integrated SZTAKI√ √

with Logistics (EMMIL)Anti-cancer Drug Design (CancerGrid) SZTAKI

√ √Cellular Automata based University of Seville and

√ √Laser Dynamics (CALD) University of Westminster

Signal and Image Processing Forschungszentrum Karlsruhe√

using GT4 TrayAnalysis of Genotype Data (Plink) Atos Origin

√Distributed Audio Retrieval Cardiff University

√ √using TRIANA (DART)

Fusion Plasma Application (ISDEP) BIFI√

3-D Video Rendering using Blender University of Westminster√

Profiling Hospitals in the UK based University of Westminster√ √

on Patient Readmission Statistics

Page 17: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

6.3 Applications Already Ported to EDGeS

There are already several applications that werealready ported to the DG → EGEE infrastruc-ture and they have been using the DG → EGEEinfrastructure since September 2008. These appli-cations are shown in Table 1 below.

6.4 Load Limit and Performances

The EDGeS 3G bridge currently supports a loadof several thousands jobs waiting to be completed,with a sustained transfer throughput of 7 jobs/s.

The latency introduced by the bridge compo-nents themselves has been measured to be 2 s,but the current conservative sleep time of 300 sof the jobwrapper introduces an average latencyof 300/2 = 150 s.

7 OGF Standards Relevant for Future GridInteroperability

In order to bridge more Grids, and provide bettersupport of Grid file access, the EDGeS project

will work on interoperability using following OGFrecommendations, which are published using doc-uments named GDF.

Figure 11 below presents the relationships be-tween OGF standards related to computation:

Following recommendations are currently inthe process of implementation by the gLite mid-dleware of EGEE:

• For information retrieval and publication:GLUE is an information model for Grid enti-ties described using natural language andenriched with a graphical representation usingUML Class Diagrams. Therefore, GLUE is thefundamental basis permitting to publish andretrieve information about Grid entities. WhengLite will migrate from GLUE 1.3 to GLUE2.0, defined in document GFD.147 [19], theEDGeS bridge will also migrate, in order topublish useful information to the BDII, andto retrieve necessary information from it.

• For job management: BES (Basic Execu-tion Service), defined in document GFD.108[20], is a service to which clients can sendrequests to initiate, monitor, and manage

Fig. 11 OGF standardsrelated to computation

Page 18: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

computational activities. It defines an ex-tensible state model for activities, and anextensible information model for a BESimplementation and for the activities that itcreates. The EDGeS bridge will use BES toreceive job submissions, for example from theGridSphere Portal. The EDGeS bridge willalso use BES to submit jobs to the BES-compliant CREAM CE when this latter willreally be in operation.

• For job description: JSDL (Job SubmissionDescription Language), defined in documentGFD.136 [21], is used to describe the require-ments of computational jobs for submission toresources, particularly in Grid environments.When gLite will migrate from JDL to JSDL,the EDGeS bridge will also migrate, in orderto describe submitted jobs.

• For data management: GridFTP, defined indocuments GFD.20 [22] and GFD.21 [23],adds Grid extensions to the File Transfer Pro-tocol (FTP) specified in RFC 959. SRM, de-fined in document GFD.129 [24], is an openspecification for Storage Resource Managers,which are Grid storage services providing in-terfaces to storage resources, as well as ad-vanced functionality such as dynamic spaceallocation and file management on sharedstorage systems. EGEE is a main contributorto these two recommendations, and alreadyimplements them. So the EDGeS projectwill have to take these recommendationsinto account inside its “data management”activity.

Following OGF recommendations could alsopotentially be used:

• For accounting records: UR (Usage Record),defined in document GFD.98 [25], describesa common format, encompassing both joblevel accounting and aggregate accounting,with which to exchange basic accounting andusage data over a Grid instantiation. RUS(Resource Usage Service) [26] is intended toaccommodate requirements on Grid resourceusage auditing and accounting as well Grideconomic model.

• For data transfers: ByteIO, defined in doc-uments GFD.87 [27] and GFD.88 [28], de-

scribes efficient manipulation of, access to,and management of bulk data sources andsinks in the Grid.

• For file management: DMI (Data MovementInterface), defined in document GFD.134 [29],specifies the support of the instantiation andmanagement of data transfers within andacross Grid deployments.

• For the GEMLCA application repository:ACS (Application Contents Service), definedin document GFD.73 [30], is an OGSA ser-vice, which maintains Application Contents asan Application Archive. The ACS repositoryprovides functions to retrieve the contents andtheir change histories. ACS also defines a stan-dard format of an Application Archive for itsmanagement and exchange.

OGF recommendations for authentication andauthorization are highly desirable, because theyare transverse to all other Grid services, andtheir standardisation is a prerequisite for any in-teroperability. But the OGF AUTHZ workinggroup, who is officially in charge of these subjects,has currently no published recommendation. TheOGF PGI working group, where EDGeS activelyparticipates, has achieved a survey of the existingmore or less compatible implementations, and isworking to publish proposals.

8 Conclusions

EDGeS so far achieved to put into operationthe integrated DG-EGEE infrastructure into theDG → EGEE and EGEE → DG directions in thededicated EDGeS desktop Grid VO.

For the DG → EGEE direction, it means thatwork units of the connected desktop Grids can beseamlessly executed in the EDGeS VO of EGEE.This helps to increase the available processingcapacity for the connected local and public DGs.It is especially useful for small local DGs like theDG of Correlation Systems Ltd.

For the EGEE → DG direction, it means thatEGEE jobs using validated applications can beseamlessly executed in the DGs connected toEDGeS. Intensive work is currently performedto port the existing EGEE applications to the

Page 19: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

EDGeS: Bridging EGEE to BOINC and XtremWeb

EDGeS infrastructure. Methodology of portingEGEE applications to EDGeS has already beendeveloped as well as the validation procedure ofsuch applications.

The developed EGEE-DG infrastructure iscreated in a way to be easily adapted for other SGand DG systems. The 3G Bridge that is the heartof the EDGeS infrastructure is easily adaptablefor other Grids: only the necessary source han-dlers and target plug-ins should be written for theGrids to be connected into EDGeS. The solutionis not specific to the EDGeS desktop Grid VO,but can also be used for different VOs of EGEE.Available standards are also considered for futureadaptation of the EDGeS technology in orderto solve the long-term interoperability issues ofservice Grids and desktop Grids.

Acknowledgements The EDGeS (Enabling DesktopGrids for e-Science) project receives Community fund-ing from the European Commission within Research In-frastructures initiative of FP7 (grant agreement Number211727).

References

1. Riedel, M., et al.: Interoperation of world-wide pro-duction e-Science infrastructures. In: Concurrency andComputation: Practice and Experience (2008)

2. Anderson, D., Cobb, J., Korpela, E., Lebofsky, M.,Werthimer, D.: Seti@home: an experiment in public-resource computing. Commun. ACM 45(11), 56–61(2002)

3. Anderson, D.: BOINC: a system for public-resourcecomputing and storage. In: Proceedings of the5th IEEE/ACM International GRID Workshop,Pittsburgh, USA (2004)

4. Fedak, G., Germain, C., Nri, V., Cappello, F.:XtremWeb: a generic global computing platform. In:Proceedings of 1st IEEE International Symposiumon Cluster Computing and the Grid CCGRID’2001,Special Session Global Computing on Personal De-vices, pp. 582–587. IEEE/ACM, IEEE Press, Brisbane,Australia (2001)

5. Myers, D.S., Bazinet, A.L., Cummings, M.P.: Ex-panding the Reach of Grid Computing: CombiningGlobus- and BOINC-based Systems, Chapter Grids forBioinformatics and Computational Biology. WileyBook Series on Parallel and Distributed Computing(2008)

6. Balaton, Z., Gombas, G., Kacsuk, P., Kornafeld, A.,Kovacs, J., Marosi, A.C., Vida, G., Podhorszki, N.,Kiss, T.: SZTAKI Desktop Grid: a modular and scal-able way of building large computing Grids. In: Proc. of

the 21st International Parallel and Distributed Process-ing Symposium, Long Beach, California, USA, 26–30March 2007

7. Cardenas-Montes, M., Emmen, A., Marosi, A.C.,Araujo, F., Gombas, G., Terstyanszky, G., Fedak, G.,Kelley, I., Taylor, I., Lodygensky, O., Kacsuk, P.,Lovas, R., Kiss, T., Balaton, Z., Farkas, Z.: Edges:bridging desktop and service Grids. In: Proceedings ofthe 2nd Iberian Grid Infrastructure Conference, Uni-versity of Porto, Portugal, 12–14 May 2008

8. Sarmenta, L.F.G.: Sabotage-tolerance mechanisms forvolunteer computing systems. Future Gener. Comput.Syst. 18(4), 561–572 (2002)

9. Lodygensky, O., Fedak, G., Cappello, F., Neri, V.,Livny, M., Thain, D.: XtremWeb & condor: sharingresources between internet connected condor pools.In: Proceedings of CCGRID’2003, Third InternationalWorkshop on Global and Peer-to-Peer Computing(GP2PC’03), pp. 382–389. IEEE/ACM, Tokyo, Japan(2003)

10. Thain, D., Livny, M.: Building reliable clients and ser-vices. In: The GRID2, pp. 285–318. Morgan Kaufman(2004)

11. Raicu, I., Zhao, Y., Dumitrescu, C., Foster, I., Wilde.M.: Falkon: a fast and light-weight task executionframework. In: IEEE/ACM SuperComputing (2007)

12. Sfiligoi1, Koeroo, O., Venekamp, G., Yocum1, D.,Groep, D., Petravick, D.: Addressing the Pilot securityproblem with gLExec. Technical Report FERMILAB-PUB-07-483-CD, Fermi National Laboratory (2007)

13. Alexandrov, A., Kmiec, P., Schauser, K.: Consh: aconfined execution environment for internet computa-tions. In: Proceedings of the Usenix Annual TechnicalConference. http://www.usenix.org/events/usenix99/(1999)

14. Acharya, A., Raje, M.: Mapbox: using parameterizedbehavior classes to confine applications. In: Techni-cal report TRCS99-25, University of California, SantaBarbara (1999)

15. Goldberg, I., Wagner, D., Thomas, R., Brewer, E.: Asecure environment for untrusted help application—confining the wily hacker. In: Proceedings of the 6thUsenix Security Symposium (1996)

16. Caillat, G., Lodygensky, O., Urbah, E., Fedak, G., He,H.: Towards a security model to bridge internet desk-top Grids and service Grids. In: Lecture Notes in Com-puter Science (2008)

17. Terstyanszky, G., Kiss, T., Kacsuk, P., Delaitre, T.,Kecskemeti, G., Winter, S.: Legacy code support forcommercial production Grids. In: Conf. Proc. of theUK E-Science All Hands Meeting, Nottingham, UK,ISBN 0-9553988-0-0, 18–21 September 2006

18. Delaitre, T., Kiss, T., Goyeneche, A., Terstyanszky,G., Winter, S., Kacsuk, P.: GEMLCA: running legacycode applications as Grid services. In: Journal ofGrid Computing, vol. 3. No. 1–2, pp. 75–90. SpringerScience + Business Media: 1570–7873 (2005)

19. Andreozzi, S., Burke, S., Ehm, F., Field, L.,Galang, G., Konya, B., Litmaath, M., Millar, P.,Navarro, J.P.: [GFD.147] GLUE Specification v. 2.0.http://www.ogf.org/documents/GFD.147.pdf (2009)

Page 20: EDGeS: Bridging EGEE to BOINC and XtremWebusers.lal.in2p3.fr/.../gc/EDGeS-Bridging-EGEE-to-BOINC-and-XtremWe… · OMII Open Middleware Infrastructure Institute OSG Open Science Grid

E. Urbah et al.

20. Foster, I., Grimshaw, A., Lane, P., Lee,W., Morgan, M., Newhouse, S., Pickles, S.,Pulsipher, D., Smith, C., Theimer, M.: [GFD.108]OGSA Basic Execution Service Version 1.0.http://www.ogf.org/documents/GFD.108.pdf (2007)

21. Anjomshoaa, A., Brisard, F., Drescher, M.,Fellows, D., Ly, A., McGough, S., Pulsipher, D.,Savva, A.: [GFD.136] Job Submission DescriptionLanguage (JSDL) specification, version 1.0. http://www.ogf.org/documents/GFD.136.pdf (2008)

22. Allcock, W., Bester, J., Bresnahan, J., Meder, S.,Plaszczak, P., Tuecke, S.: [GFD.20] GridFTP:protocol extensions to FTP for the Grid. http://www.ogf.org/documents/GFD.20.pdf (2003)

23. Mandrichenko, I.: GridFTP Protocol Improvements.[GFD.21] http://www.ogf.org/documents/GFD.21.pdf(2003)

24. Sim, A., Shoshani, A., Badino, P., Barring, O.,Baud, J.-P., Corso, E., De Witt, S., Donno, F.,Gu, J., Haddox-Schatz, M., Hess, B., Jensen, J.,Kowalski, A., Litmaath, M., Magnoni, L., Perelmutov,T., Petravick, D., Watson, C.: [GFD.129] The storage

resource manager interface specification version 2.2.http://www.ogf.org/documents/GFD.129.pdf (2008)

25. Mach, R., Lepro-Metz, R., Jackson, S.: [GFD.98]Usage record—format recommendation. http://www.ogf.org/documents/GFD.98.pdf (2007)

26. Chen, X., Khan, A., Ainsworth, J., Newhouse,S., MacLaren, J.: WSI Resource Usage Service(RUS) Core Specification (draft 19). http://forge.gridforum.org/sf/docman/do/downloadDocument/projects.rus -wg/docman.root.documents.version_1_0.draft_wsi_rus_19/doc14304/1 (2007)

27. Morgan, M.: [GFD.87] ByteIO specification 1.0.http://www.ogf.org/documents/GFD.87.pdf (2006)

28. Morgan, M.: [GFD.88] ByteIO OGSA WSRFBasic Profile Rendering 1.0. http://www.ogf.org/documents/GFD.88.pdf (2006)

29. Antonioletti, M., Drescher, M., Luniewski, A.,Newhouse, S., Madduri, R.: [GFD.134] OGSA-DMI Functional Specification 1.0. http://www.ogf.org/documents/GFD.134.pdf (2008)

30. Fukui, K.: [GFD.73] Application contents service specifi-cation 1.0. http://www.ogf.org/documents/GFD.73.pdf (2006)