D.Spiga, L.Servoli, L.Faina INFN & University of Perugia

1
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder •A CMS specific tool written in python and developed within INFN to open the Grid to the masses! •It is aimed to allow CMS users to have access to all Data produced and available, using grid middleware •It should hide as much as possible grid complexities to CMS user •It have to be installed on the UserInterface (UI), the user acces point to the grid. develop their analysis code in an ractive environment with the program for CMS nstruction (ORCA), and choose a dataset to analyze covery : the Computing Element (CE) of sites storing data are found querying central ( RefDB) and local (PubDBs) database user code: creation of a tgz archive with user code which contains bin, lib and data the main steps are: pper (sh) of ORCA executable or script creation: et up running environment on remote resources (WN); anity check on WN; ccess to local catalogs; utput handling; Description Language (Jdl) file creation: the site locations (CEs name) are passed to Resource Broker (RB) as requirements to drive resources matchmaking; splitting according to user requests User tasks: RAB main functionalities: Dynamical Web Page CRAB Submission time Check status time Output retrieval time MySQL DB UDP server Write & Update CRAB Monitoring: Technical Implementation: At three diffent points of its workflow (submission, check status, output retrieval), CRAB sends UDP packets, containing the informations, to an UDP server which process the data and fills a MySQL database. To satisfy the request for a “real-time” monitoring, some of the informations stored into the which is automatically updated. To monitor and to analyze the use of CRAB, a monitoring tool has been developed to collect data in order to: -- show, in real time. informations such as: * RATE of CRAB-Jobs Submission * Dataset and Computing Element Usage * Provenance of Jobs -- answer, with an off-line analysis, to questions like: * How efficient is the service; * How many users are using CRAB (and how); * Which patterns of data access are emerging (which data are used and where); * Which are the failures/problems of the service; * How to improve the user support; Left and right histograms show the CE and dataset/owner usage. Each bar represents the total number of jobs and it is divided into three categories: - jobs that produce ORCA Exit Code equal to 0 (green) - jobs that produce ORCA Exit Status different from 0 (yellow) On line web page role of the web pages is to show automatically updated ntities. There are several possibilities concerning mainly many jobs are submitted, where the jobs run, which input a are requested and which User Iterfaces have been used. data can be shown choosing among different time intervals. The number of jobs submitted each month. sing data collected by the monitoring tool, it is possible to stand in dept the behaviour of the system. tudy of the time evolution of several quantities allows to draw usions on the use and the performance of the service. July 2005 to January 2006 about 400’000 CRAB-jobs submitted on the grid. bove histogram shows the weekly submission rate for CG (dashed blue) and the OSG (green) grid infrastructure. Time integral of the different computing element where CRAB jobs have run. This plot shows the increase of the number of sites that are storing and making available data for CMS distributed analysis. Time integral of the number of different User Interfaces that have used CRAB. This plot shows the diffusion among users the use of the tool. It is evident a const increase. /cmsdoc.cern.ch/cms/ccs/wm/www/Crab http://cmsdoc.cern.ch lcg.web.cern.ch/LCG and “LCG Technical Design Report”,CERN-TDR-01 CERN-LHCC-2005-024, June 2005 www.opensciencegrid.org /cmsdoc.cern.ch/orca //cmsdoc.cern.ch/swdev/viewcvs/viewcvs.cgi/OCTOPUS/PubDB guage HowTo” December 17°, 2001 availabe at http://server11.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-02-Document.pdf References We wish to thank Hassen Rihai and the CRAB team who provided support the development and deployement of the monitoring tool. Acknowledgements Weekly success rate for CRAB jobs. The quantity plotted is the ratio between the jobs which don’t fail for infrastructure reasons (green & yellow) and the total number of jobs. Weekly request rate for different datasets. This plot gives an indication of how many datasets (currently about 390) are requested by the users. http://cmsgridweb.pg.infn.it/crab/crabmon.php Monitoring of job status: check the status of jobs on user demand Output retrieval and Handling of user output : copy to UI or to a generic torage Element (SE) or any host with a gsiftp server (e.g. CASTOR) Job resubmission: if job suffers Grid failure (Aborted or Cancelled status) Off line analysis CRAB usage and jobs flow monitoring on to the Grid: via Workload Management System (WMS) command (edg-job-submit)

description

CRAB. Submission time Check status time Output retrieval time. Dynamical Web Page. UDP server. MySQL DB. Write & Update. References. CRAB project: http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab The CMS experiment: http://cmsdoc.cern.ch - PowerPoint PPT Presentation

Transcript of D.Spiga, L.Servoli, L.Faina INFN & University of Perugia

Page 1: D.Spiga,  L.Servoli,  L.Faina INFN & University of Perugia

D.Spiga, L.Servoli, L.FainaINFN & University of Perugia

CRAB WorkFlow :

CRAB: CMS Remote Analysis Builder•A CMS specific tool written in python and developed within INFN to open the Grid to the masses!•It is aimed to allow CMS users to have access to all Data produced and available, using grid middleware•It should hide as much as possible grid complexities to CMS user •It have to be installed on the UserInterface (UI), the user acces point to the grid.must develop their analysis code in an

interactive environment with the program for CMS Reconstruction (ORCA), and choose a dataset to analyze

Input data discovery : the Computing Element (CE) of sites storing data are found querying central ( RefDB) and local (PubDBs) databasePackaging of user code: creation of a tgz archive with user code which contains bin, lib and data Job creation: the main steps are:

-Wrapper (sh) of ORCA executable or script creation: set up running environment on remote resources (WN); sanity check on WN; access to local catalogs; output handling;

-Job Description Language (Jdl) file creation: the site locations (CEs name) are passed to Resource Broker (RB) as requirements to drive resources matchmaking; -Job splitting according to user requests

User tasks:

CRAB main functionalities:

Dynamical Web Page

CRAB

Submission timeCheck status timeOutput retrieval time

MySQL DB

UDP server

Write &

Update

CRAB Monitoring:

Technical Implementation:

At three diffent points ofits workflow (submission, check status, output retrieval), CRAB sends UDP packets, containing the informations, to an UDP server which process the data and fills a MySQL database.To satisfy the request for a “real-time” monitoring, some of the informations stored into the database are also shown in a web page which is automatically updated.

To monitor and to analyze the use of CRAB, a monitoring tool has been developed to collect data in order to: -- show, in real time. informations such as: * RATE of CRAB-Jobs Submission * Dataset and Computing Element Usage * Provenance of Jobs -- answer, with an off-line analysis, to questions like: * How efficient is the service; * How many users are using CRAB (and how); * Which patterns of data access are emerging (which data are used and where); * Which are the failures/problems of the service; * How to improve the user support;

Left and right histograms show the CE and dataset/owner usage.Each bar represents the total number of jobs and it is divided into three categories:- jobs that produce ORCA Exit Code equal to 0 (green)- jobs that produce ORCA Exit Status different from 0 (yellow)- jobs that could not run due to the GRID problems (red)

On line web page

The role of the web pages is to show automatically updated quantities. There are several possibilities concerning mainly how many jobs are submitted, where the jobs run, which input data are requested and which User Iterfaces have been used. All data can be shown choosing among different time intervals.

The number of jobs submitted each month.

Analysing data collected by the monitoring tool, it is possible to understand in dept the behaviour of the system. The study of the time evolution of several quantities allows to draw conclusions on the use and the performance of the service.

From July 2005 to January 2006 about 400’000 CRAB-jobswere submitted on the grid.The above histogram shows the weekly submission rate for the LCG (dashed blue) and the OSG (green) grid infrastructure.

Time integral of the different computing element where CRAB jobs have run.

This plot shows the increase of the number ofsites that are storing and making available data for CMS distributed analysis.

Time integral of the number of different User Interfaces that have used CRAB.

This plot shows the diffusion among users ofthe use of the tool. It is evident a constantincrease.

1. CRAB project: http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab2. The CMS experiment: http://cmsdoc.cern.ch3. LCG Project: http://lcg.web.cern.ch/LCG and “LCG Technical Design Report”,CERN-TDR-01 CERN-LHCC-2005-024, June 20054. OSG Project: http://www.opensciencegrid.org 5. ORCA project: http://cmsdoc.cern.ch/orca6. PubDB project: http://cmsdoc.cern.ch/swdev/viewcvs/viewcvs.cgi/OCTOPUS/PubDB7. “Job Description Language HowTo” December 17°, 2001 availabe at http://server11.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-02-Document.pdf

References

We wish to thank Hassen Rihai and the CRAB team who provided support duringthe development and deployement of the monitoring tool.

Acknowledgements

Weekly success rate for CRAB jobs.The quantity plotted is the ratio between the jobs which don’t fail for infrastructure reasons (green & yellow) and the total number of jobs.

Weekly request rate for different datasets.

This plot gives an indication of how many datasets(currently about 390) are requested by the users.

http://cmsgridweb.pg.infn.it/crab/crabmon.php

Monitoring of job status: check the status of jobs on user demandOutput retrieval and Handling of user output: copy to UI or to a generic torage Element (SE) or any host with a gsiftp server (e.g. CASTOR)Job resubmission: if job suffers Grid failure (Aborted or Cancelled status)

Off line analysis

CRAB usage and jobs flow monitoring

Job submission to the Grid: via Workload Management System (WMS) command (edg-job-submit)