11th December 2002Tim Adye1 BaBar UK Grid Work Tim Adye Rutherford Appleton Laboratory BaBar...
-
Upload
grace-holt -
Category
Documents
-
view
217 -
download
0
Transcript of 11th December 2002Tim Adye1 BaBar UK Grid Work Tim Adye Rutherford Appleton Laboratory BaBar...
11th December 2002 Tim Adye 1
BaBar UK Grid WorkBaBar UK Grid Work
Tim Adye
Rutherford Appleton Laboratory
BaBar Collaboration Meeting
SLAC
11th December 2002
11th December 2002 Tim Adye 2
Talk Plan
• RAL Tier A
• BaBar Job Submission [Janusz Martyniak]
• Metadata [Alessandra Forti]
• BaBar UK Grid Facilities [Marc Kelly]
11th December 2002 Tim Adye 3
BaBar Batch CPU Use at RAL
0
20,000
40,000
60,000
80,000
100,000
120,000
Week Beginning
BaB
ar C
PU
Hou
rs p
er W
eek
(Nor
mal
ised
to P
450)
SPUK UsersNon-UK Users
Full usage at full efficiency of BaBar CPUs = 106,624 Hours/Week; 59,733 according to MOU
11th December 2002 Tim Adye 4
BaBar Batch Users at RAL(running at least one non-trivial job each week)
0
5
10
15
20
25
30
35
Week Beginning
BaB
ar U
sers
per
Wee
k
UK UsersNon-UK Users
A total of 165 new BaBar users registered since December 2001
11th December 2002 Tim Adye 5
BaBar Job SubmissionBaBar Job Submission
Janusz Martyniak
Imperial College
11th December 2002 Tim Adye 6
Submitter Requirements
• A job should be submitted only to a site which holds data required to run the job
• The job should run in the standard BaBar environment (srtpath, PARENT link etc)
• A user should be allowed to pass his private environment variables with the job
• Output should be send back and/or stored on a SE and registered with the RC
11th December 2002 Tim Adye 7
Job Submitter Steps
• Data preparation (skimData)• User TCL file expansion (dump)• JDL file(s) creation based on skimData delivered
index and user options• Submission to the GRID
11th December 2002 Tim Adye 8
Data Preparation
• A modified version of the skimData program has been developed by Dave Smith and Alessandra Forti
• It returns data matching given criteria and creates a file, index
• The index groups the files into buckets. Each bucket is defined by a list of sites which hold the data
11th December 2002 Tim Adye 9
Index Example
• Filenames are named: indexname.number.tcl• The example above would result in 4 JDL files and 4
jobs submitted
sites: BABAR-RAL,BABAR-IC00001 00002 00003sites: BABAR-MAN00004
11th December 2002 Tim Adye 10
User TCL File Expansion
• Requires release 12.2.2 [Asoka De Silva]• Reads user TCL file (eg. kangaFilterMicro),
sourcing all files referenced by it• The result is a single TCL file• The data file list should not be sourced
11th December 2002 Tim Adye 11
JDL File Creation
• The JDL file creation process analyses the index and user supplied options and creates a set of JDL files to be submitted to the GRID
• The options include:• The index filename• User top TCL file (or expanded TCL file)• The executable• Environment variables to be passed to the GRID
11th December 2002 Tim Adye 12
Submission to GRID
• The index file defines a data file to be used by a job. The data filename is inserted into the expanded TCL file and sourced
• A wrapper around the user defined executable is created and sent to GRID
• The wrapper defines the environment on the GRID but relies on the existing BaBar setup
• The std output and std error are put into the output sandbox as well as any additional user created files (eg. n-tuples)
11th December 2002 Tim Adye 13
To Do
• A manual is (almost) ready.
• The job submitter requires Linux 2.4 in order to run the tcl expansion. Since there is no official EDG UI under 2.4 the expansion and actual submission are done in 2 steps now, but it is planned to combine it later.
• Which form the should the code (pure Python) be released?• rpm or Phython distribution tools [or CVS]?
11th December 2002 Tim Adye 14
Conclusions
• The first ‘dumb’ version of the submitter exists• The command line scripts create automatically a set
of JDL files based on the skimData created index• This ensures that job will be submitted only to a site
which actually holds the data required• The load balancing is done by the Resource Broker• The environment variables may be transferred to the
GRID but they are the same for all JDL files • The python modules will be distributed as rpm or tar
files and installed as third party modules (python Distutils)
11th December 2002 Tim Adye 15
Metadata etc
Alessandra Forti
Manchester
11th December 2002 Tim Adye 16
RLS – Replica Location Service
• RLS is the EDG/Globus replica location service• Will replace the LDAP based replica catalog• distributed system based on MySQL (relational database)
• It consists of two levels of information:• LRC (Local Replica Catalog) with local replica or PFN
(Physical File Name) information• RLI (Replica Location Index) contains pointers to different
LRC for each LFN (Logical File Name)
• The RLS doesn’t contain any metadata
• Can we use this with skimData?• instead of remote lookups at each potential site
11th December 2002 Tim Adye 17
Other Work
• Alessandra is maintaining the LDAP-based Replica Catalogue (RC)• Used from INFN, IN2P3 and SLAC for testing• Works in EDG 1.2
• will be upgraded today or tomorrow for EDG 1.3
• Can now submit jobs using Globus, gsiklog, and AFS• works since yesterday!
• Also plan to install R-GMA in Manchester for testing outside the testbed
11th December 2002 Tim Adye 18
UK Grid Facilities
Marc Kelly
Bristol
11th December 2002 Tim Adye 19
Central Facilities
• The "Testbed" resoruce broker is being upgraded to 1.4 at Imperial College
• UK Testbench at RAL is also being upgraded to 1.4• RAL Tier A front-end will follow
11th December 2002 Tim Adye 20
UK Farms
• Still have a 1.2 Resource Broker for BaBar usage. This machine claims to have the following registered.• gf18.hep.man.ac.uk• gm04.hep.ph.ic.ac.uk• gw32.hep.ph.ic.ac.uk• bfa.hep.ph.ic.ac.uk• farm020.hep.phy.cam.ac.uk• ce.hep.phy.cam.ac.uk• bbr-gate01.slac.stanford.edu• bbr-gate02.slac.stanford.edu• bbr-gate03.slac.stanford.edu
• What about the other BaBar UK Farms?
11th December 2002 Tim Adye 21
UK Farms (cont)
• Many of the UK farms are or are planning to upgrade, but have been following a moving target• 1.3.3, then 1.3.4, now 1.4.0
• Need to decide which version we move to and stick to that.