1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management...
-
Upload
alaina-fox -
Category
Documents
-
view
216 -
download
0
Transcript of 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management...
![Page 1: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/1.jpg)
1
LHCb File Transfer framework
N. Brook,Ph. Charpentier,A.Tsaregorodtsev
LCG Storage Management Workshop , 6 April 2005, CERN
![Page 2: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/2.jpg)
2
Outline
LHCb (advanced) usage of SRM SRM v2 requirements DIRAC Data Management tools File transfer framework Interfacing to FTS
![Page 3: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/3.jpg)
3
LHCb (advanced) Usage of SRM
![Page 4: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/4.jpg)
4
Stripping on LCG• Jobs have several input
files (between 40 and 80)
• Jobs sent to site where the data are placed
• Currently 3 sites used CNAF, CERN and PIC based on CASTOR Mass Storage
• Using SRM interface to access MSS
![Page 5: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/5.jpg)
5
Physics stripping jobs Number of events per job 40,000
Number of files 80
Input data size 80*0.3 = 24 GB
Number of output files 2 (DST + event collection)
Output DST size 600 MB
Event collection size 1.2 MB
Number of events 60M
Number of jobs 1,500
Input data size 36 TB
Output data size 0.9 TB
Trigger stripping jobs Number of events per job 360,000
Number of files 400 (files of 900 evts) or 200 (1800 evts)
Input data size 400*0.18 = 72 GB
Number of output files 1
Output DST size 500 MB
Number of events 90M
Number of jobs 250
Input data size 18 TB
Output data size 125 GB
Scale of stripping in Data Challenge
![Page 6: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/6.jpg)
6
• LHCb CLI tools
• Stage request
• File status
• Advisory delete
• CLI tools built on GFAL library - aim to avoid any SRM version dependencies
Usage of SRM
![Page 7: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/7.jpg)
7
• inability to pin/unpin or mark file for garbage collection - poss. workarounds (redefined SRM “advisory delete” provided)
• Throttle jobs - manpower intensive (not feasible)
• New SRM stage request at each file check - use on LCG
• Technology specific commands - use on LXBATCH for debugging workflow
• SRM “advisory delete” re-defined
• SRM fails to deal with corrupted/missing files
• If error returned to SRM all subsequent files are also marked as fail (even if successful!) - needs new CASTOR implementation
SRM (vsn 1.0) Experience with CASTOR
![Page 8: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/8.jpg)
8
• No control over stage pool - mixing of general user & prod manager
• Solved - LCG can now check on user and responsibility and assign pool accordingly
• SRM request ID lifetime
• Implemented last 10 days - noted problem during a re-boot o SRM server which lost lifetime flat file
• Access rights
• if one server creates files under one user account, it is not readable by the other servers if the mapping is to another user - problem solved
SRM (vsn 1.0) Experience with CASTOR
![Page 9: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/9.jpg)
9
LHCb requirements for SRM
Consider SRM v2.1 Ignore the artificial grouping of methods (basic,
advanced 1…) v1 is definitely not enough, v3 not mature
Definition: An SRM endpoint is uniquely defining an SE
![Page 10: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/10.jpg)
10
SRM namespace
An SURL is the concatenation of 3 fields An SE/SRM endpoint : SRM://mysrmserver.site.xx A file prefix : e.g. /castor/cern.ch
• Mind this is a site-dependent information, but due to change… A filename : e.g.
/lhcb/production/DC04/evttype1234/DST/01234_2134.dst• Need for a convention “a la Castor” : VO/username
When replicating a file, no way to know the actual prefix (site dependent)
Hence we request the possibility to use relative paths SRM://mysrmserver.site.xx//lhcb/production/DC04/…
![Page 11: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/11.jpg)
11
High priority methods
File types Volatile and permanent mandatory
Space management (required for stripping jobs) space reservation, extension, deletion
Directory management All methods but possibly mv (mind ls)
Data “transfer” methods All important, pinning is a must (i.e. lifetime) srmCopy: cf discussion with fts
![Page 12: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/12.jpg)
12
High priority methods (cont’d) Protocols
srmPrepareForGet should return a list of possible tURLs
User/IO system to select which one to use• e.g. at ROOT level (should accept SURL as PFN)
File access control Based on user and role Propose the use of user directories
• e.g. lhcb/user/a/atsareg• How to define a persistent “name”?• How to create the directory (no write access to the top
directory lhcb/user….)
Authentication/authorisation Should allow access from the Grid, direct or local
![Page 13: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/13.jpg)
13
DIRAC Data Management tools
![Page 14: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/14.jpg)
14
File Catalogs
DIRAC incorporated 2 different File Catalogs Replica tables in the LHCb
Bookkeeping Database File Catalog borrowed from the
AliEn project
MySQLMySQL
AliEn FCAliEn FC AliEn UIAliEn UI
XML-RPCserver
XML-RPCserver
AliEn FCClient
AliEn FCClient
ORACLEORACLE
LHCbBK DBLHCbBK DB
XML-RPCserver
XML-RPCserver
BK FCClientBK FCClient
FC ClientFC ClientDIRAC
Application,Service
DIRACApplication,
Service
AliEn FileCatalog ServiceAliEn FileCatalog Service
BK FileCatalog Service BK FileCatalog Service
FileCatalog ClientFileCatalog Client
MySQLMySQL
AliEn FCAliEn FC AliEn UIAliEn UI
XML-RPCserver
XML-RPCserver
AliEn FCClient
AliEn FCClient
ORACLEORACLE
LHCbBK DBLHCbBK DB
XML-RPCserver
XML-RPCserver
BK FCClientBK FCClient
FC ClientFC ClientDIRAC
Application,Service
DIRACApplication,
Service
AliEn FileCatalog ServiceAliEn FileCatalog Service
BK FileCatalog Service BK FileCatalog Service
FileCatalog ClientFileCatalog Client
Both catalogs have identical client API’s Can be used interchangeably This was done for redundancy and for gaining experience
Other catalogs will be interfaced in the same way LFC – work in progress AliEn upgraded FiReMan
![Page 15: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/15.jpg)
15
Data management tools
DIRAC Storage Element is a combination of a standard server and a description of its access in the Configuration Service Similar to “Classic SE” Pluggable transport modules: gridftp,bbftp,sftp,ftp,http, …
SRM can be incorporated into the DIRAC framework with similar interface
DIRAC ReplicaManager interface (API and CLI) get(), put(), replicate(), register(), etc
![Page 16: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/16.jpg)
16
File Transfer framework
![Page 17: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/17.jpg)
17
File Transfer framework
Request DB
Job Agent Transfer Agent
LocalSE RemoteSE
putRequestgetRequest
importData
Site
WMS JobMonitoring
FileCatalog
getJob setJobStatus registerReplica
Data ManagerJobTransferrequest
Central
submitJob getJobStatus getFileInfo
DMS
WMSReliable
operationsInformation
services
Resources
exportData
Cache
storeLocal
![Page 18: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/18.jpg)
18
File Transfer framework
Reuses the WMS infrastructure To deliver transfer requests To monitor the request execution progress
Reliable File Transfer Transfers are mediated by on-site Transfer Agents On-site Request DB shared with other “reliable operations”:
• Bookkeeping registration• Application status updates• Job parameters/accounting registration
This framework is OK for small transfers Job outputs
Need for efficient bulk transfer operations
![Page 19: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/19.jpg)
19
File Transfer with FTS
Data movement
Scheduler
MovementClient
MovementClient
SE1 SE2
Request DB
Data ManagerInterface
PolicyAgent
MonitoringAgent
FileCatalogAgent
AccountingAgent
FTS
LHCbcomponents
![Page 20: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/20.jpg)
20
File Transfer with SRM-copy
SE1 SE2
Request DB
Data ManagerInterface
PolicyAgent
MonitoringAgent
FileCatalogAgent
AccountingAgent
LHCbcomponents
TransferAgent
srmCopy
requestStatus
LHCb VO components are the same as with FTS
![Page 21: 1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.](https://reader036.fdocuments.net/reader036/viewer/2022062423/5697bf7c1a28abf838c83ddc/html5/thumbnails/21.jpg)
21
Service Challenge
We would like to get access to the FTS system as early as possible May ? Our own evaluation of stability
Start with one central Request Store instance Add more instances as necessary
Do bulk transfers for Stripping data distribution to Tier1 centers September T0-T1, T1-T1 transfers Not part of LHCb Data Challenge