Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February 20082 The...
-
Upload
caren-pearson -
Category
Documents
-
view
216 -
download
0
Transcript of Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February 20082 The...
Computing Infrastructure
Status
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 2
The LHCb Computing Model: a reminder
Simulation is using non-Tier1 CPU resources MC data are stored at Tier0-1s, no permanent storage at these
sites Real data are processed at Tier0-1 (up to analysis)
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 3
The life of a real LHCb event
Written out to a RAW file by the Online System Files are 2 GBytes / 60,000 events / 30 seconds on average
RAW file is transferred to Tier0 (CERN-Castor) Migrated to tape, checksum verified (can be deleted in Online)
RAW file is transferred to one of the Tier1s Migrated to tape
RAW file is reconstructed at Tier0-1 About 20 hours to create an rDST
rDST stored locally with migration to tape When enough (4?) rDSTs are ready, they are stripped
at the local site Data streams are created
possibly files are merged into 2 GB files Merged streamed DSTs & ETCs created, stored locally and at
CERN with tape migration DST & ETC distributed to the 5 other Tier1s
Analysis takes place on these stripped DSTs at all Tier0-1
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 4
A few numbers
RAW file - 2 GB, 60000 events, 30 s 1500 files per day, 3 TBytes, 70 MB/s 90 Mevts / day
Reconstruction: 20 h 1500 CPUs permanently (200 to 300 per site)
rDST file - 2 GB, 60000 events Stripped DSTs
In the computing TDR, very aggressive numbers for selection Factor 10 overall reduction (after HLT)
Number of streams to be defined… Assume 20 balanced streams
One 2 GB DST, 20000 events originating from 4 RAW Mevts Need to merge 16 DSTs (of 240 kevts each)
25 streamed DSTs per day and per stream (50 GB) For 120 days of run, only 6 TB for each stream (3000 files)
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 5
The LHCb Computing Grid (1)
Integrated system called DIRAC DIRAC is a community (LHCb) Grid solution
based on Grid infrastructure services (storage and computing resources and services)
DIRAC provides a unified framework for all services and tools DIRAC2 used since 2004 for production, 2006 for analysis DIRAC3 being commissioned (full re-engineering)
DIRAC provides high level generic services Data Management
Fully reliable file transfers retries until successful based on gLite FTS (File Transfer Service) using specific channels and
network links (LHC Optical Private Network, OPN) FTS used for all transfers but file upload from Worker Nodes
simple file upload commands Implements failover and recovery mechanism for all operations
registers actions to be performed later (e.g. when site is available again) Registration of all copies of files (called “replicas”)
For users, files only have a Logical File Name (LFN)
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 6
The LHCb Computing Grid (2)
Workload Management Prepares data (pre-stage) if not available online Submits jobs to sites where data are available
using EGEE infrastructure (gLite WMS) Monitors the progress of the job Uploads output data and logfiles Allows re-submission if job fails
DIRAC LHCb-specific services Data Management
Registration of the provenance of files (Bookkeeping) used for retrieving list of files according to user’s criteria returns LFNs
Production tools Definition of processing (complex) workflows
e.g. run 4 Gauss, 3 Boole and 1 Brunel “step” Creation of jobs
manually (Production Manager) automatically (using pre-defined criteria, e.g. when 4 rDSTs are ready,
strip them using this workflow)
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 7
DIRAC3 for first data taking
DIRAC3 being commissioned Most components are ready, integrated and tested Basic functionality (equivalent to DIRAC2)
This week: full rehearsal week all developers are at CERN Goal: follow progress of the challenge, fix problems ASAP
DIRAC3 planning (as of 15 Nov) 30 Nov 2007: Basic functionality 15 Dec 2007: Production Management, start tests 15 Jan 2008: Full CCRC functionality, tests start 5 Feb 2008: Start tests for CCRC phase 1 18 Feb 2008: Run CCRC 31 Mar 2008: Full functionality, ready for CCRC phase 2 tests
Current status: on time with above schedule
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 8
CCRC’08 for LHCb
Raw data upload: Online Tier0 storage (CERN Castor) Use DIRAC transfer framework
exercise two transfer tools (Castor rfcp, Grid FTP)
Raw data distribution to Tier1s Reminder: CNAF, GridKa, IN2P3, NIKHEF, PIC, RAL Use gLite File Transfer System (FTS)
based on the upcoming Storage Resource Manager (SRM) version 2.2 (just coming out)
Share according to resource pledges from sites Data reconstruction at Tier0+1
Production of RDST, stored locally Data access using also SRM v2 (various storage back-ends:
Castor and dCache) For May: stripping of reconstructed data
Initially foreseen in Feb, but de-scoped Distribution of streamed DSTs to Tier1s If possible include file merging
LHC
b C
om
puti
ng S
tatu
s
LHCb LHCC mini-review, February 2008 9
Tier1 resources for CCRC’08
Data sharing according to Tier1 pledges as of February 15th (!!!)
LHCb SRM v2.2 space token descriptions are: LHCb_RAW (T1D0) LHCb_RDST (T1D0) LHCb_M-DST (T1D1) LHCb_DST (T0D1) LHCb_FAILOVER (T0D1)
used for temporary upload in case of destination unavailability All data can be scrapped after the challenge
Test SRM bulk removal Based on 2 weeks run
28,000 files (42 TB) CCRC’08 in May
4 weeks continuous running Established services and procedures
LHC
b C
om
puti
ng S
tatu
sCCRC’08 first results
Tier0-Tier1 transfers 5,000 files (7.5 TB)
transferred to Tier1s using FTS + SRM v2
4 hours!
All Tier1’s RAW spaces work
Continuous transfer over WE
Plans Adding automatic
reconstruction now
Pit to Tier0 transfers Sustained nominal rate for 6
days (60 MB/s) Red line
125 MB/s for one day Tape migration followed
Green line High peaks are Tier1 transfers
Plans ½ rate today (50% duty
cycle) Nominal rate after (>10 days)
LHCb LHCC mini-review, February 2008 10
LHC
b C
om
puti
ng S
tatu
s
Questions? More news tomorrow
LCG-LHCC meeting
LHCb LHCC mini-review, February 2008 11