Introduction to the LHCb masterclass exercise V. V. Gligorov, CERN.
LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.
-
Upload
ethan-jones -
Category
Documents
-
view
231 -
download
0
Transcript of LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.
LHCb The LHCb Data Management
System
Philippe CharpentierCERN
On behalf of the LHCb Collaboration
LHC
b D
ata
Manag
em
ent
LHCb Computing Model in a nutshell
m RAW files (3GB) are transferred to Tier0o Verify migration and checksum
m Transfer to Tier1so Each file replicated once (whole run at a single site)
m Reconstructiono At CERN and Tier1s (up to 300 HS06.hours)
P If needed Tier2s can also be used as “Tier1 co-processor”
m Strippingo On Tier1 where SDST is available
m MC simulationo Complex workflow: simulation, digitization, reconstruction,
trigger, filteringo Running everywhere with low priority
P Tier2, Tier1, CERN and unpledged resources (some non-Grid)
m User analysiso Running at Tier1s for data access, anywhere for MC studies
m Grid activities under control of LHCbDiraco LHCb extensions of the DIRAC framework (cf Poster o Many presentations at CHEP (again this time, talks and
posters)CHEP 2012, New York City
Poster #272, S.Roiser
Poster #145, F.Stagni
LHC
b D
ata
Manag
em
ent
Real data applications
m Reconstruction:o 50,000 events (3GB), <CPU>=20
HS06.s/evtm Stripping:
o 50,000 events (2.5 GB), <CPU>=2 HS06.s/evt
o 14 streamsm Merge:
o 40,000 events (5 GB)m Input files are downloaded on local
disk
CHEP 2012, New York City
SDST
Brunel(recons)
DaVinci(stripping
and streaming)
RAW
DST1DST1
DST1 DST1 DST1
Merge
DST1
LHC
b D
ata
Manag
em
ent
Data replication
m Replication performed using FTS
o Active data placementm Split disk (for analysis)
and archiveo No disk replica for RAW
and SDST (read few times)m Using SRM spaces (T1D0
and T0D1)
CHEP 2012, New York City
SDST
Brunel(recons)
DaVinci(stripping
and streaming)
RAW
DST1DST1
DST1 DST1 DST1
Merge
DST1
1 Tier1(scratch)
CERN + 1 Tier1
1 Tier1
CERN + 1 Tier1
CERN +3 Tier1s
LHC
b D
ata
Manag
em
ent
Datasets
m Granularity at the file levelo Data Management operations (replicate, remove
replica, delete file)o Workload Management: input/output files of jobs
m LHCbDirac perspectiveo DMS and WMS use LFNs to reference fileso LFN namespace refers to the origin of the file
P Constructed by the jobs (uses production and job number)
P Hierarchical namespace for convenienceP Used to define file class (tape-sets) for RAW, SDST, DSTP GUID used for internal navigation between files (Gaudi)
m User perspectiveo File is part of a dataset (consistent for physics
analysis)o Dataset: specific conditions of data, processing
version and processing levelP Files in a dataset should be exclusive and consistent in
quality and contentCHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Replica Catalog (1)
m Logical namespaceo Reflects somewhat the origin of the file (run number
for RAW, production number for output files of jobs)o File type also explicit in the directory tree
m Storage Elementso Essential component in the DIRAC DMSo Logical SEs: several DIRAC SEs can physically use the
same hardware SE (same instance, same SRM space)o Described in the DIRAC configuration
P Protocol, endpoint, port, SAPath, Web Service URLP Allows autonomous construction of the SURLP SURL = srm:<endPoint>:<port><WSUrl><SAPath><LFN>
o SRM spaces at Tier1sP Used to have as many SRM spaces as DIRAC SEs, now
only 3P LHCb-Tape (T1D0) custodial storageP LHCb-Disk (T0D1) fast disk accessP LHCb-User (T0D1) fast disk access for user data
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Replica Catalog (2)
m Currently using the LFCo Master write service at CERNo Replication using Oracle streams to Tier1so Read-only instances at CERN and Tier1s
P Mostly for redundancy, no need for scalingm LFC information:
o Metadata of the fileo Replicas
P Use “host name” field for the DIRAC SE nameP Store SURL of creation for convenience (not used)
d Allows lcg-util commands to work
o Quality flagP One character comment used to set temporarily a replica
as unavailablem Testing scalability of the DIRAC file catalog
o Built-in storage usage capabilities (per directory)
CHEP 2012, New York City
Poster: A.Tsaregorodtsev
LHC
b D
ata
Manag
em
ent
Bookkeeping Catalog (1)
m User selection criteriao Origin of the data (real or MC, year of reference)
P LHCb/Collision12o Conditions for data taking of simulation (energy,
magnetic field, detector configuration…P Beam4000GeV-VeloClosed-MagDown
o Processing Pass is the level of processing (reconstruction, stripping…) including compatibility version
P Reco13/Stripping19o Event Type is mostly useful for simulation, single
value for real dataP 8 digit numeric code (12345678, 90000000)
o File Type defines which type of output files the user wants to get for a given processing pass (e.g. which stream)
P RAW, SDST, BHADRON.DST (for a streamed file)m Bookkeeping search
o Using a pathP /<origin>/<conditions>/<processing pass>/<event type>/<file type>CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Bookkeeping Catalog (2)
m Much more than a dataset catalog!m Full provenance of files and jobs
o Files are input of processing steps (“jobs”) that produce files
o All files ever created are recorded, each processing step as well
P Full information on the “job” (location, CPU, wall clock time…)
m BK relational databaseo Two main tables: “files” and “jobs”o Jobs belong to a “production”o “Productions” belong to a “processing pass”, with a
given “origin” and “condition”o Highly optimized search for files, as well as
summariesm Quality flags
o Files are immutable, but can have a mutable quality flag
o Files have a flag indicating whether they have a replica or not
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Bookkeeping browsing
m Allows to save datasetso Filter, selectiono Plain list of fileso Gaudi configuration file
m Can return files with only replica at a given location
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Dataset based transformation
m Same mechanism used for jobs (workload management tasks) and data management
m Input datasets based on a bookkeeping queryo Tasks are data driven, used by both WMS and DMS
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Data management transformations
m Replication transformationso Uses a policy implementing the Computing Modelo Creates transfer tasks depending on the original
location and space availabilityo Replication using a tree (not all from the initial
source)P Optimized w.r.t. channel recent bandwidth and SE
localityo Transfers are whenever possible performed using
FTSP If not possible, 3rd party gridftp transfer
d If no FTS channel or for user files (special credentials)
o Replicas are automatically registered in the LFCm Removal transformations
o Used for retiring datasets or reducing the number of replicas
o Used exceptionally to completely remove files (tests, bugs…)
o Replica removal protected against last replica removal!
o Transfers and removal use the Data Manager credentials
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Staging: using files from tape
m If jobs use files that are not online (on disk)o Before submitting the jobo Stage the file from tape, and pin it on cache
m Stager agento Performs also cache managemento Throttle staging requests depending on the cache
size and the amount of pinned datao Requires fine tuning (pinning and cache size)
P Caching architecture highly site dependentP No publication of cache sizes (except Castor and StoRM)
m Jobs using staged fileso Check first the file is still staged
P If not reschedule the jobo Copies the file locally on the WN whenever possible
P Space is released fasterP More reliable access for very long jobs (reconstruction)
or jobs using many files (merging)
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Data Management Operational experience
m Transfer accountingo Per site, channel, user…
m Storage accountingo Per dataset, SE, user…o User quotas
P Not strictly enforced…
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Outlook
m Improvements on stagingo Improve the tuning of cache settings
P Depends on how caches are used by siteso Pinning/unpinning
P Difficult if files are used by more than one jobm Popularity
o Record dataset usageP Reported by jobs: number of files used in a given datasetP Account number of files used per dataset per day/week
o Assess dataset popularityP Relate usage to dataset size
o Take decisions on the number of online replicasP Taking into account available spaceP Taking into account expected need in the coming weeks
o First rely on Data Manager receiving an adviceP Possibly move to automated dynamic management
CHEP 2012, New York City
LHC
b D
ata
Manag
em
ent
Conclusions
m LHCb uses two views for data management:o Dataset view as seen by users and productionso Files view as seen by Data Management tools
(replication, removal)m Datasets are handled by the LHCb Bookkeeping
system (part of LHCbDirac)m File view is generic and handled by the DIRAC
DMS
m The LHCb DMS gives full flexibility for managing data on the Grid
m In the future LHCb expect to use popularity criteria for deciding on the number of replicas for each dataset
o Should give more flexibility to the Computing Model
CHEP 2012, New York City