An ODBMS approach to persistency in CMS
description
Transcript of An ODBMS approach to persistency in CMS
An ODBMS approach to persistency in CMSAn ODBMS approach to persistency in CMS
Lucia SilvestrisLucia Silvestris
INFN Bari - CERN/EPINFN Bari - CERN/EP
CHEP 7 - 11 February 2000CHEP 7 - 11 February 2000
Padova ItalyPadova Italy
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
CMS - Software Components CMS - Software Components CMS - Software Components CMS - Software Components
Data QualityCalibrations
Group Physics Analysis
Slow Control Online Monitoring
CMS Detector(Muon, Tracker, Calo)
Persistent Object Store ManagerObject Database Management System
Request asynchronous dataEnvironment Data
Request part of event
Simulation G3 and or G4
store
store
User Analysison demand
Request part of event
Store rec-Objcalibration
Request part of event Filter Unit/ Event Filter
Objectivity Formatter
Quasi-onlineReconstruction
Store rec-Obj
Request asynchronous data
Request part of event
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
CARF ComponentsCARF ComponentsCARF ComponentsCARF Components
CARF Architecture: On-demand reconstructionCARF Architecture: On-demand reconstruction ((see V.Innocente talk on CARF Architecture-session Asee V.Innocente talk on CARF Architecture-session A))
Framework Main ServicesFramework Main Services Define the events to be dispatched Define the events to be dispatched (events and geometry from Simulations (events and geometry from Simulations
or Test-Beams)or Test-Beams) Manage the “not yet removed” sequential components (coming from Manage the “not yet removed” sequential components (coming from
Geant3)Geant3) Run-Time Dynamic Loading is used to configure and build CARF Run-Time Dynamic Loading is used to configure and build CARF
ApplicationsApplications
Framework Persistency ServicesFramework Persistency Services
Framework Ancillary ServicesFramework Ancillary Services User Interface, Error Report, Logging facilities,...User Interface, Error Report, Logging facilities,... Timing facility, Utility libraryTiming facility, Utility library
Object of this talk
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
CMS Persistency historyCMS Persistency historyCMS Persistency historyCMS Persistency history
Prototype 1997-98Prototype 1997-98 Test Beams DAQ and Analysis Test Beams DAQ and Analysis using Objectivity/DB using Objectivity/DB in different CMS Test-in different CMS Test-
Beam areas (Beam areas (H2, T9 and X5bH2, T9 and X5b).).
The system was The system was successfullysuccessfully tested. tested.
Production 1999Production 1999 Test Beam DAQ Test Beam DAQ (from April ‘99)(from April ‘99)
Monte Carlo (GEANT3) reconstruction Monte Carlo (GEANT3) reconstruction (from October ‘99)(from October ‘99) Persistent digit for Calorimeter, Muon and Trigger Persistent digit for Calorimeter, Muon and Trigger Physics Generator information (vertices, tracks) persistentPhysics Generator information (vertices, tracks) persistent
((see D. Stickland talk on ORCA - session Asee D. Stickland talk on ORCA - session A))
ORCA
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Persistent Service for High Energy Physics DataPersistent Service for High Energy Physics DataPersistent Service for High Energy Physics DataPersistent Service for High Energy Physics Data
Event Event CollectioCollectio
nn
CollectioCollectionn
Meta-Meta-DataData Event Event
ElectronsElectrons
Tracker Tracker AlignmenAlignmen
tt
TracksTracks Ecal Ecal
calibratiocalibrationn
User TagUser Tag(N-tuple)(N-tuple)Environmental dataEnvironmental data
Detector and Accelerator statusDetector and Accelerator status
Calibrations, AlignmentsCalibrations, Alignments
Event-Collection Meta-DataEvent-Collection Meta-Data
(luminosity, selection criteria, …)(luminosity, selection criteria, …)
……
Event Data, User DataEvent Data, User Data
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Do a user need a DBMS? Do a user need a DBMS? Do a user need a DBMS? Do a user need a DBMS?
Do I encode meta-data (run number, version id) in file names?Do I encode meta-data (run number, version id) in file names?
How many files and logbooks I should consult to determine the luminosity How many files and logbooks I should consult to determine the luminosity
corresponding to a histogram?corresponding to a histogram?
How easily I can determine if two events have been reconstructed with How easily I can determine if two events have been reconstructed with
the same version of a program and using the same calibrations?the same version of a program and using the same calibrations?
How many lines of code I should write and which fraction of data I should How many lines of code I should write and which fraction of data I should read to select all events with two read to select all events with two ’s with p’s with p> 11.5 GeV and |> 11.5 GeV and ||<2.7?|<2.7?
The same at generator level?The same at generator level?
If the answers scare you, you need a DBMS!
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Can CMS do without a DBMS?Can CMS do without a DBMS?Can CMS do without a DBMS?Can CMS do without a DBMS?
An experiment lasting 20 years can not rely just on ASCII files and An experiment lasting 20 years can not rely just on ASCII files and
file systems for its production bookkeeping, “condition” database, file systems for its production bookkeeping, “condition” database,
etc.etc.
Even today at LEP, the management of all real and simulated data-Even today at LEP, the management of all real and simulated data-
sets (from raw-data to n-tuples) is a major enterprise.sets (from raw-data to n-tuples) is a major enterprise.
A DBMS is the modern answer to such a problem and, given the A DBMS is the modern answer to such a problem and, given the choice of OO technology for the choice of OO technology for the CMS softwareCMS software, an , an ODBMS (or a ODBMS (or a DBMS with an OO interface) is the natural solution.DBMS with an OO interface) is the natural solution.
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
A “BLOB” ModelA “BLOB” ModelA “BLOB” ModelA “BLOB” Model
Event
RecEvent
RawEvent
Blob
Event
Blob Blob
DataBase ObjectsDataBase Objects
BlobBlob: a sequence of bytes. Decoding it is a “user” responsibility.
Why should Blobs not be stored in the DBMS?
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Vector of Digi
Raw Event
CMS Raw EventCMS Raw Event CMS Raw EventCMS Raw Event
Vector of Digi Vector of Digi
The ReadOutUnit Objectcan identify a completedetector or a detector component
The vector of Digi in the Testbeam contains the ADC or TDC values
RawData are identified by the corresponding ReadOutUnit
Raw Data
ReadOutUnitReadOutUnit
Raw Data Raw Data
ReadOutUnit
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Persistent Object ManagementPersistent Object ManagementPersistent Object ManagementPersistent Object Management
TheThe persistent object managementpersistent object management is ais a major responsibility major responsibility in the in the
CCMS MS AAnalysis and nalysis and RReconstruction econstruction FFramework (ramework (CARFCARF))
CARF manages CARF manages multi-threaded transactionsmulti-threaded transactions creation of databases and containerscreation of databases and containers meta data and event collectionsmeta data and event collections physical clustering of event objectsphysical clustering of event objects persistent event structure and its relations with the transient persistent event structure and its relations with the transient
Use of Database is transparent to detector developersUse of Database is transparent to detector developers
users access persistent objects through C++ pointersusers access persistent objects through C++ pointers
CARF takes care of memory pinningCARF takes care of memory pinning
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Event
Event
RecEvent
RecEventRecEve
ntRecEve
nt
Event
Event
CMS Event StructureCMS Event StructureCMS Event StructureCMS Event Structure
RawEvent
EventCollectio
n
Event Head
er
Run
EventCollectio
n
In case of re-reconstructionthe original structure is kept.Event objects are cloned and new collections created
The Run object containsevent collection conditionlike Beam energy, particletype, magnetic field etc..
The event header objectcontains event num,spill num, event num in the spill
Persistent
Transient
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
CMS Reconstructed ObjectsCMS Reconstructed Objects
S Track
S-TrackReconstruct
or
S Track
..Vector of RHits
RecEvent
TrackSecInfo
TrackConstituen
ts
Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor.
A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.).
The top level object acts as a proxy.Intermediate reconstructed objects (RHits) are cached by value into the final objects .
“rec”“esd”
“aod”
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Test Beam Production in 1999Test Beam Production in 1999Test Beam Production in 1999Test Beam Production in 1999
Detector performances studiesDetector performances studies have been the real “users” for have been the real “users” for
Test Beams projectTest Beams project
From April 99 to October 99From April 99 to October 99 the test beam software was in production for the test beam software was in production for
the the TrackerTracker and the and the MuonMuon reading data from VME - FastBus modules and reading data from VME - FastBus modules and
filling one federate database for each beam line (H2b, X5b, T9) and for filling one federate database for each beam line (H2b, X5b, T9) and for
each data taking period.each data taking period. Some Some system databasessystem databases
Beam configuration : Read-Out Unit listBeam configuration : Read-Out Unit list LogBook: logbook information for each run LogBook: logbook information for each run ListRuns: run listListRuns: run list
Run DatabasesRun Databases: event collection with the same data taking conditions: event collection with the same data taking conditions
The DAQ system + Objectivity formatter running on SolarisThe DAQ system + Objectivity formatter running on Solaris More than 800 GB of data stored in Objectivity/DBMore than 800 GB of data stored in Objectivity/DB
Ran without major problemsRan without major problems
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Test Beam Production in 1999Test Beam Production in 1999Test Beam Production in 1999Test Beam Production in 1999
Run1Run1 Run2Run2 Run3Run3 RunNRunNRun1Run1 Run2Run2 Run3Run3 RunNRunN
RunDBRunDB
LogDBLogDB
BConfDBBConfDB Prod Prod FDFD
Prod BootProd Boot Offline - cmsc01Offline - cmsc01
Clone FDClone FD
OnlineOnlineProd BootProd Boot
Prod Prod FDFD
LogDBLogDB
BConfDBBConfDB
RunDBRunDB
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Test Beam Data AnalysisTest Beam Data AnalysisTest Beam Data AnalysisTest Beam Data Analysis
Online (Prompt data) Monitoring:Online (Prompt data) Monitoring:
on online machine on online machine fast feedback of the detector performances.fast feedback of the detector performances.
Offline analysisOffline analysis::
locally on the data server or remotely using AMS server.locally on the data server or remotely using AMS server. During August, Tracker (X5b) test beam up to 25 concurrent users were During August, Tracker (X5b) test beam up to 25 concurrent users were
accessing data on the offline system without any observable degradation.accessing data on the offline system without any observable degradation.
During 2000 Moves from Hbook Histograms and ntuples, to HTL and
Tags See I. Gaponenko talk on IGUANA session F
HBookn-tuplesTB Analysis
PackagePersistent
DataHTL
During 1999 Hbook Histograms and ntuples
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Tracker Silicon Detector Performances StudiesTracker Silicon Detector Performances StudiesTracker Silicon Detector Performances StudiesTracker Silicon Detector Performances Studies
Muon beams 50 GeVMuon beams 50 GeV
Silicon non irradiated detectorSilicon non irradiated detector
APV6 Chip deconvolution modeAPV6 Chip deconvolution mode
FED VME ModulesFED VME Modules
active area 62.5 mm x 61.5mmactive area 62.5 mm x 61.5mm
thickness 300 thickness 300 mm
High Resistivity High Resistivity
strip pitch 61 strip pitch 61 mm
strip width 14 strip width 14 mm
implanted strips 1024implanted strips 1024
Scl = 31.8 Ncl = 2.9Scl = 31.8 Ncl = 2.9
Scl/Ncl = 10.9Scl/Ncl = 10.9
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Muon Drift Tube Detector Performances StudiesMuon Drift Tube Detector Performances StudiesMuon Drift Tube Detector Performances StudiesMuon Drift Tube Detector Performances Studies
DTBX FormatDTBX Format bits (0:15): Drift Time (1.04ns) [0…bits (0:15): Drift Time (1.04ns) [0…
65535]65535] bit (16): Signal Edge [1=falling]bit (16): Signal Edge [1=falling] bits (17:22): Cell Number [1..63]bits (17:22): Cell Number [1..63] bits (23:25): Layer Number [1…4]bits (23:25): Layer Number [1…4] bits (26:27): SuperLayer Number bits (26:27): SuperLayer Number
[1..3][1..3]
Beam Profile
Cell Nb
Drift Time (ns)Layer Cell
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Muon Trigger (BTI) Test Beam AnalysisMuon Trigger (BTI) Test Beam AnalysisMuon Trigger (BTI) Test Beam AnalysisMuon Trigger (BTI) Test Beam Analysis
The Muon Test Beam analysis is fully integrated with the Muon and The Muon Test Beam analysis is fully integrated with the Muon and
first level trigger reconstruction.first level trigger reconstruction.
For Bunch and Track Identifier (BTI)For Bunch and Track Identifier (BTI)
comparison between real data and simulation is performed.comparison between real data and simulation is performed. see C. Grandi talk on CMS Muon Trigger - session Bsee C. Grandi talk on CMS Muon Trigger - session B
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
ORCA userAnalysis
Analysisntuples
PAW
UserAnalysis
ORCA ntupleproduction
Use
r An
aly
sis
ORCADigitization
ObjectivityDatabase
High Level Trigger Production with ORCA in 1999High Level Trigger Production with ORCA in 1999High Level Trigger Production with ORCA in 1999High Level Trigger Production with ORCA in 1999D
B p
op
.
Zebra fileswith HITS
HEPEVTntuples
CMSIM
MC
Pro
d.
Signal
MB
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
ORCA High Level Trigger 2000 productionORCA High Level Trigger 2000 productionORCA High Level Trigger 2000 productionORCA High Level Trigger 2000 production
First ORCA production in October 99 was very successful (>700GB First ORCA production in October 99 was very successful (>700GB
in Objy/DB), but ORCA prod 2000 must have much more in Objy/DB), but ORCA prod 2000 must have much more
functionality:functionality:
All data will be in the databaseAll data will be in the database
Every CMSIM run will have its objects in many database files Every CMSIM run will have its objects in many database files
Single Db file contains concatenation from many CMSIM runs (64 k Single Db file contains concatenation from many CMSIM runs (64 k
files Objectivity limit)files Objectivity limit)
Many layers of apparently autonomous federations actually Many layers of apparently autonomous federations actually
synchronized by enforcing common schema and unique DbID’ssynchronized by enforcing common schema and unique DbID’s
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Minimum BiasMinimum Bias JetMetJetMet MuonMuon (FZ)User(FZ)User
JetMetJetMet
JetMetJetMet
G3 Hits and TracksG3 Hits and Tracks
ORCA Xings &DigisORCA Xings &Digis
ORCA RecObjsORCA RecObjs
……......
Each box is anEach box is anindependent productionindependent productionrunning in “parallel”running in “parallel”
High Level Trigger Processing 2000High Level Trigger Processing 2000High Level Trigger Processing 2000High Level Trigger Processing 2000
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
Selective Tracker DigitizationSelective Tracker DigitizationSelective Tracker DigitizationSelective Tracker Digitization
TriggerTrigger CalorimetryCalorimetry TrackerTrackerMuonMuon
TriggerTrigger CalorimetryCalorimetry MuonMuon
SelectSelect
TrackerTrackerTriggerTrigger CalorimetryCalorimetry MuonMuon
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
ORCA 2000 Db StructureORCA 2000 Db StructureORCA 2000 Db StructureORCA 2000 Db Structure
1 CMSIM JobMC Info
Container #1Calo/Muon
Hits Tracker Hits
ooHit dB's
MC Info Run1
MC Info Run2
MC Info Run3..
ConcatenatedMC Info from
N runs.
One CMSIM Job, oo-formatted into multiple Db’s. For example:
Multiple sets of ooHits concatenated into single Db file. For example:
FZ File
~2 GB/file
~300kB/ev
~100kB/ev
Few kB/ev
~200kB/ev
Physical and logical Db structures diverge...
Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP 7 Feb 2000
ConclusionsConclusionsConclusionsConclusions
TheThe persistent object managementpersistent object management is ais a major responsibility major responsibility of of
CMS Analysis and Reconstruction Framework CMS Analysis and Reconstruction Framework A DBMS is required to manage the large data set of CMSA DBMS is required to manage the large data set of CMS (including (including
user data)user data)
An ODBMS is the natural choice if OO is used in all softwareAn ODBMS is the natural choice if OO is used in all software
Once an ODBMS is used to manage the experiment data, it’s very Once an ODBMS is used to manage the experiment data, it’s very natural to use it to manage any kind of data related to detector studies natural to use it to manage any kind of data related to detector studies and physics analysisand physics analysis
Objectivity/DB has been evaluated in different prototypes which Objectivity/DB has been evaluated in different prototypes which successfully stored and retrieved data (Test-Beam, simulated, successfully stored and retrieved data (Test-Beam, simulated, reconstructed, statistical i.e histograms).reconstructed, statistical i.e histograms).
From From 1999 1999 both for both for Test BeamTest Beam and and High Level TriggerHigh Level Trigger studies we studies we are in are in productionproduction using using Objectivity/DB.Objectivity/DB.