Zeus Monday Meeting, DESY, 04.11.2002
1
The ZEUS Event Store (ZES)
How to make the most
efficient data selection in ZEUS Ulrich Fricke, DESY, Hamburg
Zeus Monday Meeting, DESY, 04.11.2002
2
Outline
Introduction Technical information Performance How to use ZES ZES for MC Remarks Summary
Zeus Monday Meeting, DESY, 04.11.2002
3
Introduction:ZEUS data taking Detector waits for HERA luminosity Events selected according to FLT, SLT & TLT Raw data file is written out Raw file is by ZEPHYR (reconstruction) MDST data file is written out User code access information in MDST file Events are selected depending on the analysis
Zeus Monday Meeting, DESY, 04.11.2002
4
Efficient data selection
Requirements for data selection– efficiency (as much data as fast as possible)– transparency (need to know which data is used)– reliability (should works 24h, 7days etc.)– reproducibility (same selection today and later)
Zeus Monday Meeting, DESY, 04.11.2002
5
ZEUS data selection (1)
Usually, we run EAZE jobs over the MDST data files and produce HBOOK ntuples or ROOT trees
Simple selection: process all events – loop over all events in MDST files– select events in user code (ZUANAL)
Clever selection: pre-select the data– define pre-selection criteria (at ZEUS 128 DST bits)– calculate values of DST bits (metadata) during data processing– information is stored in text files (ZEUS Event Directories, ZEDs)– metadata is read first and only selected events are processed
Zeus Monday Meeting, DESY, 04.11.2002
6
ZEUS data selection (2)
More efficient pre-selection: use a database– use database to store metadata information– make use of database features:
data compression performance tuning indices automatic updating data available to other applications
This is what we do with ZES
Zeus Monday Meeting, DESY, 04.11.2002
7
ZEUS Event Store (ZES)
database fully integrated into ZEUS offline system – Based on Objectivity/DB version 7.0.– Object-oriented database.– Written in C++ (interfaced to ZEUS FORTRAN software).– Input generated during data processing for each event.– Contains a lot of information like
All 128 DST bits All FLT, SLT & TLT trigger information Detector information (CTD, CAL, BPC, FPC, BAC, …) Electron finders (Sinistra, EM) Muon finders ...
Zeus Monday Meeting, DESY, 04.11.2002
8
ZEUS Event Store (ZES)
ZES is ( like ZEDs) intended primary for data pre-selection– user defines selection criteria– ZES database is search for event that match criteria– only selected events are analyzed by EAZE
ZES is NOT in competition to ORANGE! Select event with ZES and analyzes them with ORANGE. Selection is done with Objectivity/DB predicate string
– you can select on the values of all variables and bits (((Eeu_si>5)and(Zvtx>-50)and(Zvtx<50) and(Eminpz>35)and(Eminpz<65))and((DST13)and(BP112))))
Zeus Monday Meeting, DESY, 04.11.2002
9
ZES database schema
ZES based on Objectivity/DB version 7.0 Available for Linux, Irix & Solaris Database has hierarchical structure
1 Federated Database (the whole ZES) Several Databases (each corresponds to a 200-450 MB file) A lot of Containers (blocks in databases) Even more Objects (information chunks in containers)
Currently 1228 databases (325 Gb) Including different sets of information for some runs (96,96GR) On raid5 disks at SGI Origin 2000 fileserver (doener)
Zeus Monday Meeting, DESY, 04.11.2002
10
ZES database design(0)
ZES has no central server which does the selection The only central process (lockserver) manages locks
during database updates etc. For a selection on events in a certain run, all event
informations are transmitted over the network to the client Significant performance increase if only the information
needed for the selection is transmitted. Influence on database design
Zeus Monday Meeting, DESY, 04.11.2002
11
Current Design (1)
ZES Federated Database
Database 1 Database 2
Run 1
Run 2
Run 3
Run 5
Run 4
Zeus Monday Meeting, DESY, 04.11.2002
12
Current Design (2)Run 1 (Events)
MDST file 1
MDST file Z
MDST file N
Event 1
Event 2
Event 100
Event 999
Event 111
Event 120
Zeus Monday Meeting, DESY, 04.11.2002
13
Design of MDST and Event object
MDST Object– Number of dataflows in MDST file– Name and Offset of dataflows– Same information as in ZEDs
Event Object– RunNr, EventNr– DST bits and trigger information– 200-300 Physics information (integers and floats)– Reference to MDST file– Event offset in MDST file
Zeus Monday Meeting, DESY, 04.11.2002
14
Recent addition: MicroEvents
Most users select on DST bits and trigger information Create new entry in database for each event, which
only contains DST and trigger information :– MicroEvent:
RunNr, EventNr DST bits Trigger information Reference to MDST file Offset in MDST file Reference to full Event object
Zeus Monday Meeting, DESY, 04.11.2002
15
Current Design (3)Run 1 (Events)
MDST file 1
MDST file Z
MDST file N
Event 1
Event 2
Event 100
Event 999
Event 111
Event 120
Run 1(MicroEvents)
MicroEvent 1
MicroEvent 2
MicroEvent 100
MicroEvent 111
MicroEvent 120
MicroEvent 999
Zeus Monday Meeting, DESY, 04.11.2002
16
How do we update the database?
The ZES information are calculated in code called from and in the ZESPHYS library
During the processing of raw data files the code is called AFTER all reconstruction and calibration has been done.
For each raw data file one HBOOK ntuple with the ZES information is produced
Ntuples are used to load information a separate federated database which is not accessible to users
After all checks are ok, the database is moved to the main ZES federated database and available to all users
Zeus Monday Meeting, DESY, 04.11.2002
17
Performance(1)
With ZES one can select on DST bits, trigger information and physics variables
With ZEDs one can only select on DST bits Relative comparrison:
– ZED selection on DST bits : 100%– ZES selection on DST bits : < 95% (Events)– ZES selection on DST bits : < 12% (MicroEvents)
One can make a tighter selection than with ZEDs
Zeus Monday Meeting, DESY, 04.11.2002
18
Performance (2)
ZES selection time is NOT increasing with the number of variables used in the selection
One can make much tighter selections with ZES Less selected events means less work for
ORANGE/EAZE You save even more CPU time!
Zeus Monday Meeting, DESY, 04.11.2002
19
How to use ZES(1)
With a command line tool– select events with /zeus/bin/zesprint– some selection as in EAZE– easy to check selection efficiency– can produce an ntuple with ZES information of all
selected events– zesprint -a 27305 -z 27305 -n 10 -v (Eeu_si>10) -b
(DST11) -l EventNr,Eeu_si
Zeus Monday Meeting, DESY, 04.11.2002
20
How to use ZES (2)
In non-EAZE executables– FORTRAN : call zesvar2(RunNr, EvenNr)– This will fill the ZES common block with the ZES
information of the given run and event– one needs to include zescommon.inc in the code
In an EAZE jobs– the most important way to use ZES– all ZES access to control by cards in the normal
cardsfile (fort.7)
Zeus Monday Meeting, DESY, 04.11.2002
21
ZES in an EAZE job
Batch Job
Interface
Setup File
Request data
Read setup
Predicate query
Open eventin
MDST file
ZES database
MDST
MDST
Event information read
Zeus Monday Meeting, DESY, 04.11.2002
22
How to use ZES (3)
Driver cards to turn on ZES for the EAZE job– ZeusIO-INFI ZeusEventStore– ZeusIO-IOPT DRIVER=OBJY
Run selection (include or exclude run ranges, default=all) – ZeusIO-FirstRun 27305– ZeusIO-LastRun 27311– ZeusIO-IncludeRun 27305-27311,27350-27399– ZeusIO-ExcludeRun 27306,27310-27311
Select a special list of runs (i.e. 96GR instead of 96)– ZeusIO-Runlist /zeus/ZES/run-list.96GR.sorted
Zeus Monday Meeting, DESY, 04.11.2002
23
How to use ZES (4)
DST and Trigger Selection (default none)– ZeusIO-Bit ((DST11)or(DST25))and(T070)
Selection on variables (default none)– ZeusIO-Variable ((Eeu_si>10.0)and(Ntrks>4))
Selection on EventNr (default (EventNr>0) )– ZeusIO-Event ((EventNr>10)or(EventNr<100))
Selection of Object to search (default Event)– ZeusIO-EventType MicroEvent
Selection on Predefined Event Sample (default none)– ZeusIO-Sample SampleName
Zeus Monday Meeting, DESY, 04.11.2002
24
How to use ZES (5)
Variable names are given on ZES WWW page. Blanks are not allowed in definition of the selection criteria
– OK : ZeusIo-Event (EventNr>10)and(EventNr<100)– Wrong: ZeusIo-Event (EventNr>10) and (EventNr<100)
You can use several lines in the cardsfile for a selection– ZeusIO-Event ((EventNr>10)or– ZeusIO-Event (EventNr<100))
Allowed arithmetic operators: +, - , /, % Allowed relational operators: <, >, <=, =>, =, != Allowed logical operators: and, or, not
Zeus Monday Meeting, DESY, 04.11.2002
25
ZES for MC events (1)
ZES was designed primary for data Users requested it also for MC events in order to
– do efficiency studies of the selection– use the same cards for data and MC– speed up the event selection in MC
It is not fully implemented yet but the first 2 requirements are working
ZES information is calculated for each event before ORANGE and user code is executed
Takes about 0.3 to 3 sec/event Only selected events are analyzed by ORANGE/user code
Zeus Monday Meeting, DESY, 04.11.2002
26
ZES for MC events (2)
Turn on the calculation of ZES information by new card ZESPHY-RUNIT
If you need to analyze all also not selected events add this ZESPHY-USEALL
Include zescommon.inc in your code to access the calculated ZES information
The integer variable ZES_eventselection is set to 1 if the event is accepted, otherwise to 0
Turn off calculation of blocks of ZES information to save CPU time by ZESBLK-nameofblock FALSE (default in TRUE)
Zeus Monday Meeting, DESY, 04.11.2002
27
ZES for MC events (3)
The ZES MC code is in development A lot of changes or now in the new development release
( compile with gmake all ZEUSRELEASE=new )
Check the ZES WWW page for updates and specific versions of needed libraries
Do NOT use the ZES driver cards for MC:– Wrong: ZeusIO-INFI ZeusEventStore– Wrong: ZeusIO-IOPT DRIVER=OBJY
Zeus Monday Meeting, DESY, 04.11.2002
28
Upcoming changes
Now: 2002 data now in ZES. Up to run 42801. (DAQ runs excluded)
Soon: More efficient selection or rejection of single events and event ranges mainly to quickly select 2002 events
Later: More functionality for MC events Later: Design changes to further increase performance Later: Extend ZES to RAW and ENV events
Zeus Monday Meeting, DESY, 04.11.2002
29
Getting Information And Help
The ZES WWW page contains a lot of information about– getting started (including an EAZE example with ZES cards)– selection strings– description of the ZES variables– a newslog with recent announcements
http://www-zeus.desy.de/ZEUS_ONLY/analysis/zes/ Pay attention to the information displayed from the
jobclients (jobq, jobinfo etc) Sent suggestions and questions to [email protected]. It is
forwarded to the responsible person. Or phone, pass by ...
Zeus Monday Meeting, DESY, 04.11.2002
30
Some last remarks
ZES is here to help YOU to make a more efficient data selection
If you find some information missing in ZES, discuss it with others (supervisor, physics coordinator etc.) and us!
You need to tell us what is needed AND help develop and/or test the code.
Especially important for new detectors like MVD, STT, Lumi spectrometer, polarimeter etc.
And for new analysis methods (Muon finders, dead material correction …)
Zeus Monday Meeting, DESY, 04.11.2002
31
Summary
ZES is our most efficient selection mechanism for data.
It is much faster than the traditional methods. With limitations it can also be used for MC. More improvements are on the way. Remember:
– ZES is for you! – Tell us what you need! – Help us to make it real!
Top Related