LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.
-
Upload
jessie-matthews -
Category
Documents
-
view
216 -
download
0
Transcript of LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.
LHCb DAQ system
LHCb SFC reviewNov. 26th 2004
Niko Neufeld, CERN
Niko NeufeldCERN, PH
Acronyms
• MEP (Multi Event Packet): packet containing n event fragments from n consecutive triggers. n is called the:
• PF (Packing Factor): there are two independent PFs for L1 and HLT
• SFC (Subfarm Controller): the event-builder
• L1 (Level-1 Trigger): data stream at 1 MHz containing data from VeLo, Trigger Tracker and summary info from L0 trigger
• HLT (Hight Level Trigger): data stream at 40 kHz containing all data
• TFC (Timing and Fast Control): the LHCb variant of the LHC wide TTC system. Central unit is the
• RS (Readout Supervisor): centrally monitors and protects buffers where deterministic by disabling the trigger. Throttle signals from modules in trouble are used where buffers cannot be predicted.
Niko NeufeldCERN, PH
LHCb DAQ & Trigger
• LHCb has a three-level trigger system– Level-0
•10 MHz “visible” interactions @2.1032
•Multiplicity/Pile-Up: 7 MHz•ET(µ1,µ2,h,e,γ,π0): 1 MHz
– Level-1 •Silicon Vertex Detector (VeLo): impact parameter•VELO+Trigger Tracker: momentum•VELO+L0-m: Mmm•accept: 40 kHz
– HLT•confirmation L1(VELO+TT)+T: 20 kHz•VELO+TT+T: dp/p<1% •2-5 kHz to tape out of which fully reconstructed for
prompt analysis: 200 Hz
• LHCb DAQ is the hardware for the LHCb software triggers
Niko NeufeldCERN, PH
LHCb Experiment
Data used for Level-1 Trigger
Niko NeufeldCERN, PH
DAQ Features
• (Almost) All detectors use the same module to send data (TELL1) on up to 4 Gigabit Ethernet links
• Synchronous information (triggers) are distributed by the TFC/TTC system
• All data traffic is using Gigabit Ethernet• Data is pushed (connectionless protocol for data
transfer, like UDP)• Two levels of software trigger and two data
streams (L1 and HLT) on the same network• L1 uses only part of the detector (VeLo, TT, L0
summary), HLT reads out all of the detector• For both L1 and HLT fragments from several
consecutive triggers are packed together and set as a Multi Event Packet (MEP)
Niko NeufeldCERN, PH
DAQ features (2)
• L1 has a latency restriction (data are buffered for full HLT readout). L1 decision from farm node must reach TFC system no later than 58 ms after initial trigger
• Static load-balancing among sub-farms: destination assignment via TTC
• Dynamic load-balancing among nodes in sub-farms by the SFC less sub-farms are better
• Central flow-control via TFC system: throttle: – Dedicated signals from Front-end boards (fast)– Via the control system from the SFCs (slow)
Niko NeufeldCERN, PH
Multiplexing andAggregation Layer
FE FE FE FE FE FE FE FE FE FE FE FE
Switch Switch
Level-1Traffic
HLTTraffic
1000 kHz5.5 GB/s
40 kHz1.6 GB/s
94 SFCs
Front-end Electronics
Gb Ethernet
Level-1 Traffic
Mixed Traffic
HLT Traffic
7.1 GB/s
TRM
Sorter
TFCSystem
L1-Decision
StorageSystem
Readout Network
Switch Switch Switch
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
CPUFarm
~1800 CPUs
DAQ Architecture
~ 200 MB/stotal
TIER0
1000 kHz9.5 GB/s
11.1 GB/s
~ 150 SFCs
~???? CPUs
DAQ Architecture upgrade
Niko NeufeldCERN, PH
DAQ in numbers
•276 detector boards (+ 1 Readout Supervisor + 1 L0 Decision Unit)
•Currently readout for Level 1: 135
•Estimated data rate for HLT and L1 combined 7.1 GB/s
•Estimated data rate for HLT and L1 combined after upgrade of L1: ~ 11 GB/s
•Estimated required peak bandwidth to storage ~ 200 MB/s
Niko NeufeldCERN, PH
21
Dataflow in LHCb
FE FE FE FE FE FE FE FE FE FE FE FE
Switch Switch
94 SFCs
Front-end Electronics
Gb Ethernet
Level-1 Traffic
Mixed Traffic
HLT Traffic
94 Links7.1 GB/s
TRM
Sorter
TFCSystem
L1-Decision
StorageSystem
Readout Network
Switch Switch Switch
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
SFC
Switch
CPU
CPU
CPU
CPUFarm
~1800 CPUs
1
L0Yes
2
L1TriggerL1
D
L1Yes
12
21
HLTYes
BΦΚs
58 ms56 ms3 ms0.5 ms
Niko NeufeldCERN, PH
Performance Requirements on SFC
•Handle L1 and HLT MEP stream– in and out
– 2 x ~O(50 kHz/ NSFC) + 2 x ~O(10 kHz/ NSFC)
•Forward pre-sorted decisions to L1-Sorter ~O(50 kHz / NSFC)
•Forward accepted events to Storage ~O(5kHz / NSFC)
•Control and Monitoring traffic ~ O(1 Hz)
Niko NeufeldCERN, PH
Technical Requirements
•Rack-mount 1 U, not deeper then 70 cm if at all possible
•Minimum 2 x 1000 MBit and 2 x 100 MBit network interfaces (1 data receive, 1 data send, 1 control, 1 storage) copper
•Network bootable, diskless operation possible
•Remotely manageable (IPMI)
Niko NeufeldCERN, PH
Nice to have (and why)
•4 x 1000 BaseT interfaces (+ min 2x 100 MBit for
control and storage): allows running at more than 1 Gigabit (resources permitting)
•Have a price that: N(required SFCs) * Price(1 SFC) / 200000 < 1:no market-survey / tender
•Other goodies if not too expensive: redundant, hot-pluggable power supplies, etc…
Niko NeufeldCERN, PH
Latency due to queuing
0.1 % of events have a timeoutlarger than the 30 ms cut-off
Ptolemy simulation:•Processing time distributionfrom number of clusters•Assuming 9 processors anda shared L1 triggerrate of 9 kHz per sub-farm•10^6 L0 accepted events, one of 120 subfarms
Niko NeufeldCERN, PH
Beating the statistics of small numbers
Subfarm now with 18 nodesand sharing ~ 18 kHz ofL1 trigger one of 60 sub farms. Total number of CPUs in the system constant
Only 0.05 % of events have a timeout largerthan 30 ms minimise number of sub-farms
Niko NeufeldCERN, PH
Summary
• LHCb DAQ is completely based on commercial (mostly high-end commodity) components
• We have to handle two levels of software trigger, on with a latency restriction
• We use a connection-less push-protocol from top to bottom
• The SFC is one of three key components in the data-flow (the sources, the network, the SFCs)
• It has to do event-building, event-distribution and buffering, decision pre-sorting and forwarding and forwarding of accepted events to storage