1
Event Building With Smart NICs
Jean-Pierre Dufey, Beat Jost, Niko Neufeld & Marianna Zuin
DAQ 2000Lyon, October 20, 2000
Niko NEUFELDCERN, EP
2
Recap: LHCb DAQ System
Re ad-out Network (RN)
RU RU RU
6 GB/s
6 GB/s
50 MB/sVariable latency
L2 ~10 msL3 ~200 ms
Control &
Monitoring
LA
N
Re ad-out units (RU)
Timing&
FastControl
Front End Links
Trigge r Le vel 2 & 3Eve nt Filte r
SFC SFC
CPU
CPU
CPU
CPU
Sub-Farm Controllers (SFC)
Storage
Thro
ttle
Front-End Multiplexe rs (FEM)
Re ad-out Network (RN)
RU RU RU
6 GB/s
6 GB/s
50 MB/sVariable latency
L2 ~10 msL3 ~200 ms
Control &
Monitoring
LA
N
Re ad-out units (RU)
Timing&
FastControl
Front End Links
Trigge r Le vel 2 & 3Eve nt Filte r
SFC SFC
CPU
CPU
CPU
CPU
Sub-Farm Controllers (SFC)
Storage
Thro
ttle
Front-End Multiplexe rs (FEM)
Niko NEUFELDCERN, EP
3
Event Building Components
•Readout units (RU): multiplexing of front-end links, destination assignment
•Switching read-out network
•Sub-farm controllers (SFC): event building and event dispatching
Niko NEUFELDCERN, EP
4
Event Building Properties
•Static load balancing among the SFCs– RUs send round robin to destinations
destination = f(event_number) f being the same for all RUs
•Pure push protocol – congestions handled via flow control and
more importantly by throttling
•Distributes the event data flow of 6 GB/s from m sources to n destinations, each of which has to handle O(1Kb) fragments at 80 kHz
Niko NEUFELDCERN, EP
5
Why Use Smart NICs?
Modern Smart NICs are powerful embedded computers
Off-load general purpose CPU
Take advantage of cheap CPU power on the NIC
Facilitate hardware design of the RU
(Yet) limited CPU power compared to commodity PC
No guarantee that high-end NIC development will continue in this direction (firmware/CPU vs. ASIC/FPGA)
Niko NEUFELDCERN, EP
6
Alteon Tigon 2
• Features– Dual R4000-class processor
running at 88 MHz– Up to 2 MB memory– GigE MAC+link-level
interface– PCI interface
• Development environment
– GNU C cross compiler with few special features to support the hardware
– Source-level remote debugger
Niko NEUFELDCERN, EP
7
Test Setup
PC/Linux
CPU
Mem
GbENIC
PC/Linux
CPU
Mem
GbENIC
PCI PCI
NIC NIC
CERN Network
Niko NEUFELDCERN, EP
8
Nic 2 Nic throughput vs framesize
0.0000
20.0000
40.0000
60.0000
80.0000
100.0000
120.0000
140.0000
1 10 100 1000 10000
Framesize [bytes]
Th
rou
gh
pu
t [B
ytes
/us]
Data
Fit
Extrapolation w/omin frame size
NIC 2 NIC Performance
Bytes 64.0=
s/125=
s 0.2=
),max(+=
•
c
μBb
μa
cxba
xy
Niko NEUFELDCERN, EP
9
Performance of Alteon NIC
•Can fill the wire at any given frame size (from 64 to 9000 bytes)
•Can send out frames at a frequencies of up to 1.4 MHz
•For frames bigger than 512 bytes more than 95% of nominal bandwidth available for data (practically 100% for >8000 Jumbo frames)
Niko NEUFELDCERN, EP
10
Event Building Algorithm
•Assembles events out of fragments from a known number of sources
•Handles an adjustable amount of events concurrently (limited only by buffer space)
• Implements “Implicit + Time-out Completion”
•Uses “scatter/gather” capabilities of NIC’s DMA engine to concatenate the fragments into the host’s memory
Niko NEUFELDCERN, EP
11
Algorithm
Start Procedure
Polling
New fragment
Newevent fragment
?
NO
Event still in the
table?
NO
Fragment out of time
Collect thefragment
YES
Decrementsources
YES
Add new eventdescriptor
Check for missing fragmentsin previous events
Start Procedure
Polling
New fragment
Newevent fragment
?
NO
Event still in the
table?
NO
Fragment out of time
Collect thefragment
YES
Decrementsources
YES
Add new eventdescriptor
Check for missing fragmentsin previous events
Start
Polling
New fragment
Newevent fragment
?
NO
Event still in the
table?
NO
Fragment out of time
Collect thefragment
YES
Decrementsources
YES
Add new eventdescriptor
Check for missing fragmentsin previous events
Start
Polling
New fragment
Newevent fragment
?
NO
Event still in the
table?
NO
Fragment out of time
Collect thefragment
YES
Decrementsources
YES
Add new eventdescriptor
Check for missing fragmentsin previous events
Niko NEUFELDCERN, EP
12
PC Test Implementation
Simple time-out / Event # on top
22
1.41.6 1.64 1.61
0
0.5
1
1.5
2
2.5
500000 10500000 20500000 30500000 40500000 50500000 60500000 70500000 80500000 90500000 100500000
Generated fragm ents
T/frg
(micr
osec
onds
)
400 MHzPIII VC++ 5.0
Niko NEUFELDCERN, EP
13
Performance NIC 2 NIC
1000
10000
100000
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
Number of sources
Ev
en
ts/s
ec
on
d
Average time per fragment11.65 us
Niko NEUFELDCERN, EP
14
Summary
•Event building on a smart NIC at a frequency of incoming fragments of almost 100 KHz has been demonstrated
•Event building at Gigabit speed for fragments bigger than ~1100 bytes
•Code Optimization ongoing (9 us/frag have already been achieved)
Niko NEUFELDCERN, EP
15
Program of Work
•Evaluate impact of interrupt coalescence on SFC performance
•Study possibility of handling some amount of TCP/IP traffic on the outgoing link of the SFC (events to storage)
•“Real world” tests on a Gigabit Ethernet switching network
•Use measured parameters in a detailed simulation of the readout network