Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark...
Transcript of Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark...
![Page 1: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/1.jpg)
January 25, 2018
Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell
DUNE FD DAQ: ATCA/RCE + FELIX Solution
![Page 2: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/2.jpg)
Felix + ATCA RCE Overview & Responsibilities
2
FrontEnds
WIBsATCA RCE
ClusterFront end data passes through WIB without concentration, electrical to optical conversion only** ATCA RCE Cluster provides
filtering, feature extraction & SuperNova Buffering (100s).Raw data received at
the ATCA RCE RTM, passed directly to DPMs
10Gbps links between RCE and Felix (underground); buffer for trigger
FELIXCluster
Event Builder, Aggregator, L3 triggering
Optics Up Shaft
Backend Computing
TriggerFarm Trigger
decisions
Trigger primitives
● FE+WIB → RCE: all raw data into RTM with some custom format (e.g. COLDATA); 8B/10B (probably) at 1.28 Gbps
● RCE → FELIX: all raw data out of the RTM some custom format (GBT etc) of multiplexed data ~10-12 Gbps links
● FELIX → Backend Computing: triggered raw data over ethernet on switched network● Trigger Path: RCE-extracted primitives go either to RCE → FELIX → trigger farm on
separate stream or directly from RCE → trigger farm via ethernet (shown)● Lossy Buffer (not shown) → RCE-extracted waveforms/time slices → lossy buffer either
through FELIX or direct from RCE
![Page 3: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/3.jpg)
3
Numerology (just CE, RCE, FELIX)
● “Cold” Electronics: 64 channels/COLDATA, 2 COLDATA/FEMB, 4 FEMB/WIB, 20 FEMB/5 WIBs/APA
○ these are fixed, never ever will change● RCE System: 4 DPMs/COB, 1* RCEs/DPM, up to 14 COBs/shelf, 1 COB/APA (target)
○ current-gen of DPM has 2 RCEs/DPM, see later slides● FELIX System: 2 APAs/FELIX; 2 FELIX/PC (target)● WIB → RCE Links: 16 links/WIB @ 1.28 Gbps, 80 links/COB/APA
○ assume passive WIB● RCE→FELIX Links: (raw, uncompressed) 2 10-Gbps links/DPM, 8 links/APA, 16 links/FELIX,
32 links/PC
![Page 4: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/4.jpg)
4
RTM Block DiagramS
NA
P12
SN
AP
12
SN
AP
12
SN
AP
12
SN
AP
12
SN
AP
12
SN
AP
12
SN
AP
12
SFP
+
SN
AP
12
SN
AP
12
WIB Connections: Support 80 links FELIX Interface Timing
DTMDPMs
Experience with high density fiber optic RTMs
QS
FPReflexphotonics SNAP12 transmitter/receiver supports 10.3125 Gbps per lane:http://reflexphotonics.com/embedded-transceivers/snap12/
![Page 5: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/5.jpg)
Data Flow In ATCA RCEs
5
Filtering
Filtering
Filtering
FrontEnds
FeatureExtraction
SuperNovaPre-Buffer
SuperNovaPost Buffer
FelixUplink
(GBT or PGP)
TimingInterface
To Felix
● Flexible architecture allows front ends to be allocated across RCEs in a flexible fashion○ Simply add more cards and move fibers
● Target is 640 channels per RCE (1x APA per COB) → 5 FEMBs/DPM○ Numerology is important! 5 WIBs vs 4 DPMs/APA; multiplexing at WIB ( 2xFEMB links e.g.)
reduces flexibility
Compression& Other
Processing
CompressionProposed
![Page 6: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/6.jpg)
RCE To Felix Uplink
6
● Multiple optical links will be routed between the ATCA RCE platform to the Felix nodes○ DWDM utilized to maximize uplink bandwidth and provide redundancy
● The ATCA RCE platform will utilize its powerful interconnects to provide a data routing capability
○ Flexible configuration of which data goes to each Felix board■ Allows system to adapt to changing data needs (noise, etc)■ Some channels can be used for raw data from a subset of the detector■ SuperNova readout lanes (slow trickle, post trigger)■ Data can be re-routed to different fibers in the case of a fiber break or Felix board failure!
○ Link count can be scaled to match system needs
■ Less fibers when RCEs do majority of data processing and event selection■ More fibers when computing cluster is needed for data processing■ One or more fibers per DPM, one fiber per COB or 1 fiber per crate
RCECluster Felix
Felix
Felix
Felix
Felix
Felix
![Page 7: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/7.jpg)
Example Data Routing
7
Felix
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
OutboundRCE
OutboundRCE
OutboundRCE
Felix
Felix
ATCA RCEInterconnect
Note: Processing RCEs can also serve as outbound RCEs
DetectorData
![Page 8: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/8.jpg)
Example Data Routing: 2 Active & 1 Spare Felix
8
Felix
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
OutboundRCE
OutboundRCE
OutboundRCE
Felix
Felix
ATCA RCEInterconnect
Note: Processing RCEs can also serve as outbound RCEs
DetectorData
![Page 9: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/9.jpg)
Example Data Routing: Felix Failure Or Fiber Break
9
Felix
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
OutboundRCE
OutboundRCE
OutboundRCE
Felix
Felix
ATCA RCEInterconnect
Note: Processing RCEs can also serve as outbound RCEs
ATCA RCE cross connect routes data to redundant Felix board after fiber break or Felix board failure!
DetectorData
![Page 10: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/10.jpg)
Example Data Routing: Load Adjustment
10
Felix
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
DetectorData
ProcessingRCE
OutboundRCE
OutboundRCE
OutboundRCE
Felix
Felix
ATCA RCEInterconnect
Note: Processing RCEs can also serve as outbound RCEs
Redundant Felix boards can take on additional loads due to flexible data routing in ATCA RCE!
DetectorData
![Page 11: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/11.jpg)
11
Data from on FELIX PCs
![Page 12: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/12.jpg)
Benefits Of Felix + ATCA RCE
12
● ATCA RCE provides a powerful front end processing platform for data processing in FPGAs○ RCE data processing features included in upcoming slides
■ Filtering■ Feature extraction■ Neural Network Processing
○ RCE could provide SuperNova data buffering (minute easily) ○ Proven packaging, cooling, interconnects, high reliability (incl hot-swap redundant fan and power supplies)
■ Already used for other experiments (LSST, ATLAS CSC, ATLAS IBL, KOTO, etc), mature design, low risk
● ATCA RCE interconnect provides ability to re-route data to Felix nodes on demand○ Adjust processor load to match the amount of processing needed in back end○ Route data around failed uplink fibers○ Route data to move from a failed Felix node (or host CPU) to a redundant element
● Felix provides a point to point path between the ATCA RCEs and the back end data processing○ Better flow control model than Ethernet or TCP / UDP over long links
■ Both PGP and GBT provide proven flow control over long link distances○ Transmitted frames stay in their native size instead of being chunked up into small network transfers (Ethernet
MTU)■ Felix has demonstrated high throughput with larger packet sizes
○ End to end data integrity■ GBT and PGP both have data integrity checking on their transport protocols■ Minimal error handling required in receiving nodes before data processing layer■ Test pattern capability over GBT or PGP links
● Back end processing model follows classic Felix architecture○ Receive data in Felix node with PCI-Express card○ Back end data processing with CPUs and GPUs
![Page 13: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/13.jpg)
13
ATCA RCE Data Processing
![Page 14: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/14.jpg)
Readout Overview
14
![Page 15: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/15.jpg)
15
ATCA RCE Platform Clustering
• The RCE nodes are interconnected through Ethernet• Each COB contains a low latency 10/40Gbps Ethernet switch
- Cut through latency < 300ns• The COB supports a full mesh 14-slot backplane
- Each COB has a direct 10Gbps link to every other COB in a crate- Any RCE in an ATCA shelf has a maximum of two switches between it and every other RCE- 14 * 8 = 112 RCEs in a low latency cluster
• Reliable UDP protocol allows direct firmware to firmware data sharing• Allows for low latency data sharing between nodes
- APA combining and edge channel data sharing- Neural Network data sharing
COB
DPM 0 DPM 1
DPM 2 DPM 3
EthernetSwitch
DTM
COB
DPM 0 DPM 1
DPM 2 DPM 3
EthernetSwitch
DTM
COB
DPM 0DPM 1
DPM 2DPM 3
EthernetSwitch
DTM
COB
DPM 0DPM 1
DPM 2DPM 3
EthernetSwitch
DTM
Off shelf link
![Page 16: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/16.jpg)
16
Oxford Design: Revision C01
● ZYNQ: XCZU15EG-1FFVB1156E● PL DDR4: 8 GB on DPM● PS DDR4 8 GB on DPM● M.2 NMVe: 512 GB on DPM
○ Located above the DPM’s DDR ICs● Dimensions: 85.09 mm x 110 mm
○ Increased by 1.27mm for NMVe
XCZU15EG-1FFVB1156E
DDR4 ICs
M.2 NMVe
SD Memory Card
JTAG
![Page 17: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/17.jpg)
17
DPM Redesign for DUNE
● Oxford/SLAC Collaboration● Optimized for large memory buffering on the DPM● Only 24 GT channels on this FPGA
○ 20 of 24 GTs for the FEBs:■ 80 links/COB @ 1.28 Gbps (8B/10B)
○ 2 of 24 GTs for the ETH SW: ■ two separate 10 GbE (10Gbps/lane, 64B/66B) to ETH SW
○ 2 of 24 GTs for the Felix: ■ 2 RX lanes and up to 22 TX lanes
● Able to support redundant Felix connections■ 20 Gb/s @ 2 lanes (10Gbps/lane, 64B/66B)
SuperNova Pre-Buffer
SuperNova Post-Buffer
Linux Kernel + SuperNova Pre-Buffer
Boot Memory
Unused FEB TX lanes can be used to increase bandwidth to Felix
![Page 18: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/18.jpg)
Backup Slides
![Page 19: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/19.jpg)
19
ATCA Packaging for DUNE
● 1 APA = 2560 channels● 1 APA per COB
○ 4 DPMs per COB○ 640 channels per DPM
● 150 APA for the entire system = 150 COBs● Total Rack space: 165U
○ 11x 14-slot ATCA crates○ 15U per 14-slot ATCA Crate
■ http://www.asis-pro.com/maxum-atca-systems/14-Slot-14U-MaXum-460
![Page 20: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/20.jpg)
20
ATCA Power/Cooling Estimates for DUNE
● COB Max Power: 300W○ ~100W for ETH SW○ 36W for RTM (limited by 3A fuse)○ 160 W for digital processing
■ 40W per DPM● Total Max Power: 45kW● Cooling via forced air (Integrated into the ATCA platform)● Power and thermal monitoring via standard IPMI interface● Example of ATCA crate that support 400W per slot
○ http://www.asis-pro.com/maxum-atca-systems/14-Slot-14U-MaXum-460
![Page 21: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/21.jpg)
21
ATCA Costs for DUNE (Updated for quantity)
● RCE Cost Estimate: ○ Upgraded DPM + Flash: $2.5K○ Upgrade COB: $4K○ RTM: $2K○ DTM $1k○ ~$17k/unit
● 14-slot ATCA crates, in quantity, 2019○ ~$7k/unit○ IPMI + shelf manager + 10GbE/40GbE backplane + fans + power supplies
● Total ATCA Hardware Cost: $2.7M○ 11x ATCA crates○ 150x RCE ATCA slots
![Page 22: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/22.jpg)
22
Packaging And Architecture Thoughts
![Page 23: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/23.jpg)
23
ATCA ComponentsProven standard, built to be robust and reliable, also fully monitored
Air Intake Filter
Intake Fans
Power supply DC or AC input
Shelf Manager
● Telecom standard designed for “5 nines” uptime● Almost all components can be replaced in the field● Redundancy is available if desired
○ N + 1 redundancy for power supplies○ Redundant shelf managers
● System is designed to handle one fan failure in each fan tray○ Shelf manager generates alarm to request fan tray replacement
Exit Fans
![Page 24: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/24.jpg)
Application Card
ShelfManager
24
ATCA Provides Management & Monitoring Features Required In Reliable & Maintainable DAQ Designs
ShelfManagers
Ethernet
Console
PowerSupplies
FanTrays
EEPROM
IPMC
EPROMs● ATCA uses IPMI for management purposes
○ Intelligent Platform Management Interface● Manages and monitors all shelf based components
○ Power supply status and power○ Shelf inlet and exit temperatures○ Fan speed control and monitoring○ Application card control and monitoring
● Redundant EEPROMs contain all shelf information○ Shelf serial number, location and ID○ Shelf manager IP/MAC address
● Application card hosts IPMC○ Intelligent Platform Management Controller
● IPMC hosts all application card information in local EEPROM○ MAC addresses○ Serial number, card type & revision
![Page 25: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/25.jpg)
Supernova Buffering In Two Stages (Update)
● Pre trigger buffer stores data in a ring buffer waiting for a supernova trigger○ 640 channels per RCE (1x APA per COB)○ 2 MHz ADC sampling rate○ 12-bits per ADC○ Raw Bandwidth: 15.36 Gbps (1.92 GB/s)
■ 640 x 2MHz x 12b○ Each DPM has 16 GB RAM:
■ 9.6 TB DDR4 RAM for all system across 150x COBs○ Total Memory for supernova “pre-buffering”: 15 GB
■ PL 8 GB + PS 7 GB (1GB for Kernel & OS)○ Without compression: 7.8 seconds pre-trigger buffer
■ Assuming 12-bit packing to remove 4-bit overhead when packing into bytes● Post trigger buffer stores data in flash based SSD before backend DAQ
○ Write sequence occurs once per supernova trigger: Low write wearing over experiment lifetime○ Low bandwidth background readout post trigger: Does not impact normal data taking○ ~$180K for NVMe M.2 SD buffering (150x COBs x 4 DPMs/COB x $300/NVMe)○ 512GB/DPM = 266 second post-trigger buffer○ Samsung NVME SSD 960 PRO: Sequential write up to 2.1GB/s
■ SSD write bandwidth matches well with 640 channels of uncompressed data
25NOTE: NO compression factor is applied in slide (only RAW bandwidths)
![Page 26: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/26.jpg)
Zynq Ultrascale+ and M.2 SDD Performance
● Benchmarked read/write bandwidth into Samsung NVMe SSD 960 PRO with the ZYNQ PS PCIe root complex interface
● M.2 SDD mounted and formated as EXT4 hard drive● Running on ArchLinux● Measuring ~1.6GB/s for read/writing dummy data
generated by the CPU○ Limited by the Zynq’s PCIe GEN2 x 4 lane
interface (Theoretical limit: 2.0Gb/s)■ Not limited by M.2 SDD’s controller
● Because the input bandwidth is 1.92GB/s > 1.6 GB/s SDD write speed, we would be able to buffer for 37 seconds in DDR before 100% back pressure
● Need some amount of compression before the SSD to prevent bottlenecking at the SDD
● This is a very simple test with only one process○ Need to do stress testing of other interfaces in
parallel of SDD to confirm rate is still 1.6GB/s
26
![Page 27: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/27.jpg)
Optional Compression
● Past Development has shown firmware compression can be costly in FPGA resources● If compression is done in firmware, a minimal LUT footprint would be required● With the high performance Zynq Ultrascale+, real-time software compression does become a reality.
27
Algorithm kLUTs/DPM kFFs/DPM DSP48/DPM RAM(Mb)/DPM
Arithmetic Probability Encoding
292(86%)
120(18%)
75(<1%)
22.3(38%)
Huffman 143(43%)
60(9%)
75(<1%)
22.3(38%)
FPGA Resources for 640 channel per DPM compression
![Page 28: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution.](https://reader033.fdocuments.net/reader033/viewer/2022051810/6019655ffb168a018900f3ee/html5/thumbnails/28.jpg)
28
Waveform Extraction
•–
•–
•••
•–
•–
••
● See slides from JJ Russell here:https://docs.google.com/presentation/d/1XufamuZOdFGkIlHZEw4N8nXMSUEbK9OlhQ9pcAGn4wk/edit?usp=sharing