Development of the DAQ software for the technical prototype: Status & Outlook Valeria Bartsch UCL.
DAQ Software Gordon Watts UW, Seattle December 8, 1999 Director’s Review Introduction to the...
-
Upload
owen-logan -
Category
Documents
-
view
214 -
download
1
Transcript of DAQ Software Gordon Watts UW, Seattle December 8, 1999 Director’s Review Introduction to the...
DAQ Software
Gordon WattsUW, Seattle
December 8, 1999Director’s Review
• Introduction to the System• Goals for Installation & Commissioning• Software Tasks & Manpower
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 2
Run II DAQ
Readout channels: Will be ~800,000 in run 2, <250
kBytes>/event
Data rates: 60-80 crates Initial design capacity:
~1000 Hz 250 MBytes/sec into the DAQ/l3-farm
Integration & Control With online
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 3
Continuous operation Version 0
PC & Run 1 DAQ hardware simulate the functionality of the Run 2 system.
Looks similar to final system to both Level 3 and the outside user
Integrate with Hardware as it arrives Small perturbations
Reliability Integration with Online (monitor, errors, etc.)
We don’t get calls at 4am Careful testing as we go along
Test stand at Brown Si test and other boot strap operations here
System isn’t Fragile If things aren’t done in the exact order
– deal with it– understandable error messages.
All code kept in a code repository (vss)
Goals
Segment Data
Cables
Segment Data
Cables
)
VRC1
)
Front EndCrate
Front EndCrate
Front EndCrate
Front EndCrate
Front EndCrate
Front EndCrate
Front EndCrate
Front EndCrate
VRC8
S(4 DATA CES
L3 Node(1 of 16)
L3 Node(1 of 16)
L3 Node(1 of 16)
SB1
SB4
ETG
Event Tag Loop
Primary Fiber Channel Loop #1
Primary Fiber Channel Loop #8
Front End TokenReadout Loop
Front End TokenReadout Loop
TriggerFramework
)
L3 Node(1 of 16)
L3 Node(1 of 16)
L3 Node(1 of 16)
To Collector
Router
To Collector
Router
Eth
ern
et
Eth
ern
et
We have to write software
=
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 5
SC
L3 Software
L3 FarmNode
L3 Supervisor
ETGVRCVRCVRCVRC
SCSCSB
OnlineSystem
CollectorRouter
L3 Monitor
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 6
L3 Software
During Running, DAQ hardware is stand alone Running components do not require software
intervention on an event-by-event basis Except for monitoring
Software must deal with initialization and configuration only.
Except for the Farm Node
DAQ components require very little software VRC, SB are simple, similar, control programs with
almost no parameter settings ETG is similar, with more sophisticated software to
handle routing table configuration Farm Node and Supervisor are the only components
that require significant programming effort. Monitor node to a lesser extent.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 7
ETG Interface
ETGProgram
ControlDisk
L3 SupervisorL3 Monitor
ETG Node
EmbeddedSystems
EmbeddedSystems
EmbeddedSystems
TriggerFramework
Triggers TriggersDisable
DCOM
Similar to the VRC (and SB); will reuse software
Similar to the VRC (and SB); will reuse software
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 8
Filter Process
Physics Filter executes in a separate process. Isolates the framework
from crashes The physics analysis code
changes much more frequently than the framework once the run has started
Crash recovery saves the event, flags it, and ships it up to the online system for debugging.
Raw event data is stored in shared memory
Framework
FilterProcess
Shared Memory
Mutexes
Run 1 ExperienceRun 1 Experience
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 9
Physics Filter Interface
ScriptRunner Framework that runs physics tools and
filters to make the actual physics decision. Cross platform code (NT, Linux, IRIX,
OSF??) Managed by the L3 Filters Group
L3 FrameworkInterface
L3 FrameworkInterface
L3 Filter ProcessL3 Filter Process
ScriptRunnerScriptRunner
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 10
L3 Supervisor
Manages Configuration of DAQ/Trigger Farm About 60 nodes
Command Planning Online system will send Level 3 simple commands L3 must translate them into the specific
commands to each node to achieve the online system’s requests
Supports Multiple Runs Partitioning the L3 Farm Node Crash and Recovery Generic Error Recovery
With minimal impact on running
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 11
ErrorLogging & Monitoring
Error Logging L3 Filters group will use the zoom ErrorLogger
Adopted a consistent set of standards for reporting errors. Plug-in module to get the errors off the Level 3 nodes
Sent to monitor process for local relay to online system Logfiles written in a standard format
– Trying to agree with online group to make this standard across all online components
Monitoring Noncritical information
Event counters, buffer occupancy, etc. Variables declared & mapped to shared memory
Slow repeater process copies data to monitor process.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 12
DAQ/Trigger Integration
Between the Hardware and the Online System
Interface is minimal
Data Output Control Monitor
Information
Implications are not Minimal
DetectorDetector
L1, L2 TCCL1, L2 TCC
L3 SupervisorL3 Supervisor
DAQ SystemDAQ System
Readout CrateReadout Crate
L3 NodeL3 Node
Ethernet
Collector / RouterCollector / Router
Data LoggerData Logger
RIPRIP
Disk
FCCFCC
Ethernet
Data Cable
Data Cable
Trigger and Readout
UNIX Host
NT Level 3
UNIX Host
COORCOOR MonitorMonitor
Ethernet
L3 MonitorL3 Monitor
DAQConsole
DAQConsole
DetectorConsole
DetectorConsole
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 13
Software Integration
Integration outside of Level 3 software Integration with offline (where we
meet) Level 3 Filter
Must run same offline and online Doom/dspack
Control & Monitor communication Uses ITC package
Online Group’s standard Communications PackageRequires offline-like Code Releases built on OnlineRequires offline-like Code Releases built on Online
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 14
NT Releases
Build is controlled by SoftRelTools 100’s of source files Build system required UNIX centric (offline)
Too much work to maintain two
SRT2 NT integration is done SRT2 is the build system Set us back several months; no assigned person
Li (NIU MS) is building NT releases now Just starting… Starting with a small DAQ only release
DSPACK + friends, itc, thread_util, l3base Next step is to build 30+ size release
Everything we had nt00.12.00 release
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 15
NT Releases
Progress is slow Build system is still in flux!
What does it affect? ScriptRunner + filters + tools + ITC 10% Test right now Our ability to test the system now
Dummy versions of SR interface Regular nt trigger releases must occur by
March 15, 2000 Muon Filtering in L3 is one of the commissioning
milestones.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 16
Scheduling
Conflicting Requirements Must be continuous availible starting now Must upgrade & integrate final hardware as it
arrives
Software is impacted Must upgrade in tandem and without disturbing
running system
Tactic Version 0 of software Upgrade adiabatically Interface to internal components remains similar Interface to online system does not change Test stand at Brown University for testing.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 17
VRC Interface
FCI from last SB
VRCProgram
ControlDisk
L3 SupervisorL3 Monitor
VRC Node
EmbeddedSystems
EmbeddedSystems
EmbeddedSystems
VBD Data Cables50 MB/s50 MB/s
100 MB/sFCI to 1st SB
100 MB/s
DCOM
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 18
VRC Interface (V0)
VRCProgram
ControlDisk
L3 SupervisorL3 Monitor
VRC NodeVBD Data Cables
50 MB/s50 MB/s
100 Mb/sDCOM
VME/MPM(2)
(to SB/ETG)
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 19
November 1, 1999
Read raw data from FEC into VRC
Send raw data in offline format to online system
Control via COOR Held up by NT
releases
COOR
L3 Super
ITC
DCOM
Detector/VRC
CollectorRouter
ITC
AutoStart
Utility
DCOM
DoneStartedNot Started
FEC
50%
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 20
February 15, 2000
Multicrate readout
Internal communication done via ACE Already
implemented
COOR
L3 Super
ITC
DCOM
Detector/VRC
CollectorRouter
ITC
AutoStart
Utility
DCOM
DoneStartedNot Started
FEC
SB/ETG
Detector/VRC
ACE
L3 FarmNode
ACE
25%
FEC
50%
75%
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 21
March 15, 2000
Muon Filtering in Level 3 ScriptRunner
Interface must be up
NT releases must be regular
COOR
L3 Super
ITC
DCOM
Detector/VRC
CollectorRouter
ITC
AutoStart
Utility
DCOM
DoneStartedNot Started
FEC
SB/ETG
Detector/VRC
ACE
L3 FarmNode
ACE
20%
FEC
50%
50%
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 22
May 1, 2000
Multistream Readout Ability to partition
the L3 Farm Multiple
Simultaneous Runs Route events by
trigger bits ScriptRunner does
output streams
COOR
L3 Super
ITC
DCOM
Detector/VRC
CollectorRouter
ITC
AutoStart
Utility
DCOM
DoneStartedNot Started
FEC
SB/ETG
Detector/VRC
ACE
L3 FarmNode
ACE
10%
FEC
25%
45%
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 23
Test Stands
Detector subsystems have individual setups Allows them to test readout with final
configuration Allows us to test our software early
High speed running, stress tests for DAQ software
Subsystems have some unique requirements Necessary for error rate checking in the Si, for
example. Separate software development branches
Attempt to keep as close as possible to the final L3 design to avoid support headaches.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 24
Test Stands
Three test stands currently in operation Brown Test Stand
Test hardware prototypes Primary software development
Silicon Test Stand L3 Node directly reads out a front end crate Helping us and Si folks test readout, perform
debugging and make system improvements
CFT Test Stand Instrumented and ready to take data (missing
one tracking board (VRBC) to control readout)
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 25
10% Test
Si Test Stand will evolve into full blown readout 10% test – single barrel readout Requires full L3 Node Test out Silicon Filter Code
ScriptRunner, Trigger Tools, etc. NT releases must be up to speed for this
This is in progress as we speak The ScriptRunner components are held up
by NT releases.
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 26
People
Joint effort Brown University University of Washington, Seattle
People: Gennady Briskin, Brown Dave Cutts, Brown Sean Mattingly, Brown Gordon Watts, UW +1 post-doc from UW Students (>1)
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 27
Tasks
VRC Simple; once done will require few modifications. (1/4
FTE)
SB Simple; once done will require few modifications
(very similar to VRC) (1/4)
ETG Complex initialization required; hardware interface
not well understood yet, requires little work now. By the time VRC ramps down, this will ramp up. (1/2)
Farm Node Large amount of work left to do in communication
with the supervisor and with ScriptRunner. Will require continuous work as system gains in complexity (3/4)
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 28
Tasks
L3 Supervisor Complex communication with COOR, started but
will require continuous upgrades as the system develops in complexity. (1/2)
Monitoring Initial work done by undergraduates. Have to
interface to the outside world. No one working on it at the moment (1/4).
NT Releases Offloading to NIU student. Requires continuous
work and interface with many different software developers (1).
L3 Filter Integration Done by hand now, will have to be made
automatic. Take advantage of offline tools (1/2).
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 29
Conclusions
NT Releases have been the biggest delay Keeping up with the offline changes requires
constant vigilance Offloading this task to a dedicated person. 10% test impact, March 15 milstone impact
Group is correct size to handle the task Continuous operation Integrating the new hardware with the software As long as this group isn’t also responsible for
releases.
Weak points currently Monitoring Integration with online system (log files, error
messages, etc.).
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 30
Dedicated 100 Mbits/sEthernet to
Online Collector/Router
Dedicated 100 Mbits/sEthernet to
Online Collector/Router
L3 Farm Node
Ethernet
L3FilterL3
FilterL3 FilterProcess
DMA capableVME-PCI Bridge
L3 Node FrameworkL3 Node FrameworkEach 48 MB/s
Control,Monitoringand ErrorModule
L3 Filter InterfaceModule
Node-VME I/O Module
Shared Memory Buffers
VME Crate
MPM
MPM
CollectorRouterModule
•Prototype of the framework is finished
•Runs in Silicon Test Stand•Second version finished by Jan
1.•Improved Speed,
interaction between processes, new interface, and stability
•Prototype of the framework is finished
•Runs in Silicon Test Stand•Second version finished by Jan
1.•Improved Speed,
interaction between processes, new interface, and stability
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 31
Validation Queue
Event Validation
• FECs presence validation• Checksum validation
L3 Filter Input Interface
Process Interface
Pool Queue
Collector/Router Network Interface•Determine where this event should be sent•Sent event to collector/router node
Filter Queue
Data to Online Host System
Data
Control
OutputEventsQueue
•Get a pointer to an event buffer •Configures MPMs for receiving new event•Wait till complete event arrives into MPM•Load event data into shared memory buffer•Insert event pointer into the next queue
L3 Supervisor Interface
L3 Monitor Interface
L3 Error Interface
Command/Monitor/ErrorShared Memory
Command/Monitor/ErrorShared Memory
Event BufferShared MemoryEvent Buffer
Shared Memory
L3 Filter Process
L3 Filter Output Interface
Process Interface
OutputPool
Queue
MPM Reader
Details Of Filter Node
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 32
L3 Supervisor Interface
Receives and interprets COOR commands and turns them into internal state objects
Next step is communication to clients VRC/ETG/SB/L3Node
COOR
COORCommandInterface
Current Configuration
DB
ResourceAllocator
CommandGeneratorSequencer
L3 NodeL3 NodeL3 NodeL3 NodeClients
Supervisor Online System
CommandsConfiguration Request
DesiredConfiguration
Data Base
DirectCommands
Commands
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 33
Auto Start System
Configuration Database
Client Machine
Auto StartService Get Package List
Install Packages
Package Database
Auto Start ServiceChange Packages,Get Status,Reboot, etc.
Package
Running Packages
•Designed to automatically start after a cold boot and bring a client to a known idle state
•Also manages software distribution
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 34
Timeline2000 2001 J F M A M J J A S O N D J F
ICD
FPS
Lum (L0)
FT install/hookup
1/2 VLPCs installed
1st CFT crate operational
Waveguide production
All VLPCs installed
SMT install/ hookup
Beam-ready
Forward MDT & pixel planes install/survey (A&B)
Forward MDT & pixel planes install/survey (C)
CC cosmics
ECN cosmics
Assemble/install/ align EMC
End toroids installed
Hookup ECS
Cosmic Ray CommissioningPhase I: Central muon, DAQ, RECO,
trigger, tracker front-end
Phase II: Fiber tracker, preshowers, VLPCs, CFT, forward
muon
Phase III: Full Cosmic Ray Run (add TRIG,
SMT, CAL)
Install/checkout CAL BLS
Install final CFT electronicsInstall tracking
front-end
1st Collab Commissioning Milestone: Feb 15, 2000
Run II begins: Mar 1, 2001
Roll in
Remove shield
wall
DAQ Availible
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 35
Silicon Test Display
Master GUIMonitor
Counters
Raw Data Viewer CPU1 CPU2
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 36
Monitoring
Non essential information Helpful for debugging
Two sources of information: Level 3 Physics Trigger Info
Accept rates, filter timing. Framework ships binary block of data out to
concentrator (1-of-50) which combines it and re-presents it.
Framework Items Event Counters, Node State.
So others can read without impacting systemSo others can read without impacting system
Director’s ReviewDec 8, 1999
DD
Gordon Watts, UW Seattle 37
Monitoring
Framework Items use a Shared Memory Scheme:
SharedMemory
FrameworkProcess 1
FrameworkProcess 2
FrameworkProcess 3
SlowRetransmitter
Process
Rest ofWorld
TC
P/IP (AC
E)
Rest of World:• Requests Particular Items• Update Frequency
Framework Process:• Saves name, type, and data of
monitor• Data type is arbitrary• Implemented with template classes
NT Native Now, Soon ACE
Try to reuse online software