Amedeo Perazzo Online Computing [email protected] November 12 th, 20081 Computing...

16
Amedeo Perazzo Online Computing Resources [email protected] November 12 th , 2008 1 Computing Resources for DAQ & Online Amedeo Perazzo Photon Controls and Data Systems Online Manager SLAC, November 12 th 2008 LCLS FAC - Controls Breakout Session

Transcript of Amedeo Perazzo Online Computing [email protected] November 12 th, 20081 Computing...

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 1

Computing Resources for DAQ & Online

Amedeo PerazzoPhoton Controls and Data Systems Online Manager

SLAC, November 12th 2008LCLS FAC - Controls Breakout Session

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 2

Contents

Network organization

Services

Machines which provide services to user workstations, DAQ & control nodes

Users

Development, internet access

DAQ

Machines used for moving data from FEE to online data cache

Controls

Machines used for controlling all slow devices

Data Cache

Temporary storage for online data analysis and as buffer between DAQ and offline

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 3

Network Zones

Photon Controls & Data Systems (PCDS) network organized in 2 zones

Back-end: provides networking services to the PCDS enclave

Front-end: control and data acquisition traffic

Network organization driven by new DOE security rules

Back End Zone: provides the infrastructure for all service traffic

Divided into six subnets:dmz: limited access from SLAC machinesusr: user development and Internet access nodesservice: NFS, DNS, NTP and AAA server nodesmgmt: utility nodes (terminal servers, shelf managers, etc...)cds: service subnet for Control & DAQ nodesdss: service subnet for the Data Storage machines

Must allow Control & DAQ to be operational, for limited amount of time, when connection to SLAC domain is down

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 4

SLACDomain

Back End Zone Diagram

DSSService

DMZ

User

Science bulk data

CDS

Service Traffic

Data cache machines

Control & DAQ nodes

NFS, DNS, NTP, AAA

Front End Zone

AcceleratorDomain

MGMT

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 5

Front End Zone

Front End Zone: provides the infrastructure for the control traffic and the data acquisition traffic

Divided into three networks:daq: science data, partition management, run monitoring and telemetry traffic

DAQ operator consoles (L0), readout nodes (L1), processing nodes (L2) and data cache machines (L3)

epics: control traffic

control operator consoles (E0), IOCs (E1), EPICS archiver (E3) and channel access gateway

bld: low latency 120 Hz beam-line data traffic

DAQ network further subdivided into experiment specific subnets:daq_amo, daq_xpp, daq_cxi, etc

EPICS network further subdivided into:

epics_amo, epics_xpp, epics_mcc, epics_xtod, etcSelected EPICS traffic may be exchanged between the different subnets via the channel access gateway

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 6

Network Devices for Services

CISCO Catalyst 6509

720 Gbps switch fabric backplane

9 slots (currently: 1 supervisor, 1 x 48 1Gb RJ45, 1 x 24 SFPs))+(

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 7

Controls

CISCO 3560

48 1Gb RJ45 + 4 SFPs (+)

Motorola MVME6100

MPC7457 PPC 1.2 GHz (+)

DELL R200 1U server

2 cores, 1 PCIe, 1 PCI-X (+)

ELMA VME chassis

Redundant PWS, shelf manager

• 4, 8 and 21 slots (+)

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 8

Network Devices for DAQ

Cluster Interconnect Module (CIM) SLAC custom made ATCA switch

Based on two 24-port 10Gb Ethernet switch ASICs from FulcrumUp to 480 Gb/s total bandwidth

Fully managed layer-2, cut-through switch (+)

ATCA chassis

5 slots, shelf manager (+)

14 slots, shelf manager

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 9

DAQ Nodes

SLAC ATCA RCE boards

2 cores, 8 PGP lanes, 2x10Gb/s ethernet fabric interface, 2x1Gb/s ethernet base interface (+)

ATCA blades

8 cores, 16GB, dual 10Gb/s ethernet fabric interface, dual 1Gb/s ethernet base interface

2 AMC slots

cPCI Concurrent PP512

4 cores, 4GB, 3x1Gb/s ethernet interfaces

2 PMC/XMC slots (+)

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 10

Servers

Services

NFS, DNS, LDAP, NTP, AAA, logging, etc

Supermicro 2U (8 cores, 32GB, 1TB SATA, redundant PWS)

Hot spare with automatic failover

EPICS

CAG, EPICS DB, EPICS archiver, etc

Supermicro 2U (8 cores, 32GB, 1TB SATA, 8 NICs, redundant PWS)

Hot spare with automatic failover

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 11

Data Cache

RAID Server

Supermicro 4U (8 cores, 32GB, 1TB SATA, 24TB SAS through hardware RAID SAS controller, redundant PWS)

In the beginning 2 servers are envisioned (+)When one server is writing, the other is reading

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 12

Users

Consoles

DELL OptiPlex with dual 20” monitor

Development

DELL Precision with 20” monitor

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 13

Public Networks

Not part of the PCDS enclave

Desktops

Part of SLAC networkTwo CISCO 3750 in NEH telecom room (+)

Dedicated redundant 1Gb/s link between telecom room and SCCS

Network devices maintained by SCCS

Wireless

Part of SLAC visitor network3 access points in basement floor, 3 in sub-basement, 2 in FEE (+)

Connected to switch in NEH telecom room

Maintained by SCCS

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 14

Status

Built prototype for NEH online computing in bldg 84 labUse same routing and firewall rules as for NEHCreated/tested set of tools which will be used in NEH for:

Maintaining service machines (NFS, NTP, DNS, AAA servers etc)Synchronization with SCCS, backup, patching, logging

Maintaining utility nodes (shelf managers, terminal servers, etc)Manage connections inside and outside PCDS enclave

Test-stand setup allows small scale testing of feasibility of network plan

Will speed up commissioning in NEH

Man power for online computing effort during 2008 1.5 FTE

Shared among 2 PCDS people and 2 SCCS people

Similar effort expected for 2009

Expenses for 2008 along expectations

Enough resources allocated for 2009

Big ticket item L2 event building and processing farm

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 15

Computational Alignment

kin

koutq

“The number currently used to obtain high-resolution structures of specimens prepared as 2D crystals, is estimated to require at least 1017 floating-point operations” R. M. Glaeser, J. Struct. Bio. 128, (1999)

Experimental Data (ALS)

Difference of pyramid diffraction patterns 10º apart, Gösta Huldt, U. Uppsala

“Computational Alignment” requires large computational power that might only be provided by performing offline analysis? Save first, and Analyze later? To be investigated

Amedeo Perazzo

Online Computing Resources [email protected]

November 12th, 2008 16

Real-time Processing – Sorting in CXI

• Diffraction from a single molecule:

single LCLS pulsenoisydiffractionpattern of unknown orientation

• Combine 105 to 107 measurements into 3D dataset:

Classify/sort Average AlignmentReconstruct by Oversampling phase retrieval

Miao, Hodgson, Sayre, PNAS 98 (2001)

unknown orientation

Gösta Huldt, Abraham Szöke, Janos Hajdu (J.Struct Biol, 2003 02-ERD-047)

The highest achievable resolution is limited by the ability to group patterns of similar orientation

Real-time?