The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station...

36
The SKA LOW correlator design challenges CSIRO ASTRONOMY AND SPACE SCIENCE John Bunton | CSP System Engineer C4SKA, Auckland, 9-10 February, 2017

Transcript of The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station...

Page 1: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

The SKA LOW correlator design challenges

CSIRO ASTRONOMY AND SPACE SCIENCE

John Bunton | CSP System EngineerC4SKA, Auckland, 9-10 February, 2017

Page 2: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

SKA1 Low antenna station (Australia)

The SKA LOW correlator design challenges

Page 3: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

The SKA LOW correlator design challenges

Station beamforming part of Receptor Sub-Element (LFAA)

Presenter
Presentation Notes
We’re planning to have 140 thousand radio detectors in Australia during phase 1 of the project 1 million in phase two
Page 4: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Low StationsLow Frequency Aperture Array - LFAALFAA has 512 station, maximum baseline less than 65km • Distributed between 1-16 subarrays

Each station 256 dual polarisation log periodic antenna.• Frequency band 50-350MHz

Signal is sent using RF-over-Fibre to Digitiser/beamformer.

Output to CSP_Low• 384 “coarse” channels per station (781 kHz each, 300 MHz total)• Channels can have any of 8 different look directions.• Stations can be in any of 16 subarrays • Up to 128 look directions! • 5.8Tbps of data

The SKA LOW correlator design challenges

Page 5: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Central Signal Processing

Central Signal Processing (CSP) tasked to take the data from LFAA and produce• visibilitiy data and • Pulsar products.

CSP divided into 4 sub-elements• Correlator and beamform• Pulsar timing• Pulsar search and• Local Monitor and control

Correlator and beamformer (CBF) work package is done by

CSIRO (Australia), ASTRON (Netherlands) and AUT (New Zealand)

The SKA LOW correlator design challenges

Page 6: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Correlator – “standard” mode

Low correlator full Stokes (all polarisation parameters)

Low 524,800 correlations per frequency channel

Upt to 65k channels across the band

Low 34G correlation per dump, 0.25s dumps, 11.0Tbps output

Note originally 0.9s

Compute load in the correlator (one correlation acc equiv 8 Flop)

Low 524,800 correlation at 0.3GHz = 1.26 Petaflops

Other processing within CSP similar in compute load

The SKA LOW correlator design challenges

MWA LOFAR JVLA ASKAP ALMA SKA1_LOW SKA1_MIDCorrelator TFLOPS 8 19 131 224 746 1258 2484Inpt data rate Tbits/sec 0.08 0.34 3.1 12.4 10.4 5.8 29.9Output data rate Gbit/sec 3 97 1 20 48 10995 2543

Page 7: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Frequency reolution

Frequency Resolution “standard” observing• 4.6 kHz across 300 MHz• 52k frequency channels

Or Zoom Mode - 4 bands• Zoom band bandwidth• each 4, 8, 16, 32, 64, 128 or 256 MHz• Note orginally just 4, 8, 16, 32 MHz• Zoom band centre frequency – anywhere in observing band. • Band overlap allowed• An LFAA frequency channels can be in any zoom band• 16k output frequency channels per zoom bnad• resolutions 0.23 to 14.5 kHz

The SKA LOW correlator design challenges

Page 8: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Requirement churn

Recent Engineering Change Proposal has allows an exchange of bandwidth for number of inputs.

Each of 512 LFFA stations can be configured to sum subsets of the 256• Example over 150 MHz sum two separate sets of 128 antennas

– Looks like two smaller stations (substations) each with 150MHz bandwidth.– Input to correlator now 1024 station, to maintain total numbe of correlations

constant bandwidth is reduced to 75MHz.

Must design so it is possibel to accomodate 1204 or 2048 substation

Other major recent changes • Added zoom modes 64-256 MHz. Required a major design change.• Decrease in integation time – major increase in output data rate

Design must be capable of adapting to requirement change.

The SKA LOW correlator design challenges

Page 9: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Station based processing

FX correlator implemented – channelise data to final resolution before correlations• 8 diiferent frequency resolutions (226 to 14.5kHz) - 8 filterbanks ???• Finest zoom mode 4096 channel filterbank

– AH HA! Implement finest zoom and integrate in frequency for rest– Integration of 1,2,4,8,16,24,32, or 64 channels for all resolution

Relative delay of astronomical signal to stations varies with time• Must be remove• Implemented as sample delay correction (coarse) and phase slope across

filterbank channels.

RFI flagging – input data has flags and internal flagging needed

The SKA LOW correlator design challenges

Page 10: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Corner turning

Filterbanks and correlation engine cannot process all frequency channels simultaneously• Must process part of the bandwidth at time • Filterbanks and correlator to process a few of the 384 LFAA channels at a time

Store all frequency channels for short term integration time and read out to filterbanks all time data for limited channels at time

Input data – All frequencies for limited time (0.2ms)

Output data – All time data for limited bandwidth (0.9, 0.25s)

The SKA LOW correlator design challenges

Page 11: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Gemini board

Proof of Concept (POC)

Single FPGA Xilinx Virtex Ultrascale+, water cooled, 4xHybrid Memory Cubes (2 link, 4G), 4x12 fibre optical at 25G, 4x

Four to be mounted in a 1U chassisBUT 2 link, 4G HMC is now end-of-life

Redesign underway for Prototype

HMC high bandwidth memory replaced by integrated HBM (smaller but faster)Add DDR4 for bulk memory

The SKA LOW correlator design challenges

Page 12: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Design Evolution - POC design – Separate Subsysems

With Gemini POC originally had a separate Correlator, Beamfromerand Station Based processing and 0.9s integration time

Major corner turn for correlator in the Station based processing(144Gbps per FPGA). But insufficient HMC or 0.9s (1.3TB double buffer but had 43 FPGAs with 16 G each)

Must store accumulate full time integration in correlation 0.34TBBut uses most of available HMC bandwidth

The SKA LOW correlator design challenges

Page 13: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Prototype (Unified) Design

Change to HBM reduced the available memory by half. Major problems fitting the design in

Go to unified design - Station based processing and correlation in the same FPGA.• Number FPGAs that accept inputs from LFAA increased from 43 to 288

– Six times reduction in input bandwidth per FPGA– Can now use DDR4 for corner turn– AND no buffer size limitation 0.9s possible– Correlator can output data SDP for a frequency channel as soon as it is

computed. Very little memory need for correlator buffer

What looked like a disaster with loss of a key component has lead a better an more robust design

The SKA LOW correlator design challenges

Page 14: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Connecting the FPGA

The Unified design has 288 FPGAs

All FPGAs must be able to communicate with all others

Switch ??

But the heart of switch is usually and FPGA

Number the FPGAs (X,Y,Z) (X 1:8) (Y,Z 1:6)Arrange as a cube with these coordinates

Cross connect within rows and colums

Inculding self connection each FPGA has6 in X, 6 in Y and 8 in Z connections 20 connection in total - 500Gbps

The SKA LOW correlator design challenges

FPGA Cube

X, 1 to 6

Y, 1 to 6

Z, 1 to 8

(1,1,1)

(3,4,3)

(3,1,1)

(3,4,1)

Page 15: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Data Flow

Input FPGA have at most one LFAA input (2 stations full bandwidth)

ZXY connections to uniformly distribute data for processing

Allows uniform distribution of compute and output in Zoom mode

Beamformers must bring all frequency data together use same XYZ

The SKA LOW correlator design challenges

IngestDoppler

Correction

Buffer InFilterbank

DelayRFI

CorrelatorOutput Buffer

PSS Beamformer

PST Beamformer

Z XY

Corr Emit

PSS Emit

PST Buffer PST EmitZ

XYZ

XY

Z

LFAA inputs

SDP Outputs

PSS Outputs

PST Outputs

Station Processing

Buffers

Array Processing

VLBI Reformat

VLBI Outputs

Page 16: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

0.9 to 0.25 sec integration

With 0.9s integration output uses 72 (50% full)• One in 4 FPGAs have output.• Aggregate using Z connect All 8 interconnected FPGA send data to two output

FPGA half to each

At 0.25s (changed requirement) is 144 at 100%. Use 180 at 80%• Simply change to 5 out of 8 FPGAs have outputs. Each FPGA sends 1/5 of its

data to an output FPGA. • Design can accommodate 22Tbps of ouput data to SDP without modifications.• Small change to hardware (duplicate the 8 Z connection on each FPA) and can

do 44Tbps

The SKA LOW correlator design challenges

Page 17: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Substations

Unified design ease usage of fast memory

Without substations 4 LFAA channels are processed in parallel• Need for uniform distribution of load from 4 zoom bands.

With 2048 substations process 1LFAA station at a time - 4X stations• Same data rate to correlator, Same size for input buffer to correlator for 2048

substations

Correlator process 2048 stations in 16 passes• Buffer for correlation products increases form 55MB to 0.88GB

– Progressive readout during processing could reduce this but more complex

The SKA LOW correlator design challenges

Page 18: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Conclusion

CSIRO/ASTRON/AUT have design a flexible FPGA based system for the LOW Correlator

It has sufficient spare resources, I/O and memory to accommodate recent requirement changes and still have spare capacity• 20 of 48 internal optical connection per FPGA are currently used.• Further expanasion possible – not I/O limited

Zoom mode changes required major redesign of data ordering but no chance to the hardware

Changes to integration time and addition of substation were changes only to some subsystems

The SKA LOW correlator design challenges

Page 19: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Revised Low Correlator and Beamformers

All processing modules identical.

Cross connects deliver part band to each correlator and beamformer.

Interconnects in reverse aggregate the data

e.g. 2 complete PSS beams per Filtebank/Correlator

Now 4 LFAA station per GEMINIPrevious was 12

separate filterbank

The SKA LOW correlator design challenges

16 groups of 8 also an option

Filterbanks Cross connects Correlator

1/8 BW per link

1st group of 16

8thgroup of 16

GeminiFrom LFAA

To SDP,PSS, PST PSS and PST

Beamformers

Gemini

128 Gemini

6 Gemini per group

From LFAA

From LFAA

From LFAA

From LFAA

From LFAA

All internal links bi-directional

Page 20: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Gemini version II

On board to rule them all (functions that is)

One HMC retained for High Bandwidth External memory

Two DDR4 to be added for High Memory depth system

Up to 4 12-fibre 25G optics + QSPF,SFP

Change to card rack system, Each card a single Gemini II with all I/OWater cooling, up to 200W per card

One FPGA per LRU - Reduced (1/4) I/O per line replaceable unit (LRU

Pluggable Optics, Power and Water at rear – easy replacement

All data connections optical

The SKA LOW correlator design challenges

Page 21: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

SKA1 Overview

SKA1-low stations include Station Beamformer

Central Signal Processing includes

Correlator and Pulsar systems

The SKA LOW correlator design challenges

Page 22: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

The SKA LOW correlator design challenges

Page 23: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Thank youCASSJohn BuntonSKA1 CSP System Engineert +61 2 9372 4420e [email protected] www.atnf.csiro.au/projects/askap

PO BOX 76 EPPING, 1710, AUSTRALIA

Page 24: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

The SKA LOW correlator design challenges

Page 25: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Central Signal Processing

For Mid and Low Central Signal Processing (CSP) consists of

Correlator between all pairs of elements (dish or station)

Tied Array Beamforming coherent sum of signals from all elements

Tied Array beams are processed by

Pulsar Search engine (limited bandwith)

Pulsar Timing engine

and are used for VLBI

LMC - Monitor of performance and control of all functions (NRC Canada, )

The SKA LOW correlator design challenges

Page 26: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

The SKA LOW correlator design challenges

SKA1 MID antennas (South Africa)

Presenter
Presentation Notes
The mid-frequency SKA will be used to study the more recent history of the universe Both arrays will work separately, with different but complementary science goals.
Page 27: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Mid Dishes

133 15m offset Gregorian Dishes + 64 MeerKAT dishes

Total of 197 dishes (Distributed between 1-16 subarrays)

maximum baseline, less than 150 km

Receivers for 5 bands

0.35 to 1.050 GHz full bandwidth 0.70 GHz at 8 bit resolution

0.95 to 1.76 GHz full bandwidth 0.81 GHz at 8 bit resolution

1.65 to 3.05 GHz not installed during construction

2.80 to 5.18 GHz not installed during construction

4.6 to 13.8 GHz 2 x 2.5GHz! at 4 bits resolution

16 subarrays

The SKA LOW correlator design challenges

Page 28: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

CSP Organisation at PDR 2014 (Correlator)

In December 2014 the Preliminary Design Review (PDR) heldAt that time three telescopes Low, Mid and Survey.Physical Implementation Proposal (PIP) submitted for each

Low lead by Oxford University with three designs in a single PIPUniboard (ASTRON), PowerMX (NRC Canada), Redback (CSIRO)

Survey lead by AUT (NZ) considered many options in a single PIPRedback (CSIRO), PowerMX (NRC Canada), Multicore processor, GPUs and ASIC

MID lead by NRC Canada had thee separate PIPsPowerMX (Canada), Redback (CSIRO Australia) & SKARAB (S.A.)

Project management MDA Canada, Local Monitor Control NRC

The SKA LOW correlator design challenges

Page 29: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Pulsars

The Pulsar teams are

Pulsar Timing lead by Swinburne University

CPU/GPU based

Pulsar Search lead by Manchester University CPU/GPU based with FPGA acceleration/power saving

Pulsar search on limited bandwidth (120 MHz Low, 300 MHz Mid)

They process array beams (coherent, polarisation corrected sums of data from ~400 stations) generated by CBF

The SKA LOW correlator design challenges

Page 30: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

A shake up for CSP Correlator/Beamformer

One outcome of the review was the SKA Office wanted just one design to proceed for each Telescope

THEN as total cost too high Rebaselining occurred. DecisionsStop the Survey Telescope

Delay work PAFs (led by CSIRO) (Critical to Survey)

The politicians stepped and decided which designs would proceed

NRC Canada continue to lead Mid (PowerMX)

CSIRO to lead Low with ASTRON, (Redback/Uniboard)+AUT

This resulted in the South African and UK teams to leaving (taking most of the Systems Engineering with them)

The SKA LOW correlator design challenges

Page 31: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Pulsar Timing Beamforming

Mid and Low form 16 tied array beams

Delay aligned, coherent summation of dual polarisation data

Must apply polarisation correction for each value summed

~ 3M Jones Matrices for low

Time resolution of data 200ns Mid, 2us Low

Basically no significant time ripples allowed, narrow impulse response

For Mid approach taken was fractional time delay filter on wideband signal to do delay correction and achieve narrow impulse response

For Low summation done on narrow band channels, phase only which is cheaper than fractional time delay – followed by synthesis.

New approach to avoid synthesis filterbank being investigated.

The SKA LOW correlator design challenges

Page 32: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Pulsar Timing

Pulsar signal is smeared due to dispersion (delay α wavelength2)

Must remove dispersion.

Pulsar time implement overlap-save convolution on the beamformed time series and the correction filter.

Time series ~1 minute, bandwidth 10 MHz, multi-million point FFTs in GPUs.

Very stringent timing requirements on data supplied

less than 10ns error over a 10 year period.

The SKA LOW correlator design challenges

Page 33: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Pulsar Search (PSS)

Mid to form 1500 power beams at a bandwidth of 300MHz

Low to form 500 power beams at a bandwidth of 120MHz

dishes/stations in compact area used, PSS beams to fill dish/station beam

Coherent summation of dish/station data, phase on narrow channels ~20kHz

Polarisation correction to dish/station data (beam centre) and for Low after beamforming. (~800,000 Jones Matrices for MID)Search each beam for Pulsars in PSS engine (GPU/CPU 16 racks, Low)

500 dispersion measures

in each dispersion measure acceleration search to 300ms-1

The SKA LOW correlator design challenges

Page 34: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Other Functions

Both Mid and Low require a transient buffer.

in Low allocated to LFFA, 256GB per Station, 150MHz of data, 2-bit precision

in Mid allocated to CSP 32GB per dish, 300MHz of data, 2-bit precision

Mid produces four VLBI beams.

VLBI possible Europe, America, Australia, Asia.

not sufficient Low frequency telescope for VLBI with Low.

The SKA LOW correlator design challenges

Page 35: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Hardware

Pulsar Search (PSS) and Pulsar Timing (PST) use common hardware for Mid and Low. CPU/GPU based (FPGA acceleration for PSS).

PST two racks at each site

PSS 16 rack at Low, dissipating ~160 kW, Mid 59 racks @ ~470 kWwider bandwidth, more beams but lower total delay to searchEach compute node process 2 beams (TBC)

Correlator and Beamformers are FPGA basedMid based on PowerMX

Low based on Perentie (development from Redback+Uniboard)

The SKA LOW correlator design challenges

Page 36: The SKA LOW correlator design challenges · The SKA LOW correlator design challenges Station beamforming part of Receptor Sub-Element (LFAA) We’re planning to have 140 thousand

Perentie (CSIRO/ASTRON) for Low

July 2015 Final confirmation the CSIRO would lead CSP for Low

Condition of leadership was to collaborate with ASTON on design.

At that time ASTRON were completing their Uniboard II

CSIRO platform was Redback-3. Both multi-FPGA boards.

For SKA CSIRO had proposed, Redback-5 a board with a single FPGAAfter a lengthy downselect process it was decided in November to proceed with GEMINI board

Four of these to be mounted in a 1U chassis

The SKA LOW correlator design challenges