The Future of the Automobile - casper.astro.berkeley.edu

32
Jason Manley Internal presentation: Operation overview and drill-down October 2007

Transcript of The Future of the Automobile - casper.astro.berkeley.edu

Jason Manley

Internal presentation:

Operation overview and drill-down

October 2007

System overview

Achievements to date

iBOB “F Engine” in detail

BEE2 “X Engine” in detail

Backend System in detail

Future developments

Discussion

8, 16 and 32 antenna dual-pol designs

Bandwidth <200MHz using iBOBs and BEEs

Full Stokes

2048 frequency channels

Control, initialisation and monitoring using Python scripts

100Mbps Ethernet output with integration times >16seconds

Capture UDP output packets using Python code into Miriad format

Web interface for near real-time visualisation

F Engine 0

10GbE Switch

F Engine 1

F Engine N-1

X Engine 0

X Engine 1

X Engine N-1

. . .

. . .

. . . FX architecture

F Engine 0

10GbE Switch

F Engine 1

F Engine N-1

X Engine 0

X Engine 1

X Engine N-1

. . .

. . .

. . .

FX architecture

BEE2

10GbE Switch

X Eng X Eng

BEE2 user FPGA

X Eng X Eng

BEE2 user FPGA

X Eng X Eng

BEE2 user FPGA

X Eng X Eng

BEE2 user FPGA

F Eng

F Eng

iBOB

F Eng

F Eng

iBOB

F Eng

F Eng

iBOB

F Eng

F Eng

iBOB

“Known good” – mirrors pocket correlator

Two F engines per iBOB

Dual polarization design

Currently uses combination of ASTRO and CASPER

libraries

Major data flow components:

X Engine

ADC

DDC Channelizer Equalization Reformat

Two X engines per BEE user FPGA

Uses CASPER library only

Pktize 10GbE Buffer X Eng AccumF Engine

Clocks:

X engines each run off independent clock

Sampling synchronized at F engines, but clock not distributed to X engines

Synchronized using global 1pps signal at ADCs

Propagated to X engines using out-of-band signaling on XAUI links

Headers labeling 10GbE Ethernet packet data

System control: separate 100Mbps Ethernet network

F engines configured through out-of-band signals on XAUI links

Control packets: UDP to Python server on BEE2 control FPGAs

Python scripts for configuration

X Engine

ADC

DDC Channelizer Equalization Reformat

Analogue Input 600MHz, but up to 800MHz t7, t6, t5, t4, t3, t2, t1, t0

Output:t4, t0

t5, t1

t6, t2

t7, t3

fout = fs/4 (normally 150MHz)8 bits × 4Signed Fixed point: 8.7 Numeric range: -1 to 1

DDC

X Engine

ADC

DDC Channelizer Equalization Reformat

Extracts a frequency band from the input signal

Input: Data: signed fix 8.7 Path: 32 bits

Output: Data: 8 bits “I”, 8 bits “Q”Path: 16 bits

Current setup:For fs = 600MHz,

Selects output band = 75 to 225 MHz

Decimation Filter

X Engine

ADC

DDC Channelizer Equalization Reformat

Improves out-of-band rejection ratio

Data: signed fix 18.17, complex Path: 36 bits

Input: Data: 8 bits “I”, 8 bits “Q”Path: 16 bits

Current setup:2048 channel, 4 tap PFB, hamming window

PFB

FIR

Output: Data: signed fix 18.17, complex Path: 36 bits

Data: Signed fix 18.15, complexPath: 36 bits

Downshift to prevent overflow in first stage of FFT

Non-detrimental: effective signal resolution from PFB is 8 bits.

FFTDown

shift

Runtime configurable downshifting through each

stage

X Engine

ADC

DDC Channelizer Equalization Reformat

Ax

Multiplies each frequency by a 17.3 bit scale factor.Can be used to correct

system frequency response irregularities at runtime.

Equalizer

Output: Data: signed fix 4.3, complex Total Path: 32 bits

Data: Signed fix 35.20, complex

Selects 4 bits,With saturating rounding

Numeric range: -0.875 to +0.875

Input (four signals): Data: signed fix 18.17, complex Path: 36 bits (x4)

Ay

Bx

By

Decimation

BRAM

lookup

table

X Engine

ADC

DDC Channelizer Equalization Reformat

Corner

Turner

Input: Data: signed fix 4.3, Complex Total Path: 32 bits

Ax

Ay

Bx

By

Ch 0

Ax

Ay

Bx

By

Ch 1

Ax

Ay

Bx

By

Ch N-1

…………

Data: signed fix 4.3, Complex, dual polTotal Path: 32 bits

Divide

by 2XAUI

Output:Data:

signed fix 4.3, Complex, dual

pol, four frequency chans

Total Path: 64 bits

Ax

Ay

Ax

Ay

Ch 0

Bx

By

Bx

By

Ch 0

t384 t256 t128 t0

Ax

Ay

Ax

Ay

Ch 1

Bx

By

Bx

By

Ch 1

XAUI

Data: signed fix 4.3, Complex, dual pol, four frequency chansTotal Path: 64 bits

Pktize 10GbE Buffer X Eng AccumF Engine

Header

Generation,

Processing

Allocation

Sync control,

System reset,

Ant decode

Payload size: 32 x 64bits + 64 bit hdr = 264 Bytes(Jumbo packet: 1120 bytes)

MSb LSb

MCNT ANT Hdr

f0t3 f0t2 f0t1 f0t0

Dat

a(3

2 x

64

b)

f0t7 f0t6 f0t5 f0t4

… … … …

f0t127 f0t126 f0t125 f0t124

64 bits

Pktize 10GbE Buffer X Eng AccumF Engine

F Engine 0

10GbE Switch

F Engine 1

F Engine N-1

X Engine 0

X Engine 1

X Engine N-1

. . .

. . .

. . .

“N” antennas“n” frequency channels f0

f1

.

.

.

fN-1

t0

fN

fN+1

.

.

.

f2N-1

t1

fn-N

fn-N+1

.

.

.

fn-1

tN/n -1

.

.

.

F Engine

Packet

stream

Pktize 10GbE Buffer X Eng AccumF Engine

10GbE

Transceiver

10GbE Ethernet

Data Unpack

Total packet size: 32 x 64 bit words + 64 bit hdr = 2112 bitsor, 264 bytes(Giant packet: 1120 bytes)

F Engine

Packet

stream

Pktize 10GbE Buffer X Eng AccumF Engine

10GbE

Transceiver

10GbE Ethernet

Data Unpack

Loopback

Mux

Pktize 10GbE Buffer X Eng AccumF Engine

2

1

3

0

6

5

7

4

1 Packet:1 Freq or 128 words

Data inserted into position in buffer determined by MCNT

in packet header.

Timeout if no packet received for 220 clocks.

Ship out a window when first packet of ½ buffer ahead

received (ie ship 1 when first packet of 5 received)

Only accept packets with MCNT:

½ buffer size back to ¼ buffer size ahead

(ie if already received up to packet 5, accept MCNTs for

windows 2, 3, 4, 5 or 6) – prevents spurious locks.

Circular buffer

Pktize 10GbE Buffer X Eng AccumF Engine

X engine

(streaming)

Streaming architecture assumes data valid on every clock.

Integration occurs

Each antenna input must thus be valid for integration_period clock cycles

Output must be filtered as duplicates occur

Pktize 10GbE Buffer X Eng AccumF Engine

X engine

(streaming)

Ax

Ay

t0

……

Bx

By

t128

……

Data: 4.3 bits , dual pol, complex Path: 16 bits

Z-128 Z-128 Z-128

Data: 16.6 bits, cplx , 4 terms Path: 128 bits

ACBD…

ABBCCD…

AD…

5 4 3 2 1

AA X X X X

BB AB X X X

CC BC AC X X

DD CD BD AD X

EE DE CE BE AE

FF EF DF CF BF

GG FG EG DG CG

HH GH FH EH DH

O AH AG AF AE

O O BH BG BF

O O O CH CG

O O O O DH

AxAx

AyAy

AxAy

AyAx

t128

AxBx

AyBy

AxBy

AyBx

t256

BxBx

ByBy

BxBy

ByBx

t257

Read out direction

Accumulation for 128 clocks

Simplification!See detail on last slide

Pktize 10GbE Buffer X Eng AccumF Engine

X engine

Re-orderData: 16.6 bits, complex , 4 terms Path: 128 bits

5 4 3 2 1

AA X X X X

BB AB X X X

CC BC AC X X

DD CD BD AD X

EE DE CE BE AE

FF EF DF CF BF

GG FG EG DG CG

HH GH FH EH DH

O AH AG AF AE

O O BH BG BF

O O O CH CG

O O O O DH

AxAx

AyAy

AxAy

AyAx

t0

AxBx

AyBy

AxBy

AyBx

t2

BxBx

ByBy

BxBy

ByBx

t4

Windowed bufferingData throttling

CxHx

CyHy

CxHy

CyHx

t71

Windowed baselines fed out every second clock

…………

Pktize 10GbE Buffer X Eng AccumF Engine

DRAM

Reformat

Data: 16.6 bits, complex , 4 terms Path: 128 bits

Increase number space to 32 bits

Data: 32.6 bits, complex , 2 terms Path: 128 bits

DRAM

Accumulator

Shared

BRAM

Integration length run-time configurable

Data: 32.6

bits, complex Path: 32 bits

Listen Config Start tx Start rx Display

Software registers on User FPGAs addressable by BORPH on Control FPGA

UDP Listener on BEE2 Control FPGA processor

Automated Python scripts for writing these registers

Special command “Start TX” begins dumping Shared BRAM output on separate UDP port

Receiver collects UDP output packets, buffers and writes to disk

Multiple files generated for storage, display and debugging

Web interface for plotting output data (useful for debugging)

Confirmed working using simulated correlator output data, generated on BEE processor

Listen Config Start tx Start rx Display

Python script executed on BEEs.

Allows programming of any software register on the BEE by name

Includes special functions, which can start/stop programs or scripts on the

BEE

Start or Stop transmitting data

Globally program gains on all connected iBOBs

Listen Config Start tx Start rx Display

Command-line parameterized

Sends packetized commands to listener on BEE

Arms iBOBs, sets FFT shifting schedule, sets iBOB EQ gains to defaults, set

accumulation length, sets antenna indices, ip addresses and ports.

Reads debug registers and snap blocks to confirm correct dataflow.

Attempts recovery through block reset and/or reprogram

Listen Config Start tx Start rx Display

Receiving computer requests dump start

BEE2 control FPGA monitors shared BRAMs and reads out when full

Data is enclosed in a timestamped packet (determined using BORPH’s system

time)

Header: 21 Bytes: Time, X Engine number, 4B vector num, 4B flags, 4B payload

length (historical)

Transmitted via UDP packet to pre-determined receiver

Listen Config Start tx Start rx Display

Receives packets, decodes header and appends to buffer (thus requires to receive in order)

If header out of order, dumps as invalid

Correct with new C code

Requires system parameters passed on command line when executed

Ability to read source from UDP packets, files or std-in pipe (untested)

Generates 4 files:

Miriad UV,

Info file (n_chans, integration length , system gain etc),

Numpy database (Python) of last integration for plotting

Numpy database of raw data (for debugging)

Listen Config Start tx Start rx Display

cgi script

10Gbps output gives sub 1 second integration times

High speed, scalable, distributed data capture software

Walsh codes and phase switching

64 antenna design

Upgrade to 4096 channels

ROACH hardware:

<550MHz bandwidth

16 384 channels

128 antennas with no architectural changes

casper_n/cn_i…

Currently in revision 3.02 testing, using ASTRO lib

Revision 4 will use CASPER library

casper_n/cn_b…

Revision 3.08 testing <- DEBUG!

Revision 3.07d stable

Revision 4 will have 10GbE output