Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss Monitor System

Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss

Monitor System

Wu, JinyuanC. Drennan, R. Thurman-Keup, Z. Shi, A. Baumbaugh and J. Lewis

Fermilab, April 2007

The Digitizer Card for the Fermilab Beam Loss Monitor System

• Beam loss input signals from ion chambers are integrated and digitized.

• Sliding sums are accumulated and compared with pre-loaded thresholds.

• Over threshold in several places causes beam abort based on pre-defined setting.

• Beam loss signals are filtered and “de-rippled” for display purposes.

• Sequence is controlled by “Seq128” block.

ADC21s/sample

RAM

FastSliding Sum

A>B

SlowSliding Sum

Very SlowSliding Sum

ImmediateSliding Sum Threshold I

AbortLogic

A>BThreshold F

A>BThreshold S

A>BThreshold V

CICSums

De-rippleProcess

Ion ChamberInput

Seq128

The Problem: 3 60Hz AC

• Rectify noise from power supply using 3-phase 60Hz AC are picked up by the input cable laying in the accelerator tunnel.

0

1000

2000

3000

4000

5000

6000

0 360 720 1080 1440 1800 2160 2520 2880 3240 3600

frequency (Hz)

Ampl

itude

Time Domain

Frequency Domain

ADC21s/sample

Filter Functions

SlidingSum

Cascaded IntegratorComb (CIC) Sum of 2nd Order

)1(

][1][Km

mk

kxms

)12(

][][][Kn

nk

kxkhny

• The CIC sum is a sliding sum of sliding sums.

• The frequency response of CIC sum is a sinc2(x) function that has 2nd order zeros and better stop band suppression.

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20x

sinc(x)sinc 2̂(x)

First Zero @ 360 Hz

Frequency

21s/sample124 samples

Filtering Works, But Partially

• Noises >360Hz, the dominating portion, are filtered out in both filter functions.

• CIC sum is a lot smoother than the sliding sum. • But small signals are still buried under ripples of 60 and 180 Hz.

SlidingSum

CICSum

Signals

Why Not Filtering Further?

• Filtering is an averaging process over many periods. There is not much time after reset.

• The noises before the accelerator ramping and after have different amplitudes and shapes.

• A “De-Ripple” algorithm has been developed.

Ramping

De-ripple Process (1.1)Waveform Extraction, Storage and Validation

WaveformBuffer Page 0

Waveform MeanWaveform

Buffer Page 1 Waveform Mean

• The CIC sum is stored into the waveform buffer and accumulated for the waveform mean.





• When it shows a good periodic property, the waveform becomes valid.





• If the data is non-periodic, the waveform becomes invalid.

De-ripple Process (2)Waveform Subtraction




- -

The waveform mean is subtracted to preserve DC component in the final result.

TheDe-rippledSum

Results of De-ripple Process

• Those otherwise hard-to-see small signals now become visible.

• DC and very slow signals are also preserved.

Filter Implementation

RecursiveImplementation

Recursive != IIRNon-RecursiveImplementation

Finite Impulse Respond (FIR)

Infinite Impulse Respond (IIR)

Possible

YesYes

NO

ResourceFriendly

x[n]

s[n]

+s[n]

-x[n-K]

x[n]

The non-recursive implementation needs:• 124 memory fetches,• 124 additions and• more ops for longer sum lengths.

The recursive implementation needs:• 1 memory fetch,• 2 add/sub operations• regardless sum length.

SlidingSum

Recursive Implementation of CIC Sum

The non-recursive implementation needs:• 248 memory fetches,• 248 multiplications,• 248 additions and

more ops for longer sum lengths.

+s[n]

-x[n-K]

x[n]

+y[n]

-s[n-K]

+u[n]

-2x[n-K]

x[n]

+y[n]

x[n-2K]

x[n]

y[n]

*h1*h2

*h[K]

The CIC sum constructedas a sliding sum of slidingsums:• 2 memory fetches,• 0 multiplications,• 4 add/sub ops for any

sum length.

The re-formulated CIC sum uses the raw data buffer rather than a separate buffer.

CICSum

Process SequencingSum1 Sum2 Sum3 Sum4

Sum1 Sum2 Sum3 Sum4

Sum1 Sum2 Sum3 Sum4

Sum1 Sum2 Sum3 Sum4

CH0

CH1

CH2

CH3

CH0

CH1

CH2

CH3

CIC1 CIC2

CIC1 CIC2

CIC1 CIC2

CIC1 CIC2

WFSUB

WFE,S,V

WFSUB

WFE,S,V

WFSUB

WFE,S,V

WFSUB

WFE,S,V

Sum1Sum2Sum3Sum4CIC1CIC2 WFSUB

WFE,S,VSum1Sum2Sum3Sum4CIC1CIC2 WF

SUBWF

E,S,VSum1Sum2Sum3Sum4CIC1CIC2 WFSUB

WFE,S,VSum1Sum2Sum3Sum4CIC1CIC2 WF

SUBWF

E,S,V

• Flat design is fast but uses a lot of logic elements.

• Sequencing the process saves logic elements significantly.

• Partially flat and partially sequence design sometimes is a better arrangement in FPGA.

BLM DC Process Sequencing

• The processes of calculating sliding sums and CIC sums are fully sequenced.• The de-ripple processor is flat for the process path. But it operates sequentially for 4

channels.

+ SlidingSum 1

(-)

+u[n]

-2x[n-K]

x[n]

+y[n]

x[n-2K] +u[n-L]

-2x[n-L-K]

+y[n-L]

x[n-L-2K]

x[n-L]

If |y[n]-y[n-L]|>MaxDY for entire period, then PG++.WF

PG=0WF

PG=1 PG

---

WF-WM DR=y[n]-(WF-WM)

MaxDYDecimation

Counter

+ SlidingSum 4

(-)SlidingSum 2

SlidingSum 3

Fully Sequencing

PartiallyFlat

FPGA Process Sequencing Options

ProgramType

ProgramLength(CLK cycles)

Reprogram ResourceUsage

Finite State Machine(FSM)

FixedWired

10 Hard Small

Enclosed Loop Micro-Sequencer(ELMS)

MemoryStoredProgram

10-1000 Easy Small

Microprocessor(MP)

MemoryStoredProgram

>1000 Easy Large

ELMS– Enclosed Loop Micro-Sequencer

Loop & Return Logic + Stack

Conditional Branch Logic

ProgramCounter

ROM128x

36bits

AReset

CLK Con

trol S

igna

lsPC Control Signals Opration00 000000000000000 01 001000100011010 LD R1, #n02 000010001000000 LD R2, #addr_a03 000000000000100 LD R3, #addr_X04 000000010001000 LD R7, #005 000000000100001 BckA1 LD R4, (R2)06 000100000010000 INC R207 000001000100000 LD R5, (R3)08 000100010000001 INC R309 001001000100000 MUL R6, R4, R50a 000000010001000 EndA1 ADD R7, R7, R60b 000010000010000 DEC R10c 000000100000100 BRNZ BckA1

Special in ELMSSupports FOR loops at machine code level

• PC+ROM is a good sequencer in FPGA.

• Adding Conditional Branch Logic allows the program to loop back.

• Loop & Return Logic + Stack is a special feature in ELMS that supports FOR loops at machine code level.

Allows jump back as in microprocessors

ELMS – Detailed Block Diagram

UserControlSignals

ROM128x

36bits

+1

CondJMP

PC

Reset

Loop & Return Registers

+ Stack (128 words)

Compare

RTNJMPIF

CNT

endA

bckA

PushPop

LoopBack

DEC

RTN

LastPass

LoopBack = DEC =(PC==endA) && (CNT!=0)

LastPass =(PC==endA) && (CNT==1)

desA

JMP

0x04

RUNat04 cnt EndA BckA

FOR BckA1 EndA1 #nLD R2, #addr_aLD R3, #addr_XLD R7, #0

BckA1 LD R4, (R2)INC R2LD R5, (R3)INC R3MUL R6, R4, R5

EndA1 ADD R7, R7, R6LD R8, R7

The Stack supports nested loops, up to 128 layers.

Software: Using Spread Sheet as Compiler

What’s Good About ELMSFOR Loops at Machine Code Level

• Looping sequence is known in this example before entering the loop.• Regular micro-processor treat the sequence as unknown.• ELMS supports FOR loops with pre-defined iterations at machine code level.• Execution time is saved and micro-complexities (branch penalty, pipeline bubble,

etc.) associated with conditional branches are avoided.

LD R1, #nLD R2, #addr_aLD R3, #addr_XLD R7, #0


EndA1 ADD R7, R7, R6DEC R1BRNZ BckA1

FOR BckA1 EndA1 #nLD R2, #addr_aLD R3, #addr_XLD R7, #0


EndA1 ADD R7, R7, R6

n

iii XaY

0

25%

Microprocessor The ELMS

Conditional Branch

Conclusion

• The de-ripple algorithm is an useful alternative method for eliminating low frequency periodic noises.

• The ELMS is a handy sequence controller in FPGA that uses small amount of resources.

The End

Thanks

What’s Good about ELMSNo ALU => Small Resource Usage

ProgramDATA

Memory

PrincetonArchitecture

HarvardArchitecture

FermilabArchitecture(?)

ProgramControl

ALU

ProgramMemory

ProgramControl

ALUDATAMemory

ProgramMemory

Sequencer(ELMS)

Data Processor

DATAMemory

• The Princeton Architecture is more suitable at system level while Harvard Architecture is better suited at micro-structure level.

• Regular microprocessors cannot run looped program without an ALU.

• The ALU takes large amount of resource while may not be efficiently utilized for data processing tasks in FPGA.

• The ELMS can run nested loop program without an ALU.

• Further separation of Program and data is therefore possible.

• The ELMS is kept small.

Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss Monitor System

Documents

Transcript of Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss Monitor System