Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays...

43
Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier Serrano, CERN AB-CO-HT

Transcript of Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays...

Page 1: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Introduction to Field Programmable Gate Arrays

Lecture 2/3

CERN Accelerator School on Digital Signal ProcessingSigtuna, Sweden, 31 May – 9 June 2007

Javier Serrano, CERN AB-CO-HT

Page 2: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 3: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 4: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Why FPGAs for DSP? (1)

Conventional DSP Device(Von Neumann architecture)

Data Out

RegData In

MAC unit....C0

Data Out

C1 C2 C255

FPGA

Reg0 Reg1 Reg2 Reg255Data In

All 256 MAC operations in 1 clock cycle256 Loops needed to process samples

Reason 1: FPGAs handle high computational workloads

Page 5: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

FPGAs are ideal for multi-channel DSP designs

LPF

Multi ChannelFilter

80MHzSamples

ch1

ch2

ch3

ch4

LPF

LPF

LPF

LPF

20MHzSamples

Many low sample rate channels can be multiplexed (e.g. TDM) and processed in the FPGA, at a high rate.Interpolation (using zeros) can also drive sample rates higher.

Page 6: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Why FPGAs for DSP? (2)

Q = (A x B) + (C x D) + (E x F) + (G x H)

can be implemented in parallel

××

×× +

+

+

+

+

+

A

BC

DE

FG

H

Q

Reason 2: Tremendous Flexibility

But is this the only way in the FPGA?

Page 7: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

××

×× +

+

+

+

+

+ ×+

+D Q

××

+

+

+

+D Q

Parallel Semi-Parallel Serial

Customize Architectures to Suit Your Ideal Algorithms

FPGAs allow Area (cost) / Performance tradeoffs

Optimized for?Speed Cost

Page 8: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

DDCDDC

A/D

A/D

D/A

D/A

MACs

ControlDDCDDC

DUCDUC

DUCDUC

MACs

Control

DSPProcs.

DUCDUC

DUCDUC

DDCDDC DDCDDC

SDRAMAFE

FPGA DSP Card

Hundreds of Termination Resistors

PowerPC

SDRAM

SSTL3TranslatorsQuad

TRx

QuadTRx

ASSP

FPGA NetworkCard

SDRAM

A/D

A/D

D/A

D/A

Control

Control

PL4

CORBA

PowerPC

MACs, DUCs, DDCs, Logic

PowerPC

PowerPCPowerPC

3.125 Gbps

ASSP

SDRAM

Reason 3: Integration simplifies PCBs

Why FPGAs for DSP? (3)

Page 9: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 10: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Unsigned integers: positive values only

Page 11: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

2’s complement

Page 12: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Fixed point binary numbers

Example: 3 integer bits and 5 fractional bits

Page 13: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Fixed point truncation vs. rounding

Note that in 2’s complement, truncation is biased while rounding isn’t.

Page 14: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 15: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

The Full Adder (FA)

Page 16: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Add/subtract circuit

S = A+B when Control=‘0’S = A-B when Control=‘1’

Page 17: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Saturation

You can’t let the data path become arbitrarily wide. Saturation involves overflow detection and a multiplexer. Useful in accumulators (like the one in the PI controller we use in the lab).

Page 18: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Multiplication: pencil & paper approach

Page 19: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

A 4-bit unsigned multiplier using Full Adders and AND gates

Of course, you can use embedded multipliers if your chip has them!

Page 20: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Constant coefficient multipliers using ROM

For “easy” coefficients, there are smarter ways. E.g. to multiply a number A by 31, left-shift A by 5 places then subtract A.

Page 21: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Division: pencil & paper

Uses add/subtract blocks presented earlier.MSB produced first: this will usually imply we have to wait for whole operation to finish before feeding result to another block.Longer combinational delays than in multiplication: an N by N division will always take longer than an N by N multiplication.

Page 22: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Pipelining the division array

Page 23: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Square root

Take a division array, cut it in half (diagonally) and you have square root. Square root is therefore faster than division!Although with less ripple through, this block suffers from the same problems as the division array.Alternative approach: first guess with a ROM, then use an iterative algorithm such as Newton-Raphson.

Page 24: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 25: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Distributed Arithmetic (DA) 1/2

∑−

=

⋅=1

0][][

N

nnxncy

∑ ∑−

=

=

⎟⎠

⎞⎜⎝

⎛⋅⋅=

1

0

1

02][][

N

n

B

b

bb nxncy

Digital filtering is about sums of products:

Let’s assume:c[n] constant (prerequisite to use DA)x[n] input signal B bits wide

Then:xb[n] is bit number b of x[n] (either 0 or 1)

And after some rearrangement of terms: ∑ ∑

=

=

⎟⎠

⎞⎜⎝

⎛⋅⋅=

1

0

1

0][][2

B

b

N

nb

b nxncy

This can be implemented with an N-input LUT

Page 26: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Distributed Arithmetic (DA) 2/2

∑ ∑−

=

=

⎟⎠

⎞⎜⎝

⎛⋅⋅=

1

0

1

0][][2

B

b

N

nb

b nxncy

xB[0] …… x1[0] x0[0]

xB[1] …… x1[1] x0[1]

xB[N-1] …… x1[N-1] x0[N-1]

……

....

……

....

……

.... LUT

+ Register

2-1

y

Generates a result every B clock ticks. Replicating logic one can trade off speed vs. area, to the limit of getting one result per clock tick.

Page 27: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Outline

Digital Signal Processing using FPGAsIntroduction. Why FPGAs for DSP?Fixed point and its subtleties.Doing arithmetic in hardware.Distributed Arithmetic (DA).COordinate Rotation DIgital Computer (CORDIC).

Page 28: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

COrdinate Rotation DIgital Computer

Page 29: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Pseudo-rotations

Page 30: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Basic CORDIC iterations

Page 31: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Angle accumulator

Page 32: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

The scaling factor

Page 33: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Rotation Mode

Page 34: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Example: calculate sin and cos of 30º

Page 35: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Vectoring Mode

Vector magnitude

Page 36: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Circular coordinate system

Page 37: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Other coordinate systems

Page 38: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Generalized CORDIC equations

Page 39: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Summary of CORDIC functions

Page 40: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Precision and convergence

Page 41: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

FPGA implementation

Page 42: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Iterative bit-serial design

Page 43: Introduction to Field Programmable Gate Arrays · Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden,

Acknowledgements

Many thanks to Jeff Weintraub (Xilinx University Program) and Bob Stewart (University of Strathclyde) for many of these slides.