FPGA architecture and design technology
Transcript of FPGA architecture and design technology
![Page 1: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/1.jpg)
ECE 435 Embedded SystemsSpring 2019
FPGA architecture and design technology
Nikos Bellas
Computer and Communications Engineering DepartmentUniversity of Thessaly
1CE435 - Embedded Systems
![Page 2: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/2.jpg)
FPGA fabric
• A generic island-style FPGA fabric• Configurable Logic Blocks (CLB) and Programmable Switch Matrices (PSM)• Bitstream configures functionality of each CLB and interconnection between logic blocks
2CE435 - Embedded Systems
![Page 3: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/3.jpg)
CLB (Combinational Logic Block)Xilinx case
CE435 - Embedded Systems 3
![Page 4: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/4.jpg)
Basic Architecture 4
The Xilinx Slice
• Xilinx slice features– LUTs
– MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram)
– Carry Logic
– MULT_ANDs
– Sequential Elements
Detailed Structure
CE435 - Embedded Systems
![Page 5: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/5.jpg)
Slice Logic
CE435 - Embedded Systems 5
• Look-Up Tables– An N-input LUT to implement
any combinational boolean function with N inputs
– Coarser-grained than logic gates
– Less area efficient than fixed logic gates (e.g. 4-input AND gate)
– Very powerful concept for implementation of bit-level random digital logic.
– Typical values (N=4,5, or 6)
![Page 6: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/6.jpg)
LUT
Example 2-input LUT
• Lookup table: a b out
0 0
0 1
1 0
1 1
a
bout
0
0
0
1
0 0 0 1
1
0
0
1
1 0 0 1
6CE435 - Embedded Systems
configurationinput
![Page 7: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/7.jpg)
Example 4-input LUT
CE435 - Embedded Systems 7
![Page 8: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/8.jpg)
Slice Logic
• D Flip Flops
– A D-Flip Flop at the output of the LUT can be used to latch the output data
– It can also be used to carry state in FSM designs OR
– as a pipeline stage
– etc.
8CE435 - Embedded Systems
![Page 9: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/9.jpg)
Slice Logic
• Carry Logic– Carry Logic is used to
speed up carry-based computations.
– Additions (CLA, Ripple Carry Adders), parity functions, etc.
– Cin/Cout routing is separate from general purpose routing (fewer logic stages and faster)
9CE435 - Embedded Systems
![Page 10: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/10.jpg)
Basic Architecture 10
Carry Chains
• Simple, fast, and complete arithmetic Logic
– Dedicated XOR gate for single-level sum completion
– Uses dedicated routing resources
– All synthesis tools can infer carry logic
COUT COUT
SLICE
S0
SLICE
S1
Second
Carry
Chain
To S0 of the
next CLBTo CIN of S2 of the next
CLB
First Carry
Chain
SLICE
S3
SLICE
S2
COUT
COUT
CIN
CIN
CIN CIN CLB
Dedicated carry chains speeds up arithmetic operations
S= 𝐴 𝑥𝑜𝑟 𝐵 𝑥𝑜𝑟 𝐶𝑖𝑛Cout= 𝐴𝐵 +(𝐶𝑖𝑛(𝐴 𝑥𝑜𝑟 𝐵 ))
CE435 - Embedded Systems
![Page 11: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/11.jpg)
Basic Architecture 11
Multiplexer Logic
F5F8
F5F6
CLB
Slice S3
Slice S2
Slice S0
Slice S1
F5F7
F5F6
MUXF8 combines the two MUXF7 outputs (from the CLB above or below)
MUXF6 combines slices S2 and S3
MUXF7 combines the two MUXF6 outputs
MUXF6 combines slices S0 and S1
MUXF5 combines LUTs in each slice
Dedicated MUXes provided to connect slices and LUTs
CE435 - Embedded Systems
![Page 12: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/12.jpg)
Programmable wiring
• Organized into channels.– Many wires per channel.
• Connections between wires made at programmable interconnection points.
• Must choose:– Channels from source to destination.
– Wires within the channels.
• Routing area typically much larger than logic area
12CE435 - Embedded Systems
![Page 13: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/13.jpg)
Programmable interconnect
• MOS switch controlled by configuration bit:
D Q
13CE435 - Embedded Systems
![Page 14: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/14.jpg)
Programmable wiring paths
14CE435 - Embedded Systems
![Page 15: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/15.jpg)
Switchbox
channel channel
chan
ne
lch
ann
el
15CE435 - Embedded Systems
![Page 16: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/16.jpg)
Choosing a path
LE
LE
16CE435 - Embedded Systems
![Page 17: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/17.jpg)
Interconnection architectures
CE435 - Embedded Systems 17
Segmented interconnects consists of shorter wires that connectto emulate longer wiresHierarchical interconnects assume that most connections are local
![Page 18: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/18.jpg)
Routing problems
• Global routing:
– Which combination of channels?
• Local routing:
– Which wire in each channel?
• Routing metrics:
– Net length.
– Delay.
18CE435 - Embedded Systems
![Page 19: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/19.jpg)
I/O
• Fundamental selection: input, output, three-state?
• Additional features:
– Register.
– Voltage levels.
– Slew rate.
19CE435 - Embedded Systems
![Page 20: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/20.jpg)
Configuration
• Must set control bits for:
– LE.
– Interconnect.
– I/O blocks.
• Usually configured off-line.
– Separate burn-in step (antifuse).
– At power-up (SRAM).
20CE435 - Embedded Systems
![Page 21: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/21.jpg)
Configuration vs. programming
• FPGA configuration:
– Bits stay at the device they program.
– A configuration bit controls a switch or a logic bit.
• CPU programming:
– Instructions are fetched from a memory.
– Instructions select complex operations.
CPUmemory
add r1, r2 IRadd r1, r2
21CE435 - Embedded Systems
![Page 22: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/22.jpg)
Reconfiguration
• Some FPGAs are designed for fast configuration.
– A few clock cycles, not thousands of clock cycles.
• Allows hardware to be changed on-the-fly.
22CE435 - Embedded Systems
![Page 23: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/23.jpg)
FPGA fabric architecture questions
• Given limited area budget:
– How many logic elements?
– How much interconnect?
– How many I/O blocks?
23CE435 - Embedded Systems
![Page 24: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/24.jpg)
Logic element questions
• How many inputs?
• How many functions?– All functions of n inputs or eliminate some
combinations?
– What inputs go to what pieces of the function?
• Any specialized logic?– Adder, etc.
• What register features?
24CE435 - Embedded Systems
![Page 25: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/25.jpg)
Interconnect questions
• How many wires in each channel?
• Uniform distribution of wiring?
• How should wires be segmented?
• How rich is interconnect between channels?
• How long is the average wire?
• How much buffering do we add to wires?
25CE435 - Embedded Systems
![Page 26: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/26.jpg)
I/O block questions
• How many pins?
– Maximum number of pins determined by package type.
• Are pins programmed individually or in groups?
• Can all pins perform all functions?
• How many logic families do we support?
26CE435 - Embedded Systems
![Page 27: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/27.jpg)
The Design Cycle for FPGAs (I)
CE435 - Embedded Systems 27
![Page 28: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/28.jpg)
The Design Cycle for FPGAs (II)
CE435 - Embedded Systems 28
![Page 29: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/29.jpg)
Mapping
CE435 - Embedded Systems 29
![Page 30: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/30.jpg)
Placement
CE435 - Embedded Systems 30
![Page 31: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/31.jpg)
Routing
CE435 - Embedded Systems 31
![Page 32: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/32.jpg)
Modern FPGA architectureXilinx Virtex family
CE435 - Embedded Systems 32
Columns of on-chips SRAMs, hard IP cores (PPC 405), andDSP slices (Multiply-Accumulate) units
![Page 33: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/33.jpg)
DSP slices
CE435 - Embedded Systems 33
Large number of hard multipliers allow for DSP applications
![Page 34: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/34.jpg)
Example Aplication: FIR filtering
CE435 - Embedded Systems 34
![Page 35: FPGA architecture and design technology](https://reader031.fdocuments.net/reader031/viewer/2022020623/61f0416971d3880bad1393b3/html5/thumbnails/35.jpg)
Architectural EvolutionReconfigurable FPGAs
Dev
ice
Co
mp
lexi
ty a
nd
P
erf
orm
ance
1985 1992 2000 2002 2004
• FPGA Fabric• Block RAM• Embedded Registers
and Multipliers• Clock Management• Multi-standard
Programmable IO
• FPGA Fabric• Block RAM
• FPGA Fabric
Domain-optimized
System Logic
• FPGA Fabric• Block RAM• Embedded
Registers and Multipliers
• Clock Management• Multi-standard
Programmable IO• Embedded
Microprocessor• Multigigabit
Transceivers
• FPGA Fabric• Block RAM• Embedded
Registers and Multipliers
• Clock Management• Multi-standard
Programmable IO• Embedded
Microprocessor• Multigigabit
Transceivers• Embedded DSP-
optimized Multiplers
• Embedded Ethernet MACs
GlueLogic
BlockLogic
PlatformLogic
SystemLogic
2005
Programmable “System in a
Package”
35CE435 - Embedded Systems