The Power of Communication: Energy-Efficient NoCs for FPGAs
description
Transcript of The Power of Communication: Energy-Efficient NoCs for FPGAs
![Page 1: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/1.jpg)
The Power of Communication: Energy-Efficient NoCs for FPGAs
Mohamed ABDELFATTAHVaughn BETZ
![Page 2: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/2.jpg)
2
Outline
Why NoCs on FPGAs?
Embedded NoCs
Power Analysis
1
2
3
![Page 3: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/3.jpg)
3
Interconnect
Motivation1. Why NoCs on FPGAs?
Logic Blocks
Switch Blocks
Wires
![Page 4: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/4.jpg)
4
Motivation1. Why NoCs on FPGAs?
Logic Blocks
Switch Blocks
Wires
Hard Blocks:• Memory• Multiplier• Processor
![Page 5: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/5.jpg)
5
Motivation1. Why NoCs on FPGAs?
Logic Blocks
Switch Blocks
Wires
Hard InterfacesDDR/PCIe ..
Interconnect still the same
Hard Blocks:• Memory• Multiplier• Processor
1600 MHz
200 MHz
800 MHz
![Page 6: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/6.jpg)
6
MotivationDDR3 PHY and Controller
Problems:1. Bandwidth requirements for
hard logic/interfaces2. Timing closure
1. Why NoCs on FPGAs?PCIe Controller
Gigabit Ethernet
1600 MHz
200 MHz
800 MHz
![Page 7: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/7.jpg)
7
MotivationDDR3 PHY and Controller
Problems:1. Bandwidth requirements for
hard logic/interfaces2. Timing closure3. High interconnect utilization:
– Huge CAD Problem– Slow compilation– Power/area utilization
4. Wire speed not scaling:– Delay is interconnect-dominated
1. Why NoCs on FPGAs?PCIe Controller
Gigabit Ethernet
![Page 8: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/8.jpg)
Barcelona Los Angeles
Keep the “roads”, but add “freeways”.
Hard Blocks
Logic Cluster
Source: Google Earth
![Page 9: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/9.jpg)
9
DDR3 PHY and Controller
1. Why NoCs on FPGAs?PCIe Controller
Gigabit Ethernet
Problems:1. Bandwidth requirements for
hard logic/interfaces2. Timing closure3. High interconnect utilization:
– Huge CAD Problem– Slow compilation– Power/area utilization
4. Wire speed not scaling:– Delay is interconnect-dominated
FPGA with NoCNoC
Routers
Links Router forwards data packet
Router moves data to local interconnect
![Page 10: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/10.jpg)
10
DDR3 PHY and Controller
1. Why NoCs on FPGAs?PCIe Controller
Gigabit Ethernet
Problems:1. Bandwidth requirements for
hard logic/interfaces2. Timing closure3. High interconnect utilization:
– Huge CAD Problem– Slow compilation– Power/area utilization
4. Wire speed not scaling:– Delay is interconnect-dominated
5. Abstraction favours modularity:– Parallel compilation– Partial reconfiguration– Multi-chip interconnect
FPGA with NoC
Pre-design NoC to requirements NoC links are “re-usable” NoC is heavily “pipelined” NoC abstraction favors modularity
High bandwidth endpoints known
![Page 11: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/11.jpg)
11
DDR3 PHY and Controller
1. Why NoCs on FPGAs?PCIe Controller
Gigabit Ethernet
FPGA with NoC
Latency-tolerant communication NoC abstraction favors modularity
Problems:1. Bandwidth requirements for
hard logic/interfaces2. Timing closure3. High interconnect utilization:
– Huge CAD Problem– Slow compilation– Power/area utilization
4. Wire speed not scaling:– Delay is interconnect-dominated
5. Abstraction favours modularity:– Parallel compilation– Partial reconfiguration– Multi-chip interconnect
Previous work: Compelling area efficiency and performance
NoCs can simplify FPGA design
Does the NoC abstraction come at a high power cost?
![Page 12: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/12.jpg)
12
Outline
Why NoCs on FPGAs?
Embedded NoCs
Power Analysis
1
2
3
Mixed NoCs Hard NoCs
![Page 13: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/13.jpg)
Embedded NoCsFPGA
DD
Rx In
terf
ace
PCIe
Inte
rfac
e
Router
Compute Module
Links(Hard or Soft)
Fabric
Port
(Hard or Soft)
2. Embedded NoCs
“Mixed” NoC
“Hard” NoC
Soft LinksHard Routers
Hard LinksHard Routers =++
=“Soft” NoCSoft LinksSoft Routers + =
![Page 14: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/14.jpg)
14
Soft Hard
FPGA CAD Tools ASIC CAD Tools
Design Compiler
Area
Speed
Power?Power
Methodology
Toggle rates
Gate-level simulation Gate-level simulation
Mixed
HSPICE
![Page 15: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/15.jpg)
15
Router Logic
Programmable Interconnect
FPGA
Router
Mixed NoCs2. Embedded NoCs
Logic blocks
Baseline Router
Programmable“soft” interconnect
Width VCs Ports Buffer
32 2 5 10/VC
“Mixed” NoCSoft LinksHard Routers + =
![Page 16: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/16.jpg)
16
Router Logic
Programmable Interconnect
FPGA
Router
Mixed NoCs2. Embedded NoCs
Router Logic
16“Mixed” NoCSoft LinksHard Routers + =
![Page 17: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/17.jpg)
17
Router Logic
Programmable Interconnect
Router
Assumed a mesh Can form any topology
FPGA
Mixed NoCs2. Embedded NoCs
Special FeatureConfigurable topology
![Page 18: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/18.jpg)
18
Router Logic
Dedicated Interconnect
FPGA
Router
Hard NoCs2. Embedded NoCs
Logic blocks
Dedicated “hard” interconnect
Programmable“soft” interconnect
18“Hard” NoCHard LinksHard Routers + =
![Page 19: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/19.jpg)
19
Router Logic
Dedicated Interconnect
FPGA
Router
Hard NoCs2. Embedded NoCs
Router Logic
19“Hard” NoCHard LinksHard Routers + =
![Page 20: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/20.jpg)
20
Router Logic
Dedicated Interconnect
FPGA
Router
Hard NoCs2. Embedded NoCs
Low-V mode
1.1 V0.9 V
Save 33% Dynamic Power
Special Feature
~15% slower
20“Hard” NoCHard LinksHard Routers + =
![Page 21: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/21.jpg)
21
Outline
Why NoCs on FPGAs?
Embedded NoCs
1
2
Power Analysis
ComponentsAnalysis
3
System Analysis
![Page 22: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/22.jpg)
Soft, Mixed and Hard
22
Area Gap
Speed Gap
Power Gap
Mixed Hard (Low-V)Soft
20X – 23X smaller
5X – 6X faster
9X 11X (15X)
Speed
Area
Speed
Bisection BW
1. Power-aware design 2. NoC power budget 3. Comparison
~ 1.5% of FPGA33% of FPGA
730 – 940 MHz166 MHz
~ 50 GB/s~ 10 GB/s
Aver
age
64 –
NoC
1X
Investigate BW and power together
![Page 23: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/23.jpg)
23
Power-Aware NoC Design Total BW = 250 GBps Most Efficient NoC?
3. Power Analysis
Links Power
Routers Power
Wider Links, Fewer Routers
![Page 24: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/24.jpg)
24
Power-Aware NoC Design Total BW = 250 GBps Most Efficient NoC?
3. Power Analysis
![Page 25: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/25.jpg)
25
Power-Aware NoC Design Total BW = 250 GBps Most Efficient NoC?
3. Power Analysis
![Page 26: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/26.jpg)
26
NoC Power BudgetSoft NoC Mixed NoC Hard NoC Hard NoC (Low-V)
17.4 W
250 GB/s total bandwidth
Typical FPGA Dynamic Power
3. Power Analysis
123%How much is used for system-level communication?
![Page 27: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/27.jpg)
27
NoC Power BudgetSoft NoC Mixed NoC Hard NoC Hard NoC (Low-V)
17.4 W
NoC
250 GB/s total bandwidth 15%
Typical FPGA Dynamic Power
3. Power Analysis
123%
![Page 28: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/28.jpg)
28
NoC Power Budget3. Power Analysis
NoC
17.4 WTypical FPGA
Dynamic Power
Soft NoC Mixed NoC Hard NoC Hard NoC (Low-V)250 GB/s total bandwidth 15%123% 11%
![Page 29: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/29.jpg)
29
NoC Power Budget3. Power Analysis
NoC
17.4 WTypical FPGA
Dynamic Power
Soft NoC Mixed NoC Hard NoC Hard NoC (Low-V)250 GB/s total bandwidth 15%123% 11% 7%
![Page 30: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/30.jpg)
30
Bandwidth in Perspective
14.6 GB/s
14.6 GB/s
14.6 GB/s
14.6 GB/s
17 G
B/s
17 G
B/s
17 G
B/s
17 G
B/s
DDR3 Module 1
PCIe Module 2
Full theoretical BW
126 GB/sAggregate Bandwidth
3.5%NoC Power Budget
Cross whole chip!
3. Power Analysis
![Page 31: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/31.jpg)
31
FPGA Interconnect
1 1
Point-to-point Links
Broadcast
1 1
n
Multiple Masters
1
1Mux + Arbiter
n
Multiple Masters, Multiple Slaves
1 1Mux + Arbiter
n nMux + Arbiter
Interconnect = Just wires Interconnect = Wires + Logic Interconnect = NoC
1 .. .. ..
.. .. .. ..
.. .. ..
.. .. .. n
..Compare “wires” interconnect to NoCs
3. Power Analysis
![Page 32: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/32.jpg)
32
NoC Power vs. FPGA Interconnect
Hard and Mixed NoCs very compelling
Length of 1 NoC Link1 % area overhead on Stratix 5
Runs at 730-943 MHz
Power on-par with simplest FPGA interconnect
3. Power Analysis
200 MHz
High Performance / Packet Switched
![Page 33: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/33.jpg)
1
2
3
Big city needs freeways to handle traffic
Area: 20-23X
Why NoCs on FPGAs?
Embedded NoCs: Mixed & Hard
Power Analysis
Speed: 5-6X Power: 9-15X
• Power-aware design of embedded NoCs• Power Budget for 100 GB/s: 3-7%• Point-to-point soft Links: 4.7 mJ/GB• Embedded NoCs: 4.5 – 10.4 mJ/GB
![Page 34: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/34.jpg)
34
eecg.utoronto.ca/~mohamed/noc_designer.html
![Page 35: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/35.jpg)
35
Thank You!
eecg.utoronto.ca/~mohamed/noc_designer.html
![Page 36: The Power of Communication: Energy-Efficient NoCs for FPGAs](https://reader037.fdocuments.net/reader037/viewer/2022110213/568143c1550346895db04ceb/html5/thumbnails/36.jpg)