Hardwired Networks on Chip in FPGAs to Unify ... -...

25
Hardwired Networks on Chip in FPGAs to Unify Functional and Conguration Interconnects Kees Goossens 1,2 Martijn Bennebroek 3 Jae Young Hur 2 Muhammad Aqeel Wahlah 2 1 Research, NXP Semiconductors 2 Computer Engineering, Delft University of Technology 3 Philips Research

Transcript of Hardwired Networks on Chip in FPGAs to Unify ... -...

Page 1: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

Hardwired Networks on Chip in FPGAsto Unify Functional and Configuration Interconnects

Kees Goossens 1,2 Martijn Bennebroek 3

Jae Young Hur 2 Muhammad Aqeel Wahlah 2

1 Research, NXP Semiconductors2 Computer Engineering, Delft University of Technology

3 Philips Research

Page 2: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

2

2008-04-08 NOCS

overview

conventional FPGA– separate hard configuration interconnect and soft NOC

proposed FPGA– single hard NOC for configuration, programming, data

hardware architectureboot procedureperformance : cost analysisconclusions

configuration: loading bitstreamprogramming: writing MMIO registers

Page 3: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

3

2008-04-08 NOCS

configuration interconnect

frame

...

CLB

frame

... ...

frame

CLB

frame

CLB

...

...

functional IOconfiguration IO

conventional FPGA

soft IP are configured inconfigurable logic blocks (CLB)CLBs are interconnected with switch boxes (shown as fat arrows)

dedicated configuration interconnect to accessframes (minimum unit of reconfiguration)frames and CLBs are orthogonal(22 frames cover 16 CLBs in Virtex4)hence minimum coherent unit of reconfiguration is 22 frames

softIP

Page 4: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

4

2008-04-08 NOCS

CFRunified configuration and functional interconnect...

...

proposed FPGA

configuration and function region (CFR)– cf. Virtex4’s 22 frames– but can be orthogonal and independent

as in current FPGAs– interconnected as in current FPGAs

(see red arrows): no “tiles”

NOC unifies & connects to 1. functional data ports of IP2. functional programming ports of IP (MMIO)3. configuration ports of soft IP (bitstreams)

CFR

Page 5: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

5

2008-04-08 NOCS

CFRunified configuration and functional interconnect...

...

proposed FPGA

configuration and function region (CFR)– cf. Virtex4’s 22 frames– but can be orthogonal and independent

as in current FPGAs– interconnected as in current FPGAs

(see red arrows): no “tiles”

NOC unifies & connects to 1. functional data ports of IP2. functional programming ports of IP (MMIO)3. configuration ports of soft IP (bitstreams)

1 & 2 for both soft & hard IPNOC is partially hard & soft

enables conversion of data bitstreams

softIP

functional IO

embedded memory,memory controller

decryption

encryption / CRC

hard IP

softIP

Page 6: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

6

2008-04-08 NOCS

NI kernel

CFRNI shell

unified configuration and functional interconnect

configurationIO

CFRNI shell

NIK

NI

NIK

NIK

NIK

...

functional IO

embedded memory,memory controller

decryption

encryption / CRCNIK

...N

IKB soft NOC

bootmoduleNI

proposed FPGA configuration data

functional &programming data

more details on– soft:hard balance– mixing of configuration:functional data

on one or more NI kernels– place of boot module– possibility of bridging to soft NOC

shown:– NI kernel to both config & fnal ports– NI kernel to conf ports only– bridge to soft NOC (= NI kernel)– hard NI kernel & shell for shared

memory IP (memories)– hard kernel to streaming IPs

(en/decryption & IO)

all data types

Page 7: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

7

2008-04-08 NOCS

NI kernelIP

resp2

req2

port

port

NI shell

resp2

req2

port

MM

IOslave

prog. port

req1

resp1port

port

req1

resp1

port

master

data port

FSM

FSM

QoS andpacketisation

FSM

conventional IP & network interface (NI)

IP has data andcontrol (MMIO) portsboth are connected to the NOC:unified data &IP programming interconnect

L1

L2

transactions;IP protocoldata width

peer-to-peerstreaming data;NOC link width

packets;NOC link width

Page 8: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

8

2008-04-08 NOCS

paths

credits

resp2

req2

port

port

req4

resp4

resp4

req4

L1

L2

resp4

req4

port

portresp2

req2

port

MM

IOslave

prog. port

req1

resp1port

port

req1

resp1

port

master

data port

MM

IOslave

prog. port

port

FSM

FSM

NI shell

QoS andpacketisation

FSM

TDMAtable

FSM

conventional NI

only NIs of NOC are programmedusing the NOC itselfunified IP & NOC programming

NI kernelIP NI shell

Page 9: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

9

2008-04-08 NOCS

hard NI kernelCFR(soft IP)

paths

credits

resp2

req2

port

port

soft NI shell

req4

resp4

resp4

req4

L1

L2

resp4

req4

port

portresp2

req2

port

MM

IOslave

prog. port

req1

resp1port

port

req1

resp1

port

master

data port

port

resp3

req3

slaveconf. port

MM

IOslave

prog. port

porthard NI shell

TDMAtable

NI in FPGA context

NI kernel is hardIP and its shell can besoft or hardhard shell for e.g.:processor, memory, IO

soft IP is configured bybitstream transportedvia the NOC

unified interconnect for– data– NI & NOC programming– configuration

FSM

FSM

QoS andpacketisation

FSM

FSM

Page 10: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

10

2008-04-08 NOCS

conventional

bootmodule

NOC

IP

1: configure moduleNOC & IP

2:program NOC

3:program IP

4:application

functional reset all

configurationprogrammingfunctional

configuration & programming

Page 11: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

11

2008-04-08 NOCS

conventional

new

bootmodule

NOC

IP

1: configure moduleNOC & IP

2:program NOC

3:program IP

4:application

functional reset all

configurationprogrammingfunctional

bootmodule

1:config. mod.

2:prog. NOC

3:conf. IP

4: prog. IP

NOC

IP

5: application

functional resetboot module& NOC

functionalreset IP

configuration & programming

Page 12: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

12

2008-04-08 NOCS

NI kernelbootmodule

TDMAtable

paths

credits

configuration & programming

req2

resp2

port

port

hard NI shell1. program the NOC

connectboot module &(external) memories containing bitstreams to configuration ports of soft IP (CFR)

req1

req

resp

port

masterport

resp1port

port

req3

resp4

port

port

L1

L2

req4

resp4

port

programming connection internal history FIFO

d

i

addressLUT

MM

IOslave

prog. portQoS and

packetisationFSM

bits

tream

conf

igur

atio

n IO

req

port

masterport

Page 13: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

13

2008-04-08 NOCS

NI kernelbootmodule

TDMAtable

paths

credits

configuration & programming

req2

resp2

port

port

hard NI shell

req

resp

port

masterport

req1

resp1port

port

req3

resp4

port

port

L1

L2

req4

resp4

port

programming connection

configuration connection

internal history FIFO

d

i

addressLUT

MM

IOslave

prog. port

bits

tream

conf

igur

atio

n IO

QoS andpacketisation

FSM

req

port

masterport

1. program the NOC2. configure the soft IP

send bitstreams over the NOC from (external) memories to CFRs

Page 14: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

14

2008-04-08 NOCS

NI kernelbootmodule

TDMAtable

paths

credits

configuration & programming

req2

resp2

port

port

hard NI shell1. program the NOC2. configure the soft IP3. program the soft IP

“normal boot” of SOCboot module acts as control processorreprogram NOC with connections fromboot module toIP MMIOprogrammingportsprogram IP

req

resp

port

masterport

req1

resp1port

port

req3

resp4

port

port

L1

L2

req4

resp4

port

programming connection

configuration connection

internal history FIFO

d

i

addressLUT

MM

IOslave

prog. portQoS and

packetisationFSM

bits

tream

conf

igur

atio

n IO

req

resp

port

masterport

Page 15: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

15

2008-04-08 NOCS

NI kernelbootmodule

TDMAtable

paths

credits

configuration & programming

req2

resp2

port

port

hard NI shell1. program the NOC2. configure the soft IP3. program the (soft) IP4. send data to (soft) IP

“functional mode”req

resp

port

masterport

req1

resp1port

port

req3

resp4

port

port

L1

L2

req4

resp4

port

programming connection

configuration connection data connection

internal history FIFO

d

i

addressLUT

MM

IOslave

prog. portQoS and

packetisationFSM

bits

tream

conf

igur

atio

n IO

req

port

masterport

Page 16: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

16

2008-04-08 NOCS

NI kernelbootmodule

TDMAtable

paths

credits

configuration & programming

req2

resp2

port

port

hard NI shell1. program the NOC2. configure the soft IP3. program the (soft) IP4. send data to (soft) IP5. use bitstreams as

functional data

(de)compressionde/encryptionsynthesisoptimisationetc.

req

resp

port

masterport

req1

resp1port

port

req3

resp4

port

port

L1

L2

req4

resp4

port

programming connection

configuration connection data connection

internal history FIFO

d

i

addressLUT

MM

IOslave

prog. portQoS and

packetisationFSM

bits

tream

conf

igur

atio

n IO

req

resp

port

masterport

Page 17: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

17

2008-04-08 NOCS

performance & cost

compare– dedicated configuration interconnect & soft NOC, with– unified hard NOC

on– area– functional speed– configuration footprint (bits)– configuration & programming time

Page 18: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

18

2008-04-08 NOCS

performance & cost

essentially, it all depends on– area soft:hard ≈ 35:1– speed soft:hard ≈ 3.5:1– configuration footprint of soft NOC (bitstream) :

programming footprint of hard NOC (MMIO registers) ≈ 214:1

resulting in– boot time soft:hard ≈ 1:200– functional performance:cost (bit/sec:area) soft:hard ≈ 1:147

Page 19: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

19

2008-04-08 NOCS

performance & cost

percentage cost of router + NI kernel against CFR size (# LUTs)

10% area cost for interconnect implies CFR of ≈ 80,000 LUTs (soft NOC)≈ 1,400 LUTs (hard NOC)

Page 20: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

20

2008-04-08 NOCS

performance & cost

# CFRs against their size (# LUTs)

# of CFRs of 1000 LUTs≈ 19 CFRs (soft NOC)≈ 144 CFRs (hard NOC)

Page 21: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

21

2008-04-08 NOCS

performance & cost

speed– 118MHz unoptimised soft NOC (90nm)– 500MHz optimised hard NOC (130nm)

performance:cost = bit/sec : area– 32 bit x 118 MHz / 8.10 mm^2 = 466 = 1 soft– 32 bit x 500 MHz / 0.23 mm^2 = 69,565 = 147 hard

Page 22: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

22

2008-04-08 NOCS

performance & cost

configuration speed– 1.9 Gb/s for dedicated configuration interconnect– 8 Gb/s for hard NOC

programming speed– 118 MHz soft NOC– 500 MHz hard NOC

configuration footprint for soft NOC – 1.8 Mb (8300 LUTs / router+NI)

programming footprint for hard NOC– 2100 bit per connection

thus– 1 ms per NI for soft NOC– 10.6 microsec per NI for hard NOC

Page 23: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

23

2008-04-08 NOCS

performance & cost

boot time

dedicated configuration + soft NOC– configure soft NOC– configure IP– program soft NOC– program IP– total

hard NOC– program soft NOC– configure IP– program IP– total

Page 24: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional

24

2008-04-08 NOCS

conclusions

nice idea but:– low-level implementation of kernel:shell interface would be nice

• investigate impact of mapping shell in LUTs– performance : cost analysis should be improved

(it is in favour of soft interconnects)

advantages of real-time NOC not explained

“type casting” of bitstreams control datahas many cool applications– run-time synthesis & modification of bitstreams

• en/decryption, CRC check, peep-hole / JIT compilation of bitstreams– run-time relocation of soft IP

• e.g. structural test engines,re-packing of soft IP on partial dynamic reconfiguration

Page 25: Hardwired Networks on Chip in FPGAs to Unify ... - Asyncasync.org.uk/async2008/async-nocs-slides/Tuesday/... · frame CLB frame CLB..... configuration IO functional IO conventional