Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro...

19
Non-Minimal Routing Strategy for Application-Specific Networks- on-Chips Hiroki Matsutan i Michihiro Koibu chi Yutaka Yamada Jouraku Akiya Hideharu Amano Keio Univ. National Institute of Informatics Toshiba RDC Keio Univ. Keio Univ.

Transcript of Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro...

Page 1: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips

Hiroki Matsutani

Michihiro Koibuchi

Yutaka Yamada

Jouraku Akiya

Hideharu Amano

Keio Univ.

National Institute of Informatics

Toshiba RDC

Keio Univ.

Keio Univ.

Page 2: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Network-on-Chip (NoC)• Tile-based Multi-Core

– Core: Execution– Router: Packet delivery

• RAW– 2D Mesh

• ACM– Tree

• aSoC– 2D Mesh

[Taylor, Micro2002]

[Liang, TVLSI2004]

[Furtek, FPL2004]

0 1 2

3 4 5

6 7 8

Tile (RISC, RAM, I/O)

Page 3: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Network-on- b Chip (NoC)

[Taylor, Micro2002]

[Liang, TVLSI2004]

[Furtek, FPL2004]

MIPSMemory

Router

• Tile-based Multi-Core– Core: Execution– Router: Packet delivery

• RAW– 2D Mesh

• ACM– Tree

• aSoC– 2D Mesh

Page 4: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Network-on-Chip (NoC)

0 1 2

3 4 5

6 7 8

SoC is growing! NoC is one of Scalable on-chip interconnects

• Better Wiring Delay– Global wiring– Limited-length Links

• Improve Modularity– Standard Network I/F

○ Advantage

• Overhead

× Drawback

Tile (RISC, RAM, I/O)

Page 5: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Stream Processing ~ Simulation ~

Module(a) Module(b)

Data

• No Clock for execution

Module(a) Module(b)

Data

• Communication is cycle accurate

Clock

• MPEG, JPEG, Viterbi– System Level Design

RTL Model

UnTimed Functional

Bus Cycle Accurate

UTF Model

BCA Model

High Abstraction

Detail DesignApplication is divided into some Tasks based on Simulation.

Page 6: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Task Flow Graph

Stream Processing ~Map, Route~

• Shared Links– Link Congestion Throughput is degraded

• Optimization (in general)– Mapping: Minimum Communication Length– Routing : Minimal Paths

(2)

(2)

(2)

(1) (3) (4)

Physical Tile of NoC

(1) (2) (2) (2)

(4) (3)

Strong access locality !!

Too short to distribute path congestion by Minimal paths.

Page 7: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Existing Routing ~ Is non-minimal path useful?~

• Packet delivery– WH Switching

Common feature of SAN & NoC

Predictable communication Load balancing with non-minimal

• Deadlock freedom – Turn-Model, …

• Various applications, Various traffic patterns– Non-minimal paths

make unstable state

Feature of SAN

[Ho, HPCA2003]

• Fixed application, Fixed traffic patterns– System level simulation

Feature of NoC

Page 8: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Flee ~ Non-minimal routing strategy~

• Stream processing in NoCs– Strong access locality !!– Too short to distribute path congestions

• Partially non-minimal paths

• Path establishment based on Traffic Amount– Heavy Traffic Comm. Minimal Path– Light Traffic Comm. Avoiding Congestion

Non-minimal paths are basically inefficient…

Increase # of alternative pathsby introducing non-minimal paths

Page 9: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Flee ~ Traffic pattern Analysis~

# time, src, dst, size

10000 (0) (1) 32

10000 (0) (2) 4

10000 (0) (3) 4

10010 (1) (2) 32

10010 (0) (1) 32

10010 (0) (2) 4

10010 (0) (3) 4

10020 (2) (3) 32

10020 (1) (2) 32

10030 (2) (3) 4

Traffic Pattern

Traffic Analysis

1. For each src-dst pair,

–Totalize packet size

E.g., src-dst pair(0,1)

32 + 32 64

2. Sorting in   descending order

–In order of TotalSize

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

Src-dst pair with largestTotalSize is in first line

Each src-dst pair gets a path in order of Analysis Record.

Heavy!

Page 10: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

(0) (1) (2) (3)

Flee ~ Establishing Paths ~

• In order of Traffic Amount :– Search for lowest cost path– Increase the cost of links selected

Each link has “Cost”

解析結果

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

# srcdst, TotalSize

(0) (1) 8192

(1) (2) 8192

(2) (3) 8192

(0) (2) 1024

(0) (3) 1024

Analysis Record

Paths are assigned not to disturb previously established paths

There will be several alternative paths …

Link with high cost is hotspot …

Page 11: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Simulation Environments• Router Model

– 4 ports for adj. Routers– 1 port for Core

• Network Topology– 4×4 Mesh– 4×4 Torus

16 node 2D mesh

0

4

8

12

1

5

9

13

2

6

10

14

3

7

11

15

Router

Core

Packet size 259 flit (2 flit header)

Switching method Wormhole switching

# of Virtual channels Mesh : 1, Torus :2

Simulation time 1,000,000 cycle

Page 12: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Applications for Evaluation• App. Traces

– Viterbi Decoder– JPEG Codec– IPsec– Uniform

(0)HeaderAnalysis

(1)HuffmanDecode

(2)InverseQuant.

(3)I-DCT

for Row

(4) (5)Yuv-rgbConvert

(6)MCU

Mapping

(7)I-DCTfor Col

(8)Rgb-yuvConvert

(9)MCU

Samping

(10)I-DCTfor Col

(11)I-DCT

for Row

(12) (13)StreamGen.

(14)Huffman

Code

(15)Quant.

Tile mapping example of JPEG Codec

( for Decoder, for Encoder)

Page 13: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Results ~ Viterbi @ 2D Mesh~

• Flee– Avg Hop count : 2.

52

• DOR– Avg Hop count : 1.

84

X-axis : Accepted Traffic [flit/cycle/node]

Y-a

xis:

Lat

en

cy [c

ycle

]

14.2% Improved

Communication in Viterbi trace includes Fork and Join.

(Dimension-Order Routing)

Page 14: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Results ~ Viterbi @ 2D Torus~

• Flee– Avg Hop count : 1.

87

• DOR– Avg Hop count : 1.

48

22.2% Improved

X-axis : Accepted Traffic [flit/cycle/node]Flee improves 22.2% of throughput with non-minimal paths.

Y-a

xis:

Lat

en

cy [c

ycle

]

Communication in Viterbi trace includes Fork and Join.

(Dimension-Order Routing)

Page 15: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Results ~ JPEG @ 2D Mesh~

• Flee– Avg Hop count : 1.

01

• DOR– Avg Hop count : 1.

00

No difference

X-axis : Accepted Traffic [flit/cycle/node]

Y-a

xis:

Lat

en

cy [c

ycle

]

In JPEG trace, data is sequentially process. No fork and join pattern.

(Dimension-Order Routing)

Communication is between neighbors No need non-minimal

Page 16: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Results ~ Effect of Traffic Analysis~

• Flee– Known data amount

• Flee (Incomplete)

– Unknown data amount

Incomplete Flee: Not Improved

Viterbi @ 2D MeshY

-axi

s: L

ate

ncy

[cyc

le]

X-axis : Accepted Traffic [flit/cycle/node]

All data transfer size is “1”

Page 17: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Results ~ Effect of Traffic Analysis~

• Flee– Known data amount

• Flee (Incomplete)

– Unknown data amount

Incomplete Flee: Partially Improved

All data transfer size is “1”

Viterbi @ 2D Torus

X-axis : Accepted Traffic [flit/cycle/node]Communication size is key factor to improve performance.

Y-a

xis:

Lat

en

cy [c

ycle

]

Page 18: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Summary ~ Non-minimal routing strategy~

• Stream Processing in NoCs– Strong access locality !!– Too short to distribute path congestions

• Flee: Non-minimal routing strategy– Heavy Traffic Comm. Minimal Paths– Light Traffic Comm. Avoiding Congestions

• Improve 22.2% of Throughput

Increase # of alternative pathsby introducing non-minimal paths

Page 19: Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.

Thank you for your listening