A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management...

25
© CEA. All rights reserved A Dynamic Stream Link for Efficient Data Flow Control in NoC Based Heterogeneous MPSoC Claude Helmstetter 1 , Sylvain Basset 1 , Romain Lemaire 1 , Fabien Clermidy 1 , Pascal Vivet 1 , Michel Langevin 2 , Chuck Pilkington 2 , Pierre Paulin 2 , Didier Fuin 3 1 CEA-Leti, Minatec Campus, Grenoble, France 2 STMicroelectronics, Ottawa, Canada 3 STMicroelectronics, Grenoble, France 18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 Jan. 22-25, 2013 Yokohama, Japan

Transcript of A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management...

Page 1: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved

A Dynamic Stream Link for Efficient Data Flow Control

in NoC Based Heterogeneous MPSoC

Claude Helmstetter1, Sylvain Basset1, Romain Lemaire1, Fabien Clermidy1,

Pascal Vivet1, Michel Langevin2, Chuck Pilkington2, Pierre Paulin2, Didier Fuin3

1CEA-Leti, Minatec Campus, Grenoble, France

2STMicroelectronics, Ottawa, Canada 3STMicroelectronics, Grenoble, France

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013

Jan. 22-25, 2013 – Yokohama, Japan

Page 2: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Heterogeneous Multicore SoCs

Embedded computing evolution

New application: 3D imaging, multi-antenna wireless

baseband processing

Common characteristics: Complex operations, high

data rate, real-time requirements

Efficiency requirement: Performance versus Power

consumption

New architectures for Systems-on-Chip

Complexity: Increasing number of computing cores

Heterogeneity: CPU, DSP, Reconfigurable HW cores

Parallelism: Dataflow programming

| 2

Page 3: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Stream-Based Communications

From bus-based architecture to Network-on-Chip

Bandwidth requirement due to increasing number of

computing cores

Well-defined communication interfaces required

Ease of integration, reuse and programming

Bus protocols not well-adapted for NoC communication

with dataflow programming

Address management issue

Introduction of stream-based NoC protocols

Local control by Network Interface (NI)

Packet-switched data transfers + Flow control (credits)

| 3

Page 4: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Dynamic Stream-Based Link Management

Configuration of stream-based communication links

How and when communications can be configured

How and when communication links are started/stopped

Synchronization requirements

How to synchronize communication links

How to synchronize communications and computation

Issue: dynamic communication parameters

Non-predictable, data-dependent control

Control latencies become critical (reaction time)

Novel NI architecture supporting dynamicity

SIF: Streaming InterFace

Objectives: fast reconfiguration and limited control workload

| 4

Page 5: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Outlines

Introduction

Streaming interface and stream link protocols

Hardware Implementation

Evaluation and simulation results

Conclusion

| 5

Page 6: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Outlines

Introduction

Streaming interface and stream link protocols

Stream link definition

Static stream link

Iteration-based dynamic stream link

Mode-based dynamic stream link

Hardware Implementation

Evaluation and simulation results

Conclusion

| 6

Page 7: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Stream Link Definition

| 7

Processing Units (PU)

Receive/send data using FIFO interfaces only (dataflow)

Optional additional ports for control (registers, IT wires, …)

Streaming Interfaces (SIF)

Output Communication Controller

(OCC)

Packetise data, add routing

information (packet header)

Input Communication Controller

(ICC)

Depacketise data, send credit

backward (flow control)

Page 8: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 8

Static Stream Link Protocol

SIF are configured with a predefined total number

of data to transfer: iteration size

Stream link closed locally using a simple counters

Iteration size must be predictable at configuration

time

Page 9: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 9

Iteration-Based Dynamic Link

The emitter PU decides to close the link (flush)

The OCC sends a packet flagged as last

Acknowledgement sent by the ICC

Required for the data/credit mechanism reset

No need to predict iteration size at start time

Require additional communications

SIF

Receiver PU

stream in

External

Controller

Emitter PU

stream out

ICC

SIF OCC

co

nfig

. (with

ou

t siz

e)

sta

rt

da

ta

do

ne

da

tad

ata

cre

dits

las

t

last

sta

rt

da

ta

do

ne

da

tad

ata

cre

dits

las

t

last

iterationiteration

running idle running idle

flus

h

flus

h

Page 10: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 10

Mode-Based Dynamic Link

Mode: Time between 2 datapath reconfigurations

Mode-Based Dynamic Link:

PU controlled at iteration level

Stream links stopped only on mode boundaries

Suppress close/reopen delay between iterations

Longer reconfiguration time between modes

sta

rt

co

nfig

. (with

ou

t siz

e)

da

ta

da

ta

do

ne

da

tad

ata

da

ta

da

ta

do

ne

cre

dits

cre

dits

sta

rt

sto

p

1 mode (incl. 2 iterations)

SIF

Receiver PU

stream in

External

Controller

Emitter PU

stream out

ICC

SIF OCC

running idle running idle

flush

flush

Page 11: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Outlines

Introduction

Streaming interface and stream link protocols

Hardware Implementation

Architecture overview

Cost overhead for dynamic link protocols

Evaluation and simulation results

Conclusion

| 11

Page 12: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 12

Architecture Overview

Architecture fully customizable at design time

Number of ICC/OCC, FIFO depth, datawidth, …

Selection of stream link protocols (possibly combined)

Control port

SIF

credits out

credits in Processing

Unit

IN_FIFO

IN_FIFO

Local controller

Register accessIT

flush

flush

ICC0

ICC1

stream in 0

stream in 1IP

OCC0

OCC1OUT_FIFO

OUT_FIFO stream out 0

stream out 1

OP

NoC

Page 13: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Silicon cost of dynamic link protocols

Technology: CMOS 32nm @ 600MHz

1 ICC, 1 OCC, 32-data FIFO depth, 64-bit data width

Relative costs:

Parts of the design shared when combining protocols

| 13

9,01 kG

6,78 kG

7,02 kG

7,36 kG

All 3 links together

Mode-based dynamic link only

Iteration-based dynamic link only

Static link only

5 5,5 6 6,5 7 7,5 8 8,5 9 9,5

Hardware cost

(kGates)

Mode-based Smallest design

Iteration-based Extra-cost due to last packets protocol

Static Overhead from additional counters

Page 14: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Outlines

Introduction

Streaming interface and stream link protocols

Hardware Implementation

Evaluation and simulation results

Case study and benchmark

Mode-based dynamic link

Iteration-based dynamic link

Conclusion

| 14

Page 15: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 15

Case Study: the P2012 SoC

Generic many-core SoC for high performance computing Extensible with hardware Processing Elements (PE)

Developed by STMicroelectronics and CEA/LETI

P2012 cluster architecture:

LIC PE #1

shared

memory

corecorecorenetwork

interface

memory-mapped interconnect

DMA

NoC

router

NoC

router

NoC

router

PE #2

SIF

PE #0

SIF

PU

SIF

SIF

OCC ICCOCC

PUPU

P2012

cluster

synchro.

periph.

Page 16: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 16

Benchmark Overview

Linear data-flow going through the DMA and n

Processing Elements (PE)

Many parameters: number of PEs (n), iteration

size, PU rate, latencies, FIFO depth, …

Alternative Stream to test path reconfigurations

LIC

shared

memory

memory-mapped interconnectcontroller

coreSIF

PE #3

SIF

PU

PE #4

SIF

PU

PE #n

SIF

PU

PE #2

SIF

PU …PU

PU

DMA

PE #0

SIF

PU

PE #5S

IFPU

alternative streammain stream

Page 17: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 17

Mode-based Dynamic Link

Execution trace of the benchmark with mode-based dynamic link

1st and 2nd iterations run perfectly fine

… but time lost before 3rd and 5th iterations

Problem: PE #0 is restarted too late (cf. )

1st iteration time lost between iterations

Processing Unit

(PU) states

Page 18: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 18

Optimizing the Controller

Basic controller: while (not_done()) {

wait until all PEs are ready to be configured;

configure and start all PEs;

}

Idea: do not wait for last PE before restarting the 1st

Optimized "asynchronous" controller: while (not_done()) {

For all PE {

if "this PE" is ready to be configured

configure and start "this PE";

}

wait an event from any PE;

}

NB: new embedded software but hardware unchanged As long as the processor running the controller is fast enough

Page 19: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 19

Mode-based Dynamic Link

Execution of the benchmark with optimized controller

No more pipeline holes between iterations

PE

nu

mb

er

Page 20: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 20

Cost of Path Reconfigurations

Mode-based dynamic link

Even partial reconfiguration

generates a pipeline hole

This hole can be larger if the

controller is not optimized

Iteration-based dynamic

link

No time cost for path

reconfiguration PE

nu

mb

er

PE

nu

mb

er

path reconfiguration

PU 0 PU 0

PU 0 PU 0 PU 0

PU 1 PU 1

PU 1 PU 1 PU 1

PU 1 PU 0

PU 0 PU 0

PU 0 PU 0 PU 0

PU 1 PU 1

PU 1 PU 1 PU 1

PU 1 PU 0

Page 21: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 21

Iteration-based vs Mode-based

Mode-based 10% faster than iteration-based dynamic link

For short iterations (figure generated with 60-token long iterations)

Explanation: FIFOs are not flushed between iterations

Page 22: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Outlines

Introduction

Streaming interface and stream link protocols

Hardware Implementation

Evaluation and simulation results

Conclusion

| 22

Page 23: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan | 23

Conclusion (1/2)

Validation of an innovative SIF architecture

Support non predictable communication parameters

1 static & 2 dynamic protocols available

« Best protocol » is application dependent

Application characteristics Recommended protocol

Predictable communication Static Stream Link

No or few path reconfiguration Mode-based Dynamic Link

Other cases Iteration-based Dynamic Link

Page 24: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved © CEA. All rights reserved

18th Asia and South Pacific Design Automation Conference, ASP-DAC 2013 – Jan. 22-25, 2013 – Yokohama, Japan

Conclusion (2/2)

External controller (embedded software) impact

on performance

Too centralized, coarse-grain: simpler code but

performance degradation

Asynchronous, optimized: easer parallelization, better

reactivity if fast enough

Known choices for real applications:

3GPP-LTE (signal processing), Magali SoC: static link

H264 (video processing), P2012 SoC: mainly mode-

based dynamic link

| 24

Page 25: A Dynamic Stream Link for Efficient Data Flow Control in ...Dynamic Stream-Based Link Management Configuration of stream-based communication links How and when communications can be

© CEA. All rights reserved

Thank you

| 25