Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure: ALU-computing...

18
Course-Grained Reconfigurable Devices

Transcript of Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure: ALU-computing...

Course-Grained Reconfigurable Devices

2

Dataflow Machines

• General Structure:ALU-computing elements,Programmable interconnections, I/O components.

• Most dominating coarse-grained systems:PACT XPPNEC-DRPPicoChipMorphosys [RaPiD] Chameleon

3

4

PACT XPP

• V. Baumgarte, G. Ehlers, F. May, A. Nueckel, M. Vorbach, and M. Weinhardt, “PACT XPP A self-reconfigurable data processing architecture,” J. Supercomput., vol. 26, no. 2, pp. 167–184, 2003.

• M. Petrov, T. Murgan, F. May, M. Vorbach, P. Zipf, and M. Glesner, “The XPP architecture and its co-simulation within the simulink environment.” in Proceedings of International Conference on Field-Programmable Logic and Applications (FPL), ser. Lecture Notes in Computer Science (LNCS), vol. 3203. Antwep, Belgium: Springer, Aug. 2004, pp. 761–770.

• http://www.pactxpp.com/

5

PACT XPP

• Aim: Efficiently compute streams of data provided from

different sources (e.g. A/D converters) rather than single instructions (as in Von-Neumann computers).

• Characteristic: Computation should be done while data are streaming

through the processing elements it is suitable to configure the PEs to adapt to the

natural computation paradigm of a given application.

6

Course Grain Architectures

7

PACT XPP: Architecture

• XPP (Extreme Processing Platform) A hierarchical structure consisting of PAEs

• PAEs Course grain PEs Adaptive Clustered in PACs PA = PAC + CM A hierarchical

configuration tree Memory elements

(aside PAs) I/O elements (on

each side of the chip)

PA PA

PA PA

8

PACT XPP Architecture: CM• CM (Configuration Manager):

Powerful run-time reconfiguration:− Configuration control is distributed over several CMs− PAEs can be configured rapidly in parallel while neighboring PAEs are

processing data.

Entire applications can be configured and run independently on different parts of the array.

Reconfiguration can be triggered:− externally or − internally (by special event signals originating within the array

− self-reconfiguring

• Local CM: One configuration manager (CM) attached to a local

memory is responsible for writing configuration onto a PA. The CMs at a lower level are controlled by a CM at the next

higher level.• Root CM:

Attached to an external configuration memory. Supervises the whole device configuration.

9

XPP Architecture

• Scalability: Can cascade multiple devices in a multi-chip module Root CMs act like ordinary, subordinate CMs

• CM: consists of a state machine + internal RAM for configuration caching

10

PACT XPP Architecture: PAE1. ALU PAE has:

1. ALU: is configured to perform basic operations:− Common fixed-point arithmetical and logical operations− Special three-input opcodes (e.g. multiply-add, sort, counters)− Generate events (e.g. counting termination, ovf, …)

2. Back Register: provides routing channels for data and events from bottom to top

3. Forward Register: provides routing channels from top to bottom

11

PACT XPP Architecture: PAE Dataflow-Registers: used at the object output for data

buffering in case of a pipeline stall. Input Registers : can be pre-loaded by configuration

data and always provide single cycle stall.

12

PACT XPP Architecture: PAE2. RAM PAE:

As ALU PAE but instead of ALU, it has a dual port RAM Useful for data storage (intermediate results)

− Can be used in FIFO or RAM mode

Useful for LUT-based functions The RAM generates a data packet after an address was

received at the input. Writing to the RAM requires two data packets:

1. for the address2. for the data to be written.

RAM

13

PACT XPP Architecture: Communication

• PAE Objects communicate via a packet-oriented network: Two types of packets:

− Data packets: uniform bit width for a device (specific to the device type, e.g 32)

− Event packets: one or a few bits wide

Self-synchronizing:− An operation is performed as soon as all necessary data input packets

are available. − The results are forwarded as soon as they are available, provided the

previous results have been consumed. − Thus possible to map a DFG directly to ALU objects, and to pipeline

input data streams through it.

Event signals:− can trigger a self-reconfiguration− Can control the merging of data-streams

14

PACT XPP: Routing

• Routing and Communication: Two independent networks:

1. for data transmission

2. for event transmission

15

PACT XPP: Routing

1. Horizontal Channel

• to connect a PAE within a row.2. Vertical Channel

• to connect objects to a given horizontal bus.

3. Configuration Bus

Horizontal routing channels

Vertical routing channels

16

PACT XPP: Interface

Number and type of interfaces vary from device to device

• XPP42-A1: 6 internal interfaces consisting of:

4 identical general purpose I/O on-chip interfaces (bottom left, upper left, upper right, and bottom right)

One configuration manager (not shown on the picture)

One JTAG (Join Test Action Group, "IEEE Standard 1149.1") Boundary scan interface or for testing purpose

Interfaces

17

2.1 The PACT XPP - Interface

The I/O interfaces can operateindependent from each other.

• Two operation modes The RAM mode The streaming mode

• RAM mode: Each port can access external

Static RAM (SRAM). Control signals for the SRAM

transaction are available.

No additional logic required

18

2.1 The PACT XPP - Interface

• Streaming mode: For high speed streaming of

data to and from the device

Each I/O element provides two bidirectional ports for data streaming

Handshake signals are used for synchronization of data packets to external port