ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction...

66
ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors Instruction Processors ASIPS” or ASIPS” or Reconfigurable Processors” Reconfigurable Processors”

Transcript of ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction...

Page 1: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 Reconfigurable

Computing Systems

Application Specific Application Specific

Instruction ProcessorsInstruction Processors

““ASIPS” orASIPS” or

““Reconfigurable Processors”Reconfigurable Processors”

Page 2: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 2

TopicsTopics

ASIPs: DefinitionASIPs: Definition MotivationMotivation How to customize ASIPsHow to customize ASIPs Tools for ASIPsTools for ASIPs ApproachesApproaches ConclusionsConclusions

Page 3: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 3

References

1.1. ““Engineering the Complex SOC: Fast, Flexible Engineering the Complex SOC: Fast, Flexible Design with Configurable Processors”, by Chris Design with Configurable Processors”, by Chris Rowen, 2004,Rowen, 2004,

2. “Xtensa Architecture and Performance”, Tensilica Inc, Sep 2002.

3. “Configurable Processors: What, Why, How?”, Tensilica Inc, June 2007

Page 4: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 4

Microprocessors and ASICs

For the ultimate in flexibilityflexibility, programmers map the application onto a general-purpose microprocessor.

For the ultimate in performanceperformance, logic designers map the application into a custom circuit.

App

licat

ion

Microprocessor

ASIC

Programmers

Logic designers

FPGA

Page 5: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 5

Classic Options for Systems-on-Chip

Design Gap!

Page 6: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 6

General Purpose Processors

Page 7: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 7

A Case for Customization

General Purpose Processors: Flexible, but tends to customize the application to

the architecture! ASICS:

High performance, but Expensive, and tends to customize the architecture to the application!

We need to find a technology that can:We need to find a technology that can: customize the architecture to the applicationcustomize the architecture to the application and at the same time flexible and cheap!and at the same time flexible and cheap!

Page 8: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 8

Processor Specialization:Get the Best of Both Options

Gains!

Page 9: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 9

Motivations: reduce size

Pentium 4 die can fit about 50 ARM9 processors at 0.13um, and 80 at 0.10um

At 0.13um and 250MHz clock, ARM9 dissipates 0.1W50 ARM9s = 5W

12mm

12mm

ARM9 at 0.13um=3mm2

Pentium4 at 0.13um= 144mm2

Cost, Power, and Size are important for embedded applications! Processing vs. Dedicated hardware (ASIC)? System-On-a-Chip concept

Page 10: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 10

Programmable Processors

Past Microprocessor Microcontroller DSP Graphics

Processor

Now / Future Network Processor Sensor Processor Crypto Processor Game Processor Wearable Processor Mobile Processor

Page 11: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 11

A Case for Customization General purpose processors handles many

applications fairly well, but…Each application has different requirementsThe instruction set is fixed!Data path width may not suit your application!Cache size/configuration may not be optimalRegister file is either too small or …Functional units might be missing or … Internal busses are slow or too narrow …

Page 12: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 12

Processor Customizations

Specialized Specialized instructionsinstructions

Optimization, searching, classification, …Optimization, searching, classification, …

Specialized Specialized functional unitsfunctional units

MAC Units, Special Comparators, Sorting UnitsMAC Units, Special Comparators, Sorting Units

Parameterized Parameterized busses and datapathsbusses and datapaths

8-bit, 16 bits, synch/async busses8-bit, 16 bits, synch/async busses

Parameterized Parameterized register filesregister files

Parameterized Parameterized cachescaches

Cache size, replacement strategy, …Cache size, replacement strategy, …

P

RegFile

D/I - Caches

FU1 FU2 FU3

Page 13: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 13

Application-specific instruction processors An ASIP is a stored-memory CPU whose architecture architecture

is tailoredis tailored for a particular set of applications. The instruction-sets tailoredinstruction-sets tailored to specific applications or

application domains Customized functional units within data pathwithin data path for high

performance Programmability allows changesallows changes to implementation, Can be used in several differentused in several different products.

Application-specific architecture provides smaller silicon areaarea, higher speedspeed, lower power consumptionpower consumption.

Page 14: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 14

RecallRecall: Different levels of coupling: Different levels of coupling

FU

Workstation

Coprocessor

CPU Memory Caches

I/O Interfac

e

Standalone Processing Unit

Attached Processing Unit

Tightly CoupledTightly Coupled

Loosely CoupledLoosely Coupled

Page 15: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 15

FPGA

ASIC P

Design costDesign costTime-to-marketFlexibilityDeterminismPowerPowerPerformancePerformance

Design costDesign costTime-to-marketTime-to-marketFlexibilityFlexibilityDeterminismPowerPerformance

Design costTime-to-marketFlexibilityDeterminismPowerPerformance

Application Specific Instruction Processors

Page 16: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 16

FPGA

ASIC P

Design costDesign costTime-to-marketTime-to-marketFlexibilityFlexibilityDeterminismDeterminismPowerPowerPerformancePerformanceASCP

Application-Specific Customizable Embedded Processor– Helps preserve the benefits of generality Helps preserve the benefits of generality – Alleviates the drawbacks of general-purpose processorsAlleviates the drawbacks of general-purpose processors

Embedded Applications Requirements

Page 17: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 17

Performance vs. FlexibilityF

lexi

bil

ity

Performance

ASIC

GPP

DSP

RCS

ASIPs!!

Page 18: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 18

ASIPs: Advantages Tailor for specific applications by:

Customize the instruction set Add Customized execution units that efficiently

perform task specific algorithms. Add special registers sized to the natural data

types of the tasks to be performed. Instructions will often execute in one or two

clock cycles which will keep clock rates low and thus energy consumption low as well.

You can further customize the processor as your application evolves with time.

Page 19: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 19

ASIP Design MethodologyA

pplic

atio

n

Design-time configurable

microprocessor

Profile the application

Create custom hardware and instructions to

accelerate critical application sections

Most of the application runs as

execution of general-purpose

instructions

Page 20: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 20

ASIP based approach R

econ

fig

ura

ble

In

str

ucti

on

Set

Pro

cessors

C Parsing

Optimizations

Inst. Identification

Inst. Selection

Config. Scheduling

Code Generation

C Code

Assembly Code

HardwareGeneration

Configuration bits

HardwareEstimator

Compiler Structure

Page 21: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 21

Instruction Set Extension

Idea:Provide a way to augmentaugment the processor’s

instruction set with? Operations needed by a particular application

Page 22: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

22

Determinates of CPU PerformanceDeterminates of CPU Performance

CPU time = Instruction_count x CPI x clock_cycle

Instruction_count

CPI clock_cycle

Algorithm

Programming language

Compiler

ISA

Processor organization

TechnologyX

XX

XX

X X

X

X

X

X

X

ENG6530 RCS

Page 23: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 23

Instruction Specialization The instruction set determines the functions

directly implemented in hardware and the operations which can be performed in parallel.

How to improve the instruction set?How to improve the instruction set? Operations which can frequently be scheduled

concurrently should be coded in the same instruction

Operations which can often be chained should be coded in the same way

Multiply-accumulation Vector operations

Page 24: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 24

Computationally demanding parts of applications run on special hardwarespecial hardware

New instructions New instructions use the special hardware

Instruction Set Customization

CUSTOM

XOR

MPY LD

XOR

SHR

XOR

MOV

MPYLD

SHR

AND

Page 25: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

25

Automatically Collapsing Clusters of Instructions into New Ones

If the ad-hoc functional unit completes the

job faster GAIN

One ad-hoc complex operation instead of a long

sequence of standard ones

Page 26: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 26

Function Unit and Data Path Specialization

To reduce power consumption and increase performance Word length adaptationWord length adaptation Implementation of application specific HW functionsspecific HW functions

String manipulation String matching Pixel operation Multiplication-accumulation

Special consideration: clock frequency It may be better to use a slower clock in embedded

systems.

Page 27: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 27

Customized Function Units Goal: support important

computation subgraphs Add specialized units within

the data path of the processor Exploits subgraph parallelism Allows natural data

propagation

FU FU FU …

FU FU FU …

IN 1

IN 2

Fetch

Issue

…ALU

ALU

CCA

… WB

Page 28: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 28

Interconnect Specialization

Specialization can be done in respect to: Interconnect of functional modules

Reduced bus instead of standard system bus to save cost or power consumption

Dedicated connection between registers (accumulator) and memories to increased parallelism

Protocol usedProtocol used for the communication between components.

Synchronous Asynchronous Semi synchronous

Page 29: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 29

Optimizing Power in ASIPs

29

Configurable processors have a deep influence on low power design in two ways: Compared to hardwired logic, software based design

allows for more sophisticated algorithms and control of operating modes.

In many applications, the software can be much smarter than custom RTL about when to run and how fast

ASIPs pack the same work into far few cycles than GPPs allowing the SOC to run at a lower clock frequency (How?)

Page 30: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 30

Optimizing Power in ASIPs

30

E = alpha C V2n E Energy use due to active switching in

CMOS logic C is the total capacitance of all the switched

nodes in the circuit V is the voltage alpha is the average fraction of circuit nodes

switching between one and zero each cycle n is the number of cycles required to execute

the function.

Page 31: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 31

Optimizing Power (insight)

31

The impact of a good processor configuration is to sharply reduce ‘n’ , while increasing ‘C’ only slightly relative to a baseline processor.

ASIPs can be quite smart about activating execution units only when necessary. The processor generator can determine the

combinations of logic blocks that must be active at each stage of the pipeline and create logic for fine-granularity clock gatingclock gating thereby reducing ‘alpha’

Page 32: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 32

ToolsTools??

Page 33: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 33

Tensilica

Tensilica has two main product lines of 32-bit 32-bit

processor coresprocessor cores for SOC design (IP):1. Diamond Standard processors (non modifiable)

2. Xtensa processors (can be modified)

Tensilica also has several CAD tool flowsCAD tool flows to extend the instructions sets

TIE Language

XPRESS Compiler

Page 34: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 34

1. Tensilica Diamond Processor Are a set of off-the-shelf synthesizable cores (fixed and

not configurable) directly available from Tensilica and foundry partners that range from area-efficient, low-power controllerscontrollers an audioaudio processor, a high-performance DSPDSP, and a videovideo processor

Diamond Standard processors come with a comprehensive software tool set: Compilers Assemblers Debuggers, ….

Page 35: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 35

2. Tensilica Xtensa Processor Tensilica’s Xtensa processors are synthesizable

processors that are configurable and extensible.!

Page 36: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 36

Xtensa Processors Architecture The Xtensa Instruction Set Architecture (ISA) is a 32-bit

RISC architecture featuring a compact instruction set optimized for embedded designs.

RISC?

• A small number of memory addressing modes• Large uniform register files for computation operations• Fixed-size instruction words Optimized Pipelined Architecture Simple and fixed instruction-field encoding Memory access via loads and stores of registers

Page 37: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 37

Xtensa Processors Architecture The architecture has:

a 32-bit ALU; 16, 32 or 64 general-purpose physical registers; six special purpose registers; Cache:Cache:

up to 32 KB and up to 32 KB and 1,2,3,4 way set associative cache?1,2,3,4 way set associative cache? Replacement Policy?Replacement Policy? Write back vs. Write through?Write back vs. Write through?

Page 38: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 38

Xtensa Processors Architecture The architecture has:

a 32-bit ALU; 16, 32 or 64 general-purpose physical registers; six special purpose registers; 5 or 7 stage pipelines:5 or 7 stage pipelines:

5-stage: Power Usage: 47 uW/MHZ @ 350 MHz 5-stage: Power Usage: 47 uW/MHZ @ 350 MHz 7-stage: Power Usage: 57 uW/MHz @ 400 MHz7-stage: Power Usage: 57 uW/MHz @ 400 MHz

Page 39: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 39

Tensilica Xtensa Architecture

Page 40: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 40

Xtensa Processor Generator The designer can select from a broad selection of predefined

standard RISC microprocessor options and can add instructions and register extensions to the tailored processor.

Or the designer can use Tensilica's XPRES Compiler to automatically tailor the processor to optimize existing C/C++ code. The Xtensa Processor Generator then creates the complete processor

solution set – pre-verified processor hardware description in source RTL (Verilog or

VHDL), plus supporting hardware implementation methodology scripts.

This complete package includes software development tools including commercial RTOS support, and comprehensive system modeling and

modeling co-verification support.

Page 41: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 41

XPRES Compiler

Page 42: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 42

XPRES CompilerXPRES Compiler

Page 43: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 43

XPRES Compiler

Page 44: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 44

Tensilica Instruction Extension (TIE) TIE is a Verilog-like language used to

describe desired custom instructions.

You can express the desired functionality in the Tensilica Instruction Extension (TIE) language.

TIE helps you get orders of magnitude performance increases out of your processor design.

1. Fusion,

2. SIMD (Single Instruction Multiple Data),

3. FLIX (Flexible Length Instruction Encoding)

Page 45: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 45

TIE Extensions

Page 46: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 46

(I) Fusion

Page 47: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 47

Affect of TIE Instructions

Page 48: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 48

TIE Flow

Page 49: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 49

Fusion Example

Page 50: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 50

Exploiting Parallelism

Page 51: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 51

Creating SIMD TIE Execution Units

Page 52: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 52

FLIX Acceleration

Page 53: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 53

Creating FLIX (VLIW) Acceleration An Xtensa processor can become a multi-issue VLIW processor.

The Xtensa C/C++ compiler’s is capable to aggressively extract instruction-level parallelism from the code. The compiler can schedule multiple operations in a VLIW instructions.

By allowing two or three instructions to execute simultaneously, FLIX allows the processor to act as a 2- or 3- issue VLIW CPU, accelerating general purpose code by 40-60 %.

Page 54: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 54

FLIX

Page 55: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 55

Estimation (energy)

Page 56: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 56

Example: MPEG Acceleration One of the most difficult parts of encoding MPEG-4 video

streams is motion estimation which searches adjacent video frames for similar pixel blocks as part of the MPEG-4 decompression algorithm.

The search algorithm’s inner loop contains a SAD (sum of absolute differences) algorithm consisting of Subtraction Absolute value operation Addition of the resulting value with previously computed values

For a QCIF (quarter common image format) image frame, a 15-Hz frame rate and an exhaustive search motion estimation scheme, SAD operations require slightly more than 641 million operations/sec.

Page 57: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 57

MPEG Acceleration Combining all three SAD component operations (subtraction, absolute

value, addition) into one operation that executes in one clock cycle and executing 16 single-pixel SAD operations in one SIMD SAD SIMD SAD instruction during the same clock cycle reduces the cycle count from 641 million reduces the cycle count from 641 million instructions/sec to 14 million instructions/sec – a 98% reductioninstructions/sec to 14 million instructions/sec – a 98% reduction

Page 58: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 58

MPEG Acceleration The full MPEG-4 decoder adds approximately 100,000 gates to the base

processor and implements a 2-way (coder and decoder) QCIF video coded that operates at 15 frames/sec.

When instructions are added to accelerate all of these MPEG-4 decoding tasks, creating an MPEG-4 SIMD engine within the tailored processor, the results can be quite surprising.

The resulting SIMD engine drops the number of cycles required to decode the MPEG-4 video clips from billions to millions and the required processor operating frequency by roughly 30x to around 10MHz (power dissipation!!)

Page 59: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 59

How Xtensa Compares

Page 60: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

Reconfigurable Instruction Reconfigurable Instruction Set ProcessorsSet Processors

Page 61: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

61

Two roads to customizationTwo roads to customization

Augment GPPs with programmable logicCouple standard processor (ARM, MIPS) with

an FPGA fabricFixed processor instruction setFPGA implements custom instructions

Implement them in FPGAsCustomize instructions at compile time or at

run time

Page 62: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

Reconfigurable Instruction Set ProcessorsReconfigurable Instruction Set Processors

Duplicated instruction decode logic (2 simmetrical data- channels)

Duplicated commonly used function Units (Alu and Shifter)

All others function units are shared (DSP operations, Memory handler)

A tightly coupled pipelined configurable Gate Array

Page 63: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

Dynamic Instruction Set Extension(1)

for (i=0; i<16;i++) { temp = abs (v1[i]-v2[i]); out = out + temp; }

A-B B-A

MUX

Accumulator

for (i=0; i<16;i++) {

pgaop (out, v1[i], v2[i]);

}

PiCoGAR

egis

ter

File

ALUs & Multiplier

Memory Unit

A-B

B-A

MU

XA

ccu

mu

lato

r

Original code Optimized XiRisc code

Page 64: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 64

Summary Configurable and extensible (tailorable) processor cores are a

combination of hardware and software IP that give system developers the ability to tailor processors for better performance tailor processors for better performance in specific applicationsin specific applications

The main difference between GPPs and ASIPs is specializationspecialization. It is important to note that specialization must not compromise flexibility!

Advantages:Advantages: Faster, more power efficient, less silicon areaFaster, more power efficient, less silicon area No other company will have your version of that task-No other company will have your version of that task-

specific processor.specific processor. No one will have the matching compiler and software tool No one will have the matching compiler and software tool

chain.chain.

Page 65: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 65

Conclusion ASIPs is somehow related to hardware/software co-designrelated to hardware/software co-design methodology

since a GP is involved along with hardware accelerators in the form of specialized functional units.

Tensilica provides all the necessary tools to automatically createautomatically create Application Specific Instruction Set Processors in minimum time.

The designer can rely either on the TIE language to manually extendTIE language to manually extend the instruction set of the newly

created processor. Another option would be to rely on the Tensilica XPRESS compilerTensilica XPRESS compiler to

automatically createautomatically create the processor and all the necessary software development tools such as compilers, debuggers …

The designer can extend the capabilities of the processor by changing the cache, ports, queues, register files, functional units, ….

It is worth pursuing using the Tensilica tools to perform some type of perform some type of design explorationdesign exploration for your application before you attempt to custom build hardware accelerators.

Page 66: ENG6530 Reconfigurable Computing Systems Application Specific Application Specific Instruction Processors “ASIPS” or “Reconfigurable Processors”

ENG6530 RCS 66