Chameleon Chip

29
Seminar Report Chameleon Chip INTRODUCTION A reconfigurable processor is a microprocessor with erasable hardware that can rewire itself dynamically. This allows the chip to adapt effectively to the programming tasks demanded by the particular software they are interfacing with at any given time. Ideally, the reconfigurable processor can transform itself from a video chip to a central processing unit (CPU) to a graphics chip, for example, all optimized to allow applications to run at the highest possible speed. These chips are like providing a "chip on demand." In practical terms, this ability can translate to immense flexibility in terms of device functions. For example, a single device could serve as both a camera and a tape recorder (among numerous other possibilities): you would simply download the desired software and the processor would reconfigure itself to optimize performance for that function. According to a recent Red Herring magazine article, that type of device versatility may be available by 2003. Reconfigurable processor chip usually contains several parallel processing computational units known as functional blocks. These functional blocks are connected in all the possible way. While reconfiguring the chip, the connections inside the functional blocks and the connections in between the functional blocks are changing. www.seminarsonly.com 1

Transcript of Chameleon Chip

Page 1: Chameleon Chip

Seminar Report Chameleon Chip

INTRODUCTION

A reconfigurable processor is a microprocessor with erasable hardware

that can rewire itself dynamically. This allows the chip to adapt effectively to the

programming tasks demanded by the particular software they are interfacing with at

any given time. Ideally, the reconfigurable processor can transform itself from a video

chip to a central processing unit (CPU) to a graphics chip, for example, all optimized to

allow applications to run at the highest possible speed.

These chips are like providing a "chip on demand." In practical terms, this

ability can translate to immense flexibility in terms of device functions. For example, a

single device could serve as both a camera and a tape recorder (among numerous other

possibilities): you would simply download the desired software and the processor

would reconfigure itself to optimize performance for that function. According to a

recent Red Herring magazine article, that type of device versatility may be available by

2003. Reconfigurable processor chip usually contains several parallel processing

computational units known as functional blocks. These functional blocks are connected

in all the possible way. While reconfiguring the chip, the connections inside the

functional blocks and the connections in between the functional blocks are changing.

That means when a particular software is loaded the present hardware design is

erased and a new hardware design is generated by making a particular number of

connections active while making others idle. This will define the optimum hardware

configuration for that particular software. The key to the design is the small size of each

processing element. The smallest segments of the chip can be defined with just 50 bits

of software code, so the entire chip can be reprogrammed with just 50,000 bits of

software description. It takes just 20 microseconds to reconfigure the entire processing

array.

Reconfigurable processors are currently available from Chameleon Systems,

Billions of Operations (BOPS), and PACT (Parallel Array Computing Technology).

Among those only Chameleon is providing a design environment, which allows

customers to convert their algorithms to hardware configuration by themselves.

www.seminarsonly.com1

Page 2: Chameleon Chip

Seminar Report Chameleon Chip

MULTIFUNCTION IMPLEMENTATION

In a conventional ASIC or FPGA, multiple algorithms are implemented as

separate hardware modules. Four algorithms would divide the chip into four functional

areas.

With Reconfigurable Technology, the four algorithms are loaded into the entire

reconfigurable Fabric one at a time. First, the entire Fabric is dedicated to algorithm 1;

during this processing time, algorithm 2 is loaded into the background place. In a single

clock cycle, the entire Fabric is swapped to algorithm 2; during this processing time,

algorithm 3 is loaded into the background plane.

The entire reconfigurable fabric is dedicated to just one algorithm at a time. The

result: much higher performance, lower cost and lower power consumption

www.seminarsonly.com

MULTIFUNCTION IMPLEMENTATION

2

Page 3: Chameleon Chip

Seminar Report Chameleon Chip

THE GENERAL ARCHITECTURE OF A

RECONFIGURABLE CHIP

The chip architecture depends on the given task. Machine design supposes that

some pins are considered as the configuration inputs and another as data or control

inputs and outputs. A new chip must inside determine the set of the function blocks

(FB), which are used to construct the circuit, rules of their interconnections and ways of

the input/output connections. Further it defines structure and writing mechanisms of the

configuration memory. The most important parts are the logic circuits, which configure

function blocks according to data in the configuration memory.

The structure of Reconfigurable chip is designed in some developmental

tool .The various possible connections between functional blocks are encoded to bits

known as Configuration bits. Resulting configuration stream is downloaded into

configuration memory through configuration inputs. Thus, a new reconfigurable

machine is established.

ARCHITECTURE

The Chip incorporates three core architectural technologies:

1) A Complete 32 bit Embedded Processor system

It provides all of the basic building blocks for a complete system: a 32-bit ARC

processor, 32-bit interface, and 64-bit high-performance memory controller. These fully

integrated and fully verified modules simplify design, debug and verification.

2) A high-performance 32-bit Reconfigurable Processing Fabric (RPF)

www.seminarsonly.com3

Page 4: Chameleon Chip

Seminar Report Chameleon Chip

The RPF has 108 parallel computation units, providing tremendous

computational power. This is where the "heavy lifting" (Rec Roadrunner Bus links these

system modules. This 128-bit, split-transaction bus provides 2GByte/sec on-chip

bandwidth amongst the subsystems in the Embedded Processor System and the RPF.

3) Instantaneous reconfigurability

These core technologies combine to eliminate the performance/flexibility

compromise, exploit platform-based design and enable you to implement your own

algorithms to differentiate your product

ARCHITECTURE32 BIT PCI BUS 64 BIT MEMORY BUS

PCI CONTROLLER

ARC PROCESSOR

MEMORY CONTROLLER

128 BIT ROADRUNNER BUS

CONFIGURATIONSUBSYSTEM

DMA SUBSYSTEM

RECONFIGURABLE PROCESSING FABRIC(RPF)

160 PIN PROGRAMMABLE I/O

RECONFIGURABLE PROCESSING FABRIC (RPF)

The Fabric (RPF or “Fabric”) provides unmatched algorithmic computation

power to Chameleon Chip. It consists of 84,32-bit Data path Units and 24, 16x24-bit

www.seminarsonly.com4

Page 5: Chameleon Chip

Seminar Report Chameleon Chip

Multipliers. Operating at 125Mhz, they provide up to 3,000 16-bit Million Multiply-

Accumulates Per Second and 24,000 16-bit Million Operations Per Second.

The fabric is divided into Slices, the basic unit of reconfiguration. The CS2112

includes four Slices, each of which can be independently reconfigured. Each Slice

consists of three Tiles. The Tile is built with 32-bit Data path Units, 16x24-bit Single-

Cycle Multipliers, Local Store Memories, and Control Logic Units. The Dynamic

Interconnect connects the modules within the fabric’.

onfiguration) is performed. The high-performance

32bit Data path Unit (DPU):

The Tile includes seven Data path Units. The DPU is a data processing

module that directly supports all C and Verilog (Verilog is a hardware description

language used to design and document electronic systems) operations. The routing

multiplexers select operands. There are 3 routing classes:

a) Local routes-connects near by 7 DPUs with a delay of 1 clock cycle.

b) Intra-slice routes-connects DPUs within a slice with a delay of 1 clock cycle

c) Inter-slice routes-connects DPUs in different slices with a delay of 2 clock cycles.

www.seminarsonly.com5

Page 6: Chameleon Chip

Seminar Report Chameleon Chip

The DPU includes a 32-bit real-time Barrel Shifter for shifting operations. The

DPU also includes two 32-bit AND/OR Mask operators.

At the heart of the DPU is the 32-bit Operator, which directly implements all C

and Verilog operators. The Operator supports number calculation signed/unsigned

shifting and bit-field masking data operation modes.

DATA PATH UNIT

ROUTING MUX

ROUTING MUX

REGISTER AND MASK

REGISTER AND MASK

BARREL SHIFTER

OP

REGISTER

REGISTER

INSTRUCTION

16x24 Single-Cycle Multiplier

The Tile includes two 16x24-bit single-cycle multipliers. With a total of 24

multipliers, the CS2112 delivers 3,000 Million Multiply-Accumulates per Second.

Local Store Memory (LSM)

www.seminarsonly.com6

Page 7: Chameleon Chip

Seminar Report Chameleon Chip

The Tile includes four 32-bit wide by 128 word deep Local Store Memories.

The LSM is accessed directly by the DMA Subsystem and the neighboring

DPUs/Multipliers.

Control Logic Unit (CLU)

The Control Logic Unit directly implements finite state machine sequencing

and conditional operation. The CLU includes the Programmable Sum-of-

Products(PSOP) and the Control State Memory (CSM). The CSM stores eight user-

specified Instructions for each of the seven DPUs in the Tile, where each Instruction

represents a complete DPU configuration.. The PSOP implements conditional state

sequences on a configurable context basis.

Dynamic Interconnect

The Fabric provides 100% routability Connecting embedded processor system

with the RPF results in Dynamic Interconnect: routes can be changed one a clock-by-

clock basis for flexible and optimal dataflow.

PROGRAMMABLE I/O

RCP includes banks of Programmable I/O (PIO) pins which provide tremendous

bandwidth. Each PIO bank of 40 PIO pins delivers 0.5 GBytes/sec I/O bandwidth.

EMBEDDED PROCESSOR SYSTEM

The Embedded Processor Systems provides all of the basic building blocks for a

complete system. These fully integrated and fully verified modules simplify design,

debug and verification. This integrated system platform consists of:

www.seminarsonly.com7

Page 8: Chameleon Chip

Seminar Report Chameleon Chip

32-bit ARC Processor The Processor delivers 120 MIPS at 125 MHz and

it employs 64 general-purpose 32-bit registers and a 32-bit address space. It

includes a 4 Kbytes instruction cache and a 4 Kbytes data memory.

32-bit PCI ControllerInterface to PCI bus

64-bit Memory ControllerInterface to Memory

DMA SubsystemIt supports 16 DMA Channels, transferring data between

the modules in the Embedded Processor System and to/from the Local Store

Memories.

Configuration Subsystem

The Configuration Subsystem includes the Configuration Controller and the

two Configuration Planes. The Configuration Controller is an optimized DMA

Controller, transferring configuration data from off-chip memory through the 64-bit

Memory Controller to the Background Configuration Plane.

This transfer can take place during full-speed operation of the Fabric, loading a

new configuration while the prior configuration is running on the Fabric

TECHNOLOGIES USED IN CHIP

1. eCONFIGURABLE™ TECHNOLOGY

eConfigurable™ Technology is used for instantaneous reconfiguration. This

technology reconfigures fabric in one clock cycle and increases voice/data/video

channels per chip. As mentioned earlier, each Slice can be configured independently.

Loading the Background Plane from external memory requires just 3 µsec per

Slice; this operation does not interfere with active processing on the Fabric.

www.seminarsonly.com8

Page 9: Chameleon Chip

Seminar Report Chameleon Chip

Swapping the Background Plane into the Active Plane requires just one clock

cycle. with eConfigurable Technology; the four algorithms are loaded into the entire

reconfigurable processing Fabric one at a time.

2. C~SIDE Development Tools

Without the necessary software tools, no one but the inventors has been able to

port software to the processors. As a result customers had to give their algorithms to

developers.

With this software, Chameleon Systems are providing the ability for the

customers to do the programming themselves thus keeping the secrecy of their

algorithms.

The Chameleon Systems Integrated Development Environment (C~SIDE) is a

complete toolkit for designing, debugging and verifying RCP designs. C~Side uses a

combined C language and Verilog (Verilog HDL is a hardware description language

used to design and document electronic systems) flow to map algorithms into the chip's

reconfigurable processing fabric (RPF).

C~SIDE includes an optimized GNU C compiler for the ARC Processor and an

optimized Verilog To Bits (V2B) synthesizer for the Reconfigurable Processing Fabric.,

an interactive floor planner, an instruction-set simulator and a unified debug

environment for the ARC core and the RPF.

3. eBIOS™

eBIOS provides a interface between the Embedded Processor System and the

Fabric. eBIOS provides resource allocation, configuration management and DMA

services. The eBIOS calls are automatically generated at compile time, but can be edited

for precise control of any function.

www.seminarsonly.com9

Page 10: Chameleon Chip

Seminar Report Chameleon Chip

DESIGN PROCESS

Design process consists of converting a C/C++ program to a hardware

configuration. One end of design is a C/C++ program and the other side is processing

hardware. So a mapping is needed between them. But C is not a hardware description

language (HDL). To specify a hardware configuration a HDL is needed.

For that purpose Chameleon Systems uses a HDL called Verilog.When a

hardware description in verilog is obtained it can be converted to configuration bits

using VerilogToBits (V2B) synthesizer. Configuration bits actually specify hardware

configuration. Now a mapping between C/C++ program and verilog is needed.

For that an assembler is provided by Chameleon Systems. When an assembly

language like description of C/C++ program is given to this assembler it will generate

Verilog descriptions. Now C/C++ algorithm is mapped to a hardware configuration

www.seminarsonly.com10

Page 11: Chameleon Chip

Seminar Report Chameleon Chip

COMPARISON WITH OTHER TECHNOLOGIES

Today’s system architects have at their disposal an arsenal of highly integrated,

high-performance semiconductor technologies, such as application-specific integrated

circuits (ASICs), application-specific standard products (ASSPs), digital signal

processors (DSPs), and field-programmable gate arrays (FPGAs).

However, system architects continue to struggle with the requirement that

communication systems deliver both performance and flexibility. Enter the

reconfigurable processor, an entirely new category of semiconductor solution that

serves as a system-level platform for a broad range of applications. The RCP fills the

void between fast but inflexible ASICs, and flexible but slow and costly DSPs and

www.seminarsonly.com

DESIGN PROCESS

C/C++ PROGRAM

VERILOG

CONFIGURATION BITS

HARDWARE

ASSEMBLER ASSEMBLER

V2BV2B

11

Page 12: Chameleon Chip

Seminar Report Chameleon Chip

FPGAs.Table1 shows the comparison of RCP with other technologies in terms of

Flexibility, cost, performance and time –to- market factors.

TABLE 1

www.seminarsonly.com12

Page 13: Chameleon Chip

Seminar Report Chameleon Chip

ADVANTAGES

Early and fast design

Design cycle time and cost actually increase due to the fact that FPGAs are bit-

oriented arrays that incur large silicon overhead when used to process wide data

streams.

DSP processing speed is typically limited by an internal bus that provides the

interconnect for multiple execution units. Converting a prototype to an ASIC solution

for cost reduction and then manufacturing the ASIC is a lengthy and costly process.

Prototyping using RCPs and associated tools enables a fast all-software design.

Reducing power.

RCPs achieve better speed/power characteristics than DSPs and FPGAs.

Reducing development cost.

RCPs substantially reduce development cycles and costs normally associated

with ASIC design

Reducing manufacturing cost.

Measured by chip count or silicon area, the manufacturing cost advantage of an

RCP over DSP- or FPGA-based solutions with equivalent data

Increasing bandwidth.

Every feature of the RCP — more fundamental processing power, higher

internal and external I/O speeds, closer interaction between the on-chip RISC processor

and the reconfigurable data stream logic, and the algorithmic flexibility to adapt to

www.seminarsonly.com13

Page 14: Chameleon Chip

Seminar Report Chameleon Chip

predetermined conditions — enables the development of higher system bandwidth cost

effectively.

DISADVANTAGES

Inertia might be the worst problem facing reconfigurable computing. Engineers are

slow to change, and they're comfortable designing things the old way, which offered

them a spectrum of programmable or hard-wired options.

Several startups in reconfigurable computing have chosen the next-generation

wireless market as the key battleground. Besides QuickSilver and Chameleon,

Morphics Technology in Campbell, California, is also targeting the wireless market.

They should expand from there.

Controlling the development time and costs in an RCP design requires a

comprehensive set of tools – a design environment with a graceful flow from

systems design to executable files that run the embedded microprocessor and

configure the fabric. Hardware and software debugging and verification tools are

also necessary. Ultimately, the complete RCP design process should merge

seamlessly with the equipment manufacturer’s other design tools.

At present, there is a "learning curve" for designers unfamiliar with reconfigurable

logic. Because designer has to study Chameleon’s assembly like design entry

language. Researches are going on to help designers enter their design through such

tools as Matlab or SPW. That will let users draw data-flow diagrams in lieu of

writing code.

www.seminarsonly.com14

Page 15: Chameleon Chip

Seminar Report Chameleon Chip

www.seminarsonly.com15

Page 16: Chameleon Chip

Seminar Report Chameleon Chip

APPLICATIONS

1. Wireless Base stations

The reconfigurable technology mainly focuses on base stations and their

unpredictable combination of voice and data traffic.Base-station infrastructure will have

to be adaptive enough to accommodate those requirements. With a fixed processor the

channels must be able to support both simple voice calls and high-bandwidth data

connections, which means many voice calls do not use up all the bandwidth that is

assigned to them. With a reconfigurable processor, each channel can be allotted the

exact amount of bandwidth it requires.

2. Wireless Local Loop (WLL)

Reconfigurable technology is widely applied in Wireless Local Loops also

because of their high processing power, bandwidth and reconfigurable nature.

3. High-Performance DSL (Digital Subscriber Line Technology)

DSL technology brings high Bandwidth to homely users. Telephone

communication lines usually used consists of two wires, which can provide Millions Hz

of bandwidth. Usual frequency range used in telecommunication range from 3000-

4000Hz.Using DSL Technology the remaining bandwidth can be effectively used for

fax and voice transmission. So if Processors employed in telephone switching stations

can’t handle that much bandwidth requirement, the DSL technology cannot be

efficiently and effectively implemented. First generation Reconfigurable

Communication Processor, CS2112, provides very high bandwidth. Hence they can be

effectively used in local switching stations.

www.seminarsonly.com16

Page 17: Chameleon Chip

Seminar Report Chameleon Chip

4. Software-Defined Radio (SDR)

SDR concept is applied in Cell phone Technology. A cell phone uses some

protocols to communicate with each other. If these protocols get changed, Cell phones

cannot communicate. If reconfigurable processors are used in cell phones, the processor

will reconfigure itself to provide a new hardware design for the new protocol so that

they can be used with new protocols and coming protocols also.

www.seminarsonly.com17

Page 18: Chameleon Chip

Seminar Report Chameleon Chip

CONCLUSION

One day, someone will make a chip that does everything for the ultimate

consumer device. The chip will be smart enough to be the brains of a cell phone that can

transmit or receive calls anywhere in the world.

If the reception is poor, the phone will automatically adjust so that the quality

improves. At the same time, the device will also serve as a handheld organizer and a

player for music, videos, or games.

Today, designing such a chip crosses too many architectural boundaries.

Nobody has figured out a way to get a chip to meet all the criteria for the ultimate

consumer device. But we might be getting closer. Now a new kind of chip adapts to any

programming task by effectively erasing its hardware design and regenerating new

hardware design that is perfectly suited to run the software at hand. These chips are

referred to as reconfigurable processors.

These new chips are able to rewire themselves on the fly to create the exact

hardware needed to run a piece of software at the utmost speed. If these adaptable chips

can reach cost-performance parity with hard-wired chips, so will the gadgets of the

information age.

www.seminarsonly.com18

Page 19: Chameleon Chip

Seminar Report Chameleon Chip

REFERENCES

1   J. R. Hauser , J. Wawrzynek, Garp: a MIPS processor with a reconfigurable

coprocessor, Proceedings of the 5th IEEE Symposium on FPGA-Based Custom

Computing Machines.

2   Seth Copen Goldstein , Herman Schmit , Mihai Budiu , Srihari Cadambi , Matt

Moe, R. Reed Taylor, PipeRench: A Reconfigurable Architecture and Compiler,

Computer.

3   Z. Andales, Y. Mitsuyama, T. Onoye, and I. Shirakawa, "CHAMELEON: A

dynamically reconfigurable hardware-based cryptosystem," in Proc. EUROMEDIA,

4   Andre DeHon, Reconfigurable Architectures for General-Purpose Computing,

Massachusetts Institute of Technology, Cambridge, MA.

5. http://www.mitsubishi.com/ghp_japan/misty/misty1megafunc.htm

6. www.seminarsonly.com

www.seminarsonly.com19

Page 20: Chameleon Chip

Seminar Report Chameleon Chip

ABSTRACT

Today, designing a chip crosses too many architectural boundaries. Nobody has

figured out a way to get a chip to meet all the criteria for the ultimate consumer device.

But we might be getting closer. Now a new kind of chip adapts to any programming

task by effectively erasing its hardware design and regenerating new hardware design

that is perfectly suited to run the software at hand. These chips are referred to as

reconfigurable processors. These new chips are able to rewire themselves on the fly to

create the exact hardware needed to run a piece of software at the utmost speed. This

new chip is called CHAMELEON CHIP.

www.seminarsonly.com20

Page 21: Chameleon Chip

Seminar Report Chameleon Chip

CONTENTS

1. INTRODUCTION

2. MULTIFUNTION IMPLEMENTATION

3. THE GENERAL ARCHITECTURE OF RECONFIGURABLE PROSESSOR

4. ARCHITECTURE

5. RECONFIGURABLE PROCESSING FABRIC

6. PROGRAMMEBLE I/O

7. EMBEDDED PROCESSOR SYSTEM

8. TECHNOLOGIES USED IN CHIP

9. DESIGN PROCESS

10. COMPARISON WITH OTHER TECHNOLOGIES

11. ADVANTAGES

12. DISADVANTAGES

13. APPLICATIONS

14. CONCLUSION

15. REFERENCES:

www.seminarsonly.com21

Page 22: Chameleon Chip

Seminar Report Chameleon Chip

www.seminarsonly.com22