Post on 02-Oct-2015
description
Lecture 13
Digital System Design
Programmable Processor 1
Programmable Processors
Programmable Processor 2
A programmable processor, also known as a general purpose
processor , is a digital circuit whose particular processing task is
stored in a memory, rather than being built into the circuit
itself.
Programmable Processors
Programmable Processor 3
The representation of that processing task in the memory is
known as a program.
ARM, MIPS, PIC processors, Pentium Processor, Power PC etc.
Programmable Processors
Programmable Processor 4
A benefit of the programmable processor is that its circuit can
be mass produced and then programmed to do almost anything.
Programmable processors have the drawback of computation
overhead because they have to be general.
Basic Architecture
Programmable Processor 5
A programmable processor consists of two main parts:
Datapath
Control Unit
Basic Datapath
6
We can view processing generally as :
Loading data: reading the data on which we wish to work from some input locations
Transforming data: performing some computations with that data that result in new data.
Storing the new data: writing the new data to some output locations.
A data memory holds all the data that a programmable processor can access, as input data or output data.
To process the data, a programmable processor needs to be able to load the data from data memory into one of the several registers, needs to be able to feed data from some subset of registers through functional units that can perform all possible transformation operations(typically ALU)
Programmable Processor
Basic Datapath
Programmable Processor 7
The basic circuit shown is known as the programmable processors datapath. The basic dataptah can perform the following possible datapath operations in a given clock cycle:
Load operation
ALU operation
Store operation
Notice that the datapath cannot directly operate on data memory locations with ALU in one clock cycle.
A datapath that requires all data to first pass through the register file before the data can be transformed by the ALU is known as a load-store architecture.
Basic Datapath
Programmable Processor 8
These possible datapath operations are illustrated in the figure
Basic Control Unit
Programmable Processor 9
If we want to use the basic datapath to perform a simple processing task of adding the contents of particular memory locations , then we need to instruct the datapath to perform distinct operations for that purpose.
We need to describe the sequence of operations that we desire to execute on the datapath. Such a description of desired processor operations are known as instructions.
A collection of instructions is known as a program.
Instruction memory:
The desired program is stored as words in another memory called the instruction memory.
How to represent these instructions?
Basic Control Unit
Programmable Processor 10
The control unit reads each instruction from the instruction memory
, and then executes that instruction on the datapath.
To execute our simple program , the control unit would begin by
performing the following tasks, known as stages , to carry out the
first instruction.
Fetch
Decode
Execute
Basic Control Unit
Programmable Processor 11
Instruction register: local register in which the control unit stores the
fetched instruction.
The control unit needs to keep track of the location in instruction memory
from which to fetch the next instruction.
Program counter(PC): A counter that is used to keep track of the
current instruction.
Three Stages of Processing One Instruction
Programmable Processor 12
Basic Control Unit
Programmable Processor 13
The control unit will require a controller that can repeatedly
perform the fetch, decode and execute steps.(note that the
controller appears inside the control unit).
The basic parts of a control unit include the :
Program counter
The instruction register
Controller
Basic Control Unit
Programmable Processor 14
To summarize, the control unit processes each instruction in three
stages:
1. Fetching the instruction by loading the current instruction into
IR and incrementing the PC for the next fetch.
2. Decoding the instruction to determine its operation
3. Executing the operation by setting the appropriate control lines
for the datapath if applicable. If the operation is a datapath
operation, the operation may be one of three possible types: a) Loading a data memory location into a register file location
b) Transforming data using an ALU operation on register file locations and writing
results back to a register file location.
c) Storing a register file location into a data memory location.
A THREE -INSTRUCTION PROGRAMMABLE
PROCESSOR
Programmable Processor 15
A First Instruction Set with Three
Instructions
Programmable Processor 16
Instruction Set:
The way we represent instructions in the instruction memory , and the list of allowable instructions, are known as programmable processors instruction set.
Lets assume that a processor uses 16-bit instructions, and the instruction memory I is 16-bits wide.
Instruction set typically reserves a certain number of bits in the instruction to denote what operation to perform.
The remaining bits specify any additional information needed to perform the operation, such as the source or destination registers.
A First Instruction Set with Three
Instructions
Programmable Processor 17
Instruction Set:
We define a simple, three instruction set , with the most significant
(meaning leftmost)4 bits identifying the appropriate operation and
the least significant 12 bits containing register file and data memory
addresses.
1. Load instruction 0000 r3 r2 r1 r0 d7 d6 d5 d4 d3 d2 d1 d0
2. Store instruction 0001 r3 r2 r1 r0 d7 d6 d5 d4 d3 d2 d1 d0
3. Add instruction 0010 r3 r2 r1 r0 ra3 ra2 ra1 ra0 rb3 rb2
rb1 rb0
A First Instruction Set with Three
Instructions
18
1. Load instruction 0000 r3 r2 r1 r0 d7 d6
d5 d4 d3 d2 d1 d0
2. Store instruction 0001 r3 r2 r1 r0 d7 d6
d5 d4 d3 d2 d1 d0
3. Add instruction 0010 r3 r2 r1 r0 ra3 ra2
ra1 ra0 rb3 rb2 rb1 rb0
Using this instruction set to compute
D[9]=D[0]+D[1]
Notice that the first four bits of each instruction
are a binary code that indicates the instructions
operation. Those bits are known as the
instructions operation code ,or opcode.
0000 means a move from data memory to register file.
The remaining bits of the instruction represent operands, which indicate what operation to
operate on. Programmable Processor
A First Instruction Set with Three
Instructions
19
1. Load instruction 0000 r3 r2 r1 r0 d7 d6
d5 d4 d3 d2 d1 d0
2. Store instruction 0001 r3 r2 r1 r0 d7 d6
d5 d4 d3 d2 d1 d0
3. Add instruction 0010 r3 r2 r1 r0 ra3 ra2
ra1 ra0 rb3 rb2 rb1 rb0
We can write a different program using the same
three-instruction instruction set. For example we
could write a program that computes
D[9]=D[5]+D[6]+D[7]
Programmable Processor
Machine Code Vs Assembly Code
Programmable Processor 20
The instructions of a program exist in instruction memory as 0s
and 1s.
A program represented as 0s and 1s is known as machine code.
An assembler allows us to write instructions using mnemonics,
or symbols, that the assembler automatically translates to machine
code.
Thus an assembler may allow us to write instructions from out three-
instruction instruction set using the following mnemonics:
1. Load instruction MOV Ra, d
2. Store instruction MOV d , Ra
3. Add instruction ADD Ra, Rb, Rc
Control Unit and Datapath for the
Three-Instruction Processor
Programmable Processor 21
Control Unit and Datapath for the Three-
Instruction Processor
Programmable Processor 22
A SIX-INSTRUCTION PROGRAMMABLE
PROCESSOR
Programmable Processor 23
Extending the Instruction Set
Programmable Processor 24
Having only a three-instruction instruction set limits the
behavior of the program that we can write. All we can do
with those instructions is add numbers. A real programmable
processor will support many more instructions, perhaps 100
more.
Lets extend our programmable processors instruction set
with a few more instructions.
Extending the Instruction Set
Programmable Processor 25
1. Load constant instruction 0011 r3 r2 r1 r0 c7 c6 c5 c4 c3 c2 c1 c0 : Specifies that a binary number represented by the bits c7 c6 c5 c4 c3 c2 c1 c0
should be loaded into the register specified by r3 r2 r1 r0. The mnemonic for this instruction is: MOV Ra, #c.
1. Subtract instruction 0100 r3 r2 r1 r0 ra3 ra2 ra1 ra0 rb3 rb2 rb1 rb0
2. Jump-if-zero instruction 0010 r3 r2 r1 r0 o7 o6 o5 o4 o3 o2 o1 o0 Specifies that if the contents of the register specified by r3 r2 r1 r0 is 0, we should
load the PC with the current value of PC plus o7 o6 o5 o4 o3 o2 o1 o0, which is an 8-bit number in twos complement form
The mnemonic for this instruction is: JMPZ Ra, offset-specifies the operation PC=PC+offset if R[a] is 0.
Extending the Control Unit and
Datapath
Programmable Processor 26
Programmable Processor 27
Extending the Control Unit and Datapath
Control Unit and Datapath for the Six-
Instruction Processor
Programmable Processor 28
Performance Extensions
Programmable Processor 29
The difference between our basic processor architecture and real processors is that many real processors are pipelined.
By inserting appropriate pipeline registers throughout the design and modifying the controller appropriately , we could pipeline the fetch, decode and execute stages ,thus reducing the total latency.
Another extension involves having multiple ALUs in the datapath. The control unit may then perform multiple ALU operations in the datapath simultaneously.
The processor designs discussed are extremely simplistic and used for illustration purposes only. Yet, seeing even simplistic designs gives one an understanding of how a programmable processor works.
Performance Extensions
Programmable Processor 30
Pipelining:
One method of obtaining speed from digital circuits is through the use of pipelining.
Pipelining means to break a large task into a sequence of stages such that the data moves through the stages like parts move through a factory assembly line. All stages operate concurrently resulting in a better performance .
Consider a system with data inputs W,X,Y and Z that should repeatedly output the sum S=W+X+Y+Z.
Performance Extensions
Programmable Processor 31
Pipelining:
Consider a system with data inputs W,X,Y and Z that should
repeatedly output the sum S=W+X+Y+Z.
Summary
Programmable Processor 32
Programmable Processors are widely used for implementing a systems desired functionality due in part to their easy availability and short design time (namely, writing software).
The basic architecture of a programmable processor consists of general purpose datapath having a register file and ALU; a control unit having a controller, PC and IR; and memories for storing the program and data.
The control unit would fetch the next instruction from program memory, decode the instruction, and then execute the instruction by configuring the datapath to carry out the instructions specified operation.
We designed a simple three- instruction programmable processor , and saw how a program would be presented as 0s and 1s (machine code) in the processors program memory.
Summary
Programmable Processor 33
We went further to design a six- instruction programmable processor , and discussed how further extensions could be made to add more instructions and achieve a more reasonable processor architecture.
Modern commercial processors are based on the same principles- Instructions are stored as machine code in the memory Control units fetch, decode and execute the instructions, Datapath supports the operation of the instructions using register files and
ALUs.
Modern processors just do a much better job, using pipelining and many other techniques to obtain high clock frequencies and fast program execution.
Best way to learn DSD is to do it!
Programmable Processor 34