[4]Single cycle datapath.pdf
-
Upload
shivam-khandelwal -
Category
Documents
-
view
20 -
download
0
Transcript of [4]Single cycle datapath.pdf
We're ready to look at an implementation of the MIPS instruction set
Simplified to contain only arithmetic-logic instructions: add, sub, and, or, slt
memory-reference instructions: lw, sw
control-flow instructions: beq, j
Implementing MIPS
op rs rt offset
6 bits 5 bits 5 bits 16 bits
op rs rt rd functshamt
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
R-Format
I-Format
op address
6 bits 26 bits
J-Format
High-level abstract view of fetch/execute implementation use the program counter (PC) to read instruction address
fetch the instruction from memory and increment PC
use fields of the instruction to select registers to read
execute depending on the instruction
repeat…
Implementing MIPS: the Fetch/Execute Cycle
Registers
Register#
Data
Register#
Data
memory
Address
Data
Register#
PC Instruction ALU
Instruction
memory
Address
Overview: Processor Implementation Styles
Single Cycle
perform each instruction in 1 clock cycle
clock cycle must be long enough for slowest instruction; therefore,
disadvantage: only as fast as slowest instruction
Multi-Cycle
break fetch/execute cycle into multiple steps
perform 1 step in each clock cycle
advantage: each instruction uses only as many cycles as it needs
Pipelined
execute each instruction in multiple steps
perform 1 step / instruction in each clock cycle
process multiple instructions in parallel – assembly line
Two types of functional elements in the hardware:
elements that operate on data (called combinational elements)
elements that contain data (called state or sequential elements)
Functional Elements
Combinational Elements
Works as an input output function, e.g., ALU
Combinational logic reads input data from one register and writes output data to another, or same, register
read/write happens in a single cycle – combinational element cannot store data from one cycle to a future one
Combinational logic hardware units
Clock cycle
Stateelement
1Combinational logic
Stateelement
2
State
elementCombinational logic
State Elements
State elements contain data in internal storage, e.g., registersand memory
All state elements together define the state of the machine
Flipflops and latches are 1-bit state elements, equivalently, they are 1-bit memories
The output(s) of a flipflop or latch always depends on the bit value stored, i.e., its state, and can be called 1/0 or high/low or true/false
The input to a flipflop or latch can change its state depending on whether it is clocked or not…
Registers are implemented with arrays of D-flipflops
State Elements on the Datapath: Register File
Register file with two read ports andone write port
Clock
5 bits
5 bits
5 bits
32 bits
32 bits
32 bits
Control signal
Read registernumber 1 Read
data 1
Readdata 2
Read registernumber 2
Register fileWriteregister
Writedata W rite
Port implementation:
Read ports are implemented with a pair of multiplexors – 5 bit multiplexors for 32 registers
Write port is implemented usinga decoder – 5-to-32 decoder for32 registers. Clock is relevant to write as register state may change only at clock edge
Clock
Clock
State Elements on the Datapath: Register File
Mu
x
Register 0
Register 1
Register n – 1
Register n
M
ux
Read data 1
Read data 2
Read register
number 1
Read register
number 2
n-to-1decoder
Register 0
Register 1
Register n – 1
C
C
D
D
Register n
C
C
D
D
Register number
Write
Register data
0
1
n – 1
n
Verilog
All components that we have discussed – and shall discuss –can be fabricated using Verilog
Single-cycle Implementation of MIPS
Our first implementation of MIPS will use a single long clock cycle for every instruction
Every instruction begins on one up (or, down) clock edge and ends on the next up (or, down) clock edge
This approach is not practical as it is much slower than a multicycle implementation where different instruction classes can take different numbers of cycles in a single-cycle implementation every instruction must take
the same amount of time as the slowest instruction
in a multicycle implementation this problem is avoided by allowing quicker instructions to use fewer cycles
Even though the single-cycle approach is not practical it is simple and useful to understand first
Note : we shall implement jump at the very end
Datapath: Instruction Store/Fetch & PC Increment
Three elements used to store and fetch instructions andincrement the PC
Datapath
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
PC
Instructionmemory
Readaddress
Instruction
4
Add
Animating the Datapath
Instruction <- MEM[PC]
PC <- PC + 4
RDMemory
ADDR
PC
Instruction
4
ADD
Datapath: R-Type Instruction
Two elements used to implementR-type instructions
Datapath
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Data
Data
Register
numbers
a. Registers b. ALU
Zero5
5
5 3
InstructionRegisters
Writeregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Zero
RegWrite
ALU operation3
Animating the Datapath
add rd, rs, rt
R[rd] <- R[rs] + R[rt];
5 5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
op rs rt rd functshamt
Operation
ALU Zero
Instruction
3
Datapath: Load/Store Instruction
Two additional elements usedTo implement load/stores
Datapath
16 32Sign
extend
b. Sign-extension unit
MemRead
MemWrite
Datamemory
Writedata
Readdata
a. Data memory unit
Address
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
Zero
ALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
Animating the Datapath
lw rt, offset(rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
Animating the Datapath
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
Datapath: Branch Instruction
Datapath
No shift hardware required:simply connect wires from input to output, each shiftedleft 2 bits
16 32Sign
extend
ZeroALU
Sum
Shiftleft 2
To branchcontrol logic
Branch target
PC + 4 from instruction datapath
Instruction
Add
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
RegWrite
ALU operation3
Animating the Datapath
beq rs, rt, offset
op rs rt offset/immediate
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
EXTND
16 32
Zero
ADD
<<2
PC +4 from instruction datapath
if (R[rs] == R[rt]) then
PC <- PC+4 + s_extend(offset<<2)
MIPS Datapath I: Single-CycleInput is either register (R-type) or sign-extendedlower half of instruction (load/store)
Combining the datapaths for R-type instructions and load/stores using two multiplexors
Data is either from ALU (R-type)or memory (load)
Animating the Datapath: R-type Instruction
add rd,rs,rt5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
MU
X
MUXALUSrc
MemtoReg
Animating the Datapath: Load Instruction
lw rt,offset(rs)5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
MU
X
MUXALUSrc
MemtoReg
Animating the Datapath: Store Instruction
sw rt,offset(rs)5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
MU
X
MUXALUSrc
MemtoReg
MIPS Datapath II: Single-Cycle
Adding instruction fetch
Separate instruction memoryas instruction and data readoccur in the same clock cycle
Separate adder as ALU operations and PC increment occur in the same clock cycle
PC
Instructionmemory
Readaddress
Instruction
16 32
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address
Writedata
Readdata
Mux
4
Add
Mux
ALU
RegWrite
ALU operation3
MemRead
MemWrite
ALUSrcMemtoReg
MIPS Datapath III: Single-Cycle
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is eitherPC+4 or branch target address
Extra adder needed as bothadders operate in each cycle
New multiplexor
Important note: in a single-cycle implementation data cannot be stored during an instruction – it only moves through combinational logic
Datapath Executing add
add rd, rs, rt
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
InstructionMemory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Datapath Executing lw
lw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
InstructionMemory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Datapath Executing sw
sw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
InstructionMemory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Datapath Executing beq
beq r1,r2,offset
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
EXTND
16 32
Zero
RD
WD
MemRead
DataMemory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
InstructionMemory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc