EECC550 - ShaabanEECC550 - Shaaban#1 Lec # 4 Winter 2012 12-11-2012
CPU Organization (Design)CPU Organization (Design)• Datapath Design:
– Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions
– (e.g., Registers, ALU, Shifters, Logic Units, ...)– Ways in which these components are interconnected (buses
connections, multiplexors, etc.).– How information flows between components.
• Control Unit Design:– Logic and means by which such information flow is controlled.– Control and coordination of FUs operation to realize the targeted
Instruction Set Architecture to be implemented (can either be implemented using a finite state machine or a microprogram).
• Hardware description with a suitable language, possibly using Register Transfer Notation (RTN).
4th Edition Chapter 4.1-4.4 - 3rd Edition Chapter 5.1-5.4
Components & their connections needed by ISA instructions
Control/sequencing of operations of datapath componentsto realize ISA instructions
Components
Connections
ISARequirements
EECC550 - ShaabanEECC550 - Shaaban#2 Lec # 4 Winter 2012 12-11-2012
1 Analyze instruction set to get datapath requirements:– Using independent RTN, write the micro-operations required for target ISA
instructions.• This provides the the required datapath components and how they are connected.
2 Select set of datapath components and establish clocking methodology (defines when storage or state elements can read and when they can be written, e.g clock edge-triggered)
3 Assemble datapath meeting the requirements.
4 Identify and define the function of all control lines, points or signals needed by the datapath.– Analyze implementation of each instruction to determine setting of control points
that affects its operations.
5 Control unit design, based on micro-operation timing and control signals identified:– Combinational logic: For single cycle CPU.
– Hard-Wired: Finite-state machine implementation.
– Microprogrammed.
Major CPU Design StepsMajor CPU Design Steps
e.g Any instruction completed in one cycle
1 2
i.e CPI = 1
e.g Flip-Flops
ISA Requirements CPU Design
EECC550 - ShaabanEECC550 - Shaaban#3 Lec # 4 Winter 2012 12-11-2012
CPU Design & Implantation ProcessCPU Design & Implantation Process• Top-down Design:
– Specify component behavior from high-level requirements (ISA).
• Bottom-up Design:– Assemble components in target technology to establish critical timing
(hardware delays, critical path timing).
• Iterative refinement:– Establish a partial solution, expand and improve.
Datapath Control
ProcessorInstruction SetArchitecture (ISA):ProvidesRequirements
Reg. File Mux ALU Reg Mem Decoder Sequencer
Cells GatesTarget VLSI implementation Technology
ISA Requirements CPU Design
EECC550 - ShaabanEECC550 - Shaaban#4 Lec # 4 Winter 2012 12-11-2012
Datapath Design StepsDatapath Design Steps• Write the micro-operation sequences required for a number of
representative target ISA instructions using independent RTN.
• Independent RTN statements specify: the required datapath components and how they are connected.
• From the above, create an initial datapath by determining possible destinations for each data source (i.e registers, ALU).– This establishes connectivity requirements (data paths, or
connections) for datapath components.– Whenever multiple sources are connected to a single input, a
multiplexor of appropriate size is added.
• Find the worst-time propagation delay (critical path) in the datapath to determine the datapath clock cycle (CPU clock cycle, C).
• Complete the micro-operation sequences for all remaining instructions adding datapath components + connections/multiplexors as needed.
(or destination)
1 2
EECC550 - ShaabanEECC550 - Shaaban#5 Lec # 4 Winter 2012 12-11-2012
MIPS Instruction FormatsMIPS Instruction Formats
• op: Opcode, operation of the instruction.• rs, rt, rd: The source and destination register specifiers.• shamt: Shift amount.• funct: Selects the variant of the operation in the “op” field.• address / immediate: Address offset or immediate value.• target address: Target address of the jump instruction.
op target address
02631
6 bits 26 bits
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
R-Type
I-Type: ALULoad/Store, Branch
J-Type: Jumps
[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]
[31:26] [25:21] [20:16] [15:0]
[31:26] [25:0]
Or address offset
EECC550 - ShaabanEECC550 - Shaaban#6 Lec # 4 Winter 2012 12-11-2012
MIPS R-Type (ALU) Instruction FieldsMIPS R-Type (ALU) Instruction Fields
• op: Opcode, basic operation of the instruction. – For R-Type op = 0
• rs: The first register source operand.• rt: The second register source operand.• rd: The register destination operand.• shamt: Shift amount used in constant shift operations.• funct: Function, selects the specific variant of operation in the op field.
OP rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
R-Type: All ALU instructions that use three registers
add $1,$2,$3
sub $1,$2,$3
and $1,$2,$3or $1,$2,$3
Examples:
Destination register in rd Operand register in rt
Operand register in rs
[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]
1st operand 2nd operand Destination
Instruction Word Mem[PC]R[rd] R[rs] funct R[rt]PC PC + 4
Rs, rt , rdare register specifier fields
R-Type = Register Type Register Addressing used (Mode 1)
Independent RTN:
Funct field value examples:Add = 32 Sub = 34 AND = 36 OR =37 NOR = 39
Opcode for R-Type= 0
FunctionField
EECC550 - ShaabanEECC550 - Shaaban#7 Lec # 4 Winter 2012 12-11-2012
MIPS ALU I-Type Instruction FieldsMIPS ALU I-Type Instruction FieldsI-Type ALU instructions that use two registers and an immediate value Loads/stores, conditional branches.
• op: Opcode, operation of the instruction.
• rs: The register source operand.
• rt: The result destination register.
• immediate: Constant second operand for ALU instruction.
OP rs rt Immediate (imm16)
6 bits 5 bits 5 bits 16 bits
add immediate: addi $1,$2,100
and immediate andi $1,$2,10
Examples:
Result register in rtSource operand register in rs
Constant operand in immediate
[31:26] [25:21] [20:16] [15:0]
1st operand 2nd operandDestination
Instruction Word Mem[PC]R[rt] R[rs] + imm16PC PC + 4
Independent RTN for addi:
I-Type = Immediate Type Immediate Addressing used (Mode 2)
imm16
OP = 8
OP = 12
imm16 = 16 bit immediate field
imm16
EECC550 - ShaabanEECC550 - Shaaban#8 Lec # 4 Winter 2012 12-11-2012
MIPS Load/Store I-Type Instruction FieldsMIPS Load/Store I-Type Instruction Fields
• op: Opcode, operation of the instruction.
– For load word op = 35, for store word op = 43.
• rs: The register containing memory base address.
• rt: For loads, the destination register. For stores, the source register of value to be stored.
• address: 16-bit memory address offset in bytes added to base register.
OP rs rt address
6 bits 5 bits 5 bits 16 bits
Store word: sw $3, 500($4)
Load word: lw $1, 32($2)
Examples: Offset
base register in rs
source register in rt
Destination register in rt Offsetbase register in rs
Signed addressoffset in bytes
Base Src./Dest.
[31:26] [25:21] [20:16] [15:0]
(e.g. offset)
Instruction Word Mem[PC] R[rt] Mem[R[rs] + imm16]PC PC + 4
Instruction Word Mem[PC]Mem[R[rs] + imm16] R[rt] PC PC + 4
Base or Displacement Addressing used (Mode 3)
imm16
imm16 = 16 bit immediate field
imm16
SignExtended
EECC550 - ShaabanEECC550 - Shaaban#9 Lec # 4 Winter 2012 12-11-2012
MIPS Branch I-Type Instruction FieldsMIPS Branch I-Type Instruction Fields
• op: Opcode, operation of the instruction.• rs: The first register being compared• rt: The second register being compared.• address: 16-bit memory address branch target offset in words
added to PC to form branch address.
OP rs rt address
6 bits 5 bits 5 bits 16 bits
Branch on equal beq $1,$2,100
Branch on not equal bne $1,$2,100
Examples:
Register in rsRegister in rt offset in bytes equal to
instruction address field x 4
Signed addressoffset in words
Addedto PC+4 to formbranch target
[31:26] [25:21] [20:16] [15:0]
PC-Relative Addressing used (Mode 4)
(e.g. offset)
Instruction Word Mem[PC]R[rs] = R[rt] : PC PC + 4 + imm16 x 4R[rs] R[rt] : PC PC + 4
Independent RTN for beq:
imm16
imm16 = 16 bit immediate field
OP = 4
OP = 5
Word = 4 bytes
imm16
Imm16 x 4
Sign extended
EECC550 - ShaabanEECC550 - Shaaban#10 Lec # 4 Winter 2012 12-11-2012
MIPS J-Type Instruction FieldsMIPS J-Type Instruction Fields
• op: Opcode, operation of the instruction.– Jump j op = 2– Jump and link jal op = 3
• jump target: jump memory address in words.
J-Type: Include jump j, jump and link jal
OP jump target
6 bits 26 bits
jump target = 2500
4 bits 26 bits 2 bits
0 0PC(31-28)
Effective 32-bit jump address: PC(31-28),jump_target,00
FromPC+4
Jump j 10000
Jump and link jal 10000
Examples:
Jump memory address in bytes equal toinstruction field jump target x 4
[31:26] [25:0]
J-Type = Jump Type Pseudodirect Addressing used (Mode 5)
Jump targetin words
Instruction Word Mem[PC]PC PC + 4PC PC(31-28),jump_target,00
Independent RTN for j:
Word = 4 bytes
EECC550 - ShaabanEECC550 - Shaaban#11 Lec # 4 Winter 2012 12-11-2012
A Subset of MIPS InstructionsA Subset of MIPS InstructionsADD and SUB:
add rd, rs, rt
sub rd, rs, rt
OR Immediate:
ori rt, rs, imm16
LOAD and STORE Word
lw rt, rs, imm16
sw rt, rs, imm16
BRANCH:
beq rs, rt, imm16
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
[31:26] [25:21] [20:16] [15:0]
[31:26] [25:21] [20:16] [15:0]
[31:26] [25:21] [20:16] [15:0]
[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]
32 = add34 = sub
13
R
I
I
I
4
35 = lw43 = sw
Offset in words
Offset in bytes
0
EECC550 - ShaabanEECC550 - Shaaban#12 Lec # 4 Winter 2012 12-11-2012
Basic MIPS Instruction Processing StepsBasic MIPS Instruction Processing Steps
Obtain instruction from program storage
Determine instruction type
Obtain operands from registers
Compute result value or status
Store result in register/memory if needed
(usually called Write Back).
Update program counter to address
of next instruction } Commonsteps for all instructions
Instruction
Fetch
Instruction
Decode
Execute
Result
Store
Next
Instruction
Instruction Mem[PC]
PC PC + 4
Done by Control Unit(Based on Opcode)
Instruction/program Memory
T = I x CPI x C
EECC550 - ShaabanEECC550 - Shaaban#13 Lec # 4 Winter 2012 12-11-2012
Overview of MIPS Instruction Micro-operationsOverview of MIPS Instruction Micro-operations• All instructions go through these common steps:
– Send program counter to instruction memory and fetch the instruction. (fetch) Instruction Mem[PC]
– Update the program counter to point to next instruction PC PC + 4– Read one or two registers, using instruction fields. (decode)
• Load reads one register only.• Additional instruction execution actions (execution) depend on the
instruction in question, but similarities exist:– All instruction classes (except J type) use the ALU after reading the
registers:• Memory reference instructions use it for effective address calculation.• Arithmetic and logic instructions (R-Type), use it for the specified
operation.• Branches use it for comparison.
• Additional execution steps where instruction classes differ:– Memory reference instructions: Access memory for a load or store.– Arithmetic and logic instructions: Write ALU result back in register.– Branch instructions: Possibly change next instruction address (update PC)
based on comparison (i.e if branch is taken).
Common Steps
EECC550 - ShaabanEECC550 - Shaaban#14 Lec # 4 Winter 2012 12-11-2012
Data
Register #
Register #
Register #
PC Address Instruction
Instructionmemory
Registers ALU Address
Data
Datamemory
AddAdd
4
A Single Cycle MIPS CPU DesignA Single Cycle MIPS CPU DesignDesign target: A single-cycle per instruction MIPS CPU design
All micro-operations of an instruction are to be carried out in a single CPU clock cycle. Cycles Per Instruction = CPI = 1
CPI = 1
4th Edition Figure 4.1 page 302 - 3rd Edition Figure 5.1 page 287
Abstract view of single cycle MIPS CPU showing major functional units (components) and major connections between them
T = I x CPI x C
CPU Performance Equation:
32
32
3232
32
ISA
EECC550 - ShaabanEECC550 - Shaaban#15 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
R[rd] R[rs] + R[rt] Add register rs to register rt result in register rd
R-Type Example:R-Type Example:Micro-Operation Sequence For ADDMicro-Operation Sequence For ADD
OP rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
add rd, rs, rt
Independent RTN ?
[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]
CommonSteps
ProgramMemory
32 = add34 = sub
0 0
i.e Funct =add = 32
EECC550 - ShaabanEECC550 - Shaaban#16 Lec # 4 Winter 2012 12-11-2012
PC
Instructionaddress
Instruction
Instructionmemory
Add Sum
a. Instruction memory b. Program counter c. Adder
32
32 3232
32
32
32
InstructionWord
Initial Datapath ComponentsInitial Datapath Components
Two state elements (memory) needed to store and access instructions:1 Instruction memory: • Only read access (by user code). No read control signal needed.
2 Program counter (PC): 32-bit register.• Written at end of every clock cycle (edge-triggered) : No write control signal.
3 32-bit Adder: To compute the the next instruction address (PC + 4).
Three components needed by: Instruction Fetch: Instruction Mem[PC]
Program Counter Update: PC PC + 4
+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4th Edition Appendix B)
4th Edition Figure 4.5, page 308 - 3rd Edition Figure 5.5, page 293
32-bit
EECC550 - ShaabanEECC550 - Shaaban#17 Lec # 4 Winter 2012 12-11-2012
4th Edition Figure 4.6 page 309 - 3rd Edition Figure 5.6 page 293
PC Readaddress
Instruction
Instructionmemory
Add
4
Building The DatapathBuilding The Datapath
Portion of the datapath used for fetching instructionsand incrementing the program counter (PC).
Instruction Fetch& PC Update:
PC PC + 4
Instruction Mem[PC]
PC write or update is edge triggered at the end of the cycleClock input to PC, memory not shown
Instruction Mem[PC]PC PC + 4
32
32
32
32
32
1
2
1
2
EECC550 - ShaabanEECC550 - Shaaban#18 Lec # 4 Winter 2012 12-11-2012
Readregister 1
Readregister 2
Writeregister
WriteData
Registers ALUData
Data
Zero
ALUresult
RegWrite
a. Registers b. ALU
5
5
5
Registernumbers
Readdata 1
Readdata 2
ALU operation4
32
32
32
32
32
More Datapath ComponentsMore Datapath Components
Register File:
• Contains all ISA registers.• Two read ports and one write port.• Register writes by asserting write control signal• Clocking Methodology: Writes are edge-triggered.
• Thus can read and write to the same register in the same clock cycle.
ISA Register File Main 32-bit ALU
32-bit Arithmetic and Logic Unit (ALU)
(Function)
Zero = Zero flag = 1When ALU result equals zero
R[rs]
R[rt]
4th Edition Figure 4.7, page 310 - 3rd Edition Figure 5.7, page 295
+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4th Edition Appendix B)
e.g add = 0010
EECC550 - ShaabanEECC550 - Shaaban#19 Lec # 4 Winter 2012 12-11-2012
• Register File consists of 32 registers:– Two 32-bit output busses: busA and busB– One 32-bit input bus: busW
• Register is selected by:– RA (number) selects the register to put on busA (data):
busA = R[RA]– RB (number) selects the register to put on busB (data):
busB = R[RB]– RW (number) selects the register to be written
via busW (data) when Write Enable is 1Write Enable: R[RW] busW
• Clock input (CLK) – The CLK input is a factor ONLY during write operations.– During read operation, it behaves as a combinational logic block:
• RA or RB valid => busA or busB valid after “access time.”
Register File DetailsRegister File Details
Clk
busW
Write Enable
3232
busA
32busB
5 5 5
RW RA RB
32 32-bitRegisters
Write Data
EECC550 - ShaabanEECC550 - Shaaban#20 Lec # 4 Winter 2012 12-11-2012
A Possible Register File Implementation
5-to-32Decoder
Register 0Write
Data In
DataOut
Register 1Write
Data In
DataOut
Register 30Write
Data In
DataOut
Register 31Write
Data In
DataOut
......
32-to-1 MUX
01
3031
32...
5
32-to-1 MUX
01
3031
32...
5
.
..
.
32
32
32
32
.
.
.
...
0
1
3031
5
.
.
.
RegisterRead Data 1(Bus A)
RegisterRead Data 2(Bus B)
Read Register 1 (RA)
Read Register 2 (RB)
Register Write Data (Bus W)
32
Register Write Enable (RegWrite)
WriteRegister RW
Clk
busW
Write Enable
3232
busA
32
busB
5 5 5RW RA RB
32 32-bitRegisters
Also see Appendix C in Book CD (3rd Edition Appendix B) - The Basics of Logic Design
Each Register contains 32 edge triggered D-Flip Flops
Clock input to registersnot shown in diagram
R[rs]
R[rt]
rs
rt
rd?rt?
Write Port2 Read Ports
EECC550 - ShaabanEECC550 - Shaaban#21 Lec # 4 Winter 2012 12-11-2012
Idealized MemoryIdealized Memory
• Memory (idealized)
– One input bus: Data In.
– One output bus: Data Out.
• Memory word is selected by:
– Address selects the word to put on Data Out bus.
– Write Enable = 1: address selects the memoryword to be written via the Data In bus.
• Clock input (CLK):
– The CLK input is a factor ONLY during write operation,
– During read operation, this memory behaves as a combinational logic block:
• Address valid => Data Out valid after “access time.”
• Ideal Memory = Short access time.
Clk
Data In
Write Enable
32 32DataOut
Address
Read Enable
Compared to other components in CPU datapath
EECC550 - ShaabanEECC550 - Shaaban#22 Lec # 4 Winter 2012 12-11-2012
Clocking Methodology Used:Edge Triggered Writes
• All storage element (e.g Flip-Flops, Registers, Data Memory) writes are triggered by the same clock edge.
• Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew
Clk
Don’t Care
Setup Hold
.
.
.
.
.
.
.
.
.
.
.
.
Setup Hold
Here writes are triggered on the rising edge of the clock
Critical Path (Longest delay path)
CLK-to-QCLK-to-Q
ClockClock
EECC550 - ShaabanEECC550 - Shaaban#23 Lec # 4 Winter 2012 12-11-2012
Simplified Datapath For MIPS Simplified Datapath For MIPS R-Type InstructionsR-Type Instructions
Components and connections as specified by RTN statement
R[rd] R[rs] + R[rt]
32
32
32
32
rs
rt
rd
R[rs]
R[rt]
FromInstructionMemory
Destination register R[rd] write or update is edge triggered at the end of the cycle
4[25:21]
[20:16]
[15:11]
Clock input to register bank not shown
i.e Funct = function =add
(Function)e.g add = 0010
EECC550 - ShaabanEECC550 - Shaaban#24 Lec # 4 Winter 2012 12-11-2012
More Detailed Datapath More Detailed Datapath For R-Type InstructionsFor R-Type Instructions
With Control Points IdentifiedWith Control Points Identified
R[rd] R[rs] + R[rt]
i.e Funct = function =add
32
Result
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs RtRd
AL
UR[rs]
R[rt]
Function =Add, Subtract …
4
EECC550 - ShaabanEECC550 - Shaaban#25 Lec # 4 Winter 2012 12-11-2012
R-Type Register-Register TimingR-Type Register-Register Timing
32Result
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs RtRdA
LU
Clk
PC
Rs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memory Access Time
Old Value
New Value
RegWr Old Value
New Value
Delay through Control Logic
busA, B
Register File Access TimeOld Value
New Value
busWALU Delay
Old Value
New Value
Old Value
New Value
New ValueOld Value
Register WriteOccurs Here
PC+4
All register writes occur onfalling edge of clock(clocking methodology)
R[rs]
R[rt]
EECC550 - ShaabanEECC550 - Shaaban#26 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
R[rt] R[rs] OR ZeroExt[imm16] OR register rs with immediate field zero extended to 32 bits, result in register rt
Logical Operations with Immediate Example Example::
Micro-Operation Sequence For ORIMicro-Operation Sequence For ORI
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
ori rt, rs, imm16
[31:26] [25:21] [20:16] [15:0]
Done by Main ALU
13
Imm16000 ….. 000
CommonSteps
Not in book version
EECC550 - ShaabanEECC550 - Shaaban#27 Lec # 4 Winter 2012 12-11-2012
Datapath For Logical Datapath For Logical Instructions With ImmediateInstructions With Immediate
R[rt] R[rs] OR ZeroExt[imm16]
32
Result
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs Rt
RtRdRegDst
ZeroE
xt
Mu
x
Mux
3216imm16
ALUSrc
AL
U
01
R[rs]
R[rt]
2x1 MUX (width 5 bits)
2x1 MUX (width 32 bits)
Function = OR
0
1
Imm16000 ….. 000
EECC550 - ShaabanEECC550 - Shaaban#28 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
R[rt] Mem[R[rs] + SignExt[imm16]] Immediate field sign extended to 32 bits and added to register rs to form memory load address,
write word at load effective address
to register rt
Load Operations ExampleExample::
Micro-Operation Sequence For LWMicro-Operation Sequence For LW
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
lw rt, rs, imm16
Data Memory
Instruction Memory
Effective Address
SignedAddress offsetin bytes
[31:26] [25:21] [20:16] [15:0]35
To load from
CommonSteps
EECC550 - ShaabanEECC550 - Shaaban#29 Lec # 4 Winter 2012 12-11-2012
Additional Datapath Components For Additional Datapath Components For Loads & StoresLoads & Stores
Inputs: for address and write (store) dataOutput for read (load) data
16-bit input sign-extended into a 32-bit value at the output
Data memory write or update is edge triggered at the end of the cycle (clocking methodology)
AddressReaddata
Datamemory
a. Data memory unit
Writedata
MemRead
MemWrite
b. Sign-extension unit
Signextend
16 32
For SignExt[imm16]
32
32
32
4th Edition Figure 4.8, page 3113rd Edition Figure 5.8, page 296
EECC550 - ShaabanEECC550 - Shaaban#30 Lec # 4 Winter 2012 12-11-2012
Datapath For LoadsDatapath For Loads
R[rt] Mem[R[rs] + SignExt[imm16]]
32
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs
RtRdRegDst
Exten
der
Mu
x
Mux
3216
imm16
ALUSrc
ExtOp
Clk
Data InWrEn
32
Adr
DataMemory
32
AL
U
MemWr Mu
x
MemtoReg
1 0
0
1
0
1
R[rs]
R[rt]
Base Address register
EffectiveAddress
Offset
MemRd
Function = add
Effective AddressData Memory
LWLW
32 bit2X1 Mux
EECC550 - ShaabanEECC550 - Shaaban#31 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
Mem[R[rs] + SignExt[imm16]] R[rt] Immediate field sign extended to 32 bits and added to register rs to form memory store effective
address, register rt written to
memory at store effective address.
Store Operations ExampleExample::
Micro-Operation Sequence For SWMicro-Operation Sequence For SW
op rs rt Immediate (imm16)
016212631
6 bits 16 bits5 bits5 bits
sw rt, rs, imm16
Effective Address
SignedAddress offsetin bytes
Data Memory
43
[31:26] [25:21] [20:16] [15:0]
To store at
CommonSteps
EECC550 - ShaabanEECC550 - Shaaban#32 Lec # 4 Winter 2012 12-11-2012
Datapath For StoresDatapath For Stores
Mem[R[rs] + SignExt[imm16]] R[rt]
ALUSrcExtOp
32
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
Rd
RegDst
Exten
der
Mu
x
Mux
3216imm16
Clk
Data InWrEn
32
Adr
DataMemory
MemWr
AL
U32
Mu
xMemtoReg
1 0
0
1
0
1
R[rs]
R[rt]
Base Address register
EffectiveAddress
OffsetR[rt]
MemRd
Effective AddressData Memory
Add =
SWSW
EECC550 - ShaabanEECC550 - Shaaban#33 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
Zero R[rs] - R[rt] Calculate the branch condition R[rs] == R[rt]
(i.e R[rs] - R[rt] = 0 )
Zero : PC PC + ( SignExt(imm16) x 4 ) Calculate the next
instruction’s PC address
Conditional Branch ExampleExample::
Micro-Operation Sequence For BEQMicro-Operation Sequence For BEQ
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
beq rs, rt, imm16
Branch Target
PC Offset in words
[31:26] [25:21] [20:16] [15:0]
“Zero” is zero flag of main ALU
Condition Action
4
Then Zero = 1
CommonSteps
EECC550 - ShaabanEECC550 - Shaaban#34 Lec # 4 Winter 2012 12-11-2012
Readregister 1
Readregister 2
Writeregister
Writedata
Registers ALU Zero
RegWrite
Readdata 1
Readdata 2
ALU operation4
To branchcontrol logic
Add SumBranchtarget
PC4 from instruction datapath
Signextend
16 32
Instruction
Shiftleft 2
Datapath For Branch Instructions
Main ALU evaluates branch conditionNew adder to compute branch target:
• Sum of incremented PC and sign-extended lower 16-bits on the instruction.
Main ALU EvaluatesBranch Condition(subtract)
R[rs]
R[rt]Zero flag =1if R[rs] - R[rt] = 0(i.e R[rs] = R[rt])
New 32-bit Adder (Third ALU)for Branch Target
PC + 4 + ( SignExt(imm16) x 4
SignExt(imm16) x 4
[25:21] rs
[20:16] rt
[15:0] imm16SignExt(imm16)
(Main ALU)
4th Edition Figure 4.9, page 312 - 3rd Edition Figure 5.9, page 297
= Subtract
ISA
Zero R[rs] - R[rt]
EECC550 - ShaabanEECC550 - Shaaban#35 Lec # 4 Winter 2012 12-11-2012
More Detailed Datapath More Detailed Datapath For Branch OperationsFor Branch Operations
Clk
busW
RegWr
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs Rt
Eq
ual
?
Zero
32
imm16
PCClk
00
Ad
der
Mu
x
Ad
der
4
PC
Ext
Instruction Address
32
BranchZero
PC+4
BranchTarget
0
1Main ALU
(subtract)
Branch Target ALU New 2X1 32-bitMUX to select next PC valueSign extend
shift left 2
PC
New Third ALU (adder)
R[rs]
R[rt]
Zero R[rs] - R[rt]
EECC550 - ShaabanEECC550 - Shaaban#36 Lec # 4 Winter 2012 12-11-2012
Combining The Datapaths For Memory Combining The Datapaths For Memory Instructions and R-Type InstructionsInstructions and R-Type Instructions
Highlighted muliplexors and connections added to combine the datapaths of memory and R-Type instructions into one datapath
[25:21] rs
[20:16] rtR[rs]
R[rt]
R[rt]
0
1
1
0
rt/rdMUX not shown
4
[15:0] imm16 SignExt(imm16)
This is book version ORI not supported
4th Edition Figure 4.10 Page 314 - 3rd Edition Figure 5.10 Page 299
32
32
32
32
32
EECC550 - ShaabanEECC550 - Shaaban#37 Lec # 4 Winter 2012 12-11-2012
Instruction Fetch Datapath Added toInstruction Fetch Datapath Added toALU R-Type and Memory Instructions DatapathALU R-Type and Memory Instructions Datapath
This is book version ORI not supported, no zero extend of immediate needed
Readregister 1
Readregister 2
Writeregister
Writedata
Writedata
Registers
ALU
Zero
RegWrite
MemRead
MemWrite
MemtoReg
Readdata 1
Readdata 2
ALU operation4
Signextend
16 32
ALUresultM
ux
0
1
Mux
1
0
ALUSrc
Address
Datamemory
Readdata
PC Readaddress
Instruction
Instructionmemory
Add
4
R[rs]
R[rt]
rs
rt
PC+ 4
PC
rt/rdMUX not shown
32
32
32
32
32R[rt]
Combination of Figure 4.10 (p. 314) and Figure 4.6 (p. 309) [3rd Edition Figure 5.10 (p. 299) and Figure 5.6 (p. 293)]
EECC550 - ShaabanEECC550 - Shaaban#38 Lec # 4 Winter 2012 12-11-2012
A Simple Datapath For The MIPS ArchitectureA Simple Datapath For The MIPS ArchitectureDatapath of branches and a program counter multiplexor are added.
Resulting datapath can execute in a single cycle the basic MIPS instruction:
- load/store word - ALU operations - Branches
This is book version ORI not supported, no zero extend of immediate needed
4th Edition Figure 4.11 page 315 - 3rd Edition Figure 5.11 page 300
1
0
0
1
PC +4
Branch Target
rt/rdMUX not shown
4
Branch
Zero
32
32
32
32
32
32
32
32
R[rs]
R[rt]
rs
rt
EECC550 - ShaabanEECC550 - Shaaban#39 Lec # 4 Winter 2012 12-11-2012
Main ALU Control• The main ALU has four control lines (detailed design in Appendix B)
with the following functions:
• For our current subset of MIPS instructions only the top five functions will be used (thus only three control lines will be used)
• For R-type instruction the ALU function depends on both the opcode and the 6-bit “funct” function field
• For other instructions the ALU function depends on the opcode only.• A local ALU control unit can be designed to accept 2-bit ALUop
control lines (from main control unit) and the 6-bit function field and generate the correct 4-bit ALU control lines.
ALU Control Lines ALU Function
0000
0001
0010
0110
0111
AND
OR
add
subtract
Set-on-less-than1100 NOR Not Used
Or 3 bits depending on number functions actually used
4th EditionAppendix C
3rd Edition
EECC550 - ShaabanEECC550 - Shaaban#40 Lec # 4 Winter 2012 12-11-2012
InstructionOpcode
LWSWBranch EqualR-TypeR-TypeR-TypeR-TypeR-Type
ALUOp
00 00 01 10 10 10 10 10
Funct Field
XXXXXX XXXXXX XXXXXX 100000 100010 100100 100101 101010
DesiredALU Action
addaddsubtractaddsubtractandorset on less than
ALU Control Lines
0010 0010 0110 0010 0110 0000 0001 0111
InstructionOperation
Load wordStore wordbranch equaladdsubtractANDORset on less than
MainControl
op
6
ALUControl(Local)
func
2
6ALUop
ALUctr
4
AL
U
Local ALU Decoding of “func” FieldLocal ALU Decoding of “func” Field
R-Type = 10
Add = 00 Subtract = 01
OpcodeOr3 bits
EECC550 - ShaabanEECC550 - Shaaban#41 Lec # 4 Winter 2012 12-11-2012
Local ALU Control Unit
3 ALU Control Lines
FunctionField
(2 lines From main control unit)
More details found in Appendix D in Book CD – (3rd Edition Appendix C)
4th line = 0
AddSubtractAddSubtractANDORSet-On-less-Than
Page 302
R-type =10 {
Add = 00Subtract = 01
2
EECC550 - ShaabanEECC550 - Shaaban#42 Lec # 4 Winter 2012 12-11-2012
Readregister 1
Readregister 2
Writeregister
Writedata
Writedata
Registers
ALU
Add
Zero
MemRead
MemWrite
RegWrite
PCSrc
MemtoReg
Readdata 1
Readdata 2
Signextend
16 32
Instruction[31:0] ALU
result
Add
ALUresult
Mux
Mux
Mux
ALUSrc
Address
Datamemory
Readdata
Shiftleft 2
4
Readaddress
Instructionmemory
PC
1
0
0
1
0
1
Mux
0
1
ALUcontrol
ALUOpInstruction [5:0]
Instruction [25:21]
Instruction [15:11]
Instruction [20:16]
Instruction [15:0]
RegDst
Function Field
Branch
Zero
imm16
rs
rt
32
32
32
PC +4
rd
32
R[rs]
R[rt]
3232
Branch Target
PC +4
32
ALUOp (2-bits)00 = add01 = subtract10 = R-Type
R[rt]
32
Single Cycle MIPS DatapathSingle Cycle MIPS DatapathNecessary multiplexors and control linesare identified here and local ALU control added:
This is book version ORI not supported, no zero extend of immediate needed
4th Edition Figure 4.15 page 320 - 3rd Edition Figure 5.15 page 305
EECC550 - ShaabanEECC550 - Shaaban#43 Lec # 4 Winter 2012 12-11-2012
Putting It All Together: A Single Cycle DatapathPutting It All Together: A Single Cycle Datapathim
m16
32
ALUop (2-bits)
Clk
busW
RegWr
32
32
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LU
Zero
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
Ad
der
Ad
der
PC
Clk
00
Mu
x
4
PCSrc
PC
Ext
Adr
InstMemory
BranchZero
0
1
PC+4
BranchTarget
R[rs]
R[rt]
MainALU
(Includes ORInot in book version)
ALUControlFunction
Field
MemRd
00 = add01 = subtract10 = R-Type
e.g Sign Extend + Shift Left 2
EECC550 - ShaabanEECC550 - Shaaban#44 Lec # 4 Winter 2012 12-11-2012
RegDst
Instruction<31:0>
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
Adr
InstructionMemory
DATA PATHDATA PATH
ALUSrcALOp (2-bits)
MemReadMemtoReg
Control UnitControl Unit
Op
<21:25>
Fun
BranchRegWrite
<0:25>
Jump_target
Control LinesMemWrite
EECC550 - ShaabanEECC550 - Shaaban#45 Lec # 4 Winter 2012 12-11-2012
The Effect of The Control Signals Signal Name
RegDst
RegWrite
ALUSrc
Branch
MemRead
MemWrite
MemtoReg
Effect when deasserted (=0)
The register destination number for thewrite register comes from the rt field(instruction bits 20:16).
None
The second main ALU operand comes from the second register file output (Read data 2) R[rt]
The PC is replaced by the output of the adder that computes PC + 4
None
None
The value fed to the register write data input comes from the main ALU.
Effect when asserted (=1)
The register destination number for thewrite register comes from the rd field(instruction bits 15:11).
The register on the write register inputis written with the value on the Write data input.
The second main ALU operand is the sign-extended lower 16 bits on the instruction (imm16)
If Zero =1 The PC is replaced by the output of the adder that computes the branch target.
Data memory contents designated by the address input are put on the Read data output.
Data memory contents designated by the address input are replaced by the value on the Write data input.
The value fed to the register write data input comes from data memory.
(BEQ)
EECC550 - ShaabanEECC550 - Shaaban#46 Lec # 4 Winter 2012 12-11-2012
Control Line Settings
Instruction
R-Format
lw
sw
beq
RegDst
1
0
X
X
ALUSrc
0
1
1
0
Memto-Reg
0
1
X
X
RegWrite
1
1
0
0
MemRead
0
1
0
0
MemWrite
0
0
1
0
Branch
0
0
0
1
ALUOp1
1
0
0
0
ALUOp0
0
0
0
1
4th Edition Figure 4.18 page 3233rd Edition Figure 5.18 page 308
ALUOp (2-bits)
00 = add01 = subtract10 = R-Type
ControlLines
EECC550 - ShaabanEECC550 - Shaaban#47 Lec # 4 Winter 2012 12-11-2012
The Truth Table For The Main ControlThe Truth Table For The Main Control
(Opcode)
Similar to Figure 4.22 Page 327 (3rd Edition Figure 5.22 Page 312)
EECC550 - ShaabanEECC550 - Shaaban#48 Lec # 4 Winter 2012 12-11-2012
PLA Implementation of the Main ControlPLA Implementation of the Main Control
PLA = Programmable Logic Array - Appendix C (3rd Edition Appendix B)
Figure D.2.5 in Appendix D (3rd Edition Figure C.2.5 in Appendix C)
ControlLines
To Datapath
EECC550 - ShaabanEECC550 - Shaaban#49 Lec # 4 Winter 2012 12-11-2012
Readregister 1
Readregister 2
Writeregister
Writedata
Writedata
Registers
ALU
Add
Zero
Readdata 1
Readdata 2
Signextend
16 32
Instruction[31–0] ALU
result
Add
ALUresult
Mux
Mux
Mux
Address
Datamemory
Readdata
Shiftleft 2
4
Readaddress
Instructionmemory
PC
1
0
0
1
0
1
Mux
0
1
ALUcontrol
Instruction [5–0]
Instruction [25–21]
Instruction [31–26]
Instruction [15–11]
Instruction [20–16]
Instruction [15–0]
RegDstBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite
Control
Single Cycle MIPS Datapath Control Unit AddedSingle Cycle MIPS Datapath Control Unit Added
In this book version, ORI is not supported—no zero extend of immediate needed.
4th Edition Figure 4.21, page 3263rd Edition Figure 5.17, page 307
Function Field
rs
rt
PC +4
rd
R[rs]
R[rt]
Branch Target
PC +4
32
3232
32
32
32PC +4
ALUOp (2-bits)00 = add01 = subtract10 = R-Type
imm16
Opcode
EECC550 - ShaabanEECC550 - Shaaban#50 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC] Fetch the instruction
PC PC + 4 Increment PC
PC PC(31-28),jump_target,00 Update PC with jump address
Adding Support For Jump::
Micro-Operation Sequence For Jump: JMicro-Operation Sequence For Jump: J
OP Jump_target
6 bits 26 bits
j jump_target
jump target
4 bits 26 bits 2 bits
0 0
PC(31-28)
JumpAddress
Jump addressin words
[31:26] [25:0]
4 highest bits from PC + 4
2
CommonSteps
EECC550 - ShaabanEECC550 - Shaaban#51 Lec # 4 Winter 2012 12-11-2012
Datapath For JumpDatapath For Jump
32
PC
Clk
00
Mu
x
PCSrc
imm16
Ad
der
Ad
der
4
PC
Ext
Next Instruction Address
Mu
x
JUMP
Shift left 2jump_target
Instruction(15-0)
Instruction(25-0)
32
26
PC+4(31-28)
28 32
4
32 0
1
PC+4
BranchZero
BranchTarget
JumpAddress
PC
PC(31-28),jump_target,00
e.g Sign Extend + Shift Left 2
jump target
4 bits 26 bits 2 bits
0 0PC(31-28)
Jump Address
EECC550 - ShaabanEECC550 - Shaaban#52 Lec # 4 Winter 2012 12-11-2012
Readregister 1
Readregister 2
Writeregister
Writedata
Writedata
Registers
Add
Readdata 1
Readdata 2
Signextend
16 32
Instruction[31–0]
Add
ALUresult
Mux
Mux
Mux
Address
Datamemory
Readdata
Shiftleft 2
Shiftleft 2
4
Readaddress
Instructionmemory
PC
1
0
0
1
1
0
Mux
0
1
Mux
0
1
ALUcontrol
Instruction [5–0]
Instruction [25–21]
Instruction [31–26]
Instruction [15–11]
Instruction [20–16]
Instruction [15–0]
RegDst
Jump
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Control
Instruction [25–0] Jump address [31–0]
26 28
PC + 4 [31–28]
ALU
Zero
ALUresult
4
Single Cycle MIPS Datapath Extended To Handle Jump with Single Cycle MIPS Datapath Extended To Handle Jump with Control Unit AddedControl Unit Added
In this book version, ORI is not supported—no zero extend of immediate needed.
4th Edition Figure 4.24 page 3293rd Edition Figure 5.24 page 314
Function Field
rs
rt
PC +4
rd
R[rs]
R[rt]
Branch Target
PC +4
32
32
32
32
32
32PC +4
ALUOp (2-bits)00 = add01 = subtract10 = R-Type
imm16
Opcode
R[rt]
EECC550 - ShaabanEECC550 - Shaaban#53 Lec # 4 Winter 2012 12-11-2012
Control Line Settings(with jump instruction, j added)
Instruction
R-Format
lw
sw
beq
j
RegDst
1
0
X
X
X
ALUSrc
0
1
1
0
X
Memto-Reg
0
1
X
X
X
RegWrite
1
1
0
0
0
MemRead
0
1
0
0
0
MemWrite
0
0
1
0
0
Branch
0
0
0
1
X
ALUOp1
1
0
0
0
X
ALUOp0
0
0
0
1
X
Figure 4.18 page 323 (3rd Edition Figure 5.18 page 308) modified to include j
Jump
0
0
0
0
1
EECC550 - ShaabanEECC550 - Shaaban#54 Lec # 4 Winter 2012 12-11-2012
Clk
PC
Rs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memoey Access Time
Old Value New Value
RegWr Old Value New Value
Delay through Control Logic
busA
Register File Access Time
Old Value New Value
busB
ALU Delay
Old Value New Value
Old Value New Value
New ValueOld Value
ExtOp Old Value New Value
ALUSrc Old Value New Value
MemtoReg Old Value New Value
Address Old Value New Value
busW Old Value New
Delay through Extender & Mux
RegisterWrite Occurs
Data Memory Access Time
Worst Case Timing (Load)Worst Case Timing (Load)
EECC550 - ShaabanEECC550 - Shaaban#55 Lec # 4 Winter 2012 12-11-2012
Instruction Timing ComparisonInstruction Timing Comparison
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
PC Inst Memory mux
Jump
EECC550 - ShaabanEECC550 - Shaaban#56 Lec # 4 Winter 2012 12-11-2012
Simplified Single Cycle Datapath Timing• Assuming the following datapath/control hardware components delays:
– Memory Units: 2 ns
– ALU and adders: 2 ns
– Register File: 1 ns
– Control Unit < 1 ns
• Ignoring Mux and clk-to-Q delays, critical path analysis:
Instruction Memory
Register Read
Main ALU
Data Memory
Register Write
PC + 4 ALU
Branch Target ALU
Control Unit
Time
0 2ns 3ns 4ns 5ns 7ns 8ns
Critical Path(Load)
Obtained from low-level target VLSI implementation technology of components
ns = nanosecond = 10-9 second
}
2 ns 2 ns 2 ns
1 ns
1 ns
2 ns
CriticalPath = 8 ns (LW)
EECC550 - ShaabanEECC550 - Shaaban#57 Lec # 4 Winter 2012 12-11-2012
Performance of Single-Cycle (CPI=1) CPUPerformance of Single-Cycle (CPI=1) CPU • Assuming the following datapath hardware components delays:
– Memory Units: 2 ns– ALU and adders: 2 ns– Register File: 1 ns
• The delays needed for each instruction type can be found :
• The clock cycle is determined by the instruction with longest delay: The load in this case which is 8 ns. Clock rate = 1 / 8 ns = 125 MHz• A program with I = 1,000,000 instructions executed takes:
Execution Time = T = I x CPI x C = 106 x 1 x 8x10-9 = 0.008 s = 8 msec
Instruction Instruction Register ALU Data Register Total Class Memory Read Operation Memory Write Delay
ALU 2 ns 1 ns 2 ns 1 ns 6 ns
Load 2 ns 1 ns 2 ns 2 ns 1 ns 8 ns
Store 2 ns 1 ns 2 ns 2 ns 7 ns
Branch 2 ns 1 ns 2 ns 5 ns
Jump 2 ns 2 ns
Load has longest delay of 8 nsthus determining the clock cycle of the CPU to be 8ns
Nanosecond, ns = 10-9 second
C = 8 ns
T = I x CPI x C
EECC550 - ShaabanEECC550 - Shaaban#58 Lec # 4 Winter 2012 12-11-2012
• The MIPS jump and link instruction, jal is used to support procedure calls by jumping to jump address (similar to j ) and saving the address of the following instruction PC+4 in register $ra ($31)
jal Address • jal uses the j instruction format:
• We wish to add jal to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314) . Add any necessary datapaths and control signals to the single-clock datapath and justify the need for the modifications, if any.
• Specify control line values for this instruction.
Adding Support for jal to Single Cycle Datapath
op (6 bits) Target address (26 bits)
R[31] PC + 4
PC Jump Address
i.e. Return Address
EECC550 - ShaabanEECC550 - Shaaban#59 Lec # 4 Winter 2012 12-11-2012
31 2
2
Instruction Word Mem[PC]R[31] PC + 4PC Jump Address
1. Expand the multiplexor controlled by RegDst to include the value 31 as a new input 2. 2. Expand the multiplexor controlled by MemtoReg to have PC+4 as new input 2.
jump and link, jal support to Single Cycle Datapath
PC + 4
Jump Address
PC + 4
rs
rt
rd
imm16
R[rs]
R[rt]
Branch Target
PC + 4
EECC550 - ShaabanEECC550 - Shaaban#60 Lec # 4 Winter 2012 12-11-2012
Memto- Reg Mem Mem RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump
R-format 01 0 00 1 0 0 0 1 0 0
lw 00 1 01 1 1 0 0 0 0 0
sw xx 1 xx 0 0 1 0 0 0 0
beq xx 0 xx 0 0 0 1 0 1 0
J xx x xx 0 0 0 x x x 1
JAL 10 x 10 1 0 0 x x x 1
Adding Control Lines Settings for jal(For Textbook Single Cycle Datapath including Jump)
jump and link, jal support to Single Cycle Datapath
PC Jump AddressPC+ 4R[31]
MemtoRegIs now 2 bits
RegDstIs now 2 bits
Instruction Word Mem[PC]R[31] PC + 4PC Jump Address
EECC550 - ShaabanEECC550 - Shaaban#61 Lec # 4 Winter 2012 12-11-2012
• We wish to add a variant of lw (load word) let’s call it LWR to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314).
LWR $rd, $rs, $rt
• The LWR instruction is similar to lw but it sums two registers (specified by $rs, $rt) to obtain the effective load address and uses the R-Type format
• Add any necessary datapaths and control signals to the single cycle datapath and justify the need for the modifications, if any.
• Specify control line values for this instruction.
Adding Support for LWR to Single Cycle Datapath
Loaded word from memory written to register rd
Load Word Register
EECC550 - ShaabanEECC550 - Shaaban#62 Lec # 4 Winter 2012 12-11-2012
Memto- Reg Mem Mem RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump
R-format 1 0 0 1 0 0 0 1 0 0
lw 0 1 1 1 1 0 0 0 0 0
sw x 1 x 0 0 1 0 0 0 0
beq x 0 x 0 0 0 1 0 1 0
J x x x 0 0 0 x x x 1
LWR 1 0 1 1 1 0 0 0 0 0
Adding Control Lines Settings for LWR(For Textbook Single Cycle Datapath including Jump)
LWR (R-format LW) support to Single Cycle Datapath
rd
Instruction Word Mem[PC]
PC PC + 4
R[rd] Mem[ R[rs] + R[rt] ]
No new components or connections are needed for the datapath just the proper control line settings
R[rt]Add
EECC550 - ShaabanEECC550 - Shaaban#63 Lec # 4 Winter 2012 12-11-2012
• We wish to add a new instruction jm (jump memory) to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314).
jm offset($rs)
• The jm instruction loads a word from effective address (R[rs] + offset), this is similar to lw except the loaded word is put in the PC instead of register $rt.
• Jm used the I-format with field rt not used.
• Add any necessary datapaths and control signals to the single cycle datapath and justify the need for the modifications, if any.
• Specify control line values for this instruction.
Adding Support for jm to Single Cycle Datapath
OP rs rt address (imm16)
6 bits 5 bits 5 bits 16 bitsNot Used
Jump Memory
EECC550 - ShaabanEECC550 - Shaaban#64 Lec # 4 Winter 2012 12-11-2012
Instruction Word Mem[PC]
PC Mem[R[rs] + SignExt[imm16]] 1. Expand the multiplexor controlled by Jump to include the Read Data (data memory output) as new input 2. The Jump control signal is now 2 bits
Adding jump memory, jm support to Single Cycle Datapath
2Jump
2
2Jump
rs
rt
rd
imm16
PC + 4
Branch Target
R[rs]
R[rt]
EECC550 - ShaabanEECC550 - Shaaban#65 Lec # 4 Winter 2012 12-11-2012
Memto- Reg Mem Mem RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump
R-format 1 0 0 1 0 0 0 1 0 00
lw 0 1 1 1 1 0 0 0 0 00
sw x 1 x 0 0 1 0 0 0 00
beq x 0 x 0 0 0 1 0 1 00
J x x x 0 0 0 x x x 01
Jm x 1 x 0 1 0 x 0 0 10
Adding Control Lines Settings for jm(For Textbook Single Cycle Datapath including Jump)
Adding jm support to Single Cycle Datapath
PC Mem[R[rs] + SignExt[imm16]]
Jumpis now 2 bits
add
EECC550 - ShaabanEECC550 - Shaaban#66 Lec # 4 Winter 2012 12-11-2012
Drawbacks of Single Cycle ProcessorDrawbacks of Single Cycle Processor1. Long cycle time:
– All instructions must take as much time as the slowest• Here, cycle time for load is longer than needed for all other instructions.
– Cycle time must be long enough for the load instruction:PC’s Clock -to-Q + Instruction Memory Access Time +Register File Access Time + ALU Delay (address calculation) +Data Memory Access Time + Register File Setup Time + Clock Skew
– Real memory is not as well-behaved as idealized memory• Cannot always complete data access in one (short) cycle.
2. Impossible to implement complex, variable-length instructions and complex addressing modes in a single cycle.– e.g indirect memory addressing.
3. High and duplicate hardware resource requirements– Any hardware functional unit cannot be used more than once in a
single cycle (e.g. ALUs).
4. Does not allow overlap of instruction processing (instruction pipelining, chapter 6).
e.g R[$1] Mem[ Mem[$2] ]
Top Related