EEM 486 : Computer Architecture Designing a Multicycle Processor

24
EEM 486: Computer Architecture Designing a Multicycle Processor

description

EEM 486 : Computer Architecture Designing a Multicycle Processor. Processor. Input. Control. Memory. Datapath. Output. The Big Picture. Designing a Multiple Clock Cycle Datapath. OPcode. Control Logic / Store (PLA, ROM). Decode. microinstruction. Conditions. Instruction. - PowerPoint PPT Presentation

Transcript of EEM 486 : Computer Architecture Designing a Multicycle Processor

Page 1: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

EEM 486: Computer Architecture

Designing a Multicycle Processor

Page 2: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

The Big Picture

Designing a Multiple Clock Cycle Datapath

Control

Datapath

Memory

ProcessorInput

Output

Page 3: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Single-Cycle Processor

In our single-cycle processor, each instruction is realizedby exactly one control command or microinstruction

Control Logic / Store (PLA, ROM)

OPcode

Datapath

Inst

ruct

ion

Decode

Cond

ition

sControlPoints

microinstruction

Page 4: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Abstract View of Single Cycle-Processor

PCNe

xt P

C

Regi

ster

Fetc

h ALU Reg.

W

rt

Mem

Acce

ss

Data

MemIn

stru

ctio

nFe

tch

ALUc

tr

RegD

st

ALUS

rcEx

tOp

Mem

Wr

Equa

l

nPC_

sel

RegW

r

Mem

Wr

Mem

Rd

MainControl

ALUcontrol

op

fun

Ext

Page 5: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

What’s Wrong with CPI=1 Processor?

Long Cycle Time All instructions take as much time as the slowest Real memory is not as nice as our idealized memory

◦ Cannot always get the job done in one (short) cycle

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

Page 6: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Memory Access Time

Physics fast memories are small (large memories are slow)

Use a hierarchy of memories

Storage Arrayselected word line

addressstorage cellbit line

sense ampsaddressdecoder

CacheProcessor

1 time-period

proc

. bus

L2Cache

mem

. bus

2-3 time-periods20 - 50 time-periods

memory

Page 7: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Multicycle Approach

Break up the instructions into steps:◦ Let each step take one “smaller” clock cycle

- Balance the amount of work to be done- Restrict each cycle to use only one major functional

unit Major functional units: Memory, Register File, and ALU

◦ Let different instructions take different numbers of cycles Use a functional unit more than once within

execution of one instruction (Less hardware)◦ A single memory unit for both instructions and data◦ A single ALU, rather than an ALU and two adders

At the end of a cycle◦ store values for use in later cycles ◦ introduce additional “internal” registers

Page 8: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Partitioning the CPI=1 Datapath Add registers between smallest steps

PCNe

xt P

C

Oper

and

Fetc

h Exec Reg.

Fil

e

Mem

Acce

ss

Data

Mem

Inst

ruct

ion

Fetc

h

ALUc

tr

RegD

st

ALUS

rcEx

tOp

Mem

Wr

nPC_

sel

RegW

r

Mem

Wr

Mem

Rd

Equa

l

Instruction fetch

Decode and Operand fetch

Execution Memory access

Writeback

Page 9: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Recall: Step-by-step Processor Design

Step 1: ISA Logical Register TransfersStep 2: Components of the DatapathStep 3: RTL + Components DatapathStep 4: Datapath + Logical RTs Physical RTsStep 5: Physical RTs Control

Page 10: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : R-type (add, sub, . . .)

inst Logical Register TransfersADDU R[rd]<–R[rs] + R[rt]; PC <– PC + 4

Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4

Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt]

Step 3. Execution ALUOut ← A op B

Step 4. Write-back R[rd] ← ALUOut

Page 11: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : R-type (add, sub, . . .)

PC

Instruction [15-11]

A

B

01

014

ALU ALU

Out

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemData

Writedata

Memory

MemRead MemWrite IRWrite RegWrite ALUSrcA

ALUSrcB ALUctr

nPCWrite

Page 12: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Logical immediate

inst Logical Register TransfersORI R[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4

Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4

Step 2. Instruction Decode and Register Fetch A ← R[rs]

Step 3. Execution ALUOut ← A OR ZExt(Im16)

Step 4. Write-back R[rt] ← ALUOut

Page 13: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Logical immediate

PC

Inst [15-11]

A

B

01

0

14

ALU ALU

Out

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemData

Writedata

Memory

MemRead MemWrite IRWrite RegWrite ALUSrcA

ALUSrcB ALUctr

nPCWrite

2

Zeroextend

01

RegDst

16 32

Page 14: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Load

inst Logical Register TransfersLW R[rt] <– MEM[R[rs] + SExt(Im16)]; PC <– PC + 4

Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4

Step 2. Instruction Decode and Register Fetch A ← R[rs]

Step 3. Memory address computation ALUOut ← A + SExt(Im16)

Step 4. Memory access MDR ← Memory[ALUOut]

Step 5. Load completion R[rt] ← MDR

Page 15: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Load

PC

Inst [15-11]

A

B

01

014

ALU ALU

Out

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemData

Writedata

Memory

MemRead MemWrite IRWrite RegWrite ALUSrcA

ALUSrcB ALUctr

nPCWrite

2

Extender

01

RegDst

16 32

01

MDR

01

IorD MemtoReg ExtOp

Page 16: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Store

inst Logical Register TransfersSW MEM[R[rs] + SExt(Im16)] <– R[rt]; PC <– PC + 4

Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4

Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt]

Step 3. Memory address computation ALUOut ← A + SExt(Im16)

Step 4. Memory access Memory[ALUOut] ← B

Page 17: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Store

PC

Inst [15-11]

A

B

01

0

14

ALU ALUOut

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemData

Writedata

Memory

MemRead MemWrite IRWrite RegWrite ALUSrcA

ALUSrcB ALUctr

nPCWrite

2

Extender

01

RegDst

16 32

01

MDR

01

IorD MemtoReg ExtOp

Page 18: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Branch

inst Logical Register TransfersBEQ if R[rs] == R[rt] then PC <= PC + 4 + SExt(Im16) || 00

else PC <= PC + 4

Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4

Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt]

ALUOut ← PC + SExt(Im16) || 00

Step 3. Branch completion If A = B, PC ← ALUOut

Page 19: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step 4 : Branch

PC

Inst [15-11]

A

B

01

0

14

ALUALUOut

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemData

Writedata

Memory

MemRead MemWrite IRWrite RegWrite ALUSrcA

ALUSrcB ALUctr

2

Extender

01

RegDst

16 32

01

MDR

01

IorD MemtoReg

3

Shift left 2

1 0

PCSource

PCWrite PCWriteCond

Zero

ExtOp

Page 20: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Multicycle Processor

RegWriteALUSrcA

RegDst

PCSourcePCWriteCond

PC

Inst [15-11]

A

B

01

014

ALUALUOut

Rs

Rw

Rt Registers

Write data

Read data 1

Read data 2

Instruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-0]

Instruction register

Address

MemDataWritedata

Memory

MemReadMemWrite

IRWrite

ALUSrcB

ALUOp

2

Extender

01

16 32

01

MDR

01

IorD

MemtoReg

3

Shift left 2

1 0

PCWrite

Zero

ExtOpControl Op [5-0]

ALUControl

Instruction [5-0]

Page 21: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Step name Action for R-type instructions

Action for memory-reference instructions

Action for branches

Action for jumps

IR = Memory[PC]PC = PC + 4

A = Reg [IR[25-21]]B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II(IR[15-0]) PC = ALUOut (IR[25-0]<<2)

Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

Instruction fetch

Instruction decode/register fetch

Execution, address computation, branch/jump

completion

Memory access or R-type completion

Summary of Instruction Steps

Page 22: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Performance Evaluation

What is the average CPI?◦ State diagram gives CPI for each instruction type◦ Workload gives frequency of each type

Type CPIi for type Frequency CPIi x freqIi Arith/Logic 4 40% 1.6Load 5 30% 1.5Store 4 10% 0.4Branch 3 20% 0.6

Average CPI: 4.1

Page 23: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Simple Questions

How many cycles will it take to execute this code?

lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Label #assume notadd $t5, $t2, $t3sw $t5, 8($t3)

Label:...◦ 21 cycles

What is going on during the 8th cycle of execution?◦ Address calculation to put on ALUOut

In what cycle does the actual addition of $t2 and $t3 takes place?◦ 16th cycle

Page 24: EEM  486 :  Computer Architecture Designing  a  Multicycle  Processor

Summary

Disadvantages of the Single Cycle Processor◦ Long cycle time◦ Cycle time is too long for all instructions except the Load

Multiple Cycle Processor:◦ Divide the instructions into smaller steps◦ Execute each step (instead of the entire instruction) in one

cyclePartition datapath into equal size chunks to minimize

cycle timeFollow same 5-step method for designing “real”

processor