Putting it All Together:A Single Cycle Datapath

35
CS 61C L5.2.1 CPU Design III (1) K. Meinz, Summer 2004 © UCB CS61C : Machine Structures Lecture 5.2.1 CPU Design III: Control 2004-07-21 Kurt Meinz inst.eecs.berkeley.edu/~cs61c

description

CS61C : Machine Structures Lecture 5.2.1 CPU Design III: Control 2004-07-21 Kurt Meinz inst.eecs.berkeley.edu/~cs61c. Inst Memory. Adr. Adder. Mux. Adder. Putting it All Together:A Single Cycle Datapath. Instruction. . . . . Rs. Rt. Rd. Imm16. - PowerPoint PPT Presentation

Transcript of Putting it All Together:A Single Cycle Datapath

Page 1: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (1) K. Meinz, Summer 2004 © UCB

CS61C : Machine Structures

Lecture 5.2.1

CPU Design III: Control

2004-07-21

Kurt Meinz

inst.eecs.berkeley.edu/~cs61c

Page 2: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (2) K. Meinz, Summer 2004 © UCB

Putting it All Together:A Single Cycle Datapath

imm

16

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Exten

der

Mu

x

3216imm16

ALUSrcExtOp

Mu

x

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LU

Equal

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

Ad

der

Ad

der

PC

Clk

00Mu

x

1

nPC_sel

PC

Ext

Adr

InstMemory

Page 3: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (3) K. Meinz, Summer 2004 © UCB

Recall: Clocking Methodology

• All storage elements are clocked by the same clock edge

•Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew

Clk

Don’t Care

Setup Hold

.

.

.

.

.

.

.

.

.

.

.

.

Setup Hold

Page 4: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (4) K. Meinz, Summer 2004 © UCB

Clocking Methodology

• Storage elements clocked by same edge• Being physical devices, flip-flops (FF) and combinational logic have some delays •Gates: delay from input change to output change •Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF, and we have the usual clock-to-Q delay

• “Critical path” (longest path through logic) determines length of clock period

Clk

.

.

.

.

.

.

.

.

.

.

.

.

Page 5: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (5) K. Meinz, Summer 2004 © UCB

An Abstract View of the Critical Path Critical Path (Load Operation) =

Delay clock through PC (FFs) + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Stable Time for Register File Write

Clk

5

Rw Ra Rb32 32-bitRegisters

RdA

LU

Clk

Data In

DataAddress Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

Clk

PC

5Rs

5Rt

16Imm

32

323232

A

B

Nex

t A

dd

ress

• This affects how much you can overclock your PC!

Page 6: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (6) K. Meinz, Summer 2004 © UCB

How to Design a Processor: step-by-step• 1. Analyze instruction set architecture (ISA)

=> datapath requirements•meaning of each instruction is given by the register transfers• datapath must include storage element for ISA registers• datapath must support each register transfer

• 2. Select set of datapath components and establish clocking methodology

• 3. Assemble datapath meeting requirements• 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

• 5. Assemble the control logic (hard part!)

Page 7: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (7) K. Meinz, Summer 2004 © UCB

A Single Cycle Datapath• Rs, Rt, Rd, Imed16 connected to datapath

• We have everything except control signals

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Exten

der

Mu

x

Mux

3216imm16

ALUSrc

ExtOp

Mu

x

MemtoReg

Clk

Data InWrEn

32Adr

DataMemory

32

MemWrA

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Page 8: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (8) K. Meinz, Summer 2004 © UCB

Meaning of the Control Signals: IF• nPC_MUX_sel: 0 PC <– PC + 4

1 PC <– PC + 4 + {SignExt(Im16) , 00 }

Adr

InstMemory

Ad

der

Ad

der

PC

Clk

00Mu

x

1

nPC_MUX_sel

PC

Ext

imm

16

“n”=next

Page 9: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (9) K. Meinz, Summer 2004 © UCB

Meaning of the Control Signals• ExtOp: “zero”, “sign”

• ALUsrc: 0 regB; 1 immed

• ALUctr: “add”, “sub”, “or”

° MemWr: 1 write memory° MemtoReg: 0 ALU; 1 Mem° RegDst: 0 “rt”; 1 “rd”° RegWr: 1 write register

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Exten

der

Mu

x

3216imm16

ALUSrcExtOp

Mu

x

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWr

AL

U

Equal

0

1

0

1

01

=

Page 10: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (10) K. Meinz, Summer 2004 © UCB

RTL: The Add Instruction

add rd, rs, rt•MEM[PC] Fetch the instruction

from memory

•R[rd] = R[rs] + R[rt] The actual operation

•PC = PC + 4 Calculate the next instruction’s address

op rs rt rd shamt funct

061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

Page 11: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (11) K. Meinz, Summer 2004 © UCB

Instruction Fetch Unit at the Beginning of Add• Fetch the instruction from Instruction memory: Instruction = MEM[PC]• same for all instructions

PC

Ext

Adr

InstMemory

Ad

der

Ad

der

PC

Clk

00Mu

x

1

nPC_MUX_sel

imm

16Instruction<31:0>

Page 12: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (12) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Add

32

ALUctr = Add

Clk

busW

RegWr = 1

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 1

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 0

ExtOp = x

Mu

x

MemtoReg = 0

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>• R[rd] = R[rs] + R[rt]

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

op rs rt rd shamt funct

061116212631

nPC_sel= +4

Page 13: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (13) K. Meinz, Summer 2004 © UCB

Instruction Fetch Unit at the End of Add• PC = PC + 4

• This is the same for all instructions except: Branch and Jump

Adr

InstMemory

Ad

der

Ad

der

PC

Clk

00Mu

x

1

nPC_MUX_sel

imm

16Instruction<31:0>

0

1

Page 14: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (14) K. Meinz, Summer 2004 © UCB

Single Cycle Datapath during Or Immediate?

op rs rt immediate

016212631

32

ALUctr =

Clk

busW

RegWr =

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Exten

der

Mu

x

Mux

3216imm16

ALUSrc =

ExtOp =

Mu

x

MemtoReg =

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel =

• R[rt] = R[rs] OR ZeroExt[Imm16]

Page 15: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (15) K. Meinz, Summer 2004 © UCB

32

ALUctr = Or

Clk

busW

RegWr = 1

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 0

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 1

ExtOp = 0

Mu

x

MemtoReg = 0

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

• R[rt] = R[rs] OR ZeroExt[Imm16]

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

op rs rt immediate

016212631

nPC_sel= +4

Single Cycle Datapath during Or Immediate

Page 16: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (16) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Load?

32

ALUctr =

Clk

busW

RegWr =

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Exten

der

Mu

x

Mux

3216imm16

ALUSrc =

ExtOp =

Mu

x

MemtoReg =

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr =A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• R[rt] = Data Memory {R[rs] + SignExt[imm16]}

op rs rt immediate

016212631

nPC_sel=

Page 17: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (17) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Load

32

ALUctr = Add

Clk

busW

RegWr = 1

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 0

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mu

x

MemtoReg = 1

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• R[rt] = Data Memory {R[rs] + SignExt[imm16]}

op rs rt immediate

016212631

nPC_sel= +4

Page 18: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (18) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Store?

op rs rt immediate

016212631

32

ALUctr =

Clk

busW

RegWr =

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Exten

der

Mu

x

Mux

3216imm16

ALUSrc =

ExtOp =

Mu

x

MemtoReg =

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel =

• Data Memory {R[rs] + SignExt[imm16]} = R[rt]

Page 19: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (19) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Store

32

ALUctr = Add

Clk

busW

RegWr = 0

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mu

x

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 1A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• Data Memory {R[rs] + SignExt[imm16]} = R[rt]op rs rt immediate

016212631

nPC_sel= +4

Page 20: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (20) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Branch?

32

ALUctr =

Clk

busW

RegWr =

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Exten

der

Mu

x

Mux

3216imm16

ALUSrc =

ExtOp =

Mu

x

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr =A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0op rs rt immediate

016212631

nPC_sel=

Page 21: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (21) K. Meinz, Summer 2004 © UCB

The Single Cycle Datapath during Branch

32

ALUctr =Sub

Clk

busW

RegWr = 0

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 0

ExtOp = x

Mu

x

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0op rs rt immediate

016212631

nPC_sel= “Br”

Page 22: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (22) K. Meinz, Summer 2004 © UCB

Instruction Fetch Unit at the End of Branch• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ;

else PC = PC + 4

op rs rt immediate

016212631

• What is encoding of nPC_sel?•Direct MUX select?

•Branch / not branch

• Let’s pick 2nd option

nPC_sel zero? MUX0 x 01 0 01 1 1

Adr

InstMemory

Ad

derA

dder

PC

Clk

00Mu

x

1

nPC_sel

imm

16

Instruction<31:0>

0

1

Zero

nPC_MUX_sel

Page 23: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (23) K. Meinz, Summer 2004 © UCB

How to Design a Processor: step-by-step• 1. Analyze instruction set architecture (ISA)

=> datapath requirements•meaning of each instruction is given by the register transfers• datapath must include storage element for ISA registers• datapath must support each register transfer

• 2. Select set of datapath components and establish clocking methodology

• 3. Assemble datapath meeting requirements• 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

• 5. Assemble the control logic (hard part!)

Page 24: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (24) K. Meinz, Summer 2004 © UCB

Step 5: Given Datapath: RTL -> Control

ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Zero

Instruction<31:0>

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Adr

InstMemory

DATA PATH

Control

Op

<0:5>

Fun

RegWr

<26:31>

Page 25: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (25) K. Meinz, Summer 2004 © UCB

A Summary of the Control Signals (1/2)

inst Register Transfer

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”

SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”

ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4

ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel =“+4”

LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”

STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”

BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4

nPC_sel = “Br”, ALUctr = “sub”

Page 26: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (26) K. Meinz, Summer 2004 © UCB

A Summary of the Control Signals (2/2)

add sub ori lw sw beq jump

RegDst

ALUSrc

MemtoReg

RegWrite

MemWrite

nPCsel

Jump

ExtOp

ALUctr<2:0>

1

0

0

1

0

0

0

x

Add

1

0

0

1

0

0

0

x

Subtract

0

1

0

1

0

0

0

0

Or

0

1

1

1

0

0

0

1

Add

x

1

x

0

1

0

0

1

Add

x

0

x

0

0

1

0

x

Subtract

x

x

x

0

0

0

1

x

xxx

op target address

op rs rt rd shamt funct

061116212631

op rs rt immediate

R-type

I-type

J-type

add, sub

ori, lw, sw, beq

jump

func

op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010Appendix A10 0000See 10 0010 We Don’t Care :-)

Page 27: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (27) K. Meinz, Summer 2004 © UCB

Review: Finite State Machine (FSM)

• States represent possible output values.

• Transitions represent changes between statesbased on inputs.

• Implement with CL andclocked registerfeedback.

Page 28: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (28) K. Meinz, Summer 2004 © UCB

Finite State Machines extremely useful!

• They define •How output signals respond to input signals and previous state.

•How we change states depending on input signals and previous state

• The output signals could be our familiar control signals•Some control signals may only depend on CL, not on state at all…

• We could implement very detailed CL blocks w/Programmable Logic Arrays

Page 29: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (29) K. Meinz, Summer 2004 © UCB

Taking advantage of sum-of-products

• Since sum-of-products is a convenient notation and way to think about design, offer hardware building blocks that match that notation

• One example isProgrammable Logic Arrays (PLAs)

• Designed so that can select (program) ands, ors, complements after you get the chip• Late in design process, fix errors, figure out what to do later, …

Page 30: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (30) K. Meinz, Summer 2004 © UCB

• • •

inputs

ANDarray

• • •

outputs

ORarrayproduct

terms

Programmable Logic Arrays• Pre-fabricated building block of many

AND/OR gates• “Programmed” or “Personalized" by making or

breaking connections among gates

• Programmable array block diagram for sum of products form

And Programming:• How many inputs?• How to combine inputs?• How many product terms?

Or Programming:• How to combine product terms?• How many outputs?

Page 31: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (31) K. Meinz, Summer 2004 © UCB

example:

F0 = A + B' C'F1 = A C' + A BF2 = B' C' + A BF3 = B' C + A

personality matrix1 = uncomplemented in term0 = complemented in term– = does not participate

1 = term connected to output0 = no connection to output

input side: 3 inputs

output side: 4 outputsProduct inputs outputsterm A B C F0 F1 F2 F3

AB 1 1 – 0 1 1 0B'C – 0 1 0 0 0 1AC' 1 – 0 0 1 0 0B'C' – 0 0 1 0 1 0A 1 – – 1 0 0 1 reuse of terms;

5 product terms

Enabling Concept• Shared product terms among outputs

Page 32: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (32) K. Meinz, Summer 2004 © UCB

Before Programming

• All possible connections available before "programming"

Page 33: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (33) K. Meinz, Summer 2004 © UCB

A B C

F1 F2 F3F0

AB

B'C

AC'

B'C'

A

After Programming• Unwanted connections are "blown"

• Fuse (normally connected, break unwanted ones)

• Anti-fuse (normally disconnected, make wanted connections)

Page 34: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (34) K. Meinz, Summer 2004 © UCB

notation for implementingF0 = A B + A' B'F1 = C D' + C' D

AB+A'B'CD'+C'D

AB

A'B'

CD'

C'D

A B C D

Alternate Representation

• Short-hand notation--don't have to draw all the wires• X Signifies a connection is present and

perpendicular signal is an input to gate

Page 35: Putting it All Together:A Single Cycle Datapath

CS 61C L5.2.1 CPU Design III (35) K. Meinz, Summer 2004 © UCB

°5 steps to design a processor• 1. Analyze instruction set => datapath requirements• 2. Select set of datapath components & establish clock

methodology• 3. Assemble datapath meeting the requirements• 4. Analyze implementation of each instruction to

determine setting of control points that effects the register transfer.• 5. Assemble the control logic

°Control is the hard part°MIPS makes that easier• Instructions same size• Source registers always in same place• Immediates same size, location• Operations always on registers/immediates

And in Conclusion… Single cycle control

Control

Datapath

Memory

ProcessorInput

Output