Pipelined Processor II (cont’d) CPSC 321

17
Pipelined Processor II (cont’d) CPSC 321 Andreas Klappenecker

description

Pipelined Processor II (cont’d) CPSC 321. Andreas Klappenecker. Pipelined Datapath. Pipeline separation registers, width varies. Control Lines. Instruction fetch: control signal to read instruction memory and to write PC are always asserted - nothing special here - PowerPoint PPT Presentation

Transcript of Pipelined Processor II (cont’d) CPSC 321

Page 1: Pipelined Processor II (cont’d) CPSC 321

Pipelined Processor II (cont’d)CPSC 321

Andreas Klappenecker

Page 2: Pipelined Processor II (cont’d) CPSC 321

Pipelined Datapath

Pipeline separation registers, width varies

Page 3: Pipelined Processor II (cont’d) CPSC 321

Control Lines

• Instruction fetch: • control signal to read instruction memory and to write

PC are always asserted - nothing special here

• Instruction decode/register file read:• same thing happens every clock cycle, so no optional

control lines to set

• Execution/address calculation• RegDst selects the result register, • ALUOp selects the ALU operation • ALUSrc selects Read data 2 or sign-extd. immediate

Page 4: Pipelined Processor II (cont’d) CPSC 321

Pipelined Datapath with Control Signals

Page 5: Pipelined Processor II (cont’d) CPSC 321

Control Lines

• Memory access• Branch set by branch equal• MemRead set by load instructions• MemWrite set by store instructions

• Write back• MemtoReg send ALU result or memory

value• RegWrite selects register

Page 6: Pipelined Processor II (cont’d) CPSC 321

Pipelined Datapath with Control Signals

Page 7: Pipelined Processor II (cont’d) CPSC 321

• Pass control signals along just like the data

Pipeline Control

Execution/Address Calculation stage control lines

Memory access stage control lines

Write-back stage control

lines

InstructionReg Dst

ALU Op1

ALU Op0

ALU Src Branch

Mem Read

Mem Write

Reg write

Mem to Reg

R-format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw X 0 0 1 0 0 1 0 Xbeq X 0 1 0 1 0 0 0 X

Control

EX

M

WB

M

WB

WB

IF/ID ID/EX EX/MEM MEM/WB

Instruction

Page 8: Pipelined Processor II (cont’d) CPSC 321

Datapath with Control

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Me

mto

Re

g

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Re

gWrit

e

MemRead

Control

ALU

Instruction[15– 11]

6

EX

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Me

mW

rite

AddressData

memory

Address

Page 9: Pipelined Processor II (cont’d) CPSC 321

• Assume that the compiler has to guarantee that no hazards occur

• Where do we insert the “nops” ?

sub $2, $1, $3and $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)

Data Hazards

Page 10: Pipelined Processor II (cont’d) CPSC 321

Data hazard: a dependency that “goes backward in time”

Dependencies

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecutionorder(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2:

DM Reg

Reg

Reg

Reg

DM

Page 11: Pipelined Processor II (cont’d) CPSC 321

Forwarding

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecution order(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2 :

DM Reg

Reg

Reg

Reg

X X X – 20 X X X X XValue of EX/MEM :X X X X – 20 X X X XValue of MEM/WB :

DM

Page 12: Pipelined Processor II (cont’d) CPSC 321

Forwarding

PCInstruction

memory

Registers

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Forwardingunit

IF/ID

Inst

ruct

ion

Mux

RdEX/MEM.RegisterRd

MEM/WB.RegisterRd

Rt

Rt

Rs

IF/ID.RegisterRd

IF/ID.RegisterRt

IF/ID.RegisterRt

IF/ID.RegisterRs

Page 13: Pipelined Processor II (cont’d) CPSC 321

• Load word can still cause a hazard: an instruction trying to read a register following a load instruction writing to the same register.

• Need a hazard detection unit to “stall” pipeline

Obstructions to Forwarding

Reg

IM

Reg

Reg

IM

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

lw $2, 20($1)

Programexecutionorder(in instructions)

and $4, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

or $8, $2, $6

add $9, $4, $2

slt $1, $6, $7

DM Reg

Reg

Reg

DM

Page 14: Pipelined Processor II (cont’d) CPSC 321

Hazard Detection Unit

• Stall by letting an instruction that won’t write anything go forward

PCInstruction

memory

Registers

Mux

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

0

Mux

IF/ID

Inst

ruct

ion

ID/EX.MemRead

IF/I

DW

rite

PC

Wri

te

ID/EX.RegisterRt

IF/ID.RegisterRd

IF/ID.RegisterRt

IF/ID.RegisterRt

IF/ID.RegisterRs

RtRs

Rd

Rt EX/MEM.RegisterRd

MEM/WB.RegisterRd

Page 15: Pipelined Processor II (cont’d) CPSC 321

• When we decide to branch, other instructions are in

the pipeline!

• We are predicting “branch not taken”• need to add hardware for flushing instructions if we are wrong

Branch Hazards

Reg

Reg

CC 1

Time (in clock cycles)

40 beq $1, $3, 7

Programexecutionorder(in instructions)

IM Reg

IM DM

IM DM

IM DM

DM

DM Reg

Reg Reg

Reg

Reg

RegIM

44 and $12, $2, $5

48 or $13, $6, $2

52 add $14, $2, $2

72 lw $4, 50($7)

CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9

Reg

Page 16: Pipelined Processor II (cont’d) CPSC 321

Flushing Instructions

PCInstruction

memory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

Page 17: Pipelined Processor II (cont’d) CPSC 321

Dynamic Scheduling

• The hardware performs the “scheduling” • hardware tries to find instructions to execute

• out of order execution is possible

• speculative execution and dynamic branch prediction

• All modern processors are very complicated• DEC Alpha 21264: 9 stage pipeline, 6 instruction issue

• PowerPC and Pentium: branch history table

• Compiler technology important

• This class has given you the background you need to learn more - read Chapter 6!

• More material will be posted!