Pipelining III

23
Pipelining III Andreas Klappenecker CPSC321 Computer Architecture

description

Pipelining III. Andreas Klappenecker CPSC321 Computer Architecture. Administrative Issues. Talk by Laszlo Kish Quantum Computing Seminar, Thursday 10:00am-11:00am, HRBB 302 Projects: Get started!!!. Pipelined Datapath. Pipeline separation registers, width varies. Control Lines. - PowerPoint PPT Presentation

Transcript of Pipelining III

Page 1: Pipelining III

Pipelining III

Andreas KlappeneckerCPSC321 Computer

Architecture

Page 2: Pipelining III

Administrative Issues Talk by Laszlo Kish

Quantum Computing Seminar, Thursday 10:00am-11:00am, HRBB 302

Projects: Get started!!!

Page 3: Pipelining III

Pipelined Datapath

Pipeline separation registers, width varies

Page 4: Pipelining III

Control Lines Instruction fetch:

control signal to read instruction memory and to write PC are always asserted - nothing special here

Instruction decode/register file read: same thing happens every clock cycle, so no

optional control lines to set Execution/address calculation

RegDst selects the result register, ALUOp selects the ALU operation ALUSrc selects Read data 2 or sign-extd.

immediate

Page 5: Pipelining III

Pipelined Datapath w/ Controls Signals

Page 6: Pipelining III

Control Lines Memory access

Branch set by branch equal MemRead set by load instructions MemWrite set by store instructions

Write back MemtoReg send ALU result or memory

value RegWrite selects register

Page 7: Pipelining III

Pipelined Datapath w/ Controls Signals

Page 8: Pipelining III

Pass control signals along just like the data

Pipeline Control

Execution/Address Calculation stage control lines

Memory access stage control lines

Write-back stage control

lines

InstructionReg Dst

ALU Op1

ALU Op0

ALU Src Branch

Mem Read

Mem Write

Reg write

Mem to Reg

R-format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw X 0 0 1 0 0 1 0 Xbeq X 0 1 0 1 0 0 0 X

Control

EX

M

WB

M

WB

WB

IF/ID ID/EX EX/MEM MEM/WB

Instruction

Page 9: Pipelining III

Datapath with Control

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

EX

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

Page 10: Pipelining III

Assume that the compiler has to guarantee that no hazards occur

Where do we insert the “nops” ?

sub $2, $1, $3and $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)

Data Hazards

Page 11: Pipelining III

Data hazard: a dependency that “goes backward in time”

Dependencies

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecutionorder(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2:

DM Reg

Reg

Reg

Reg

DM

Page 12: Pipelining III

Solutionsub $2, $1, $3nopnopand $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)

Problem: this slows us down!

Resolution of Data Hazards

Page 13: Pipelining III

ForwardingDo not wait until result have been written Use temporary results! Use register file forwarding to handle read/write to same register ALU forwarding

Page 14: Pipelining III

Forwarding

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecution order(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2 :

DM Reg

Reg

Reg

Reg

X X X – 20 X X X X XValue of EX/MEM :X X X X – 20 X X X XValue of MEM/WB :

DM

Page 15: Pipelining III

Forwarding

PC Instructionmemory

Registers

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Forwardingunit

IF/ID

Inst

ruct

ion

Mux

RdEX/MEM.RegisterRd

MEM/WB.RegisterRd

Rt

Rt

Rs

IF/ID.RegisterRd

IF/ID.RegisterRt

IF/ID.RegisterRt

IF/ID.RegisterRs

Page 16: Pipelining III

Load word can still cause a hazard: an instruction trying to read a register following a load instruction writing to the same register.

Need a hazard detection unit to “stall” pipeline

Obstructions to Forwarding

Reg

IM

Reg

Reg

IM

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

lw $2, 20($1)

Programexecutionorder(in instructions)

and $4, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

or $8, $2, $6

add $9, $4, $2

slt $1, $6, $7

DM Reg

Reg

Reg

DM

Page 17: Pipelining III

Stalling We can stall the pipeline by keeping

an instruction in the same stage

lw $2, 20($1)

Programexecutionorder(in instructions)

and $4, $2, $5

or $8, $2, $6

add $9, $4, $2

slt $1, $6, $7

Reg

IM

Reg

Reg

IM DM

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6Time (in clock cycles)

IM Reg DM RegIM

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9 CC 10

DM Reg

RegReg

Reg

bubble

Page 18: Pipelining III

Hazard Detection Unit Stall by letting an instruction that

won’t write anything go forward

PC Instructionmemory

Registers

Mux

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

0

Mux

IF/ID

Inst

ruct

ion

ID/EX.MemRead

IF/ID

Writ

e

PC

Writ

e

ID/EX.RegisterRt

IF/ID.RegisterRd

IF/ID.RegisterRtIF/ID.RegisterRtIF/ID.RegisterRs

RtRs

Rd

Rt EX/MEM.RegisterRd

MEM/WB.RegisterRd

Page 19: Pipelining III

When we decide to branch, other instructions are in the pipeline!

We are predicting “branch not taken” need to add hardware for flushing instructions if we are wrong

Branch Hazards

Reg

Reg

CC 1

Time (in clock cycles)

40 beq $1, $3, 7

Programexecutionorder(in instructions)

IM Reg

IM DM

IM DM

IM DM

DM

DM Reg

Reg Reg

Reg

Reg

RegIM

44 and $12, $2, $5

48 or $13, $6, $2

52 add $14, $2, $2

72 lw $4, 50($7)

CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9

Reg

Page 20: Pipelining III

Flushing Move branch decision from 4th

pipeline stage to the second only one instruction following the

branch will be in the pipeline IF.Flush turns fetched instruction

into a nop by zeroing the IF/ID pipeline register

Page 21: Pipelining III

Flushing Instructions

PC Instructionmemory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

Page 22: Pipelining III

Improving Performance Try and avoid stalls by reordering instructions Add a “branch delay slot”

the next instruction after a branch is always executed

rely on compiler to “fill” the slot with something useful

Superscalar: start more than one instruction in the same cycle

Page 23: Pipelining III

Dynamic Scheduling The hardware performs the “scheduling”

hardware tries to find instructions to execute out of order execution is possible speculative execution and dynamic branch prediction

All modern processors are very complicated DEC Alpha 21264: 9 stage pipeline, 6 instruction issue PowerPC and Pentium: branch history table Compiler technology important

This class has given you the background you need to learn more - read Chapter 6!

More material will be posted!