Lecture 16: Pipeline Controlsjtang/cs411.s20/lectures/L16... · 2020-03-26 · Lecture 16: Pipeline...
Transcript of Lecture 16: Pipeline Controlsjtang/cs411.s20/lectures/L16... · 2020-03-26 · Lecture 16: Pipeline...
Instruction Pipelining
• Goal is to start executing next instruction on every clock cycle
• Pipelining only works when there are no resource contentions
• If so, then either intentionally stall pipeline or duplicate those resources
�3
Fetch Decode Execute Memory Write Back
Fetch Decode Execute Memory Write Back
Fetch Decode Execute Memory Write Back
Fetch Decode Execute Memory Write Back
Fetch Decode Execute Memory Write Back
Time
Program Flow
Multicycle Datapath
�4
IF: Instruction Fetch ID: Instruction Decode / Register
File Read
EX: Execute / Address Calculation
MEM: Mem Access WB: Write Back
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
Pipelined Datapath
�5
IF: Instruction Fetch ID: Instruction Decode / Register
File Read
EX: Execute / Address Calculation
MEM: Mem Access WB: Write Back
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
IF /
ID
ID /
EX
EX /
MEM
MEM
/ W
B
Pipeline Registers
�6
IF: Instruction Fetch ID: Instruction Decode / Register
File Read
EX: Execute / Address Calculation
MEM: Mem Access WB: Write Back
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
M
Out
Pipeline Controls
�7
IF: Instruction Fetch ID: Instr Decode EX: Execute MEM: Mem Access WB: Write
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
ID ControlsEX Controls MEM Controls WB Controls
Controls and Datapath
�8
A ← R[n]B ← R[m]
Imm* ← Extend(*)PC ← ALUOut
ALUOut ← A op B
R[d] ← ALUOut
ALUOut ← A + Imm9
MDR ← Mem[ALUOut]
R[d] ← MDR PC ← ALUOut
Mem[ALUOut] ← B
ALUOut ← PrevPC + Imm19
ALUOut ← A + 0
IR ← Mem[PC]ALUOut ← PC + 4
ID Controls
EX Controls
MEM Controls
WB Controls
IF Controls
Pipeline Control Generation
• In single and multicycle datapaths, a single instruction decoder takes instruction register and generates control signals
• In pipelined datapaths, either:
• Single instruction decoder decodes upfront, then writes controls into data stationary registers to be used on subsequent cycles, or
• Instruction register value itself is copied into data stationary registers, and then a local decoder generates signals for that portion of the datapath
�9
Pipelining Instructions
• Pipeline conflict (specifically, a structural hazard) when two instructions try to write to register file on same clock cycle
• Only one write port to register file
• Caused by uneven pipeline stages
�10
IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB CtrlsR-Type
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
load IF Ctrls loadID Ctrls
loadEX Ctrls
loadMEM Ctrl
loadWB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
Solution 1: Insert Bubble into Pipeline
• Insert bubble into pipeline to prevent simultaneous writes
• Control logic can be complex
• No instruction started on Cycle 7
�11
IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB CtrlsR-Type
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
load IF Ctrls loadID Ctrls
loadEX Ctrls
loadMEM Ctrl
loadWB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
bubble
bubble
bubble
bubble bubble bubble bubble
Solution 2: Delay Write by One Cycle
• Delay R-Type’s write by one cycle
• R-Type MEM Ctrl is a no-op; it generates no controls
• Now pipeline has same length for all instruction types
�12
IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB CtrlsR-Type
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
load IF Ctrls loadID Ctrls
loadEX Ctrls
loadMEM Ctrl
loadWB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-Type WB Ctrls
R-Type IF Ctrls R-TypeID Ctrls
R-TypeEX Ctrls
R-TypeMEM Ctrl
R-TypeMEM Ctrl
R-TypeMEM Ctrl
R-TypeMEM Ctrl
R-TypeMEM Ctrl
R-TypeMEM Ctrl
Pipelined Operations
• IM = Instruction Memory, Reg = Register File, DM = Data Memory
• Shaded on right side = read, Shaded on left = write
�13
ldur X10, [X10, #40]
sub X11, X2, X3
add X12, X3, X4
stur X13, [X1, #48]
add X14, X5, X6
IM Reg DM RegALU
IM Reg DM RegALU
IM Reg DM RegALU
IM Reg DM RegALU
IM Reg DM RegALU
Time
Program Flow
Pipeline Controls
• PC is updated every cycle, so no need for PCWrite signal
• Pipeline registers (green boxes) also updated every cycle
�14
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
Setting Pipeline Controls
• Instruction decoder calculates control signal values given instruction register
• Writes those values to pipeline registers, to be used by later cycles
�15
InstructionEX Ctrls MEM Ctrls WB Ctrls
ALUSrcA ALUSrcB ALUOp MemRead MemWrite MemToReg RegWrite
add A B add 0 0 Out 1
sub A B sub 0 0 Out 1
ldur A Ext (imm9) add 1 0 M 1
stur A Ext (imm9) add 0 1 X 0
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M00
Example of Pipeline: Cycle 0
• Initial state (Cycle 0): PC = 00
�16
00: ldur04: sub08: add0C: stur10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M04
00
ldur
Example of Pipeline: After Cycle 1
• Fetch from 00h, then increment PC
�17
IF 00: ldur04: sub08: add0C: stur10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M08
04
sub
00
X10
X
#40
A, Ext, add
1, 0
M, 1, X10
Example of Pipeline: After Cycle 2
• Fetch from 04h; decode ldur X10, [X10, #40] and generate controls
�18
ID 00: ldurIF 04: sub
08: add0C: stur10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M0C
08
add
04
X2
X3
X
A, B, sub
0, 0
M, 1, X10
1, 0
A
Ext
add
X
sum
Out, 1, X11
Example of Pipeline: After Cycle 3
• Fetch from 08h; decode sub X11, X2, X3; execute ldur
�19
EX 00: ldurID 04: subIF 08: add
0C: stur10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M10
0C
stur
08
X3
X4
X
A, B, add
0, 0
Out, 1, X11
0, 0
A
B
sub
X3
diff
Out, 1, X12
1 0
M, 1, X10
sum
D
Example of Pipeline: After Cycle 4
• Fetch 0Ah; decode add X12, X3, X4; execute sub; access ldur
�20
MEM 00: ldurEX 04: subID 08: addIF 0C: stur
10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
M
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
14
10
add
0C
X1
X13
#48
A, Ext, add
0, 1
Out, 1, X12
0, 0
A
B
add
X4
sum
X, 0, X
0 0
Out, 1, X11
diff
1
X10
MX
Example of Pipeline: After Cycle 5
• Fetch 10h; decode stur X13, [X1, #48]; execute add; access sub (noop); write ldur
�21
WB 00: ldurMEM 04: subEX 08: addID 0C: sturIF 10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
MX
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M18
14 10
X5
X6
X
A, B, add
0,0
X, 0, X
0, 1
A
Ext
add
X13
sum
Out, 1, X14
0 0
Out, 1, X12
diff
1
X11
OutX
Example of Pipeline: After Cycle 6
• Decode add X14, X5, X6; execute stur; access add (noop); write sub
�22
00: ldurWB 04: sub
MEM 08: addEX 0C: sturID 10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
MX
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M1C
18 14
Out, 1, X14
0, 0
A
B
add
X6
sum
0 1
X, 0, X
sum
1
X12
OutX
Example of Pipeline: After Cycle 7
• Execute add; access stur; write add
�23
00: ldur04: sub
WB 08: addMEM 0C: sturEX 10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
MX
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M20
1C 18
0 0
Out, 1, X14
sum
0
X
XX
Example of Pipeline: After Cycle 8
• Access add (noop); write stur (noop)
�24
00: ldur04: sub08: add
WB 0C: sturMEM 10: add
Register FileDataA Sel
B Sel
W Sel
ALU
ExtendData
Memory
addr readdata
write data
adder
PC
4
Instruction Memory
addr readdata
PC
IR
PC
A
B
Ext
Out
B
Out
Decoder
EX Ctrls
MEM Ctrls
WB Ctrls
MEM CtrlsWB Ctrls
WB Ctrls
MM
MemRead MemWrite
MemToReg
ALUSrcB
ALUSrcARegWrite ALUOp
IR
PC PC
A
BB
Out
Out
Ext
EXCtrlsMEMCtrls
MEMCtrlsWBCtrls
WBCtrls
WBCtrls
M24
20 1C1
X14
Out
Example of Pipeline: After Cycle 9
• Write add
�25
00: ldur04: sub08: add0C: stur
WB 10: add