ETH Zürich - Homepage | ETH Zürich - Design of Digital ......ETH Zurich Spring 2017 7 April 2017...
Transcript of ETH Zürich - Homepage | ETH Zürich - Design of Digital ......ETH Zurich Spring 2017 7 April 2017...
Design of Digital Circuits
Lecture 14: Microprogramming
Prof. Onur Mutlu ETH Zurich Spring 2017 7 April 2017
Agenda for Today & Next Few Lectures ! Single-cycle Microarchitectures
! Multi-cycle and Microprogrammed Microarchitectures ! Pipelining
! Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery, …
! Out-of-Order Execution
! Issues in OoO Execution: Load-Store Handling, …
2
Readings for This Week ! P&P, Chapter 4
" Microarchitecture
! P&P, Revised Appendix C " Microarchitecture of the LC-3b " Appendix A (LC-3b ISA) will be useful in following this
! H&H, Chapter 7.4 (keep reading)
! Optional " Maurice Wilkes, “The Best Way to Design an Automatic
Calculating Machine,” Manchester Univ. Computer Inaugural Conf., 1951.
3
Multi-Cycle Microarchitectures
4
Remember: Multi-Cycle Microarchitecture AS = Architectural (programmer visible) state
at the beginning of an instruction
Step 1: Process part of instruction in one clock cycle
Step 2: Process part of instruction in the next clock cycle
…
AS’ = Architectural (programmer visible) state at the end of a clock cycle
5
One Example Multi-Cycle Microarchitecture
6
Carnegie Mellon
7
Remember:Single-CycleMIPSProcessor
SignImm
CLK
A RDInstruction
Memory
+
4
A1
A3WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
01
01
A RDData
MemoryWD
WE01
PC01 PC' Instr 25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
BranchMemWriteMemtoReg
ALUSrc
RegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
01
25:0 <<2
27:0 31:28
PCJump
Jump
Carnegie Mellon
8
Remember:CompleteMul9-cycleProcessor
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
Carnegie Mellon
9
ControlUnit
ALUSrcA
PCSrc
Branch
ALUSrcB1:0Opcode5:0
ControlUnit
ALUControl2:0Funct5:0
MainController(FSM)
ALUOp1:0
ALUDecoder
RegWrite
PCWrite
IorD
MemWriteIRWrite
RegDstMemtoReg
RegisterEnables
MultiplexerSelects
Carnegie Mellon
10
MainControllerFSM:Fetch
Reset
S0: Fetch
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
0
1 1
0
X
X
00
01
0100
10
Carnegie Mellon
11
MainControllerFSM:Fetch
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
0
1 1
0
X
X
00
01
0100
10
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch
Carnegie Mellon
12
MainControllerFSM:DecodeIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch S1: Decode
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
0X
XX
XXXX
00
Carnegie Mellon
13
MainControllerFSM:AddressCalcula9onIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch
S2: MemAdr
S1: Decode
Op = LWor
Op = SW
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
01
10
010X
00
Carnegie Mellon
14
MainControllerFSM:AddressCalcula9onIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
Reset
S0: Fetch
S2: MemAdr
S1: Decode
Op = LWor
Op = SW
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
01
10
010X
00
Carnegie Mellon
15
MainControllerFSM:lwIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemRead
Op = LWor
Op = SW
Op = LW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Carnegie Mellon
16
MainControllerFSM:swIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1 IorD = 1MemWrite
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
Op = LWor
Op = SW
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Carnegie Mellon
17
MainControllerFSM:R-TypeIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
Op = LWor
Op = SWOp = R-type
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Carnegie Mellon
18
MainControllerFSM:beqIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Carnegie Mellon
19
CompleteMul9-cycleControllerFSMIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Carnegie Mellon
20
MainControllerFSM:addiIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Carnegie Mellon
21
MainControllerFSM:addiIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Carnegie Mellon
22
ExtendedFunc9onality:j
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01PC 0
1
PC' Instr 25:21
20:16
15:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
RegDst BranchMemWrite MemtoReg ALUSrcARegWrite
Zero
PCSrc1:0
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWriteIorD PCWrite
PCEn
000110
<<2
25:0 (jump)
31:28
27:0
PCJump
Carnegie Mellon
23
ControlFSM:jIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 00
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Op = J
S11: Jump
Carnegie Mellon
24
ControlFSM:jIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 00
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
PCSrc = 10PCWrite
Op = J
S11: Jump
Carnegie Mellon
25
Mul9-cyclePerformance:CPI! Instruc9onstakedifferentnumberofcycles:
# 3cycles: beq,j# 4cycles: R-Type,sw,addi# 5cycles: lw
! CPIisweightedaverage,e.g.SPECINT2000benchmark:# 25% loads# 10% stores# 11% branches# 2% jumps# 52% R-type
! AverageCPI=(0.11+0.02)3+(0.52+0.10)4+(0.25)5 =4.12
Realis9c?
Carnegie Mellon
26
Mul9-cyclePerformance:CycleTime! Mul9-cyclecri9calpath:
Tc=
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
Carnegie Mellon
27
Mul9-cyclePerformance:CycleTime! Mul9-cyclecri9calpath:
Tc=tpcq+tmux+max(tALU+tmux,tmem)+tsetup
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01 0
1
PC 01
PC' Instr 25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
RegD
st
Branch
MemWrite
MemtoR
eg
ALUSrcARegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
Carnegie Mellon
28
Mul9cyclePerformanceExampleElement Parameter Delay(ps)
Registerclock-to-Q tpcq_PC 30
Registersetup tsetup 20
MulOplexer tmux 25
ALU tALU 200
Memoryread tmem 250
Registerfileread tRFread 150
Registerfilesetup tRFsetup 20
Tc =
Carnegie Mellon
29
Mul9cyclePerformanceExampleElement Parameter Delay(ps)
Registerclock-to-Q tpcq_PC 30
Registersetup tsetup 20
MulOplexer tmux 25
ALU tALU 200
Memoryread tmem 250
Registerfileread tRFread 150
Registerfilesetup tRFsetup 20
Tc =tpcq_PC+tmux+max(tALU+tmux,tmem)+tsetup
=[30+25+250+20]ps
=325ps
Carnegie Mellon
30
Mul9-cyclePerformanceExample! Foraprogramwith100billioninstruc9onsexecu9ngona
mul9-cycleMIPSprocessor# CPI=4.12# Tc=325ps
! Execu/onTime =(#instruc9ons)×CPI×Tc =(100×109)(4.12)(325×10-12) =133.9seconds
! Thisisslowerthanthesingle-cycleprocessor(92.5seconds).Why?
! Didwebreakthestagesinabalancedmanner?! Overheadofregistersetup/holdpaidmany9mes! Howwouldtheresultschangewithdifferentassump9onson
memorylatencyandinstruc9onmix?
Carnegie Mellon
31
Recall:Single-CyclePerformanceExample! Example:
Foraprogramwith100billioninstrucOonsexecuOngonasingle-cycleMIPSprocessor:
Execu/onTime =(#instrucOons)×CPI×Tc =(100×109)(1)(925×10-12s) =92.5seconds
Carnegie Mellon
32
Review:Single-CycleMIPSProcessor
SignImm
CLK
A RDInstruction
Memory
+
4
A1
A3WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
01
01
A RDData
MemoryWD
WE01
PC01 PC' Instr 25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
BranchMemWriteMemtoReg
ALUSrc
RegWrite
OpFunct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
01
25:0 <<2
27:0 31:28
PCJump
Jump
Carnegie Mellon
33
Review:Mul9-CycleMIPSProcessor
ImmExt
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2RD1
WE3
A2
CLK
Sign Extend
RegisterFile
01
01PC 0
1
PC' Instr 25:21
20:16
15:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
ZeroCLK
ALU
WD
WE
CLK
Adr
01
Data
CLK
CLK
AB 00
011011
4
CLK
ENEN000110
<<2
25:0 (Addr)
31:28
27:0
PCJump
5:0
31:26
Branch
MemWrite
ALUSrcARegWrite
OpFunct
ControlUnit
PCSrc
CLK
ALUControl2:0
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
RegD
st
Mem
toReg
Carnegie Mellon
34
Review:Mul9-CycleMIPSFSMIorD = 0
AluSrcA = 0ALUSrcB = 01ALUOp = 00PCSrc = 00
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SWOp = R-type
Op = BEQ
Op = LWOp = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
PCSrc = 10PCWrite
Op = J
S11: Jump
Whatistheshortcomingofthisdesign?
Whatdoesthisdesignassumeaboutmemory?
Carnegie Mellon
35
WhatIfMemoryTakes>OneCycle?! Stayinthesame“memoryaccess”stateun9lmemory
returnsthedata
! “MemoryReady?”bitisaninputtothecontrollogicthatdeterminesthenextstate
Another Example: Microprogrammed Multi-Cycle Microarchitecture
36
How Do We Implement This? ! Maurice Wilkes, “The Best Way to Design an Automatic
Calculating Machine,” Manchester Univ. Computer Inaugural Conf., 1951.
! An elegant implementation: " The concept of microcoded/microprogrammed machines
37
Recall: A Basic Multi-Cycle Microarchitecture ! Instruction processing cycle divided into “states”
! A stage in the instruction processing cycle can take multiple states
! A multi-cycle microarchitecture sequences from state to state to process an instruction ! The behavior of the machine in a state is completely determined by
control signals in that state
! The behavior of the entire processor is specified fully by a finite state machine
! In a state (clock cycle), control signals control two things: ! How the datapath should process the data ! How to generate the control signals for the (next) clock cycle
38
Microprogrammed Control Terminology ! Control signals associated with the current state
" Microinstruction
! Act of transitioning from one state to another " Determining the next state and the microinstruction for the
next state " Microsequencing
! Control store stores control signals for every possible state " Store for microinstructions for the entire FSM
! Microsequencer determines which set of control signals will be used in the next clock cycle (i.e., next state)
39
C.4. THE CONTROL STRUCTURE 9
Microinstruction
R
Microsequencer
BEN
x2
Control Store6
IR[15:11]
6
(J, COND, IRD)
269
35
35
Figure C.4: The control structure of a microprogrammed implementation, overall blockdiagram
on the LC-3b instruction being executed during the current instruction cycle. This statecarries out the DECODE phase of the instruction cycle. If the IRD control signal in themicroinstruction corresponding to state 32 is 1, the output MUX of the microsequencer(Figure C.5) will take its source from the six bits formed by 00 concatenated with thefour opcode bits IR[15:12]. Since IR[15:12] specifies the opcode of the current LC-3b instruction being processed, the next address of the control store will be one of 16addresses, corresponding to the 14 opcodes plus the two unused opcodes, IR[15:12] =1010 and 1011. That is, each of the 16 next states is the first state to be carried outafter the instruction has been decoded in state 32. For example, if the instruction beingprocessed is ADD, the address of the next state is state 1, whose microinstruction isstored at location 000001. Recall that IR[15:12] for ADD is 0001.
If, somehow, the instruction inadvertently contained IR[15:12] = 1010 or 1011, the
Simple Design of the Control Structure
Example Control
Structure
What Happens In A Clock Cycle? ! The control signals (microinstruction) for the current state
control two things: " Processing in the data path " Generation of control signals (microinstruction) for the next
cycle " See Supplemental Figure 1 (next-next slide)
! Datapath and microsequencer operate concurrently
! Question: why not generate control signals for the current cycle in the current cycle? " This could lengthen the clock cycle " Why could it lengthen the clock cycle? " See Supplemental Figure 2
41
Example uProgrammed Control & Datapath
42
Read Appendix C On website
A Clock Cycle
43
A Bad Clock Cycle!
44
A Simple LC-3b Control and Datapath
45
Read Appendix C On website
What Determines Next-State Control Signals? ! What is happening in the current clock cycle
" See the 9 control signals coming from “Control” block ! What are these for?
! The instruction that is being executed " IR[15:11] coming from the Data Path
! Whether the condition of a branch is met, if the instruction being processed is a branch " BEN bit coming from the datapath
! Whether the memory operation is completing in the current cycle, if one is in progress " R bit coming from memory
46
A Simple LC-3b Control and Datapath
47
The State Machine for Multi-Cycle Processing ! The behavior of the LC-3b uarch is completely determined by
" the 35 control signals and " additional 7 bits that go into the control logic from the datapath
! 35 control signals completely describe the state of the control structure
! We can completely describe the behavior of the LC-3b as a state machine, i.e. a directed graph of " Nodes (one corresponding to each state) " Arcs (showing flow from each state to the next state(s))
48
An LC-3b State Machine ! Patt and Patel, Appendix C, Figure C.2
! Each state must be uniquely specified " Done by means of state variables
! 31 distinct states in this LC-3b state machine " Encoded with 6 state variables
! Examples " State 18,19 correspond to the beginning of the instruction
processing cycle " Fetch phase: state 18, 19 $ state 33 $ state 35 " Decode phase: state 32
49
C.2. THE STATE MACHINE 5
R
PC<!BaseR
To 18
12
To 18
To 18
RR
To 18
To 18
To 18
MDR<!SR[7:0]
MDR <! M
IR <! MDR
R
DR<!SR1+OP2*set CC
DR<!SR1&OP2*set CC
[BEN]
PC<!MDR
32
1
5
0
0
1To 18
To 18 To 18
R R
[IR[15:12]]
28
30
R7<!PCMDR<!M[MAR]
set CC
BEN<!IR[11] & N + IR[10] & Z + IR[9] & P
9DR<!SR1 XOR OP2*
4
22
To 111011
JSRJMP
BR
1010
To 10
21
200 1
LDB
MAR<!B+off6
set CC
To 18
MAR<!B+off6
DR<!MDRset CC
To 18
MDR<!M[MAR]25
27
3762
STW STBLEASHF
TRAP
XOR
AND
ADD
RTI
To 8
set CC
set CCDR<!PC+LSHF(off9, 1)
14
LDW
MAR<!B+LSHF(off6,1) MAR<!B+LSHF(off6,1)
PC<!PC+LSHF(off9,1)
33
35
DR<!SHF(SR,A,D,amt4)
NOTESB+off6 : Base + SEXT[offset6]
R
MDR<!M[MAR[15:1]’0]
DR<!SEXT[BYTE.DATA]
R
29
31
18, 19
MDR<!SR
To 18
R R
M[MAR]<!MDR16
23
R R
17
To 19
24
M[MAR]<!MDR**
MAR<!LSHF(ZEXT[IR[7:0]],1)15To 18
PC+off9 : PC + SEXT[offset9]
MAR <! PCPC <! PC + 2
*OP2 may be SR2 or SEXT[imm5]** [15:8] or [7:0] depending on MAR[0]
[IR[11]]
PC<!BaseR
PC<!PC+LSHF(off11,1)
R7<!PC
R7<!PC
13
Figure C.2: A state machine for the LC-3b
This FSM Implements the LC-3b ISA
51
! P&P Appendix A (revised): " https://www.ethz.ch/
content/dam/ethz/special-interest/infk/inst-infsec/system-security-group-dam/education/Digitaltechnik_17/lecture/pp-appendixa.pdf
LC-3b State Machine: Some Questions ! How many cycles does the fastest instruction take?
! How many cycles does the slowest instruction take?
! Why does the BR take as long as it takes in the FSM?
! What determines the clock cycle time?
52
LC-3b Datapath ! Patt and Patel, Appendix C, Figure C.3
! Single-bus datapath design " At any point only one value can be “gated” on the bus (i.e.,
can be driving the bus) " Advantage: Low hardware cost: one bus " Disadvantage: Reduced concurrency – if instruction needs the
bus twice for two different things, these need to happen in different states
! Control signals (26 of them) determine what happens in the datapath in one clock cycle " Patt and Patel, Appendix C, Table C.1
53
C.4. THE CONTROL STRUCTURE 7
MEMORY
OUTPUTINPUT
KBDR
ADDR. CTL.LOGIC
MDR
INMUX
MAR L
L
MAR[0]
MAR[0]
DATA.SIZE
R
DATA.SIZE
D
D
.
.
M
MDR
AR
2
KBSR
MEM.EN
R.W
MIO.EN
GatePCGateMARMUX
16
16 16
16
16 16 16
LD.CC
SR2MUX
SEXT
SEXT[8:0]
[10:0]
SEXT
SEXT[5:0]
16
+2
PCLD.PC
16
+
16
16
[7:0]
LSHF1
[4:0]
GateALU16
SHF
GateSHF
6IR[5:0]
16
1616
16
16
16
16
LOGIC
16 16
GateMDR
N Z P
SR2OUT
SR1OUT
REGFILE
MARMUX
16
3
0
16
R
ADDR2MUX
2
ZEXT &LSHF1
3
3
ALUALUK
2 AB
ADDR1MUX
PCMUX2
SR1
DR
SR2
LD.REG
IRLD.IR
CONTROL
DDR
DSR
MIO.EN
LOGIC
LOGIC
SIZEDATA.
WE0WE1
[0]WE
LOGIC
Figure C.3: The LC-3b data path
provide you with the additional flexibility of more states, so we have selected a controlstore consisting of 26 locations.
C.4. THE CONTROL STRUCTURE 11
DR
IR[11:9]
111
DRMUX
(a)
SR1
SR1MUX
IR[11:9]
IR[8:6]
(b)
Logic BEN
PZN
IR[11:9]
(c)
Figure C.6: Additional logic required to provide control signals
LC-3b to operate correctly with a memory that takes multiple clock cycles to read orstore a value.
Suppose it takes memory five cycles to read a value. That is, once MAR containsthe address to be read and the microinstruction asserts READ, it will take five cyclesbefore the contents of the specified location in memory are available to be loaded intoMDR. (Note that the microinstruction asserts READ by means of three control signals:MIO.EN/YES, R.W/RD, and DATA.SIZE/WORD; see Figure C.3.)
Recall our discussion in Section C.2 of the function of state 33, which accessesan instruction from memory during the fetch phase of each instruction cycle. For theLC-3b to operate correctly, state 33 must execute five times before moving on to state35. That is, until MDR contains valid data from the memory location specified by thecontents of MAR, we want state 33 to continue to re-execute. After five clock cycles,the memory has completed the “read,” resulting in valid data in MDR, so the processorcan move on to state 35. What if the microarchitecture did not wait for the memory tocomplete the read operation before moving on to state 35? Since the contents of MDRwould still be garbage, the microarchitecture would put garbage into IR in state 35.
The ready signal (R) enables the memory read to execute correctly. Since the mem-ory knows it needs five clock cycles to complete the read, it asserts a ready signal(R) throughout the fifth clock cycle. Figure C.2 shows that the next state is 33 (i.e.,100001) if the memory read will not complete in the current clock cycle and state 35(i.e., 100011) if it will. As we have seen, it is the job of the microsequencer (FigureC.5) to produce the next state address.
Remember the MIPS datapath
LC-3b Datapath: Some Questions
! How does instruction fetch happen in this datapath according to the state machine?
! What is the difference between gating and loading? " Gating: Enable/disable an input to be connected to the bus
! Combinational: during a clock cycle
" Loading: Enable/disable an input to be written to a register ! Sequential: e.g., at a clock edge (assume at the end of cycle)
! Is this the smallest hardware you can design?
57
LC-3b Microprogrammed Control Structure
! Patt and Patel, Appendix C, Figure C.4
! Three components: " Microinstruction, control store, microsequencer
! Microinstruction: control signals that control the datapath (26 of them) and help determine the next state (9 of them)
! Each microinstruction is stored in a unique location in the control store (a special memory structure)
! Unique location: address of the state corresponding to the microinstruction " Remember each state corresponds to one microinstruction
! Microsequencer determines the address of the next microinstruction (i.e., next state)
58
C.4. THE CONTROL STRUCTURE 9
Microinstruction
R
Microsequencer
BEN
x2
Control Store6
IR[15:11]
6
(J, COND, IRD)
269
35
35
Figure C.4: The control structure of a microprogrammed implementation, overall blockdiagram
on the LC-3b instruction being executed during the current instruction cycle. This statecarries out the DECODE phase of the instruction cycle. If the IRD control signal in themicroinstruction corresponding to state 32 is 1, the output MUX of the microsequencer(Figure C.5) will take its source from the six bits formed by 00 concatenated with thefour opcode bits IR[15:12]. Since IR[15:12] specifies the opcode of the current LC-3b instruction being processed, the next address of the control store will be one of 16addresses, corresponding to the 14 opcodes plus the two unused opcodes, IR[15:12] =1010 and 1011. That is, each of the 16 next states is the first state to be carried outafter the instruction has been decoded in state 32. For example, if the instruction beingprocessed is ADD, the address of the next state is state 1, whose microinstruction isstored at location 000001. Recall that IR[15:12] for ADD is 0001.
If, somehow, the instruction inadvertently contained IR[15:12] = 1010 or 1011, the
Simple Design of the Control Structure
10APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
IRD
Address of Next State
6
6
0,0,IR[15:12]
J[5]
Branch ReadyModeAddr.
J[0]J[1]J[2]
COND0COND1
J[3]J[4]
R IR[11]BEN
Figure C.5: The microsequencer of the LC-3b base machine
unused opcodes, the microarchitecture would execute a sequence of microinstructions,starting at state 10 or state 11, depending on which illegal opcode was being decoded.In both cases, the sequence of microinstructions would respond to the fact that aninstruction with an illegal opcode had been fetched.
Several signals necessary to control the data path and the microsequencer are notamong those listed in Tables C.1 and C.2. They are DR, SR1, BEN, and R. Figure C.6shows the additional logic needed to generate DR, SR1, and BEN.
The remaining signal, R, is a signal generated by the memory in order to allow the
14APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
J LD.PC
LD.BEN
LD.IR
LD.M
DR
LD.M
AR
LD.REG
LD.CC
Cond
IRD
GatePC
GateMDR
GateALU
GateMARMUX
GateSH
FPC
MUXDRMUXSR
1MUX
ADDR1MUX
ADDR2MUX
MARMUX
010000 (State 16)010001 (State 17)
010011 (State 19)010010 (State 18)
010100 (State 20)010101 (State 21)010110 (State 22)010111 (State 23)011000 (State 24)011001 (State 25)011010 (State 26)011011 (State 27)011100 (State 28)011101 (State 29)011110 (State 30)011111 (State 31)100000 (State 32)100001 (State 33)100010 (State 34)100011 (State 35)100100 (State 36)100101 (State 37)100110 (State 38)100111 (State 39)101000 (State 40)101001 (State 41)101010 (State 42)101011 (State 43)101100 (State 44)101101 (State 45)101110 (State 46)101111 (State 47)110000 (State 48)110001 (State 49)110010 (State 50)110011 (State 51)110100 (State 52)110101 (State 53)110110 (State 54)110111 (State 55)111000 (State 56)111001 (State 57)111010 (State 58)111011 (State 59)111100 (State 60)111101 (State 61)111110 (State 62)111111 (State 63)
001000 (State 8)001001 (State 9)001010 (State 10)001011 (State 11)001100 (State 12)001101 (State 13)001110 (State 14)001111 (State 15)
000000 (State 0)000001 (State 1)000010 (State 2)000011 (State 3)000100 (State 4)000101 (State 5)000110 (State 6)000111 (State 7)
ALUK
MIO.EN
R.W LSHF1
DATA.SI
ZE
Figure C.7: Specification of the control store
LC-3b Microsequencer
! Patt and Patel, Appendix C, Figure C.5
! The purpose of the microsequencer is to determine the address of the next microinstruction (i.e., next state) " Next state could be conditional or unconditional
! Next state address depends on 9 control signals (plus 7 data signals)
62
10APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
IRD
Address of Next State
6
6
0,0,IR[15:12]
J[5]
Branch ReadyModeAddr.
J[0]J[1]J[2]
COND0COND1
J[3]J[4]
R IR[11]BEN
Figure C.5: The microsequencer of the LC-3b base machine
unused opcodes, the microarchitecture would execute a sequence of microinstructions,starting at state 10 or state 11, depending on which illegal opcode was being decoded.In both cases, the sequence of microinstructions would respond to the fact that aninstruction with an illegal opcode had been fetched.
Several signals necessary to control the data path and the microsequencer are notamong those listed in Tables C.1 and C.2. They are DR, SR1, BEN, and R. Figure C.6shows the additional logic needed to generate DR, SR1, and BEN.
The remaining signal, R, is a signal generated by the memory in order to allow the
The Microsequencer: Some Questions ! When is the IRD signal asserted?
! What happens if an illegal instruction is decoded?
! What are condition (COND) bits for?
! How is variable latency memory handled?
! How do you do the state encoding? " Minimize number of state variables (~ control store size) " Start with the 16-way branch " Then determine constraint tables and states dependent on COND
64
An Exercise in Microprogramming
65
Handouts ! 7 pages of Microprogrammed LC-3b design
! https://www.ethz.ch/content/dam/ethz/special-interest/infk/inst-infsec/system-security-group-dam/education/Digitaltechnik_17/lecture/lc3b-figures.pdf
66
A Simple LC-3b Control and Datapath
67
C.2. THE STATE MACHINE 5
R
PC<!BaseR
To 18
12
To 18
To 18
RR
To 18
To 18
To 18
MDR<!SR[7:0]
MDR <! M
IR <! MDR
R
DR<!SR1+OP2*set CC
DR<!SR1&OP2*set CC
[BEN]
PC<!MDR
32
1
5
0
0
1To 18
To 18 To 18
R R
[IR[15:12]]
28
30
R7<!PCMDR<!M[MAR]
set CC
BEN<!IR[11] & N + IR[10] & Z + IR[9] & P
9DR<!SR1 XOR OP2*
4
22
To 111011
JSRJMP
BR
1010
To 10
21
200 1
LDB
MAR<!B+off6
set CC
To 18
MAR<!B+off6
DR<!MDRset CC
To 18
MDR<!M[MAR]25
27
3762
STW STBLEASHF
TRAP
XOR
AND
ADD
RTI
To 8
set CC
set CCDR<!PC+LSHF(off9, 1)
14
LDW
MAR<!B+LSHF(off6,1) MAR<!B+LSHF(off6,1)
PC<!PC+LSHF(off9,1)
33
35
DR<!SHF(SR,A,D,amt4)
NOTESB+off6 : Base + SEXT[offset6]
R
MDR<!M[MAR[15:1]’0]
DR<!SEXT[BYTE.DATA]
R
29
31
18, 19
MDR<!SR
To 18
R R
M[MAR]<!MDR16
23
R R
17
To 19
24
M[MAR]<!MDR**
MAR<!LSHF(ZEXT[IR[7:0]],1)15To 18
PC+off9 : PC + SEXT[offset9]
MAR <! PCPC <! PC + 2
*OP2 may be SR2 or SEXT[imm5]** [15:8] or [7:0] depending on MAR[0]
[IR[11]]
PC<!BaseR
PC<!PC+LSHF(off11,1)
R7<!PC
R7<!PC
13
Figure C.2: A state machine for the LC-3b
C.4. THE CONTROL STRUCTURE 7
MEMORY
OUTPUTINPUT
KBDR
ADDR. CTL.LOGIC
MDR
INMUX
MAR L
L
MAR[0]
MAR[0]
DATA.SIZE
R
DATA.SIZE
D
D
.
.
M
MDR
AR
2
KBSR
MEM.EN
R.W
MIO.EN
GatePCGateMARMUX
16
16 16
16
16 16 16
LD.CC
SR2MUX
SEXT
SEXT[8:0]
[10:0]
SEXT
SEXT[5:0]
16
+2
PCLD.PC
16
+
16
16
[7:0]
LSHF1
[4:0]
GateALU16
SHF
GateSHF
6IR[5:0]
16
1616
16
16
16
16
LOGIC
16 16
GateMDR
N Z P
SR2OUT
SR1OUT
REGFILE
MARMUX
16
3
0
16
R
ADDR2MUX
2
ZEXT &LSHF1
3
3
ALUALUK
2 AB
ADDR1MUX
PCMUX2
SR1
DR
SR2
LD.REG
IRLD.IR
CONTROL
DDR
DSR
MIO.EN
LOGIC
LOGIC
SIZEDATA.
WE0WE1
[0]WE
LOGIC
Figure C.3: The LC-3b data path
provide you with the additional flexibility of more states, so we have selected a controlstore consisting of 26 locations.
A Simple Datapath Can Become Very Powerful
10APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
IRD
Address of Next State
6
6
0,0,IR[15:12]
J[5]
Branch ReadyModeAddr.
J[0]J[1]J[2]
COND0COND1
J[3]J[4]
R IR[11]BEN
Figure C.5: The microsequencer of the LC-3b base machine
unused opcodes, the microarchitecture would execute a sequence of microinstructions,starting at state 10 or state 11, depending on which illegal opcode was being decoded.In both cases, the sequence of microinstructions would respond to the fact that aninstruction with an illegal opcode had been fetched.
Several signals necessary to control the data path and the microsequencer are notamong those listed in Tables C.1 and C.2. They are DR, SR1, BEN, and R. Figure C.6shows the additional logic needed to generate DR, SR1, and BEN.
The remaining signal, R, is a signal generated by the memory in order to allow the
State18(010010)State33(100001)State35(100011)State32(100000)State6(000110)State25(011001)State27(011011)
StateMachineforLDW Microsequencer
Fill in the microinstructions for the 7 states for LDW
C.4. THE CONTROL STRUCTURE 11
DR
IR[11:9]
111
DRMUX
(a)
SR1
SR1MUX
IR[11:9]
IR[8:6]
(b)
Logic BEN
PZN
IR[11:9]
(c)
Figure C.6: Additional logic required to provide control signals
LC-3b to operate correctly with a memory that takes multiple clock cycles to read orstore a value.
Suppose it takes memory five cycles to read a value. That is, once MAR containsthe address to be read and the microinstruction asserts READ, it will take five cyclesbefore the contents of the specified location in memory are available to be loaded intoMDR. (Note that the microinstruction asserts READ by means of three control signals:MIO.EN/YES, R.W/RD, and DATA.SIZE/WORD; see Figure C.3.)
Recall our discussion in Section C.2 of the function of state 33, which accessesan instruction from memory during the fetch phase of each instruction cycle. For theLC-3b to operate correctly, state 33 must execute five times before moving on to state35. That is, until MDR contains valid data from the memory location specified by thecontents of MAR, we want state 33 to continue to re-execute. After five clock cycles,the memory has completed the “read,” resulting in valid data in MDR, so the processorcan move on to state 35. What if the microarchitecture did not wait for the memory tocomplete the read operation before moving on to state 35? Since the contents of MDRwould still be garbage, the microarchitecture would put garbage into IR in state 35.
The ready signal (R) enables the memory read to execute correctly. Since the mem-ory knows it needs five clock cycles to complete the read, it asserts a ready signal(R) throughout the fifth clock cycle. Figure C.2 shows that the next state is 33 (i.e.,100001) if the memory read will not complete in the current clock cycle and state 35(i.e., 100011) if it will. As we have seen, it is the job of the microsequencer (FigureC.5) to produce the next state address.
C.4. THE CONTROL STRUCTURE 9
Microinstruction
R
Microsequencer
BEN
x2
Control Store6
IR[15:11]
6
(J, COND, IRD)
269
35
35
Figure C.4: The control structure of a microprogrammed implementation, overall blockdiagram
on the LC-3b instruction being executed during the current instruction cycle. This statecarries out the DECODE phase of the instruction cycle. If the IRD control signal in themicroinstruction corresponding to state 32 is 1, the output MUX of the microsequencer(Figure C.5) will take its source from the six bits formed by 00 concatenated with thefour opcode bits IR[15:12]. Since IR[15:12] specifies the opcode of the current LC-3b instruction being processed, the next address of the control store will be one of 16addresses, corresponding to the 14 opcodes plus the two unused opcodes, IR[15:12] =1010 and 1011. That is, each of the 16 next states is the first state to be carried outafter the instruction has been decoded in state 32. For example, if the instruction beingprocessed is ADD, the address of the next state is state 1, whose microinstruction isstored at location 000001. Recall that IR[15:12] for ADD is 0001.
If, somehow, the instruction inadvertently contained IR[15:12] = 1010 or 1011, the
Simple Design of the Control Structure
10APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
IRD
Address of Next State
6
6
0,0,IR[15:12]
J[5]
Branch ReadyModeAddr.
J[0]J[1]J[2]
COND0COND1
J[3]J[4]
R IR[11]BEN
Figure C.5: The microsequencer of the LC-3b base machine
unused opcodes, the microarchitecture would execute a sequence of microinstructions,starting at state 10 or state 11, depending on which illegal opcode was being decoded.In both cases, the sequence of microinstructions would respond to the fact that aninstruction with an illegal opcode had been fetched.
Several signals necessary to control the data path and the microsequencer are notamong those listed in Tables C.1 and C.2. They are DR, SR1, BEN, and R. Figure C.6shows the additional logic needed to generate DR, SR1, and BEN.
The remaining signal, R, is a signal generated by the memory in order to allow the
14APPENDIXC. THEMICROARCHITECTUREOFTHE LC-3B, BASICMACHINE
J LD.PC
LD.BEN
LD.IR
LD.M
DR
LD.M
AR
LD.REG
LD.CC
Cond
IRD
GatePC
GateMDR
GateALU
GateMARMUX
GateSH
FPC
MUXDRMUXSR
1MUX
ADDR1MUX
ADDR2MUX
MARMUX
010000 (State 16)010001 (State 17)
010011 (State 19)010010 (State 18)
010100 (State 20)010101 (State 21)010110 (State 22)010111 (State 23)011000 (State 24)011001 (State 25)011010 (State 26)011011 (State 27)011100 (State 28)011101 (State 29)011110 (State 30)011111 (State 31)100000 (State 32)100001 (State 33)100010 (State 34)100011 (State 35)100100 (State 36)100101 (State 37)100110 (State 38)100111 (State 39)101000 (State 40)101001 (State 41)101010 (State 42)101011 (State 43)101100 (State 44)101101 (State 45)101110 (State 46)101111 (State 47)110000 (State 48)110001 (State 49)110010 (State 50)110011 (State 51)110100 (State 52)110101 (State 53)110110 (State 54)110111 (State 55)111000 (State 56)111001 (State 57)111010 (State 58)111011 (State 59)111100 (State 60)111101 (State 61)111110 (State 62)111111 (State 63)
001000 (State 8)001001 (State 9)001010 (State 10)001011 (State 11)001100 (State 12)001101 (State 13)001110 (State 14)001111 (State 15)
000000 (State 0)000001 (State 1)000010 (State 2)000011 (State 3)000100 (State 4)000101 (State 5)000110 (State 6)000111 (State 7)
ALUK
MIO.EN
R.W LSHF1
DATA.SI
ZE
Figure C.7: Specification of the control store
End of the Exercise in Microprogramming
76
Design of Digital Circuits
Lecture 14: Microprogramming
Prof. Onur Mutlu ETH Zurich Spring 2017 7 April 2017