CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned...
-
Upload
jeffrey-norris -
Category
Documents
-
view
221 -
download
0
description
Transcript of CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned...
CBP 2005 Comp 3070 Computer Architecture 1
Last Time …All instructions
the same lengthWe learned to program MIPS
And a bit about Intel’s x86 Instructions of
variable length
CBP 2005 Comp 3070 Computer Architecture 2
Today the consequences of …
Intel (CISC) MIPS (RISC)
CBP 2005 Comp 3070 Computer Architecture 3
Laundry Model
Washer Drier Store Basket Wardrobe
CBP 2005 Comp 3070 Computer Architecture 4
Process Steps
A. Wash then Dry
idle
idle running
running
time
time9.00 10.00 11.00
1. Load the washer at 9.00
2. Done at 10, load the drier
3. Drier Done at 11
CBP 2005 Comp 3070 Computer Architecture 5
Sequential Process
3 loads takes 6 hours
time9.00 15.00 11.00
1. Load washer at 9.002. Done at 10, load
drier3. Drier Done at 114. Reload washer at
115. Done at 12, load
drier6. Drier done at 137. Reload washer at
138. Done at 14, load
drier9. Done at 15
13.00
CBP 2005 Comp 3070 Computer Architecture 6
Overlapping Process
3 loads takes 4 hours
time9.00 15.00 11.00
1. Load washer at 9.002. Done at 10, load drier
reload washer3. Both Done at 11. Reload
drier reload washer4. Both done at 12. Reload
drier5. Drier done at 13
13.00
From 10.00 till 11.00 both washer and dryer running concurrently
CBP 2005 Comp 3070 Computer Architecture 7
Washing Pipeline Filling
time
9.00 11.00 13.00 15.00 17.00
18.00
5 loads in 9 hours
5 Cycles !!!1. Get washing2. Wash3. Dry4. Store5. Put away
CBP 2005 Comp 3070 Computer Architecture 8
Washing Pipeline Fulltime
9.00 11.00 13.00 15.00 17.00
Pipe Full gives 1 load per hour
CBP 2005 Comp 3070 Computer Architecture 9
Pipelining : Comments
time
9.00 11.00 13.00 15.00 17.00
18.00
• Potential speedup = number of stages• Time to ‘fill’ and ‘drain’ reduces speedup• Rate limited by slowest step
CBP 2005 Comp 3070 Computer Architecture 10
Can we Pipeline SAM ?
Data Memory
Instruction reg
Code Memory
ALU
r1
r2
r0X
Y
W
X Y
W
0
1
7mar
mdr
1.Fetch
2.Dec/Reg 3.ALU 4.Mem
5.RW
CBP 2005 Comp 3070 Computer Architecture 11
Pipelined Sam4
Data Memory
0
1
7
X
Y
W
Y
W
r1r2
r0
X
Code Memory
1.Fetch
2.Dec/Reg
3.ALU 4.Mem 5.RW
Buffer
time
CBP 2005 Comp 3070 Computer Architecture 12
5 Stages in Pipeline
ALUMem Reg Mem Reg
add r3,r1,r2 r1,r2 r3add
Let’s take the instruction add r3,r1,r2 and show which stage is needed for each part of the instruction.
1.Fetch 2.Dec/Reg
3.ALU 4.Mem 5.RW
time
CBP 2005 Comp 3070 Computer Architecture 13
ld r0 Mem r3
Two Instructions
ld r3,[r0+2]
Two instructions into the pipeline
add r4,r1,r2 ALUadd r1,r2 r4
r0
2
time
CBP 2005 Comp 3070 Computer Architecture 14
Structural Hazard
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
Here we are being asked to read from memory and write to it simultaneously. Impossible!
Write (store)
Read (fetch
)
Solution – Use separate code and data memories
add r4,r1,r2
st r0,[5]
CBP 2005 Comp 3070 Computer Architecture 15
Hazardous Washing
time
9.00 11.00 13.00 15.00 17.00
18.00
Washing basket containes both clean and dirty
washing!
CBP 2005 Comp 3070 Computer Architecture 16
Code and Data Memories
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
CBP 2005 Comp 3070 Computer Architecture 17
add r1,r2 r3
Data Hazard
add r3,r1,r2
but need r3 hereEARLIER !
add r4,r1,r3 add r1,r3 r4
r3 set heretime
CBP 2005 Comp 3070 Computer Architecture 18
Data Hazard
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
add r3,r1,r2
add r4,r1,r3
Need value of r3 for second instruction before the first is complete.
CBP 2005 Comp 3070 Computer Architecture 19
Pipeline Stalls
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
Mem
ALUReg Mem RegStall Stall
ALUMem Reg Mem Reg
add r3,r1,r2
add r4,r1,r3
Resolve Hazard – Insert delay into second instruction stream. ‘Stall’ Cycles.
But this needs extra electronics on the chip. Complex and Costly.
CBP 2005 Comp 3070 Computer Architecture 20
Forwarding
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
add r3,r1,r2
add r4,r1,r3
Need value of r3 for second instruction before the first is complete.
So build in extra circuits to get the data as soon as it is available from the ALU
CBP 2005 Comp 3070 Computer Architecture 21
Compiler resolves Hazard
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
ALUMem Reg Mem Reg
add r3,r1,r2
add r4,r1,r3
Compile can detect possible hazard and insert 2 nops (‘no ops’)
ALUMem Reg Mem Reg
ALUMem Reg Mem Regnop
nop
CBP 2005 Comp 3070 Computer Architecture 22
Example
op code regs alu mem reg write
ld r1,[7]
ld r2,[8]
add r3,r1,r2
ld r1[7]
ld r2[8]
addr1, r2 r3
CBP 2005 Comp 3070 Computer Architecture 23
Exercise op code regs alu mem reg
writeld r1,[7]
add r1,r2,r0
add r3,r1,r2
CBP 2005 Comp 3070 Computer Architecture 24
Control (branch) Hazard
?beq r1,r2Test
done
add r1,r2 r3
st r3 [64]
st r2 [68]
ld 124 [124] r1
4 beq r1,r2,20
8 add r3,r1,r2
12 st r3,[64]
16 st r2,[68]
20 ld r1,[124]
Program may branch to 20
If r1 = r2 branch to 20 Test if r1 = r2 done by ALU. Result known only in stage 4
Run from here now
Must FLUSH these
CBP 2005 Comp 3070 Computer Architecture 25
Branch Hazard Resolution• Let’s just assume the branch will NOT be taken• So the following instruction needs to be executed• And this is already in the pipeline• So we make no changes to our CPU hardware designWe will be wrong 50% of the time, at a
guess. Then we have to flush the pipeline
• The above assumption is a crude form of Branch Prediction• Could keep a branch prediction table storing the results of previous branches. Use this to make a statistics based decision