CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned...

Post on 18-Jan-2018

221 views 0 download

description

CBP 2005Comp 3070 Computer Architecture3 Laundry Model Washer Drier Store Basket Wardrobe

Transcript of CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned...

CBP 2005 Comp 3070 Computer Architecture 1

Last Time …All instructions

the same lengthWe learned to program MIPS

And a bit about Intel’s x86 Instructions of

variable length

CBP 2005 Comp 3070 Computer Architecture 2

Today the consequences of …

Intel (CISC) MIPS (RISC)

CBP 2005 Comp 3070 Computer Architecture 3

Laundry Model

Washer Drier Store Basket Wardrobe

CBP 2005 Comp 3070 Computer Architecture 4

Process Steps

A. Wash then Dry

idle

idle running

running

time

time9.00 10.00 11.00

1. Load the washer at 9.00

2. Done at 10, load the drier

3. Drier Done at 11

CBP 2005 Comp 3070 Computer Architecture 5

Sequential Process

3 loads takes 6 hours

time9.00 15.00 11.00

1. Load washer at 9.002. Done at 10, load

drier3. Drier Done at 114. Reload washer at

115. Done at 12, load

drier6. Drier done at 137. Reload washer at

138. Done at 14, load

drier9. Done at 15

13.00

CBP 2005 Comp 3070 Computer Architecture 6

Overlapping Process

3 loads takes 4 hours

time9.00 15.00 11.00

1. Load washer at 9.002. Done at 10, load drier

reload washer3. Both Done at 11. Reload

drier reload washer4. Both done at 12. Reload

drier5. Drier done at 13

13.00

From 10.00 till 11.00 both washer and dryer running concurrently

CBP 2005 Comp 3070 Computer Architecture 7

Washing Pipeline Filling

time

9.00 11.00 13.00 15.00 17.00

18.00

5 loads in 9 hours

5 Cycles !!!1. Get washing2. Wash3. Dry4. Store5. Put away

CBP 2005 Comp 3070 Computer Architecture 8

Washing Pipeline Fulltime

9.00 11.00 13.00 15.00 17.00

Pipe Full gives 1 load per hour

CBP 2005 Comp 3070 Computer Architecture 9

Pipelining : Comments

time

9.00 11.00 13.00 15.00 17.00

18.00

• Potential speedup = number of stages• Time to ‘fill’ and ‘drain’ reduces speedup• Rate limited by slowest step

CBP 2005 Comp 3070 Computer Architecture 10

Can we Pipeline SAM ?

Data Memory

Instruction reg

Code Memory

ALU

r1

r2

r0X

Y

W

X Y

W

0

1

7mar

mdr

1.Fetch

2.Dec/Reg 3.ALU 4.Mem

5.RW

CBP 2005 Comp 3070 Computer Architecture 11

Pipelined Sam4

Data Memory

0

1

7

X

Y

W

Y

W

r1r2

r0

X

Code Memory

1.Fetch

2.Dec/Reg

3.ALU 4.Mem 5.RW

Buffer

time

CBP 2005 Comp 3070 Computer Architecture 12

5 Stages in Pipeline

ALUMem Reg Mem Reg

add r3,r1,r2 r1,r2 r3add

Let’s take the instruction add r3,r1,r2 and show which stage is needed for each part of the instruction.

1.Fetch 2.Dec/Reg

3.ALU 4.Mem 5.RW

time

CBP 2005 Comp 3070 Computer Architecture 13

ld r0 Mem r3

Two Instructions

ld r3,[r0+2]

Two instructions into the pipeline

add r4,r1,r2 ALUadd r1,r2 r4

r0

2

time

CBP 2005 Comp 3070 Computer Architecture 14

Structural Hazard

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

Here we are being asked to read from memory and write to it simultaneously. Impossible!

Write (store)

Read (fetch

)

Solution – Use separate code and data memories

add r4,r1,r2

st r0,[5]

CBP 2005 Comp 3070 Computer Architecture 15

Hazardous Washing

time

9.00 11.00 13.00 15.00 17.00

18.00

Washing basket containes both clean and dirty

washing!

CBP 2005 Comp 3070 Computer Architecture 16

Code and Data Memories

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

CBP 2005 Comp 3070 Computer Architecture 17

add r1,r2 r3

Data Hazard

add r3,r1,r2

but need r3 hereEARLIER !

add r4,r1,r3 add r1,r3 r4

r3 set heretime

CBP 2005 Comp 3070 Computer Architecture 18

Data Hazard

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

add r3,r1,r2

add r4,r1,r3

Need value of r3 for second instruction before the first is complete.

CBP 2005 Comp 3070 Computer Architecture 19

Pipeline Stalls

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

Mem

ALUReg Mem RegStall Stall

ALUMem Reg Mem Reg

add r3,r1,r2

add r4,r1,r3

Resolve Hazard – Insert delay into second instruction stream. ‘Stall’ Cycles.

But this needs extra electronics on the chip. Complex and Costly.

CBP 2005 Comp 3070 Computer Architecture 20

Forwarding

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

add r3,r1,r2

add r4,r1,r3

Need value of r3 for second instruction before the first is complete.

So build in extra circuits to get the data as soon as it is available from the ALU

CBP 2005 Comp 3070 Computer Architecture 21

Compiler resolves Hazard

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

add r3,r1,r2

add r4,r1,r3

Compile can detect possible hazard and insert 2 nops (‘no ops’)

ALUMem Reg Mem Reg

ALUMem Reg Mem Regnop

nop

CBP 2005 Comp 3070 Computer Architecture 22

Example

op code regs alu mem reg write

ld r1,[7]

ld r2,[8]

add r3,r1,r2

ld r1[7]

ld r2[8]

addr1, r2 r3

CBP 2005 Comp 3070 Computer Architecture 23

Exercise op code regs alu mem reg

writeld r1,[7]

add r1,r2,r0

add r3,r1,r2

CBP 2005 Comp 3070 Computer Architecture 24

Control (branch) Hazard

?beq r1,r2Test

done

add r1,r2 r3

st r3 [64]

st r2 [68]

ld 124 [124] r1

4 beq r1,r2,20

8 add r3,r1,r2

12 st r3,[64]

16 st r2,[68]

20 ld r1,[124]

Program may branch to 20

If r1 = r2 branch to 20 Test if r1 = r2 done by ALU. Result known only in stage 4

Run from here now

Must FLUSH these

CBP 2005 Comp 3070 Computer Architecture 25

Branch Hazard Resolution• Let’s just assume the branch will NOT be taken• So the following instruction needs to be executed• And this is already in the pipeline• So we make no changes to our CPU hardware designWe will be wrong 50% of the time, at a

guess. Then we have to flush the pipeline

• The above assumption is a crude form of Branch Prediction• Could keep a branch prediction table storing the results of previous branches. Use this to make a statistics based decision