Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation...

19
Hardware Based Speculation

Transcript of Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation...

Page 1: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

Hardware Based Speculation

Page 2: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

2

Hardware-Based Speculation

•Branch prediction reduces the stalls attributable to branches

• For a processor executing multiple instructions• Just predicting branch is not enough• Multiple issue processor may execute a branch every clock

cycle

•Exploiting parallelism requires that we overcome the limitation of control dependence

Page 3: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

3

Hardware-Based Speculation

•Greater ILP: Overcome control dependence by hardware speculating on outcome of branches and executing program as if guesses were correct•If prediction is wrong it needs a hardware to handle it

• extension over branch prediction with dynamic scheduling• Speculation fetch, issue, and execute instructions as if branch

predictions were always correct • Dynamic scheduling only fetches and issues such instructions

•Essentially a data flow execution model: Operations execute as soon as their operands are available

Page 4: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

4

Hardware-Based Speculation

•3 components of HW-based speculation:

• Dynamic branch prediction to choose which instructions

to execute

• Speculation to allow execution of instructions before

control dependences are resolved

• Dynamic scheduling to deal with scheduling of different

combinations of basic blocks

Page 5: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

5

Hardware-Based Speculation

•Extending Tomasulo’s algorithm

To support speculation -must separate the bypassing of

results among instruction (speculative instruction) from the actual

completing of an instruction

•out-of-order execution but in-order commit

•the register file is not updated until instruction commits

Page 6: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

6

Hardware-Based Speculation

Store

• Store still takes 2 steps. 2nd step performed by instn commit

•Instruction commit unit-if instruction is executed completely then it

is updated in memory or register

• every instruction has a place in the ROB until it commits

Page 7: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

7

Hardware-Based Speculation in Tomasulo

•The key idea • allow instructions to execute out of order • force instructions to commit in order• prevent any irrevocable action (such as updating state or

taking an exception) until an instruction commits.

•Hence: • Must separate execution from allowing instruction to finish or

“commit”• instructions may finish execution considerably before they

are ready to commit.

•This additional step called instruction commit

Page 8: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

8

Hardware-Based Speculation in Tomasulo

•When an instruction is no longer speculative, allow it to update the register file or memory

Reorder buffer (ROB)

•Requires additional set of buffers to hold results of instructions that have finished execution but have not committed

•used to pass results among instructions that may be speculated

Page 9: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

9

Reorder Buffer

•In Tomasulo’s algorithm, once an instruction writes its result, any subsequently issued instructions will find result in the register file

•With speculation, the register file is not updated until the instruction commits

• (we know definitively that the instruction should execute)

•Thus, the ROB supplies operands in interval between completion of instruction execution and instruction commit

• ROB is a source of operands for instructions, just as reservation stations (RS) provide operands in Tomasulo’s algorithm

• ROB extends architectured registers like RS

Page 10: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

10

Reorder Buffer Structure (Four fields)

•instruction type field• Indicates whether the instruction is a branch (and has no destination

result), a store (which has a memory address destination), or a register operation (ALU operation or load, which has register destinations).

•destination field • supplies the register number (for loads and ALU operations) or the

memory address (for stores) where the instruction result should be written.

•value field • hold the value of the instruction result until the instruction commits.

•ready field • indicates that the instruction has completed execution, and the value is

ready.

Page 11: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

11

Reorder Buffer Operation

•Holds instructions in FIFO order, exactly as issued•When instructions complete, results placed into ROB

• Supplies operands to other instruction between execution complete & commit => more registers like RS

• Tag results with ROB buffer number instead of reservation station

•Instructions commit =>values at head of ROB placed in registers•As a result, easy to undo speculated instructions on mispredicted branches or on exceptions

ReorderBufferFP

OpQueue

FP Adder FP Adder

Res Stations Res Stations

FP Regs

Commit path

Page 12: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

12

Where is the store queue?

Page 13: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

13

4 Steps of Speculative Tomasulo

1- Issue —get instruction from instruction queueIf reservation station and reorder buffer slot free, issue instr & send operands (if available: either from ROB or FP registers) to reservation station & send reorder buffer no. allocated for result to reservation station (tag the result when it is placed on CDB)

Page 14: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

14

4 Steps of Speculative Tomasulo

2. Execution —operate on operands (EX)When both operands ready then execute; if not ready, watch CDB for result; when both in reservation station, execute; checks RAW

Loads still require 2 step process, instructions may take multiple cycles,, stores is only effective address calculation (need Rs),

Page 15: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

15

4 Steps of Speculative Tomasulo

4. Commit a) when an instruction reaches the head of the ROB and its result is present in the buffer;

• update the register with the result and remove the instruction from

the ROB.

b) Committing a store is similar except that memory is

updated rather than a result register.

c) If a branch with incorrect prediction reaches the head ROB• it indicates that the speculation was wrong.

• ROB is flushed and execution is restarted at the correct successor

of the branch.

d) If the branch was correctly predicted, the branch is finished.

•Once an instruction commits, its entry in the ROB is

reclaimed. If the ROB fills, we simply stop issuing instructions

until an entry is made free.

Page 16: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

Tomasulo With Reorder buffer:

ToMemory

FP adders FP multipliers

Reservation Stations

FP OpQueue

ROB7

ROB6

ROB5

ROB4

ROB3

ROB2

ROB1F0 LD F0,16(R2) N

Done?

DestDest

Oldest

Newest

from Memory

1 10+R2Dest

Reorder Buffer

Registers

LD F0,16(R2)

ADDD F10,F4,F0

DIVD F2,F10,F6

Dest. Value Instruction

Page 17: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

17

2 ADDD R(F4),ROB1

Tomasulo With Reorder buffer:

ToMemory

FP adders FP multipliers

Reservation Stations

FP OpQueue

ROB7

ROB6

ROB5

ROB4

ROB3

ROB2

ROB1

F10

F0

ADDD F10,F4,F0

LD F0,16(R2)

N

N

Done?

DestDest

Oldest

Newest

from Memory

1 10+R2Dest

Reorder Buffer

Registers

LD F0,16(R2)

ADDD F10,F4,F0

DIVD F2,F10,F6

Dest. Value Instruction

Page 18: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

18

3 DIVD ROB2,R(F6)2 ADDD R(F4),ROB1

Tomasulo With Reorder buffer:

ToMemory

FP adders FP multipliers

Reservation Stations

FP OpQueue

ROB7

ROB6

ROB5

ROB4

ROB3

ROB2

ROB1

F2

F10

F0

DIVD F2,F10,F6

ADDD F10,F4,F0

LD F0,16(R2)

N

N

N

Done?

DestDest

Oldest

Newest

from Memory

1 10+R2Dest

Reorder Buffer

Registers

LD F0,16(R2)

ADDD F10,F4,F0

DIVD F2,F10,F6

Dest. Value Instruction

Page 19: Hardware Based Speculation - rmd.ac.in Materials/7/ACA/unit2.… · Hardware-Based Speculation •Extending Tomasulo’salgorithm To support speculation -must separate the bypassing

19

Avoiding Memory Hazards

• WAW and WAR hazards through memory are eliminated with speculation

• RAW hazards through memory are avoided by two restrictions: