Lec Feb09 2009

30
Anshul Kumar, CSE IITD CSL718 : Superscalar Processors CSL718 : Superscalar CSL718 : Superscalar Processors Processors Dynamic Scheduling and Speculative Execution 9th Feb, 2009

Transcript of Lec Feb09 2009

Page 1: Lec Feb09 2009

Anshul Kumar, CSE IITD

CSL718 : Superscalar Processors

CSL718 : Superscalar CSL718 : Superscalar ProcessorsProcessors

Dynamic Scheduling andSpeculative Execution

9th Feb, 2009

Page 2: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 2

Handling Control DependenceHandling Control DependenceHandling Control Dependence

• Simple pipeline– Branch prediction reduces stalls due to control

dependence• Wide issue processor

– Mere branch prediction is not sufficient– Instructions in the predicted path need to be

fetched and EXECUTED (speculated execution)

Page 3: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 3

What is required for speculation?What is required for speculation?What is required for speculation?

• Branch prediction to choose which instructions to execute

• Execution of instructions before control dependences are resolved

• Ability to undo the effects of incorrectly speculated sequence

• Preserving of correct behaviour under exceptions

Page 4: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 4

Types of speculationTypes of speculationTypes of speculation

• Hardware based speculation– done with dynamic branch prediction and

dynamic scheduling– used in Superscalar processors

• Compiler based speculation– done with static branch prediction and static

scheduling– used in VLIW processors

Page 5: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 5

Extending Tomasulo’s scheme for speculative execution

Extending Extending TomasuloTomasulo’’ss scheme for scheme for speculative executionspeculative execution

• Introduce re-order buffer (ROB)• Add another stage – “commit”

Normal execution• Issue• Execute• Write result

Speculative execution• Issue• Execute• Write result• Commit

f xfx

i i x x

Page 6: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 6

Extending Tomasulo’s scheme for speculative execution – contd.

Extending Extending TomasuloTomasulo’’ss scheme for scheme for speculative execution speculative execution –– contd.contd.

• Write results into ROB in the “write result” stage• Write results into register file or memory in the

“commit” stage• Dependent instructions can read operands from

ROB• A speculative instruction commits only if the

prediction is determined to be correct• Instructions may complete execution out-of-order,

but they commit in-order

Page 7: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 7

Recall Tomasulo’s scheme ......

Page 8: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 8

IssueIssueIssue

• Get next instruction from instruction queue• Check if there is a matching RS which is

empty– no: structural hazard, instruction stalls– yes: issue the instruction to that RS

• For each operand, check if it is available in RF– yes: put the operand in the RS– no: keep track of FU that will produce it

Page 9: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 9

ExecuteExecuteExecute

• If one or more operands not available, wait and monitor CDB

• When an operand becomes available, it is placed in RS

• When all operands are available, start execution

• Choice may need to be made if multiple instructions become ready at the same time

Page 10: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 10

Write resultWrite resultWrite result

• When result is available– write it on CDB and – from there into RF and relevant RSs

• Mark RS as available

Page 11: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 11

More formal description ......

Page 12: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 12

RS and RF fieldsRS and RF fieldsRS and RF fields

op busy Qj Vj Qk Vk val Qi

Page 13: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 13

IssueIssueIssue

• Get instruction <op, rd, rs, rt> from instruction queue

• Wait until ∃

r |

RS[r].busy = no and RF[rd].Qi = φ• if (RF[rs].Qi ≠ φ)

{RS[r].Qj ← RF[rs].Qi}else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}

• similarly for rt• RS[r].op ← op; RS[r].busy ← yes; RF[rd].Qi ← r

Page 14: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 14

ExecuteExecuteExecute

• Wait until RS[r].Qj = φ

and RS[r].Qk = φ• Compute result: operation is RS[r].op,

operands are RS[r].Vj and RS[r].Vk

Page 15: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 15

Write resultWrite resultWrite result

• Wait until execution complete at r and CDB available

• ∀

x if (RF[x].Qi = r) {RF[x].val ← result; RF[x].Qi ← φ}

• ∀

x if (RS[x].Qj = r) {RS[x].Vj ← result; RS[x].Qj ← φ}

• similarly for Qk / Vk• RS[r].busy ← no

Page 16: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 16

Tomasulo’s scheme plus ROB......

Page 17: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 17

IssueIssueIssue

• Get next instruction from instruction queue• Check if there is a matching RS which is empty

and an empty slot in ROB– no: structural hazard, instruction stalls– yes: issue the instruction to that RS and mark the ROB

slot, also put ROB slot number in RS

• For each operand, check if it is available in RF or ROB– yes: put the operand in the RS– no: keep track of FU that will produce it

Page 18: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 18

Execute (no change)Execute (no change)Execute (no change)

• If one or more operands not available, wait and monitor CDB

• When an operand becomes available, it is placed in RS

• When all operands are available, start execution

• Choice may need to be made if multiple instructions become ready at the same time

Page 19: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 19

Write resultWrite resultWrite result

• When result is available– write it on CDB with ROB tag and – from there into ROB RF and relevant RSs

• Mark RS as available

Page 20: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 20

Commit (non-branch instruction)Commit (nonCommit (non--branch instruction)branch instruction)

• Wait until instruction reaches head of ROB• Update RF• Remove instruction from ROB

Page 21: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 21

Commit (branch instruction)Commit (branch instruction)Commit (branch instruction)

• Wait until instruction reaches head of ROB• If branch is mispredicted,

– flush ROB– Restart execution at correct successor of the

branch instruction• else

– Remove instruction from ROB

Page 22: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 22

More formal description ......

Page 23: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 23

RS fieldsRS fieldsRS fields

op busy Qi Qj Vj Qk Vk

Page 24: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 24

RF fieldsRF fieldsRF fields

val Qi busy

Page 25: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 25

ROB fieldsROB fieldsROB fields

inst busy rdy val dst

Page 26: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 26

IssueIssueIssue• Get instruction <op, rd, rs, rt> from instruction queue• Wait until ∃r |

RS[r].busy=no and RF[rd].Qi = φ

ROB[b].busy=no, where b = ROB tail• if (RF[rs].Qi ≠ φ RF[rs].busy) {h ← RF[rs].Qi;

if (ROB[h].rdy) {RS[r].Vj ← ROB[h].val; RS[r].Qj ← φ}else {RS[r].Qj ← h}

} else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}• similarly for rt• RS[r].op ← op; RS[r].busy ← yes; RS[r].Qi← b• RF[rd].Qi ← rb; RF[rd].busy ← yes; ROB[b].busy ← yes• ROB[b].inst ← op; ROB[b].dst ← rd; ROB[b].rdy ← no

Page 27: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 27

Execute (no change)Execute (no change)Execute (no change)

• Wait until RS[r].Qj = φ

and RS[r].Qk = φ• Compute result: operation is RS[r].op,

operands are RS[r].Vj and RS[r].Vk

Page 28: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 28

Write resultWrite resultWrite result• Wait until execution complete at r and CDB

available• b ← RS[r].Qi• ∀

x if (RF[x].Qi = r) {RF[x] ← result; RF[x].Qi ← φ}

• ∀

x if (RS[x].Qj = r b) {RS[x].Vj ← result; RS[x].Qj ← φ}

• similarly for Qk / Vk• RS[r].busy ← no• ROB[b].rdy ← yes; ROB[b].val ← result

Page 29: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 29

Commit (non-branch instruction)Commit (nonCommit (non--branch instruction)branch instruction)

• Wait until instruction reaches head of ROB (entry h) and ROB[h].rdy = yes

• d ← ROB[h].dst• RF[d].val ← ROB[h].val• ROB[h].busy ← no• if (RF[d].Qi = h) {RF[d].busy ← no}

Page 30: Lec Feb09 2009

Anshul Kumar, CSE IITD slide 30

Commit (branch instruction)Commit (branch instruction)Commit (branch instruction)

• Wait until instruction reaches head of ROB (entry h) and ROB[h].rdy = yes

• If branch is mispredicted, – clear ROB, RF[ ].Qi– fetch branch dest

• else– ROB[h].busy ← no– if (RF[d].Qi = h) {RF[d].busy ← no}