Lec Feb05 2009

22
Anshul Kumar, CSE IITD CSL718 : Superscalar Processors CSL718 : Superscalar CSL718 : Superscalar Processors Processors Renaming and Reordering 5th Feb, 2009

Transcript of Lec Feb05 2009

Page 1: Lec Feb05 2009

Anshul Kumar, CSE IITD

CSL718 : Superscalar Processors

CSL718 : Superscalar CSL718 : Superscalar ProcessorsProcessors

Renaming and Reordering5th Feb, 2009

Page 2: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 2

Why Renaming and Reordering?Why Renaming and Reordering?Why Renaming and Reordering?

• Register Renaming– Removes false dependencies (WAR and

WAW) • Reordering Buffer (ROB)

– Ensures sequential consistency of interrupts (precise vs imprecise interrupts)

– Facilitates speculative execution

Page 3: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 3

RAW, WAR and WAW (in Static Pipeline)

RAW, WAR and WAWRAW, WAR and WAW (in Static Pipeline)(in Static Pipeline)

IF D RF EX WB

IF D RF EX WB

IF D RF EX WB

IF D RF EX WB

RAW

WAR

IF D RF EX WB

IF D RF EX WBWAW

EX EX

Page 4: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 4

RAW, WAR and WAW (in Superscalar)

RAW, WAR and WAWRAW, WAR and WAW (in Superscalar)(in Superscalar)

IF IS DP EX WB

IF IS DP EX WB

IF IS DP EX WB

write

read

write

RAW

WARWAW

b ← 1

b ← 0

b ← 1

scoreboard bit set by write, cleared by read

what happens when there are multiple reads for a write?

Page 5: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 5

Implementation using scoreboard bitImplementation using scoreboard bitImplementation using scoreboard bit

IF IS DP EX WB

IF IS DP EX WB

write

readRAW

WAR

b ← 0 b ← 1

IF IS DP EX WBwriteb ← 0

IF IS DP EX WBreadWARWAW

in order issue, scoreboard bit set by write, cleared at issue time

issue only if there are no pending reads

Page 6: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 6

CDC 6600 like ImplementationCDC 6600 like ImplementationCDC 6600 like Implementation

IF IS DP EX WB

IF IS DP EX WB

IF IS DP EX WB

write

read

write

RAW

WARWAW

b ← FU1 b ← φ

b ← FU2

Page 7: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 7

IBM 360 like ImplementationIBM 360 like ImplementationIBM 360 like Implementation

IF IS DP EX WB

IF IS DP EX WB

IF IS DP EX WB

write

read

write

RAW

WARWAW

b ← FU1 b ← φ

b ← FU2

Page 8: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 8

Use of RenamingUse of RenamingUse of Renaming

IF IS DP EX WB

IF IS DP EX WB

IF IS DP EX WB

write

read

write

RAW

WARWAW

Page 9: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 9

Register renamingRegister renamingRegister renaming

write R5RAW

read R5WAR

write R5RAW

read R5

write R5RAW

read R5

write R8RAW

read R8

Page 10: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 10

Who does renaming?Who does renaming?Who does renaming?

• Compiler– Done statically– Limited by registers visible to compiler

• Hardware– Done dynamically– Limited by registers available to hardware

Page 11: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 11

Types of renaming buffersTypes of renaming buffersTypes of renaming buffers

• Separate renaming register file and architectural register file

• Combined renaming and architectural register file

• Renaming combined with reordering• Renaming combined with reservation

stations and reordering

Page 12: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 12

How renaming works? (in context of combined reg file)

How renaming works?How renaming works? (in context of combined (in context of combined regreg file)file)

mapping

register addressfrom instruction

physical register file(larger than architectural

register file)

Page 13: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 13

Types of mappingTypes of mappingTypes of mapping

Indexed• Inexpensive• Two steps required

– Look up index– Read value

Associative• Expensive• Single step associative

access

Page 14: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 14

Renaming with indexed accessRenaming with indexed accessRenaming with indexed access

value value valid

entry indexvalid

physical register filemapping table

registernumber

Page 15: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 15

Renaming with associative accessRenaming with associative accessRenaming with associative access

value value latest valid

entry regvalid num

physical register file (associative)

registernumber

match

Page 16: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 16

Handling interruptsHandling interruptsHandling interrupts

status ofinstructionexecution

at the time ofinterrupt

programorder

completedunder executionnot started

these can “commit”

Page 17: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 17

Speculative executionSpeculative executionSpeculative execution

predictedbranch

speculativeexecution

don’t commit till correctnessof prediction is determined

Page 18: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 18

ReorderingReorderingReorderinginstruction enter

instructions commit/retire

fx

fx

i ix

xi: issuedx: in executionf: finished

Page 19: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 19

Using ROB with RFUsing ROB with RFUsing ROB with RF

RegisterFile

to reservation stations/FUsfrom FUs

RegisterFile to reservation

stations/FUs

from FUs ROB

Page 20: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 20

Future file and history fileFuture file and history fileFuture file and history file

RegisterFile

to reservation stations/FUs

from FUs

ROB

FutureFile

use in case of interrupts

to reservation stations/FUsfrom FUs

HistoryFile Future

File

update in case of interrupts

displaced values

Page 21: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 21

Combining renaming and reorderingCombining renaming and reorderingCombining renaming and reordering

• Use physical register file as ROB as well• Maintain status about committed and

uncommitted values

Page 22: Lec Feb05 2009

Anshul Kumar, CSE IITD slide 22

How much to speculate?How much to speculate?How much to speculate?

• Handle exceptions in speculated instructions?– handle only low cost exception events such as

first level cache miss– wait if expensive exceptional event occurs such

as second level cache miss or TLB miss• Speculating through multiple branches

– needed when branches are frequent or clustered– even handling multiple branches in a cycle may

be required