Constructive Computer Architecture Tutorial 5 Epoch & Branch...

25
Constructive Computer Architecture Tutorial 5 Epoch & Branch Predictor Sizhuo Zhang 6.175 TA October 30, 2015 T05-1 http://csg.csail.mit.edu/6.175

Transcript of Constructive Computer Architecture Tutorial 5 Epoch & Branch...

Page 1: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Constructive Computer ArchitectureTutorial 5Epoch & Branch Predictor

Sizhuo Zhang6.175 TA

October 30, 2015 T05-1http://csg.csail.mit.edu/6.175

Page 2: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)

October 30, 2015 T05-2http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=0feEp=0fdEp=0 dEp=0

...

I1ieEp=0

Delay: 0 cycle

Delay: 100 cycles

Page 3: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)

October 30, 2015 T05-3http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=0feEp=0fdEp=0 dEp=1

...I1

deEp=0I1 redirect, ieEp=0

Delay: 0 cycle

Delay: 100 cycles

Page 4: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1

October 30, 2015 T05-4http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=0feEp=0 Delay: 0 cycle

fdEp=0 dEp=1Delay: 100 cycles

...I1

deEp=0I1 redirect, ieEp=0

Page 5: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1

October 30, 2015 T05-5http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=1feEp=1 Delay: 0 cycle

fdEp=0 dEp=1Delay: 100 cycles

...I1

deEp=0I1 redirect, ieEp=0

Page 6: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1Correct-path I2 (ieEp=1,idEp=0) issues

October 30, 2015 T05-6http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=1feEp=1 Delay: 0 cycle

fdEp=0 dEp=1Delay: 100 cycles

...I2

deEp=0I1 redirect, ieEp=0

Page 7: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1Correct-path I2 (ieEp=1,idEp=0) issues

October 30, 2015 T05-7http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=1feEp=1 Delay: 0 cycle

fdEp=0 dEp=0Delay: 100 cycles

...I2

I1 redirect, ieEp=0deEp=1

Page 8: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1Correct-path I2 (ieEp=1,idEp=0) issuesExecute redirects I2

October 30, 2015 T05-8http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=1feEp=1 Delay: 0 cycle

fdEp=0 dEp=0Delay: 100 cycles

...I2

I1 redirect, ieEp=0deEp=1

Page 9: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1Correct-path I2 (ieEp=1,idEp=0) issuesExecute redirects I2

October 30, 2015 T05-9http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=0feEp=0 Delay: 0 cycle

fdEp=0 dEp=0Delay: 100 cycles

...I2

I1 redirect, ieEp=0deEp=1

Page 10: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Distributed Epochs

Decode redirects I1 (ieEp=idEp=0)Execute redirects I1Correct-path I2 (ieEp=1,idEp=0) issuesExecute redirects I2I1 redirect arrives at Fetch (ieEp == feEp) change PC to a wrong value

October 30, 2015 T05-10http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

eEp=0feEp=0 Delay: 0 cycle

fdEp=0 dEp=0Delay: 100 cycles

...I2

I1 redirect, ieEp=0deEp=1

Page 11: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Unbounded Global Epochs

Both Decode and Execute can redirect the PC Execute redirect should never be overruled

Global epoch for each redirecting stage eEpoch: incremented when redirect from Execute takes effect dEpoch: incremented when redirect from Decode takes effect Initially set all epochs to 0

October 30, 2015 T05-11http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

miss pred?

miss pred?

redirect PC

redirect PC

eEpdEp

...

Redirect

Page 12: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Execute stage

October 30, 2015 T05-12http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

miss pred?

miss pred?

{pc, newpc}

{pc, newpc}

eEpdEp

...

{…, ieEp, idEp} {pc, ppc, ieEp}

Is ieEp == eEp? yes

no

Wrong path instruction;poison it

Current instruction is OK; check the ppc prediction by execution, signal a redirect message (by writing EHR) on misprediction

Redirect

Page 13: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Decode Stage

October 30, 2015 T05-13http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

miss pred?

miss pred?

{pc, newpc}

{pc, newpc}

eEpdEp

...

{pc, ppc, ieEp, idEp} {..., ieEp}

Is ieEp == eEp && idEp == dEp ? yes no

Wrong path instruction; drop it

Current instruction is OK; check the ppc prediction via BHT, signal a redirect message (by writing EHR) on misprediction

Redirect

Page 14: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Fetch stage: Redirect PC, Change Epoch

October 30, 2015 T05-14http://csg.csail.mit.edu/6.175

Executed2eDecodef2dFetchPC

miss pred?

miss pred?

{pc, newpc}

eEpdEp

...

Redirect

If there is redirect message from Execute Set PC to correct value, increment eEp

Else if there is redirect message from Decode Set PC to correct value, increment dEp

We can always request IMem to fetch new instruction New instruction will be tagged with current eEpoch and dEpoch PC redirection overrides next-PC prediction When PC redirects, new instruction must be killed at Decode stage

Page 15: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Observation

At cycle , instructions between Fetch and Execute: 1… 1 at Fetch, at Execute, at Decode

Invariant: 1 . 2 . ⋯ . 1 1 . 2 . ⋯ . 1

Proved by induction on At cycle , both Decode and Execute redirect . ⟹ 1 . 2 . ⋯ . . ⟹ 1 . 2 . ⋯ . ← 1, invariants still hold at cycle 1

October 30, 2015 T05-15http://csg.csail.mit.edu/6.175

eEpochdEpoch ... Execute ...Fetch Decode...

1

Page 16: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

1-bit Global Epochs

The max difference of values of one type of epoch is 1 Only need 1 bit to encode each epoch Increment epoch flip epoch

October 30, 2015 T05-16http://csg.csail.mit.edu/6.175

Page 17: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

4-stage Pipeline: Code Sketch

October 30, 2015 T05-17http://csg.csail.mit.edu/6.175

module mkProc(Proc);// stage 1: Fetch// stage 2: Decode & read register// stage 3: Execute & access memory// stage 4: Commit (write register file)Ehr#(2, Addr) pc <- mkEhr(?);IMemory iMem <- mkIMemory;...Fifo#(2, Fetch2Decode) f2d <- mkCFFifo;Fifo#(2, Decode2Execute) d2e <- mkCFFifo;Fifo#(2, Execute2Commit) e2c <- mkCFFifo;Reg#(Bool) eEpoch <- mkReg(False);Reg#(Bool) dEpoch <- mkReg(False);Ehr#(2, Maybe#(ExeRedirect)) exeRedirect <- mkEhr(Invalid);Ehr#(2, Maybe#(DecRedirect)) decRedirect <- mkEhr(Invalid); BTB#(16) btb <- mkBTB;BHT#(1024) bht <- mkBHT;

Page 18: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Execute Stage

October 30, 2015 T05-18http://csg.csail.mit.edu/6.175

rule doExecute;d2e.deq; let x = d2e.first;if(x.ieEp == eEpoch) begin

let eInst = exec(...);if(eInst.mispredict)

exeRedirect[0] <= Valid ({pc: x.pc, newpc: eInst.addr});

...end else e2c.enq(Exec2Commit{poisoned: True, ...});

// wrong path instruction, poison itendrule

BHT training will be in lab

Page 19: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Decode Stage

October 30, 2015 T05-19http://csg.csail.mit.edu/6.175

rule doDecode;f2d.deq; let x = f2d.first;if(x.ieEp == eEpoch && x.idEp == dEpoch) begin

let stall = ...; // check scoreboard for stallif(!stall) begin

if(x.iType == Br) beginlet bht_ppc = ...; // BHT predictionif(bht_ppc != x.ppc)

decRedirect[0] <= Valid ({pc: x.pc, newpc: bht_ppc});

end...d2e.enq(Decode2Execute{ieEp: x.ieEp, ...});

endend // else: wrong path instruction, killed

endrule

Page 20: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Fetch Stage

October 30, 2015 T05-20http://csg.csail.mit.edu/6.175

rule doFetch;let inst = iMem.req(pc[0]);let ppc = btb.predPc(pc[0]);pc[0] <= ppc;f2d.enq(Fetch2Decode{pc: pc[0], ppc: ppc, inst: inst,

ieEp: eEpoch, idEp: dEpoch});endrule

rule cononicalizeRedirect;if(exeRedirect[1] matches tagged Valid .r) begin

pc[1] <= r.newpc; btb.update(r.pc, r.newpc);eEpoch <= !eEpoch;

end else if(decRedirect[1] matches tagged Valid .r) beginpc[1] <= r.newpc; btb.update(r.pc, r.newpc);dEpoch <= !dEpoch;

end exeRedirect[1] <= Invalid; decRedirect[1] <= Invalid;

endrule

Page 21: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Advanced Branch Predictor

BHT cannot predict very accurately Need more sophisticated schemesGlobal branch history taken/not token for previous branchesLocal branch history taken/not token for previous occurrences of

the same branchTournament branch predictor Use both global and local history Alpha 21264

October 30, 2015 T05-21http://csg.csail.mit.edu/6.175

Page 22: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Tournament Predictor

10-bit PC: index 1024 x 10-bit local history table10-bit local history Index 1024 x 3-bit BHT: prediction 1

12-bit global history Index 4096 x 2-bit BHT: prediction 2 Index 4096 x 2-bit BHT: select between predictions 1, 2

October 30, 2015 T05-22http://csg.csail.mit.edu/6.175

“The Alpha 21264 Microprocessor Architecture”

Page 23: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

TAGE Branch PredictorsTAGE (Tagged Geometric history length)

October 30, 2015 T05-23http://csg.csail.mit.edu/6.175

“A case for partially taggedgeometric history length branch prediction”

Page 24: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Other Predictors

Perceptron-based predictors Classification problem in machine

learningChampionship Branch Predictor Papers Slides simulation codes

October 30, 2015 T05-24http://csg.csail.mit.edu/6.175

Page 25: Constructive Computer Architecture Tutorial 5 Epoch & Branch …csg.csail.mit.edu/6.175/tutorials/T05-Epoch.pdf · Microsoft PowerPoint - T05-Epoch.pptx Author: Sizhuo Created Date:

Return Address Stack

Use a stack to store return address from function call Push when function is called Pop when returning from a functionInstruction to return from function call JALR: rd = x0, rs1 = x1 (ra)Instruction to initiate function call JAL: rd = x1 JALR: rd = x1

October 30, 2015 T05-25http://csg.csail.mit.edu/6.175