T OR A AMODT ([email protected] Andreas Moshovos Paul Chow

23
TOR AAMODT ([email protected] Andreas Moshovos Paul Chow Electrical and Computer Engineering University of Toronto Canada The Predictability of Computations that Produce Unpredictable Outcomes

description

The Predictability of Computations that Produce Unpredictable Outcomes. T OR A AMODT ([email protected] Andreas Moshovos Paul Chow Electrical and Computer Engineering University of Toronto Canada. Outcome-Based Prediction. History of Outcomes leading up to Branch “X”: - PowerPoint PPT Presentation

Transcript of T OR A AMODT ([email protected] Andreas Moshovos Paul Chow

Page 1: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

TOR AAMODT

([email protected]

Andreas Moshovos Paul Chow

Electrical and Computer Engineering

University of Toronto

Canada

The Predictability of Computations that Produce Unpredictable Outcomes

Page 2: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Outcome-Based Prediction

History of Outcomes leading up to Branch “X”:

TNTTNTT ...NTN... TNTTNTT

Why this works:

Locality in the outcome stream

Next time we encounter X after “TNTTNT” we can predict “T”

History

Outcome of Branch X

Page 3: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Problem

• Unpredictable Branches THE Problem.

• No Outcome-Locality

Page 4: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Operation-Based Prediction

• Find locality in the computations that produce the outcome

bne

slt

ld

add

Page 5: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

This Work

• First work that looks at the fundamental program behaviour that would facilitate operation-based prediction.

• Related work… – Characterization of slices – Prefetching loads / pre-execution of branches

Page 6: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Ideally...

• Slice (i.e., slice trace) will always be the same.

• Slice will contain very few operations spanning large portion of original program.

• Easy (fast) to pre-compute.

Page 7: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Terminology

• Lead : earliest instruction in slice

• Target : branch we want to precomputebne

slt

ld

add

Page 8: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

What Should a Slice be?

• Commited Instructions 32, 64, 128, or 256 window

• Ignore Control Flow retain side-effect of JAL on $r31

• Memory Dependence follow resolved load-store

dependence: M

• Restrict # Instructions R = max 1/4, U = “no restriction”

FETCH...

COMMIT

older

Page 9: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Methodology

• 12 programs from SPEC2000 • Baseline Outcome Prediction Hardware

– 64K Gshare + 64K bimodal w/ 64K selector– 64 entry RAS

• sim-outorder (SimpleScalar 3.0):– 8-way, 128 entry RUU, 64 entry-fetch buffer– 64K dual LI, 256K unified L2– 64 entry LSQ– Perfect Memory Disambiguation

Page 10: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Measuring Slice Locality

• locality(1) = Probability same slice was seen last time. High value of locality(1) indicates that last-operation based slice prediction would work well.

• locality(N) = Probability same slice seen in last N unique slices.

Page 11: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Measuring Slice Locality

• Save the FOUR unique, most recent slice traces per static branch (only on misprediction).

• Each time a mispredicted branch is encountered check whether the slice trace was the most recent, 2nd most recent, etc...

Page 12: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Measuring Slice Locality

• All results are weighted averages.

• Result for each static branch weighted proportionally to the number of times the operation-based predictor mispredicted it.

• Characteristics of branches that cause most mispredictions emphasized.

Page 13: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Unrestricted Slices : 32UM

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

gcc equake ammp bzip

Saving ONE slice captures most of locality.

Lo

calit

yBetter

Page 14: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Restricted vs. Unrestricted

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

32RM

32UM

gcc equake ammp bzip

Most slices have few instructions.

Lo

calit

yBetter

Page 15: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Effect of Memory Dependence

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

64RM

64R

gcc equake ammp bzip

Tracking Dependence Does Not Affect Locality Much.

Lo

calit

yBetter

Page 16: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Window Size

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

gcc equake ammp bzip

Lo

calit

yBetter

256RM

128RM

64RM

32RM

Locality good even for large windows.

Page 17: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Effect of Selection Context 128RM

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

On Mispredict

Always

Lo

calit

yBetter

gcc equake ammp bzip

Focusing on Mispredictions Improves Locality.

Page 18: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Idealized Predictor

Lead PC

• Spawn and execute instantaneously when lead operation is encountered.

• Store up to 4 slice traces per lead operation

Page 19: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Idealized Predictor

• Match operations & register dependencies as instructions are fetched.

• After matching there is usually only one prediction per target, if any (>80% of time)...– Tie-breaker #1: longest lead-target distance.– Tie-breaker #2: most recently detected slice.

Page 20: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Correcting Mispredictions

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

wR wW

High Coverage of Mispredicted Branches

128RM

64RM

32RM

gcc equake ammp bzip

Page 21: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

rR rW

Interaction with Outcome-Based Predictor

gcc equake ammp bzip

Very Little Destructive Interference

128RM

64RM

32RM

Page 22: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Summary

• Slice-locality for mispredicted branches– average of 70% for restricted slices on a 64 entry

window following load-store dependencies (12 SPEC2000 benchmarks).

• Accuracy of idealized predictor– 74% of mispredicted branches eliminated

Page 23: T OR  A AMODT (aamodt@eecg.utoronto Andreas Moshovos      Paul Chow

Aamodt, Moshovos, ChowUniversity of Toronto

The Predictability of Computations that Produce Unpredictable Outcomes

Conclusion

• First work that looks at the fundamental program behaviour, slice-locality, that would facilitate predicting slice traces to pre-execute outcomes.

• SPEC2000 benchmarks show very high slice-locality for mispredicted branches.