Post on 31-Dec-2015
description
TOR AAMODT
(aamodt@eecg.utoronto.ca
Andreas Moshovos Paul Chow
Electrical and Computer Engineering
University of Toronto
Canada
The Predictability of Computations that Produce Unpredictable Outcomes
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Outcome-Based Prediction
History of Outcomes leading up to Branch “X”:
TNTTNTT ...NTN... TNTTNTT
Why this works:
Locality in the outcome stream
Next time we encounter X after “TNTTNT” we can predict “T”
History
Outcome of Branch X
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Problem
• Unpredictable Branches THE Problem.
• No Outcome-Locality
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Operation-Based Prediction
• Find locality in the computations that produce the outcome
bne
slt
ld
add
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
This Work
• First work that looks at the fundamental program behaviour that would facilitate operation-based prediction.
• Related work… – Characterization of slices – Prefetching loads / pre-execution of branches
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Ideally...
• Slice (i.e., slice trace) will always be the same.
• Slice will contain very few operations spanning large portion of original program.
• Easy (fast) to pre-compute.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Terminology
• Lead : earliest instruction in slice
• Target : branch we want to precomputebne
slt
ld
add
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
What Should a Slice be?
• Commited Instructions 32, 64, 128, or 256 window
• Ignore Control Flow retain side-effect of JAL on $r31
• Memory Dependence follow resolved load-store
dependence: M
• Restrict # Instructions R = max 1/4, U = “no restriction”
FETCH...
COMMIT
older
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Methodology
• 12 programs from SPEC2000 • Baseline Outcome Prediction Hardware
– 64K Gshare + 64K bimodal w/ 64K selector– 64 entry RAS
• sim-outorder (SimpleScalar 3.0):– 8-way, 128 entry RUU, 64 entry-fetch buffer– 64K dual LI, 256K unified L2– 64 entry LSQ– Perfect Memory Disambiguation
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality
• locality(1) = Probability same slice was seen last time. High value of locality(1) indicates that last-operation based slice prediction would work well.
• locality(N) = Probability same slice seen in last N unique slices.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality
• Save the FOUR unique, most recent slice traces per static branch (only on misprediction).
• Each time a mispredicted branch is encountered check whether the slice trace was the most recent, 2nd most recent, etc...
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality
• All results are weighted averages.
• Result for each static branch weighted proportionally to the number of times the operation-based predictor mispredicted it.
• Characteristics of branches that cause most mispredictions emphasized.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Unrestricted Slices : 32UM
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
gcc equake ammp bzip
Saving ONE slice captures most of locality.
Lo
calit
yBetter
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Restricted vs. Unrestricted
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
32RM
32UM
gcc equake ammp bzip
Most slices have few instructions.
Lo
calit
yBetter
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Effect of Memory Dependence
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
64RM
64R
gcc equake ammp bzip
Tracking Dependence Does Not Affect Locality Much.
Lo
calit
yBetter
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Window Size
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
gcc equake ammp bzip
Lo
calit
yBetter
256RM
128RM
64RM
32RM
Locality good even for large windows.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Effect of Selection Context 128RM
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
On Mispredict
Always
Lo
calit
yBetter
gcc equake ammp bzip
Focusing on Mispredictions Improves Locality.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Idealized Predictor
Lead PC
• Spawn and execute instantaneously when lead operation is encountered.
• Store up to 4 slice traces per lead operation
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Idealized Predictor
• Match operations & register dependencies as instructions are fetched.
• After matching there is usually only one prediction per target, if any (>80% of time)...– Tie-breaker #1: longest lead-target distance.– Tie-breaker #2: most recently detected slice.
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Correcting Mispredictions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
wR wW
High Coverage of Mispredicted Branches
128RM
64RM
32RM
gcc equake ammp bzip
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
rR rW
Interaction with Outcome-Based Predictor
gcc equake ammp bzip
Very Little Destructive Interference
128RM
64RM
32RM
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Summary
• Slice-locality for mispredicted branches– average of 70% for restricted slices on a 64 entry
window following load-store dependencies (12 SPEC2000 benchmarks).
• Accuracy of idealized predictor– 74% of mispredicted branches eliminated
Aamodt, Moshovos, ChowUniversity of Toronto
The Predictability of Computations that Produce Unpredictable Outcomes
Conclusion
• First work that looks at the fundamental program behaviour, slice-locality, that would facilitate predicting slice traces to pre-execute outcomes.
• SPEC2000 benchmarks show very high slice-locality for mispredicted branches.