H-Pattern : A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

22
H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second Year Undergraduate Milan Patnaik MTech Dr. V Kamakoti Professor Indian Institute of Technology Madras Department of Computer Science &

description

H-Pattern : A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation. Indian Institute of Technology Madras Department of Computer Science & Engineering. Approach. Conditional branch instructions often follow patterns which periodically repeat. - PowerPoint PPT Presentation

Transcript of H-Pattern : A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Page 1: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern: A Hybrid Pattern Based Dynamic

BranchPredictor with Performance Based

Adaptation

Samir OtivSecond Year Undergraduate

Kaushik GarikipatiSecond Year Undergraduate

Milan PatnaikMTech

Dr. V KamakotiProfessor

Indian Institute of Technology MadrasDepartment of Computer Science & Engineering

Page 2: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Approach

Conditional branch instructions often follow patterns which periodically repeat.

If a branch instruction is found to follow a certain repeating pattern, a predictor must have the ability to accurately predict its outcome for as long as the pattern persists.

Predicting ALL patterns with periods of ANY length: Impossible, given a fixed storage budget.

Page 3: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Approach

STRATEGY: Restrict ourselves to capturing patterns with a period only up to a certain predetermined length

Objective: Creating a predictor that captures patterns with periods of lengths of up to n-bits.

Challenges:1. Using minimum space2. The patterns followed can change – must dynamically relearn

Page 4: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Solution

For every branch: Store local history of 2n bits

If a branch instruction follows a pattern of execution with a period p, where p is at most equal to n, then the most recent set of n bits must be identical to the set of n bits that occurred p executions prior.

outcome(hi) = outcome(hi+p) (where hi = ith most recent execution)

To predict, all we do is compare the most recent n bits to successively older History Patterns (substrings of n bits of the local history), and stop at the first match. The bit, just after this matching substring, is our prediction for the next execution.(The picture on the next slide should clarify)

Page 5: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Here, with n=8, we store a local history of 16 bits.

The branch instruction follows a repeating pattern –(110)-, which has a period of 3.

The bit string h0 to h7 (Current Pattern) matches precisely with the bit string h3 to h11 (Matched Current Pattern).

The prediction returned is the bit just after the matched current pattern – h2.

Illustration

Page 6: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern: nBPAT + AltPred

nBPAT: n-Bit Pattern PredictorAltPred: Any other alternate branch predictor

When no pattern is detected (i.e. no pattern match occurs), AltPred is used.

When a pattern is detected, the better performing predictor is used.

Page 7: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

The nBPAT Predictor

Every entry of the predictor is comprised of:• A 2n-bit shift register for local history• A saturating counter to keep track of the better performing predictor

(as described in ‘Combining Branch Predictors’ by Scott McFarling)

Storage:Various configurations possible – tagged/tagless/direct mapped/associative

Page 8: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

The nBPAT Algorithm

To Predict:

1. Match the current pattern (h0 to hn-1) with successively older history patterns

2. If the first match is found at hi, then hi-1 is the predicted outcome. If the most significant bit of the saturating selection counter is 1, then return hi-1.

3. If there is no match, or if the most significant bit is 0, use AltPred

To Update:1. If AltPred mispredicted and

nBPAT correctly predicted, increment the saturating selection counter.

2. If AltPred correctly predicted and nBPAT mispredicted, decrement the saturating selection counter.

3. If nBPAT was not ready, don’t change the saturating counter

4. Update the local history by inserting the outcome of the branch into the local history shift register

Page 9: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Combinations of H-Pattern

H-Pattern: Various configuration decisionsAltPred Component: Several possible options, for instance:• Gshare• TAGE• ISL-TAGE

nBPAT Storage Structure:• Tagged/Tagless• Associative/Direct Mapped

Page 10: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with Gshare

Configuration:• Tagless, direct-mapped table used for nBPAT – indexed by few of the

least significant bits of the PC• 50% of the storage budget assigned to nBPAT

Outcome:Distinct improvement in accuracy observed, as will be shown soon.

Page 11: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with Gshare

4KB 32KB Unlimited0

1

2

3

4

5

6

7

8

6.7

5.2

4.6

6.4

4.7

3.8

Mispredictions per Kilo Instructions – CBP 2014 Framework

Gshare H-Pattern with Gshare

Page 12: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with TAGE/ISL-TAGE

Minimal portion of storage allocated to nBPAT

The storage structure must facilitate maximum accuracy by nBPAT for very small storage spaces.

Proportion of the storage budget allocated to nBPAT was different for different budgets

Improvement in accuracy was lesser than that achieved with Gshare

Page 13: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with TAGE/ISL-TAGE

CONFIGURATION: nBPAT STORAGEPartially tagged, 2-way set-associative.

Selection Counter: 4-bits

Useful Counter:Included in every entry. Serves as a measure of the effectiveness of an

entry in the table.Decremented if:1. No pattern match found2. Misprediction by nBPAT & correct prediction by AltPredIncremented if misprediction by AltPred and correct prediction by nBPAT.All useful counters are reset periodically using a global reset counter.

This correctly captures the notion of an entry in the table being effective or ineffective, and aids in the entry replacement policy.

Page 14: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with TAGE/ISL-TAGE

UPDATE ALGORITHM:1. If the TAGE predictor MISPREDICTED and there is no tag match in

nBPAT 2-way associative table, and, either of the 2 potential entry locations have Useful = 0, then, make Tag = [BranchTag] and Useful = [Maximum].

2. If the entry ALREADY exists in the nBPAT 2-way associative table, then,

1. If nBPAT was not ready, OR, nBPAT mispredicted and TAGE correctly predicted, decrease useful.

2. If nBPAT correctly predicted and TAGE mispredicted, increase useful3. Update the nBPAT entry as described earlier in the nBPAT algorithm4. Update the TAGE/ISL-TAGE predictor

Page 15: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Reference TAGE Configurations

The optimized configuration for an 8-table TAGE predictor, as specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, was used.• 4KB: History Lengths = 5 to 127• 32KB: History Lengths = 5 to 450

Whereas for the unlimited case, 18 tagged tables were used.History Lengths = 3 to 2000

Page 16: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with TAGE Configurations

• 4KB:Tag length was reduced by 1 in every alternate table starting from T2.4-BPAT predictor used with 7-bit tagged entries & 3-bit useful counters.

• 32KB:Table T6 of TAGE has been halved in size. 8-BPAT predictor used with 8-bit tagged entries & 4-bit useful counter.

• Unlimited:8-BPAT predictor used with 16-bit tagged width.

Page 17: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with TAGE

4KB 32KB Unlimited0

0.5

1

1.5

2

2.5

3

3.5

43.735

2.678

2.177

3.712

2.644

2.164

TAGE H-Pattern with TAGEMispredictions per Kilo Instructions – CBP 2014 Framework

Page 18: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Reference ISL-TAGE Configurations

• 4KB:Configuration was same as the 8-component predictor specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, with space freed from the base bimodal predictor by having only 2K prediction entries and 1K hysteresis entries to accommodate statistical corrector and loop predictor. History lengths = 5 to 126.

• 32KB:Configuration (including history lengths) was identical to the one specified in

the paper “A 64KBit ISL-TAGE branch predictor ”, by André Seznec, with all storage tables halved.

• Unlimited:18 tagged tables were used.History Lengths = 3 to 2000

Page 19: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with ISL-TAGE Configurations

• 4KBFrom the reference 4KB ISL-TAGE, freed one tag bit from every

alternate table starting from T2.4-BPAT predictor used with 7-bit tagged entries & 3-bit useful

counters.

• 32KBFrom the reference 32KB ISL-TAGE, halved the last shared table and reduced the size of statistical corrector and loop predictor.4-BPAT predictor used with 6-bit tagged entries & 3-bit useful

counters.

• UnlimitedIn combination with the reference Unlimited ISL-TAGE predictor, an 8-BPAT predictor was used with 16-bit tagged entries & 4-bit useful counters.

Page 20: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

H-Pattern with ISL-TAGE

4KB 32KB Unlimited0

0.5

1

1.5

2

2.5

3

3.5

43.706

2.549

2.076

3.691

2.542

2.058

ISL-TAGE H-Pattern with ISL-TAGE

Mispredictions per Kilo Instructions – CBP 2014 Framework

Page 21: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Further Statistics: Success rates

H-Pattern with Component Unlimited 32KB 4KBGshare nBPAT 99.30% 99.10% 98.60%

AltPred 94.50% 93.40% 91.30%TAGE nBPAT 99.50% 99.59% 98.98%

AltPred 97.93% 97.15% 96.86%ISL-TAGE nBPAT 99.62% 99.56% 98.94%

AltPred 98.09% 97.37% 96.93%

𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑟𝑎𝑡𝑒=𝑁𝑜 .𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠𝑁𝑜 .𝑜𝑓 𝑎𝑡𝑡𝑒𝑚𝑝𝑡𝑒𝑑𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

Page 22: H-Pattern :  A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation

Thank You