MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM...
-
Upload
laurence-wheeler -
Category
Documents
-
view
212 -
download
0
Transcript of MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM...
![Page 1: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/1.jpg)
MadCache: A PC-aware Cache Insertion Policy
Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group
University of Wisconsin – Madison
June 20, 2010
![Page 2: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/2.jpg)
2
• Problem: Changing hardware and workloads encourage investigation of cache replacement/insertion policy designs
• Proposal: MadCache uses PC history to choose cache insertion policy– Last level cache granularity– Individual PC granularity
• Performance improvements over LRU– 2.5% improvement IPC (single thread)– 4.5% speedup and 6% speedup improvement (multithreaded)
Executive Summary
![Page 3: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/3.jpg)
3
• Importance of investigating cache insertion policies– Direct affect on performance– LRU dominated hardware designs for many years– Changing workloads, levels of caches
• Shared last-level cache– Cache behavior now depends on multiple running applications– One streaming thread can ruin the cache for everyone
Motivation
![Page 4: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/4.jpg)
4
• Dynamic insertion policies– DIP – Qureshi et. al – ISCA ’07
• Dueling sets select best of multiple policies• Bimodal Insertion Policy (BIP) offers thrash protection
– TADIP – Jaleel et. al – PACT ’08• Awareness of other threads’ workloads
• Utilizing Program Counter information– Exhibit a useful amount of predictable behavior– Dead-block prediction and prefetching – ISCA ’01– PC-based load miss prediction – MICRO ’95
Previous Work
![Page 5: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/5.jpg)
5
• Problem: With changing hardware and workloads, caches are subject to suboptimal insertion policies
• Solution: Use PC information to create a better policy– Adaptive default cache insertion policy– Track PCs to determine the policy on a finer grain than DIP– Filter out streaming PCs
Introducing MadCache!
MadCache Proposal
![Page 6: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/6.jpg)
6
• Tracker Sets– Sample behavior of the cache– Enter the PCs into PC-Predictor Table– Determines default policy of cache
• Uses set dueling - Qureshi et. al – ISCA ’07• LRU and Bypassing Bimodal Insertion Policy (BBIP)
• Follower Sets– Majority of the last level cache– Typically follow the default policy– Can override default cache policy (PC-Predictor Table)
MadCache Design
![Page 7: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/7.jpg)
7
Tracker and Follower Sets
BBIP Tracker Sets
LRU Trackers Sets
Follower Sets
Reuse Bit
Index to PC- Predictor
• Tracker Sets overhead– 1-bit to indicate if line was accessed again– 10/11 bits to index PC-Predictor table
Last Level Cache
![Page 8: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/8.jpg)
8
• PC-Predictor Table– Store PCs that have accessed Tracker Sets– Track behavior history using counter
• Decrement if an address is used many times in the LLC • Increment if line is evicted and was never reused
– Per-PC default policy override• LRU (default) plus BBIP override• BBIP (default) plus LRU override
MadCache Design
![Page 9: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/9.jpg)
9
PC-Predictor Table
Policy + PC (MSB) Counter # Entries
(1 + 64 bits) (6 bits) (9 bits)
PC (miss) (MSB) Counter
Hit?
0 1
• Parallel to cache miss, PC + current policy index PC-Predictor • If hit in table, follow the PC’s override policy• If miss in table, follow global default policy
Default Policy PC-Predictor Table
![Page 10: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/10.jpg)
10
• Thread aware MadCache– Similar structures as single-threaded MadCache– Track based on current policy of other threads
• Multithreaded MadCache extensions– Separate tracker sets for each thread
• Each thread still tracks LRU and BBIP– PC-Predictor table
• Extended number of entries• Indexed by thread-ID, policy, and PC
– Set dueling PER THREAD
Multi-Threaded MadCache
![Page 11: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/11.jpg)
11
Multi-threaded MadCache
TID + <P0,P1,P2,P3> + PC (MSB) Counter # Entries
(2 + 4 + 64 bits) (6 bits) (9 bits)
(MSB) Counter
Hit?
0 1
Default Policy PC-Predictor Table
(10 bits)
TID-0
TID-1
TID-2
TID-3
TID-0 BBIP Tracker Sets
TID-0 LRU Tracker Sets
Other Tracker Sets
Follower Sets
Last Level Cache
![Page 12: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/12.jpg)
12
• Deep Packet Inspection1
– Large match tables (1MB+) commonly used for DFA/XFA regular expression matching
– Incoming byte stream from packets causes different table traversals• Table exhibits reuse between packets• Packets mostly streaming (backtracking implementation
dependent)
MadCache – Example Application
1Evaluating GPUs for Network Packet Signature Matching – ISPASS ‘09
![Page 13: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/13.jpg)
13
MadCache – Example Application
– Packets mostly streaming– Frequently accessed Match Table contents held in L1/L2
• Less frequently accessed elements in LLC/memory
Match Table Current Processing Element
Pa
cke
t
Current Processing ElementCurrent Processing Element
Current Processing Element
Current Processing ElementCurrent Processing Element
Pa
cke
tP
ack
et
![Page 14: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/14.jpg)
14
MadCache – Example Application
• DIP– Would favor BIP policy due to packet data streaming– LLC mixture of Match Table and useless packet data
• MadCache– Would identify PCs associated with Match Table as useful– LLC populated almst entirely by Match Table
DIP LLC MadCache LLC
Packet Data
Table Data
![Page 15: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/15.jpg)
15
Experimentation
Processor 8-Stage, 4-wide pipeline
Instruction window size 128 entries
Branch Predictor Perfect
L1 inst. cache 32KB, 64B linesize, 4-way SA, LRU, 1 cycle hit
L1 data cache 32KB, 64B linesize, 8-way SA, LRU, 1 cycle hit
L2 cache 32KB, 64B linesize, 8-way SA, LRU, 10 cycle hit
L3 cache (1 thread) 1MB, 64B linesize, 30 cycle hit
L3 cache (4 threads) 4MB, 64B linesize, 30 cycle hit
Main memory 200 cycles
– 15 benchmarks from SPEC CPU2006– 15 workload mixes for multithreaded experiments– 200 million cycle simulations
![Page 16: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/16.jpg)
16
IPC normalized to LRU– 2.5% improvement across benchmarks tested– Slight improvement over DIP
Results – Single-threaded
0.88
0.9
0.92
0.94
0.96
0.98
1
1.02
1.04
1.06
1.08
1.1
astar gcc hmmer geomean
IPC
Nor
mal
ized
to
LRU
RAND
DIP
MAD
![Page 17: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/17.jpg)
17
Results – Multithreaded
0.95
1
1.05
1.1
1.15
1.2
Thro
ughp
ut N
orm
alize
d to
LRU
DIP
MAD
Throughput normalized to LRU– 6% improvement across mixes tested– DIP performs similarly to LRU
![Page 18: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/18.jpg)
18
Results
Weighted speedup normalized to LRU– 4.5% improvement across benchmaks tested– DIP performs similarly to LRU
0.96
0.98
1
1.02
1.04
1.06
1.08
1.1
1.12
1.14
Wei
ghte
d Sp
eedu
p N
orm
aliz
ed t
o LR
U DIP
MAD
![Page 19: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/19.jpg)
19
Future Work
• MadderCache?– Optimize size of structures
• PC-Predictor Table size• Replace CAM with Hashed PC & Tag
– Detailed analysis of benchmarks with MadCache– Extend PC Predictions
• Don’t take into account sharers
![Page 20: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/20.jpg)
20
Conclusions
• Cache behavior still evolving – Changing cache levels, sharing, workloads
• MadCache insertion policy uses PC information– PCs exhibit useful amount of predictable behavior
• MadCache performance– 2.5% improvement IPC for single-threaded– 4.5% speedup, 6% throughput improvement for 4-threads– Sized to competition bit budget
• Preliminary investigations show little impact with reduction in structures
![Page 21: MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,](https://reader036.fdocuments.in/reader036/viewer/2022070402/56649f275503460f94c3f37e/html5/thumbnails/21.jpg)
21
Questions?