DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1%...
-
Upload
ruby-mcdowell -
Category
Documents
-
view
212 -
download
0
Transcript of DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1%...
![Page 1: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/1.jpg)
Computer Architecture CSE 3322Lecture 20Web Site
crystal.uta.edu/~jpatters/cse3322
Phase II Project due Monday Dec 1
Problems: 7.20, 7.22, 7.27, 7.28 Due Nov 17
![Page 2: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/2.jpg)
DECStation 3100
Block Instruction Data EffectiveProgram Size Miss Rate Miss Rate Miss Rate
1 6.1% 2.1% 5.4%4 2.0% 1.7% 1.9%
1 1.2% 1.3% 1.2%4 0.3% 0.6% 0.4%
gcc
spice
Write Misses included in 4 word block, but notin 1 word.
![Page 3: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/3.jpg)
DECStation 3100
Block Instruction Data EffectiveProgram Size Miss Rate Miss Rate Miss Rate
1 6.1% 2.1% 5.4%4 2.0% 1.7% 1.9%
1 1.2% 1.3% 1.2%4 0.3% 0.6% 0.4%
gcc
spice
Write Misses included in 4 word block, but notin 1 word.Remember Miss Penalty goes UP !
![Page 4: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/4.jpg)
Average Memory Access Time =Hit Time + Miss Rate * Miss Penalty
MissPenalty
Block Size
MissRate
Block Size
Access Time
Transfer Time
Constant Size Cache
Fewer Blocks
![Page 5: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/5.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
![Page 6: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/6.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Don’t wait for the complete block to be transferred“Early Restart”
![Page 7: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/7.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Don’t wait for the complete block to be transferred“Early Restart”Access and transfer each word sequentially.As soon as the requested word is in cache, restart the processor to access cache and finish the block transferwhile the cache is available.
![Page 8: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/8.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Don’t wait for the complete block to be transferred“Early Restart”Access and transfer each word sequentially.As soon as the requested word is in cache, restart the processor to access cache and finish the block transferwhile the cache is available.
Variation: “Requested Word First”
![Page 9: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/9.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Don’t wait for the complete block to be transferred“Early Restart”Access and transfer each word sequentially.As soon as the requested word is in cache, restart the processor to access cache and finish the block transferwhile the cache is available.
Variation: “Requested Word First”Disadvantage: Complex Control
Likely access cache block before transferis complete
![Page 10: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/10.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Assume Memory Access times:• 1 clock cycle to send address• 10 Clock cycles to access DRAM• 1 clock cycle to send a word of data
![Page 11: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/11.jpg)
Reducing the Miss Penalty
Reduce the time to read the multiple words from MainMemory to the cache block.
Assume Memory Access times:• 1 clock cycle to send address• 10 Clock cycles to access DRAM• 1 clock cycle to send a word of data
For sequential transfer of 4 data words:
Miss Penalty = 1 + 4 *( 10 +1) = 45 clock cycles
![Page 12: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/12.jpg)
What if we could read a block of words simultaneouslyfrom the Main Memory?
Cache Entry
Valid
Tag Word3 Word2 Word1 Word0
32 32 32 32
Main Memory
![Page 13: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/13.jpg)
What if we could read a block of words simultaneouslyfrom the Main Memory?
Cache Entry
Valid
Tag Word3 Word2 Word1 Word0
32 32 32 32
Main Memory
Miss Penalty = 1 + 10 + 1 = 12 clock cycles
Miss Penalty for Sequential = 45 clock cycles
![Page 14: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/14.jpg)
What about 4 banks of Memory? “Interleaved Memory”
Cache
Bank 3 Bank 2 Bank 1 Bank 0Address
Banks are accessed in parallel Words are transferred serially
![Page 15: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/15.jpg)
What about 4 banks of Memory? “Interleaved Memory”
Cache
Bank 3 Bank 2 Bank 1 Bank 0Address
Banks are accessed in parallel Words are transferred serially
Miss Penalty = 1 + 10 + 4 * 1 = 16 clock cycles
Miss Penalty for Parallel = 12 clock cyclesMiss Penalty for Sequential = 45 clock cycles
![Page 16: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/16.jpg)
Average Memory Access Time =Hit Time + Miss Rate * Miss Penalty
Average Access Time
Block Size
Increase Cache sizeIncrease Block size
Main MemoryOrganization
![Page 17: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/17.jpg)
CPU Performance with Cache Memory
For a program:CPU time = CPU execution time + CPU Hold time
Assuming no penalty for Hit
![Page 18: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/18.jpg)
CPU Performance with Cache Memory
For a program:CPU time = CPU execution time + CPU Hold time
CPU Hold time = Memory Stall Clock Cycles* Clock Cycle time
Assuming no penalty for Hit
![Page 19: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/19.jpg)
CPU Performance with Cache Memory
For a program:CPU time = CPU execution time + CPU Hold time
CPU Hold time = Memory Stall Clock Cycles* Clock Cycle time
Memory Stall Clock Cycles = Read Stall Cycles +Write Stall Cycles
Assuming no penalty for Hit
![Page 20: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/20.jpg)
CPU Performance with Cache Memory
For a program:CPU time = CPU execution time + CPU Hold time
CPU Hold time = Memory Stall Clock Cycles* Clock Cycle time
Memory Stall Clock Cycles = Read Stall Cycles +Write Stall Cycles
Read Stall Cycles = Reads * Read Miss Rate * Read Miss Penalty Program
Assuming no penalty for Hit
![Page 21: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/21.jpg)
CPU Performance with Cache Memory
Write Stall Cycles = Writes * Write Miss Rate * Write Miss Penalty Program
+ Write Buffer Stalls
![Page 22: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/22.jpg)
CPU Performance with Cache Memory
Write Stall Cycles = Writes * Write Miss Rate * Write Miss Penalty Program
+ Write Buffer Stalls
Write Buffer Stalls should be << Write Miss Stalls
![Page 23: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/23.jpg)
CPU Performance with Cache Memory
Write Stall Cycles = Writes * Write Miss Rate * Write Miss Penalty Program
+ Write Buffer Stalls
Write Buffer Stalls should be << Write Miss Stalls
So, approximately,
Write Stall Cycles = Writes * Write Miss Rate * Write Miss Penalty Program
![Page 24: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/24.jpg)
CPU Performance with Cache Memory
Memory Stall Clock Cycles = Read Stall Cycles +
Write Stall Cycles
= Reads * Read Miss Rate * Read Miss Penalty
Program
+ Writes * Write Miss Rate * Write Miss Penalty Program
![Page 25: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/25.jpg)
CPU Performance with Cache Memory
Memory Stall Clock Cycles = Read Stall Cycles +
Write Stall Cycles
= Reads * Read Miss Rate * Read Miss Penalty
Program
+ Writes * Write Miss Rate * Write Miss Penalty Program
The Miss Penalties are approximately the same ( Fetch the Block)So, combining the Reads and Writes together into a weighted Miss Rate
Memory Stall Cycles = Memory Accesses * Miss Rate * Miss Penalty Program
![Page 26: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/26.jpg)
CPU Performance with Cache MemoryFor a program:
CPU time = CPU execution time + CPU Hold time
CPU Hold time = Memory Stall Clock Cycles
* Clock Cycle time
CPU time = CPU execution time + Memory Accesses * Miss Rate * Miss Penalty* Clock Cycle time Program
Assuming no penalty for Hit
![Page 27: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/27.jpg)
CPU Performance with Cache MemoryFor a program:
CPU time = CPU execution time + CPU Hold time
CPU Hold time = Memory Stall Clock Cycles
* Clock Cycle time
CPU time = CPU execution time + Memory Accesses * Miss Rate * Miss Penalty* Clock Cycle time ProgramDividing both sides by Instructions / Program and Clock Cycle time
Effective CPI = Execution CPI +Memory Accesses * Miss Rate * Miss Penalty
Instruction
Assuming no penalty for Hit
![Page 28: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/28.jpg)
CPU Performance with Cache Memory
Effective CPI = Execution CPI +
Memory Accesses * Miss Rate * Miss Penalty
Instruction
Consider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%1.) Sequential Memory : Miss penalty = 65 clock cycles2.) 4 Bank Interleaved: Miss penalty = 20 clock cycles
![Page 29: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/29.jpg)
CPU Performance with Cache Memory
Effective CPI = Execution CPI +
Memory Accesses * Miss Rate * Miss Penalty
Instruction
Eff CPI = 1.2 + ( 1 * .003 + .09 * .006) Miss Penalty
= 1.2 + .00354 * Miss Penalty
Consider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%1.) Sequential Memory : Miss penalty = 65 clock cycles2.) 4 Bank Interleaved: Miss penalty = 20 clock cycles
![Page 30: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/30.jpg)
CPU Performance with Cache Memory
Effective CPI = Execution CPI +
Memory Accesses * Miss Rate * Miss Penalty
Instruction
Eff CPI = 1.2 + ( 1 * .003 + .09 * .006) Miss Penalty
= 1.2 + .00354 * Miss Penalty
1.) Eff CPI = 1.2 + .00354* 65 = 1.2 + .2301 = 1.43
Consider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%1.) Sequential Memory : Miss penalty = 65 clock cycles2.) 4 Bank Interleaved: Miss penalty = 20 clock cycles
![Page 31: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/31.jpg)
CPU Performance with Cache Memory
Effective CPI = Execution CPI +
Memory Accesses * Miss Rate * Miss Penalty
Instruction
Eff CPI = 1.2 + ( 1 * .003 + .09 * .006) Miss Penalty
= 1.2 + .00354 * Miss Penalty
1.) Eff CPI = 1.2 + .00354* 65 = 1.2 + 0.2301 = 1.43
2.) Eff CPI = 1.2 + .00354 * 20 = 1.2 + 0.071 = 1.271
Consider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%1.) Sequential Memory : Miss penalty = 65 clock cycles2.) 4 Bank Interleaved: Miss penalty = 20 clock cycles
![Page 32: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/32.jpg)
CPU Performance with Cache MemoryConsider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%4 Bank Interleaved: Miss penalty = 20 clock cyclesEff CPI = 1.271 clock cycles
What if we get a new processor and cache that runs at twice the clockfrequency, but keep the same main memory speed?
![Page 33: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/33.jpg)
CPU Performance with Cache MemoryConsider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%4 Bank Interleaved: Miss penalty = 20 clock cyclesEff CPI = 1.271 clock cycles
What if we get a new processor and cache that runs at twice the clockfrequency, but keep the same main memory speed?
Miss penalty = 40 clock cycles
Eff CPI = 1.2 +.00354 * 40 = 1.2 + 0.1416 = 1.342
![Page 34: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/34.jpg)
CPU Performance with Cache MemoryConsider the DECStation 3100 with 4 word blocks running spiceCPI = 1.2 without missesInstruction Miss Rate = 0.3%Data Miss Rate = 0.6%, For spice, frequency of loads and stores = 9%4 Bank Interleaved: Miss penalty = 20 clock cyclesEff CPI = 1.271 clock cycles
What if we get a new processor and cache that runs at twice the clockfrequency, but keep the same main memory speed?
Miss penalty = 40 clock cycles
Eff CPI = 1.2 +.00354 * 40 = 1.2 + 0.1416 = 1.342
Performance Fast clock = 1.271 * 2 *clock cycle time = 1.89 Slow clock 1.342 * clock cycle time
![Page 35: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/35.jpg)
31 . . . 16 15 . . . 4 3 2 1 0 Address
Byte OffsetBlock Offset
IndexTag
16 12
v Tag Word3 Word2 Word1 Word0
4KEntries
= 16
Hit
Mux
32 32 32 32
2
32Data
![Page 36: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/36.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
678980678981
![Page 37: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/37.jpg)
0 3 2 10
1 7 6 54
2 11 10 98
3 15 14 1312
7 31 30 2928
8 35 34 3332
15 63 62 6160
X 4X+3 4X+2 4X+1 4X
Block Address
Word Address
Word Addr 4
![Page 38: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/38.jpg)
0 3 2 10
1 7 6 54
2 11 10 98
3 15 14 1312
7 31 30 2928
8 35 34 3332
15 63 62 6160
X 4X+3 4X+2 4X+1 4X
Block Address
Word Address
Word Addr 4
Cache Address0123
7
![Page 39: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/39.jpg)
0 3 2 10
1 7 6 54
2 11 10 98
3 15 14 1312
7 31 30 2928
8 35 34 3332
15 63 62 6160
X 4X+3 4X+2 4X+1 4X
Block Address
Word Address
Word Addr 4
Cache Address0123
70
7
![Page 40: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/40.jpg)
0 3 2 10
1 7 6 54
2 11 10 98
3 15 14 1312
7 31 30 2928
8 35 34 3332
15 63 62 6160
X 4X+3 4X+2 4X+1 4X
Block Address
Word Address
Word Addr 4
Cache Address0123
70
7
X Modulo 8
![Page 41: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/41.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
678980678981
Cache Address =( Word Addr ) modulo 8 4
![Page 42: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/42.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss78980678981
Cache Address =( Word Addr ) modulo 8 4
![Page 43: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/43.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8980678981
Cache Address =( Word Addr ) modulo 8 4
![Page 44: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/44.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss980678981
Cache Address =( Word Addr ) modulo 8 4
![Page 45: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/45.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit80678981
Cache Address =( Word Addr ) modulo 8 4
![Page 46: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/46.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit80 20 4 Miss678981
Cache Address =( Word Addr ) modulo 8 4
![Page 47: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/47.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit80 20 4 Miss6 1 1 Hit7 1 1 Hit8 2 2 Hit9 2 2 Hit81 20 4 Hit
Cache Address =( Word Addr ) modulo 8 4
![Page 48: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/48.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 6 17 18 29 269
Cache Address =( Word Addr ) modulo 8 4
![Page 49: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/49.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 17 1 Miss 6 17 18 29 269
Cache Address =( Word Addr ) modulo 8 4
![Page 50: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/50.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 17 1 Miss 6 1 1 Miss7 1 1 Hit8 2 2 Hit9 2 2 Hit69
Cache Address =( Word Addr ) modulo 8 4
![Page 51: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/51.jpg)
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address Hit or Miss
6 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 17 1 Miss 6 1 1 Miss7 1 1 Hit8 2 2 Hit9 2 2 Hit69 17 1 Miss
Cache Address =( Word Addr ) modulo 8 4
![Page 52: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/52.jpg)
How about putting a block in any unused block of the eight blocks?
Tag Word3 Word2 Word1 Word0
![Page 53: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/53.jpg)
How about putting a block in any unused block of the eight blocks?
Tag Word3 Word2 Word1 Word0
How can you find it?
![Page 54: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/54.jpg)
How about putting a block in any unused block of the eight blocks?
Tag Word3 Word2 Word1 Word0
How can you find it?Expand the Tag to the block address and compare
![Page 55: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/55.jpg)
How about putting a block in any unused block of the eight blocks?
Tag Word3 Word2 Word1 Word0
Fully Associative Memory – Addressed by it’s contents
Block Address – 28 bitsAddress
![Page 56: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/56.jpg)
Fully Associative Memory – Addressed by it’s contents
Block Address – 28 bitsAddress
• For practical Hit time, must have parallel comparisonsof the Tag and the Block Address
• Only feasible for small number of blocks
Byte Offset
Block Offset
![Page 57: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/57.jpg)
Fully Associative Memory – Addressed by it’s contents
Block Address – 28 bitsAddress
Tag Data Tag Data Tag Data Tag Data
BlkAddr
= = = =
+Hit
Mux
DataValid bitnot shown
Block Offsetselects Word
Byte Offset
Block Offset
![Page 58: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/58.jpg)
Fully Associative Memory – Addressed by it’s contents
Block Address – 28 bitsAddress
Tag Data Tag Data Tag Data Tag Data
BlkAddr
= = = =
+Hit
Mux
DataValid bitnot shown
HardwareNot Feasiblefor large Cache
Byte Offset
Block Offset
![Page 59: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/59.jpg)
Make sets of Blocks Associative
Two-way set associative
Tag0 Data0 Tag1 Data101...
Index
Valid bitnot shown
• Addr by Index• Compare Two Tags in parallel for Hit
2k-1
![Page 60: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/60.jpg)
Make sets of Blocks Associative
Two-way set associative
Tag0 Data0 Tag1 Data101...
Index
Valid bitnot shown
Tag Index
Block Offset
Byte Offset
• Addr by Index• Compare Two Tags in parallel for Hit
Address
2k-1
![Page 61: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/61.jpg)
Block replacement strategies
For each Index there are 2, 4, ... n options for replacement.
Strategies
1. LRU – Least Recently Used
• Replace the block that has been unused for the longest time
• Implementation
![Page 62: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/62.jpg)
Block replacement strategies
For each Index there are 2, 4, ... n options for replacement
Strategies
1. LRU – Least Recently Used
• Replace the block that has been unused for the longest time
2. Random
• Select the block to be replaced randomly
• Implementation
![Page 63: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/63.jpg)
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss
Entry 0 Entry 1678968 678969
Cache Address =( Word Addr ) modulo 4 4
![Page 64: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/64.jpg)
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss
Entry 0 Entry 16 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 678969
Cache Address =( Word Addr ) modulo 4 4
![Page 65: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/65.jpg)
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss
Entry 0 Entry 16 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 17 1 Miss 678969
Cache Address =( Word Addr ) modulo 4 4
![Page 66: DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%](https://reader034.fdocuments.in/reader034/viewer/2022051821/5697c00c1a28abf838cc8be8/html5/thumbnails/66.jpg)
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words.Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss
Entry 0 Entry 16 1 1 Miss7 1 1 Hit8 2 2 Miss9 2 2 Hit68 17 1 Miss 6 1 1 Hit7 1 1 Hit8 2 2 Hit9 2 2 Hit69 17 1 Hit
Cache Address =( Word Addr ) modulo 4 4