Module23Handout
Transcript of Module23Handout
-
7/29/2019 Module23Handout
1/3
CS220: Introduction to Computer Organization
2011-12 Ist Semester
Memory - IV
Amey Karkare
Department of CSE, IIT Kanpur
karkare, CSE, IITK CS220, Memory 1/12
Cache Jargon
Block: A set of contiguous address locations of some size.
A cache block is also called cache line.
Replacement policy
Rules for creating space for a new block in cache.
LRU, FIFO, LFU, random . . .
Cache Hit/Cache Miss
Hit: If the requested address exists in cache.
Miss: Otherwise.
Associativity
Number of possible tags where a physical block may be foundFully associative, k-way associative
Write Policy
Write through, Write back
karkare, CSE, IITK CS220, Memory 2/12
Cache Mapping Algorithms: Example
Main Memory16-bit address.
4K blocks of 16 bytes each.
Total 64K bytes.
Cache
128 blocks of 16 bytes each.
Total 2048 (2K) bytes.
karkare, CSE, IITK CS220, Memory 3/12
Direct Mapping
Block B of main memory mapped to block B%128 of cache.
Blocks 0, 128, 256, . . . mapped to cache location 0.
Blocks 1, 129, 259, . . . mapped to cache location 1, and so on.Total 32 memory blocks mapped per cache location.
Contention may occur even if the cache is not full.
Trivial replacement algorithm: replace the existing block.
16-bit memory address divided in three parts.
Tag Location # Offset
5 7 4
Lower 4 bits of address are the offset within the block (2 4 = 16).
Middle 7 bits of address point to a particular location in the cache
(27 = 128)
Upper 5 bits of address are compared with tag (25 = 32).
karkare, CSE, IITK CS220, Memory 4/12
http://localhost/var/www/apps/conversion/tmp/scratch_6/[email protected]://localhost/var/www/apps/conversion/tmp/scratch_6/[email protected] -
7/29/2019 Module23Handout
2/3
Direct Mapping: Hardware Implementation
karkare, CSE, IITK CS220, Memory 5/12
Fully Associative Mapping
A memory block can be placed into any cache block
location.Space in the cache can be utilized efficiently.A new block replaces an old block only when the cache is full.
All the tags are searched parallely for the desired block.
Very high hardware cost for search.
16-bit memory address divided in two parts.
Tag Offset
12 4
Lower 4 bits of address are the offset within the block (2 4 = 16).
Upper 12 bits of address are compared with tag
(212 = 4096= 4K).
karkare, CSE, IITK CS220, Memory 6/12
Set Associative Mapping
A combination of direct- and fully associative- mapping
Blocks in cache are grouped into sets.k-way associative: Each set contains kblocks.
k= 1 direct mapping.
k=# of cache blocks fully associative.
A block in main memory can be placed in any block of aspecific set.
Contention is reduced w.r.t. direct mapping.
Hardware cost is reduced w.r.t. fully associative mapping.
karkare, CSE, IITK CS220, Memory 7/12
Set Associative Mapping
16-bit memory address divided in three parts.
Example: 2-way set associative.Total 128/2= 64 sets.Total 64 blocks mapped per set.
Tag Set # Offset
6 6 4
Lower 4 bits of address are the offset within the block (2 4 = 16).
Middle 6 bits of address specify the set (26 = 64).
Upper 6 bits of address are compared with tag (26 = 64).
karkare, CSE, IITK CS220, Memory 8/12
-
7/29/2019 Module23Handout
3/3
Set Associative Mapping: Hardware Implementation
karkare, CSE, IITK CS220, Memory 9/12
Replacement Policies
Cache controller need to implement policies about:When do we replace? (easy to answer!)
How do we replace?
Replacement: at the time of conflict.
All kblocks in the target set are full.
Which of the k blocks to replace?
FIFO, LRU, random.
May require extra information in the tag.
Schemes are easier for small k(1 or 2)
karkare, CSE, IITK CS220, Memory 10/12
Extra Information in TAG
For FIFO scheme, we need counters.
When a block is replaced, counter for all other blocks in the same
set are incremented.
For the block brought in, counter is set to 0.
For LRU scheme, we need reference registers.
When a block is referred to, a 1 is inserted in the ref bits of the
referred block and 0 in all other blocks.
In case of FIFO, replacement candidate is selected with
highest counter value.
In case of LRU, replacement candidate is chosen using ref
bits with most zeros at the end.
For k= 2, we need just one bit of history/referenceinformation per block.
For k= 1 (direct mapping) we need no extra information.
karkare, CSE, IITK CS220, Memory 11/12
Other Issues
What do we do if the cache block is not in sync with
memory?
Cache block is more current.
Need to store this information.
Dirty (D) bit per block.
Set when a write occurs in the block.
Reset when a new block is brought it.
If dirty bit is set, write the block in memory before reading another
block.
Valid (V) bitWhether the contents of a cache block or valid or not.
karkare, CSE, IITK CS220, Memory 12/12