Module #7 – Memory Managementcse325/Modules/Module07/module...1 Module #7 – Memory Management...
Transcript of Module #7 – Memory Managementcse325/Modules/Module07/module...1 Module #7 – Memory Management...
1
Module #7 – Memory Management
� Lecture #2
• Cache memory
� Readings: Silberschatz, 9.1-9.8
1
Exploiting Locality of Reference
� Exploit locality of reference by keeping a subset of the instructions and data values in high-speed storage (with mechanism to change the subset of instructions and data values when necessary).
� Processor checks high-speed storage first; if item not found, copy it from slower speed storage.
2
2
Exploiting Locality of Reference
� Some systems have two or three levels of cache.
� Level 1 cache (usually split between the instruction cache and the data cache) is smallest and fastest.
� Level 2 cache is larger and slower (Level 3 cache is even larger and slower).
3
Cache and RAM Configuration
� RAM is much larger than cache, so cache can only hold a subset of RAM.
� RAM is viewed as a sequence of fixed-size blocks (N bytes).
� Each cache slot (line) can hold one block (N bytes).
� Each cache slot has associated control bits (valid, tag, etc).
� The unit of transfer between RAM and cache is one block.
4
RAM Cache
control
bits
data block
(4 bytes)
3
Read (Load) Operation
Read hit – if desired item is already present in cache, simply copy item from cache to CPU.
• Control info sent to cache – hit
• Desired item copied from cache
to CPU
5
CPU
Cache
RAM
Read (Load) Operation
Read miss – if desired item is not already present in cache, copy block containing item from RAM to cache.
• Control info sent to cache – miss
• Address sent to RAM
• Block containing desired item
copied into cache
• Desired item copied from cache
to CPU
6
CPU
Cache
RAM
4
Write (Store) Operation
Write hit – if desired item is already present in cache, simply copy item from CPU to cache.
• Control info sent to cache – hit
• Desired item copied from CPU
to cache
7
CPU
Cache
RAM
Write (Store) Operation
Write miss – if desired item is not already present in cache, copy block containing item from RAM to cache.
• Control info sent to cache – miss
• Address sent to RAM
• Block containing desired item
copied into cache
• Desired item copied from CPU
to cache
8
CPU
Cache
RAM
5
Write Policies
� After a write operation, the contents of the block in cache and RAM are different – must have a strategy.
� Write through: whenever a cache block is changed, the block is written (copied) to RAM.
� Write back: cache block is only written (copied) to RAM when the cache line is evicted (replaced).
• multiple store instructions can occur before block has to be written to RAM
• modified bit used to indicate that block has been changed (and must be written to RAM)
9
Cache Organizations
Several different cache organizations have been developed:
• Direct mapped
• Fully associative
• Set associative
Direct mapped and fully
associative are two ends
of the spectrum – set
associative is in between.
10
6
Direct Mapped
Mapping function: I = J mod M
I = cache line number
J = main memory block number
M = number of lines in the cache
11
Example #1
� Block size: 4 bytes
� RAM: 16 MB (24-bit addresses)
� RAM is viewed as 222 blocks of 4 bytes each
(16 MB / 4 bytes)
� Cache: 64 KB for data blocks
� Cache is organized as 214 lines, where each line holds 4 bytes (64 KB / 4 bytes)
� Control bits associated with each cache line
12
7
Example (2)
Address (24 bits) viewed as three fields:
• Offset: 2 bits to identify byte within block
• Line: 14 bits to identify cache line
• Tag: 8 bits (remaining bits)
13
Tag Line Offset
8 bits 14 bits 2 bits
Example (3)
Address: 16339C
in binary:
000101100011001110011100
Tag: 00010110 (16)
Line: 00110011100111 (0CE7)
Byte: 00 (0)
14
8
Example (4)
Cache line Addresses of RAM blocks
0 000000, 010000, …, FF0000
1 000004, 010004, …, FF0004
2 000008, 010008, …, FF0008
.
.
214-1 00FFFC, 01FFFC, …, FFFFFC
15
Determining Hit or Miss
When the cache controller checks a particular cache line, it needs to determine if the desired item is already in the cache or not (hit or miss).
• Check the Valid bit.
• Compare the tag from
the address and the tag
from the cache line.
• If the entry is valid
and the tags match,
the item is present
in the cache.16
9
Example #2
� Address (32 bits) viewed as three fields:
• Byte offset: 8 bits to identify byte within block
• Line: 4 bits to identify cache line
• Tag: 20 bits (remaining bits)
� Example: FFF7C408
11111111111101111100010000001000
17
Example (2)
� How many lines in the cache?
24 = 16 lines
� How many bytes in one block?
28 = 256 bytes
� How many control bits in one line?
V + M + Tag = 1 + 1 + 20 = 22 bits
� How many total bits in one line?
control + data = 22 + 2048 = 2070 bits
18
10
Example (3)
V M Tag V M Tag
---- ----- ---- -----
[0]: 1 0 FF641 [8]: 0 0 0004A
[1]: 1 0 00014 [9]: 1 0 00028
[2]: 1 0 0003A [A]: 1 0 00028
[3]: 0 1 FF593 [B]: 1 1 FFF7C
[4]: 1 1 FFF7C [C]: 0 1 00EA1
[5]: 1 0 00014 [D]: 1 0 00028
[6]: 0 0 00014 [E]: 1 1 0003A
[7]: 1 0 00014 [F]: 1 1 0003A
19
Example (4)
� Index – line number (not stored)
� Valid bit (V) – initially 0, set to 1 when that entry in the cache is in use
� Modified bit (M) – set to 1 when at least one byte in the block has been modified by a "write" operation (sometimes called the dirty bit)
� Tag bits – compared to tag bits from address
� Block – 256 bytes (not shown)
20
11
Example (5)
� Consider the cache entry at index 4:
[4]: 1 1 FFF7C
• What are the addresses of the first and last bytes in that cache entry?
first byte: FFF7C400
last byte: FFF7C4FF
• Has the contents of that cache block been modified?
Yes, M = 1
21
Example (6)
� Consider a request to read from address 00028A14
Line in address is A, so check cache line at index A:
[A]: 1 0 00028
Hit: V = 1 and tag in cache line matches tag in address
Transfer 4 bytes (14, 15, 16, 17) from cache block to CPU
22
12
Example (7)
� Consider a request to read from address 0007260C
Line in address is 6, so check cache line at index 6:
[6]: 0 0 00014
Miss: V = 0
Transfer 256 bytes from RAM to cache
Set V bit to 1
Set M bit to 0
Set tag to 00072
Transfer 4 bytes (0C, 0D, 0E, 0F) from cache block to CPU
23
Example (8)
� Consider a request to write to address 0003AED8
Line in address is E, so check cache line at index E:
[E]: 1 1 0003A
Hit: V = 1 and tag in cache line matches tag in address
Transfer 4 bytes from CPU to cache block (D8, D9, DA, DB)
Set M bit to 1
� Note that some of the 256 bytes in the cache block are no longer the same as the corresponding bytes in RAM (copy block to RAM later)
24
13
Example (9)
� Consider a request to write to address 0003A344
Line in address is 3, so check cache line at index 3:
[3]: 0 1 FF593
Miss: V = 0
Transfer 256 bytes from RAM to cache
Set V bit to 1
Set M bit to 0
Set tag to 0003A
Transfer 4 bytes (44, 45, 46, 47) from CPU to cache block
Set M bit to 1
25
Example (10)
� Consider a request to read from address 002C5934
Line in address is 9, so check cache line at index 9:
[9]: 1 0 00028
Miss: V = 1, but tags don't match
Transfer 256 bytes from RAM to cache
Set V bit to 1
Set M bit to 0
Set tag to 002C5
Transfer 4 bytes (34, 35, 36, 37) from cache block to CPU
26
14
Example (11)
� Consider a request to read from address 002D1F98
Line in address is F, so check cache line at index F:
[F]: 1 1 0003A
Miss: V = 1, but tags don't match
Transfer 256 bytes from cache to RAM (write back)
Transfer 256 bytes from RAM to cache
Set V bit to 1
Set M bit to 0
Set tag to 002D1
Transfer 4 bytes (98, 99, 9A, 9B) from cache block to CPU
27
� To exploit spatial locality, a cache slot must hold more than one item (one word).
� Block size is always a multiple of 2 (use the least significant N bits of the address to identify a specific byte within a block of 2N bytes).
� Typical block sizes are 32 to 256 bytes, with 64 and 128 bytes as the most common.
Block Size
28
15
Miss rate vs. block size for one benchmark
Block Size
29
� Addresses are 32 bits
� Cache characteristics:
• direct mapped
• write through
• 256 slots
• 16 words (64 bytes) per block
� Address subdivided into 3 fields:
• 18 + 8 + 6
Example: 64-byte blocks
30
16
31