CS152 – Computer Architecture and Engineering Lecture 15 ...cs152/fa04/lecnotes/lec8-2.pdf · ¥...
Transcript of CS152 – Computer Architecture and Engineering Lecture 15 ...cs152/fa04/lecnotes/lec8-2.pdf · ¥...
CS 152 L15 Virtual Memory () UC Regents Fall 2004 © UCB
2004-10-21 Dave Patterson
(www.cs.berkeley.edu/~patterson)
John Lazzaro (www.cs.berkeley.edu/~lazzaro)
www-inst.eecs.berkeley.edu/~cs152/
CS152 – Computer Architecture andEngineering
Lecture 15 – Virtual Memory
1
CS 152 L15 Virtual Memory () UC Regents Fall 2004 © UCB
Last Time: How to Design a Cache
ToCPU
ToLowerLevelMemory
ToCPU
ToLowerLevelMemory
TagsBlocks
Addr
Din
Dout
Addr
Din
Dout
State Machine
Control
Control Control
Most design errors come from incorrect specification of state machine behavior!
Common bugs: stalls, block replacement, write buffer
2
CS 152 L15 Virtual Memory () UC Regents Fall 2004 © UCB
Today’s Lecture - Virtual Memory
Virtual address spaces
Page table layout
TLB design options
DRAM technology
3
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
DRAM Technology
4
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Why DRAM over SRAM? Density!bit
!"#$%&'())* ++,!-.)'/ 012-3/414- 56&1'--
!"#$%&#'()'*"+,(-"*.$+&/"0(
1 234)(-'##
1 5$+6'+(7'##(! #"8'+(9'0/&%,:(;&6;'+(7"/%<=&%((((((((((((((((((
1 >"(+'?+'/;(+'@A&+'9(
1 2&*.#'(+'$9(! ?$/%'+($77'//(
1 2%$09$+9(B-(.+"7'//(! 0$%A+$#(?"+(&0%'6+$%&"0(8&%;(#"6&7
1 C34)(-'##
1 2*$##'+(7'##(! ;&6;'+(9'0/&%,:(#"8'+(7"/%<=&%(
1 >''9/(.'+&"9&7(+'?+'/;:($09(+'?+'/;($?%'+(+'$9(
1 -"*.#'D(+'$9(! #"06'+($77'//(%&*'(
1 2.'7&$#(B-(.+"7'//(! 9&??&7A#%(%"(&0%'6+$%'(8&%;(#"6&7(7&+7A&%/
8"+9(#&0'
=&%(#&0' =&%(#&0'
8"+9(#&0'
=&%(#&0'
!"#$%&'()&*$+',,#&#-.#$/#01##-$+',,#&#-0$(#(2&*$0*%#3$'3$0"#$/'0 .#445
bit
word
1 0
SRAM Cell: Large
6 transistorsnFET and pFET3 interface wires
Vdd and Gnd
DRAM Cell: Small
transistor + capacitornFET only
2 interface wiresno Vdd
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.21
1-Transistor Memory Cell (DRAM)
° Write:• 1. Drive bit line
• 2.. Select row
° Read:• 1. Precharge bit line to Vdd/2
• 2.. Select row
• 3. Cell and bit line share charges
- Very small voltage changes on the bit line
• 4. Sense (fancy sense amp)
- Can detect changes of ~1 million electrons
• 5. Write: restore the value
° Refresh• 1. Just do a dummy read to every cell.
row select
bit
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.22
DRAM Capacitors: more capacitance in a small area
° Trench capacitors:• Logic ABOVE capacitor
• Gain in surface area of capacitor
• Better Scaling properties
• Better Planarization
° Stacked capacitors• Logic BELOW capacitor
• Gain in surface area of capacitor
• 2-dim cross-section quite small
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.23
Classical DRAM Organization (square)
row
decoder
rowaddress
Column Selector &I/O Circuits Column
Address
data
RAM CellArray
word (row) select
bit (data) lines
° Row and Column Address together:
• Select 1 bit a time
Each intersection represents
a 1-T DRAM Cell
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.24
DRAM logical organization (4 Mbit)
° Square root of bits per RAS/CAS
Column Decoder
Sense Amps & I/O
Memory Array
(2,048 x 2,048)A0…A10
…
11 D
Q
Word LineStorage Cell
Density advantage: 3X to 10X, depends on metric
5
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
DRAM: Reading, Writing, RefreshWriting DRAM:
Drive data on bit lineSelect row
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.21
1-Transistor Memory Cell (DRAM)
° Write:• 1. Drive bit line
• 2.. Select row
° Read:• 1. Precharge bit line to Vdd/2
• 2.. Select row
• 3. Cell and bit line share charges
- Very small voltage changes on the bit line
• 4. Sense (fancy sense amp)
- Can detect changes of ~1 million electrons
• 5. Write: restore the value
° Refresh• 1. Just do a dummy read to every cell.
row select
bit
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.22
DRAM Capacitors: more capacitance in a small area
° Trench capacitors:• Logic ABOVE capacitor
• Gain in surface area of capacitor
• Better Scaling properties
• Better Planarization
° Stacked capacitors• Logic BELOW capacitor
• Gain in surface area of capacitor
• 2-dim cross-section quite small
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.23
Classical DRAM Organization (square)
row
decoder
rowaddress
Column Selector &I/O Circuits Column
Address
data
RAM CellArray
word (row) select
bit (data) lines
° Row and Column Address together:
• Select 1 bit a time
Each intersection represents
a 1-T DRAM Cell
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.24
DRAM logical organization (4 Mbit)
° Square root of bits per RAS/CAS
Column Decoder
Sense Amps & I/O
Memory Array
(2,048 x 2,048)A0…A10
…
11 D
Q
Word LineStorage Cell
1
1
1
Reading DRAM
Select rowSense bit line
(~1 million electrons)Write value back
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.21
1-Transistor Memory Cell (DRAM)
° Write:• 1. Drive bit line
• 2.. Select row
° Read:• 1. Precharge bit line to Vdd/2
• 2.. Select row
• 3. Cell and bit line share charges
- Very small voltage changes on the bit line
• 4. Sense (fancy sense amp)
- Can detect changes of ~1 million electrons
• 5. Write: restore the value
° Refresh• 1. Just do a dummy read to every cell.
row select
bit
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.22
DRAM Capacitors: more capacitance in a small area
° Trench capacitors:• Logic ABOVE capacitor
• Gain in surface area of capacitor
• Better Scaling properties
• Better Planarization
° Stacked capacitors• Logic BELOW capacitor
• Gain in surface area of capacitor
• 2-dim cross-section quite small
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.23
Classical DRAM Organization (square)
row
decoder
rowaddress
Column Selector &I/O Circuits Column
Address
data
RAM CellArray
word (row) select
bit (data) lines
° Row and Column Address together:
• Select 1 bit a time
Each intersection represents
a 1-T DRAM Cell
4/12/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec19.24
DRAM logical organization (4 Mbit)
° Square root of bits per RAS/CAS
Column Decoder
Sense Amps & I/O
Memory Array
(2,048 x 2,048)A0…A10
…
11 D
Q
Word LineStorage Cell
1
1
1
1
1Refresh: a dummy read
Capacitor holds statefor 60 ms -- then
must do “refresh”
To learn more ...DRAM Circuit Design: A TutorialBrent Keeth, R. Jacob BakerISBN: 0-7803-6014-1November 2000,Wiley-IEEE Press
6
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Synchronous DRAM (SDRAM) Interface
11128Mb: x4, x8, x16 SDRAM Micron Technology, Inc., reserves the right to change products or specifications without notice.128MSDRAM_E.p65 – Rev. E; Pub. 1/02 ©2001, Micron Technology, Inc.
128Mb: x4, x8, x16SDRAM
Operating ModeThe normal operating mode is selected by setting M7
and M8 to zero; the other combinations of values for M7and M8 are reserved for future use and/or test modes.The programmed burst length applies to both READ andWRITE bursts.
Test modes and reserved states should not be usedbecause unknown operation or incompatibility with fu-ture versions may result.
Write Burst ModeWhen M9 = 0, the burst length programmed via
M0-M2 applies to both READ and WRITE bursts; whenM9 = 1, the programmed burst length applies toREAD bursts, but write accesses are single-location(nonburst) accesses.
CAS LatencyThe CAS latency is the delay, in clock cycles, between
the registration of a READ command and the availabilityof the first piece of output data. The latency can be set totwo or three clocks.
If a READ command is registered at clock edge n, andthe latency is m clocks, the data will be available by clockedge n + m. The DQs will start driving as a result of theclock edge one cycle earlier (n + m - 1), and provided thatthe relevant access times are met, the data will be valid byclock edge n + m. For example, assuming that the clockcycle time is such that all relevant access times are met,if a READ command is registered at T0 and the latency isprogrammed to two clocks, the DQs will start drivingafter T1 and the data will be valid by T2, as shown inFigure 2. Table 2 below indicates the operating frequen-cies at which each CAS latency setting can be used.
Reserved states should not be used as unknown op-eration or incompatibility with future versionsmay result.
Figure 2CAS Latency
CLK
DQ
T2T1 T3T0
CAS Latency = 3
LZ
DOUT
tOHt
COMMAND NOPREAD
tAC
NOP
T4
NOP
DON’T CARE
UNDEFINED
CLK
DQ
T2T1 T3T0
CAS Latency = 2
LZ
DOUT
tOHt
COMMAND NOPREAD
tAC
NOP
Table 2CAS LatencyALLOWABLE OPERATING
FREQUENCY (MHz)CAS CAS
SPEED LATENCY = 2 LATENCY = 3-7E ≤ 133 ≤ 143-75 ≤ 100 ≤ 133-8E ≤ 100 ≤ 125
Note! This example is best-case! For a random access, DRAM takes
many more than 2 cycles!
A clocked bus protocol
(ex: 100 MHz)
Cache controllerputs commands
on bus
Data comes out several cycles
later.From Micron 128 Mb SDRAM data sheet (on “resources” web page)
(CAS = Column Address Strobe)
7
CS 152 L15 Virtual Memory () UC Regents Fall 2004 © UCB
Administrivia - Lab 3, HW 3, Lab 4
Homework 3 due 10/26 (Tuesday),283 Soda, in CS 152 box at 5 PM
Lab 3 final demo on 10/22 (Friday)
Lab 3 report due: Monday 10/25 11:59 PM
Lab 4 to be posted on 10/22 (Friday)(if all goes well).
8
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Virtual Addressing
9
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
The Limits of Physical Addressing
CPU Memory
A0-A31 A0-A31
D0-D31 D0-D31
“Physical addresses” of memory locations
Data
All programs share one address space: The physical address space
No way to prevent a program from accessing any machine resource
Machine language programs must beaware of the machine organization
Where we are in CS 152 ...
10
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Solution: Add a Layer of Indirection
CPU Memory
A0-A31 A0-A31
D0-D31 D0-D31
Data
User programs run in an standardizedvirtual address space
Address Translation hardware managed by the operating system (OS)
maps virtual address to physical memory
“Physical Addresses”
AddressTranslation
Virtual Physical
“Virtual Addresses”
Hardware supports “modern” OS features:Protection, Translation, Sharing
11
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
MIPS R4000: Address Space ModelProcess A
AddressError
2 GB
0
231
232- 1
Process B
AddressError
2 GB
0
231
232- 1
Process A and B have independent address spaces
ASID = 12 ASID = 13
ASID = Address Space Identifier
When Process A writes its address 9, it writes to a different physical memory location than
Process B’s address 9
May only be accessed by kernel/supervisor
To let Process A and B share memory, OS maps parts of
ASID 12 and ASID 13 to the same physical memory locations.
All address spaces“translated” to standard map
Still works (slowly!) if a process accesses more virtual memory than the machine has physical memory
12
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
MIPS R4000: Who’s Running on the CPU?System Control Registers
Chapter 4
80 MIPS R4000 Microprocessor User' s Manual
4.3 System Control Coprocessor
The System Control Coprocessor (CP0) is implemented as an integral part
of the CPU, and supports memory management, address translation,
exception handling, and other privileged operations. CP0 contains the
registers shown in Figure 4-7 plus a 48-entry TLB. The sections that follow
describe how the processor uses the memory management-related
registers†.
Each CP0 register has a unique number that identifies it; this number is
referred to as the register number. For instance, the Page Mask register is
register number 5.
Figure 4-7 CP0 Registers and the TLB
† For a description of CP0 data dependencies and hazards, please see Appendix F.
EntryLo02*
EntryHi
Page Mask
Index
Random
Wired
Count
47
0
BadVAddr
TLB
(“Safe” entries)
(See Random Register,
PRId
0127
8*
15*
Compare
11*
Config
16*
LLAddr
17*
WatchLo
18*
WatchHi
19*
TagLo
28*TagHi
29*
contents of TLB Wired)ECC
26*
*Register number
Used with exception
processing. SeeUsed with memory
Chapter 5 for details.
EntryLo0
2*
3*
EntryLo1
EntryHi
10*
5*Page Mask
Index
0*
Random1*
Wired6*
ErrorEPC
30*
Context
4*
Status
12*Cause
13*
EPC14*
management system.
CacheErr
27*
XContext
20*
9*
User cannot write supervisor/kernel bits. Supervisor cannot write kernel bit.
User cannot change address translation configuration
Status (12): Indicatesuser, supervisor, or
kernel mode
EntryLo0 (2): 8-bit ASID field codes virtualaddress space ID.
13
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
MIPS Address Translation: How does it work?“Physical Addresses”
CPU Memory
A0-A31 A0-A31
D0-D31 D0-D31
Data
TLB also contains ASID andkernel/supervisor bits for virtual address
Virtual Physical
“Virtual Addresses”
TranslationLook-Aside
Buffer(TLB)
Translation Look-Aside Buffer (TLB)A small fully-associative cache of
mappings from virtual to physical addresses
Fast common case: Virtual address is in TLB, process has permission to read/write it.
What is the table
ofmappings
that it caches?
14
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Page tables encode virtual address spaces
A machine usually supports
pages of a few sizes
(MIPS R4000):
MIPS R4000 Microprocessor User' s Manual 87
Memory Management
EntryLo0 (2), and EntryLo1 (3) Registers
The EntryLo register consists of two registers that have identical formats:
• EntryLo0 is used for even virtual pages.
• EntryLo1 is used for odd virtual pages.
The EntryLo0 and EntryLo1 registers are read / write registers. They hold
the physical page frame number (PFN) of the TLB entry for even and odd
pages, respectively, when performing TLB read and write operations.
Figure 4-10 shows the format of these registers.
PageMask Register (5)
The PageMask register is a read / write register used for read ing from or
writing to the TLB; it holds a comparison mask that sets the variable page
size for each TLB entry, as shown in Table 4-9.
TLB read and write operations use this register as either a source or a
destination; when virtual addresses are presented for translation into
physical address, the corresponding bits in the TLB identify which virtual
address bits among bits 24:13 are used in the comparison. When the Mask
field is not one of the values shown in Table 4-9, the operation of the TLB
is undefined .
Table 4-9 Mask Field Values for Page Sizes
Page SizeBi t
24 23 22 21 20 19 18 17 16 15 14 13
4 Kbytes 0 0 0 0 0 0 0 0 0 0 0 0
16 Kbytes 0 0 0 0 0 0 0 0 0 0 1 1
64 Kbytes 0 0 0 0 0 0 0 0 1 1 1 1
256 Kbytes 0 0 0 0 0 0 1 1 1 1 1 1
1 Mbyte 0 0 0 0 1 1 1 1 1 1 1 1
4 Mbytes 0 0 1 1 1 1 1 1 1 1 1 1
16 Mbytes 1 1 1 1 1 1 1 1 1 1 1 1
PhysicalMemory Space
A valid page table entry codes physical memory “frame” address for the page
A virtual address spaceis divided into blocks
of memory called pagesframeframe
frame
frame
A page table is indexed by a virtual address
virtual address
Page Table(One per ASID)
OS manages the page table for each ASID
15
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.21
CPU Registers100s Bytes<10s ns
CacheK Bytes10-100 ns$.01-.001/bit
Main MemoryM Bytes100ns-1us$.01-.001
DiskG Bytesms10 - 10 cents
-3 -4
CapacityAccess TimeCost
Tapeinfinitesec-min10-6
Registers
Cache
Memory
Disk
Tape
Instr. Operands
Blocks
Pages
Files
StagingXfer Unit
prog./compiler1-8 bytes
cache cntl8-128 bytes
OS512-4K bytes
user/operatorMbytes
Upper Level
Lower Level
faster
Larger
Recall: Levels of the Memory Hierarchy
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.22
° Virtual memory => treat memory as a cache for the disk° Terminology: blocks in this cache are called “Pages”
• Typical size of a page: 1K — 8K
° Page table maps virtual page numbers to physical frames• “PTE” = Page Table Entry
Physical Address Space
Virtual Address Space
What is virtual memory?
Virtual Address
Page Table
indexintopagetable
Page TableBase Reg
VAccessRights PA
V page no. offset10
table locatedin physicalmemory
P page no. offset
10
Physical Address
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.23
Three Advantages of Virtual Memory
° Translation:• Program can be given consistent view of memory, even
though physical memory is scrambled
• Makes multithreading reasonable (now used a lot!)
• Only the most important part of program (“Working Set”) must be in physical memory.
• Contiguous structures (like stacks) use only as much physical memory as necessary yet still grow later.
° Protection:• Different threads (or processes) protected from each other.
• Different pages can be given special behavior- (Read Only, Invisible to user programs, etc).
• Kernel data protected from User programs
• Very important for protection from malicious programs=> Far more “viruses” under Microsoft Windows
° Sharing:• Can map same physical page to multiple users
(“Shared memory”) 4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.24
What is the size of information blocks that are transferred from secondary to main storage (M)? ⇒ page size(Contrast with physical block size on disk, I.e. sector size)
Which region of M is to hold the new block ⇒ placement policy
How do we find a page when we look for it? ⇒ block identification
Block of information brought into M, and M is full, then some region of M must be released to make room for the new block ⇒ replacement policy
What do we do on a write? ⇒ write policy
Missing item fetched from secondary memory only on the occurrence of a fault ⇒ demand load policy
pages
reg
cachemem disk
frame
Issues in Virtual Memory System Design
V=0 pages either reside on disk or
have not yet been allocated.
OS handles V=0“Page fault”
Physical and virtual pages must be the
same size!
The TLB caches page table entries
TLB
Page Table
2
0
1
3
virtual address
page off
2frame page
250
physical address
page off
TLB caches page table
entries.
MIPS handles TLB misses in software (random replacement). Other
machines use hardware.
for ASID
Physicalframe
address
16
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Page tables may not fit in memory!A table for 4KB pages for a 32-bit
address space has 1M entries Each process needs its own address space!
P1 index P2 index Page Offset
31 12 11 02122
32 bit virtual address
Top-level table wired in main memory
Subset of 1024 second-level tables in main memory; rest are on disk or
unallocated
Two-level Page Tables
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.25
How big is the translation (page) table?
° Simplest way to implement “fully associative” lookup policy is with large lookup table.
° Each entry in table is some number of bytes, say 4
° With 4K pages, 32- bit address space, need:232/4K = 220 = 1 Meg entries x 4 bytes = 4MB
° With 4K pages, 64-bit address space, need:264/4K = 252 entries = BIG!
° Can’t keep whole page table in memory!
Virtual Page Number Page Offset
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.26
Large Address Spaces
Two-level Page Tables
32-bit address:
P1 index P2 index page offest
4 bytes
4 bytes
4KB
10 10 12
1KPTEs
° 2 GB virtual address space
° 4 MB of PTE2
– paged, holes
° 4 KB of PTE1
What about a 48-64 bit address space?
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.27
Inverted Page Tables
V.Page P. FramehashVirtual
Page
=
IBM System 38 (AS400) implements 64-bit addresses.
48 bits translated
start of object contains a 12-bit tag
=> TLBs or virtually addressed caches are critical
4/19/04 ©UCB Spring 2004CS152 / Kubiatowicz
Lec21.28
Virtual Address and a Cache: Step backward???
° Virtual memory seems to be really slow:
• Must access memory on load/store -- even cache hits!
• Worse, if translation not completely in memory, may need to go to disk before hitting in cache!
° Solution: Caching! (surprise!)
• Keep track of most common translations and place them in a “Translation Lookaside Buffer” (TLB)
CPUTrans-lation
CacheMain
Memory
VA PA miss
hit
data
17
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
VM and Disk: Page replacement policy
...
Page Table
1 0
useddirty
1 00 11 10 0
Set of all pagesin Memory Tail pointer:
Clear the usedbit in thepage table
Head pointerPlace pages on free list if used bitis still clear.Schedule pages with dirty bit set tobe written to disk.
Freelist
Free Pages
Dirty bit: page written.
Used bit: set to1 on any reference
Architect’s role: support setting dirty
and used bits
18
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
TLB Design Concepts
19
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
MIPS R4000 TLB: A closer look ...“Physical Addresses”
CPU MemorySystem
A0-A31 A0-A31
D0-D31 D0-D31
Data
TranslationLook-Aside
Buffer(TLB)
Virtual Physical
“Virtual Addresses”
MIPS R4000 Microprocessor User' s Manual 65
Memory Management
32-bit Mode Address Translation
Figure 4-2 shows the virtual-to-physical-address translation of a 32-bit
mode address.
• The top portion of Figure 4-2 shows a virtual add ress w ith a
12-bit, or 4-Kbyte, page size, labelled Offset. The remaining 20
bits of the add ress represent the VPN, and index the 1M-entry
page table.
• The bottom portion of Figure 4-2 shows a virtual add ress w ith
a 24-bit, or 16-Mbyte, page size, labelled Offset. The remaining
8 bits of the add ress represent the VPN, and index the 256-
entry page table.
Figure 4-2 32-bit Mode Virtual Address Translation
28 11 0
20 12
2931
VPN Offset
3239
ASID
8
Virtual Address with 1M (220) 4-Kbyte pages
23 0
8 24
Offset
39
Virtual Address with 256 (28)16-Mbyte pages
8 bits = 256 pages
20 bits = 1M pages 12
ASID
8
28 293132
VPN
24
Virtual-to-physicaltranslation in TLB
Bits 31, 30 and 29 of the virtualaddress select user, supervisor,or kernel address spaces.
Offset passedunchanged tophysicalmemory
Virtual-to-physicaltranslation in TLB
TLB
TLB
35 0
PFN Offset
Offset passedunchanged tophysicalmemory
36-bit Physical Address
Physical space larger than virtual space!Checked
againstCPO ASID
20
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Can TLB and caching be overlapped?
Index Byte Select
Valid
Cache Block
Cache Block
Cache Tags Cache Data
Data out
Virtual Page Number Page Offset
TranslationLook-Aside
Buffer(TLB)
Virtual
Physical
=
Hit
Cache Tag
This works, but ...
Q. What is the downside?
A. Inflexibility. VPN size locked to cache tag size.
21
UC Regents Fall 2004 © UCBCS 152 L15 Virtual Memory ()
Can we cache virtual addresses?“Physical Addresses”
CPU Main Memory
A0-A31 A0-A31
D0-D31 D0-D31
Only use TLB on a cache miss !
TranslationLook-Aside
Buffer(TLB)
Virtual Physical
“Virtual Addresses”
A. Synonym problem. If two address spaces share a physical frame, data may be in cache twice.
Maintaining consistency is a nightmare.
CacheVirtual
D0-D31
Downside: a subtle, fatal problem. What is it?
22
CS 152 L15 Virtual Memory () UC Regents Fall 2004 © UCB
Conclusions
VM: Uniform memory models,protection, sharing.
Synchronous DRAM: flexiblebus protocol for array access
Operating systems manage the page table and (often) the TLB
A TLB acts as a fast cache forrecent address translations.
23