1 Chapter 10 Virtual Memory. 2 Outline Background Demand Paging Process Creation Page Replacement...
-
Upload
lucy-cameron -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Chapter 10 Virtual Memory. 2 Outline Background Demand Paging Process Creation Page Replacement...
1
Chapter 10 Virtual Memory
2
Outline
• Background• Demand Paging• Process Creation• Page Replacement• Allocation of Frames • Thrashing• OS Examples
3
Background
• Memory Management in Chapter 9– Place the entire logical address space into the physical address
space• Exception -- overlays and dynamic loading: programmer
• Virtual memory – allow the execution of processes that may not be completely in memory– Only part of the program needs to be in memory for execution– Separate user logical address space (memory) from physical
address space (memory)– Logical address space can be much larger than physical address
space– Need to allow pages to be swapped in and out
4
Does the entire process need in memory for execution…?
• Code to handle unusual error conditions• Arrays, list, and tables are often allocated more memory
than they actually need• Certain options and features of a program may be used
rarely• Even when the entire program is needed… it may not all be
needed at the same time
5
Benefits of executing a program partially in memory…
• A program can be larger than the physical memory– Programmers no longer need to worry about the amount of physical
memory available
• Increase the level of multiprogramming– Increase the CPU utilization and throughput, without increasing the
response time or turnaround time (really?)
• Less I/O would be needed to load or swap each user program into memory
6
A large VM when only a small physical memory is available
Direct
access
Page is not in physical memory
Physical memory is full, swap a frame out
(Page Replacement)
7
Virtual Memory
• Implementation– Demand paging– Demand segmentation
• Segmentation-replacement algorithms are more complex because the segments have variable sizes
8
10.2 Demand Paging
9
Basic Concepts
• Similar to paging + swapping– Processes reside on secondary memory… When executing
• Swap: the entire process in/out memory (swapper)
• Demand Paging: never swap a page into memory unless it will be needed (Lazy Swapper or pager)
• Demand paging– Less I/O needed
– Less memory needed
– Faster response
– More processes
• Page is needed reference to it– invalid reference abort
– not-in-memory bring to memory
Pager guesses which pages will be used before the process is swapped out again Bring these necessary pages into memory only
10
Transfer of a paged memory to contiguous disk space
11
Hardware Support for Demand Paging
• With each page table entry a valid–invalid bit is associated– 1 legal and in-memory; 0 not-in-memory or illegal
• Initially valid–invalid bit is set to 0 on all entries• Marking a page invalid will have no effect if the process
never attempts to access that page• During address translation, if valid–invalid bit in page table
entry is 0 page-fault trap– Illegal?– Legal but not in memory
12
Page table when some pages are not in main memory…
illegal access
OS puts the process in the backing store when it starts executing.
13
Page-Fault Trap
• If there is a reference to a page, first reference will trap to OS page fault (this page has not yet brought into memory)
• OS looks at an internal table (keep with PCB) to decide:– Invalid reference abort– Just not in memory page it in
• Get a free frame from the free-frame list – What if no free fames Page Replacement
• Swap page into frame (schedule a disk operation)• Reset tables (internal and page tables), valid bit = 1.• Restart instruction (need architecture support)
14
Steps in handling a page fault
15
Restart any instruction after a page fault…
• ADD A, B, C– Fetch an decode the instruction (ADD)– Fetch A– Fetch B– Add A and B– Store the sum in C
The Key is -- How to keep the original states for restarting…
The Key is -- How to keep the original states for restarting…
Need Architecture SupportNeed Architecture Support
16
What happens if there is no free frame?
• Page replacement – find some page in memory, but not really in use, swap it out.– Algorithm?– Performance – want an algorithm which will result in minimum
number of page faults
• Same page may be brought into memory several times– Page in Page out Page in … Page out
17
More about Demand Paging…
• Pure demand paging– Never bring a page into memory until it is required– Start executing a process with no pages in memory
• Locality of reference– Result in reasonable performance from demand paging
• Hardware support: same as paging and swapping– Page Table– Secondary memory: hold the pages not in main memory
• Swap space or backing store• Must fast
18
Performance of Demand Paging
• Page Fault Rate 0 p 1.0– if p = 0 no page faults – if p = 1, every reference is a fault
• Effective Access Time (EAT)
EAT = (1 – p) x memory access
+ p (service page-fault trap
+ [swap page out ]
+ Read page in
+ restart overhead)
19
Performance of Demand Paging (Example)
• Memory access time = 100 nanoseconds• Average page-fault service time = 25 milliseconds• EAT = (1 – p) x (100) + p x (25 milliseconds)
= (1 – p) x (100) + p x 25,000,000 = 100 + 24,999,900 x p
• p = 0.001 EAT = 25 microseconds (250 slow down)
• EAT = 110 ns p < 0.0000004 (10-percent slow down)
20
10.3 Process Creation
21
Process Creation
• Virtual memory allows other benefits during process creation:– Copy-on-Write– Memory-Mapped Files
22
Copy-on-Write
• Copy-on-Write (COW) allows both parent and child processes to initially share the same pages in memory– Their page tables point to the same frames
• If either process modifies a shared page, only then is the page copied (Copy-on-Write).
• COW allows more efficient process creation as only modified pages are copied.
• Free pages are allocated from a pool of zeroed-out pages.
23
Memory-Mapped Files
• Memory-mapped file I/O allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory.
• A file is initially read using demand paging. A page-sized portion of the file is read from the file system into a physical page.
• Subsequent reads/writes to/from the file are treated as ordinary memory accesses.
• Simplifies file access by treating file I/O through memory rather than read() write() system calls.
• Also allows several processes to map the same file allowing the pages in memory to be shared.
24
Memory Mapped Files
25
10.4 Page Replacement
26
Introduction
• Prevent over-allocation of memory by modifying page-fault service routine to include page replacement.
• Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk.
• Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory.
27
Need for Page Replacement
28
Basic Page Replacement
• Find the location of the desired page on the disk• Find a free frame
– If there is a free frame, use it– If no free frame, use a page replacement algorithm to select a victim
frame– Write the victim to the disk; change the page and frame tables
accordingly
• Read the desired page into the newly free frame• Update the page and frame tables• Restart the user process
29
Page Replacement
30
Two Major Problems to Implement Demand Paging
• Frame-allocation algorithm– How many frames to allocate to
each process
• Page-replacement algorithm– Select the frames that are to be
replaced
– Want the lowest page-fault rate
– Evaluate an algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string
• Page-replacement algorithm (Cont.)– Reference String: the string of
memory references
– 0100, 0432, 0101, 0612, 0102, 0103, 0104, 0101, 0611, 0102, 0103, 0104, 0101, 0610, 0102, 0103, 0104, 0101, 0609, 0102, 0105
• 1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1 (with page size=100)
– Reference String used: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
31
Ideal graph of page faults VS. the number of frames…
32
First-In-First-Out (FIFO) Algorithm
33
FIFO Page Replacement
34
Belady’s Anomaly
The page-fault rate may increase as the number of allocated frames increases
35
Optimal Algorithm
• Replace page that will not be used for longest period of time• How do you know this? Need future knowledge• Used for comparison studies (like SJF)
36
Optimal Algorithm
37
Least Recently Used (LRU) Algorithm
• Replace the page that has not been used for the longest period of time– Associate each page the time of that page’s last use
38
LRU Algorithm
39
LRU Implementation
• Counter or Clock– Every page entry has a time-of-use field; every time page is
referenced through this entry, copy the clock into the field– When a page needs to be replaced, look at the counters to select
the victim– Require a search of the page table to find the LRU page, and a write
to memory for each memory access– Overflow of the clock must be considered
40
LRU Implementation(Cont.)
• Stack implementation – keep a stack of page numbers in a double link form:– Page referenced:
• move it to the top• requires 6 pointers to be changed
– No search for replacement
41
Stack Implementation of LRU
42
LRU Approximation Algorithms
• Reference bit– With each page associate a bit,
initially = 0
– When page is referenced bit set to 1.
– Replace the one which is 0 (if one exists). We do not know the order, however.
• Additional-Reference-Bit– Recording the reference bits at
regular intervals
– Say use 8-bits to record
• Illustration
– Replace the page with the lowest no. in history bits
• 11000100 vs. 01110111
1 0 1 1 0 0 1 1 0
Referencebit
0 1 0 1 1 0 0 1 1 0
History Bits
43
LRU Approximation Algorithms (Cont.)
• Second chance (Clock)– The basic algorithm is FIFO– Need reference bit– If page to be replaced (in FIFO order) has reference bit = 1. then:
• Set reference bit 0• Leave page in memory (give it the second chance)• Replace next page (in FIFO order), subject to same rules
44
The Clock Policy (Second Chance)
• The set of frames candidate for replacement is considered as a circular buffer
• When a page is replaced, a pointer is set to point to the next frame in buffer
• A use bit for each frame is set to 1 whenever – A page is first loaded into the frame– The corresponding page is referenced
• When it is time to replace a page, the first frame encountered with the use bit set to 0 is replaced.– During the search for replacement, each use bit set to 1 is changed
to 0
47
Second-Chance Page-Replacement Algorithm
48
Comparison of Clock with FIFO and LRU
• Asterisk indicates that the corresponding use bit is set to 1• Clock protects frequently referenced pages by setting the
use bit to 1 at each reference
49
Comparison of Clock with FIFO and LRU
• Numerical experiments tend to show that performance of Clock is close to that of LRU
• Experiments have been performed when the number of frames allocated to each process is fixed and when pages local to the page-fault process are considered for replacement– When few (6 to 8) frames are allocated per process, there is almost
a factor of 2 of page faults between LRU and FIFO– This factor reduces close to 1 when several (more than 12) frames
are allocated. (But then more main memory is needed to support the same level of multiprogramming)
50
51
LRU Approximation Algorithms (Cont.)
• Enhanced Second-Chance: reference + modify bit– Replace the first page encountered in the lowest class
• 1: (0,0) neither recently used nor modified• 2: (0,1) not recently used but modified• 3: (1,0) recently used but not modified• 4: (1,1) recently used and modified
52
Counting-Based Algorithms
• Keep a counter of the number of references that have been made to each page.– LFU Algorithm: replaces page with smallest count.– MFU Algorithm: based on the argument that the page with the
smallest count was probably just brought in and has yet to be used.
53
10.5 Allocation of Frames
54
Minimum Number of Frames
• Each process needs minimum number of pages– Defined by computer (instruction set) architecture
• Example: IBM 370 – 6 pages to handle SS MOVE instruction:– instruction is 6 bytes, might span 2 pages.– 2 pages to handle from.– 2 pages to handle to.
• Two major allocation schemes– fixed allocation– priority allocation
55
Fixed Allocation
• Equal allocation – e.g., if 100 frames and 5 processes, give each 20 pages.
• Proportional allocation – Allocate according to the size of process.
mS
spa
m
sS
ps
iii
i
ii
for allocation
frames of number total
process of size
5964137
127
564137
10
127
10
64
2
1
2
a
a
s
s
m
i
56
Priority Allocation
• Use a proportional allocation scheme w.r.t. priorities rather than size.
• If process Pi generates a page fault,– select for replacement one of its frames.– select for replacement a frame from a process with lower priority
number.
57
Global vs. Local Allocation
• Global replacement – process selects a replacement frame from the set of all frames; one process can take a frame from another.
• Local replacement – each process selects from only its own set of allocated frames.
58
10.6 Thrashing
59
Thrashing
• If a process does not have “enough” pages, the page-fault rate is very high. This leads to:– Low CPU utilization– Operating system thinks that it needs to increase the degree of
multiprogramming– Another process added to the system
• Think of a N-iteration loop, each iteration needs 4 pages and only 3 pages are allocated
• Thrashing a process is busy swapping pages in and out1
2
3
4
1 2 34 2 34 1 34 1 23 1 2
60
Thrashing Diagram
61
Locality Model
• To prevent thrashing, we must provide a process as many frames as it needs
• Why does paging work? Locality model– A locality is a set of pages that are actively used together– In the previous loop example, the locality consists of 4 pages– Process migrates from one locality to another– If we can keep the current localities of all processes in memory no
thrashing
• Why does thrashing occur? size of locality > total memory size
62
63
64
Working-Set Model
• Look at how many frames a process is actually using working-set window a fixed number of page
references– Example: 10,000 instruction
• WSSi (working set of Process Pi) =total number of pages referenced in the most recent – if too small will not encompass entire locality.– if too large will encompass several localities.– if = will encompass entire program.
• D = WSSi total demand frames• if D > m Thrashing
– Policy if D > m, then suspend one of the processes.
65
Working-Set Model Illustration
66
Keeping Track of Working Set
• Approximate with interval timer + a reference bit• Example: = 10,000
– Timer interrupts after every 5000 time units.– Keep in memory 2 bits for each page.– Whenever a timer interrupts, copy and sets the values of all
reference bits to 0.– If one of the bits in memory = 1 page in working set.
• Improvement = 10 bits and interrupt every 1000 time units.
67
Page-Fault Frequency Scheme
Use page-fault frequency to establish “acceptable” page-fault rate. If actual rate too low, process loses frame. If actual rate too high, process gains frame.
68
Process Suspension When Thrashing
• Lowest priority process• Faulting process
– this process does not have its working set in main memory so it will be blocked anyway
• Last process activated– this process is least likely to have its working set resident
69
Process Suspension
• Process with smallest resident set– this process requires the least future effort to reload
• Largest process– obtains the most free frames
• Process with the largest remaining execution window
70
10.8 Other Considerations
71
Pre-paging and Page Size
• Pre-paging – Attempt to prevent the high level of initial paging– Keep the working-set when a process is suspended– Whether the cost of pre-paging is less than that of the page-fault
service
• Page size selection (Prefer larger page now…)– Internal fragmentation small page– page table size large page– I/O overhead large page– Total I/O small page– # of page fault large page
72
TLB Reach
• TLB Reach - Amount of memory accessible from the TLB– TLB Reach = (TLB Size) X (Page Size)
• Ideally, the working set of each process is stored in the TLB. Otherwise there is a high degree of page faults
• Increasing TLB reach– Increase TLB size– Increase the Page Size. This may lead to an increase in
fragmentation as not all applications require a large page size– Provide Multiple Page Sizes: allow applications requiring larger page
sizes the opportunity to use them without increase in fragmentation• Need OS-managed TLB
73
Program Structure
• Program structure How to improve locality– Array A[1024, 1024] of integer– Each row is stored in one page– One frame – Program 1 for j := 1 to 1024 do
for i := 1 to 1024 doA[i,j] := 0;
1024 x 1024 page faults – Program 2 for i := 1 to 1024 do
for j := 1 to 1024 doA[i,j] := 0;
1024 page faults
74
Inverted Page Table
• IPT no longer contains complete information about the logical address space of a process, which is required to process page fault
• An external page table (one for each process) must be kept– Look like the traditional per-process page table, containing
information on where each virtual page is located– Reference only when a page fault occur no need to be quick– May page in and out of memory as necessary
75
I/O Interlocks
• What happens if a page waiting for I/O is paged out, and another page is put in the frame originally used for that page
• Solution 1: never execute I/O to user memory– I/O takes place only between system memory and I/O device– Extra copies are needed for transferring data between system
memory and user memory
• Solution 2: Allow pages to be locked into memory– A locked page cannot be selected for eviction by a page
replacement algorithm
76
Reason Why Frames Used for I/O Must be in Memory
77
Locking A Page to Prevent Replacement
• Kernel memory is usually locked in memory (performance)• Prevent replacing a newly brought-in page until it can be
used at least once
78
Real-Time Processing
• VM provides the best overall utilization of a computer• VM increases overall system throughput, but individual
processes may suffer from page faults and replacements• VM is the antithesis real-time computing
– VM can introduce unexpected long-term delays in the execution of a process while pages are brought into memory
– Real-time systems almost never have virtual memory
• Solaris 2 – real-time and time-sharing processes– Allow a process to tell it which pages are important to that process
(“hint” on page use so that important pages may not be paged out)– Privileged users can lock pages