1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU...

69
1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015

Transcript of 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU...

Page 1: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

1

ECE 371Microprocessors

Chapter 7Demand-Paged

Virtual Memory Management

Herbert G. Mayer, PSUStatus 10/10/2015

For use at CCUT Fall 2015

Page 2: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

2

Syllabus

Introduction High-Level Steps of Paged VMM VMM Mapping Steps Two Levels History of Paging Goals and Methods of Paging An Unrealistic Paging Scheme Realistic Paging Scheme 32-bit Architecture Typical PD and PT Entries Page Replacement Using FIFO Appendix Bibliography

Page 3: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

3

Introduction

With the advent of 64-bit architectures, paged Virtual Memory Management (VMM) experienced a revival in the 1990s

Originally, the force behind VMM systems (paged or segmented or both on Multics [4]) was the need for more addressable, logical memory than was typically available in physical memory

Scarceness of physical memory then was caused by high cost of memory. The early technology of core memories carried a high price tag due to the tedious manual labor involved

High cost per byte of memory is gone, but the days of insufficient physical memory have returned, with larger data sets due to 64 bit addresses

Page 4: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

4

Introduction

Paged VMM is based on the idea that some memory areas can be relocated out to disk, while other re-used areas can be moved back from disc into memory on demand

Disk space abounds while main memory is more constrained. This relocation in and out, called swapping, can be handled transparently, thus imposing no additional burden on the application programmer

The system must detect situations in which an address references an object that is on disk and must therefore perform the hidden swap-in automatically and transparently

At the expense of additional time

Page 5: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

5

Introduction

Paged VMM trades speed for address range. The loss in speed is caused by mapping logical-to-physical, and by the slow disk accesses; slow disk vs. faster memory access

Typical disk access can be thousands to millions of times more expensive in number of cycles than a memory access

However, if virtual memory mapping allows large programs to run -albeit slowly- that previously could not execute due to their high demands of memory, then the trade-off is worth the loss in speed

The real trade-off is enabling memory-hungry programs to run slowly vs. not executing at all

Page 6: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

6

Steps of Two-LevelVMM Mapping

On 32-Bit Architecture

Page 7: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

7

Paged VMM --from Wikipedia [6]

Page 8: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

8

High-Level Steps of Paged VMM

Some instruction references a logical address la VMM determines, whether la maps onto resident page If yes, memory access completes. In a system with a

data cache, such an access is fast; see also special purpose TLB cache for VMM

If memory access is a store operation, (AKA write) the fact of data modification must be recorded

If access does not address a resident page: Then the missing page is found on disk and made available

to memory via swap-in or else the missing page is created for the first time ever,

generally with initialization for system pages and no initialization for user pages; yet even for user pages a frame must be found

Making a new page available requires finding memory space of page frame size, aligned on a page boundary

Page 9: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

9

High-Level Steps of Paged VMM

If such space can be allocated from unused memory (usually during initial program execution), a page frame is now reserved from such available memory

If no page frame is available, a currently resident page must be swapped out and the freed space reused for the new page Preferred is swapping out a page of the current process If needed, a page from other processes will be removed Note preference of locating unmodified pages, i.e. pages with

no writes; no swap-out needed! Should the page to be replaced be dirty, it must first

be written to disk Otherwise a copy already exists on disk and the

costly swap-out operation can be skipped

Page 10: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

10

VMM Mapping Steps Two Levels Some instruction references logical address (la)

Processor finds datum in L1 data cache, then the operation is complete

Else the work for VMM starts In our paged VMM discussions we ignore data caches But do include a special purpose, cache, known as TLB

Processor finds start address of Page Directory (PD) Logical address is partitioned into three bit fields,

Page Directory Index, Page Table (PT) Index, and user-page Page Offset

The PD finds the PT, the PT finds the user Page Frame (actual page address)

Typical entries are 32-bits long, 4 bytes on byte-addressable architecture

Entry in PD is found by adding PD Index left-shifted by 2, added to the start address of PD

Page 11: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

11

VMM Mapping Steps Two Levels A PD entry yields a PT address, stored without trailing

zeros PT Index left-shifted by 2 is then added to the

previously found PT address; this yields a Page Address

Add Page Offset to previously found Page Address; this yields byte address

Along the way there may have been 3 page faults, with swap-outs and swap-ins

During book-keeping (e.g. find clean pages, locate LRU page etc.) many more memory accesses can result

Page 12: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

12

History of Paging Developed in the 1960s at University of

Manchester for Atlas Computers Used commercially in KDF-9 computer [5] of

English Electric Co. In fact, KDF-9 was one of the major architectural

milestones in computer history aside from von Neumann’s and Atanasoff’s earlier machines

KDF9 incorporated first cache and VMM, had hardware display for stack frame addresses, etc.

HW display was necessary/helpful, since assumed programming language was Algol-60 with nested scopes, different from today’s C++

Page 13: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

13

History of Paging

In the late 1960s to early 1980s, memory was highly expensive, processors were expensive and getting faster

Programs grew larger Insufficient memories were common, and

become one aspect of the growing Software Crisis since the late 1970s

16-bit minicomputers and 32-bit mainframes became common; also 18-bit address architectures (CDC and Cyber) of 60-bit words were used

Page 14: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

14

History of Paging

Paging grew increasingly popular: fast execution was traded for a larger address range

By mid 1980s, memories became cheaper, faster By the late 1980s, memories had become cheaper

yet, the address range remained 32-bit, and large physical memories became possible and available

Supercomputers were designed in the 1980s whose OS provided no virtual memory at all

E.g. Cray systems were built and marketed without VMM

The Intel Hypercube NX® operating system had no virtual memory management

Page 15: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

15

History of Paging

Just when VMM was falling into disfavor, the addressing limitation of 32 bits started constraining programs

In the 1990s, 64-bit architectures became common-place, rather than an exception

Intermediate steps between evolving architectural generations: The Harris 3-byte system with 24-bit addresses, not making the jump to 32 bits

The Pentium Pro® with 36 bits in extended addressing mode, not at all making the jump to 64 bit addresses

Early Itanium® family processors had 44 physical address bits, not quite growing to 64 bits

Page 16: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

16

History of Paging

By the 1990s, 64-bit addresses were common, 64-bit integer arithmetic was generally performed in hardware, no longer via slow library extensions

Pentium Pro has 4 kB pages by default, 4 MB pages if page size extension (pse) bit is set in normal addressing mode

However, if the physical address extension (pae) bit and pse bit are set, the default page size changes from 4 kB to 2 MB

Page 17: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

17

Goals, Methods of Paging

Goal to make full logical (virtual) address space available, even if smaller physical memory installed

Perform mapping transparently. Thus, if at a later time the target receives a larger physical memory, the same program will run unchanged except faster

Note that overlays were managed explicitly by programmer

Map logical onto physical address, even at cost of performance; special purpose cache (tlb) helped

Implement the mapping in a way that the overhead is small in relation to the total program execution

Page 18: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

18

Goals, Methods of Paging

A necessary requirement is a sufficiently large working set, else thrashing happens

Moreover, this is accomplished by caching page directory- and page table entries

Or by placing complete page directory into a special cache

Page 19: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

19

Unrealistic Paging Scheme 32-Bit Arch.

On a 32-bit architecture, byte-addressable, 4-byte words, 4 kB page size

Single-level paging mechanism; later we’ll cover multi-level mapping

On 4kB page, rightmost 12 bits of each page address are all 0, or implied

Page Table thus has 220 entries, each entry typically 4 bytes; due to number of bits: 32-12 = 20

Page offset of 12 bits identifies each byte on a 4 kB page within that page

Page 20: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

20

Unrealistic Paging Scheme 32-Bit Arch.

Logical AddressLogical Address

Page Address (20)Page Address (20) Page Offset (12)Page Offset (12)

Page 21: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

21

Unrealistic Paging Scheme 32-Bit Arch.

With Page Table entry consuming 4 bytes, results in page table of 4 MB

This is a miserable paging scheme, since the scarce resource, physical memory, consumes already a 4 MB overhead

Note that Page Tables should be resident, as long as some entries point to resident user pages

Thus the overhead may exceed the total available resource, which is memory!

Worse: most entries are empty; their associated pages do not exist yet; the pointers are null!

Page 22: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

22

Unrealistic Paging Scheme 32-Bit Arch.

Page Address (20)Page Address (20) Page Offset (12)Page Offset (12)

Logical AddressLogical Address

Page Table 4MBPage Table 4MB

User Pages 4kBUser Pages 4kB

Page 23: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

23

Unrealistic Paging Scheme 32-Bit Arch.

Problem: The one data structure that should be resident is too large, consuming much/most of the physical memory that is so scarce --or was so scare in the 1970s, when VMM was productized

So: break the table overhead into smaller units!

Disadvantage of additional mapping: more memory accesses

Advantage: Avoiding large 4 MB Page Table

We’ll forget this paging scheme on 32-bit architectures

On contemporary 64-bit architectures, multi-MB page sizes are common again

Page 24: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

24

Realistic Paging Scheme 32-Bit Arch.

Next try: assume a 32-bit architecture, byte-addressable, 4-byte words, 4 kB page size

This time use two-level paging mechanism, consisting of Page Directory and Page Tables in addition to user pages

With 4kB page size, again the rightmost 12 bits of any page address are 0 implied

All user pages, page tables, and the page directory can fit into identically sized page frames!

Page 25: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

25

Realistic Paging Scheme 32-Bit Arch.

Mechanism to determine a physical address: Start with a HW register pdbr that points to

Page Directory Or else implement the Page Directory as a

special-purpose cache Or else place PD into a-priori known

memory location But there must be a way to locate the PD,

best without a memory access

Page 26: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

26

Realistic Paging Scheme 32-Bit Arch.

Intel x86 pdbr, AKA CR3; note Intel term for logical address is linear address:

Page 27: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

27

Realistic Paging Scheme 32-Bit Arch.

Design principle: have user pages, Page Tables, and Page Directory look similar, for example, all should consume one integral page frame each

Every logical address is broken into three parts:1. two indices, generally 10 bits on a 32-bit architecture

with 4 kB page frames2. one offset, generally 12 bits, to locate any offset within

a 4 kB page3. total adds up to 32 bits

The Page Directory Index is indeed an index; to find the actual entry in PD, first << 2 (or * by 4)

Then add this number to the start address of PD; thus a PD entry can be found

Page 28: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

28

Realistic Paging Scheme 32-Bit Arch Similarly, Page Table Index is an index; to find the

entry in a page table: left-shift by two (<< 2) and add result to the start address of PT found in previous step; thus an entry in the PT is found

PT entry holds the address of the user page, yet the rightmost 12 bits are all implied, are all 0, hence don’t need to be stored in the page table

The rightmost 12 bits of a logical address define the page offset within the found user page

Since pages are 4 kB in size, 12 bits suffice to identify any byte in a page

Given that the address of a user page is found, add the offset to its address, the final byte address (physical address) is thus identified

Page 29: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

29

Realistic Paging Scheme 32-Bit Arch

pdbrpdbr

Logical AddressLogical Address

PD Index PT Index Page Offset

Page DirectoryPage Directory

Page TablesPage Tables

User PagesUser Pages

Page 30: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

30

Realistic Paging Scheme 32-Bit Arch Disadvantage: Multiple memory accesses, total up

to 3; can be worse if page-replacement algorithm also has to search through memory

In fact, can become way worse, since any of these total 3 accesses could cause page faults, resulting in 3 swap-ins, with disc accesses and many memory references; it is a good design principle never to swap out the page directory, and rarely –if ever—the page tables!

Some of these 3 accesses may cause swap-out, if a page frame has to be located, making matters even worse

Performance loss could be tremendous; i.e. several decimal orders of magnitude slower than a single memory access

Page 31: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

31

Realistic Paging Scheme 32-Bit Arch

Thus some of these VMM-related data structures should be cached

In all higher-performance architectures –e.g. Intel Pentium Pro® – a Translation Look-Aside Buffer (TLB) is a special purpose cache for PD and PT entries

Also possible to cache the complete PD, since it is contained in size (4 KB)

Conclusion: VMM via paging works only, if locality is good; another way of saying this, paged VMM works well, only if the working set is large enough; else there will be thrashing

Page 32: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

32

Typical PD and PT Entries

Fundamental assumption (design requirement) is that all PD and PT entries be 4 bytes long, 32 bits

Entries in PD and PT need 20 bits for address; the lower 12 bits are implied zeros; no need to store!

The left-over 12 bits (beyond the 20) in any page PD or PT entry can be used for additional, crucial information; for example: P-Bit, AKA present bit, indicating, is the referenced page

present? (aka resident) Modified bit, has page experienced a store? AKA dirty bit R/W/E bit: Can referenced page be read, written,

executed, all? User/Supervisor bit: OS dependent, is page reserved for

privileged code?

Page 33: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

33

Typical PD and PT Entries

Typically the P-Bit is positioned for quick, easy access: rightmost bit in 4-byte word

Some information is unique to PD or PT For example, a user page may be shared

(global bit set in PT), but a PT may never be sharable between multiple processes; hence no global bit needed in PD

On systems with varying page sizes, page size information must be recorded

See examples below:

Page 34: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

34

Typical PD and PT Entries

Page Directory Entry

Page Table Address

Accessed BitUser/SupervisorRead/WritePresent Bit

Unused -- Bits

Page 35: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

35

Typical PD and PT Entries

If the Operating System exercises a policy to not ever swap-out Page Tables, then the respective PT entries in the PD do not need to track the fact that a PT was modified (i.e. the Dirty bit)

Hence there may be reasons that the data structures for PT and PD entries exhibit minor differences

But in general, the format and structure of PTs and PDs are very similar, and the size of user pages, PTs and PDs are identical

Page 36: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

36

Typical PD and PT Entries

Page Table Entry

User Page Address

Dirty BitAccessed BitUser/SupervisorRead/WritePresent Bit

Page 37: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

37

Page Replacement Using FIFO

When a page needs to be swapped-in or created for the first time, it is placed into an empty page frame

How does the VMM find an empty page frame?

If no free frame is available, another existing page needs to be removed out of its page frame. This page is referred to as the victim page

The victim page may even be pirated from another processes’ set of page frames; but this is an OS decision, the user generally has no such control!

If the replaced page –the victim page– was modified, it must be swapped out before its frame can be overwritten, to preserve the recent changes

Priority for VM to locate victim page that is not dirty

Page 38: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

38

Page Replacement Using FIFO

Otherwise, if the page is unmodified, it can be overwritten without swap-out, as an exact copy exists already on mass storage; yet it must be marked as “not present” in its PT entry

The policy used to identify a victim page is called the replacement policy; the algorithm is named the replacement algorithm

Typical replacement policies are the FIFO method, the random method, or the LRU policy; similar to replacing cache lines in a data cache

When storing the age, it is sufficient to record relative ages, not necessarily the exact time when it was created or referenced! Again, like in a data cache line

Page 39: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

39

Page Replacement Using FIFO

For example, LRU information may be recorded implicitly by linking user pages in a linked-list fashion, and removing the victim at the head, adding a new page at the tail

MAY cause many memory references, making such a method of repeated lookups in memory impractical!

Page 40: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

40

FIFO Sample, Track Page Faults

In the two examples below, the numbers refer to 5 distinct page frames that shall be accessed. We use the following reference string of these 5 user pages; stolen with pride from [7]:

1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

We’ll track the number of page faults and hits, caused by an assumed FIFO replacement algorithm

Strict FIFO age tracking requires an access to page table entries for each page reference! See later some significant, even crude –yet effective-simplifications in Unix!

Page 41: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

41

Page Replacement Using FIFO

Sample 1: In this first example, we use 3 physical page frames for placing 5 logical pages:

Track the number of page faults, given 3 frames!

Cleary, when a page is referenced for the first time, it does not exist, so the memory access will cause a page fault automatically

Page accesses causing a fault are listed in red

frame 0 User page placed into page frame Page frame 0 1 4 5 5 Page frame 1 2 1 1 3 Page frame 2 3 2 2 4

Page 42: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

42

Page Replacement Using FIFO

We observe a total of 9 pages faults for the 12 page references in Sample 1

Would it not be desirable to have a smaller number of page faults?

With less faults, performance will improve, since less swap-in and swap-out activity would be required

Generally, adding computing resources improves SW performance

Page 43: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

43

Page Replacement Using FIFO

Sample 2: Run this same page-reference example with 4 page frames now; again faults listed in red:

1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

frame 0 User page placed into page

frame Page frame 0 1 1 5 4 Page frame 1 2 2 1 5 Page frame 2 3 2 Page frame 3 4 3

Page 44: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

44

Page Replacement Using FIFO

A strange thing happened: Now the total number of page faults has

increased to 10, despite having had more resources: We now have one additional page frame!

Phenomenon is known as the infamous Belady Anomaly

Page 45: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

45

Appendix:Some Definitions

Page 46: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

46

DefinitionsDefinitions

AlignmentAlignment Attribute of some memory address Attribute of some memory address AA, ,

stating that stating that AA must be evenly divisible by must be evenly divisible by some power of twosome power of two

For example, For example, word-alignedword-aligned on a 4-byte, 32- on a 4-byte, 32-bit architecture means, an address is bit architecture means, an address is divisible by 4, or the rightmost 2 address divisible by 4, or the rightmost 2 address bits are both 0bits are both 0

Page 47: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

47

DefinitionsDefinitions

Demand PagingDemand Paging Policy that Policy that allocates a page frame allocates a page frame in in

physical memory only if an address on that physical memory only if an address on that page is actually referenced (demanded) in page is actually referenced (demanded) in the executing programthe executing program

Page 48: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

48

DefinitionsDefinitions

Dirty BitDirty Bit Single bit data structure that tells whether Single bit data structure that tells whether

the associated page was written after its the associated page was written after its last swap-in (or creation)last swap-in (or creation)

Page 49: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

49

DefinitionsDefinitions

Global pageGlobal page A page that is used in more than one A page that is used in more than one

programprogram Typically found in multi-programming Typically found in multi-programming

environment with shared pagesenvironment with shared pages

Page 50: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

50

DefinitionsDefinitions

Logical AddressLogical Address Address as defined by the architecture; is Address as defined by the architecture; is

the address as seen by the programmer or the address as seen by the programmer or compilercompiler

Synonym: Synonym: virtual address virtual address and on Intel and on Intel architecture: architecture: linear addresslinear address

Antonym: physical addressAntonym: physical address

Page 51: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

51

DefinitionsDefinitions

OverlayOverlay Before advent of VMM, programmer had to Before advent of VMM, programmer had to

manually relocate information out, to free manually relocate information out, to free memory for next datamemory for next data

This reuse of the same data area was This reuse of the same data area was called: to called: to overlayoverlay memory memory

Page 52: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

52

DefinitionsDefinitions

PagePage A portion of logical addressing space of a A portion of logical addressing space of a

particular size and alignmentparticular size and alignment Start address of a page is an integral Start address of a page is an integral

multiple of the page size, thus it also is multiple of the page size, thus it also is page-size alignedpage-size aligned

A A logical logical pagepage is placed into a is placed into a physical physical page framepage frame

Antonym: SegmentAntonym: Segment

Page 53: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

53

DefinitionsDefinitions

Page DirectoryPage Directory A page holding a list of addresses for A page holding a list of addresses for Page Page

TablesTables Typically this directory consumes an Typically this directory consumes an

integral number of pages as well, ideally integral number of pages as well, ideally exactly exactly oneone page page

In addition to page table addresses, each In addition to page table addresses, each entry also contains information about entry also contains information about presence, access rights, written to presence, access rights, written to or notor not, , globalglobal, etc. similar to , etc. similar to Page TablePage Table entries entries

Page 54: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

54

DefinitionsDefinitions

Page Directory Base Register (pdbr)Page Directory Base Register (pdbr) HW resource (typically a machine register) HW resource (typically a machine register)

that holds the address of the that holds the address of the Page DirectoryPage Directory pagepage

Due to alignment constraints, it may be shorter than 32 bits on a 32-bit architecture

Bits are saved due to defined alignment constraint

See Intel x86 CR3, AKA pdbr: https://en.wikipedia.org/wiki/Control_register

Page 55: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

55

DefinitionsDefinitions

Page FaultPage Fault Logical address references a page that is Logical address references a page that is

not residentnot resident Consequently, space must be found for the Consequently, space must be found for the

referenced page, and that page must be referenced page, and that page must be either either created created or else or else swapped in swapped in if it has if it has existed beforeexisted before

Space will be placed into a page frameSpace will be placed into a page frame

Page 56: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

56

DefinitionsDefinitions

Page FramePage Frame A portion of physical memory that is aligned A portion of physical memory that is aligned

and fixed in size, able to hold one page; and fixed in size, able to hold one page; doesn’t always hold a page of infodoesn’t always hold a page of info

It starts at a boundary evenly divisible by the It starts at a boundary evenly divisible by the page sizepage size

Total physical memory should be an integral Total physical memory should be an integral multiple of the size of page frames; else we multiple of the size of page frames; else we see fragmentationsee fragmentation

Logical memory is an integral multiple of Logical memory is an integral multiple of page size by definitionpage size by definition

Page 57: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

57

DefinitionsDefinitions

Page Table Base Register (ptbr)Page Table Base Register (ptbr) Rare! Resource (typically a machine Rare! Resource (typically a machine

register) that holds the address of the register) that holds the address of the Page Page TableTable

Used in a single-level paging schemeUsed in a single-level paging scheme In dual-level scheme use In dual-level scheme use pdbrpdbr

Page 58: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

58

DefinitionsDefinitions

Page TablePage Table A list of addresses of user A list of addresses of user PagesPages Typically each page table consumes an Typically each page table consumes an

integral number of pages, e.g. 1integral number of pages, e.g. 1 In addition to page addresses, each entry In addition to page addresses, each entry

also contains information about also contains information about presence, presence, access rights, dirtyaccess rights, dirty, , globalglobal, etc. similar to , etc. similar to Page DirectoryPage Directory entries entries

Page 59: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

59

DefinitionsDefinitions

Physical MemoryPhysical Memory Main memory actually available Main memory actually available physicallyphysically

on a processoron a processor Antonym or related: Logical memoryAntonym or related: Logical memory Historically, small physical memories were Historically, small physical memories were

a driving force behind the development of a driving force behind the development of paged VMMpaged VMM

Since advent of 64-bit computing, the driving force for VMM now is impracticality of all memory being physically available

Page 60: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

60

DefinitionsDefinitions

Present BitPresent Bit Single-bit HW data structure that tells, Single-bit HW data structure that tells,

whether the associated page is whether the associated page is presentpresent or or notnot

If not, it may be swapped out onto disk, or If not, it may be swapped out onto disk, or perhaps must be created for the first timeperhaps must be created for the first time

Synonym: resident

Page 61: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

61

DefinitionsDefinitions

Resident (adj.)Resident (adj.) Attribute of a page (or of any memory Attribute of a page (or of any memory

object) referenced by executing code: object) referenced by executing code: Object is physically in memory, then it is Object is physically in memory, then it is residentresident

Else if not in memory, then it is Else if not in memory, then it is non-non-residentresident

Page 62: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

62

DefinitionsDefinitions

Swap-InSwap-In Transfer of a page of information from Transfer of a page of information from

secondary storage to primary storage, secondary storage to primary storage, fitting into a fitting into a page framepage frame in memory in memory

Physical move from disk to physical Physical move from disk to physical memorymemory

Necessary requirement that page frame Necessary requirement that page frame existexist

Antonym: Swap-OutAntonym: Swap-Out

Page 63: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

63

DefinitionsDefinitions

Swap-OutSwap-Out Transfer of a page of information from Transfer of a page of information from

primary to secondary storage; from primary to secondary storage; from physical memory to diskphysical memory to disk

Antonym: Swap-InAntonym: Swap-In Caused by need to evict. Eviction generally

caused by a process exceeding its limited budget of page frames, and asking for new frame

Page 64: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

64

DefinitionsDefinitions

ThrashingThrashing Excessive amount of swappingExcessive amount of swapping When this happens, performance is When this happens, performance is

severely degradedseverely degraded This is an indicator for the This is an indicator for the working setworking set

being too smallbeing too small Cause can be a memory-greedy process or

an inappropriate grant of page frames defined by OS fiat!

Page 65: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

65

DefinitionsDefinitions

Translation Look-Aside BufferTranslation Look-Aside Buffer Special-purpose cache for storing recently Special-purpose cache for storing recently

used used Page DirectoryPage Directory and and Page TablePage Table entriesentries

Physically very small, yet effective Lovingly referred to as tlb

Page 66: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

66

DefinitionsDefinitions

Virtual ContiguityVirtual Contiguity Memory management policy that Memory management policy that separates separates

physical from logicalphysical from logical memory memory In particular, In particular, virtual contiguityvirtual contiguity creates the creates the

impression that two logical addresses impression that two logical addresses nn and and n+1n+1 are adjacent to one another in are adjacent to one another in main logical memory, while in reality they main logical memory, while in reality they are an arbitrary number of physical are an arbitrary number of physical locations apart from one anotherlocations apart from one another

Usually, they are an integral number Usually, they are an integral number (positive or negative) of page-size bytes (positive or negative) of page-size bytes apart, plus 1apart, plus 1

Page 67: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

67

DefinitionsDefinitions

Virtual MemoryVirtual Memory Memory management policy that Memory management policy that separates separates

physical from logicalphysical from logical memory memory In particular, In particular, virtual memoryvirtual memory can create the can create the

impression that a impression that a larger amount of memory larger amount of memory is addressable than is really available on is addressable than is really available on the targetthe target

It can also create the impression that 2 logical addresses A and A+1 are adjacent to one another, AKA contiguous

Page 68: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

68

DefinitionsDefinitions

Working SetWorking Set The The Working SetWorking Set of page frames is that of page frames is that

number of allocated physical frames to number of allocated physical frames to guarantee that a program executes without guarantee that a program executes without thrashingthrashing

Working setWorking set is unique for any piece of is unique for any piece of software; moreover it can vary by the input software; moreover it can vary by the input data for a particular execution of such a data for a particular execution of such a piece of SWpiece of SW

OS responsible for tracking number of page faults; if needed, for increasing working set

Page 69: 1 ECE 371 Microprocessors Chapter 7 Demand-Paged Virtual Memory Management Herbert G. Mayer, PSU Status 10/10/2015 For use at CCUT Fall 2015.

69

Bibliography1. Denning, Peter: 1968, “The Working Set Model for Program

Behavior”, ACM Symposium on Operating Systems Principles, Vol 11, Number 5, May 1968, pp 323-333

2. Organick, Elliott I. (1972). The Multics System: An Examination of Its Structure. MIT Press

3. Alex Nichol, http://www.aumha.org/win5/a/xpvm.php Virtual Memory in Windows XP

4. Multics history: ttp://www.multicians.org/history.html5. KDF-9 development history:

http://archive.computerhistory.org/resources/text/English_Electric/EnglishElectric.KDF9.1961.102641284.pdf

6. Wiki paged VMM: http://www.cs.nott.ac.uk/~gxk/courses/g53ops/Memory%20Management/MM10-paging.html

7. Silberschatz, Abraham and Peter Baer Galvin: 1998, “Operating Systems Concepts”, Addison Wesley, 5th edition

8. Intel x86 control registers, including one with pdbr function, known as CR3: https://en.wikipedia.org/wiki/Control_register