Swapping Chapter 4: Memory Managementmosse/courses/cs1550/Slides/amer-mem… · Swapping Virtual...

of 35/35
Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory
  • date post

    25-Jul-2020
  • Category

    Documents

  • view

    5
  • download

    0

Embed Size (px)

Transcript of Swapping Chapter 4: Memory Managementmosse/courses/cs1550/Slides/amer-mem… · Swapping Virtual...

  • Chapter 4: Memory Management

    Part 1: Mechanisms for Managing Memory

  • Chapter 4 2CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Memory management

    n Basic memory managementn Swappingn Virtual memoryn Page replacement algorithmsn Modeling page replacement algorithmsn Design issues for paging systemsn Implementation issuesn Segmentation

  • Chapter 4 3CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    In an ideal world…

    n The ideal world has memory that isn Very largen Very fastn Non-volatile (doesn’t go away when power is turned off)

    n The real world has memory that is:n Very largen Very fastn Affordable!ÞPick any two…

    n Memory management goal: make the real world look as much like the ideal world as possible

  • Chapter 4 4CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Memory hierarchy

    n What is the memory hierarchy?n Different levels of memoryn Some are small & fastn Others are large & slow

    n What levels are usually included?n Cache: small amount of fast, expensive memory

    n L1 (level 1) cache: usually on the CPU chipn L2 & L3 cache: off-chip, made of SRAM

    n Main memory: medium-speed, medium price memory (DRAM)n Disk: many gigabytes of slow, cheap, non-volatile storage

    n Memory manager handles the memory hierarchy

  • Chapter 4 5CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Basic memory management

    n Components includen Operating system (perhaps with device drivers)n Single process

    n Goal: lay these out in memoryn Memory protection may not be an issue (only one program)n Flexibility may still be useful (allow OS changes, etc.)

    n No swapping or paging

    Operating system(RAM)

    User program(RAM)

    0xFFFF 0xFFFF

    0 0

    User program(RAM)

    Operating system(ROM)

    Operating system(RAM)

    User program(RAM)

    Device drivers(ROM)

  • Chapter 4 6CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Fixed partitions: multiple programs

    n Fixed memory partitionsn Divide memory into fixed spacesn Assign a process to a space when it’s free

    n Mechanismsn Separate input queues for each partitionn Single input queue: better ability to optimize CPU usage

    OS

    Partition 1

    Partition 2Partition 3

    Partition 4

    0

    100K

    500K600K700K

    900K

    OS

    Partition 1

    Partition 2Partition 3

    Partition 4

    0

    100K

    500K600K700K

    900K

  • Chapter 4 7CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    How many programs is enough?

    n Several memory partitions (fixed or variable size)n Lots of processes wanting to use the CPUn Tradeoff

    n More processes utilize the CPU bettern Fewer processes use less memory (cheaper!)

    n How many processes do we need to keep the CPU fully utilized?n This will help determine how much memory we needn Is this still relevant with memory costing less than $1/GB?

  • Chapter 4 8CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Modeling multiprogramming

    n More I/O wait means less processor utilizationn At 20% I/O wait, 3–4

    processes fully utilize CPUn At 80% I/O wait, even 10

    processes aren’t enoughn This means that the OS

    should have more processes if they’re I/O bound

    n More processes => memory management & protection more important!

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 1 2 3 4 5 6 7 8 9 10

    Degree of Multiprogramming

    CPU

    Util

    izat

    ion

    80% I/O Wait 50% I/O Wait 20% I/O Wait

  • Chapter 4 9CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Multiprogrammed system performance

    n Arrival and work requirements of 4 jobsn CPU utilization for 1–4 jobs with 80% I/O waitn Sequence of events as jobs arrive and finish

    n Numbers show amount of CPU time jobs get in each intervaln More processes => better utilization, less time per process

    Job Arrivaltime

    CPUneeded

    1 10:00 42 10:10 33 10:15 24 10:20 2

    1 2 3 4CPU idle 0.80 0.64 0.51 0.41CPU busy 0.20 0.36 0.49 0.59

    CPU/process 0.20 0.18 0.16 0.15

    0 10 15 20 22 27.6 28.2 31.7

    1234

    Time

  • Chapter 4 10CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Memory and multiprogramming

    n Memory needs two things for multiprogrammingn Relocationn Protection

    n The OS cannot be certain where a program will be loaded in memoryn Variables and procedures can’t use absolute locations in

    memoryn Several ways to guarantee this

    n The OS must keep processes’ memory separaten Protect a process from other processes reading or

    modifying its own memoryn Protect a process from modifying its own memory in

    undesirable ways (such as writing to program code)

  • Chapter 4 11CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Base and limit registers

    n Special CPU registers: base & limitn Access to the registers limited to

    system moden Registers contain

    n Base: start of the process’s memory partition

    n Limit: length of the process’s memory partition

    n Address generationn Physical address: location in

    actual memoryn Logical address: location from

    the process’s point of viewn Physical address = base + logical

    addressn Logical address larger than limit

    => error

    Processpartition

    OS0

    0xFFFF

    Limit

    Base

    0x2000

    0x9000

    Logical address: 0x1204Physical address:0x1204+0x9000 = 0xa204

  • Chapter 4 12CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Swapping

    n Memory allocation changes as n Processes come into memoryn Processes leave memory

    n Swapped to diskn Complete execution

    n Gray regions are unused memory

    OS OS OS OS OS OS OS

    A AB

    ABC

    BC

    BC

    D

    C

    D

    C

    DA

  • Chapter 4 13CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Swapping: leaving room to grow

    n Need to allow for programs to grown Allocate more memory for

    datan Larger stack

    n Handled by allocating more space than is necessary at the startn Inefficient: wastes memory

    that’s not currently in usen What if the process requests

    too much memory?

    OSCode

    Data

    StackCode

    Data

    Stack

    ProcessB

    ProcessA

    Room forB to grow

    Room forA to grow

  • Chapter 4 14CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Tracking memory usage: bitmaps

    n Keep track of free / allocated memory regions with a bitmapn One bit in map corresponds to a fixed-size region of memoryn Bitmap is a constant size for a given amount of memory regardless of how

    much is allocated at a particular timen Chunk size determines efficiency

    n At 1 bit per 4KB chunk, we need just 256 bits (32 bytes) per MB of memoryn For smaller chunks, we need more memory for the bitmapn Can be difficult to find large contiguous free areas in bitmap

    A B C D

    11111100001110000111111111111000

    8 16 24 32

    Memory regions

    Bitmap

  • Chapter 4 15CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Tracking memory usage: linked lists

    n Keep track of free / allocated memory regions with a linked listn Each entry in the list corresponds to a contiguous region of memoryn Entry can indicate either allocated or free (and, optionally, owning process)n May have separate lists for free and allocated areas

    n Efficient if chunks are largen Fixed-size representation for each regionn More regions => more space needed for free lists

    A B C D

    16 24 32Memory regions

    A 0 6 - 6 4 B 10 3 - 13 4 C 17 9

    - 29 3D 26 3

    8

  • Chapter 4 16CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Allocating memory

    n Search through region list to find a large enough spacen Suppose there are several choices: which one to use?

    n First fit: the first suitable hole on the listn Next fit: the first suitable after the previously allocated holen Best fit: the smallest hole that is larger than the desired region (wastes least

    space?)n Worst fit: the largest available hole (leaves largest fragment)

    n Option: maintain separate queues for different-size holes

    - 6 5 - 19 14 - 52 25 - 102 30 - 135 16

    - 202 10 - 302 20 - 350 30 - 411 19 - 510 3

    Allocate 20 blocks first fit

    5

    Allocate 12 blocks next fit

    18

    Allocate 13 blocks best fit

    1

    Allocate 15 blocks worst fit

    15

  • Chapter 4 17CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Freeing memory

    n Allocation structures must be updated when memory is freedn Easy with bitmaps: just set the appropriate bits in the bitmapn Linked lists: modify adjacent elements as needed

    n Merge adjacent free regions into a single regionn May involve merging two regions with the just-freed area

    A X B

    A X

    X B

    X

    A B

    A

    B

  • Chapter 4 18CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Limitations of swapping

    n Problems with swappingn Process must fit into physical memory (impossible to run

    larger processes)n Memory becomes fragmented

    n External fragmentation: lots of small free areasn Compaction needed to reassemble larger free areas

    n Processes are either in memory or on disk: half and half doesn’t do any good

    n Overlays solved the first problemn Bring in pieces of the process over time (typically data)n Still doesn’t solve the problem of fragmentation or

    partially resident processes

  • Chapter 4 19CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Virtual memory

    n Basic idea: allow the OS to hand out more memory than exists on the system

    n Keep recently used stuff in physical memoryn Move less recently used stuff to diskn Keep all of this hidden from processes

    n Processes still see an address space from 0 – max addressn Movement of information to and from disk handled by the

    OS without process helpn Virtual memory (VM) especially helpful in

    multiprogrammed systemn CPU schedules process B while process A waits for its

    memory to be retrieved from disk

  • Chapter 4 20CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Virtual and physical addresses

    n Program uses virtual addressesn Addresses local to the processn Hardware translates virtual

    address to physical addressn Translation done by the Memory Management Unitn Usually on the same chip as

    the CPUn Only physical addresses leave

    the CPU/MMU chipn Physical memory indexed

    by physical addresses

    CPU chipCPU

    Memory

    Diskcontroller

    MMU

    Virtual addressesfrom CPU to MMU

    Physical addresseson bus, in memory

  • Chapter 4 21CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    0–4K4–8K8–12K12–16K16–20K20–24K24–28K28–32K

    Paging and page tables

    n Virtual addresses mapped to physical addressesn Unit of mapping is called a pagen All addresses in the same virtual

    page are in the same physical page

    n Page table entry (PTE) contains translation for a single page

    n Table translates virtual page number to physical page numbern Not all virtual memory has a

    physical pagen Not every physical page need be

    usedn Example:

    n 64 KB virtual memoryn 32 KB physical memory

    70–4K44–8K

    8–12K12–16K

    016–20K20–24K24–28K

    328–32K32–36K36–40K

    140–44K544–48K648–52K-52–56K

    56–60K-60–64K

    Virtualaddressspace

    Physicalmemory

    -

    ------

  • Chapter 4 22CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    What’s in a page table entry?

    n Each entry in the page table containsn Valid bit: set if this logical page number has a corresponding physical frame

    in memoryn If not valid, remainder of PTE is irrelevant

    n Page frame number: page in physical memoryn Referenced bit: set if data on the page has been accessedn Dirty (modified) bit :set if data on the page has been modifiedn Protection information

    Page frame numberVRDProtection

    Valid bitReferenced bitDirty bit

  • Chapter 4 23CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Example:• 4 KB (=4096 byte) pages• 32 bit logical addresses

    p d

    2d = 4096 d = 12

    12 bits

    32 bit logical address

    32-12 = 20 bits

    Mapping logical => physical address

    n Split address from CPU into two piecesn Page number (p)n Page offset (d)

    n Page numbern Index into page tablen Page table contains base

    address of page in physical memory

    n Page offsetn Added to base address to get

    actual physical memory address

    n Page size = 2d bytes

  • Chapter 4 24CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    page number

    p d

    page offset

    01

    p-1p

    p+1f

    f d

    Page frame number

    ...

    page tablephysical memory

    01

    ...f-1f

    f+1f+2

    ...

    Page frame number

    CPU

    Address translation architecture

  • Chapter 4 25CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    0

    Page frame number

    Logical memory (P0)

    123456789

    Physicalmemory

    Page table (P0)

    Logical memory (P1) Page table (P1)

    Page 4Page 3Page 2Page 1Page 0

    Page 1Page 0

    08

    29436

    Page 3 (P0)Page 0 (P1)

    Page 0 (P0)

    Page 2 (P0)Page 1 (P0)Page 4 (P0)

    Page 1 (P1)

    Freepages

    Memory & paging structures

  • Chapter 4 26CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    884960

    955

    ...

    220657

    401

    ...

    1st levelpage table

    2nd levelpage tables

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    mainmemory

    ...

    125613

    961

    ...

    Two-level page tables

    n Problem: page tables can be too largen 232 bytes in 4KB pages need 1

    million PTEsn Solution: use multi-level page

    tablesn “Page size” in first page table is

    large (megabytes)n PTE marked invalid in first page

    table needs no 2nd level page table

    n 1st level page table has pointers to 2nd level page tables

    n 2nd level page table has actual physical page numbers in it

  • Chapter 4 27CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    More on two-level page tables

    n Tradeoffs between 1st and 2nd level page table sizesn Total number of bits indexing 1st and 2nd level table is

    constant for a given page size and logical address lengthn Tradeoff between number of bits indexing 1st and number

    indexing 2nd level tablesn More bits in 1st level: fine granularity at 2nd leveln Fewer bits in 1st level: maybe less wasted space?

    n All addresses in table are physical addressesn Protection bits kept in 2nd level table

  • Chapter 4 28CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    p1 = 10 bits p2 = 9 bits offset = 13 bits

    page offsetpage number

    Two-level paging: example

    n System characteristicsn 8 KB pagesn 32-bit logical address divided into 13 bit page offset, 19 bit page number

    n Page number divided into:n 10 bit page numbern 9 bit page offset

    n Logical address looks like this:n p1 is an index into the 1st level page tablen p2 is an index into the 2nd level page table pointed to by p1

  • Chapter 4 29CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    ...

    ...

    2-level address translation example

    p1 = 10 bits p2 = 9 bits offset = 13 bits

    page offsetpage number

    ...

    0

    1

    p1...

    01

    p2

    19physical address

    1st level page table

    2nd level page table

    main memory

    01

    framenumber

    13Pagetablebase

    ...

    ...

  • Chapter 4 30CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Implementing page tables in hardware

    n Page table resides in main (physical) memoryn CPU uses special registers for paging

    n Page table base register (PTBR) points to the page tablen Page table length register (PTLR) contains length of page table:

    restricts maximum legal logical addressn Translating an address requires two memory accesses

    n First access reads page table entry (PTE)n Second access reads the data / instruction from memory

    n Reduce number of memory accessesn Can’t avoid second access (we need the value from memory)n Eliminate first access by keeping a hardware cache (called a

    translation lookaside buffer or TLB) of recently used page table entries

  • Chapter 4 31CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Logicalpage #

    Physicalframe #

    Example TLB

    8unused231229227

    3

    10126114

    Translation Lookaside Buffer (TLB)

    n Search the TLB for the desired logical page numbern Search entries in paralleln Use standard cache techniques

    n If desired logical page number is found, get frame number from TLB

    n If desired logical page number isn’t foundn Get frame number from page

    table in memoryn Replace an entry in the TLB

    with the logical & physical page numbers from this reference

  • Chapter 4 32CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Handling TLB misses

    n If PTE isn’t found in TLB, OS needs to do the lookup in the page table

    n Lookup can be done in hardware or softwaren Hardware TLB replacement

    n CPU hardware does page table lookupn Can be faster than softwaren Less flexible than software, and more complex hardware

    n Software TLB replacementn OS gets TLB exceptionn Exception handler does page table lookup & places the result into the

    TLBn Program continues after return from exceptionn Larger TLB (lower miss rate) can make this feasible

  • Chapter 4 33CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    How long do memory accesses take?

    n Assume the following times:n TLB lookup time = a (often zero - overlapped in CPU)n Memory access time = m

    n Hit ratio (h) is percentage of time that a logical page number is found in the TLBn Larger TLB usually means higher hn TLB structure can affect h as well

    n Effective access time (an average) is calculated as:n EAT = (m + a)h + (m + m + a)(1-h)n EAT =a + (2-h)m

    n Interpretationn Reference always requires TLB lookup, 1 memory accessn TLB misses also require an additional memory reference

  • Chapter 4 34CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    Inverted page table

    n Reduce page table size further: keep one entry for each frame in memory

    n PTE containsn Virtual address pointing to this framen Information about the process that owns this page

    n Search page table byn Hashing the virtual page number and process IDn Starting at the entry corresponding to the hash resultn Search until either the entry is found or a limit is reached

    n Page frame number is index of PTEn Improve performance by using more advanced

    hashing algorithms

  • Chapter 4 35CS 1550, cs.pitt.edu (originaly modified by Ethan L. Miller and Scott A. Brandt)

    pid1

    pidk

    pid0

    Inverted page table architecture

    process ID p = 19 bits offset = 13 bits

    page number

    1319physical address

    inverted page table

    main memory

    ...

    01

    ...

    Page framenumber

    page offset

    pid p

    p0

    p1

    pk

    ...

    ...

    01

    k

    search

    k