Avishai Wool lecture 8 - 1 Introduction to Systems Programming Lecture 8 Paging Design Input-Output.
-
date post
20-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of Avishai Wool lecture 8 - 1 Introduction to Systems Programming Lecture 8 Paging Design Input-Output.
Avishai Woollecture 8 - 1
Introduction to Systems Programming Lecture 8
Paging Design
Input-Output
Avishai Woollecture 8 - 2
Steps in Handling a Page Fault
Avishai Woollecture 8 - 3
VirtualPhysical mapping
• CPU accesses virtual address 100000
• MMU looks in page table to find physical address– Page table is in memory too
• Unreasonable overhead!
Avishai Woollecture 8 - 4
TLB: Translation Lookaside Buffer
• Idea: Keep the most frequently used parts of the page table in a cache, inside the MMU chip.
• TLB holds a small number of page table entries: Usually 8 – 64
• TLB hit rate very high because, e.g., instructions fetched sequentially.
Avishai Woollecture 8 - 5
A TLB to speed up paging
Example: • Code loops through pages 19,20,21• Uses data array in pages 129,130,140• Stack variables in pages 860,861
Avishai Woollecture 8 - 6
Valid TLB Entries
• TLB miss:– Do regular page lookup– Evict a TLB entry and store the new TLB entry– Miniature paging system, done in hardware
• When OS does context switch to a new process, all TLB entries become invalid: – Early instructions of new process will cause TLB
misses.
Avishai Woollecture 8 - 7
TLB placement/eviction
• Done by hardware
• Placement rule:– TLBIndex = VirtualAddr modulo TLBSize– TLBSize is always 2k
TLBIndex = k least-significant bits
– Keep “tag” (rest of bits) to fully identify virtual addr
• Virtual address can be only in one TLB index
• No explicit “eviction”: simply overwrite what is in TLB[TLBIndex]
Avishai Woollecture 8 - 8
TLB + Page table lookup
In TLB?In pagetable?
Page fault:copy from
disk to memory
Virtual address
Physical address
No
Yes Yes; update TLB
No
Avishai Woollecture 8 - 9
TLB – cont.
• If address is in TLB page is in physical memory– OS invalidates TLB entry when evicting a page– So page fault not possible if we have a TLB hit
• “page fault rate” is computed only on TLB misses
Avishai Woollecture 8 - 10
Example: Average memory access time
• TLB lookup: 4ns• Phys mem access: 10ns• Disk access: 10ms
• TLB miss rate: 1%• Page fault rate: 0.1%
• Assume page table is in memory.
p=0.99, time=4ns+10ns
Page hit: p=0.01*0.999, time=4ns+10ns+10ns
Page fault: p=0.01*0.001, time=4ns+10ns+10ms+10nsTLB miss
TLB hit
Average memory access: 114.1ns (1.141*10-7)
Avishai Woollecture 8 - 11
Design issues in Paging
Avishai Woollecture 8 - 12
Local versus Global Allocation Policies:Physical Memory
a) Original configuration – ‘A’ causes page faultb) Local page replacementc) Global page replacement
Avishai Woollecture 8 - 13
Local or Global?
• Local number of frames per process is fixed– If working set grows thrashing– If working set shrinks waste
• Global usually better
• Some algorithms can only be local (working set, WSClock).
Avishai Woollecture 8 - 14
How many frames to give a process?
• Fixed number
• Proportional to its size (before load)
• Zero, let it issue page faults for all its pages.– This is called pure demand paging.
• Monitor page-fault-frequency (PFF), give more pages if PFF high.
Avishai Woollecture 8 - 15
Page fault rate as a function of the number of page frames assigned
Avishai Woollecture 8 - 16
Load Control• Despite good designs, system may still thrash• When PFF algorithm indicates
– some processes need more memory – but no processes need less
• Solution: Reduce number of processes competing for memory– swap one or more to disk, divide up frames they held– reconsider degree of multiprogramming
Avishai Woollecture 8 - 17
Cleaning Policy
• Need for a background process, paging daemon– periodically inspects state of memory
• When too few frames are free– selects pages to evict using a replacement algorithm
• It can use same circular list (clock) – as regular page replacement algorithm but with diff ptr
Avishai Woollecture 8 - 18
Windows XP Page Replacement• Processes are assigned working set minimum and
working set maximum• Working set minimum is the minimum number of page
frames the process is guaranteed to have in memory• A process may be assigned as many page frames up to
its working set maximum• When the amount of free memory in the system falls
below a threshold, automatic working set trimming is performed to restore the amount of free memory
• Working set trimming removes frames from processes that have more than their working set minimum
Avishai Woollecture 8 - 19
Devices, Controllers, and I/O Architectures
Avishai Woollecture 8 - 20
I/O Device Types
• Block Devices– block size of 512-32768 bytes– block can be read/written individually– typical: disks / floppy / CD
• Character Devices– delivers / accepts a sequential stream of characters– non-addressable – typical: keyboard, mouse, printer, network
• Other: Monitor, Clock
Avishai Woollecture 8 - 21
Typical Data Rates
Avishai Woollecture 8 - 22
Device Controllers
• I/O devices have components:– mechanical component – electronic component
• The electronic component is the device controller– may be able to handle multiple devices
• Controller's tasks– convert serial bit stream to block of bytes– perform error correction as necessary– make available to main memory
Avishai Woollecture 8 - 23
Communicating with Controllers
• Controllers have registers to deliver data, accept data, etc.
• Option 1: special I/O commands, I/O ports in r0, 4
• “4” is not memory address 4, it is I/O port 4
• Option 2: I/O registers mapped to memory addresses
Avishai Woollecture 8 - 24
Memory-Mapped Registers
• Controller connected to the bus
• Has a physical “memory address” like B0000000
• When this address appears on the bus, the controller responds (read/write to its I/O register)
• RAM configured to ignore controller’s address
Avishai Woollecture 8 - 25
Possible I/O Register Mappings
• Separate I/O and memory space (IBM 360)• Memory-mapped I/O (PDP-11)• Hybrid (Pentium, 640K-1M are for I/O)
Avishai Woollecture 8 - 26
Advantages of Memory Mapped I/O
• No special instructions, can be written in C.
• Protection by not putting I/O memory in user virtual address space.
• All machine instructions can access I/O:LOOP: test *b0000004 // check if port_4 is 0 beq READY branch LOOP
READY: ...
Avishai Woollecture 8 - 27
Disadvantages of Memory Mapped I/O
• Memory and I/O controllers have to be on the same bus:– modern architectures have separate memory bus!– Pentium has 3 buses: memory, PCI, ISA
Avishai Woollecture 8 - 28
Bus Architectures
(a) A single-bus architecture(b) A dual-bus memory architecture
Avishai Woollecture 8 - 29
Memory Mapped with Separate Bus
• I/O Controllers do not see memory bus.
• Option 1: all addresses to memory bus. No response I/O bus
• Option 2: Snooping device between buses– speed difference is a problem
• Option 3 (Pentium): filter addresses in PCI bridge
Avishai Woollecture 8 - 30
Structure of a large Pentium system
Avishai Woollecture 8 - 31
Principles of I/O Software
Avishai Woollecture 8 - 32
Goals of I/O Software
• Device independence– programs can access any I/O device – without specifying device in advance
· (floppy, hard drive, or CD-ROM)
• Uniform naming– name of a file or device a string or an integer– not depending on which machine
• Error handling– handle as close to the hardware as possible
Avishai Woollecture 8 - 33
Goals of I/O Software (2)
• Synchronous vs. asynchronous transfers– blocked transfers vs. interrupt-driven
• Buffering– data coming off a device cannot be stored in final
destination
• Sharable vs. dedicated devices– disks are sharable– tape drives would not be
Avishai Woollecture 8 - 34
How is I/O Programmed
• Programmed I/O
• Interrupt-driven I/O
• DMA (Direct Memory Access)
Avishai Woollecture 8 - 35
Programmed I/O
Steps in printing a string
Avishai Woollecture 8 - 36
Polling
Busy-waiting until device can accept another character
Example assumes memory-mapped registers
Avishai Woollecture 8 - 37
Properties of Programmed I/O
• Simple to program
• Ties up CPU, especially if device is slow
Avishai Woollecture 8 - 38
Interrupts Revisited
bus
Avishai Woollecture 8 - 39
Interrupt-Driven I/O
Code executed when print system call is made
Interrupt service procedure
Avishai Woollecture 8 - 40
Properties of Interrupt-Driven I/O
• Interrupt every character or word.
• Interrupt handling takes time.
• Makes sense for slow devices (keyboard, mouse)
• For fast device: use dedicated DMA controller – usually for disk and network.
Avishai Woollecture 8 - 41
Direct Memory Access (DMA)
• DMA controller has access to bus.
• Registers:– memory address to write/read from– byte count– I/O port or mapped-memory address to use– direction (read from / write to device)– transfer unit (byte or word)
Avishai Woollecture 8 - 42
Operation of a DMA transfer
Avishai Woollecture 8 - 43
I/O Using DMA
code executed when the print system call is made
interrupt service procedure
Avishai Woollecture 8 - 44
DMA with Virtual Memory
• Most DMA controllers use physical addresses
• What if memory of buffer is paged out during DMA transfer?
• Force the page to not page out (“pinning”)
Avishai Woollecture 8 - 45
Burst or Cycle-stealing
• DMA controller grabs bus for one word at a time, it competes with CPU bus access. This is called “cycle-stealing”.
• In “burst” mode the DMA controller acquires the bus (exclusively), issues several transfers, and releases. – More efficient – May block CPU and other devices
Avishai Woollecture 8 - 46
Concepts for review• TLB
• Local/Global page replacement
• Demand paging
• Page-fault-frequency monitor
• I/O device controller
• in/out commands
• Memory-mapped registers
• PCI Bridge
• Programmed I/O (Polling)
• Interrupt-driven I/O
• I/O using DMA
• Page pinning
• DMA cycle-stealing
• DMA burst mode