VM and IO Topics in Linux
-
Upload
cucufrog -
Category
Technology
-
view
2.149 -
download
0
description
Transcript of VM and IO Topics in Linux
![Page 1: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/1.jpg)
VM and I/O Topics in Linux
Page Replacement, Swap and I/O
Jiannan Ouyang
Ph.D. Student
Computer Science Department
University of Pittsburgh
05/05/2011
![Page 2: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/2.jpg)
Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 2
![Page 3: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/3.jpg)
Describing Physical Memory
Jiannan Ouyang, CS PhD@PITT 3
Node: NUMA memory region
Zone: memory type
Struct Page: page frame
![Page 4: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/4.jpg)
Physical Page Allocation
Jiannan Ouyang, CS PhD@PITT 4
Binary Buddy Allocator:
• If a block of the desired size is not available, a large block is broken up in half, and the
two blocks are buddies to each other. One half is used for the allocation, and the other is
free. The blocks are continuously halved as necessary until a block of the desired size is
available.
• When a block is later freed, the buddy is examined, and the two are coalesced if it is free.
![Page 5: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/5.jpg)
Page Table Management
• Three Level Mapping
Jiannan Ouyang, CS PhD@PITT 5
![Page 6: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/6.jpg)
Kernel Memory Mapping
Jiannan Ouyang, CS PhD@PITT 6 Virtual Memory
0x00000000
4-GB
Physical memory 0x00000000
0x3FFFFFFF
1-GB 896-MB
896-MB
0xC0000000
display memory
device memory
![Page 7: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/7.jpg)
User Memory Mapping
Jiannan Ouyang, CS PhD@PITT 7 virtual memory
kernel
space
user space
text data
stack
text
data
stack
physical memory
mappings
3-GB
![Page 8: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/8.jpg)
User Memory Mapping
Jiannan Ouyang, CS PhD@PITT 8
user space
kernel
space
user space
text
data
stack
kernel
space
text
data
stack
text
data
data
stack
stack
physical memory virtual memory virtual memory
![Page 9: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/9.jpg)
Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 9
![Page 10: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/10.jpg)
Memory Customers
Jiannan Ouyang, CS PhD@PITT 10
Kernel Code & data
User Code & Data
Slab Cache
Page Cache
Icache & dcache Buddy
System
Request
Reclaim
• All memory except “User Code & data” are used by the kernel
• “User Code & Data” are managed in user space, i.e. malloc/free,
kernel can only swap out user pages
![Page 11: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/11.jpg)
Slab Cache
Jiannan Ouyang, CS PhD@PITT 11
• Cache for commonly used objects kept in an initialized state
available for use by the kernel.
• Save time of allocating, initializing and freeing the same object.
![Page 12: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/12.jpg)
Disk related caches
• Dcache (metadata): dentry objects representing filesystem pathnames.
• Icache (metadata): inode objects representing disk inodes.
• Page Cache (data): data pages from disk, main disk cache used
Jiannan Ouyang, CS PhD@PITT 12
![Page 13: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/13.jpg)
Memory Customers Review
Jiannan Ouyang, CS PhD@PITT 13
Kernel Code & data
User Code & Data
Slab Cache
Page Cache
Icache & dcache Buddy
System
Request
Reclaim
We’ll see when will the kernel start reclaim pages, which pages to
reclaim, and the replacement policy.
![Page 14: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/14.jpg)
Reclamation: When?
Jiannan Ouyang, CS PhD@PITT 14
Zone Watermarks • Pages Low: kswapd is woken up by the buddy
allocator to start freeing pages. The value is twice the value of pages min by default.
• Pages Min: the allocator will do the kswapd work in a synchronous fashion, sometimes referred to as the direct-reclaim path.
• Pages High: kswapd will go back to sleep. The default for pages high is three times the value of pages min.
![Page 15: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/15.jpg)
Jiannan Ouyang, CS PhD@PITT 15
![Page 16: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/16.jpg)
Reclamation: Which?
Jiannan Ouyang, CS PhD@PITT 16
![Page 17: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/17.jpg)
Reclamation: Which? (Con.)
Jiannan Ouyang, CS PhD@PITT 17
• Mapped & Anonymous Pages
– Mapped: backed up by a file
– Anonymous: anonymous memory region of a process
• Shared & Non-shared Pages
– Unmapping from all page table entries at once: reverse mapping, important improvement in Linux 2.6 Kernel
![Page 18: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/18.jpg)
Reclamation: Which? (Con.)
shrink_caches until given target number of pages is met,
1. slab cache (Kmem_cache_reap)
2. User pages & page cache (refill & shrink_cache)
3. dcache and icache
Jiannan Ouyang, CS PhD@PITT 18
![Page 19: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/19.jpg)
Replacement Policy
Jiannan Ouyang, CS PhD@PITT 19
active
inactive
Ref=1, clear
Ref=0
(active, ref) = {11,10, 01, 00}
reclaim
access
access
active=1
active=0
![Page 20: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/20.jpg)
Moving pages across the list
Jiannan Ouyang, CS PhD@PITT 20
mark_page_accessed( ):
on each access increase the (active, ref) counter;
if active=1 move inactive->active;
Refill_inactive_zone():
if (ref=1) {ref=0; move to head of active list;}
else {move active -> inactive;}
![Page 21: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/21.jpg)
Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 21
![Page 22: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/22.jpg)
Swap
• Able to reclaim all the page frames obtained by a process, and not only those have an image on disk
– anonymous pages (User stack or heap)
– Dirty pages that belong to a private memory mapping of a process
– IPC shared pages
Jiannan Ouyang, CS PhD@PITT 22
![Page 23: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/23.jpg)
Swap (Con.)
• Set up “swap areas” on disk
• allocating and freeing “page slots” in swap areas
• Provide functions both to “swap out” pages from RAM into a swap area and to “swap in” pages from a swap area into RAM.
• Mark Page Table entries to keep track of the positions of data in the swap areas.
Jiannan Ouyang, CS PhD@PITT 23
![Page 24: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/24.jpg)
Example
total used free shared buffers cached
Mem: 2013 1811 201 0 157 872
-/+ buffers/cache: 782 1231
Swap: 397 0 397
Jiannan Ouyang, CS PhD@PITT 24
While(1){
p = malloc(N);
memset(p, 0, N);
//demand paging
}
$free -m
total used free shared buffers cached
Mem: 2013 1956(+) 56(-) 0 4(-) 109(-)
-/+ buffers/cache: 1842(+) 170(-)
Swap: 397 8 389
![Page 25: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/25.jpg)
Linux I/O Architecture
Jiannan Ouyang, CS PhD@PITT 25
• How to do bypassing?
• Default file I/O API,
fwrite(), are buffered
• File System:
(dir, name, offset) -> LBA
• Device File: not normal
file
![Page 26: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/26.jpg)
I/O Bypassing
• Disk Cache
– O_DIRECT
• File System
– Device file
• I/O Scheduler
– To be solved
Jiannan Ouyang, CS PhD@PITT 26
![Page 27: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/27.jpg)
Thanks Q&A
Jiannan Ouyang, CS PhD@PITT 27
![Page 28: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/28.jpg)
Reference
• Understanding the Linux Kernel, 3rd
• Understanding the Linux Virtual Memory Manager
Jiannan Ouyang, CS PhD@PITT 28
![Page 29: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/29.jpg)
BACKUP SLICES
Jiannan Ouyang, CS PhD@PITT 29
![Page 30: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/30.jpg)
Page Table Management
• Three Level Mapping
Jiannan Ouyang, CS PhD@PITT 30
![Page 31: VM and IO Topics in Linux](https://reader033.fdocuments.in/reader033/viewer/2022051016/5590cb8c1a28ab3e538b479d/html5/thumbnails/31.jpg)
Page Table Management (Con.)
Jiannan Ouyang, CS PhD@PITT 31
MMU Linear Address Physical Address
PGD Address