Where'd all my memory go? SCALE 12x SCALE12x

download Where'd all my memory go? SCALE 12x SCALE12x

If you can't read please download the document

Transcript of Where'd all my memory go? SCALE 12x SCALE12x

Where'd all my memory go?

Joshua Miller SCALE 12x 22 FEB 2014

The Incomplete Story

Computers have memory, which they use to run applications.

Cruel Reality

swap

caches

buffers

shared

virtual

resident

more...

Topics

Memory basicsPaging, swapping, caches, buffers

Overcommit

Filesystem cache

Kernel caches and buffers

Shared memory

top is awesome

top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3149296k used, 709396k free, 261556k buffersSwap: 0k total, 0k used, 0k free, 1081832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd

top is awesome

top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3149296k used, 709396k free, 261556k buffersSwap: 0k total, 0k used, 0k free, 1081832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd

Physical memory used and free

Swap used and free

top is awesome

top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3149296k used, 709396k free, 261556k buffersSwap: 0k total, 0k used, 0k free, 1081832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd

Per-process breakdown of virtual, resident, and shared memory

Percentage of RES/total memory

top is awesome

top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3149296k used, 709396k free, 261556k buffersSwap: 0k total, 0k used, 0k free, 1081832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd

Kernel buffers and caches (no association with swap,despite being on the same row)

/proc/meminfo

[jmiller@meminfo]$ cat /proc/meminfoMemTotal: 3858692 kBMemFree: 3445624 kBBuffers: 19092 kBCached: 128288 kBSwapCached: 0 kB...

/proc/meminfo

[jmiller@meminfo]$ cat /proc/meminfoMemTotal: 3858692 kBMemFree: 3445624 kBBuffers: 19092 kBCached: 128288 kBSwapCached: 0 kB...

Many useful values which we'll refer to throughout the presentation

Overcommit

top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3075728k used, 782964k free, 283648k buffersSwap: 0k total, 0k used, 0k free, 1073320k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat

Overcommit

top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3075728k used, 782964k free, 283648k buffersSwap: 0k total, 0k used, 0k free, 1073320k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat

4G of physical memory and no swap , so how can bloat have 18.6g virtual?

Overcommit

top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3075728k used, 782964k free, 283648k buffersSwap: 0k total, 0k used, 0k free, 1073320k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat

Virtual memory is not physical memory plus swap

A process can request huge amounts of memory, but it isn't mapped to real memory until actually referenced

4G of physical memory and no swap , so how can bloat have 18.6g virtual?


Linux filesystem caching

Free memory is used to cache filesystem contents.

Over time systems can appear to be out of memory because all of the free memory is used for cache.

top is awesome

top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 3149296k used, 709396k free, 261556k buffersSwap: 0k total, 0k used, 0k free, 1081832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd

About 25% of this system's memory is from page cache


Linux filesystem caching

Additions and removals from the cache are transparent to applications

Tunable through swappiness

Can be dropped - echo 1 > /proc/sys/vm/drop_caches

Under memory pressure, memory is freed automatically* *usually

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

...

106MB RSS

-

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

345MB cache

25MB buffer

...

106MB RSS

-

-

-

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

345MB cache

25MB buffer

...

106MB RSS

-

-

-

=

~1GB mystery

What is consuming a GB of memory?

kernel slab cache

The kernel uses free memory for its own caches.

Some include:dentries (directory cache)

inodes

buffers

kernel slab cache

[jmiller@mem-mystery ~]$ slabtop -o -s c Active / Total Objects (% used) : 2461101 / 2468646 (99.7%) Active / Total Slabs (% used) : 259584 / 259586 (100.0%) Active / Total Caches (% used) : 104 / 187 (55.6%) Active / Total Size (% used) : 835570.40K / 836494.74K (99.9%) Minimum / Average / Maximum Object : 0.02K / 0.34K / 4096.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME624114 624112 99% 1.02K 208038 3 832152K nfs_inode_cache631680 631656 99% 0.19K 31584 20 126336K dentry649826 649744 99% 0.06K 11014 59 44056K size-64494816 494803 99% 0.03K 4418 112 17672K size-32 186 186 100% 32.12K 186 1 11904K kmem_cache 4206 4193 99% 0.58K 701 6 2804K inode_cache 6707 6163 91% 0.20K 353 19 1412K vm_area_struct 2296 2290 99% 0.55K 328 7 1312K radix_tree_node

kernel slab cache

[jmiller@mem-mystery ~]$ slabtop -o -s c Active / Total Objects (% used) : 2461101 / 2468646 (99.7%) Active / Total Slabs (% used) : 259584 / 259586 (100.0%) Active / Total Caches (% used) : 104 / 187 (55.6%) Active / Total Size (% used) : 835570.40K / 836494.74K (99.9%) Minimum / Average / Maximum Object : 0.02K / 0.34K / 4096.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME624114 624112 99% 1.02K 208038 3 832152K nfs_inode_cache631680 631656 99% 0.19K 31584 20 126336K dentry649826 649744 99% 0.06K 11014 59 44056K size-64494816 494803 99% 0.03K 4418 112 17672K size-32 186 186 100% 32.12K 186 1 11904K kmem_cache 4206 4193 99% 0.58K 701 6 2804K inode_cache 6707 6163 91% 0.20K 353 19 1412K vm_area_struct 2296 2290 99% 0.55K 328 7 1312K radix_tree_node

1057MB of kernel slab cache

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

345MB cache

25MB buffer

...

106MB RSS

-

-

-

=

~1GB mystery

What is consuming a GB of memory?

Where'd my memory go?

top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombieCpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stCpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1549480k used, 2309212k free, 25804k buffersSwap: 0k total, 0k used, 0k free, 344280k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd

1.5G used

345MB cache

25MB buffer

...

106MB RSS

-

-

-

=

~1GB mystery

What is consuming a GB of memory?

Answer: kernel slab cache 1057MB

Additions and removals from the cache are transparent to applications

Tunable through procs vfs_cache_pressure

Under memory pressure, memory is freed automatically*

*usually

kernel slab cache

kernel slab cache
network buffers example

[jmiller@mem-mystery2 ~]$ slabtop -s c -o Active / Total Objects (% used) : 2953761 / 2971022 (99.4%) Active / Total Slabs (% used) : 413496 / 413496 (100.0%) Active / Total Caches (% used) : 106 / 188 (56.4%) Active / Total Size (% used) : 1633033.85K / 1635633.87K (99.8%) Minimum / Average / Maximum Object : 0.02K / 0.55K / 4096.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME1270200 1270170 99% 1.00K 317550 4 1270200K size-10241269480 1269406 99% 0.25K 84632 15 338528K skbuff_head_cache325857 325746 99% 0.06K 5523 59 22092K size-64

kernel slab cache
network buffers example

[jmiller@mem-mystery2 ~]$ slabtop -s c -o Active / Total Objects (% used) : 2953761 / 2971022 (99.4%) Active / Total Slabs (% used) : 413496 / 413496 (100.0%) Active / Total Caches (% used) : 106 / 188 (56.4%) Active / Total Size (% used) : 1633033.85K / 1635633.87K (99.8%) Minimum / Average / Maximum Object : 0.02K / 0.55K / 4096.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME1270200 1270170 99% 1.00K 317550 4 1270200K size-10241269480 1269406 99% 0.25K 84632 15 338528K skbuff_head_cache325857 325746 99% 0.06K 5523 59 22092K size-64

~1.5G used , this time for in-use network buffers (SO_RCVBUF)

Unreclaimable slab

[jmiller@mem-mystery2 ~]$ grep -A 2 ^Slab /proc/meminfoSlab: 1663820 kBSReclaimable: 9900 kBSUnreclaim: 1653920 kB

Unreclaimable slab

[jmiller@mem-mystery2 ~]$ grep -A 2 ^Slab /proc/meminfoSlab: 1663820 kBSReclaimable: 9900 kBSUnreclaim: 1653920 kB

Some slab objects can't be reclaimed, and memory pressure won't automatically free the resources

Nitpick Accounting

[jmiller@postgres ~]$ ./memory_explain.sh"free" buffers (MB) : 277"free" caches (MB) : 4650"slabtop" memory (MB) : 109.699"ps" resident process memory (MB) : 366.508

"free" used memory (MB) : 5291buffers+caches+slab+rss (MB) : 5403.207difference (MB) : -112.207

Now we can account for all memory utilization:

Nitpick Accounting

[jmiller@postgres ~]$ ./memory_explain.sh"free" buffers (MB) : 277"free" caches (MB) : 4650"slabtop" memory (MB) : 109.699"ps" resident process memory (MB) : 366.508

"free" used memory (MB) : 5291buffers+caches+slab+rss (MB) : 5403.207difference (MB) : -112.207

Now we can account for all memory utilization:

But sometimes we're using more memory than we're using?!

And a cache complication...

top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%stMem: 7673860k total, 6895008k used, 778852k free, 300388k buffersSwap: 0k total, 0k used, 0k free, 6179780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres

And a cache complication...

top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%stMem: 7673860k total, 6895008k used, 778852k free, 300388k buffersSwap: 0k total, 0k used, 0k free, 6179780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres

~7G used

And a cache complication...

top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%stMem: 7673860k total, 6895008k used, 778852k free, 300388k buffersSwap: 0k total, 0k used, 0k free, 6179780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres

~7G used , ~6G cached ,

And a cache complication...

top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%stMem: 7673860k total, 6895008k used, 778852k free, 300388k buffersSwap: 0k total, 0k used, 0k free, 6179780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres

~7G used , ~6G cached , so how can postgres have 2.8G resident?

Shared memory

Pages that multiple processes can access

Resident, shared, and in the page cache

Not subject to cache flush

shmget()

mmap()

Shared memory
shmget() example

Shared memory
shmget()

top - 21:08:20 up 147 days, 13:12, 9 users, load average: 0.03, 0.04, 0.00Tasks: 150 total, 1 running, 149 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 1.5%sy, 0.4%ni, 96.7%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1114512k used, 2744180k free, 412k buffersSwap: 0k total, 0k used, 0k free, 931652k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND20599 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:06.52 share

Shared memory
shmget()

top - 21:08:20 up 147 days, 13:12, 9 users, load average: 0.03, 0.04, 0.00Tasks: 150 total, 1 running, 149 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 1.5%sy, 0.4%ni, 96.7%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1114512k used, 2744180k free, 412k buffersSwap: 0k total, 0k used, 0k free, 931652k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND20599 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:06.52 share

Shared memory is in the page cache!

Shared memory
shmget()

top - 21:21:29 up 147 days, 13:25, 9 users, load average: 0.34, 0.18, 0.06Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.6%sy, 0.4%ni, 98.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1099756k used, 2758936k free, 844k buffersSwap: 0k total, 0k used, 0k free, 914408k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND22058 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:05.00 share22059 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.35 share22060 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.40 share

3x processes, but same resource utilization - about 1GB

Shared memory
shmget()

top - 21:21:29 up 147 days, 13:25, 9 users, load average: 0.34, 0.18, 0.06Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.6%sy, 0.4%ni, 98.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1099756k used, 2758936k free, 844k buffersSwap: 0k total, 0k used, 0k free, 914408k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND22058 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:05.00 share22059 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.35 share22060 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.40 share

From /proc/meminfo:Mapped: 912156 kBShmem: 902068 kB

Shared memory
mmap() example

Shared memory
mmap()

top - 21:46:04 up 147 days, 13:50, 10 users, load average: 0.24, 0.21, 0.11Tasks: 152 total, 1 running, 151 sleeping, 0 stopped, 0 zombieCpu(s): 0.3%us, 1.6%sy, 0.2%ni, 94.9%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1648992k used, 2209700k free, 3048k buffersSwap: 0k total, 0k used, 0k free, 1385724k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24569 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:03.04 mapped

From /proc/meminfo:Mapped: 1380664 kBShmem: 212 kB

Shared memory
mmap()

top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%stMem: 3858692k total, 1659936k used, 2198756k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 1385732k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped

From /proc/meminfo:Mapped: 1380664 kBShmem: 212 kB

Shared memory
mmap()

top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%stMem: 3858692k total, 1659936k used, 2198756k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 1385732k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped

From /proc/meminfo:Mapped: 1380664 kBShmem: 212 kB

Not counted as shared, but mapped

Shared memory
mmap()

top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombieCpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%stMem: 3858692k total, 1659936k used, 2198756k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 1385732k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped

From /proc/meminfo:Mapped: 1380664 kBShmem: 212 kB

105%!

A subtle difference between shmget() and mmap()...

Locked shared memory

Memory from shmget() must be explicitly released by a shmctl(..., IPC_RMID, ) call

Process termination doesn't free the memory

Not the case for mmap()

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

~900M of cache

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

'echo 3 > /proc/sys/vm/drop_caches' no impact on value of cache,so it's not filesystem caching

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

Processes consuming way less than ~900M

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

From /proc/meminfo:Mapped: 27796 kBShmem: 902044 kB

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

From /proc/meminfo:Mapped: 27796 kBShmem: 902044 kB

Un-attached shared mem segment(s)

Locked shared memory
shmget()

top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 3858692k total, 1142248k used, 2716444k free, 3248k buffersSwap: 0k total, 0k used, 0k free, 934360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd

From /proc/meminfo:Mapped: 27796 kBShmem: 902044 kB

Observable through 'ipcs -a'

Accounting for shared memory
is difficult

top reports memory that can be shared but might not be

ps doesn't account for shared

pmap splits mapped vs shared, reports allocated vs used

mmap'd files are shared, until modified at which point they're private


Linux filesystem cache

?

What's inside?Do you need it?

/etc/motd?

detritus?

Important app data?


Linux filesystem cache

We know shared memory is in the page cache, which we can largely understand through proc

From /proc/meminfo:Cached: 367924 kB...Mapped: 31752 kBShmem: 196 kB


Linux filesystem cache

From /proc/meminfo:Cached: 367924 kB...Mapped: 31752 kBShmem: 196 kB

But what about the rest of what's in the cache?

We know shared memory is in the page cache, which we can largely understand through proc


Linux filesystem cache

Bad news: We can't just ask What's in the cache?

Good news: We can ask Is this file in the cache?


linux-ftools
https://code.google.com/p/linux-ftools/

[jmiller@cache ~]$ linux-fincore /tmp/bigfilename size cached_pages cached_size cached_perc-------- ---- ------------ ----------- -----------/tmp/big 4,194,304 0 0 0.00---total cached size: 0


linux-ftools
https://code.google.com/p/linux-ftools/

[jmiller@cache ~]$ linux-fincore /tmp/bigfilename size cached_pages cached_size cached_perc-------- ---- ------------ ----------- -----------/tmp/big 4,194,304 0 0 0.00---total cached size: 0

Zero % cached


linux-ftools
https://code.google.com/p/linux-ftools/

[jmiller@cache ~]$ linux-fincore /tmp/bigfilename size cached_pages cached_size cached_perc-------- ---- ------------ ----------- -----------/tmp/big 4,194,304 0 0 0.00---total cached size: 0

[jmiller@cache ~]$ dd if=/tmp/big of=/dev/null bs=1k count=50

Read ~5%


linux-ftools
https://code.google.com/p/linux-ftools/

[jmiller@cache ~]$ linux-fincore /tmp/bigfilename size cached_pages cached_size cached_perc-------- ---- ------------ ----------- -----------/tmp/big 4,194,304 0 0 0.00---total cached size: 0

[jmiller@cache ~]$ dd if=/tmp/big of=/dev/null bs=1k count=50

[jmiller@cache ~]$ linux-fincore /tmp/bigfilename size cached_pages cached_size cached_perc-------- ---- ------------ ----------- -----------/tmp/big 4,194,304 60 245,760 5.86---total cached size: 245,760

~5% cached


system tap cache hits
https://sourceware.org/systemtap/wiki/WSCacheHitRate

[jmiller@stap ~]$ sudo stap /tmp/cachehit.stap

Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00%


system tap cache hits
https://sourceware.org/systemtap/wiki/WSCacheHitRate

[jmiller@stap ~]$ sudo stap /tmp/cachehit.stap

Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00%

Track reads against VFS, reads against disk, then infer cache hits


system tap cache hits

[jmiller@stap ~]$ sudo stap /tmp/cachehit.stap

Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00%

But have to account for LVM, device mapper, remote diskdevices (NFS, iSCSI ), ...


Easy mode - drop_caches

echo 1 | sudo tee /proc/sys/vm/drop_caches

frees clean cache pages immediately

frequently accessed files should be re-cached quickly

performance impact while caches repopulated

Filesystem cache contents

No ability to easily see full contents of cache

mincore() - but have to check every file

Hard - system tap / dtrace inference

Easy drop_caches and observe impact

Physical memory

Swap

Virtual memory

Memory: The Big Picture

Physical Memory

Physical Memory

Free

Physical Memory

Free

Used

Physical Memory

Free

Used

Private application memory

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Buffer cache (block IO)

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Buffer cache (block IO)

Page cache

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Buffer cache (block IO)

Page cache

Filesystem cache

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Buffer cache (block IO)

Page cache

Shared memory

Filesystem cache

Physical Memory

Free

Used

Private application memory

Kernel caches (SLAB)

Buffer cache (block IO)

Page cache

Shared memory

Filesystem cache

Thanks!


Send feedback to me:
joshuamiller01 on gmail