Memory info under Linux
I have been herding Linuxes since 1997. One of the first things I learned was monitoring the systems. Back then was MRTG, then Munin, maybe others on different jobs I had, and now it's Prometheus + Grafana. The most basic things you can ask about your system are 4: CPU, memory, disk and network usage. 3 of them are quite easy to monitor; the CPU is a little bit more complicated based on the few1 ways the CPU can be used.
But memory...
You will always find articles saying that memory on Linux is hard. People say that because of the read only memory pages
that contain code that are shared among several processes (for instance, two processes running the same code; or all
processes dynamically linked to the libc, which most do; or two processes mmap()'ing the same parts of the same file).
Bu that's not the only reason. The kernel maintains, as counted today, 56 different values related to memory, which are
exposed in the virtual text file /proc/meminfo:
MemTotal: 32718364 kB MemFree: 3134544 kB MemAvailable: 19557500 kB Buffers: 5760 kB Cached: 19520212 kB SwapCached: 7185472 kB Active: 7179364 kB Inactive: 20880636 kB Active(anon): 5054052 kB Inactive(anon): 6573036 kB Active(file): 2125312 kB Inactive(file): 14307600 kB Unevictable: 365728 kB Mlocked: 25216 kB SwapTotal: 51216380 kB SwapFree: 30697060 kB Zswap: 0 kB Zswapped: 0 kB Dirty: 58600 kB Writeback: 64 kB AnonPages: 8586988 kB Mapped: 10866048 kB Shmem: 3308476 kB KReclaimable: 459200 kB Slab: 649052 kB SReclaimable: 459200 kB SUnreclaim: 189852 kB KernelStack: 23216 kB PageTables: 174396 kB SecPageTables: 3532 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 67575560 kB Committed_AS: 49285284 kB VmallocTotal: 34359738367 kB VmallocUsed: 114008 kB VmallocChunk: 0 kB Percpu: 9696 kB HardwareCorrupted: 0 kB AnonHugePages: 518144 kB ShmemHugePages: 83968 kB ShmemPmdMapped: 0 kB FileHugePages: 61440 kB FilePmdMapped: 0 kB Unaccepted: 0 kB Balloon: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 270560 kB DirectMap2M: 6918144 kB DirectMap1G: 26214400 kB
These 56 values slice and splice memory accounting in several ways, so let's try to figure them all.
Active: 7179364 kB Inactive: 20880636 kB Active(anon): 5054052 kB Inactive(anon): 6573036 kB Active(file): 2125312 kB Inactive(file): 14307600 kB Unevictable: 365728 kB Mlocked: 25216 kB Dirty: 58600 kB Writeback: 64 kB AnonPages: 8586988 kB Mapped: 10866048 kB Shmem: 3308476 kB KReclaimable: 459200 kB SReclaimable: 459200 kB SUnreclaim: 189852 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 67575560 kB Committed_AS: 49285284 kB VmallocTotal: 34359738367 kB VmallocUsed: 114008 kB VmallocChunk: 0 kB Percpu: 9696 kB HardwareCorrupted: 0 kB AnonHugePages: 518144 kB ShmemHugePages: 83968 kB ShmemPmdMapped: 0 kB FileHugePages: 61440 kB FilePmdMapped: 0 kB Unaccepted: 0 kB Balloon: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 270560 kB DirectMap2M: 6918144 kB DirectMap1G: 26214400 kB
Let's start with what is not physical memory. The Linux kernel manages memory in pages. The pages can be anywhere in the memory... or not. Under memory pressure, pages can be evicted and stored in swap. So let's look there first.
SwapTotal: 51216380 kB SwapFree: 30697060 kB
Implied is the amount of swap used. There's an extra parameter related to swap:
SwapCached: 7185472 kB
This is our first informative value. It counts the amount of swapped pages that have been brought back to physical memory, but that still have a copy in swap. What is not clear from the docs is if it's counted as part of the free swap space or not, but my money is on the free side. This is probably a optimization where pages in memory but that are also in the swap cache will be the first to be evicted, because they're already there. Of course, this only works on pages that have not been modified since they were brought back to RAM; those who are must also be removed from the swap cache.
There are another two related to swap. To compressed swap:
Zswap: 0 kB Zswapped: 0 kB
I don't use it, but it's there. Feels weird that this time it's size + used, not the usual size + free, but it makes sense, since you can't know beforehand how well you will be able to compress pages, so you can't know how much you can cram in whatever actual space there's left. In any case, the kernel must know internally how much actual free space there is, it just doesn't show it.
Enough about swap. For the physical memory, the most basic values are:
MemTotal: 32718364 kB MemFree: 3134544 kB
How much physical memory the system has, and how much is actually free, not allocated for anything else. The difference from the two is how much memory is used. Many values in the kernel are like that, the total size and the free space, and the amount used it the difference of the two. If you think about it, the most important one of that triangle is the free one; when it gets low we get into trouble.
The used memory is split between several systems, that we can roughly as: memory used for kernel structures needed to maintain the system, cache, apps and another category I'll talk later. The kernel structures include:
KernelStack: 23216 kB PageTables: 174396 kB SecPageTables: 3532 kB Buffers: 5760 kB Cached: 19520212 kB Slab: 649052 kB
The buffers are IO buffers for devices, mostly storage. The cache is the file system cache. The slab is a cache of frequently used kernel objects. All these pages are another optimization, and they can be reclaimed at anytime after they have been flushed to the relevant devices, if needed. That leads us to another value:
MemAvailable: 19557500 kB
I don't understand why, but this is an estimate of how much memory is available for "starting new applications" without swapping. Why new applications and not current applications just asking for more memory? Unclear.
If we do the math with the reclaimable ones we just mentioned we get:
In [1]: 19520212 + 649052 + 5760 - 19557500 Out[1]: 617524
An extra ~600MiB I can't account for. Furthermore, I said buffers cache and slab are reclaimable, but it's not 100% true:
On the other hand, to me the most detailed monitoring memory graph I have seen is Munin's:
So my quest this time was to get the same graph on Grafana.
Luckily Munin is open source, and we can see how it does it:
https://github.com/munin-monitoring/munin/blob/master/plugins/node.d.linux/memory#L252
-
right now 7: system, irq, softirq, steal, iowait, user and nice. ↩