VmSize = physical memory + swap? - linux

I have a little question regarding VmSize, in the documentation it's supposed to be the application's usage of memory.
However on my system:
VmSize = physical memory + swap
VmHWM seems more like what the application actually would be using.
[root#sun ~]# free -m
total used free shared buffers cached
Mem: 12012 9223 2788 0 613 1175
-/+ buffers/cache: 7434 4577
Swap: 3967 0 3967
[root#sun ~]# cat /proc/8268/status
Name: mysqld
State: S (sleeping)
Tgid: 8268
Pid: 8268
PPid: 1
TracerPid: 0
Uid: 89 89 89 89
Gid: 89 89 89 89
FDSize: 512
Groups: 89
VmPeak: 15878128 kB
VmSize: 15878128 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 7036312 kB
VmRSS: 7036312 kB
VmData: 15839272 kB
VmStk: 136 kB
VmExe: 10744 kB
VmLib: 6356 kB
VmPTE: 16208 kB
VmSwap: 0 kB
Threads: 265
SigQ: 0/96048
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000087007
SigIgn: 0000000000001000
SigCgt: 00000001800066e9
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: fff
Cpus_allowed_list: 0-11
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 2567
nonvoluntary_ctxt_switches: 77
Any idea of why?
I try to get the usage of memory for this application in particular but this result doesn't really make sense.
Thanks.

VMsize is the "address space" that the process has in use: the number of available adresses. These addresses do not have to have any physical memory attached to them. (Attached physical memory is the RSS figure)
You can verify this by allocating a chunk of memory with p = malloc(4 * 1024 * 1024);, and not doing anything to *p: the VmSize will increase by 1K pages, but the RSS will be (about) the same. (your program will have more adressable memory, but it does not address it, so the memory does not need to be attached )

VmSize is the sum of all mapped memory (/proc/pid/maps)

Related

cgroup and process memory usage mismactch

I have a memory cgroup with 1 process in it.
And look for rss memory usage in that cgroup (in memory.stat) and it is much bigger than rss memory of the process (from /proc/[pid]/status).
The only process pid in cgroup:
$ cat /sys/fs/cgroup/memory/karim/master/cgroup.procs
3744924
The memory limit in cgroup:
$ cat /sys/fs/cgroup/memory/karim/master/memory.limit_in_bytes
7340032000
rss of the cgroup is 990 MB:
$ cat /sys/fs/cgroup/memory/karim/master/memory.stat
cache 5990449152
rss 990224384
rss_huge 0
shmem 0
mapped_file 13516800
dirty 1081344
writeback 270336
pgpgin 4195191
pgpgout 2490628
pgfault 5264589
pgmajfault 0
inactive_anon 0
active_anon 990240768
inactive_file 5862830080
active_file 127021056
unevictable 0
hierarchical_memory_limit 7340032000
total_cache 5990449152
total_rss 990224384
total_rss_huge 0
total_shmem 0
total_mapped_file 13516800
total_dirty 1081344
total_writeback 270336
total_pgpgin 4195191
total_pgpgout 2490628
total_pgfault 5264589
total_pgmajfault 0
total_inactive_anon 0
total_active_anon 990240768
total_inactive_file 5862830080
total_active_file 127021056
total_unevictable 0
rss of the process is 165 MB:
$ cat /proc/3744924/status
Name: [main] /h
Umask: 0002
State: S (sleeping)
Tgid: 3744924
Ngid: 0
Pid: 3744924
PPid: 3744912
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1001 1001 1001 1001
FDSize: 256
Groups: 1000 1001
NStgid: 3744924
NSpid: 3744924
NSpgid: 3744912
NSsid: 45028
VmPeak: 2149068 kB
VmSize: 2088876 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 245352 kB
VmRSS: 198964 kB
RssAnon: 165248 kB
RssFile: 33660 kB
RssShmem: 56 kB
VmData: 575400 kB
VmStk: 132 kB
VmExe: 3048 kB
VmLib: 19180 kB
VmPTE: 1152 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 17
SigQ: 0/241014
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000001001000
SigCgt: 0000000180000002
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: fff
Cpus_allowed_list: 0-11
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 94902
nonvoluntary_ctxt_switches: 1903
Why is there such a big difference?

oom invoked even though enough memory available

I use Android P-OS. and kernel version is msm-4.14
oom invoked and killing process since booting up. However, memory is abundant.
My memory size is 8GByte, Swap is 1GByte.
I'm not even using a swap.
[ 59.901334] Killing 'ndroid.keychain' (2011), adj 906,\x0a to free 87268kB on behalf of 'Binder:883_2' (938)\x0a Free CMA is 246200kB\x0a Total reserve is 242332kB\x0a Total free pages is 5100764kB\x0a Total file cache is 978224kB
[ 59.903948] Killing 'Jit thread pool' (2016), adj 906,\x0a to free 88676kB on behalf of 'ActivityManager' (960)\x0a Free CMA is 246200kB\x0a Total reserve is 242332kB\x0a Total free pages is 5100764kB\x0a Total file cache is 978224kB
[ 60.007328] oom_reaper: reaped process 2011 (ndroid.keychain), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
$ free
total used free shared buffers
Mem: 7842377728 3144630272 4697747456 2084864 14852096
-/+ buffers/cache: 3129778176 4712599552
Swap: 1073737728 0 1073737728
$ meminfo
MemTotal: 7658572 kB
MemFree: 4589120 kB
MemAvailable: 5800580 kB
Buffers: 14416 kB
Cached: 1415944 kB
SwapCached: 0 kB
Active: 630232 kB
Inactive: 1299508 kB
Active(anon): 501820 kB
Inactive(anon): 1876 kB
Active(file): 128412 kB
Inactive(file): 1297632 kB
Unevictable: 2888 kB
Mlocked: 2888 kB
SwapTotal: 1048572 kB
SwapFree: 1048572 kB
Dirty: 92 kB
Writeback: 0 kB
AnonPages: 502148 kB
Mapped: 745728 kB
Shmem: 2036 kB
Slab: 520776 kB
SReclaimable: 130688 kB
SUnreclaim: 390088 kB
KernelStack: 28336 kB
PageTables: 40972 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4877856 kB
Committed_AS: 95986336 kB
VmallocTotal: 263061440 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
CmaTotal: 303104 kB
CmaFree: 246200 kB
I don't understand this situation.
Why does this happen? Is there a way to avoid it?
Answer Myself
It is not Out Of Memory.
Just oom_reaper is working by sigkill signal.
Memory is enough
Thanks

Child process is not generating core ONLY for SIGBUS error and became Zombie process

My child process is trying to access an PCI address space. It works fine most of the times.
But, sometimes the child process is going to zombie state. dmesg logs shows the following bus error.
[ 501.134156] Caused by (from MCSR=10008): Bus - Read Data Bus Error
[ 501.134169] Oops: Machine check, sig: 7 [#1]
There is no core file generated in this case.
[Linux:/]$ ps -axl | grep tes1
4 0 6805 32495 20 0 0 0 exit Zl ? 0:05 [test1] <defunct>
[Linux:/]$
Core is generated for SIGSEGV error by the child process. So I assume it has nothing to do with permission/ulimit settings.
Can someone help me to understand why core is not getting generated in this case?
Child Process:
--------------
[Linux:/]$ cat /proc/6805/status
Name: test1
State: Z (zombie)
Tgid: 6805
Pid: 6805
PPid: 32495
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 0
Groups:
Threads: 2
SigQ: 18/13007
SigPnd: 0000000002000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001006
SigCgt: 0000000182000200
CapInh: 0000000000000000
CapPrm: 0000001fffffffff
CapEff: 0000001fffffffff
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: 3
Cpus_allowed_list: 0-1
voluntary_ctxt_switches: 8998
nonvoluntary_ctxt_switches: 857
Stack:
-------
[Linux:/]$ cat /proc/6805/stack
[<00000000>] (nil)
[<c0008640>] __switch_to+0xc0/0x160
[<c004b4f4>] do_exit+0x5d4/0xa70
[<c000c694>] die+0x224/0x310
[<c000ce44>] machine_check_exception+0x124/0x1e0
[<c00123bc>] ret_from_mcheck_exc+0x0/0x14c
[Linux:/]$
Parent Process:
---------------
[Linux:/]$ cat /proc/32495/status
Name: test
State: S (sleeping)
Tgid: 32495
Pid: 32495
PPid: 21911
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 256
Groups:
VmPeak: 4820 kB
VmSize: 4820 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 2548 kB
VmRSS: 2548 kB
VmData: 1284 kB
VmStk: 132 kB
VmExe: 900 kB
VmLib: 1976 kB
VmPTE: 24 kB
VmSwap: 0 kB
Threads: 1
SigQ: 19/13007
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000010000
SigIgn: 0000000000001006
SigCgt: 0000000043816ef9
CapInh: 0000000000000000
CapPrm: 0000001fffffffff
CapEff: 0000001fffffffff
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: 3
Cpus_allowed_list: 0-1
voluntary_ctxt_switches: 274
nonvoluntary_ctxt_switches: 145
[Linux:/]$
I understand that the PCI hardware which is mmaped, is not responding. So, it is appropriate that only the kernel to deal with the error.
The error won't be propagated to user level, because this is not software fault. So, We do not get a core dump (either kernel or user space), since it is not a software failure.
The Machine check exception handler in the kernel tells what the hardware failure was, and what address/data is relevant (depending on the cause) - Need to be investigated from the hardware perspective further.

SIGBUS while doing memcpy from mmap ed buffer which is in RAM as identified by mincore

I am mmapping a block as:
mapAddr = mmap((void*) 0, curMapSize, PROT_NONE, MAP_LOCKED|MAP_SHARED, fd, curMapOffset);
if this does not fail (mapAddr != MAP_FAILED) I query mincore as:
err = mincore((char*) mapAddr, pageSize, &mincoreRet);
to find out whether it is in RAM. In case it is in RAM (err == 0 && mincoreRet & 0x01) I mmap it again for reading as:
copyAddr = mmap((void*) 0, curMapSize, PROT_READ, MAP_LOCKED|MAP_SHARED, fd, curMapOffset);
and then I try to copy it out to my buffer as:
memcpy(data, copyAddr, pageSize);
everything works fine except in the last memcpy once in a while I get SIGBUS. When I check /proc/ /smaps at the time of the failure I notice that it has Rss as well as Locked fields as 0 as listed below:
7f4a4c118000-7f4a4c119000 r--s 00326000 00:17 6 <file name>
Size: 4 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 0 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
Any thoughts? This is happening on ubuntu 12.0.4 with kernel version 3.5.0-36.

What does pss mean in /proc/pid/smaps

I was confused about the Pss column in /proc/pid/smaps, so I wrote a program to test it:
void sa();
int main(int argc,char *argv[])
{
int fd;
sa();
sleep(1000);
}
void sa()
{
char *pi=new char[1024*1024*10];
for(int i=0;i<4;++i) {
for(int j=0;j<1024*1024;++j){
*pi='o';
pi++;
}
}
int cnt;
for(int i=0;i<6;++i) {
for(int j=0;j<1024*1024;++j){
cnt+=*pi;
pi++;
}
}
printf("%d",cnt);
}
$cat /proc/`pidof testprogram`/smaps
08838000-0885b000 rw-p 00000000 00:00 0 [heap]
Size: 140 kB
Rss: 12 kB
Pss: 12 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 12 kB
Referenced: 12 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
b6dcd000-b77d0000 rw-p 00000000 00:00 0
Size: 10252 kB
Rss: 10252 kB
Pss: 4108 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4108 kB
Referenced: 4108 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Here I found Pss equal to Private_Dirty, but I wonder why.
BTW: Is there any detailed documentation for smaps?
Quoting from lwn.net
The "proportional set size" (PSS) of a process is the count of pages
it has in memory, where each page is divided by the number of
processes sharing it. So if a process has 1000 pages all to itself,
and 1000 shared with one other process, its PSS will be 1500
From Linux Kernel Documentation,
The /proc/PID/smaps is an extension based on maps, showing the memory
consumption for each of the process's mappings. For each of mappings there
is a series of lines such as the following:
08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
Size: 1084 kB
Rss: 892 kB
Pss: 374 kB
Shared_Clean: 892 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 892 kB
Anonymous: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 374 kB
The first of these lines shows the same information as is displayed
for the mapping in /proc/PID/maps. The remaining lines show the size
of the mapping (size), the amount of the mapping that is currently
resident in RAM (RSS), the process' proportional share of this mapping
(PSS), the number of clean and dirty private pages in the mapping.
Note that even a page which is part of a MAP_SHARED mapping, but has
only a single pte mapped, i.e. is currently used by only one process,
is accounted as private and not as shared. "Referenced" indicates the
amount of memory currently marked as referenced or accessed.
"Anonymous" shows the amount of memory that does not belong to any
file. Even a mapping associated with a file may contain anonymous
pages: when MAP_PRIVATE and a page is modified, the file page is
replaced by a private anonymous copy. "Swap" shows how much
would-be-anonymous memory is also used, but out on swap.
This Question on Unix and Linux Stackexchange covers almost the topic. See Mat's excellent answer which will surely clear all your doubts.

Resources