How to measure the stack size of a process? - linux

How do I find the stack size of a process ?
/proc/5848/status gives me VmStk but this doesnt change
No matter how much ever while loop and recursion I do in my test program this value hardly changes.
when I looked at /proc/pid/status all of the process has 136k and have no idea where that value comes from.
Thanks,

There really is no such thing as the "stack size of a process" on Linux. Processes have a starting stack, but as you see, they rarely allocate much from the standard stack. Instead, processes just allocate generic memory from the operating system and use it as a stack. So there's no way for the OS to know -- that detail is only visible from inside the process.
A typical, modern OS may have a stack size limit of 8MB imposed by the operating system. Yet processes routinely allocate much larger objects on their stack. That's because the application is using a stack that is purely application-managed and not a stack as far as the OS is concerned.
This is always true for multi-threaded processes. For single-threaded processes, it's possible they are actually just using very, very little stack.

Maybe you just want to get the address map of some process. For process 1234, read sequentially the /proc/1234/maps pseudo-file. For your own process, read /proc/self/maps
Try
cat /proc/self/maps
to get a feeling of it (the above command displays the address map of the cat process executing it).
Read proc(5) man page for details.
You might also be interested by process limits, e.g. getrlimit(2) and related syscalls.
I am not sure that stack size has some precise sense, notably for multi-threaded processes.
Maybe you are interested in mmap(2)-ed segments with MAP_GROWSDOWN.

the stksize can be got by pidstat command. install it by apt install sysstat
pidstat -p 11577 -l -s

Related

Getting Details (Instruction Size, Read/write access, memory, core load) of Processes in Linux

I have been trying to develop an application that would model a system using Graph Theory (See [1]) Graph theory, basically, can be used in order to model runnables in order to figure out their partitions (grouped runnables) and can be used in order to map them to cores.
In order to achieve this, we need many information. Since I dont know about how linux (Raspbian in particular for us) OS schedules everything in detail and I'm interested in finding out how our algorithm will improve core utilization, I thought I could obtain the information of processes and try to model them myself.
For that purpose, we need:
Instruction size, how many instructions CPU runs to complete the task (very important)
Memory needed for the process, physical memory and virtual memory
Core load for debugging the processes.
Read/write accesses, which process is it communicating with, is it a read or write acces, what kind of interface is it, and what is the instruction size and memory needed to read and/or write.
I think I can extract some of these information by using 'top' command in linux. It gives the core load, memory usage, virtual and physical memory. I also think I should mention that I'm intending to use 'taskset' in order to place processes to cores to see their information (See [2]).
Now the first question I have is how do I effectively obtain the instruction sizes, r/w accesses and the things I listed above?
Second question is there any possible way to see runnables of a process, i.e. simple functions it runs. And also their information and r/w accesses with each other? This question is simply about finding out a way to model a process itself, rather than the interactions of processes?
Any help is greately appreciated as it will help our open-source Multi-core platform research.
Thank you very much in advance.
[1] http://math.tut.fi/~ruohonen/GT_English.pdf
[2] To place a process to a core, I use:
pid = $(pgrep -u root -f $process_name -n)
sudo taskset -pc $core $pid &&
echo "Process $process_name with PID=$pid has been placed on core $core"

Linux memory usage history

I had a problem in which my server began failing some of its normal processes and checks because the server's memory was completely full and taken.
I looked in the logging history and found that what it killed were some Java processes.
I used the "top" command to see what processes were taking up the most memory right now(after the issue was fixed) and it was a Java process. So in essence, I can tell what processes are taking up the most memory right now.
What I want to know is if there is a way to see what processes were taking up the most memory at the time when the failures started happening? Perhaps Linux keeps track or a log of the memory usage at particular times? I really have no idea but it would be great if I could see that kind of detail.
#Andy has answered your question. However, I'd like to add that for future reference use a monitoring tool. Something like these. These will give you what happened during a crash since you obviously cannot monitor all your servers all the time. Hope it helps.
Are you saying the kernel OOM killer went off? What does the log in dmesg say? Note that you can constrain a JVM to use a fixed heap size, which means it will fail affirmatively when full instead of letting the kernel kill something else. But the general answer to your question is no: there's no way to reliably run anything at the time of an OOM failure, because the system is out of memory! At best, you can use a separate process to poll the process table and log process sizes to catch memory leak conditions, etc...
There is no history of memory usage in linux be default, but you can achieve it with some simple command-line tool like sar.
Regarding your problem with memory:
If it was OOM-killer that did some mess on machine, then you have one great option to ensure it won't happen again (of course after reducing JVM heap size).
By default linux kernel allocates more memory than it has really. This, in some cases, can lead to OOM-killer killing the most memory-consumptive process if there is no memory for kernel tasks.
This behavior is controlled by vm.overcommit sysctl parameter.
So, you can try setting it to vm.overcommit = 2 is sysctl.conf and then run sysctl -p.
This will forbid overcommiting and make possibility of OOM-killer doing nasty things very low. Also you can think about adding a little-bit of swap space (if you don't have it already) and setting vm.swappiness to some really low value (like 5, for example. default value is 60), so in normal workflow your application won't go into swap, but if you'll be really short on memory, it will start using it temporarily and you will be able to see it even with df
WARNING this can lead to processes receiving "Cannot allocate memory" error if you have your server overloaded by memory. In this case:
Try to restrict memory usage by applications
Move part of them to another machine

What is the Linux Stack?

I recently ran into a bug with the "linux stack" and the "linux stack size". I came across a blog directing me to try
ulimit -a
to see what the limit for my box was, and it was set to 8192kb which seems to be the default.
What is the "linux stack"? How does it work, what does it store, what does it do?
The short answer is:
When programs on your linux box run, they add and remove data from the stack on a regular basis as the programs function. The stack size, referes to how much space is allocated in memory for the stack. If you increase the stack size, that allows the program to increase the number of routines that can be called. Each time a function is called, data can be added to the stack (stacked on top of the last routines data.)
Unless the program is a very complex, or designed for a special purpose, a stack size of 8192kb is normally fine. Some programs like graphics processing programs require you to increase the size of the stack to function. As they may store a lot of data on the stack.
Feel free to increase the stack size for those applications, its not a problem. To do so, use
ulimit -s bytes
BTW, What is a StackOverflowError?

Why the process is getting killed at 4GB?

I have written a program which works on huge set of data. My CPU and OS(Ubuntu) both are 64 bit and I have got 4GB of RAM. Using "top" (%Mem field), I saw that the process's memory consumption went up to around 87% i.e 3.4+ GB and then it got killed.
I then checked how much memory a process can access using "uname -m" which comes out to be "unlimited".
Now, since both the OS and CPU are 64 bit and also there exists a swap partition, the OS should have used the virtual memory i.e [ >3.4GB + yGB from swap space ] in total and only if the process required more memory, it should have been killed.
So, I have following ques:
How much physical memory can a process access theoretically on 64 bit m/c. My answer is 2^48 bytes.
If less than 2^48 bytes of physical memory exists, then OS should use virtual memory, correct?
If ans to above ques is YES, then OS should have used SWAP space as well, why did it kill the process w/o even using it. I dont think we have to use some specific system calls which coding our program to make this happen.
Please suggest.
It's not only the data size that could be the reason. For example, do ulimit -a and check the max stack size. Have you got a kill reason? Set 'ulimit -c 20000' to get a core file, it shows you the reason when you examine it with gdb.
Check with file and ldd that your executable is indeed 64 bits.
Check also the resource limits. From inside the process, you could use getrlimit system call (and setrlimit to change them, when possible). From a bash shell, try ulimit -a. From a zsh shell try limit.
Check also that your process indeed eats the memory you believe it does consume. If its pid is 1234 you could try pmap 1234. From inside the process you could read the /proc/self/maps or /proc/1234/maps (which you can read from a terminal). There is also the /proc/self/smaps or /proc/1234/smaps and /proc/self/status or /proc/1234/status and other files inside your /proc/self/ ...
Check with  free that you got the memory (and the swap space) you believe. You can add some temporary swap space with swapon /tmp/someswapfile (and use mkswap to initialize it).
I was routinely able, a few months (and a couple of years) ago, to run a 7Gb process (a huge cc1 compilation), under Gnu/Linux/Debian/Sid/AMD64, on a machine with 8Gb RAM.
And you could try with a tiny test program, which e.g. allocates with malloc several memory chunks of e.g. 32Mb each. Don't forget to write some bytes inside (at least at each megabyte).
standard C++ containers like std::map or std::vector are rumored to consume more memory than what we usually think.
Buy more RAM if needed. It is quite cheap these days.
In what can be addressed literally EVERYTHING has to fit into it, including your graphics adaptors, OS kernel, BIOS, etc. and the amount that can be addressed can't be extended by SWAP either.
Also worth noting that the process itself needs to be 64-bit also. And some operating systems may become unstable and therefore kill the process if you're using excessive RAM with it.

After "OOM Killer", is there a "Resurrector"?

I understand that on Linux there is a kernel functionality referred to as "OOM Killer". When the OOM (Out-Of-Memory) condition subsides, is there such a thing as a "Process Resurrector" ?
I understand this functionality would be difficult to implement for all sorts of reason, but is there something that gets close to it?
Edit: Example: the "Resurrector" would have a block of memory guaranteed to it for storing a limited set of process information (e.g. command-line, environment etc.) (i.e. not a whole process code & data !). Once the OOM condition is cleared, the "Resurrector" could go through the list and "resurrect" some processes.
From what I gather up to now, there doesn't seem to be functionality akin to what I am asking.
No. Once a process is killed by the OOM Killer, it's dead. You can restart it (resources permitting), and if it's something that's managed by the system (via inittab, perhaps), it might get restarted that way.
Edit: As a thought experiment, think about what a resurrection of a process would mean. Even if you could store the entire process state, you wouldn't want to because the process killed might be the REASON for the out-of-memory condition.
So the best you could possibly due would be to store it's startup state (command line, etc). But that's no good either, because again, that may be WHY the system ran out of memory in the first place!
Furthermore, if you resurrected a process in this way, there's no telling what could go wrong. What if the process controls hardware? What if the process controls shouldn't be run more than once? What if it was connected to a tty that isn't there anymore (because the sshd was one of the processes killed)?
There's an ENORMOUS amount of context around a process that the system can't possibly be aware of. The ONLY sensible thing is the thing that the kernel does: kill the sucker and go on.
I suppose you can imagine a hibernate-the-process-to-disk strategy, but given that we're out of memory (including swap), that means either pre-reserving some disk space or deciding to allocate disk space to this on the fly. Either of which strategy may not be capable of dealing with the size of the process in question.
In short: No, you don't get to come back from the OOM killer. It's a killer, you just have to deal with it.
Of course there is no. Otherwise, where can a killed process be stored if there's no more memory to store it? :-)
The thing is that OOM killer only comes into play when all available memory is exhausted, both RAM and on-disk swap memory. If a "process resurrector" could "resurrect" a process after the condition subsides, it should have been capable to store it somewhere at the time when "the killer" starts. But since killer only starts when there's no memory available, that is impossible.
Of course you may say "save to disk", but well, swap memory is a disk. If you want to limit memory consumption of your process, use ulimit functionality and track your mem usage manually via ps program or /proc filesystem. "OOM killer" is a panic measure and should not be very nice to processes.
Example of what you can do with ulimit (and, perhaps, without, but I can't experiment with OOM killing on my system atm)
#!/bin/bash
save_something=$ENV_VARIABLE
( ulimit -Sv 1000000;
perl -e 'print "Taking all RAM!!!\n"; while (1) { $a[$i++] = $i; }'
)
echo "killed, resetting"
( ulimit -Sv 1000000;
export ENV_VARIABLE="$save_something"
perl -e 'print "Taking all RAM!!!\n"; while (1) { $a[$i++] = $i; }'
)

Resources