Status of linux process when core dumping - linux

Let's say I have a process which will generate a huge core file if it crashes somehow (e.g mysql). I want to know what's the status of the process when it core dumping. Is it as before or changing to zombie?
My real life problem is like this:
I have a monitor to check the status of a process. Once it realizes the process crashes(by monitoring the status of the process), it will do something. I want to make sure the monitor do something only after core dumping finished. That's why I want to know the process status when core dumping.

If your monitor is starting the processes with fork it should able to get SIGCHLD signals then call waitpid(2). AFAIK waitpid will tell you when the core dumping has finished (and won't return successfully before that)
Read also core(5)
Perhaps using inotify(7) facilities on the directory containing the core dump might help.
And systemd might be relevant too (I don't know the details)
BTW, while core dumping, I believe that the process status (as reported thru proc(5) in 3rd field of /proc/$PID/stat) is
D Waiting in uninterruptible disk sleep
So if you are concerned about long core dump time you could for example loop every half-second to fopen then fscanf then fclose that /proc/$PID/stat pseudo-file till the status is no more D
At last, core dumps are usually quick (unless you run on a supercomputer with a terabyte of RAM) these days (on Linux with a good file system like Ext4 or BTRFS), because I believe that (if you have sufficient RAM) the core dump file stays in the page cache. Core dumps lasting half an hour were common in the previous century on supercomputers (Cray) of that time.
Of course you could also stat(2) the core file.
See also http://www.linuxatemyram.com/

Related

What is a memory image in *nix systems?

In the book Advanced Programming in the Unix Environment 3rd Edition, Chapter 10 -- Signals, Page 315, when talking about the actions taken by the processes that receive a signal , the author says
When the default action is labeled "terminate+core", it means that a memory image of the process is left in the file named core of the current working directory of the process.
What is a memory image? When is this created, what's the content of it, and what is it used for?
A memory image is simply a copy of the process's virtual memory, saved in a file. It's used when debugging the program, as you can examine the values of the program's variables and determine which functions were being called at the time of the failure.
As the documentation you quoted says, this file is created when the process is terminated due to a signal that has the "terminate+core" default action.'
A memory image is often called a core image. See core(5) and the core dump wikipage.
Grossly speaking, a core image describes the process virtual address space (and content) at time of crash (including call stacks of each active thread and writable data segments for global data and heaps, but often excluding text or code segments which are read-only and given in the executable ELF file or in shared libraries). It also contains the register state (for each thread).
The name core is understandable only by old guys like me (having seen computers built in the 1960 & 1970-s like IBM/360, PDP-10 and early PDP-11, both used for developing the primordial Unix), since long time ago (1950-1970) random access memory was made with magnetic core memory.
If you have compiled all your source code with debug information (e.g. using gcc -g -Wall) you can do some post-mortem debugging (after yourprogram crashed and dumped a core file!) using gdb as
gdb yourprogram core
and the first gdb command you'll try is probably bt to get the backtrace.
Don't forget to enable core dumps, with the setrlimit(2) syscall generally done in your shell with e.g. ulimit  -c
Several signals can dump core, see signal(7). A common cause is a segmentation violation, like when you dereference a NULL or bad pointer, which gives a SIGSEGV signal which (often) dumps a core file in the current directory.
See also gcore(1).

Shared Library (.so) file on hard drive changes when user processed killed

This is in continuance to my previous question.
hash of libgmp.so changes automatically
I have developed a library and linked it to my process. It is required that my library should have same hash (from installation time), every time I link it with my process. My process always checks library's hash before doing anything else. My process is a daemon process and it gets started and stopped with script in initrd. I am always killing my process through "kill -9 myproc" command which send SIGKILL to process and forcefully terminates the process.
but it happens that sometimes my shared library file's hash gets changed when I stop and restart my process. It happens at random times and recently it is happening more frequently, due which my process does not start because of hash comparison condition that I coded in it.
I have taken dumps of both shared libraries i.e. right after installation and the changed version. I have used "objdump -d libmy.so" to take dumps. Here is a 'diff' screen shot of both dumps (yellow is original file and red is changed version):
I dont know much about elf file contents but it looks like that original file only had offsets and changed file has full address of instructions and functions. As a result, changed version of library is 2kb bigger than that of original.
Why is this happening? Does it has anything to do with SIGKILL signal which forces process to shutdown? If not, what can be the reason?
Any help would be appreciated.
Why is this happening?
Most likely because you are on a RedHat, Fedora or CentOS system, and prelink is enabled (by default it prelinks all shared libraries on the system to a new random address every two weeks). When you stop/restart your daemon, it gets a new version of the library IF prelink ran since last time you started your daemon.
See these instructions on how to disable prelink.
Alternatively, modify your checksumming process to only pay attention to interesting sections, such as .text, .data, and .rodata, and ignore the rest.

Core dump is created, but not written to a file?

I'm trying to get a core dump of a proprietary application running on an embedded linux system, for which I wrote some plugins.
What I did was:
ulimit -c unlimited
echo "/tmp/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern
kill -3 <PID>`
However, no core dump is created. '/tmp/cores' exists and is writable for everyone, and the disk has enough space available. When I try the same thing with sleep 100 & as an example process and then kill it, the core dump is created.
I tried the example for the pipe syntax from the core manpage, which writes some parameters and the size of the core dump into a file called core.info. This file IS created, and the size is greater than 0. So if the core dump is created, why isn't it written to /tmp/cores? To be sure, I also searched for core* on the file system - it's not there. dmesg doesn't show any errors (but it does if I pipe the core dump to an invalid program).
Some more info: The system is probably based on Debian, but I'm not quite sure. GDB is not available, as well as many other tools - there is only busybox for basic stuff.
The process I'm trying to debug is automatically restarted soon after being killed.
So, I guess one solution would be to modify the example program in order to write the dump to a file instead of just counting bytes. But why doesn't it work just normally if there obviously is some data?
If your proprietary application calls setrlimit(2) with RLIMIT_CORE set to 0, or if it is setuid, no core dump happens. See core(5). Perhaps use strace(1) to find out. And you could install  gdb (perhaps by [cross-] compiling it). See also gcore(1).
Also, check (and maybe set) the limit in the invoking shell. With bash(1) use ulimit builtin. Otherwise, cat /proc/self/limits should display the limits. If you don't have bash you could code a small wrapper in C calling setrlimit then execve ...

Generating core dumps

From times to times my Go program crashes.
I tried a few things in order to get core dumps generated for this program:
defining ulimit on the system, I tried both ulimit -c unlimited and ulimit -c 10000 just in case. After launching my panicking program, I get no core dump.
I also added recover() support in my program and added code to log to syslog in case of panic but I get nothing in syslog.
I am running out of ideas right now.
I must have overlooked something but I do not find what, any help would be appreciated.
Thanks ! :)
Note that a core dump is generated by the OS when a condition from a certain set is met. These conditions are pretty low-level — like trying to access unmapped memory or trying to execute an opcode the CPU does not know etc. Under a POSIX operating system such as Linux when a process does one of these things, an appropriate signal is sent to it, and some of them, if not handled by the process, have a default action of generating a core dump, which is done by the OS if not prohibited by setting a certain limit.
Now observe that this machinery treats a process on the lowest possible level (machine code), but the binaries a Go compiler produces are more higher-level that those a C compiler (or assembler) produces, and this means certain errors in a process produced by a Go compiler are handled by the Go runtime rather than the OS. For instance, a typical NULL pointer dereference in a process produced by a C compiler usually results in sending the process the SIGSEGV signal which is then typically results in an attempt to dump the process' core and terminate it. In contrast, when this happens in a process compiled by a Go compiler, the Go runtime kicks in and panics, producing a nice stack trace for debugging purposes.
With these facts in mind, I would try to do this:
Wrap your program in a shell script which first relaxes the limit for core dumps (but see below) and then runs your program with its standard error stream redirected to a file (or piped to the logger binary etc).
The limits a user can tweak have a hierarchy: there are soft and hard limits — see this and this for an explanation. So try checking your system does not have 0 for the core dump size set as a hard limit as this would explain why your attempt to raise this limit has no effect.
At least on my Debian systems, when a program dies due to SIGSEGV, this fact is logged by the kernel and is visible in the syslog log files, so try grepping them for hints.
First, please make sure all errors are handled.
For core dump, you can refer generate a core dump in linux
You can use supervisor to reboot the program when it crashes.

Finding out memory footprint size

I would like to be able to restart a service when it is using too much memory (this is related to a bug in a third party library)
I have used this to limit the amount of memory that can be requested:
resource.setrlimit(resource.RLIMIT_AS, (128*1024*1024, 128*1024*1024))
But the third party library gets stuck in a memory allocation busyloop failing and re-requesting memory. So I want to be able to, in a thread, poll the current size of the memory of the process.
Language I'm using is python, but a solution for any programming language can be translated into python code, provided it's viable and sensible on linux.
Monit is a service you can run to monitor external processes. All you need to do is dump your pid to a file for monit to read. People often use it to monitor their web server. One of the tests monit can do is for total memory usage. You can set a value and if your process uses too much memory it will be restarted. Here's an example monit config
check process yourProgram
with pidfile "/var/run/YOUR.pid"
start program = "/path/to/PROG.py"
stop program = "/script/to/kill/prog/kill_script.sh"
restart if totalmem is greater than 60.0 MB
This is the code that I came up with. Seems to work properly, and avoids too much string parsing. The variable names I unpack come from proc(5) man page, and this is probably a better way of extracting the OS information than string parsing /proc/self/status.
def get_vsize():
parts = open('/proc/self/stat').read().split()
(pid, comm, state, ppid, pgrp, session, tty, tpgid, flags, minflt, cminflt,
majflt, cmajflt, utime, stime, cutime, cstime, counter, priority, timeout,
itrealvalue, starttime, vsize, rss, rlim, startcode, endcode, startstack,
kstkesp, kstkeip, signal, blocked, sigignore, sigcatch, wchan,
) = parts[:35]
return int(vsize)
def memory_watcher():
while True:
time.sleep(120)
if get_vsize() > 120*1024*1024:
os.kill(0, signal.SIGTERM)
You can read the current memory usage using the /proc filesystem.
The format is /proc/[pid]/status. In the status virtual file you can see the current VmRSS (resident memory).

Resources