What happens if process crashes when flushinig mapped file? - linux

I'm using boost::interprocess::managed_mapped_file to do IPC under linux. In short one process can write objects into files (method construct) for another process to read (method find). However what if the process crashes while writing? Will boost handle this automatically or I have to add a mechanism to detect such failure?

If process crashes result is not defined - nothing can know how much I/O it could have done. But I would think that OS is probably doing I/O in some units, probably at least one block (512 bytes) or page (4KB).

Related

Can I launch the process when it was generating core dump?

I have a monitor script will check a specified process, if it crash, the script will relaunch it without waiting for the core dump writing complete. Does this incur bad things? Will it affect the core dump file or the relaunched process?
Yes, you can. A process is a different thing than a program. As you can have several instances of the ls command in unix running in parallel, there's nothing to impede you to relaunch the same program (but a different, new process) again while it is saving the core file. The only difference from an normal process writing a file is that the process writing a core just does it in kernel mode. Nothing else.
Core dump is executed by the process killed executing in kernel mode, as a previous to die task. For the purposes of process state, the process in state exiting and nothing can affect it until the core dump is finished (it can only be interrupted by a write error in the dump file, or perhaps this is an interruptible state)
The only problem you can have, is that the next instance you launch, as it tries to write the same core file name, will have to wait for it to end (i think the inode is only locked on a per write basis only, not for the whole file) and you get a bunch of processes dying and writing the same core file. That's not the case if the core happens to a new, different file (the file is unlinked before creating it) but that depends on implementation. Probably an exploit should be a DOS attack to begin generating cores at a high pace, to make the writing of core files to queue a lot of processes in non interrupting state. But I think this is difficult to achieve... most probably only you'll get high load by many processes writing different core files just to be erased next (as a consequence of the unlink system call made by then next core generating task).
A core(5) dump is very bad, and you should fix its root cause. It is generally the result of some unexpected and unhandled signal(7) (perhaps some memory corruption giving a SIGSEGV, etc...; read also about undefined behavior and be very scared of UB).
if it crash, the script will relaunch it without waiting for the core dump writing complete.
So your approach is flawed, except as a temporary measure. BTW, in many cases, the virtual address space of the faulty process is small enough for the core to be dumped in a small fraction of a second. In some cases, the dumping of the core might take many minutes (think of a big HPC process dealing with hundreds of gigabytes of data on a supercomputer).
It is rumored that, in the previous century, some huge core files took half an hour to be dumped on Cray supercomputers.
You really should fix your program to avoid dumping core.
We don't know at all what is your buggy program which dumps core. But if it has some persistent state (e.g. in some database or some file) which you care about, your approach is very wrong: the core dump might perhaps happen in the code which produces that state, and then, if you restart the same program, it could reuse that faulty state.
Does this incur bad things?
Yes in general. Perhaps not in your specific case (but we don't know what your program is doing).
So, you'll better understand why is that core happening. In general, you would compile your program with all warnings and debug info (so gcc -Wall -Wextra -g with GCC) and use gdb to analyze post-mortem the core dump (see this).
You really should not write programs which dump core (even if that happens to all of us; but it is a strong bug that should be fixed ASAP). And you should not accept core dumps as an acceptable behavior of your programs.
The core dumps are here to help the developer to fix some serious problem. Read also about Unix philosophy. It is socially unacceptable to consider as "normal" a core dump, which is definitely an abnormal program behavior.
(there are several ways to avoid core dumps; but that makes a different question; and you need to explain what kind of programs you are writing and monitoring, and why and how it is dumping core.)

Signal handler (segv) unable to complete before device crashes

I have installed a handler (say, crashHandler()) which has a bit of file output functionality. It is a linux thread which registers for SIGSEGV with the crashHandler(). File writing is requred, as it stores the stack trace to persistent storage.
It works most of the times. But in a specific scenario, the function (crashHandler()) executes the function partly (I can see logs) and then device reboots. Can someone help me with a way to deal with such ?
The first question to ask here is why the device rebooted. Normally having an ordinary application crash won't cause a kernel-level or hardware-level reboot. Most likely, you're either hitting a watchdog timer before the crash handler completes (in which case you should extend the watchdog timeout - do NOT reset the timer from within the crash handler though, as then you're risking problems in the crash handler itself preventing a reboot), or this is pid 1 and it's crashing within the SIGSEGV handler, causing a kernel panic due to pid 1 (init) dying.
If it's the latter, you need to be more careful with what you do in that crash handler. Remember, you just crashed. You know memory is corrupt, but you don't know how it's corrupt. It may be corrupt in ways that affect the crash handler itself - e.g. if you corrupt the heap metadata, you may be unable to allocate memory without crashing for real this time. You should keep what you do in that handler to a bare minimum - in particular, avoid calling any library functions that are not documented as being async-signal-safe and avoid using any complex (pointer-containing) data structures or dynamically allocated memory. For the highest level of safety, limit yourself to just fork() and exec()ing another process that will use debugger APIs (ptrace() and /proc/$PID/mem) to perform memory dumps or whatever else you might need.

Can I coredump a process that is blocking on disk activity (preferably without killing it)?

I want to dump the core of a running process that is, according to /proc/<pid>/status, currently blocking on disk activity. Actually, it is busy doing work on the GPU (should be 4 hours of work, but it has taken significantly longer now). I would like to know how much of the process's work has been done, so it'd be good to be able to dump the process's memory. However, as far as I know, "blocking on disk activity" means that it's not possible to interrupt the process in any way, and coredumping a process e.g. using gdb requires interrupting and temporarily stopping the process in order to attach via ptrace, right?
I know that I could just read /proc/<pid>/{maps,mem} as root to get the (maybe inconsistent) memory state, but I don't know any way to get hold of the process's userspace CPU register values... they stay the same while the process is inside the kernel, right?
You can probably run gcore on your program. It's basically a wrapper around GDB that attaches, uses the gcore command, and detaches again.
This might interrupt your IO (as if it received a signal, which it will), but your program can likely restart it if written correctly (and this may occur in any case, due to default handling).

Linux Kernel Procfs multiple read/writes

How does the Linux kernel handle multiple reads/writes to procfs? For instance, if two processes write to procfs at once, is one process queued (i.e. a kernel trap actually blocks one of the processes), or is there a kernel thread running for each core?
The concern is if you have a buffer used within a function (static to the global space), do you have to protect it or will the code be run sequentially?
It depends on each and every procfs file implementation. No one can even give you a definite answer because each driver can implement its own procfs folder and files (you didn't specify any specific files. Quick browsing in http://lxr.free-electrons.com/source/fs/proc/ shows that some files do use locks).
In either way you can't use the global buffer because a context switch can always occur, if not in the kernel then it can catch your reader thread right after it finishes the read syscall and before it started to process the read data.

Transferring data between process calls

I have a Linux process that is being called numerous times, and I need to make this process as fast as possible.
The problem is that I must maintain a state between calls (load data from previous call and store it for the next one), without running another process / daemon.
Can you suggest fast ways to do so? I know I can use files for I/O, and would like to avoid it, for obvious performance reasons. Should (can?) I create a named pipe to read/write from and by that avoid real disk I/O?
Pipes aren't appropriate for this. Use posix shared memory or a posix message queue if you are absolutely sure files are too slow - which you should test first.
In the shared memory case your program creates the segment with shm_open() if it doesn't exist or opens it if it does. You mmap() the memory and make whatever changes and exit. You only shm_unlink() when you know your program won't be called anymore and no longer needs the shared memory.
With message queues, just set up the queue. Your program reads the queue, makes whatever changes, writes the queue and exits. Mq_unlink() when you no longer need the queue.
Both methods have kernel persistence so you lose the shared memory and the queue on a reboot.
It sounds like you have a process that is continuously executed by something.
Why not create a factory that spawns the worker threads?
The factory could provide the workers with any information needed.
... I can use files for I/O, and would like to avoid it, for obvious performance reasons.
I wonder what are these reasons please...
Linux caches files in kernel memory in the page cache. Writes go to the page cash first, in other words, a write() syscall is a kernel call that only copies the data from the user space to the page cache (it is a bit more complicated when the system is under stress). Some time later pdflush writes data to disk asynchronously.
File read() first checks the page cache to see if the data is already available in memory to avoid a disk read. What it means is that if one program writes data to files and another program reads it, these two programs are effectively communicating via kernel memory as long as the page cache keeps those files.
If you want to avoid disk writes entirely, that is, the state does not need to be persisted across OS reboots, those files can be put in /dev/shm or in /tmp, which are normally the mount points of in-memory filesystems.

Resources