Gdb on multiple instance - linux

I have multiple instances of a particular process running on my system . At some point during the process execution, some of the internal data structures gets overwritten with invalid data. This happens on random instances at random intervals. Is there a way to debug this other than by setting memory access breakpoints?. Also, is it possible set memory access breakpoint on all these process simultaneously without starting a separate instance of gdb for each process?. The process runs on x86_64 linux system with 2.6 kernel.

If you haven't already done so, would recommend using valgrind (http://valgrind.org). It can detect many types of memory bugs including memory over/under runs, memory leaks, double frees, etc.

Also, is it possible set memory access break-point on all these process simultaneously without starting a separate instance of gdb for each process?
I don't think so that using gdb you can set breakpoints for all the processes in one go. According to me, you have separately attach each process and set the breakpoints.

For memory errors, valgrind is much more useful than GDB.
Assuming the instances you are talking about are forked or spawned from a single parent, you don't need separate instances of valgrind.
Just use valgrind --trace-children=yes
See http://man7.org/linux/man-pages/man1/valgrind.1.html
As to your question on GDB, an instance can debug one process at a time only.

You can only debug one process per gdb session. If your program forks, gdb follows the parent process if no other options to set follow-fork-mode was given.
see: http://www.delorie.com/gnu/docs/gdb/gdb_26.html
If you have memory problems it is even possible to run valgrind in combination with gdb or use some other memory debugging library like efence. Efence replaces some library calls e.g. malloc/free with own functions. After that efence and also valgrind use the mmu to catch invalid memory access. This is typically done by adding some space before and after each allocated memory block. If this spare memory is accessed by your application the library ( efence ) or valgrind stops execution. In connection with gdb you will be pointed to the source line which access the forbidden memory area.
Having multiple processes needs multiple instances of gdb which is in practive no real problem.

Related

linux: munmap shared memory in on single call

If a process calls mmap(...,MAP_ANONYMOUS | MAP_SHARED,...) and forks N children, is it possible for any one of these processes (parent or descendants) to munmap() the memory for all processes in one go, thus releasing the physical memory, or does every of these processes have to munmap() individually?
(I know the memory will be unmapped on process exit, but the children won't exit yet).
Alternatively, is there a way to munmap memory from another process? I'm thinking of a call something like munmap(pid,...).
Or is there a way to achieve what I am looking for using non-anonymous mappings and performing an operation on the related file descriptor (e.g closing the file)?
My processes are performance sensitive, and I would like to avoid performing lots of IPC when it becomes known that the shared memory will no longer be used by anyone.
No, there is no way to unmap memory in one go.
If you don't need mapped memory in child processes at all, you may mark mappings with madvise(MADV_DONTFORK) before forking.
In emergency situations, you may invoke syscalls from inside external processes by using gdb:
Figure out PID of target process
List mapped memory with cat /proc/<PID>/maps
Attach to process using gdb: gdb -p <PID> (it will suspend execution of target process)
Run from gdb: call munmap(0x<address>, 0x<size>) for each region you need to unmap
Exit gdb (execution of process is resumed)
It must be obvious that if your process tries to access unmapped memory, it will receive SIGSEGV. So, you must be 100% sure what you are doing.

How to avoid shared memory leaks

I'm using shared memory between 2 processes on Suse Linux and I'm wondering how can I avoid the shared memory leaks in case one process crashes or both. Does a leak occur in this case? If yes, how can I avoid it?
You could allocate space for two counters in the shared memory region: one for each process. Every few seconds, each process increments its counter, and checks that the other counter has been incremented as well. That makes it easy for these two processes, or an external watchdog, to tear down the shared memory if somebody crashes or exits.
If the subprocess is a simple fork() from the parent process, then mmap() with MAP_SHARED should work.
If the subprocess does an exec() to start a different executable, you can often pass file descriptors from shm_open() or a similar non-portable system call (see Is there anything like shm_open() without filename?) On many operating systems, including Linux, you can shm_unlink() the file descriptor from shm_open() so it doesn't leak memory when your processes die, and use fcntl() to clear the close-on-exec flag on the shm file descriptor so that your child process can inherit it across exec. This is not well defined in the POSIX standard but it appears to be very portable in practice.
If you need to use a filename instead of just a file descriptor number to pass the shared memory object to an unrelated process, then you have to figure out some way to shm_unlink() the file yourself when it's no longer needed; see John Zwinck's answer for one method.

Can I coredump a process that is blocking on disk activity (preferably without killing it)?

I want to dump the core of a running process that is, according to /proc/<pid>/status, currently blocking on disk activity. Actually, it is busy doing work on the GPU (should be 4 hours of work, but it has taken significantly longer now). I would like to know how much of the process's work has been done, so it'd be good to be able to dump the process's memory. However, as far as I know, "blocking on disk activity" means that it's not possible to interrupt the process in any way, and coredumping a process e.g. using gdb requires interrupting and temporarily stopping the process in order to attach via ptrace, right?
I know that I could just read /proc/<pid>/{maps,mem} as root to get the (maybe inconsistent) memory state, but I don't know any way to get hold of the process's userspace CPU register values... they stay the same while the process is inside the kernel, right?
You can probably run gcore on your program. It's basically a wrapper around GDB that attaches, uses the gcore command, and detaches again.
This might interrupt your IO (as if it received a signal, which it will), but your program can likely restart it if written correctly (and this may occur in any case, due to default handling).

Program stalls during long runs

Fixed:
Well this seems a bit silly. Turns out top was not displaying correctly and programs actually continue to run. Perhaps the CPU time became too large to display? Either way, the program seems to be working fine and this whole question was moot.
Thanks (and sorry for the silly question).
Original Q:
I am running a simulation on a computer running Ubuntu server 10.04.3. Short runs (<24 hours) run fine, but long runs eventually stall. By stall, I mean that the program no longer gets any CPU time, but it still holds all information in memory. In order to run these simulations, I SSH and nohup the program and pipe any output to a file.
Miscellaneous information:
The system is definitely not running out of RAM. The program does not need to read or write to the hard drive until completion; the computation is done completely in memory. The program is not killed, as it still has a PID after it stalls. I am using openmp, but have increased the max number of processes and the max time is unlimited. I am finding the largest eigenvalues of a matrix using the ARPACK fortran library.
Any thoughts on what is causing this behavior or how to resume my currently stalled program?
Thanks
I assume this is an OpenMP program from your tags, though you never actually state this. Is ARPACK threadsafe?
It sounds like you are hitting a deadlock (more common in MPI programs than OpenMP, but it's definitely possible). The first thing to do is to compile with debugging flags on, then the next time you find this problem, attach with a debugger and find out what the various threads are doing. For gdb, for instance, some instructions for switching between threads are shown here.
Next time your program "stalls", attach GDB to it and do thread apply all where.
If all your threads are blocked waiting for some mutex, you have a
deadlock.
If they are waiting for something else (e.g. read), then you need to figure out what prevents the operation from completing.
Generally on UNIX you don't need to rebuild with debug flags on to get a meaningful stack trace. You wouldn't get file/line numbers, but they may not be necessary to diagnose the problem.
A possible way of understanding what a running program (that is, a process) is doing is to attach a debugger to it with gdb program *pid* (which works well only when the program has been compiled with debugging enabled with -g), or to use strace on it, using strace -p *pid*. the strace command is an utility (technically, a specialized debugger built above the ptrace system call interface) which shows you all the system calls done by a program or a process.
There is also a variant, called ltrace that intercepts the call to functions in dynamic libraries.
To get a feeling of it, try for instance strace ls
Of course, strace won't help you much if the running program is not doing any system calls.
Regards.
Basile Starynkevitch

Faster forking of large processes on Linux?

What's the fastest, best way on modern Linux of achieving the same effect as a fork-execve combo from a large process ?
My problem is that the process forking is ~500MByte big, and a simple benchmarking test achieves only about 50 forks/s from the process (c.f ~1600 forks/s from a minimally sized process) which is too slow for the intended application.
Some googling turns up vfork as having being invented as the solution to this problem... but also warnings about not to use it. Modern Linux seems to have acquired related clone and posix_spawn calls; are these likely to help ? What's the modern replacement for vfork ?
I'm using 64bit Debian Lenny on an i7 (the project could move to Squeeze if posix_spawn would help).
On Linux, you can use posix_spawn(2) with the POSIX_SPAWN_USEVFORK flag to avoid the overhead of copying page tables when forking from a large process.
See Minimizing Memory Usage for Creating Application Subprocesses for a good summary of posix_spawn(2), its advantages and some examples.
To take advantage of vfork(2), make sure you #define _GNU_SOURCE before #include <spawn.h> and then simply posix_spawnattr_setflags(&attr, POSIX_SPAWN_USEVFORK)
I can confirm that this works on Debian Lenny, and provides a massive speed-up when forking from a large process.
benchmarking the various spawns over 1000 runs at 100M RSS
user system total real
fspawn (fork/exec): 0.100000 15.460000 40.570000 ( 41.366389)
pspawn (posix_spawn): 0.010000 0.010000 0.540000 ( 0.970577)
Outcome: I was going to go down the early-spawned helper subprocess route as suggested by other answers here, but then I came across this re using huge page support to improve fork performance.
Having tried it myself using libhugetlbfs to simply make all my app's mallocs allocate huge pages, I'm now getting around 2400 forks/s regardless of the process size (over the range I'm interested in anyway). Amazing.
Did you actually measure how much time forks take? Quoting the page you linked,
Linux never had this problem; because Linux used copy-on-write semantics internally, Linux only copies pages when they changed (actually, there are still some tables that have to be copied; in most circumstances their overhead is not significant)
So the number of forks doesn't really show how big the overhead will be. You should measure the time consumed by forks, and (which is a generic advice) consumed only by the forks you actually perform, not by benchmarking maximum performance.
But if you really figure out that forking a large process is a slow, you may spawn a small ancillary process, pipe master process to its input, and receive commands to exec from it. The small process will fork and exec these commands.
posix_spawn()
This function, as far as I understand, is implemented via fork/exec on desktop systems. However, in embedded systems (particularly, in those without MMU on board), processes are spawned via a syscall, interface to which is posix_spawn or a similar function. Quoting the informative section of POSIX standard describing posix_spawn:
Swapping is generally too slow for a realtime environment.
Dynamic address translation is not available everywhere that POSIX might be useful.
Processes are too useful to simply option out of POSIX whenever it must run without address translation or other MMU services.
Thus, POSIX needs process creation and file execution primitives that can be efficiently implemented without address translation or other MMU services.
I don't think that you will benefit from this function on desktop if your goal is to minimize time consumption.
If you know the number of subprocess ahead of time, it might be reasonable to pre-fork your application on startup then distribute the execv information via a pipe. Alternatively, if there is some sort of "lull" in your program it might be reasonable to fork ahead of time a subprocess or two for quick turnaround at a later time. Neither of these options would directly solve the problem but if either approach is suitable to your app, it might allow you to side-step the issue.
I've come across this blog post: http://blog.famzah.net/2009/11/20/a-much-faster-popen-and-system-implementation-for-linux/
pid = clone(fn, stack_aligned, CLONE_VM | SIGCHLD, arg);
Excerpt:
The system call clone() comes to the rescue. Using clone() we create a
child process which has the following features:
The child runs in the same memory space as the parent. This means that no memory structures are copied when the child process is
created. As a result of this, any change to any non-stack variable
made by the child is visible by the parent process. This is similar to
threads, and therefore completely different from fork(), and also very
dangerous – we don’t want the child to mess up the parent.
The child starts from an entry function which is being called right after the child was created. This is like threads, and unlike fork().
The child has a separate stack space which is similar to threads and fork(), but entirely different to vfork().
The most important: This thread-like child process can call exec().
In a nutshell, by calling clone in the following way, we create a
child process which is very similar to a thread but still can call
exec():
However I think it may still be subject to the setuid problem:
http://ewontfix.com/7/ "setuid and vfork"
Now we get to the worst of it. Threads and vfork allow you to get in a
situation where two processes are both sharing memory space and
running at the same time. Now, what happens if another thread in the
parent calls setuid (or any other privilege-affecting function)? You
end up with two processes with different privilege levels running in a
shared address space. And this is A Bad Thing.
Consider for example a multi-threaded server daemon, running initially
as root, that’s using posix_spawn, implemented naively with vfork, to
run an external command. It doesn’t care if this command runs as root
or with low privileges, since it’s a fixed command line with fixed
environment and can’t do anything harmful. (As a stupid example, let’s
say it’s running date as an external command because the programmer
couldn’t figure out how to use strftime.)
Since it doesn’t care, it calls setuid in another thread without any
synchronization against running the external program, with the intent
to drop down to a normal user and execute user-provided code (perhaps
a script or dlopen-obtained module) as that user. Unfortunately, it
just gave that user permission to mmap new code over top of the
running posix_spawn code, or to change the strings posix_spawn is
passing to exec in the child. Whoops.

Resources