I have a node server running. Once in a while, the main thread hangs and goes to 100% CPU usage. The thread is totally hung and not processing any further events whatsoever.
Unfortunately, because of this, even attaching the node debugger is not useful since the thread is hung somewhere (I ran node's debugger and attached to the stalled process but [for example] 'pause' or 'bt' does not return).
How can I figure out where it is hanging? Is it possible to have node keep track of the current closure stack so that I can get access to it retrospectively when the bug occurs again?
One low-level method of checking is to use a utility like strace. You can use it like: strace -p <node pid>. This will only show syscalls however, so if your program is in some kind of infinite loop that is not making any syscalls (like performing I/O) you won't see any output.
You might also try using llnode to attach to the live process to get a more node-friendly interface to the node process (compared to using gdb).
As far as seeing what handles/requests are active in the node process, there are a couple of "private" (underscore-prefixed) methods available if you are feeling adventurous: process._getActiveHandles() and process._getActiveRequests(). You might use those functions in conjunction with a module like blocked which helps detect when the event loop is executing slower than whatever threshold you want.
Related
To verify the behavior of a third party binary distributed software I'd like to use, I'm implementing a kernel module whose objective is to keep track of each child this software produces and terminates.
The target binary is a Golang produced one, and it is heavily multi thread.
The kernel module I wrote installs hooks on the kernel functions _do_fork() and do_exit() to keep track of each process/thread this binary produces and terminates.
The LKM works, more or less.
During some conditions, however, I have a scenario I'm not able to explain.
It seems like a process/thread could terminate without passing through do_exit().
The evidence I collected by putting printk() shows the process creation but does not indicate the process termination.
I'm aware that printk() can be slow, and I'm also aware that messages can be lost in such situations.
Trying to prevent message loss due to slow console (for this particular application, serial tty 115200 is used), I tried to implement a quicker console, and messages have been collected using netconsole.
The described setup seems to confirm a process can terminate without pass through the do_exit() function.
But because I wasn't sure my messages couldn't be lost on the printk() infrastructure, I decided to repeat the same test but replacing printk() with ftrace_printk(), which should be a leaner alternative to printk().
Still the same result, occasionally I see processes not passing through the do_exit(), and verifying if the PID is currently running, I have to face the fact that it is not running.
Also note that I put my hook in the do_exit() kernel function as the first instruction to ensure the function flow does not terminate inside a called function.
My question is then the following:
Can a Linux process terminate without its flow pass through the do_exit() function?
If so, can someone give me a hint of what this scenario can be?
After a long debug session, I'm finally able to answer my own question.
That's not all; I'm also able to explain why I saw the strange behavior I described in my scenario.
Let's start from the beginning: monitoring a heavily multithreading application. I observed rare cases where a PID that suddenly stops exists without observing its flow to pass through the Linux Kernel do_exit() function.
Because this my original question:
Can a Linux process terminate without pass through the do_exit() function?
As for my current knowledge, which I would by now consider reasonably extensive, a Linux process can not end its execution without pass through the do_exit() function.
But this answer is in contrast with my observations, and the problem leading me to this question is still there.
Someone here suggested that the strange behavior I watched was because my observations were somehow wrong, alluding my method was inaccurate, as for my conclusions.
My observations were correct, and the process I watched didn't pass through the do_exit() but terminated.
To explain this phenomenon, I want to put on the table another question that I think internet searchers may find somehow useful:
Can two processes share the same PID?
If you'd asked me this a month ago, I'd surely answered this question with: "definitively no, two processes can not share the same PID."
Linux is more complex, though.
There's a situation in which, in a Linux system, two different processes can share the same PID!
https://elixir.bootlin.com/linux/v4.19.20/source/fs/exec.c#L1141
Surprisingly, this behavior does not harm anyone; when this happens, one of these two processes is a zombie.
updated to correct an error
The circumstances of this duplicate PID are more intricate than those described previously. The process must flush the previous exec context if a threaded process forks before invoking an execve (the fork copies also the threads). If the intention is to use the execve() function to execute a new text, the kernel must first call the flush_old_exec() function, which then calls the de_thread() function for each thread in the process other than the task leader. Except the task leader, all the process' threads are eliminated as a result. Each thread's PID is changed to that of the leader, and if it is not immediately terminated, for example because it needs to wait an operation completion, it keeps using that PID.
end of the update
That was what I was watching; the PID I was monitoring did not pass through the do_exit() because when the corresponding thread terminated, it had no more the PID it had when it started, but it had its leader's.
For people who know the Linux Kernel's mechanics very well, this is nothing to be surprised for; this behavior is intended and hasn't changed since 2.6.17.
Current 5.10.3, is still this way.
Hoping this to be useful to internet searchers; I'd also like to add that this also answers the followings:
Question: Can a Linux process/thread terminate without pass through do_exit()? Answer: NO, do_exit() is the only way a process has to end its execution — both intentional than unintentional.
Question: Can two processes share the same PID? Answer: Normally don't. There's some rare case in which two schedulable entities have the same PID.
Question: Do Linux kernel have scenarios where a process change its PID? Answer: yes, there's at least one scenario where a Process changes its PID.
Can a Linux process terminate without its flow pass through the do_exit() function?
Probably not, but you should study the source code of the Linux kernel to be sure. Ask on KernelNewbies. Kernel threads and udev or systemd related things (or perhaps modprobe or the older hotplug) are probable exceptions. When your /sbin/init of pid 1 terminates (that should not happen) strange things would happen.
The LKM works, more or less.
What does that means? How could a kernel module half-work?
And in real life, it does happen sometimes that your Linux kernel is panicking or crashes (and it could happen with your LKM, if it has not been peer-reviewed by the Linux kernel community). In such a case, there is no more any notion of processes, since they are an abstraction provided by a living Linux kernel.
See also dmesg(1), strace(1), proc(5), syscalls(2), ptrace(2), clone(2), fork(2), execve(2), waitpid(2), elf(5), credentials(7), pthreads(7)
Look also inside the source code of your libc, e.g. GNU libc or musl-libc
Of course, see Linux From Scratch and Advanced Linux Programming
And verifying if the PID is currently running,
This can be done is user land with /proc/, or using kill(2) with a 0 signal (and maybe also pidfd_send_signal(2)...)
PS. I still don't understand why you need to write a kernel module or change the kernel code. My intuition would be to avoid doing that when possible.
The other day, when doing testing on a Linux server, we observed that under some conditions, one process could die and then started again. After checking the code, we found it was caused by an infinite loop.
This aroused my curiosity how the process went dead and then got started? Is it the OS who detects and determines the abnormal process and get it restarted? If yes, how does that work?
Let's assume you won't be able to fix your code... And let's ignore all crazy options like attaching gdb via script or so.
You can either check CPU usage (most accidental infinite loops that I've done used 100% of CPU for hours :) ), or (more likely option) use strace to check what the software is doing right now and implement your own signature tracing (if those 20 APIs repeats 20 times let's assume infinite loop or so).
For example:
#!/bin/bash
strace -p`cat your_app.pid` | ./your_signature_evaluator
# Or
strace -p12345 | ./your_signature_evaluator
As for automatic system recognition... It seems normal that program crashes after calling things in loop uncontrollably (for example malloc() until you deplete memory, opening files...), but I've (and correct me in comment if I'm wrong) never seen system (kernel) restarting the app. I think you've either:
have conditions (signal handling, whatever) inside program that helps to recover
you're running a watchdog (check every 20 seconds that <pid> is running and if not start new instance)
you're running distribution that provides service/program configuration with restart if stopped
But I really doubt that Linux would be so nice to your application on it's own.
If it could the person that wrote that kernel will have solved the halting problem
PS: Vytor - Web servers are in an infinite loop and do not use 100% CPU.
Fixed:
Well this seems a bit silly. Turns out top was not displaying correctly and programs actually continue to run. Perhaps the CPU time became too large to display? Either way, the program seems to be working fine and this whole question was moot.
Thanks (and sorry for the silly question).
Original Q:
I am running a simulation on a computer running Ubuntu server 10.04.3. Short runs (<24 hours) run fine, but long runs eventually stall. By stall, I mean that the program no longer gets any CPU time, but it still holds all information in memory. In order to run these simulations, I SSH and nohup the program and pipe any output to a file.
Miscellaneous information:
The system is definitely not running out of RAM. The program does not need to read or write to the hard drive until completion; the computation is done completely in memory. The program is not killed, as it still has a PID after it stalls. I am using openmp, but have increased the max number of processes and the max time is unlimited. I am finding the largest eigenvalues of a matrix using the ARPACK fortran library.
Any thoughts on what is causing this behavior or how to resume my currently stalled program?
Thanks
I assume this is an OpenMP program from your tags, though you never actually state this. Is ARPACK threadsafe?
It sounds like you are hitting a deadlock (more common in MPI programs than OpenMP, but it's definitely possible). The first thing to do is to compile with debugging flags on, then the next time you find this problem, attach with a debugger and find out what the various threads are doing. For gdb, for instance, some instructions for switching between threads are shown here.
Next time your program "stalls", attach GDB to it and do thread apply all where.
If all your threads are blocked waiting for some mutex, you have a
deadlock.
If they are waiting for something else (e.g. read), then you need to figure out what prevents the operation from completing.
Generally on UNIX you don't need to rebuild with debug flags on to get a meaningful stack trace. You wouldn't get file/line numbers, but they may not be necessary to diagnose the problem.
A possible way of understanding what a running program (that is, a process) is doing is to attach a debugger to it with gdb program *pid* (which works well only when the program has been compiled with debugging enabled with -g), or to use strace on it, using strace -p *pid*. the strace command is an utility (technically, a specialized debugger built above the ptrace system call interface) which shows you all the system calls done by a program or a process.
There is also a variant, called ltrace that intercepts the call to functions in dynamic libraries.
To get a feeling of it, try for instance strace ls
Of course, strace won't help you much if the running program is not doing any system calls.
Regards.
Basile Starynkevitch
i wrote a program that needs to continuously run. but since im a bad programmer it crashes every so often. is there a way to have another program watch it and restart it when it crashes?
Not to be specious, but if you're a bad programmer, what's to say your watching programming won't fail too ;) And, you should get better so that you don't have this issue (for this reason). That said, you will probably have need of the following answer eventually.
However, if getting better isn't possible, just run a cron job at regular intervals looking for the name of your program in the output from 'ps'. And that answer you can get from superuser.com
No need for 3rd party programs
All of this can be accomplished with the linux inittab
inittab MAN pages
Look for "respawn"
You can use supervisord
http://supervisord.org/
Since Stackoverflow is a programming site, let me give you an overview of how such a watcher would be implemented.
First thing to know is that your watcher will have to start the watched program yourself. You do this with fork and exec.
What you can then do is wait for the program to exit. You can use of the wait system calls (i.e. wait, waitpid or wait4) depending on your specific needs. You can also catch SIGCHLD so that you can get asynchronously informed when your child exits (you will then need to call wait to get it's status).
Now that you have the status, you can tell if the process died due to a signal with the macro WIFSIGNALED. If that macro returns true, then your program crashed and needs to be restarted.
It still won't continuously run if you have another task monitoring it... it will still have a short amount of down time while it restarts.
Additionally, if you are acting as a network (or local) server process, you'll lose any state about requests in progress; I hope this is ok (Of course your clients may have built-in timeout and retry).
Finally, if your process crashed while it was in the middle of storing any persistent data, I hope it has a mechanism of coping with half-written files, etc.
However, if you intend it to be robust, all of these things should be true anyway, so you can use something like supervisord safely.
I use Monit to watch over my programs and services.
I am writing a GUI oriented debugger which targets Linux primarily, but I plan ports to other OSes in the future. Because the GUI must stay interactive at all times, I have a few threads handling different things.
Primarily I have a "debug event" thread which simply loops waiting for waitpid to return and delivers the received events to the other threads. I do this because waitpid does not have a timeout, which makes it very hard to integrate it with other event loops and keep things responsive (waitpid can hang indefinitely!).
This strategy has worked wonderfully for the Linux builds so far. Lately I've been trying to make my debugger thread aware (as in the threads in the debugged application, not the debugger itself).
So I set the ptrace options to follow clone events and look for a status which has the upper 16-bit set to PTRACE_EVENT_CLONE. Then I use PTRACE_GETEVENTMSG to get the TID of the new thread. This all works nicely in my small test harness applications. But for some reason, it is failing when i put that code in my actual debugger. (I get a "No such process" error code)
The one thing that occurred to me is that Windows has a rule that only the thread which attached to an application can listen for debug events. Does Linux's ptrace have a similar limitation? If so, why does my code work for other debug events?
EDIT:
It seems that at the very least waitpid supports waiting from a different thread, the man page says:
Before Linux 2.4, a thread was just a
special case of a process, and as a
consequence one thread could not wait on the
children of another thread, even when
the latter belongs to the same thread
group. However, POSIX prescribes
such functionality, and since Linux 2.4 a
thread can, and by default
will, wait on children of other
threads in the same thread group.
So at most this is a ptrace limitation.
I had the same issue (plus many others!) while implementing the Linux-specific part of the Maxine VM debugger. You are correct in your guess that only one thread in the debugger can use ptrace to control the debuggee. We accomplish this by making all calls to ptrace on a dedicated thread. You may find it useful to look at the LinuxTask.java, linuxTask.h and linuxTask.c files in the Maxine sources available at kenai.com/projects/maxine/sources/maxine/show
As far as I can tell, this is not allowed. A task cannot use ptrace on a task which it has not attached. Also, a task can be traced by at most one other task, so you can't simply attach it once in each thread. I think this is because when one task attaches to another task, the tracing task becomes the parent of the traced task, and each task can only have one parent.
It seems like multi-thread tracing ought to be allowed because the threads are part of the same process, but implementation-wise, there isn't actually much distinction between threads and processes in the Linux kernel. A thread is just a task that happens to share most of its resources with another task.
If you're interested, you can browse the source code for ptrace in the kernel. Specifically look at ptrace_check_attach, which is called by sys_ptrace for most requests. It returns -ESRCH (sounds like the error code you're getting) if the target task's parent is not the current task.