I know that, given enough context, one could hope to use constructively (i.e. recover) from a segfault condition.
But, is the effort worth it? If yes, in what situation(s) ?
You can't really hope to recover from a segfault. You can detect that it happened, and dump out relevant application-specific state if possible, but you can't continue the process. This is because (amongst others)
The thread which failed cannot be continued, so your only options are longjmp or terminating the thread. Neither is safe in most cases.
Either way, you may leave a mutex / lock in a locked state which causes other threads to wait forever
Even if that doesn't happen, you may leak resources
Even if you don't do either of those things, the thread which segfaulted may have left the internal state of the application inconsistent when it failed. An inconsistent internal state could cause data errors or further bad behaviour subsequently which causes more problems than simply quitting
So in general, there is no point in trapping it and doing anything EXCEPT terminating the process in a fairly abrupt fashion. There's no point in attempting to write (important) data back to disc, or continue to do other useful work. There is some point in dumping out state to logs- which many applications do - and then quitting.
A possibly useful thing to do might be to exec() your own process, or have a watchdog process which restarts it in the case of a crash. (NB: exec does not always have well defined behaviour if your process has >1 thread)
A number of the reasons:
To provide more application specific information to debug a crash. For instance, I crashed at stage 3 processing file 'x'.
To probe whether certain memory regions are accessible. This was mostly to satisfy an API for an embedded system. We would try to write to the memory region and catch the segfault that told us that the memory was read-only.
The segfault usually originates with a signal from the MMU, which is used by the operating system to swap in pages of memory if necessary. If the OS doesn't have that page of memory, it then forwards the signal onto the application.
a Segmentation Fault is really accessing memory that you do not have permission to access ( either because it's not mapped, you don't have permissions, invalid virtual address, etc. ).
Depending on the underlying reason, you may want to trap and handle the segmentation fault. For instance, if your program is passed an invalid virtual address, it may log that segfault and then do some damage control.
A segfault does not necessarily mean that the program heap is corrupted. Reading an invalid address ( eg. null pointer ) may result in a segfault, but that does not mean that the heap is corrupted. Also, an application can have multiple heaps depending on the C runtime.
There are very advanced techniques that one might implementing by catching a segmentation fault, if you know the segmentation fault isn't an error. For example, you can protect pages so that you can't read from them, and then trap the SIGSEGV to perform "magical" behavior before the read completes. (See Tomasz Węgrzanowski "Segfaulting own programs for fun and profit" for an example of what you might do, but usually the overhead is pretty high so it's not worth doing.)
A similar principle applies to catching trapping an illegal instruction exception (usually in the kernel) to emulate an instruction that's not implemented on your processor.
To log a crash stack trace, for example.
No. I think it is a waste of time - a seg fault indicates there is something wrong in your code, and you will be better advised to find this by examining a core dump and/or your source code. The one time I tried trapping a seg fault lead me off into a hall of mirrors which I could have avoided by simply thinking about the source code. Never again.
Related
I am trying to debug a mysterious SIGSEGV based on a core file. Usually when the root cause is a bad memory access, I see the signal handler somewhere in the stack trace and it's fairly trivial to find the piece of code that triggered the segmentation fault. This time I do not see any signal handler called, and all the call stacks are pointing to relativity innocuous pieces of code or waiting for events.
I strongly suspect, due to some recent changes, the root cause is some sort of 'out of memory' issue during runtime. None of the call stacks are pointing to an obvious allocation culprit. There is also no useful message in dmesg or syslog.
Is there any way to definitely tell that the root cause was some sort of out-of-memory issue, and if so, exactly what allocation was responsable?
If I receive ENOBUFS or ENOMEM during a call to read(2), is it possible that the kernel may free up resources and a future call will succeed? Or, do I treat the error as fatal, and begin a teardown process?
I'm a bit at a loss to see what possible use may come from retrying.
If you got back ENOMEM on a read, it means the kernel is in serious trouble. Yes, it is possible that retrying might work, but it is also possible it will not. If it will not, how long is appropriate to wait before retrying? If you retry immediately, what's to prevent you from adding another process doing 100% CPU bound loop?
Personally, if I got such an error from a read for which I know how to handle errors, I'd handle the error as usual. If it is a situation where I positively need the read to succeed, then I'd fail the program. If this program is mission critical, you will need to run it inside a watchdog that restarts it anyway.
On that note, please bear in mind that if the kernel returned ENOMEM, there is a non-negligible probability that the OOM killer will send SIGKILL to someone. Experience has shown that someone will likely be your process. That is just one more reason to just exit, and handle that exit with a watchdog monitoring the process (bear in mind, however, that the watchdog might also get a SIGKILL if the OOM killer was triggered).
The situation with ENOBUFS isn't much different. The "how long to delay" and infinite loop considerations are still there. OOM killer is less likely under such considerations, but relying on the watchdog is still the correct path, IMHO.
The core issue here is that there are no specific cases in which read(2) should return any of those errors. If a condition arises that results in those errors, it is just as legitimate for the driver to return EIO.
As such, and unless OP knows of a specific use case his code is built to handle, these errors really should be handled the same way.
One last not regarding the OOM killer. People sometimes think of it as something that will save them from hanging the entire system. That is not really the case. The OOM killer randomly kills a process. It is true that the more pages the process has, the more likely it is that it be the one being killed. I strongly suggest no relying on that fact, however.
I have seen cases where physical memory was exhausted, where the OOM killer killed a process that used very little memory, taking some time to get to the main culprit. I've seen cases that the memory exhaustion was in the kernel address space, and the user space processes being killed were completely random.
As I've said above, OOM killer might kill your watchdog process, leaving your main hogger running. Do not rely on it to fix your code path.
I have installed a handler (say, crashHandler()) which has a bit of file output functionality. It is a linux thread which registers for SIGSEGV with the crashHandler(). File writing is requred, as it stores the stack trace to persistent storage.
It works most of the times. But in a specific scenario, the function (crashHandler()) executes the function partly (I can see logs) and then device reboots. Can someone help me with a way to deal with such ?
The first question to ask here is why the device rebooted. Normally having an ordinary application crash won't cause a kernel-level or hardware-level reboot. Most likely, you're either hitting a watchdog timer before the crash handler completes (in which case you should extend the watchdog timeout - do NOT reset the timer from within the crash handler though, as then you're risking problems in the crash handler itself preventing a reboot), or this is pid 1 and it's crashing within the SIGSEGV handler, causing a kernel panic due to pid 1 (init) dying.
If it's the latter, you need to be more careful with what you do in that crash handler. Remember, you just crashed. You know memory is corrupt, but you don't know how it's corrupt. It may be corrupt in ways that affect the crash handler itself - e.g. if you corrupt the heap metadata, you may be unable to allocate memory without crashing for real this time. You should keep what you do in that handler to a bare minimum - in particular, avoid calling any library functions that are not documented as being async-signal-safe and avoid using any complex (pointer-containing) data structures or dynamically allocated memory. For the highest level of safety, limit yourself to just fork() and exec()ing another process that will use debugger APIs (ptrace() and /proc/$PID/mem) to perform memory dumps or whatever else you might need.
I'm trying to create memcpy like function that will fail gracefully (ie return an error instead of segfaulting) when given an address in memory that is part of an unallocated page. I think the right approach is to install a sigsegv signal handler, and do something in the handler to make the memcpy function stop copying.
But I'm not sure what happens in the case my program is multithreaded:
Is it possible for the signal handler to execute in another thread?
What happens if a segfault isn't related to any memcpy operation?
How does one handle two threads executing memcpy concurrently?
Am I missing something else? Am I looking for something that's impossible to implement?
Trust me, you do not want to go down that road. It's a can of worms for many reasons. Correct signal handling is already hard in single threaded environments, yet alone in multithreaded code.
First of all, returning from a signal handler that was caused by an exception condition is undefined behavior - it works in Linux, but it's still undefined behavior nevertheless, and it will give you problems sooner or later.
From man 2 sigaction:
The behaviour of a process is undefined after it returns normally from
a signal-catching function for a SIGBUS, SIGFPE, SIGILL or SIGSEGV
signal that was not generated by kill(), sigqueue() or raise().
(Note: this does not appear on the Linux manpage; but it's in SUSv2)
This is also specified in POSIX. While it works in Linux, it's not good practice.
Below the specific answers to your questions:
Is it possible for the signal handler to execute in another thread?
Yes, it is. A signal is delivered to any thread that is not blocking it (but is delivered only to one, of course), although in Linux and many other UNIX variants, exception-related signals (SIGILL, SIGFPE, SIGBUS and SIGSEGV) are usually delivered to the thread that caused the exception. This is not required though, so for maximum portability you shouldn't rely on it.
You can use pthread_sigmask(2) to block signals in every thread but one; that way you make sure that every signal is always delivered to the same thread. This makes it easy to have a single thread dedicated to signal handling, which in turn allows you to do synchronous signal handling, because the thread may use sigwait(2) (note that multithreaded code should use sigwait(2) rather than sigsuspend(2)) until a signal is delivered and then handle it synchronously. This is a very common pattern.
What happens if a segfault isn't related to any memcpy operation?
Good question. The signal is delivered, and there is no (trivial) way to portably differentiate a genuine segfault from a segfault in memcpy(3).
If you have one thread taking care of every signal, like I mentioned above, you could use sigwaitinfo(2), and then examine the si_addr field of siginfo_t once sigwaitinfo(2) returned. The si_addr field is the memory location that caused the fault, so you could compare that to the memory addresses passed to memcpy(3).
But some platforms, most notably Mac OS, do not implement sigwaitinfo(2) or its cousin sigtimedwait(2).
So there's no way to do it portably.
How does one handle two threads executing memcpy concurrently?
I don't really understand this question, what's so special about multithreaded memcpy(3)? It is the caller's responsibility to make sure regions of memory being read from and written to are not concurrently accessed; memcpy(3) isn't (and never was) thread-safe if you pass it overlapping buffers.
Am I missing something else? Am I looking for something that's
impossible to implement?
If you're concerned with portability, I would say it's pretty much impossible. Even if you just focus on Linux, it will be hard. If this was something easy to do, by this time someone would have probably done it already.
I think you're better off building your own allocator and force user code to rely on it. Then you can store state and manage allocated memory, and easily tell if the buffers passed are valid or not.
Can a page fault occur in an interrupt handler/atomic context ?
It can, but it would be a disaster. :-)
(This is an oldish question. The existing answers contain correct facts, but are quite thin. I will attempt to answer it in a more substantial way.)
The answer to this question depends upon whether the code is in the kernel (supervisor mode), or in user mode. The reason is that the rules for memory access in these regions are usually different. Here is a brief sequence of events to illustrate the problem (assuming kernel memory could be paged out):
While a user program is executing, an interrupt occurs (e.g. key press / disk event).
CPU transitions to supervisor mode and begins executing the handler in the kernel.
The interrupt handler begins to save the CPU state (so that the user process can be correctly resumed later), but in doing so it touches some of its storage which had previously been paged out.
This triggers a page fault exception.
In order to process the page fault exception, the kernel must now save the CPU state of the code that experienced the page miss.
It may actually be able to do this if it has a preallocated pool of memory that will never be paged out, but such a pool would be inevitably be limited in size.
So you see, the safest (and simplest) solution is for the kernel to ensure that memory owned by the kernel is not pagable at all. For this reason, page faults should not really occur within the kernel. They can occur, but as #adobriyan notes, that usually indicates a much bigger error than a simple need to page in some memory. (I believe this is the case in Linux. Check your specific OS to be sure whether kernel memory is non-pagable. OS architectures do differ.)
So in summary, kernel memory is usually not pagable, and since interrupts are usually handled within the kernel, page faults should not in general occur while servicing interrupts. Higher priority interrupts can still interrupt lower ones. It is just that all their resources are kept in physical memory.
The question about atomic contexts is less clear. If by that you mean atomic operations supported by the hardware, then no interrupt occurs within a partial completion of the operation. If you are instead referring to something like a critical section, then remember that critical sections only emulate atomicity. From the perspective of the hardware there is nothing special about such code except for the entry and exit code, which may use true hardware atomic operations. The code in between is normal code, and subject to being interrupted.
I hope this provides a useful response to this question, as I also wondered about this issue for a while.
Yes.
The code for the handler or critical region could span the boundary between two pages. If the second page is not available, then a page fault is necessary to bring it in.
Not sure why no body has used the word "Double Fault":
http://en.wikipedia.org/wiki/Double_fault
But that is the terms used in Intel manual:
http://software.intel.com/en-us/articles/introduction-to-pc-architecture/
or here:
ftp://download.intel.com/design/processor/manuals/253668.pdf (look at section 6-38).
There is something called triple fault too, which as the name indicate, can also happened when the CPU is trying to service the double fault error.
I think the answer is YES.
I just checked the page fault handler code in kernel 4.15 for x86_64 platform.
Take the following as a hint. no_context is the classic 'kernel oops'.
no_context(struct pt_regs *regs, unsigned long error_code,
unsigned long address, int signal, int si_code)
{
/* Are we prepared to handle this kernel fault? */
if (fixup_exception(regs, X86_TRAP_PF)) {
/*
* Any interrupt that takes a fault gets the fixup. This makes
* the below recursive fault logic only apply to a faults from
* task context.
*/
if (in_interrupt())
return;