Interrupting open() with SIGALRM - linux

We have a legacy embedded system which uses SDL to read images and fonts from an NFS share.
If there's a network problem, TTF_OpenFont() and IMG_Load() hang essentially forever. A test application reveals that open() behaves in the same way.
It occurred to us that a quick fix would be to call alarm() before the calls which open files on the NFS share. The man pages weren't entirely clear whether open() would fail with EINTR when interrupted by SIGALRM, so we put together a test app to verify this approach. We set up a signal handler with sigaction::sa_flags set to zero to ensure that SA_RESTART was not set.
The signal handler was called, but open() was not interrupted. (We observed the same behaviour with SIGINT and SIGTERM.)
I suppose the system treats open() as a "fast" operation even on "slow" infrastructure such as NFS.
Is there any way to change this behaviour and allow open() to be interrupted by a signal?

The man pages weren't entirely clear whether open() would fail with
EINTR when interrupted by SIGALRM, so we put together a test app to
verify this approach.
open(2) is a slow syscall (slow syscalls are those that can sleep forever, and can be awaken when, and if, a signal is caught in the meantime) only for some file types. In general, opens that block the caller until some condition occurs are usually interruptible. Known examples include opening a FIFO (named pipe), or (back in the old days) opening a physical terminal device (it sleeps until the modem is dialed).
NFS-mounted filesystems probably don't cause open(2) to sleep in an interruptible state. After all, you are most likely opening a regular file, and in that case open(2) will not be interruptable.
Is there any way to change this behaviour and allow open() to be
interrupted by a signal?
I don't think so, not without doing some (non-trivial) changes to the kernel.
I would explore the possibility of using setjmp(3) / longjmp(3) (see the manpage if you're not familiar; it's basically non-local gotos). You can initialize the environment buffer before calling open(2), and issue a longjmp(3) in the signal handler. Here's an example:
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>
#include <unistd.h>
#include <signal.h>
static jmp_buf jmp_env;
void sighandler(int signo) {
longjmp(jmp_env, 1);
}
int main(void) {
struct sigaction sigact;
sigact.sa_handler = sighandler;
sigact.sa_flags = 0;
sigemptyset(&sigact.sa_mask);
if (sigaction(SIGALRM, &sigact, NULL) < 0) {
perror("sigaction(2) error");
exit(EXIT_FAILURE);
}
if (setjmp(jmp_env) == 0) {
/* First time through
* This is where we would open the file
*/
alarm(5);
/* Simulate a blocked open() */
while (1)
; /* Intentionally left blank */
/* If open(2) is successful here, don't forget to unset
* the alarm
*/
alarm(0);
} else {
/* SIGALRM caught, open(2) canceled */
printf("open(2) timed out\n");
}
return 0;
}
It works by saving the context environment with the help of setjmp(3) before calling open(2). setjmp(3) returns 0 the first time through, and returns whatever value was passed to longjmp(3) otherwise.
Please be aware that this solution is not perfect. Here are some points to keep in mind:
There is a window of time between the call to alarm(2) and the call to open(2) (simulated here with while (1) { ... }) where the process may be preempted for a long time, so there is a chance the alarm expires before we actually attempt to open the file. Sure, with a large timeout such as 2 or 3 seconds this will most likely not happen, but it's still a race condition.
Similarly, there is a window of time between successfully opening the file and canceling the alarm where, again, the process may be preempted for a long time and the alarm may expire before we get the chance to cancel it. This is slightly worse because we have already opened the file so we will "leak" the file descriptor. Again, in practice, with a large timeout this will likely never happen, but it's a race condition nevertheless.
If the code catches other signals, there may be another signal handler in the midst of execution when SIGALRM is caught. Using longjmp(3) inside the signal handler will destroy the execution context of these other signal handlers, and depending on what they were doing, very nasty things may happen (inconsistent state if the signal handlers were manipulating other data structures in the program, etc.). It's as if it started executing, and suddenly crashed somewhere in the middle. You can fix it by: a) carefully setting up all signal handlers such that SIGALRM is blocked before they are invoked (this ensures that the SIGALRM handler does not begin execution until other handlers are done) and b) blocking these other signals before catching SIGALRM. Both actions can be accomplished by setting the sa_mask field of struct sigaction with the necessary mask (the operating system atomically sets the process's signal mask to that value before beginning execution of the handler and unsets it before returning from the handler). OTOH, if the rest of the code doesn't catch signals, then this is not a problem.
sleep(3) may be implemented with alarm(2), and alarm(2) and setitimer(2) share the same timer; if other portions in the code make use of any of these functions, they will interfere and the result will be a huge mess.
Just make sure you weigh in these disadvantages before blindly using this approach. The use of setjmp(3) / longjmp(3) is usually discouraged and makes programs considerably harder to read, understand and maintain. It's not elegant, but right now I don't think you have a choice, unless you're willing to do some core refactoring in the project.
If you do end up using setjmp(3), then at the very least document these limitations.

Maybe there is a strategy of using a separate thread to do the open so the main thread is not held up longer than desired.

Related

Close call does not release underlying resources for the device

A bit of context: Linux 3.10.40, Multi-threads application, main thread waiting for user input (keyboard), other threads waiting (epoll_wait()) for events. No specific priority for either application or child threads, no bounding to a specific core.
I have a problem when I try to close the device /dev/ttyGS from my application in user space. Close return 0 and file descriptor is indeed removed from the process fd list but the underlying tty port is not released (that because the gs_close() callback is not called).
It "only" happens when I test the following scenario: unloading my driver whereas the /dev/ttyGS is still opened.
However, if I close /dev/ttyGS during the "normal" application exit path, i.e. do the tear down sequence (including the close(fd) call) and exit the application, then unload the driver (in the shell) I am not facing this issue.
From my (main thread) application:
// during application initialization
fd = open("/dev/ttyGS0", O_NONBLOCK | O_NOCTTY)
fd1 = epoll_create(....);
epoll_ctl(fd1, EPOLL_CTL_ADD, fd, &evt);
fd2 = epoll_create(....)
....
// then during application life
system("rmmod mydriver");
mydriver_exit
// some code ....
eventfd_signal
// some code ....
wait_event_interruptible
// Then from my event thread of my application
exit epoll_wait(fd2)
// some code ....
epoll_ctl(fd1, EPOLL_CTL_DEL, fd, NULL);
close(fd)
// .... some code within the kernel fs subsystem
fput(filp);
if (atomic_long_dec_and_test(&file->f_count)) {
// some code ....
if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
if (!task_work_add(task, &file->f_u.fu_rcuhead, true))
return;
// some code ....
schedule_work(&delayed_fput_work);
spin_unlock_irqrestore(&delayed_fput_lock, flags);
}
// return from syscall
// some code ....
write(some_sysfs_attribute)
// some code ....
wake_up_interruptible
// return from syscall
// some code ....
go_back_to_epoll_wait(fd2)
// etc...
Is that correct to call close from a child thread whereas the open was performed in another (the main) thread of my application? I guess so...
The problem I have here is that file->f_count is greater than 1, so the if branch is not taken and therefore the work, which eventually will triggered tty_release() and thus gs_write callback, is not scheduled.
I grepped the f_count increment location in the fs subsystem and and from the result I get, apart from open, there were in the locking subpart (i.e. /fs/lockd).
So I was wondering whether it could be some lock (involved by the close() call) that has a grasp on the file (increasing the reference count) during the close which could prevent the work from being scheduled (and thus the callback).
From what I know file descriptors are shared between all the thread of a process, and looking in /proc/<my_app_pid>/fd and /proc/<my_app_child_pid>/fd I indeed see the same fds.
Still if I am not mistaken I think the fd table is shared between all the threads (within the same process), which I guess might/should? involve some kind of lock which might explain the problem.
The thing is that I don't really know fs subsystem (neither architecture nor source code). I try to read the source but although some parts of it are understandable, others are less (or rather more tricky especially without a good overview). I am struggling a bit to identity what could have grasp on the reference count.
Any idea of what the problem could be?

How does SIGSTOP work in Linux kernel?

I am wondering how SIGSTOP works inside the Linux Kernel. How is it handled? And how the kernel stops running when it is handled?
I am familiar with the kernel code base. So, if you can reference kernel functions that will be fine, and in fact that is what I want. I am not looking for high level description from a user's perspective.
I have already bugged the get_signal_to_deliver() with printk() statements (it is compiling right now). But I would like someone to explain things in better details.
It's been a while since I touched the kernel, but I'll try to give as much detail as possible. I had to look up some of this stuff in various other places, so some details might be a little messy, but I think this gives a good idea of what happens under the hood.
When a signal is raised, the TIF_SIGPENDING flag is set in the process descriptor structure. Before returning to user mode, the kernel tests this flag with test_thread_flag(TIF_SIGPENDING), which will return true (because a signal is pending).
The exact details of where this happens seem to be architecture dependent, but you can see an example for um:
void interrupt_end(void)
{
struct pt_regs *regs = &current->thread.regs;
if (need_resched())
schedule();
if (test_thread_flag(TIF_SIGPENDING))
do_signal(regs);
if (test_and_clear_thread_flag(TIF_NOTIFY_RESUME))
tracehook_notify_resume(regs);
}
Anyway, it ends up calling arch_do_signal(), which is also architecture dependent and is defined in the corresponding signal.c file (see the example for x86):
void arch_do_signal(struct pt_regs *regs)
{
struct ksignal ksig;
if (get_signal(&ksig)) {
/* Whee! Actually deliver the signal. */
handle_signal(&ksig, regs);
return;
}
/* Did we come from a system call? */
if (syscall_get_nr(current, regs) >= 0) {
/* Restart the system call - no handlers present */
switch (syscall_get_error(current, regs)) {
case -ERESTARTNOHAND:
case -ERESTARTSYS:
case -ERESTARTNOINTR:
regs->ax = regs->orig_ax;
regs->ip -= 2;
break;
case -ERESTART_RESTARTBLOCK:
regs->ax = get_nr_restart_syscall(regs);
regs->ip -= 2;
break;
}
}
/*
* If there's no signal to deliver, we just put the saved sigmask
* back.
*/
restore_saved_sigmask();
}
As you can see, arch_do_signal() calls get_signal(), which is also in signal.c.
The bulk of the work happens inside get_signal(), it's a huge function, but eventually it seems to process the special case of SIGSTOP here:
if (sig_kernel_stop(signr)) {
/*
* The default action is to stop all threads in
* the thread group. The job control signals
* do nothing in an orphaned pgrp, but SIGSTOP
* always works. Note that siglock needs to be
* dropped during the call to is_orphaned_pgrp()
* because of lock ordering with tasklist_lock.
* This allows an intervening SIGCONT to be posted.
* We need to check for that and bail out if necessary.
*/
if (signr != SIGSTOP) {
spin_unlock_irq(&sighand->siglock);
/* signals can be posted during this window */
if (is_current_pgrp_orphaned())
goto relock;
spin_lock_irq(&sighand->siglock);
}
if (likely(do_signal_stop(ksig->info.si_signo))) {
/* It released the siglock. */
goto relock;
}
/*
* We didn't actually stop, due to a race
* with SIGCONT or something like that.
*/
continue;
}
See the full function here.
do_signal_stop() does the necessary processing to handle SIGSTOP, you can also find it in signal.c. It sets the task state to TASK_STOPPED with set_special_state(TASK_STOPPED), a macro that is defined in include/sched.h that updates the current process descriptor status. (see the relevant line in signal.c). Further down, it calls freezable_schedule() which in turn calls schedule(). schedule() calls __schedule() (also in the same file) in a loop until an eligible task is found. __schedule() attempts to find the next task to schedule (next in the code), and the current task is prev. The state of prev is checked, and because it was changed to TASK_STOPPED, deactivate_task() is called, which moves the task from the run queue to the sleep queue:
} else {
...
deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK);
...
}
deactivate_task() (also in the same file) removes the process from the runqueue by decrementing the on_rq field of the task_struct to 0 and calling dequeue_task(), which moves the process to the new (waiting) queue.
Then, schedule() checks the number of runnable processes and selects the next task to enter the CPU according to the scheduling policies in effect (I think this is a little bit out of scope by now).
At the end of the day, SIGSTOP moves a process from the runnable queue to a waiting queue until that process receives SIGCONT.
Nearly every time there is an interrupt, the kernel suspends some process from running and switches to running the interrupt handler (the only exception being when there is no process running). Likewise, the kernel will suspend processes that run too long without giving up the CPU (and technically that's the same thing: it just originates from the timer interrupt or possibly an IPI). Ordinarily in these cases, the kernel then puts the suspended process back on the run queue and when the scheduling algorithm decides the time is right, it is resumed.
In the case of SIGSTOP, the same basic thing happens: the affected processes are suspended due to the reception of the stop signal. They just don't get put back on the run queue until SIGCONT is sent. Nothing extraordinary here: SIGSTOP is just instructing the kernel to make a process non-runnable until further notice.
[One note: you seemed to imply that the kernel stops running with SIGSTOP. That is of course not the case. Only the SIGSTOPped processes stop running.]

interrupt switch (PIC)

#define SW1 RB5
int IOFlag = 2; //while in out
void SW(){
if(!RB5)
__delay_ms(50);
while(!RB5);
__delay_ms(50);
IOFlag++;
}
void main(){
SW();
while(IOFlag % 2 != 0){
SW();
//some routines..
}
}
I used pic16f73, RB5 input use for switch.
When some of the routine is running, switch is not operating properly.
It is possible if you use the interrupt. However I don't know how to use it properly.
You need to understand the difference between polling and interrupts.
With polling (what you appear to be doing), you periodically check the state of some "thing" and act on it.
With interrupts, the "thing" happening causes your main thread of execution to be suspended, and an interrupt service routine (ISR) run.
Polling has the disadvantage of potentially long latency, the time between the thing happening and you finding out about it. In fact, you can even lose events if the thing is a momentary switch for example - you switch it on then off then, when the code checks for it, it's off.
Now you can still use polling if you wish, provided you understand these implications. Sometimes the easiest solution is to poll more often.
For example, if one of your //some routines.. jobs is a long running loop, you can poll from within there:
for (int i = 0; i < numThings; i++) {
doSomethingQuickWitn (thing[i]);
SW(); // Poll here as well
}
// Rather than here.
However, for _minimal latency, using interrupts is usually better and is reasonably simple once you wrap your head around the concept.
Your ISR (which will run on the given event, interrupting the main thread of execution) simply has to store the fact that the event has happened and communicate that to your main thread somehow.
For situations where you don't care how many times the event has happened, a flag will do the job. Your ISR simply sets the flag and your main thread of execution checks it periodically to see if it's been set, then clears it (with interrupts disabled so as to avoid race conditions). That would be something like (pseudo-code):
global val switchHit = false;
main:
interrupt (7, intFn) // call intfn() on interrupt 7
while true:
disableInts() // disallow interrupts for a short while
if switchHit:
handleSwitch() // switch was hit, do something (quickly)
switchHit = false // mark as not hit
enableInts() // and re-allow interrupts
doLotsOfOtherStuff()
intfn:
switchHit = true // notify main
Note that I'm not worry about race conditions within the ISR, interrupts are generally disabled automatically there.
More complicated information transfer may involve a count rather than a flag, or even a message queue of some sort, flowing from the ISR to the main thread of execution.

How do I "disengage" from `accept` on a blocking socket when signalled from another thread?

I am in the same situation as this guy, but I don't quite understand the answer.
The problem:
Thread 1 calls accept on a socket, which is blocking.
Thread 2 calls close on this socket.
Thread 1 continues blocking. I want it to return from accept.
The solution:
what you should do is send a signal to the thread which is blocked in
accept. This will give it EINTR and it can cleanly disengage - and
then close the socket. Don't close it from a thread other than the one
using it.
I don't get what to do here -- when the signal is received in Thread 1, accept is already blocking, and will continue to block after the signal handler has finished.
What does the answer really mean I should do?
If the Thread 1 signal handler can do something which will cause accept to return immediately, why can't Thread 2 do the same without signals?
Is there another way to do this without signals? I don't want to increase the caveats on the library.
Instead of blocking in accept(), block in select(), poll(), or one of the similar calls that allows you to wait for activity on multiple file descriptors and use the "self-pipe trick". All of the file descriptors passed to select() should be in non-blocking mode. One of the file descriptors should be the server socket that you use with accept(); if that one becomes readable then you should go ahead and call accept() and it will not block. In addition to that one, create a pipe(), set it to non-blocking, and check for the read side becoming readable. Instead of calling close() on the server socket in the other thread, send a byte of data to the first thread on the write end of the pipe. The actual byte value doesn't matter; the purpose is simply to wake up the first thread. When select() indicates that the pipe is readable, read() and ignore the data from the pipe, close() the server socket, and stop waiting for new connections.
The accept() call will return with error code EINTR if a signal is caught before a connection is accepted. So check the return value and error code then close the socket accordingly.
If you wish to avoid the signal mechanism altogether, use select() to determine if there are any incoming connections ready to be accepted before calling accept(). The select() call can be made with a timeout so that you can recover and respond to abort conditions.
I usually call select() with a timeout of 1000 to 3000 milliseconds from a while loop that checks for an exit/abort condition. If select() returns with a ready descriptor I call accept() otherwise I either loop around and block again on select() or exit if requested.
Call shutdown() from Thread 2. accept will return with "invalid argument".
This seems to work but the documentation doesn't really explain its operation across threads -- it just seems to work -- so if someone can clarify this, I'll accept that as an answer.
Just close the listening socket, and handle the resulting error or exception from accept().
I believe signals can be used without increasing "the caveats on the library". Consider the following:
#include <pthread.h>
#include <signal.h>
#include <stddef.h>
static pthread_t thread;
static volatile sig_atomic_t sigCount;
/**
* Executes a concurrent task. Called by `pthread_create()`..
*/
static void* startTask(void* arg)
{
for (;;) {
// calls to `select()`, `accept()`, `read()`, etc.
}
return NULL;
}
/**
* Starts concurrent task. Doesn't return until the task completes.
*/
void start()
{
(void)pthread_create(&thread, NULL, startTask, NULL);
(void)pthread_join(thread);
}
static void noop(const int sig)
{
sigCount++;
}
/**
* Stops concurrent task. Causes `start()` to return.
*/
void stop()
{
struct sigaction oldAction;
struct sigaction newAction;
(void)sigemptyset(&newAction.sa_mask);
newAction.sa_flags = 0;
newAction.sa_handler = noop;
(void)sigaction(SIGTERM, &newAction, &oldAction);
(void)pthread_kill(thread, SIGTERM); // system calls return with EINTR
(void)sigaction(SIGTERM, &oldAction, NULL); // restores previous handling
if (sigCount > 1) // externally-generated SIGTERM was received
oldAction.sa_handler(SIGTERM); // call previous handler
sigCount = 0;
}
This has the following advantages:
It doesn't require anything special in the task code other than normal EINTR handling; consequently, it makes reasoning about resource leakage easier than using pthread_cancel(), pthread_cleanup_push(), pthread_cleanup_pop(), and pthread_setcancelstate().
It doesn't require any additional resources (e.g. a pipe).
It can be enhanced to support multiple concurrent tasks.
It's fairly boilerplate.
It might even compile. :-)

How to implement a thread safe timer on linux?

As we know, doing things in signal handlers is really bad, because they run in an interrupt-like context. It's quite possible that various locks (including the malloc() heap lock!) are held when the signal handler is called.
So I want to implement a thread safe timer without using signal mechanism.
How can I do?
Sorry, actually, I'm not expecting answers about thread-safe, but answers about implementing a timer on Unix or Linux which is thread-safe.
Use usleep(3) or sleep(3) in your thread. This will block the thread until the timeout expires.
If you need to wait on I/O and have a timer expire before any I/O is ready, use select(2), poll(2) or epoll(7) with a timeout.
If you still need to use a signal handler, create a pipe with pipe(2), do a blocking read on the read side in your thread, or use select/poll/epoll to wait for it to be ready, and write a byte to the write end of your pipe in the signal handler with write(2). It doesn't matter what you write to the pipe - the idea is to just get your thread to wake up. If you want to multiplex signals on the one pipe, write the signal number or some other ID to the pipe.
You should probably use something like pthreads, the POSIX threads library. It provides not only threads themselves but also basic synchronization primitives like mutexes (locks), conditions, semaphores. Here's a tutorial I found that seems to be decent:
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html
For what it's worth, if you're totally unfamiliar with multithreaded programming, it might be a little easier to learn it in Java or Python, if you know either of those, than in C.
I think the usual way around the problems you describe is to make the signal handlers do only a minimal amount of work. E.g. setting some timer_expired flag. Then you have some thread that regularly checks whether the flag has been set, and does the actual work.
If you don't want to use signals I suppose you'd have to make a thread sleep or busy-wait for the specified time.
Use a Posix interval timer, and have it notify via a signal. Inside the signal handler function almost none of C's functions, like printf() can be used, as they aren't re-entrant.
Use a single global flag, declared static volatile for your signal handler to manipulate. The handler should literally have this one line of code, and NOTHING else; This flag should impact the flow control elsewhere in the 1 & Only thread in the program.
static volatile bool g_zig_instead_of_zag_flg = false;
...
void signal_handler_fnc()
g_zig_instead_of_zag_flg = true;
return
int main() {
if(false == g_zig_instead_of_zag) {
do_zag();
} else {
do_zig();
g_zig_instead_of_zag = false;
return 0;
}
Michael Kerrisk's The Linux Programming Interface has examples of both methods, and a few more, but the examples come with a lot of his own private functions you have to get working, and the examples carefully avoid many of the gotchas they should explore, so not great.
Using the Poxix interval timer that notifies via a thread makes everything a lot worse, and AFAICT, that notification method is pretty much useless. I only say pretty much because I am allowing that there may be SOME case where doing nothing in the main() thread, and everything in the handler thread is useful, but I sure can't think of any such case.

Resources