Close call does not release underlying resources for the device - multithreading

A bit of context: Linux 3.10.40, Multi-threads application, main thread waiting for user input (keyboard), other threads waiting (epoll_wait()) for events. No specific priority for either application or child threads, no bounding to a specific core.
I have a problem when I try to close the device /dev/ttyGS from my application in user space. Close return 0 and file descriptor is indeed removed from the process fd list but the underlying tty port is not released (that because the gs_close() callback is not called).
It "only" happens when I test the following scenario: unloading my driver whereas the /dev/ttyGS is still opened.
However, if I close /dev/ttyGS during the "normal" application exit path, i.e. do the tear down sequence (including the close(fd) call) and exit the application, then unload the driver (in the shell) I am not facing this issue.
From my (main thread) application:
// during application initialization
fd = open("/dev/ttyGS0", O_NONBLOCK | O_NOCTTY)
fd1 = epoll_create(....);
epoll_ctl(fd1, EPOLL_CTL_ADD, fd, &evt);
fd2 = epoll_create(....)
....
// then during application life
system("rmmod mydriver");
mydriver_exit
// some code ....
eventfd_signal
// some code ....
wait_event_interruptible
// Then from my event thread of my application
exit epoll_wait(fd2)
// some code ....
epoll_ctl(fd1, EPOLL_CTL_DEL, fd, NULL);
close(fd)
// .... some code within the kernel fs subsystem
fput(filp);
if (atomic_long_dec_and_test(&file->f_count)) {
// some code ....
if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
if (!task_work_add(task, &file->f_u.fu_rcuhead, true))
return;
// some code ....
schedule_work(&delayed_fput_work);
spin_unlock_irqrestore(&delayed_fput_lock, flags);
}
// return from syscall
// some code ....
write(some_sysfs_attribute)
// some code ....
wake_up_interruptible
// return from syscall
// some code ....
go_back_to_epoll_wait(fd2)
// etc...
Is that correct to call close from a child thread whereas the open was performed in another (the main) thread of my application? I guess so...
The problem I have here is that file->f_count is greater than 1, so the if branch is not taken and therefore the work, which eventually will triggered tty_release() and thus gs_write callback, is not scheduled.
I grepped the f_count increment location in the fs subsystem and and from the result I get, apart from open, there were in the locking subpart (i.e. /fs/lockd).
So I was wondering whether it could be some lock (involved by the close() call) that has a grasp on the file (increasing the reference count) during the close which could prevent the work from being scheduled (and thus the callback).
From what I know file descriptors are shared between all the thread of a process, and looking in /proc/<my_app_pid>/fd and /proc/<my_app_child_pid>/fd I indeed see the same fds.
Still if I am not mistaken I think the fd table is shared between all the threads (within the same process), which I guess might/should? involve some kind of lock which might explain the problem.
The thing is that I don't really know fs subsystem (neither architecture nor source code). I try to read the source but although some parts of it are understandable, others are less (or rather more tricky especially without a good overview). I am struggling a bit to identity what could have grasp on the reference count.
Any idea of what the problem could be?

Related

What guarantees are there on inter-thread visibility of filesystem changes on Linux/OS X?

Background: I have an application that contains several threads (typically 2-4). At certain points in the running of that application, the threads need to simultaneously open separate log files. The files sit in a common directory. When opening the files, I just need to ensure that they have distinct names, named with a one-up counter. This application runs primarily on Linux and OS X.
Example: If I have 4 threads start logging at once, I would expect to see the following files appear:
log_file-<date>-<time>-1.log
log_file-<date>-<time>-2.log
log_file-<date>-<time>-3.log
log_file-<date>-<time>-4.log
Right now, I'm doing the above using a routine like the below (pseudocode), which is executed in each thread:
lock mutex;
int i = 1;
while (true)
{
filename = "log_file-<date>-<time>_" + to_string(i) + ".log";
if (filename does not exist)
{
create file called filename;
break;
}
i += 1;
}
unlock mutex;
This works, most of the time. However, I'm seeing the occasional case where I have more than one thread choose the same file name. After introducing some logging statements to this process, I see a case that boils down to the following:
Thread #1 chooses a filename, say, log-file-1.log.
Thread #1 touches the file (by opening it for creation and immediately closing it).
Thread #1 verifies that the file exists on disk (via stat() or similar system call).
Thread #1 releases the mutex.
Thread #2 acquires the mutex.
Thread #2 checks to see if log-file-1.log exists (via stat() or similar system call).
Much to my surprise, thread #2 does not see the presence of the file created by thread #1, causing it to select the same filename (which breaks some stuff down the line in my application).
Question: Are there any guarantees as to the visibility of filesystem operations across threads? Are there filesystem analogs to the memory barriers that are needed in order to ensure proper visibility of memory accesses across threads?

Interrupting open() with SIGALRM

We have a legacy embedded system which uses SDL to read images and fonts from an NFS share.
If there's a network problem, TTF_OpenFont() and IMG_Load() hang essentially forever. A test application reveals that open() behaves in the same way.
It occurred to us that a quick fix would be to call alarm() before the calls which open files on the NFS share. The man pages weren't entirely clear whether open() would fail with EINTR when interrupted by SIGALRM, so we put together a test app to verify this approach. We set up a signal handler with sigaction::sa_flags set to zero to ensure that SA_RESTART was not set.
The signal handler was called, but open() was not interrupted. (We observed the same behaviour with SIGINT and SIGTERM.)
I suppose the system treats open() as a "fast" operation even on "slow" infrastructure such as NFS.
Is there any way to change this behaviour and allow open() to be interrupted by a signal?
The man pages weren't entirely clear whether open() would fail with
EINTR when interrupted by SIGALRM, so we put together a test app to
verify this approach.
open(2) is a slow syscall (slow syscalls are those that can sleep forever, and can be awaken when, and if, a signal is caught in the meantime) only for some file types. In general, opens that block the caller until some condition occurs are usually interruptible. Known examples include opening a FIFO (named pipe), or (back in the old days) opening a physical terminal device (it sleeps until the modem is dialed).
NFS-mounted filesystems probably don't cause open(2) to sleep in an interruptible state. After all, you are most likely opening a regular file, and in that case open(2) will not be interruptable.
Is there any way to change this behaviour and allow open() to be
interrupted by a signal?
I don't think so, not without doing some (non-trivial) changes to the kernel.
I would explore the possibility of using setjmp(3) / longjmp(3) (see the manpage if you're not familiar; it's basically non-local gotos). You can initialize the environment buffer before calling open(2), and issue a longjmp(3) in the signal handler. Here's an example:
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>
#include <unistd.h>
#include <signal.h>
static jmp_buf jmp_env;
void sighandler(int signo) {
longjmp(jmp_env, 1);
}
int main(void) {
struct sigaction sigact;
sigact.sa_handler = sighandler;
sigact.sa_flags = 0;
sigemptyset(&sigact.sa_mask);
if (sigaction(SIGALRM, &sigact, NULL) < 0) {
perror("sigaction(2) error");
exit(EXIT_FAILURE);
}
if (setjmp(jmp_env) == 0) {
/* First time through
* This is where we would open the file
*/
alarm(5);
/* Simulate a blocked open() */
while (1)
; /* Intentionally left blank */
/* If open(2) is successful here, don't forget to unset
* the alarm
*/
alarm(0);
} else {
/* SIGALRM caught, open(2) canceled */
printf("open(2) timed out\n");
}
return 0;
}
It works by saving the context environment with the help of setjmp(3) before calling open(2). setjmp(3) returns 0 the first time through, and returns whatever value was passed to longjmp(3) otherwise.
Please be aware that this solution is not perfect. Here are some points to keep in mind:
There is a window of time between the call to alarm(2) and the call to open(2) (simulated here with while (1) { ... }) where the process may be preempted for a long time, so there is a chance the alarm expires before we actually attempt to open the file. Sure, with a large timeout such as 2 or 3 seconds this will most likely not happen, but it's still a race condition.
Similarly, there is a window of time between successfully opening the file and canceling the alarm where, again, the process may be preempted for a long time and the alarm may expire before we get the chance to cancel it. This is slightly worse because we have already opened the file so we will "leak" the file descriptor. Again, in practice, with a large timeout this will likely never happen, but it's a race condition nevertheless.
If the code catches other signals, there may be another signal handler in the midst of execution when SIGALRM is caught. Using longjmp(3) inside the signal handler will destroy the execution context of these other signal handlers, and depending on what they were doing, very nasty things may happen (inconsistent state if the signal handlers were manipulating other data structures in the program, etc.). It's as if it started executing, and suddenly crashed somewhere in the middle. You can fix it by: a) carefully setting up all signal handlers such that SIGALRM is blocked before they are invoked (this ensures that the SIGALRM handler does not begin execution until other handlers are done) and b) blocking these other signals before catching SIGALRM. Both actions can be accomplished by setting the sa_mask field of struct sigaction with the necessary mask (the operating system atomically sets the process's signal mask to that value before beginning execution of the handler and unsets it before returning from the handler). OTOH, if the rest of the code doesn't catch signals, then this is not a problem.
sleep(3) may be implemented with alarm(2), and alarm(2) and setitimer(2) share the same timer; if other portions in the code make use of any of these functions, they will interfere and the result will be a huge mess.
Just make sure you weigh in these disadvantages before blindly using this approach. The use of setjmp(3) / longjmp(3) is usually discouraged and makes programs considerably harder to read, understand and maintain. It's not elegant, but right now I don't think you have a choice, unless you're willing to do some core refactoring in the project.
If you do end up using setjmp(3), then at the very least document these limitations.
Maybe there is a strategy of using a separate thread to do the open so the main thread is not held up longer than desired.

How to close thread winapi

what is the rigth way to close Thread in Winapi, threads don't use common resources.
I am creating threads with CreateThread , but I don't know how to close it correctly in ,because someone suggest to use TerminateThread , others ExitThread , but what is the correct way to close it .
Also where should I call closing function in WM_CLOSE or WM_DESTROY ?
Thx in advance .
The "nicest" way to close a thread in Windows is by "telling" the thread to shutdown via some thread-safe signaling mechanism, then simply letting it reach its demise its own, potentially waiting for it to do so via one of the WaitForXXXX functions if completion detection is needed (which is frequently the case). Something like:
Main thread:
// some global event all threads can reach
ghStopEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
// create the child thread
hThread = CreateThread(NULL, 0, ThreadProc, NULL, 0, NULL);
//
// ... continue other work.
//
// tell thread to stop
SetEvent(ghStopEvent);
// now wait for thread to signal termination
WaitForSingleObject(hThread, INFINITE);
// important. close handles when no longer needed
CloseHandle(hThread);
CloseHandle(ghStopEvent);
Child thread:
DWORD WINAPI ThreadProc(LPVOID pv)
{
// do threaded work
while (WaitForSingleObject(ghStopEvent, 1) == WAIT_TIMEOUT)
{
// do thread busy work
}
return 0;
}
Obviously things can get a lot more complicated once you start putting it in practice. If by "common" resources you mean something like the ghStopEvent in the prior example, it becomes considerably more difficult. Terminating a child thread via TerminateThread is strongly discouraged because there is no logical cleanup performed at all. The warnings specified in the `TerminateThread documentation are self-explanatory, and should be heeded. With great power comes....
Finally, even the called thread invoking ExitThread is not required explicitly by you, and though you can do so, I strongly advise against it in C++ programs. It is called for you once the thread procedure logically returns from the ThreadProc. I prefer the model above simply because it is dead-easy to implement and supports full RAII of C++ object cleanup, which neither ExitThread nor TerminateThread provide. For example, the ExitThread documentation :
...in C++ code, the thread is exited before any destructors can be called
or any other automatic cleanup can be performed. Therefore, in C++
code, you should return from your thread function.
Anyway, start simple. Get a handle on things with super-simple examples, then work your way up from there. There are a ton of multi-threaded examples on the web, Learn from the good ones and challenge yourself to identify the bad ones.
Best of luck.
So you need to figure out what sort of behaviour you need to have.
Following is a simple description of the methods taken from documentation:
"TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL."
So if you need your thread to terminate at any cost, call this method.
About ExitThread, this is more graceful. By calling ExitThread, you're telling to windows you're done with that calling thread, so the rest of the code isn't going to get called. It's a bit like calling exit(0).
"ExitThread is the preferred method of exiting a thread. When this function is called (either explicitly or by returning from a thread procedure), the current thread's stack is deallocated, all pending I/O initiated by the thread is canceled, and the thread terminates. If the thread is the last thread in the process when this function is called, the thread's process is also terminated."

When to call sem_unlink()?

I'm a little confused by the Linux API sem_unlink(), mainly when or why to call it. I've used semaphores in Windows for many years. In Windows once you close the last handle of a named semaphore the system removes the underlying kernel object. But it appears in Linux you, the developer, needs to remove the kernel object by calling sem_unlink(). If you don't the kernel object persists in the /dev/shm folder.
The problem I'm running into, if process A calls sem_unlink() while process B has the semaphore locked, it immediately destroys the semaphore and now process B is no longer "protected" by the semaphore when/if process C comes along. What's more, the man page is confusing at best:
"The semaphore name is removed immediately. The semaphore is destroyed once all other processes that have the semaphore open close it."
How can it destroy the object immediately if it has to wait for other processes to close the semaphore?
Clearly I don't understand the proper use of semaphore objects on Linux. Thanks for any help. Below is some sample code I'm using to test this.
int main(void)
{
sem_t *pSemaphore = sem_open("/MyName", O_CREAT, S_IRUSR | S_IWUSR, 1);
if(pSemaphore != SEM_FAILED)
{
if(sem_wait(pSemaphore) == 0)
{
// Perform "protected" operations here
sem_post(pSemaphore);
}
sem_close(pSemaphore);
sem_unlink("/MyName");
}
return 0;
}
Response to your questions:
In comparison to the semaphore behavior for windows you
describe, POSIX semaphores are Kernel persistent. Meaning that the
semaphore retains it's value even if no process has the semaphore
opened. (the semaphore's reference count would be 0)
If process A calls sem_unlink() while process B has the semaphore
locked. This means the semaphore's reference count is not 0 and will
not be destructed.
Basic operation of sem_close vs sem_unlink, I think will help overall understanding:
sem_close: close's a semaphore, this also done when a process exits. the semaphore still remains in the system.
sem_unlink: will be removed from the system only when the reference count reaches 0 (that is after all processes that have it open, call sem_close or are exited).
References:
Book - Unix Networking Programming-Interprocess Communication by W.Richard Stevens, vol 2, ch10
The sem_unlink() function removes the semaphore identified by name and marks
the semaphore to be destroyed once all processes cease using it (this may mean
immediately, if all processes that had the semaphore open have already closed it).

Simultaneous Read/Write on a file by two threads (Mutex aren't helping)

I want to use one thread to get fields of packets by using tshark utility (using system () command) whos output is then redirected to a file. This same file needs to be read by another thread simultaneously, so that it can make runtime decisions based on the fields observed in the file.
The problem I am having currently now is even though the first thread is writing to the file, the second thread is unable to read it (It reads NULL from the file). I am not sure why its behaving this way. I thought it might be due to simultaneous access to the same file. I thought of using mutex locks but that would block the reading thread, since the first thread will only end when the program terminates.
Any ideas on how to go about it?
If you are using that file for interprocess communication, you could instead use named pipes or message queues instead. They are much easier to use and don't require synchronization because one thread writes and the other one reads when data is available.
Edit: For inter-thread communication you can simply use shared variables and a conditional variable to signal when some data has been produced (a producer-consumer pattern). Something like:
// thread 1
while(1)
{
// read packet
// write packet to global variable
// signal thread 2
// wait for confirmation of reading
}
// thread 2
while(1)
{
// wait for signal from thread 1
// read from global variable
// signal thread 2 to continue
}
The signal parts can be implemented with conditional variables: pthread_cond_t.

Resources