I wanted to use inotify for monitoring some files in my C program.
I am wondering if it is safe to have one thread reading from inotify descriptor (the one returned by inotify_init) thus blocking until some event happens, during this waiting there would be a possibility of adding new file to watch queue using inotify_add_watch during the other thread waiting period.
Do I need to synchronize those actions or is it safe to do such thing?
Don't have the exact answer, but I do know from experience that you can't even open files in another thread without triggering the read() in the thread you are using inotify. I recall reading that you need to use inotify_init1() along with the IN_CLOEXEC flag to allow file io in other threads. I'm not sure if that means you can actually use inotify in more than one thread simultaneously though.
Related
I've this problem, I need to understand if a Linux thread is running or not due to crash and not for normal exit. The reason to do that is try to restart the thread without reset\restart all system.
The pthread_join() seems not a good option because I've several thread to monitoring and the function return on specific thread, It doesn't work in "parallel". At moment I've a keeep live signal from thread to main but I'm looking for some system call or thread attribute to understand the state
Any suggestion?
P
Thread "crashes"
How to detect if a linux thread is crashed
if (0) //...
That is, the only way that a pthreads thread can terminate abnormally while other threads in the process continue to run is via thread cancellation,* which is not well described as a "crash". In particular, if a signal is received whose effect is abnormal termination then the whole process terminates, not just the thread that handled the signal. Other kinds of errors do not cause threads to terminate.
On the other hand, if by "crash" you mean normal termination in response to the thread detecting an error condition, then you have no limitation on what the thread can do prior to terminating to communicate about its state. For example,
it could update a shared object that tracks information about your threads
it could write to a pipe designated for the purpose
it could raise a signal
If you like, you can use pthread_cleanup_push() to register thread cleanup handlers to help with that.
On the third hand, if you're asking about detecting live threads that are failing to make progress -- because they are deadlocked, for example -- then your best bet is probably to implement some form of heartbeat monitor. That would involve each thread you want to monitor periodically updating a shared object that tracks the time of each thread's last update. If a thread goes too long between beats then you can guess that it may be stalled. This requires you to instrument all the threads you want to monitor.
Thread cancellation
You should not use thread cancellation. But if you did, and if you include termination because of cancellation in your definition of "crash", then you still have all the options above available to you, but you must engage them by registering one or more cleanup handlers.
GNU-specific options
The main issues with using pthread_join() to check thread state are
it doesn't work for daemon threads, and
pthread_join() blocks until the specified thread terminates.
For daemon threads, you need one of the approaches already discussed, but for ordinary threads on GNU/Linux, Glibc provides non-standard pthread_tryjoin_np(), which performs a non-blocking attempt to join a thread, and also pthread_timedjoin_np(), which performs a join attempt with a timeout. If you are willing to rely on Glibc-specific functions then one of these might serve your purpose.
Linux-specific options
The Linux kernel makes per-process thread status information available via the /proc filesystem. See How to check the state of Linux threads?, for example. Do be aware, however, that the details vary a bit from one kernel version to another. And if you're planning to do this a lot, then also be aware that even though /proc is a virtual filesystem (so no physical disk is involved), you still access it via slow-ish I/O interfaces.
Any of the other alternatives is probably better than reading files in /proc. I mention it only for completeness.
Overall
I'm looking for some system call or thread attribute to understand the state
The pthreads API does not provide a "have you terminated?" function or any other such state-inquiry function, unless you count pthread_join(). If you want that then you need to roll your own, which you can do by means of some of the facilities already discussed.
*Do not use thread cancellation.
In Windows there is the API WaitForMultipleObjects which will, if one event is registered in many threads, only wake one thread if the event occurs. I now have to port an application that uses this in its threadpool and I am looking for the best practive to do this in Linux.
I am aware of epoll which can wait for fds (which i can create with pipe), but waiting on one FD in multiple threads may wake every thread on event when only one is needed.
What would be the best practice to implement this behaviour on Linux? I really dont want to split up an event to have as many FDs as there are worker threads, as this may hit the FD limit on some systems as I have many events (which all would be split up).
What I thought about is create 1 master thread that will delegate work to an available worker (or queue the task if all workers are working), but that would mean that I have one additional context switch (and thus giving up computation time) as the master will wake up and then wake up another worker. I would do this if there is no other possibility to cleanly implement this. Unfortunately I cannot get rid of the current architecture so I need to get around this.
Is there any API that would be applicable for this kind of problem?
epoll() is the correct solution, although you could consider using eventfd() file descriptors rather than pipe() file descriptors for the event signalling. See this text from the epoll(7) man page:
If multiple threads (or processes, if child processes have inherited
the epoll file descriptor across fork(2)) are blocked in
epoll_wait(2) waiting on the same the same epoll file descriptor
and a file descriptor in the interest list that is marked for
edge-triggered (EPOLLET) notification becomes ready, just one of the
threads (or processes) is awoken from epoll_wait(2). This provides
a useful optimization for avoiding "thundering herd" wake-ups in some
scenarios.
So to get this single-wakeup behaviour, you have to be calling epoll_wait() in each thread on the same epoll descriptor, and you have to have registered your event-notifying file descriptors in the epoll set as edge-triggered.
I want to maintain a cache that mirrors a particular directory, so I add a watch whose events are monitored by thread A and then tell thread B to scan that directory and put the filenames into my cache. I have separate threads because I want the application to still be responsive to incoming inotify events during the scan. Otherwise, I could lose events because I wasn't reading them and the inotify queue filled up during the scan.
It is entirely possible that a delete or move_from event for a file will be processed before it was added to my cache by the directory scan. In that case a naive implementation would end up having a cache entry referring to a file that doesn't exist. What's the right way deal with this particular race condition?
The way I'd have done it is to keep two permanent threads: a single utility thread and single inotify thread for non-stop reading from inotify file descriptor. These threads communicate via blocking queue.
When inotify thread detects an event, that can be one of two event types:
An event, indicating, that entire cache for the observed directory must be destroyed and re-created: queue overflow or unmount.
An event, that can be handled by changing a single entry in cache (most other inotify events)
Upon detection, the event is immediately queued to utility thread.
When utility thread receives event of 1st type, it recreaters entire cache from scratch by reading full directory contents into cache. The same happens, when there is no cache yet, and the event of 2nd type arrives. In other cases full readdir() is avoided, and the cache is simply modified according to event.
The race condition, described in your question, may happen only if multiple threads are allowed to modify the cache. The described approach handles it by assuming, that the only thread allowed to modify a cache is utility thread.
If you want to allow other threads to modify the cache (for example, because you don't know if inotify is supported by filesystem), you can use a simpler and more robust approach: do not track individual directory modification events and have utility thread perform full readdir() on every arriving event. In the worst case there will be too much readdirs, but reading directory contents by itself is so cheap, I wouldn't care about that.
If reading full directory contents is not cheap (for example, because it may be very, very big), then you shouldn't store all of it in memory to begin with. Such scenario would better work with small partial cache, that can be quickly refreshed by using telldir, seek and fstat to track small number of files, currently visible to user.
I have a multi-threaded system in which a main thread has to wait in blocking state for one of the following 4 events to happen:
inter-process semaphore (sem_wait())
pthread condition (pthread_cond_wait())
recv() from socket
timeout expiring
Ideally I'd like a mechanism to unblock the main thread when any of the above occurs, something like a ppoll() with suitable timeout parameter. Non-blocking and polling is out of the picture due to the impact on the CPU usage, while having separate threads blocking on different events is not ideal due to the increased latency (one thread unblocking from one of the events should eventually wake up the main one).
The code will be almost exclusively compiled under Linux with gcc toolchain, if that helps, but some portability would be good, if at all possible.
Thanks in advance for any suggestion
The mechanisms for waiting on multiple types of objects on Unix-like systems are not that great. In general, the idea is to, wherever possible, use file descriptors for IPC rather than multiple different IPC mechanisms.
From your comment, it sounds like you can edit or change the condition variable, but not the code that signals the semaphore. So what I'd recommend is something like the following.
Change the condition variable to either a pipe (for more portability) or an eventfd(2) object (Linux-specific). The notifying thread writes to the pipe whenever it wants to signal the main thread. This will allow you to select(2) or poll(2) or whatever in the main thread on both that pipe and the socket.
Because you're stuck with the semaphore, I think the best option would be to create another thread, whose sole purpose is to wait for the semaphore using sem_wait(), and then write to another pipe or eventfd(2) object when it is notified by whatever process is doing sem_post(). In the main thread, just add this other file descriptor to your select(2) set.
So you'll have three descriptors: one for the socket, one taking the place of the condition variable, and one which is written to when the semaphore is incremented. You can then wait on all three using your favorite I/O multiplexing method, and include directly whatever timeout you'd like.
I'm using pthreads on Linux, and one of my threads periodically calls the write function on a device file descriptor. If the write call takes a while to finish, will my thread be suspended so other threads can run? I didn't set any of the scheduling features of pthreads, so my question is about default thread behavior.
So long as nothing else is trying to write to the same resource, the other threads should run while the writing thread waits for its write to complete.
If a write() call blocks, only the calling thread is suspended. This is documented in the POSIX spec for write():
If there is enough space for all the
data requested to be written
immediately, the implementation should
do so. Otherwise, the calling thread
may block; that is, pause until enough
space is available for writing.
Note that it says calling thread, not calling process.
See if blocking behavior is explicitly defined here
http://www.akkadia.org/drepper/nptl-design.pdf
In principle, YES, other threads can run.
But be aware that some filesystems have locking mechnisms which permit only one concurrent IO operation on a single file. So if another thread does another IO on the same file (even if it's via a different file descriptor) it MAY block it for some of the duration of the write() system call.
There are also other in-kernel locks for other facililties. Most of them will not block other threads running unless they're doing closely related activities, however.
If your device file descriptor is a shared resource, you have to take care of locking. But once it's thread-safe, calls to such shared resource are serialized, thus if one thread writes, the rest are blocked. If locking is not implemented, the data may be garbled.