after pthread_cancel(), pthread_join() function call hangup - multithreading

I have written an application, which creates a thread and runs into while loop.
In the application after spawning a thread, i have overridden the fork() syscall with pthread_atfork().
[when fork is called, a prepare function call is called which cancels the thread and after the child process is spawned the same thread is spawned again.]
I have problem in the above logic, when the same code is compiled to different target, there is hangup in pthread_join().
spawn a thread, which runs into while loop.(i have added cancel state=Enabled and cancel type=asynchronous + cancellation points are also considered)
override fork() with pthread_atfork().
when fork() system call is executed, prepare-function cancels the running thread with pthread_cancel and pthread_join() is applied to check the thread is terminated.
Here the problem occurs, the pthread_join() doesn't return.
the above behavior is observed in one particular target environment.
I have some doubt
pthread_cancel() is it safe to call?
with cancel state = enabled and type asynchronous, does pthread_cancel() cancel thread immediately?
Or is there any alternate way to cancel the thread?

Related

Waiting std::condition_variable while forking and forked child process is unable to resume it

I am trying to understand forking with multithreading. So what happens in below scenario ?
Application thread has spawned a thread - polling thread
Application thread runs fork
atpthread_fork handler's pre_fork stops the polling thread using a std::condition_variable. It also waits on a different condition variable to resume the polling
atpthread_fork handler's post_fork in child does cv.notify_one for the waiting poll thread and stops the poll thread
atpthread_fork handler's post_fork in parent does cv.notify_one for the waiting poll thread and resumes the poll thread
But what happens is, post_fork in child enters an infinite loop where it keeps on waiting. This also doesn't seem to notified the poll thread cv at all.
Why is this happening ?
I am trying to understand forking with multithreading.
The #1 thing to understand about combining forking with multithreading is don't do it. The combination is highly problematic other than in a handful of special cases.
So what happens in below scenario ?
Application thread has spawned a thread - polling thread
Application thread runs fork
atpthread_fork handler's pre_fork stops the polling thread using a std::condition_variable. It also waits on a different condition
variable to resume the polling
That makes no sense. A condition variable does not have the power to preemptively make any thread stop. And if the polling thread eventually did stop by blocking on the CV, then what role would a different CV have to play in starting it again?
atpthread_fork handler's post_fork in child does cv.notify_one for the waiting poll thread and stops the poll thread
I suppose you meant to say that a post_fork handler registered via pthread_atfork performs a cv.notify_one in the child to resume the poll thread.
Any way around, it is impossible for the child to do anything with the polling thread because it doesn't have one. The child process has only one thread -- a copy of the one that called fork(). This is one of the main reasons why forking and multithreading don't mix.
atpthread_fork handler's post_fork in parent does cv.notify_one for the waiting poll thread and resumes the poll thread
This seems questionable in light of the overall questionable behavior you are attributing to the CVs, but there is nothing inherently wrong with the concept.
But what happens is, post_fork in child enters an infinite loop where
it keeps on waiting.
Something is missing here. Are you doing a timed wait? Is its wait failing? These are the only ways I can think of that the child could be both looping and waiting. There is initially no other thread in the child process to wake the single one that resulted from the fork, so that thread cannot perform a successful return from a wait on any CV, unless spurriously. There is no one to signal it.
This also doesn't seem to notified the poll
thread cv at all.
Do you mean the one in the child that doesn't exist? Or the one in the parent that probably isn't waiting on the CV you think it's waiting on?
Most of the above is moot. There is absolutely no reason to think that your program is exercising one of the special exceptions, so refer to #1: don't combine forking with multithreading. Choose one.

How can one implement pthread_detach on Linux?

pthread_detach marks a thread so that when it terminates, its resources are automatically released without requiring the parent thread to call pthread_join. How can it do this? From the perspective of Linux in particular, there are two resources in particular I am curious about:
As an implementation detail, I would expect that if a wait system call is not performed on the terminated thread, then the thread would become a zombie. I assume that the pthread library's solution to this problem does not involve SIGCHLD, because (I think) it still works regardless of what action the program has specified to occur when SIGCHLD is received.
Threads are created using the clone system call. The caller must allocate memory to serve as the child thread's stack area before calling clone. Elsewhere on Stack Overflow, it was recommended that the caller use mmap to allocate the stack for the child. How can the stack be unmapped after the thread exits?
It seems to me that pthread_detach must somehow provide solutions to both of these problems, otherwise, a program that spawns and detaches many threads would eventually lose the ability to continue spawning new threads, even though the detached threads may have terminated already.
The pthreads library (on Linux, NPTL) provides a wrapper around lower-level primitives such as clone(2). When a thread is created with pthread_create, the function passed to clone is a wrapper function. That function allocates the stack and stores that information plus any other metadata into a structure, then calls the user-provided start function. When the user-provided start function returns, cleanup happens. Finally, an internal function called __exit_thread is called to make a system call to exit the thread.
When such a thread is detached, it still returns from the user-provided start function and calls the cleanup code as before, except the stack and metadata is freed as part of this since there is nobody waiting for this thread to complete. This would normally be handled by pthread_join.
If a thread is killed or exits without having run, then the cleanup is handled by the next pthread_create call, which will call any cleanup handlers yet to be run.
The reason a SIGCHLD is not sent to the parent nor is wait(2) required is because the CLONE_THREAD flag to clone(2) is used. The manual page says the following about this flag:
A new thread created with CLONE_THREAD has the same parent process as the process that made the clone call (i.e., like CLONE_PARENT), so that calls to getppid(2) return the same value for all of the threads in a thread group. When a CLONE_THREAD thread terminates, the thread that created it is not sent a SIGCHLD (or other termination) signal; nor can the status of such a thread be obtained using wait(2). (The thread is said to be detached.)
As you noted, this is required for the expected POSIX semantics to occur.

Do I need to check for my threads exiting?

I have an embedded application, running as a single process on Linux.
I use sigaction() to catch problems, such as segmentation fault, etc.
The process has a few threads, all of which, like the app, should run forever.
My question is whether (and how) I should detect if one of the threads dies.
Would a seg fault in a thread be caught by the application’s sigaction() handler?
I was thinking of using pthread_cleanup_push/pop, but this page says “If any thread within a process calls exit, _Exit, or _exit, then the entire process terminates”, so I wonder if a thread dying would be caught at the process level …
It is not a must that you need to check whether the child thread is completed.
If you have a need of doing something after the child thread completes its processing you can call thread_join() from the main thread, so that it will wait till the child threads completes execution and you can do the rest after this. If you are using thread_exit in the main thread it will get terminated once it is done, leaving the spawned threads to continue execution. The process will get killed only after all the threads completes execution.
If you want to check the status of the spawned threads you can use a flag to detect whether it is running or not. Check this link for more details
How do you query a pthread to see if it is still running?

how to cancel (not terminate) boost thread?

I like C# CancellationTokenSource which allows me to terminate the Task as shown in this article.
What would be the similar algorithm of canceling boost::thread? I don't want to "kill" or "terminate" the thread. Instead i want to "request" the task to finish. Then i need to wait until the task is finished.
You can use the boost thread interruption
A running thread can be interrupted by invoking the interrupt() member
function of the corresponding boost::thread object. When the
interrupted thread next executes one of the specified interruption
points (or if it is currently blocked whilst executing one) with
interruption enabled, then a boost::thread_interrupted exception will
be thrown in the interrupted thread. If not caught, this will cause
the execution of the interrupted thread to terminate. As with any
other exception, the stack will be unwound, and destructors for
objects of automatic storage duration will be executed.

Thread deletion design

I have multi thread program. I have a design of my application as follows:
Suppose one is main thread, and other are slave threads. Main thread keep track of all slave thread ID's. During one of the scenario of application (one of the scenario is graceful shutdown of application), i want to delete slave threads from main thread.
Here slave threads may be executing i.e., either in sleep mode or doing some action which i cannot stop the action. So i want to delete the threads from main thread with thread IDs i stored internally.
Additional info:
While deleting i should not wait for thread current action to complete as it may take long time as i am reading from data base and taking some action in thread, in case of gracefull shut down i should not wait for action to complete as it may take time.
If i force delete a thread how can there will be a resource leaks?
Is above design is ok or there is any flow or any ways we can improve the design.
Thanks!
It's not okay. It's a bad practice to forcefully kill a thread from another thread because you'll very likely to have resource leaks. The best way is to use an event or signal to signal the client process to stop and wait until they exit gracefully.
The overall flow of the program would look like this:
Parent thread creates an event (say hEventParent). it then creates child threads and passes hEventParent as a parameter. The Parent thread keeps the hThread of the child thread(s).
Child threads do work but periodically waits for hEventParent.
When the program needs to exit, the parent thread sets hEventParent. It then waits for hThread (WaitForMultipleObjects also accepts hThread)
Child thread is notified then execute clean up routine and exits.
When all the threads exit, the parent can then exit.
The most common approach consists in the main thread sending a termination signal to all the threads, then waiting for the threads to end.
Typically the worker threads will have a loop, inside of which the work is done. You can add a boolean variable that indicates if the thread needs to end. For example:
terminate = false;
while (!terminate) {
// work here
}
If you want your worker threads to go to sleep when they have no work, then it gets a bit more complicated. In this case you could make the threads wait on semaphores. Each semaphore will be signaled when there is work to do, and that will awaken the thread. You will also signal the semaphore when the request to terminate is issued. Example worker thread:
terminate = false;
while (!terminate) {
// work here
wait(semaphore); // go to sleep
}
When the main thread wants to exit it will set terminate to true for all the threads and then signal the thread semaphores to awaken the threads and give them a chance to see the termination request. After that it will join all the threads, and only after all the threads are finished it will exit.
Note that the terminate boolean may need to be declared as volatile if you are using C/C++, to indicate to the compiler that it may be changed from another thread.

Resources