What is the difference between cancelling and exiting a pthread? - linux

The term "cancelling a pthread" and "exiting a pthread" looks confusing.
Can someone help me clearly explain the difference between the two?
P.S: Please don't help me with a link to the man pages, i have already seen that :-)
Added:
1) How are the thread data structures handled, and the cleanup
different in both these cases?
2) When there are signals pending for a thread, how are the pending
signals mask handled in both these cases?

Referring the 1st question in the OP's addendum, verbatim from man pthread_cancel():
When a cancellation requested is acted on, the following steps occur for thread (in this order):
Cancellation clean-up handlers are popped (in the reverse of the order in which they were pushed) and called. (See pthread_cleanup_push(3).)
Thread-specific data destructors are called, in an unspecified order. (See pthread_key_create(3).)
The thread is terminated. (See pthread_exit(3).)
The above steps happen asynchronously with respect to the pthread_cancel() call; the return status of pthread_cancel() merely informs the caller whether the cancellation request was successfully queued.
After a canceled thread has terminated, a join with that thread using pthread_join(3) obtains PTHREAD_CANCELED as the thread's exit status. (Joining with a thread is the only way to know that cancellation has completed.)
The only difference I see is the exit point: For a cancelled thread it's any cancellation point the thread function might pass, else it's pthread_exit(), return or the end of the thread function.
Update (referring the 2nd question):
I'd say if a signal was put into a thread's queue and still is pending when cancellation has finished the signal is lost. I'm not sure but I could imagine that signal processing is going on as long as the thread lives, that is also "during" the cancellation.
All I could find regarding this is from man pthread_exit:
BUGS
Currently, there are limitations in the kernel implementation logic for wait(2) ing on a stopped thread group with a dead thread group leader. This can manifest in problems such as a locked terminal if a stop signal is sent to a foreground process whose thread group leader has already called pthread_exit(3).
All quotes are comming from the Debian (non-free) package manpages-posix-dev (2.16-1) . (The source package is here.)

Related

processes only terminate, when threads are terminated?

Processes should only terminate themselves, when all their threads are
terminated!
It's a question in our mock exam, and we aren't sure whether the statement is true or false.
Thanks a lot
First, I need to point out that this exam question contains an incorrect presumption. A running process always has at least one thread. The initial thread, the thread that first calls main or equivalent, isn't special; it's just like every other thread created by pthread_create or equivalent. Once all of the threads within a process have exited, the process can't do anything anymore — there's no way for it to execute even a single additional CPU instruction. In practice, the operating system will terminate the process at that point.
Second, as was pointed out in the comments on the question, the use of "should" makes your exam question ambiguous. It could be read as either "Processes only terminate when all of their threads are terminated" — as a description of how the system works. Or it could be read as "You, the programmer, should write code that ensures that your processes only terminate when all of their threads are terminated" — as a prescription for writing correct code.
If you are specifically talking about POSIX threads ("pthreads"), the answer to the descriptive question is that it depends on how each thread terminates. If all threads terminate by calling pthread_exit or by being cancelled, the process will survive until the last thread terminates, no matter which order they exit in. On the other hand, if any thread calls exit or _exit, or receives a fatal signal, that will immediately terminate the entire process, no matter how many threads are still active. (I am not 100% sure about this, but I think it doesn't matter whether any threads have been detached.)
There's an additional complication, which is that returning from a function passed to pthread_create is equivalent to calling pthread_exit for that thread, but returning from main is equivalent to calling exit. That makes the initial thread a little bit special: unless you specifically end main by calling pthread_exit, the entire process will be terminated when the initial thread exits. But technically this is not a property of the thread itself, but of the code running in that thread.
I do not know the answer to the descriptive question for threads libraries other than POSIX; in particular I don't know the answer for either Windows native threads, or for the threads library added to ISO C in its 2011 revision.
The answer to the prescriptive question is yes with exceptions. You, a programmer, should write programs that, under normal conditions, take care to end their process only when all of their threads have finished their work. (With POSIX threads, this translates to making sure that main does not return until all the other threads have been joined.) However, sometimes you have a few threads that run an infinite loop, without holding any locks or anything, and there's no good way to tell them to exit when everything else is done; as long as exiting the process out from under them won't damage any persistent state, go ahead and exit the process out from under them. (This is the intended use case for detached threads.) Also, it's OK, and often the best choice, to terminate the entire process abruptly if you encounter some kind of unrecoverable error. Those are the only exceptions I can think of off the top of my head.

Is pthread_join() a critical function?

According to POSIX, a Thread ID can be reused if the original bearer thread finished. Therefore, would one need to use a mutex or semaphore when calling pthread_join()? Because, it could happen that the target thread, which one wants to join, already terminated and another thread with the same thread ID was created, before calling pthread_join() in the original thread. This would make the original thread believe that the target thread has not finished, although this is not the case.
I think you'll find this works much the same way as processes in UNIX. A joinable thread is not considered truly finished until something has actually joined it.
This is similar to the UNIX processes in that, even though they've technically exited, enough status information (including the PID, which cannot be re-used yet) hangs around until another process does a wait on it. Only after that point does the PID become available for re-use. This kind of process is called a zombie, since it's dead but not dead.
This is supported by the pthread_join documentation which states:
Failure to join with a thread that is joinable (i.e., one that is not detached), produces a "zombie thread". Avoid doing this, since each zombie thread consumes some system resources, and when enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).
and pthread_create, which states:
Only when a terminated joinable thread has been joined are the last of its resources released back to the system.

will setting pthread_canelState to PTHREAD_CANCEL_DISABLE queue up the cancellation requests?

I am trying to understand Posix threads. In the man page of pthread_cancel(), it is mentioned that "thread’s cancelability state, determined by pthread_setcancelstate(), can be enabled or disabled. If a thread has disabled cancellation, then a cancellation request remains queued until the thread enables cancellation.
But when I was reading about thread cancellation points on http://www.makelinux.net/alp/029, it is mentioned that if we set the cancel type as disabled (uncancellable), the cancellation requests are quietly ignored.
Can any one please let me know whether cancellation requests are getting queued or ignored if we set the cancellation type as DISABLED?
POSIX threads controls thread cancellation by a combination of two binaries variables:
cancellation STATE and cancellation TYPE. The associated functions are pthread_setcancelstate() and pthread_setcanceltype() accordingly.
When the STATE is set to disabled, the cancellation request is ignored.
It is not thrown out, it is suspended (or as you correctly wrote - "queued"), until the STATE is set back to enabled. Since the state is enabled, the OS starts the cancellation process according to the cancellation type. If you have a code that must be executed before a thread is cancelled (e.g. memory de-allocation etc.), you may set the thread cancellation state to disabled, before entering the code, and enable the cancellation exiting the code. The second question is how and when the thread is really stopped (cancelled). The the cancellation type answers this question. If the type is set to (not recommended) asynchronous, the cancellation may occur at the nearest instruction. If the type is set to the default deferred cancellation, the cancellation will occur at the next "cancellation point", a POSIX function that checks thread cancellation status and terminates the thread.

Interrupt while placing process on the waiting queue

Suppose there is a process that is trying to enter the critical region but since it is occupied by some other process, the current process has to wait for it. So, at the time when the process is getting added to the waiting queue of the semaphore, suppose an interrupt comes (ex- battery finished), then what will happen to that process and the waiting queue?
I think that since the battery has finished so this interrupt will have the highest priority and so the context of the process which was placing the process on the waiting queue would be saved and interrupt service routine for this routing will be executed.
And then it will return to the process that was placing the process on the queue.
Please give some hints/suggestions for this question.
This is very hardware / OS dependant, however a few thoughts:
As has been mentioned in the comments, a ‘battery finished’ interrupt may be considered as a special case, simply because the machine may turn off without taking any action, in which case the processes + queue will disappear. In general however, assuming a non-fatal interrupt and an OS that suspends / resumes correctly, I think it’s unlikely there will be any noticeable impact to the execution of either process.
In a multi-core setup, the process may not be immediately suspended. The interrupt could be handled by a different core and neither of the processes you’ve mentioned would be any the wiser.
In a pre-emptive multitasking OS there's also no guarantee that the process adding to the queue would be resumed immediately after the interrupt, the scheduler could decide to activate the process currently in the critical section or another process entirely. What would happen when the process adding itself to the semaphore wait queue resumed would depend on how far through adding it was, how the queue has been implemented and what state the semaphore was in. It may be that it never gets on to the wait queue because it detects that the other process has already woken up and left the critical section, or it may be that it completes adding itself to the queue and suspends as if nothing had happened…
In a single core/processor machine with a cooperative multitasking OS, I think the scenario you’ve described in your question is quite likely, with the executing process being suspended to handle the interrupt and then resumed afterwards until it finished adding itself to the queue and yielded.
It depends on the implementation, but conceptually the same operating process should be performing both the addition of the process to the wait queue and the management of the interrupts, so your process being moved to wait would instead be treated as interrupted from the wait queue.
For Java, see the API for Thread.interrupt()
Interrupts this thread.
Unless the current thread is interrupting itself, which is always permitted, the checkAccess method of this thread is invoked, which may cause a SecurityException to be thrown.
If this thread is blocked in an invocation of the wait(), wait(long), or wait(long, int) methods of the Object class, or of the join(), join(long), join(long, int), sleep(long), or sleep(long, int), methods of this class, then its interrupt status will be cleared and it will receive an InterruptedException.
If this thread is blocked in an I/O operation upon an interruptible channel then the channel will be closed, the thread's interrupt status will be set, and the thread will receive a ClosedByInterruptException.
If this thread is blocked in a Selector then the thread's interrupt status will be set and it will return immediately from the selection operation, possibly with a non-zero value, just as if the selector's wakeup method were invoked.
If none of the previous conditions hold then this thread's interrupt status will be set.
Interrupting a thread that is not alive need not have any effect.

some information on timer_helper_thread() of librt.so.1

Can anybody give some information on timer_helper_thread() function of librt.so.1.
I am using posix timer_create() function in my application for timer functionality and i am using SIEV_THREAD for notifiction. When timeout happens, i could see in gdb that two thread are getting created. One is the thread whose start function i have specified and another is the thread whose start function is timer_help_therad() of librt.so.1. Among these two timer_helper_thread() is not exiting even after my thread is exiting. Can anbody tell me when will timer_helper_thread() exit and give some informatin on it?
Short answer: don't worry about it; it's an implementation detail and will clean up after itself when your program exits. But if you're curious...
From glibc's timer_create(2) man page:
SIGEV_THREAD:
Upon timer expiration, invoke sigev_notify_function as if it were the start function of a new thread. (Among the implementation possibilities here are that each timer notification could result in the creation of a new thread, or that a single thread is created to receive all notifications.)
And also:
The functionality for SIGEV_THREAD is implemented within glibc, rather than the kernel.
So glibc (i.e. librt.so) assumes that the kernel cannot create a thread in response to a timer event -- that all it supports is sending a signal. So someone needs to receive that signal and create the handler thread. If you wanted to muck with the details of receiving the signal yourself, you wouldn't have used SIGEV_THREAD, so glibc doesn't bother you and instead creates its own thread just for handling timer events.
This timer helper thread lasts from the fist time you call timer_create() until your program ends. Unless you're doing something unusual, you don't need to worry about it; it will clean up after itself when your program exits. The only thing it does is wait for a timer to expire, so it's not using up any extra processing power. Furthermore, it looks like there will only ever be the one helper thread, no matter how many timers you create.
#jander: Your comment is interesting here "This timer helper thread lasts from the fist time you call timer_create() until your program ends."
There are threads created on everytime a timer is timeout. Is this same as the timer_helper_thread() you mention?
I have a similar post where I observe a separate thread created only for timer_create(). Would this be the timer_helper_thread()?
Ref: New thread on invocation of timer_create()

Resources