When is POSIX thread cancellation not immediate? - linux

The POSIX specifies two types for thread cancellation type: PTHREAD_CANCEL_ASYNCHRONOUS, and PTHREAD_CANCEL_DEFERRED (set by pthread_setcanceltype(3)) determining when pthread_cancel(3) should take effect. By my reading, the POSIX manual pages do not say much about these, but Linux manual page says the following about PTHREAD_CANCEL_ASYNCHRONOUS:
The thread can be canceled at any time. (Typically, it will be canceled immediately upon receiving a cancellation request, but the system doesn't guarantee this.)
I am curious about the meaning about the system doesn't guarantee this. I can easily imagine this happening in multicore/multi-CPU systems (before context switch). But what about single core systems:
Could we have a thread not cancelled immediately when cancellation is requested and cancellation is enabled (pthread_setcancelstate(3)) and cancel type set to PTHREAD_CANCEL_ASYNCHRONOUS?
If yes, under what conditions could this happen?
I am mainly curious about Linux (LinuxThreads / NPTL), but also more generally about POSIX standard compliant way of viewing this cancellation business.
Update/Clarification: Here the real practical concern is usage of resources that are destroyed immediately after calling pthread_cancel() where the targeted thread have cancellation enabled and set to type PTHREAD_CANCEL_ASYNCHRONOUS!!! So the point really is: is there even a tiny possibility for the cancelled thread in this case to continue running normally after context switch (even for a very small time)?
Thanks for Damon's answer the question is reduced about signal delivery and handling in relation to the next context switch.
Update-2: I answered my own question to point that this is bad concern and that the underlying program design should be addressed in fundamentally different conceptual level. I wish this "wrong" question is useful for others wondering about mysteries of asynchronous cancellation.

The meaning is just what it says: It's not guaranteed to happen instantly. The reason for this is that a certain "liberty" for implementation details is needed and accounted for in the standard.
For example under Linux/NPTL, cancellation is implemented by sending signal nr. 32. The thread is cancelled when the signal is received, which usually happens at the next kernel-to-user switch, or at the next interrupt, or at the end of the time slice (which may accidentially be immediately, but usually is not). A signal is never received while the thread isn't running, however. So the real catch here is actually that signals are not necessarily received immediately.
If you think about it, it isn't even possible to do it much different, either. Since you can phtread_cleanup_push some handlers which the operating system must execute (it cannot just blast the thread out of existence!), the thread must necessarily run to be cancelled. There is no guarantee that any particular thread (including the one you want to cancel) is running at the exact time you cancel a thread, so there can be no guarantee that it is cancelled immediately.
Except of course, hypothetically, if the OS was implemented in a way as to block the calling thread and schedule the to-be-cancelled thread so it executes its handlers, and only unblocks pthread_cancel afterwards. But since pthread_cancel isn't specified as blocking, this would be an utterly nasty surprise. It would also be somewhat inacceptable because of interfering wtih execution time limits and scheduler fairness.
So, either your cancel type is "disable", then nothing happens. Or, it is "enable", and the cancel type is "deferred", then the thread cancels when calling a function that is listed as cancellation point in pthreads(7).
Or, it is "asynchronous", then as stated above, the OS will do "something" to cancel the thread as soon as it deems appropriate -- not at a precise, well-defined time, but "soon". In the case of Linux, by sending a signal.

If you need to wonder when the asynchronous cancellation happen, you are doing something terribly wrong.
Following Standards: You are eating ground below your feet by deliberately creating or allowing code to exist whose correctness depends on assumptions about the platform (single core, particular implementation, whatever). It is almost always better, if possible, to follow the standards (and document clearly when it is not possible). The name PTHREAD_CANCEL_ASYNCHROUNOUS itself suggests the meaning asynchronous, which is different from immediate or even almost immediate. The original poster specifically states single core, but why should you allow code to exist that will break in non-deterministic ways, when your code is put to run in truly parallel machines (multiple cores or CPUs) where it is practically impossible to have guarantee of immediateness (this would require stopping the other cores from running or waiting for context switch or some other terrible hack which your OS/CPU is not going to support to support your unconventional wishes).
Asynchronous thread cancellation mode is not meant for guaranteed immediate cancellation of a thread. Hence it is a terribly confusing hack to use them in this way even if it would work.
Async-Safeness: If you are concerned about the mechanism of asynchronous cancellation, it raises the suspicion that the threads in question (because of lack of independence) are maybe not purely computational or written in async-cancel-safe manner.
POSIX specifies only three functions as async-cancel safe: pthread_cancel(3), pthread_setcancelstate(3), and pthread_setcancelmode(3) - see IEEE Std 1003.1, 2013 Edition, 2.9.5. This cancellation mode is only suitable for purely computational tasks that do not call (other than purely computational) library functions; such code would not provide cancellation points if the threads were set to run in the default deferred cancellation mode. Hence the rationale for defining such mode.
It is possible to write async-cancel-safe code by disabling cancellation during critical sections. But library writers (including POSIX library implementors) in general should not care about async-safetyness by reasons of following general convention, avoiding complexity, and even avoiding performance overhead. Because the library writers should not care, you should never expect async-safetyness unless it is explicitly stated otherwise.
If your code is not async-safe (because for example calling other libraries, including POSIX/standard C libraries without temporarily disabling cancellation or changing cancellation mode) and asynchronous cancellation occurs, you might leak resources (memory, etc), leave behind inconsistent states and locked mutexes dead-locking other threads, and summon many other problems currently imaginable and non-imaginable. (If you are writing in C++, it seems you will have other issues to deal with due to POSIX thread cancellation's close association with exception handling.)


What's wrong with using TThread.Resume? [duplicate]

Long ago, when I started working with threads in Delphi, I was making threads start themselves by calling TThread.Resume at the end of their constructor, and still do, like so:
constructor TMyThread.Create(const ASomeParam: String);
inherited Create(True);
FSomeParam:= ASomeParam;
//Initialize some stuff here...
Since then, Resume has been deprecated in favor to use Start instead. However, Start can only be called from outside the thread, and cannot be called from within the constructor.
I have continued to design my threads using Resume as shown above, although I know it's been deprecated - only because I do not want to have to call Start from outside the thread. I find it a bit messy to have to call:
FMyThread := TMyThread.Create(SomeParamValue);
Question: What's the reason why this change was made? I mean, what is so wrong about using Resume that they want us to use Start instead?
EDIT After Sedat's answer, I guess this really depends on when, within the constructor, does the thread actually begin executing.
The short and pithy answer is because the authors of the TThread class didn't trust developers to read or to understand the documentation. :)
Suspending and resuming a thread is a legitimate operation for only a very limited number of use cases. In fact, that limited number is essentially "one": Debuggers
The reason it is considered undesirable (to say the least) is that problems can arise if a thread is suspended while (for example) it owns a lock on some other synchronization object such as a mutex or sempahore etc.
These synchronization objects are specifically designed to ensure the safe operation of a thread with respect to other threads accessing shared resources, so interrupting and interfering with these mechanisms is likely to lead to problems.
A debugger needs a facility to directly suspend a thread irrespective of these mechanisms for surprisingly similar reasons.
Consider for example that a breakpoint involves an implicit (or you might even say explicit) "suspend" operation on a thread. If a debugger halts a thread when it reaches a break-point then it must also suspend all other threads in the process precisely because they will otherwise race ahead doing work that could interfere with many of the low level tasks that the debugger might be asked to then do.
The Strong Arm of the Debugger
A debugger cannot "inject" nice, polite synchronization objects and mechanisms to request that these other thread suspend themselves in a co-ordinated fashion with some other thread that has been unceremoniously halted (by a breakpoint). The debugger has no choice but to strong-arm the threads and this is precisely what the Suspend/Resume API's are for.
They are for situations where you need to stop a thread "Right now. Whatever you are doing I don't care, just stop!". And later, to then say "OK, you can carry on now with whatever it was you were doing before, whatever it was.".
Well Behaved Threads Behave Well Toward Each Other
It should be patently obvious that this is not how a well-behaved thread interacts with other threads in normal operation (if it wishes to maintain a state of "normal" operation and not create all sorts of problems). In those normal cases a thread very much does and should care what those other threads are doing and ensure that it doesn't interfere, using appropriate synchronization techniques to co-ordinate with those other threads.
In those cases, the legitimate use case for Resuming a thread is similarly reduced to just one, single mode. Which is, that you have created and initialised a thread that you do not wish to run immediately but to start execution at some later point in time under the control of some other thread.
But once that new thread has been started, subsequent synchronization with other threads must be achieved using those proper synchronization techniques, not the brute force of suspending it.
Start vs Suspend/Resume
Hence it was decided that Suspend/Resume had no real place on a general purpose thread class (people implementing debuggers could still call the Windows API's directly) and instead a more appropriate "Start" mechanism was provided.
Hopefully it should be apparent that even though this Start mechanism employs the exact same API that the deprecated Resume method previously employed, the purpose is quite different.

Why can a signal handler not run during system calls?

In Linux a process that is waiting for IO can be in either of the states TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE.
The latter, for example, is the case when the process waits until the read from a regular file completes. If this read takes very long or forever (for whatever reasons), it leads to the problem that the process cannot receive any signals during that time.
The former, is the case if, for example, one waits for a read from a socket (which may take an unbounded amount of time). If such a read call gets interrupted, however, it leads to the problem that it requires the programmer/user to handle the errno == EINTER condition correctly, which he might forget.
My question is: Wouldn't it be possible to allow all system calls to be interrupted by signals and, moreover, in such a way that after the signal handler ran, the original system call simply continues its business? Wouldn't this solve both problems? If it is not possible, why?
The two halves of the question are related but I may have digested too much spiked eggnog to determine if there is so much a one-to-one as opposed to a one-to-several relationship here.
On the kernel side I think a TASK_KILLABLE state was added a couple of years back. The consequence of that was that while the process blocked in a system call (for whatever reason), if the kernel encountered a signal that was going to kill the process anyway, it would let nature take its course. That reduces the chances of a process being permanent stuck in an uninterruptible state. What progress has been made in changing individual system calls to avail themselves of that option I do not know.
On the user side, SA_RESTART and its convenience function cousin siginterrupt go some distance to alleviating the application programmer pain. But the fact that these are specified per signal and aren't honored by all system calls is a pretty good hint that there are reasons why a blanket interruption scheme like you propose can't be implemented, or at least easily implemented. Just thinking about the possible interactions of the matrices of signals, individual system calls, calls that have historically expected and backwardly supported behavior (and possibly multiple historical semantics based on their system bloodline!), is a bit dizzying. But maybe that is the eggnog.

When should a thread generally yield?

In most languages/frameworks, there exists a way for a thread to yield control to other threads. However, I can't really think of a time when yielding from a thread was the correct solution to a given problem. When, in general, should one use Thread.yield(), sleep(0), etc?
One use case could be for testing concurrent programs, try to find interleavings that reveal flaws in your synchronization patterns. For instance in Java:
A useful trick for increasing the number of interleavings, and
therefore more effectively exploring the state space of your programs,
is to use Thread.yield to encourage more context switches during
operations that access shared state. (The effectiveness of this
technique is platform-specific, since the JVM is free to treat
THRead.yield as a no-op [JLS 17.9]; using a short but nonzero sleep
would be slower but more reliable.) — JCIP
Also interesting from the Java point of view is that their semantics are not defined:
The semantics of Thread.yield (and Thread.sleep(0)) are undefined
[JLS 17.9]; the JVM is free to implement them as no-ops or treat them
as scheduling hints. In particular, they are not required to have the
semantics of sleep(0) on Unix systemsput the current thread at the end
of the run queue for that priority, yielding to other threads of the
same prioritythough some JVMs implement yield in this way. — JCIP
This makes them, of course, rather unreliable. This is very Java specific, however, in generally I believe following is true:
Both are low-level mechanism which can be used to influence the scheduling order. If this is used to achieve a certain functionality then this functionality is based on the probability of the OS scheduler which seems a rather bad idea. This should be managed by higher-level synchronization constructs instead.
For testing purpose or for forcing the program into a certain state it seems a handy tool.
When, in general, should one use Thread.yield(), sleep(0), etc?
It depends on the VM are thread model we are talking about. For me the answer is rarely if ever.
Traditionally some thread models were non-preemptive and others are (or were) not mature hence the need for Thread.yield().
I feel that Thread.yield() is like using register in C. We used to rely on it to improve the performance of our programs because in many cases the programmer was better at this than the compiler. But modern compilers are much smarter and in much fewer cases these days can the programmer actually improve the performance of a program with the use of register and Thread.yield().
Keep your OS scheduler decide for you ?
So never yield, and never sleep(0) until you match a case where sleep(0) is absolutly necessary and document it here.
Also context switch are costy so I don't think a lot of people want more context switches.
I know this is old, but you didn't get any good answers here.
In general yielding is a way to be polite to other threads/processes and give them a chance to run on the same CPU with minimal delay to the yielding thread.
Not all yielding is equal either. On Windows SwitchToThread() only releases CPU if another thread of equal or greater priority was scheduled to run on the same CPU which means it very possibly will simply resume the calling thread while Sleep(0) has looser scheduler semantics; on Linux sched_yield() is similar to SwitchToThread() while nanosleep() with a 0 timespec seemingly marks the thread as unready for whatever period the timer slack is set to (inferred from profiling and substantiated here ). Behavior on MacOS is seemingly similar to Linux, but with much less timer slack - haven't looked into it that much though.
Yielding was way more useful in the days when uniprocessor systems were abundant because it really helped keep the system moving, but for example on Windows where by default Sleep(1) is actually predictably at least a 15.6ms delay (note that this is nearly an entire frame at 60fps if you're making a game or media player or something) it's still pretty valid although MessageWaitForMultipleObjectsEx should be preferred in general UI applications. Windows 10 added a new type of high resolution waitable timer with microsecond granularity that should probably be preferred over other methods, so hopefully that kind of yielding won't be so necessary anymore either.
In the context of N:1 and N:M cooperative threading models (not common at the OS level anymore, but still employed at the application-level through libraries providing Fibers and Coroutines often enough) yielding is still also definitely useful to keep things moving.
Unfortunately it's also abused pretty often, for example yielding in a busy loop rather than waiting on a synchronization primitive because the appropriate primitive isn't obvious or because the developer is overly optimistic about how long their threads will wait for / overly pessimistic about the scheduler. But in practice on most modern multitasking OSes unless the system is extremely busy, threads waiting on a synchronization primitive will get run almost instantly when the primitive is triggered/released/whatever.
You should try to avoid yielding, especially as an alternative to using a proper synchronization method. When you do need to yield, a zero sleep or waiting on a high resolution time source is probably better than a normal yield - I call the prior a "long yield" as opposed to a "short yield" - but unless you're using the system interface the implementation of sleep in your programming language/framework of choice might "optimize" sleep(0) into a short yield or even a no-op for you, sadly.

C# How to maximize chance that particular section of code will have no context switch?

I have time-critical piece of code in my app. I made the thread which runs it Highest priority - that's the most I could do.
Is there any suggestions on how to make part of the code run in this thread to be interrupted as few times as possible (less context switch occurred) ?
The code is not complicated. I replaced all the method calls with inlined code and I don't use anything of high level (like no LINQ). The most of operations are arithmetic. There is only one comparison of strings (I am thinking of ways how to rid of it). Half of maths are with ints and half with doubles.
The code is x86 .NET 4 C#. Runs on single Xenon X3450 W2008R2. Single app server.
(Unfortunately data is coming from 3d party API which doesn't support x64 (hate it!))
I'd appreciate grown-up discussion with experienced developers.
P.S. The server has no paging file so hard page faults wont happen either (no unwanted IO operations).
The only thing you need to worry about in terms of context switches, is blocking your thread. So there should be no problem with using LINQ (that is, LINQ-to-objects, obviously LINQ-to-SQL or whatever would involve blocking!). Any sort of arithmetic or calling methods and so on will also not block the thread and so have no impact on context switches.
The other thing that affects context switching is, as you noted, priority. But not just thread priority, also your process's priority. You can use SetPriorityClass to increase your process's priority to ABOVE_NORMAL_PRIORITY_CLASS (I wouldn't bother putting it higher than that) and then set your thread's priority to Above Normal as well.
However, in general, priorities are really only useful when it's a matter of timing (that is, making sure your process responds to external input (network, user input, disk I/O) as fast as possible). It will actually have very little impact on your thread's actual throughput, unless you have other processes that are also CPU-bound running at the same time. But if that's the case, then fiddling with priorities is not going to be a viable long-term solution anyway. This is because you'll find that by setting one of the processes to a higher priority, it'll completely starve the other processes and they'll never run.
So anyway, I would carefully consider things before adjusting thread and process priorities. And, as always, test, test, test!
If you make that unmanaged WINAPI code instead, the SetThreadPriority function also supports a THREAD_PRIORITY_TIME_CRITICAL (higher than THREAD_PRIORITY_HIGHEST).
It's also worth boosting the priority of the process in which the thread is running (actual priority depends on a combination of thread and process priority).
You should also avoid making I/O calls on the thread (which could block). Taking it to a perhaps-ridiculous extreme you could also avoid making I/O calls on other threads (which could temporarily boost the priority of those threads).

How to define threadsafe?

Threadsafe is a term that is thrown around documentation, however there is seldom an explanation of what it means, especially in a language that is understandable to someone learning threading for the first time.
So how do you explain Threadsafe code to someone new to threading?
My ideas for options are the moment are:
Do you use a list of what makes code
thread safe vs. thread unsafe
The book definition
A useful metaphor
Multithreading leads to non-deterministic execution - You don't know exactly when a certain piece of parallel code is run.
Given that, this wonderful multithreading tutorial defines thread safety like this:
Thread-safe code is code which has no indeterminacy in the face of any multithreading scenario. Thread-safety is achieved primarily with locking, and by reducing the possibilities for interaction between threads.
This means no matter how the threads are run in particular, the behaviour is always well-defined (and therefore free from race conditions).
Eric Lippert says:
When I'm asked "is this code thread safe?" I always have to push back and ask "what are the exact threading scenarios you are concerned about?" and "exactly what is correct behaviour of the object in every one of those scenarios?".
It is unhelpful to say that code is "thread safe" without somehow communicating what undesirable behaviors the utilized thread safety mechanisms do and do not prevent.
A good place to start is to have a read of the POSIX paper on thread safety.
Edit: Just the first few paragraphs give you a quick overview of thread safety and re-entrant code.
i maybe wrong but one of the criteria for being thread safe is to use local variables only. Using global variables can have undefined result if the same function is called from different threads.
A thread safe function / object (hereafter referred to as an object) is an object which is designed to support multiple concurrent calls. This can be achieved by serialization of the parallel requests or some sort of support for intertwined calls.
Essentially, if the object safely supports concurrent requests (from multiple threads), it is thread safe. If it is not thread safe, multiple concurrent calls could corrupt its state.
Consider a log book in a hotel. If a person is writing in the book and another person comes along and starts to concurrently write his message, the end result will be a mix of both messages. This can also be demonstrated by several threads writing to an output stream.
I would say to understand thread safe, start with understanding difference between thread safe function and reentrant function.
Please check The difference between thread-safety and re-entrancy for details.
Tread-safe code is code that won't fail because the same data was changed in two places at once. Thread safe is a smaller concept than concurrency-safe, because it presumes that it was in fact two threads of the same program, rather than (say) hardware modifying data, or the OS.
A particularly valuable aspect of the term is that it lies on a spectrum of concurrent behavior, where thread safe is the strongest, interrupt safe is a weaker constraint than thread safe, and reentrant even weaker.
In the case of thread safe, this means that the code in question conforms to a consistent api and makes use of resources such that other code in a different thread (such as another, concurrent instance of itself) will not cause an inconsistency, so long as it also conforms to the same use pattern. the use pattern MUST be specified for any reasonable expectation of thread safety to be had.
The interrupt safe constraint doesn't normally appear in modern userland code, because the operating system does a pretty good job of hiding this, however, in kernel mode this is pretty important. This means that the code will complete successfully, even if an interrupt is triggered during its execution.
The last one, reentrant, is almost guaranteed with all modern languages, in and out of userland, and it just means that a section of code may be entered more than once, even if execution has not yet preceeded out of the code section in older cases. This can happen in the case of recursive function calls, for instance. It's very easy to violate the language provided reentrancy by accessing a shared global state variable in the non-reentrant code.
