What are the POSIX cancellation points? - multithreading

What are the POSIX cancellation points? I'm looking for a definitive list of POSIX cancellation points.
I'm asking because I have a book that says accept() and select() are cancellation points, but I've seen sites on the internet claim that they are not.
Also, if Linux cancellation points are different than POSIX cancellation points I want a list of them too.

The POSIX 1003.1-2003 standard gives a list in the System Interfaces section, then General Information, then Threads (direct link courtesy of A. Rex).
(Added: POSIX 1003.1-2008 is now available on the web (all 3872 pages of it, in PDF and HTML). You have to register (free). I got to it from the Open Group Bookstore.)
Cancellation Points
Cancellation points shall occur when a thread is executing the following functions:
accept()
aio_suspend()
clock_nanosleep()
close()
connect()
creat()
fcntl() (When the cmd argument is F_SETLKW)
fdatasync()
fsync()
getmsg()
getpmsg()
lockf()
mq_receive()
mq_send()
mq_timedreceive()
mq_timedsend()
msgrcv()
msgsnd()
msync()
nanosleep()
open()
pause()
poll()
pread()
pselect()
pthread_cond_timedwait()
pthread_cond_wait()
pthread_join()
pthread_testcancel()
putmsg()
putpmsg()
pwrite()
read()
readv()
recv()
recvfrom()
recvmsg()
select()
sem_timedwait()
sem_wait()
send()
sendmsg()
sendto()
sigpause()
sigsuspend()
sigtimedwait()
sigwait()
sigwaitinfo()
sleep()
system()
tcdrain()
usleep()
wait()
waidid()
waitpid()
write()
writev()
A cancellation point may also occur when a thread is executing the following functions:
access()
asctime()
asctime_r()
catclose()
catgets()
catopen()
closedir()
closelog()
ctermid()
ctime()
ctime_r()
dbm_close()
dbm_delete()
dbm_fetch()
dbm_nextkey()
dbm_open()
dbm_store()
dlclose()
dlopen()
endgrent()
endhostent()
endnetent()
endprotoent()
endpwent()
endservent()
endutxent()
fclose()
fcntl() (For any value of the cmd argument. [Presumably except F_SETLKW which is listed.]
fflush()
fgetc()
fgetpos()
fgets()
fgetwc()
fgetws()
fmtmsg()
fopen()
fpathconf()
fprintf()
fputc()
fputs()
fputwc()
fputws()
fread()
freopen()
fscanf()
fseek()
fseeko()
fsetpos()
fstat()
ftell()
ftello()
ftw()
fwprintf()
fwrite()
fwscanf()
getaddrinfo()
getc()
getc_unlocked()
getchar()
getchar_unlocked()
getcwd()
getdate()
getgrent()
getgrgid()
getgrgid_r()
getgrnam()
getgrnam_r()
gethostbyaddr()
gethostbyname()
gethostent()
gethostid()
gethostname()
getlogin()
getlogin_r()
getnameinfo()
getnetbyaddr()
getnetbyname()
getnetent()
getopt() (if opterr is non-zero.)
getprotobyname()
getprotobynumber()
getprotoent()
getpwent()
getpwnam()
getpwnam_r()
getpwuid()
getpwuid_r()
gets()
getservbyname()
getservbyport()
getservent()
getutxent()
getutxid()
getutxline()
getwc()
getwchar()
getwd()
glob()
iconv_close()
iconv_open()
ioctl()
link()
localtime()
localtime_r()
lseek()
lstat()
mkstemp()
mktime()
nftw()
opendir()
openlog()
pathconf()
pclose()
perror()
popen()
posix_fadvise()
posix_fallocate()
posix_madvise()
posix_openpt()
posix_spawn()
posix_spawnp()
posix_trace_clear()
posix_trace_close()
posix_trace_create()
posix_trace_create_withlog()
posix_trace_eventtypelist_getne
posix_trace_eventtypelist_rewin
posix_trace_flush()
posix_trace_get_attr()
posix_trace_get_filter()
posix_trace_get_status()
posix_trace_getnext_event()
posix_trace_open()
posix_trace_rewind()
posix_trace_set_filter()
posix_trace_shutdown()
posix_trace_timedgetnext_event(
posix_typed_mem_open()
printf()
pthread_rwlock_rdlock()
pthread_rwlock_timedrdlock()
pthread_rwlock_timedwrlock()
pthread_rwlock_wrlock()
putc()
putc_unlocked()
putchar()
putchar_unlocked()
puts()
pututxline()
putwc()
putwchar()
readdir()
readdir_r()
remove()
rename()
rewind()
rewinddir()
scanf()
seekdir()
semop()
setgrent()
sethostent()
setnetent()
setprotoent()
setpwent()
setservent()
setutxent()
stat()
strerror()
strerror_r()
strftime()
symlink()
sync()
syslog()
tmpfile()
tmpnam()
ttyname()
ttyname_r()
tzset()
ungetc()
ungetwc()
unlink()
vfprintf()
vfwprintf()
vprintf()
vwprintf()
wcsftime()
wordexp()
wprintf()
wscanf()
An implementation shall not introduce cancellation points into any other functions specified in this volume of IEEE Std 1003.1-2001.
The side effects of acting upon a cancellation request while suspended during a call of a function are the same as the side effects that may be seen in a single-threaded program when a call to a function is interrupted by a signal and the given function returns [EINTR]. Any such side effects occur before any cancellation cleanup handlers are called.
Whenever a thread has cancelability enabled and a cancellation request has been made with that thread as the target, and the thread then calls any function that is a cancellation point (such as pthread_testcancel() or read()), the cancellation request shall be acted upon before the function returns. If a thread has cancelability enabled and a cancellation request is made with the thread as a target while the thread is suspended at a cancellation point, the thread shall be awakened and the cancellation request shall be acted upon. However, if the thread is suspended at a cancellation point and the event for which it is waiting occurs before the cancellation request is acted upon, it is unspecified whether the cancellation request is acted upon or whether the cancellation request remains pending and the thread resumes normal execution.
Ugh! Can't get the table to work very well it looked OK in preview and nothing like a table afterwards. Look at the URL for the information!
There are a lot of possible cancellation points.

See the pthread_cancel man page for further and fast info.

Additional Info: since kernel 2.6, Linux has used the NPTL thread library which is POSIX compliant, so cancellation points should be as above for recent Linux implmentations.
http://www.ddj.com/linux-open-source/184406204

Related

Will Go's scheduler yield control from one goroutine to another for CPU-intensive work?

The accepted answer at golang methods that will yield goroutines explains that Go's scheduler will yield control from one goroutine to another when a syscall is encountered. I understand that this means if you have multiple goroutines running, and one begins to wait for something like an HTTP response, the scheduler can use this as a hint to yield control from that goroutine to another.
But what about situations where there are no syscalls involved? What if, for example, you had as many goroutines running as logical CPU cores/threads available, and each were in the middle of a CPU-intensive calculation that involved no syscalls. In theory, this would saturate the CPU's ability to do work. Would the Go scheduler still be able to detect an opportunity to yield control from one of these goroutines to another, that perhaps wouldn't take as long to run, and then return control back to one of these goroutines performing the long CPU-intensive calculation?
There are few if any promises here.
The Go 1.14 release notes says this in the Runtime section:
Goroutines are now asynchronously preemptible. As a result, loops without function calls no longer potentially deadlock the scheduler or significantly delay garbage collection. This is supported on all platforms except windows/arm, darwin/arm, js/wasm, and plan9/*.
A consequence of the implementation of preemption is that on Unix systems, including Linux and macOS systems, programs built with Go 1.14 will receive more signals than programs built with earlier releases. This means that programs that use packages like syscall or golang.org/x/sys/unix will see more slow system calls fail with EINTR errors. ...
I quoted part of the third paragraph here because this gives us a big clue as to how this asynchronous preemption works: the runtime system has the OS deliver some OS signal (SIGALRM, SIGVTALRM, etc.) on some sort of schedule (real or virtual time). This allows the Go runtime to implement the same kind of schedulers that real OSes implement with real (hardware) or virtual (virtualized hardware) timers. As with OS schedulers, it's up to the runtime to decide what to do with the clock ticks: perhaps just run the GC code, for instance.
We also see a list of platforms that don't do it. So we probably should not assume it will happen at all.
Fortunately, the runtime source is actually available: we can go look to see what does happen, should any given platform implement it. This shows that in runtime/signal_unix.go:
// We use SIGURG because it meets all of these criteria, is extremely
// unlikely to be used by an application for its "real" meaning (both
// because out-of-band data is basically unused and because SIGURG
// doesn't report which socket has the condition, making it pretty
// useless), and even if it is, the application has to be ready for
// spurious SIGURG. SIGIO wouldn't be a bad choice either, but is more
// likely to be used for real.
const sigPreempt = _SIGURG
and:
// doSigPreempt handles a preemption signal on gp.
func doSigPreempt(gp *g, ctxt *sigctxt) {
// Check if this G wants to be preempted and is safe to
// preempt.
if wantAsyncPreempt(gp) && isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()) {
// Inject a call to asyncPreempt.
ctxt.pushCall(funcPC(asyncPreempt))
}
// Acknowledge the preemption.
atomic.Xadd(&gp.m.preemptGen, 1)
atomic.Store(&gp.m.signalPending, 0)
}
The actual asyncPreempt function is in assembly, but it just does some assembly-only trickery to save user registers, and then calls asyncPreempt2 which is in runtime/preempt.go:
//go:nosplit
func asyncPreempt2() {
gp := getg()
gp.asyncSafePoint = true
if gp.preemptStop {
mcall(preemptPark)
} else {
mcall(gopreempt_m)
}
gp.asyncSafePoint = false
}
Compare this to runtime/proc.go's Gosched function (documented as the way to voluntarily yield):
//go:nosplit
// Gosched yields the processor, allowing other goroutines to run. It does not
// suspend the current goroutine, so execution resumes automatically.
func Gosched() {
checkTimeouts()
mcall(gosched_m)
}
We see the main differences include some "async safe point" stuff and that we arrange for an M-stack-call to gopreempt_m instead of gosched_m. So, apart from the safety check stuff and a different trace call (not shown here) the involuntary preemption is almost exactly the same as voluntary preemption.
To find this, we had to dig rather deep into the (Go 1.14, in this case) implementation. One might not want to depend too much on this.
A little bit more on this to complete #torek's answer.
Goroutines are interruptible when there is a syscall, but also when a routine is waiting on a lock, a chan or sleeping.
As #torek's said, since 1.14 routines can also be preempted even when they do none of the above. The scheduler can mark any routine as preemptible after it ran for more than 10ms.
More reading there: https://medium.com/a-journey-with-go/go-goroutine-and-preemption-d6bc2aa2f4b7

Interrupting open() with SIGALRM

We have a legacy embedded system which uses SDL to read images and fonts from an NFS share.
If there's a network problem, TTF_OpenFont() and IMG_Load() hang essentially forever. A test application reveals that open() behaves in the same way.
It occurred to us that a quick fix would be to call alarm() before the calls which open files on the NFS share. The man pages weren't entirely clear whether open() would fail with EINTR when interrupted by SIGALRM, so we put together a test app to verify this approach. We set up a signal handler with sigaction::sa_flags set to zero to ensure that SA_RESTART was not set.
The signal handler was called, but open() was not interrupted. (We observed the same behaviour with SIGINT and SIGTERM.)
I suppose the system treats open() as a "fast" operation even on "slow" infrastructure such as NFS.
Is there any way to change this behaviour and allow open() to be interrupted by a signal?
The man pages weren't entirely clear whether open() would fail with
EINTR when interrupted by SIGALRM, so we put together a test app to
verify this approach.
open(2) is a slow syscall (slow syscalls are those that can sleep forever, and can be awaken when, and if, a signal is caught in the meantime) only for some file types. In general, opens that block the caller until some condition occurs are usually interruptible. Known examples include opening a FIFO (named pipe), or (back in the old days) opening a physical terminal device (it sleeps until the modem is dialed).
NFS-mounted filesystems probably don't cause open(2) to sleep in an interruptible state. After all, you are most likely opening a regular file, and in that case open(2) will not be interruptable.
Is there any way to change this behaviour and allow open() to be
interrupted by a signal?
I don't think so, not without doing some (non-trivial) changes to the kernel.
I would explore the possibility of using setjmp(3) / longjmp(3) (see the manpage if you're not familiar; it's basically non-local gotos). You can initialize the environment buffer before calling open(2), and issue a longjmp(3) in the signal handler. Here's an example:
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>
#include <unistd.h>
#include <signal.h>
static jmp_buf jmp_env;
void sighandler(int signo) {
longjmp(jmp_env, 1);
}
int main(void) {
struct sigaction sigact;
sigact.sa_handler = sighandler;
sigact.sa_flags = 0;
sigemptyset(&sigact.sa_mask);
if (sigaction(SIGALRM, &sigact, NULL) < 0) {
perror("sigaction(2) error");
exit(EXIT_FAILURE);
}
if (setjmp(jmp_env) == 0) {
/* First time through
* This is where we would open the file
*/
alarm(5);
/* Simulate a blocked open() */
while (1)
; /* Intentionally left blank */
/* If open(2) is successful here, don't forget to unset
* the alarm
*/
alarm(0);
} else {
/* SIGALRM caught, open(2) canceled */
printf("open(2) timed out\n");
}
return 0;
}
It works by saving the context environment with the help of setjmp(3) before calling open(2). setjmp(3) returns 0 the first time through, and returns whatever value was passed to longjmp(3) otherwise.
Please be aware that this solution is not perfect. Here are some points to keep in mind:
There is a window of time between the call to alarm(2) and the call to open(2) (simulated here with while (1) { ... }) where the process may be preempted for a long time, so there is a chance the alarm expires before we actually attempt to open the file. Sure, with a large timeout such as 2 or 3 seconds this will most likely not happen, but it's still a race condition.
Similarly, there is a window of time between successfully opening the file and canceling the alarm where, again, the process may be preempted for a long time and the alarm may expire before we get the chance to cancel it. This is slightly worse because we have already opened the file so we will "leak" the file descriptor. Again, in practice, with a large timeout this will likely never happen, but it's a race condition nevertheless.
If the code catches other signals, there may be another signal handler in the midst of execution when SIGALRM is caught. Using longjmp(3) inside the signal handler will destroy the execution context of these other signal handlers, and depending on what they were doing, very nasty things may happen (inconsistent state if the signal handlers were manipulating other data structures in the program, etc.). It's as if it started executing, and suddenly crashed somewhere in the middle. You can fix it by: a) carefully setting up all signal handlers such that SIGALRM is blocked before they are invoked (this ensures that the SIGALRM handler does not begin execution until other handlers are done) and b) blocking these other signals before catching SIGALRM. Both actions can be accomplished by setting the sa_mask field of struct sigaction with the necessary mask (the operating system atomically sets the process's signal mask to that value before beginning execution of the handler and unsets it before returning from the handler). OTOH, if the rest of the code doesn't catch signals, then this is not a problem.
sleep(3) may be implemented with alarm(2), and alarm(2) and setitimer(2) share the same timer; if other portions in the code make use of any of these functions, they will interfere and the result will be a huge mess.
Just make sure you weigh in these disadvantages before blindly using this approach. The use of setjmp(3) / longjmp(3) is usually discouraged and makes programs considerably harder to read, understand and maintain. It's not elegant, but right now I don't think you have a choice, unless you're willing to do some core refactoring in the project.
If you do end up using setjmp(3), then at the very least document these limitations.
Maybe there is a strategy of using a separate thread to do the open so the main thread is not held up longer than desired.

How to close thread winapi

what is the rigth way to close Thread in Winapi, threads don't use common resources.
I am creating threads with CreateThread , but I don't know how to close it correctly in ,because someone suggest to use TerminateThread , others ExitThread , but what is the correct way to close it .
Also where should I call closing function in WM_CLOSE or WM_DESTROY ?
Thx in advance .
The "nicest" way to close a thread in Windows is by "telling" the thread to shutdown via some thread-safe signaling mechanism, then simply letting it reach its demise its own, potentially waiting for it to do so via one of the WaitForXXXX functions if completion detection is needed (which is frequently the case). Something like:
Main thread:
// some global event all threads can reach
ghStopEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
// create the child thread
hThread = CreateThread(NULL, 0, ThreadProc, NULL, 0, NULL);
//
// ... continue other work.
//
// tell thread to stop
SetEvent(ghStopEvent);
// now wait for thread to signal termination
WaitForSingleObject(hThread, INFINITE);
// important. close handles when no longer needed
CloseHandle(hThread);
CloseHandle(ghStopEvent);
Child thread:
DWORD WINAPI ThreadProc(LPVOID pv)
{
// do threaded work
while (WaitForSingleObject(ghStopEvent, 1) == WAIT_TIMEOUT)
{
// do thread busy work
}
return 0;
}
Obviously things can get a lot more complicated once you start putting it in practice. If by "common" resources you mean something like the ghStopEvent in the prior example, it becomes considerably more difficult. Terminating a child thread via TerminateThread is strongly discouraged because there is no logical cleanup performed at all. The warnings specified in the `TerminateThread documentation are self-explanatory, and should be heeded. With great power comes....
Finally, even the called thread invoking ExitThread is not required explicitly by you, and though you can do so, I strongly advise against it in C++ programs. It is called for you once the thread procedure logically returns from the ThreadProc. I prefer the model above simply because it is dead-easy to implement and supports full RAII of C++ object cleanup, which neither ExitThread nor TerminateThread provide. For example, the ExitThread documentation :
...in C++ code, the thread is exited before any destructors can be called
or any other automatic cleanup can be performed. Therefore, in C++
code, you should return from your thread function.
Anyway, start simple. Get a handle on things with super-simple examples, then work your way up from there. There are a ton of multi-threaded examples on the web, Learn from the good ones and challenge yourself to identify the bad ones.
Best of luck.
So you need to figure out what sort of behaviour you need to have.
Following is a simple description of the methods taken from documentation:
"TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL."
So if you need your thread to terminate at any cost, call this method.
About ExitThread, this is more graceful. By calling ExitThread, you're telling to windows you're done with that calling thread, so the rest of the code isn't going to get called. It's a bit like calling exit(0).
"ExitThread is the preferred method of exiting a thread. When this function is called (either explicitly or by returning from a thread procedure), the current thread's stack is deallocated, all pending I/O initiated by the thread is canceled, and the thread terminates. If the thread is the last thread in the process when this function is called, the thread's process is also terminated."

Is there an async-signal-safe way of reading a directory listing on Linux?

SUSv4 does not list opendir, readdir, closedir, etc. in its list of async-signal-safe functions.
Is there a safe way to read a directory listing from a signal handler?
e.g. is it possible to 'open' the directory and somehow slurp out the raw directory listing? If so what kind of data structure is returned by 'read'?
Or maybe on Linux there are certain system calls that are async-signal-safe even though SUSv4 / POSIX does not require it that could be used?
If you know in advance which directory you need to read, you could call opendir() outside the signal handler (opendir() calls malloc(), so you can't run it from within the handler) and keep the DIR* in a static variable somewhere. When your signal handler runs, you should be able to get away with calling readdir_r() on that handle as long as you can guarantee that only that one signal handler would use the DIR* handle at any moment. There is a lock field in the DIR that is taken by readdir() and readdir_r(), so if, say, you used the DIR* from two signal handlers, or you registered the same handler to handle multiple signals, you may end up with a deadlock due to the lock never being released by the interrupted handler.
A similar approach appears to also work to read a directory from a child process after calling fork() but before calling execve().

Can I prevent a Linux user space pthread yielding in critical code?

I am working on an user space app for an embedded Linux project using the 2.6.24.3 kernel.
My app passes data between two file nodes by creating 2 pthreads that each sleep until a asynchronous IO operation completes at which point it wakes and runs a completion handler.
The completion handlers need to keep track of how many transfers are pending and maintain a handful of linked lists that one thread will add to and the other will remove.
// sleep here until events arrive or time out expires
for(;;) {
no_of_events = io_getevents(ctx, 1, num_events, events, &timeout);
// Process each aio event that has completed or thrown an error
for (i=0; i<no_of_events; i++) {
// Get pointer to completion handler
io_complete = (io_callback_t) events[i].data;
// Get pointer to data object
iocb = (struct iocb *) events[i].obj;
// Call completion handler and pass it the data object
io_complete(ctx, iocb, events[i].res, events[i].res2);
}
}
My question is this...
Is there a simple way I can prevent the currently active thread from yielding whilst it runs the completion handler rather than going down the mutex/spin lock route?
Or failing that can Linux be configured to prevent yielding a pthread when a mutex/spin lock is held?
You can use the sched_setscheduler() system call to temporarily set the thread's scheduling policy to SCHED_FIFO, then set it back again. From the sched_setscheduler() man page:
A SCHED_FIFO process runs until either
it is blocked by an I/O request, it is
preempted by a higher priority
process, or it calls sched_yield(2).
(In this context, "process" actually means "thread").
However, this is quite a suspicious requirement. What is the problem you are hoping to solve? If you are just trying to protect your linked list of completion handlers from concurrent access, then an ordinary mutex is the way to go. Have the completion thread lock the mutex, remove the list item, unlock the mutex, then call the completion handler.
I think you'll want to use mutexes/locks to prevent race conditions here. Mutexes are by no way voodoo magic and can even make your code simpler than using arbitrary system-specific features, which you'd need to potentially port across systems. Don't know if the latter is an issue for you, though.
I believe you are trying to outsmart the Linux scheduler here, for the wrong reasons.
The correct solution is to use a mutex to prevent completion handlers from running in parallel. Let the scheduler do its job.

Resources