How are threads/processes parked and woken in Linux, prior to futex? - linux

Before the futex system calls existed in Linux, what underlying system calls were used by threading libraries like pthreads to block/sleep a thread and to subsequently wake those threads from userland?
For example, if a thread tries to acquire a mutex, the userland implementation will block the thread (perhaps after a short spinning interval), but I can't find the syscalls that are used for this (other than futex which are a relatively recent creation).

Before futex and current implementation of pthreads for Linux, the NPTL (require kernel 2.6 and newer), there were two other threading libraries with POSIX Thread API for Linux: linuxthreads and NGPT (which was based on Gnu Pth. LinuxThreads was the only widely used libpthread for years (and it can still be used in some strange & unmaintained micro-libc to work on 2.4; other micro-libc variants may have own builtin implementation of pthread-like API on top of futex+clone). And Gnu Pth is not thread library, it is single process thread with user-level "thread" switching.
You should know that there are several Threading Models when we check does the kernel knows about some or all of user threads (how many CPU cores can be used with adding threads to the program; what is the cost of having the thread / how many threads may be started). Models are named as M:N where M is userspace thread number and N is thread number schedulable by OS kernel:
"1:1" ''kernel-level threading'' - every userspace thread is schedulable by OS kernel. This is implemented in Linuxthreads, NPTL and many modern OS.
"N:1" ''user-level threading'' - userspace threads are planned by the userspace, they all are invisible to the kernel, it only schedules one process (and it may use only 1 CPU core). Gnu Pth (GNU Portable Threads) is example of it, and there are many other implementations for some computer architectures.
"M:N" ''hybrid threading'' - there are some entities visible and schedulable by OS kernel, but there may be more user-space threads in them. And sometimes user-space threads will migrate between kernel-visible threads.
With 1:1 model there are many classic sleep mechanisms/APIs in Unix like select/poll and signals and other variants of IPC APIs. As I remember, Linuxthreads used separate processes for every thread (with fully shared memory) and there was special manager "thread" (process) to emulate some POSIX thread features. Wikipedia says that SIGUSR1/SIGUSR2 were used in Linuxthreads for some internal communication between threads, same says IBM "The synchronization of primitives is achieved by means of signals. For example, threads block until awoken by signals.". Check also the project FAQ http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html#H.4 "With LinuxThreads, I can no longer use the signals SIGUSR1 and SIGUSR2 in my programs! Why?"
LinuxThreads needs two signals for its internal operation. One is used to suspend and restart threads blocked on mutex, condition or semaphore operations. The other is used for thread cancellation.
On ``old'' kernels (2.0 and early 2.1 kernels), there are only 32 signals available and the kernel reserves all of them but two: SIGUSR1 and SIGUSR2. So, LinuxThreads has no choice but use those two signals.
With "N:1" model thread may call some blocking syscall and block everything (some libraries may convert some blocking syscalls into async, or use some SIGALRM or SIGVTALRM magic); or it may call some (very) special internal threading function which will do user-space thread switching by rewriting machine state register (like switch_to in linux kernel, save IP/SP and other regs, restore IP/SP and regs of other thread). So, kernel does not wake any user thread directly from userland, it just schedules whole process; and user space scheduler implement thread synchronization logic (or just calls sched_yield or select when there is no threads to work).
With M:N model things are very complicated... Don't know much about NGPT... There is one paragraph about NGPT in POSIX Threads and the Linux Kernel, Dave McCracken, OLS2002,330 page 5
There is a new pthread library under development called NGPT. This library is based on the GNU Pth library, which is an M:1 library. NGPT extends Pth by using multiple Linux tasks, thus creating an M:N library. It attempts to preserve Pth’s pthread compatibility while also using multiple Linux tasks for concurrency, but this effort is hampered by the underlying differences in the Linux threading model. The NGPT library at present uses non-blocking wrappers around blocking system calls to avoid
blocking in the kernel.
Some papers and posts: POSIX Threads and the Linux Kernel, Dave McCracken, OLS2002,330, LWN post about NPTL 0.1
The futex system call is used extensively in all synchronization
primitives and other places which need some kind of
synchronization. The futex mechanism is generic enough to support
the standard POSIX synchronization mechanisms with very little
effort. ... Futexes also allow the implementation of inter-process
synchronization primitives, a sorely missed feature in the old
LinuxThreads implementation (Hi jbj!).
NPTL design pdf:
5.5 Synchronization Primitives
The implementation of the synchronization primitives such as mutexes, read-write
locks, conditional variables, semaphores, and barriers requires some form of kernel
support. Busy waiting is not an option since threads can have different priorities (beside wasting CPU cycles). The same argument rules out the exclusive use of sched yield. Signals were the only viable solution for the old implementation. Threads would block in the kernel until woken by a signal. This method has severe drawbacks in terms of speed and reliability caused by spurious wakeups and derogation of the quality of the signal handling in the application.
Fortunately some new functionality was added to the kernel to implement all kinds
of synchronization primitives: futexes [Futex]. The underlying principle is simple but
powerful enough to be adaptable to all kinds of uses. Callers can block in the kernel
and be woken either explicitly, as a result of an interrupt, or after a timeout.

Futex stands for "fast userspace mutex." It's simply an abstraction over mutexes which is considered faster and more convenient than traditional mutex mechanisms because it implements the wait system for you. Before and after futex(), threads were put to sleep and awoken via a change in their process state. The process states are:
Running state
Sleeping state
Un-interruptible sleeping state (i.e. blocking for a syscall like read() or write()
Defunct/zombie state
When a thread is suspended, it is put into (interruptible) 'sleep' state. Later, it can be woken via the wake_up() function, which operates on its task structure within the kernel. As far as I can tell, wake_up is a kernel function, not a syscall. The kernel doesn't need a syscall to wake or sleep a task; it (or a process) simply changes the task structure to reflect the state of the process. When the Linux scheduler next deals with that process, it treats it according to its state (again, the states are listed above).
Short story: futex() implements a wait system for you. Without it, you need a data structure that's accessible from the main thread and from the sleeping thread in order to wake up a sleeping thread. All of this is done with userland code. The only thing you might need from the kernel is a mutex--the specifics of which do include locking mechanisms and mutex datastructures, but don't inherently wake or sleep the thread. The syscalls you're looking for don't exist. Essentially, most of what you're talking about can be achieved from userspace, without a syscall, by manually keeping track of data conditions that determine whether and when to sleep or wake a thread.

Related

sleep() within a user-space thread implementation

Why is it okay to use sleep() with a kernel thread implementation, but not within a user-space thread implementation? Is it because sleep has to be in system call?
To begin, the user thread/kernel threat distinction is the creation of bad textbooks on operating systems.
Kernel threads are threads.
"User threads" are simulations of thread using run-time libraries.
The behavior of simulated threads is entirely system dependent. Some dreadful operating systems textbooks say that blocking calls made in simulated threads block the entire process. That may be true in some implementations but it is not true in all.

How is preemptive scheduling implemented for user-level threads in Linux?

With user-level threads there are N user-level threads running on top of a single kernel thread. This is in contrast to pthreads where only one user thread runs on a kernel thread.
The N user-level threads are preemptively scheduled on the single kernel thread. But what are the details of how that is done.
I heard something that suggested that the threading library sets things up so that a signal is sent by the kernel and that is the mechanism to yank execution from an individual user-level thread to a signal handler that can then do the preemptive scheduling.
But what are the details of how state such as registers and thread structs are saved and/or mutated to make this all work? Is there maybe a very simple of user-level threads that is useful for learning the details?
To get the details right, use the source! But this is what I remember from when I read it...
There are two ways user-level threads can be scheduled: voluntarily and preemptively.
Voluntary scheduling: threads must call a function periodically to pass the use of the CPU to another thread. This function is called yield() or schedule() or something like that.
Preemptive scheduling: the library forcefully removes the CPU from one thread and passes it to another. This is usually done with timer signals, such as SIGALARM (see man ualarm for the details).
About how to do the real switch, if your OS is friendly and provides the necessary functions, that is easy. In Linux you have the makecontext() / swapcontext() functions that make swapping from one task to another easy. Again, see the man pages for details.
Unfortunately, these functions are removed from POSIX, so other UNIX may not have them. If that's the case, there are other tricks that can be done. The most popular was the one calling sigaltstack() to set up an alternate stack for managing the signals, then kill() itself to get to the alternate stack, and longjmp() from the signal function to the actual user-mode-thread you want to run. Clever, uh?
As a side note, in Windows user-mode threads are called fibers and are fully supported also (see the docs of CreateFiber()).
The last resort is using assembler, that can be made to work almost everywhere, but it is totally system specific. The steps to create a UMT would be:
Allocate a stack.
Allocate and initialize a UMT context: a struct to hold the value of the relevant CPU registers.
And to switch from one UMT to another:
Save the current context.
Switch the stack.
Restore the next context in the CPU and jump to the next instruction.
These steps are relatively easy to do in assembler, but quite impossible in plain C without support from any of the tricks cited above.

How safe is pthread robust mutex?

I m thinking to use Posix robust mutexes to protect shared resource among different processes (on Linux). However there are some doubts about safety in difference scenarios. I have the following questions:
Are robust mutexes implemented in the kernel or in user code?
If latter, what would happen if a process happens to crash while in a call to pthread_mutex_lock or pthread_mutex_unlock and while a shared pthread_mutex datastructure is getting updated?
I understand that if a process locked the mutex and dies, a thread in another process will be awaken and return EOWNERDEAD. However, what would happen if the process dies (in unlikely case) exactly when the pthread_mutex datastructure (in shared memory) is being updated? Will the mutex get corrupted in that case? What would happen to another process that is mapped to the same shared memory if it were to call a pthread_mutex function?
Can the mutex still be recovered in this case?
This question applies to any pthread object with PTHREAD_PROCESS_SHARED attribute. Is it safe to call functions like pthread_mutex_lock, pthread_mutex_unlock, pthread_cond_signal, etc. concurrently on the same object from different processes? Are they thread-safe across different processes?
From the man-page for pthreads:
Over time, two threading implementations have been provided by the
GNU C library on Linux:
LinuxThreads
This is the original Pthreads implementation. Since glibc
2.4, this implementation is no longer supported.
NPTL (Native POSIX Threads Library)
This is the modern Pthreads implementation. By comparison
with LinuxThreads, NPTL provides closer conformance to the
requirements of the POSIX.1 specification and better
performance when creating large numbers of threads. NPTL is
available since glibc 2.3.2, and requires features that are
present in the Linux 2.6 kernel.
Both of these are so-called 1:1 implementations, meaning that each
thread maps to a kernel scheduling entity. Both threading
implementations employ the Linux clone(2) system call. In NPTL,
thread synchronization primitives (mutexes, thread joining, and so
on) are implemented using the Linux futex(2) system call.
And from man futex(7):
In its bare form, a futex is an aligned integer which is touched only
by atomic assembler instructions. Processes can share this integer
using mmap(2), via shared memory segments or because they share
memory space, in which case the application is commonly called
multithreaded.
An additional remark found here:
(In case you’re wondering how they work in shared memory: Futexes are keyed upon their physical address)
Summarizing, Linux decided to implement pthreads on top of their "native" futex primitive, which indeed lives in the user process address space. For shared synchronization primitives, this would be shared memory and the other processes will still be able to see it, after one process dies.
What happens in case of process termination? Ingo Molnar wrote an article called Robust Futexes about just that. The relevant quote:
Robust Futexes
There is one race possible though: since adding to and removing from the
list is done after the futex is acquired by glibc, there is a few
instructions window for the thread (or process) to die there, leaving
the futex hung. To protect against this possibility, userspace (glibc)
also maintains a simple per-thread 'list_op_pending' field, to allow the
kernel to clean up if the thread dies after acquiring the lock, but just
before it could have added itself to the list. Glibc sets this
list_op_pending field before it tries to acquire the futex, and clears
it after the list-add (or list-remove) has finished
Summary
Where this leaves you for other platforms, is open-ended. Suffice it to say that the Linux implementation, at least, has taken great care to meet our common-sense expectation of robustness.
Seeing that other operating systems usually resort to Kernel-based synchronization primitives in the first place, it makes sense to me to assume their implementations would be even more naturally robust.
Following the documentation from here: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getrobust.html, it does read that in a fully POSIX compliant OS, shared mutex with the robust flag will behave in the way you'd expect.
The problem obviously is that not all OS are fully POSIX compliant. Not even those claiming to be. Process shared mutexes and in particular robust ones are among those finer points that are often not part of an OS's implementation of POSIX.

Difference between user-level and kernel-supported threads?

I've been looking through a few notes based on this topic, and although I have an understanding of threads in general, I'm not really to sure about the differences between user-level and kernel-level threads.
I know that processes are basically made up of multiple threads or a single thread, but are these thread of the two prior mentioned types?
From what I understand, kernel-supported threads have access to the kernel for system calls and other uses not available to user-level threads.
So, are user-level threads simply threads created by the programmer when then utilise kernel-supported threads to perform operations that couldn't be normally performed due to its state?
Edit: The question was a little confusing, so I'm answering it two different ways.
OS-level threads vs Green Threads
For clarity, I usually say "OS-level threads" or "native threads" instead of "Kernel-level threads" (which I confused with "kernel threads" in my original answer below.) OS-level threads are created and managed by the OS. Most languages have support for them. (C, recent Java, etc) They are extremely hard to use because you are 100% responsible for preventing problems. In some languages, even the native data structures (such as Hashes or Dictionaries) will break without extra locking code.
The opposite of an OS-thread is a green thread that is managed by your language. These threads are given various names depending on the language (coroutines in C, goroutines in Go, fibers in Ruby, etc). These threads only exist inside your language and not in your OS. Because the language chooses context switches (i.e. at the end of a statement), it prevents TONS of subtle race conditions (such as seeing a partially-copied structure, or needing to lock most data structures). The programmer sees "blocking" calls (i.e. data = file.read() ), but the language translates it into async calls to the OS. The language then allows other green threads to run while waiting for the result.
Green threads are much simpler for the programmer, but their performance varies: If you have a LOT of threads, green threads can be better for both CPU and RAM. On the other hand, most green thread languages can't take advantage of multiple cores. (You can't even buy a single-core computer or phone anymore!). And a bad library can halt the entire language by doing a blocking OS call.
The best of both worlds is to have one OS thread per CPU, and many green threads that are magically moved around onto OS threads. Languages like Go and Erlang can do this.
system calls and other uses not available to user-level threads
This is only half true. Yes, you can easily cause problems if you call the OS yourself (i.e. do something that's blocking.) But the language usually has replacements, so you don't even notice. These replacements do call the kernel, just slightly differently than you think.
Kernel threads vs User Threads
Edit: This is my original answer, but it is about User space threads vs Kernel-only threads, which (in hindsight) probably wasn't the question.
User threads and Kernel threads are exactly the same. (You can see by looking in /proc/ and see that the kernel threads are there too.)
A User thread is one that executes user-space code. But it can call into kernel space at any time. It's still considered a "User" thread, even though it's executing kernel code at elevated security levels.
A Kernel thread is one that only runs kernel code and isn't associated with a user-space process. These are like "UNIX daemons", except they are kernel-only daemons. So you could say that the kernel is a multi-threaded program. For example, there is a kernel thread for swap. This forces all swap issues to get "serialized" into a single stream.
If a user thread needs something, it will call into the kernel, which marks that thread as sleeping. Later, the swap thread finds the data, so it marks the user thread as runnable. Later still, the "user thread" returns from the kernel back to userland as if nothing happened.
In fact, all threads start off in kernel space, because the clone() operation happens in kernel space. (And there's lots of kernel accounting to do before you can 'return' to a new process in user space.)
Before we go into comparison, let us first understand what a thread is. Threads are lightweight processes within the domain of independent processes. They are required because processes are heavy, consume a lot of resources and more importantly,
two separate processes cannot share a memory space.
Let's say you open a text editor. It's an independent process executing in the memory with a separate addressable location. You'll need many resources within this process, such as insert graphics, spell-checks etc. It's not feasible to create separate processes for each of these functionalities and maintain them independently in memory. To avoid this,
multiple threads can be created within a single process, which can
share a common memory space, existing independently within a process.
Now, coming back to your questions, one at a time.
I'm not really to sure about the differences between user-level and kernel-level threads.
Threads are broadly classified as user level threads and kernel level threads based on their domain of execution. There are also cases when one or many user thread maps to one or many kernel threads.
- User Level Threads
User level threads are mostly at the application level where an application creates these threads to sustain its execution in the main memory. Unless required, these thread work in isolation with kernel threads.
These are easier to create since they do not have to refer many registers and context switching is much faster than a kernel level thread.
User level thread, mostly can cause changes at the application level and the kernel level thread continues to execute at its own pace.
- Kernel Level Threads
These threads are mostly independent of the ongoing processes and are executed by the operating system.
These threads are required by the Operating System for tasks like memory management, process management etc.
Since these threads maintain, execute and report the processes required by the operating system; kernel level threads are more expensive to create and manage and context switching of these threads are slow.
Most of the kernel level threads can not be preempted by the user level threads.
MS DOS written for Intel 8088 didn't have dual mode of operation. Thus, a user level process had the ability to corrupt the entire operating system.
- User Level Threads mapped over Kernel Threads
This is perhaps the most interesting part. Many user level threads map over to kernel level thread, which in-turn communicate with the kernel.
Some of the prominent mappings are:
One to One
When one user level thread maps to only one kernel thread.
advantages: each user thread maps to one kernel thread. Even if one of the user thread issues a blocking system call, the other processes remain unaffected.
disadvantages: every user thread requires one kernel thread to interact and kernel threads are expensive to create and manage.
Many to One
When many user threads map to one kernel thread.
advantages: multiple kernel threads are not required since similar user threads can be mapped to one kernel thread.
disadvantage: even if one of the user thread issues a blocking system call, all the other user threads mapped to that kernel thread are blocked.
Also, a good level of concurrency cannot be achieved since the kernel will process only one kernel thread at a time.
Many to Many
When many user threads map to equal or lesser number of kernel threads. The programmer decides how many user threads will map to how many kernel threads. Some of the user threads might map to just one kernel thread.
advantages: a great level of concurrency is achieved. Programmer can decide some potentially dangerous threads which might issue a blocking system call and place them with the one-to-one mapping.
disadvantage: the number of kernel threads, if not decided cautiously can slow down the system.
The other part of your question:
kernel-supported threads have access to the kernel for system calls
and other uses not available to user-level threads.
So, are user-level threads simply threads created by the programmer
when then utilise kernel-supported threads to perform operations that
couldn't be normally performed due to its state?
Partially correct. Almost all the kernel thread have access to system calls and other critical interrupts since kernel threads are responsible for executing the processes of the OS. User thread will not have access to some of these critical features. e.g. a text editor can never shoot a thread which has the ability to change the physical address of the process. But if needed, a user thread can map to kernel thread and issue some of the system calls which it couldn't do as an independent entity. The kernel thread would then map this system call to the kernel and would execute actions, if deemed fit.
Quote from here :
Kernel-Level Threads
To make concurrency cheaper, the execution aspect of process is separated out into threads. As such, the OS now manages threads and processes. All thread operations are implemented in the kernel and the OS schedules all threads in the system. OS managed threads are called kernel-level threads or light weight processes.
NT: Threads
Solaris: Lightweight processes(LWP).
In this method, the kernel knows about and manages the threads. No runtime system is needed in this case. Instead of thread table in each process, the kernel has a thread table that keeps track of all threads in the system. In addition, the kernel also maintains the traditional process table to keep track of processes. Operating Systems kernel provides system call to create and manage threads.
Advantages:
Because kernel has full knowledge of all threads, Scheduler may decide to give more time to a process having large number of threads than process having small number of threads.
Kernel-level threads are especially good for applications that frequently block.
Disadvantages:
The kernel-level threads are slow and inefficient. For instance, threads operations are hundreds of times slower than that of user-level threads.
Since kernel must manage and schedule threads as well as processes. It require a full thread control block (TCB) for each thread to maintain information about threads. As a result there is significant overhead and increased in kernel complexity.
User-Level Threads
Kernel-Level threads make concurrency much cheaper than process because, much less state to allocate and initialize. However, for fine-grained concurrency, kernel-level threads still suffer from too much overhead. Thread operations still require system calls. Ideally, we require thread operations to be as fast as a procedure call. Kernel-Level threads have to be general to support the needs of all programmers, languages, runtimes, etc. For such fine grained concurrency we need still "cheaper" threads.
To make threads cheap and fast, they need to be implemented at user level. User-Level threads are managed entirely by the run-time system (user-level library).The kernel knows nothing about user-level threads and manages them as if they were single-threaded processes.User-Level threads are small and fast, each thread is represented by a PC,register,stack, and small thread control block. Creating a new thread, switiching between threads, and synchronizing threads are done via procedure call. i.e no kernel involvement. User-Level threads are hundred times faster than Kernel-Level threads.
Advantages:
The most obvious advantage of this technique is that a user-level threads package can be implemented on an Operating System that does not support threads.
User-level threads does not require modification to operating systems.
Simple Representation: Each thread is represented simply by a PC, registers, stack and a small control block, all stored in the user process address space.
Simple Management: This simply means that creating a thread, switching between threads and synchronization between threads can all be done without intervention of the kernel.
Fast and Efficient: Thread switching is not much more expensive than a procedure call.
Disadvantages:
User-Level threads are not a perfect solution as with everything else, they are a trade off. Since, User-Level threads are invisible to the OS they are not well integrated with the OS. As a result, Os can make poor decisions like scheduling a process with idle threads, blocking a process whose thread initiated an I/O even though the process has other threads that can run and unscheduling a process with a thread holding a lock. Solving this requires communication between between kernel and user-level thread manager.
There is a lack of coordination between threads and operating system kernel. Therefore, process as whole gets one time slice irrespect of whether process has one thread or 1000 threads within. It is up to each thread to relinquish control to other threads.
User-level threads requires non-blocking systems call i.e., a multithreaded kernel. Otherwise, entire process will blocked in the kernel, even if there are runable threads left in the processes. For example, if one thread causes a page fault, the process blocks.
User Threads
The library provides support for thread creation, scheduling and management with no support from the kernel.
The kernel unaware of user-level threads creation and scheduling are done in user space without kernel intervention.
User-level threads are generally fast to create and manage they have drawbacks however.
If the kernel is single-threaded, then any user-level thread performing a blocking system call will cause the entire process to block, even if other threads are available to run within the application.
User-thread libraries include POSIX Pthreads, Mach C-threads,
and Solaris 2 UI-threads.
Kernel threads
The kernel performs thread creation, scheduling, and management in kernel space.
kernel threads are generally slower to create and manage than are user threads.
the kernel is managing the threads, if a thread performs a blocking system call.
A multiprocessor environment, the kernel can schedule threads on different processors.
5.including Windows NT, Windows 2000, Solaris 2, BeOS, and Tru64 UNIX (formerlyDigital UN1X)-support kernel threads.
Some development environments or languages will add there own threads like feature, that is written to take advantage of some knowledge of the environment, for example a GUI environment could implement some thread functionality which switch between user threads on each event loop.
A game library could have some thread like behaviour for characters. Sometimes the user thread like behaviour can be implemented in a different way, for example I work with cocoa a lot, and it has a timer mechanism which executes your code every x number of seconds, use fraction of a seconds and it like a thread. Ruby has a yield feature which is like cooperative threads. The advantage of user threads is they can switch at more predictable times. With kernel thread every time a thread starts up again, it needs to load any data it was working on, this can take time, with user threads you can switch when you have finished working on some data, so it doesn't need to be reloaded.
I haven't come across user threads that look the same as kernel threads, only thread like mechanisms like the timer, though I have read about them in older text books so I wonder if they were something that was more popular in the past but with the rise of true multithreaded OS's (modern Windows and Mac OS X) and more powerful hardware I wonder if they have gone out of favour.

Non-preemptive Pthreads?

Is there a way to use pthreads without a scheduler, so context switch occurs only if a thread explicitly yields, or is blocked on a mutex/cond? If not, is there a way to minimize the scheduling overhead, so that forced context switches will occur as rarely as possible?
The question refers to the Linux gcc/g++ implementation of POSIX threads.
You can use Pth (a.k.a. GNU Portable Threads), a non-preemptive thread library. Configuring it with --enable-pthread will create a plug-in replacement for pthreads. I just built and tested this on my Mac and it works fine for a simple pthreads program.
From the README:
Pth is a very portable POSIX/ANSI-C based library for Unix platforms
which provides non-preemptive priority-based scheduling for multiple
threads of execution (aka `multithreading') inside event-driven
applications. All threads run in the same address space of the server
application, but each thread has its own individual program-counter,
run-time stack, signal mask and errno variable.
The thread scheduling itself is done in a cooperative way, i.e., the
threads are managed by a priority- and event-based non-preemptive
scheduler. The intention is, that this way one can achieve better
portability and run-time performance than with preemptive scheduling.
The event facility allows threads to wait until various types of
events occur, including pending I/O on filedescriptors, asynchronous
signals, elapsed timers, pending I/O on message ports, thread and
process termination, and even customized callback functions.
Additionally Pth provides an optional emulation API for POSIX.1c
threads (`Pthreads') which can be used for backward compatibility to
existing multithreaded applications.
If you have a process running in normal user land, context switches will naturally happen as part of the system operation - there is always another process that needs the CPU time. Preemptive context switches between your threads are quite well optimized by the OS already and are bound to be necessary sometimes.
If you really happen to have problems with excessive context switching, you are best off tweaking the Linux scheduler first, which is off-topic here. pthread_setschedprio and pthread_setschedparam can set some hints, but are limited to setting priorities, and the interpretation of these priorities is implementation-defined, i.e. up to the Linux scheduler.

Resources