Are Minix system calls in C atomic in nature? - semaphore

I have got a project where I have to implement semaphores in Minix.
In order to implement them I need to make the semaphore wait and signal functions atomic.
My idea is that
Assuming that system calls are atomic in Minix ,I will create new system calls that implement the wait and signal functions.
That's why I want to know that Are Minix "system calls" in C atomic in nature?

Related

How are threads/processes parked and woken in Linux, prior to futex?

Before the futex system calls existed in Linux, what underlying system calls were used by threading libraries like pthreads to block/sleep a thread and to subsequently wake those threads from userland?
For example, if a thread tries to acquire a mutex, the userland implementation will block the thread (perhaps after a short spinning interval), but I can't find the syscalls that are used for this (other than futex which are a relatively recent creation).
Before futex and current implementation of pthreads for Linux, the NPTL (require kernel 2.6 and newer), there were two other threading libraries with POSIX Thread API for Linux: linuxthreads and NGPT (which was based on Gnu Pth. LinuxThreads was the only widely used libpthread for years (and it can still be used in some strange & unmaintained micro-libc to work on 2.4; other micro-libc variants may have own builtin implementation of pthread-like API on top of futex+clone). And Gnu Pth is not thread library, it is single process thread with user-level "thread" switching.
You should know that there are several Threading Models when we check does the kernel knows about some or all of user threads (how many CPU cores can be used with adding threads to the program; what is the cost of having the thread / how many threads may be started). Models are named as M:N where M is userspace thread number and N is thread number schedulable by OS kernel:
"1:1" ''kernel-level threading'' - every userspace thread is schedulable by OS kernel. This is implemented in Linuxthreads, NPTL and many modern OS.
"N:1" ''user-level threading'' - userspace threads are planned by the userspace, they all are invisible to the kernel, it only schedules one process (and it may use only 1 CPU core). Gnu Pth (GNU Portable Threads) is example of it, and there are many other implementations for some computer architectures.
"M:N" ''hybrid threading'' - there are some entities visible and schedulable by OS kernel, but there may be more user-space threads in them. And sometimes user-space threads will migrate between kernel-visible threads.
With 1:1 model there are many classic sleep mechanisms/APIs in Unix like select/poll and signals and other variants of IPC APIs. As I remember, Linuxthreads used separate processes for every thread (with fully shared memory) and there was special manager "thread" (process) to emulate some POSIX thread features. Wikipedia says that SIGUSR1/SIGUSR2 were used in Linuxthreads for some internal communication between threads, same says IBM "The synchronization of primitives is achieved by means of signals. For example, threads block until awoken by signals.". Check also the project FAQ http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html#H.4 "With LinuxThreads, I can no longer use the signals SIGUSR1 and SIGUSR2 in my programs! Why?"
LinuxThreads needs two signals for its internal operation. One is used to suspend and restart threads blocked on mutex, condition or semaphore operations. The other is used for thread cancellation.
On ``old'' kernels (2.0 and early 2.1 kernels), there are only 32 signals available and the kernel reserves all of them but two: SIGUSR1 and SIGUSR2. So, LinuxThreads has no choice but use those two signals.
With "N:1" model thread may call some blocking syscall and block everything (some libraries may convert some blocking syscalls into async, or use some SIGALRM or SIGVTALRM magic); or it may call some (very) special internal threading function which will do user-space thread switching by rewriting machine state register (like switch_to in linux kernel, save IP/SP and other regs, restore IP/SP and regs of other thread). So, kernel does not wake any user thread directly from userland, it just schedules whole process; and user space scheduler implement thread synchronization logic (or just calls sched_yield or select when there is no threads to work).
With M:N model things are very complicated... Don't know much about NGPT... There is one paragraph about NGPT in POSIX Threads and the Linux Kernel, Dave McCracken, OLS2002,330 page 5
There is a new pthread library under development called NGPT. This library is based on the GNU Pth library, which is an M:1 library. NGPT extends Pth by using multiple Linux tasks, thus creating an M:N library. It attempts to preserve Pth’s pthread compatibility while also using multiple Linux tasks for concurrency, but this effort is hampered by the underlying differences in the Linux threading model. The NGPT library at present uses non-blocking wrappers around blocking system calls to avoid
blocking in the kernel.
Some papers and posts: POSIX Threads and the Linux Kernel, Dave McCracken, OLS2002,330, LWN post about NPTL 0.1
The futex system call is used extensively in all synchronization
primitives and other places which need some kind of
synchronization. The futex mechanism is generic enough to support
the standard POSIX synchronization mechanisms with very little
effort. ... Futexes also allow the implementation of inter-process
synchronization primitives, a sorely missed feature in the old
LinuxThreads implementation (Hi jbj!).
NPTL design pdf:
5.5 Synchronization Primitives
The implementation of the synchronization primitives such as mutexes, read-write
locks, conditional variables, semaphores, and barriers requires some form of kernel
support. Busy waiting is not an option since threads can have different priorities (beside wasting CPU cycles). The same argument rules out the exclusive use of sched yield. Signals were the only viable solution for the old implementation. Threads would block in the kernel until woken by a signal. This method has severe drawbacks in terms of speed and reliability caused by spurious wakeups and derogation of the quality of the signal handling in the application.
Fortunately some new functionality was added to the kernel to implement all kinds
of synchronization primitives: futexes [Futex]. The underlying principle is simple but
powerful enough to be adaptable to all kinds of uses. Callers can block in the kernel
and be woken either explicitly, as a result of an interrupt, or after a timeout.
Futex stands for "fast userspace mutex." It's simply an abstraction over mutexes which is considered faster and more convenient than traditional mutex mechanisms because it implements the wait system for you. Before and after futex(), threads were put to sleep and awoken via a change in their process state. The process states are:
Running state
Sleeping state
Un-interruptible sleeping state (i.e. blocking for a syscall like read() or write()
Defunct/zombie state
When a thread is suspended, it is put into (interruptible) 'sleep' state. Later, it can be woken via the wake_up() function, which operates on its task structure within the kernel. As far as I can tell, wake_up is a kernel function, not a syscall. The kernel doesn't need a syscall to wake or sleep a task; it (or a process) simply changes the task structure to reflect the state of the process. When the Linux scheduler next deals with that process, it treats it according to its state (again, the states are listed above).
Short story: futex() implements a wait system for you. Without it, you need a data structure that's accessible from the main thread and from the sleeping thread in order to wake up a sleeping thread. All of this is done with userland code. The only thing you might need from the kernel is a mutex--the specifics of which do include locking mechanisms and mutex datastructures, but don't inherently wake or sleep the thread. The syscalls you're looking for don't exist. Essentially, most of what you're talking about can be achieved from userspace, without a syscall, by manually keeping track of data conditions that determine whether and when to sleep or wake a thread.

the most devastating argument against user-level thread

I am reading sections about user space thread from the book "Modern Operating System". It states that:
Another, and probably the most devastating argument against user-level threads, is that programmers generally want threads precisely in applications where the threads block often, as, for example, in a multithreaded Web server. These threads are constantly making system calls. Once a trap has occurred to the kernel to carry out the system call, it is hardly any more work for the kernel to switch threads if the old one has blocked, and having the kernel do this eliminates the need for constantly making select system calls that check to see if read system calls are safe. For applications that are essentially entirely CPU bound and rarely block, what is the point of having threads at all? No one would seriously propose computing the first n prime numbers or playing chess using threads because there is nothing to be gained by doing it that way.
I am particularly confused about the bold text.
1.Since these are user space threads, how can the kernel do a "switch threads"?
2. "having the kernel do this" , what does "this" here mean?
I thought behaviors are like:
1. "select" call is made, and find following system call is a blocking one.
2. Then the user space thread scheduler makes a thread switching and execute anohter thread.
For some reason, colleges insist on using operating systems textbooks that are confusing and at times nonsensical.
First, what is being described here is ENTIRELY system specific. On SOME operating systems, a synchronous system call will block all threads. This is not true in ALL operating systems.
Second, user threads are the poor man's way of doing them. In ye olde days user threads came into being because there were no operating system support. There are some that promote user threads as being more "efficient" than kernel threads (in theory a library can switch threads faster than the kernel) but this is total BS in practice. User threads are completely obsolete and systems that force developers to use them for threading are OBSOLETE. Even systems older systems like VMS have kernel threads.
In a modern OS course, "user threads" should be a sidebar or historical footnote.
In essence, your book is trying to make a debate where none exists. It's like post WWII U.S. Army assessments comparing the Sherman Tank to the Panther. They talk about things like the Sherman having move comfortable seats to try to make the two sound comparable when, in reality, the Sherman was obsolete and not even in the same class at the Panther.
1.Since these are user space threads, how can the kernel do a "switch threads"? 2. "having the kernel do this" , what does "this" here mean?
What they appear to be suggesting is that the thread will block the process when it makes a system call. When the occurs, the operating system will make a context switch. In this case the operating system is making a "thread switch" to another process anyway. The [correct] conclusion they are trying to lead you to then is that this switch take away the user threads have in alleged reduced overhead.
I thought behaviors are like: 1. "select" call is made, and find following system call is a blocking one. 2. Then the user space thread scheduler makes a thread switching and execute anohter thread.
Let me take the case of a user thread implementation that is not totally blocked by blocking system calls.
The library sets a timer for thread switching.
The thread start or resumes executing.
The thread makes a blocking system service (e.g, select).
The operating system switches the process out as part of the system service processing.
The timer goes off.
The process becomes current again and the OS invokes the timer handler in the library.
The library schedules another thread to execute.
The problem you face is that a blocking system service is usually going to have as part of its processing code to trigger a context switch. Because the system does know no about threads (otherwise it would be using kernel threads), a thread calling such a blocking service is going to pass through the code.
Even though the process may have threads that are executable, the operating system has no way to cause them to be executed because it has know knowledge of them because they are managed by a library in the process.

Where does the wait queue for threads lies in POSIX pthread mutex lock and unlock?

I was going through concurrency section from REMZI and while going through mutex section, and I got confused about this:
To avoid busy waiting, mutex implementations employ park() / unpark() mechanism (on Sun OS) which puts a waiting thread in a queue with its thread ID. Later on during pthread_mutex_unlock() it removes one thread from the queue so that it can be picked by the scheduler. Similarly, an implementation of Futex (mutex implementation on Linux) uses the same mechanism.
It is still unclear to me where the queue lies. Is it in the address space of the running process or somewhere inside the kernel?
Another doubt I had is regarding condition variables. Do pthread_cond_wait() and pthread_cond_signal() use normal signals and wait methods, or do they use some variant of it?
Doubt 1: But, it is still unclear to me where actually does the queue lies. Is it in the address space of the running process or somewhere inside kernel.
Every mutex has an associated data structure maintained in the kernel address space, in Linux it is futex. That data structure has an associated wait queue where threads from different processes can queue up and wait to be woken up, see futex_wait kernel function.
Doubt 2: Another doubt I had is regarding condition variables, does pthread_cond_wait() and pthread_cond_signal() use normal signal and wait methods OR they use some variant of it.
Modern Linux does not use signals for condition variable signaling. See NPTL: The New Implementation of Threads for Linux for more details:
The addition of the Fast Userspace Locking (futex) into the kernel enabled a complete reimplementation of mutexes and other synchronization mechanisms without resorting to interthread signaling. The futex, in turn, was made possible by the introduction of preemptive scheduling to the kernel.

How is preemptive scheduling implemented for user-level threads in Linux?

With user-level threads there are N user-level threads running on top of a single kernel thread. This is in contrast to pthreads where only one user thread runs on a kernel thread.
The N user-level threads are preemptively scheduled on the single kernel thread. But what are the details of how that is done.
I heard something that suggested that the threading library sets things up so that a signal is sent by the kernel and that is the mechanism to yank execution from an individual user-level thread to a signal handler that can then do the preemptive scheduling.
But what are the details of how state such as registers and thread structs are saved and/or mutated to make this all work? Is there maybe a very simple of user-level threads that is useful for learning the details?
To get the details right, use the source! But this is what I remember from when I read it...
There are two ways user-level threads can be scheduled: voluntarily and preemptively.
Voluntary scheduling: threads must call a function periodically to pass the use of the CPU to another thread. This function is called yield() or schedule() or something like that.
Preemptive scheduling: the library forcefully removes the CPU from one thread and passes it to another. This is usually done with timer signals, such as SIGALARM (see man ualarm for the details).
About how to do the real switch, if your OS is friendly and provides the necessary functions, that is easy. In Linux you have the makecontext() / swapcontext() functions that make swapping from one task to another easy. Again, see the man pages for details.
Unfortunately, these functions are removed from POSIX, so other UNIX may not have them. If that's the case, there are other tricks that can be done. The most popular was the one calling sigaltstack() to set up an alternate stack for managing the signals, then kill() itself to get to the alternate stack, and longjmp() from the signal function to the actual user-mode-thread you want to run. Clever, uh?
As a side note, in Windows user-mode threads are called fibers and are fully supported also (see the docs of CreateFiber()).
The last resort is using assembler, that can be made to work almost everywhere, but it is totally system specific. The steps to create a UMT would be:
Allocate a stack.
Allocate and initialize a UMT context: a struct to hold the value of the relevant CPU registers.
And to switch from one UMT to another:
Save the current context.
Switch the stack.
Restore the next context in the CPU and jump to the next instruction.
These steps are relatively easy to do in assembler, but quite impossible in plain C without support from any of the tricks cited above.

What interprocess locking calls should I monitor?

I'm monitoring a process with strace/ltrace in the hope to find and intercept a call that checks, and potentially activates some kind of globally shared lock.
While I've dealt with and read about several forms of interprocess locking on Linux before, I'm drawing a blank on what to calls to look for.
Currently my only suspect is futex() which comes up very early on in the process' execution.
Update0
There is some confusion about what I'm after. I'm monitoring an existing process for calls to persistent interprocess memory or equivalent. I'd like to know what system and library calls to look for. I have no intention call these myself, so naturally futex() will come up, I'm sure many libraries will implement their locking calls in terms of this, etc.
Update1
I'd like a list of function names or a link to documentation, that I should monitor at the ltrace and strace levels (and specifying which). Any other good advice about how to track and locate the global lock in mind would be great.
If you can start monitored process in valgrind, then there are two projects:
http://code.google.com/p/data-race-test/wiki/ThreadSanitizer
and Helgrind
http://valgrind.org/docs/manual/hg-manual.html
Helgrind is aware of all the pthread
abstractions and tracks their effects
as accurately as it can. On x86 and
amd64 platforms, it understands and
partially handles implicit locking
arising from the use of the LOCK
instruction prefix.
So, this tools can detect even atomic memory accesses. And they will check pthread usage
flock is another good one
There are many system calls can be used for locking: flock, fcntl, and even create.
When you are using pthreads/sem_* locks they may be executed in user space so you'll never
see them in strace as futex is called only for pending operations. Like when you actually
need to wait.
Some operations can be done in user space only - like spinlocks - you'll never see them
unless they do some waits for timer - backoff so you may see only stuff like nanosleep when one lock waits for other.
So there is no "generic" way to trace them.
on systems with glibc ~ >= 2.5 (glibc + nptl) you can use process shared
semaphores (last parameter to sem_init), more precisely, posix unnamed semaphores
posix mutexes (with PTHREAD_PROCESS_SHARED to pthread_mutexattr_setpshared)
posix named semaphores (got from sem_open/sem_unlink)
system v (sysv) semaphores: semget, semop
On older systems with glibc 2.2, 2.3 with linuxthreads or on embedded systems with uClibc you can use ONLY system v (sysv) semaphores for iterprocess communication.
upd1: any IPC and socker must be checked.

Resources