How does mutex or semaphore wake up processes? - multithreading

I read that mutexes and semaphores maintain a list of waiting processes and wake them up when the current thread completes the critical section. How do mutexes and semaphores do that? Doesn't they interfere with the process scheduler decisions?

The waiting and waking up is typically done in cooperation with the scheduler. A mutex implementation that forces a specific one of the waiting threads to wake is typically considered to be a poor implementation.
Instead, the mutex or semaphore will notify the scheduler that a thread is waiting, and thus to take it off the "ready to run" list. Then, when the mutex is unlocked or the semaphore signalled, the implementation will either
ask the scheduler to wake one of the waiting threads at the scheduler's discretion, or
notify the scheduler that all the waiting threads are ready to run, and then have logic on the waiting threads so that all but the first one to be woken by the scheduler go back to sleep again.
The former is the preferred implementation choice, but is not always available. The second is often dubbed a "thundering herd" approach: if there are 1000 threads waiting then all 1000 are woken (a big thundering herd of threads), only for 999 to go back to sleep. This is wasteful of CPU resources, and implementations will avoid it where possible.

This depends largely on the implementation. When speaking of these things, we usually refer to threads, which may have operating system support, or may be implemented entirely in the language, and thus do not even interact with the O/S for thread support.
They can also refer to processes, however, but these implementations usually require complex message passing patterns, in addition to some O/S support for process management.

Yes, the simple explanation is that any waiting process (whether semaphore or I/O) is taken off the scheduler's run queue until its wait condition is satisfied.

Related

Can it be assumed that `pthread_cond_signal` will wake the signaled thread atomically with regards to the mutex bond to the condition variable?

Quoting POSIX:
The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().
"If predictable scheduling behavior is required". This could/would hint that locking the mutex bound to the condition variable right before calling pthread_cond_signal() should guarantee that the signaled thread will be woken up before any other thread manages to lock this mutex. Is this correct?
We will se if any PThreads guru has a more comprehensive answer, but as far as I can see, at least in the Linux manpage, you do not get fully predictable behavior. What you do get is a guarantee that if two threads wait on the same condition variable, the higher-prio thread gets to go first (at least, that should be true on Linux if one thread is SCHED_OTHER and the other is real-time SCHED_FIFO). That holds if you lock mutex before signalling (with reservation for errors after a quick read of the manpage).
See
https://linux.die.net/man/3/pthread_cond_signal
No, there is no guarantee the signalled thread will be waken up. Worse, if in the signalling thread you have sequence:
while(run_again) {
pthread_mutex_lock(&mutex);
/* prepare data */
pthread_mutex_unlock(&mutex);
pthread_cond_broadcast(&cond);
}
there is reasonable chance control would never be passed to other threads waiting on mutex because of logic in the scheduler. Some examples to play with you can find in this answer.
No.
The best reference I have found regarding the predictability is this one:
https://austin-group-l.opengroup.narkive.com/lKcmfoRI/predictable-scheduling-behavior-in-pthread-cond-broadcast
Basically, people want to guard against the possibility that threads do not get a fair chance to run. Apparently, it is not a problem for most producer-consumer scenarios, and it does not apply to pthread_cond_broadcast as well. I would say, it is useful only in limited cases.
Cppreference.com actually considers unlocking after notifying may be a pessimization:
https://en.cppreference.com/w/cpp/thread/condition_variable/notify_all

What guarantee thread with spin lock on multiprocessor run on a different processor

I know spin lock only works on multiprocessor. But if two threads try to acquire the same resource and one is put on spinlock, what prevents the other one not running on the same processor? If it happens the one with spin lock will prevent the one holding the resources to exceed. In this case it becomes a deadlock. How does OS prevent it happen?
Some background facts first:
spin-locks (and locks generally) are not limited to multiprocessor systems. They work fine on single processor or even single-threaded application can use them without any harm.
spin-locks are not only provided by OS, they have pure user-space implementation as well. For example, tbb provides tbb::spin_mutex.
By default, nothing prevents a thread from running on any available CPU (regardless of the locks they use).
There are reentrant/recursive type of locks. It means that if a thread acquired it once, and tries to acquire it once again without releasing, it will succeed, not deadlock as usual locks. But it does not mean that the same applies to different threads just because they are scheduled to the same CPU. With any type of lock, if one software thread locked a mutex, other threads have to wait.
It is possible for one thread to acquire the lock and be preempted (i.e. interrupted by OS timer) before it releases the lock. Another thread can be scheduled to the same CPU and it might want to acquire the same lock. In case of pure spin-locks, this thread will uselessly spin until it exceeds its time-slice allowed by OS and will be preempted. Finally, the first thread will get a chance to run and release its lock so another thread will be able to acquire it.
As you can see, it is not quite efficient to spent the time on the hopeless waiting. Thus, more sophisticated implementations, after a number of attempts to acquire the spinlock, call OS for help in order to voluntary give away its time-slice to other threads which possibly can unlock the current one.

Mutex lock: what does "blocking" mean?

I've been reading up on multithreading and shared resources access and one of the many (for me) new concepts is the mutex lock. What I can't seem to find out is what is actually happening to the thread that finds a "critical section" is locked. It says in many places that the thread gets "blocked", but what does that mean? Is it suspended, and will it resume when the lock is lifted? Or will it try again in the next iteration of the "run loop"?
The reason I ask, is because I want to have system supplied events (mouse, keyboard, etc.), which (apparantly) are delivered on the main thread, to be handled in a very specific part in the run loop of my secondary thread. So whatever event is delivered, I queue in my own datastructure. Obviously, the datastructure needs a mutex lock because it's being modified by both threads. The missing puzzle-piece is: what happens when an event gets delivered in a function on the main thread, I want to queue it, but the queue is locked? Will the main thread be suspended, or will it just jump over the locked section and go out of scope (losing the event)?
Blocked means execution gets stuck there; generally, the thread is put to sleep by the system and yields the processor to another thread. When a thread is blocked trying to acquire a mutex, execution resumes when the mutex is released, though the thread might block again if another thread grabs the mutex before it can.
There is generally a try-lock operation that grab the mutex if possible, and if not, will return an error. But you are eventually going to have to move the current event into that queue. Also, if you delay moving the events to the thread where they are handled, the application will become unresponsive regardless.
A queue is actually one case where you can get away with not using a mutex. For example, Mac OS X (and possibly also iOS) provides the OSAtomicEnqueue() and OSAtomicDequeue() functions (see man atomic or <libkern/OSAtomic.h>) that exploit processor-specific atomic operations to avoid using a lock.
But, why not just process the events on the main thread as part of the main run loop?
The simplest way to think of it is that the blocked thread is put in a wait ("sleeping") state until the mutex is released by the thread holding it. At that point the operating system will "wake up" one of the threads waiting on the mutex and let it acquire it and continue. It's as if the OS simply puts the blocked thread on a shelf until it has the thing it needs to continue. Until the OS takes the thread off the shelf, it's not doing anything. The exact implementation -- which thread gets to go next, whether they all get woken up or they're queued -- will depend on your OS and what language/framework you are using.
Too late to answer but I may facilitate the understanding. I am talking more from implementation perspective rather than theoretical texts.
The word "blocking" is kind of technical homonym. People may use it for sleeping or mere waiting. The term has to be understood in context of usage.
Blocking means Waiting - Assume on an SMP system a thread B wants to acquire a spinlock held by some other thread A. One of the mechanisms is to disable preemption and keep spinning on the processor unless B gets it. Another mechanism probably, an efficient one, is to allow other threads to use processor, in case B does not gets it in easy attempts. Therefore we schedule out thread B (as preemption is enabled) and give processor to some other thread C. In this case thread B just waits in the scheduler's queue and comes back with its turn. Understand that B is not sleeping just waiting rather passively instead of busy-wait and burning processor cycles. On BSD and Solaris systems there are data-structures like turnstiles to implement this situation.
Blocking means Sleeping - If the thread B had instead made system call like read() waiting data from network socket, it cannot proceed until it gets it. Therefore, some texts casually use term blocking as "... blocked for I/O" or "... in blocking system call". Actually, thread B is rather sleeping. There are specific data-structures known as sleep queues - much like luxury waiting rooms on air-ports :-). The thread will be woken up when OS detects availability of data, much like an attendant of the waiting room.
Blocking means just that. It is blocked. It will not proceed until able. You don't say which language you're using, but most languages/libraries have lock objects where you can "attempt" to take the lock and then carry on and do something different depending on whether you succeeded or not.
But in, for example, Java synchronized blocks, your thread will stall until it is able to acquire the monitor (mutex, lock). The java.util.concurrent.locks.Lock interface describes lock objects which have more flexibility in terms of lock acquisition.

prevent linux thread from being interrupted by scheduler

How do you tell the thread scheduler in linux to not interrupt your thread for any reason? I am programming in user mode. Does simply locking a mutex acomplish this? I want to prevent other threads in my process from being scheduled when a certain function is executing. They would block and I would be wasting cpu cycles with context switches. I want any thread executing the function to be able to finish executing without interruption even if the threads' timeslice is exceeded.
How do you tell the thread scheduler in linux to not interrupt your thread for any reason?
Can't really be done, you need a real time system for that. The closes thing you'll get with linux is to
set the scheduling policy to a realtime scheduler, e.g. SCHED_FIFO, and also set the PTHREAD_EXPLICIT_SCHED attribute. See e.g. here , even now though, e.g. irq handlers and other other stuff will interrupt your thread and run.
However, if you only care about the threads in your own process not being able to do anything, then yes, having them block on a mutex your running thread holds is sufficient.
The hard part is to coordinate all the other threads to grab that mutex whenever your thread needs to do its thing.
You should architect your sw so you're not dependent on the scheduler doing the "right" thing from your app's point of view. The scheduler is complicated. It will do what it thinks is best.
Context switches are cheap. You say
I would be wasting cpu cycles with context switches.
but you should not look at it that way. Use the multi-threaded machinery of mutexes and blocked / waiting processes. The machinery is there for you to use...
You can't. If you could what would prevent your thread from never releasing the request and starving other threads.
The best you can do is set your threads priority so that the scheduler will prefer it over lower priority threads.
Why not simply let the competing threads block, then the scheduler will have nothing left to schedule but your living thread? Why complicate the design second guessing the scheduler?
Look into real time scheduling under Linux. I've never done it, but if you indeed do NEED this this is as close as you can get in user application code.
What you seem to be scared of isn't really that big of a deal though. You can't stop the kernel from interrupting your programs for real interrupts or of a higher priority task wants to run, but with regular scheduling the kernel does uses it's own computed priority value which pretty much handles most of what you are worried about. If thread A is holding resource X exclusively (X could be a lock) and thread B is waiting on resource X to become available then A's effective priority will be at least as high as B's priority. It also takes into account if a process is using up lots of cpu or if it is spending lots of time sleeping to compute the priority. Of course, the nice value goes in there too.

what is kernel thread dispatching?

Can someone give me an easy to understand definition of kernel thread dispatching or just thread dispatching if there's no difference between the two?
From what I understand it's just doing a context switch while the currently active thread waits on a lock from another thread, so the CPU goes and does something else while this thread is in blocking mode.
I might however have misunderstood.
It's basically the process by which the operating system determines which of the many active threads is sent (dispatched) to the CPU for processing at any given point.
Each operating system has its own implementation, but the basic concept is to keep a sorted list of threads by priority, and dispatch them as needed to the CPU. Time slicing is added to allow multiple programs to run concurrently, etc.

Resources