Linux pthread mutex and kernel scheduler - linux

With a friend of mine, we disagree on how synchronization is handled at userspace level (in the pthread library).
a. I think that during a pthread_mutex_lock, the thread actively waits. Meaning the linux scheduler rises this thread, let it execute his code, which should looks like:
while (mutex_resource->locked);
Then, another thread is scheduled which potentially free the locked field, etc.
So this means that the scheduler waits for the thread to complete its schedule time before switching to the next one, no matter what the thread is doing.
b. My friend thinks that the waiting thread somehow tells the kernel "Hey, I'm asleep, don't wait for me at all".
In this case, the kernel would schedule the next thread right away, without waiting for the current thread to complete its schedule time, being aware this thread is sleeping.
From what I see in the code of pthread, it seems there is loop handling the lock. But maybe I missed something.
In embedded systems, it could make sense to prevent the kernel from waiting. So he may be right (but I hope he does not :D).
Thanks!

a. I think that during a pthread_mutex_lock, the thread actively waits.
Yes, glibc's NPTL pthread_mutex_lock have active wait (spinning),
BUT the spinning is used only for very short amount of time and only for some types of mutexes. After this amount, pthread_mutex_lock will go to sleep, by calling linux syscall futex with WAIT argument.
Only mutexes with type PTHREAD_MUTEX_ADAPTIVE_NP will spin, and default is PTHREAD_MUTEX_TIMED_NP (normal mutex) without spinning. Check MAX_ADAPTIVE_COUNT in __pthread_mutex_lock sources).
If you want to do infinite spinning (active waiting), use pthread_spin_lock function with pthread_spinlock_t-types locks.
I'll consider the rest of your question as if you are using pthread_spin_lock:
Then, another thread is scheduled which potentially free the locked field, etc. So this means that the scheduler waits for the thread to complete its schedule time before switching to the next one, no matter what the thread is doing.
Yes, if there is contention for CPU cores, the your thread with active spinning may block other thread from execute, even if the other thread is the one who will unlock the mutex (spinlock) which is needed by your thread.
But if there is no contention (no thread oversubscribing), and threads are scheduled on different cores (by coincidence, or by manual setting of cpu affinity with sched_setaffinity or pthread_setaffinity_np), spinning will enable you to proceed faster, then using OS-based futex.
b. My friend thinks that the waiting thread somehow tells the kernel "Hey, I'm asleep, don't wait for me at all". In this case, the kernel would schedule the next thread right away, without waiting for the current thread to complete...
Yes, he is right.
futex is the modern way to say OS that this thread is waiting for some value in memory (for opening some mutex); and in current implementation futex also puts our thread to sleep. It is not needed to wake it to do spinning, if kernel knows when to wake up this thread. How it knows? The lock owner, when doing pthread_mutex_unlock, will check, is there any other threads, sleeping on this mutex. If there is any, lock owner will call futex with FUTEX_WAKE, telling OS to wake some thread, registered as sleeper on this mutex.
There is no need to spin, if thread registers itself as waiter in OS.

Some debuging with gdb for this test program:
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
pthread_mutex_t x = PTHREAD_MUTEX_INITIALIZER;
void* thr_func(void *arg)
{
pthread_mutex_lock(&x);
}
int main(int argc, char **argv)
{
pthread_t thr;
pthread_mutex_lock(&x);
pthread_create(&thr, NULL, thr_func, NULL);
pthread_join(thr,NULL);
return 0;
}
shows that a call to pthread_mutex_lock on a mutex results in a calling a system call futex with the op parameter set to FUTEX_WAIT (http://man7.org/linux/man-pages/man2/futex.2.html)
And this is description of FUTEX_WAIT:
FUTEX_WAIT
This operation atomically verifies that the futex address
uaddr still contains the value val, and sleeps awaiting FUTEX_WAKE on
this futex address. If the timeout argument is
non-NULL, its contents describe the maximum duration of the wait,
which is infinite otherwise. The arguments uaddr2 and val3 are
ignored.
So from this description I can say that if a mutex is locked then a thread will sleep and not actively wait. And it will sleep until futex with op equal to FUTEX_WAKE is called.

Related

Does C++11 locks make a call to the kernel [duplicate]

I have the following situation:
Two C++11 threads are working on a calculation and they are synchronized through a std::mutex.
Thread A locks the mutex until the data is ready for the operation Thread B executes. When the mutex is unlocked Thread B starts to work.
Thread B tries to lock the mutex and is blocked until it is unlocked by Thread A.
void ThreadA (std::mutex* mtx, char* data)
{
mtx->lock();
//do something useful with data
mtx->unlock();
}
void ThreadB (std::mutex* mtx, char* data)
{
mtx->lock(); //wait until Thread A is ready
//do something useful with data
//.....
}
It is asserted that Thread A can block the mutex first.
Now I am wondering if the mtx->lock() in Thread B waits active or passive. So is Thread B polling the mutex state and wasting processor time or is released passively by the sheduler when the mutex is unlocked.
In the different C++ references it is only mentioned that the thread is blocked, but not in which way.
Could it be, however, that the std::mutex implementation is hardly depended on the used plattform and OS?
It's highly implementation defined, even for the same compiler and OS
for example,on VC++, in Visual Studio 2010, std::mutex was implemented with Win32 CRITICAL_SECTION. EnterCriticalSection(CRITICAL_SECTION*) has some nice feature: first it tries to lock the CRITICAL_SECTION by iterating on the lock again and again. after specified number of iteration, it makes a kernel-call which makes the thread go sleep, only to be awakened up again when the lock is released and the whole deal starts again.
in this case , the mechanism polls the lock again and again before going to sleep, then the control switches to the kernel.
Visual Studio 2012 came with a different implementation. std::mutex was implemented with Win32 mutex. Win32 mutex shifts the control immediately to the kernel. there is no active polling done by the lock.
you can read about the implementation switch in the answer : std::mutex performance compared to win32 CRITICAL_SECTION
So, it is unspecified how the mutex acquires the lock. it is the best not to rely on such behaviour.
ps. do not lock the mutex manually, use std::lock_guard instead. also, you might want to use condition_variable for more-refined way of controlling your synchronization.

Lock Holder Preemption

Could you have the following scenario in concurrent programs?
suppose a thread acquires a lock to execute a critical section.Then before the critical section is executed the processor preempts the thread. The new thread that comes for execution needs the lock from the old thread (that was preempted). So the current thread can't proceed (hangs until it get preempted). Is there a mechanism in Operating systems to not let threads preempted until the lock is released?
It is possible for a thread holding a mutex to be preempted while executing a critical section. If the thread that the OS switches to tries to acquire that mutex and finds that it is already locked, then that thread should be context switched out immediately. The thread scheduler should be smart enough to not switch back to that thread until it has switched back to the thread holding the mutex and the mutex is released.
If you are writing Kernel code then yes, there are mechanisms for preventing a thread to preempt.
For standard code there is no such thing. Some operations are atomic and are ensured atomic by the compiler and kernel but right after those operations the thread may be preempted and it can remain preempted for an undetermined amount of time (unless the system is a real-time sistem).

What is a safe and easy way to exchange data from a threaded ISR? (Raspberry Pi)

I'm trying to develop a C/C++ userspace application on the Raspberry Pi which processes data coming from an SPI device. I'm using the WiringPi Library (function wiringPiISR) which registers a function (the real interrupt handler) that will be called from a pthreaded interrupt handler on an IRQ event.
I heard that STL containers aren't thread safe, but is it enough to have a mutex lock while executing my callback function and of course a lock in the main thread while accessing the buffer/container there?
My "real interrupt handler" which is registered through wiringPiISR looks like this
std::deque<uint8_t> buffer;
static void irq_handler()
{
uint8_t data;
while (digitalRead(IRQ_PIN)==0)
{
data = spi_txrx(CMD_READBYTE);
pthread_mutex_lock(&mutex1);
callback(data);
pthread_mutex_unlock(&mutex1);
}
}
static void callback(uint8_t byte)
{
buffer.push_back(byte);
}
Or is there an easier way to achieve the data exchange between a threaded ISR and main thread?
Is that a real ISR ?
Anyway mutex are not a good fit for ISR, because they lead to priority inversion.
Let's look at normal mutex usage, with two thread :
Thread A runs and take the mmutex
for some reason, thread A is preempted, and thread B executes.
thread B try to take the mutex, but can't.
thread B is put to sleep, allowing another thread to run, for instance thread C or thread A
...
At some point, thread A wille be rescheduled, will resume it's operation, and release the mutex.
When thread B is scheduled again, takes the mutex.
Now the scenario is very different when it comes to ISR. ISR won't be put to sleep in favor of a lower priority thread, so the mutex owning thread will not run while you are in the ISR, and you will never get out of point three.
So the real question is, "When running an IRQ handler, is it possible for other code to run ?" Otherwise you are in deadlock !

synchronising threads with mutexes

In Qt, I have a method which contains a mutex lock and unlock. The problem is when the mutex is unlock it sometimes take long before the other thread gets the lock back. In other words it seems the same thread can get the lock back(method called in a loop) even though another thread is waiting for it. What can I do about this? One thread is a qthread and the other thread is the main thread.
You can have your thread that just unlocked the mutex relinquish the processor. On Posix, you do that by calling pthread_yield() and on Windows by calling Sleep(0).
That said, there is no guarantee that the thread waiting on the lock will be scheduled before your thread wakes up again.
It shouldn't be possible to release a lock and then get it back if some other thread is already waiting on it.
Check that you actually releasing the lock when you think you do. Check that waiting thread actually waits (and not spins a loop with a trylock tests and sleeps, I actually done that once and was very puzzled at first :)).
Or if waiting thread really never gets time to even reach locking code, try QThread::yieldCurrentThread(). This will stop current thread and give scheduler a chance to give execution to somebody else. Might cause unnecessary switching depending on tightness of your loop.
If you want to make sure that one thread has priority over the other ones, an option is to use a QReadWriteLock. It's adapted to a typical scenario where n threads are going to read a value in a infinite loop, with only one thread updating it. I think it's the scenario you described.
QReadWriteLock offers two ways to lock: lockForRead() and lockForWrite(). The threads depending on the value will use the latter, while the thread updating the value (typically via the GUI) will use the former (lockForWrite()) and will have top priority. You won't need to sleep or yield or whatever.
Example code
Let's say you have a QReadWrite lock; somewhere.
"Reader" thread
forever {
lock.lockForRead();
if (condition) {
do_stuff();
}
lock.unlock();
}
"Writer" thread
// external input (eg. user) changes the thread
lock.lockForWrite(); // will block as soon as the reader lock ends
update_condition();
lock.unlock();

How do I suspend another thread (not the current one)?

I'm trying to implement a simulation of a microcontroller. This simulation is not meant to do a clock cycle precise representation of one specific microcontroller but check the general correctness of the code.
I thought of having a "main thread" executing normal code and a second thread executing ISR code. Whenever an ISR needs to be run, the ISR thread suspends the "main thread".
Of course, I want to have a feature to block interrupts.
I thought of solving this with a mutex that the ISR thread holds whenever it executes ISR code while the main thread holds it as long as "interrupts are blocked".
A POR (power on reset) can then be implemented by not only suspending but killing the main thread (and starting a new one executing the POR function).
The windows API provides the necessary functions.
But it seems to be impossible to do the above with posix threads (on linux).
I don't want to change the actual hardware independent microcontroller code. So inserting anything to check for pending interrupts is not an option.
Receiving interrupts at non well behaved points is desirable, as this also happens on microcontrollers (unless you block interrupts).
Is there a way to suspend another thread on linux? (Debuggers must use that option somehow, I think.)
Please, don't tell me this is a bad idea. I know that is true in most circumstances. But the main code does not use standard libs or lock/mutexes/semaphores.
SIGSTOP does not work - it always stops the entire process.
Instead you can use some other signals, say SIGUSR1 for suspending and SIGUSR2 for resuming:
// at process start call init_pthread_suspending to install the handlers
// to suspend a thread use pthread_kill(thread_id, SUSPEND_SIG)
// to resume a thread use pthread_kill(thread_id, RESUME_SIG)
#include <signal.h>
#define RESUME_SIG SIGUSR2
#define SUSPEND_SIG SIGUSR1
static sigset_t wait_mask;
static __thread int suspended; // per-thread flag
void resume_handler(int sig)
{
suspended = 0;
}
void suspend_handler(int sig)
{
if (suspended) return;
suspended = 1;
do sigsuspend(&wait_mask); while (suspended);
}
void init_pthread_suspending()
{
struct sigaction sa;
sigfillset(&wait_mask);
sigdelset(&wait_mask, SUSPEND_SIG)
sigdelset(&wait_mask, RESUME_SIG);
sigfillset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = resume_handler;
sigaction(RESUME_SIG, &sa, NULL);
sa.sa_handler = suspend_handler;
sigaction(SUSPEND_SIG, &sa, NULL);
}
I am very annoyed by replies like "you should not suspend another thread, that is bad".
Guys why do you assume others are idiots and don't know what they are doing? Imagine that others, too, have heard about deadlocking and still, in full consciousness, want to suspend other threads.
If you don't have a real answer to their question why do you waste your and the readers' time.
An yes, IMO pthreads are very short-sighted api, a disgrace for POSIX.
The Hotspot JAVA VM uses SIGUSR2 to implement suspend/resume for JAVA threads on linux.
A procedure based on on a signal handler for SIGUSR2 might be:
Providing a signal handler for SIGUSR2 allows a thread to request a lock
(which has already been acquired by the signal sending thread).
This suspends the thread.
As soon as the suspending thread releases the lock, the signal handler can
(and will?) get the lock. The signal handler releases the lock immediately and
leaves the signal handler.
This resumes the thread.
It will probably be necessary to introduce a control variable to make sure that the main thread is in the signal handler before starting the actual processing of the ISR.
(The details depend on whether the signal handler is called synchronously or asynchronously.)
I don't know, if this is exactly how it is done in the Java VM, but I think the above procedure does what I need.
Somehow I think sending the other thread SIGSTOP works.
However, you are far better off writing some thread communication involving senaogires.mutexes and global variables.
You see, if you suspend the other thread in malloc() and you call malloc() -> deadlock.
Did I mention that lots of C standard library functions, let alone other libraries you use, will call malloc() behind your back?
EDIT:
Hmmm, no standard library code. Maybe use setjmp/longjump() from signal handler to simulate the POR and a signal handier to simulate interrupt.
TO THOSE WHO KEEP DOWNVOTING THIS: The answer was accepted for the contents after EDIT, which is a specific scenario that cannot be used in any other scenario.
Solaris has the thr_suspend(3C) call that would do what you want. Is switching to Solaris a possibility?
Other than that, you're probably going to have to do some gymnastics with mutexes and/or semaphores. The problem is that you'll only suspend when you check the mutex, which will probably be at a well-behaved point. Depending on what you're actually trying to accomplish, this might now be desirable.
It makes more sense to have the main thread execute the ISRs - because that's how the real controller works (presumably). Just have it check after each emulated instruction if there is both an interrupt pending, and interrupts are currently enabled - if so, emulate a call to the ISR.
The second thread is still used - but it just listens for the conditions which cause an interrupt, and mark the relevant interrupt as pending (for the other thread to later pick up).

Resources