Synchronize multiple pthreads in a - multithreading

I'm discovering the pthread library (in C) and I'm having some trouble understanding well a few things.
First of all, I understand what a mutex is, I understand how it works, ok, I also understand the concept of the cond, but I can't manage to use it properly (I don't really get how to combine the mutex and the cond)
This is, in pseudo-code, what I want to do :
thread :
loop :
// do something
end loop
end thread
So there is n threads, but each thread uses the same function. I want the inside of the loop to be executed in parallel by all the threads BUT each thread must be in the same iteration of the loop, meaning I don't care in what order the instructions inside the loop are executed between threads, but to start iteration 2 of a thread, all the other threads must have finished iteration 1 (etc).
So my question is : how do you do that ? Not particularly in a specific example, but theoretically.
EDIT
I manage to do it, I don't know if it's the proper way, but it's working :
global nbOfThreads
global nbOfIterations
thread :
lock(mutex0)
unlock(mutex0)
loop :
// Do something
lock(mutex1)
nbOfIterations++
if (nbOfIterations == nbOfThread) :
nbOfIterations = 0
broadcast(cond)
unlock(mutex1)
continue
end if
wait(cond, mutex1)
unlock(mutex1)
end loop
end thread
main (n) :
nbOfThreads = n
nbOfIterations = 0
lock(mutex0)
do nbOfThreads times : create(thread)
unlock(mutex0)
end main
I obviously tried to understand myself, but there are some things I don't understand :
The main one : WHY does a cond need to be pair with a mutex
In some examples I saw something like this :
// thread A :
while (!condition)
wait(&cond)
// thread B :
if (condition)
signal(&cond)
well I really don't get the point of this while loop, I thought wait put the thread in pause until the condition is true (until the other thread send the signal). I mean I would get it if it was an if instead of a while.
Thank you

WHY does a cond need.... because the (!condition) you reference almost certainly depends upon some bits of the object not changing while you reference them. Correspondingly, modifying the state of the object should be done in such a way as to appear atomic to any observer; thus a mutex. While you could rely on too-clever-by-half hackery like atomic types, there is also the problem of ‘what if it was modified just after you checked it’ -- a race condition. Thus the idiomatic lock(); while (!cond) { wait(); }.
The point of the while... The signal+wait is not a handoff of control; after the signal, any number of things could happen to the object before a particular thread returns from wait. Even though the condition might have been in the correct state, by the time thread A examines it, it may no longer be. At the point of exiting the while loop, thread A knows: The condition is in the state I desire, and I have exclusive access to the object.

Condition variables can have spurious wake-ups. The condition might not actually be true when the wait function returns.
Depending on your task, a different synchronization primitive, such as a barrier (see pthread_barrier_init) or a semaphore (sem_init) might be easier to use.

Related

multiple threads but only one allowed to use method

So basically the situation I am in is I have a bunch of threads each doing different calculations throughout the week. At the end of the week, every thread calls function X() and then starts calculating for the next week and repeats this cycle.
However, only one thread is allowed to actually do the operations in method X() and only when all threads have reached method X(). Furthermore, none of the threads can continue on their way until the one thread that got to use method X() is finished.
So I'm having difficulty implementing this. I feel like I need to use a condition variable but I'm still shaky with threads and whatnot.
Barriers are a useful synchronization method here.
In pthreads, you can use two barriers, each initialized to a require however many threads are running. The first synchronizes threads after they've finished calculating, and the second after one of them has called X(). Conveniently, the pthread_barrier_wait will elect one and only one of your N waiting threads to actually call X():
void *my_thread(void *whatever) { // XXX error checking omitted
while (1) {
int rc;
do_intense_calculations();
// Wait for all calculations to finish
rc = pthread_barrier_wait(&calc_barrier);
// Am I nominated to run X() ?
if (rc == PTHREAD_BARRIER_SERIAL_THREAD) X();
// Wait for everyone, including whoever is doing X()
rc = pthread_barrier_wait(&x_barrier);
}
Java's CyclicBarrier with a Runnable argument would let you do the same thing with but one barrier. (The Runnable is run after all parties arrive but before any are released.)

When is a condition variable needed, isn't a mutex enough?

I'm sure mutex isn't enough that's the reason the concept of condition variables exist; but it beats me and I'm not able to convince myself with a concrete scenario when a condition variable is essential.
Differences between Conditional variables, Mutexes and Locks question's accepted answer says that a condition variable is a
lock with a "signaling" mechanism. It is used when threads need to
wait for a resource to become available. A thread can "wait" on a CV
and then the resource producer can "signal" the variable, in which
case the threads who wait for the CV get notified and can continue
execution
Where I get confused is that, a thread can wait on a mutex too, and when it gets signalled, is simply means the variable is now available, why would I need a condition variable?
P.S.: Also, a mutex is required to guard the condition variable anyway, when makes my vision more askew towards not seeing condition variable's purpose.
Even though you can use them in the way you describe, mutexes weren't designed for use as a notification/synchronization mechanism. They are meant to provide mutually exclusive access to a shared resource. Using mutexes to signal a condition is awkward and I suppose would look something like this (where Thread1 is signaled by Thread2):
Thread1:
while(1) {
lock(mutex); // Blocks waiting for notification from Thread2
... // do work after notification is received
unlock(mutex); // Tells Thread2 we are done
}
Thread2:
while(1) {
... // do the work that precedes notification
unlock(mutex); // unblocks Thread1
lock(mutex); // lock the mutex so Thread1 will block again
}
There are several problems with this:
Thread2 cannot continue to "do the work that precedes notification" until Thread1 has finished with "work after notification". With this design, Thread2 is not even necessary, that is, why not move "work that precedes" and "work after notification" into the same thread since only one can run at a given time!
If Thread2 is not able to preempt Thread1, Thread1 will immediately re-lock the mutex when it repeats the while(1) loop and Thread1 will go about doing the "work after notification" even though there was no notification. This means you must somehow guarantee that Thread2 will lock the mutex before Thread1 does. How do you do that? Maybe force a schedule event by sleeping or by some other OS-specific means but even this is not guaranteed to work depending on timing, your OS, and the scheduling algorithm.
These two problems aren't minor, in fact, they are both major design flaws and latent bugs. The origin of both of these problems is the requirement that a mutex is locked and unlocked within the same thread. So how do you avoid the above problems? Use condition variables!
BTW, if your synchronization needs are really simple, you could use a plain old semaphore which avoids the additional complexity of condition variables.
Mutex is for exclusive access of shared resources, while conditional variable is about waiting for a condition to be true. They are tw different types of kernel resource. Some people might think they can implement conditional variable by themselves with mutex, a common pattern is "flag + mutex":
lock(mutex)
while (!flag) {
sleep(100);
}
unlock(mutex)
do_something_on_flag_set();
but it doesn't work, because you never release the mutex during the wait, no one else can set the flag in a thread-safe way. This is why we need kernel support for conditional variables, so when you're waiting on a condition variable, the associated mutex is not hold by your thread until it's signaled.
I was thinking about this too and the most important information which I think was missing everywhere is that mutex can be owned (or changed) by only one thread at a time. So if you have one producer and more consumers, the producer would have to wait on mutex to produce. With cond. variable it can produce at any time.
You need condition variables, to be used with a mutex (each cond.var. belongs to a mutex) to signal changing states (conditions) from one thread to another one. The idea is that a thread can wait till some condition becomes true. Such conditions are program specific (i.e. "queue is empty", "matrix is big", "some resource is almost exhausted", "some computation step has finished" etc). A mutex might have several related condition variables. And you need condition variables because such conditions may not always be expressed as simply as "a mutex is locked" (so you need to broadcast changes in conditions to other threads).
Read some good posix thread tutorials, e.g. this tutorial or that or that one. Better yet, read a good pthread book. See this question.
Also read Advanced Unix Programming and Advanced Linux Programming
P.S. Parallelism and threads are difficult concepts to grasp. Take time to read and experiment and read again.
The conditional var and the mutex pair can be replaced by a binary semaphore and mutex pair. The sequence of operations of a consumer thread when using the conditional var + mutex is:
Lock the mutex
Wait on the conditional var
Process
Unlock the mutex
The producer thread sequence of operations is
Lock the mutex
Signal the conditional var
Unlock the mutex
The corresponding consumer thread sequence when using the sema+mutex pair is
Wait on the binary sema
Lock the mutex
Check for the expected condition
If the condition is true, process.
Unlock the mutex
If the condition check in the step 3 was false, go back to the step 1.
The sequence for the producer thread is:
Lock the mutex
Post the binary sema
Unlock the mutex
As you can see the unconditional processing in the step 3 when using the conditional var is replaced by the conditional processing in the step 3 and step 4 when using the binary sema.
The reason is that when using sema+mutex, in a race condition, another consumer thread may sneak in between the step 1 and 2 and process/nullify the condition. This won't happen when using conditional var. When using the conditional var, the condition is guarantied to be true after the step 2.
The binary semaphore can be replaced with the regular counting semaphore. This may result in the step 6 to step 1 loop a few more times.
Slowjelj is right, but to shed some light on the problem, look at the python code below. We have a buffer, a producer, and a consumer. And think if you could rewrite it just with mutexes.
import threading, time, random
cv = threading.Condition()
buffer = []
MAX = 3
def put(value):
cv.acquire()
while len(buffer) == MAX:
cv.wait()
buffer.append(value)
print("added value ", value, "length =", len(buffer))
cv.notify()
cv.release()
def get():
cv.acquire()
while len(buffer) == 0:
cv.wait()
value = buffer.pop()
print("removed value ", value, "length =", len(buffer))
cv.notify()
cv.release()
def producer():
while True:
put(0) # it doesn't mater what is the value in our example
time.sleep(random.random()/10)
def consumer():
while True:
get()
time.sleep(random.random()/10)
if __name__ == '__main__':
cs = threading.Thread(target=consumer)
pd = threading.Thread(target=producer)
cs.start()
pd.start()
cs.join()
pd.join()
I think it is implementation defined.
The mutex is enough or not depends on whether you regard the mutex as a mechanism for critical sections or something more.
As mentioned in http://en.cppreference.com/w/cpp/thread/mutex/unlock,
The mutex must be locked by the current thread of execution, otherwise, the behavior is undefined.
which means a thread could only unlock a mutex which was locked/owned by itself in C++.
But in other programming languages, you might be able to share a mutex between processes.
So distinguishing the two concepts may be just performance considerations, a complex ownership identification or inter-process sharing are not worthy for simple applications.
For example, you may fix #slowjelj's case with an additional mutex (it might be an incorrect fix):
Thread1:
lock(mutex0);
while(1) {
lock(mutex0); // Blocks waiting for notification from Thread2
... // do work after notification is received
unlock(mutex1); // Tells Thread2 we are done
}
Thread2:
while(1) {
lock(mutex1); // lock the mutex so Thread1 will block again
... // do the work that precedes notification
unlock(mutex0); // unblocks Thread1
}
But your program will complain that you have triggered an assertion left by the compiler (e.g. "unlock of unowned mutex" in Visual Studio 2015).

Simple POSIX threads question

I have this POSIX thread:
void subthread(void)
{
while(!quit_thread) {
// do something
...
// don't waste cpu cycles
if(!quit_thread) usleep(500);
}
// free resources
...
// tell main thread we're done
quit_thread = FALSE;
}
Now I want to terminate subthread() from my main thread. I've tried the following:
quit_thread = TRUE;
// wait until subthread() has cleaned its resources
while(quit_thread);
But it does not work! The while() clause does never exit although my subthread clearly sets quit_thread to FALSE after having freed its resources!
If I modify my shutdown code like this:
quit_thread = TRUE;
// wait until subthread() has cleaned its resources
while(quit_thread) usleep(10);
Then everything is working fine! Could someone explain to me why the first solution does not work and why the version with usleep(10) suddenly works? I know that this is not a pretty solution. I could use semaphores/signals for this but I'd like to learn something about multithreading, so I'd like to know why my first solution doesn't work.
Thanks!
Without a memory fence, there is no guarantee that values written in one thread will appear in another. Most of the pthread primitives introduce a barrier, as do several system calls such as usleep. Using a mutex around both the read and write introduces a barrier, and more generally prevents multi-byte values being visible in partially written state.
You also need to separate the idea of asking a thread to stop executing, and reporting that it has stopped, and appear to be using the same variable for both.
What's most likely to be happening is that your compiler is not aware that quit_thread can be changed by another thread (because C doesn't know about threads, at least at the time this question was asked). Because of that, it's optimising the while loop to an infinite loop.
In other words, it looks at this code:
quit_thread = TRUE;
while(quit_thread);
and thinks to itself, "Hah, nothing in that loop can ever change quit_thread to FALSE, so the coder obviously just meant to write while (TRUE);".
When you add the call to usleep, the compiler has another think about it and assumes that the function call may change the global, so it plays it safe and doesn't optimise it.
Normally you would mark the variable as volatile to stop the compiler from optimising it but, in this case, you should use the facilities provided by pthreads and join to the thread after setting the flag to true (and don't have the sub-thread reset it, do that in the main thread after the join if it's necessary). The reason for that is that a join is likely to be more efficient than a continuous loop waiting for a variable change since the thread doing the join will most likely not be executed until the join needs to be done.
In your spinning solution, the joining thread will most likely continue to run and suck up CPU grunt.
In other words, do something like:
Main thread Child thread
------------------- -------------------
fStop = false
start Child Initialise
Do some other stuff while not fStop:
fStop = true Do what you have to do
Finish up and exit
join to Child
Do yet more stuff
And, as an aside, you should technically protect shared variables with mutexes but this is one of the few cases where it's okay, one-way communication where half-changed values of a variable don't matter (false/not-false).
The reason you normally mutex-protect a variable is to stop one thread seeing it in a half-changed state. Let's say you have a two-byte integer for a count of some objects, and it's set to 0x00ff (255).
Let's further say that thread A tries to increment that count but it's not an atomic operation. It changes the top byte to 0x01 but, before it gets a chance to change the bottom byte to 0x00, thread B swoops in and reads it as 0x01ff.
Now that's not going to be very good if thread B want to do something with the last element counted by that value. It should be looking at 0x0100 but will instead try to look at 0x01ff, the effect of which will be wrong, if not catastrophic.
If the count variable were protected by a mutex, thread B wouldn't be looking at it until thread A had finished updating it, hence no problem would occur.
The reason that doesn't matter with one-way booleans is because any half state will also be considered as true or false so, if thread A was halfway between turning 0x0000 into 0x0001 (just the top byte), thread B would still see that as 0x0000 (false) and keep going (until thread A finishes its update next time around).
And if thread A was turning the boolean into 0xffff, the half state of 0xff00 would still be considered true by thread B so it would do its thing before thread A had finished updating the boolean.
Neither of those two possibilities is bad simply because, in both, thread A is in the process of changing the boolean and it will finish eventually. Whether thread B detects it a tiny bit earlier or a tiny bit later doesn't really matter.
The while(quite_thread); is using the value quit_thread was set to on the line before it. Calling a function (usleep) induces the compiler to reload the value on each test.
In any case, this is the wrong way to wait for a thread to complete. Use pthread_join instead.
You're "learning" multhithreading the wrong way. The right way is to learn to use mutexes and condition variables; any other solution will fail under some circumstances.

Synchronization among 2 threads in linux pthreads

In linux, how can synchronize between 2 thread (using pthreads on linux)?
I would like, under some conditions, a thread will block itself and then later on, it will be resume by another thread. In Java, there is wait(), notify() functions. I am looking for something the same on pthreads:
I have read this, but it only has mutex, which is kind of like Java's synchronized keyword. That is not what I am looking for.
https://computing.llnl.gov/tutorials/pthreads/#Mutexes
Thank you.
You need a mutex, a condition variable and a helper variable.
in thread 1:
pthread_mutex_lock(&mtx);
// We wait for helper to change (which is the true indication we are
// ready) and use a condition variable so we can do this efficiently.
while (helper == 0)
{
pthread_cond_wait(&cv, &mtx);
}
pthread_mutex_unlock(&mtx);
in thread 2:
pthread_mutex_lock(&mtx);
helper = 1;
pthread_cond_signal(&cv);
pthread_mutex_unlock(&mtx);
The reason you need a helper variable is because condition variables can suffer from spurious wakeup. It's the combination of a helper variable and a condition variable that gives you exact semantics and efficient waiting.
You can also look at spin locks. try to man/google pthread_spin_init, pthread_spin_lock as a starting point
depending on your application specific, they might be more appropriate than mutex

Is it ok to have multiple threads writing the same values to the same variables?

I understand about race conditions and how with multiple threads accessing the same variable, updates made by one can be ignored and overwritten by others, but what if each thread is writing the same value (not different values) to the same variable; can even this cause problems? Could this code:
GlobalVar.property = 11;
(assuming that property will never be assigned anything other than 11), cause problems if multiple threads execute it at the same time?
The problem comes when you read that state back, and do something about it. Writing is a red herring - it is true that as long as this is a single word most environments guarantee the write will be atomic, but that doesn't mean that a larger piece of code that includes this fragment is thread-safe. Firstly, presumably your global variable contained a different value to begin with - otherwise if you know it's always the same, why is it a variable? Second, presumably you eventually read this value back again?
The issue is that presumably, you are writing to this bit of shared state for a reason - to signal that something has occurred? This is where it falls down: when you have no locking constructs, there is no implied order of memory accesses at all. It's hard to point to what's wrong here because your example doesn't actually contain the use of the variable, so here's a trivialish example in neutral C-like syntax:
int x = 0, y = 0;
//thread A does:
x = 1;
y = 2;
if (y == 2)
print(x);
//thread B does, at the same time:
if (y == 2)
print(x);
Thread A will always print 1, but it's completely valid for thread B to print 0. The order of operations in thread A is only required to be observable from code executing in thread A - thread B is allowed to see any combination of the state. The writes to x and y may not actually happen in order.
This can happen even on single-processor systems, where most people do not expect this kind of reordering - your compiler may reorder it for you. On SMP even if the compiler doesn't reorder things, the memory writes may be reordered between the caches of the separate processors.
If that doesn't seem to answer it for you, include more detail of your example in the question. Without the use of the variable it's impossible to definitively say whether such a usage is safe or not.
It depends on the work actually done by that statement. There can still be some cases where Something Bad happens - for example, if a C++ class has overloaded the = operator, and does anything nontrivial within that statement.
I have accidentally written code that did something like this with POD types (builtin primitive types), and it worked fine -- however, it's definitely not good practice, and I'm not confident that it's dependable.
Why not just lock the memory around this variable when you use it? In fact, if you somehow "know" this is the only write statement that can occur at some point in your code, why not just use the value 11 directly, instead of writing it to a shared variable?
(edit: I guess it's better to use a constant name instead of the magic number 11 directly in the code, btw.)
If you're using this to figure out when at least one thread has reached this statement, you could use a semaphore that starts at 1, and is decremented by the first thread that hits it.
I would expect the result to be undetermined. As in it would vary from compiler to complier, langauge to language and OS to OS etc. So no, it is not safe
WHy would you want to do this though - adding in a line to obtain a mutex lock is only one or two lines of code (in most languages), and would remove any possibility of problem. If this is going to be two expensive then you need to find an alternate way of solving the problem
In General, this is not considered a safe thing to do unless your system provides for atomic operation (operations that are guaranteed to be executed in a single cycle).
The reason is that while the "C" statement looks simple, often there are a number of underlying assembly operations taking place.
Depending on your OS, there are a few things you could do:
Take a mutual exclusion semaphore (mutex) to protect access
in some OS, you can temporarily disable preemption, which guarantees your thread will not swap out.
Some OS provide a writer or reader semaphore which is more performant than a plain old mutex.
Here's my take on the question.
You have two or more threads running that write to a variable...like a status flag or something, where you only want to know if one or more of them was true. Then in another part of the code (after the threads complete) you want to check and see if at least on thread set that status... for example
bool flag = false
threadContainer tc
threadInputs inputs
check(input)
{
...do stuff to input
if(success)
flag = true
}
start multiple threads
foreach(i in inputs)
t = startthread(check, i)
tc.add(t) // Keep track of all the threads started
foreach(t in tc)
t.join( ) // Wait until each thread is done
if(flag)
print "One of the threads were successful"
else
print "None of the threads were successful"
I believe the above code would be OK, assuming you're fine with not knowing which thread set the status to true, and you can wait for all the multi-threaded stuff to finish before reading that flag. I could be wrong though.
If the operation is atomic, you should be able to get by just fine. But I wouldn't do that in practice. It is better just to acquire a lock on the object and write the value.
Assuming that property will never be assigned anything other than 11, then I don't see a reason for assigment in the first place. Just make it a constant then.
Assigment only makes sense when you intend to change the value unless the act of assigment itself has other side effects - like volatile writes have memory visibility side-effects in Java. And if you change state shared between multiple threads, then you need to synchronize or otherwise "handle" the problem of concurrency.
When you assign a value, without proper synchronization, to some state shared between multiple threads, then there's no guarantees for when the other threads will see that change. And no visibility guarantees means that it it possible that the other threads will never see the assignt.
Compilers, JITs, CPU caches. They're all trying to make your code run as fast as possible, and if you don't make any explicit requirements for memory visibility, then they will take advantage of that. If not on your machine, then somebody elses.

Resources