Difference between mutex/spinlock/semaphore in this use case? - multithreading

I'm not sure if my answers to the following exercise is correct:
Suppose the program is run with two threads. Further suppose that the
following sequence of events occurs:
Time Thread 0 Thread 1
0 pthread_mutex_lock(&mut0) pthread_mutex_lock(&mut1)
1 pthread_mutex_lock(&mut1) pthread_mutex_lock(&mut0)
b. Would this be a problem if the program used busy-waiting (with two flag
variables) instead of mutexes?
c. Would this be a problem if the program used semaphores instead of mutexes?
Are the outcomes for this not the exact same in that they'd result in a deadlock?
For b) if we use a spinlock, the outcome would look something like this, right? (Let flag0 = flag1 = 1)
Time Thread 0 Thread 1
0 flag0--; flag1--;
1 while(flag1 == 0); while(flag2 == 0);
For c) I'm assuming they're referring to a binary semaphore, which is effectively the same thing as a mutex so it'd also result in deadlock.
Is there a mistake in my reasoning? My answers seem suspiciously simple but I don't see how spinlock/semaphores would change anything

Related

Synchronize multiple pthreads in a

I'm discovering the pthread library (in C) and I'm having some trouble understanding well a few things.
First of all, I understand what a mutex is, I understand how it works, ok, I also understand the concept of the cond, but I can't manage to use it properly (I don't really get how to combine the mutex and the cond)
This is, in pseudo-code, what I want to do :
thread :
loop :
// do something
end loop
end thread
So there is n threads, but each thread uses the same function. I want the inside of the loop to be executed in parallel by all the threads BUT each thread must be in the same iteration of the loop, meaning I don't care in what order the instructions inside the loop are executed between threads, but to start iteration 2 of a thread, all the other threads must have finished iteration 1 (etc).
So my question is : how do you do that ? Not particularly in a specific example, but theoretically.
EDIT
I manage to do it, I don't know if it's the proper way, but it's working :
global nbOfThreads
global nbOfIterations
thread :
lock(mutex0)
unlock(mutex0)
loop :
// Do something
lock(mutex1)
nbOfIterations++
if (nbOfIterations == nbOfThread) :
nbOfIterations = 0
broadcast(cond)
unlock(mutex1)
continue
end if
wait(cond, mutex1)
unlock(mutex1)
end loop
end thread
main (n) :
nbOfThreads = n
nbOfIterations = 0
lock(mutex0)
do nbOfThreads times : create(thread)
unlock(mutex0)
end main
I obviously tried to understand myself, but there are some things I don't understand :
The main one : WHY does a cond need to be pair with a mutex
In some examples I saw something like this :
// thread A :
while (!condition)
wait(&cond)
// thread B :
if (condition)
signal(&cond)
well I really don't get the point of this while loop, I thought wait put the thread in pause until the condition is true (until the other thread send the signal). I mean I would get it if it was an if instead of a while.
Thank you
WHY does a cond need.... because the (!condition) you reference almost certainly depends upon some bits of the object not changing while you reference them. Correspondingly, modifying the state of the object should be done in such a way as to appear atomic to any observer; thus a mutex. While you could rely on too-clever-by-half hackery like atomic types, there is also the problem of ‘what if it was modified just after you checked it’ -- a race condition. Thus the idiomatic lock(); while (!cond) { wait(); }.
The point of the while... The signal+wait is not a handoff of control; after the signal, any number of things could happen to the object before a particular thread returns from wait. Even though the condition might have been in the correct state, by the time thread A examines it, it may no longer be. At the point of exiting the while loop, thread A knows: The condition is in the state I desire, and I have exclusive access to the object.
Condition variables can have spurious wake-ups. The condition might not actually be true when the wait function returns.
Depending on your task, a different synchronization primitive, such as a barrier (see pthread_barrier_init) or a semaphore (sem_init) might be easier to use.

Understanding an exam answer on Semaphores

This question came up in a practice exam. I do not really understand how they got to the answers that they said are correct.
I was hoping for some help understanding the question and how to answer it?
Thank you!
I think it all comes to this:
wait (P): If the value of semaphore variable is not negative, decrements it by 1. If the semaphore variable is now negative, the process executing wait is blocked (i.e., added to the semaphore's queue) until the value is greater or equal to 1. Otherwise, the process continues execution, having used a unit of the resource.
Source and More information
For the correct answers:
1) Since the semaphore is initially equal 0, thread 3 will block on P(S) waiting for another thread to do V(S). This will only happen on thread 1, and after A is completed. So no matter how long statement A takes to execute, thread 3 will wait until the instruction V(S) is executed. So A will always be executed before F.
2) The same concept applies with B and G. Before G you have to execute P(T) and this will wait for an instruction V(T). This only happens after B has been executed.
3) Since A is executed before F as demonstrated in (1), and F is always executed before G, A is always executed before G.
As for the incorrect answers:
a) A is executed before E? Maybe, but not always. Because thread 1 and thread 2 have to wait for a semaphore, thread 2 could execute B, V(T) and E faster than A, so in this case the sentence (a) is false.
b) B is executed before F? Maybe, but not always. Why? To execute F, thread 3 only depends on thread 1 (the semaphore S) so C and A can execute very fast and then go to F while B can be still executing because it is very slow.
d) C is executed before D? Maybe, but not always. Again, C might take a long time to execute and thread 1, since it has not to wait for any semaphore, could execute all its instructions very fast, before C is completed.

Reusable Barrier Algorithm

I'm looking into the Reusable Barrier algorithm from the book "The Little Book Of Semaphores" (archived here).
The puzzle is on page 31 (Basic Synchronization Patterns/Reusable Barrier), and I have come up with a 'solution' (or not) which differs from the solution from the book (a two-phase barrier).
This is my 'code' for each thread:
# n = 4; threads running
# semaphore = n max., initialized to 0
# mutex, unowned.
start:
mutex.wait()
counter = counter + 1
if counter = n:
semaphore.signal(4) # add 4 at once
counter = 0
mutex.release()
semaphore.wait()
# critical section
semaphore.release()
goto start
This does seem to work, I've even inserted different sleep timers into different sections of the threads, and they still wait for all the threads to come before continuing each and every loop. Am I missing something? Is there a condition that this will fail?
I've implemented this using the Windows library Semaphore and Mutex functions.
Update:
Thank you to starblue for the answer. Turns out that if for whatever reason a thread is slow between mutex.release() and semaphore.wait() any of the threads that arrive to semaphore.wait() after a full loop will be able to go through again, since there will be one of the N unused signals left.
And having put a Sleep command for thread number 3, I got this result where one can see that thread 3 missed a turn the first time, with thread 1 having done 2 turns, and then catching up on the second turn (which was in fact its 1st turn).
Thanks again to everyone for the input.
One thread could run several times through the barrier while some other thread doesn't run at all.

pthreads: If I increment a global from two different threads, can there be sync issues?

Suppose I have two threads A and B that are both incrementing a ~global~ variable "count". Each thread runs a for loop like this one:
for(int i=0; i<1000; i++)
count++; //alternatively, count = count + 1;
i.e. each thread increments count 1000 times, and let's say count starts at 0. Can there be sync issues in this case? Or will count correctly equal 2000 when the execution is finished? I guess since the statement "count = count + 1" may break down into TWO assembly instructions, there is potential for the other thread to be swapped in between these two instructions? Not sure. What do you think?
Yes there can be sync issues in this case. You need to either protect the count variable with a mutex, or use a (usually platform specific) atomic operation.
Example using pthread mutexes
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
for(int i=0; i<1000; i++) {
pthread_mutex_lock(&mutex);
count++;
pthread_mutex_unlock(&mutex);
}
Using atomic ops
There is a prior discussion of platform specific atomic ops here:
UNIX Portable Atomic Operations
If you only need to support GCC, this approach is straightforward. If you're supporting other compilers, you'll probably have to make some per-platform decisions.
Count clearly needs to be protected with a mutex or other synchronization mechanism.
At a fundamental level, the count++ statment breaks down to:
load count into register
increment register
store count from register
A context switch could occur before/after any of those steps, leading to situations like:
Thread 1: load count into register A (value = 0)
Thread 2: load count into register B (value = 0)
Thread 1: increment register A (value = 1)
Thread 1: store count from register A (value = 1)
Thread 2: increment register B (value = 1)
Thread 2: store count from register B (value = 1)
As you can see, both threads completed one iteration of the loop, but the net result is that count was only incremented once.
You probably would also want to make count volatile to force loads & stores to go to memory, since a good optimizer would likely keep count in a register unless otherwise told.
Also, I would suggest that if this is all the work that's going to be done in your threads, performance will dramatically drop from all the mutex locking/unlocking required to keep it consistent. Threads should have much bigger work units to perform.
Yes, there can be sync problems.
As an example of the possible issues, there is no guarantee that an increment itself is an atomic operation.
In other words, if one thread reads the value for increment then gets swapped out, the other thread could come in and change it, then the first thread will write back the wrong value:
+-----+
| 0 | Value stored in memory (0).
+-----+
| 0 | Thread 1 reads value into register (r1 = 0).
+-----+
| 0 | Thread 2 reads value into register (r2 = 0).
+-----+
| 1 | Thread 2 increments r2 and writes back.
+-----+
| 1 | Thread 1 increments r1 and writes back.
+-----+
So you can see that, even though both threads have tried to increment the value, it's only increased by one.
This is just one of the possible problems. It may also be that the write itself is not atomic and one thread may update only part of the value before being swapped out.
If you have atomic operations that are guaranteed to work in your implementation, you can use them. Otherwise, use mutexes, That's what pthreads provides for synchronisation (and guarantees will work) so is the safest approach.
I guess since the statement "count = count + 1" may break down into TWO assembly instructions, there is potential for the other thread to be swapped in between these two instructions? Not sure. What do you think?
Don't think like this. You're writing C code and pthreads code. You don't have to ever think about assembly code to know how your code will behave.
The pthreads standard does not define the behavior when one thread accesses an object while another thread is, or might be, modifying it. So unless you're writing platform-specific code, you should assume this code can do anything -- even crash.
The obvious pthreads fix is to use mutexes. If your platform has atomic operations, you can use those.
I strongly urge you not to delve into detailed discussions about how it might fail or what the assembly code might look like. Regardless of what you might or might not think compilers or CPUs might do, the behavior of the code is undefined. And it's too easy to convince yourself you've covered every way you can think of that it might fail and then you miss one and it fails.

Is it ok to have multiple threads writing the same values to the same variables?

I understand about race conditions and how with multiple threads accessing the same variable, updates made by one can be ignored and overwritten by others, but what if each thread is writing the same value (not different values) to the same variable; can even this cause problems? Could this code:
GlobalVar.property = 11;
(assuming that property will never be assigned anything other than 11), cause problems if multiple threads execute it at the same time?
The problem comes when you read that state back, and do something about it. Writing is a red herring - it is true that as long as this is a single word most environments guarantee the write will be atomic, but that doesn't mean that a larger piece of code that includes this fragment is thread-safe. Firstly, presumably your global variable contained a different value to begin with - otherwise if you know it's always the same, why is it a variable? Second, presumably you eventually read this value back again?
The issue is that presumably, you are writing to this bit of shared state for a reason - to signal that something has occurred? This is where it falls down: when you have no locking constructs, there is no implied order of memory accesses at all. It's hard to point to what's wrong here because your example doesn't actually contain the use of the variable, so here's a trivialish example in neutral C-like syntax:
int x = 0, y = 0;
//thread A does:
x = 1;
y = 2;
if (y == 2)
print(x);
//thread B does, at the same time:
if (y == 2)
print(x);
Thread A will always print 1, but it's completely valid for thread B to print 0. The order of operations in thread A is only required to be observable from code executing in thread A - thread B is allowed to see any combination of the state. The writes to x and y may not actually happen in order.
This can happen even on single-processor systems, where most people do not expect this kind of reordering - your compiler may reorder it for you. On SMP even if the compiler doesn't reorder things, the memory writes may be reordered between the caches of the separate processors.
If that doesn't seem to answer it for you, include more detail of your example in the question. Without the use of the variable it's impossible to definitively say whether such a usage is safe or not.
It depends on the work actually done by that statement. There can still be some cases where Something Bad happens - for example, if a C++ class has overloaded the = operator, and does anything nontrivial within that statement.
I have accidentally written code that did something like this with POD types (builtin primitive types), and it worked fine -- however, it's definitely not good practice, and I'm not confident that it's dependable.
Why not just lock the memory around this variable when you use it? In fact, if you somehow "know" this is the only write statement that can occur at some point in your code, why not just use the value 11 directly, instead of writing it to a shared variable?
(edit: I guess it's better to use a constant name instead of the magic number 11 directly in the code, btw.)
If you're using this to figure out when at least one thread has reached this statement, you could use a semaphore that starts at 1, and is decremented by the first thread that hits it.
I would expect the result to be undetermined. As in it would vary from compiler to complier, langauge to language and OS to OS etc. So no, it is not safe
WHy would you want to do this though - adding in a line to obtain a mutex lock is only one or two lines of code (in most languages), and would remove any possibility of problem. If this is going to be two expensive then you need to find an alternate way of solving the problem
In General, this is not considered a safe thing to do unless your system provides for atomic operation (operations that are guaranteed to be executed in a single cycle).
The reason is that while the "C" statement looks simple, often there are a number of underlying assembly operations taking place.
Depending on your OS, there are a few things you could do:
Take a mutual exclusion semaphore (mutex) to protect access
in some OS, you can temporarily disable preemption, which guarantees your thread will not swap out.
Some OS provide a writer or reader semaphore which is more performant than a plain old mutex.
Here's my take on the question.
You have two or more threads running that write to a variable...like a status flag or something, where you only want to know if one or more of them was true. Then in another part of the code (after the threads complete) you want to check and see if at least on thread set that status... for example
bool flag = false
threadContainer tc
threadInputs inputs
check(input)
{
...do stuff to input
if(success)
flag = true
}
start multiple threads
foreach(i in inputs)
t = startthread(check, i)
tc.add(t) // Keep track of all the threads started
foreach(t in tc)
t.join( ) // Wait until each thread is done
if(flag)
print "One of the threads were successful"
else
print "None of the threads were successful"
I believe the above code would be OK, assuming you're fine with not knowing which thread set the status to true, and you can wait for all the multi-threaded stuff to finish before reading that flag. I could be wrong though.
If the operation is atomic, you should be able to get by just fine. But I wouldn't do that in practice. It is better just to acquire a lock on the object and write the value.
Assuming that property will never be assigned anything other than 11, then I don't see a reason for assigment in the first place. Just make it a constant then.
Assigment only makes sense when you intend to change the value unless the act of assigment itself has other side effects - like volatile writes have memory visibility side-effects in Java. And if you change state shared between multiple threads, then you need to synchronize or otherwise "handle" the problem of concurrency.
When you assign a value, without proper synchronization, to some state shared between multiple threads, then there's no guarantees for when the other threads will see that change. And no visibility guarantees means that it it possible that the other threads will never see the assignt.
Compilers, JITs, CPU caches. They're all trying to make your code run as fast as possible, and if you don't make any explicit requirements for memory visibility, then they will take advantage of that. If not on your machine, then somebody elses.

Resources