Check My Understanding of semaphores, please!
I understand the idea behind counting semaphores and binary semaphores. However the difference between a spinlock and semaphore implemented with signal() and wait() kind of blend together to me.
For example a spinlock has basically two values (a binary true/false for locked or unlocked). Therefore a spinlock is basically a binary semaphore, correct?
Any process attempting to enter the critical section while another process is inside will be unable to while it's locked, and will spin and continually check the lock status until it is unlocked and then is able to enter and lock it.
A semaphore using a signal() and wait() function essentially add or subtract a value from a variable of some kind. There is a constraint regarding the critical section. It will only be opened when the variable is of some kind of value. An example implementation for a consumer process would be wait(full), then when it's full it executes and at the end it signal(empty). Whereas a producer process may wait(empty) and execute when empty is true, and then when it finishes it signal(full).
What is the difference between wait() and a spinlock that is essentially 'waiting' in a loop?
Unlike semaphores, spinlocks may be used in code that cannot sleep, such as interrupt handlers.
http://www.makelinux.net/ldd3/chp-5-sect-5.shtml
http://www.linuxjournal.com/article/5833
Related
when there are waiting semaphores of sem_wait method, I call the sem_destroy method on other thread. But waiting semaphore was not wake up.
In case of mutex, pthread_mutex_destroy was return the value EBUSY when there are some waiting threads.
however sem_destroy return 0 and errno was also set 0.
I want to destroy semaphore after calling sem_destroy to block access as destroyed semaphore and to wake up the waiting thread.
Semaphore handle of Window OS is possible.
please advise me. thank you.
POSIX says this about sem_destroy:
The effect of destroying a semaphore upon which other threads are currently blocked is undefined.
It specifically doesn't say that other threads are woken up. In fact, if sem_t contains a pointer to memory, what it almost certainly does do is free the memory, meaning you then have a use-after-free security problem. (Whether that is the case depends on your libc.)
The general approach of allocation for mutexes and semaphores is that they should be either allocated and freed with their relevant data structure, or they should be allocated before the relevant code needs them and then freed after the entire code is done with them. In C, you cannot safely deallocate data structures (e.g., with sem_destroy) that are in use.
If you want to wake up all users of the semaphore, you must increment it until all users have awoken. You can call sem_getvalue to determine if anyone is waiting on the semaphore and then call sem_post to increment it. Only then can you safely destroy it. Note that this can have a race condition, depending on your code.
However, note that you must be careful that the other code does not continue to use the semaphore after it's destroyed, such as by trying to re-acquire it in a loop. If you are careful to structure your code properly, then you can have confidence that this won't happen.
Does signal() and wait() do in a semaphore algorithm?
I know that one of them does S++ and the other S-- but I am not sure which one does which. I have checked out the signal algorithm and it seems to show that signal brings the counter down to 0.
Conditional Variables are the ones with signal and wait.
A Conditional variable is used when you want a thread to wait until a certain condition is met.
while(!canProceed) { cond.wait(); }
When another thread wants to unblock these blocked threads, it simply calls signal (to unblock one) or broadcast (to unblock all).
canProceed = true
cond.broadcast()
Semaphores are simply generalizations of a mutex. While a mutex allows for one thread inside a given critical section, semaphores allow for N threads inside.
Threads initially wait to enter a critical section; after they're done they post (at least using the pthreads API).
semaphore.wait();
do_stuff();
semaphore.post();
I have two question about mutexes:
1. When a mutex variable equals to 1, and we do a signal() operation on it, what is expected to happen?
2. When mutex equals to 0, and we do a wait(), then the thread is blocked, and mutex stays 0. correct? After a while, another thread makes a signal() operation, then the blocking is released. What will be the value of the mutex now? 0 or 1?
So conceptually mutex has 2 states: locked and unlocked. If it is represented by 0 or 1 is not important here.
If you unlock (i.e. signal) a mutex it changes its state from locked to unlocked. Further unlocking doesn't change its state and it actually doesn't do anything.
If a mutex is unlocked and you call wait then the call does nothing (it does not wait) and the thread continues its execution.
When a mutex is locked and you call wait then the thread is blocked. When other thread calls unlock then blocking is released and the mutex becomes unlocked.
The most important thing is that unlock and lock operations are atomic in the sense that parallel calls cannot overlap each other to produce corrupted result (formally: parallel calls to lock/unlock are always equivalent to some serialized call history). Otherwise the whole concept of mutex would be simply stupid. :)
After reading the comments (and original unedited question) it is clear that there is enough people out there who believe binary sempahores to be interchangeable with mutexes. If we speak in practical terms (that is, pthread mutex and System V semaphore) they are very different. I will try to outline the most important differences below.
Conceptual ownership. Mutexes are owned by their locker, sempahores are not owned by anybody. This leads to two disticntions. Very important one is that mutexes can (should) only be unlocked by the owner (locker thread), while sempahores can be unlocked by any thread (see below for permissions). The less important thing is that mutexes can be made re-entrant - that is, can be locked multiple times with the owner thread - while semaphores can't behave so. I say it is less important because reentrant mutexes almost always present design flaw.
Semaphores are objects which are more or less indepenent of the user. They can be created, used and destroyed by completely unrelated processess or threads, which do not even have to know anything of each other (or execute at the same time). For example, a process might create a semaphore and than die, and other processes can use it, while third process will remove it). Semaphores have permissions associated with them (not unlike file permissions), while mutexes have no such thing - anybody who have access to mutex technically can do anything with it.
Semaphores are process-shared. That is, they can be used by multiple processes without extra efforts. Default mutexes are single-process only, if the same mutex is to be used by multiple processes, it has to be created in a special mode.
I would like to know how many threads are waiting on a lock so I would be able to destroy it safely.
The problem is that I can't destroy the lock when someone holds it or someone is waiting on it.
My program can make sure that no new requests are made to acquire the lock, but how can I know when all the threads that waited on it are done with it?
I thought about a conditional variable but I suspect it will create problems..
dlv, could you add some code snippet to your description.
I hope you should be using condition variables,
Each thread will block in pthread_cond_wait() until the other thread signals it to wake up. This will not cause a deadlock. It can easily be extended to many threads, by allocating one int, pthread_cond_t and pthread_mutex_t per thread.
pthread_cond_wait() blocks the calling thread until the specified condition is signalled. This routine should be called while mutex is locked, and it will automatically release the mutex while it waits. After signal is received and thread is awakened, mutex will be automatically locked for use by the thread. The programmer is then responsible for unlocking mutex when the thread is finished with it.
The pthread_cond_signal() routine is used to signal (or wake up) another thread which is waiting on the condition variable. It should be called after mutex is locked, and must unlock mutex in order for pthread_cond_wait() routine to complete.
The pthread_cond_broadcast() routine should be used instead of pthread_cond_signal() if more than one thread is in a blocking wait state.
It is a logical error to call pthread_cond_signal() before calling pthread_cond_wait().
Proper locking and unlocking of the associated mutex variable is essential when using these routines. For example:
Failing to lock the mutex before calling pthread_cond_wait() may cause it NOT to block.
Failing to unlock the mutex after calling pthread_cond_signal() may not allow a matching pthread_cond_wait() routine to complete (it will remain blocked).
If threads that can use the mutex still exist or might be created in the future then don't delete it.
You do know and are tracking what threads are created, right?
If, for some reason, you cannot keep track of the threads using a resource, your only way out is to leak the resource. It can never be safely deleted because you never know when you are done using it.
Say you had a counter that counted the threads using a mutex. That counter would need its own mutex. Then how do you decide when to delete that one?
That way of thinking is the road that leads to hell. You could do what you want with condition variables, but the result would be an extremely weak design.
Assuming you managed to create such a monster, it would basically allow you to kill "safely" any other thread regardless of its internal state. Except for a quick and dirty panic exit (in case of some internal software error), this is the worst possible way of solving synchronization issues.
A design relying on such tricks would have to create implicit synchronizations between tasks to make sure the terminations occur in the proper order. A lot of software are designed that way, and most of them allow mediocre programmers to make a living by maintaining the pile of crap they created in the first place.
Task termination should be an issue solved at global design level, not by a toolbox of wonky objects that allow you to twist synchronization any odd way.
When to use a semaphore and when to use a conditional variable?
Locks are used for mutual exclusion. When you want to ensure that a piece of code is atomic, put a lock around it. You could theoretically use a binary semaphore to do this, but that's a special case.
Semaphores and condition variables build on top of the mutual exclusion provide by locks and are used for providing synchronized access to shared resources. They can be used for similar purposes.
A condition variable is generally used to avoid busy waiting (looping repeatedly while checking a condition) while waiting for a resource to become available. For instance, if you have a thread (or multiple threads) that can't continue onward until a queue is empty, the busy waiting approach would be to just doing something like:
//pseudocode
while(!queue.empty())
{
sleep(1);
}
The problem with this is that you're wasting processor time by having this thread repeatedly check the condition. Why not instead have a synchronization variable that can be signaled to tell the thread that the resource is available?
//pseudocode
syncVar.lock.acquire();
while(!queue.empty())
{
syncVar.wait();
}
//do stuff with queue
syncVar.lock.release();
Presumably, you'll have a thread somewhere else that is pulling things out of the queue. When the queue is empty, it can call syncVar.signal() to wake up a random thread that is sitting asleep on syncVar.wait() (or there's usually also a signalAll() or broadcast() method to wake up all the threads that are waiting).
I generally use synchronization variables like this when I have one or more threads waiting on a single particular condition (e.g. for the queue to be empty).
Semaphores can be used similarly, but I think they're better used when you have a shared resource that can be available and unavailable based on some integer number of available things. Semaphores are good for producer/consumer situations where producers are allocating resources and consumers are consuming them.
Think about if you had a soda vending machine. There's only one soda machine and it's a shared resource. You have one thread that's a vendor (producer) who is responsible for keeping the machine stocked and N threads that are buyers (consumers) who want to get sodas out of the machine. The number of sodas in the machine is the integer value that will drive our semaphore.
Every buyer (consumer) thread that comes to the soda machine calls the semaphore down() method to take a soda. This will grab a soda from the machine and decrement the count of available sodas by 1. If there are sodas available, the code will just keep running past the down() statement without a problem. If no sodas are available, the thread will sleep here waiting to be notified of when soda is made available again (when there are more sodas in the machine).
The vendor (producer) thread would essentially be waiting for the soda machine to be empty. The vendor gets notified when the last soda is taken from the machine (and one or more consumers are potentially waiting to get sodas out). The vendor would restock the soda machine with the semaphore up() method, the available number of sodas would be incremented each time and thereby the waiting consumer threads would get notified that more soda is available.
The wait() and signal() methods of a synchronization variable tend to be hidden within the down() and up() operations of the semaphore.
Certainly there's overlap between the two choices. There are many scenarios where a semaphore or a condition variable (or set of condition variables) could both serve your purposes. Both semaphores and condition variables are associated with a lock object that they use to maintain mutual exclusion, but then they provide extra functionality on top of the lock for synchronizing thread execution. It's mostly up to you to figure out which one makes the most sense for your situation.
That's not necessarily the most technical description, but that's how it makes sense in my head.
Let's reveal what's under the hood.
Conditional variable is essentially a wait-queue, that supports blocking-wait and wakeup operations, i.e. you can put a thread into the wait-queue and set its state to BLOCK, and get a thread out from it and set its state to READY.
Note that to use a conditional variable, two other elements are needed:
a condition (typically implemented by checking a flag or a counter)
a mutex that protects the condition
The protocol then becomes,
acquire mutex
check condition
block and release mutex if condition is true, else release mutex
Semaphore is essentially a counter + a mutex + a wait queue. And it can be used as it is without external dependencies. You can use it either as a mutex or as a conditional variable.
Therefore, semaphore can be treated as a more sophisticated structure than conditional variable, while the latter is more lightweight and flexible.
Semaphores can be used to implement exclusive access to variables, however they are meant to be used for synchronization. Mutexes, on the other hand, have a semantics which is strictly related to mutual exclusion: only the process which locked the resource is allowed to unlock it.
Unfortunately you cannot implement synchronization with mutexes, that's why we have condition variables. Also notice that with condition variables you can unlock all the waiting threads in the same instant by using the broadcast unlocking. This cannot be done with semaphores.
semaphore and condition variables are very similar and are used mostly for the same purposes. However, there are minor differences that could make one preferable. For example, to implement barrier synchronization you would not be able to use a semaphore.But a condition variable is ideal.
Barrier synchronization is when you want all of your threads to wait until everyone has arrived at a certain part in the thread function. this can be implemented by having a static variable which is initially the value of total threads decremented by each thread when it reaches that barrier. this would mean we want each thread to sleep until the last one arrives.A semaphore would do the exact opposite! with a semaphore, each thread would keep running and the last thread (which will set semaphore value to 0) will go to sleep.
a condition variable on the other hand, is ideal. when each thread gets to the barrier we check if our static counter is zero. if not, we set the thread to sleep with the condition variable wait function. when the last thread arrives at the barrier, the counter value will be decremented to zero and this last thread will call the condition variable signal function which will wake up all the other threads!
I file condition variables under monitor synchronization. I've generally seen semaphores and monitors as two different synchronization styles. There are differences between the two in terms of how much state data is inherently kept and how you want to model code - but there really isn't any problem that can be solved by one but not the other.
I tend to code towards monitor form; in most languages I work in that comes down to mutexes, condition variables, and some backing state variables. But semaphores would do the job too.
semaphore need to know the count upfront for initialization. There is no such requirement for condition variables.
The the mutex and conditional variables are inherited from semaphore.
For mutex, the semaphore uses two states: 0, 1
For condition variables the semaphore uses counter.
They are like syntactic sugar
conditionalVar + mutex == semaphore