Semaphore vs. Monitors - what's the difference? - multithreading

What are the major differences between a Monitor and a Semaphore?

A Monitor is an object designed to be accessed from multiple threads. The member functions or methods of a monitor object will enforce mutual exclusion, so only one thread may be performing any action on the object at a given time. If one thread is currently executing a member function of the object then any other thread that tries to call a member function of that object will have to wait until the first has finished.
A Semaphore is a lower-level object. You might well use a semaphore to implement a monitor. A semaphore essentially is just a counter. When the counter is positive, if a thread tries to acquire the semaphore then it is allowed, and the counter is decremented. When a thread is done then it releases the semaphore, and increments the counter.
If the counter is already zero when a thread tries to acquire the semaphore then it has to wait until another thread releases the semaphore. If multiple threads are waiting when a thread releases a semaphore then one of them gets it. The thread that releases a semaphore need not be the same thread that acquired it.
A monitor is like a public toilet. Only one person can enter at a time. They lock the door to prevent anyone else coming in, do their stuff, and then unlock it when they leave.
A semaphore is like a bike hire place. They have a certain number of bikes. If you try and hire a bike and they have one free then you can take it, otherwise you must wait. When someone returns their bike then someone else can take it. If you have a bike then you can give it to someone else to return --- the bike hire place doesn't care who returns it, as long as they get their bike back.

Following explanation actually explains how wait() and signal() of monitor differ from P and V of semaphore.
The wait() and signal() operations on condition variables in a monitor are similar to P and V operations on counting semaphores.
A wait statement can block a process's execution, while a signal statement can cause another process to be unblocked. However, there are some differences between them. When a process executes a P operation, it does not necessarily block that process because the counting semaphore may be greater than zero. In contrast, when a wait statement is executed, it always blocks the process. When a task executes a V operation on a semaphore, it either unblocks a task waiting on that semaphore or increments the semaphore counter if there is no task to unlock. On the other hand, if a process executes a signal statement when there is no other process to unblock, there is no effect on the condition variable. Another difference between semaphores and monitors is that users awaken by a V operation can resume execution without delay. Contrarily, users awaken by a signal operation are restarted only when the monitor is unlocked. In addition, a monitor solution is more structured than the one with semaphores because the data and procedures are encapsulated in a single module and that the mutual exclusion is provided automatically by the implementation.
Link: here for further reading. Hope it helps.

Semaphore allows multiple threads (up to a set number) to access a shared object. Monitors allow mutually exclusive access to a shared object.
Monitor
Semaphore

One Line Answer:
Monitor: controls only ONE thread at a time can execute in the monitor. (need to acquire lock to execute the single thread)
Semaphore: a lock that protects a shared resource. (need to acquire the lock to access resource)

A semaphore is a signaling mechanism used to coordinate between threads. Example: One thread is downloading files from the internet and another thread is analyzing the files. This is a classic producer/consumer scenario. The producer calls signal() on the semaphore when a file is downloaded. The consumer calls wait() on the same semaphore in order to be blocked until the signal indicates a file is ready. If the semaphore is already signaled when the consumer calls wait, the call does not block. Multiple threads can wait on a semaphore, but each signal will only unblock a single thread.
A counting semaphore keeps track of the number of signals. E.g. if the producer signals three times in a row, wait() can be called three times without blocking. A binary semaphore does not count but just have the "waiting" and "signalled" states.
A mutex (mutual exclusion lock) is a lock which is owned by a single thread. Only the thread which have acquired the lock can realease it again. Other threads which try to acquire the lock will be blocked until the current owner thread releases it. A mutex lock does not in itself lock anything - it is really just a flag. But code can check for ownership of a mutex lock to ensure that only one thread at a time can access some object or resource.
A monitor is a higher-level construct which uses an underlying mutex lock to ensure thread-safe access to some object. Unfortunately the word "monitor" is used in a few different meanings depending on context and platform and context, but in Java for example, a monitor is a mutex lock which is implicitly associated with an object, and which can be invoked with the synchronized keyword. The synchronized keyword can be applied to a class, method or block and ensures only one thread can execute the code at a time.

Semaphore :
Using a counter or flag to control access some shared resources in a concurrent system, implies use of Semaphore.
Example:
A counter to allow only 50 Passengers to acquire the 50 seats (Shared resource) of any Theatre/Bus/Train/Fun ride/Classroom. And to allow a new Passenger only if someone vacates a seat.
A binary flag indicating the free/occupied status of any Bathroom.
Traffic lights are good example of flags. They control flow by regulating passage of vehicles on Roads (Shared resource)
Flags only reveal the current state of Resource, no count or any other information on the waiting or running objects on the resource.
Monitor :
A Monitor synchronizes access to an Object by communicating with threads interested in the object, asking them to acquire access or wait for some condition to become true.
Example:
A Father may acts as a monitor for her daughter, allowing her to date only one guy at a time.
A school teacher using baton to allow only one child to speak in the class.
Lastly a technical one, transactions (via threads) on an Account object synchronized to maintain integrity.

When a semaphore is used to guard a critical region, there is no direct relationship between the semaphore and the data being protected. This is part of the reason why semaphores may be dispersed around the code, and why it is easy to forget to call wait or notify, in which case the result will be, respectively, to violate mutual exclusion or to lock the resource permanently.
In contrast, niehter of these bad things can happen with a monitor. A monitor is tired directly to the data (it encapsulates the data) and, because the monitor operations are atomic actions, it is impossible to write code that can access the data without calling the entry protocol. The exit protocol is called automatically when the monitor operation is completed.
A monitor has a built-in mechanism for condition synchronisation in the form of condition variable before proceeding. If the condition is not satisfied, the process has to wait until it is notified of a change in the condition. When a process is waiting for condition synchronisation, the monitor implementation takes care of the mutual exclusion issue, and allows another process to gain access to the monitor.
Taken from The Open University M362 Unit 3 "Interacting process" course material.

Related

Kernel Programming - Mutexes

So I'm trying to use mutex_init(), mutex_lock(), mutex_unlock() for thread synchronization.
I am currently trying to schedule threads in a round robin fashion(but more than 1 thread could be running at a time) and I set the current state of a thread to TASK_INTERRUPTIBLE, followed by waking up another thread whose PID, I have in a list.
I need to iterate over this list for my logic.
As I understand it, I need to lock this list as I access its elements, or another thread might miss a new entry while I'm making changes to it. Also, as one mutex has locked a resource, no other mutex can unlock it, until the original mutex releases it.
But, I'm still not sure if I'm locking it correctly. (I release the lock before I call schedule(), and re-lock after that)
I declare a mutex locally within a thread and lock the list. After my current thread locks
mutex_lock(&lock);
and I iterate over the list, till I find something(or ends if it doesn't find anything), then unlocks.
mutex_unlock(&lock);
I assume locking while I iterate is legal. I have never seen examples of this though.
Also, is it normal for the process to have a state of (TASK_UNINTERRUPTIBLE) while it holds a mutex lock?
EDIT : I am adding some more information based on the answer below.
It is possible my program may be run on a virtual machine with a single core. Therefore, I do not want to risk infinite polling using spin_lock().
I am trying to maintain scheduling between threads that have a certain id. For example if there are 4 threads. 2 in set 'A' and 2 in set 'B'. I allow only 1 thread to run in each set. But I switch between threads in a given set. However, a thread in set 'A' should not switch to any thread in set 'B'
(I know the kernel scheduler wont be perfect, so an approximate switching will do).
My Reasoning for TASK_STATE's:
1) Initial thread that gets created is running.
2) If another thread in the same set is running (and this one hasn't executed for a given time). Set other thread to TASK_INTERRUPTIPLE, while calling schedule(); Note: There can be more than 2 threads in each set, but let's keep it simple by considering only 2 for now.
3) If it has executed for enough time, set this task to TASK_INTERRUPTIPLE, set the other task in the same set to TASK_RUNNING, while calling schedule();
All this logic happens while I am accessing certain data structures which are locked by a (now) Global Mutex. I unlock the mutex just before I call schedule(), and instantly re-lock afterward. After my logic part is done, I completely unlock the mutex.
Is there anything fundamentally wrong with the approach?
As I understand it, I need to lock this list as I access its elements
Yes, that is true. But if you use a mutex, you're going to be really sad because a call to lock/unlock is a call to the scheduler. Therefore, calling it from inside the scheduler should result in deadlock. What you need to do depends on if your processor is multi-core or (the mythical) single-core. (Is this a virtual system?) On a single-core processor you can disable interrupts. On a multi-core processor, disabling interrupts is not sufficient (it only disables interrupts for that one core, and another core may still be interrupted). The simplest thing to do on a multi-core is to use a spinlock. Unlike the mutex, both of these locking mechanisms can be unlocked from different threads.
I set the current state of a thread to TASK_INTERRUPTIBLE
Is the thread being taken off the CPU? If so, it's not running, so I suspect that TASK_INTERRUPTIBLE is the wrong state. It would be helpful if you could list the possible states for me or if you could describe what the state is supposed to indicate. Because to me "TASK_INTERRUPTIBLE" sounds like a running task.
I declare a mutex locally within a thread and lock the list
Local mutexes are a red flag! The resource you are locking should be protected by a mutex with the same scope. If the list is global, it should have a global mutex to protect it. Threads that want to use the list must first acquire its mutex. Of course, as I already talked about, you probably want to use a different kind of locking to protect the list of ready-to-run processes.
I assume locking while I iterate is legal
It is perfectly legal (assuming of course that your mutual exclusion scheme is bug-free). In fact, it's required. If another thread were allowed to, for example, remove a node from the list while you were reading it, you could end up dereferencing a deleted node.
Also, is it normal for the process to have a state of TASK_UNINTERRUPTIBLE while it holds a mutex lock?
No, not while it holds the lock if the process is currently running on a CPU. A mutex is available to user code. If holding a mutex made the process uninterruptible, that would mean that a process could hijack the system by simply locking a mutex and never releasing it. Now, you will find that the lock and unlock functions need to be uninterruptible on a single-core processor. However, it doesn't make sense to set the state for the process because it's actually the scheduler that must not be interrupted.

semaphore and mutex locking concept

I read one of the differences between semaphore and mutex is in case of mutex the process/thread (which ever is having the lock) can only release the lock. But in the case of the semaphore any other process can release the semaphore. My doubt arises when a process that does not have the semaphore with it can release the semaphore. What is the use of having a semaphore?
Let's say I have two processes A and B. Assume process A is having a semaphore with it and executing some critical task. Now let us say process B sends a signal to release the semaphore. In this scenario, will process A release the semaphore even if it is executing some critical task?
You are making half-sense. It is not about ownership. Partner-release in semaphores (and mutexes) is usable, for instance, in my favorite interview question of thread ping-pong. As a matter of fact, I have specifically tried to partner-release a mutex on 3 implementations available to me at a time (Linux/Solaris/AIX) and partner-release did work for mutexes as expected - i.e. mutex was successsfully released and threads blocking on it resumed execution. However, this is, of course, prohibited by Posix.
I think you might be confused on the whole set of differences between a semaphore and a mutex. A mutex provides mutual exclusion. A semaphore counts until it reaches a level where it starts excluding. A semaphore that counted to one would give similar semantics to a mutex though.
A good example would be a television set. Only so many people can watch the same television set, so protecting it with a semaphore would make sense. Anyone can stop watching the television. The remote control for the television can only be operated by one person at a time though, so you could protect it with a mutex.
Some reading...
https://en.wikipedia.org/wiki/Mutual_exclusion
https://en.wikipedia.org/wiki/Semaphore_%28programming%29
"Let's say I have two processes A and B. Assume process A is having a semaphore with it and executing some critical task. Now let us say process B sends a signal to release the semaphore. In this scenario, will process A release the semaphore even if it is executing some critical task?"
One key point to note here is the role of OS kernel. Process B can't send a signal to Process A 'to release the semaphore'. What it can do is request the kernel to give it access to the resource. Process A had requested the kernel and the kernel granted it access to the resource.
Now process A, after it finishes its job, will let the kernel know that it is done with the resource and then kernel grants access to B.
"My doubt arises when a process that does not have the semaphore with it can release the semaphore. What is the use of having a semaphore?"
The key difference between a mutex and a semaphore is, a semaphore serializes access to multiple instances of a resource. Mutex does the same when there is one instance of the resource.
A count is maintained by kernel in case of semaphore and mutex is a special case where the count is 1.
Consider the processes as customers waiting in line at a bank.
The use of semaphore is analogous to the case where there are multiple tellers serving the customers. Usage of mutex is analogous to the case where there is just one teller.
Say there are processes A, B and C that need concurrent access to a resource (lock, file or a data structure in memory, etc.). Further suppose there are 2 instances of the resource. So at most two processes can be granted access at a time.
Process A requests access to an instance of the resource following the required semantics. This request to the kernel involves data structures to identify the resource and maximum number of instances as 2. kernel creates the semaphore with a count of 2, grants A access to the resource and decrements the count to 1, because now only one other process can get access.
Now process B requests access to the resource by following the same semantics. Kernel grants it access and decrements the count to 0.
Now process C requests access, but kernel keeps it in waiting state, because count is 0 and no more than 2 processes can get concurrent access.
Process A is done with the resource and lets kernel know. Kernel notices this and grants access to process C that has been waiting.
In case of mutex, kernel grants access to the resource only one process at a time.
A normal binary semaphore is basically used for synchronization. However, the mutex is for exclusive access to a resource. A mutex is a special variant of semaphore that allows only one locker at a time and with more stringency on ownership than a normal semaphore such as the mutex should be released only by the thread that acquired it. Also, please note that in case of pthreads, fast mutex may not check for this error related to ownership, whereas the error checking mutex shall return error.
For the query related to 2 process A and B, the Process A shall intimate via kernel that it is done with its critical work so that the resource can be made available for waiting processes like B.
You could find some related information in this link too :
When should we use mutex and when should we use semaphore
There is no such thing as "having" a semaphore. Semaphores don't have ownership like mutexes do. The code you describe would simply be buggy. Mutexes won't work if your code is buggy either.
Consider the most classic example of a semaphore -- allowing one train at a time on a section of track. You could implement this with a mutex if the train is a thread. The train would lock the track mutex before going on the track and unlock it after leaving the track.
But what if the train itself is multi-threaded? Which thread should own the track?
And what if the signalling devices are the threads, not the train? Here, the signalling device that detects the train entering the track has to lock the track while the signalling device that detects the train leaving the track has to unlock it.
Mutexes are suitable for cases where there is something that is owned by a particular thread for a short period of time. That thread can "own" the mutex. Semaphores are useful for cases where there is no thread to own anything or nothing for the thread to own.

Advantages of using condition variables over mutex

I was wondering what is the performance benefit of using condition variables over mutex locks in pthreads.
What I found is : "Without condition variables, the programmer would need to have threads continually polling (possibly in a critical section), to check if the condition is met. This can be very resource consuming since the thread would be continuously busy in this activity. A condition variable is a way to achieve the same goal without polling." (https://computing.llnl.gov/tutorials/pthreads)
But it also seems that mutex calls are blocking (unlike spin-locks). Hence if a thread (T1) fails to get a lock because some other thread (T2) has the lock, T1 is put to sleep by the OS, and is woken up only when T2 releases the lock and the OS gives T1 the lock. The thread T1 does not really poll to get the lock. From this description, it seems that there is no performance benefit of using condition variables. In either case, there is no polling involved. The OS anyway provides the benefit that the condition-variable paradigm can provide.
Can you please explain what actually happens.
A condition variable allows a thread to be signaled when something of interest to that thread occurs.
By itself, a mutex doesn't do this.
If you just need mutual exclusion, then condition variables don't do anything for you. However, if you need to know when something happens, then condition variables can help.
For example, if you have a queue of items to work on, you'll have a mutex to ensure the queue's internals are consistent when accessed by the various producer and consumer threads. However, when the queue is empty, how will a consumer thread know when something is in there for it to work on? Without something like a condition variable it would need to poll the queue, taking and releasing the mutex on each poll (otherwise a producer thread could never put something on the queue).
Using a condition variable lets the consumer find that when the queue is empty it can just wait on the condition variable indicating that the queue has had something put into it. No polling - that thread does nothing until a producer puts something in the queue, then signals the condition that the queue has a new item.
You're looking for too much overlap in two separate but related things: a mutex and a condition variable.
A common implementation approach for a mutex is to use a flag and a queue. The flag indicates whether the mutex is held by anyone (a single-count semaphore would work too), and the queue tracks which threads are in line waiting to acquire the mutex exclusively.
A condition variable is then implemented as another queue bolted onto that mutex. Threads that got in line to wait to acquire the mutex can—usually once they have acquired it—volunteer to get out of the front of the line and get into the condition queue instead. At this point, you have two separate sets of waiters:
Those waiting to acquire the mutex exclusively
Those waiting for the condition variable to be signaled
When a thread holding the mutex exclusively signals the condition variable, for which we'll assume for now that it's a singular signal (unleashing no more than one waiting thread) and not a broadcast (unleashing all the waiting threads), the first thread in the condition variable queue gets shunted back over into the front (usually) of the mutex queue. Once the thread currently holding the mutex—usually the thread that signaled the condition variable—relinquishes the mutex, the next thread in the mutex queue can acquire it. That next thread in line will have been the one that was at the head of the condition variable queue.
There are many complicated details that come into play, but this sketch should give you a feel for the structures and operations in play.
If you are looking for performance, then start reading about "non blocking / non locking" thread synchronization algorithms. They are based upon atomic operations, which gcc is kind enough to provide. Lookup gcc atomic operations. Our tests showed we could increment a global value with multiple threads using atomic operation magnitudes faster than locking with a mutex. Here is some sample code that shows how to add items to and from a linked list from multiple threads at the same time without locking.
For sleeping and waking threads, signals are much faster than conditions. You use pthread_kill to send the signal, and sigwait to sleep the thread. We tested this too with the same kind of performance benefits. Here is some example code.

Mutex lock: what does "blocking" mean?

I've been reading up on multithreading and shared resources access and one of the many (for me) new concepts is the mutex lock. What I can't seem to find out is what is actually happening to the thread that finds a "critical section" is locked. It says in many places that the thread gets "blocked", but what does that mean? Is it suspended, and will it resume when the lock is lifted? Or will it try again in the next iteration of the "run loop"?
The reason I ask, is because I want to have system supplied events (mouse, keyboard, etc.), which (apparantly) are delivered on the main thread, to be handled in a very specific part in the run loop of my secondary thread. So whatever event is delivered, I queue in my own datastructure. Obviously, the datastructure needs a mutex lock because it's being modified by both threads. The missing puzzle-piece is: what happens when an event gets delivered in a function on the main thread, I want to queue it, but the queue is locked? Will the main thread be suspended, or will it just jump over the locked section and go out of scope (losing the event)?
Blocked means execution gets stuck there; generally, the thread is put to sleep by the system and yields the processor to another thread. When a thread is blocked trying to acquire a mutex, execution resumes when the mutex is released, though the thread might block again if another thread grabs the mutex before it can.
There is generally a try-lock operation that grab the mutex if possible, and if not, will return an error. But you are eventually going to have to move the current event into that queue. Also, if you delay moving the events to the thread where they are handled, the application will become unresponsive regardless.
A queue is actually one case where you can get away with not using a mutex. For example, Mac OS X (and possibly also iOS) provides the OSAtomicEnqueue() and OSAtomicDequeue() functions (see man atomic or <libkern/OSAtomic.h>) that exploit processor-specific atomic operations to avoid using a lock.
But, why not just process the events on the main thread as part of the main run loop?
The simplest way to think of it is that the blocked thread is put in a wait ("sleeping") state until the mutex is released by the thread holding it. At that point the operating system will "wake up" one of the threads waiting on the mutex and let it acquire it and continue. It's as if the OS simply puts the blocked thread on a shelf until it has the thing it needs to continue. Until the OS takes the thread off the shelf, it's not doing anything. The exact implementation -- which thread gets to go next, whether they all get woken up or they're queued -- will depend on your OS and what language/framework you are using.
Too late to answer but I may facilitate the understanding. I am talking more from implementation perspective rather than theoretical texts.
The word "blocking" is kind of technical homonym. People may use it for sleeping or mere waiting. The term has to be understood in context of usage.
Blocking means Waiting - Assume on an SMP system a thread B wants to acquire a spinlock held by some other thread A. One of the mechanisms is to disable preemption and keep spinning on the processor unless B gets it. Another mechanism probably, an efficient one, is to allow other threads to use processor, in case B does not gets it in easy attempts. Therefore we schedule out thread B (as preemption is enabled) and give processor to some other thread C. In this case thread B just waits in the scheduler's queue and comes back with its turn. Understand that B is not sleeping just waiting rather passively instead of busy-wait and burning processor cycles. On BSD and Solaris systems there are data-structures like turnstiles to implement this situation.
Blocking means Sleeping - If the thread B had instead made system call like read() waiting data from network socket, it cannot proceed until it gets it. Therefore, some texts casually use term blocking as "... blocked for I/O" or "... in blocking system call". Actually, thread B is rather sleeping. There are specific data-structures known as sleep queues - much like luxury waiting rooms on air-ports :-). The thread will be woken up when OS detects availability of data, much like an attendant of the waiting room.
Blocking means just that. It is blocked. It will not proceed until able. You don't say which language you're using, but most languages/libraries have lock objects where you can "attempt" to take the lock and then carry on and do something different depending on whether you succeeded or not.
But in, for example, Java synchronized blocks, your thread will stall until it is able to acquire the monitor (mutex, lock). The java.util.concurrent.locks.Lock interface describes lock objects which have more flexibility in terms of lock acquisition.

Conditional Variable vs Semaphore

When to use a semaphore and when to use a conditional variable?
Locks are used for mutual exclusion. When you want to ensure that a piece of code is atomic, put a lock around it. You could theoretically use a binary semaphore to do this, but that's a special case.
Semaphores and condition variables build on top of the mutual exclusion provide by locks and are used for providing synchronized access to shared resources. They can be used for similar purposes.
A condition variable is generally used to avoid busy waiting (looping repeatedly while checking a condition) while waiting for a resource to become available. For instance, if you have a thread (or multiple threads) that can't continue onward until a queue is empty, the busy waiting approach would be to just doing something like:
//pseudocode
while(!queue.empty())
{
sleep(1);
}
The problem with this is that you're wasting processor time by having this thread repeatedly check the condition. Why not instead have a synchronization variable that can be signaled to tell the thread that the resource is available?
//pseudocode
syncVar.lock.acquire();
while(!queue.empty())
{
syncVar.wait();
}
//do stuff with queue
syncVar.lock.release();
Presumably, you'll have a thread somewhere else that is pulling things out of the queue. When the queue is empty, it can call syncVar.signal() to wake up a random thread that is sitting asleep on syncVar.wait() (or there's usually also a signalAll() or broadcast() method to wake up all the threads that are waiting).
I generally use synchronization variables like this when I have one or more threads waiting on a single particular condition (e.g. for the queue to be empty).
Semaphores can be used similarly, but I think they're better used when you have a shared resource that can be available and unavailable based on some integer number of available things. Semaphores are good for producer/consumer situations where producers are allocating resources and consumers are consuming them.
Think about if you had a soda vending machine. There's only one soda machine and it's a shared resource. You have one thread that's a vendor (producer) who is responsible for keeping the machine stocked and N threads that are buyers (consumers) who want to get sodas out of the machine. The number of sodas in the machine is the integer value that will drive our semaphore.
Every buyer (consumer) thread that comes to the soda machine calls the semaphore down() method to take a soda. This will grab a soda from the machine and decrement the count of available sodas by 1. If there are sodas available, the code will just keep running past the down() statement without a problem. If no sodas are available, the thread will sleep here waiting to be notified of when soda is made available again (when there are more sodas in the machine).
The vendor (producer) thread would essentially be waiting for the soda machine to be empty. The vendor gets notified when the last soda is taken from the machine (and one or more consumers are potentially waiting to get sodas out). The vendor would restock the soda machine with the semaphore up() method, the available number of sodas would be incremented each time and thereby the waiting consumer threads would get notified that more soda is available.
The wait() and signal() methods of a synchronization variable tend to be hidden within the down() and up() operations of the semaphore.
Certainly there's overlap between the two choices. There are many scenarios where a semaphore or a condition variable (or set of condition variables) could both serve your purposes. Both semaphores and condition variables are associated with a lock object that they use to maintain mutual exclusion, but then they provide extra functionality on top of the lock for synchronizing thread execution. It's mostly up to you to figure out which one makes the most sense for your situation.
That's not necessarily the most technical description, but that's how it makes sense in my head.
Let's reveal what's under the hood.
Conditional variable is essentially a wait-queue, that supports blocking-wait and wakeup operations, i.e. you can put a thread into the wait-queue and set its state to BLOCK, and get a thread out from it and set its state to READY.
Note that to use a conditional variable, two other elements are needed:
a condition (typically implemented by checking a flag or a counter)
a mutex that protects the condition
The protocol then becomes,
acquire mutex
check condition
block and release mutex if condition is true, else release mutex
Semaphore is essentially a counter + a mutex + a wait queue. And it can be used as it is without external dependencies. You can use it either as a mutex or as a conditional variable.
Therefore, semaphore can be treated as a more sophisticated structure than conditional variable, while the latter is more lightweight and flexible.
Semaphores can be used to implement exclusive access to variables, however they are meant to be used for synchronization. Mutexes, on the other hand, have a semantics which is strictly related to mutual exclusion: only the process which locked the resource is allowed to unlock it.
Unfortunately you cannot implement synchronization with mutexes, that's why we have condition variables. Also notice that with condition variables you can unlock all the waiting threads in the same instant by using the broadcast unlocking. This cannot be done with semaphores.
semaphore and condition variables are very similar and are used mostly for the same purposes. However, there are minor differences that could make one preferable. For example, to implement barrier synchronization you would not be able to use a semaphore.But a condition variable is ideal.
Barrier synchronization is when you want all of your threads to wait until everyone has arrived at a certain part in the thread function. this can be implemented by having a static variable which is initially the value of total threads decremented by each thread when it reaches that barrier. this would mean we want each thread to sleep until the last one arrives.A semaphore would do the exact opposite! with a semaphore, each thread would keep running and the last thread (which will set semaphore value to 0) will go to sleep.
a condition variable on the other hand, is ideal. when each thread gets to the barrier we check if our static counter is zero. if not, we set the thread to sleep with the condition variable wait function. when the last thread arrives at the barrier, the counter value will be decremented to zero and this last thread will call the condition variable signal function which will wake up all the other threads!
I file condition variables under monitor synchronization. I've generally seen semaphores and monitors as two different synchronization styles. There are differences between the two in terms of how much state data is inherently kept and how you want to model code - but there really isn't any problem that can be solved by one but not the other.
I tend to code towards monitor form; in most languages I work in that comes down to mutexes, condition variables, and some backing state variables. But semaphores would do the job too.
semaphore need to know the count upfront for initialization. There is no such requirement for condition variables.
The the mutex and conditional variables are inherited from semaphore.
For mutex, the semaphore uses two states: 0, 1
For condition variables the semaphore uses counter.
They are like syntactic sugar
conditionalVar + mutex == semaphore

Resources