Using semaphore with counter - semaphore

Such a question.I am trying to understand how to use a semaphore. For exercise I took classical problem of readers / writers
with a cyclic memory buffer. I would like to discuss only the writers. If I initialize the semaphore with a count greater than 1,
I see that my writers can write to the same memory location. Then what is the meaning of the semaphore with the counter if it does
not guarantee synchronized access to a shared resource? It seems I should have for each memory cell the separate semaphore.

Well your use case is a special situation when the semaphore is initialized to 1 and behaves like a mutex. Obviously putting 2 would be an error as it would not be a correct lock anymore.
Nevertheless, semaphores a used in many other situations, for example, say you want to make sure that you do not have more than 5 thread running at a time.
You would setup the semaphore at 5, and each time you spawn a thread you do a down on it, and each time a thread finishes, you would do a up.
Trying to spawn the 6th thread would cause you to be 'stuck' in the down() until a thread eventually finishes at performs a up() that will unblock you.

Semaphore
Semaphores are a way to share a resource among multiple threads. In the Readers-writers problem, it is a way to guarantee consistency of the data, by preventing updates while it is being read, and preventing reads while it is being written to. It allows only one writer, and multiple concurrent readers.
Talking about semaphores is only useful if there are both readers and writers; In the case of an exclusive lock, where there can only be one thread who 'owns' the lock (has access to the resource), they are usually called Mutex (short for mutual exclusion).
Implementation
I implemented semaphores the other way around (due to CPU specifics): positive indicates how many readers there are, and a negative number indicates that there is one writer.
Initially the semaphore is 0, indicating no writer, and no readers.
Read Lock
Any time a reader wants to read, the semaphore must be 0 or positive, to support concurrent reads. If this is so, it is incremented. Positive numbers then, indicate that there are readers.
A reader would do a LOCK_READ, which succeeds, unless the semaphore is negative, indicating it is in the process of being written to and thus inconsistent. If this happens, the thread doing the read lock is suspended until the semaphore becomes 0 (or higher).
Write Lock
Any time a writer wants to write, the semaphore must be 0, because if it is positive, the readers may get partially updated (corrupt) data, and if it is negative, it is already locked for writing by another thread. If the semaphore indicates that the resource is not being accessed (0), the semaphore is decremented.
Unlocking
The unlocking is the reverse of the locking, except that there is no need to suspend the thread to unlock a resource. A read lock is lifted by decrementing the semaphore, and a write lock is lifted by incrementing the semaphore.

Related

why POSIX doesn't provide a robust IPC semaphore(regarding process crash safety)

According to this link, How do I recover a semaphore when the process that decremented it to zero crashes? it seems that there is no robust semaphore inter-process, and the author finally chose filelock, which is guaranteed to be released properly by the system-level or kernel-level control.
But I also found robust mutex provided by pthread https://man7.org/linux/man-pages/man3/pthread_mutexattr_setrobust.3.html why there is no something like robust semaphore?
And an extra question: what robust alternatives we have regarding IPC synchronization? filelock seems to be the best one. I think providing such mechanism is not that difficult from system or kernel level,since they do implement fielock. then why they don't provide some other approaches?
When you use a mutex, it can be acquired by at most one thread at a time. Therefore, once the mutex has been acquired, the owner can write its process ID or thread ID (depending on the system) into the mutex, and future users can detect whether the owner is still alive or not.
However, a semaphore is ultimately a counter. It is possible that different threads may increment or decrement the counter. There isn't intrinsically one resource that is being shared; there could instead be multiple resources.
For example, if we're trying to limit ourselves to a certain number of outgoing connections (say, 8), then we could create a semaphore with that value and allow threads to acquire it (wait) to make a connection, and then increment it (post) when they're done. If we never want to make more than 8 connections at once, the semaphore will never block; we'll have acquired it successfully each time, even though there's no exclusion.
In such a situation, there isn't going to be space inside the semaphore to store every process's thread ID. Using memory allocation is tricky because that code needs to be synchronized independently, and even if that could be solved, it means that a semaphore value would have at least O(N) performance when acquiring the semaphore. I work on a production system that uses hundreds of threads, so you can imagine the performance problems if we had such a semaphore design.
There are other solutions which you can use when you need robustness, such as file locking or a robust mutex in a shared memory segment, but none of them have the same properties as a semaphore. Therefore, any discussion of what primitives should be used instead depends on the particular needs of the situation (which should probably be placed in a new question).

Java Thread Live Lock

I have an interesting problem related to Java thread live lock. Here it goes.
There are four global locks - L1,L2,L3,L4
There are four threads - T1, T2, T3, T4
T1 requires locks L1,L2,L3
T2 requires locks L2
T3 required locks L3,L4
T4 requires locks L1,L2
So, the pattern of the problem is - Any of the threads can run and acquire the locks in any order. If any of the thread detects that a lock which it needs is not available, it release all other locks it had previously acquired waits for a fixed time before retrying again. The cycle repeats giving rise to a live lock condition.
So, to solve this problem, I have two solutions in mind
1) Let each thread wait for a random period of time before retrying.
OR,
2) Let each thread acquire all the locks in a particular order ( even if a thread does not require all the
locks)
I am not convinced that these are the only two options available to me. Please advise.
Have all the threads enter a single mutex-protected state-machine whenever they require and release their set of locks. The threads should expose methods that return the set of locks they require to continue and also to signal/wait for a private semaphore signal. The SM should contain a bool for each lock and a 'Waiting' queue/array/vector/list/whatever container to store waiting threads.
If a thread enters the SM mutex to get locks and can immediately get its lock set, it can reset its bool set, exit the mutex and continue on.
If a thread enters the SM mutex and cannot immediately get its lock set, it should add itself to 'Waiting', exit the mutex and wait on its private semaphore.
If a thread enters the SM mutex to release its locks, it sets the lock bools to 'return' its locks and iterates 'Waiting' in an attempt to find a thread that can now run with the set of locks available. If it finds one, it resets the bools appropriately, removes the thread it found from 'Waiting' and signals the 'found' thread semaphore. It then exits the mutex.
You can twiddle with the algorithm that you use to match up the available set lock bools with waiting threads as you wish. Maybe you should release the thread that requires the largest set of matches, or perhaps you would like to 'rotate' the 'Waiting' container elements to reduce starvation. Up to you.
A solution like this requires no polling, (with its performance-sapping CPU use and latency), and no continual aquire/release of multiple locks.
It's much easier to develop such a scheme with an OO design. The methods/member functions to signal/wait the semaphore and return the set of locks needed can usually be stuffed somewhere in the thread class inheritance chain.
Unless there is a good reason (performance wise) not to do so,
I would unify all locks to one lock object.
This is similar to solution 2 you suggested, only more simple in my opinion.
And by the way, not only is this solution more simple and less bug proned,
The performance might be better than solution 1 you suggested.
Personally, I have never heard of Option 1, but I am by no means an expert on multithreading. After thinking about it, it sounds like it will work fine.
However, the standard way to deal with threads and resource locking is somewhat related to Option 2. To prevent deadlocks, resources need to always be acquired in the same order. For example, if you always lock the resources in the same order, you won't have any issues.
Go with 2a) Let each thread acquire all of the locks that it needs (NOT all of the locks) in a particular order; if a thread encounters a lock that isn't available then it releases all of its locks
As long as threads acquire their locks in the same order you can't have deadlock; however, you can still have starvation (a thread might run into a situation where it keeps releasing all of its locks without making forward progress). To ensure that progress is made you can assign priorities to threads (0 = lowest priority, MAX_INT = highest priority) - increase a thread's priority when it has to release its locks, and reduce it to 0 when it acquires all of its locks. Put your waiting threads in a queue, and don't start a lower-priority thread if it needs the same resources as a higher-priority thread - this way you guarantee that the higher-priority threads will eventually acquire all of their locks. Don't implement this thread queue unless you're actually having problems with thread starvation, though, because it's probably less efficient than just letting all of your threads run at once.
You can also simplify things by implementing omer schleifer's condense-all-locks-to-one solution; however, unless threads other than the four you've mentioned are contending for these resources (in which case you'll still need to lock the resources from the external threads), you can more efficiently implement this by removing all locks and putting your threads in a circular queue (so your threads just keep running in the same order).

Semaphore vs. Monitors - what's the difference?

What are the major differences between a Monitor and a Semaphore?
A Monitor is an object designed to be accessed from multiple threads. The member functions or methods of a monitor object will enforce mutual exclusion, so only one thread may be performing any action on the object at a given time. If one thread is currently executing a member function of the object then any other thread that tries to call a member function of that object will have to wait until the first has finished.
A Semaphore is a lower-level object. You might well use a semaphore to implement a monitor. A semaphore essentially is just a counter. When the counter is positive, if a thread tries to acquire the semaphore then it is allowed, and the counter is decremented. When a thread is done then it releases the semaphore, and increments the counter.
If the counter is already zero when a thread tries to acquire the semaphore then it has to wait until another thread releases the semaphore. If multiple threads are waiting when a thread releases a semaphore then one of them gets it. The thread that releases a semaphore need not be the same thread that acquired it.
A monitor is like a public toilet. Only one person can enter at a time. They lock the door to prevent anyone else coming in, do their stuff, and then unlock it when they leave.
A semaphore is like a bike hire place. They have a certain number of bikes. If you try and hire a bike and they have one free then you can take it, otherwise you must wait. When someone returns their bike then someone else can take it. If you have a bike then you can give it to someone else to return --- the bike hire place doesn't care who returns it, as long as they get their bike back.
Following explanation actually explains how wait() and signal() of monitor differ from P and V of semaphore.
The wait() and signal() operations on condition variables in a monitor are similar to P and V operations on counting semaphores.
A wait statement can block a process's execution, while a signal statement can cause another process to be unblocked. However, there are some differences between them. When a process executes a P operation, it does not necessarily block that process because the counting semaphore may be greater than zero. In contrast, when a wait statement is executed, it always blocks the process. When a task executes a V operation on a semaphore, it either unblocks a task waiting on that semaphore or increments the semaphore counter if there is no task to unlock. On the other hand, if a process executes a signal statement when there is no other process to unblock, there is no effect on the condition variable. Another difference between semaphores and monitors is that users awaken by a V operation can resume execution without delay. Contrarily, users awaken by a signal operation are restarted only when the monitor is unlocked. In addition, a monitor solution is more structured than the one with semaphores because the data and procedures are encapsulated in a single module and that the mutual exclusion is provided automatically by the implementation.
Link: here for further reading. Hope it helps.
Semaphore allows multiple threads (up to a set number) to access a shared object. Monitors allow mutually exclusive access to a shared object.
Monitor
Semaphore
One Line Answer:
Monitor: controls only ONE thread at a time can execute in the monitor. (need to acquire lock to execute the single thread)
Semaphore: a lock that protects a shared resource. (need to acquire the lock to access resource)
A semaphore is a signaling mechanism used to coordinate between threads. Example: One thread is downloading files from the internet and another thread is analyzing the files. This is a classic producer/consumer scenario. The producer calls signal() on the semaphore when a file is downloaded. The consumer calls wait() on the same semaphore in order to be blocked until the signal indicates a file is ready. If the semaphore is already signaled when the consumer calls wait, the call does not block. Multiple threads can wait on a semaphore, but each signal will only unblock a single thread.
A counting semaphore keeps track of the number of signals. E.g. if the producer signals three times in a row, wait() can be called three times without blocking. A binary semaphore does not count but just have the "waiting" and "signalled" states.
A mutex (mutual exclusion lock) is a lock which is owned by a single thread. Only the thread which have acquired the lock can realease it again. Other threads which try to acquire the lock will be blocked until the current owner thread releases it. A mutex lock does not in itself lock anything - it is really just a flag. But code can check for ownership of a mutex lock to ensure that only one thread at a time can access some object or resource.
A monitor is a higher-level construct which uses an underlying mutex lock to ensure thread-safe access to some object. Unfortunately the word "monitor" is used in a few different meanings depending on context and platform and context, but in Java for example, a monitor is a mutex lock which is implicitly associated with an object, and which can be invoked with the synchronized keyword. The synchronized keyword can be applied to a class, method or block and ensures only one thread can execute the code at a time.
Semaphore :
Using a counter or flag to control access some shared resources in a concurrent system, implies use of Semaphore.
Example:
A counter to allow only 50 Passengers to acquire the 50 seats (Shared resource) of any Theatre/Bus/Train/Fun ride/Classroom. And to allow a new Passenger only if someone vacates a seat.
A binary flag indicating the free/occupied status of any Bathroom.
Traffic lights are good example of flags. They control flow by regulating passage of vehicles on Roads (Shared resource)
Flags only reveal the current state of Resource, no count or any other information on the waiting or running objects on the resource.
Monitor :
A Monitor synchronizes access to an Object by communicating with threads interested in the object, asking them to acquire access or wait for some condition to become true.
Example:
A Father may acts as a monitor for her daughter, allowing her to date only one guy at a time.
A school teacher using baton to allow only one child to speak in the class.
Lastly a technical one, transactions (via threads) on an Account object synchronized to maintain integrity.
When a semaphore is used to guard a critical region, there is no direct relationship between the semaphore and the data being protected. This is part of the reason why semaphores may be dispersed around the code, and why it is easy to forget to call wait or notify, in which case the result will be, respectively, to violate mutual exclusion or to lock the resource permanently.
In contrast, niehter of these bad things can happen with a monitor. A monitor is tired directly to the data (it encapsulates the data) and, because the monitor operations are atomic actions, it is impossible to write code that can access the data without calling the entry protocol. The exit protocol is called automatically when the monitor operation is completed.
A monitor has a built-in mechanism for condition synchronisation in the form of condition variable before proceeding. If the condition is not satisfied, the process has to wait until it is notified of a change in the condition. When a process is waiting for condition synchronisation, the monitor implementation takes care of the mutual exclusion issue, and allows another process to gain access to the monitor.
Taken from The Open University M362 Unit 3 "Interacting process" course material.

When should we use mutex and when should we use semaphore

When should we use mutex and when should we use semaphore ?
Here is how I remember when to use what -
Semaphore:
Use a semaphore when you (thread) want to sleep till some other thread tells you to wake up. Semaphore 'down' happens in one thread (producer) and semaphore 'up' (for same semaphore) happens in another thread (consumer)
e.g.: In producer-consumer problem, producer wants to sleep till at least one buffer slot is empty - only the consumer thread can tell when a buffer slot is empty.
Mutex:
Use a mutex when you (thread) want to execute code that should not be executed by any other thread at the same time. Mutex 'down' happens in one thread and mutex 'up' must happen in the same thread later on.
e.g.: If you are deleting a node from a global linked list, you do not want another thread to muck around with pointers while you are deleting the node. When you acquire a mutex and are busy deleting a node, if another thread tries to acquire the same mutex, it will be put to sleep till you release the mutex.
Spinlock:
Use a spinlock when you really want to use a mutex but your thread is not allowed to sleep.
e.g.: An interrupt handler within OS kernel must never sleep. If it does the system will freeze / crash. If you need to insert a node to globally shared linked list from the interrupt handler, acquire a spinlock - insert node - release spinlock.
A mutex is a mutual exclusion object, similar to a semaphore but that only allows one locker at a time and whose ownership restrictions may be more stringent than a semaphore.
It can be thought of as equivalent to a normal counting semaphore (with a count of one) and the requirement that it can only be released by the same thread that locked it(a).
A semaphore, on the other hand, has an arbitrary count and can be locked by that many lockers concurrently. And it may not have a requirement that it be released by the same thread that claimed it (but, if not, you have to carefully track who currently has responsibility for it, much like allocated memory).
So, if you have a number of instances of a resource (say three tape drives), you could use a semaphore with a count of 3. Note that this doesn't tell you which of those tape drives you have, just that you have a certain number.
Also with semaphores, it's possible for a single locker to lock multiple instances of a resource, such as for a tape-to-tape copy. If you have one resource (say a memory location that you don't want to corrupt), a mutex is more suitable.
Equivalent operations are:
Counting semaphore Mutual exclusion semaphore
-------------------------- --------------------------
Claim/decrease (P) Lock
Release/increase (V) Unlock
Aside: in case you've ever wondered at the bizarre letters (P and V) used for claiming and releasing semaphores, it's because the inventor was Dutch. In that language:
Probeer te verlagen: means to try to lower;
Verhogen: means to increase.
(a) ... or it can be thought of as something totally distinct from a semaphore, which may be safer given their almost-always-different uses.
It is very important to understand that a mutex is not a semaphore with count 1!
This is the reason there are things like binary semaphores (which are really semaphores with count 1).
The difference between a Mutex and a Binary-Semaphore is the principle of ownership:
A mutex is acquired by a task and therefore must also be released by the same task.
This makes it possible to fix several problems with binary semaphores (Accidental release, recursive deadlock, and priority inversion).
Caveat: I wrote "makes it possible", if and how these problems are fixed is up to the OS implementation.
Because the mutex has to be released by the same task it is not very good for the synchronization of tasks. But if combined with condition variables you get very powerful building blocks for building all kinds of IPC primitives.
So my recommendation is: if you got cleanly implemented mutexes and condition variables (like with POSIX pthreads) use these.
Use semaphores only if they fit exactly to the problem you are trying to solve, don't try to build other primitives (e.g. rw-locks out of semaphores, use mutexes and condition variables for these)
There is a lot of misunderstanding between mutexes and semaphores. The best explanation I found so far is in this 3-Part article:
Mutex vs. Semaphores – Part 1: Semaphores
Mutex vs. Semaphores – Part 2: The Mutex
Mutex vs. Semaphores – Part 3 (final part): Mutual Exclusion Problems
While #opaxdiablo answer is totally correct I would like to point out that the usage scenario of both things is quite different. The mutex is used for protecting parts of code from running concurrently, semaphores are used for one thread to signal another thread to run.
/* Task 1 */
pthread_mutex_lock(mutex_thing);
// Safely use shared resource
pthread_mutex_unlock(mutex_thing);
/* Task 2 */
pthread_mutex_lock(mutex_thing);
// Safely use shared resource
pthread_mutex_unlock(mutex_thing); // unlock mutex
The semaphore scenario is different:
/* Task 1 - Producer */
sema_post(&sem); // Send the signal
/* Task 2 - Consumer */
sema_wait(&sem); // Wait for signal
See http://www.netrino.com/node/202 for further explanations
See "The Toilet Example" - http://pheatt.emporia.edu/courses/2010/cs557f10/hand07/Mutex%20vs_%20Semaphore.htm:
Mutex:
Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.
Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section."
Ref: Symbian Developer Library
(A mutex is really a semaphore with value 1.)
Semaphore:
Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.
Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)."
Ref: Symbian Developer Library
Mutex is to protect the shared resource.
Semaphore is to dispatch the threads.
Mutex:
Imagine that there are some tickets to sell. We can simulate a case where many people buy the tickets at the same time: each person is a thread to buy tickets. Obviously we need to use the mutex to protect the tickets because it is the shared resource.
Semaphore:
Imagine that we need to do a calculation as below:
c = a + b;
Also, we need a function geta() to calculate a, a function getb() to calculate b and a function getc() to do the calculation c = a + b.
Obviously, we can't do the c = a + b unless geta() and getb() have been finished.
If the three functions are three threads, we need to dispatch the three threads.
int a, b, c;
void geta()
{
a = calculatea();
semaphore_increase();
}
void getb()
{
b = calculateb();
semaphore_increase();
}
void getc()
{
semaphore_decrease();
semaphore_decrease();
c = a + b;
}
t1 = thread_create(geta);
t2 = thread_create(getb);
t3 = thread_create(getc);
thread_join(t3);
With the help of the semaphore, the code above can make sure that t3 won't do its job untill t1 and t2 have done their jobs.
In a word, semaphore is to make threads execute as a logicial order whereas mutex is to protect shared resource.
So they are NOT the same thing even if some people always say that mutex is a special semaphore with the initial value 1. You can say like this too but please notice that they are used in different cases. Don't replace one by the other even if you can do that.
Trying not to sound zany, but can't help myself.
Your question should be what is the difference between mutex and semaphores ?
And to be more precise question should be, 'what is the relationship between mutex and semaphores ?'
(I would have added that question but I'm hundred % sure some overzealous moderator would close it as duplicate without understanding difference between difference and relationship.)
In object terminology we can observe that :
observation.1 Semaphore contains mutex
observation.2 Mutex is not semaphore and semaphore is not mutex.
There are some semaphores that will act as if they are mutex, called binary semaphores, but they are freaking NOT mutex.
There is a special ingredient called Signalling (posix uses condition_variable for that name), required to make a Semaphore out of mutex.
Think of it as a notification-source. If two or more threads are subscribed to same notification-source, then it is possible to send them message to either ONE or to ALL, to wakeup.
There could be one or more counters associated with semaphores, which are guarded by mutex. The simple most scenario for semaphore, there is a single counter which can be either 0 or 1.
This is where confusion pours in like monsoon rain.
A semaphore with a counter that can be 0 or 1 is NOT mutex.
Mutex has two states (0,1) and one ownership(task).
Semaphore has a mutex, some counters and a condition variable.
Now, use your imagination, and every combination of usage of counter and when to signal can make one kind-of-Semaphore.
Single counter with value 0 or 1 and signaling when value goes to 1 AND then unlocks one of the guy waiting on the signal == Binary semaphore
Single counter with value 0 to N and signaling when value goes to less than N, and locks/waits when values is N == Counting semaphore
Single counter with value 0 to N and signaling when value goes to N, and locks/waits when values is less than N == Barrier semaphore (well if they dont call it, then they should.)
Now to your question, when to use what. (OR rather correct question version.3 when to use mutex and when to use binary-semaphore, since there is no comparison to non-binary-semaphore.)
Use mutex when
1. you want a customized behavior, that is not provided by binary semaphore, such are spin-lock or fast-lock or recursive-locks.
You can usually customize mutexes with attributes, but customizing semaphore is nothing but writing new semaphore.
2. you want lightweight OR faster primitive
Use semaphores, when what you want is exactly provided by it.
If you dont understand what is being provided by your implementation of binary-semaphore, then IMHO, use mutex.
And lastly read a book rather than relying just on SO.
I think the question should be the difference between mutex and binary semaphore.
Mutex = It is a ownership lock mechanism, only the thread who acquire the lock can release the lock.
binary Semaphore = It is more of a signal mechanism, any other higher priority thread if want can signal and take the lock.
All the above answers are of good quality,but this one's just to memorize.The name Mutex is derived from Mutually Exclusive hence you are motivated to think of a mutex lock as Mutual Exclusion between two as in only one at a time,and if I possessed it you can have it only after I release it.On the other hand such case doesn't exist for Semaphore is just like a traffic signal(which the word Semaphore also means).
As was pointed out, a semaphore with a count of one is the same thing as a 'binary' semaphore which is the same thing as a mutex.
The main things I've seen semaphores with a count greater than one used for is producer/consumer situations in which you have a queue of a certain fixed size.
You have two semaphores then. The first semaphore is initially set to be the number of items in the queue and the second semaphore is set to 0. The producer does a P operation on the first semaphore, adds to the queue. and does a V operation on the second. The consumer does a P operation on the second semaphore, removes from the queue, and then does a V operation on the first.
In this way the producer is blocked whenever it fills the queue, and the consumer is blocked whenever the queue is empty.
A mutex is a special case of a semaphore. A semaphore allows several threads to go into the critical section. When creating a semaphore you define how may threads are allowed in the critical section. Of course your code must be able to handle several accesses to this critical section.
I find the answer of #Peer Stritzinger the correct one.
I wanted to add to his answer the following quote from the book Programming with POSIX Threads by David R Butenhof. On page 52 of chapter 3 the author writes (emphasis mine):
You cannot lock a mutex when the calling thread already has that mutex locked. The result of attempting to do so may be an error return (EDEADLK), or it may be a self-deadlock, where the unfortunate thread waits forever. You cannot unlock a mutex that is unlocked, or that is locked by another thread. Locked mutexes are owned by the thread that locks them. If you need an "unowned" lock, use a semaphore. Section 6.6.6 discusses semaphores)
With this in mind, the following piece of code illustrates the danger of using a semaphore of size 1 as a replacement for a mutex.
sem = Semaphore(1)
counter = 0 // shared variable
----
Thread 1
for (i in 1..100):
sem.lock()
++counter
sem.unlock()
----
Thread 2
for (i in 1..100):
sem.lock()
++counter
sem.unlock()
----
Thread 3
sem.unlock()
thread.sleep(1.sec)
sem.lock()
If only for threads 1 and 2, the final value of counter should be 200. However, if by mistake that semaphore reference was leaked to another thread and called unlock, than you wouldn't get mutual exclusion.
With a mutex, this behaviour would be impossible by definition.
Binary semaphore and Mutex are different. From OS perspective, a binary semaphore and counting semaphore are implemented in the same way and a binary semaphore can have a value 0 or 1.
Mutex -> Can only be used for one and only purpose of mutual exclusion for a critical section of code.
Semaphore -> Can be used to solve variety of problems. A binary semaphore can be used for signalling and also solve mutual exclusion problem. When initialized to 0, it solves signalling problem and when initialized to 1, it solves mutual exclusion problem.
When the number of resources are more and needs to be synchronized, we can use counting semaphore.
In my blog, I have discussed these topics in detail.
https://designpatterns-oo-cplusplus.blogspot.com/2015/07/synchronization-primitives-mutex-and.html

Conditional Variable vs Semaphore

When to use a semaphore and when to use a conditional variable?
Locks are used for mutual exclusion. When you want to ensure that a piece of code is atomic, put a lock around it. You could theoretically use a binary semaphore to do this, but that's a special case.
Semaphores and condition variables build on top of the mutual exclusion provide by locks and are used for providing synchronized access to shared resources. They can be used for similar purposes.
A condition variable is generally used to avoid busy waiting (looping repeatedly while checking a condition) while waiting for a resource to become available. For instance, if you have a thread (or multiple threads) that can't continue onward until a queue is empty, the busy waiting approach would be to just doing something like:
//pseudocode
while(!queue.empty())
{
sleep(1);
}
The problem with this is that you're wasting processor time by having this thread repeatedly check the condition. Why not instead have a synchronization variable that can be signaled to tell the thread that the resource is available?
//pseudocode
syncVar.lock.acquire();
while(!queue.empty())
{
syncVar.wait();
}
//do stuff with queue
syncVar.lock.release();
Presumably, you'll have a thread somewhere else that is pulling things out of the queue. When the queue is empty, it can call syncVar.signal() to wake up a random thread that is sitting asleep on syncVar.wait() (or there's usually also a signalAll() or broadcast() method to wake up all the threads that are waiting).
I generally use synchronization variables like this when I have one or more threads waiting on a single particular condition (e.g. for the queue to be empty).
Semaphores can be used similarly, but I think they're better used when you have a shared resource that can be available and unavailable based on some integer number of available things. Semaphores are good for producer/consumer situations where producers are allocating resources and consumers are consuming them.
Think about if you had a soda vending machine. There's only one soda machine and it's a shared resource. You have one thread that's a vendor (producer) who is responsible for keeping the machine stocked and N threads that are buyers (consumers) who want to get sodas out of the machine. The number of sodas in the machine is the integer value that will drive our semaphore.
Every buyer (consumer) thread that comes to the soda machine calls the semaphore down() method to take a soda. This will grab a soda from the machine and decrement the count of available sodas by 1. If there are sodas available, the code will just keep running past the down() statement without a problem. If no sodas are available, the thread will sleep here waiting to be notified of when soda is made available again (when there are more sodas in the machine).
The vendor (producer) thread would essentially be waiting for the soda machine to be empty. The vendor gets notified when the last soda is taken from the machine (and one or more consumers are potentially waiting to get sodas out). The vendor would restock the soda machine with the semaphore up() method, the available number of sodas would be incremented each time and thereby the waiting consumer threads would get notified that more soda is available.
The wait() and signal() methods of a synchronization variable tend to be hidden within the down() and up() operations of the semaphore.
Certainly there's overlap between the two choices. There are many scenarios where a semaphore or a condition variable (or set of condition variables) could both serve your purposes. Both semaphores and condition variables are associated with a lock object that they use to maintain mutual exclusion, but then they provide extra functionality on top of the lock for synchronizing thread execution. It's mostly up to you to figure out which one makes the most sense for your situation.
That's not necessarily the most technical description, but that's how it makes sense in my head.
Let's reveal what's under the hood.
Conditional variable is essentially a wait-queue, that supports blocking-wait and wakeup operations, i.e. you can put a thread into the wait-queue and set its state to BLOCK, and get a thread out from it and set its state to READY.
Note that to use a conditional variable, two other elements are needed:
a condition (typically implemented by checking a flag or a counter)
a mutex that protects the condition
The protocol then becomes,
acquire mutex
check condition
block and release mutex if condition is true, else release mutex
Semaphore is essentially a counter + a mutex + a wait queue. And it can be used as it is without external dependencies. You can use it either as a mutex or as a conditional variable.
Therefore, semaphore can be treated as a more sophisticated structure than conditional variable, while the latter is more lightweight and flexible.
Semaphores can be used to implement exclusive access to variables, however they are meant to be used for synchronization. Mutexes, on the other hand, have a semantics which is strictly related to mutual exclusion: only the process which locked the resource is allowed to unlock it.
Unfortunately you cannot implement synchronization with mutexes, that's why we have condition variables. Also notice that with condition variables you can unlock all the waiting threads in the same instant by using the broadcast unlocking. This cannot be done with semaphores.
semaphore and condition variables are very similar and are used mostly for the same purposes. However, there are minor differences that could make one preferable. For example, to implement barrier synchronization you would not be able to use a semaphore.But a condition variable is ideal.
Barrier synchronization is when you want all of your threads to wait until everyone has arrived at a certain part in the thread function. this can be implemented by having a static variable which is initially the value of total threads decremented by each thread when it reaches that barrier. this would mean we want each thread to sleep until the last one arrives.A semaphore would do the exact opposite! with a semaphore, each thread would keep running and the last thread (which will set semaphore value to 0) will go to sleep.
a condition variable on the other hand, is ideal. when each thread gets to the barrier we check if our static counter is zero. if not, we set the thread to sleep with the condition variable wait function. when the last thread arrives at the barrier, the counter value will be decremented to zero and this last thread will call the condition variable signal function which will wake up all the other threads!
I file condition variables under monitor synchronization. I've generally seen semaphores and monitors as two different synchronization styles. There are differences between the two in terms of how much state data is inherently kept and how you want to model code - but there really isn't any problem that can be solved by one but not the other.
I tend to code towards monitor form; in most languages I work in that comes down to mutexes, condition variables, and some backing state variables. But semaphores would do the job too.
semaphore need to know the count upfront for initialization. There is no such requirement for condition variables.
The the mutex and conditional variables are inherited from semaphore.
For mutex, the semaphore uses two states: 0, 1
For condition variables the semaphore uses counter.
They are like syntactic sugar
conditionalVar + mutex == semaphore

Resources