Can a Boolean flag be used instead of a semaphore?

Can a Boolean flag be used instead of a semaphore? - semaphore

Semaphore does the job of signalling whether a resource is free or being used.Can we not replace the same with a boolean flag. How is a semaphore different from flag?

Semaphores count; one increments and decrements them — they tell you how many of a resource is available and allow you to wait for one. A Boolean does not count.
Thread-safe use of a Boolean would require some other synchronisation mechanism. The main risk is that code like this:
if(!flag) {
flag = true;
...
... results in two threads simultaneously checking flag and proceeding before either has set it.
A fairly common assembly instruction is atomic test and set (or clear), which does the two things as a single atomic step. That's often used for basic synchronisation.

Related

why does std::condition_variable::wait need mutex?

TL;DR
Why does std::condition_variable::wait needs a mutex as one of its variables?
Answer 1
You may look a the documentation and quote that:
wait... Atomically releases lock
But that's not a real reason. That's just validate my question even more: why does it need it in the first place?
Answer 2
predicate is most likely query the state of a shared resource and it must be lock guarded.
OK. fair.
Two questions here
Is it always true that predicate query the state of a shared resource? I assume yes. I t doesn't make sense to me to implement it otherwise
What if I do not pass any predicate (it is optional)?
Using predicate - lock makes sense
int i = 0;
void waits()
{
std::unique_lock<std::mutex> lk(cv_m);
cv.wait(lk, []{return i == 1;});
std::cout << i;
}
Not Using predicate - why can't we lock after the wait?
int i = 0;
void waits()
{
cv.wait(lk);
std::unique_lock<std::mutex> lk(cv_m);
std::cout << i;
}
Notes
I know that there are no harmful implications to this practice. I just don't know how to explain to my self why it was design this way?
Question
If predicate is optional and is not passed to wait, why do we need the lock?

When using a condition variable to wait for a condition, a thread performs the following sequence of steps:
It determines that the condition is not currently true.
It starts waiting for some other thread to make the condition true. This is the wait call.
For example, the condition might be that a queue has elements in it, and a thread might see that the queue is empty and wait for another thread to put things in the queue.
If another thread were to intercede between these two steps, it could make the condition true and notify on the condition variable before the first thread actually starts waiting. In this case, the waiting thread would not receive the notification, and it might never stop waiting.
The purpose of requiring the lock to be held is to prevent other threads from interceding like this. Additionally, the lock must be unlocked to allow other threads to do whatever we're waiting for, but it can't happen before the wait call because of the notify-before-wait problem, and it can't happen after the wait call because we can't do anything while we're waiting. It has to be part of the wait call, so wait has to know about the lock.
Now, you might look at the notify_* methods and notice that those methods don't require the lock to be held, so there's nothing actually stopping another thread from notifying between steps 1 and 2. However, a thread calling notify_* is supposed to hold the lock while performing whatever action it does to make the condition true, which is usually enough protection.

TL;DR
If predicate is optional and is not passed to wait, why do we need the lock?
condition_variable is designed to wait for a certain condition to come true, not to wait just for a notification. So to "catch" the "moment" when the condition becomes true you need to check the condition and wait for the notification. And to avoid a race condition you need those two to be a single atomic operation.
Purpose Of condition_variable:
Enable a program to implement this: do some action when a condition C holds.
Intended Protocol:
Condition producer changes state of the world from !C to C.
Condition consumer waits for C to happen and takes the action while/after condition C holds.
Simplification:
For simplicity (to limit number of cases to think of) let's assume that C never switches back to !C. Let's also forget about spurious wakeups. Even with this assumptions we'll see that the lock is necessary.
Naive Approach:
Let's have two threads with an essential code summarized like this:
void producer() {
_condition = true;
_condition_variable.notify_all();
}
void consumer() {
if (!_condition) {
_condition_variable.wait();
}
action();
}
The Problem:
The problem here is a race condition. A problematic interleaving of the threads is following:
The consumer reads condition, checks it to be false and decides to wait.
A thread scheduler interrupts consumer and resumes producer.
The producer updates condition to become true and invokes notify_all().
The consumer is resumed.
The consumer actually does wait(), but is never notified and waken up (a liveness hazard).
So without locking the consumer may miss the event of the condition becoming true.
Solution:
Disclaimer: this code still does not handle spurious wakeups and possibility of condition becoming false again.
void producer() {
{ std::unique_lock<std::mutex> l(_mutex);
_condition = true;
}
_condition_variable.notify_all();
}
void consumer() {
{ std::unique_lock<std::mutex> l(_mutex);
if (!_condition) {
_condition_variable.wait(l);
}
}
action();
}
Here we check condition, release lock and start waiting as a single atomic operation, preventing the race condition mentioned before.
See Also
Why Lock condition await must hold the lock

You need a std::unique_lock when using std::condition_variable for the same reason you need a std::FILE* when using std::fwrite and for the same reason a BasicLockable is necessary when using std::unique_lock itself.
The feature std::fwrite gives you, entire the reason it exists, is to write to files. So you have to give it a file. The feature std::unique_lock provides you is RAII locking and unlocking of a mutex (or another BasicLockable, like std::shared_mutex, etc.) so you have to give it something to lock and unlock.
The feature std::condition_variable provides, the entire reason it exists, is the atomically waiting and unlocking a lock (and completing a wait and locking). So you have to give it something to lock.
Why would someone want that is a separate question that has been discussed already. For example:
When is a condition variable needed, isn't a mutex enough?
Conditional Variable vs Semaphore
Advantages of using condition variables over mutex
And so on.
As has been explained, the pred parameter is optional, but having some sort of a predicate and testing it isn't. Or, in other words, not having a predicate doesn't make any sense inn a manner similar to how having a condition variable without a lock doesn't making any sense.
The reason you have a lock is because you have shared state you need to protect from simultaneous access. Some function of that shared state is the predicate.
If you don't have a predicate and you don't have a lock you really don't need a condition variable just like if you don't have a file you really don't need fwrite.
A final point is that the second code snippet you wrote is very broken. Obviously it won't compile as you define the lock after you try to pass it as an argument to condition_variable::wait(). You probably meant something like:
std::mutex mtx_cv;
std::condition_variable cv;
...
{
std::unique_lock<std::mutex> lk(mtx_cv);
cv.wait(lk);
lk.lock(); // throws std::system_error with an error code of std::errc::resource_deadlock_would_occur
}
The reason this is wrong is very simple. condition_variable::wait's effects are (from [thread.condition.condvar]):
Effects:
— Atomically calls lock.unlock() and blocks on *this.
— When unblocked, calls lock.lock() (possibly blocking on the lock), then returns.
— The function will unblock when signaled by a call to notify_one() or a call to notify_all(), or spuriously
After the return from wait() the lock is locked, and unique_lock::lock() throws an exception if it has already locked the mutex it wraps ([thread.lock.unique.locking]).
Again, why would someone want coupling waiting and locking the way std::condition_variable does is a separate question, but given that it does - you cannot, by definition, lock a std::condition_variable's std::unique_lock after std::condition_variable::wait has returned.

It's not stated in the documentation (and could be implemented differently) but conceptually you can imagine the condition variable has another mutex to both protect its own data but also coordinate the condition, waiting and notification with modification of the consumer code data (e.g. queue.size()) affecting the test.
So when you call wait(...) the following (logically) happens.
Precondition: The consumer code holds the lock (CCL) controlling the consumer condition data (CCD).
The condition is checked, if true, execution in the consumer code continues still holding the lock.
If false, it first acquires its own lock (CVL), adds the current thread to the waiting thread collection releases the consumer lock and puts itself to waiting and releases its own lock (CVL).
That final step is tricky because it needs to sleep the thread and release the CVL at the same time or in that order or in a way that threads notified just before going to wait are able to (somehow) not go to wait.
The step of acquiring the CVL before releasing the CCD is key. Any parallel thread trying to update the CCD and notify will be blocked either by the CCL or CVL. If the CCL was released before acquiring the CVL a parallel thread could acquire the CCL, change the data and then notify before the the to-be-waiting thread is added to the waiters.
A parallel thread acquires the CCL, modifies the data to make the condition true (or at least worth testing) and then notifies. Notification acquires the the CVL and identifies a blocked thread (or threads) if any to unwait. The unwaited threads then seek to acquire the CCL and may block there but won't leave wait and re-perform the test until they've acquired it.
Notification must acquire the CVL to make sure threads that have found the test false have been added to the waiters.
It's OK (possibly preferable for performance) to notify without holding the CCL because the hand-off between the CCL and CVL in the wait code is ensuring the ordering.
It may be preferrable because notifying when holding the CCL may mean all the unwaited threads just unwait to block (on the CCL) while the thread modifying the data is still holding the lock.
Notice that even if the CCD is atomic you must modify it holding the CCL or that Lock CVL, unlock CCL step won't ensure the total ordering required to make sure notifications aren't sent when threads are in the process of going to wait.
The standard only talks about atomicity of operations and another implementation may have a way of blocking notification before completing the 'add to waiters' step has completed following a failed test. The C++ Standard is careful to not dictate an implementation.
In all that, to answer some of the specific questions.
Must the state be shared? Sort of. There could be an external condition like a file being in a directory and the wait is timed to re-try after a time-period. You can decide for yourself whether you consider the file system or even just the wall-clock to be shared state.
Must there be any state? Not necessarily. A thread can wait on notification.
That could be tricky to coordinate because there has to be enough sequencing to stop the other thread notifying out of turn. The commonest solution is to have some boolean flag set by the notifying thread so the notified thread knows if it missed it. The normal use of void wait(std::unique_lock<std::mutex>& lk) is when the predicate is checked outside:
std::unique_lock<std::mutex> ulk(ccd_mutex)
while(!condition){
cv.wait(ulk);
}
Where the notifying thread uses:
{
std::lock_guard<std::mutex> guard(ccd_mutex);
condition=true;
}
cv.notify();

The reason is that in some times the waiting-thread holds the m_mutex:
#include <mutex>
#include <condition_variable>
void CMyClass::MyFunc()
{
std::unique_lock<std::mutex> guard(m_mutex);
// do something (on the protected resource)
m_condiotion.wait(guard, [this]() {return !m_bSpuriousWake; });
// do something else (on the protected resource)
guard.unluck();
// do something else than else
}
and a thread should never go to sleep while holding a m_mutex. One doesn't want to lock everybody out, while sleeping. So, atomically: {guard is unlocked and the thread go to sleep}. Once it waked up by the other-thread (m_condiotion.notify_one(), let's say) guard is locked again, and then the thread continue.
Reference (video)

Because if not so, there's a race condition before the waiting thread noticing the change of the shared state and the wait() call.
Assume we got a shared state of type std::atomic state_, there's still a fair chance for the waiting thread to miss a notification:
T1(waiting) | T2(notification)
---------------------------------------------- * ---------------------------
1) for (int i = state_; i != 0; i = state_) { |
2) | state_ = 0;
3) | cv.notify();
4) cv.wait(); |
5) }
6) // go on with the satisfied condition... |
Note that the wait() call failed to notice the latest value of state_ and may keep waiting forever.

Use a Monitor like a Semaphore?

When using monitors for most concurrency problems, you can just put the critical section inside a monitor method and then invoke the method. However, there are some multiplexing problems wherein up to n threads can run their critical sections simultaneously. So we can say that it's useful to know how to use a monitor like the following:
monitor.enter();
runCriticalSection();
monitor.exit();
What can we use inside the monitors so we can go about doing this?
Side question: Are there standard resources tackling this? Most of what I read involve only putting the critical section inside the monitor. For semaphores there is "The Little Book of Semaphores".

As far as I understand your question, any solution must satisfy this:
When fewer than n threads are in the critical section, a thread calling monitor.enter() should not block—i.e. the only thing preventing it from progressing should be the whims of the scheduler.
At most n threads are in the critical section at any point in time; implying that
When thread n+1 calls monitor.enter(), it must block until a thread calls monitor.exit().
As far as I can tell, your requirements are equivalent to this:
The "monitor" is a semaphore with an initial value of n.
monitor.enter() is semaphore.prolaag() (aka P, decrement or wait)
monitor.exit() is semaphore.verhoog() (aka V, increment or signal)
So here it is, a semaphore implemented from a monitor:
monitor Semaphore(n):
int capacity = n
method enter:
while capacity == 0: wait()
capacity -= 1
method exit:
capacity += 1
signal()
Use it like this:
shared state:
monitor = Semaphore(n)
each thread:
monitor.enter()
runCriticalSection()
monitor.exit()
The other path
I guess that you might want some kind of syntactic wrapper, let's call it Multimonitor, so you can write something like this:
Multimonitor(n):
method critical_section_a:
<statements>
method critical_section_b:
<statements>
And your run-time environment would ensure that at most n threads are active inside any of the monitor methods (in your case you just wanted one method). I know of no such feature in any programming language or runtime environment.
Perhaps in python you can create a Multimonitor class containing all the book-keeping variables, then subclass from it and put decorators on all the methods; a metaclass-involving solution might be able to do the decorating for the user.
The third option
If you implement monitors using semaphores, you're often using a semaphore as a mutex around monitor entry and resume points. I think you could initialize such a semaphore with a value larger than one and thereby produce such a Multimonitor, complete with wait() and signal() on condition variables. But: it would do more than what you need in your stated question, and if you use semaphores, why not just use them in the basic and straightforward way?

Why is threading dangerous?

I've always been told to puts locks around variables that multiple threads will access, I've always assumed that this was because you want to make sure that the value you are working with doesn't change before you write it back
i.e.
mutex.lock()
int a = sharedVar
a = someComplexOperation(a)
sharedVar = a
mutex.unlock()
And that makes sense that you would lock that. But in other cases I don't understand why I can't get away with not using Mutexes.
Thread A:
sharedVar = someFunction()
Thread B:
localVar = sharedVar
What could possibly go wrong in this instance? Especially if I don't care that Thread B reads any particular value that Thread A assigns.

It depends a lot on the type of sharedVar, the language you're using, any framework, and the platform. In many cases, it's possible that assigning a single value to sharedVar may take more than one instruction, in which case you may read a "half-set" copy of the value.
Even when that's not the case, and the assignment is atomic, you may not see the latest value without a memory barrier in place.

MSDN Magazine has a good explanation of different problems you may encounter in multithreaded code:
Forgotten Synchronization
Incorrect Granularity
Read and Write Tearing
Lock-Free Reordering
Lock Convoys
Two-Step Dance
Priority Inversion
The code in your question is particularly vulnerable to Read/Write Tearing. But your code, having neither locks nor memory barriers, is also subject to Lock-Free Reordering (which may include speculative writes in which thread B reads a value that thread A never stored) in which side-effects become visible to a second thread in a different order from how they appeared in your source code.
It goes on to describe some known design patterns which avoid these problems:
Immutability
Purity
Isolation
The article is available here

The main problem is that the assignment operator (operator= in C++) is not always guaranteed to be atomic (not even for primitive, built in types). In plain English, that means that assignment can take more than a single clock cycle to complete. If, in the middle of that, the thread gets interrupted, then the current value of the variable might be corrupted.
Let me build off of your example:
Lets say sharedVar is some object with operator= defined as this:
object& operator=(const object& other) {
ready = false;
doStuff(other);
if (other.value == true) {
value = true;
doOtherStuff();
} else {
value = false;
}
ready = true;
return *this;
}
If thread A from your example is interrupted in the middle of this function, ready will still be false when thread B starts to run. This could mean that the object is only partially copied over, or is in some intermediate, invalid state when thread B attempts to copy it into a local variable.
For a particularly nasty example of this, think of a data structure with a removed node being deleted, then interrupted before it could be set to NULL.
(For some more information regarding structures that don't need a lock (aka, are atomic), here is another question that talks a bit more about that.)

This could go wrong, because threads can be suspended and resumed by the thread scheduler, so you can't be sure about the order these instructions are executed. It might just as well be in this order:
Thread B:
localVar = sharedVar
Thread A:
sharedVar = someFunction()
In which case localvar will be null or 0 (or some completeley unexpected value in an unsafe language), probably not what you intended.
Mutexes actually won't fix this particular issue by the way. The example you supply does not lend itself well for parallelization.

Synchronization among 2 threads in linux pthreads

In linux, how can synchronize between 2 thread (using pthreads on linux)?
I would like, under some conditions, a thread will block itself and then later on, it will be resume by another thread. In Java, there is wait(), notify() functions. I am looking for something the same on pthreads:
I have read this, but it only has mutex, which is kind of like Java's synchronized keyword. That is not what I am looking for.
https://computing.llnl.gov/tutorials/pthreads/#Mutexes
Thank you.

You need a mutex, a condition variable and a helper variable.
in thread 1:
pthread_mutex_lock(&mtx);
// We wait for helper to change (which is the true indication we are
// ready) and use a condition variable so we can do this efficiently.
while (helper == 0)
{
pthread_cond_wait(&cv, &mtx);
}
pthread_mutex_unlock(&mtx);
in thread 2:
pthread_mutex_lock(&mtx);
helper = 1;
pthread_cond_signal(&cv);
pthread_mutex_unlock(&mtx);
The reason you need a helper variable is because condition variables can suffer from spurious wakeup. It's the combination of a helper variable and a condition variable that gives you exact semantics and efficient waiting.

You can also look at spin locks. try to man/google pthread_spin_init, pthread_spin_lock as a starting point
depending on your application specific, they might be more appropriate than mutex

What is a semaphore?

A semaphore is a programming concept that is frequently used to solve multi-threading problems. My question to the community:
What is a semaphore and how do you use it?

Think of semaphores as bouncers at a nightclub. There are a dedicated number of people that are allowed in the club at once. If the club is full no one is allowed to enter, but as soon as one person leaves another person might enter.
It's simply a way to limit the number of consumers for a specific resource. For example, to limit the number of simultaneous calls to a database in an application.
Here is a very pedagogic example in C# :-)
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace TheNightclub
{
public class Program
{
public static Semaphore Bouncer { get; set; }
public static void Main(string[] args)
{
// Create the semaphore with 3 slots, where 3 are available.
Bouncer = new Semaphore(3, 3);
// Open the nightclub.
OpenNightclub();
}
public static void OpenNightclub()
{
for (int i = 1; i <= 50; i++)
{
// Let each guest enter on an own thread.
Thread thread = new Thread(new ParameterizedThreadStart(Guest));
thread.Start(i);
}
}
public static void Guest(object args)
{
// Wait to enter the nightclub (a semaphore to be released).
Console.WriteLine("Guest {0} is waiting to entering nightclub.", args);
Bouncer.WaitOne();
// Do some dancing.
Console.WriteLine("Guest {0} is doing some dancing.", args);
Thread.Sleep(500);
// Let one guest out (release one semaphore).
Console.WriteLine("Guest {0} is leaving the nightclub.", args);
Bouncer.Release(1);
}
}
}

The article Mutexes and Semaphores Demystified by Michael Barr is a great short introduction into what makes mutexes and semaphores different, and when they should and should not be used. I've excerpted several key paragraphs here.
The key point is that mutexes should be used to protect shared resources, while semaphores should be used for signaling. You should generally not use semaphores to protect shared resources, nor mutexes for signaling. There are issues, for instance, with the bouncer analogy in terms of using semaphores to protect shared resources - you can use them that way, but it may cause hard to diagnose bugs.
While mutexes and semaphores have some similarities in their implementation, they should always be used differently.
The most common (but nonetheless incorrect) answer to the question posed at the top is that mutexes and semaphores are very similar, with the only significant difference being that semaphores can count higher than one. Nearly all engineers seem to properly understand that a mutex is a binary flag used to protect a shared resource by ensuring mutual exclusion inside critical sections of code. But when asked to expand on how to use a "counting semaphore," most engineers—varying only in their degree of confidence—express some flavor of the textbook opinion that these are used to protect several equivalent resources.
...
At this point an interesting analogy is made using the idea of bathroom keys as protecting shared resources - the bathroom. If a shop has a single bathroom, then a single key will be sufficient to protect that resource and prevent multiple people from using it simultaneously.
If there are multiple bathrooms, one might be tempted to key them alike and make multiple keys - this is similar to a semaphore being mis-used. Once you have a key you don't actually know which bathroom is available, and if you go down this path you're probably going to end up using mutexes to provide that information and make sure you don't take a bathroom that's already occupied.
A semaphore is the wrong tool to protect several of the essentially same resource, but this is how many people think of it and use it. The bouncer analogy is distinctly different - there aren't several of the same type of resource, instead there is one resource which can accept multiple simultaneous users. I suppose a semaphore can be used in such situations, but rarely are there real-world situations where the analogy actually holds - it's more often that there are several of the same type, but still individual resources, like the bathrooms, which cannot be used this way.
...
The correct use of a semaphore is for signaling from one task to another. A mutex is meant to be taken and released, always in that order, by each task that uses the shared resource it protects. By contrast, tasks that use semaphores either signal or wait—not both. For example, Task 1 may contain code to post (i.e., signal or increment) a particular semaphore when the "power" button is pressed and Task 2, which wakes the display, pends on that same semaphore. In this scenario, one task is the producer of the event signal; the other the consumer.
...
Here an important point is made that mutexes interfere with real time operating systems in a bad way, causing priority inversion where a less important task may be executed before a more important task because of resource sharing. In short, this happens when a lower priority task uses a mutex to grab a resource, A, then tries to grab B, but is paused because B is unavailable. While it's waiting, a higher priority task comes along and needs A, but it's already tied up, and by a process that isn't even running because it's waiting for B. There are many ways to resolve this, but it most often is fixed by altering the mutex and task manager. The mutex is much more complex in these cases than a binary semaphore, and using a semaphore in such an instance will cause priority inversions because the task manager is unaware of the priority inversion and cannot act to correct it.
...
The cause of the widespread modern confusion between mutexes and semaphores is historical, as it dates all the way back to the 1974 invention of the Semaphore (capital "S", in this article) by Djikstra. Prior to that date, none of the interrupt-safe task synchronization and signaling mechanisms known to computer scientists was efficiently scalable for use by more than two tasks. Dijkstra's revolutionary, safe-and-scalable Semaphore was applied in both critical section protection and signaling. And thus the confusion began.
However, it later became obvious to operating system developers, after the appearance of the priority-based preemptive RTOS (e.g., VRTX, ca. 1980), publication of academic papers establishing RMA and the problems caused by priority inversion, and a paper on priority inheritance protocols in 1990, 3 it became apparent that mutexes must be more than just semaphores with a binary counter.
Mutex: resource sharing
Semaphore: signaling
Don't use one for the other without careful consideration of the side effects.

Mutex: exclusive-member access to a resource
Semaphore: n-member access to a resource
That is, a mutex can be used to syncronize access to a counter, file, database, etc.
A sempahore can do the same thing but supports a fixed number of simultaneous callers. For example, I can wrap my database calls in a semaphore(3) so that my multithreaded app will hit the database with at most 3 simultaneous connections. All attempts will block until one of the three slots opens up. They make things like doing naive throttling really, really easy.

Consider, a taxi that can accommodate a total of 3(rear)+2(front) persons including the driver. So, a semaphore allows only 5 persons inside a car at a time.
And a mutex allows only 1 person on a single seat of the car.
Therefore, Mutex is to allow exclusive access for a resource (like an OS thread) while a Semaphore is to allow access for n number of resources at a time.

#Craig:
A semaphore is a way to lock a
resource so that it is guaranteed that
while a piece of code is executed,
only this piece of code has access to
that resource. This keeps two threads
from concurrently accesing a resource,
which can cause problems.
This is not restricted to only one thread. A semaphore can be configured to allow a fixed number of threads to access a resource.

Semaphore can also be used as a ... semaphore.
For example if you have multiple process enqueuing data to a queue, and only one task consuming data from the queue. If you don't want your consuming task to constantly poll the queue for available data, you can use semaphore.
Here the semaphore is not used as an exclusion mechanism, but as a signaling mechanism.
The consuming task is waiting on the semaphore
The producing task are posting on the semaphore.
This way the consuming task is running when and only when there is data to be dequeued

There are two essential concepts to building concurrent programs - synchronization and mutual exclusion. We will see how these two types of locks (semaphores are more generally a kind of locking mechanism) help us achieve synchronization and mutual exclusion.
A semaphore is a programming construct that helps us achieve concurrency, by implementing both synchronization and mutual exclusion. Semaphores are of two types, Binary and Counting.
A semaphore has two parts : a counter, and a list of tasks waiting to access a particular resource. A semaphore performs two operations : wait (P) [this is like acquiring a lock], and release (V)[ similar to releasing a lock] - these are the only two operations that one can perform on a semaphore. In a binary semaphore, the counter logically goes between 0 and 1. You can think of it as being similar to a lock with two values : open/closed. A counting semaphore has multiple values for count.
What is important to understand is that the semaphore counter keeps track of the number of tasks that do not have to block, i.e., they can make progress. Tasks block, and add themselves to the semaphore's list only when the counter is zero. Therefore, a task gets added to the list in the P() routine if it cannot progress, and "freed" using the V() routine.
Now, it is fairly obvious to see how binary semaphores can be used to solve synchronization and mutual exclusion - they are essentially locks.
ex. Synchronization:
thread A{
semaphore &s; //locks/semaphores are passed by reference! think about why this is so.
A(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
;// some block of code B2
...
}
//thread B{
semaphore &s;
B(semaphore &s): s(s){} //constructor
foo(){
...
...
// some block of code B1
s.V();
..
}
main(){
semaphore s(0); // we start the semaphore at 0 (closed)
A a(s);
B b(s);
}
In the above example, B2 can only execute after B1 has finished execution. Let's say thread A comes executes first - gets to sem.P(), and waits, since the counter is 0 (closed). Thread B comes along, finishes B1, and then frees thread A - which then completes B2. So we achieve synchronization.
Now let's look at mutual exclusion with a binary semaphore:
thread mutual_ex{
semaphore &s;
mutual_ex(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
//critical section
s.V();
...
...
s.P();
//critical section
s.V();
...
}
main(){
semaphore s(1);
mutual_ex m1(s);
mutual_ex m2(s);
}
The mutual exclusion is quite simple as well - m1 and m2 cannot enter the critical section at the same time. So each thread is using the same semaphore to provide mutual exclusion for its two critical sections. Now, is it possible to have greater concurrency? Depends on the critical sections. (Think about how else one could use semaphores to achieve mutual exclusion.. hint hint : do i necessarily only need to use one semaphore?)
Counting semaphore: A semaphore with more than one value. Let's look at what this is implying - a lock with more than one value?? So open, closed, and ...hmm. Of what use is a multi-stage-lock in mutual exclusion or synchronization?
Let's take the easier of the two:
Synchronization using a counting semaphore: Let's say you have 3 tasks - #1 and 2 you want executed after 3. How would you design your synchronization?
thread t1{
...
s.P();
//block of code B1
thread t2{
...
s.P();
//block of code B2
thread t3{
...
//block of code B3
s.V();
s.V();
}
So if your semaphore starts off closed, you ensure that t1 and t2 block, get added to the semaphore's list. Then along comes all important t3, finishes its business and frees t1 and t2. What order are they freed in? Depends on the implementation of the semaphore's list. Could be FIFO, could be based some particular priority,etc. (Note : think about how you would arrange your P's and V;s if you wanted t1 and t2 to be executed in some particular order, and if you weren't aware of the implementation of the semaphore)
(Find out : What happens if the number of V's is greater than the number of P's?)
Mutual Exclusion Using counting semaphores: I'd like you to construct your own pseudocode for this (makes you understand things better!) - but the fundamental concept is this : a counting semaphore of counter = N allows N tasks to enter the critical section freely. What this means is you have N tasks (or threads, if you like) enter the critical section, but the N+1th task gets blocked (goes on our favorite blocked-task list), and only is let through when somebody V's the semaphore at least once. So the semaphore counter, instead of swinging between 0 and 1, now goes between 0 and N, allowing N tasks to freely enter and exit, blocking nobody!
Now gosh, why would you need such a stupid thing? Isn't the whole point of mutual exclusion to not let more than one guy access a resource?? (Hint Hint...You don't always only have one drive in your computer, do you...?)
To think about : Is mutual exclusion achieved by having a counting semaphore alone? What if you have 10 instances of a resource, and 10 threads come in (through the counting semaphore) and try to use the first instance?

I've created the visualization which should help to understand the idea. Semaphore controls access to a common resource in a multithreading environment.
ExecutorService executor = Executors.newFixedThreadPool(7);
Semaphore semaphore = new Semaphore(4);
Runnable longRunningTask = () -> {
boolean permit = false;
try {
permit = semaphore.tryAcquire(1, TimeUnit.SECONDS);
if (permit) {
System.out.println("Semaphore acquired");
Thread.sleep(5);
} else {
System.out.println("Could not acquire semaphore");
}
} catch (InterruptedException e) {
throw new IllegalStateException(e);
} finally {
if (permit) {
semaphore.release();
}
}
};
// execute tasks
for (int j = 0; j < 10; j++) {
executor.submit(longRunningTask);
}
executor.shutdown();
Output
Semaphore acquired
Semaphore acquired
Semaphore acquired
Semaphore acquired
Could not acquire semaphore
Could not acquire semaphore
Could not acquire semaphore
Sample code from the article

A semaphore is an object containing a natural number (i.e. a integer greater or equal to zero) on which two modifying operations are defined. One operation, V, adds 1 to the natural. The other operation, P, decreases the natural number by 1. Both activities are atomic (i.e. no other operation can be executed at the same time as a V or a P).
Because the natural number 0 cannot be decreased, calling P on a semaphore containing a 0 will block the execution of the calling process(/thread) until some moment at which the number is no longer 0 and P can be successfully (and atomically) executed.
As mentioned in other answers, semaphores can be used to restrict access to a certain resource to a maximum (but variable) number of processes.

A hardware or software flag. In multi tasking systems , a semaphore is as variable with a value that indicates the status of a common resource.A process needing the resource checks the semaphore to determine the resources status and then decides how to proceed.

Semaphores are act like thread limiters.
Example: If you have a pool of 100 threads and you want to perform some DB operation. If 100 threads access the DB at a given time, then there may be locking issue in DB so we can use semaphore which allow only limited thread at a time.Below Example allow only one thread at a time. When a thread call the acquire() method, it will then get the access and after calling the release() method, it will release the acccess so that next thread will get the access.
package practice;
import java.util.concurrent.Semaphore;
public class SemaphoreExample {
public static void main(String[] args) {
Semaphore s = new Semaphore(1);
semaphoreTask s1 = new semaphoreTask(s);
semaphoreTask s2 = new semaphoreTask(s);
semaphoreTask s3 = new semaphoreTask(s);
semaphoreTask s4 = new semaphoreTask(s);
semaphoreTask s5 = new semaphoreTask(s);
s1.start();
s2.start();
s3.start();
s4.start();
s5.start();
}
}
class semaphoreTask extends Thread {
Semaphore s;
public semaphoreTask(Semaphore s) {
this.s = s;
}
#Override
public void run() {
try {
s.acquire();
Thread.sleep(1000);
System.out.println(Thread.currentThread().getName()+" Going to perform some operation");
s.release();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

So imagine everyone is trying to go to the bathroom and there's only a certain number of keys to the bathroom. Now if there's not enough keys left, that person needs to wait. So think of semaphore as representing those set of keys available for bathrooms (the system resources) that different processes (bathroom goers) can request access to.
Now imagine two processes trying to go to the bathroom at the same time. That's not a good situation and semaphores are used to prevent this. Unfortunately, the semaphore is a voluntary mechanism and processes (our bathroom goers) can ignore it (i.e. even if there are keys, someone can still just kick the door open).
There are also differences between binary/mutex & counting semaphores.
Check out the lecture notes at http://www.cs.columbia.edu/~jae/4118/lect/L05-ipc.html.

This is an old question but one of the most interesting uses of semaphore is a read/write lock and it has not been explicitly mentioned.
The r/w locks works in simple fashion: consume one permit for a reader and all permits for writers.
Indeed, a trivial implementation of a r/w lock but requires metadata modification on read (actually twice) that can become a bottle neck, still significantly better than a mutex or lock.
Another downside is that writers can be started rather easily as well unless the semaphore is a fair one or the writes acquire permits in multiple requests, in such case they need an explicit mutex between themselves.
Further read:

Mutex is just a boolean while semaphore is a counter.
Both are used to lock part of code so it's not accessed by too many threads.
Example
lock.set()
a += 1
lock.unset()
Now if lock was a mutex, it means that it will always be locked or unlocked (a boolean under the surface) regardless how many threads try access the protected snippet of code. While locked, any other thread would just wait until it's unlocked/unset by the previous thread.
Now imagine if instead lock was under the hood a counter with a predefined MAX value (say 2 for our example). Then if 2 threads try to access the resource, then lock would get its value increased to 2. If a 3rd thread then tried to access it, it would simply wait for the counter to go below 2 and so on.
If lock as a semaphore had a max of 1, then it would be acting exactly as a mutex.

A semaphore is a way to lock a resource so that it is guaranteed that while a piece of code is executed, only this piece of code has access to that resource. This keeps two threads from concurrently accesing a resource, which can cause problems.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string