Is this reader-writer lock implementation correct?

Is this reader-writer lock implementation correct? - multithreading

wondering if the following implementation of reader/writer problem correct.
We're using only one mutex and a count variable to indicate the num of readers.
read api:
void read() {
mutex.lock();
count ++;
mutex.unlock();
// Do read
mutex.lock();
count --;
mutex.unlock();
}
write api:
void write() {
while(1) {
mutex.lock();
if(count == 0) {
// Do write
mutex.unlock();
return;
}
mutex.unlock();
}
}
Looks like in the code:
Only one lock is used so there is no deadlock problem;
Writer can only write when count == 0 so there is no race conditions.
As for a read/write problem prior to reader, is there any problem for the above code? Looks like all the standard implementation uses two locks(eg. https://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem#First_readers-writers_problem). If the above implementation seems correct, why are we using two locks in wiki? Thank you!

It's correct, but it will perform atrociously. Imagine if while a reader is trying to do work there are two waiting writers. Those two waiting writers will constantly acquire and release the mutex, saturating the CPU resources while the reader is trying to finish its work so that the system as a whole can make forward progress.
The nightmare scenario would be where the reader shares a physical core with one of the waiting writers. Yikes.
Correct, yes. Useful and sensible, definitely not!
One reason to use two locks is to prevent two writers from competing. A more common solution, at least in my experience, is to use a lock with a condition variable to release waiting writers or alternate phases.

Related

boost shared_mutex destructor

I have a multithreaded app that has to read some data often, and occasionally that data is updated. I have problems with writing by using unique_lock and problems with reading by using upgrade_lock
There is examples of my problems:
void unlock(){
test.stream = 0;
test.mtx.unlock();
}
void lock_mtx(int i){
boost::unique_lock<boost::shared_mutex> lock(test.mtx);
test.stream = i;
boost::this_thread::sleep_for(boost::chrono::milliseconds(10000));
unlock();
boost::this_thread::sleep_for(boost::chrono::milliseconds(1000));
}
When I destruct lock , mutex is already unlocked by this thread, and sometimes it is locked by another thread, but destructor make it free again. After destruction of lock (in the first thread) third thread take mutex and I have two writers at the one moment
void lock_mtx(int i){
boost::upgrade_lock<boost::shared_mutex> lock(test.mtx);
read_from_locked();
boost::this_thread::sleep_for(boost::chrono::milliseconds(5000));
boost::upgrade_to_unique_lock<boost::shared_mutex> uniqueLock(lock);
write_to_locked();
boost::this_thread::sleep_for(boost::chrono::milliseconds(10000));
}
The second problem, is that when some thread takes upgrade_lock, other threads can't read shared objects
Both of problems occur in MS VisualStudio 2013 and Windows8 x64

void lock_mtx(int i)
{
{
boost::unique_lock<boost::shared_mutex> lock(test.mtx);
test.stream = i;
boost::this_thread::sleep_for(boost::chrono::milliseconds(10000));
}
boost::this_thread::sleep_for(boost::chrono::milliseconds(1000));
}
The purpose of unique_lock and lock_guard is to automatically unlock the mutex when it goes out of scope. This pattern is known as RAII.
Rule of thumb: Never manually call lock()/unlock() on your (Basic|Shared)Lockable objects. It's an antipattern because
It's extremely hard to get right (think of exception safety)
It usually indicates a code smell (locks being held across different method calls). If you even need this, consider making the RAII lock guard (lock_guard or unique_lock) a member of the containing class, or return the unique_lock so the caller has the option to explicitly adopt the lock, or to just let it be automatically released by the guard.

java - avoid unnessary thread wake-ups

I have a set of 12 threads executing work (Runnable) in parallel. In essence, each thread does the following:
Runnable r;
while (true) {
synchronized (work) {
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
}
Work is added as following:
Runnable r = ...;
synchronized (work) {
work.add(r);
work.notify();
}
When new work is available, it is added to the list and the lock is notified. If there is a thread waiting, it is woken up, so it can execute this work.
Here lies the problem. When a thread is woken up, it is very likely that another thread will execute this work. This happens when the latter thread is done with its previous work and re-enters the while(true)-loop. The smaller/shorter the work actions, the more likely this will happen.
This means I am waking up a thread for nothing. As I need high throughput, I believe this behavior will lower the performance.
How would you solve this? In theory, I need a mechanism which allows me to cancel a pending thread wake-up notification. Of course, this is not possible in Java.
I thought about introducing a work list for each thread. Instead of pushing the work into one single list, the work is spread over the 12 work lists. But I believe this will introduce other problems. For example, one thread might have a lot of work pending, while another thread might have no work pending. In essence, I believe that a solution which assigns work to a particular thread in advance might become very complex and and is sub-optimal.
Thanks!

What you are doing is a thread pooling. Take a look at pre java-5 concurrency framework, PooledExecutor class there:
http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html

In addition to my previous answer - another solution. This question makes me curious.
Here, I added a check with volatile boolean.
It does not completely avoid the situation of uselessly wakening up a thread but helps to avoid it. Actually, I do not see how this could be completely avoided without additional restrictions like "we know that after 100ms a job will most likely be done".
volatile boolean free = false;
while (true) {
synchronized (work) {
free = false; // new rev.2
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
free = true; // new
}
--
synchronized (work) {
work.add(r);
if (!free) { // new
work.notify();
} // new
free = false; // new rev.2
}

Multi-threaded application, parrarel reading with write possibility

For example, I have multi-threaded application which can be presented as:
Data bigData;
void thread1()
{
workOn(bigData);
}
void thread2()
{
workOn(bigData);
}
void thread3()
{
workOn(bigData);
}
There are few threads that are working on data. I could leave it as it is, but the problem is that sometimes (very seldom) data are modified by thread4.
void thread4()
{
sometimesModifyData(bigData);
}
Critical sections could be added there, but it would make no sense to multi-threading, because only one thread could work on data at the same time.
What is the best method to make it sense multi-threading while making it thread safe?
I am thinking about kind of state (sempahore?), that would prevent reading and writing at the same time but would allow parallel reading.

This is called a readers–writer lock. You could implement what is called a mutex to make sure no one reads when write is going on and no one writes when reads are going on. One way to solve the problem would be to have flags. If the writer is got something to modify, then switch on a lock. Upon which NO MORE readers will get to read and after all the current readers have finished, the writer will get to do its job and then again the readers read.

How does a read-write mutex/lock work?

Let's say I'm programming in a threading framework that does not have multiple-reader/single-writer mutexes. Can I implement their functionality with the following:
Create two mutexes: a recursive (lock counting) one for readers and a binary one for the writer.
Write:
acquire lock on binary mutex
wait until recursive mutex has lock count zero
actual write
release lock on binary mutex
Read:
acquire lock on binary mutex (so I know the writer is not active)
increment count of recursive mutex
release lock on binary mutex
actual read
decrement count of recursive mutex
This is not homework. I have no formal training in concurrent programming, and am trying to grasp the issues. If someone can point out a flaw, spell out the invariants or provide a better algorithm, I'd be very pleased. A good reference, either online or on dead trees, would also be appreciated.

The following is taken directly from The Art of Multiprocessor Programming which is a good book to learn about this stuff. There's actually 2 implementations presented: a simple version and a fair version. I'll go ahead and reproduce the fair version.
One of the requirements for this implementation is that you have a condition variable primitive. I'll try to figure out a way to remove it but that might take me a little while. Until then, this should still be better than nothing. Note that it's also possible to implement this primitive using only locks.
public class FifoReadWriteLock {
int readAcquires = 0, readReleases = 0;
boolean writer = false;
ReentrantLock lock;
Condition condition = lock.newCondition(); // This is the condition variable.
void readLock () {
lock.lock();
try {
while(writer)
condition.await();
readAcquires++;
}
finally {
lock.unlock();
}
}
void readUnlock () {
lock.lock();
try {
readReleases++;
if (readAcquires == readReleases)
condition.signalAll();
}
finally {
lock.unlock();
}
}
void writeLock () {
lock.lock();
try {
while (writer)
condition.await();
writer = true;
while (readAcquires != readReleases)
condition.await();
}
finally {
lock.unlock();
}
}
void writeUnlock() {
writer = false;
condition.signalAll();
}
}
First off, I simplified the code a little but the algorithm remains the same. There also happens to be an error in the book for this algorithm which is corrected in the errata. If you plan on reading the book, keep the errata close by or you'll end up being very confused (like me a few minutes ago when I was trying to re-understand the algorithm). Note that on the bright side, this is a good thing since it keeps you on your toes and that's a requirement when you're dealing with concurrency.
Next, while this may be a Java implementation, only use it as pseudo code. When doing the actual implementation you'll have to be carefull about the memory model of the language or you'll definitely end up with a headache. As an example, I think that the readAcquires and readReleases and writer variable all have to be declared as volatile in Java or the compiler is free to optimize them out of the loops. This is because in a strictly sequential programs there's no point in continuously looping on a variable that is never changed inside the loop. Note that my Java is a little rusty so I might be wrong. There's also another issue with integer overflow of the readReleases and readAcquires variables which is ignored in the algorithm.
One last note before I explain the algorithm. The condition variable is initialized using the lock. That means that when a thread calls condition.await(), it gives up its ownership of the lock. Once it's woken up by a call to condition.signalAll() the thread will resume once it has reacquired the lock.
Finally, here's how and why it works. The readReleases and readAcquires variables keep track of the number threads that have acquired and released the read lock. When these are equal, no thread has the read lock. The writer variable indicates that a thread is trying to acquire the write lock or it already has it.
The read lock part of the algorithm is fairly simple. When trying to lock, it first checks to see if a writer is holding the lock or is trying to acquire it. If so, it waits until the writer is done and then claims the lock for the readers by incrementing the readAcquires variable. When unlocking, a thread increases the readReleases variable and if there's no more readers, it notifies any writers that may be waiting.
The write lock part of the algorithm isn't much more complicated. To lock, a thread must first check whether any other writer is active. If they are, it has to wait until the other writer is done. It then indicates that it wants the lock by setting writer to true (note that it doesn't hold it yet). It then waits until there's no more readers before continuing. To unlock, it simply sets the variable writer to false and notifies any other threads that might be waiting.
This algorithm is fair because the readers can't block a writer indefinitely. Once a writer indicates that it wants to acquire the lock, no more readers can acquire the lock. After that the writer simply waits for the last remaining readers to finish up before continuing. Note that there's still the possibility of a writer indefinitely blocking another writer. That's a fairly rare case but the algorithm could be improved to take that into account.
So I re-read your question and realised that I partly (badly) answered it with the algorithm presented below. So here's my second attempt.
The algorithm, you described is fairly similar to the simple version presented in the book I mentionned. The only problem is that A) it's not fair and B) I'm not sure how you would implement wait until recursive mutex has lock count zero. For A), see above and for B), the book uses a single int to keep track of the readers and a condition variable to do the signalling.

You may want to prevent write starvation, to accomplish this you can either give preference to writes or make mutex fair.
ReadWriteLock Java's interface documentation says Writer preference is common,
ReentrantReadWriteLock class documentation says
This class does not impose a reader or writer preference ordering for lock access. However, it does support an optional fairness policy.
Note R..'s comment
Rather than locking and unlocking the binary mutex for reading, you
can just check the binary mutex state after incrementing the count on
the recursive mutex, and wait (spin/yield/futex_wait/whatever) if it's
locked until it becomes unlocked
Recommended reading:
Programming with POSIX Threads
Perl's RWLock
Java's ReadWriteLock documentation.

What is a semaphore?

A semaphore is a programming concept that is frequently used to solve multi-threading problems. My question to the community:
What is a semaphore and how do you use it?

Think of semaphores as bouncers at a nightclub. There are a dedicated number of people that are allowed in the club at once. If the club is full no one is allowed to enter, but as soon as one person leaves another person might enter.
It's simply a way to limit the number of consumers for a specific resource. For example, to limit the number of simultaneous calls to a database in an application.
Here is a very pedagogic example in C# :-)
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace TheNightclub
{
public class Program
{
public static Semaphore Bouncer { get; set; }
public static void Main(string[] args)
{
// Create the semaphore with 3 slots, where 3 are available.
Bouncer = new Semaphore(3, 3);
// Open the nightclub.
OpenNightclub();
}
public static void OpenNightclub()
{
for (int i = 1; i <= 50; i++)
{
// Let each guest enter on an own thread.
Thread thread = new Thread(new ParameterizedThreadStart(Guest));
thread.Start(i);
}
}
public static void Guest(object args)
{
// Wait to enter the nightclub (a semaphore to be released).
Console.WriteLine("Guest {0} is waiting to entering nightclub.", args);
Bouncer.WaitOne();
// Do some dancing.
Console.WriteLine("Guest {0} is doing some dancing.", args);
Thread.Sleep(500);
// Let one guest out (release one semaphore).
Console.WriteLine("Guest {0} is leaving the nightclub.", args);
Bouncer.Release(1);
}
}
}

The article Mutexes and Semaphores Demystified by Michael Barr is a great short introduction into what makes mutexes and semaphores different, and when they should and should not be used. I've excerpted several key paragraphs here.
The key point is that mutexes should be used to protect shared resources, while semaphores should be used for signaling. You should generally not use semaphores to protect shared resources, nor mutexes for signaling. There are issues, for instance, with the bouncer analogy in terms of using semaphores to protect shared resources - you can use them that way, but it may cause hard to diagnose bugs.
While mutexes and semaphores have some similarities in their implementation, they should always be used differently.
The most common (but nonetheless incorrect) answer to the question posed at the top is that mutexes and semaphores are very similar, with the only significant difference being that semaphores can count higher than one. Nearly all engineers seem to properly understand that a mutex is a binary flag used to protect a shared resource by ensuring mutual exclusion inside critical sections of code. But when asked to expand on how to use a "counting semaphore," most engineers—varying only in their degree of confidence—express some flavor of the textbook opinion that these are used to protect several equivalent resources.
...
At this point an interesting analogy is made using the idea of bathroom keys as protecting shared resources - the bathroom. If a shop has a single bathroom, then a single key will be sufficient to protect that resource and prevent multiple people from using it simultaneously.
If there are multiple bathrooms, one might be tempted to key them alike and make multiple keys - this is similar to a semaphore being mis-used. Once you have a key you don't actually know which bathroom is available, and if you go down this path you're probably going to end up using mutexes to provide that information and make sure you don't take a bathroom that's already occupied.
A semaphore is the wrong tool to protect several of the essentially same resource, but this is how many people think of it and use it. The bouncer analogy is distinctly different - there aren't several of the same type of resource, instead there is one resource which can accept multiple simultaneous users. I suppose a semaphore can be used in such situations, but rarely are there real-world situations where the analogy actually holds - it's more often that there are several of the same type, but still individual resources, like the bathrooms, which cannot be used this way.
...
The correct use of a semaphore is for signaling from one task to another. A mutex is meant to be taken and released, always in that order, by each task that uses the shared resource it protects. By contrast, tasks that use semaphores either signal or wait—not both. For example, Task 1 may contain code to post (i.e., signal or increment) a particular semaphore when the "power" button is pressed and Task 2, which wakes the display, pends on that same semaphore. In this scenario, one task is the producer of the event signal; the other the consumer.
...
Here an important point is made that mutexes interfere with real time operating systems in a bad way, causing priority inversion where a less important task may be executed before a more important task because of resource sharing. In short, this happens when a lower priority task uses a mutex to grab a resource, A, then tries to grab B, but is paused because B is unavailable. While it's waiting, a higher priority task comes along and needs A, but it's already tied up, and by a process that isn't even running because it's waiting for B. There are many ways to resolve this, but it most often is fixed by altering the mutex and task manager. The mutex is much more complex in these cases than a binary semaphore, and using a semaphore in such an instance will cause priority inversions because the task manager is unaware of the priority inversion and cannot act to correct it.
...
The cause of the widespread modern confusion between mutexes and semaphores is historical, as it dates all the way back to the 1974 invention of the Semaphore (capital "S", in this article) by Djikstra. Prior to that date, none of the interrupt-safe task synchronization and signaling mechanisms known to computer scientists was efficiently scalable for use by more than two tasks. Dijkstra's revolutionary, safe-and-scalable Semaphore was applied in both critical section protection and signaling. And thus the confusion began.
However, it later became obvious to operating system developers, after the appearance of the priority-based preemptive RTOS (e.g., VRTX, ca. 1980), publication of academic papers establishing RMA and the problems caused by priority inversion, and a paper on priority inheritance protocols in 1990, 3 it became apparent that mutexes must be more than just semaphores with a binary counter.
Mutex: resource sharing
Semaphore: signaling
Don't use one for the other without careful consideration of the side effects.

Mutex: exclusive-member access to a resource
Semaphore: n-member access to a resource
That is, a mutex can be used to syncronize access to a counter, file, database, etc.
A sempahore can do the same thing but supports a fixed number of simultaneous callers. For example, I can wrap my database calls in a semaphore(3) so that my multithreaded app will hit the database with at most 3 simultaneous connections. All attempts will block until one of the three slots opens up. They make things like doing naive throttling really, really easy.

Consider, a taxi that can accommodate a total of 3(rear)+2(front) persons including the driver. So, a semaphore allows only 5 persons inside a car at a time.
And a mutex allows only 1 person on a single seat of the car.
Therefore, Mutex is to allow exclusive access for a resource (like an OS thread) while a Semaphore is to allow access for n number of resources at a time.

#Craig:
A semaphore is a way to lock a
resource so that it is guaranteed that
while a piece of code is executed,
only this piece of code has access to
that resource. This keeps two threads
from concurrently accesing a resource,
which can cause problems.
This is not restricted to only one thread. A semaphore can be configured to allow a fixed number of threads to access a resource.

Semaphore can also be used as a ... semaphore.
For example if you have multiple process enqueuing data to a queue, and only one task consuming data from the queue. If you don't want your consuming task to constantly poll the queue for available data, you can use semaphore.
Here the semaphore is not used as an exclusion mechanism, but as a signaling mechanism.
The consuming task is waiting on the semaphore
The producing task are posting on the semaphore.
This way the consuming task is running when and only when there is data to be dequeued

There are two essential concepts to building concurrent programs - synchronization and mutual exclusion. We will see how these two types of locks (semaphores are more generally a kind of locking mechanism) help us achieve synchronization and mutual exclusion.
A semaphore is a programming construct that helps us achieve concurrency, by implementing both synchronization and mutual exclusion. Semaphores are of two types, Binary and Counting.
A semaphore has two parts : a counter, and a list of tasks waiting to access a particular resource. A semaphore performs two operations : wait (P) [this is like acquiring a lock], and release (V)[ similar to releasing a lock] - these are the only two operations that one can perform on a semaphore. In a binary semaphore, the counter logically goes between 0 and 1. You can think of it as being similar to a lock with two values : open/closed. A counting semaphore has multiple values for count.
What is important to understand is that the semaphore counter keeps track of the number of tasks that do not have to block, i.e., they can make progress. Tasks block, and add themselves to the semaphore's list only when the counter is zero. Therefore, a task gets added to the list in the P() routine if it cannot progress, and "freed" using the V() routine.
Now, it is fairly obvious to see how binary semaphores can be used to solve synchronization and mutual exclusion - they are essentially locks.
ex. Synchronization:
thread A{
semaphore &s; //locks/semaphores are passed by reference! think about why this is so.
A(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
;// some block of code B2
...
}
//thread B{
semaphore &s;
B(semaphore &s): s(s){} //constructor
foo(){
...
...
// some block of code B1
s.V();
..
}
main(){
semaphore s(0); // we start the semaphore at 0 (closed)
A a(s);
B b(s);
}
In the above example, B2 can only execute after B1 has finished execution. Let's say thread A comes executes first - gets to sem.P(), and waits, since the counter is 0 (closed). Thread B comes along, finishes B1, and then frees thread A - which then completes B2. So we achieve synchronization.
Now let's look at mutual exclusion with a binary semaphore:
thread mutual_ex{
semaphore &s;
mutual_ex(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
//critical section
s.V();
...
...
s.P();
//critical section
s.V();
...
}
main(){
semaphore s(1);
mutual_ex m1(s);
mutual_ex m2(s);
}
The mutual exclusion is quite simple as well - m1 and m2 cannot enter the critical section at the same time. So each thread is using the same semaphore to provide mutual exclusion for its two critical sections. Now, is it possible to have greater concurrency? Depends on the critical sections. (Think about how else one could use semaphores to achieve mutual exclusion.. hint hint : do i necessarily only need to use one semaphore?)
Counting semaphore: A semaphore with more than one value. Let's look at what this is implying - a lock with more than one value?? So open, closed, and ...hmm. Of what use is a multi-stage-lock in mutual exclusion or synchronization?
Let's take the easier of the two:
Synchronization using a counting semaphore: Let's say you have 3 tasks - #1 and 2 you want executed after 3. How would you design your synchronization?
thread t1{
...
s.P();
//block of code B1
thread t2{
...
s.P();
//block of code B2
thread t3{
...
//block of code B3
s.V();
s.V();
}
So if your semaphore starts off closed, you ensure that t1 and t2 block, get added to the semaphore's list. Then along comes all important t3, finishes its business and frees t1 and t2. What order are they freed in? Depends on the implementation of the semaphore's list. Could be FIFO, could be based some particular priority,etc. (Note : think about how you would arrange your P's and V;s if you wanted t1 and t2 to be executed in some particular order, and if you weren't aware of the implementation of the semaphore)
(Find out : What happens if the number of V's is greater than the number of P's?)
Mutual Exclusion Using counting semaphores: I'd like you to construct your own pseudocode for this (makes you understand things better!) - but the fundamental concept is this : a counting semaphore of counter = N allows N tasks to enter the critical section freely. What this means is you have N tasks (or threads, if you like) enter the critical section, but the N+1th task gets blocked (goes on our favorite blocked-task list), and only is let through when somebody V's the semaphore at least once. So the semaphore counter, instead of swinging between 0 and 1, now goes between 0 and N, allowing N tasks to freely enter and exit, blocking nobody!
Now gosh, why would you need such a stupid thing? Isn't the whole point of mutual exclusion to not let more than one guy access a resource?? (Hint Hint...You don't always only have one drive in your computer, do you...?)
To think about : Is mutual exclusion achieved by having a counting semaphore alone? What if you have 10 instances of a resource, and 10 threads come in (through the counting semaphore) and try to use the first instance?

I've created the visualization which should help to understand the idea. Semaphore controls access to a common resource in a multithreading environment.
ExecutorService executor = Executors.newFixedThreadPool(7);
Semaphore semaphore = new Semaphore(4);
Runnable longRunningTask = () -> {
boolean permit = false;
try {
permit = semaphore.tryAcquire(1, TimeUnit.SECONDS);
if (permit) {
System.out.println("Semaphore acquired");
Thread.sleep(5);
} else {
System.out.println("Could not acquire semaphore");
}
} catch (InterruptedException e) {
throw new IllegalStateException(e);
} finally {
if (permit) {
semaphore.release();
}
}
};
// execute tasks
for (int j = 0; j < 10; j++) {
executor.submit(longRunningTask);
}
executor.shutdown();
Output
Semaphore acquired
Semaphore acquired
Semaphore acquired
Semaphore acquired
Could not acquire semaphore
Could not acquire semaphore
Could not acquire semaphore
Sample code from the article

A semaphore is an object containing a natural number (i.e. a integer greater or equal to zero) on which two modifying operations are defined. One operation, V, adds 1 to the natural. The other operation, P, decreases the natural number by 1. Both activities are atomic (i.e. no other operation can be executed at the same time as a V or a P).
Because the natural number 0 cannot be decreased, calling P on a semaphore containing a 0 will block the execution of the calling process(/thread) until some moment at which the number is no longer 0 and P can be successfully (and atomically) executed.
As mentioned in other answers, semaphores can be used to restrict access to a certain resource to a maximum (but variable) number of processes.

A hardware or software flag. In multi tasking systems , a semaphore is as variable with a value that indicates the status of a common resource.A process needing the resource checks the semaphore to determine the resources status and then decides how to proceed.

Semaphores are act like thread limiters.
Example: If you have a pool of 100 threads and you want to perform some DB operation. If 100 threads access the DB at a given time, then there may be locking issue in DB so we can use semaphore which allow only limited thread at a time.Below Example allow only one thread at a time. When a thread call the acquire() method, it will then get the access and after calling the release() method, it will release the acccess so that next thread will get the access.
package practice;
import java.util.concurrent.Semaphore;
public class SemaphoreExample {
public static void main(String[] args) {
Semaphore s = new Semaphore(1);
semaphoreTask s1 = new semaphoreTask(s);
semaphoreTask s2 = new semaphoreTask(s);
semaphoreTask s3 = new semaphoreTask(s);
semaphoreTask s4 = new semaphoreTask(s);
semaphoreTask s5 = new semaphoreTask(s);
s1.start();
s2.start();
s3.start();
s4.start();
s5.start();
}
}
class semaphoreTask extends Thread {
Semaphore s;
public semaphoreTask(Semaphore s) {
this.s = s;
}
#Override
public void run() {
try {
s.acquire();
Thread.sleep(1000);
System.out.println(Thread.currentThread().getName()+" Going to perform some operation");
s.release();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

So imagine everyone is trying to go to the bathroom and there's only a certain number of keys to the bathroom. Now if there's not enough keys left, that person needs to wait. So think of semaphore as representing those set of keys available for bathrooms (the system resources) that different processes (bathroom goers) can request access to.
Now imagine two processes trying to go to the bathroom at the same time. That's not a good situation and semaphores are used to prevent this. Unfortunately, the semaphore is a voluntary mechanism and processes (our bathroom goers) can ignore it (i.e. even if there are keys, someone can still just kick the door open).
There are also differences between binary/mutex & counting semaphores.
Check out the lecture notes at http://www.cs.columbia.edu/~jae/4118/lect/L05-ipc.html.

This is an old question but one of the most interesting uses of semaphore is a read/write lock and it has not been explicitly mentioned.
The r/w locks works in simple fashion: consume one permit for a reader and all permits for writers.
Indeed, a trivial implementation of a r/w lock but requires metadata modification on read (actually twice) that can become a bottle neck, still significantly better than a mutex or lock.
Another downside is that writers can be started rather easily as well unless the semaphore is a fair one or the writes acquire permits in multiple requests, in such case they need an explicit mutex between themselves.
Further read:

Mutex is just a boolean while semaphore is a counter.
Both are used to lock part of code so it's not accessed by too many threads.
Example
lock.set()
a += 1
lock.unset()
Now if lock was a mutex, it means that it will always be locked or unlocked (a boolean under the surface) regardless how many threads try access the protected snippet of code. While locked, any other thread would just wait until it's unlocked/unset by the previous thread.
Now imagine if instead lock was under the hood a counter with a predefined MAX value (say 2 for our example). Then if 2 threads try to access the resource, then lock would get its value increased to 2. If a 3rd thread then tried to access it, it would simply wait for the counter to go below 2 and so on.
If lock as a semaphore had a max of 1, then it would be acting exactly as a mutex.

A semaphore is a way to lock a resource so that it is guaranteed that while a piece of code is executed, only this piece of code has access to that resource. This keeps two threads from concurrently accesing a resource, which can cause problems.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string