What does threadsafe mean?

What does threadsafe mean? - multithreading

Recently I tried to Access a textbox from a thread (other than the UI thread) and an exception was thrown. It said something about the "code not being thread safe" and so I ended up writing a delegate (sample from MSDN helped) and calling it instead.
But even so I didn't quite understand why all the extra code was necessary.
Update:
Will I run into any serious problems if I check
Controls.CheckForIllegalCrossThread..blah =true

Eric Lippert has a nice blog post entitled What is this thing you call "thread safe"? about the definition of thread safety as found of Wikipedia.
3 important things extracted from the links :
“A piece of code is thread-safe if it functions correctly during
simultaneous execution by multiple threads.”
“In particular, it must satisfy the need for multiple threads to
access the same shared data, …”
“…and the need for a shared piece of data to be accessed by only one
thread at any given time.”
Definitely worth a read!

In the simplest of terms threadsafe means that it is safe to be accessed from multiple threads. When you are using multiple threads in a program and they are each attempting to access a common data structure or location in memory several bad things can happen. So, you add some extra code to prevent those bad things. For example, if two people were writing the same document at the same time, the second person to save will overwrite the work of the first person. To make it thread safe then, you have to force person 2 to wait for person 1 to complete their task before allowing person 2 to edit the document.

Wikipedia has an article on Thread Safety.
This definitions page (you have to skip an ad - sorry) defines it thus:
In computer programming, thread-safe describes a program portion or routine that can be called from multiple programming threads without unwanted interaction between the threads.
A thread is an execution path of a program. A single threaded program will only have one thread and so this problem doesn't arise. Virtually all GUI programs have multiple execution paths and hence threads - there are at least two, one for processing the display of the GUI and handing user input, and at least one other for actually performing the operations of the program.
This is done so that the UI is still responsive while the program is working by offloading any long running process to any non-UI threads. These threads may be created once and exist for the lifetime of the program, or just get created when needed and destroyed when they've finished.
As these threads will often need to perform common actions - disk i/o, outputting results to the screen etc. - these parts of the code will need to be written in such a way that they can handle being called from multiple threads, often at the same time. This will involve things like:
Working on copies of data
Adding locks around the critical code
Opening files in the appropriate mode - so if reading, don't open the file for write as well.
Coping with not having access to resources because they're locked by other threads/processes.

Simply, thread-safe means that a method or class instance can be used by multiple threads at the same time without any problems occurring.
Consider the following method:
private int myInt = 0;
public int AddOne()
{
int tmp = myInt;
tmp = tmp + 1;
myInt = tmp;
return tmp;
}
Now thread A and thread B both would like to execute AddOne(). but A starts first and reads the value of myInt (0) into tmp. Now for some reason, the scheduler decides to halt thread A and defer execution to thread B. Thread B now also reads the value of myInt (still 0) into it's own variable tmp. Thread B finishes the entire method so in the end myInt = 1. And 1 is returned. Now it's Thread A's turn again. Thread A continues. And adds 1 to tmp (tmp was 0 for thread A). And then saves this value in myInt. myInt is again 1.
So in this case the method AddOne() was called two times, but because the method was not implemented in a thread-safe way the value of myInt is not 2, as expected, but 1 because the second thread read the variable myInt before the first thread finished updating it.
Creating thread-safe methods is very hard in non-trivial cases. And there are quite a few techniques. In Java you can mark a method as synchronized, this means that only one thread can execute that method at a given time. The other threads wait in line. This makes a method thread-safe, but if there is a lot of work to be done in a method, then this wastes a lot of space. Another technique is to 'mark only a small part of a method as synchronized' by creating a lock or semaphore, and locking this small part (usually called the critical section). There are even some methods that are implemented as lock-less thread-safe, which means that they are built in such a way that multiple threads can race through them at the same time without ever causing problems, this can be the case when a method only executes one atomic call. Atomic calls are calls that can't be interrupted and can only be done by one thread at a time.

In real world example for the layman is
Let's suppose you have a bank account with the internet and mobile banking and your account have only $10.
You performed transfer balance to another account using mobile banking, and the meantime, you did online shopping using the same bank account.
If this bank account is not threadsafe, then the bank allows you to perform two transactions at the same time and then the bank will become bankrupt.
Threadsafe means that an object's state doesn't change if simultaneously multiple threads try to access the object.

You can get more explanation from the book "Java Concurrency in Practice":
A class is thread‐safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code.

A module is thread-safe if it guarantees it can maintain its invariants in the face of multi-threaded and concurrence use.
Here, a module can be a data-structure, class, object, method/procedure or function. Basically scoped piece of code and related data.
The guarantee can potentially be limited to certain environments such as a specific CPU architecture, but must hold for those environments. If there is no explicit delimitation of environments, then it is usually taken to imply that it holds for all environments that the code can be compiled and executed.
Thread-unsafe modules may function correctly under mutli-threaded and concurrent use, but this is often more down to luck and coincidence, than careful design. Even if some module does not break for you under, it may break when moved to other environments.
Multi-threading bugs are often hard to debug. Some of them only happen occasionally, while others manifest aggressively - this too, can be environment specific. They can manifest as subtly wrong results, or deadlocks. They can mess up data-structures in unpredictable ways, and cause other seemingly impossible bugs to appear in other remote parts of the code. It can be very application specific, so it is hard to give a general description.

Thread safety: A thread safe program protects it's data from memory consistency errors. In a highly multi-threaded program, a thread safe program does not cause any side effects with multiple read/write operations from multiple threads on same objects. Different threads can share and modify object data without consistency errors.
You can achieve thread safety by using advanced concurrency API. This documentation page provides good programming constructs to achieve thread safety.
Lock Objects support locking idioms that simplify many concurrent applications.
Executors define a high-level API for launching and managing threads. Executor implementations provided by java.util.concurrent provide thread pool management suitable for large-scale applications.
Concurrent Collections make it easier to manage large collections of data, and can greatly reduce the need for synchronization.
Atomic Variables have features that minimize synchronization and help avoid memory consistency errors.
ThreadLocalRandom (in JDK 7) provides efficient generation of pseudorandom numbers from multiple threads.
Refer to java.util.concurrent and java.util.concurrent.atomic packages too for other programming constructs.

Producing Thread-safe code is all about managing access to shared mutable states. When mutable states are published or shared between threads, they need to be synchronized to avoid bugs like race conditions and memory consistency errors.
I recently wrote a blog about thread safety. You can read it for more information.

You are clearly working in a WinForms environment. WinForms controls exhibit thread affinity, which means that the thread in which they are created is the only thread that can be used to access and update them. That is why you will find examples on MSDN and elsewhere demonstrating how to marshall the call back onto the main thread.
Normal WinForms practice is to have a single thread that is dedicated to all your UI work.

I find the concept of http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 to be what I usually think of as unsafe threading which is when a method has and relies on a side effect such as a global variable.
For example I have seen code that formatted floating point numbers to string, if two of these are run in different threads the global value of decimalSeparator can be permanently changed to '.'
//built in global set to locale specific value (here a comma)
decimalSeparator = ','
function FormatDot(value : real):
//save the current decimal character
temp = decimalSeparator
//set the global value to be
decimalSeparator = '.'
//format() uses decimalSeparator behind the scenes
result = format(value)
//Put the original value back
decimalSeparator = temp

To understand thread safety, read below sections:
4.3.1. Example: Vehicle Tracker Using Delegation
As a more substantial example of delegation, let's construct a version of the vehicle tracker that delegates to a thread-safe class. We store the locations in a Map, so we start with a thread-safe Map implementation, ConcurrentHashMap. We also store the location using an immutable Point class instead of MutablePoint, shown in Listing 4.6.
Listing 4.6. Immutable Point class used by DelegatingVehicleTracker.
class Point{
public final int x, y;
public Point() {
this.x=0; this.y=0;
}
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Point is thread-safe because it is immutable. Immutable values can be freely shared and published, so we no longer need to copy the locations when returning them.
DelegatingVehicleTracker in Listing 4.7 does not use any explicit synchronization; all access to state is managed by ConcurrentHashMap, and all the keys and values of the Map are immutable.
Listing 4.7. Delegating Thread Safety to a ConcurrentHashMap.
public class DelegatingVehicleTracker {
private final ConcurrentMap<String, Point> locations;
private final Map<String, Point> unmodifiableMap;
public DelegatingVehicleTracker(Map<String, Point> points) {
this.locations = new ConcurrentHashMap<String, Point>(points);
this.unmodifiableMap = Collections.unmodifiableMap(locations);
}
public Map<String, Point> getLocations(){
return this.unmodifiableMap; // User cannot update point(x,y) as Point is immutable
}
public Point getLocation(String id) {
return locations.get(id);
}
public void setLocation(String id, int x, int y) {
if(locations.replace(id, new Point(x, y)) == null) {
throw new IllegalArgumentException("invalid vehicle name: " + id);
}
}
}
If we had used the original MutablePoint class instead of Point, we would be breaking encapsulation by letting getLocations publish a reference to mutable state that is not thread-safe. Notice that we've changed the behavior of the vehicle tracker class slightly; while the monitor version returned a snapshot of the locations, the delegating version returns an unmodifiable but “live” view of the vehicle locations. This means that if thread A calls getLocations and thread B later modifies the location of some of the points, those changes are reflected in the Map returned to thread A.
4.3.2. Independent State Variables
We can also delegate thread safety to more than one underlying state variable as long as those underlying state variables are independent, meaning that the composite class does not impose any invariants involving the multiple state variables.
VisualComponent in Listing 4.9 is a graphical component that allows clients to register listeners for mouse and keystroke events. It maintains a list of registered listeners of each type, so that when an event occurs the appropriate listeners can be invoked. But there is no relationship between the set of mouse listeners and key listeners; the two are independent, and therefore VisualComponent can delegate its thread safety obligations to two underlying thread-safe lists.
Listing 4.9. Delegating Thread Safety to Multiple Underlying State Variables.
public class VisualComponent {
private final List<KeyListener> keyListeners
= new CopyOnWriteArrayList<KeyListener>();
private final List<MouseListener> mouseListeners
= new CopyOnWriteArrayList<MouseListener>();
public void addKeyListener(KeyListener listener) {
keyListeners.add(listener);
}
public void addMouseListener(MouseListener listener) {
mouseListeners.add(listener);
}
public void removeKeyListener(KeyListener listener) {
keyListeners.remove(listener);
}
public void removeMouseListener(MouseListener listener) {
mouseListeners.remove(listener);
}
}
VisualComponent uses a CopyOnWriteArrayList to store each listener list; this is a thread-safe List implementation particularly suited for managing listener lists (see Section 5.2.3). Each List is thread-safe, and because there are no constraints coupling the state of one to the state of the other, VisualComponent can delegate its thread safety responsibilities to the underlying mouseListeners and keyListeners objects.
4.3.3. When Delegation Fails
Most composite classes are not as simple as VisualComponent: they have invariants that relate their component state variables. NumberRange in Listing 4.10 uses two AtomicIntegers to manage its state, but imposes an additional constraint—that the first number be less than or equal to the second.
Listing 4.10. Number Range Class that does Not Sufficiently Protect Its Invariants. Don't do this.
public class NumberRange {
// INVARIANT: lower <= upper
private final AtomicInteger lower = new AtomicInteger(0);
private final AtomicInteger upper = new AtomicInteger(0);
public void setLower(int i) {
//Warning - unsafe check-then-act
if(i > upper.get()) {
throw new IllegalArgumentException(
"Can't set lower to " + i + " > upper ");
}
lower.set(i);
}
public void setUpper(int i) {
//Warning - unsafe check-then-act
if(i < lower.get()) {
throw new IllegalArgumentException(
"Can't set upper to " + i + " < lower ");
}
upper.set(i);
}
public boolean isInRange(int i){
return (i >= lower.get() && i <= upper.get());
}
}
NumberRange is not thread-safe; it does not preserve the invariant that constrains lower and upper. The setLower and setUpper methods attempt to respect this invariant, but do so poorly. Both setLower and setUpper are check-then-act sequences, but they do not use sufficient locking to make them atomic. If the number range holds (0, 10), and one thread calls setLower(5) while another thread calls setUpper(4), with some unlucky timing both will pass the checks in the setters and both modifications will be applied. The result is that the range now holds (5, 4)—an invalid state. So while the underlying AtomicIntegers are thread-safe, the composite class is not. Because the underlying state variables lower and upper are not independent, NumberRange cannot simply delegate thread safety to its thread-safe state variables.
NumberRange could be made thread-safe by using locking to maintain its invariants, such as guarding lower and upper with a common lock. It must also avoid publishing lower and upper to prevent clients from subverting its invariants.
If a class has compound actions, as NumberRange does, delegation alone is again not a suitable approach for thread safety. In these cases, the class must provide its own locking to ensure that compound actions are atomic, unless the entire compound action can also be delegated to the underlying state variables.
If a class is composed of multiple independent thread-safe state variables and has no operations that have any invalid state transitions, then it can delegate thread safety to the underlying state variables.

Related

Appropriate use of synchronizing or locking a segment of code

Given a singleton class Employee with 2 methods
int getSalary()
void updateSalary(int increment)
Do I need to synchronize or lock both these functions or use atomic salary variable?
If yes then the question is that in this way we would have to synchronize all the functions that we define in multithreaded environment. So, why not just make synchronized a standard as today no real world application would be single threaded?

With Singleton, we always have to very careful because, singleton object being a single instance naturally, can be shared between threads. Making functions synchronized is one way, and it is not efficient way. We need to think about other aspect of concurrency, like immutability Atomic classes.
class Employee {
//singleton instantiation
private final AtomicInteger sal = new AtomicInteger(0);
int getSalary(){
return sla.get();
}
void updateSalary(int increment){
sla.add(increment);
}
}
This will solve, we do not need to synchronize every method of the singleton class.
We do not have to mark every function of every class to be synchronized, but always have to be careful if a function is modifying a state or reading a state and could be concurrently invoked, in such cases start thinking about synchronization. But, with singleton classes we always have to be careful.

<Spring Batch> Why does making ItemReader thread-safe leads us to loosing restartability?

I have a multi-threaded batch job reading from a DB and I am concerned about different threads re-reading records as ItemReader is not thread safe in Spring batch. I went through SpringBatch FAQ section which states that
You can synchronize the read() method (e.g. by wrapping it in a delegator that does the synchronization). Remember that you will lose restartability, so best practice is to mark the step as not restartable and to be safe (and efficient) you can also set saveState=false on the reader.
I want to know why will I loose re-startability in this case? What has restartability got to do with synchronizing my read operations? It can always try again,right?
Also, will this piece of code be enough for synchronizing the reader?
public SynchronizedItemReader<T> implements ItemReader<T> {
private final ItemReader<T> delegate;
public SynchronizedItemReader(ItemReader<T> delegate) {
this.delegate = delegate;
}
public synchronized T read () {
return delegate.read();
}
}

When using an ItemReader with multithreads, the lack of restartability is not about the read itself. It's about saving the state of the reader which occurs in the update method. The issue is that there needs to be coordination between the calls to read() - the method providing the data and update() - the method persisting the state. When you use multiple threads, the internal state of the reader (and therefore the update() call) may or may not reflect the work that has been done. Take for example the FlatFileItemReader using a chunk size of 5 and running on multiple threads. You could have thread1 having read 5 items (time to update), yet thread 2 could have read an additional 3. This means that the call to update would save that 8 items have been read. If the chunk on thread 2 fails, the state would due incorrect and the restart would miss the three items that were already read.
This is not to say that it is impossible to write a thread safe ItemReader. However, as your example above illustrates, if delegate is a stateful ItemReader (implements ItemStream as well), the state will not be persisted correctly with calls to update (in fact, your example above doesn't even take the ItemStream aspect of stageful readers into account).

If you want make restartable your job, with parallel execution of items, you can save item, that reader read plus state of this item by yourself.

Interview Question on .NET Threading

Could you describe two methods of synchronizing multi-threaded write access performed
on a class member?
Please could any one help me what is this meant to do and what is the right answer.

When you change data in C#, something that looks like a single operation may be compiled into several instructions. Take the following class:
public class Number {
private int a = 0;
public void Add(int b) {
a += b;
}
}
When you build it, you get the following IL code:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: dup
// Pushes the value of the private variable 'a' onto the stack
IL_0003: ldfld int32 Simple.Number::a
// Pushes the value of the argument 'b' onto the stack
IL_0008: ldarg.1
// Adds the top two values of the stack together
IL_0009: add
// Sets 'a' to the value on top of the stack
IL_000a: stfld int32 Simple.Number::a
IL_000f: ret
Now, say you have a Number object and two threads call its Add method like this:
number.Add(2); // Thread 1
number.Add(3); // Thread 2
If you want the result to be 5 (0 + 2 + 3), there's a problem. You don't know when these threads will execute their instructions. Both threads could execute IL_0003 (pushing zero onto the stack) before either executes IL_000a (actually changing the member variable) and you get this:
a = 0 + 2; // Thread 1
a = 0 + 3; // Thread 2
The last thread to finish 'wins' and at the end of the process, a is 2 or 3 instead of 5.
So you have to make sure that one complete set of instructions finishes before the other set. To do that, you can:
1) Lock access to the class member while it's being written, using one of the many .NET synchronization primitives (like lock, Mutex, ReaderWriterLockSlim, etc.) so that only one thread can work on it at a time.
2) Push write operations into a queue and process that queue with a single thread. As Thorarin points out, you still have to synchronize access to the queue if it isn't thread-safe, but it's worth it for complex write operations.
There are other techniques. Some (like Interlocked) are limited to particular data types, and there are even more (like the ones discussed in Non-blocking synchronization and Part 4 of Joseph Albahari's Threading in C#), though they are more complex: approach them with caution.

In multithreaded applications, there are many situations where simultaneous access to the same data can cause problems. In such cases synchronization is required to guarantee that only one thread has access at any one time.
I imagine they mean using the lock-statement (or SyncLock in VB.NET) vs. using a Monitor.
You might want to read this page for examples and an understanding of the concept. However, if you have no experience with multithreaded application design, it will likely become quickly apparent, should your new employer put you to the test. It's a fairly complicated subject, with many possible pitfalls such as deadlock.
There is a decent MSDN page on the subject as well.
There may be other options, depending on the type of member variable and how it is to be changed. Incrementing an integer for example can be done with the Interlocked.Increment method.
As an excercise and demonstration of the problem, try writing an application that starts 5 simultaneous threads, incrementing a shared counter a million times per thread. The intended end result of the counter would be 5 million, but that is (probably) not what you will end up with :)
Edit: made a quick implementation myself (download). Sample output:
Unsynchronized counter demo:
expected counter = 5000000
actual counter = 4901600
Time taken (ms) = 67
Synchronized counter demo:
expected counter = 5000000
actual counter = 5000000
Time taken (ms) = 287

There are a couple of ways, several of which are mentioned previously.
ReaderWriterLockSlim is my preferred method. This gives you a database type of locking, and allows for upgrading (although the syntax for that is incorrect in the MSDN last time I looked and is very non-obvious)
lock statements. You treat a read like a write and just prevent access to the variable
Interlocked operations. This performs an operations on a value type in an atomic step. This can be used for lock free threading (really wouldn't recommend this)
Mutexes and Semaphores (haven't used these)
Monitor statements (this is essentially how the lock keyword works)
While I don't mean to denigrate other answers, I would not trust anything that does not use one of these techniques. My apologies if I have forgotten any.

Is this a safe version of double-checked locking?

Slightly modified version of canonical broken double-checked locking from Wikipedia:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null) {
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper();
// Atomically publish this instance.
atomicSet(helper, myHelper);
}
}
}
return helper;
}
}
Does simply making the publishing of the newly created Helper instance atomic make this double checked locking idiom safe, assuming that the underlying atomic ops library works properly? I realize that in Java, one could just use volatile, but even though the example is in pseudo-Java, this is supposed to be a language-agnostic question.
See also:
Double checked locking Article

It entirely depends on the exact memory model of your platform/language.
My rule of thumb: just don't do it. Lock-free (or reduced lock, in this case) programming is hard and shouldn't be attempted unless you're a threading ninja. You should only even contemplate it when you've got profiling proof that you really need it, and in that case you get the absolute best and most recent book on threading for that particular platform and see if it can help you.

I don't think you can answer the question in a language-agnostic fashion without getting away from code completely. It all depends on how synchronized and atomicSet work in your pseudocode.

The answer is language dependent - it comes down to the guarantees provided by atomicSet().
If the construction of myHelper can be spread out after the atomicSet() then it doesn't matter how the variable is assigned to the shared state.
i.e.
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper(); // ALLOCATE MEMORY HERE BUT DON'T INITIALISE
// Atomically publish this instance.
atomicSet(helper, myHelper); // ATOMICALLY POINT UNINITIALISED MEMORY from helper
// other thread gets run at this time and tries to use helper object
// AT THE PROGRAMS LEISURE INITIALISE Helper object.
If this is allowed by the language then the double checking will not work.

Using volatile would not prevent a multiple instantiations - however using the synchronize will prevent multiple instances being created. However with your code it is possible that helper is returned before it has been setup (thread 'A' instantiates it, but before it is setup thread 'B' comes along, helper is non-null and so returns it straight away. To fix that problem, remove the first if (helper == null).

Most likely it is broken, because the problem of a partially constructed object is not addressed.

To all the people worried about a partially constructed object:
As far as I understand, the problem of partially constructed objects is only a problem within constructors. In other words, within a constructor, if an object references itself (including it's subclass) or it's members, then there are possible issues with partial construction. Otherwise, when a constructor returns, the class is fully constructed.
I think you are confusing partial construction with the different problem of how the compiler optimizes the writes. The compiler can choose to A) allocate the memory for the new Helper object, B) write the address to myHelper (the local stack variable), and then C) invoke any constructor initialization. Anytime after point B and before point C, accessing myHelper would be a problem.
It is this compiler optimization of the writes, not partial construction that the cited papers are concerned with. In the original single-check lock solution, optimized writes can allow multiple threads to see the member variable between points B and C. This implementation avoids the write optimization issue by using a local stack variable.
The main scope of the cited papers is to describe the various problems with the double-check lock solution. However, unless the atomicSet method is also synchronizing against the Foo class, this solution is not a double-check lock solution. It is using multiple locks.
I would say this all comes down to the implementation of the atomic assignment function. The function needs to be truly atomic, it needs to guarantee that processor local memory caches are synchronized, and it needs to do all this at a lower cost than simply always synchronizing the getHelper method.
Based on the cited paper, in Java, it is unlikely to meet all these requirements. Also, something that should be very clear from the paper is that Java's memory model changes frequently. It adapts as better understanding of caching, garbage collection, etc. evolve, as well as adapting to changes in the underlying real processor architecture that the VM runs on.
As a rule of thumb, if you optimize your Java code in a way that depends on the underlying implementation, as opposed to the API, you run the risk of having broken code in the next release of the JVM. (Although, sometimes you will have no choice.)
dsimcha:
If your atomicSet method is real, then I would try sending your question to Doug Lea (along with your atomicSet implementation). I have a feeling he's the kind of guy that would answer. I'm guessing that for Java he will tell you that it's cheaper to always synchronize and to look to optimize somewhere else.

What is a semaphore?

A semaphore is a programming concept that is frequently used to solve multi-threading problems. My question to the community:
What is a semaphore and how do you use it?

Think of semaphores as bouncers at a nightclub. There are a dedicated number of people that are allowed in the club at once. If the club is full no one is allowed to enter, but as soon as one person leaves another person might enter.
It's simply a way to limit the number of consumers for a specific resource. For example, to limit the number of simultaneous calls to a database in an application.
Here is a very pedagogic example in C# :-)
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace TheNightclub
{
public class Program
{
public static Semaphore Bouncer { get; set; }
public static void Main(string[] args)
{
// Create the semaphore with 3 slots, where 3 are available.
Bouncer = new Semaphore(3, 3);
// Open the nightclub.
OpenNightclub();
}
public static void OpenNightclub()
{
for (int i = 1; i <= 50; i++)
{
// Let each guest enter on an own thread.
Thread thread = new Thread(new ParameterizedThreadStart(Guest));
thread.Start(i);
}
}
public static void Guest(object args)
{
// Wait to enter the nightclub (a semaphore to be released).
Console.WriteLine("Guest {0} is waiting to entering nightclub.", args);
Bouncer.WaitOne();
// Do some dancing.
Console.WriteLine("Guest {0} is doing some dancing.", args);
Thread.Sleep(500);
// Let one guest out (release one semaphore).
Console.WriteLine("Guest {0} is leaving the nightclub.", args);
Bouncer.Release(1);
}
}
}

The article Mutexes and Semaphores Demystified by Michael Barr is a great short introduction into what makes mutexes and semaphores different, and when they should and should not be used. I've excerpted several key paragraphs here.
The key point is that mutexes should be used to protect shared resources, while semaphores should be used for signaling. You should generally not use semaphores to protect shared resources, nor mutexes for signaling. There are issues, for instance, with the bouncer analogy in terms of using semaphores to protect shared resources - you can use them that way, but it may cause hard to diagnose bugs.
While mutexes and semaphores have some similarities in their implementation, they should always be used differently.
The most common (but nonetheless incorrect) answer to the question posed at the top is that mutexes and semaphores are very similar, with the only significant difference being that semaphores can count higher than one. Nearly all engineers seem to properly understand that a mutex is a binary flag used to protect a shared resource by ensuring mutual exclusion inside critical sections of code. But when asked to expand on how to use a "counting semaphore," most engineers—varying only in their degree of confidence—express some flavor of the textbook opinion that these are used to protect several equivalent resources.
...
At this point an interesting analogy is made using the idea of bathroom keys as protecting shared resources - the bathroom. If a shop has a single bathroom, then a single key will be sufficient to protect that resource and prevent multiple people from using it simultaneously.
If there are multiple bathrooms, one might be tempted to key them alike and make multiple keys - this is similar to a semaphore being mis-used. Once you have a key you don't actually know which bathroom is available, and if you go down this path you're probably going to end up using mutexes to provide that information and make sure you don't take a bathroom that's already occupied.
A semaphore is the wrong tool to protect several of the essentially same resource, but this is how many people think of it and use it. The bouncer analogy is distinctly different - there aren't several of the same type of resource, instead there is one resource which can accept multiple simultaneous users. I suppose a semaphore can be used in such situations, but rarely are there real-world situations where the analogy actually holds - it's more often that there are several of the same type, but still individual resources, like the bathrooms, which cannot be used this way.
...
The correct use of a semaphore is for signaling from one task to another. A mutex is meant to be taken and released, always in that order, by each task that uses the shared resource it protects. By contrast, tasks that use semaphores either signal or wait—not both. For example, Task 1 may contain code to post (i.e., signal or increment) a particular semaphore when the "power" button is pressed and Task 2, which wakes the display, pends on that same semaphore. In this scenario, one task is the producer of the event signal; the other the consumer.
...
Here an important point is made that mutexes interfere with real time operating systems in a bad way, causing priority inversion where a less important task may be executed before a more important task because of resource sharing. In short, this happens when a lower priority task uses a mutex to grab a resource, A, then tries to grab B, but is paused because B is unavailable. While it's waiting, a higher priority task comes along and needs A, but it's already tied up, and by a process that isn't even running because it's waiting for B. There are many ways to resolve this, but it most often is fixed by altering the mutex and task manager. The mutex is much more complex in these cases than a binary semaphore, and using a semaphore in such an instance will cause priority inversions because the task manager is unaware of the priority inversion and cannot act to correct it.
...
The cause of the widespread modern confusion between mutexes and semaphores is historical, as it dates all the way back to the 1974 invention of the Semaphore (capital "S", in this article) by Djikstra. Prior to that date, none of the interrupt-safe task synchronization and signaling mechanisms known to computer scientists was efficiently scalable for use by more than two tasks. Dijkstra's revolutionary, safe-and-scalable Semaphore was applied in both critical section protection and signaling. And thus the confusion began.
However, it later became obvious to operating system developers, after the appearance of the priority-based preemptive RTOS (e.g., VRTX, ca. 1980), publication of academic papers establishing RMA and the problems caused by priority inversion, and a paper on priority inheritance protocols in 1990, 3 it became apparent that mutexes must be more than just semaphores with a binary counter.
Mutex: resource sharing
Semaphore: signaling
Don't use one for the other without careful consideration of the side effects.

Mutex: exclusive-member access to a resource
Semaphore: n-member access to a resource
That is, a mutex can be used to syncronize access to a counter, file, database, etc.
A sempahore can do the same thing but supports a fixed number of simultaneous callers. For example, I can wrap my database calls in a semaphore(3) so that my multithreaded app will hit the database with at most 3 simultaneous connections. All attempts will block until one of the three slots opens up. They make things like doing naive throttling really, really easy.

Consider, a taxi that can accommodate a total of 3(rear)+2(front) persons including the driver. So, a semaphore allows only 5 persons inside a car at a time.
And a mutex allows only 1 person on a single seat of the car.
Therefore, Mutex is to allow exclusive access for a resource (like an OS thread) while a Semaphore is to allow access for n number of resources at a time.

#Craig:
A semaphore is a way to lock a
resource so that it is guaranteed that
while a piece of code is executed,
only this piece of code has access to
that resource. This keeps two threads
from concurrently accesing a resource,
which can cause problems.
This is not restricted to only one thread. A semaphore can be configured to allow a fixed number of threads to access a resource.

Semaphore can also be used as a ... semaphore.
For example if you have multiple process enqueuing data to a queue, and only one task consuming data from the queue. If you don't want your consuming task to constantly poll the queue for available data, you can use semaphore.
Here the semaphore is not used as an exclusion mechanism, but as a signaling mechanism.
The consuming task is waiting on the semaphore
The producing task are posting on the semaphore.
This way the consuming task is running when and only when there is data to be dequeued

There are two essential concepts to building concurrent programs - synchronization and mutual exclusion. We will see how these two types of locks (semaphores are more generally a kind of locking mechanism) help us achieve synchronization and mutual exclusion.
A semaphore is a programming construct that helps us achieve concurrency, by implementing both synchronization and mutual exclusion. Semaphores are of two types, Binary and Counting.
A semaphore has two parts : a counter, and a list of tasks waiting to access a particular resource. A semaphore performs two operations : wait (P) [this is like acquiring a lock], and release (V)[ similar to releasing a lock] - these are the only two operations that one can perform on a semaphore. In a binary semaphore, the counter logically goes between 0 and 1. You can think of it as being similar to a lock with two values : open/closed. A counting semaphore has multiple values for count.
What is important to understand is that the semaphore counter keeps track of the number of tasks that do not have to block, i.e., they can make progress. Tasks block, and add themselves to the semaphore's list only when the counter is zero. Therefore, a task gets added to the list in the P() routine if it cannot progress, and "freed" using the V() routine.
Now, it is fairly obvious to see how binary semaphores can be used to solve synchronization and mutual exclusion - they are essentially locks.
ex. Synchronization:
thread A{
semaphore &s; //locks/semaphores are passed by reference! think about why this is so.
A(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
;// some block of code B2
...
}
//thread B{
semaphore &s;
B(semaphore &s): s(s){} //constructor
foo(){
...
...
// some block of code B1
s.V();
..
}
main(){
semaphore s(0); // we start the semaphore at 0 (closed)
A a(s);
B b(s);
}
In the above example, B2 can only execute after B1 has finished execution. Let's say thread A comes executes first - gets to sem.P(), and waits, since the counter is 0 (closed). Thread B comes along, finishes B1, and then frees thread A - which then completes B2. So we achieve synchronization.
Now let's look at mutual exclusion with a binary semaphore:
thread mutual_ex{
semaphore &s;
mutual_ex(semaphore &s): s(s){} //constructor
foo(){
...
s.P();
//critical section
s.V();
...
...
s.P();
//critical section
s.V();
...
}
main(){
semaphore s(1);
mutual_ex m1(s);
mutual_ex m2(s);
}
The mutual exclusion is quite simple as well - m1 and m2 cannot enter the critical section at the same time. So each thread is using the same semaphore to provide mutual exclusion for its two critical sections. Now, is it possible to have greater concurrency? Depends on the critical sections. (Think about how else one could use semaphores to achieve mutual exclusion.. hint hint : do i necessarily only need to use one semaphore?)
Counting semaphore: A semaphore with more than one value. Let's look at what this is implying - a lock with more than one value?? So open, closed, and ...hmm. Of what use is a multi-stage-lock in mutual exclusion or synchronization?
Let's take the easier of the two:
Synchronization using a counting semaphore: Let's say you have 3 tasks - #1 and 2 you want executed after 3. How would you design your synchronization?
thread t1{
...
s.P();
//block of code B1
thread t2{
...
s.P();
//block of code B2
thread t3{
...
//block of code B3
s.V();
s.V();
}
So if your semaphore starts off closed, you ensure that t1 and t2 block, get added to the semaphore's list. Then along comes all important t3, finishes its business and frees t1 and t2. What order are they freed in? Depends on the implementation of the semaphore's list. Could be FIFO, could be based some particular priority,etc. (Note : think about how you would arrange your P's and V;s if you wanted t1 and t2 to be executed in some particular order, and if you weren't aware of the implementation of the semaphore)
(Find out : What happens if the number of V's is greater than the number of P's?)
Mutual Exclusion Using counting semaphores: I'd like you to construct your own pseudocode for this (makes you understand things better!) - but the fundamental concept is this : a counting semaphore of counter = N allows N tasks to enter the critical section freely. What this means is you have N tasks (or threads, if you like) enter the critical section, but the N+1th task gets blocked (goes on our favorite blocked-task list), and only is let through when somebody V's the semaphore at least once. So the semaphore counter, instead of swinging between 0 and 1, now goes between 0 and N, allowing N tasks to freely enter and exit, blocking nobody!
Now gosh, why would you need such a stupid thing? Isn't the whole point of mutual exclusion to not let more than one guy access a resource?? (Hint Hint...You don't always only have one drive in your computer, do you...?)
To think about : Is mutual exclusion achieved by having a counting semaphore alone? What if you have 10 instances of a resource, and 10 threads come in (through the counting semaphore) and try to use the first instance?

I've created the visualization which should help to understand the idea. Semaphore controls access to a common resource in a multithreading environment.
ExecutorService executor = Executors.newFixedThreadPool(7);
Semaphore semaphore = new Semaphore(4);
Runnable longRunningTask = () -> {
boolean permit = false;
try {
permit = semaphore.tryAcquire(1, TimeUnit.SECONDS);
if (permit) {
System.out.println("Semaphore acquired");
Thread.sleep(5);
} else {
System.out.println("Could not acquire semaphore");
}
} catch (InterruptedException e) {
throw new IllegalStateException(e);
} finally {
if (permit) {
semaphore.release();
}
}
};
// execute tasks
for (int j = 0; j < 10; j++) {
executor.submit(longRunningTask);
}
executor.shutdown();
Output
Semaphore acquired
Semaphore acquired
Semaphore acquired
Semaphore acquired
Could not acquire semaphore
Could not acquire semaphore
Could not acquire semaphore
Sample code from the article

A semaphore is an object containing a natural number (i.e. a integer greater or equal to zero) on which two modifying operations are defined. One operation, V, adds 1 to the natural. The other operation, P, decreases the natural number by 1. Both activities are atomic (i.e. no other operation can be executed at the same time as a V or a P).
Because the natural number 0 cannot be decreased, calling P on a semaphore containing a 0 will block the execution of the calling process(/thread) until some moment at which the number is no longer 0 and P can be successfully (and atomically) executed.
As mentioned in other answers, semaphores can be used to restrict access to a certain resource to a maximum (but variable) number of processes.

A hardware or software flag. In multi tasking systems , a semaphore is as variable with a value that indicates the status of a common resource.A process needing the resource checks the semaphore to determine the resources status and then decides how to proceed.

Semaphores are act like thread limiters.
Example: If you have a pool of 100 threads and you want to perform some DB operation. If 100 threads access the DB at a given time, then there may be locking issue in DB so we can use semaphore which allow only limited thread at a time.Below Example allow only one thread at a time. When a thread call the acquire() method, it will then get the access and after calling the release() method, it will release the acccess so that next thread will get the access.
package practice;
import java.util.concurrent.Semaphore;
public class SemaphoreExample {
public static void main(String[] args) {
Semaphore s = new Semaphore(1);
semaphoreTask s1 = new semaphoreTask(s);
semaphoreTask s2 = new semaphoreTask(s);
semaphoreTask s3 = new semaphoreTask(s);
semaphoreTask s4 = new semaphoreTask(s);
semaphoreTask s5 = new semaphoreTask(s);
s1.start();
s2.start();
s3.start();
s4.start();
s5.start();
}
}
class semaphoreTask extends Thread {
Semaphore s;
public semaphoreTask(Semaphore s) {
this.s = s;
}
#Override
public void run() {
try {
s.acquire();
Thread.sleep(1000);
System.out.println(Thread.currentThread().getName()+" Going to perform some operation");
s.release();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

So imagine everyone is trying to go to the bathroom and there's only a certain number of keys to the bathroom. Now if there's not enough keys left, that person needs to wait. So think of semaphore as representing those set of keys available for bathrooms (the system resources) that different processes (bathroom goers) can request access to.
Now imagine two processes trying to go to the bathroom at the same time. That's not a good situation and semaphores are used to prevent this. Unfortunately, the semaphore is a voluntary mechanism and processes (our bathroom goers) can ignore it (i.e. even if there are keys, someone can still just kick the door open).
There are also differences between binary/mutex & counting semaphores.
Check out the lecture notes at http://www.cs.columbia.edu/~jae/4118/lect/L05-ipc.html.

This is an old question but one of the most interesting uses of semaphore is a read/write lock and it has not been explicitly mentioned.
The r/w locks works in simple fashion: consume one permit for a reader and all permits for writers.
Indeed, a trivial implementation of a r/w lock but requires metadata modification on read (actually twice) that can become a bottle neck, still significantly better than a mutex or lock.
Another downside is that writers can be started rather easily as well unless the semaphore is a fair one or the writes acquire permits in multiple requests, in such case they need an explicit mutex between themselves.
Further read:

Mutex is just a boolean while semaphore is a counter.
Both are used to lock part of code so it's not accessed by too many threads.
Example
lock.set()
a += 1
lock.unset()
Now if lock was a mutex, it means that it will always be locked or unlocked (a boolean under the surface) regardless how many threads try access the protected snippet of code. While locked, any other thread would just wait until it's unlocked/unset by the previous thread.
Now imagine if instead lock was under the hood a counter with a predefined MAX value (say 2 for our example). Then if 2 threads try to access the resource, then lock would get its value increased to 2. If a 3rd thread then tried to access it, it would simply wait for the counter to go below 2 and so on.
If lock as a semaphore had a max of 1, then it would be acting exactly as a mutex.

A semaphore is a way to lock a resource so that it is guaranteed that while a piece of code is executed, only this piece of code has access to that resource. This keeps two threads from concurrently accesing a resource, which can cause problems.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string