What is a deadlock? - multithreading

When writing multi-threaded applications, one of the most common problems experienced are deadlocks.
My questions to the community are:
What is a deadlock?
How do you detect them?
Do you handle them?
And finally, how do you prevent them from occurring?

A lock occurs when multiple processes try to access the same resource at the same time.
One process loses out and must wait for the other to finish.
A deadlock occurs when the waiting process is still holding on to another resource that the first needs before it can finish.
So, an example:
Resource A and resource B are used by process X and process Y
X starts to use A.
X and Y try to start using B
Y 'wins' and gets B first
now Y needs to use A
A is locked by X, which is waiting for Y
The best way to avoid deadlocks is to avoid having processes cross over in this way. Reduce the need to lock anything as much as you can.
In databases avoid making lots of changes to different tables in a single transaction, avoid triggers and switch to optimistic/dirty/nolock reads as much as possible.

Let me explain a real world (not actually real) example for a deadlock situation from the crime movies. Imagine a criminal holds an hostage and against that, a cop also holds an hostage who is a friend of the criminal. In this case, criminal is not going to let the hostage go if cop won't let his friend to let go. Also the cop is not going to let the friend of criminal let go, unless the criminal releases the hostage. This is an endless untrustworthy situation, because both sides are insisting the first step from each other.
Criminal & Cop Scene
So simply, when two threads needs two different resources and each of them has the lock of the resource that the other need, it is a deadlock.
Another High Level Explanation of Deadlock : Broken Hearts
You are dating with a girl and one day after an argument, both sides are heart-broken to each other and waiting for an I-am-sorry-and-I-missed-you call. In this situation, both sides want to communicate each other if and only if one of them receives an I-am-sorry call from the other. Because that neither of each is going to start communication and waiting in a passive state, both will wait for the other to start communication which ends up in a deadlock situation.

Deadlocks will only occur when you have two or more locks that can be aquired at the same time and they are grabbed in different order.
Ways to avoid having deadlocks are:
avoid having locks (if possible),
avoid having more than one lock
always take the locks in the same order.

To define deadlock, first I would define process.
Process : As we know process is nothing but a program in execution.
Resource : To execute a program process needs some resources. Resource categories may include memory, printers, CPUs, open files, tape drives, CD-ROMS, etc.
Deadlock : Deadlock is a situation or condition when two or more processes are holding some resources and trying to acquire some more resources, and they can not release the resources until they finish there execution.
Deadlock condition or situation
In the above diagram there are two process P1 and p2 and there are two resources R1 and R2.
Resource R1 is allocated to process P1 and resource R2 is allocated to process p2.
To complete execution of process P1 needs resource R2, so P1 request for R2, but R2 is already allocated to P2.
In the same way Process P2 to complete its execution needs R1, but R1 is already allocated to P1.
both the processes can not release their resource until and unless they complete their execution. So both are waiting for another resources and they will wait forever. So this is a DEADLOCK Condition.
In order for deadlock to occur, four conditions must be true.
Mutual exclusion - Each resource is either currently allocated to exactly one process or it is available. (Two processes cannot
simultaneously control the same resource or be in their critical
section).
Hold and Wait - processes currently holding resources can request new resources.
No preemption - Once a process holds a resource, it cannot be taken away by another process or the kernel.
Circular wait - Each process is waiting to obtain a resource which is held by another process.
and all these condition are satisfied in above diagram.

A deadlock happens when a thread is waiting for something that never occurs.
Typically, it happens when a thread is waiting on a mutex or semaphore that was never released by the previous owner.
It also frequently happens when you have a situation involving two threads and two locks like this:
Thread 1 Thread 2
Lock1->Lock(); Lock2->Lock();
WaitForLock2(); WaitForLock1(); <-- Oops!
You generally detect them because things that you expect to happen never do, or the application hangs entirely.

You can take a look at this wonderful articles, under section Deadlock. It is in C# but the idea is still the same for other platform. I quote here for easy reading
A deadlock happens when two threads each wait for a resource held by
the other, so neither can proceed. The easiest way to illustrate this
is with two locks:
object locker1 = new object();
object locker2 = new object();
new Thread (() => {
lock (locker1)
{
Thread.Sleep (1000);
lock (locker2); // Deadlock
}
}).Start();
lock (locker2)
{
Thread.Sleep (1000);
lock (locker1); // Deadlock
}

Deadlock is a common problem in multiprocessing/multiprogramming problems in OS.
Say there are two processes P1, P2 and two globally shareable resource R1, R2 and in critical section both resources need to be accessed
Initially, the OS assigns R1 to process P1 and R2 to process P2.
As both processes are running concurrently they may start executing their code but the PROBLEM arises when a process hits the critical section.
So process R1 will wait for process P2 to release R2 and vice versa...
So they will wait for forever (DEADLOCK CONDITION).
A small ANALOGY...
Your Mother(OS),
You(P1),
Your brother(P2),
Apple(R1),
Knife(R2),
critical section(cutting apple with knife).
Your mother gives you the apple and the knife to your brother in the beginning.
Both are happy and playing(Executing their codes).
Anyone of you wants to cut the apple(critical section) at some point.
You don't want to give the apple to your brother.
Your brother doesn't want to give the knife to you.
So both of you are going to wait for a long very long time :)

Deadlock occurs when two threads aquire locks which prevent either of them from progressing. The best way to avoid them is with careful development. Many embedded systems protect against them by using a watchdog timer (a timer which resets the system whenever if it hangs for a certain period of time).

A deadlock occurs when there is a circular chain of threads or processes which each hold a locked resource and are trying to lock a resource held by the next element in the chain. For example, two threads that hold respectively lock A and lock B, and are both trying to acquire the other lock.

Lock-based concurrency control
Using locking for controlling access to shared resources is prone to deadlocks, and the transaction scheduler alone cannot prevent their occurrences.
For instance, relational database systems use various locks to guarantee transaction ACID properties.
No matter what relational database system you are using, locks will always be acquired when modifying (e.g., UPDATE or DELETE) a certain table record. Without locking a row that was modified by a currently running transaction, Atomicity would be compromised).
What is a deadlock
A deadlock happens when two concurrent transactions cannot make progress because each one waits for the other to release a lock, as illustrated in the following diagram.
Because both transactions are in the lock acquisition phase, neither one releases a lock prior to acquiring the next one.
Recovering from a deadlock situation
If you're using a Concurrency Control algorithm that relies on locks, then there is always the risk of running into a deadlock situation. Deadlocks can occur in any concurrency environment, not just in a database system.
For instance, a multithreading program can deadlock if two or more threads are waiting on locks that were previously acquired so that no thread can make any progress. If this happens in a Java application, the JVM cannot just force a Thread to stop its execution and release its locks.
Even if the Thread class exposes a stop method, that method has been deprecated since Java 1.1 because it can cause objects to be left in an inconsistent state after a thread is stopped. Instead, Java defines an interrupt method, which acts as a hint as a thread that gets interrupted can simply ignore the interruption and continue its execution.
For this reason, a Java application cannot recover from a deadlock situation, and it is the responsibility of the application developer to order the lock acquisition requests in such a way that deadlocks can never occur.
However, a database system cannot enforce a given lock acquisition order since it's impossible to foresee what other locks a certain transaction will want to acquire further. Preserving the lock order becomes the responsibility of the data access layer, and the database can only assist in recovering from a deadlock situation.
The database engine runs a separate process that scans the current conflict graph for lock-wait cycles (which are caused by deadlocks).
When a cycle is detected, the database engine picks one transaction and aborts it, causing its locks to be released, so that the other transaction can make progress.
Unlike the JVM, a database transaction is designed as an atomic unit of work. Hence, a rollback leaves the database in a consistent state.

A classic and very simple program for understanding Deadlock situation :-
public class Lazy {
private static boolean initialized = false;
static {
Thread t = new Thread(new Runnable() {
public void run() {
initialized = true;
}
});
t.start();
try {
t.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
System.out.println(initialized);
}
}
When the main thread invokes Lazy.main, it checks whether the class Lazy
has been initialized and begins to initialize the class. The
main thread now sets initialized to false , creates and starts a background
thread whose run method sets initialized to true , and waits for the background thread to complete.
This time, the class is currently being initialized by another thread.
Under these circumstances, the current thread, which is the background thread,
waits on the Class object until initialization is complete. Unfortunately, the thread
that is doing the initialization, the main thread, is waiting for the background
thread to complete. Because the two threads are now waiting for each other, the
program is DEADLOCKED.

A deadlock is a state of a system in which no single process/thread is capable of executing an action. As mentioned by others, a deadlock is typically the result of a situation where each process/thread wishes to acquire a lock to a resource that is already locked by another (or even the same) process/thread.
There are various methods to find them and avoid them. One is thinking very hard and/or trying lots of things. However, dealing with parallelism is notoriously difficult and most (if not all) people will not be able to completely avoid problems.
Some more formal methods can be useful if you are serious about dealing with these kinds of issues. The most practical method that I'm aware of is to use the process theoretic approach. Here you model your system in some process language (e.g. CCS, CSP, ACP, mCRL2, LOTOS) and use the available tools to (model-)check for deadlocks (and perhaps some other properties as well). Examples of toolset to use are FDR, mCRL2, CADP and Uppaal. Some brave souls might even prove their systems deadlock free by using purely symbolic methods (theorem proving; look for Owicki-Gries).
However, these formal methods typically do require some effort (e.g. learning the basics of process theory). But I guess that's simply a consequence of the fact that these problems are hard.

Deadlock is a situation occurs when there is less number of available resources as it is requested by the different process. It means that when the number of available resources become less than it is requested by the user then at that time the process goes in waiting condition .Some times waiting increases more and there is not any chance to check out the problem of lackness of resources then this situation is known as deadlock .
Actually, deadlock is a major problem for us and it occurs only in multitasking operating system .deadlock can not occur in single tasking operating system because all the resources are present only for that task which is currently running......

Above some explanations are nice. Hope this may also useful:
https://ora-data.blogspot.in/2017/04/deadlock-in-oracle.html
In a database, when a session (e.g. ora) wants a resource held by another session (e.g. data), but that session (data) also wants a resource which is held by the first session (ora). There can be more than 2 sessions involved also but idea will be the same.
Actually, Deadlocks prevent some transactions from continuing to work.
For example:
Suppose, ORA-DATA holds lock A and requests lock B
And SKU holds lock B and requests lock A.
Thanks,

Deadlock occurs when a thread is waiting for other thread to finish and vice versa.
How to avoid?
- Avoid Nested Locks
- Avoid Unnecessary Locks
- Use thread join()
How do you detect it?
run this command in cmd:
jcmd $PID Thread.print
reference : geeksforgeeks

Deadlocks does not just occur with locks, although that's the most frequent cause. In C++, you can create deadlock with two threads and no locks by just having each thread call join() on the std::thread object for the other.

Let's say one thread wants to transfer money from "A Account => B Account" and another thread wants to transfer money from "B Account => A Account", simultaneously. This can cause a deadlock.
The first thread will take a lock on "Account A" and then it has to take a lock on "Account B" but it cannot because second thread will already took lock on "Account B". Similarly second thread cannot take a lock on A Account because it is locked by the first thread. so this transaction will remain incomplete and our system will lose 2 threads. To prevent this, we can add a rule that locks the database records in sorted order. So thread should look at accounts name or IDs and decide to lock in a sorted order: A => B. There will be a race condition here and whoever wins, process its code and the second thread will take over. This is the solution for this specific case but deadlocks may occur in many reasons so each case will have a different solution.
Os has a deadlock detection mechanism with a certain interval of time, and when it detects the deadlock, it starts a recovery approach. More on deadlock detection
In the example we lost 2 threads but if we get more deadlocks, those deadlocks can bring down the system.

The best solution to the deadlock is not to fall into the deadlock
system resources or critical space should be used in a balanced way.
give priority to key variables.
If more than one lock object will be used, the sequence number should be given.
Deadlock occurs as a result of incorrect use of synchronization objects.
For Example Deadlock with C#:
public class Deadlock{
static object o1 = new Object();
static object o2 = new Object();
private static void y1()
{
lock (o1)
{
Console.WriteLine("1");
lock (o2)
{
Console.WriteLine("2");
}
}
}
private static void y2()
{
lock (o2)
{
Console.WriteLine("3");
lock (o1)
{
Console.WriteLine("4");
}
}
}
public static void Main(string[] args)
{
Thread thread1 = new Thread(y1);
Thread thread2 = new Thread(y2);
thread1.Start();
thread2.Start();
}
}

Mutex in essence is a lock, providing protected access to shared resources. Under Linux, the thread mutex data type is pthread_mutex_t. Before use, initialize it.
To access to shared resources, you have to lock on the mutex. If the mutex already on the lock, the call will block the thread until the mutex is unlocked. Upon completion of the visit to shared resources, you have to unlock them.
Overall, there are a few unwritten basic principles:
Obtain the lock before using the shared resources.
Holding the lock as short time as possible.
Release the lock if the thread returns an error.

Related

Java Thread Live Lock

I have an interesting problem related to Java thread live lock. Here it goes.
There are four global locks - L1,L2,L3,L4
There are four threads - T1, T2, T3, T4
T1 requires locks L1,L2,L3
T2 requires locks L2
T3 required locks L3,L4
T4 requires locks L1,L2
So, the pattern of the problem is - Any of the threads can run and acquire the locks in any order. If any of the thread detects that a lock which it needs is not available, it release all other locks it had previously acquired waits for a fixed time before retrying again. The cycle repeats giving rise to a live lock condition.
So, to solve this problem, I have two solutions in mind
1) Let each thread wait for a random period of time before retrying.
OR,
2) Let each thread acquire all the locks in a particular order ( even if a thread does not require all the
locks)
I am not convinced that these are the only two options available to me. Please advise.
Have all the threads enter a single mutex-protected state-machine whenever they require and release their set of locks. The threads should expose methods that return the set of locks they require to continue and also to signal/wait for a private semaphore signal. The SM should contain a bool for each lock and a 'Waiting' queue/array/vector/list/whatever container to store waiting threads.
If a thread enters the SM mutex to get locks and can immediately get its lock set, it can reset its bool set, exit the mutex and continue on.
If a thread enters the SM mutex and cannot immediately get its lock set, it should add itself to 'Waiting', exit the mutex and wait on its private semaphore.
If a thread enters the SM mutex to release its locks, it sets the lock bools to 'return' its locks and iterates 'Waiting' in an attempt to find a thread that can now run with the set of locks available. If it finds one, it resets the bools appropriately, removes the thread it found from 'Waiting' and signals the 'found' thread semaphore. It then exits the mutex.
You can twiddle with the algorithm that you use to match up the available set lock bools with waiting threads as you wish. Maybe you should release the thread that requires the largest set of matches, or perhaps you would like to 'rotate' the 'Waiting' container elements to reduce starvation. Up to you.
A solution like this requires no polling, (with its performance-sapping CPU use and latency), and no continual aquire/release of multiple locks.
It's much easier to develop such a scheme with an OO design. The methods/member functions to signal/wait the semaphore and return the set of locks needed can usually be stuffed somewhere in the thread class inheritance chain.
Unless there is a good reason (performance wise) not to do so,
I would unify all locks to one lock object.
This is similar to solution 2 you suggested, only more simple in my opinion.
And by the way, not only is this solution more simple and less bug proned,
The performance might be better than solution 1 you suggested.
Personally, I have never heard of Option 1, but I am by no means an expert on multithreading. After thinking about it, it sounds like it will work fine.
However, the standard way to deal with threads and resource locking is somewhat related to Option 2. To prevent deadlocks, resources need to always be acquired in the same order. For example, if you always lock the resources in the same order, you won't have any issues.
Go with 2a) Let each thread acquire all of the locks that it needs (NOT all of the locks) in a particular order; if a thread encounters a lock that isn't available then it releases all of its locks
As long as threads acquire their locks in the same order you can't have deadlock; however, you can still have starvation (a thread might run into a situation where it keeps releasing all of its locks without making forward progress). To ensure that progress is made you can assign priorities to threads (0 = lowest priority, MAX_INT = highest priority) - increase a thread's priority when it has to release its locks, and reduce it to 0 when it acquires all of its locks. Put your waiting threads in a queue, and don't start a lower-priority thread if it needs the same resources as a higher-priority thread - this way you guarantee that the higher-priority threads will eventually acquire all of their locks. Don't implement this thread queue unless you're actually having problems with thread starvation, though, because it's probably less efficient than just letting all of your threads run at once.
You can also simplify things by implementing omer schleifer's condense-all-locks-to-one solution; however, unless threads other than the four you've mentioned are contending for these resources (in which case you'll still need to lock the resources from the external threads), you can more efficiently implement this by removing all locks and putting your threads in a circular queue (so your threads just keep running in the same order).

Semaphore vs. Monitors - what's the difference?

What are the major differences between a Monitor and a Semaphore?
A Monitor is an object designed to be accessed from multiple threads. The member functions or methods of a monitor object will enforce mutual exclusion, so only one thread may be performing any action on the object at a given time. If one thread is currently executing a member function of the object then any other thread that tries to call a member function of that object will have to wait until the first has finished.
A Semaphore is a lower-level object. You might well use a semaphore to implement a monitor. A semaphore essentially is just a counter. When the counter is positive, if a thread tries to acquire the semaphore then it is allowed, and the counter is decremented. When a thread is done then it releases the semaphore, and increments the counter.
If the counter is already zero when a thread tries to acquire the semaphore then it has to wait until another thread releases the semaphore. If multiple threads are waiting when a thread releases a semaphore then one of them gets it. The thread that releases a semaphore need not be the same thread that acquired it.
A monitor is like a public toilet. Only one person can enter at a time. They lock the door to prevent anyone else coming in, do their stuff, and then unlock it when they leave.
A semaphore is like a bike hire place. They have a certain number of bikes. If you try and hire a bike and they have one free then you can take it, otherwise you must wait. When someone returns their bike then someone else can take it. If you have a bike then you can give it to someone else to return --- the bike hire place doesn't care who returns it, as long as they get their bike back.
Following explanation actually explains how wait() and signal() of monitor differ from P and V of semaphore.
The wait() and signal() operations on condition variables in a monitor are similar to P and V operations on counting semaphores.
A wait statement can block a process's execution, while a signal statement can cause another process to be unblocked. However, there are some differences between them. When a process executes a P operation, it does not necessarily block that process because the counting semaphore may be greater than zero. In contrast, when a wait statement is executed, it always blocks the process. When a task executes a V operation on a semaphore, it either unblocks a task waiting on that semaphore or increments the semaphore counter if there is no task to unlock. On the other hand, if a process executes a signal statement when there is no other process to unblock, there is no effect on the condition variable. Another difference between semaphores and monitors is that users awaken by a V operation can resume execution without delay. Contrarily, users awaken by a signal operation are restarted only when the monitor is unlocked. In addition, a monitor solution is more structured than the one with semaphores because the data and procedures are encapsulated in a single module and that the mutual exclusion is provided automatically by the implementation.
Link: here for further reading. Hope it helps.
Semaphore allows multiple threads (up to a set number) to access a shared object. Monitors allow mutually exclusive access to a shared object.
Monitor
Semaphore
One Line Answer:
Monitor: controls only ONE thread at a time can execute in the monitor. (need to acquire lock to execute the single thread)
Semaphore: a lock that protects a shared resource. (need to acquire the lock to access resource)
A semaphore is a signaling mechanism used to coordinate between threads. Example: One thread is downloading files from the internet and another thread is analyzing the files. This is a classic producer/consumer scenario. The producer calls signal() on the semaphore when a file is downloaded. The consumer calls wait() on the same semaphore in order to be blocked until the signal indicates a file is ready. If the semaphore is already signaled when the consumer calls wait, the call does not block. Multiple threads can wait on a semaphore, but each signal will only unblock a single thread.
A counting semaphore keeps track of the number of signals. E.g. if the producer signals three times in a row, wait() can be called three times without blocking. A binary semaphore does not count but just have the "waiting" and "signalled" states.
A mutex (mutual exclusion lock) is a lock which is owned by a single thread. Only the thread which have acquired the lock can realease it again. Other threads which try to acquire the lock will be blocked until the current owner thread releases it. A mutex lock does not in itself lock anything - it is really just a flag. But code can check for ownership of a mutex lock to ensure that only one thread at a time can access some object or resource.
A monitor is a higher-level construct which uses an underlying mutex lock to ensure thread-safe access to some object. Unfortunately the word "monitor" is used in a few different meanings depending on context and platform and context, but in Java for example, a monitor is a mutex lock which is implicitly associated with an object, and which can be invoked with the synchronized keyword. The synchronized keyword can be applied to a class, method or block and ensures only one thread can execute the code at a time.
Semaphore :
Using a counter or flag to control access some shared resources in a concurrent system, implies use of Semaphore.
Example:
A counter to allow only 50 Passengers to acquire the 50 seats (Shared resource) of any Theatre/Bus/Train/Fun ride/Classroom. And to allow a new Passenger only if someone vacates a seat.
A binary flag indicating the free/occupied status of any Bathroom.
Traffic lights are good example of flags. They control flow by regulating passage of vehicles on Roads (Shared resource)
Flags only reveal the current state of Resource, no count or any other information on the waiting or running objects on the resource.
Monitor :
A Monitor synchronizes access to an Object by communicating with threads interested in the object, asking them to acquire access or wait for some condition to become true.
Example:
A Father may acts as a monitor for her daughter, allowing her to date only one guy at a time.
A school teacher using baton to allow only one child to speak in the class.
Lastly a technical one, transactions (via threads) on an Account object synchronized to maintain integrity.
When a semaphore is used to guard a critical region, there is no direct relationship between the semaphore and the data being protected. This is part of the reason why semaphores may be dispersed around the code, and why it is easy to forget to call wait or notify, in which case the result will be, respectively, to violate mutual exclusion or to lock the resource permanently.
In contrast, niehter of these bad things can happen with a monitor. A monitor is tired directly to the data (it encapsulates the data) and, because the monitor operations are atomic actions, it is impossible to write code that can access the data without calling the entry protocol. The exit protocol is called automatically when the monitor operation is completed.
A monitor has a built-in mechanism for condition synchronisation in the form of condition variable before proceeding. If the condition is not satisfied, the process has to wait until it is notified of a change in the condition. When a process is waiting for condition synchronisation, the monitor implementation takes care of the mutual exclusion issue, and allows another process to gain access to the monitor.
Taken from The Open University M362 Unit 3 "Interacting process" course material.

Is Deadlock recovery possible in MultiThread programming?

Process has some 10 threads and all 10 threads entered DEADLOCK state( assume all are waiting for Mutex variable ).
How can you free process(threads) from DEADLOCK state ? .
Is there any way to kill lower priority thread ?( in Multi process case we can kill lower priority process when all processes in deadlock state).
Can we attach that deadlocked process to the debugger and assign proper value to the Mutex variable ( assume all the threads are waiting on a mutex variable MUT but it is value is 0 and can we assign MUT value to 1 through debugger ) .
If every thread in the app is waiting on every other, and none are set to time out, you're rather screwed. You might be able to run the app in a debugger or something, but locks are generally acquired for a reason -- and manually forcing a mutex to be owned by a thread that didn't legitimately acquire it can cause some big problems (the thread that previously owned it is still going to try and release it, the results of which can be unpredictable if the mutex is unexpectedly yanked away. Could cause an unexpected exception, could cause the mutex to be unlocked while still in use.) Anyway it defeats the whole purpose of mutexes, so you're just covering up a much bigger problem.
There are two common solutions:
Instead of having threads wait forever, set a timeout. This is slightly harder to do in languages like Java that embed mutexes into the language via synchronized or lock blocks, but it's almost always possible. If you time out waiting on the lock, release all the locks/mutexes you had and try later.
Better, but potentially much more complex, is to figure out why everything's fighting for the resource and remove that contention. If you must lock, lock consistently. But if there's 10 threads blocking on a single mutex, that could be a clue either that your operations are badly chunked (ie: that your threads are doing too much or too little at once before trying to acquire a lock), or that there's unnecessary locking going on. Don't lock unless you have to. Some synchronization could be obviated by using collections and algorithms specifically designed to be "lock-free" while still offering thread-safety.
Adding another answer because I don't agree with the solutions proposed by cHao earlier - the analysis is fine.
First, why I disagree with the two solutions offered:
Reduce contention
Contention doesn't lead to deadlocks. It just causes poor performance. Deadlock means no performance whatsoever. Therefore, reducing contention does not solve deadlocks.
timeout on mutex.
A mutex protects a resource, and a thread locks the mutex because it needs the resource. With a timeout, you won't be able to acquire the resource, and your thread fails. Does it solve the deadlock problem? Only if the failing thread releases another resource that was blocking the other threads.
But in that case, there's a much better solution. Mutexes should have a partial ordering. If there is at least one thread that can both mutex A and B, you should decide whether A or B is acquired first, and then stick with that. This must be a transitive order: if you lock A before B, and B before C, then obviously you must lock A before C.
This is a perfect solution to deadlocks. Look back at the timeout example: it only works if the thread that times out waiting on A then releases its lock on B, to release another thread that was waiting on B. In the most simple case, that other thread was itself directly locking A. Thus, the mutexes A and B are not properly ordered. You should have consistently locked either A or B first.
The timeout case could also be the result of a cyclic order problem; one thread locks A then B, another B then C, and a third C then A, with the deadlock happening when each thread owns one lock. The solution again is the same; order the locks.
Alternatively said, mutex lock orders can be described by a directed graph. If a thread locks A before B, there's an arc from A to B. Deadlocks appear if the directed graph is cyclic, and then the arcs of that cycle are the deadlocked threads.
This theory can be a bit complex, but there are some simple insights to be found. For instance, from the graph theory, we know that trees are acyclic graphs. Hence, neither "leaf mutexes" (those that are always locked last) nor "root mutexes" (those that are always locked first) can cause deadlocks. Leaf mutexes are excluded because no thread ever blocks holding them, and root mutexes are excluded because the thread that holds them will be able to lock all subsequent mutexes in due time.

Is a lock (threading) atomic?

This may sound like a stupid question, but if one locks a resource in a multi-threaded app, then the operation that happens on the resource, is that done atomically?
I.E.: can the processor be interrupted or can a context switch occur while that resource has a lock on it? If it does, then nothing else can access this resource until it's scheduled back in to finish off it's process. Sounds like an expensive operation.
The processor can very definitely still switch to another thread, yes. Indeed, in most modern computers there can be multiple threads running simultaneously anyway. The locking just makes sure that no other thread can acquire the same lock, so you can make sure that an operation on that resource is atomic in terms of that resource. Code using other resources can operate completely independently.
You should usually lock for short operations wherever possible. You can also choose the granularity of locks... for example, if you have two independent variables in a shared object, you could use two separate locks to protect access to those variables. That will potentially provide better concurrency - but at the same time, more locks means more complexity and more potential for deadlock. There's always a balancing act when it comes to concurrency.
You're exactly right. That's one reason why it's so important to lock for short period of time. However, this isn't as bad as it sounds because no other thread that's waiting on the lock will get scheduled until the thread holding the lock releases it.
Yes, a context switch can definitely occur.
This is exactly why when accessing a shared resource it is important to lock it from another thread as well. When thread A has the lock, thread B cannot access the code locked.
For example if two threads run the following code:
1. lock(l);
2. -- change shared resource S here --
3. unlock(l);
A context switch can occur after step 1, but the other thread cannot hold the lock at that time, and therefore, cannot change the shared resource. If access to the shared resource on one of the threads is done without a lock - bad things can happen!
Regarding the wastefulness, yes, it is a wasteful method. This is why there are methods that try to avoid locks altogether. These methods are called lock-free, and some of them are based on strong locking services such as CAS (Compare-And-Swap) or others.
No, it's not really expensive. There are typically only two possibilities:
1) The system has other things it can do: In this case, the system is still doing useful work with all available cores.
2) The system doesn't have anything else to do: In this case, the thread that holds the lock will be scheduled. A sane system won't leave a core unused while there's a ready-to-run thread that's not scheduled.
So, how can it be expensive? If there's nothing else for the system to do that doesn't require acquiring that lock (or not enough other things to occupy all cores) and the thread holding the lock is not ready-to-run. So that's the case you have to avoid, and the context switch or pre-empt issue doesn't matter (since the thread would be ready-to-run).

What are multi-threading DOs and DONTs? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am applying my new found knowledge of threading everywhere and getting lots of surprises
Example:
I used threads to add numbers in an
array. And outcome was different every
time. The problem was that all of my
threads were updating the same
variable and were not synchronized.
What are some known thread issues?
What care should be taken while using
threads?
What are good multithreading resources.
Please provide examples.
sidenote:(I renamed my program thread_add.java to thread_random_number_generator.java:-)
In a multithreading environment you have to take care of synchronization so two threads doesn't clobber the state by simultaneously performing modifications. Otherwise you can have race conditions in your code (for an example see the infamous Therac-25 accident.) You also have to schedule the threads to perform various tasks. You then have to make sure that your synchronization and scheduling doesn't cause a deadlock where multiple threads will wait for each other indefinitely.
Synchronization
Something as simple as increasing a counter requires synchronization:
counter += 1;
Assume this sequence of events:
counter is initialized to 0
thread A retrieves counter from memory to cpu (0)
context switch
thread B retrieves counter from memory to cpu (0)
thread B increases counter on cpu
thread B writes back counter from cpu to memory (1)
context switch
thread A increases counter on cpu
thread A writes back counter from cpu to memory (1)
At this point the counter is 1, but both threads did try to increase it. Access to the counter has to be synchronized by some kind of locking mechanism:
lock (myLock) {
counter += 1;
}
Only one thread is allowed to execute the code inside the locked block. Two threads executing this code might result in this sequence of events:
counter is initialized to 0
thread A acquires myLock
context switch
thread B tries to acquire myLock but has to wait
context switch
thread A retrieves counter from memory to cpu (0)
thread A increases counter on cpu
thread A writes back counter from cpu to memory (1)
thread A releases myLock
context switch
thread B acquires myLock
thread B retrieves counter from memory to cpu (1)
thread B increases counter on cpu
thread B writes back counter from cpu to memory (2)
thread B releases myLock
At this point counter is 2.
Scheduling
Scheduling is another form of synchronization and you have to you use thread synchronization mechanisms like events, semaphores, message passing etc. to start and stop threads. Here is a simplified example in C#:
AutoResetEvent taskEvent = new AutoResetEvent(false);
Task task;
// Called by the main thread.
public void StartTask(Task task) {
this.task = task;
// Signal the worker thread to perform the task.
this.taskEvent.Set();
// Return and let the task execute on another thread.
}
// Called by the worker thread.
void ThreadProc() {
while (true) {
// Wait for the event to become signaled.
this.taskEvent.WaitOne();
// Perform the task.
}
}
You will notice that access to this.task probably isn't synchronized correctly, that the worker thread isn't able to return results back to the main thread, and that there is no way to signal the worker thread to terminate. All this can be corrected in a more elaborate example.
Deadlock
A common example of deadlock is when you have two locks and you are not careful how you acquire them. At one point you acquire lock1 before lock2:
public void f() {
lock (lock1) {
lock (lock2) {
// Do something
}
}
}
At another point you acquire lock2 before lock1:
public void g() {
lock (lock2) {
lock (lock1) {
// Do something else
}
}
}
Let's see how this might deadlock:
thread A calls f
thread A acquires lock1
context switch
thread B calls g
thread B acquires lock2
thread B tries to acquire lock1 but has to wait
context switch
thread A tries to acquire lock2 but has to wait
context switch
At this point thread A and B are waiting for each other and are deadlocked.
There are two kinds of people that do not use multi threading.
1) Those that do not understand the concept and have no clue how to program it.
2) Those that completely understand the concept and know how difficult it is to get it right.
I'd make a very blatant statement:
DON'T use shared memory.
DO use message passing.
As a general advice, try to limit the amount of shared state and prefer more event-driven architectures.
I can't give you examples besides pointing you at Google. Search for threading basics, thread synchronisation and you'll get more hits than you know.
The basic problem with threading is that threads don't know about each other - so they will happily tread on each others toes, like 2 people trying to get through 1 door, sometimes they will pass though one after the other, but sometimes they will both try to get through at the same time and will get stuck. This is difficult to reproduce, difficult to debug, and sometimes causes problems. If you have threads and see "random" failures, this is probably the problem.
So care needs to be taken with shared resources. If you and your friend want a coffee, but there's only 1 spoon you cannot both use it at the same time, one of you will have to wait for the other. The technique used to 'synchronise' this access to the shared spoon is locking. You make sure you get a lock on the shared resource before you use it, and let go of it afterwards. If someone else has the lock, you wait until they release it.
Next problem comes with those locks, sometimes you can have a program that is complex, so much that you get a lock, do something else then access another resource and try to get a lock for that - but some other thread has that 2nd resource, so you sit and wait... but if that 2nd thread is waiting for the lock you hold for the 1st resource.. it's going to sit and wait. And your app just sits there. This is called deadlock, 2 threads both waiting for each other.
Those 2 are the vast majority of thread issues. The answer is generally to lock for as short a time as possible, and only hold 1 lock at a time.
I notice you are writing in java and that nobody else mentioned books so Java Concurrency In Practice should be your multi-threaded bible.
-- What are some known thread issues? --
Race conditions.
Deadlocks.
Livelocks.
Thread starvation.
-- What care should be taken while using threads? --
Using multi-threading on a single-processor machine to process multiple tasks where each task takes approximately the same time isn’t always very effective.For example, you might decide to spawn ten threads within your program in order to process ten separate tasks. If each task takes approximately 1 minute to process, and you use ten threads to do this processing, you won’t have access to any of the task results for the whole 10 minutes. If instead you processed the same tasks using just a single thread, you would see the first result in 1 minute, the next result 1 minute later, and so on. If you can make use of each result without having to rely on all of the results being ready simultaneously, the single
thread might be the better way of implementing the program.
If you launch a large number of threads within a process, the overhead of thread housekeeping and context switching can become significant. The processor will spend considerable time in switching between threads, and many of the threads won’t be able to make progress. In addition, a single process with a large number of threads means that threads in other processes will be scheduled less frequently and won’t receive a reasonable share of processor time.
If multiple threads have to share many of the same resources, you’re unlikely to see performance benefits from multi-threading your application. Many developers see multi-threading as some sort of magic wand that gives automatic performance benefits. Unfortunately multi-threading isn’t the magic wand that it’s sometimes perceived to be. If you’re using multi-threading for performance reasons, you should measure your application’s performance very closely in several different situations, rather than just relying on some non-existent magic.
Coordinating thread access to common data can be a big performance killer. Achieving good performance with multiple threads isn’t easy when using a coarse locking plan, because this leads to low concurrency and threads waiting for access. Alternatively, a fine-grained locking strategy increases the complexity and can also slow down performance unless you perform some sophisticated tuning.
Using multiple threads to exploit a machine with multiple processors sounds like a good idea in theory, but in practice you need to be careful. To gain any significant performance benefits, you might need to get to grips with thread balancing.
-- Please provide examples. --
For example, imagine an application that receives incoming price information from
the network, aggregates and sorts that information, and then displays the results
on the screen for the end user.
With a dual-core machine, it makes sense to split the task into, say, three threads. The first thread deals with storing the incoming price information, the second thread processes the prices, and the final thread handles the display of the results.
After implementing this solution, suppose you find that the price processing is by far the longest stage, so you decide to rewrite that thread’s code to improve its performance by a factor of three. Unfortunately, this performance benefit in a single thread may not be reflected across your whole application. This is because the other two threads may not be able to keep pace with the improved thread. If the user interface thread is unable to keep up with the faster flow of processed information, the other threads now have to wait around for the new bottleneck in the system.
And yes, this example comes directly from my own experience :-)
DONT use global variables
DONT use many locks (at best none at all - though practically impossible)
DONT try to be a hero, implementing sophisticated difficult MT protocols
DO use simple paradigms. I.e share the processing of an array to n slices of the same size - where n should be equal to the number of processors
DO test your code on different machines (using one, two, many processors)
DO use atomic operations (such as InterlockedIncrement() and the like)
YAGNI
The most important thing to remember is: do you really need multithreading?
I agree with pretty much all the answers so far.
A good coding strategy is to minimise or eliminate the amount of data that is shared between threads as much as humanly possible. You can do this by:
Using thread-static variables (although don't go overboard on this, it will eat more memory per thread, depending on your O/S).
Packaging up all state used by each thread into a class, then guaranteeing that each thread gets exactly one state class instance to itself. Think of this as "roll your own thread-static", but with more control over the process.
Marshalling data by value between threads instead of sharing the same data. Either make your data transfer classes immutable, or guarantee that all cross-thread calls are synchronous, or both.
Try not to have multiple threads competing for the exact same I/O "resource", whether it's a disk file, a database table, a web service call, or whatever. This will cause contention as multiple threads fight over the same resource.
Here's an extremely contrived OTT example. In a real app you would cap the number of threads to reduce scheduling overhead:
All UI - one thread.
Background calcs - one thread.
Logging errors to a disk file - one thread.
Calling a web service - one thread per unique physical host.
Querying the database - one thread per independent group of tables that need updating.
Rather than guessing how to do divvy up the tasks, profile your app and isolate those bits that are (a) very slow, and (b) could be done asynchronously. Those are good candidates for a separate thread.
And here's what you should avoid:
Calcs, database hits, service calls, etc - all in one thread, but spun up multiple times "to improve performance".
Don't start new threads unless you really need to. Starting threads is not cheap and for short running tasks starting the thread may actually take more time than executing the task itself. If you're on .NET take a look at the built in thread pool, which is useful in a lot of (but not all) cases. By reusing the threads the cost of starting threads is reduced.
EDIT: A few notes on creating threads vs. using thread pool (.NET specific)
Generally try to use the thread pool. Exceptions:
Long running CPU bound tasks and blocking tasks are not ideal run on the thread pool cause they will force the pool to create additional threads.
All thread pool threads are background threads, so if you need your thread to be foreground, you have to start it yourself.
If you need a thread with different priority.
If your thread needs more (or less) than the standard 1 MB stack space.
If you need to be able to control the life time of the thread.
If you need different behavior for creating threads than that offered by the thread pool (e.g. the pool will throttle creating of new threads, which may or may not be what you want).
There are probably more exceptions and I am not claiming that this is the definitive answer. It is just what I could think of atm.
I am applying my new found knowledge of threading everywhere
[Emphasis added]
DO remember that a little knowledge is dangerous. Knowing the threading API of your platform is the easy bit. Knowing why and when you need to use synchronisation is the hard part. Reading up on "deadlocks", "race-conditions", "priority inversion" will start you in understanding why.
The details of when to use synchronisation are both simple (shared data needs synchronisation) and complex (atomic data types used in the right way don't need synchronisation, which data is really shared): a lifetime of learning and very solution specific.
An important thing to take care of (with multiple cores and CPUs) is cache coherency.
I am surprised that no one has pointed out Herb Sutter's Effective Concurrency columns yet. In my opinion, this is a must read if you want to go anywhere near threads.
a) Always make only 1 thread responsible for a resource's lifetime. That way thread A won't delete a resource thread B needs - if B has ownership of the resource
b) Expect the unexpected
DO think about how you will test your code and set aside plenty of time for this. Unit tests become more complicated. You may not be able to manually test your code - at least not reliably.
DO think about thread lifetime and how threads will exit. Don't kill threads. Provide a mechanism so that they exit gracefully.
DO add some kind of debug logging to your code - so that you can see that your threads are behaving correctly both in development and in production when things break down.
DO use a good library for handling threading rather than rolling your own solution (if you can). E.g. java.util.concurrency
DON'T assume a shared resource is thread safe.
DON'T DO IT. E.g. use an application container that can take care of threading issues for you. Use messaging.
In .Net one thing that surprised me when I started trying to get into multi-threading is that you cannot straightforwardly update the UI controls from any thread other than the thread that the UI controls were created on.
There is a way around this, which is to use the Control.Invoke method to update the control on the other thread, but it is not 100% obvious the first time around!
Don't be fooled into thinking you understand the difficulties of concurrency until you've split your head into a real project.
All the examples of deadlocks, livelocks, synchronization, etc, seem simple, and they are. But they will mislead you, because the "difficulty" in implementing concurrency that everyone is talking about is when it is used in a real project, where you don't control everything.
While your initial differences in sums of numbers are, as several respondents have pointed out, likely to be the result of lack of synchronisation, if you get deeper into the topic, be aware that, in general, you will not be able to reproduce exactly the numeric results you get on a serial program with those from a parallel version of the same program. Floating-point arithmetic is not strictly commutative, associative, or distributive; heck, it's not even closed.
And I'd beg to differ with what, I think, is the majority opinion here. If you are writing multi-threaded programs for a desktop with one or more multi-core CPUs, then you are working on a shared-memory computer and should tackle shared-memory programming. Java has all the features to do this.
Without knowing a lot more about the type of problem you are tackling, I'd hesitate to write that 'you should do this' or 'you should not do that'.

Resources