What is the difference between a dead lock and a race around condition in programming terms?
Think of a race condition using the traditional example. Say you and a friend have an ATM cards for the same bank account. Now suppose the account has $100 in it. Consider what happens when you attempt to withdraw $10 and your friend attempts to withdraw $50 at exactly the same time.
Think about what has to happen. The ATM machine must take your input, read what is currently in your account, and then modify the amount. Note, that in programming terms, an assignment statement is a multi-step process.
So, label both of your transactions T1 (you withdraw $10), and T2 (your friend withdraws $50). Now, the numbers below, to the left, represent time steps.
T1 T2
---------------- ------------------------
1. Read Acct ($100)
2. Read Acct ($100)
3. Write New Amt ($90)
4. Write New Amt ($50)
5. End
6. End
After both transactions complete, using this timeline, which is possible if you don't use any sort of locking mechanism, the account has $50 in it. This is $10 more than it should (your transaction is lost forever, but you still have the money).
This is a called race condition. What you want is for the transaction to be serializable, that is in no matter how you interleave the individual instruction executions, the end result will be the exact same as some serial schedule (meaning you run them one after the other with no interleaving) of the same transactions. The solution, again, is to introduce locking; however incorrect locking can lead to dead lock.
Deadlock occurs when there is a conflict of a shared resource. It's sort of like a Catch-22.
T1 T2
------- --------
1. Lock(x)
2. Lock(y)
3. Write x=1
4. Write y=19
5. Lock(y)
6. Write y=x+1
7. Lock(x)
8. Write x=y+2
9. Unlock(x)
10. Unlock(x)
11. Unlock(y)
12. Unlock(y)
You can see that a deadlock occurs at time 7 because T2 tries to acquire a lock on x but T1 already holds the lock on x but it is waiting on a lock for y, which T2 holds.
This bad. You can turn this diagram into a dependency graph and you will see that there is a cycle. The problem here is that x and y are resources that may be modified together.
One way to prevent this sort of deadlock problem with multiple lock objects (resources) is to introduce an ordering. You see, in the previous example, T1 locked x and then y but T2 locked y and then x. If both transactions adhered here to some ordering rule that says "x shall always be locked before y" then this problem will not occur. (You can change the previous example with this rule in mind and see no deadlock occurs).
These are trivial examples and really I've just used the examples you may have already seen if you have taken any kind of undergrad course on this. In reality, solving deadlock problems can be much harder than this because you tend to have more than a couple resources and a couple transactions interacting.
As always, use Wikipedia as a starting point for CS concepts:
http://en.wikipedia.org/wiki/Deadlock
http://en.wikipedia.org/wiki/Race_condition
A deadlock is when two (or more) threads are blocking each other. Usually this has something to do with threads trying to acquire shared resources. For example if threads T1 and T2 need to acquire both resources A and B in order to do their work. If T1 acquires resource A, then T2 acquires resource B, T1 could then be waiting for resource B while T2 was waiting for resource A. In this case, both threads will wait indefinitely for the resource held by the other thread. These threads are said to be deadlocked.
Race conditions occur when two threads interact in a negatve (buggy) way depending on the exact order that their different instructions are executed. If one thread sets a global variable, for example, then a second thread reads and modifies that global variable, and the first thread reads the variable, the first thread may experience a bug because the variable has changed unexpectedly.
Deadlock :
This happens when 2 or more threads are waiting on each other to release the resource for infinite amount of time.
In this the threads are in blocked state and not executing.
Race/Race Condition:
This happens when 2 or more threads run in parallel but end up giving a result which is wrong and not equivalent if all the operations are done in sequential order.
Here all the threads run and execute there operations.
In Coding we need to avoid both race and deadlock condition.
I assume you mean "race conditions" and not "race around conditions" (I've heard that term...)
Basically, a dead lock is a condition where thread A is waiting for resource X while holding a lock on resource Y, and thread B is waiting for resource Y while holding a lock on resource X. The threads block waiting for each other to release their locks.
The solution to this problem is (usually) to ensure that you take locks on all resources in the same order in all threads. For example, if you always lock resource X before resource Y then my example can never result in a deadlock.
A race condition is something where you're relying on a particular sequence of events happening in a certain order, but that can be messed up if another thread is running at the same time. For example, to insert a new node into a linked list, you need to modify the list head, usually something like so:
newNode->next = listHead;
listHead = newNode;
But if two threads do that at the same time, then you might have a situation where they run like so:
Thread A Thread B
newNode1->next = listHead
newNode2->next = listHead
listHead = newNode2
listHead = newNode1
If this were to happen, then Thread B's modification of the list will be lost because Thread A would have overwritten it. It can be even worse, depending on the exact situation, but that's the basics of it.
The solution to this problem is usually to ensure that you include the proper locking mechanisms (for example, take out a lock any time you want to modify the linked list so that only one thread is modifying it at a time).
Withe rest to Programming language if you are not locking shared resources and are accessed by multiple threads then its called as "Race condition", 2nd case if you locked the resources and sequences of access to shared resources are not defined properly then threads may go long waiting for the resources to use then its a case of "deadlock"
Related
I have been confused at this question:
I have C++ function:
void withdraw(int x) {
balance = balance - x;
}
balance is a global integer variable, which equals to 100 at the start.
We run the above function with two different thread: thread A and thread B. Thread A run withdraw(50) and thread B run withdraw(30).
Assuming we don't protect balance, what is the final result of balance after running those threads in following sequences?
A1->A2->A3->B1->B2->B3
B1->B2->B3->A1->A2->A3
A1->A2->B1->B2->B3->A3
B1->B2->A1->A2->A3->B3
Explanation:
A1 means OS execute the first line of function withdraw in thread A, A2 means OS execute the second line of function withdraw in thread A, B3 means OS execute the third line of function withdraw in thread B, and so on.
The sequence is how OS schedule thread A & B presumably.
My answer is
20
20
50 (Before context switch, OS saves balance. After context switch, OS restore balance to 50)
70 (Similar to above)
But my friend disagrees, he said that balance was a global variable. Thus it is not saved in stack, so it does not affected by context switching. He claimed that all 4 sequences result in 20.
So who is right? I can't find fault in his logic.
(We assume we have one processor that can only execute one thread at a time)
Consider this line:
balance = balance - x;
Thread A reads balance. It is 100. Now, thread A subtracts 50 and ... oops
Thread B reads balance. It is 100. Now, thread B subtracts 30 and updates the variable, which is now 70.
...thread A continues now updates the variable, which is now 50. You've just lost the work that Thread B.
Threads don't execute "lines of code" -- they execute machine instructions. It does not matter if a global variable is affected by context switching. What matters is when the variable is read, and when it is written, by each thread, because the value is "taken off the shelf" and modified, then "put back". Once the first thread has read the global variable and is working with the value "somewhere in space", the second thread must not read the global variable until the first thread has written the updated value.
Unless the threading standard you are using specifies, then there's no way to know. Most typical threading standards don't, so typically there's no way to know.
Your answer sounds like nonsense though. The OS has no idea what balance is nor any way to do anything to it around a context switch. Also, threads can run at the same time without context switches.
Your friend's answer also sounds like nonsense. How does he know that it won't be cached in a register by the compiler and thus some of the modifications will stomp on previous ones?
But the point is, both of you are just guessing about what might happen to happen. If you want to answer this usefully, you have to talk about what is guaranteed to happen.
Clearly homework, but saved by doing actual work before asking.
First, forget about context switching. Context switching is totally irrelevant to the problem. Assume that you have multiple processors, each executing one thread, and each progressing at an unknown speed, stopping and starting at unpredictable times. Even better, assume that this stopping and storing is controlled by an enemy, who will try to break your program.
And because context switching is irrelevant, the OS will not save or restore anything. It won't touch the variable balance. Only your two threads will.
Your friend is absolutely, totally wrong. It's quite the opposite. Since balance is a global variable, both threads can read and write it. But you don't only have the problem that they might read and write it in unknown order, as you examined, it is worse. They could access it at the same time, and if one thread modifies data while another reads it, you have a race condition and anything at all could happen. Not only could you get any result, your program could also crash.
If balance was a local variable saved on the stack, then both threads would have each its own variable, and nothing bad would happen.
Consider this line:
balance = balance - x;
Thread A reads balance. It is 100. Now, thread A subtracts 50 and ... oops
Thread B reads balance. It is 100. Now, thread B subtracts 30 and updates the variable, which is now 70.
...thread A continues and updates the variable, which is now 50. You've just completely lost the work of Thread B.
Threads don't execute "lines of code" -- they execute machine instructions. It does not matter if a global variable is affected by context switching. What matters is when the variable is read, and when it is written, by each thread, because the value is "taken off the shelf" and modified, then "put back". Once the first thread has read the global variable and is working with the value "somewhere in space", the second thread cannot read the global variable until the first thread has written the updated value.
Simple and short answer for c++: Unsynchronized access to a shared variable is undefined behavior, so anything can happen. The value can e.g. be 100,70,50,20,42 or -458995. The program could crash or not. And in theory its even allowed to order pizza.
The actual machine code that is executed is usually far away from what your program looks like and in the case of undefined behavior, you are no longer guaranteed, that the actual behavior has anything to do with the c++ code you have written.
Start with x = 0. Note there are no memory barriers in any of the code below.
volatile int x = 0
Thread 1:
while (x == 0) {}
print "Saw non-zer0"
while (x != 0) {}
print "Saw zero again!"
Thread 2:
x = 1
Is it ever possible to see the second message, "Saw zero again!", on any (real) CPU? What about on x86_64?
Similarly, in this code:
volatile int x = 0.
Thread 1:
while (x == 0) {}
x = 2
Thread 2:
x = 1
Is the final value of x guaranteed to be 2, or could the CPU caches update main memory in some arbitrary order, so that although x = 1 gets into a CPU's cache where thread 1 can see it, then thread 1 gets moved to a different cpu where it writes x = 2 to that cpu's cache, and the x = 2 gets written back to main memory before x = 1.
Yes, it's entirely possible. The compiler could, for example, have just written x to memory but still have the value in a register. One while loop could check memory while the other checks the register.
It doesn't happen due to CPU caches because cache coherency hardware logic makes the caches invisible on all CPUs you are likely to actually use.
Theoretically, the write race you talk about could happen due to posted write buffering and read prefetching. Miraculous tricks were used to make this impossible on x86 CPUs to avoid breaking legacy code. But you shouldn't expect future processors to do this.
Leaving aside for a second tricks done by the compiler (even ones allowed by language standards), I believe you're asking how the micro-architecture could behave in such scenario. Keep in mind that the code would most likely expand into a busy wait loop of cmp [x] + jz or something similar, which hides a load inside it. This means that [x] is likely to live in the cache of the core running thread 1.
At some point, thread 2 would come and perform the store. If it resides on a different core, the line would first be invalidated completely from the first core. If these are 2 threads running on the same physical core - the store would immediately affect all chronologically younger loads.
Now, the most likely thing to happen on a modern out-of-order machine is that all the loads in the pipeline at this point would be different iterations of the same first loop (since any branch predictor facing so many repetitive "taken" resolution is likely to assume the branch will continue being taken, until proven wrong), so what would happen is that the first load to encounter the new value modified by the other thread will cause the matching branch to simply flush the entire pipe from all younger operations, without the 2nd loop ever having a chance to execute.
However, it's possible that for some reason you did get to the 2nd loop (let's say the predictor issue a not-taken prediction just at the right moment when the loop condition check saw the new value) - in this case, the question boils down to this scenario:
Time -->
----------------------------------------------------------------
thread 1
cmp [x],0 execute
je ... execute (not taken)
...
cmp [x],0 execute
jne ... execute (not taken)
Can_We_Get_Here:
...
thread2
store [x],1 execute
In other words, given that most modern CPUs may execute instructions out of order, can a younger load be evaluated before an older one to the same address, allowing the store (from another thread) to change the value so it may be observed inconsistently by the loads.
My guess is that the above timeline is quite possible given the nature of out-of-order execution engines today, as they simply arbitrate and perform whatever operation is ready. However, on most x86 implementations there are safeguards to protect against such a scenario, since the memory ordering rules strictly say -
8.2.3.2 Neither Loads Nor Stores Are Reordered with Like Operations
Such mechanisms may detect this scenario and flush the machine to prevent the stale/wrong values becoming visible. So The answer is - no, it should not be possible, unless of course the software or the compiler change the nature of the code to prevent the hardware from noticing the relation. Then again, memory ordering rules are sometimes flaky, and i'm not sure all x86 manufacturers adhere to the exact same wording, but this is a pretty fundamental example of consistency, so i'd be very surprised if one of them missed it.
The answer seems to be, "this is exactly the job of the CPU cache coherency." x86 processors implement the MESI protocol, which guarantee that the second thread can't see the new value then the old.
Problem
Summary: Parallely apply a function F to each element of an array where F is NOT thread safe.
I have a set of elements E to process, lets say a queue of them.
I want to process all these elements in parallel using the same function f( E ).
Now, ideally I could call a map based parallel pattern, but the problem has the following constraints.
Each element contains a pair of 2 objects.( E = (A,B) )
Two elements may share an object. ( E1 = (A1,B1); E2 = (A1, B2) )
The function f cannot process two elements that share an object. so E1 and E2 cannot be processing in parallel.
What is the right way of doing this?
My thoughts are like so,
trivial thought: Keep a set of active As and Bs, and start processing an Element only when no other thread is already using A OR B.
So, when you give the element to a thread you add the As and Bs to the active set.
Pick the first element, if its elements are not in the active set spawn a new thread , otherwise push it to the back of the queue of elements.
Do this till the queue is empty.
Will this cause a deadlock ? Ideally when a processing is over some elements will become available right?
2.-The other thought is to make a graph of these connected objects.
Each node represents an object (A / B) . Each element is an edge connecting A & B, and then somehow process the data such that we know the elements are never overlapping.
Questions
How can we achieve this best?
Is there a standard pattern to do this ?
Is there a problem with these approaches?
Not necessary, but if you could tell the TBB methods to use, that'll be great.
The "best" approach depends on a lot of factors here:
How many elements "E" do you have and how much work is needed for f(E). --> Check if it's really worth it to work the elements in parallel (if you need a lot of locking and don't have much work to do, you'll probably slow down the process by working in parallel)
Is there any possibility to change the design that can make f(E) multi-threading safe?
How many elements "A" and "B" are there? Is there any logic to which elements "E" share specific versions of A and B? --> If you can sort the elements E into separate lists where each A and B only appears in a single list, then you can process these lists parallel without any further locking.
If there are many different A's and B's and you don't share too many of them, you may want to do a trivial approach where you just lock each "A" and "B" when entering and wait until you get the lock.
Whenever you do "lock and wait" with multiple locks it's very important that you always take the locks in the same order (e.g. always A first and B second) because otherwise you may run into deadlocks. This locking order needs to be observed everywhere (a single place in the whole application that uses a different order can cause a deadlock)
Edit: Also if you do "try lock" you need to ensure that the order is always the same. Otherwise you can cause a lifelock:
thread 1 locks A
thread 2 locks B
thread 1 tries to lock B and fails
thread 2 tries to lock A and fails
thread 1 releases lock A
thread 2 releases lock B
Goto 1 and repeat...
Chances that this actually happens "endless" are relatively slim, but it should be avoided anyway
Edit 2: principally I guess I'd just split E(Ax, Bx) into different lists based on Ax (e.g one list for all E's that share the same A). Then process these lists in parallel with locking of "B" (there you can still "TryLock" and continue if the required B is already used.
I have the following code:
pthread_mutex lock_row[M], lock_culm[M];
FUNCTION SIGNATURE (..., int i, int j, ...) {
pthread_mutex_lock(&lock_row[i]);
pthread_mutex_lock(&lock_culm[j]);
...CRITICAL CODE...
pthread_mute_unlock(&lock_row[j]);
pthread_mute_unlock(&lock_row[i]);
}
Can I get a deadlock between the first lock to the second? Let's say if we have a context switch after the first row, and other thread tries to lock something again? I don't really get this I would like to understand this a little further.
Besides the probable typo when you try to unlock sth twice, this example will never deadlock. Context switches between the two lock-calls pose no threat to the mechanism involved here. Think of it as a getting a higher level of allowance. With each lock gained, this process or thread is allowed to do more. Each locking is a gate which might hold the process up until no other lock-holder prevents the entering of the higher level. Whatever happens between the two lockings does not matter as long as it does not change that level of allowance.
pthread_mutex_lock(&lock_row[i]);
pthread_mutex_lock(&lock_culm[j]);
This is fine as long as all of your code takes these locks in this order - the lock_row lock first, then the lock_culm lock second. If another part of the code takes these same locks in the opposite order, then it can deadlock.
For this reason it is usual in complex programs to define the locking order - a global ordering of all the locks in the program, defining the order in which they should be taken.
I understand about race conditions and how with multiple threads accessing the same variable, updates made by one can be ignored and overwritten by others, but what if each thread is writing the same value (not different values) to the same variable; can even this cause problems? Could this code:
GlobalVar.property = 11;
(assuming that property will never be assigned anything other than 11), cause problems if multiple threads execute it at the same time?
The problem comes when you read that state back, and do something about it. Writing is a red herring - it is true that as long as this is a single word most environments guarantee the write will be atomic, but that doesn't mean that a larger piece of code that includes this fragment is thread-safe. Firstly, presumably your global variable contained a different value to begin with - otherwise if you know it's always the same, why is it a variable? Second, presumably you eventually read this value back again?
The issue is that presumably, you are writing to this bit of shared state for a reason - to signal that something has occurred? This is where it falls down: when you have no locking constructs, there is no implied order of memory accesses at all. It's hard to point to what's wrong here because your example doesn't actually contain the use of the variable, so here's a trivialish example in neutral C-like syntax:
int x = 0, y = 0;
//thread A does:
x = 1;
y = 2;
if (y == 2)
print(x);
//thread B does, at the same time:
if (y == 2)
print(x);
Thread A will always print 1, but it's completely valid for thread B to print 0. The order of operations in thread A is only required to be observable from code executing in thread A - thread B is allowed to see any combination of the state. The writes to x and y may not actually happen in order.
This can happen even on single-processor systems, where most people do not expect this kind of reordering - your compiler may reorder it for you. On SMP even if the compiler doesn't reorder things, the memory writes may be reordered between the caches of the separate processors.
If that doesn't seem to answer it for you, include more detail of your example in the question. Without the use of the variable it's impossible to definitively say whether such a usage is safe or not.
It depends on the work actually done by that statement. There can still be some cases where Something Bad happens - for example, if a C++ class has overloaded the = operator, and does anything nontrivial within that statement.
I have accidentally written code that did something like this with POD types (builtin primitive types), and it worked fine -- however, it's definitely not good practice, and I'm not confident that it's dependable.
Why not just lock the memory around this variable when you use it? In fact, if you somehow "know" this is the only write statement that can occur at some point in your code, why not just use the value 11 directly, instead of writing it to a shared variable?
(edit: I guess it's better to use a constant name instead of the magic number 11 directly in the code, btw.)
If you're using this to figure out when at least one thread has reached this statement, you could use a semaphore that starts at 1, and is decremented by the first thread that hits it.
I would expect the result to be undetermined. As in it would vary from compiler to complier, langauge to language and OS to OS etc. So no, it is not safe
WHy would you want to do this though - adding in a line to obtain a mutex lock is only one or two lines of code (in most languages), and would remove any possibility of problem. If this is going to be two expensive then you need to find an alternate way of solving the problem
In General, this is not considered a safe thing to do unless your system provides for atomic operation (operations that are guaranteed to be executed in a single cycle).
The reason is that while the "C" statement looks simple, often there are a number of underlying assembly operations taking place.
Depending on your OS, there are a few things you could do:
Take a mutual exclusion semaphore (mutex) to protect access
in some OS, you can temporarily disable preemption, which guarantees your thread will not swap out.
Some OS provide a writer or reader semaphore which is more performant than a plain old mutex.
Here's my take on the question.
You have two or more threads running that write to a variable...like a status flag or something, where you only want to know if one or more of them was true. Then in another part of the code (after the threads complete) you want to check and see if at least on thread set that status... for example
bool flag = false
threadContainer tc
threadInputs inputs
check(input)
{
...do stuff to input
if(success)
flag = true
}
start multiple threads
foreach(i in inputs)
t = startthread(check, i)
tc.add(t) // Keep track of all the threads started
foreach(t in tc)
t.join( ) // Wait until each thread is done
if(flag)
print "One of the threads were successful"
else
print "None of the threads were successful"
I believe the above code would be OK, assuming you're fine with not knowing which thread set the status to true, and you can wait for all the multi-threaded stuff to finish before reading that flag. I could be wrong though.
If the operation is atomic, you should be able to get by just fine. But I wouldn't do that in practice. It is better just to acquire a lock on the object and write the value.
Assuming that property will never be assigned anything other than 11, then I don't see a reason for assigment in the first place. Just make it a constant then.
Assigment only makes sense when you intend to change the value unless the act of assigment itself has other side effects - like volatile writes have memory visibility side-effects in Java. And if you change state shared between multiple threads, then you need to synchronize or otherwise "handle" the problem of concurrency.
When you assign a value, without proper synchronization, to some state shared between multiple threads, then there's no guarantees for when the other threads will see that change. And no visibility guarantees means that it it possible that the other threads will never see the assignt.
Compilers, JITs, CPU caches. They're all trying to make your code run as fast as possible, and if you don't make any explicit requirements for memory visibility, then they will take advantage of that. If not on your machine, then somebody elses.