Design pattern for asynchronous while loop

Design pattern for asynchronous while loop - multithreading

I have a function that boils down to:
while(doWork)
{
config = generateConfigurationForTesting();
result = executeWork(config);
doWork = isDone(result);
}
How can I rewrite this for efficient asynchronous execution, assuming all functions are thread safe, independent of previous iterations, and probably require more iterations than the maximum number of allowable threads ?
The problem here is we don't know how many iterations are required in advance so we can't make a dispatch_group or use dispatch_apply.
This is my first attempt, but it looks a bit ugly to me because of arbitrarily chosen values and sleeping;
int thread_count = 0;
bool doWork = true;
int max_threads = 20; // arbitrarily chosen number
dispatch_queue_t queue =
dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
while(doWork)
{
if(thread_count < max_threads)
{
dispatch_async(queue, ^{ Config myconfig = generateConfigurationForTesting();
Result myresult = executeWork();
dispatch_async(queue, checkResult(myresult)); });
thread_count++;
}
else
usleep(100); // don't consume too much CPU
}
void checkResult(Result value)
{
if(value == good) doWork = false;
thread_count--;
}

Based on your description, it looks like generateConfigurationForTesting is some kind of randomization technique or otherwise a generator which can make a near-infinite number of configuration (hence your comment that you don't know ahead of time how many iterations you will need). With that as an assumption, you are basically stuck with the model that you've created, since your executor needs to be limited by some reasonable assumptions about the queue and you don't want to over-generate, as that would just extend the length of the run after you have succeeded in finding value ==good measurements.
I would suggest you consider using a queue (or OSAtomicIncrement* and OSAtomicDecrement*) to protect access to thread_count and doWork. As it stands, the thread_count increment and decrement will happen in two different queues (main_queue for the main thread and the default queue for the background task) and thus could simultaneously increment and decrement the thread count. This could lead to an undercount (which would cause more threads to be created than you expect) or an overcount (which would cause you to never complete your task).
Another option to making this look a little nicer would be to have checkResult add new elements into the queue if value!=good. This way, you load up the initial elements of the queue using dispatch_apply( 20, queue, ^{ ... }) and you don't need the thread_count at all. The first 20 will be added using dispatch_apply (or an amount that dispatch_apply feels is appropriate for your configuration) and then each time checkResult is called you can either set doWork=false or add another operation to queue.

dispatch_apply() works for this, just pass ncpu as the number of iterations (apply never uses more than ncpu worker threads) and keep each instance of your worker block running for as long as there is more work to do (i.e. loop back to generateConfigurationForTesting() unless !doWork).

Related

using atomic c++11 to implement a thread safe down counter to zero

I'm new to atomic techniques and try to implement a safe thread version for the follow code:
// say m_cnt is unsigned
void Counter::dec_counter()
{
if(0==m_cnt)
return;
--m_cnt;
if(0 == m_cnt)
{
// Do seomthing
}
}
Every thread that calls dec_counter must decrement it by one and "Do something" should be done only one time - at when the counter is decremented to 0.
After fighting with it, I did the follow code that does it well (I think), but I wonder if this is the way to do it, or is there a better way. Thanks.
// m_cnt is std::atomic<unsigned>
void Counter::dec_counter()
{
// loop until decrement done
unsigned uiExpectedValue;
unsigned uiNewValue;
do
{
uiExpectedValue = m_cnt.load();
// if other thread already decremented it to 0, then do nothing.
if (0 == uiExpectedValue)
return;
uiNewValue = uiExpectedValue - 1;
// at the short time from doing
// uiExpectedValue = m_cnt.load();
// it is possible that another thread had decremented m_cnt, and it won't be equal here to uiExpectedValue,
// thus the loop, to be sure we do a decrement
} while (!m_cnt.compare_exchange_weak(uiExpectedValue, uiNewValue));
// if we are here, that means we did decrement . so if it was to 0, then do something
if (0 == uiNewValue)
{
// do something
}
}

The thing with atomic is that only that one statement is atomic.
If you write
std::atomic<int> i {20}
...
if (!--i)
...
Then just 1 thread will enter the if.
However, if you split up the change and the test, then other threads can get into the gap, and you may get strange results:
std::atomic<int> i {20}
...
--i;
// other thread(s) can modify i just here
if (!i)
...
Of course you can split the condition test for the decrement by using a local variable:
std::atomic<int> i {20}
...
int j=--i;
// other thread(s) can modify i just here
if (!j)
...
All the simple math operations are generally efficiently supported for small atomics in c++
For more complex types and expressions, you need to use the read/modify/write member methods.
These allow you to read the current value, calculate the new value, and then call compare_exchange_strong or compare_exchange_weak say "if the value has not changed, then store my new value, otherwise give me the new current value" a a single atomic operation. You can stick this in a loop and keep recalculating the new value until you are lucky enough that your thread is the only writer. If there are not too many threads trying too often to change the value this is reasonably efficient as well.

Is it possible to block on wait on a semaphore without using the data when it's available?

The software I'm working on is a data analyzer with a sliding window. I have 2 threads, one producer and one consumer, that use a circular buffer.
The consumer must process data only if the first element in the buffer is old enough, therefore there are at least X elements in the buffer. But after the processing, only X/4 data can be deleted, because of the moving window.
My solution below works quite well, except that I have a trade-off between being fast (busy form of waiting in the check), or being efficient (sleep for some time). The problem is that the sleep time varies according to load, thread scheduling and elaboration complexity, so I can potentially slow down the performances.
Is there a way to poll a semaphore to check if there are at least X elements, blocking the thread otherwise, but acquiring only X/4 after the processing has been done? The tryAcquire option does not work because when it wakes the thread consumes all the data, and not one half.
I've thought about copyng the elements in a second buffer, but actually there are 7 circular buffers of big data, therefore I'd like to avoid data duplication, or even data moving.
//common structs
QSemaphore written;
QSemaphore free;
int writtenIndex = 0;
int readIndex = 0;
myCircularBuffer buf;
bool scan = true;
//producer
void produceData(data d)
{
while ( free.tryAcquire(1, 1000) == false && scan == true)
{
//avoid deadlock!
//once per second give up waiting and check if closing
}
if (scan == false) return;
buf.at(writtenIndex) = d;
writtenIndex = (writtenIndex+1) % bufferSize;
written.release();
}
//consumer
void consumeData()
{
while(1)
{
//here goes the problem: usleep (slow), sched_yield (B.F.O.W.) or what?
if (buf.at(writtenIndex).age - buf.at(readIndex).age < X)
{
//usleep(100); ? how much time?
//sched_yield(); ?
//tryAcquire not an option!
continue;
}
processTheData();
written.acquire(X/4);
readIndex = (readIndex + X/4) % bufferSize;
free.release(X/4);
}

lock-free bounded MPMC ringbuffer failure

I've been banging my head against (my attempt) at a lock-free multiple producer multiple consumer ring buffer. The basis of the idea is to use the innate overflow of unsigned char and unsigned short types, fix the element buffer to either of those types, and then you have a free loop back to beginning of the ring buffer.
The problem is - my solution doesn't work for multiple producers (it does though work for N consumers, and also single producer single consumer).
#include <atomic>
template<typename Element, typename Index = unsigned char> struct RingBuffer
{
std::atomic<Index> readIndex;
std::atomic<Index> writeIndex;
std::atomic<Index> scratchIndex;
Element elements[1 << (sizeof(Index) * 8)];
RingBuffer() :
readIndex(0),
writeIndex(0),
scratchIndex(0)
{
;
}
bool push(const Element & element)
{
while(true)
{
const Index currentReadIndex = readIndex.load();
Index currentWriteIndex = writeIndex.load();
const Index nextWriteIndex = currentWriteIndex + 1;
if(nextWriteIndex == currentReadIndex)
{
return false;
}
if(scratchIndex.compare_exchange_strong(
currentWriteIndex, nextWriteIndex))
{
elements[currentWriteIndex] = element;
writeIndex = nextWriteIndex;
return true;
}
}
}
bool pop(Element & element)
{
Index currentReadIndex = readIndex.load();
while(true)
{
const Index currentWriteIndex = writeIndex.load();
const Index nextReadIndex = currentReadIndex + 1;
if(currentReadIndex == currentWriteIndex)
{
return false;
}
element = elements[currentReadIndex];
if(readIndex.compare_exchange_strong(
currentReadIndex, nextReadIndex))
{
return true;
}
}
}
};
The main idea for writing was to use a temporary index 'scratchIndex' that acts a pseudo-lock to allow only one producer at any one time to copy-construct into the elements buffer, before updating the writeIndex and allowing any other producer to make progress. Before I am called heathen for implying my approach is 'lock-free' I realise that this approach isn't exactly lock-free, but in practice (if it would work!) it is significantly faster than having a normal mutex!
I am aware of a (more complex) MPMC ringbuffer solution here http://www.1024cores.net/home/lock-free-algorithms/queues/bounded-mpmc-queue, but I am really experimenting with my idea to then compare against that approach and find out where each excels (or indeed whether my approach just flat out fails!).
Things I have tried;
Using compare_exchange_weak
Using more precise std::memory_order's that match the behaviour I want
Adding cacheline pads between the various indices I have
Making elements std::atomic instead of just Element array
I am sure that this boils down to a fundamental segfault in my head as to how to use atomic accesses to get round using mutex's, and I would be entirely grateful to whoever can point out which neurons are drastically misfiring in my head! :)

This is a form of the A-B-A problem. A successful producer looks something like this:
load currentReadIndex
load currentWriteIndex
cmpxchg store scratchIndex = nextWriteIndex
store element
store writeIndex = nextWriteIndex
If a producer stalls for some reason between steps 2 and 3 for long enough, it is possible for the other producers to produce an entire queue's worth of data and wrap back around to the exact same index so that the compare-exchange in step 3 succeeds (because scratchIndex happens to be equal to currentWriteIndex again).
By itself, that isn't a problem. The stalled producer is perfectly within its rights to increment scratchIndex to lock the queue—even if a magical ABA-detecting cmpxchg rejected the store, the producer would simply try again, reload exactly the same currentWriteIndex, and proceed normally.
The actual problem is the nextWriteIndex == currentReadIndex check between steps 2 and 3. The queue is logically empty if currentReadIndex == currentWriteIndex, so this check exists to make sure that no producer gets so far ahead that it overwrites elements that no consumer has popped yet. It appears to be safe to do this check once at the top, because all the consumers should be "trapped" between the observed currentReadIndex and the observed currentWriteIndex.
Except that another producer can come along and bump up the writeIndex, which frees the consumer from its trap. If a producer stalls between steps 2 and 3, when it wakes up the stored value of readIndex could be absolutely anything.
Here's an example, starting with an empty queue, that shows the problem happening:
Producer A runs steps 1 and 2. Both loaded indices are 0. The queue is empty.
Producer B interrupts and produces an element.
Consumer pops an element. Both indices are 1.
Producer B produces 255 more elements. The write index wraps around to 0, the read index is still 1.
Producer A awakens from its slumber. It had previously loaded both read and write indices as 0 (empty queue!), so it attempts step 3. Because the other producer coincidentally paused on index 0, the compare-exchange succeeds, and the store progresses. At completion the producer lets writeIndex = 1, and now both stored indices are 1, and the queue is logically empty. A full queue's worth of elements will now be completely ignored.
(I should mention that the only reason I can get away with talking about "stalling" and "waking up" is that all the atomics used are sequentially consistent, so I can pretend that we're in a single-threaded environment.)
Note that the way that you are using scratchIndex to guard concurrent writes is essentially a lock; whoever successfully completes the cmpxchg gets total write access to the queue until it releases the lock. The simplest way to fix this failure is to just replace scratchIndex with a spinlock—it won't suffer from A-B-A and it's what's actually happening.

bool push(const Element & element)
{
while(true)
{
const Index currentReadIndex = readIndex.load();
Index currentWriteIndex = writeIndex.load();
const Index nextWriteIndex = currentWriteIndex + 1;
if(nextWriteIndex == currentReadIndex)
{
return false;
}
if(scratchIndex.compare_exchange_strong(
currentWriteIndex, nextWriteIndex))
{
elements[currentWriteIndex] = element;
// Problem here!
writeIndex = nextWriteIndex;
return true;
}
}
}
I've marked the problematic spot. Multiple threads can get to the writeIndex = nextWriteIndex at the same time. The data will be written in any order, although each write will be atomic.
This is a problem because you're trying to update two values using the same atomic condition, which is generally not possible. Assuming the rest of your method is fine, one way around this would be to combine both scratchIndex and writeIndex into a single value of double-size. For example, treating two uint32_t values as a single uint64_t value and operating atomically on that.

Multithreading: classical Producer Consumer algorithm

Something I don't get about the classical algorithm for the Producer-Consumer problem (from Wikipedia:)
semaphore mutex = 1
semaphore fillCount = 0
semaphore emptyCount = BUFFER_SIZE
procedure producer() {
while (true) {
item = produceItem()
down(emptyCount)
down(mutex)
putItemIntoBuffer(item)
up(mutex)
up(fillCount)
}
up(fillCount) //the consumer may not finish before the producer.
}
procedure consumer() {
while (true) {
down(fillCount)
down(mutex)
item = removeItemFromBuffer()
up(mutex)
up(emptyCount)
consumeItem(item)
}
}
I note that both producers and consumers lock 'mutex' prior to messing with the buffer, and unlock it thereafter. If that is the case, i.e. only a single thread is accessing the buffer at any given moment, I don't really see how the above algo differs from a simple solution that entails only putting a guarding mutex over the buffer:
semaphore mutex = 1
procedure producer() {
while (true) {
item = produceItem()
flag = true
while (flag) {
down(mutex)
if (bufferNotFull()) {
putItemIntoBuffer(item)
flag = false
}
up(mutex)
}
}
}
procedure consumer() {
while (true) {
flag = true
while (flag) {
down(mutex)
if (bufferNotEmpty()) {
item = removeItemFromBuffer()
flag = false
}
up(mutex)
}
consumeItem(item)
}
}
Only thing I can think of that necessitates using the 'fillCount' and 'emptyCount' semaphores is scheduling.
Maybe the first algo is for making sure that in a state where 5 consumers are waiting on an empty buffer (zero 'fillCount'), it is assured that when a new producer comes along, it will go past its "down(emptyCount)" statement quickly and get the 'mutex' quickly.
(whereas in the other solution the consumers will needlessly get the 'mutex' only to relinquish it until the new producer gets it and inserts an item).
Am I right? Am I missing something?

If there are no messages in the buffer, the consumer will down the mutex, check the buffer, find that it's empty, up the mutex, loop back around and immediately repeat the process. In simple terms, consumers and producers are stuck in busy loops that chew up 100% of a CPU core. This is not just a theoretical problem, either. You may well find that your computer's fan starts to spin every time you run your program.

The hole concept of the producer/consumer pattern is that the shared resource (buffer) is only accessed if certain criteria are met. And to not utilize unnecessary CPU cycles in order to make sure they are met.
Consumer:
Wait if there is nothing to consume (=> empty buffer).
Producer:
Wait if there is not enough space in the buffer.
And for both its very important to note:
Wait != Spin wait

There's more than just the locking of the buffer going on.
The fillCount semaphore in the first program is to pause the consumer(s) when there's nothing left to consume. Without it, you're constantly polling the buffer to see if there's anything to get, and that's quite wasteful.
Likewise, the emptyCount semaphore pauses the producer when the buffer is full.

There is a starving issue if more than one consumer is involved. In the latter example consumer checks for an item in a loop. But by the time he get's to try again some other consumer may "steal" his item. And next time again, and again. That is not nice.

What is a race condition?

When writing multithreaded applications, one of the most common problems experienced is race conditions.
My questions to the community are:
What is the race condition?
How do you detect them?
How do you handle them?
Finally, how do you prevent them from occurring?

A race condition occurs when two or more threads can access shared data and they try to change it at the same time. Because the thread scheduling algorithm can swap between threads at any time, you don't know the order in which the threads will attempt to access the shared data. Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e. both threads are "racing" to access/change the data.
Problems often occur when one thread does a "check-then-act" (e.g. "check" if the value is X, then "act" to do something that depends on the value being X) and another thread does something to the value in between the "check" and the "act". E.g:
if (x == 5) // The "Check"
{
y = x * 2; // The "Act"
// If another thread changed x in between "if (x == 5)" and "y = x * 2" above,
// y will not be equal to 10.
}
The point being, y could be 10, or it could be anything, depending on whether another thread changed x in between the check and act. You have no real way of knowing.
In order to prevent race conditions from occurring, you would typically put a lock around the shared data to ensure only one thread can access the data at a time. This would mean something like this:
// Obtain lock for x
if (x == 5)
{
y = x * 2; // Now, nothing can change x until the lock is released.
// Therefore y = 10
}
// release lock for x

A "race condition" exists when multithreaded (or otherwise parallel) code that would access a shared resource could do so in such a way as to cause unexpected results.
Take this example:
for ( int i = 0; i < 10000000; i++ )
{
x = x + 1;
}
If you had 5 threads executing this code at once, the value of x WOULD NOT end up being 50,000,000. It would in fact vary with each run.
This is because, in order for each thread to increment the value of x, they have to do the following: (simplified, obviously)
Retrieve the value of x
Add 1 to this value
Store this value to x
Any thread can be at any step in this process at any time, and they can step on each other when a shared resource is involved. The state of x can be changed by another thread during the time between x is being read and when it is written back.
Let's say a thread retrieves the value of x, but hasn't stored it yet. Another thread can also retrieve the same value of x (because no thread has changed it yet) and then they would both be storing the same value (x+1) back in x!
Example:
Thread 1: reads x, value is 7
Thread 1: add 1 to x, value is now 8
Thread 2: reads x, value is 7
Thread 1: stores 8 in x
Thread 2: adds 1 to x, value is now 8
Thread 2: stores 8 in x
Race conditions can be avoided by employing some sort of locking mechanism before the code that accesses the shared resource:
for ( int i = 0; i < 10000000; i++ )
{
//lock x
x = x + 1;
//unlock x
}
Here, the answer comes out as 50,000,000 every time.
For more on locking, search for: mutex, semaphore, critical section, shared resource.

What is a Race Condition?
You are planning to go to a movie at 5 pm. You inquire about the availability of the tickets at 4 pm. The representative says that they are available. You relax and reach the ticket window 5 minutes before the show. I'm sure you can guess what happens: it's a full house. The problem here was in the duration between the check and the action. You inquired at 4 and acted at 5. In the meantime, someone else grabbed the tickets. That's a race condition - specifically a "check-then-act" scenario of race conditions.
How do you detect them?
Religious code review, multi-threaded unit tests. There is no shortcut. There are few Eclipse plugin emerging on this, but nothing stable yet.
How do you handle and prevent them?
The best thing would be to create side-effect free and stateless functions, use immutables as much as possible. But that is not always possible. So using java.util.concurrent.atomic, concurrent data structures, proper synchronization, and actor based concurrency will help.
The best resource for concurrency is JCIP. You can also get some more details on above explanation here.

There is an important technical difference between race conditions and data races. Most answers seem to make the assumption that these terms are equivalent, but they are not.
A data race occurs when 2 instructions access the same memory location, at least one of these accesses is a write and there is no happens before ordering among these accesses. Now what constitutes a happens before ordering is subject to a lot of debate, but in general ulock-lock pairs on the same lock variable and wait-signal pairs on the same condition variable induce a happens-before order.
A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior.
Many race conditions can be (and in fact are) caused by data races, but this is not necessary. As a matter of fact, data races and race conditions are neither the necessary, nor the sufficient condition for one another. This blog post also explains the difference very well, with a simple bank transaction example. Here is another simple example that explains the difference.
Now that we nailed down the terminology, let us try to answer the original question.
Given that race conditions are semantic bugs, there is no general way of detecting them. This is because there is no way of having an automated oracle that can distinguish correct vs. incorrect program behavior in the general case. Race detection is an undecidable problem.
On the other hand, data races have a precise definition that does not necessarily relate to correctness, and therefore one can detect them. There are many flavors of data race detectors (static/dynamic data race detection, lockset-based data race detection, happens-before based data race detection, hybrid data race detection). A state of the art dynamic data race detector is ThreadSanitizer which works very well in practice.
Handling data races in general requires some programming discipline to induce happens-before edges between accesses to shared data (either during development, or once they are detected using the above mentioned tools). this can be done through locks, condition variables, semaphores, etc. However, one can also employ different programming paradigms like message passing (instead of shared memory) that avoid data races by construction.

A sort-of-canonical definition is "when two threads access the same location in memory at the same time, and at least one of the accesses is a write." In the situation the "reader" thread may get the old value or the new value, depending on which thread "wins the race." This is not always a bug—in fact, some really hairy low-level algorithms do this on purpose—but it should generally be avoided. #Steve Gury give's a good example of when it might be a problem.

A race condition is a situation on concurrent programming where two concurrent threads or processes compete for a resource and the resulting final state depends on who gets the resource first.

A race condition is a kind of bug, that happens only with certain temporal conditions.
Example:
Imagine you have two threads, A and B.
In Thread A:
if( object.a != 0 )
object.avg = total / object.a
In Thread B:
object.a = 0
If thread A is preempted just after having check that object.a is not null, B will do a = 0, and when thread A will gain the processor, it will do a "divide by zero".
This bug only happen when thread A is preempted just after the if statement, it's very rare, but it can happen.

Many answers in this discussion explains what a race condition is. I try to provide an explaination why this term is called race condition in software industry.
Why is it called race condition?
Race condition is not only related with software but also related with hardware too. Actually the term was initially coined by the hardware industry.
According to wikipedia:
The term originates with the idea of two signals racing each other to
influence the output first.
Race condition in a logic circuit:
Software industry took this term without modification, which makes it a little bit difficult to understand.
You need to do some replacement to map it to the software world:
"two signals" ==> "two threads"/"two processes"
"influence the output" ==> "influence some shared state"
So race condition in software industry means "two threads"/"two processes" racing each other to "influence some shared state", and the final result of the shared state will depend on some subtle timing difference, which could be caused by some specific thread/process launching order, thread/process scheduling, etc.

Race conditions occur in multi-threaded applications or multi-process systems. A race condition, at its most basic, is anything that makes the assumption that two things not in the same thread or process will happen in a particular order, without taking steps to ensure that they do. This happens commonly when two threads are passing messages by setting and checking member variables of a class both can access. There's almost always a race condition when one thread calls sleep to give another thread time to finish a task (unless that sleep is in a loop, with some checking mechanism).
Tools for preventing race conditions are dependent on the language and OS, but some comon ones are mutexes, critical sections, and signals. Mutexes are good when you want to make sure you're the only one doing something. Signals are good when you want to make sure someone else has finished doing something. Minimizing shared resources can also help prevent unexpected behaviors
Detecting race conditions can be difficult, but there are a couple signs. Code which relies heavily on sleeps is prone to race conditions, so first check for calls to sleep in the affected code. Adding particularly long sleeps can also be used for debugging to try and force a particular order of events. This can be useful for reproducing the behavior, seeing if you can make it disappear by changing the timing of things, and for testing solutions put in place. The sleeps should be removed after debugging.
The signature sign that one has a race condition though, is if there's an issue that only occurs intermittently on some machines. Common bugs would be crashes and deadlocks. With logging, you should be able to find the affected area and work back from there.

Microsoft actually have published a really detailed article on this matter of race conditions and deadlocks. The most summarized abstract from it would be the title paragraph:
A race condition occurs when two threads access a shared variable at
the same time. The first thread reads the variable, and the second
thread reads the same value from the variable. Then the first thread
and second thread perform their operations on the value, and they race
to see which thread can write the value last to the shared variable.
The value of the thread that writes its value last is preserved,
because the thread is writing over the value that the previous thread
wrote.

What is a race condition?
The situation when the process is critically dependent on the sequence or timing of other events.
For example,
Processor A and processor B both needs identical resource for their execution.
How do you detect them?
There are tools to detect race condition automatically:
Lockset-Based Race Checker
Happens-Before Race Detection
Hybrid Race Detection
How do you handle them?
Race condition can be handled by Mutex or Semaphores. They act as a lock allows a process to acquire a resource based on certain requirements to prevent race condition.
How do you prevent them from occurring?
There are various ways to prevent race condition, such as Critical Section Avoidance.
No two processes simultaneously inside their critical regions. (Mutual Exclusion)
No assumptions are made about speeds or the number of CPUs.
No process running outside its critical region which blocks other processes.
No process has to wait forever to enter its critical region. (A waits for B resources, B waits for C resources, C waits for A resources)

You can prevent race condition, if you use "Atomic" classes. The reason is just the thread don't separate operation get and set, example is below:
AtomicInteger ai = new AtomicInteger(2);
ai.getAndAdd(5);
As a result, you will have 7 in link "ai".
Although you did two actions, but the both operation confirm the same thread and no one other thread will interfere to this, that means no race conditions!

I made a video that explains this.
Essentially it is when you have a state with is shared across multiple threads and before the first execution on a given state is completed, another execution starts and the new thread’s initial state for a given operation is wrong because the previous execution has not completed.
Because the initial state of the second execution is wrong, the resulting computation is also wrong. Because eventually the second execution will update the final state with the wrong result.
You can view it here.
https://youtu.be/RWRicNoWKOY

Here is the classical Bank Account Balance example which will help newbies to understand Threads in Java easily w.r.t. race conditions:
public class BankAccount {
/**
* #param args
*/
int accountNumber;
double accountBalance;
public synchronized boolean Deposit(double amount){
double newAccountBalance=0;
if(amount<=0){
return false;
}
else {
newAccountBalance = accountBalance+amount;
accountBalance=newAccountBalance;
return true;
}
}
public synchronized boolean Withdraw(double amount){
double newAccountBalance=0;
if(amount>accountBalance){
return false;
}
else{
newAccountBalance = accountBalance-amount;
accountBalance=newAccountBalance;
return true;
}
}
public static void main(String[] args) {
// TODO Auto-generated method stub
BankAccount b = new BankAccount();
b.accountBalance=2000;
System.out.println(b.Withdraw(3000));
}

Try this basic example for better understanding of race condition:
public class ThreadRaceCondition {
/**
* #param args
* #throws InterruptedException
*/
public static void main(String[] args) throws InterruptedException {
Account myAccount = new Account(22222222);
// Expected deposit: 250
for (int i = 0; i < 50; i++) {
Transaction t = new Transaction(myAccount,
Transaction.TransactionType.DEPOSIT, 5.00);
t.start();
}
// Expected withdrawal: 50
for (int i = 0; i < 50; i++) {
Transaction t = new Transaction(myAccount,
Transaction.TransactionType.WITHDRAW, 1.00);
t.start();
}
// Temporary sleep to ensure all threads are completed. Don't use in
// realworld :-)
Thread.sleep(1000);
// Expected account balance is 200
System.out.println("Final Account Balance: "
+ myAccount.getAccountBalance());
}
}
class Transaction extends Thread {
public static enum TransactionType {
DEPOSIT(1), WITHDRAW(2);
private int value;
private TransactionType(int value) {
this.value = value;
}
public int getValue() {
return value;
}
};
private TransactionType transactionType;
private Account account;
private double amount;
/*
* If transactionType == 1, deposit else if transactionType == 2 withdraw
*/
public Transaction(Account account, TransactionType transactionType,
double amount) {
this.transactionType = transactionType;
this.account = account;
this.amount = amount;
}
public void run() {
switch (this.transactionType) {
case DEPOSIT:
deposit();
printBalance();
break;
case WITHDRAW:
withdraw();
printBalance();
break;
default:
System.out.println("NOT A VALID TRANSACTION");
}
;
}
public void deposit() {
this.account.deposit(this.amount);
}
public void withdraw() {
this.account.withdraw(amount);
}
public void printBalance() {
System.out.println(Thread.currentThread().getName()
+ " : TransactionType: " + this.transactionType + ", Amount: "
+ this.amount);
System.out.println("Account Balance: "
+ this.account.getAccountBalance());
}
}
class Account {
private int accountNumber;
private double accountBalance;
public int getAccountNumber() {
return accountNumber;
}
public double getAccountBalance() {
return accountBalance;
}
public Account(int accountNumber) {
this.accountNumber = accountNumber;
}
// If this method is not synchronized, you will see race condition on
// Remove syncronized keyword to see race condition
public synchronized boolean deposit(double amount) {
if (amount < 0) {
return false;
} else {
accountBalance = accountBalance + amount;
return true;
}
}
// If this method is not synchronized, you will see race condition on
// Remove syncronized keyword to see race condition
public synchronized boolean withdraw(double amount) {
if (amount > accountBalance) {
return false;
} else {
accountBalance = accountBalance - amount;
return true;
}
}
}

You don't always want to discard a race condition. If you have a flag which can be read and written by multiple threads, and this flag is set to 'done' by one thread so that other thread stop processing when flag is set to 'done', you don't want that "race condition" to be eliminated. In fact, this one can be referred to as a benign race condition.
However, using a tool for detection of race condition, it will be spotted as a harmful race condition.
More details on race condition here, http://msdn.microsoft.com/en-us/magazine/cc546569.aspx.

Consider an operation which has to display the count as soon as the count gets incremented. ie., as soon as CounterThread increments the value DisplayThread needs to display the recently updated value.
int i = 0;
Output
CounterThread -> i = 1
DisplayThread -> i = 1
CounterThread -> i = 2
CounterThread -> i = 3
CounterThread -> i = 4
DisplayThread -> i = 4
Here CounterThread gets the lock frequently and updates the value before DisplayThread displays it. Here exists a Race condition. Race Condition can be solved by using Synchronzation

A race condition is an undesirable situation that occurs when two or more process can access and change the shared data at the same time.It occurred because there were conflicting accesses to a resource . Critical section problem may cause race condition. To solve critical condition among the process we have take out only one process at a time which execute the critical section.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string