C++11 non-blocking producer/consumer - multithreading

I have a C++11 application with a high-priority thread that's producing data, and a low-priority thread that's consuming it (in my case, writing it to disk). I'd like to make sure the high-priority producer thread is never blocked, i.e. it uses only lock-free algorithms.
With a lock-free queue, I can push data to the queue from the producer thread, and poll it from the consumer thread, thus meeting my goals above. I'd like to modify my program so that the consumer thread blocks when inactive instead of polling.
It seems like the C++11 condition variable might be useful to block the consumer thread. Can anyone show me an example of how to use it, while avoiding the possibility that the consumer sleeps with data still in the queue? More specifically, I want to make sure that the consumer is always woken up some finite time after the producer pushes the last item into the queue. It's also important that the producer remains non-blocking.

It seems like the C++11 condition variable might be useful to block the consumer thread. Can anyone show me an example of how to use it, while avoiding the possibility that the consumer sleeps with data still in the queue?
To use a condition variable you need a mutex and a condition. In your case the condition will be "there is data available in the queue". Since the producer will be using lock-free updates to produce work, the consumer has to use the same form of synchronisation to consume the work, so the mutex will not actually be used for synchronisation and is only needed by the consumer thread because there's no other way to wait on a condition variable.
// these variables are members or otherwise shared between threads
std::mutex m_mutex;
std::condition_variable m_cv;
lockfree_queue m_data;
// ...
// in producer thread:
while (true)
{
// add work to queue
m_data.push(x);
m_cv.notify_one();
}
// in consumer thread:
while (true)
{
std::unique_lock<std::mutex> lock(m_mutex);
m_cv.wait(lock, []{ return !m_data.empty(); });
// remove data from queue and process it
auto x = m_data.pop();
}
The condition variable will only block in the wait call if the queue is empty before the wait. The condition variable might wake up spuriously, or because it was notified by the producer, but in either case will only return from the wait call (rather than sleeping again) if the queue is non-empty. That's guaranteed by using the condition_variable::wait overload that takes a predicate, because the condition variable always re-checks the predicate for you.
Since the mutex is only used by the consumer thread it could in fact be local to that thread (as long as you only have one consumer, with more than one they all need to share the same mutex to wait on the same condvar).

One solution I found to this in the past was using Windows events (http://msdn.microsoft.com/en-us/library/windows/desktop/ms682396(v=vs.85).aspx). In this case the event remains signaled until it wakes up a waiting thread, and if no threads are waiting it remains signaled. So the producer simply needs to signal the event after pushing data to the queue. Then we are guaranteed that the consumer will wake up some finite time after this.
I wasn't able to find a way to implement this using the standard library though (at least not without blocking the producer thread).

I think semaphores could be used to solve this problem safely:
// in producer thread:
while (true)
{
m_data.push();
m_semaphore.release();
}
// in consumer thread:
while (true)
{
m_semaphore.wait();
m_data.pop();
}
Unfortunately I don't think C++11 includes a semaphore? I have also not been able to confirm that releasing a semaphore is a non-blocking operation. Certainly implementations with mutexes e.g. C++0x has no semaphores? How to synchronize threads? will not allow for a non-blocking producer thread.

Related

Why does std::condition_variable wait() require a std::unique_lock arg?

My thread does not need to be locked. std::unique_lock locks thread on construction. I am simply using cond_var.wait() as a way to avoid busy waiting. I have essentially circumvented the auto-locking by putting the unique_lock within a tiny scope and hence destroying the unique lock after it leaves the tiny scope. Additionally, there is only a single consumer thread if that's relevant.
{
std::unique_lock<std::mutex> dispatch_ul(dispatch_mtx);
pq_cond.wait(dispatch_ul);
}
Is there possibly a better option to avoid the unnecessary auto-lock functionality from the unique_lock? I'm looking for a mutexless option to simply signal the thread, I am aware of std::condition_variable_any but that requires a mutex of sorts which is yet again unnessesary in my case.
You need a lock to prevent this common newbie mistake:
Producer thread produces something,
Producer thread calls some_condition.notify_all(),
Producer thread goes idle for a while,
meanwhile:
Consumer thread calls some_condition.wait(...)
Consumer thread waits,...
And waits,...
And waits.
A condition variable is not a flag. It does not remember that it was notified. If the producer calls notify_one() or notify_all() before the consumer has entered the wait() call, then the notification is "lost."
In order to prevent lost notifications, there must be some shared data that tells the consumer whether or not it needs to wait, and there must be a lock to protect the shared data.
The producer should:
Lock the lock,
update the shared data,
notify the condition variable,
release the lock
The consumer must then:
Lock the lock,
Check the shared data to see if it needs wait,
Wait if needed,
consume whatever,
release the lock.
The consumer needs to pass the lock in to the wait(...) call so that wait(...) can temporarily unlock it, and then re-lock it before returning. If wait(...) did not unlock the lock, then the producer would never be able to reach the notify() call.

What is a Spinning Thread?

I have stumbled upon the term spinning, referring to a thread while reading this (ROS)
What is the general concept behind spinning a thread?
My intuition would say that a spinning thread is a thread that keeps executing in a multithreading process with a certain frequency, somewhat related to the concept of polling (i.e. keep checking some condition with a certain frequency) but I am not sure at all about it.
Could you give some explanation? The more general the better.
There are a couple of separate concepts here.
In terms of ROS (the link you reference), ros::spin() runs the ROS callback invoker, so that pending events are delivered to your program callbacks via a thread belonging to your program. This sort of call typically does not return; it will wait for new events to be ready, and invoke appropriate callbacks when they occur.
But you also refer to "spinning a thread."
This is a separate topic. It generally relates to a low level programming pattern whereby a thread will repeatedly check for some condition being met without being suspended.
A common way to wait for some condition to be met is to just wait on a conditional variable. In this example, the thread will be suspended by the kernel until some other thread calls notify on the condition variable. Upon the notify, the kernel will resume the thread, and the condition will evaluate to true, allowing the thread to continue.
std::mutex m;
std::condition_variable cv;
bool ready = false;
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{ return ready; }); /* thread suspended */
Alternatively a spinning approach would repeatedly check some condition, without going to sleep. Caution: this results in high CPU, and there are subtle caveats to implementing correctly).
Here is an example of a simple spinlock (although note that spinning threads can be used for other purposes than spinlocks). In the below code, notice that the while loop repeatedly calls test_and_set ... which is just an attempt to set the flag to true; that's the spin part.
// spin until true
std::atomic_flag lock = ATOMIC_FLAG_INIT;
while (lock.test_and_set(std::memory_order_acquire)); // acquire lock
/* got the flag .. do work */
lock.clear(std::memory_order_release); // release lock
spin like while loop without sleeping, your task consumes cpu resource constantly until the conditions is satisfied.

c++11 thread sleep/wakeup without lock?

I use a lockfree queue between two threads. One produce data,another consume data. What I want to do is that, when the queue is empty, consumer thread yield cpu until producer thread push data to the queue. I can't call sleep() since there is no way to wakeup a sleeping thread, I think. What I found is std::condition_variable, but it needs a mutex. producer thread need to hold the lock and then notify consumer thread for every data pushing. Is there a better and lighter way to realize my goal?

Reuse of threads pthread

I have a program which constantly gets some work to be done (something like a server), and few threads. Which is the right way to reuse threads from pthread library? Or am I forced to create a new thread every time. I want to reuse at least the pthread_t structures. I am thinking of something like this:
int main() {
pthread_t threads[some value];
while (1) {
get work;
find a free thread;
pthread_create(free thread, do work);
pthread_join(done threads);
}
But I don't know how to properly free a thread or how to check if it is free.
Just code the thread to do whatever work needs to be done. Don't keep creating and joining threads. The simplest way is to use a thread pool -- a collection of threads and a thread-safe queue of jobs. Each thread in the pool takes a job from the queue, does that job, and then waits for another job.
With POSIX threads, usually a mutex is used to protect the queue and a condition variable to allow threads to wait for work. You may want a boolean variable to track whether the program is shutting down.
In pseudo-code, each thread does this:
Acquire the mutex.
Check if the program is shutting down, if so, release the mutex and terminate.
Check if the queue is empty. If so, block on the condition variable and go to step 2.
Take the top job from the queue.
Release the mutex.
Do the job.
Go to step 1.
To ask a thread to do a job, do this:
Allocate a new work object.
Fill it in with the work to be done (which can be a pointer to a function and a parameter for that function).
Acquire the mutex.
Add the job to the queue.
Signal the condition variable.
Release the mutex.
To shut down:
Acquire the mutex.
Set the shutting down boolean to true.
Broadcast the condition variable.
Release the mutex.
Join all threads.

Advantages of using condition variables over mutex

I was wondering what is the performance benefit of using condition variables over mutex locks in pthreads.
What I found is : "Without condition variables, the programmer would need to have threads continually polling (possibly in a critical section), to check if the condition is met. This can be very resource consuming since the thread would be continuously busy in this activity. A condition variable is a way to achieve the same goal without polling." (https://computing.llnl.gov/tutorials/pthreads)
But it also seems that mutex calls are blocking (unlike spin-locks). Hence if a thread (T1) fails to get a lock because some other thread (T2) has the lock, T1 is put to sleep by the OS, and is woken up only when T2 releases the lock and the OS gives T1 the lock. The thread T1 does not really poll to get the lock. From this description, it seems that there is no performance benefit of using condition variables. In either case, there is no polling involved. The OS anyway provides the benefit that the condition-variable paradigm can provide.
Can you please explain what actually happens.
A condition variable allows a thread to be signaled when something of interest to that thread occurs.
By itself, a mutex doesn't do this.
If you just need mutual exclusion, then condition variables don't do anything for you. However, if you need to know when something happens, then condition variables can help.
For example, if you have a queue of items to work on, you'll have a mutex to ensure the queue's internals are consistent when accessed by the various producer and consumer threads. However, when the queue is empty, how will a consumer thread know when something is in there for it to work on? Without something like a condition variable it would need to poll the queue, taking and releasing the mutex on each poll (otherwise a producer thread could never put something on the queue).
Using a condition variable lets the consumer find that when the queue is empty it can just wait on the condition variable indicating that the queue has had something put into it. No polling - that thread does nothing until a producer puts something in the queue, then signals the condition that the queue has a new item.
You're looking for too much overlap in two separate but related things: a mutex and a condition variable.
A common implementation approach for a mutex is to use a flag and a queue. The flag indicates whether the mutex is held by anyone (a single-count semaphore would work too), and the queue tracks which threads are in line waiting to acquire the mutex exclusively.
A condition variable is then implemented as another queue bolted onto that mutex. Threads that got in line to wait to acquire the mutex can—usually once they have acquired it—volunteer to get out of the front of the line and get into the condition queue instead. At this point, you have two separate sets of waiters:
Those waiting to acquire the mutex exclusively
Those waiting for the condition variable to be signaled
When a thread holding the mutex exclusively signals the condition variable, for which we'll assume for now that it's a singular signal (unleashing no more than one waiting thread) and not a broadcast (unleashing all the waiting threads), the first thread in the condition variable queue gets shunted back over into the front (usually) of the mutex queue. Once the thread currently holding the mutex—usually the thread that signaled the condition variable—relinquishes the mutex, the next thread in the mutex queue can acquire it. That next thread in line will have been the one that was at the head of the condition variable queue.
There are many complicated details that come into play, but this sketch should give you a feel for the structures and operations in play.
If you are looking for performance, then start reading about "non blocking / non locking" thread synchronization algorithms. They are based upon atomic operations, which gcc is kind enough to provide. Lookup gcc atomic operations. Our tests showed we could increment a global value with multiple threads using atomic operation magnitudes faster than locking with a mutex. Here is some sample code that shows how to add items to and from a linked list from multiple threads at the same time without locking.
For sleeping and waking threads, signals are much faster than conditions. You use pthread_kill to send the signal, and sigwait to sleep the thread. We tested this too with the same kind of performance benefits. Here is some example code.

Resources