C++11 - Managing worker threads - multithreading

I am new to threading in C++11 and I am wondering how to manage worker threads (using the standard library) to perform some task and then die off. I have a pool of threads vector<thread *> thread_pool that maintains a list of active threads.
Let's say I launch a new thread and add it to the pool using thread_pool.push_back(new thread(worker_task)), where worker_task is defined as follows:
void worker_task()
{
this_thread::sleep_for(chrono::milliseconds(1000));
cout << "Hello, world!\n"
}
Once the worker thread has terminated, what is the best way to reliably remove the thread from the pool? The main thread needs to run continuously and cannot block on a join call. I am more confused about the general structure of the code than the intricacies of synchronization.
Edit: It looks like I misused the concept of a pool in my code. All I meant was that I have a list of threads that are currently running.

You can use std::thread::detach to "separate the thread of execution from the thread object, allowing execution to continue independently. Any allocated resources will be freed once the thread exits."
If each thread should make its state visible, you can move this functionality into the thread function.
std::mutex mutex;
using strings = std::list<std::string>;
strings info;
strings::iterator insert(std::string value) {
std::unique_lock<std::mutex> lock{mutex};
return info.insert(info.end(), std::move(value));
}
auto erase(strings::iterator p) {
std::unique_lock<std::mutex> lock{mutex};
info.erase(p);
}
template <typename F>
void async(F f) {
std::thread{[f] {
auto p = insert("...");
try {
f();
} catch (...) {
erase(p);
throw;
}
erase(p);
}}.detach();
}

Related

C++ thread: how to send message to other long-live thread?

I have a server listening to some port, and I create several detached threads.
Not only the server it self will run forever, but also the detached threads will run forever.
//pseudocode
void t1_func()
{
for(;;)
{
if(notified from server)
dosomething();
}
}
thread t1(t1_func);
thread t2(...);
for(;;)
{
// read from accepted socket
string msg = socket.read_some(...);
//notify thread 1 and thread 2;
}
Since I am new to multithreading, I don't know how to implement such nofity in server, and check the nofity in detached threads.
Any helpful tips will be appreciated.
The easiest way to do this is with std::condition_variable.
std::condition_variable will wait until another thread calls either notify_one or notify_all on it and only then will it wake up.
Here is your t1_func implemented using condition variables:
std::condition_variable t1_cond;
void t1_func()
{
//wait requires a std::unique_lock
std::mutex mtx;
std::unique_lock<std::mutex> lock{ mtx };
while(true)
{
t1_cond.wait(lock);
doSomething();
}
}
The wait method takes a std::unique_lock but the lock doesn't have to be shared to notify the thread. When you want to wake up the worker thread from the main thread you would call notify_one or notify_all like this:
t1_cond.notify_one();
If you want to have the thread wake up after a certain amount of time you could use wait_for instead of wait.

How to send signal/data from a worker thread to main thread?

I'll preface this by saying that I'm delving into multithreading for the first time. Despite a lot of reading on concurrency and synchronization, I'm not readily seeing a solution for the requirements I've been given.
Using C++11 and Boost, I'm trying to figure out how to send data from a worker thread to a main thread. The worker thread is spawned at the start of the application and continuously monitors a lock free queue. Objects populate this queue at various intervals. This part is working.
Once the data is available, it needs to be processed by the main thread since another signal will be sent to the rest of the application which cannot be on a worker thread. This is what I'm having trouble with.
If I have to block the main thread through a mutex or a condition variable until the worker thread is done, how will that improve responsiveness? I might as well just stay with a single thread so I have access to the data. I must be missing something here.
I have posted a couple questions, thinking that Boost::Asio was the way to go. There is an example of how signals and data can be sent between threads, but as the responses indicate, things get quickly overly-complicated and it's not working perfectly:
How to connect signal to boost::asio::io_service when posting work on different thread?
Boost::Asio with Main/Workers threads - Can I start event loop before posting work?
After speaking with some colleagues, it was suggested that two queues be used -- one input, one output. This would be in shared space and the output queue would be populated by the worker thread. The worker thread is always going but there would need to be a Timer, probably at the application level, that would force the main thread to examine the output queue to see if there were any pending tasks.
Any ideas on where I should direct my attention? Are there any techniques or strategies that might work for what I'm trying to do? I'll be looking at Timers next.
Thanks.
Edit: This is production code for a plugin system that post-processes simulation results. We are using C++11 first wherever possible, followed by Boost. We are using Boost's lockfree::queue. The application is doing what we want on a single thread but now we are trying to optimize where we see that there are performance issues (in this case, a calculation happening through another library). The main thread has a lot of responsibilities, including database access, which is why I want to limit what the worker thread actually does.
Update: I have already been successful in using std::thread to launch a worker thread that examines a Boost lock::free queue and processes tasks placed it in. It's step 5 in #Pressacco's response that I'm having trouble with. Any examples returning a value to the main thread when a worker thread is finished and informing the main thread, rather than simply waiting for the worker to finish?
If your objective is develop the solution from scratch (using native threads, queues, etc.):
create a thread save queue queue (Mutex/CriticalSection around add/remove)
create a counting semaphore that is associated with the queue
have one or more worker threads wait on the counting semaphore (i.e. the thread will block)
the semaphore is more efficient than having the thread constantly poll the queue
as messages/jobs are added to the queue, increment the semaphore
a thread will wake up
the thread should remove one message
if a result needs to be returned...
setup another: Queue+Semaphore+WorkerThreads
ADDITIONAL NOTES
If you decide to implement a thread safe queue from scratch, take a look at:
Synchronization between threads using Critical Section
With that said, I would take another look at BOOST. I haven't used the library, but from what I hear it will most likely contain some relevant data structures (e.g. a thread safe queue).
My favorite quote from the MSDN:
"When you use multithreading of any sort, you potentially expose
yourself to very serious and complex bugs"
SIDEBAR
Since you are looking at concurrent programming for the first time, you may wish to consider:
Is your objective to build production worthy code , or is this simply a learning exercise?
production? consider us existing proven libraries
learning? consider writing the code from scratch
Consider using a thread pool with an asynchronous callback instead of native threads.
more threads != better
Are threads really needed?
Follow the KISS principle.
The feedback above led me in the right direction for what I needed. The solution was definitely simpler than having to use signals/slots or Boost::Asio as I had previously attempted. I have two lock-free queues, one for input (on a worker thread) and one for output (on the main thread, populated by the worker thread). I use a timer to schedule when the output queue is processed. The code is below; perhaps it is of use to somebody:
//Task.h
#include <iostream>
#include <thread>
class Task
{
public:
Task(bool shutdown = false) : _shutdown(shutdown) {};
virtual ~Task() {};
bool IsShutdownRequest() { return _shutdown; }
virtual int Execute() = 0;
private:
bool _shutdown;
};
class ShutdownTask : public Task
{
public:
ShutdownTask() : Task(true) {}
virtual int Execute() { return -1; }
};
class TimeSeriesTask : public Task
{
public:
TimeSeriesTask(int value) : _value(value) {};
virtual int Execute()
{
std::cout << "Calculating on thread " << std::this_thread::get_id() << std::endl;
return _value * 2;
}
private:
int _value;
};
// Main.cpp : Defines the entry point for the console application.
#include "stdafx.h"
#include "afxwin.h"
#include <boost/lockfree/spsc_queue.hpp>
#include "Task.h"
static UINT_PTR ProcessDataCheckTimerID = 0;
static const int ProcessDataCheckPeriodInMilliseconds = 100;
class Manager
{
public:
Manager()
{
//Worker Thread with application lifetime that processes a lock free queue
_workerThread = std::thread(&Manager::ProcessInputData, this);
};
virtual ~Manager()
{
_workerThread.join();
};
void QueueData(int x)
{
if (x > 0)
{
_inputQueue.push(std::make_shared<TimeSeriesTask>(x));
}
else
{
_inputQueue.push(std::make_shared<ShutdownTask>());
}
}
void ProcessOutputData()
{
//process output data on the Main Thread
_outputQueue.consume_one([&](int value)
{
if (value < 0)
{
PostQuitMessage(WM_QUIT);
}
else
{
int result = value - 1;
std::cout << "Final result is " << result << " on thread " << std::this_thread::get_id() << std::endl;
}
});
}
private:
void ProcessInputData()
{
bool shutdown = false;
//Worker Thread processes input data indefinitely
do
{
_inputQueue.consume_one([&](std::shared_ptr<Task> task)
{
std::cout << "Getting element from input queue on thread " << std::this_thread::get_id() << std::endl;
if (task->IsShutdownRequest()) { shutdown = true; }
int result = task->Execute();
_outputQueue.push(result);
});
} while (shutdown == false);
}
std::thread _workerThread;
boost::lockfree::spsc_queue<std::shared_ptr<Task>, boost::lockfree::capacity<1024>> _inputQueue;
boost::lockfree::spsc_queue<int, boost::lockfree::capacity<1024>> _outputQueue;
};
std::shared_ptr<Manager> g_pMgr;
//timer to force Main Thread to process Manager's output queue
void CALLBACK TimerCallback(HWND hWnd, UINT nMsg, UINT nIDEvent, DWORD dwTime)
{
if (nIDEvent == ProcessDataCheckTimerID)
{
KillTimer(NULL, ProcessDataCheckPeriodInMilliseconds);
ProcessDataCheckTimerID = 0;
//call function to process data
g_pMgr->ProcessOutputData();
//reset timer
ProcessDataCheckTimerID = SetTimer(NULL, ProcessDataCheckTimerID, ProcessDataCheckPeriodInMilliseconds, (TIMERPROC)&TimerCallback);
}
}
int main()
{
std::cout << "Main thread is " << std::this_thread::get_id() << std::endl;
g_pMgr = std::make_shared<Manager>();
ProcessDataCheckTimerID = SetTimer(NULL, ProcessDataCheckTimerID, ProcessDataCheckPeriodInMilliseconds, (TIMERPROC)&TimerCallback);
//queue up some dummy data
for (int i = 1; i <= 10; i++)
{
g_pMgr->QueueData(i);
}
//queue a shutdown request
g_pMgr->QueueData(-1);
//fake the application's message loop
MSG msg;
bool shutdown = false;
while (shutdown == false)
{
if (GetMessage(&msg, NULL, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
else
{
shutdown = true;
}
}
return 0;
}

Keeping threads alive even if the main thead has terminated

I am not sure if my question is correct, but I have the following example, where the main thread creates two additional threads.
Since I am not using join command at the end of the main, it will continue execution and in the same time, the two created threads will work in parallel. But since the main is terminated before they finish their execution, I am getting the following output:
terminate called without an active exception
Aborted (core dumped)
Here's the code:
#include <iostream> // std::cout
#include <thread> // std::thread
#include <chrono>
void foo()
{
std::chrono::milliseconds dura( 2000 );
std::this_thread::sleep_for( dura );
std::cout << "Waited for 2Sec\n";
}
void bar(int x)
{
std::chrono::milliseconds dura( 4000 );
std::this_thread::sleep_for( dura );
std::cout << "Waited for 4Sec\n";
}
int main()
{
std::thread first (foo);
std::thread second (bar,0);
return 0;
}
So my question is how to keep these two threads working even if the main thread terminated?
I am asking this because in my main program, I have an event handler ,and for each event I create a corresponding thread. But the main problem when the handler creates a new thread, the handler will continue execution. Until it is destroyed which will cause also the newly created thread to be destroyed. So my question is how to keep the thread alive in this case?
Also if I use a join it will convert back to serialization.
void ho_commit_indication_handler(message &msg, const boost::system::error_code &ec)
{
.....
}
void event_handler(message &msg, const boost::system::error_code &ec)
{
if (ec)
{
log_(0, __FUNCTION__, " error: ", ec.message());
return;
}
switch (msg.mid())
{
case n2n_ho_commit:
{
boost::thread thrd(&ho_commit_indication_handler, boost::ref(msg), boost::ref(ec));
}
break
}
};
Thanks a lot.
Keeping the threads alive is a bad idea, because it causes a call to std::terminate. You should definitively join the threads:
int main()
{
std::thread first (foo);
std::thread second (bar, 0);
first.join();
second.join();
}
An alternative is to detach the threads. However you still need to assert that the main thread lives longer (by e.g. using a mutex / condition_variable).
This excerpt from the C++11 standard is relevant here:
15.5.1 The std::terminate() function [except.terminate]
1 In some situations exception handling must be abandoned for less subtle error
handling techniques. [ Note: These situations are:
[...]
-- when the destructor or the copy assignment operator is invoked on an
object of type std::thread that refers to a joinable thread
Hence, you have to call either join or detach on threads before scope exit.
Concerning your edit: You have to store the threads in a list (or similar) and wait for every one of them before main is done. A better idea would be to use a thread pool (because this limits the total number of threads created).

Locking C++11 std::unique_lock causes deadlock exception

I'm trying to use a C++11 std::condition_variable, but when I try to lock the unique_lock associated with it from a second thread I get an exception "Resource deadlock avoided". The thread that created it can lock and unlock it, but not the second thread, even though I'm pretty sure the unique_lock shouldn't be locked already at the point the second thread tries to lock it.
FWIW I'm using gcc 4.8.1 in Linux with -std=gnu++11.
I've written a wrapper class around the condition_variable, unique_lock and mutex, so nothing else in my code has direct access to them. Note the use of std::defer_lock, I already fell in to that trap :-).
class Cond {
private:
std::condition_variable cCond;
std::mutex cMutex;
std::unique_lock<std::mutex> cULock;
public:
Cond() : cULock(cMutex, std::defer_lock)
{}
void wait()
{
std::ostringstream id;
id << std::this_thread::get_id();
H_LOG_D("Cond %p waiting in thread %s", this, id.str().c_str());
cCond.wait(cULock);
H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
}
// Returns false on timeout
bool waitTimeout(unsigned int ms)
{
std::ostringstream id;
id << std::this_thread::get_id();
H_LOG_D("Cond %p waiting (timed) in thread %s", this, id.str().c_str());
bool result = cCond.wait_for(cULock, std::chrono::milliseconds(ms))
== std::cv_status::no_timeout;
H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
return result;
}
void notify()
{
cCond.notify_one();
}
void notifyAll()
{
cCond.notify_all();
}
void lock()
{
std::ostringstream id;
id << std::this_thread::get_id();
H_LOG_D("Locking Cond %p in thread %s", this, id.str().c_str());
cULock.lock();
}
void release()
{
std::ostringstream id;
id << std::this_thread::get_id();
H_LOG_D("Releasing Cond %p in thread %s", this, id.str().c_str());
cULock.unlock();
}
};
My main thread creates a RenderContext, which has a thread associated with it. From the main thread's point of view, it uses the Cond to signal the rendering thread to perform an action and can also wait on the COnd for the rendering thread to complete that action. The rendering thread waits on the Cond for the main thread to send rendering requests, and uses the same Cond to tell the main thread it's completed an action if necessary. The error I'm getting occurs when the rendering thread tries to lock the Cond to check/wait for render requests, at which point it shouldn't be locked at all (because the main thread is waiting on it), let alone by the same thread. Here's the output:
DEBUG: Created window
DEBUG: OpenGL 3.0 Mesa 9.1.4, GLSL 1.30
DEBUG: setScreen locking from thread 140564696819520
DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
DEBUG: Releasing Cond 0x13ec1e0 in thread 140564696819520
DEBUG: Entering GLFW main loop
DEBUG: requestRender locking from thread 140564696819520
DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
DEBUG: requestRender waiting
DEBUG: Cond 0x13ec1e0 waiting in thread 140564696819520
DEBUG: Running thread 'RenderThread' with id 140564575180544
DEBUG: render thread::run locking from thread 140564575180544
DEBUG: Locking Cond 0x13ec1e0 in thread 140564575180544
terminate called after throwing an instance of 'std::system_error'
what(): Resource deadlock avoided
To be honest I don't really understand what a unique_lock is for and why condition_variable needs one instead of using a mutex directly, so that's probably the cause of the problem. I can't find a good explanation of it online.
Foreword: An important thing to understand with condition variables is that they can be subject to random, spurious wake ups. In other words, a CV can exit from wait() without anyone having called notify_*() first. Unfortunately there is no way to distinguish such a spurious wake up from a legitimate one, so the only solution is to have an additional resource (at the very least a boolean) so that you can tell whether the wake up condition is actually met.
This additional resource should be guarded by a mutex too, usually the very same you use as a companion for the CV.
The typical usage of a CV/mutex pair is as follows:
std::mutex mutex;
std::condition_variable cv;
Resource resource;
void produce() {
// note how the lock only protects the resource, not the notify() call
// in practice this makes little difference, you just get to release the
// lock a bit earlier which slightly improves concurrency
{
std::lock_guard<std::mutex> lock(mutex); // use the lightweight lock_guard
make_ready(resource);
}
// the point is: notify_*() don't require a locked mutex
cv.notify_one(); // or notify_all()
}
void consume() {
std::unique_lock<std::mutex> lock(mutex);
while (!is_ready(resource))
cv.wait(lock);
// note how the lock still protects the resource, in order to exclude other threads
use(resource);
}
Compared to your code, notice how several threads can call produce()/consume() simultaneously without worrying about a shared unique_lock: the only shared things are mutex/cv/resource and each thread gets its own unique_lock that forces the thread to wait its turn if the mutex is already locked by something else.
As you can see, the resource can't really be separated from the CV/mutex pair, which is why I said in a comment that your wrapper class wasn't really fitting IMHO, since it indeed tries to separate them.
The usual approach is not to make a wrapper for the CV/mutex pair as you tried to, but for the whole CV/mutex/resource trio. Eg. a thread-safe message queue where the consumer threads will wait on the CV until the queue has messages ready to be consumed.
If you really want to wrap just the CV/mutex pair, you should get rid of your lock()/release() methods which are unsafe (from a RAII point of view) and replace them with a single lock() method returning a unique_ptr:
std::unique_ptr<std::mutex> lock() {
return std::unique_ptr<std::mutex>(cMutex);
}
This way you can use your Cond wrapper class in rather the same way as what I showed above:
Cond cond;
Resource resource;
void produce() {
{
auto lock = cond.lock();
make_ready(resource);
}
cond.notify(); // or notifyAll()
}
void consume() {
auto lock = cond.lock();
while (!is_ready(resource))
cond.wait(lock);
use(resource);
}
But honestly I'm not sure it's worth the trouble: what if you want to use a recursive_mutex instead of a plain mutex? Well, you'd have to make a template out of your class so that you can choose the mutex type (or write a second class altogether, yay for code duplication). And anyway you don't gain much since you still have to write pretty much the same code in order to manage the resource. A wrapper class only for the CV/mutex pair is too thin a wrapper to be really useful IMHO. But as usual, YMMV.

C++11 When To Use A Memory Fence?

I'm writing some threaded C++11 code, and I'm not totally sure on when I need to use a memory fence or something. So here is basically what I'm doing:
class Worker
{
std::string arg1;
int arg2;
int arg3;
std::thread thread;
public:
Worker( std::string arg1, int arg2, int arg3 )
{
this->arg1 = arg1;
this->arg2 = arg2;
this->arg3 = arg3;
}
void DoWork()
{
this->thread = std::thread( &Worker::Work, this );
}
private:
Work()
{
// Do stuff with args
}
}
int main()
{
Worker worker( "some data", 1, 2 );
worker.DoWork();
// Wait for it to finish
return 0;
}
I was wondering, what steps do I need to take to make sure that the args are safe to access in the Work() function which runs on another thread. Is it enough that it's written in the constructor, and then the thread is created in a separate function? Or do I need a memory fence, and how do I make a memory fence to make sure all 3 args are written by the main thread, and then read by the Worker thread?
Thanks for any help!
The C++11 standard section 30.3.1.2 thread constructors [thread.thread.constr] p5 describes the constructor template <class F, class... Args> explicit thread(F&& f, Args&&... args):
Synchronization: the completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f.
So everything in the current thread happens before the thread function is called. You don't need to do anything special to ensure that the assignments to the Worker members are complete and will be visible to the new thread.
In general, you should never have to use a memory fence when writing multithreaded C++11: synchronization is built into mutexes/atomics and they handle any necessary fences for you. (Caveat: you are on your own if you use relaxed atomics.)

Resources