Why this boost thread creation does't compile? - multithreading

I wrote some multithreading code using Boost thread library. I initialized two threads in the constructor using the placeholder _1 as the argument required by member function fillSample(int num). But this doesn't compile in my Visual Studio 2010. Following is the code:
#include<boost/thread.hpp>
#include<boost/thread/condition.hpp>
#include<boost/bind/placeholders.hpp>
#define SAMPLING_FREQ 250
#define MAX_NUM_SAMPLES 5*60*SAMPLING_FREQ
#define BUFFER_SIZE 8
class ECG
{
private:
int sample[BUFFER_SIZE];
int sampleIdx;
int readIdx, writeIdx;
boost::thread m_ThreadWrite;
boost::thread m_ThreadRead;
boost::mutex m_Mutex;
boost::condition bufferNotFull, bufferNotEmpty;
public:
ECG();
void fillSample(int num); //get sample from the data stream
void processSample(); //process ECG sample, return the last processed
};
ECG::ECG() : readyFlag(false), sampleIdx(0), readIdx(0), writeIdx(0)
{
m_ThreadWrite=boost::thread((boost::bind(&ECG::fillSample, this, _1)));
m_ThreadRead=boost::thread((boost::bind(&ECG::processSample, this)));
}
void ECG::fillSample(int num)
{
boost::mutex::scoped_lock lock(m_Mutex);
while( (writeIdx-readIdx)%BUFFER_SIZE == BUFFER_SIZE-1 )
{
bufferNotFull.wait(lock);
}
sample[writeIdx] = num;
writeIdx = (writeIdx+1) % BUFFER_SIZE;
bufferNotEmpty.notify_one();
}
void ECG::processSample()
{
boost::mutex::scoped_lock lock(m_Mutex);
while( readIdx == writeIdx )
{
bufferNotEmpty.wait(lock);
}
sample[readIdx] *= 2;
readIdx = (readIdx+1) % BUFFER_SIZE;
++sampleIdx;
bufferNotFull.notify_one();
}
I already included the placeholders.hpp header file but it still doesn't compile. If I replace the _1 with 0, then it will work. But this will initialize the thread function with 0, which is not what I want. Any ideas on how to make this work?

Move the creation to the initialization list:
m_ThreadWrite(boost::bind(&ECG::fillSample, this, _1)), ...
thread object is not copyable, and your compiler doesn't support its move constructor.

Related

'PTX JIT compilation failed' from cuModuleLoadData

Below is the code:
#define FILENAME "kernel.code"
#define kernel_name "hello_world"
#define THREADS 4
std::vector<char> load_file()
{
std::ifstream file(FILENAME, std::ios::binary | std::ios::ate);
std::streamsize fsize = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> buffer(fsize);
if (!file.read(buffer.data(), fsize)) {
failed("could not open code object '%s'\n", FILENAME);
}
return buffer;
}
struct joinable_thread : std::thread
{
template <class... Xs>
joinable_thread(Xs&&... xs) : std::thread(std::forward<Xs>(xs)...) // NOLINT
{
}
joinable_thread& operator=(joinable_thread&& other) = default;
joinable_thread(joinable_thread&& other) = default;
~joinable_thread()
{
if(this->joinable())
this->join();
}
};
void run(const std::vector<char>& buffer) {
CUdevice device;
CUDACHECK(cuDeviceGet(&device, 0));
CUcontext context;
CUDACHECK(cuCtxCreate(&context, 0, device));
CUmodule Module;
CUDACHECK(cuModuleLoadData(&Module, &buffer[0]));
...
}
void run_multi_threads(uint32_t n) {
{
auto buffer = load_file();
std::vector<joinable_thread> threads;
for (uint32_t i = 0; i < n; i++) {
threads.emplace_back(std::thread{[&, i, buffer] {
run(buffer);
}});
}
}
}
int main() {
CUDACHECK(cuInit(0));
run_multi_threads(THREADS);
}
And the code kernel.cu used for ptx is as follows:
#include "cuda_runtime.h"
extern "C" __global__ void hello_world(float* a, float* b) {
int tx = threadIdx.x;
b[tx] = a[tx];
}
I m generating the ptx in this way
nvcc --ptx kernel.cu -o kernel.code
Im using a machine with GeForce GTX TITAN X.
And Im facing this "PTX JIT compilation failed" from cuModuleLoadData error, only when I m trying to use this with multiple threads. If i remove the multi-threading part and run normally, this error doesn't occur.
Can anyone please tell me what is going wrong and how to overcome this.
As mentioned in the comments, I was able to get it to work by moving the load_file() call to the main, so that the buffer read from the file is valid, and then pass only the buffer to all the threads.
Actually in the original code, the buffer will be deconstructed once it leaves the '{...}' scope. So when thread starts, you may read the invalid buffer.
If you put your buffer in the main, it will not be deconstructed or freed until the program exits.
So yes, it's because you pass the invalid buffer (which may have already been freed) to the cu code.

Overridden virtual function not called from thread

I am writing a base class to manage threads. The idea is to allow the thread function to be overridden in child class while the base class manages thread life cycle. I ran into a strange behavior which I don't understand - it seems that the virtual function mechanism does not work when the call is made from a thread. To illustrate my problem, I reduced my code to the following:
#include <iostream>
#include <thread>
using namespace std;
struct B
{
thread t;
void thread_func_non_virt()
{
thread_func();
}
virtual void thread_func()
{
cout << "B::thread_func\n";
}
B(): t(thread(&B::thread_func_non_virt, this)) { }
void join() { t.join(); }
};
struct C : B
{
virtual void thread_func() override
{
cout << "C::thread_func\n";
}
};
int main()
{
C c; // output is "B::thread_func" but "C::thread_func" is expected
c.join();
c.thread_func_non_virt(); // output "C::thread_func" as expected
}
I tried with both Visual studio 2017 and g++ 5.4 (Ubuntu 16) and found the behavior is consistent. Can someone point out where I got wrong?
== UPDATE ==
Based on Igor's answer, I moved the thread creation out of the constructor into a separate method and calling that method after the constructor and got the desired behavior.
Your program exhibits undefined behavior. There's a race on *this between thread_func and C's (implicitly defined) constructor.
#include <iostream>
#include <thread>
using namespace std;
struct B
{
thread t;
void thread_func_non_virt()
{
thread_func();
}
virtual void thread_func()
{
cout << "B::thread_func\n";
}
B(B*ptr): t(thread(&B::thread_func_non_virt, ptr))
{
}
void join() { t.join(); }
};
struct C:public B
{
C():B(this){}
virtual void thread_func() override
{
cout << "C::thread_func\n";
}
};
int main()
{
C c; // "C::thread_func" is expected as expected
c.join();
c.thread_func_non_virt(); // output "C::thread_func" as expected
}

How to prematurely kill std::async threads before they are finished *without* using a std::atomic_bool?

I have a function that takes a callback, and used it to do work on 10 separate threads. However, it is often the case that not all of the work is needed. For example, if the desired result is obtained on the third thread, it should stop all work being done on of the remaining alive threads.
This answer here suggests that it is not possible unless you have the callback functions take an additional std::atomic_bool argument, that signals whether the function should terminate prematurely.
This solution does not work for me. The workers are spun up inside a base class, and the whole point of this base class is to abstract away details of multithreading. How can I do this? I am anticipating that I will have to ditch std::async for something more involved.
#include <iostream>
#include <future>
#include <vector>
class ABC{
public:
std::vector<std::future<int> > m_results;
ABC() {};
~ABC(){};
virtual int callback(int a) = 0;
void doStuffWithCallBack();
};
void ABC::doStuffWithCallBack(){
// start working
for(int i = 0; i < 10; ++i)
m_results.push_back(std::async(&ABC::callback, this, i));
// analyze results and cancel all threads when you get the 1
for(int j = 0; j < 10; ++j){
double foo = m_results[j].get();
if ( foo == 1){
break; // but threads continue running
}
}
std::cout << m_results[9].get() << " <- this shouldn't have ever been computed\n";
}
class Derived : public ABC {
public:
Derived() : ABC() {};
~Derived() {};
int callback(int a){
std::cout << a << "!\n";
if (a == 3)
return 1;
else
return 0;
};
};
int main(int argc, char **argv)
{
Derived myObj;
myObj.doStuffWithCallBack();
return 0;
}
I'll just say that this should probably not be a part of a 'normal' program, since it could leak resources and/or leave your program in an unstable state, but in the interest of science...
If you have control of the thread loop, and you don't mind using platform features, you could inject an exception into the thread. With posix you can use signals for this, on Windows you would have to use SetThreadContext(). Though the exception will generally unwind the stack and call destructors, your thread may be in a system call or other 'non-exception safe place' when the exception occurs.
Disclaimer: I only have Linux at the moment, so I did not test the Windows code.
#if defined(_WIN32)
# define ITS_WINDOWS
#else
# define ITS_POSIX
#endif
#if defined(ITS_POSIX)
#include <signal.h>
#endif
void throw_exception() throw(std::string())
{
throw std::string();
}
void init_exceptions()
{
volatile int i = 0;
if (i)
throw_exception();
}
bool abort_thread(std::thread &t)
{
#if defined(ITS_WINDOWS)
bool bSuccess = false;
HANDLE h = t.native_handle();
if (INVALID_HANDLE_VALUE == h)
return false;
if (INFINITE == SuspendThread(h))
return false;
CONTEXT ctx;
ctx.ContextFlags = CONTEXT_CONTROL;
if (GetThreadContext(h, &ctx))
{
#if defined( _WIN64 )
ctx.Rip = (DWORD)(DWORD_PTR)throw_exception;
#else
ctx.Eip = (DWORD)(DWORD_PTR)throw_exception;
#endif
bSuccess = SetThreadContext(h, &ctx) ? true : false;
}
ResumeThread(h);
return bSuccess;
#elif defined(ITS_POSIX)
pthread_kill(t.native_handle(), SIGUSR2);
#endif
return false;
}
#if defined(ITS_POSIX)
void worker_thread_sig(int sig)
{
if(SIGUSR2 == sig)
throw std::string();
}
#endif
void init_threads()
{
#if defined(ITS_POSIX)
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = worker_thread_sig;
sigaction(SIGUSR2, &sa, 0);
#endif
}
class tracker
{
public:
tracker() { printf("tracker()\n"); }
~tracker() { printf("~tracker()\n"); }
};
int main(int argc, char *argv[])
{
init_threads();
printf("main: starting thread...\n");
std::thread t([]()
{
try
{
tracker a;
init_exceptions();
printf("thread: started...\n");
std::this_thread::sleep_for(std::chrono::minutes(1000));
printf("thread: stopping...\n");
}
catch(std::string s)
{
printf("thread: exception caught...\n");
}
});
printf("main: sleeping...\n");
std::this_thread::sleep_for(std::chrono::seconds(2));
printf("main: aborting...\n");
abort_thread(t);
printf("main: joining...\n");
t.join();
printf("main: exiting...\n");
return 0;
}
Output:
main: starting thread...
main: sleeping...
tracker()
thread: started...
main: aborting...
main: joining...
~tracker()
thread: exception caught...
main: exiting...

C++ Qt: Redirect cout from a thread to emit a signal

In a single thread, I have this beautiful class that redirects all cout output to a QTextEdit
#include <iostream>
#include <streambuf>
#include <string>
#include <QScrollBar>
#include "QTextEdit"
#include "QDateTime"
class ThreadLogStream : public std::basic_streambuf<char>, QObject
{
Q_OBJECT
public:
ThreadLogStream(std::ostream &stream) : m_stream(stream)
{
m_old_buf = stream.rdbuf();
stream.rdbuf(this);
}
~ThreadLogStream()
{
// output anything that is left
if (!m_string.empty())
{
log_window->append(m_string.c_str());
}
m_stream.rdbuf(m_old_buf);
}
protected:
virtual int_type overflow(int_type v)
{
if (v == '\n')
{
log_window->append(m_string.c_str());
m_string.erase(m_string.begin(), m_string.end());
}
else
m_string += v;
return v;
}
virtual std::streamsize xsputn(const char *p, std::streamsize n)
{
m_string.append(p, p + n);
long pos = 0;
while (pos != static_cast<long>(std::string::npos))
{
pos = m_string.find('\n');
if (pos != static_cast<long>(std::string::npos))
{
std::string tmp(m_string.begin(), m_string.begin() + pos);
log_window->append(tmp.c_str());
m_string.erase(m_string.begin(), m_string.begin() + pos + 1);
}
}
return n;
}
private:
std::ostream &m_stream;
std::streambuf *m_old_buf;
std::string m_string;
QTextEdit* log_window;
};
However, this doesn't work if ANY thread (QThread) is initiated with a cout. This is because all pointers are messed up, and one has to use signals and slots for allowing transfer of data between the sub-thread and the main thread.
I would like to modify this class to emit a signal rather than write to a text file. This requires that this class becomes a Q_OBJECT and be inherited from one. I tried to inherit from QObject in addition to std::basic_streambuf<char> and added Q_OBJECT macro in the body but it didn't compile.
Could you please help me to achieve this? What should I do to get this class to emit signals that I can connect to and that are thread safe?
For those who need the full "working" answer, here it's. I just copied it because #GraemeRock asked for it.
#ifndef ThreadLogStream_H
#define ThreadLogStream_H
#include <iostream>
#include <streambuf>
#include <string>
#include <QScrollBar>
#include "QTextEdit"
#include "QDateTime"
class ThreadLogStream : public QObject, public std::basic_streambuf<char>
{
Q_OBJECT
public:
ThreadLogStream(std::ostream &stream) : m_stream(stream)
{
m_old_buf = stream.rdbuf();
stream.rdbuf(this);
}
~ThreadLogStream()
{
// output anything that is left
if (!m_string.empty())
{
emit sendLogString(QString::fromStdString(m_string));
}
m_stream.rdbuf(m_old_buf);
}
protected:
virtual int_type overflow(int_type v)
{
if (v == '\n')
{
emit sendLogString(QString::fromStdString(m_string));
m_string.erase(m_string.begin(), m_string.end());
}
else
m_string += v;
return v;
}
virtual std::streamsize xsputn(const char *p, std::streamsize n)
{
m_string.append(p, p + n);
long pos = 0;
while (pos != static_cast<long>(std::string::npos))
{
pos = static_cast<long>(m_string.find('\n'));
if (pos != static_cast<long>(std::string::npos))
{
std::string tmp(m_string.begin(), m_string.begin() + pos);
emit sendLogString(QString::fromStdString(tmp));
m_string.erase(m_string.begin(), m_string.begin() + pos + 1);
}
}
return n;
}
private:
std::ostream &m_stream;
std::streambuf *m_old_buf;
std::string m_string;
signals:
void sendLogString(const QString& str);
};
#endif // ThreadLogStream_H
The derivation needs to happen QObject-first:
class LogStream : public QObject, std::basic_streambuf<char> {
Q_OBJECT
...
};
...
If the goal was to minimally modify your code, there's a simpler way. You don't need to inherit QObject to emit signals iff you know exactly what slots the signals are going to. All you need to do is to invoke the slot in a thread safe way:
QMetaObject::invokeMethod(log_window, "append", Qt::QueuedConnection,
Q_ARG(QString, tmp.c_str()));
To speed things up, you can cache the method so that it doesn't have to be looked up every time:
class LogStream ... {
QPointer<QTextEdit> m_logWindow;
QMetaMethod m_append;
LogStream::LogStream(...) :
m_logWindow(...),
m_append(m_logWindow->metaObject()->method(
m_logWindow->metaObject()->indexOfSlot("append(QString)") )) {
...
}
};
You can then invoke it more efficiently:
m_append.invoke(m_logWindow, Qt::QueuedConnection, Q_ARG(QString, tmp.c_str()));
Finally, whenever you're holding pointers to objects whose lifetimes are not under your control, it's helpful to use QPointer since it never dangles. A QPointer resets itself to 0 when the pointed-to object gets destructed. It will at least prevent you from dereferencing a dangling pointer, since it never dangles.

Implementing boost::barrier in C++11

I've been trying to get a project rid of every boost reference and switch to pure C++11.
At one point, thread workers are created which wait for a barrier to give the 'go' command, do the work (spread through the N threads) and synchronize when all of them finish. The basic idea is that the main loop gives the go order (boost::barrier .wait()) and waits for the result with the same function.
I had implemented in a different project a custom made Barrier based on the Boost version and everything worked perfectly. Implementation is as follows:
Barrier.h:
class Barrier {
public:
Barrier(unsigned int n);
void Wait(void);
private:
std::mutex counterMutex;
std::mutex waitMutex;
unsigned int expectedN;
unsigned int currentN;
};
Barrier.cpp
Barrier::Barrier(unsigned int n) {
expectedN = n;
currentN = expectedN;
}
void Barrier::Wait(void) {
counterMutex.lock();
// If we're the first thread, we want an extra lock at our disposal
if (currentN == expectedN) {
waitMutex.lock();
}
// Decrease thread counter
--currentN;
if (currentN == 0) {
currentN = expectedN;
waitMutex.unlock();
currentN = expectedN;
counterMutex.unlock();
} else {
counterMutex.unlock();
waitMutex.lock();
waitMutex.unlock();
}
}
This code has been used on iOS and Android's NDK without any problems, but when trying it on a Visual Studio 2013 project it seems only a thread which locked a mutex can unlock it (assertion: unlock of unowned mutex).
Is there any non-spinning (blocking, such as this one) version of barrier that I can use that works for C++11? I've only been able to find barriers which used busy-waiting which is something I would like to prevent (unless there is really no reason for it).
class Barrier {
public:
explicit Barrier(std::size_t iCount) :
mThreshold(iCount),
mCount(iCount),
mGeneration(0) {
}
void Wait() {
std::unique_lock<std::mutex> lLock{mMutex};
auto lGen = mGeneration;
if (!--mCount) {
mGeneration++;
mCount = mThreshold;
mCond.notify_all();
} else {
mCond.wait(lLock, [this, lGen] { return lGen != mGeneration; });
}
}
private:
std::mutex mMutex;
std::condition_variable mCond;
std::size_t mThreshold;
std::size_t mCount;
std::size_t mGeneration;
};
Use a std::condition_variable instead of a std::mutex to block all threads until the last one reaches the barrier.
class Barrier
{
private:
std::mutex _mutex;
std::condition_variable _cv;
std::size_t _count;
public:
explicit Barrier(std::size_t count) : _count(count) { }
void Wait()
{
std::unique_lock<std::mutex> lock(_mutex);
if (--_count == 0) {
_cv.notify_all();
} else {
_cv.wait(lock, [this] { return _count == 0; });
}
}
};
Here's my version of the accepted answer above with Auto reset behavior for repetitive use; this was achieved by counting up and down alternately.
/**
* #brief Represents a CPU thread barrier
* #note The barrier automatically resets after all threads are synced
*/
class Barrier
{
private:
std::mutex m_mutex;
std::condition_variable m_cv;
size_t m_count;
const size_t m_initial;
enum State : unsigned char {
Up, Down
};
State m_state;
public:
explicit Barrier(std::size_t count) : m_count{ count }, m_initial{ count }, m_state{ State::Down } { }
/// Blocks until all N threads reach here
void Sync()
{
std::unique_lock<std::mutex> lock{ m_mutex };
if (m_state == State::Down)
{
// Counting down the number of syncing threads
if (--m_count == 0) {
m_state = State::Up;
m_cv.notify_all();
}
else {
m_cv.wait(lock, [this] { return m_state == State::Up; });
}
}
else // (m_state == State::Up)
{
// Counting back up for Auto reset
if (++m_count == m_initial) {
m_state = State::Down;
m_cv.notify_all();
}
else {
m_cv.wait(lock, [this] { return m_state == State::Down; });
}
}
}
};
Seem all above answers don't work in the case the barrier is placed too near
Example: Each thread run the while loop look like this:
while (true)
{
threadBarrier->Synch();
// do heavy computation
threadBarrier->Synch();
// small external calculations like timing, loop count, etc, ...
}
And here is the solution using STL:
class ThreadBarrier
{
public:
int m_threadCount = 0;
int m_currentThreadCount = 0;
std::mutex m_mutex;
std::condition_variable m_cv;
public:
inline ThreadBarrier(int threadCount)
{
m_threadCount = threadCount;
};
public:
inline void Synch()
{
bool wait = false;
m_mutex.lock();
m_currentThreadCount = (m_currentThreadCount + 1) % m_threadCount;
wait = (m_currentThreadCount != 0);
m_mutex.unlock();
if (wait)
{
std::unique_lock<std::mutex> lk(m_mutex);
m_cv.wait(lk);
}
else
{
m_cv.notify_all();
}
};
};
And the solution for Windows:
class ThreadBarrier
{
public:
SYNCHRONIZATION_BARRIER m_barrier;
public:
inline ThreadBarrier(int threadCount)
{
InitializeSynchronizationBarrier(
&m_barrier,
threadCount,
8000);
};
public:
inline void Synch()
{
EnterSynchronizationBarrier(
&m_barrier,
0);
};
};

Resources