I have an application that runs multiple threads which are sometimes cancelled. These threads may call into another object that internally accesses a resources (socket). To prevent the resource to be accessed simultaneously, there is a critical section to get some order in the execution.
Now, when cancelling the thread, it (sometimes) happens that the thread is just within that code that is blocked by the critical section. The critical section is locked using an object and I was hoping that upon cancellation of the thread this object would be destructed and consequently release the lock. However this does not seem to be the case, so that at thread destruction this resource object is permanently locked.
Changing the resource object is probably not an option (3rd party delivered), plus it makes sense to prevent simultaneous access to a resource that can not be used in parallel.
I have experimented with preventing the thread to be cancelled using pthread_setcancelstate when the section is locked/unlocked, however this does feel a bit dirty and would not be a final solution for other situations (e.g. aquired mutexes, etc).
I know that a prefered solution would be to not use pthread_cancel but instead set a flag in the thread and it would cancel itself when it is ready (in a clean way). However as I want to cancel the thread asap, I was wondering (also out of academic interest) if there would be other options to do that.
Thread cancellation without the help from the application (the mentioned flag) is a bad idea. Just google.
Actually cancellation is so hard that it has been omitted from the latest C++0x draft. You can search http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html and won't find any mention of cancellation at all. Here's the definition of the proposed thread class (you won't find cancel there):
class thread
{
public:
// types:
class id;
typedef implementation-defined native_handle_type; // See [thread.native]
// construct/copy/destroy:
thread();
template <class F> explicit thread(F f);
template <class F, class ...Args> thread(F&& f, Args&&... args);
~thread();
thread(const thread&) = delete;
thread(thread&&);
thread& operator=(const thread&) = delete;
thread& operator=(thread&&);
// members:
void swap(thread&&);
bool joinable() const;
void join();
void detach();
id get_id() const;
native_handle_type native_handle(); // See [thread.native]
// static members:
static unsigned hardware_concurrency();
};
You could use pthread_cleanup_push() to push a cancellation cleanup handler onto the threads cancellation cleanup stack. This handler would be responsible for unlocking the critical section.
Once you leave the critical section you should call pthread_cleanup_pop(0) to remove it.
i.e.
CRIITICAL_SECTION g_section;
void clean_crit_sec( void * )
{
LeaveCriticalSection( &g_section )
}
void *thrfunc( void * )
{
EnterCriticalSection( &g_section );
pthread_cleanup_push( clean_crit_sec, NULL );
// Do something that may be cancellable
LeaveCriticalSection( &g_section );
pthread_cleanup_pop( 0 );
}
This would still leave a small race condition where the critcial section has been unlocked but the cleanup handler could still be executed if the thread was canceled between the Leave.. and the cleanup_pop.
You could call pthread_cleanup_pop with 1 which would execute your cleanup code and not levae the critical section yourself. i.e
CRIITICAL_SECTION g_section;
void clean_crit_sec( void * )
{
LeaveCriticalSection( &g_section )
}
void *thrfunc( void * )
{
EnterCriticalSection( &g_section );
pthread_cleanup_push( clean_crit_sec, NULL );
// Do something that may be cancellable
pthread_cleanup_pop( 1 ); // this will pop the handler and execute it.
}
The idea of aborting threads without using a well defined control method (ie, flags) is just so evil that you simply shouldn't do it.
If you have third party code that you have no option except to do this, I might go as far as suggest to abstract the horrible code inside a process, and then interact with the process instead, separating each such component nicely.
Now, such a design would be even worse on windows, because windows is not good at running multiple processes, however this is not such a bad idea on linux.
Of course, having a sensible design for your threaded modules would be even better...
(Personally, I prefer not using threads at all, and always using processes, or non-blocking designs)
if the lock which controls the critical section is not exposed to you directly, there is not much you can do. When you cancel a thread, all the cleanup handlers for the thread are executed in the normal reverse order, but of course these handlers could only release mutexes which you have access to. So you really can't do much more than disable canceling during your visit to the 3rd party component.
I think your best solution is to both use a flag and the pthread_cancel functionality. WHen you are entering the 3rd party component, disable cancel processing (PTHREAD_CANCEL_DISABLE); when you get back out of it, re-enable it. After re-enabling it, check for the flag:
/* In thread which you want to be able to be canceled: */
int oldstate;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate);
... call 3rd party component ...
pthread setcancelstate(oldstate, NULL);
if (cancelled_flag) pthread_exit(PTHREAD_CANCELED);
/* In the thread canceling the other one. Note the order of operations
to avoid race condition: */
cancelled_flag = true;
pthread_cancel(thread_id);
Related
A common design pattern is to have a "manager" object that maintains a set of "managed" objects. In C++11 and later, the Manager likely keeps shared_ptrs to the Managed objects. If the Managed objects need a reference back to the Manager, they wisely do so by storing a weak_ptr<Manager>. The Manager can establish this relationship itself by constructing each Managed object directly (through a factory function, for example), and passing its own shared_ptr to the Managed object. The Manager can obtain its own shared_ptr by using shared_from_this(). None of these choices are required, but they are common and reasonable.
Now consider a Manager that maintains its Managed objects in a separate thread. A user of the Manager-Managed system may ask the Manager to create Managed objects, then run() the Manager so that it maintains those objects in the background until stop() is called. Still seems perfectly reasonable, right?
But now consider the Manager's destructor. It would be a nasty error to allow its background thread to continue past destruction. So we call stop() from the destructor.
Yet this raises a serious issue. Because the Manager is owned by shared_ptrs, its destructor will be called precisely when no shared_ptr references it. At that point, all weak_ptrs to the Manager will be expired(). Therefore all of the Managed objects' manager pointers will be invalid. And since the Managed objects are being "worked" (their member functions called) in a separate thread, they may suddenly find themselves with a null manager. If they assume their manager is non-null, the result is an error of one (severe) kind or another.
I see three potential solutions to the problem.
Add explicit checks for non-null manager everywhere it's used in Managed object code. Yet depending on the complexity of Managed objects, these checks are likely to be fiddly and error-prone.
Ensure that stop() is called prior to the manager being destroyed. Yet this violates the semantics of shared_ptr. There is no single owner of Manager: it is shared. So no single object knows when it will die or when it should stop updating. Moreover, it's simply bad form to leave Manager's destructor without a call to stop(): RAII implies that the Manager must deal with its own thread.
Make the Manager detect its own imminent death and stop calling Managed objects when it's dying. This has the benefit of centralizing the burden: the Manager should be able to detect its own death in a few places (for example, loops over all Managed objects) and refuse to deal with Managed objects in those places. Since the Managed objects won't be called, they won't attempt to use their expired weak_ptr<Manager>s and therefore won't fail (or need to check them constantly).
Is there a standard, correct way of dealing with this problem? Is the problem as I've framed it in violation of some well-understood principle for design, use of shared_ptr/weak_ptr, or use of threads?
The following code illustrates the problem.
#include <memory>
#include <thread>
#include <vector>
#include <cassert>
using namespace std;
class Manager;
class Managed {
public:
explicit Managed( shared_ptr< Manager > manager )
: m_manager( manager )
{}
void doStuff() {
// Fails because Manager::work() may be continuing to traverse Managed objects in a separate thread
// while m_manager is in its destructor (and therefore dead).
assert( m_manager.expired() == false );
// ...
}
private:
weak_ptr< Manager > m_manager;
};
class Manager : public enable_shared_from_this< Manager > {
public:
~Manager() {
stop(); // Problematic: all weak_ptrs to me are now expired(), yet work() continues a moment.
}
shared_ptr< Managed > create() {
assert( !m_thread.joinable() ); // Mustn't be running, to avoid concurrency issues.
auto managed = make_shared< Managed >( shared_from_this() );
m_managed.push_back( managed );
return managed;
}
void run() {
m_continue = true;
m_thread = thread{ bind( &Manager::work, this ) };
}
void stop() {
m_continue = false;
if( m_thread.joinable() ) {
m_thread.join();
}
}
private:
vector< shared_ptr< Managed >> m_managed;
thread m_thread;
atomic_bool m_continue{ true };
void work() {
while( m_continue ) {
for( const auto& managed : m_managed ) {
managed->doStuff();
}
}
}
};
int main() {
// Create the manager and a bunch of managed objects.
auto manager = make_shared< Manager >();
for( size_t i = 0; i < 10000; ++i ) {
manager->create();
}
// Run for a while.
manager->run();
this_thread::sleep_for( chrono::seconds{ 1 } );
manager.reset(); // Calls manager->stop() indirectly.
return 0;
}
Shared pointers do not solve every resource problem. They solve one particular one that happens to be easy to fix: blindly using shared pointers causes more problems than it fixes in my experience.
Shared pointers are about distributing the right to extend the lifetime of some object to an unbounded set of clients. Clients who hold weak pointers are people who want to be able to passively know when the object has gone away. This means they must always check, and once it goes away they have no rights to get it back.
If you have singular ownership, then use a unique pointer not a shared pointer.
If you guarantee the workers do not outlive the manager, give them a raw pointer not a weak pointer.
If you want to clean up before you destroy the object, give the unique pointer a custom deleter that does cleanup before delete.
Then, by delete, your workers should all be gone. Assert, and abort if you fail.
If you want non-unique-pointer semenatics, wrap up your actual manager as a unique pImpl within a wrapper type that offers pseudo-value semantics, and either use a custom deleter or just have the wrapper's destructor call the pre-destruction cleanup code.
shared_from_this is no longer involved.
I am creating a music library app using Qt( C++). It involves a method that does the following jobs in the given sequence-
List N audio files by recursively traversing a directory.
Read each and every file to collect ID3 tags.
Extract artwork images from the files.
Save the ID3 tags in the database.
The above set of tasks are extremely resource intensive. For N ~ 1000, it takes around a minute and half to complete the tasks and during the course of execution of this sequence, the GUI freezes up and doesn't respond well as I currently use no other threads.
I have seen a few examples of Qt threading and they more or less tell how to do things in parallel as expected but in those examples, achieving parallelism or concurrency is a requirement as they don't have any other options. But in case of my app, it's a choice whether I use multiple threads or not. The goal is to make sure the GUI stays responsive and interactive during the execution of the resource intensive task.I would really appreciate any expert advice may be with a code template or example in Qt to perform the resource intensive task in a different thread.
Code in main thread-
QStringList files;
QString status;
createLibrary(files, status); //To be done in a different thread
if(status == "complete"){
//do something
}
Thanks a lot your time!
You could use the QtConcurrent module.
Use QtConcurrent::map() to iterate over the list of files and call a method in separate thread:
QFuture<void> result = QtConcurrent::map(files, createLibrary);
QFutureWatcher will send a signal when the processing is done:
QFutureWatcher<void> watcher;
connect(&watcher, SIGNAL(finished()),
this, SLOT(processingFinished()));
// Start the computation.
QFuture<void> result = QtConcurrent::map(files, createLibrary);
watcher.setFuture(result);
BTW because of a lot of bad files in the wild, the music player Amarok decided to put the id3 tag scanner in a separate process. See here for more informations.
My best advice would be to create a subclass QThread. Pass this subclass a pointer to the directories and give it a pointer to a valid (non-null) view that you want to update in the following way:
header.h
class SearchAndUpdate : public QThread
{
Q_OBJECT
public:
SearchAndUpdate(QStringList *files, QWidget *widget);
//The QWidget can be replaced with a Layout or a MainWindow or whatever portion
//of your GUI that is updated by the thread. It's not a real awesome move to
//update your GUI from a background thread, so connect to the QThread::finished()
//signal to perform your updates. I just put it in because it can be done.
~SearchAndUpdate();
QMutex mutex;
QStringList *f;
QWidget *w;
bool running;
private:
virtual void run();
};
Then in your implementation for that thread do this:
thread.cpp
SearchAndUpdate(QStringList *files, QWidget *widget){
this->f=files;
this->w=widget;
}
void SearchAndUpdate::run(){
this->running=true;
mutex.lock();
//here is where you do all the work
//create a massive QStringList iterator
//whatever you need to complete your 4 steps.
//you can even try to update your QWidget *w pointer
//although some window managers will yell at you
mutex.unlock();
this->running=false;
this->deleteLater();
}
Then in your GUI thread maintain the valid pointers QStringList *files and SearchAndUpdate *search, then do something like this:
files = new QStringList();
files->append("path/to/file1");
...
files->append("path/to/fileN");
search = new SearchAndUpdate(files,this->ui->qwidgetToUpdate);
connect(search,SIGNAL(finished()),this,SLOT(threadFinished()));
search->start();
...
void threadFinished(){
//update the GUI here and no one will be mad
}
(Pseudo-)Code
Here is a non-compilable code-sketch of the concepts I am having trouble with:
struct Data {};
struct A {};
struct B {};
struct C {};
/* and many many more...*/
template<typename T>
class Listener {
public:
Listener(MyObject* worker):worker(worker)
{ /* do some magic to register with RTI DDS */ };
public:
// This function is used ass a callback from RTI DDS, i.e. it will be
// called from other threads when new Data is available
void callBackFunction(Data d)
{
T t = extractFromData(d);
// Option 1: direct function call
// works somewhat, but shows "QObject::startTimer: timers cannot be started
// from another thread" at the console...
worker->doSomeWorkWithData(t); //
// Option 2: Use invokeMethod:
// seems to fail, as the macro expands including '"T"' and that type isn't
// registered with the QMetaType system...
// QMetaObject::invokeMethod(worker,"doSomeGraphicsWork",Qt::AutoConnection,
// Q_ARG(T, t)
// );
// Option 3: use signals slots
// fails as I can't make Listener, a template class, a QObject...
// emit workNeedsToBeDone(t);
}
private:
MyObject* worker;
T extractFromData(Data d){ return T(d);};
};
class MyObject : public QObject {
Q_OBJECT
public Q_SLOTS:
void doSomeWorkWithData(A a); // This one affects some QGraphicsItems.
void doSomeWorkWithData(B b){};
void doSomeWorkWithData(C c){};
public:
MyObject():QObject(nullptr){};
void init()
{
// listeners are not created in the constructor, but they should have the
// same thread affinity as the MyObject instance that creates them...
// (which in this example--and in my actual code--would be the main GUI
// thread...)
new Listener<A>(this);
new Listener<B>(this);
new Listener<C>(this);
};
};
main()
{
QApplication app;
/* plenty of stuff to set up RTI DDS and other things... */
auto myObject = new MyObject();
/* stuff resulting in the need to separate "construction" and "initialization" */
myObject.init();
return app.exec();
};
Some more details from the actual code:
The Listener in the example is a RTI DataReaderListener, the callback
function is onDataAvailable()
What I would like to accomplish
I am trying to write a little distributed program that uses RTI's Connext DDS for communication and Qt5 for the GUI stuff--however, I don't believe those details do matter much as the problem, as far as I understood it, boils down to the following:
I have a QObject-derived object myObject whose thread affinity might or might not be with the main GUI thread (but for simplicity, let's assume that is the case.)
I want that object to react to event's which happen in another, non-Qt 3rd-party library (in my example code above represented by the functions doSomeWorkWithData().
What I understand so far as to why this is problematic
Disclaimer: As usual, there is always more than one new thing one learns when starting a new project. For me, the new things here are/were RTI's Connext and (apparently) my first time where I myself have to deal with threads.
From reading about threading in Qt (1,2,3,4, and 5 ) it seems to me that
QObjects in general are not thread safe, i.e. I have to be a little careful about things
Using the right way of "communicating" with QObjects should allow me to avoid having to deal with mutexes etc myself, i.e. somebody else (Qt?) can take care of serializing access for me.
As a result from that, I can't simply have (random) calls to MyClass::doSomeWorkWithData() but I need to serialize that. One, presumably easy, way to do so is to post an event to the event queue myObject lives in which--when time is available--will trigger the execution of the desired method, MyClass::doSomeWorkWithData() in my case.
What I have tried to make things work
I have confirmed that myObject, when instantiated similarly as in the sample code above, is affiliated with the main GUI thread, i.e. myObject.thread() == QApplication::instance()->thread().
With that given, I have tried three options so far:
Option 1: Directly calling the function
This approach is based upon the fact that
- myObject lives in the GUI thread
- All the created listeners are also affiliated with the GUI thread as they are
created by `myObject' and inherit its thread that way
This actually results in the fact that doSomeWorkWithData() is executed. However,
some of those functions manipulate QGraphicsItems and whenever that is the case I get
error messages reading: "QObject::startTimer: timers cannot be started from another
thread".
Option 2: Posting an event via QMetaObject::invokeMethod()
Trying to circumvent this problem by properly posting an event for myObject, I
tried to mark MyObject::doSomeWorkWithData() with Q_INVOKABLE, but I failed at invoking the
method as I need to pass arguments with Q_ARG. I properly registered and declared my custom types
represented by struct A, etc. in the example), but I failed at the fact the
Q_ARG expanded to include a literal of the type of the argument, which in the
templated case didn't work ("T" isn't a registered or declared type).
Trying to use conventional signals and slots
This approach essentially directly failed at the fact that the QMeta system doesn't
work with templates, i.e. it seems to me that there simply can't be any templated QObjects.
What I would like help with
After spending about a week on attempting to fix this, reading up on threads (and uncovering some other issues in my code), I would really like to get this done right.
As such, I would really appreciate if :
somebody could show me a generic way of how a QObject's member function can be called via a callback function from another 3rd-party library (or anything else for that matter) from a different, non QThread-controlled, thread.
somebody could explain to me why Option 1 works if I simply don't create a GUI, i.e. do all the same work, just without a QGraphcisScene visualizing it (and the project's app being a QCoreApplication instead of a QApplication and all the graphics related work #defineed out).
Any, and I mean absolutely any, straw I could grasp on is truly appreciated.
Update
Based on the accepted answer I altered my code to deal with callbacks from other threads: I introduced a thread check at the beginning of my void doSomeWorkWithData() functions:
void doSomeWorkWithData(A a)
{
if( QThread::currentThread() != this->thread() )
{
QMetaObject::invokeMethod( this,"doSomeWorkWithData"
,Qt::QueuedConnection
,Q_ARG(A, a) );
return;
}
/* The actual work this function does would be below here... */
};
Some related thoughts:
I was contemplating to introduce a QMutexLocker before the if statement, but decided against it: the only part of the function that is potentially used in parallel (anything above the return; in the if statement) is--as far as I understand--thread safe.
Setting the connection type manually to Qt::QueuedConnection: technically, if I understand the documentation correctly, Qt should do the right thing and the default, Qt::AutoConnection, should end up becoming a Qt::QueuedConnection. But since would always be the case when that statement is reached, I decided to put explicitly in there to remind myself about why this is there.
putting the queuing code directly in the function and not hiding it in an interim function: I could have opted to put the call to invokeMethod in another interim function, say queueDoSomeWorkWithData()', which would be called by the callback in the listener and then usesinvokeMethodwith anQt::AutoConnection' on doSomeWorkWithData(). I decided against this as there seems no way for me to auto-code this interim function via templates (templates and the Meta system was part of the original problem), so "the user" of my code (i.e. the person who implements doSomeWorkWithData(XYZ xyz)) would have to hand type the interim function as well (as that is how the templated type names are correctly resolved). Including the check in the actual function seems to me to safe typing an extra function header, keeps the MyClass interface a little cleaner, and better reminds readers of doSomeWorkWithData() that there might be a threading issue lurking in the dark.
It is ok to call a public function on a subclass of QObject from another thread if you know for certain that the individual function will perform only thread-safe actions.
One nice thing about Qt is that it will handle foreign threads just as well as it handles QThreads. So, one option is to create a threadSafeDoSomeWorkWithData function for each doSomeWorkWithData that does nothing but QMetaMethod::invoke the non-threadsafe one.
public:
void threadSafeDoSomeWorkWithData(A a) {
QMetaMethod::invoke("doSomeWorkWithData", Q_ARG(A,a));
}
Q_INVOKABLE void doSomeWorkWithData(A a);
Alternatively, Sergey Tachenov suggests an interesting way of doing more or less the same thing in his answer here. He combines the two functions I suggested into one.
void Obj2::ping() {
if (QThread::currentThread() != this->thread()) {
// not sure how efficient it is
QMetaObject::invoke(this, "ping", Qt::QueuedConnection);
return;
}
// thread unsafe code goes here
}
As to why you see normal behaviour when not creating a GUI? Perhaps you're not doing anything else that is unsafe, aside from manipulating GUI objects. Or, perhaps they're the only place in which your thread-safety problems are obvious.
I have a weird problem regarding the use of threads inside a Firebreath plugin (in this case a FB plugin, but could happen anywhere); I will try to explain:
1) My plugin creates a thread (static), and it receives a pointer to "this" every time it gets added to a page.
2) So, now I have a thread with a pointer to the plugin, so I can call it's methods.
3) Very nice so far, BUT, suppose that I have a button (coded in HTML), which when pressed will REMOVE the current plugin, put in place another one and launch another thread.
I have described my scenario, now for the problem, when a plugin gets added it launches a thread; inside the thread there is a pointer to "this". First time, it gets fired...while the thread is executing I press the HTML button (so, the current plugin now is destroyed) and a new one is placed. The thread from the 1st plugin ends, and now returns...but it returns to the 2nd instance of the plugin.
The plugin is an image viewer, the first plugin look for a picture, it gets removed and a new one is placed; BUT the image from the 1st plugin is placed in the 2nd one. I don't know where to start looking, apparently the pointer has an address to the plugin (e.g. 12345), the plugin gets removed and instantiated again with the same memory address (12345).
Is there some way to avoid that behavior?
This is the code I have so far:
myPlugin.h
unsigned ThreadId;
HANDLE hThread;
myPlugin.cpp
unsigned __stdcall myPlugin::Thread(void *data)
{
myPlugin* this = (myPlugin*) data;
this->getImage("http:\\host.com\\image.jpg");
_endthreadex(0); //EDIT: addedd this missing line to end the thread
}
void myPlugin::onPluginReady(std::string imageUrl)
{
hThread = (HANDLE)_beginthreadex(NULL, 0, myPlugin::Thread, (void*) **this**, 0, &ThreadId);
}
void myPlugin::getImage()
{
//get an image using CURL... //no problem here
}
You need to stop and join the thread in the shutdown() function of your Plugin class; that will be called before things are actually unloaded and that will help avoid the problem.
I would also recommend using boost::thread, since FireBreath already compiles it all in, and that will help simplify some of this; you can hold a weak_ptr in your thread to the plugin class rather than passing in a void*. Of course, either way you'll need to stop and join the thread during the plugin shutdown (and the thread needs to stop quickly or the browser will get cranky about it taking so long).
So I'm trying to use the TPL features in .NET 4.0 and have some code like this (don't laugh):
/// <summary>Fetches a thread along with its posts. Increments the thread viewed counter.</summary>
public Thread ViewThread(int threadId)
{
// Get the thread along with the posts
Thread thread = this.Context.Threads.Include(t => t.Posts)
.FirstOrDefault(t => t.ThreadID == threadId);
// Increment viewed counter
thread.NumViews++;
Task.Factory.StartNew(() =>
{
try {
this.Context.SaveChanges();
}
catch (Exception ex) {
this.Logger.Error("Error viewing thread " + thread.Title, ex);
}
this.Logger.DebugFormat(#"Thread ""{0}"" viewed and incremented.", thread.Title);
});
return thread;
}
So my immediate concerns with the lambda are this.Context (my entity framework datacontext member), this.Logger (logger member) and thread (used in the logger call). Normally in the QueueUserWorkItem() days, I would think these would need to be passed into the delegate as part of a state object. Are closures going to be bail me out of needing to do that?
Another issue is that the type that this routine is in implements IDisposable and thus is in a using statement. So if I do something like...
using (var bl = new ThreadBL()) {
t = bl.ViewThread(threadId);
}
... am I going to create a race between a dispose() call and the TPL getting around to invoking my lambda?
Currently I'm seeing the context save the data back to my database but no logging - no exceptions either. This could be a configuration thing on my part but something about this code feels odd. I don't want to have unhandled exceptions in other threads. Any input is welcome!
As for your question on closures, yes this is exactly what closures are about. You don't worry about passing state, instead it is captured for you from any outer context and copied onto a compiler supplied class which is also where the closure method will be defined. The compiler does a lot of magic here to make you're life simple. If you want to understand more I highly recommend picking up Jon Skeet's C# in Depth. The chapter on closures is actually available here.
As for your specific implementation, it will not work mainly for the exact problem you mentioned: the Task will be scheduled at the end of ViewThread, but potentially not execute before your ThreadBL instance is disposed of.