QtConcurrent threading is slow!! What am I doing wrong? - multithreading

Why is my qtconcurrent::run() call just as slow as calling the member function through the object??
(Ex: QtConcurrent::run(&db, &DBConnect::loadPhoneNumbers) is just as slow as calling db.loadPhoneNumbers())
Read below for futher explanation
I've been trying to create a thread via QtConcurrent::run to help speed up data being sent to a SQL database table. I am taking a member variable which is a QMap and iterating through it to send each key+value to the database.
Member function for the QtConcurrent::run() call:
void DBConnect::loadPhoneNumbers()
{
//m_phoneNumbers is a private QMap member variable in DBConnect
qDebug() << "\t[!] Items to send: " << m_phoneNumbers.size();
QSqlQuery query;
qDebug() << "\t[!] Using loadphonenumbers thread: " << QThread::currentThread();
qDebug() << "\t[!] Ideal Num of Threads: " << QThread::idealThreadCount();
bool isLoaded = false;
QMap<QString,QString>::const_iterator tmp = m_phoneNumbers.constBegin();
while(tmp != m_phoneNumbers.constEnd())
{
isLoaded = query.exec(QString("INSERT INTO "+m_mtable+" VALUES('%1','%2')").arg(tmp.key()).arg(tmp.value()));
if(isLoaded == false)
{
qDebug() << "\r\r[X] ERROR: Could\'t load number " << tmp.key() << " into table " << m_mtable;
qDebug() << query.lastError().text();
}
tmp++;
}
}
main.cpp section that calls the thread
DBConnect db("QODBC", myINI.getSQLServer(),C_DBASE,myINI.getMTable(), myINI.getBTable());
db.startConnect();
//...more code here
qDebug() << "\n[*] Using main thread: " << QThread::currentThread() << endl;
//....two qtconcurrent::run() threads started and finished here (not shown)
qDebug() << "\n[*] Sending numbers to Database...";
QFuture<void> dbFuture = QtConcurrent::run(&db, &DBConnect::loadPhoneNumbers);
dbFuture.waitForFinished();
My understanding of the situation
From my understanding, this thread will run under a new pool of threads seperate from the main thread. What I am seeing is not the case (note there are 2 other QtConcurrent::run() calls before this one for the database, all left to finish before continuing to database call)
Now I thought about using QtConcurrent::map() / mapped() but couldn't get it to work properly with a QMap. (Couldn't find any examples to help out with either but that is besides the matter... was just an FYI in case someone asks why I didn't use one)
Have been doing some "debug" work to find out whats happening and in my tests I use QThread::currentThread() to find which thread I am currently making a call from. This is what is happening for the various threads in my program. (All qtconcurrent::run() calls are made in main.cpp FYI... not sure if that makes a difference)
Check what is main thread: on QThread(0x5d2cd0)
Run thread 1: on QThread(0x5dd238, name = "Thread (pooled)")
Run thread 2: on QThread(0x5d2cd0)
Run thread 3 (loadPhoneNumbers function): on QThread(0x5d2cd0)
As seen above, other than the first qtconcurrent::run() call, everything else is on the main thread (o.O)
Questions:
From my understanding, all my threads (all qtconcurrent::run) should be on their own thread (only first one is). Is that true or am I missing something?
Second, is my loadPhoneNumebrs() member function thread safe?? (Since I am not altering anything from what I can see)
Biggest question:
Why is my loadPhoneNumbers() qtconcurrent::run call just as slow as if I just called the member function? (ex: db.loadPhoneNumbers() is just as slow as the qtconcurrent::run() version)
Any help is much appreciated!

Threads don't magically speed things up, they just make it so you can continue doing other stuff while it's happening in the background. When you call waitForFinished(), your main thread won't continue until the load phone numbers thread is finished, essentially negating that advantage. Depending on the implementation, that may be why your currentThread() is showing the same as main, because the wait is already happening.
Probably more significant in terms of speed would be to build a single query that inserts all the values in the list, rather than a separate query for each value.

According to QtSql documentation:
A connection can only be used from within the thread that created it.
Moving connections between threads or creating queries from a
different thread is not supported.
It works anyway because ODBC itself supports multithreaded access to a single ODBC handle. But since you are only using one connection, all queries are probably serialized by ODBC as if there was only a single thread (see for example what Oracle's ODBC driver does).
waitForFinished() calls a private function stealRunnable() that, as its name implies, takes a not yet started task from the QFuture queue an runs it in the current thread.

Related

When the main exits where does the console output go?

#include<iostream>
#include<thread>
using namespace std;
void func()
{
for (int i = 0; i < 10000; i++)cout << "Print" << endl;
}
int main()
{
thread t(func);
t.detach();
cout << "Exit" << endl;
return 0;
}
In the above code, when the main exits, where does the "Print"text would be gone,since it doesn't have an output stream? Is there any dummy stream for inserting data which has no use?
When main exits it calls exit which terminates all threads, regardless detached or not. This is because exit terminates the entire process.
The C++ runtime runs main as exit(main(argc, argv)), so that returning from main causes exit to be called.
You can terminate your main thread, if you wish, by calling pthread_exit. In this case the main thread will not return from main and will not call exit. The application will keep running until some other thread calls exit or all threads terminate (or the application crashes). This is how it works on Linux, not sure about Windows.
std::cout object and the other standard streams are available at least until exit is called. These streams are initialized using Schwarz Counter idiom, which makes sure they get initialized before its first use and destroyed after the last user is gone. In other words, if you have a global object with a constructor and destructor, which gets initialized before main is entered and destroyed after (when exit is called), that standard stream is still going to be available in that global object destructor. Basically, there is a reference counter associated with each standard stream, each translation unit (object file) increments this reference counter on the startup and decrements on termination.
ISO/IEC 14882:2011(E) says:
27.4 Standard iostream objects
27.4.1.2 The objects [the standard streams] are constructed and the associations are established at some time prior to or during the first time an object of class ios_base::Init is constructed, and in any case before the body of main begins execution†. The objects are not destroyed during program execution. The results of including in a translation unit shall be as if defined an instance of ios_base::Init with static storage duration. Similarly, the entire program shall behave as if there were at least one instance of ios_base::Init with static storage duration.
† Constructors and destructors for static objects can access these objects to read input from stdin or write output to stdout or stderr.

Limit number of concurrent thread in a thread pool

In my code I have a loop, inside this loop I send several requests to a remote webservice. WS providers said: "The webservice can host at most n threads", so i need to cap my code since I can't send n+1 threads.
If I've to send m threads I would that first n threads will be executed immediately and as soon one of these is completed a new thread (one of the remaining m-n threads) will be executed and so on, until all m threads are executed.
I have thinked of a Thread Pool and explicit setting of the max thread number to n. Is this enough?
For this I would avoid the use of multiple threads. Instead, wrapping the entire loop up which can be run on a single thread. However, if you do want to launch multiple threads using the/a thread pool then I would use the Semaphore class to facilitate the required thread limit; here's how...
A semaphore is like a mean night club bouncer, it has been provide a club capacity and is not allowed to exceed this limit. Once the club is full, no one else can enter... A queue builds up outside. Then as one person leaves another can enter (analogy thanks to J. Albahari).
A Semaphore with a value of one is equivalent to a Mutex or Lock except that the Semaphore has no owner so that it is thread ignorant. Any thread can call Release on a Semaphore whereas with a Mutex/Lock only the thread that obtained the Mutex/Lock can release it.
Now, for your case we are able to use Semaphores to limit concurrency and prevent too many threads from executing a particular piece of code at once. In the following example five threads try to enter a night club that only allows entry to three...
class BadAssClub
{
static SemaphoreSlim sem = new SemaphoreSlim(3);
static void Main()
{
for (int i = 1; i <= 5; i++)
new Thread(Enter).Start(i);
}
// Enfore only three threads running this method at once.
static void Enter(int i)
{
try
{
Console.WriteLine(i + " wants to enter.");
sem.Wait();
Console.WriteLine(i + " is in!");
Thread.Sleep(1000 * (int)i);
Console.WriteLine(i + " is leaving...");
}
finally
{
sem.Release();
}
}
}
I hope this helps.
Edit. You can also use the ThreadPool.SetMaxThreads Method. This method restricts the number of threads allowed to run in the thread pool. But it does this 'globally' for the thread pool itself. This means that if you are running SQL queries or other methods in libraries that you application uses then new threads will not be spun-up due to this blocking. This may not be relevant to you, in which case use the SetMaxThreads method. If you want to block for a particular method however, it is safer to use Semphores.

Long-running / blocking operations in boost asio handlers

Current Situation
I implemented a TCP server using boost.asio which currently uses a single io_service object on which I call the run method from a single thread.
So far the server was able to answer the requests of the clients immediately, since it had all necessary information in the memory (no long-running operations in the receive handler were necessary).
Problem
Now requirements have changed and I need to get some information out of a database (with ODBC) - which is basically a long-running blocking operation - in order to create the response for the clients.
I see several approaches, but I don't know which one is best (and there are probably even more approaches):
First Approach
I could keep the long running operations in the handlers, and simply call io_service.run() from multiple threads. I guess I would use as many threads as I have CPU cores available?
While this approach would be easy to implement, I don't think I would get the best performance with this approach because of the limited number of threads (which are idling most of the time since database access is more an I/O-bound operation than a compute-bound operation).
Second Approach
In section 6 of this document it says:
Use threads for long running tasks
A variant of the single-threaded design, this design still uses a single io_service::run() thread for implementing protocol logic. Long running or blocking tasks are passed to a background thread and, once completed, the result is posted back to the io_service::run() thread.
This sounds promising, but I don't know how to implement that. Can anyone provide some code snippet / example for this approach?
Third Approach
Boris Schäling explains in section 7.5 of his boost introduction how to extend boost.asio with custom services.
This looks like a lot of work. Does this approach have any benefits compared to the other approaches?
The approaches are not explicitly mutually exclusive. I often see a combination of the first and second:
One or more thread are processing network I/O in one io_service.
Long running or blocking tasks are posted into a different io_service. This io_service functions as a thread pool that will not interfere with threads handling network I/O. Alternatively, one could spawn a detached thread every time a long running or blocking task is needed; however, the overhead of thread creation/destruction may a noticeable impact.
This answer that provides a thread pool implementation. Additionally, here is a basic example that tries to emphasize the interaction between two io_services.
#include <iostream>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/chrono.hpp>
#include <boost/optional.hpp>
#include <boost/thread.hpp>
/// #brief Background service will function as a thread-pool where
/// long-standing blocking operations may occur without affecting
/// the network event loop.
boost::asio::io_service background_service;
/// #brief The main io_service will handle network operations.
boost::asio::io_service io_service;
boost::optional<boost::asio::io_service::work> work;
/// #brief ODBC blocking operation.
///
/// #brief data Data to use for query.
/// #brief handler Handler to invoke upon completion of operation.
template <typename Handler>
void query_odbc(unsigned int data,
Handler handler)
{
std::cout << "in background service, start querying odbc\n";
std::cout.flush();
// Mimic busy work.
boost::this_thread::sleep_for(boost::chrono::seconds(5));
std::cout << "in background service, posting odbc result to main service\n";
std::cout.flush();
io_service.post(boost::bind(handler, data * 2));
}
/// #brief Functions as a continuation for handle_read, that will be
/// invoked with results from ODBC.
void handle_read_odbc(unsigned int result)
{
std::stringstream stream;
stream << "in main service, got " << result << " from odbc.\n";
std::cout << stream.str();
std::cout.flush();
// Allow io_service to stop in this example.
work = boost::none;
}
/// #brief Mocked up read handler that will post work into a background
/// service.
void handle_read(const boost::system::error_code& error,
std::size_t bytes_transferred)
{
std::cout << "in main service, need to query odbc" << std::endl;
typedef void (*handler_type)(unsigned int);
background_service.post(boost::bind(&query_odbc<handler_type>,
21, // data
&handle_read_odbc) // handler
);
// Keep io_service event loop running in this example.
work = boost::in_place(boost::ref(io_service));
}
/// #brief Loop to show concurrency.
void print_loop(unsigned int iteration)
{
if (!iteration) return;
std::cout << " in main service, doing work.\n";
std::cout.flush();
boost::this_thread::sleep_for(boost::chrono::seconds(1));
io_service.post(boost::bind(&print_loop, --iteration));
}
int main()
{
boost::optional<boost::asio::io_service::work> background_work(
boost::in_place(boost::ref(background_service)));
// Dedicate 3 threads to performing long-standing blocking operations.
boost::thread_group background_threads;
for (std::size_t i = 0; i < 3; ++i)
background_threads.create_thread(
boost::bind(&boost::asio::io_service::run, &background_service));
// Post a mocked up 'handle read' handler into the main io_service.
io_service.post(boost::bind(&handle_read,
make_error_code(boost::system::errc::success), 0));
// Post a mockup loop into the io_service to show concurrency.
io_service.post(boost::bind(&print_loop, 5));
// Run the main io_service.
io_service.run();
// Cleanup background.
background_work = boost::none;
background_threads.join_all();
}
And the output:
in main service, need to query odbc
in main service, doing work.
in background service, start querying odbc
in main service, doing work.
in main service, doing work.
in main service, doing work.
in main service, doing work.
in background service, posting odbc result to main service
in main service, got 42 from odbc.
Note that the single thread processing the main io_service posts work into the background_service, and then continues to process its event loop while the background_service blocks. Once the background_service gets a result, it posts a handler into the main io_service.
We have same long-running tasks in our server (a legacy protocol with storages). So our server is running 200 threads to avoid blocking service (yes, 200 threads is running io_service::run). Its not too great thing, but works well for now.
The only problem we had is asio::strand which uses so-called "implementations" which gets locked when hadler is currently called. Solved this via increase this strands butckets and "deattaching" task via io_service::post without strand wrap.
Some tasks may run seconds or even minutes and this does work without issues at the moment.

TBB ThreadingBuildingBlocks strange behaviour

My Question: Why is my program freezing if i use "read only" const_accessors?
It seems to be locking up, from the API description it seems to be ok to have one accessors and multiple const_accessors, (writer, reader). Maybe somebody can tell me a different story.
The Goal i try to achieve is to use this concurrent hash map and make it available to 10-200 Threads so that they can lookup and add/delete information. If you have a better solution than the current one i' am using than you also welcome to post the alternatives.
tbb::size_t hashInitSize = 1200000;
concurrent_hash_map<long int,char*> hashmap(hashInitSize);
cout << hashmap.bucket_count() << std::endl;
long int l = 200;
long int c = 201;
concurrent_hash_map<long int,char*>::accessor o;
concurrent_hash_map<long int,char*>::const_accessor t;
concurrent_hash_map<long int,char*>::const_accessor h;
cout << "Trying to find 200 "<< hashmap.find(t,200) << std::endl;
hashmap.insert(o,l);
o->second = "testother";
TBB Community Tutorial Guide Page 43 describes the concept of accessors
From the TBB reference manual:
An accessor acts as a smart pointer to a pair in a concurrent_hash_map. It holds an implicit lock on a pair until the instance is destroyed or method release is called on the accessor.
Accessors acquire a lock when they are used. Multiple accessors can exist at the same time. However, if a program uses multiple accessors concurrently and does not release the locks, it is likely to deadlock.
To avoid deadlock, release the lock on a hash map entry once you are done accessing it.

What is the reason for QProcess error status 5?

i have multiple threads running the following QProcess. Randomly they fail with error state 5. The Qt docs do not give any more details. Has anyone a clue what that error could come from? Thank you very much.
extCmd = new QProcess(this);
QString cmd = "/usr/bin/php";
QStringList argStr;
argStr << "/bin/sleep" << "10"; // changed to ever working command
extCmd->start(cmd, args);
bool suc = extCmd->waitForFinished(-1);
if (!suc) {
qDebug() << "finishing failed error="
<< extCmd.error()
<< extCmd.errorString();
}
Gives me the output:
finishing failed error= 5 "Unknown error"
Tangential to your problem is the fact that you should not be starting a thread per each process. A QProcess emits a finished(int code, QProcess::ExitStatus status) signal when it's done. It will also emit started() and error() upon successful and unsuccessful startup, respectively. Connect all those three signals to a slot in a QObject, then start the process, and deal with the results in the slots. You won't need any extra threads.
If you get a started() signal, then you can be sure that the process's file name was correct, and the process was started. Whatever exit code you get from finished(int) is then indicative of what the process did, perhaps in response to potentially invalid arguments you might have passed to it. If you get a error() signal, the process has failed to start because you gave a wrong filename to QProcess::start(), or you don't have correct permissions.
You should not be writing synchronous code where things happen asynchronously. Synchronous code is code that blocks for a particular thing to happen, like calling waitForCmdFinished. I wish that there was a Qt configuration flag that disables all those leftover synchronous blocking APIs, just like there's a flag to disable/enable Qt 3 support APIs. The mere availability of those blocking APIs promotes horrible hacks like the code above. Those APIs should be disabled by default IMHO. Just as there should be a test for moving QThread and derived classes to another thread. It's also a sign of bad design in every example of publicly available code I could find, and I did a rather thorough search to convince myself I wasn't crazy or something.
The only reasonable use I recall for a waitxxx method in Qt is the wait for a QThread to finish. Even then, this should be only called from within the ~QThread, so as to prevent the QThread from being destroyed with the tread still running.

Resources