The Qt desktop GUI multithreaded application that I need to develop, is required to read from the serial port on one thread and display the read-data on another thread.
I am thinking of taking the following design approach.
Subclass a QObject to create a worker. Instantiate this worker object and a QThread. Move the worker to the new thread. Send data to the worker object over queued signal-slot connections.
Is this a good design approach? Or is there a better one?
Related
Let's assume i have a nodejs serverProgram with one api and it does some manipulations on the video file, sent via the http request.
const saveVideoFile=(req,res)=>{
processAndSaveVideoFile(); // can run for minimum of 10 minutes
res.send({status: "video is being processed"})
}
i decided to to make use of a workerThread to do this processing as my machine has 3 cores (core1,core2,core3) and there is no hyperthreading enabled here
Assume that my nodejs program is running on core1. When i fire up a single workerThread, will the workerThread run on core2/core3 or core1?
i read that workerThread is not the same as childProcess. ChildProcess will fork a new process which will facilitate the childProcess to choose from available free cores (core2 or core3).
i read that workerThread shares memory with the mainThread. Let's assume that i create 2 workerThreads (wt1,wt2). Will my nodejs program, wt1, wt2 run on the same core i.e core1 ?
Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?
Below is an excerpt from this blog
We can run things in parallel in Node.js. However, we need not to
create threads. The operating system and the virtual machine
collectively run the I/O in parallel and the JS code then runs in a
single thread when it is time to send the data back to the JavaScript
code.
i keep reading this same information about nodejs in many articles and video presentations. But what i do not understand is this,
The operating system and the virtual machine collectively run the I/O in parallel
How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?
There are lot of contents on nodejs, but it doesn't clear doubts related to my above questions. if you have idea on how these things actually work, please share the detail.
A worker thread in node.js is an actual OS thread running in a different instance of V8. As such, it's totally up to the operating system to decide how to allocate it among available CPU cores. If there are cores with available time, then it will not generally be run on the same core as the main nodejs thread when that thread is busy because the OS will allocate busy threads across the various cores.
But, again this is entirely up to the OS and is not something that nodejs controls and the exact strategy for which cores are used will vary by OS. But, in all modern operating systems, the design goal is that available cores are used for threads that are currently executing. Now, if there are more threads active at once than there are cores, the threads will be time-sliced and all the cores will be active.
Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?
No, it is not correct to assume those threads all use the same core.
A workerThread in nodejs has its own event loop. For the most part, it does not share memory. In fact, if you want to share memory, you have to very specifically allocated SharedMemory and pass that to the workerThread.
Is it overkill? Well, it depends upon what you're doing. There are very useful things to do with workerThreads and there are things that they would not be necessary for.
The operating system and the virtual machine collectively run the I/O in parallel
I/O in node.js is either asynchronous at the OS level (such as networking) or run in separate threads (such as disk I/O). That means it runs separately from the main thread in node.js that runs your Javascript and can run in parallel with it, synchronizing only at the completion of an event. "Parallel" in this case means that both make progress at the same time. If there are multiple cores, then they can truly be running at exactly the same time. If there was only one core, then the OS will timeslice between the various threads and they will be both make progress (in an interleaved fashion that will seem to be parallel, but really they are taking turns).
How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?
The OS has its own threads for managing things like a network interface or a disk interface. The job of those threads is to interface with the hardware and bring data to an appropriate application or take data from the application and send it to the hardware. These are OS-level threads that exists independent of node.js. Yes, other cores can be used by those OS-level threads. It is important to realize that many operations such as networking are inherently non-blocking. Thus, if you're waiting for some data to arrive on a network interface, you don't need to have a thread doing something the whole time.
I want to add that it appears in your questions that you've combined questions about a several different things. Mentioned in your questions are:
Worker Threads
Internal node.js threads
Operating system threads
These are all different things.
A worker thread is a new thread you can start to run specific pieces of Javascript in another thread so you can have more than one Javascript thread running at the same time. In node.js, this is done by creating a whole new instance of V8, setting up a whole new global environment and loaded modules environment and using almost entirely separate memory.
Internal node.js threads are used by node.js as part of implementing its event loop and its standard library. Specifically, disk I/O and some crypto operations are run in internal native threads and they communicate with your Javascript via events/callbacks through the event loop.
Operating system threads are threads that the OS uses to implement it's own system APIs. Since the OS is responsible for lots of things, these threads ca have many different uses. Depending upon native implementations, they may be used to facilitate things like disk I/O or networking I/O. These threads are the responsibility of the OS to create and use and are not directly controlled by node.js.
Some additional questions asked in comments:
what is the difference b/w workerThread & childProcess concept in nodejs? is childProcess = workerThread without sharedMemory ?
A child process can be any type of program - it does not have to be a node.js program. A worker thread is node.js code.
A worker thread can share memory if sharedMemory is specifically allocated and shared with the worker thread and if it is carefully managed for concurrency issues.
It is more efficient to copy memory back and forth between worker thread and main thread than with child process.
If main program exits, worker threads will exit. If main program exits, child process can be configured to exit or to continue.
If worker thread calls process.exit(), the main thread will exit too. If child program exits, it cannot cause main program to exit without main program's cooperation.
how nodejs is able to magically interact with the os level thread without nodejs itself creating any threads?, i need additional details on this, your explanation is the common one present in most places including the blog i shared?
nodejs just calls an OS API. It's the OS API that manages communicating with its own threads (if threads are needed for that specific OS API). How it does that communication internally is implementation dependent and will vary by OS. It will even vary by OS which OS APIs use threads and which don't.
I'm a little confused with multithreading and asynchronous in js. What is the difference between a cluster, a stream, a child process, and a worker thread?
The first thing to remember about multithreading in Node.js is that in user-space, there exists no concept of threading, and as such you cannot write any code making use of threads. Any node program is always a single threaded program (in user-space).
Since a node program is a single thread, and runs as a single process, it uses only a single CPU. Most modern processors have multiple CPUs, and in order to make use of all of these CPUs and provide better throughput, you can start the same node program as a cluster.
The cluster module of node, allows you to start a node program, and the first instance launched is launched as the master instance. The master allows you to spawn new workers as separate processes (not threads) using cluster.fork() method. The actual work that is to be done by the node program is done by the workers. The example in the node docs demonstrates this perfectly.
A child process is a process that is spawned from the current process and has an established IPC channel between them to communicate with each other. The master and workers I described in cluster are an example of child processes. the child_process module in node allows you to spawn custom child processes as you require.
Streams are something that is not at all related to multi-threading or multiple processes. Streams are just a way to handle large amounts of data without loading all the data into the working memory at the same time. Ex: Consider you want to read a 10GB log file, and your server only has 4GB of memory. Trying to load the file using fs.readFile will crash your process. Instead you use fs.createReadStream and use that to process the file in smaller chunks that can be loaded into memory.
Hope this explains. For further details you really should read the node docs.
this is a little vague so I'm just gonna give an overview.
Streams are really just data streams like in any other language. Similar to iostreams in C and where you get user input, or other types of data. They're usually masked by another class so you don't know you're using a stream. You won't mess with these unless you're building a new type usually.
Child processes, worker threads, and clusters are all ways of utilizing multi-core processing in Node applications.
Worker threads are basic multithreading the Node way, with each thread having a way to communicate with the parent, and shared memory possible between each thread. You pass in a function and data, and can provide a callback for when the thread is done processing.
Clusters are more for network sharing. Often used behind a master listener port, a master app will listen for connections, then assign them in a round-robin manner to each cluster thread for use. They share the server port(s) across multiple processors to even out the load.
Child processes are a way to create a new process in a similar way to through popen. These can be asynchronous or synchronous (non-blocking or blocking the Node event loop), and can send to and receive from the parent process via stdout/stderr and stdin, respectively. The parent can register listeners to each child process for updates. You can pass a file, a function, or a module to a child process. Generally do not share memory.
I'd suggest reading the documentation yourself and coming back with any specific questions you have, you won't get much with vague questions like this, makes it seem like you didn't do your own part of the work beforehand.
Documentation:
Streams
Worker Threads
Clusters
Child Processes
I understand how Node.js works with single thread. Mostly it is using asynchronous methods/modules in order to keep the main runtime thread free as much as possible.
However, some of the asynchronous modules internally are using threads to do their job. Example for this is reading file or other high intensive CPU task. This is done in background and it is abstracted for the Node developer.
My question is , how internally Socket.IO works, does it use threads like the above examples ? Does it use separate thread per connection ? If so , does it mean that we will have 1000 threads, if we have 1000 connected clients ?
Node does not use the thread pool (or separate threads) for sockets, instead it uses whatever platform-specific mechanism for polling sockets for data (e.g. epoll on Linux, kqueue on OS X (IIRC), I/O completion ports on Windows, etc.) on the main thread.
Socket.io works on the event loop like most node applications. No tricky thread business AFAIK. You can check out the source yourself here: https://github.com/Automattic/socket.io
We have a DLL that provides an API for a USB device we make that can appear as a USB CDC com port. We actually use a custom driver on windows for best performance along with async i/o, but we have also used serial port async file i/o in the past with reasonable success as well.
Latency is very important in this API when it is communicating with our device, so we have structured our library so that when applications make API calls to execute commands on the device, those commands turn directly into writes on the API caller's thread so that there is no waiting for a context switch. The library also maintains a listening thread which is always waiting using wait objects on an async read for new responses. These responses get parsed and inserted into thread-safe queues for the API user to read at their convenience.
So basically, we do most of our writing in the API caller's thread, and all of our reading in a listening thread. I have tried porting a version of our code over to using QSerialPort instead of native serial file i/o for Windows and OSX, but I am running into an error whenever I try to write() from the caller's thread (the QSerialPort is created in the listening thread):
QObject: Cannot create children for a parent that is in a different thread.
which seems to be due to the creation of another QObject-based WriteOverlappedCompletionNotifier for the notifiers pool used by QSerialPortPrivate::startAsyncWrite().
Is the current 5.2 version of QSerialPort limited to only doing reads and writes on the same thread? This seems very unfortunate as the underlying operating systems do not have any such thread limitations for serial port file i/o. As far as I can tell, the issue mainly has to do with the fact that all of QSerialPort's notifier classes are based on QObject.
Does anyone have a good work around to this? I might try building my own QSerialPort that uses notifiers not based on QObject to see how far that gets me. The only real advantage QObject seems to be giving here is in the destruction of the notifiers when the port closes.
Minimal Impact Solution
You're free to inspect the QSerialPort and QIODevice code and see what would need to change to make the write method(s) thread-safe for access from one thread only. The notifiers don't need to be children of the QSerialPort at all, they could be added to a list of pointers that's cleaned up upon destruction.
My guess is that perhaps no other changes are necessary to the mainline code, and only mutex protection is needed for access to error state, but you'd need to confirm that. This would have lowest impact on your code.
If you care about release integrity, you should be compiling Qt yourself anyway, and you should be having it as a part of your own source code repository, too. So none of this should be any problem at all.
On the Performance
"those commands turn directly into writes on the API caller's thread so that there is no waiting for a context switch" Modern machines are multicore and multiple threads can certainly run in parallel without any context switching. The underlying issue is, though: why bother? If you need hard-realtime guarantees, you need a hard-realtime system. Otherwise, nothing in your system should care about such minuscule latency. If you're doing this only to make the GUI feel responsive, there's really no point to such overcomplication.
A Comms Thread Approach
What I do, with plenty of success, and excellent performance, is to have the communications protocol and the communications port in the same, dedicated thread, and the users in either the GUI thread, or yet other thread(s). The communications port is generally a QIODevice, like QTcpSocket, QSerialPort, QLocalSocket, etc. Since the communications protocol object is "just" a QObject, it can also live, with the port, in the GUI thread for demostration purposes - it's designed fully asynchronously anyway, and doesn't block for anything but most trivial of computations.
The communications protocol is queuing multiple requests for execution. Even on a single-core machine, once the GUI thread is done submitting all of the requests, the further execution is all in the communications thread.
The QSerialPort implementation uses asynchronous OS APIs. There's little to no benefit to further processing those async replies on separate threads. Those operations have very low overhead and you will not gain anything measurable in your latency by trying to do so. Remember: this is not your code, but merely code that pushes bytes between buffers. Yes, the context switch overhead may be there on heavily loaded or single-core systems, but unless you can measure the difference between its presence and absence, you're fighting imaginary problems.
It is possible to use any QObject from multiple threads, of course, as long as you serialize the access to it via the event queue mutex. This is done for you whenever you use the QMetaObject::invokeMethod or signal-slot connections.
So, add a trivial wrapper around QSerialPort that exposes the write as a thread-safe method. Internally, it should use a signal-slot connection. You can call this thread-safe write from any thread. The overhead in such a call is a mutex lock and 2+n malloc/free calls, where n is the non-zero number of arguments.
In your wrapper, you can also process the readyRead signal, and emit a signal with received data. That signal can be processed by a QObject living in another thread.
Overall, if you do the measurements correctly, and if your port thread's implementation is correct, you should find no benefit whatsoever to all this complication.
If your communications protocol does heavy data processing, this should be factored out. It could go into a separate QObject that can then run on its own thread. Or, it can be simply done using dedicated functors that are executed by QtConcurrent::run.
What if you use QSerialPort to open and configure the serial port, and QSocketNotifier to monitor for read activity (and other QSocketNotifier instances for write completion and error handling, if necessary)?
QSerialPort::handle should give you the file descriptor you need. On Windows, if that function returns a Windows HANDLE, you can use _open_osfhandle to get a file descriptor.
As a follow up, shortly after this discussion I did implement my own thread-safe serial port code for POSIX systems using select() and the like and it is working well on multiple threads in conjunction with Qt and non-Qt applications alike. Basically, I have abandoned using QtSerialPort at all.
More specifically, my application is a network application, a kind of hub in which different endpoint connect and communicate. We need a graphical user interface to monitor the behavior of the participant to the hub, and etc....
Provided of course that the appropriate communication between thread is applied such that for updating the UI thread from another thread etc... does it matter that the GUI thread is the main thread or not.
Up until now, my Gui thread was a separate thread launch from my main thread. However a colleague told me that it was wrong.
Does anyone has some lessons learned or best practice that you could share with me on that subject ?
Many thanks
Maat
What do you mean by "the main thread"?
If you mean "the thread which calls main method", it doesn't matter.
If you mean "the thread which does important work for the application", it should definitely not be the same as GUI thread (which should never run any long-running methods or wait for anything except GUI events).