Qt: How to cancel ALL processing of a QtConcurrent::map()? - multithreading

I am using Qt5 on Windows7 platform.
In my current application I am using QtConcurrent to process all items in a container.
If I decide to exit the application, I am using: QFuture::cancel().
As per documentation http://doc.qt.io/qt-5/qfuture.html#cancel, not all async computations are canceled. Probably are canceled only the ones that are not yet started? And the running/ongoing computations are still allowed to continue to run?
If the above assumption is correct and QFuture::cancel() is not sufficient, then what should I do in order to stop also the running (ongoing) processing and exit the application gracefully?

You cannot pre-emptively and gracefully terminate asynchronous operations. The operations themselves must support co-operative cancellation.
In other words, it's impossible for any framework, library, or operating system to provide this functionality for you. You must handle cancelling operations that have already started yourself.

Related

Is there a node.js api that allows to store the current running node process and resume it later?

Googling for it results in many “how to persist data in a node app” but I’m looking on a way to store the program counter, memory status, event loop, call stack etc in persistent storage, and resume it later.
Benefits: if you see the runtime (a server, container, serverless function) is about to terminate, instead of using business logic to pause and resume (custom work), use the same way operating systems handle multiple processes / threads. Store everything, then resume it later form a different infrastructure (but with identical specs).
I’m sure there is something like this, but simply can’t find the right search term probably.
Ps this might be an OS feature that I’m looking for and not node specific, but if this can be done from within Node’s API (Eg v8 internals) I can basically get an unlimited / long running lambda ;) (which is a bad idea but I want to know if it’s possible).
(V8 developer here.)
V8 definitely doesn't support this.
What V8 does support is taking a heap snapshot, and deserializing that on renewed process startup (and I believe Node is making use of this functionality). That's quite different from freezing an entire running process though.
I'm not sure what you mean by "the same way operating systems handle multiple processes / threads". Operating systems don't usually let you snapshot a process and transfer it to a different machine.
On the same machine, you could literally just let the OS do it: pause the process (e.g. press Ctrl+Z if you started it at a Linux command line, or use equivalent Task Manager functionality if your OS provides it, or similar), and resume it later. If the process itself doesn't fire any repeated tasks/timers, then that's almost equivalent to simply doing nothing: a process that executes no work won't get scheduled by the kernel anyway; a server that isn't serving any requests can just sit around waiting.
If you actually need to transfer a running process to another machine, your best bet may be a VM which you can snapshot, transfer, resume.

How worker threads works in Nodejs?

Nodejs can not have a built-in thread API like java and .net
do. If threads are added, the nature of the language itself will
change. It’s not possible to add threads as a new set of available
classes or functions.
Nodejs 10.x added worker threads as an experiment and now stable since 12.x. I have gone through the few blogs but did not understand much maybe due to lack of knowledge. How are they different than the threads.
Worker threads in Javascript are somewhat analogous to WebWorkers in the browser. They do not share direct access to any variables with the main thread or with each other and the only way they communicate with the main thread is via messaging. This messaging is synchronized through the event loop. This avoids all the classic race conditions that multiple threads have trying to access the same variables because two separate threads can't access the same variables in node.js. Each thread has its own set of variables and the only way to influence another thread's variables is to send it a message and ask it to modify its own variables. Since that message is synchronized through that thread's event queue, there's no risk of classic race conditions in accessing variables.
Java threads, on the other hand, are similar to C++ or native threads in that they share access to the same variables and the threads are freely timesliced so right in the middle of functionA running in threadA, execution could be interrupted and functionB running in threadB could run. Since both can freely access the same variables, there are all sorts of race conditions possible unless one manually uses thread synchronization tools (such as mutexes) to coordinate and protect all access to shared variables. This type of programming is often the source of very hard to find and next-to-impossible to reliably reproduce concurrency bugs. While powerful and useful for some system-level things or more real-time-ish code, it's very easy for anyone but a very senior and experienced developer to make costly concurrency mistakes. And, it's very hard to devise a test that will tell you if it's really stable under all types of load or not.
node.js attempts to avoid the classic concurrency bugs by separating the threads into their own variable space and forcing all communication between them to be synchronized via the event queue. This means that threadA/functionA is never arbitrarily interrupted and some other code in your process changes some shared variables it was accessing while it wasn't looking.
node.js also has a backstop that it can run a child_process that can be written in any language and can use native threads if needed or one can actually hook native code and real system level threads right into node.js using the add-on SDK (and it communicates with node.js Javascript through the SDK interface). And, in fact, a number of node.js built-in libraries do exactly this to surface functionality that requires that level of access to the nodejs environment. For example, the implementation of file access uses a pool of native threads to carry out file operations.
So, with all that said, there are still some types of race conditions that can occur and this has to do with access to outside resources. For example if two threads or processes are both trying to do their own thing and write to the same file, they can clearly conflict with each other and create problems.
So, using Workers in node.js still has to be aware of concurrency issues when accessing outside resources. node.js protects the local variable environment for each Worker, but can't do anything about contention among outside resources. In that regard, node.js Workers have the same issues as Java threads and the programmer has to code for that (exclusive file access, file locks, separate files for each Worker, using a database to manage the concurrency for storage, etc...).
It comes under the node js architecture. whenever a req reaches the node it is passed on to "EVENT QUE" then to "Event Loop" . Here the event-loop checks whether the request is 'blocking io or non-blocking io'. (blocking io - the operations which takes time to complete eg:fetching a data from someother place ) . Then Event-loop passes the blocking io to THREAD POOL. Thread pool is a collection of WORKER THREADS. This blocking io gets attached to one of the worker-threads and it begins to perform its operation(eg: fetching data from database) after the completion it is send back to event loop and later to Execution.

Node.js, not works only in single thread by default

I have a question, Node.js uses libuv inside of u core, to manage its event loop and by default works whit 4 threads and process queue whit limit of 1024 process.
Process queue limit
Threads by default
So, because most programmers say it's single thread?
By default, node.js only uses ONE thread to run your Javascript. Thus your Javascript runs as single threaded. No two pieces of your Javascript are ever running at the same time. This is a critical design element in Javascript and is why it does not generally have concurrency problems with access to shared variables.
The event driven system works by doing this:
Fetch event from event queue.
Run the Javascript callback associated with the event.
Run that Javascript until it returns control back to the system.
Fetch the next event from the event queue and go back to step 2.
If no event in the event queue, go to sleep until an event is added to the queue, then go to step 1.
In this way, you can see that a given piece of Javascript runs until it returns control back to the system and then, and only then, can another piece of Javascript run. That's where the notion of "single threaded" comes from. One piece of Javascript running at a time. It vastly simplifies concurrency issues and, when combined with the non-blocking I/O model, it makes a very efficient system, even when lots of operations are "in flight" (though only one is actually running at a time).
Yes, node.js has some threads inside of libuv that are used for things like implementing file system access. But those are only for native code inside the library and do NOT make your Javascript multi-threaded in any way.
Now, recent versions of node.js do have Worker Threads which allow you to actually run multiple threads of Javascript, but each thread is a very separate environment and you must communicate with other threads via messages without the direct sharing of variables. This is relatively new to nodejs version 10.5 (though it's similar in concept to WebWorkers in the browser. These Worker Threads are not used at all unless you specifically engage them with custom programming designed to take advantage of them and live within their specific rules of operation.

Thread inside Application vs. Server process

I have a site which sometimes takes particularly long to process a request (and that's not a defect). 99% of the time it's pretty quick because it almost doesn't do any processing.
I want to show a message that says "Loading" when the site takes long to process the request. My site uses mod_wsgi and Apache. The way I see it, I would respond saying 'Loading' before completing the processing and do one of two things right before:
-spawn a (daemon) thread to take care of the processing.
-communicate through socket with other process and tell it to take care of the processing (most likely send request to http://localhost:8080/do_processing).
What are the pros and cons of one approach vs the other?
Using a separate process is better. It does not have to be hard at all as suggested in another answer as you can use an existing system for doing exactly that such as Celery (http://celeryproject.org/). Relying on in process threads is not necessarily a good idea unless you are going to implement an internal job queueing system of your own to prevent blowing out of number of threads. Also, in a multiprocess server configuration you cant be guaranteed a request comes back to the same process and so not easy to get status of a running operation. Finally, the web server processes could get killed off and thus your background task could also be killed before it finishes. You would need to have a mechanism for holding state which can survive such an event if that was important. Far easier to use something like Celery.
The process route requires quite a bit of a system processing. Creation of a separate process is relatively expensive and slow. However if your process crashes it doesn't affect your main governing process (you will receive the exit status code and will have an opportunity to respawn a new working process). You will also need some sort of InterProcessCommunication layer (can be a socket, pipe, shared memory, etc...) which is adds to complexity if your project.
Threads are lightweight and cheap. All you need to do is to manage concurrent access to shared resources. So it really depends on the task you have in mind. Threads probably will be more likely the appropriate way to implement your task.

Is there an use case for non-blocking receive when I have threads?

I know non-blocking receive is not used as much in message passing, but still some intuition tells me, it is needed. Take for example GUI event driven applications, you need some way to wait for a message in a non-blocking way, so your program can execute some computations. One of the ways to solve this is to have a special thread with message queue. Is there some use case, where you would really need non-blocking receive even if you have threads?
Threads work differently than non-blocking asynchronous operations, although you can usually achieve the same effect by having threads that does synchronous operations. However, in the end, it boils down on how to handle doing things more efficiently.
Threads are limited resources, and should be used to process long running, active operations. If you have something that is not really active doing things, but need to wait idly for some time for the result (think some I/O operation over the network like calling web services or database servers), then it is better to use the provided asynchronous alternative for it instead of wasting threads unnecessarily by putting the synchronous call on another thread.
You can have a good read on this issue here for more understanding.
One thread per connection is often not a good idea (wasted memory, not all OS are very good with huge thread counts, etc)
How do you interrupt the blocking receive call? On Linux, for example (and probably on some other POSIX OS) pthreads + signals = disaster. With a non-blocking receive you can multiplex your wait on the receiving socket and some kind of IPC socket used to communicate between your threads. Also maps to the Windows world relatively easily.
If you need to replace your regular socket with something more complex (e.g. OpenSSL) relying on the blocking behavior can get you in trouble. OpenSSL, for example, can get deadlocked on a blocking socket, because SSL protocol has sender/receive inversion scenarios where receive can not proceed before some sending is done.
My experience has been -- "when in doubt use non-blocking sockets".
With blocking IO, it's challenging on many platforms to get your application to do a best effort orderly shutdown in the face of slow, hung, or disconnected clients/services.
With non-blocking IO, you can kill the in-flight operation as soon as the system call returns, which is immediately. If your code is written with premature termination in mind - which is comparatively simple with non-blocking IO - this can allow you to clean up your saved state gracefully.
I can't think of any, but sometimes the non-blocking APIs are designed in a way that makes them easier/more intuitive to use than an explicitly multi-threaded implementation.
Here goes a real situation I have faced recently. Formerly I had a script that would run every hour, managed by crontab, but sometimes users would log to the machine and run the script manually. This had some problems, for example concurrent execution by crontab and user could cause problems, and sometimes users would log in as root - I know, bad pattern, not under my control - and run script with wrong permissions. So we decided to have the routine running as daemon, with proper permissions, and the command users were used to run would now just trigger the daemon.
So, this user executed command would basically do two things: trigger the daemon and wait for it to finish the task. But it also needed a timeout and to keep dumping daemon logs to user while waiting.
If I understand the situation you proposed, I had the case you want: I needed to keep listening from daemon while still interacting with user independently. The solution was asynchronous read.
Lucky for me, I didn't think about using threads. I probably would have thought so if I were coding in Java, but this was Python code.
My point is, that when we consider threads and messaging being perfect, the real trade-off is about writing scheduler for planning the non-blocking receive operations and writing synchronizations codefor threads with shared state (locks etc.). I would say, that both can be sometime easy and sometime hard. So an use case would be when there are many messages asynchronous messages to be received and when there is much data to be operated on based on the messages. This would be quite easy in one thread using non-blocking receive and would ask for much synchronization with many threads and shared state.... I am also thinking about some real life example, I will include it probably later.

Resources