I was creating a Nest.js app and wanted to take care of scalability of the app. As I found out there are basically two ways to scale: cluster or worker threads. I already asked the question about the scalability of node.js apps here "https://stackoverflow.com/questions/64340922/docker-vs-cluster-with-node-js". And found out that instead of clusters it is best to wrap the app with docker and create clones out of it. So, I have done that with Nest.js app but the question is how can we use worker threads in Nest.js to handle CPU intensive tasks? Should we just import worker threads and use it or are there special syntaxes in Nest.js allowing us to use worker threads?
Queues
Queues are a powerful design pattern that help you deal with common application scaling and performance challenges.
Break up monolithic tasks that may otherwise block the Node.js event loop. For example, if a user request requires CPU intensive work like audio transcoding, you can delegate this task to other processes, freeing up user-facing processes to remain responsive.
Go to official document...
NestJS is a NodeJS framework, with Dependency Injection as a main draw to it. With that said, class based architecture is usually the best approach. Now, there's absolutely no reason you can't just import the module (similar to how you would in Node/Typescript) and use the library. There's no special syntax necessary, unless you decide to create a Dynamic Module, which isn't likely necessary.
If you want to handle CPU-intensive tasks, I've written a blog post about this because I had the same problem.
Basically in the js file associated with the worker thread, you can use Nest standalone application which is a wrapper around IoC Container.
// workerThread.js
async function run() {
const app = await NestFactory.createApplicationContext(AppModule);
// application logic...
}
run();
Related
Nodejs can not have a built-in thread API like java and .net
do. If threads are added, the nature of the language itself will
change. It’s not possible to add threads as a new set of available
classes or functions.
Nodejs 10.x added worker threads as an experiment and now stable since 12.x. I have gone through the few blogs but did not understand much maybe due to lack of knowledge. How are they different than the threads.
Worker threads in Javascript are somewhat analogous to WebWorkers in the browser. They do not share direct access to any variables with the main thread or with each other and the only way they communicate with the main thread is via messaging. This messaging is synchronized through the event loop. This avoids all the classic race conditions that multiple threads have trying to access the same variables because two separate threads can't access the same variables in node.js. Each thread has its own set of variables and the only way to influence another thread's variables is to send it a message and ask it to modify its own variables. Since that message is synchronized through that thread's event queue, there's no risk of classic race conditions in accessing variables.
Java threads, on the other hand, are similar to C++ or native threads in that they share access to the same variables and the threads are freely timesliced so right in the middle of functionA running in threadA, execution could be interrupted and functionB running in threadB could run. Since both can freely access the same variables, there are all sorts of race conditions possible unless one manually uses thread synchronization tools (such as mutexes) to coordinate and protect all access to shared variables. This type of programming is often the source of very hard to find and next-to-impossible to reliably reproduce concurrency bugs. While powerful and useful for some system-level things or more real-time-ish code, it's very easy for anyone but a very senior and experienced developer to make costly concurrency mistakes. And, it's very hard to devise a test that will tell you if it's really stable under all types of load or not.
node.js attempts to avoid the classic concurrency bugs by separating the threads into their own variable space and forcing all communication between them to be synchronized via the event queue. This means that threadA/functionA is never arbitrarily interrupted and some other code in your process changes some shared variables it was accessing while it wasn't looking.
node.js also has a backstop that it can run a child_process that can be written in any language and can use native threads if needed or one can actually hook native code and real system level threads right into node.js using the add-on SDK (and it communicates with node.js Javascript through the SDK interface). And, in fact, a number of node.js built-in libraries do exactly this to surface functionality that requires that level of access to the nodejs environment. For example, the implementation of file access uses a pool of native threads to carry out file operations.
So, with all that said, there are still some types of race conditions that can occur and this has to do with access to outside resources. For example if two threads or processes are both trying to do their own thing and write to the same file, they can clearly conflict with each other and create problems.
So, using Workers in node.js still has to be aware of concurrency issues when accessing outside resources. node.js protects the local variable environment for each Worker, but can't do anything about contention among outside resources. In that regard, node.js Workers have the same issues as Java threads and the programmer has to code for that (exclusive file access, file locks, separate files for each Worker, using a database to manage the concurrency for storage, etc...).
It comes under the node js architecture. whenever a req reaches the node it is passed on to "EVENT QUE" then to "Event Loop" . Here the event-loop checks whether the request is 'blocking io or non-blocking io'. (blocking io - the operations which takes time to complete eg:fetching a data from someother place ) . Then Event-loop passes the blocking io to THREAD POOL. Thread pool is a collection of WORKER THREADS. This blocking io gets attached to one of the worker-threads and it begins to perform its operation(eg: fetching data from database) after the completion it is send back to event loop and later to Execution.
Pure curiosity, I'm just wondering if there is any case where a webworker would manage to execute a separate thread if only one thread is available in the CPU, maybe with some virtualization, using the GPU?
Thanks!
There seem to be two premises behind your question: firstly, that web workers use threads; and secondly that multiple threads require multiple cores. But neither is really true.
On the first: there’s no actual requirement that web workers be implemented with threads. User agents are free to use processes, threads or any “equivalent construct” [see the web worker specification]. They could use multitasking within a single thread if they wanted to. Web worker scripts are run concurrently but not necessarily parallel to browser JavaScript.
On the second: it’s quite possible for multiple threads to run on a single CPU. It works a lot like concurrent async functions do in single threaded JavaScript.
So yes, in answer to your question: web workers do run properly on a single core client. You will lose some of the performance benefits but the code will still behave as it would in a multi core system.
Is there any way to create threads for running multiple methods at a time?
That way, if any method fails in between all the other threads should be killed.
New answer:
While node.js didn't used to have the ability to use threading natively, the ability has since been added. See https://nodejs.org/api/worker_threads.html for details.
Old answer:
Every node.js process is single threaded by design. Therefore to get multiple threads, you have to have multiple processes (As some other posters have pointed out, there are also libraries you can link to that will give you the ability to work with threads in Node, but no such capability exists without those libraries. See answer by Shawn Vincent referencing https://github.com/audreyt/node-webworker-threads)
You can start child processes from your main process as shown here in the node.js documentation: http://nodejs.org/api/child_process.html. The examples are pretty good on this page and are pretty straight forward.
Your parent process can then watch for the close event on any process it started and then could force close the other processes you started to achieve the type of one fail all stop strategy you are talking about.
Also see: Node.js on multi-core machines
There is also at least one library for doing native threading from within Node.js: node-webworker-threads
https://github.com/audreyt/node-webworker-threads
This basically implements the Web Worker browser API for node.js.
Update 2:
Since Node.js 12 LTS Worker threads are stable.
Update 1:
From Node v11.7.0 onward, you do not have to use --experimental-worker flag.
Release note: https://nodejs.org/en/blog/release/v11.7.0/
From Node 10.5 there is now multi threading support, but it is experimental. Hope this will become stable soon.
Checkout following resources :
PR thread.
Official Documentation
Blog article: Threads
in Node 10.5.0: a practical intro
You can get multi-threading using Napa.js.
https://github.com/Microsoft/napajs
"Napa.js is a multi-threaded JavaScript runtime built on V8, which was originally designed to develop highly iterative services with non-compromised performance in Bing. As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them. Napa.js is exposed as a Node.js module, while it can also be embedded in a host process without Node.js dependency."
I needed real multithreading in Node.js and what worked for me was the threads package. It spawns another process having it's own Node.js message loop, so they don't block each other. The setup is easy and the documentation get's you up and running fast. Your main program and the workers can communicate in both ways and worker "threads" can be killed if needed.
Since multithreading and Node.js is a complicated and widely discussed topic it was quite difficult to find a package that works for my specific requirement. For the record these did not work for me:
tiny-worker allowed spawning workers, but they seemed to share the same message loop (but it might be I did something wrong - threads had more documentation giving me confidence it really used multiple processes, so I kept going until it worked)
webworker-threads didn't allow require-ing modules in workers which I needed
And for those asking why I needed real multi-threading: For an application involving the Raspberry Pi and interrupts. One thread is handling those interrupts and another takes care of storing the data (and more).
The nodejs 10.5.0 release has announced multithreading in Node.js. The feature is still experimental. There is a new worker_threads module available now.
You can start using worker threads if you run Node.js v10.5.0 or higher, but this is an experimental API. It is not available by default: you need to enable it by using --experimental-worker when invoking Node.js.
Here is an example with ES6 and worker_threads enabled, tested on version 12.3.1
//package.json
"scripts": {
"start": "node --experimental-modules --experimental- worker index.mjs"
},
Now, you need to import Worker from worker_threads.
Note: You need to declare you js files with '.mjs' extension for ES6 support.
//index.mjs
import { Worker } from 'worker_threads';
const spawnWorker = workerData => {
return new Promise((resolve, reject) => {
const worker = new Worker('./workerService.mjs', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', code => code !== 0 && reject(new Error(`Worker stopped with
exit code ${code}`)));
})
}
const spawnWorkers = () => {
for (let t = 1; t <= 5; t++)
spawnWorker('Hello').then(data => console.log(data));
}
spawnWorkers();
Finally, we create a workerService.mjs
//workerService.mjs
import { workerData, parentPort, threadId } from 'worker_threads';
// You can do any cpu intensive tasks here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage(`${workerData} from worker ${threadId}`);
Output:
npm run start
Hello from worker 4
Hello from worker 3
Hello from worker 1
Hello from worker 2
Hello from worker 5
If you're using Rx, it's rather simple to plugin in rxjs-cluster to split work into parallel execution.
(disclaimer: I'm the author)
https://www.npmjs.com/package/rxjs-cluster
There is also now https://github.com/xk/node-threads-a-gogo, though I'm not sure about project status.
NodeJS now includes threads (as an experimental feature at time of answering).
You might be looking for Promise.race (native I/O racing solution, not threads)
Assuming you (or others searching this question) want to race threads to avoid failure and avoid the cost of I/O operations, this is a simple and native way to accomplish it (which does not use threads). Node is designed to be single threaded (look up the event loop), so avoid using threads if possible. If my assumption is correct, I recommend you use Promise.race with setTimeout (example in link). With this strategy, you would race a list of promises which each try some I/O operation and reject the promise if there is an error (otherwise timeout). The Promise.race statement continues after the first resolution/rejection, which seems to be what you want. Hope this helps someone!
Node.js doesn't use threading. According to its inventor that's a key feature. At the time of its invention, threads were slow, problematic, and difficult. Node.js was created as the result of an investigation into an efficient single-core alternative. Most Node.js enthusiasts still cite ye olde argument as if threads haven't been improved over the past 50 years.
As you know, Node.js is used to run JavaScript. The JavaScript language has also developed over the years. It now has ways of using multiple cores - i.e. what Threads do. So, via advancements in JavaScript, you can do some multi-core multi-tasking in your applications. user158 points out that Node.js is playing with it a bit. I don't know anything about that. But why wait for Node.js to approve of what JavaScript has to offer.
Google for JavaScript multi-threading instead of Node.js multi-threading. You'll find out about Web Workers, Promises, and other things.
You can't run multiple threads natively in node.js, however if you really need to offload some code from the main thread, you can always use child_process fork() to run some code in another process with it's own PID and memory.
Is there any way to create threads for running multiple methods at a time?
That way, if any method fails in between all the other threads should be killed.
New answer:
While node.js didn't used to have the ability to use threading natively, the ability has since been added. See https://nodejs.org/api/worker_threads.html for details.
Old answer:
Every node.js process is single threaded by design. Therefore to get multiple threads, you have to have multiple processes (As some other posters have pointed out, there are also libraries you can link to that will give you the ability to work with threads in Node, but no such capability exists without those libraries. See answer by Shawn Vincent referencing https://github.com/audreyt/node-webworker-threads)
You can start child processes from your main process as shown here in the node.js documentation: http://nodejs.org/api/child_process.html. The examples are pretty good on this page and are pretty straight forward.
Your parent process can then watch for the close event on any process it started and then could force close the other processes you started to achieve the type of one fail all stop strategy you are talking about.
Also see: Node.js on multi-core machines
There is also at least one library for doing native threading from within Node.js: node-webworker-threads
https://github.com/audreyt/node-webworker-threads
This basically implements the Web Worker browser API for node.js.
Update 2:
Since Node.js 12 LTS Worker threads are stable.
Update 1:
From Node v11.7.0 onward, you do not have to use --experimental-worker flag.
Release note: https://nodejs.org/en/blog/release/v11.7.0/
From Node 10.5 there is now multi threading support, but it is experimental. Hope this will become stable soon.
Checkout following resources :
PR thread.
Official Documentation
Blog article: Threads
in Node 10.5.0: a practical intro
You can get multi-threading using Napa.js.
https://github.com/Microsoft/napajs
"Napa.js is a multi-threaded JavaScript runtime built on V8, which was originally designed to develop highly iterative services with non-compromised performance in Bing. As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them. Napa.js is exposed as a Node.js module, while it can also be embedded in a host process without Node.js dependency."
I needed real multithreading in Node.js and what worked for me was the threads package. It spawns another process having it's own Node.js message loop, so they don't block each other. The setup is easy and the documentation get's you up and running fast. Your main program and the workers can communicate in both ways and worker "threads" can be killed if needed.
Since multithreading and Node.js is a complicated and widely discussed topic it was quite difficult to find a package that works for my specific requirement. For the record these did not work for me:
tiny-worker allowed spawning workers, but they seemed to share the same message loop (but it might be I did something wrong - threads had more documentation giving me confidence it really used multiple processes, so I kept going until it worked)
webworker-threads didn't allow require-ing modules in workers which I needed
And for those asking why I needed real multi-threading: For an application involving the Raspberry Pi and interrupts. One thread is handling those interrupts and another takes care of storing the data (and more).
The nodejs 10.5.0 release has announced multithreading in Node.js. The feature is still experimental. There is a new worker_threads module available now.
You can start using worker threads if you run Node.js v10.5.0 or higher, but this is an experimental API. It is not available by default: you need to enable it by using --experimental-worker when invoking Node.js.
Here is an example with ES6 and worker_threads enabled, tested on version 12.3.1
//package.json
"scripts": {
"start": "node --experimental-modules --experimental- worker index.mjs"
},
Now, you need to import Worker from worker_threads.
Note: You need to declare you js files with '.mjs' extension for ES6 support.
//index.mjs
import { Worker } from 'worker_threads';
const spawnWorker = workerData => {
return new Promise((resolve, reject) => {
const worker = new Worker('./workerService.mjs', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', code => code !== 0 && reject(new Error(`Worker stopped with
exit code ${code}`)));
})
}
const spawnWorkers = () => {
for (let t = 1; t <= 5; t++)
spawnWorker('Hello').then(data => console.log(data));
}
spawnWorkers();
Finally, we create a workerService.mjs
//workerService.mjs
import { workerData, parentPort, threadId } from 'worker_threads';
// You can do any cpu intensive tasks here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage(`${workerData} from worker ${threadId}`);
Output:
npm run start
Hello from worker 4
Hello from worker 3
Hello from worker 1
Hello from worker 2
Hello from worker 5
If you're using Rx, it's rather simple to plugin in rxjs-cluster to split work into parallel execution.
(disclaimer: I'm the author)
https://www.npmjs.com/package/rxjs-cluster
There is also now https://github.com/xk/node-threads-a-gogo, though I'm not sure about project status.
NodeJS now includes threads (as an experimental feature at time of answering).
You might be looking for Promise.race (native I/O racing solution, not threads)
Assuming you (or others searching this question) want to race threads to avoid failure and avoid the cost of I/O operations, this is a simple and native way to accomplish it (which does not use threads). Node is designed to be single threaded (look up the event loop), so avoid using threads if possible. If my assumption is correct, I recommend you use Promise.race with setTimeout (example in link). With this strategy, you would race a list of promises which each try some I/O operation and reject the promise if there is an error (otherwise timeout). The Promise.race statement continues after the first resolution/rejection, which seems to be what you want. Hope this helps someone!
Node.js doesn't use threading. According to its inventor that's a key feature. At the time of its invention, threads were slow, problematic, and difficult. Node.js was created as the result of an investigation into an efficient single-core alternative. Most Node.js enthusiasts still cite ye olde argument as if threads haven't been improved over the past 50 years.
As you know, Node.js is used to run JavaScript. The JavaScript language has also developed over the years. It now has ways of using multiple cores - i.e. what Threads do. So, via advancements in JavaScript, you can do some multi-core multi-tasking in your applications. user158 points out that Node.js is playing with it a bit. I don't know anything about that. But why wait for Node.js to approve of what JavaScript has to offer.
Google for JavaScript multi-threading instead of Node.js multi-threading. You'll find out about Web Workers, Promises, and other things.
You can't run multiple threads natively in node.js, however if you really need to offload some code from the main thread, you can always use child_process fork() to run some code in another process with it's own PID and memory.
if nodejs is multithreaded see
this article and
threads are managed by OS which can do it in the same core or in another core in multicore cpu see this question then nodejs will automatically utilize multicore cpu ,
so why should i use cluster.fork to make different process of node to utilize multicore as shown in this example at node docs
i know that multiprocess have the advantage that when one process fall there still another process to respond to requests unlike in threads , i need to know if multicore can be utilized by just spawning process for each core or it's an OS task that i can't control
It depends.
Work that happens asynchronously and by Node itself, such as IO operations, is multithreaded. Your JavaScript application runs in a single thread.
In my opinion, the only time you need to fire off multiple processes, is if the vast majority of your work is done in straight JavaScript. Node was designed behind the fact that this is rarely the case, and is built for applications that primarily block on disk and network.
So, if you have a typical Node application where your JavaScript isn't the bulk of the work, then firing off multiple processes will not help you utilize multiple CPUs/cores.
However, if you have a special application where you do lots of work in your main loop, then multiple processes may be for you.
The easiest way to know is to monitor CPU utilization while your application runs. You will have to decide on a per-application basis what is best.
Node is not multi-threaded from the point of developer's view. Threads are used in a very different way than they are used by for example Apache's worker mpm.
I believe this answer will clear things up.