How to create threads in nodejs - multithreading

Is there any way to create threads for running multiple methods at a time?
That way, if any method fails in between all the other threads should be killed.

New answer:
While node.js didn't used to have the ability to use threading natively, the ability has since been added. See https://nodejs.org/api/worker_threads.html for details.
Old answer:
Every node.js process is single threaded by design. Therefore to get multiple threads, you have to have multiple processes (As some other posters have pointed out, there are also libraries you can link to that will give you the ability to work with threads in Node, but no such capability exists without those libraries. See answer by Shawn Vincent referencing https://github.com/audreyt/node-webworker-threads)
You can start child processes from your main process as shown here in the node.js documentation: http://nodejs.org/api/child_process.html. The examples are pretty good on this page and are pretty straight forward.
Your parent process can then watch for the close event on any process it started and then could force close the other processes you started to achieve the type of one fail all stop strategy you are talking about.
Also see: Node.js on multi-core machines

There is also at least one library for doing native threading from within Node.js: node-webworker-threads
https://github.com/audreyt/node-webworker-threads
This basically implements the Web Worker browser API for node.js.

Update 2:
Since Node.js 12 LTS Worker threads are stable.
Update 1:
From Node v11.7.0 onward, you do not have to use --experimental-worker flag.
Release note: https://nodejs.org/en/blog/release/v11.7.0/
From Node 10.5 there is now multi threading support, but it is experimental. Hope this will become stable soon.
Checkout following resources :
PR thread.
Official Documentation
Blog article: Threads
in Node 10.5.0: a practical intro

You can get multi-threading using Napa.js.
https://github.com/Microsoft/napajs
"Napa.js is a multi-threaded JavaScript runtime built on V8, which was originally designed to develop highly iterative services with non-compromised performance in Bing. As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them. Napa.js is exposed as a Node.js module, while it can also be embedded in a host process without Node.js dependency."

I needed real multithreading in Node.js and what worked for me was the threads package. It spawns another process having it's own Node.js message loop, so they don't block each other. The setup is easy and the documentation get's you up and running fast. Your main program and the workers can communicate in both ways and worker "threads" can be killed if needed.
Since multithreading and Node.js is a complicated and widely discussed topic it was quite difficult to find a package that works for my specific requirement. For the record these did not work for me:
tiny-worker allowed spawning workers, but they seemed to share the same message loop (but it might be I did something wrong - threads had more documentation giving me confidence it really used multiple processes, so I kept going until it worked)
webworker-threads didn't allow require-ing modules in workers which I needed
And for those asking why I needed real multi-threading: For an application involving the Raspberry Pi and interrupts. One thread is handling those interrupts and another takes care of storing the data (and more).

The nodejs 10.5.0 release has announced multithreading in Node.js. The feature is still experimental. There is a new worker_threads module available now.
You can start using worker threads if you run Node.js v10.5.0 or higher, but this is an experimental API. It is not available by default: you need to enable it by using  --experimental-worker when invoking Node.js.
Here is an example with ES6 and worker_threads enabled, tested on version 12.3.1
//package.json
"scripts": {
"start": "node --experimental-modules --experimental- worker index.mjs"
},
Now, you need to import Worker from worker_threads.
Note: You need to declare you js files with '.mjs' extension for ES6 support.
//index.mjs
import { Worker } from 'worker_threads';
const spawnWorker = workerData => {
return new Promise((resolve, reject) => {
const worker = new Worker('./workerService.mjs', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', code => code !== 0 && reject(new Error(`Worker stopped with
exit code ${code}`)));
})
}
const spawnWorkers = () => {
for (let t = 1; t <= 5; t++)
spawnWorker('Hello').then(data => console.log(data));
}
spawnWorkers();
Finally, we create a workerService.mjs
//workerService.mjs
import { workerData, parentPort, threadId } from 'worker_threads';
// You can do any cpu intensive tasks here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage(`${workerData} from worker ${threadId}`);
Output:
npm run start
Hello from worker 4
Hello from worker 3
Hello from worker 1
Hello from worker 2
Hello from worker 5

If you're using Rx, it's rather simple to plugin in rxjs-cluster to split work into parallel execution.
(disclaimer: I'm the author)
https://www.npmjs.com/package/rxjs-cluster

There is also now https://github.com/xk/node-threads-a-gogo, though I'm not sure about project status.

NodeJS now includes threads (as an experimental feature at time of answering).

You might be looking for Promise.race (native I/O racing solution, not threads)
Assuming you (or others searching this question) want to race threads to avoid failure and avoid the cost of I/O operations, this is a simple and native way to accomplish it (which does not use threads). Node is designed to be single threaded (look up the event loop), so avoid using threads if possible. If my assumption is correct, I recommend you use Promise.race with setTimeout (example in link). With this strategy, you would race a list of promises which each try some I/O operation and reject the promise if there is an error (otherwise timeout). The Promise.race statement continues after the first resolution/rejection, which seems to be what you want. Hope this helps someone!

Node.js doesn't use threading. According to its inventor that's a key feature. At the time of its invention, threads were slow, problematic, and difficult. Node.js was created as the result of an investigation into an efficient single-core alternative. Most Node.js enthusiasts still cite ye olde argument as if threads haven't been improved over the past 50 years.
As you know, Node.js is used to run JavaScript. The JavaScript language has also developed over the years. It now has ways of using multiple cores - i.e. what Threads do. So, via advancements in JavaScript, you can do some multi-core multi-tasking in your applications. user158 points out that Node.js is playing with it a bit. I don't know anything about that. But why wait for Node.js to approve of what JavaScript has to offer.
Google for JavaScript multi-threading instead of Node.js multi-threading. You'll find out about Web Workers, Promises, and other things.

You can't run multiple threads natively in node.js, however if you really need to offload some code from the main thread, you can always use child_process fork() to run some code in another process with it's own PID and memory.

Related

The right way to use worker threads in Nest.js app

I was creating a Nest.js app and wanted to take care of scalability of the app. As I found out there are basically two ways to scale: cluster or worker threads. I already asked the question about the scalability of node.js apps here "https://stackoverflow.com/questions/64340922/docker-vs-cluster-with-node-js". And found out that instead of clusters it is best to wrap the app with docker and create clones out of it. So, I have done that with Nest.js app but the question is how can we use worker threads in Nest.js to handle CPU intensive tasks? Should we just import worker threads and use it or are there special syntaxes in Nest.js allowing us to use worker threads?
Queues
Queues are a powerful design pattern that help you deal with common application scaling and performance challenges.
Break up monolithic tasks that may otherwise block the Node.js event loop. For example, if a user request requires CPU intensive work like audio transcoding, you can delegate this task to other processes, freeing up user-facing processes to remain responsive.
Go to official document...
NestJS is a NodeJS framework, with Dependency Injection as a main draw to it. With that said, class based architecture is usually the best approach. Now, there's absolutely no reason you can't just import the module (similar to how you would in Node/Typescript) and use the library. There's no special syntax necessary, unless you decide to create a Dynamic Module, which isn't likely necessary.
If you want to handle CPU-intensive tasks, I've written a blog post about this because I had the same problem.
Basically in the js file associated with the worker thread, you can use Nest standalone application which is a wrapper around IoC Container.
// workerThread.js
async function run() {
const app = await NestFactory.createApplicationContext(AppModule);
// application logic...
}
run();

Is really NodeJS Singlethread or Multithread?

many question on stackoverflow and others website , some ones says NodeJS is Singlethread and someone says NodeJS is Multithread, and they have there own logic to be Singlethread or Multithread. But If a interviewer ask same question. what should I say. I am getting confusion here.
The main event loop in NodeJs is single-threaded but most of the I/O works run on separate threads.
You can make it multi-threaded by creating child processes.
There is a npm module napajs to create a multi-threaded javascript runtime.
However,the 10.5.0 release has announced multithreading in Node.js. The feature is still experimental and likely to undergo extensive changes, but it does show the direction in which NodeJs is heading.
So stay tuned!!
NodeJS runs JavaScript in a single threaded environment. You get to use a single thread (barring the worker_threads module, and spawning processes).
Under the scenes, NodeJS uses libuv, which uses OS threads for the async I/O you get in the form of the event loop.

how to write non blocking code in node js? [duplicate]

Is there any way to create threads for running multiple methods at a time?
That way, if any method fails in between all the other threads should be killed.
New answer:
While node.js didn't used to have the ability to use threading natively, the ability has since been added. See https://nodejs.org/api/worker_threads.html for details.
Old answer:
Every node.js process is single threaded by design. Therefore to get multiple threads, you have to have multiple processes (As some other posters have pointed out, there are also libraries you can link to that will give you the ability to work with threads in Node, but no such capability exists without those libraries. See answer by Shawn Vincent referencing https://github.com/audreyt/node-webworker-threads)
You can start child processes from your main process as shown here in the node.js documentation: http://nodejs.org/api/child_process.html. The examples are pretty good on this page and are pretty straight forward.
Your parent process can then watch for the close event on any process it started and then could force close the other processes you started to achieve the type of one fail all stop strategy you are talking about.
Also see: Node.js on multi-core machines
There is also at least one library for doing native threading from within Node.js: node-webworker-threads
https://github.com/audreyt/node-webworker-threads
This basically implements the Web Worker browser API for node.js.
Update 2:
Since Node.js 12 LTS Worker threads are stable.
Update 1:
From Node v11.7.0 onward, you do not have to use --experimental-worker flag.
Release note: https://nodejs.org/en/blog/release/v11.7.0/
From Node 10.5 there is now multi threading support, but it is experimental. Hope this will become stable soon.
Checkout following resources :
PR thread.
Official Documentation
Blog article: Threads
in Node 10.5.0: a practical intro
You can get multi-threading using Napa.js.
https://github.com/Microsoft/napajs
"Napa.js is a multi-threaded JavaScript runtime built on V8, which was originally designed to develop highly iterative services with non-compromised performance in Bing. As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them. Napa.js is exposed as a Node.js module, while it can also be embedded in a host process without Node.js dependency."
I needed real multithreading in Node.js and what worked for me was the threads package. It spawns another process having it's own Node.js message loop, so they don't block each other. The setup is easy and the documentation get's you up and running fast. Your main program and the workers can communicate in both ways and worker "threads" can be killed if needed.
Since multithreading and Node.js is a complicated and widely discussed topic it was quite difficult to find a package that works for my specific requirement. For the record these did not work for me:
tiny-worker allowed spawning workers, but they seemed to share the same message loop (but it might be I did something wrong - threads had more documentation giving me confidence it really used multiple processes, so I kept going until it worked)
webworker-threads didn't allow require-ing modules in workers which I needed
And for those asking why I needed real multi-threading: For an application involving the Raspberry Pi and interrupts. One thread is handling those interrupts and another takes care of storing the data (and more).
The nodejs 10.5.0 release has announced multithreading in Node.js. The feature is still experimental. There is a new worker_threads module available now.
You can start using worker threads if you run Node.js v10.5.0 or higher, but this is an experimental API. It is not available by default: you need to enable it by using  --experimental-worker when invoking Node.js.
Here is an example with ES6 and worker_threads enabled, tested on version 12.3.1
//package.json
"scripts": {
"start": "node --experimental-modules --experimental- worker index.mjs"
},
Now, you need to import Worker from worker_threads.
Note: You need to declare you js files with '.mjs' extension for ES6 support.
//index.mjs
import { Worker } from 'worker_threads';
const spawnWorker = workerData => {
return new Promise((resolve, reject) => {
const worker = new Worker('./workerService.mjs', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', code => code !== 0 && reject(new Error(`Worker stopped with
exit code ${code}`)));
})
}
const spawnWorkers = () => {
for (let t = 1; t <= 5; t++)
spawnWorker('Hello').then(data => console.log(data));
}
spawnWorkers();
Finally, we create a workerService.mjs
//workerService.mjs
import { workerData, parentPort, threadId } from 'worker_threads';
// You can do any cpu intensive tasks here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage(`${workerData} from worker ${threadId}`);
Output:
npm run start
Hello from worker 4
Hello from worker 3
Hello from worker 1
Hello from worker 2
Hello from worker 5
If you're using Rx, it's rather simple to plugin in rxjs-cluster to split work into parallel execution.
(disclaimer: I'm the author)
https://www.npmjs.com/package/rxjs-cluster
There is also now https://github.com/xk/node-threads-a-gogo, though I'm not sure about project status.
NodeJS now includes threads (as an experimental feature at time of answering).
You might be looking for Promise.race (native I/O racing solution, not threads)
Assuming you (or others searching this question) want to race threads to avoid failure and avoid the cost of I/O operations, this is a simple and native way to accomplish it (which does not use threads). Node is designed to be single threaded (look up the event loop), so avoid using threads if possible. If my assumption is correct, I recommend you use Promise.race with setTimeout (example in link). With this strategy, you would race a list of promises which each try some I/O operation and reject the promise if there is an error (otherwise timeout). The Promise.race statement continues after the first resolution/rejection, which seems to be what you want. Hope this helps someone!
Node.js doesn't use threading. According to its inventor that's a key feature. At the time of its invention, threads were slow, problematic, and difficult. Node.js was created as the result of an investigation into an efficient single-core alternative. Most Node.js enthusiasts still cite ye olde argument as if threads haven't been improved over the past 50 years.
As you know, Node.js is used to run JavaScript. The JavaScript language has also developed over the years. It now has ways of using multiple cores - i.e. what Threads do. So, via advancements in JavaScript, you can do some multi-core multi-tasking in your applications. user158 points out that Node.js is playing with it a bit. I don't know anything about that. But why wait for Node.js to approve of what JavaScript has to offer.
Google for JavaScript multi-threading instead of Node.js multi-threading. You'll find out about Web Workers, Promises, and other things.
You can't run multiple threads natively in node.js, however if you really need to offload some code from the main thread, you can always use child_process fork() to run some code in another process with it's own PID and memory.

What exactly are the implications of the fact that nodejs is single threaded?

The NodeJS website says the following. Emphasis is mine.
Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.
Even though I love NodeJS I dont see why it is better for scalable applications compared to the existing technologies such as Python, Java or even PHP.
As I understand the JavaScript run-time always runs as a single thread in the CPU. The IO however probably uses underlying kernel methods which might rely on the thread pools provided by the kernel.
So the real questions that need to be answered are:
Because all JS code will run in a single thread NodeJS is unsuitable for applications where there is less IO and lots of computation ?
If I am writing a web application using nodejs and there are 100 open connections each performing a pure computation requiring 100ms, at least one of them will take 10s to finish ?
If your machine has 10 cores but if you are running just one nodeJS instance your other 9 CPUs are sitting ducks ?
I would appreciate if you also post how other technologies perform viz a viz NodeJS in these cases.
I haven't done a ton of node, but I have some opinions on this. Please correct if I am mistaken, SO.
Because all JS code will run in a single thread NodeJS is unsuitable for applications where there is less IO and lots of computation ?
Yeah. Single threaded means if you are crunching lots of data hard in your JS code, you are blocking everything else. And that sucks. But this isn't typical for most web applications.
If I am writing a web application using nodejs and there are 100 open connections each performing a pure computation requiring 100ms, at least one of them will take 10s to finish?
Yep. 10 seconds of CPU time.
If your machine has 10 cores but if you are running just one nodeJS instance your other 9 CPUs are sitting ducks?
That I'm not sure about. The V8 engine might have some optimizations in it that take advantage of multiple cores, transparent to the programmer. But I doubt it.
The thing is, most of the time a web application isn't calculating. If your app is engineered well, a single request can be responded to very quickly. And if you have to fetch things to do that (db, files, remote services) you shouldn't have to wait for that fetch to return before processing the next request.
So you may have many requests in various stages at the same time in various stages of completion, due to when I/O callbacks happen. Even though only one request is running JS code at a time, that code should do what it needs to do very quickly, exit the run loop, and await the next event callback.
If your JS can't run quickly, then this model does pose a problem. As you note, things will get hung as the CPU churns. So don't build a node web application that does lots of intense calculation on the fly.
However, you can refactor things to be asynchronous instead. Maybe you have a standalone node script that can do the calculation for you, with a callback when it's done. Your web application can then boot up that script as a child process, tell it do stuff, and provide a callback to run when it's done. You now have sort of faked threads, in a round about way.
In virtually all web application technologies, you do not want to be doing complex and intense calculation on the fly. Even with proper threading, it's a losing battle. Instead you have to strategize. Do the calculations in the background, or at regular intervals on a cron job, outside of the main web application process itself.
The things you point out are flaws in theory, but in practice it really only becomes an issue if you aren't doing it right.
Node.js is single threaded. This means anything that would block the main thread needs to be done outside the main thread.
In practice this just means using callbacks for heavy computations the same way you use callbacks for I/O.
For instace here's the API for node bcrypt
var bcrypt = require('bcrypt');
bcrypt.genSalt(10, function(err, salt) {
bcrypt.hash("B4c0/\/", salt, function(err, hash) {
// Store hash in your password DB.
});
});
Which Mozilla Persona uses in production. See their code here.

What is different about the way NodeJS handles requests as opposed to a setup like Rails / Passenger?

My understanding is that Node is an 'Event' driven as opposed to sequentially driven server application. This I've come to understand means, that for event driven software, the user is in command, he can create an event at any time and the server is in a state such that it can respond, whereas with sequential software (like a DOS prompt), the application tells the user when its 'ok' to response, and may at any given time be not available (due to some other process).
Further, my understanding is that applications like Node and EventMachine use a reactor of sorts.. they wait for an 'event' to occur, and using a callback they delegate the task to some other worker. Ok.. so then, what about Rails & Passenger?
Rails might use a server like NGINX with Passenger to spawn new processes when 'events' are received by the system. Is this not conceptually the same idea? If it is, is it just the processing overhead that is really separating the two where Passenger would need to potentially spawn a new rails instance while, node is already waiting to handle the request?
Node.js is event driven non blocking programming language. The key is the non blocking part. Node doesn't spawn for other processes. It runs in one thread (this is for starters... you can actually spawn it now through some modules - i think - but that's another talk)
Anyway this is different from other typical programming languages where you receive a request and the thread is locked until it has an answer. If you assign it to another thread, that thread is still locked...
In node you never lock. You receive request and the thread continues to receive requests. When a request is processed, the callback is called.
Hope I made myself understand and I used the right terms ;)
Anyway, if you want this video is nicee: http://www.youtube.com/watch?v=jo_B4LTHi3I
The non-blocking/evented I/O as jribeiro described is one part of the answer. Ruby applications tend to be written using blocking I/O, and using processes and threads for concurrency.
However, non-blocking and evented I/O are not inherent to Node. You can achieve the same thing in Ruby by using EventMachine, and an in-process evented server like Thin. If you program directly against EventMachine and Thin, then it is architecturally almost the same as Node. That being said, the Ruby ecosystem does not have as many event-friendly libraries and documentation as Node does, so it takes a bit of skill to achieve the same thing in Ruby.
Conversely, the way Phusion Passenger manages processes - i.e. by spawning multiple processes and load balancing requests between them, and supervising processes - is not unique to Ruby. In fact, Phusion Passenger introduced Node.js support recently. The Node.js support was open sourced today.

Resources