Loop inside worker thread is not working in nodejs - node.js

I have the following code.
if (isMainThread) {
let worker = new Worker(__filename, {
workerData: "thread-1",
});
worker.on("message", (data) => {
console.log("Got message from worker : ", data);
});
worker.on("error", (err) => {
console.log("error", err);
});
worker.on("exit", (code) => {
if (code !== 0)
console.log(new Error(`Worker stopped with exit code ${code}`));
});
} else {
while (true) console.log(workerData);
}
The above code prints thread-1 for only one time. Ideally, it should print continuously.
When I replace
while (true) console.log(workerData);
// with
for (let i = 0; i < 5; i++) console.log(workerData);
Then thread-1 is being logged for 5 times.
Also when I keep while loop after for loop, then it is login thread-1 for only one time.
for (let i = 0; i < 5; i++) console.log(workerData);
while (true) console.log(workerData);
The above code prints thread-1 for only one time.

So, node.js in general does not work well if you don't give the event loop some cycles to process events.
And, doing an infinite loop like this:
while (true) console.log(workerData);
Never gives the event loop any cycles. In your example, this shows up in your worker threads because console.log() in a workerThread messages the logging to the main thread and the main thread has to be able to receive that message in order to actually log it to the console. To manage concurrency with console.log(), nodejs sends all logging to the main thread via the interprocess messaging and that messaging has to be getting processed promptly for logging from a workerThread to show up promply. If either the workerThread or the main thread is stuck in an infinite loop, not processing event loop messages, then you will not necessarily get all the console.log() messages you are expecting, particularly from the WorkerThread.
The moral of the story here is that things will only work as expected when you are regularly allowing the event loop to process messages. This is not only true for the main thread, but also for the workerThread. If you're infinite looping anywhere, that will cause some things not to work as expected. For workerThreads, this will include anything that tries to communicate with the main thread.

Related

How to trap signals in long-running synchronous loop using Node.js

I have a simple scenario of:
process.on("SIGINT", s => {
console.log('trapped sigint:', s);
signals.INT = true;
});
let i = 0;
while(true){
if(signals.INT){
console.log('got the SIGINT'); break;
}
console.log('process.pid:',process.pid, 'next val:', i++);
}
console.log('exited loop, done');
process.exit(0);
the problem is that once we are in the synchronous loop, we can't seem to handle any i/o events, even signals. Is there any way around this?
The only solution I can think of, while maintaining a synchronous loop, is to actually go into the guts of node and look for i/o events ourselves from the synchronous loop, but I severely doubt that this is exposed to users of the runtime.
Obviously, one way around this is to create an async loop, like so:
process.on("SIGINT", s => {
console.log('trapped sigint:', s);
signals.INT = true;
});
let i = 0;
(async () => {
while(true){
if(signals.INT){
console.log('got the SIGINT'); break;
}
await console.log('process.pid:',process.pid, 'next val:', i++);
}
})()
console.log('exited loop, done');
process.exit(0);
that makes the loop a lot slower but allows us to capture i/o events. Is there any middle ground? I would love to be able to keep synchronous loop but still listen for i/o events somehow. I thought signals might be able to solve it, but I understand why even signals are not an exception in Node.js / JavaScript.
Why don t you create an async loop externaly of your main loop that set a value to a variable that is global to both loop.
So when the loop run for one more loop it set the variable to the correct value?

Node.js for loop, event loop, asynchronous resolution

Recently I came across code that makes me wonder about how Node.js engine works. Specifically in regards to event loop, for loop, and asynchronous processing. For example:
const x = 100000000; // or some stupendously large number
for (let i = 0; i < x; i++) {
asyncCall(callback);
}
const callback = () => {
console.log("done one");
}
Let's say asyncCall takes anywhere from 1ms to 100ms. Is there a possibility that console.log("done one") will be called before the for loop finishes?
To try to find my own answer, I've read this article: https://blog.sessionstack.com/how-javascript-works-event-loop-and-the-rise-of-async-programming-5-ways-to-better-coding-with-2f077c4438b5. But I'm not sure if there is a case where the call stack will be empty in the middle of the for loop so that the event loop puts the callback in between asyncCall calls?

Which Event Loop Phase Executes Ordinady JavaScript Code

I am new to node.js and little bit confused on understanding the event-loop. As far as i know from https://github.com/nodejs/node/blob/master/doc/topics/event-loop-timers-and-nexttick.md, the event-loop phases only process setTimeout, setInterval, setImmediate, process.nextTick, promises and some I/O callbacks.
My question is, if i have following code:
for (var i = 0; i < 100000000; i++)
;
in which phase the above code will get executed ?
Regular JavaScript code, like the for loop in your example, is executed before the queues are cleared. The first thing node will do is run your code, and will only call callbacks, timeout results, I/O results, and so on after your code finishes.
As an example, you could try this code:
fs.open('filename', 'r', () => {
console.log('File opened.');
});
for (var i = 0; i < 100000000; i++);
console.log('Loop complete.');
No matter how big or small your loop variable, 'Loop complete' will always appear before 'File opened'. This is because with only one thread, node can't run the callback you've supplied to the fs.open function until the loop code has finished.
Remember that there isn't a "main" thread that node keeps going back to. Most long-running node programs will run through the code in main.js pretty quickly, and subsequent code is all going to come from callbacks. The purpose of the initial execution is to define how and when those callbacks happen.
In the node event loop doc (https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick), the following code is given as an example:
const fs = require('fs');
function someAsyncOperation(callback) {
// Assume this takes 95ms to complete
fs.readFile('/path/to/file', callback);
}
const timeoutScheduled = Date.now();
setTimeout(() => {
const delay = Date.now() - timeoutScheduled;
console.log(`${delay}ms have passed since I was scheduled`);
}, 100);
// do someAsyncOperation which takes 95 ms to complete
someAsyncOperation(() => {
const startCallback = Date.now();
// 10ms loop
while (Date.now() - startCallback < 10) {
// do nothing
}
});
The loop keeps scanning according to phases and after fs.readFile() finishes, the poll queue is is empty, so its callback will be added and immediately executed. The callback holds a blocking 10ms loop before the timer is executed. That is why the delay will display:
105ms have passed since I was scheduled instead of the 100ms you might expect.
Most of your code will live in callbacks so will be executed in the poll phase. If not, like in your example, it will be executed before entering any phases as it will block the event loop.
The caveat are callbacks scheduled by setImmediate that will enter the check phase before resuming the poll phase in the next loop.

How to terminate child process on node.js?

I have a few child processes of node, that depends from master. Every process is a program with some asynchronic logic. And i have to terminate this process when all will be done. But process not terminate by himself, cause there some listeners on it. Example:
if (cluster.isMaster) {
for (var i = 0; i < numCPUs; i++) {
let worker = cluster.fork();
worker.send(i);
}
} else {
process.once('message', msg => {
// here some logic
// and after this is done, process have to terminated
console.log(msg);
})
}
But process still working, even i using "once". I had tried to remove all of process listeners, but it still works. How i can terminate it?
Use module like
terminate
Terminate a Node.js Process based on the Process ID
A minimalist yet reliable (tested) way to Terminate a Node.js Process (and all Child Processes) based on the Process ID
var terminate = require('terminate');
terminate(process.pid, function(err, done){
if(err) { // you will get an error if you did not supply a valid process.pid
console.log("Oopsy: " + err); // handle errors in your preferred way.
}
else {
console.log(done); // do what you do best!
}
});
or
We can start child processes with {detached: true} option so those processes will not be attached to main process but they will go to a new group of processes. Then using process.kill(-pid) method on main process we can kill all processes that are in the same group of a child process with the same pid group. In my case, I only have one processes in this group.
var spawn = require('child_process').spawn;
var child = spawn('my-command', {detached: true});
process.kill(-child.pid);
For cluster worker processes, you can call process.disconnect() to disconnect the IPC channel with the master process. Having the IPC channel connected will keep the worker process alive.

child_process spawn Race condition possibility in nodejs

I'm starting to learn and use node and I like it but I'm not really sure how certain features work. Maybe you can help me resolve one such issue:
I want to spawn local scripts and programs from my node server upon rest commands. looking at the fs library I saw the example below of how to spawn a child process and add some pipes/event handlers on it.
var spawn = require('child_process').spawn,
ps = spawn('ps', ['ax']),
grep = spawn('grep', ['ssh']);
ps.stdout.on('data', function (data) {
grep.stdin.write(data);
});
ps.stderr.on('data', function (data) {
console.log('ps stderr: ' + data);
});
ps.on('close', function (code) {
if (code !== 0) {
console.log('ps process exited with code ' + code);
}
grep.stdin.end();
});
grep.stdout.on('data', function (data) {
console.log('' + data);
});
grep.stderr.on('data', function (data) {
console.log('grep stderr: ' + data);
});
grep.on('close', function (code) {
if (code !== 0) {
console.log('grep process exited with code ' + code);
}
});
What's weird to me is that I don't understand how I can be guaranteed that the event handler code will be registered before the program starts to run. It's not like there's a 'resume' function that you run to start up the child. Isn't this a race condition? Granted the condition would be minisculy small and would almost never hit because its such a short snipping of code afterward but still, if it is I'd rather not code it this way out of good habits.
So:
1) if it's not a race condition why?
2) if it is a race condition how could I write it the right way?
Thanks for your time!
Given the slight conflict and ambiguity in the accepted answer's comments, the sample and output below tells me two things:
The child process (referring to the node object returned by spawn) emits no events even though the real underlying process is live / executing.
The pipes for the IPC are setup before the child process is executed.
Both are obvious. The conflict is w.r.t. interpretation of the OP's question:-
Actually 'yes', this is the epitome of a data race condition if one needs to consider the real child process's side effects. But 'no', there's no data race as far as IPC pipe plumbing is concerned. The data is written to a buffer and retrieved as a (bigger) blob as and when (as already well described) the context completes allowing the event loop to continue.
The first data event seen below pushes not 1 but 5 chunks written to stdout by the child process whilst we were blocking.. thus nothing is lost.
sample:
let t = () => (new Date()).toTimeString().split(' ')[0]
let p = new Promise(function (resolve, reject) {
console.log(`[${t()}|info] spawning`);
let cp = spawn('bash', ['-c', 'for x in `seq 1 1 10`; do printf "$x\n"; sleep 1; done']);
let resolved = false;
if (cp === undefined)
reject();
cp.on('error', (err) => {
console.log(`error: ${err}`);
reject(err);
});
cp.stdout.on('data', (data) => {
if (!resolved) {
console.log(`[${t()}|info] spawn succeeded`);
resolved = true;
resolve();
}
process.stdout.write(`[${t()}|data] ${data}`);
});
let ts = parseInt(Date.now() / 1000);
while (parseInt(Date.now() / 1000) - ts < 5) {
// waste some cycles in the current context
ts--; ts++;
}
console.log(`[${t()}|info] synchronous time wasted`);
});
Promise.resolve(p);
output:
[18:54:18|info] spawning
[18:54:23|info] synchronous time wasted
[18:54:23|info] spawn succeeded
[18:54:23|data] 1
2
3
4
5
[18:54:23|data] 6
[18:54:24|data] 7
[18:54:25|data] 8
[18:54:26|data] 9
[18:54:27|data] 10
It is not a race condition. Node.js is single threaded and handles events on a first come first serve basis. New events are put at the end of the event loop. Node will execute your code in a synchronous manner, part of which will involve setting up event emitters. When these event emitters emit events, they will be put to the end of the queue, and will not be handled until Node finishes executing whatever piece of code its currently working on, which happens to be the same code that registers the listener. Therefore, the listener will always be registered before the event is handled.

Resources