Completely clear spawn - node.js

Is it necessary to null the spawn after pause and kill?
let child = spawn(cmd_str);
child.on('exit', code => {
child.stdin.pause();
child.kill();
child = null;
});
I don't want my module to have a chance to take the extra resources from the system after doing it's job.

No, it is not necessary.
JavaScript has a garbage collector fonctionnality that takes care of cleaning memory for variable that are not used any more.
One thing you could do instead to be sure the subprocess has been killed with success is to listen to the close or error event, as shown in the example from the NodeJS doc.
You could also track the process by its PID to make sure it is not alive anymore.
Note that I don't fully understand why you need to kill your process on exit, you should maybe focus on a clean exit of the subprocess program.

Related

Node.js - process.exit() vs childProcess.kill()

I have a node application that runs long running tasks so whenever a task runs a child process is forked to run the task. The code creates a fork for the task to be run and sends a message to the child process to start.
Originally, when the task was complete, I was sending a message back to the parent process and the parent process would call .kill() on the child process. I noticed in my activity monitor that the node processes weren't being removed. All the child processes were hanging around. So, instead of sending a message to the parent and calling .kill(), I called process.exit() in the child process code once the task was complete.
The second approach seems to work fine and I see the node processes being removed from the activity monitor but I'm wondering if there is a downside to this approach that I don't know about. Is one method better than the other? What's the difference between the 2 methods?
My code looks like this for the messaging approach.
//Parent process
const forked = fork('./app/jobs/onlineConcurrency.js');
forked.send({clientId: clientData.clientId,
schoolYear: schoolYear
});
forked.on("message", (msg) => {
console.log("message", msg);
forked.kill();
});
//child Process
process.on('message', (data) => {
console.log("Message recieved");
onlineConcurrencyJob(data.clientId, data.schoolYear, function() {
console.log("Killing process");
process.send("done");
});
})
The code looks like this for the child process when just exiting
//child Process
process.on('message', (data) => {
console.log("Message received");
onlineConcurrencyJob(data.clientId, data.schoolYear, function() {
console.log("Killing process");
process.exit();
});
})
kill sends a signal to the child process. Without an argument, it sends a SIGTERM (where TERM is short for "termination"), which typically, as the name suggests, terminates the process.
However, sending a signal like that is a forced method of stopping a process. If the process is performing tasks like writing to a file, and it receives a termination signal, it might cause file corruption because the process doesn't get a chance to write all data to the file, and close it (there are mitigations for this, like installing a signal handler that can be used to "catch" signals and ignore them, or finish all tasks before exiting, but this requires explicit code to be added to the child process).
Whereas with process.exit(), the process exits itself. And typically, it does so at a point where it knows that there are no more pending tasks, so it can exit cleanly. This is generally speaking the best way to stop a (child) process.
As for why the processes aren't being removed, I'm not sure. It could be that the parent process isn't cleaning up the resources for the child processes, but I would expect that to happen automatically (I don't even think you can perform so-called "child reaping" explicitly in Node.js).
Calling process.exit(0) is the best mechanism, though there are cases where you might want to .kill from the parent (eg. A distributed search where one node returning means all nodes can stop).
.kill is probably failing due to some handling of the signal it is getting. Try .kill('SIGTERM'), or even 'SIGKILL'.
Also note that subprocesses which aren't killed when the parent process exits will be moved to the grandparent process. See here for more info and a proposed workaround: https://github.com/nodejs/node/issues/13538
In summary, this is default Unix behavior, and the workaround is to process.on("exit", () => child.kill())

Control-C not caught in Node.js

I thought if I run this and then hit control-C, the program should exit after displaying "Exiting...". It does not.
Of course, I want to do a lot more than console.log in the real application.
process.on('SIGINT', function() {
console.log("Exiting...");
process.exit();
});
while(1);
It does catch but does not exit. I have to kill the process separately.
Node version 8.x LTS
EDIT:
The edit is to make one of my comments below clear.As is made clear in the accepted answer, my signal-handler was overwriting the default one but it was NEVER getting executed. The fact that Cntl-C was not killing the process gave me the impression that the signal-handler was actually executing. It had merely overwritten the built-in handler. THE ANSWER IS TRULY INFORMATIVE - PACKED WITH INFO IN A FEW WORDS.
while(1) is hanging onto the process. Change it to:
setInterval(() => {}, 1000);
And it behaves as you would like.
I presume you used while(1) as a placeholder for a running program, but it's not an accurate representation. A normal node app would not hold the process synchronously like that.
It's probably worth noting that when you execute process.on('SIGINT', ... you are pre-empting node's normal SIGINT handler, which would have exited on ctrl-C even if while(1) was holding the process. By adding your own handler, your code will run when node gets to it, which would be after the current synchronous event cycle, which in this case never happens.

NodeJS child processes are terminated on SIGINT

Im creating NodeJS application, that creates quite a few child processes. They are started by both spawn and exec (based on lib implementation). Some examples may be GraphicsMagick (gm) for image manipulation or Tesseract (node-tesseract) for OCR. Now I would like to gracefully end my application so I created shutdown hook:
function exitHandler() {
killer.waitForShutdown().then(function(){
logger.logInfo("Exited successfully.");
process.exit();
}).catch(function(err) {
logger.logError(err, "Error during server shutdown.");
process.exit();
});
}
process.on('exit', exitHandler);
process.on('SIGINT', exitHandler);
process.on('SIGTERM', exitHandler);
Exit handling itself works fine, it is waiting well and so on, but there is a catch. All "native" (gm, tesseract, ...) processes that run at that time are also killed. Exception messages only consists of "Command failed" and then content of command which failed e.g.
"Command failed: /bin/sh -c tesseract tempfiles/W1KwFSdz7MKdJQQnUifQFKdfTRDvBF4VkdJgEvxZGITng7JZWcyPYw6imrw8JFVv/ocr_0.png /tmp/node-tesseract-49073e55-0ef6-482d-8e73-1d70161ce91a -l eng -psm 3\nTesseract Open Source OCR Engine v3.03 with Leptonica"
So at least for me, they do not tell anything useful. I'm also queuing process execution, so PC don't get overloaded by 50 processes at one time. When running processes are killed by SIGINT, new processes that were queued are started just fine and finishes successfully. I have problem only with those few running at the time of receiving SIGINT. This behavior is same on Linux (Debian 8) and Windows (W10). From what I read here, people usually have opposite problem (to kill child processes). I tried to search if stdin gets somehow piped into child processes but I can't find it. So is this how its supposed to work? Is there any trick to prevent this behavior?
The reason this happens is because, by default, the detached option is set to false. If detached is false, the signals will also be sent to the child processes, regardless of whether you setup an event listener.
To stop this happening, you need to change your spawn calls to use the third argument in order to specify detached; for example:
spawn('ls', ['-l'], { detached: true })
From the Node documentation:
On Windows, setting options.detached to true makes it possible for the
child process to continue running after the parent exits. The child
will have its own console window. Once enabled for a child process, it
cannot be disabled.
On non-Windows platforms, if options.detached is set to true, the
child process will be made the leader of a new process group and
session. Note that child processes may continue running after the
parent exits regardless of whether they are detached or not. See
setsid(2) for more information.

Killing CPU bound child process

I have a child process spawned using child_process.fork and would like to terminate it. The problem is that the child process does some lengthy CPU bound calculation and I don't have control over it. That is, the CPU bound code fragment cannot be restructured to make use of process.nextTick or polling.
A very simplified example:
parent.js
var cp = require('child_process');
var child = cp.fork('child.js');
child.js
...
while(true){} // lengthy computation which I cannot modify
...
Is it possible to terminate it? Preferably in a way that allows catching the exit event in the child in order to do some cleanups?
Sending SIGTERM/SIGKILL/etc using child.kill() doesn't
seem to work on Windows. I assume even if it works on other OSes it wouldn't kill the process anyway due to child not being able to process events while doing the computation.
I've done this the messy way by using the PID of the process and killing it at the OS level.
Not sure how to do it in windows, but in Linux/mac I've done:
var cp = require('child_process'),
badJob = cp.fork('badFile.js');
cp.execSync('kill -9 ' + badJob.pid);
The signal 9 is caught at the Kernel level, so the condition of the process is irrelevant.
Edit: In Windows you can use taskkill instead of kill. ex:
cp.execSync('taskkill /f ' + badJob.pid);

Reading stdout of child process unbuffered

I'm trying to read the output of a Python script launched by Node.js as it arrives. However, I only get access to the data once the process has finished.
var proc, args;
args = [
'./bin/build_map.py',
'--min_lon',
opts.sw.lng,
'--max_lon',
opts.ne.lng,
'--min_lat',
opts.sw.lat,
'--max_lat',
opts.ne.lat,
'--city',
opts.city
];
proc = spawn('python', args);
proc.stdout.on('data', function (buf) {
console.log(buf.toString());
socket.emit('map-creation-response', buf.toString());
});
If I launch the process with { stdio : 'inherit' } I can see the output as it happens directly in the console. But doing something like process.stdout.on('data', ...) will not work.
How do I make sure I can read the output from the child process as it arrives and direct it somewhere else?
The process doing the buffering, because it knows the terminal was redirected and not really going to the terminal, is python. You can easily tell Python not to do this buffering: Just run "python -u" instead of "python". Should be easy as that.
When a process is spawned by child_process.spawn(), the streams connected to the child process's standard output and standard error are actually unbuffered on the Nodejs side. To illustrate this, consider the following program:
const spawn = require('child_process').spawn;
var proc = spawn('bash', [
'-c',
'for i in $(seq 1 80); do echo -n .; sleep 1; done'
]);
proc.stdout
.on('data', function (b) {
process.stdout.write(b);
})
.on('close', function () {
process.stdout.write("\n");
});
This program runs bash and has it emit . characters every second for 80 seconds, while consuming this child process's standard output via data events. You should notice that the dots are emitted by the Node program every second, helping to confirm that buffering does not occur on the Nodejs side.
Also, as explained in the Nodejs documentation on child_process:
By default, pipes for stdin, stdout and stderr are established between
the parent Node.js process and the spawned child. It is possible to
stream data through these pipes in a non-blocking way. Note, however,
that some programs use line-buffered I/O internally. While that does
not affect Node.js, it can mean that data sent to the child process
may not be immediately consumed.
You may want to confirm that your Python program does not buffer its output. If you feel you're emitting data from your Python program as separate distinct writes to standard output, consider running sys.stdout.flush() following each write to suggest that Python should actually write data instead of trying to buffer it.
Update: In this commit that passage from the Nodejs documentation was removed for the following reason:
doc: remove confusing note about child process stdio
It’s not obvious what the paragraph is supposed to say. In particular,
whether and what kind of buffering mechanism a process uses for its
stdio streams does not affect that, in general, no guarantees can be
made about when it consumes data that was sent to it.
This suggests that there could be buffering at play before the Nodejs process receives data. In spite of this, care should be taken to ensure that processes within your control upstream of Nodejs are not buffering their output.

Resources