When I run my node script in Sublime 3 (as a build system ... Ctrl-B), if I add a listener to the stdin's data event, the process stays running until killed. This makes sense, since there's potentially still work to do.
process.stdin.on('data', (d)=> {
// ... do some work with `d`
});
However, I expected that if I removed the listener to that data event, my process would naturally exit. But it doesn't!
// This program never exits naturally.
function processData(d) {
// ... do some work with `d`, then...
process.stdin.removeListener('data', processData);
}
process.stdin.on('data', processData);
Even if you remove the event handler immediately after adding it, the process still sticks around...
function processData() {}
process.stdin.on('data', processData);
process.stdin.removeListener('data', processData);
In this exact case, I could use the once() function instead of on(), but that doesn't clear this up for me. What am I missing? Why does the stdin stream prevent the process from exiting, given it has no listeners of any kind?
Related
In the documentation for Node's Child Process spawn() function, and in examples I've seen elsewhere, the pattern is to call the spawn() function, and then to set up a bunch of handlers on the returned ChildProcess object. For instance, here is the first example of spawn() given on that documentation page:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
The spawn() function itself is called on the second line. My understanding is that spawn() starts a child process asynchronously. From the documentation:
The child_process.spawn() method spawns a new process using the given
command, with command line arguments in args.
However, the following lines of the script above go on to set up various handlers for the process, so it's assuming that the process hasn't actually started (and potentially finished) between the time spawn() is called on line 2 and the other stuff happens on the subsequent lines. I know JavaScript/Node is single threaded. However, the operating system is not single threaded, and naively one would read that spawn() call to be telling the operating system to spawn the process right now (at which point, with unfortunate timing, the OS could suspend the parent Node process and run/complete the child process before the next line of the Node code is executed).
But it must be that the process doesn't actually get spawned until the current JavaScript function completes (or more generally the current JavaScript event handler that called the current function completes), right?
That seems like a pretty important thing to say. Why doesn't it say that in the Child Process documentation page? Is there some overriding Node principle that makes it unnecessary to say that explicitly?
The spawning of the new process starts immediately (it's handed over to the OS to actually fire up the process and get it going). Starting the new process with .spawn() is asynchronous and non-blocking. So, it will initiate the operation with the OS and immediately return. You might think that that's why it's OK to set up event handlers after it returns (because the process hasn't yet finished starting). Well, yes and no. It likely hasn't yet finished starting the new process, but that isn't the main reason why it's OK.
It's OK, because node.js runs all its events through a single threaded event queue. Thus no events from the newly spawned process can be processed until after your code finishes executing and returns control back to the system. Only then can it process the next event in the event queue and trigger one of the events you are registering handlers for.
Or, said another way, none of the events from the other process are pre-emptive. They won't/can't interrupt your existing Javascript code. So, since you're still running your Javascript code, those events can't get run yet. Instead, they sit in the event queue until your Javascript code finishes and then the interpreter can go get the next event from the event queue and run the callback associated with it. Likewise, that callback runs until it returns back to the interpreter and then the interpreter can get the next event and run its callback and so on...
That's why node.js is called an event-driven system.
As such, it's perfectly fine to do this type of structure:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
None of those data or close events can execute their callbacks until after your code is done and returns control back to the system. So, it's perfectly safe to set up those event handlers like you are. Even if the newly spawned process was running and generating events right away, those events will just sit in the event queue until your Javascript finishes what it is doing (which includes setting up your event handlers).
Now, if you delayed setting up the event handlers until some future tick of the event loop (as shown below) with something like a setTimeout(), then you could miss some events:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
setTimeout(() => {
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
}, 10);
Here you are not setting up the event handlers immediately as part of the same tick of the event loop, but after a short delay. Therefore some events could get processed from the event loop before you install your event handlers and you could miss some of these events. Obviously, you would never do it this way (on purpose), but I just wanted to show that code running on the same tick of the event loop does not have a problem, but code running on some future tick of the event loop could have a problem missing events.
This is to follow up on jfriend00's answer, to explain what it helped me understand, in case it helps someone else. I knew about the event-driven nature of JavaScript/Node. What jfriend00's explanation made clear to me is the idea that an event can happen and Node can be aware that it happened, but it doesn't actually decide which handlers to tell about that event until the next tick. For instance, if the spawn() call fails outright (e.g., command does not exist), Node obviously knows that immediately. My thought was that it would then immediately queue the appropriate handlers to run on the next tick. But what I now understand is that it puts the "raw event" (i.e., the fact that the spawn failed, with whatever details about that) in its queue, and then on the next tick it determines and calls the appropriate handlers. And the same is true for other events like receiving output from the process, etc. The event is saved but the appropriate handlers for the event are only determined when the next tick runs, so handlers assigned on the previous tick, after spawn(), will get called.
So I have to write some NodeJS code that does the following: whenever a post request is made, I attempt to execute some program; if the program is already executing (because of a previous request), I ignore the request. If not, I execute the program. I'm using NodeJS child_process.exec to accomplish this; however, there's no way for me to know when exec(program) terminates; I thought of using execSync, but this simply blocks any requests until the program is done executing, instead of ignoring them completely. Here is the code I have right now:
function fun () {
execFile('C:\\Windows\\System32\\notepad.exe', ['package.json'],);
}
execFile is an EventEmitter, so you can listen for events that occur while execFile operates, including the exit event, which tells you the process has completed.
ignoreNextRequest = true;
execFile('C:\\Windows\\System32\\notepad.exe', ['package.json']).once('exit', (code, signal) => {
// Your code to handle the end of the process here.
ignoreNextRequest = false;
});
I'm using node to wrap an executable and I'm using the spawn event emitter. See the docs here. There are multiple events to subsribe to.
child = spawn("path/to/exe", args)
child.on('close', exitNormally )
child.on('exit', exitNormally )
child.on('error', exitAbnormally )
child.on('disconnect', exitAbnormally )
Should I be subscribing to all of them or is subscribing to close and error enough? I have a callback that I have to execute regardless of whether the outcome is a success or not. The docs for the events are here but it doesn't seem to say explictly say what I'm asking and I want to confirm that my thinking is correct and I don't miss any exits.
The exit event always will be called if your process ends, so I think it will be enough.
I have the following node.js code:
var testProcess = spawn(item.testCommand, [], {
cwd: process.cwd(),
stdio: ['ignore', process.stdout, process.stderr]
});
testProcess.on('close', function(data) {
console.log('test');
});
waitpid(testProcess.pid);
testProcess.kill();
however the close method never gets calls.
The end result I am looking for is that I spwan a process and the the script waits for that child processs to finish (which waitpid() is doing correctly). I want the output/err of the child process to be display to the screen (which the stdio config is doing correctly). I also want to perform code on the close of the child process which I was going to do in the close event (also tried exit), but it does not fire.
Why is the event not not firing?
http://nodejs.org/api/process.html
Note that just because the name of this function is process.kill, it is really just a signal sender, like the kill system call. The signal sent may do something other than kill the target process.
You can specify the signal while Kill() call.
Looking at waitpid() I found out that it returns an object with the exitCode. I changed my code so that I just perform certain actions based on what the value of the exitCode is.
Consider:
node -e "setTimeout(function() {console.log('abc'); }, 2000);"
This will actually wait for the timeout to fire before the program exits.
I am basically wondering if this means that node is intended to wait for all timeouts to complete before quitting.
Here is my situation. My client has a node.js server he's gonna run from Windows with a Shortcut icon. If the node app encounters an exceptional condition, it will typically instantly exit, not leaving enough time to see in the console what the error was, and this is bad.
My approach is to wrap the entire program with a try catch, so now it looks like this: try { (function () { ... })(); } catch (e) { console.log("EXCEPTION CAUGHT:", e); }, but of course this will also cause the program to immediately exit.
So at this point I want to leave about 10 seconds for the user to take a peek or screenshot of the exception before it quits.
I figure I should just use blocking sleep() through the npm module, but I discovered in testing that setting a timeout also seems to work. (i.e. why bother with a module if something builtin works?) I guess the significance of this isn't big, but I'm just curious about whether it is specified somewhere that node will actually wait for all timeouts to complete before quitting, so that I can feel safe doing this.
In general, node will wait for all timeouts to fire before quitting normally. Calling process.exit() will exit before the timeouts.
The details are part of libuv, but the documentation makes a vague comment about it:
http://nodejs.org/api/all.html#all_ref
you can call ref() to explicitly request the timer hold the program open
Putting all of the facts together, setTimeout by default is designed to hold the event loop open (so if that's the only thing pending, the program will wait). You can programmatically disable or re-enable the behavior.
Late answer, but a definite yes - Nodejs will wait around for setTimeout to finish - see this documentation. Coincidentally, there is also a way to not wait around for setTimeout, and that is by calling unref on the object returned from setTimeout or setInterval.
To summarize: if you want Nodejs to wait until the timeout has been called, there's nothing you need to do. If you want Nodejs to not wait for a particular timeout, call unref on it.
If node didn't wait for all setTimeout or setInterval calls to complete, you wouldn't be able to use them in simple scripts.
Once you tell node to listen for an event, as with the setTimeout or some async I/O call, the event loop will loop until it is told to exit.
Rather than wrap everything in a try/catch you can bind an event listener to process just as the example in the docs:
process.on('uncaughtException', function(err) {
console.log('Caught exception: ' + err);
});
setTimeout(function() {
console.log('This will still run.');
}, 500);
// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');
In the uncaughtException event, you can then add a setTimeout to exit after 10 seconds:
process.on('uncaughtException', function(err) {
console.log('Caught exception: ' + err);
setTimeout(function(){ process.exit(1); }, 10000);
});
If this exception is something you can recover from, you may want to look at domains: http://nodejs.org/api/domain.html
edit:
There may actually be another issue at hand: your client application doesn't do enough (or any?) logging. You can use log4js-node to write to a temp file or some application-specific location.
Easy way Solution:
Make a batch (.bat) file that starts nodejs
make a shortcut out of it
Why this is best. This way you client would run nodejs in command line. And even if nodejs program returns nothing would happen to command line.
Making bat file:
Make a text file
put START cmd.exe /k "node abc.js"
Save it
Rename It to abc.bat
make a shortcut or whatever.
Opening it will Open CommandLine and run nodejs file.
using settimeout for this is a bad idea.
The odd ones out are when you call process.exit() or there's an uncaught exception, as pointed out by Jim Schubert. Other than that, node will wait for the timeout to complete.
Node does remember timers, but only if it can keep track of them. At least that is my experience.
If you use setTimeout in an arrow / anonymous function I would recommend to keep track of your timers in an array, like:
=> {
timers.push(setTimeout(doThisLater, 2000));
}
and make sure let timers = []; isn't set in a method that will vanish, so i.e. globally.