Exit Process When all Readline on('line') Callbacks Complete - node.js

I have a Node v10.14.1 program that reads a CSV file line-by-line using the readline Interface
My .on('line') is an async callback performs some operations which read/write from a db, thus I use async/await to deal with the promises.
A short version of the program's code block of interest would look something like:
const readline = require('readline');
const filesystem = require('fs');
const reader = readline.createInterface({
input: filesystem.createReadStream(pathToSomeCSV)
});
reader.on('line', async (line) => {
await doSomeDBStuff();
})
If I leave the above the way it is, the process does not exit. However, if I
reader.on('close', () => {process.exit()});
then the process exits prior to all of the on('line') callbacks finishing and their promises resolving.
My question is: is there a way to say "Upon all lines being read AND all on('line') callbacks being completed with their promises resolved, then exit the process (I assume with process.exit())"?

Investigation
I get the feeling the docs are leaving some non-obvious details out. I was unable to get this official example working correctly (which is what your question appears to be based on). That implementation would kill my application prematurely. Or, if I removed the 'close' listener, the terminal would just hang forever on exit. I tried overriding process.on('exit') to no avail. I also tried the prompt-sync package, but it consistently corrupted my terminal.
Solution
I found a lovely answer here which offers a good solution.
Create the function:
const prompt = msg => {
fs.writeSync(1, String(msg));
let s = '', buf = Buffer.alloc(1);
while(buf[0] - 10 && buf[0] - 13)
s += buf, fs.readSync(0, buf, 0, 1, 0);
return s.slice(1);
};
Use it:
const result = prompt('Input something: ');
console.log('Your input was: ' + result);
No terminal corruption, the application does not die prematurely, and it does not hang on exit, either.
This solution is not perfect however - it intentionally blocks the main thread while waiting for user input, meaning you cannot run other functions in the background while waiting for user input. In my mind user input should be thread-blocking in most cases anyway, so this solution works very well for me personally.
Edit: see an improved version for Linux here.

Related

Nodejs exec child process stdout not getting all the chunks

I'm trying to send messages from my child process to main my process but some chunks are not being sent, possibly because the file is too big.
main process:
let response = ''
let error = ''
await new Promise(resolve => {
const p = exec(command)
p.stdout.on('data', data => {
// this gets triggered many times because the html string is big and gets split up
response += data
})
p.stderr.on('data', data => {
error += data
})
p.on('exit', resolve)
})
console.log(response)
child process:
// only fetch 1 page, then quit
const bigHtmlString = await fetchHtmlString(url)
process.stdout.write(bigHtmlString)
I know the child process works because when I run the it directly, I can see the end of the file in in the console. But when I run the main process, I cannot see the end of the file. It's quite big so I'm not sure exactly what chunks are missing.
edit: there's also a new unknown problem. when I add a wait at the end of my child process, it doesn't wait, it closes. So I'm guessing it crashes somehow? I'm not seeing any error even with p.on('error', console.log)
example:
const bigHtmlString = await fetchHtmlString(url)
process.stdout.write(bigHtmlString)
// this never gets executed, the process closes. The wait works if I launch the child process directly
await new Promise(resolve => setTimeout(resolve, 1000000))
process.stdout.write(...) returns true/false depending on whether it wrote the string or not. If it returns false, you can listen() to the drain event to make sure it finishes.
Something like this:
const bigHtmlString = await fetchHtmlString(url);
const wrote = process.stdout.write(bigHtmlString);
if (!wrote){
// this effectively means "wait for this
// event to fire", but it doesn't block everything
process.stdout.on('drain', ...doSomethingHere)
}
My suggestion from the comments resolved the issue so I'm posting it as an answer.
I would suggest using spawn instead of exec. The latter buffers the output and flushes it when the process is ended (or the buffer is full) while spawn is streaming the output which is better for huge output like in your case

Node.js halts when console window is scrolled

If you run the following script in Node.js under Windows (at least 8)
const init = +new Date;
setInterval(() => {
console.log(+new Date - init);
}, 1000);
and drag the thumb of a scroll bar of console window, the output of the script looks similar to
1001
2003 // long drag here
12368 // its result
13370
14372
Looks like Node.js' event loop halts during the scroll. The same thing should happen to asynchronous actions inside of http package. Thus leaving a visible terminal window is dangerous to the running server.
How do I change the code to avoid such behavior?
NodeJS is not halted while scrolling or selecting text. The only functions that send data to stdout are halted.
In your server, you are able to send log data to a file, and this way your server will not halt.
For example, see this code:
const init = +new Date;
var str=''
setInterval(() => {
x=(+new Date - init).toString();;
str+=x + '\n'
}, 1000);
setTimeout(function(){
console.log(str)
},5000)
I have selected text during the first 5 seconds, and this was the result:
C:\me>node a
1002
2002
3002
4003
You can see that there is no 'pause'.
As you see, the first event loop setInterval wasn't halted, because there is no console.log inside.
Now, when you use an output file for logging, you can view live log using tail -f. This will show you each new line in the output file.
Your console is actually pausing when you scroll or click in the console, as it's entering into select mode. Have a look in the title bar as it's paused it will likely say select.
To prevent this behavior, edit the properties of the command prompt, and unselect "quick edit mode".
There are two pieces of information in the node documentation that may give some clues for the reason of that behaviour:
an excerpt from Console:
Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.
an excerpt from A note on process I/O:
Warning: Synchronous writes block the event loop until the write has completed. This can be near instantaneous in the case of output to a file, but under high system load, pipes that are not being read at the receiving end, or with slow terminals or file systems, its possible for the event loop to be blocked often enough and long enough to have severe negative performance impacts.
And it seems that partial solution can be built using method you already proposed:
const fs = require('fs');
const init = +new Date;
setInterval(() => {
fs.write(1,String(+new Date - init)+'\n',null,'utf8',()=>{});
}, 1000);
It still blocks UI if you start selection, but doesn't stop processing:
2296
3300 // long pause here when selection was started
4313 // all those lines printed at the same time after selection was aborted
5315
6316
7326
8331
9336
10346
11356
12366
13372
If you'd like to make your console.log and console.error always asynchronous on all platforms, you can do this by using fs.write to fd 1 (stdout) or fd 2 (stderr).
const fs = require('fs')
const util = require('util')
// used by console.log
process.stdout.write = function write (str) {
fs.write(1, str, ()=>{})
}
// used by console.error
process.stderr.write = function write (str) {
fs.write(2, str, ()=>{})
}

What do fibers/future actually do?

What does the line of code below do?
Npm.require('fibers/future');
I looked online for examples and I came across a few like this:
Future = Npm.require('fibers/future');
var accessToken = new Future();
What will accessToken variable be in this case?
Question is a bit old but my 2 cents:
As Molda said in the comment, Future's main purpose is to make async things work synchronously.
future instance comes with 3 methods:
future.wait() basically tells your thread to basically pause until told to resume.
future.return(value), first way to tell waiting future he can resume, it's also very useful since it returns a value wait can then be assigned with, hence lines like const ret = future.wait() where ret becomes your returned value once resumed.
future.throw(error), quite explicit too, makes your blocking line throw with given error.
Making things synchronous in javascript might sound a bit disturbing but it is sometimes useful. In Meteor, it's quite useful when you are chaining async calls in a Meteor.method and you want its result to be returned to the client. You could also use Promises which are now fully supported by Meteor too, I've used both and it works, it's up to your liking.
A quick example:
Meteor.methods({
foo: function() {
const future = new Future();
someAsyncCall(foo, function bar(error, result) {
if (error) future.throw(error);
future.return(result);
});
// Execution is paused until callback arrives
const ret = future.wait(); // Wait on future not Future
return ret;
}
});

How to forcibly keep a Node.js process from terminating?

TL;DR
What is the best way to forcibly keep a Node.js process running, i.e., keep its event loop from running empty and hence keeping the process from terminating? The best solution I could come up with was this:
const SOME_HUGE_INTERVAL = 1 << 30;
setInterval(() => {}, SOME_HUGE_INTERVAL);
Which will keep an interval running without causing too much disturbance if you keep the interval period long enough.
Is there a better way to do it?
Long version of the question
I have a Node.js script using Edge.js to register a callback function so that it can be called from inside a DLL in .NET. This function will be called 1 time per second, sending a simple sequence number that should be printed to the console.
The Edge.js part is fine, everything is working. My only problem is that my Node.js process executes its script and after that it runs out of events to process. With its event loop empty, it just terminates, ignoring the fact that it should've kept running to be able to receive callbacks from the DLL.
My Node.js script:
var
edge = require('edge');
var foo = edge.func({
assemblyFile: 'cs.dll',
typeName: 'cs.MyClass',
methodName: 'Foo'
});
// The callback function that will be called from C# code:
function callback(sequence) {
console.info('Sequence:', sequence);
}
// Register for a callback:
foo({ callback: callback }, true);
// My hack to keep the process alive:
setInterval(function() {}, 60000);
My C# code (the DLL):
public class MyClass
{
Func<object, Task<object>> Callback;
void Bar()
{
int sequence = 1;
while (true)
{
Callback(sequence++);
Thread.Sleep(1000);
}
}
public async Task<object> Foo(dynamic input)
{
// Receives the callback function that will be used:
Callback = (Func<object, Task<object>>)input.callback;
// Starts a new thread that will call back periodically:
(new Thread(Bar)).Start();
return new object { };
}
}
The only solution I could come up with was to register a timer with a long interval to call an empty function just to keep the scheduler busy and avoid getting the event loop empty so that the process keeps running forever.
Is there any way to do this better than I did? I.e., keep the process running without having to use this kind of "hack"?
The simplest, least intrusive solution
I honestly think my approach is the least intrusive one:
setInterval(() => {}, 1 << 30);
This will set a harmless interval that will fire approximately once every 12 days, effectively doing nothing, but keeping the process running.
Originally, my solution used Number.POSITIVE_INFINITY as the period, so the timer would actually never fire, but this behavior was recently changed by the API and now it doesn't accept anything greater than 2147483647 (i.e., 2 ** 31 - 1). See docs here and here.
Comments on other solutions
For reference, here are the other two answers given so far:
Joe's (deleted since then, but perfectly valid):
require('net').createServer().listen();
Will create a "bogus listener", as he called it. A minor downside is that we'd allocate a port just for that.
Jacob's:
process.stdin.resume();
Or the equivalent:
process.stdin.on("data", () => {});
Puts stdin into "old" mode, a deprecated feature that is still present in Node.js for compatibility with scripts written prior to Node.js v0.10 (reference).
I'd advise against it. Not only it's deprecated, it also unnecessarily messes with stdin.
Use "old" Streams mode to listen for a standard input that will never come:
// Start reading from stdin so we don't exit.
process.stdin.resume();
Here is IFFE based on the accepted answer:
(function keepProcessRunning() {
setTimeout(keepProcessRunning, 1 << 30);
})();
and here is conditional exit:
let flag = true;
(function keepProcessRunning() {
setTimeout(() => flag && keepProcessRunning(), 1000);
})();
You could use a setTimeout(function() {""},1000000000000000000); command to keep your script alive without overload.
spin up a nice repl, node would do the same if it didn't receive an exit code anyway:
import("repl").then(repl=>
repl.start({prompt:"\x1b[31m"+process.versions.node+": \x1b[0m"}));
I'll throw another hack into the mix. Here's how to do it with Promise:
new Promise(_ => null);
Throw that at the bottom of your .js file and it should run forever.

Better way to make node not exit?

In a node program I'm reading from a file stream with fs.createReadStream. But when I pause the stream the program exits. I thought the program would keep running since the file is still opened, just not being read.
Currently to get it to not exit I'm setting an interval that does nothing.
setInterval(function() {}, 10000000);
When I'm ready to let the program exit, I clear it. But is there a better way?
Example Code where node will exit:
var fs = require('fs');
var rs = fs.createReadStream('file.js');
rs.pause();
Node will exit when there is no more queued work. Calling pause on a ReadableStream simply pauses the data event. At that point, there are no more events being emitted and no outstanding work requests, so Node will exit. The setInterval works since it counts as queued work.
Generally this is not a problem since you will probably be doing something after you pause that stream. Once you resume the stream, there will be a bunch of queued I/O and your code will execute before Node exits.
Let me give you an example. Here is a script that exits without printing anything:
var fs = require('fs');
var rs = fs.createReadStream('file.js');
rs.pause();
rs.on('data', function (data) {
console.log(data); // never gets executed
});
The stream is paused, there is no outstanding work, and my callback never runs.
However, this script does actually print output:
var fs = require('fs');
var rs = fs.createReadStream('file.js');
rs.pause();
rs.on('data', function (data) {
console.log(data); // prints stuff
});
rs.resume(); // queues I/O
In conclusion, as long as you are eventually calling resume later, you should be fine.
Short way based on answers below
require('fs').createReadStream('file.js').pause();

Resources