If you run the following script in Node.js under Windows (at least 8)
const init = +new Date;
setInterval(() => {
console.log(+new Date - init);
}, 1000);
and drag the thumb of a scroll bar of console window, the output of the script looks similar to
1001
2003 // long drag here
12368 // its result
13370
14372
Looks like Node.js' event loop halts during the scroll. The same thing should happen to asynchronous actions inside of http package. Thus leaving a visible terminal window is dangerous to the running server.
How do I change the code to avoid such behavior?
NodeJS is not halted while scrolling or selecting text. The only functions that send data to stdout are halted.
In your server, you are able to send log data to a file, and this way your server will not halt.
For example, see this code:
const init = +new Date;
var str=''
setInterval(() => {
x=(+new Date - init).toString();;
str+=x + '\n'
}, 1000);
setTimeout(function(){
console.log(str)
},5000)
I have selected text during the first 5 seconds, and this was the result:
C:\me>node a
1002
2002
3002
4003
You can see that there is no 'pause'.
As you see, the first event loop setInterval wasn't halted, because there is no console.log inside.
Now, when you use an output file for logging, you can view live log using tail -f. This will show you each new line in the output file.
Your console is actually pausing when you scroll or click in the console, as it's entering into select mode. Have a look in the title bar as it's paused it will likely say select.
To prevent this behavior, edit the properties of the command prompt, and unselect "quick edit mode".
There are two pieces of information in the node documentation that may give some clues for the reason of that behaviour:
an excerpt from Console:
Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O for more information.
an excerpt from A note on process I/O:
Warning: Synchronous writes block the event loop until the write has completed. This can be near instantaneous in the case of output to a file, but under high system load, pipes that are not being read at the receiving end, or with slow terminals or file systems, its possible for the event loop to be blocked often enough and long enough to have severe negative performance impacts.
And it seems that partial solution can be built using method you already proposed:
const fs = require('fs');
const init = +new Date;
setInterval(() => {
fs.write(1,String(+new Date - init)+'\n',null,'utf8',()=>{});
}, 1000);
It still blocks UI if you start selection, but doesn't stop processing:
2296
3300 // long pause here when selection was started
4313 // all those lines printed at the same time after selection was aborted
5315
6316
7326
8331
9336
10346
11356
12366
13372
If you'd like to make your console.log and console.error always asynchronous on all platforms, you can do this by using fs.write to fd 1 (stdout) or fd 2 (stderr).
const fs = require('fs')
const util = require('util')
// used by console.log
process.stdout.write = function write (str) {
fs.write(1, str, ()=>{})
}
// used by console.error
process.stderr.write = function write (str) {
fs.write(2, str, ()=>{})
}
Related
I have a Node v10.14.1 program that reads a CSV file line-by-line using the readline Interface
My .on('line') is an async callback performs some operations which read/write from a db, thus I use async/await to deal with the promises.
A short version of the program's code block of interest would look something like:
const readline = require('readline');
const filesystem = require('fs');
const reader = readline.createInterface({
input: filesystem.createReadStream(pathToSomeCSV)
});
reader.on('line', async (line) => {
await doSomeDBStuff();
})
If I leave the above the way it is, the process does not exit. However, if I
reader.on('close', () => {process.exit()});
then the process exits prior to all of the on('line') callbacks finishing and their promises resolving.
My question is: is there a way to say "Upon all lines being read AND all on('line') callbacks being completed with their promises resolved, then exit the process (I assume with process.exit())"?
Investigation
I get the feeling the docs are leaving some non-obvious details out. I was unable to get this official example working correctly (which is what your question appears to be based on). That implementation would kill my application prematurely. Or, if I removed the 'close' listener, the terminal would just hang forever on exit. I tried overriding process.on('exit') to no avail. I also tried the prompt-sync package, but it consistently corrupted my terminal.
Solution
I found a lovely answer here which offers a good solution.
Create the function:
const prompt = msg => {
fs.writeSync(1, String(msg));
let s = '', buf = Buffer.alloc(1);
while(buf[0] - 10 && buf[0] - 13)
s += buf, fs.readSync(0, buf, 0, 1, 0);
return s.slice(1);
};
Use it:
const result = prompt('Input something: ');
console.log('Your input was: ' + result);
No terminal corruption, the application does not die prematurely, and it does not hang on exit, either.
This solution is not perfect however - it intentionally blocks the main thread while waiting for user input, meaning you cannot run other functions in the background while waiting for user input. In my mind user input should be thread-blocking in most cases anyway, so this solution works very well for me personally.
Edit: see an improved version for Linux here.
Assuming a Readable Stream in NodeJS and a Data (on('data', ...)) event handler tied to it that is relatively slow, is it possible for the End event to fire before the last Data handler(s) has finished, and if so, will it prematurely terminate that handler? Or, will all Data events get dispatched and run?
In my case, I am working with large files and want to commit to a DB every data chunk. I am worried that I may lose the last record or two (or more) if End is fired before the last DB calls in the handler actually complete.
Event 'end' fire after last 'data' event. But it may happend before the last Data handler has finished. It is possible that before one 'data' handler has finished, next is started. It depends of what you have in your code, but it is possible that later call of event 'data' finish before earlier. It may cause errors and problems in your code.
Example how to cause problems (to your own tests):
var fs = require('fs');
var rr = fs.createReadStream('somebigfile.jpg');
var i=0;
rr.on('data', function(chunk) {
i++;
var s = i;
console.log('readable:' + s);
setTimeout(function(){
console.log('timeout:'+s);
}, 50-i*10);
});
rr.on('end', function() {
console.log('end');
});
It will print in your console when start each 'data' event handler. And after some miliseconds when it finish. Finish may be in different order.
Solution:
Readable Streams have two modes 'flowing mode' and a 'paused mode'. When you add 'data' event handler, you auto set Readable Streams to flowing mode.
From documentation :
When in flowing mode, data is read from the underlying system and
provided to your program as fast as possible
In this mode events will not wait for your slow actions to finish. For your need is 'paused mode'.
From documentation:
In paused mode, you must explicitly call stream.read() to get chunks
of data out. Streams start out in paused mode.
In other words: you demand chunk of data, you get it, you work with it, and when you ready you ask for new chunk of data. In this mode you controll when you want to get your data.
How to change to 'paused mode':
It is default mode for this stream. But when you register 'data' event handler it switch to 'flowing mode'. Therefore not use readstream.on('data',...)
Instead use readstream.on('readable', function(){...}) when it fire, then it means that stream is ready to give chunk of data. To get chunk of data use var chunk = readstream.read();
Example from docs:
var fs = require('fs');
var rr = fs.createReadStream('foo.txt');
rr.on('readable', function() {
console.log('readable:', rr.read());
});
rr.on('end', function() {
console.log('end');
});
Please read documentation for more details, because there are more posibilities when stream is auto switched to 'flowing mode'.
Work with slow handlers and flowing mode:
If you want/need work in 'flowing mode', there is also solution. You can pause and resume stream. When you get chunk form readstream('data'), pause stream and when you finish work then resume it.
Example from documentation:
var readable = getReadableStreamSomehow();
readable.on('data', function(chunk) {
console.log('got %d bytes of data', chunk.length);
readable.pause();
console.log('there will be no more data for 1 second');
setTimeout(function() {
console.log('now data will start flowing again');
readable.resume();
}, 1000);
});
I have a Firebase Connection in nodejs that pushes data to a url while the connection is persistent, when it closes, I want to remove that data (think, I push "hey I'm here", and when I leave, the text disappears)
I made a "runnable" that shows an example of it:
http://web-f6176e84-c073-416f-93af-62a9a9fbfabd.runnable.com
basically, hit "ctrl + c" and it prints out "trying to remove reference" but never actually deletes the data ( the documents say that remove() is equivalent to set(null) which it basically sets the data to null, and since it's null, the entire element should be gone.)
However it's not removing it, I don't see the data ever "disappear". (I'm using a temp Firebase URL, you should be able to duplicate with any URL you can access if this url stops existing).
this is the code I'm using.
var FB_URL = 'https://cuhiqgro1t3.firebaseio-demo.com/test_code';
var Firebase = require('firebase');
var myRootRef = new Firebase(FB_URL);
console.log("created Firebase URL");
process.stdin.resume(); //so the program will not close instantly
function delete_fb_entries() {
return function() {
console.log("Trying to remove reference");
myRootRef.remove();
process.exit();
}
}
//do something when app is closing
process.on('exit', delete_fb_entries());
//catches ctrl+c event
process.on('SIGINT', delete_fb_entries());
//catches uncaught exceptions
process.on('uncaughtException', delete_fb_entries());
EDIT: Additional Information as to the "why", I push my local IP address out to my Firebase URL cause I'm lazy and it's easier to just have a webpage setup I can always access that will show the url of particular devices (and I know using the routers tables would be easier), I actually also have other purposes for this usage as well (if I happen to be inside my network, I can just select a particular device from my webpage and access the data I need, either way, it works, but I just can't get it to remove itself correctly, this used to work at one point in time I believe, so I can only assume the API has changed or something).
EDIT 2: OK removed process.exit() as suggested, and the runnable seemed to delete the data in question, I tried it on my local data (and after some cleaning up and commenting out), it removed the data, however when I hit Ctrl + C it no longer exits the program.....so yay.
I need to figure out if "process.exit()" is necessary or unnecessary at this point.
Edit 3: Ok so I need to use process.exit (as far as I can tell, Ctrl + C no longer exits the program, I have to ctrl + Z, and reboot). I tried adding it right after, but I realized that removing a firebase element is not a synchronus operation, so when I close it I tried (the next attempt) was to use the on complete handler for the remove function (so remove(onComplete), and then adding the process.exit() to the onComplete function).
So finally it looks like this and it seems to be working with my application
var FB_URL = 'https://cuhiqgro1t3.firebaseio-demo.com/test_code';
var Firebase = require('firebase');
var myRootRef = new Firebase(FB_URL);
console.log("created Firebase URL");
function onComplete() {
process.exit();
]
process.stdin.resume(); //so the program will not close instantly
function delete_fb_entries() {
return function() {
console.log("Trying to remove reference");
myRootRef.remove(onComplete);
}
}
//do something when app is closing
process.on('exit', delete_fb_entries());
//catches ctrl+c event
process.on('SIGINT', delete_fb_entries());
//catches uncaught exceptions
process.on('uncaughtException', delete_fb_entries());
EDIT 4: In response to comments below, So I tried modifying a simple program to be the following:
function delete_fb_entries (){
return function () {
console.log("I should quit soon");
}
}
process.stdin.resume(); //so the program will not close instantly
//catches ctrl+c event
process.on('SIGINT', delete_fb_entries());
My program never exited. I don't understand why node would not close in this case, changing to add a process.exit() after the console.log causes nodejs to quit. This is not an async function, so why is it not exiting in this case? (Is this a bug, or a misunderstanding of how this works by me?)
You cannot perform asynchronous operations in a process's exit event handler, only synchronous operations, since the process is exited once all exit event handlers have been executed.
TL;DR
What is the best way to forcibly keep a Node.js process running, i.e., keep its event loop from running empty and hence keeping the process from terminating? The best solution I could come up with was this:
const SOME_HUGE_INTERVAL = 1 << 30;
setInterval(() => {}, SOME_HUGE_INTERVAL);
Which will keep an interval running without causing too much disturbance if you keep the interval period long enough.
Is there a better way to do it?
Long version of the question
I have a Node.js script using Edge.js to register a callback function so that it can be called from inside a DLL in .NET. This function will be called 1 time per second, sending a simple sequence number that should be printed to the console.
The Edge.js part is fine, everything is working. My only problem is that my Node.js process executes its script and after that it runs out of events to process. With its event loop empty, it just terminates, ignoring the fact that it should've kept running to be able to receive callbacks from the DLL.
My Node.js script:
var
edge = require('edge');
var foo = edge.func({
assemblyFile: 'cs.dll',
typeName: 'cs.MyClass',
methodName: 'Foo'
});
// The callback function that will be called from C# code:
function callback(sequence) {
console.info('Sequence:', sequence);
}
// Register for a callback:
foo({ callback: callback }, true);
// My hack to keep the process alive:
setInterval(function() {}, 60000);
My C# code (the DLL):
public class MyClass
{
Func<object, Task<object>> Callback;
void Bar()
{
int sequence = 1;
while (true)
{
Callback(sequence++);
Thread.Sleep(1000);
}
}
public async Task<object> Foo(dynamic input)
{
// Receives the callback function that will be used:
Callback = (Func<object, Task<object>>)input.callback;
// Starts a new thread that will call back periodically:
(new Thread(Bar)).Start();
return new object { };
}
}
The only solution I could come up with was to register a timer with a long interval to call an empty function just to keep the scheduler busy and avoid getting the event loop empty so that the process keeps running forever.
Is there any way to do this better than I did? I.e., keep the process running without having to use this kind of "hack"?
The simplest, least intrusive solution
I honestly think my approach is the least intrusive one:
setInterval(() => {}, 1 << 30);
This will set a harmless interval that will fire approximately once every 12 days, effectively doing nothing, but keeping the process running.
Originally, my solution used Number.POSITIVE_INFINITY as the period, so the timer would actually never fire, but this behavior was recently changed by the API and now it doesn't accept anything greater than 2147483647 (i.e., 2 ** 31 - 1). See docs here and here.
Comments on other solutions
For reference, here are the other two answers given so far:
Joe's (deleted since then, but perfectly valid):
require('net').createServer().listen();
Will create a "bogus listener", as he called it. A minor downside is that we'd allocate a port just for that.
Jacob's:
process.stdin.resume();
Or the equivalent:
process.stdin.on("data", () => {});
Puts stdin into "old" mode, a deprecated feature that is still present in Node.js for compatibility with scripts written prior to Node.js v0.10 (reference).
I'd advise against it. Not only it's deprecated, it also unnecessarily messes with stdin.
Use "old" Streams mode to listen for a standard input that will never come:
// Start reading from stdin so we don't exit.
process.stdin.resume();
Here is IFFE based on the accepted answer:
(function keepProcessRunning() {
setTimeout(keepProcessRunning, 1 << 30);
})();
and here is conditional exit:
let flag = true;
(function keepProcessRunning() {
setTimeout(() => flag && keepProcessRunning(), 1000);
})();
You could use a setTimeout(function() {""},1000000000000000000); command to keep your script alive without overload.
spin up a nice repl, node would do the same if it didn't receive an exit code anyway:
import("repl").then(repl=>
repl.start({prompt:"\x1b[31m"+process.versions.node+": \x1b[0m"}));
I'll throw another hack into the mix. Here's how to do it with Promise:
new Promise(_ => null);
Throw that at the bottom of your .js file and it should run forever.
I have a file with a lot of entries (10+ million), each representing a partial document that is being saved to a mongo database (based on some criteria, non-trivial).
To avoid overloading the database (which is doing other operations at the same time), I wish to read in chunks of X lines, wait for them to finish, read the next X lines, etc.
Is there any way to use any of the fscallback-mechanisms to also "halt" progress at a certain point, without blocking the entire program? From what I can tell they will all run from start to finish with no way of stopping it, unless you stop reading the file entirely.
The issues is that because of the file size, memory also becomes an issue and because of the time the updates take, a LOT of the data will be held in memory exceeding the 1 GB limit and causing the program to crash. Secondarily, as I said, I don't want to queue 1 million updates and completely stress the mongo database.
Any and all suggestions welcome.
UPDATE: Final solution using line-reader (available via npm) below, in pseudo-code.
var lineReader = require('line-reader');
var filename = <wherever you get it from>;
lineReader(filename, function(line, last, cb) {
//
// Do work here, line contains the line data
// last is true if it's the last line in the file
//
function checkProcessed(callback) {
if (doneProcessing()) { // Implement doneProcessing to check whether whatever you are doing is done
callback();
}
else {
setTimeout(function() { checkProcessed(callback) }, 100); // Adjust timeout according to expecting time to process one line
}
}
checkProcessed(cb);
});
This is implemented to make sure doneProcessing() returns true before attempting to work on more lines - this means you can effectively throttle whatever you are doing.
I don't use MongoDB and I'm not an expert in using Lazy, but I think something like below might work or give you some ideas. (note that I have not tested this code)
var fs = require('fs'),
lazy = require('lazy');
var readStream = fs.createReadStream('yourfile.txt');
var file = lazy(readStream)
.lines // ask to read stream line by line
.take(100) // and read 100 lines at a time.
.join(function(onehundredlines){
readStream.pause(); // pause reading the stream
writeToMongoDB(onehundredLines, function(err){
// error checking goes here
// resume the stream 1 second after MongoDB finishes saving.
setTimeout(readStream.resume, 1000);
});
});
}