Can I write a real async callback in Nodejs? - node.js

This is a normal example to read a file:
var fs = require('fs');
fs.readFile('./gparted-live-0.18.0-2-i486.iso', function (err, data) {
console.log(data.length);
});
console.log('All done.');
the code above outputs:
All done.
187695104
whereas this is my own version of a callback, I hope it could be async like the file reading code above, but it is not:
var f = function(cb) {
cb();
};
f(function() {
var i = 0;
// Do some very long job.
while(++i < (1<<30)) {}
console.log('Cb comes back.')
});
console.log('All done.');
the code above outputs:
Cb comes back.
All done.
Up till now, it's clear that in the first version of the file reading code, All done. is always printed before the file is read. However, in the second my home brewed version of code, All done. is always waiting until the very long job is done.
So what on earth is the magic that makes fs.readFile's callback an async call back while mine is not?

var f = function(cb) {
cb();
};
Is not async because it invokes cb immediately.
I think you want
var f = function(cb) {
setImmediate(function(){ cb(); });
};

In your example the while-loop is occupying the event-loop therefore the function call to console.log('All done.') is queued on the stack. When the event-loop becomes unblocked the subsequent function calls will be called in sequence.
In Mastering Node.js by Sandro Pasquali - Chapter 2, he discusses deferred execution and the event-loop in order to avoid the issue of the event-loop taking hold and blocking execution. I recommend reading that chapter in order to better understand this non-intuitive way of working in Node.js.
From Mastering Node.js...
Node processes JavaScript instructions using a single thread. Within
your JavaScript program no two operations will ever execute at exactly
the same moment, as might happen in a multithreaded environment.
Understanding this fact is essential to understanding how a Node
program, or process, is designed and runs.
The use of setImmediate() can remedy this issue.

You can use setImmediate() to defer the execution of code until the next cycle of the event loop, which I think accomplishes what you want:
var f = function(cb) {
cb();
};
f(function() {
setImmediate(function() {
var i = 0;
// Do some very long job.
while(++i < (1<<30)) {}
console.log('Cb comes back.')
});
});
console.log('All done.');
The documentation for setImmediate explains the difference between process.nextTick and setImmediate thusly:
Immediates are queued in the order created, and are popped off the queue once per loop iteration. This is different from process.nextTick which will execute process.maxTickDepth queued callbacks per iteration. setImmediate will yield to the event loop after firing a queued callback to make sure I/O is not being starved. While order is preserved for execution, other I/O events may fire between any two scheduled immediate callbacks.
Edit: Update answer based on #generalhenry's comment.

Related

Which Event Loop Phase Executes Ordinady JavaScript Code

I am new to node.js and little bit confused on understanding the event-loop. As far as i know from https://github.com/nodejs/node/blob/master/doc/topics/event-loop-timers-and-nexttick.md, the event-loop phases only process setTimeout, setInterval, setImmediate, process.nextTick, promises and some I/O callbacks.
My question is, if i have following code:
for (var i = 0; i < 100000000; i++)
;
in which phase the above code will get executed ?
Regular JavaScript code, like the for loop in your example, is executed before the queues are cleared. The first thing node will do is run your code, and will only call callbacks, timeout results, I/O results, and so on after your code finishes.
As an example, you could try this code:
fs.open('filename', 'r', () => {
console.log('File opened.');
});
for (var i = 0; i < 100000000; i++);
console.log('Loop complete.');
No matter how big or small your loop variable, 'Loop complete' will always appear before 'File opened'. This is because with only one thread, node can't run the callback you've supplied to the fs.open function until the loop code has finished.
Remember that there isn't a "main" thread that node keeps going back to. Most long-running node programs will run through the code in main.js pretty quickly, and subsequent code is all going to come from callbacks. The purpose of the initial execution is to define how and when those callbacks happen.
In the node event loop doc (https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick), the following code is given as an example:
const fs = require('fs');
function someAsyncOperation(callback) {
// Assume this takes 95ms to complete
fs.readFile('/path/to/file', callback);
}
const timeoutScheduled = Date.now();
setTimeout(() => {
const delay = Date.now() - timeoutScheduled;
console.log(`${delay}ms have passed since I was scheduled`);
}, 100);
// do someAsyncOperation which takes 95 ms to complete
someAsyncOperation(() => {
const startCallback = Date.now();
// 10ms loop
while (Date.now() - startCallback < 10) {
// do nothing
}
});
The loop keeps scanning according to phases and after fs.readFile() finishes, the poll queue is is empty, so its callback will be added and immediately executed. The callback holds a blocking 10ms loop before the timer is executed. That is why the delay will display:
105ms have passed since I was scheduled instead of the 100ms you might expect.
Most of your code will live in callbacks so will be executed in the poll phase. If not, like in your example, it will be executed before entering any phases as it will block the event loop.
The caveat are callbacks scheduled by setImmediate that will enter the check phase before resuming the poll phase in the next loop.

Understanding Node.js event loop. process.nextTick() never invoked. Why?

I am experimenting with the event loop. First I begin with this straightforward code to read and print the contents of a file:
var fs = require('fs');
var PATH = "./.gitignore";
fs.readFile(PATH,"utf-8",function(err,text){
console.log("----read: "+text);
});
Then I place it into an infinite loop. In this case, the readFile function is never executed. If I am not mistaken it's because Node's single thread is busy iterating without letting I/O calls be executed.
while(true){
var fs = require('fs');
var PATH = "./.gitignore";
fs.readFile(PATH,"utf-8",function(err,text){
console.log("----read: "+text);
});
}
So, I would like to do something so that I/O calls are assigned process time intertwined with the loop. I tried with process.nextTick() but it doesn't work:
while(true){
process.nextTick(function(){
fs.readFile(PATH,"utf-8",function(err,text){
console.log("----read: "+text)
});
});
}
Why isn't it working and how could I make it?
Because your while loop is still running. It's just infinitely adding things to do in the next tick. If you let it go, your node process will crash as it runs out of memory.
When you work with async code, your normal loops and control structures tend to trip you up. The reason is that they execute synchronously in one step of the event loop. Until something happens that yields control to the event loop again, nothing 'nextTick' will happen.
Think of it like this, You are in Pass B of the event loop when your code runs. When you call
process.nextTick(function foo() { do.stuff(); })'
you are adding the foo to the list of 'things to do before you start pass C of the event loop.' Every time you call nextTick, you add one more thing to the list, but none of them will run until the synchronous code is done.
What you need to do instead is create 'do the next thing' links in your callbacks. Think linked-lists.
// var files = your list of files;
function do_read(count) {
var next = count+1;
fs.readFile(files[count], "utf-8", function(err,text) {
console.log("----read: " + text);
if (next < files.length) {
// this doesn't run until the previous readFile completes.
process.nextTick(function() { do_read(next) });
}
});
}
// kick off the first one:
do_read(files[0], 0);
(obviously this is a contrived example, but you get the idea)
This causes each 'next file' to be added to the 'nextTick' to-do queue only after the previous one has been fully processed.
TL;DR: Most of the time, you don't want to start it doing the next thing until the previous thing is completed
Hope that helps!

How to forcibly keep a Node.js process from terminating?

TL;DR
What is the best way to forcibly keep a Node.js process running, i.e., keep its event loop from running empty and hence keeping the process from terminating? The best solution I could come up with was this:
const SOME_HUGE_INTERVAL = 1 << 30;
setInterval(() => {}, SOME_HUGE_INTERVAL);
Which will keep an interval running without causing too much disturbance if you keep the interval period long enough.
Is there a better way to do it?
Long version of the question
I have a Node.js script using Edge.js to register a callback function so that it can be called from inside a DLL in .NET. This function will be called 1 time per second, sending a simple sequence number that should be printed to the console.
The Edge.js part is fine, everything is working. My only problem is that my Node.js process executes its script and after that it runs out of events to process. With its event loop empty, it just terminates, ignoring the fact that it should've kept running to be able to receive callbacks from the DLL.
My Node.js script:
var
edge = require('edge');
var foo = edge.func({
assemblyFile: 'cs.dll',
typeName: 'cs.MyClass',
methodName: 'Foo'
});
// The callback function that will be called from C# code:
function callback(sequence) {
console.info('Sequence:', sequence);
}
// Register for a callback:
foo({ callback: callback }, true);
// My hack to keep the process alive:
setInterval(function() {}, 60000);
My C# code (the DLL):
public class MyClass
{
Func<object, Task<object>> Callback;
void Bar()
{
int sequence = 1;
while (true)
{
Callback(sequence++);
Thread.Sleep(1000);
}
}
public async Task<object> Foo(dynamic input)
{
// Receives the callback function that will be used:
Callback = (Func<object, Task<object>>)input.callback;
// Starts a new thread that will call back periodically:
(new Thread(Bar)).Start();
return new object { };
}
}
The only solution I could come up with was to register a timer with a long interval to call an empty function just to keep the scheduler busy and avoid getting the event loop empty so that the process keeps running forever.
Is there any way to do this better than I did? I.e., keep the process running without having to use this kind of "hack"?
The simplest, least intrusive solution
I honestly think my approach is the least intrusive one:
setInterval(() => {}, 1 << 30);
This will set a harmless interval that will fire approximately once every 12 days, effectively doing nothing, but keeping the process running.
Originally, my solution used Number.POSITIVE_INFINITY as the period, so the timer would actually never fire, but this behavior was recently changed by the API and now it doesn't accept anything greater than 2147483647 (i.e., 2 ** 31 - 1). See docs here and here.
Comments on other solutions
For reference, here are the other two answers given so far:
Joe's (deleted since then, but perfectly valid):
require('net').createServer().listen();
Will create a "bogus listener", as he called it. A minor downside is that we'd allocate a port just for that.
Jacob's:
process.stdin.resume();
Or the equivalent:
process.stdin.on("data", () => {});
Puts stdin into "old" mode, a deprecated feature that is still present in Node.js for compatibility with scripts written prior to Node.js v0.10 (reference).
I'd advise against it. Not only it's deprecated, it also unnecessarily messes with stdin.
Use "old" Streams mode to listen for a standard input that will never come:
// Start reading from stdin so we don't exit.
process.stdin.resume();
Here is IFFE based on the accepted answer:
(function keepProcessRunning() {
setTimeout(keepProcessRunning, 1 << 30);
})();
and here is conditional exit:
let flag = true;
(function keepProcessRunning() {
setTimeout(() => flag && keepProcessRunning(), 1000);
})();
You could use a setTimeout(function() {""},1000000000000000000); command to keep your script alive without overload.
spin up a nice repl, node would do the same if it didn't receive an exit code anyway:
import("repl").then(repl=>
repl.start({prompt:"\x1b[31m"+process.versions.node+": \x1b[0m"}));
I'll throw another hack into the mix. Here's how to do it with Promise:
new Promise(_ => null);
Throw that at the bottom of your .js file and it should run forever.

How to execute an async task with socket.io and node.js?

When I receive an "on" event on the server side, I want to start a task in parallel so it does not block the current event loop thread. Is it possible to do so? How?
I don't want to block the server side loop and I want to be able to send back a message to the client once the task is done, something such as:
client.on('execute-parallel-task', function(msg) {
setTimeout(function() {
// do something that takes a while
client.emit('finished-that-task');
},0);
// this block should return asap, not waiting for the previous call
});
I am not sure if setTimeout will do the job.
It depends what the takes a while is. If it takes a while asynchronously (you can tell because you'll have to register a callback or complete handler), and takes a while because it's blocked on something like IO, rather than CPU bound, it'll inherently be parallel.
If however, its something synchronous or CPU bound, whilst you can use setTimeout, setImmediate etc. to send back a message immediately, once the handler for setTimeout or setImmediate executes, your single thread of execution will be stuck handling that; you're not really fixing the problem, merely deferring it.
To exhibit true parallel behaviour, you'll need to launch a child process. You can use the message passing functionality to notify your worker what work to do, and to notify the parent process once the work is complete.
var cp = require('child_process');
var child = cp.fork(__dirname + '/my-child-worker.js');
n.on('message', function(m) {
if (m === "done") {
// Whey!
}
});
n.send(/* Job id, or something */);
Then in my-child-worker.js;
process.on('message', function (m) {
switch (m) {
case 'get-x':
// blah
break;
// other jobs
}
process.send('done');
});
you do not need the setTimeout.
Your function(msg) will be called once the execute parallel task finishes.
if you are designing a task to run in an async manner, you can look at something like the async lib for node.js
Async Node JS Link

Nodejs asynchronous confusion

I can't seem to grasp how to maintain async control flow with NodeJs. All of the nesting makes the code very hard to read in my opinion. I'm a novice, so I'm probably missing the big picture.
What is wrong with simply coding something like this...
function first() {
var object = {
aProperty: 'stuff',
anArray: ['html', 'html'];
};
second(object);
}
function second(object) {
for (var i = 0; i < object.anArray.length; i++) {
third(object.anArray[i]);
};
}
function third(html) {
// Parse html
}
first();
The "big picture" is that any I/O is non-blocking and is performed asynchronously in your JavaScript; so if you do any database lookups, read data from a socket (e.g. in an HTTP server), read or write files to the disk, etc., you have to use asynchronous code. This is necessary as the event loop is a single thread, and if I/O wasn't non-blocking, your program would pause while performing it.
You can structure your code such that there is less nesting; for example:
var fs = require('fs');
var mysql = require('some_mysql_library');
fs.readFile('/my/file.txt', 'utf8', processFile);
function processFile(err, data) {
mysql.query("INSERT INTO tbl SET txt = '" + data + "'", doneWithSql);
}
function doneWithSql(err, results) {
if(err) {
console.log("There was a problem with your query");
} else {
console.log("The query was successful.");
}
}
There are also flow control libraries like async (my personal choice) to help avoid lots of nested callbacks.
You may be interested in this screencast I created on the subject.
As #BrandonTilley said, I/O is asynchronous, so you need callbacks in Node.js to handle them. This is why Node.js can do so much with just a single thread (it's not actually doing more in a single thread, but rather than having the thread wait around for the data, it just starts processing the next task and when the I/O comes back, then it'll jump back to that task with the callback function you gave it).
But, nested callbacks can be taken care of with a good library like the venerable async or my new little library: queue-flow. They handle the callback issues and let you keep your code un-nested and looking very similar to blocking, synchronous code. :)

Resources