What do fibers/future actually do? - node.js

What does the line of code below do?
Npm.require('fibers/future');
I looked online for examples and I came across a few like this:
Future = Npm.require('fibers/future');
var accessToken = new Future();
What will accessToken variable be in this case?

Question is a bit old but my 2 cents:
As Molda said in the comment, Future's main purpose is to make async things work synchronously.
future instance comes with 3 methods:
future.wait() basically tells your thread to basically pause until told to resume.
future.return(value), first way to tell waiting future he can resume, it's also very useful since it returns a value wait can then be assigned with, hence lines like const ret = future.wait() where ret becomes your returned value once resumed.
future.throw(error), quite explicit too, makes your blocking line throw with given error.
Making things synchronous in javascript might sound a bit disturbing but it is sometimes useful. In Meteor, it's quite useful when you are chaining async calls in a Meteor.method and you want its result to be returned to the client. You could also use Promises which are now fully supported by Meteor too, I've used both and it works, it's up to your liking.
A quick example:
Meteor.methods({
foo: function() {
const future = new Future();
someAsyncCall(foo, function bar(error, result) {
if (error) future.throw(error);
future.return(result);
});
// Execution is paused until callback arrives
const ret = future.wait(); // Wait on future not Future
return ret;
}
});

Related

Dispatching up to max parallel REST calls in node.js / how does await work in node

I'm using node.js, have a graph of dependent REST calls and am trying to dispatch them in parallel. It's part of a testing/load testing script.
My graph, has "connected components", and each component is directed and acyclic. I toposort each component, so I end up with a graph that looks like this
Component1 = [Call1, Call2...., Callm] (Call2 possibly dependent on call1 etc)
Component2 = [Call1, Call2... Calln]
...
Componentp
The number of components, and calls in each component m, n and p are dynamic
I want to round robin over the components, and each of it's calls, dispatching up to "x" calls concurrently.
Whilst I understand a little about Promises, async/await and Node's event loop I'm NOT an expert.
PSEUDO CODE ONLY
maxParallel = x
runningCallCount = 0
while(components.some(calls => calls.some(call => noResponseYet(call)) {
if (runningCallCount < maxParallel) {
runningCallCount++
var result = await axios(call)
runningCallCount--
}
}
This doesn't work - I never dispatch the calls.
Remove the await and i fall through to the runningCallCount-- straight away.
Other approaches I've tried and comments
Wrapping every call in an async function, and using Promise.All on a chunk of x number at a time - a chunking style of approach. This may work, but It doesn't acheive the result of allways trying to have x parallel calls going
Used RxJs - tried merge on all components with a max number of parallelism - but this parallelises the components, not the calls within the components, and i couldn't work out how to
make it work the way i wanted based on the poor doco. I'd used the .NET version before so this was a bit disappointing.
I haven't yet tried recursion
Can anyone chime in with an idea as to how to do this ?
How does await work in node ? I've seen it explained like generator functions and yield statements (https://medium.com/siliconwat/how-javascript-async-await-works-3cab4b7d21da)
Can anyone add detail - how is the event loop checked when code strikes an await call ? Again I'm guessing either the entire stack unrolls, or a call to run the event loop is somehow inserted by
the await call.
I'm not interested in using a load testing package, or other load testing tools - I just want to understand the best way to do this, but also understand what's going on in node and await.
I'll update this if i understand this or find a solution, but
Help appreciated.
I would think something like this would work to achieve always having n parallel calls going.
const delay = time => new Promise(r=>setTimeout(r,time));
let maxJobs = 4;
let jobQueue = [
{time:1000},{time:3000},{time:1000},{time:2000},
{time:1000},{time:1000},{time:2000},{time:1000},
{time:1000},{time:5000},{time:1000},{time:1000},
{time:1000},{time:7000},{time:1000},{time:1000}
];
jobQueue.forEach((e,i)=>e.id=i);
const jobProcessor = async function(){
while(jobQueue.length>0){
let job = jobQueue.pop();
console.log('Starting id',job.id);
await delay(job.time);
console.log('Finished id',job.id);
}
return;
};
(async ()=>{
console.log("Starting",new Date());
await Promise.all([...Array(maxJobs).keys()].map(e=>jobProcessor()))
console.log("Finished",new Date());
})();

Can I write a real async callback in Nodejs?

This is a normal example to read a file:
var fs = require('fs');
fs.readFile('./gparted-live-0.18.0-2-i486.iso', function (err, data) {
console.log(data.length);
});
console.log('All done.');
the code above outputs:
All done.
187695104
whereas this is my own version of a callback, I hope it could be async like the file reading code above, but it is not:
var f = function(cb) {
cb();
};
f(function() {
var i = 0;
// Do some very long job.
while(++i < (1<<30)) {}
console.log('Cb comes back.')
});
console.log('All done.');
the code above outputs:
Cb comes back.
All done.
Up till now, it's clear that in the first version of the file reading code, All done. is always printed before the file is read. However, in the second my home brewed version of code, All done. is always waiting until the very long job is done.
So what on earth is the magic that makes fs.readFile's callback an async call back while mine is not?
var f = function(cb) {
cb();
};
Is not async because it invokes cb immediately.
I think you want
var f = function(cb) {
setImmediate(function(){ cb(); });
};
In your example the while-loop is occupying the event-loop therefore the function call to console.log('All done.') is queued on the stack. When the event-loop becomes unblocked the subsequent function calls will be called in sequence.
In Mastering Node.js by Sandro Pasquali - Chapter 2, he discusses deferred execution and the event-loop in order to avoid the issue of the event-loop taking hold and blocking execution. I recommend reading that chapter in order to better understand this non-intuitive way of working in Node.js.
From Mastering Node.js...
Node processes JavaScript instructions using a single thread. Within
your JavaScript program no two operations will ever execute at exactly
the same moment, as might happen in a multithreaded environment.
Understanding this fact is essential to understanding how a Node
program, or process, is designed and runs.
The use of setImmediate() can remedy this issue.
You can use setImmediate() to defer the execution of code until the next cycle of the event loop, which I think accomplishes what you want:
var f = function(cb) {
cb();
};
f(function() {
setImmediate(function() {
var i = 0;
// Do some very long job.
while(++i < (1<<30)) {}
console.log('Cb comes back.')
});
});
console.log('All done.');
The documentation for setImmediate explains the difference between process.nextTick and setImmediate thusly:
Immediates are queued in the order created, and are popped off the queue once per loop iteration. This is different from process.nextTick which will execute process.maxTickDepth queued callbacks per iteration. setImmediate will yield to the event loop after firing a queued callback to make sure I/O is not being starved. While order is preserved for execution, other I/O events may fire between any two scheduled immediate callbacks.
Edit: Update answer based on #generalhenry's comment.

Understanding try and catch in node.js

I'm new to coding. Trying to understand why try...catch isn't supposed to work in node.js. I've created an example, but contrary to expectations, try...catch seems to be working. Where am I going wrong in my understanding ? Please help.
function callback(error) { console.log(error); }
function A() {
var errorForCallback;
var y = parseInt("hardnut");
if (!y) {
throw new Error("boycott parsley");
errorForCallback = "boycott parsley for callback";
}
setTimeout(callback(errorForCallback),1000);
}
try {
A();
}
catch (e) {
console.log(e.message);
}
// Output: boycott parsley
// Synchronous behaviour, try...catch works
-----------Example re-framed to reflect my understanding after reading answer below----------
function callback(error) { console.log(error); }
function A() {
var errorForCallback;
setTimeout(function(){
var y = parseInt("hardnut");
if (!y) {
// throw new Error("boycott parsley");
errorForCallback = "boycott parsley for callback";
}
callback(errorForCallback);
}, 1000);
}
try {
A();
}
catch (e) {
console.log(e.message);
}
// Output: boycott parsley for callback
// Asynchronous behaviour
// And if "throw new Error" is uncommented,
// then node.js stops
The try-catch approach is something that works perfectly with synchronous code. Not all the programming that you do in Node.js is asynchronous and so in those pieces of synchronous code that you write you can perfectly use a try-catch approach. Asynchronous code, on the other hand, does not work that way.
For instance, if you had two function executions like this
var x = fooSync();
var y = barSync();
You would expect three things, first that barSync() would be executed only after fooSync() has finished, and you would expect that x would contain whatever value is returned by the execution of fooSync before barSync() is executed. Also you would expect that if fooSync throws an exception, barSync is never executed.
If you would use a try-catch around fooSync() you could guarantee that if fooSync() fails you can catch that exception.
Now, the conditions completely change if you would have a code like this:
var x = fooAsync();
var y = barSync();
Now imagine that when fooAsync() is invoked in this scenario, it is not actually executed. It's just scheduled for execution later on. It is as if node would have a todo list, and at this moment it is too busy running your current module, and when it finds this function invocation, instead of running it, it simply adds it to the end of its todo list.
So, now you cannot guarantee that barSync() will run before fooAsync(), as a matter of fact, it probably won't. Now you don't control the context in which fooAsync() is executed.
So, after scheduling the fooAsync() function, it immediately moves to execution of barSync(). So, what can fooAsync() return? At this point nothing, because it has not run yet. So x above is probably undefined. If you would put try-catch around this piece of code it would be pointless, because the function will not be executed in the context of this code. It will be executed later on, when Node.js checks if there are any pending tasks in its todo list. It will be executed in the context of another routine that is constantly checking this todo list, and this only thread of execution is called an event loop.
If your function fooAsync() gets to fail, it will fail in the context of execution of this thread running the event loop and therefore it would not be caught by your try-catch statement, at that point, that module above may have probably finished execution.
So, that is why in asynchronous programing you cannot either get a return value, neither can you expect to do a try-catch, because you code is evaluated somewhere else, in another context different from the one where you think you invoked it. It is as if you could would have done something like this instead:
scheduleForExecutionLaterWhenYouHaveTime(foo);
var y = barSync();
And that's the reason why asynchronous programming requires other techniques to determine what happened to your code when it finally runs. Typically this is notified through a callback. You define a callback function which is called back with the details of what failed (if anything) or what your function produced and then you can react to that.

Convert asynchronous/callback method to blocking/synchronous method

Is is possible to convert an asynchronous/callback based method in node to blocking/synchronous method?
I'm curious, more from a theoretical POV, than a "I have a problem to solve" POV.
I see how callback methods can be converted to values, via Q and the like, but calling Q.done() doesn't block execution.
The node-sync module can help you do that. But please be careful, this is not node.js way.
To turn asynchronous functions to synchronous in 'multi-threaded environment', we need to set up a loop checking the result, therefore cause blocking.
Here’s the example code in JS:
function somethingSync(args){
var ret; //the result-holding variable
//doing something async here...
somethingAsync(args,function(result){
ret = result;
});
while(ret === undefined){} //wait for the result until it's available,cause the blocking
return ret;
}
OR
synchronize.js also helps.
While I would not recommend it, this can easy be done using some sort of busy wait. For instance:
var flag = false;
asyncFunction( function () { //This is a callback
flag = true;
})
while (!flag) {}
The while loop will continuously loop until the callback has executed, thus blocking execution.
As you can imagine this would make your code very messy, so if you are going to do this (which I wouldn't recommend) you should make some sort of helper function to wrap your async function; similar to Underscore.js's Function functions, such as throttle. You can see exactly how these work by looking at the annotated source.

Nodejs asynchronous confusion

I can't seem to grasp how to maintain async control flow with NodeJs. All of the nesting makes the code very hard to read in my opinion. I'm a novice, so I'm probably missing the big picture.
What is wrong with simply coding something like this...
function first() {
var object = {
aProperty: 'stuff',
anArray: ['html', 'html'];
};
second(object);
}
function second(object) {
for (var i = 0; i < object.anArray.length; i++) {
third(object.anArray[i]);
};
}
function third(html) {
// Parse html
}
first();
The "big picture" is that any I/O is non-blocking and is performed asynchronously in your JavaScript; so if you do any database lookups, read data from a socket (e.g. in an HTTP server), read or write files to the disk, etc., you have to use asynchronous code. This is necessary as the event loop is a single thread, and if I/O wasn't non-blocking, your program would pause while performing it.
You can structure your code such that there is less nesting; for example:
var fs = require('fs');
var mysql = require('some_mysql_library');
fs.readFile('/my/file.txt', 'utf8', processFile);
function processFile(err, data) {
mysql.query("INSERT INTO tbl SET txt = '" + data + "'", doneWithSql);
}
function doneWithSql(err, results) {
if(err) {
console.log("There was a problem with your query");
} else {
console.log("The query was successful.");
}
}
There are also flow control libraries like async (my personal choice) to help avoid lots of nested callbacks.
You may be interested in this screencast I created on the subject.
As #BrandonTilley said, I/O is asynchronous, so you need callbacks in Node.js to handle them. This is why Node.js can do so much with just a single thread (it's not actually doing more in a single thread, but rather than having the thread wait around for the data, it just starts processing the next task and when the I/O comes back, then it'll jump back to that task with the callback function you gave it).
But, nested callbacks can be taken care of with a good library like the venerable async or my new little library: queue-flow. They handle the callback issues and let you keep your code un-nested and looking very similar to blocking, synchronous code. :)

Resources