Async confused about parallel function - node.js

i try to understand and use async library in node. What i am really not understand, how async.parallel function worked. The name parallel it seems for like multithreading, consider the following sample:
async.parallel([
function(callback){
setTimeout(function(){
console.log('1');
callback(null, 'one');
}, 200);
},
function(callback){
setTimeout(function(){
console.log('2');
callback(null, 'two');
}, 100);
}
],
// optional callback
function(err, results){
if(err){
console.log('Error');
} else {
console.log(results);
}
// the results array will equal ['one','two'] even though
// the second function had a shorter timeout.
});
i have got the result
[ 'one', 'two' ]
Do async.parallel execute on multithread? If not, what the name parallel express?

The placement of results within the array is gauranteed by your placement of those functions within the array you pass to parallel. Notice that functions you pass to parallel have to take the same arguments. Picture these functions being placed in a wrapper and then called like this
someArrayOfFunctions[index](arguments);
Using this method, async gaurantees that independent of when functions within parallel finishes, the results are placed in the array in the order expected, through use of callbacks for populating results, based on indices. The specific implementation does not matter, the fact is no matter when you finish, the results of respective runs will end up placed based on where their function was in the array, not based on timing.
As per your second question, whether async.parallel is truly parallel. It is not. Refer to the following example:
var async = require('async');
async.parallel([
function(callback){
while(true);//blocking!
},
function(callback){
console.log('Not Blocked');//Never get's called
}
]);
The second function will never get called. The parallel that async provides helps out, for exactly the reason you are confused about your example. The problem with asynchronous code is that, sometimes we have some series of callbacks for things that are actually parallel in node(disk I/O, network I/O, etc...) that need to complete, but that will end up completing at unpredictable intervals. Say for example we have configuration data to collect from multiple sources, and no sync methods are supplied. We don't want to run these serially, because this slows things down considerably, but we still want to collect the results in order. This is the prime example use for async.parallel. But, no, async.parallel cannot make synchronous code execute asynchronously, as is shown by this blocking example.
True parallelism in node comes within the V8 backend. The only way to provide this parallelism would be to release an alternate version of node or by developing native extensions.

Related

values getting overriden in callbacks inside callback in setInterval method node js

I have situation where i have an array of employees data and i need to process something parallel for every employee.To implement it and achieve the task i broke the things to chunks to four methods and every method has a callback calling each other and returning callback.I am using
async.eachSeries
to start the process for each element of the employee array.
In the last method i have to set the setInterval to perform same task if required response is not achieved,this interval is not cleared till the process of repeating task continues to 5 times(if desired value is not received 5 times,but cleared after 5th time).
Now,the problem happening is that data which i am processing inside setInterval is getting overriden by values of last employees.
So i am not able to keep track of process happening for all the employee Array elements and also the details of processing for last employee are getting mixed up.
In between the four methods which i am using for performing the task are carrying out the process of saving data to redis , MongoDB , Outside Api's giving response in callback.
Can anyone suggest better way of doing this and also i feel that the problem is happening because i am not returning any callback from SetInterval method().But since that method itself is an asynchronous method so i am unware about how to handle the situation.
EmployeeArray
async.eachSeries() used to process EmployeeArray
for each i have Four callBack Medhods .
async.eachSeries() {
Callback1(){
Callback2(){
Callback3(){
Callback4(){
SetInterval(No CallBack inside this from my side)
}
}
}
}
}
As per I know the async each function does parallel processing. Also u can use async waterfall to make your code more clean. Try something like this.
async.each(openFiles, function(file, callback1) {
async.waterfall([
function(callback) {
callback(null, 'one', 'two');
},
function(arg1, arg2, callback) {
// arg1 now equals 'one' and arg2 now equals 'two'
callback(null, 'three');
},
function(arg1, callback) {
// arg1 now equals 'three'
callback(null, 'done');
}
], function (err, result) {
callback1(err);
});
}, function(err){
//if you come here without error your data is processed
});

Node.js Callbacks: Call a grand_callback when all callbacks are returned

I have task to search meaning of certain words via lookup on a database for which i am making asynchronous calls to the database, every request would look for sat n number of terms.
The issue i have is I want to call another call back, say, grand_callback, the goal of this callback is to aggregate data from all other callbacks and process the next set of codes after aggregating all the data.
Is there a way I can implement the same..
Some details:
terms = [........] // 1000 terms
grand_callback = () ->
#called with aggreagted data.
getbucket_data = (bucket ,callback) ->
#some treatment over terms
callback null , data
some_func = (term) ->
bucket.push term
if bucket.length is 15{
getbucket_data bucket , (err, data)->
#i need to aggregate this data
}
_.map terms , some_func
You can, of course, just keep track of which callbacks are done manually, but this can be a pain and pretty error prone if you have to do it a lot. Perhaps you could benefit from one of these solutions:
1. Use an async library
I personally like using async by caolan. It has a bunch of functions that can be useful for managing asynchronous operations; the one you're looking for is probably parallel (docs):
parallel(tasks, [callback])
Run the tasks array of functions in parallel, without waiting until the previous
function has completed. If any of the functions pass an error to its
callback, the main callback is immediately called with the value of the error.
Once the tasks have completed, the results are passed to the final callback as an
array.
So you would do something like:
async.parallel([
asyncFunctionOne,
asyncFunctionTwo,
asyncFunctionThree
], function(error, results) {
// ...
});
2. Use promises
Promises are a nice abstraction on top of asynchronous operations. A promise represents a value that either exists now or will exist in the future; you don't have to care which. You can wait for multiple promises to complete by creating a new promise that contains them all. One of the most popular promise libraries for Node is Q by kriskowal.
// creating a promise manually
var deferred1 = Q.defer();
var promise1 = deferred1.promise;
asyncFunction1(function(err, value) {
if (err) deferred1.reject(err);
else deferred1.resolve(value);
});
// wrapping a Node-style function to create promises
var promise2 = Q.nfcall(asyncFunction2, arg1, arg2);
// Create a promise that waits for all other promises
var grandPromise = Q.all([promise1, promise2]);
grandPromise.then(Q.spread(function(promise1result, promise2result) {
// ...
}));

Block function whilst waiting for response

I've got a NodeJS app i'm building (using Sails, but i guess that's irrelevant).
In my action, i have a number of requests to other services, datasources etc that i need to load up. However, because of the huge dependency on callbacks, my code is still executing long after the action has returned the HTML.
I must be missing something silly (or not quite getting the whole async thing) but how on earth do i stop my action from finishing until i have all my data ready to render the view?!
Cheers
I'd recommend getting very intimate with the async library
The docs are pretty good with that link above, but it basically boils down to a bunch of very handy calls like:
async.parallel([
function(){ ... },
function(){ ... }
], callback);
async.series([
function(){ ... },
function(){ ... }
]);
Node is inherently async, you need to learn to love it.
It's hard to tell exactly what the problem is but here is a guess. Assuming you have only one external call your code should look like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result) {
// you must render your view inside the callback
res.render('someview', {data: result});
});
// do not render here as you don't have the result yet.
}
If you have more than two external calls your code will looks like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result1) {
longExternalCallTwo(someparams, function(result2) {
// you must render your view inside the most inner callback
data = {some combination of result1 and result2};
res.render('someview', {data: data });
});
// do not render here since you don't have result2 yet
});
// do not render here either as you don't have neither result1 nor result2 yet.
}
As you can see, once you have more than one long running async call things start to get tricky. The code above is just for illustration purposes. If your second callback depends on the first one then you need something like it, but if longExternalCallOne and longExternalTwo are independent of each other you should be using a library like async to help parallelize the requests https://github.com/caolan/async
You cannot stop your code. All you can do is check in all callbacks if everything is completed. If yes, go on with your code. If no, wait for the next callback and check again.
You should not stop your code, but rather render your view in your other resources callback, so you wait for your resource to be reached before rendering. That's the common pattern in node.js.
If you have to wait for several callbacks to be called, you can check manually each time one is called if the others have been called too (with simple bool for example), and call your render function if yes. Or you can use async or other cool libraries which will make the task easier. Promises (with the bluebird library) could be an option too.
I am guessing here, since there is no code example, but you might be running into something like this:
// let's say you have a function, you pass it an argument and callback
function myFunction(arg, callback) {
// now you do something asynchronous with the argument
doSomethingAsyncWithArg(arg, function() {
// now you've got your arg formatted or whatever, render result
res.render('someView', {arg: arg});
// now do the callback
callback();
// but you also have stuff here!
doSomethingElse();
});
});
So, after you render, your code keeps running. How to prevent it? return from there.
return callback();
Now your inner function will stop processing after it calls callback.

Is the following node.js code blocking or non-blocking?

I have the node.js code running on a server and would like to know if it is blocking or not. It is kind of similar to this:
function addUserIfNoneExists(name, callback) {
userAccounts.findOne({name:name}, function(err, obj) {
if (obj) {
callback('user exists');
} else {
// Add the user 'name' to DB and run the callback when done.
// This is non-blocking to here.
user = addUser(name, callback)
// Do something heavy, doesn't matter when this completes.
// Is this part blocking?
doSomeHeavyWork(user);
}
});
};
Once addUser completes the doSomeHeavyWork function is run and eventually places something back into the database. It does not matter how long this function takes, but it should not block other events on the server.
With that, is it possible to test if node.js code ends up blocking or not?
Generally, if it reaches out to another service, like a database or a webservice, then it is non-blocking and you'll need to have some sort of callback. However, any function will block until something (even if nothing) is returned...
If the doSomeHeavyWork function is non-blocking, then it's likely that whatever library you're using will allow for some sort of callback. So you could write the function to accept a callback like so:
var doSomHeavyWork = function(user, callback) {
callTheNonBlockingStuff(function(error, whatever) { // Whatever that is it likely takes a callback which returns an error (in case something bad happened) and possible a "whatever" which is what you're looking to get or something.
if (error) {
console.log('There was an error!!!!');
console.log(error);
callback(error, null); //Call callback with error
}
callback(null, whatever); //Call callback with object you're hoping to get back.
});
return; //This line will most likely run before the callback gets called which makes it a non-blocking (asynchronous) function. Which is why you need the callback.
};
You should avoid in any part of your Node.js code synchronous blocks which don't call system or I/O operations and which computation takes long time (in computer meaning), e.g iterating over big arrays. Instead move this type of code to the separate worker or divide it to smaller synchronous pieces using process.nextTick(). You can find explanation for process.nextTick() here but read all comments too.

What's the right way to find out when a series of callbacks (fired from a loop) have all executed?

I'm new to Node.js and am curious what the prescribed methodology is for running a loop on a process (repeatedly) where at the end of the execution some next step is to take place, but ONLY after all the iterations' callbacks have fired.
Specifically I'm making SQL calls and I need to close the sql connection after making a bunch of inserts and updates, but since they're all asynchronous, I have no way of knowing when all of them have in fact completed, so that I can call end() on the session.
Obviously this is a problem that extends far beyond this particular example, so, I'm not looking for the specific solution regarding sql, but more the general practice, which so far, I'm kind of stumped by.
What I'm doing now is actually setting a global counter to the length of the loop object and decrementing from it in each callback to see when it reaches zero, but that feels REALLY klugy, and I'm hoping theres a more elegant (and Javascript-centric) way to achieve this monitoring.
TIA
There are a bunch of flow-control libraries available that apply patterns to help with this kind of thing. My favorite is async. If you wanted to run a bunch of SQL queries one after another in order, for instance, you might use series:
async.series([
function(cb) { sql.exec("SOME SQL", cb) },
function(cb) { sql.exec("SOME MORE SQL", cb) },
function(cb) { sql.exec("SOME OTHER SQL", cb) }
], function(err, results) {
// Here, one of two things are true:
// (1) one of the async functions passed in an error to its callback
// so async immediately calls this callback with a non-null "err" value
// (2) all of the async code is done, and "results" is
// an array of each of the results passed to the callbacks
});
I wrote my own queue library to do this (I'll publish it one of these days), basically push queries onto a queue (an array basically) execute each one as it's removed, have a callback take place when the array is empty.
It doesn't take much to do it.
*edit. I've added this example code. It isn't what I've used before and I haven't tried it in practice, but it should give you a starting point. There's a lot more you can do with the pattern.
One thing to note. Queueing effectively makes your actions synchronous, they happen one after another. I wrote my mysql queue script so I could execute queries on multiple tables asynchronously but on any one table in synch, so that inserts and selects happened in the order they were requested.
var queue = function() {
this.queue = [];
/**
* Allows you to pass a callback to run, which is executed at the end
* This example uses a pattern where errors are returned from the
* functions added to the queue and then are passed to the callback
* for handling.
*/
this.run = function(callback){
var i = 0;
var errors = [];
while (this.queue.length > 0) {
errors[errors.length] = this.queue[i]();
delete this.queue[i];
i++;
}
callback(errors);
}
this.addToQueue = function(callback){
this.queue[this.queue.length] = callback;
}
}
use:
var q = new queue();
q.addToQueue(function(){
setTimeout(function(){alert('1');}, 100);
});
q.addToQueue(function(){
setTimeout(function(){alert('2');}, 50);
});
q.run();

Resources