Why does async waterfall differentiate between multiple callbacks in the function array? - async.js

async.waterfall([
function(cb) {
console.log('Inner 1');
cb(null, '1st');
cb(null, '1st-Again')
console.log('After 1');
},
function(val, cb) {
console.log('Inner 2 |' + val);
cb(null, '2nd');
cb(null, '2nd-Again');
console.log('After 2');
}
], function(err, results) {
console.log('final cb |' + results);
});
The output for the above piece of code is
Inner 1
After 1
Inner 2 |1st
After 2
Inner 2 |1st-Again
After 2
final cb |2nd
I understand the basic working of waterfall where results get passed to the next function in the array. Could someone kindly explain why I dont see a final cb | 2nd-Again printout? I would be grateful if you could point me in the right direction. (I also looked at the source code for waterfall, but could not really make sense of it apart from how once task would automatically call the other). Thanks for the help!

Related

node.js svn update/commit synchronously?

I'm using svn-spawn library to update/commit files to svn. Problem is my app calls svn up/commit in a loop, and because of the async nature of the call, svn-up is called from the next iteration of the loop before the previous svn-up can finish.
How to handle this issue? Is there any way to prevent the next call from happening until the previous one is complete?
Figured out a way to do it using async module.
async.series can be used to execute async tasks in a serial fashion.
This is how I did it.
function commitFile(arg, callback) {
svnClient.getStatus(filePath, function(err, data) {
//...
svnClient.commit(['Commit msg', filePath], callback);
//...
});
}
var toCommit = [];
for (var i = 0, len = requests.length; i < len; i++) {
//Adding files to commit, async.apply enables adding arguments to the anonymous function
toCommit.push(async.apply(function(arg, cb) {
commitFile(arg, cb);
}, 'arg1'));
}
async.series(toCommit,function (err, result) {
console.log('Final callback');
if(err) {
console.log('error', err);
} else {
console.log('result of this run: ' + result);
}
});
async.series needs an array of functions which must call a callback once they are done. It uses the callback to determine that the current function in done executing and only then it will pick the next function to execute.

async.<fn>Limit stop after first iteraton loop

this question is related with an answer to my previous question. there #robertklep recommends me to use mapLimit() instead of .map() because .map() can't handle a large series of data, and with that solution all works fine. But now I restructured my code, and now neither of the .<fn>Limit() functions run after the first loop iteration. do I missing something here?
var proccesBook = function(file, cb) {
testFile(file, function (epub) {
if (epub) {
getEpuData(file, function (data) {
insertBookInDB(data)
})
}else{
cb(file)
}
})
}
async.mapLimit(full_files_path, 10, proccesBook, function(err){
if(err){
console.log('Corrupted file', err);
} else {
console.log('Processing complete');
};
})
// ---> only runs for the first 10 series data
Your primary issue is you don't call cb in the success branch of processBook. Your control flow must guarantee to call the callback exactly once for each worker function invocation.
Other asides:
You don't seem to need the results, so eachLimit is fine
Only need mapLimit if you need the results of each worker
You need to follow the standard error-first convention when calling the callback. Don't do cb(file) as that will be interpretted as an error and about the remaining processing.
var proccesBook = function(file, cb) {
testFile(file, function (epub) {
if (epub) {
getEpuData(file, function (data) {
insertBookInDB(data)
cb() // This is what you were missing
})
}else{
cb()
}
})
}
async.eachlimit(full_files_path, 10, proccesBook, function(err){
if(err){
console.log('Corrupted file', err);
} else {
console.log('Processing complete');
};
})

Nodejs assign var once mongodb query finished (callback misunderstanding needing a simple example)

I'm calling a mongodb query and need to assign the result to a value.
function oneref(db, coll, cb) {
var pipeline = [{ "$sample": { size: 1 } }]
var ohoh
coll.aggregate(pipeline, function (err,oneref) {
ohoh=oneref[0].code
})
console.log('hoho?: ' + ohoh)
cb(null, db, coll)
},
I understand I have an issue understanding callback, but even checking all hello world examples, I'm struggling.
How to write it in the simplest way so I only assigne the var hoho when the query finished?
Thanks a lot in advance.
You're getting undefined value for the variable hoho is because the aggregate() cursor method is asynchronous, and can finish at any time. In your case, it is finishing after you're using console.log(), so the values are undefined when you're accessing them.
Asign the variable as the argument to the callback within the aggregate() function callback i.e.
function oneref(db, coll, cb) {
var pipeline = [{ "$sample": { size: 1 } }];
coll.aggregate(pipeline, function (err, oneref) {
if (err) return cb(err);
var hoho = oneref[0].code
console.log('hoho?: ' + hoho);
cb(null, hoho);
});
};
Calling the oneref function:
oneref(db, collection, function(err, result){
console.log(result); // logs the oneref[0].code
});
Refer to this Question for a better understanding on how callback functions work.

Async node.js data flow confusion

thanks for your help...struggling big time with how to handle this properly. I'm in async now, having given up on my ability to write the callbacks properly. I have snippet where I'm passing a set of random numbers (eachrecord) and passing them through to a mongoose call. Trying to create a data set from the multiple queries I pass.
My issue is that no matter what I've done for 4 hours, the "newarray" variable is always empty.
Thank you for your help -
async.forEach(arLimit, function(eachrecord, callback){
newarray = new Array;
var query = UGC_DB_Model.find({}).skip(eachrecord).limit(-1);
query.execFind(function (err, data) {
if (err)
console.log(err);
else {
newarray.push(data);
}
});
callback(null, newarray);
}, function(err, result) {
if (err) return next(err);
console.log("(it's empty): " + result);
});
There are several issues with your code:
async.forEach isn't meant to 'generate' results, that's what async.map is for;
you need to call the callback only when execFind is done, and not immediately after calling it;
your newarray is probably not necessary;
So try this instead:
async.map(arLimit, function(eachrecord, callback){
var query = UGC_DB_Model.find({}).skip(eachrecord).limit(-1);
query.execFind(function (err, data) {
if (err)
callback(err); // pass error along
else {
callback(null, [ data ]);
// although I think you mean this (because 'data' is probably an array already)
// callback(null, data);
}
});
}, function(err, result) {
if (err) return next(err);
console.log("(it's empty): " + result);
});

NodeJS + async: which control flow option to chose?

As you know, async.parallel, defined with a such code:
async.parallel([
function (callback) {
callback(err, objects);
},
function (callback) {
callback(err, status);
},
function (callback) {
callback(err, status);
},
], function (err, results) {
//smth with results[N] array...
});
performs all the tasks all together parallel. However, I need the callback result of first function (objects, to be exact) to be avialable in 2nd and 3rd functions. In other words, first step – 1st function, second – ( 2rd + 3rd parallel with results of the 1st one). async.waterfall seems to be a bad idea 'cause:
In waterfall function can't work parallel
I can't get access to every result of stack, only to the last.
Any ideas? Thanks!
You need both waterfall and parallel.
function thing1(callback) {...callback(null, thing1Result);}
function thing2A(thing1Result, callback) {...}
function thing2B(thing1Result, callback) {...}
function thing2(thing1Result, callback) {
async.parallel([
async.apply(thing2A, thing1Result),
async.apply(thing2B, thing1Result)
], callback);
}
async.waterfall([thing1, thing2], function (error) {
//all done
});
There's no need to use async. With async you are basically black-boxing your app. Because I don't like the magic for easy tasks, vanilla js:
var f1 = function (cb){
...
cb (null, "result from f1"); //null error
};
var f2 = function (resultFromF1, cb){
...
cb (null); //null error
};
var f3 = function (resultFromF1, cb){
...
cb (null); //null error
};
var main = function (cb){
f1 (function (error, resultFromF1){
if (error) return cb ([error]);
var errors = [];
var remaining = 2;
var finish = function (error){
if (error) errors.push (error);
if (!--remaining){
//f2 and f3 have finished
cb (errors.length ? errors : null);
}
};
f2 (resultFromF1, finish);
f3 (resultFromF1, finish);
});
};
main (function (errors){
if (errors) return handleError (errors); //errors is an array
...
});

Resources