Nested async.eachSeries in waterfall executes in wrong order - node.js

In second function in the async waterfall, the eachSeries callback (urlCallback) in my code executes after the waterfall callback (waterfallCallback), for reasons I cannot suss out.
async.waterfall([
function(callback) {
request(website, function (error, response, html) {
if (!error && response.statusCode == 200) {
pageUrls = getPageUrls(html)
callback(null, pageUrls)
}
})
},
function (pageUrls, waterfallCallback) {
async.eachSeries(pageUrls, function (url, urlCallback) {
console.log('SET ' + url)
request(url, function (err, response, body) {
var $ = cheerio.load(body)
$('#div').children().each(function(){
console.log($(this).children("a").attr("href"));
itemUrl = $(this).children("a").attr("href")
itemUrls.push(itemUrl)
})
urlCallback(null,itemUrls)
})
},
waterfallCallback(null, itemUrls))
}
],
function(err, results) {
console.log("results: " + results)
})
AFAIK, the async.eachSeries takes three arguments (array,functionToBeExecuteOnEachItem,callback) and execute them in that order. Somehow not here.

The parameters for async.eachSeries are function definitions. Something like waterfallCallback or function(err,result){}.
When you call waterfallCallback(null, itemUrls), that is not a function definition, that is running the function itself!
Changing to simply waterfallCallback should do the trick.
Update: Also, .eachSeries does not return the values as an array, its final callback is just function(err). Checkout .mapSeries link instead, which will return the resulting array in the final callback function(err,finalArray). (Be aware that each return of a .map will be an element in the array, so if you return an array, you'll get data structures like [ [], [], [] ])

Related

JSON variable returns Undefined

Sorry for the inconvenience, I am a newbie in Node. I am trying to store a json in "usersData" variable so I want to use it later in another functions. The problem is that if I test the variable with console.log inside the "if" it returns to me results, but when trying to show the variable outside the request subfunction, it comes out 'undefined'. I have declared the global usersData variable as shown below. Thank you.
var usersData;
function getAllUsers(){
request({url, json: true}, function (error, response, body) {
if (!error && response.statusCode == 200) {
usersData = body
//console.log(usersData) //Here returns a value
}
});
console.log(usersData) //here returns undefined
}
request is asynchronous method, so if you want to use its result later in another functions, should handle that in second parameter callback. i.e
var usersData;
var handleUserData = function() {};
function getAllUsers(){
request({url, json: true}, function (error, response, body) {
if (!error && response.statusCode == 200) {
usersData = body
//console.log(usersData) //Here returns a value
// use results in another function
handleUserData(body);
}
});
}
or use Promise
function getAllUsers() {
return new Promise(function(resolve, reject) {
request({url, json: true}, function (error, response, body) {
if (!error && response.statusCode == 200) {
usersData = body
//console.log(usersData) //Here returns a value
resolve(body);
} else {
reject(error);
}
});
});
}
// handle `usersData`
getAllUsers().then(body => {
handleUserData(body);
});
Here are some stuffs you need to know,
request is a asynchronous function which means it runs on background without blocking main thread, so the callback function is executed after the request completes, that is why value of body is assigned to userData and when printed gives some value.
But when printing outside of the callback function, it gives undefined because of request being async takes time to complete which run on background. So, consoling statement runs before the body is assigned to userData and so when printed gives undefined.

Function with async request in Node js

I have a loop, which iterates over array and in every iteration I have to do a http request, like this:
var httpsRequest = require('request')
var getData = function(id) {
var result;
httpsRequest({
url: 'https://link/'+id,
}, (error, resp, body) => {
if(resp.statusCode == 200) {
result = JSON.parse(body);
}
});
//here I would like to wait for a result
}
var data = [];
for(row in rows) {
data.push(getData(row.ID))
}
resp.send(JSON.stringify(data)) //I send data back to the client
I cannot do the rest of the for loop in callback, I have to wait for a result which will be returned from a function getData and move to the next iteration.
How to handle this?
PS I know I could use callback function but what if after the last iteration program will send the response (last line above) before the last getData execution finish?
Regards
As stated in the answer by Johannes, the use of promises is a good idea. Since you're using request I'd like to propose an alternative method by using request-promise which is a promisified version of 'request' using bluebird.
The requests will in this case return a promise, and by using .map() you can create an array of promises that you can await using Promise.all(). When all promises are resolved, the response can be sent! This also differs from the use of .reduce(), which only will start to execute the next request as soon as the previous one is done. By using an array of promises, you can start all the requests at the same time.
var httpsRequest = require('request-promise')
var getData = function(id) {
return httpsRequest({
url: 'https://link/' + id,
}, (error, resp, body) => {
if(resp.statusCode == 200) {
return JSON.parse(body);
} else {
//Throw error, this will be caught in the .catch()
throw error;
}
});
}
var promises = rows.map(function(row){
return getData(row.ID)
});
Promise.all(promises)
.then(function(results){
//All requests are done!
//The variable results will be an array of all the results in the same order as they were requested
resp.send(JSON.stringify(results));
})
.catch(function(error){
//Handle the error thrown in the 'getData' function
});
If you need to wait for each iteration to be done before starting another one, you can use Promises and reduce. If you only want to wait for all requests to be finished it's better to use map + Promise.all as explained in Daniel Bs answer.
// i asume rows is an array as you wrote you iterate over one.
const results = [];
rows.reduce((previous, row) => {
return previous.then(() => getData(row.ID).then(result => results.push(result)) // do whatever you want with the result
);
}, Promise.resolve())
.then(() => resp.send(JSON.stringify(results)));
const getData = (id) => {
return new Promise((resolve, reject)=> {
httpsRequest({
url: 'https://link/'+id,
}, (error, resp, body) => {
if(error) return reject(error);
if(resp.statusCode == 200) {
return resolve(JSON.parse(body));
}
return resolve(); // if you want to pass non 200 through. You may want to do sth different here
});
});
};

mongoose exec call with async.parallel

I have a code like this in my express controller
function (req, res) {
var queries = [];
data.forEach(function (item) {
var query = myModel.findOneAndUpdate({remoteId: item.id}, item, {upsert: true}).exec;
queries.push(query);
});
async.parallel(queries, function (err, docs) {
res.json(docs);
});
});
If data array has 3 item, then i have an array of 3 null values.
async.parallel function accepts a function with a callback argument, that should be called to properly complete its execution. So mongoose.Query.exec does the same. But i recieve an array of null objects as a result.
If i wrap my exec call like so
var query = function (cb) {
tournamentsModel.findOneAndUpdate({remoteId: item.id}, item, {upsert: true}).exec(function (err, model) {
cb(err, model);
});
};
queries.push(query);
everything is ok and i recieve 3 docs from mongo as a result.
Why should i explicitly call a callback passed to a async.parallel function call, when exec method does the same?
When you directly pass the exec function of your query to async.parallel as a function to execute, you're losing the this context of that function call that contains your query to run.
To use this approach, you would need to call bind to return a new function that will call exec with the right this context; so something like this:
var query = Query.prototype.exec.bind(
myModel.findOneAndUpdate({remoteId: item.id}, item, {upsert: true})
);
queries.push(query);
It's probably cleaner to call exec yourself, but pass in the async callback to it:
var query = function(cb) {
myModel.findOneAndUpdate({remoteId: item.id}, item, {upsert: true}).exec(cb);
}
queries.push(query);

async.parallel() - last function not being called

For some reason the 'yyyyyyyyy' string is never printed when I use async.parallel() as per below. Why is this? I thought that the last function would be called once the other two have been called.
var async = require('async');
async.parallel([
function() {
console.log('xxxxxxxxxxx');
},
function() {
console.log('ccccccccccc');
}
], function(err, results){
console.log('yyyyyyyyy');
});
Every function passed in the first parameter to async.parallel should take a callback that it calls when its done so async knows that it has completed:
var async = require('async');
async.parallel([
function(callback) {
console.log('xxxxxxxxxxx');
callback();
},
function(callback) {
console.log('ccccccccccc');
callback();
}
], function(err, results){
console.log('yyyyyyyyy');
});
If an error happens in one of the functions, it should call the callback with
callback(err);
so that async knows an error has occurred and it'll immediately call the last function.

Node.js - Using the async lib - async.foreach with object

I am using the node async lib - https://github.com/caolan/async#forEach and would like to iterate through an object and print out its index key. Once complete I would like execute a callback.
Here is what I have so far but the 'iterating done' is never seen:
async.forEach(Object.keys(dataObj), function (err, callback){
console.log('*****');
}, function() {
console.log('iterating done');
});
Why does the final function not get called?
How can I print the object index key?
The final function does not get called because async.forEach requires that you call the callback function for every element.
Use something like this:
async.forEach(Object.keys(dataObj), function (item, callback){
console.log(item); // print the key
// tell async that that particular element of the iterator is done
callback();
}, function(err) {
console.log('iterating done');
});
async.each is very useful and powerful function which is provided by Async Lib .it have 3 fields
1-collection/array
2- iteration
3-callback
the collection is referred to the array or collection of objects and iteration is refer to the each iteration and callback is optional .
if we are giving callback then it will return the response or say result which you want to show you in the frontend
Applies the function iteratee to each item in coll, in parallel. The iteratee is called with an item from the list, and a callback for when it has finished. If the iteratee passes an error to its callback, the main callback (for the each function) is immediately called with the error.
Note, that since this function applies iteratee to each item in parallel, there is no guarantee that the iteratee functions will complete in order.
exapmle-
var updateEventCredit = function ( userId, amount ,callback) {
async.each(userId, function(id, next) {
var incentiveData = new domain.incentive({
user_id:userId,
userName: id.userName,
amount: id.totalJeeneePrice,
description: id.description,
schemeType:id.schemeType
});
incentiveData.save(function (err, result) {
if (err) {
next(err);
} else {
domain.Events.findOneAndUpdate({
user_id: id.ids
}, {
$inc: {
eventsCredit: id.totalJeeneePrice
}
},{new:true}, function (err, result) {
if (err) {
Logger.info("Update status", err)
next(err);
} else {
Logger.info("Update status", result)
sendContributionNotification(id.ids,id.totalJeeneePrice);
next(null,null);
}
});
}
});

Resources