Node.js async recursive function with callback - node.js

Task: I need to recursively walk through a json object and make certain changes to the keys. I will be handling objects with varying depths, and varying sizes. When the function hits a key whose value is an object, it is called again on that object.
Problem 1.: Doing this as a synchronous function, I noticed that large json objects were returning incomplete. Using the the async library, async.forEach solves the problem of handling long tasks and returning only when finished, but...
Problem 2.: It seems that the async function loses concurrency (?) when it is called recursively (pointed out in the code snippet).
To test this idea, I removed the recursive function call and it worked (but without recursion), and I got a callback. When I add the function call back in, I get TypeError: results is not a function .
This leads me to think that async with callback and recursive functions don't mix in node. Is there a way to achieve both?
Possible Fix: I could run a separate function to count all the keys and use a counter as my control in a simple for loop, rather than letting forEach handle the control. That seems a bit inefficient, no?
Here's the code:
function fixJsonKeys(obj, results) {
async.forEach(Object.keys(obj), function(key, callback) {
if (typeof obj[key] == 'object') {
// do stuff to json key, then call the function
// on the nested object
fixJsonKeys(obj[key]);
callback(); // <-- how does this work with recursion??
}
else {
// do stuff to json key
callback();
}
}, function(err) {
if (err) return next(err);
// obj keys fixed, now return completed object
results(obj);
});
}
EDIT: hard to format in comments, so:
#Laksh and #Suhail: I tried your suggestions and same outcomes. If I remove the callback from the if condition, the async.forEach still looks to confirm that it has handled the top level keys (I think).
For example, say I have 3 top level keys, and one of them has a nested object:
[key1] (no callback, do recursion)
--[subKey1]
--[subKey2]
[key2] (callback)
[key3] (callback)
async.forEach is still looking for a callback on action taken for key1. Do I have this correct?

Since you are using recursive functions here, you don't need to call callback() inside the if condition.
This is because your recursive call will return to the existing callback() in if once the entire recursive stack finished for that particular obj[key].
Recursive function return only when the base condition is true, in your case when if condition fails so it will automatically call callback() from else block.

Related

Why can't I deep copping a DICT out of a function in Node.js?

I'm using request to call an API which gives me movies data in an object called body, but when I try to pass it to an array and console log it, the terminal shows me an empty array.
let copy = [];
const request = require('request');
request('https://www.omdbapi.com/?t=JOKER&apikey=b04f2804', { json: true }, (err, res, body) => {
if (err) { return console.log(err);}
copy.push(JSON.parse(JSON.stringify(body)))
});
console.log(copy);
First off, the request() library is deprecated and it is not recommended that you write new code with it. There is a list of alternatives (that all support promises) here.
Then second, the request() library is non-blocking and asynchronous. That means that when you call request(), it starts the operation and then immediately returns. The code after the call to request() then continues to execute BEFORE request() calls its callback. So, you're trying to examine the array before its value has been set. This is a classic issue in asynchronous programming. There are a number of ways to handle this. If you stayed with the request() library, then you must either use the result INSIDE the callback itself or you can call some other function from within that callback and pass it the array.
With one of the listed alternatives that all support promises (my favorite is the got() library), you can then use await which is often a preferred programming style for asynchronous operations:
const got = require('got');
async function someFunction() {
let result = await got('https://www.omdbapi.com/?t=JOKER&apikey=b04f2804').json();
// you can use the result here
console.log(result);
// Or you can return it and it will become the resolved value
// of the promise this async function returns
return result;
}
someFunction().then(result => {
// can use result here
}).catch(err => {
console.log(err);
});
Note that when you are writing asynchronously retrieved data to a higher scoped variable as you are when you do copy.push(), that is typically a warning that you're doing something wrong. This is because NONE of the code in the higher scope will know when the data is available in that variable. There are rare cases when you might do that (like for implementing a cache), but these have to be situations where the data is being stored only for future reference, not for immediate use. 99.9% of the time when we see that construct, it's a wrong way to program with asynchronously retrieved data.

For loop in redis with nodejs asynchronous requests

I've got a problem with redis and nodejs. I have to loop through a list of phone numbers, and check if this number is present in my redis database. Here is my code :
function getContactList(contacts, callback) {
var contactList = {};
for(var i = 0; i < contacts.length; i++) {
var phoneNumber = contacts[i];
if(utils.isValidNumber(phoneNumber)) {
db.client().get(phoneNumber).then(function(reply) {
console.log("before");
contactList[phoneNumber] = reply;
});
}
}
console.log("after");
callback(contactList);
};
The "after" console log appears before the "before" console log, and the callback always return an empty contactList. This is because requests to redis are asynchronous if I understood well. But the thing is I don't know how to make it works.
How can I do ?
You have two main issues.
Your phoneNumber variable will not be what you want it to be. That can be fixed by changing to a .forEach() or .map() iteration of your array because that will create a local function scope for the current variable.
You have create a way to know when all the async operations are done. There are lots of duplicate questions/answers that show how to do that. You probably want to use Promise.all().
I'd suggest this solution that leverages the promises you already have:
function getContactList(contacts) {
var contactList = {};
return Promise.all(contacts.filter(utils.isValidNumber).map(function(phoneNumber) {
return db.client().get(phoneNumber).then(function(reply) {
// build custom object
constactList[phoneNumber] = reply;
});
})).then(function() {
// make contactList be the resolve value
return contactList;
});
}
getContactList.then(function(contactList) {
// use the contactList here
}, funtion(err) {
// process errors here
});
Here's how this works:
Call contacts.filter(utils.isValidNumber) to filter the array to only valid numbers.
Call .map() to iterate through that filtered array
return db.client().get(phoneNumber) from the .map() callback to create an array of promises.
After getting the data for the phone number, add that data to your custom contactList object (this is essentially a side effect of the .map() loop.
Use Promise.all() on the returned array of promises to know when they are all done.
Make the contactList object we built up be the resolve value of the returned promise.
Then, to call it just use the returned promise with .then() to get the final result. No need to add a callback argument when you already have a promise that you can just return.
The simplest solution may be to use MGET with a list of phone numbers and put the callback in the 'then' section.
You could also put the promises in an array and use Promise.all().
At some point you might want your function to return a promise rather than with callback, just to stay consistent.
Consider refactoring your NodeJS code to use Promises.
Bluebird is an excellent choice: http://bluebirdjs.com/docs/working-with-callbacks.html
you put async code into a for loop (sync operations). So, each iteration of the for loop is not waiting for the db.client(...) function to end.
Take a look at this stackoverflow answer, it explains how to make async loops :
Here

Async.until function parameters

async.parallel([
function(callback) { //This is the first task, and callback is its callback task
db.save('xxx', 'a', function(err) {
//Now we have saved to the DB, so let's tell async that this task is done
callback();
});
},
function(callback) { //This is the second task, and callback is its callback task
db.save('xxx', 'b', callback); //Since we don't do anything interesting in db.save()'s callback, we might as well just pass in the task callback
}
], function(err) { //This is the final callback
console.log('Both a and b are saved now');
});
Hey I found this code online when I trying to understand async in node.js and my question I have with the code above is in the array of functions in async.parallel, what is the callback parameter that is passed in each function? Where does that callback parameter come from and what is the purpose of that variable? I am sorry if this is a dumb question, but I can't grasp a solid understanding of this callback's...
The contract that async.parallel() defines is that you pass it an array of functions. It will then call each of those functions and pass those function a single argument that is a function reference (I think of it as a completion callback). For async.parallel() to work properly, your function that you pass in the array has to call that function reference when your async operation is done. When it calls that function reference, it can pass an error argument and a result argument. If the error argument is truthy, then it will stop further processing and feed that error back to the calling code.
The function that async passes a reference to is internal to the async library and when you call it, the async library does some housekeeping to keep track of how many asynchronous operations are still going, whether they're all done now, whether further processing should stop due to error, etc...
You can essentially think of it like the async library completion function. When your asynchronous operation is done, you have to call that completion function so that the async library knows that your operation is now done. That's how it is able to manage and keep track of your multiple async operations. If you don't call that completion callback, then the async library never knows when your asynchronous operation is done.
The callback parameter is a function created by async. When you call it in each of the array functions, it signals async that the function completed. In that way, you can control when your function is done - when you call callback.
When callback is called from all the array functions, the final function is called.

node.js for loop execution in a synchronous manner

I have to implement a program in node.js which looks like the following code snippet. It has an array though which I have to traverse and match the values with database table entries. I need to wait till the loop ends and send the result back to the calling function:
var arr=[];
arr=[one,two,three,four,five];
for(int j=0;j<arr.length;j++) {
var str="/^"+arr[j]+"/";
// consider collection to be a variable to point to a database table
collection.find({value:str}).toArray(function getResult(err, result) {
//do something incase a mathc is found in the database...
});
}
However, as the str="/^"+arr[j]+"/"; (which is actually a regex to be passed to find function of MongoDB in order to find partial match) executes asynchronously before the find function, I am unable to traverse through the array and get required output.
Also, I am having hard time traversing through array and send the result back to calling function as I do not have any idea when will the loop finish executing.
Try using async each. This will let you iterate over an array and execute asynchronous functions. Async is a great library that has solutions and helpers for many common asynchronous patterns and problems.
https://github.com/caolan/async#each
Something like this:
var arr=[];
arr=[one,two,three,four,five];
asych.each(arr, function (item, callback) {
var str="/^"+item+"/";
// consider collection to be a variable to point to a database table
collection.find({value:str}).toArray(function getResult(err, result) {
if (err) { return callback(err); }
// do something incase a mathc is found in the database...
// whatever logic you want to do on result should go here, then execute callback
// to indicate that this iteration is complete
callback(null);
});
} function (error) {
// At this point, the each loop is done and you can continue processing here
// Be sure to check for errors!
})

Is the following node.js code blocking or non-blocking?

I have the node.js code running on a server and would like to know if it is blocking or not. It is kind of similar to this:
function addUserIfNoneExists(name, callback) {
userAccounts.findOne({name:name}, function(err, obj) {
if (obj) {
callback('user exists');
} else {
// Add the user 'name' to DB and run the callback when done.
// This is non-blocking to here.
user = addUser(name, callback)
// Do something heavy, doesn't matter when this completes.
// Is this part blocking?
doSomeHeavyWork(user);
}
});
};
Once addUser completes the doSomeHeavyWork function is run and eventually places something back into the database. It does not matter how long this function takes, but it should not block other events on the server.
With that, is it possible to test if node.js code ends up blocking or not?
Generally, if it reaches out to another service, like a database or a webservice, then it is non-blocking and you'll need to have some sort of callback. However, any function will block until something (even if nothing) is returned...
If the doSomeHeavyWork function is non-blocking, then it's likely that whatever library you're using will allow for some sort of callback. So you could write the function to accept a callback like so:
var doSomHeavyWork = function(user, callback) {
callTheNonBlockingStuff(function(error, whatever) { // Whatever that is it likely takes a callback which returns an error (in case something bad happened) and possible a "whatever" which is what you're looking to get or something.
if (error) {
console.log('There was an error!!!!');
console.log(error);
callback(error, null); //Call callback with error
}
callback(null, whatever); //Call callback with object you're hoping to get back.
});
return; //This line will most likely run before the callback gets called which makes it a non-blocking (asynchronous) function. Which is why you need the callback.
};
You should avoid in any part of your Node.js code synchronous blocks which don't call system or I/O operations and which computation takes long time (in computer meaning), e.g iterating over big arrays. Instead move this type of code to the separate worker or divide it to smaller synchronous pieces using process.nextTick(). You can find explanation for process.nextTick() here but read all comments too.

Resources