Asynchronous Database Queries with PostgreSQL in Node not working - node.js

Using Node.js and the node-postgres module to communicate with a database, I'm attempting to write a function that accepts an array of queries and callbacks and executes them all asynchronously using the same database connection. The function accepts a two-dimensional array and calling it looks like this:
perform_queries_async([
['SELECT COUNT(id) as count FROM ideas', function(result) {
console.log("FUNCTION 1");
}],
["INSERT INTO ideas (name) VALUES ('test')", function(result) {
console.log("FUNCTION 2");
}]
]);
And the function iterates over the array, creating a query for each sub-array, like so:
function perform_queries_async(queries) {
var client = new pg.Client(process.env.DATABASE_URL);
for(var i=0; i<queries.length; i++) {
var q = queries[i];
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
}
client.on('drain', function() {
console.log("drained");
client.end();
});
client.connect();
}
When I ran the above code, I expected to see output like this:
FUNCTION 1
FUNCTION 2
drained
However, the output bizarrely appears like so:
FUNCTION 2
drained
FUNCTION 2
Not only is the second function getting called for both requests, it also seems as though the drain code is getting called before the client's queue of queries is finished running...yet the second query still runs perfectly fine even though the client.end() code ostensibly killed the client once the event is called.
I've been tearing my hair out about this for hours. I tried hardcoding in my sample array (thus removing the for loop), and my code worked as expected, which leads me to believe that there is some problem with my loop that I'm not seeing.
Any ideas on why this might be happening would be greatly appreciated.

The simplest way to properly capture the value of the q variable in a closure in modern JavaScript is to use forEach:
queries.forEach(function(q) {
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
});
If you don't capture the value, your code reflects the last value that q had, as the callback function executed later, in the context of the containing function.
forEach, by using a callback function isolates and captures the value of q so it can be properly evaluated by the inner callback.

A victim of the famous Javascript closure/loop gotcha. See my (and other) answers here:
I am trying to open 10 websocket connections with nodejs, but somehow my loop doesnt work
Basically, at the time your callback is executed, q is set to the last element of the input array. The way around it is to dynamically generate the closure.

It will be good to execute this using async module . It will help you to reuse the code also . and will make the code more readable . I just love the auto function provided by async module
Ref: https://github.com/caolan/async

Related

Node.js Firestore forEach collection query cannot populate associative array

In this simplified example, associative array A cannot be populated in a Node.js Firestore query---it's as if there is a scoping issue:
var A = {};
A["name"] = "nissa";
firestore.collection("magic: the gathering")
.get()
.then(function(query) {
query.forEach(function(document) {
A[document.id] = document.id;
console.log(A);
});
})
.catch(function(error) {
});
console.log(A);
Console output:
{ name: 'nissa' } < last console.log()
{ name: 'nissa', formats: 'formats' } < first console.log() (in forEach loop)
{ name: 'nissa', formats: 'formats', releases: 'releases' } < second console.log() (in forEach loop)
Grateful for any assistance, please request for further detail if needed.
Data is loaded from Firestore asynchronously, and while that is happening, your main code continues to run.
It's easiest to see what that means by placing a few logging statements:
console.log("Starting to load data");
firestore.collection("magic: the gathering")
.get()
.then(function(query) {
console.log("Got data");
});
console.log("After starting to load data");
When you run this code, it prints:
Starting to load data
After starting to load data
Got data
This is probably not the order that you expected the logging to be in. But it is actually working as intended, and explains the output you see. By the time your last console.log(A); runs, the data hasn't been loaded yet, so A is empty.
The solution is simple, but typically takes some time to get used to: all code that needs the data from the database must be inside the callback, or be called from there.
So something like this:
var A = {};
A["name"] = "nissa";
firestore.collection("magic: the gathering")
.get()
.then(function(query) {
query.forEach(function(document) {
A[document.id] = document.id;
});
console.log(A);
})
Also see:
Array of JSON object is not empty but cannot iterate with foreach, show zero length
NodeJS, Firestore get field
Unable to add Google markers inside a loop, a more complex problem, calling multiple asynchronous API
scope issue in javascript between two Functions, which also shows using the more modern async and await keywords, instead of then()
How to get data from firestore DB in outside of onSnapshot, which uses an onSnapshot listener instead of get()

Using Mongodb variables out of its functions

So I'm making a web application and I'm trying to send variables to an EJS file but when they are sent out of the mongo functions they come out as undefined because it's a different scope for some reason. It's hard to explain so let me try to show you.
router.get("/", function(req, res){
var bookCount;
var userCount;
Books.count({}, function(err, stats){
if(err){
console.log("Books count failed to load.");
}else{
bookCount = stats;
}
});
User.count({}, function(err, count){
if(err){
console.log("User count failed to load.")
}else{
userCount = count;
console.log(userCount);
}
});
console.log(userCount);
//Get All books from DB
Books.find({}, function(err, allbooks){
if(err){
console.log("Problem getting all books");
}else{
res.render("index", {allbooks: allbooks, bookCount: bookCount, userCount: userCount});
}
});
});
So in the User.Count and Books.count I'm finding the number of documents in a collection which works and the number is stored inside of the variables declared at the very top.
After assigning the numbers like userCount i did console.log(userCount) which outputs the correct number which is 3, If was to do console.log(userCount) out of the User.count function it would return undefined, which is a reference to the declaration at the very top.
What is really weird is that Book.Find() has the correct userCount even though its a totally different function. The whole goal im trying to accomplish is doing res.render("index", {userCount: userCount}); outside of the Books.find(). I can do it but of course for some reason it passes undefined instead of 3. I hope this made a shred of sense.
I seem to have found a solution. but if anyone knows a different way I would love to know. So basically all you need to do is move the User.Count function outside of the router.get() function. Not completely sure about the logic of that but it works...
This is a classic asynchronous-operation problem: Your methods (Books.count, Books.find, User.count) are called immediately, but the callback functions you pass to them are not. userCount is undefined in your log because console.log is called before the assignment in the callback function is made. Your code is similar to:
var userCount;
setTimeout(function() {
userCount = 3;
}, 1000);
console.log(userCount); // undefined
User.count takes time to execute before calling back with the result, just like setTimeout takes the specified time to execute before calling its callback. The problem is JS doesn't pause and wait for the timeout to complete before moving on and calling console.log below it, it calls setTimeout, calls console.log immediately after, then the callback function is called one second later.
To render a complete view, you need to be sure you have all of the data before you call res.render. To do so you need to wait for all of the methods to call back before calling res.render. But wait, I just told you that JS doesn't pause and wait, so how can this be accomplished? Promise is the answer. Multiple promises, actually.
It looks like you are using Mongoose models. Mongoose has been written so that if you don't pass a callback function to your methods, they return a promise.
Books.count({}) // returns a promise
JS promises have a method then which takes a callback function that is called when the promise has been resolved with the value of the asynchronous method call.
Books.count({}) // takes some time
.then(function(bookCount) { // called when Books.count is done
// use the bookCount here
})
The problem is, you want to wait for multiple operations to complete, and multiple promises, before continuing. Luckily JS has a utility just for this purpose:
Promise.all( // wait for all of these operations to finish before calling the callback
Books.count({}),
User.count({}),
Books.find({})
)
.then(function(array) { // all done!
// the results are in an array
bookCount = array[0];
userC0unt = array[1];
allBooks = array[2];
})

How do I make a large but unknown number of REST http calls in nodejs?

I have an orientdb database. I want to use nodejs with RESTfull calls to create a large number of records. I need to get the #rid of each for some later processing.
My psuedo code is:
for each record
write.to.db(record)
when the async of write.to.db() finishes
process based on #rid
carryon()
I have landed in serious callback hell from this. The version that was closest used a tail recursion in the .then function to write the next record to the db. However, I couldn't carry on with the rest of the processing.
A final constraint is that I am behind a corporate proxy and cannot use any other packages without going through the network administrator, so using the native nodejs packages is essential.
Any suggestions?
With a completion callback, the general design pattern for this type of problem makes use of a local function for doing each write:
var records = ....; // array of records to write
var index = 0;
function writeNext(r) {
write.to.db(r, function(err) {
if (err) {
// error handling
} else {
++index;
if (index < records.length) {
writeOne(records[index]);
}
}
});
}
writeNext(records[0]);
The key here is that you can't use synchronous iterators like .forEach() because they won't iterate one at a time and wait for completion. Instead, you do your own iteration.
If your write function returns a promise, you can use the .reduce() pattern that is common for iterating an array.
var records = ...; // some array of records to write
records.reduce(function(p, r) {
return p.then(function() {
return write.to.db(r);
});
}, Promsise.resolve()).then(function() {
// all done here
}, function(err) {
// error here
});
This solution chains promises together, waiting for each one to resolve before executing the next save.
It's kinda hard to tell which function would be best for your scenario w/o more detail, but I almost always use asyncjs for this kind of thing.
From what you say, one way to do it would be with async.map:
var recordsToCreate = [...];
function functionThatCallsTheApi(record, cb){
// do the api call, then call cb(null, rid)
}
async.map(recordsToCreate, functionThatCallsTheApi, function(err, results){
// here, err will be if anything failed in any function
// results will be an array of the rids
});
You can also check out other ones to enable throttling, which is probablya good idea.

How to process a big array applying a async function for each element in nodejs?

I am working with zombie.js to scrape one site, I must to use the callback style to connect to each url. The point is that I have got an urls array and I need to process each urls using an async function. This is my first approach:
Array urls = {http..., http...};
function process_url(index)
{
if(index == urls.length)
return;
async_function(url,
function() {
...
//parse the url
...
// Process the next url
process_url(index++);
}
);
}
process_url(0)
Without use someone third party nodejs library to use the asyn funtion as sync function or to wait for the function (wait.for, synchornized, mocha), this is the way that I though to resolve this problem, I don't know what would happen if the array is too big. Is the function released from the memory when the next function is called? or all the functions are in memory until the end?
Any ideas?
Your scheme will work. I call it "manually sequencing async operations".
A general purpose version of what you're doing would look like this:
function processItem(data, callback) {
// do your async function here
// for example, let's suppose it was an http request using the request module
request(data, callback);
}
function processArray(array, fn) {
var index = 0;
function next() {
if (index < array.length) {
fn(array[index++], function(err, result) {
// process error here
if (err) return;
// process result here
next();
});
}
}
next();
}
processArray(arr, processItem);
As to your specific questions:
I don't know what would happen if the array is too big. Is the
function released from the memory when the next function is called? or
all the functions are in memory until the end?
Memory in Javascript is released when it is not longer referenced by any running code and when the garbage collector gets time to run. Since you are running a series of asynchronous operations here, it is likely that the garbage collector gets a chance to run regularly while waiting for the http response from the async operation so memory could get cleaned up then. Functions are just another type of object in Javascript and they get garbage collected just like anything else. When they are no longer reference by running code, they are eligible for garbage collection.
In your specific code, because you are re-calling process_url() only in an async callback, there is no stack build-up (as in normal recursion). The prior instance of process_url() has already completed BEFORE the async callback is called and BEFORE you call the next iteration of process_url().
In general, management and coordination of multiple async operations is much, much easier using promises which are built into the current versions of node.js and are part of the ES6 ECMAScript standard. No external libraries are required to use promises in current versions of node.js.
For a list of a number of different techniques for sequencing your asynchronous operations on your array, both using promises and not using promises, see:
How to synchronize a sequence of promises?.
The first step in using promises is the "promisify" your async function so that it returns a promise instead of takes a callback.
function async_function_promise(url) {
return new Promise(function(resolve, reject) {
async_function(url, function(err, result) {
if (err) {
reject(err);
} else {
resolve(result);
}
});
});
}
Now, you have a version of your function that returns promises.
If you want your async operations to proceed one at a time so the next one doesn't start until the previous one has completed, then a usual design pattern for that is to use .reduce() like this:
function process_urls(array) {
return array.reduce(function(p, url) {
return p.then(function(priorResult) {
return async_function_promise(url);
});
}, Promise.resolve());
}
Then, you can call it like this:
var myArray = ["url1", "url2", ...];
process_urls(myArray).then(function(finalResult) {
// all of them are done here
}, function(err) {
// error here
});
There are also Promise libraries that have some helpful features that make this type of coding simpler. I, myself, use the Bluebird promise library. Here's how your code would look using Bluebird:
var Promise = require('bluebird');
var async_function_promise = Promise.promisify(async_function);
function process_urls(array) {
return Promise.map(array, async_function_promise, {concurrency: 1});
}
process_urls(myArray).then(function(allResults) {
// all of them are done here and allResults is an array of the results
}, function(err) {
// error here
});
Note, you can change the concurrency value to whatever you want here. For example, you would probably get faster end-to-end performance if you increased it to something between 2 and 5 (depends upon the server implementation on how this is best optimized).

node.js for loop execution in a synchronous manner

I have to implement a program in node.js which looks like the following code snippet. It has an array though which I have to traverse and match the values with database table entries. I need to wait till the loop ends and send the result back to the calling function:
var arr=[];
arr=[one,two,three,four,five];
for(int j=0;j<arr.length;j++) {
var str="/^"+arr[j]+"/";
// consider collection to be a variable to point to a database table
collection.find({value:str}).toArray(function getResult(err, result) {
//do something incase a mathc is found in the database...
});
}
However, as the str="/^"+arr[j]+"/"; (which is actually a regex to be passed to find function of MongoDB in order to find partial match) executes asynchronously before the find function, I am unable to traverse through the array and get required output.
Also, I am having hard time traversing through array and send the result back to calling function as I do not have any idea when will the loop finish executing.
Try using async each. This will let you iterate over an array and execute asynchronous functions. Async is a great library that has solutions and helpers for many common asynchronous patterns and problems.
https://github.com/caolan/async#each
Something like this:
var arr=[];
arr=[one,two,three,four,five];
asych.each(arr, function (item, callback) {
var str="/^"+item+"/";
// consider collection to be a variable to point to a database table
collection.find({value:str}).toArray(function getResult(err, result) {
if (err) { return callback(err); }
// do something incase a mathc is found in the database...
// whatever logic you want to do on result should go here, then execute callback
// to indicate that this iteration is complete
callback(null);
});
} function (error) {
// At this point, the each loop is done and you can continue processing here
// Be sure to check for errors!
})

Resources