Node.js & Mongoose: Async Function Logic

Node.js & Mongoose: Async Function Logic - node.js

I'm writing an API for a project and recently we've shifted our technology stack to Node.js and MongoDB. However I could not settle some of the aspects related to Node and Mongo.
I started to code Node by checking the infamous Node Beginner tutorial, where it is highly mentioned to follow the non-blocking logic. That is if I understood correctly not waiting for a function to finish, but move on and later "magically" get the results of that function you've moved on.
But there is one thing that confused me which if the non-blocking is the essence of Node, should I follow it when I'm querying a database, because I have to assure and return the result of the connection as either success or the error. The code I have will explain better for the tl;dr 's; (by the way I'm using Mongoose as mongoDB ODM.
db.on('error', function(err){
if(err)
console.log("There is an error");
response.write("Terrible Error!");
response.end();
});
I've written what to do when the db connection succeed after the 'db.on()' error code, however after a second thought I am think it is better to write in 'function(err)' since an error occurs it will directly cancel the operation and end the response. But is it against the non-blocking logic of Node.js?

Is the essence of your question where to place code for callbacks? The recommended pattern is to use the sort of pattern described in the docs. This wraps any document logic within callbacks to avoid blocking operations.

Related

Getting data out of a MongoDB call [duplicate]

This question already has answers here:
Why is my variable unaltered after I modify it inside of a function? - Asynchronous code reference
(7 answers)
Closed 5 years ago.
I am unable to retrieve data from my calls to MongoDB.
The calls work as I can display results to the console but when I try to write/copy those results to an external array so they are usable to my program, outside the call, I get nothing.
EVERY example that I have seen does all it's work within the connection loop. I cannot find any examples where results are copied to an array (either global or passed in), the connection ends, and the program continues processing the external array.
Most of the sample code out there is either too simplistic (ie. just console.log within the connection loop) or is way too complex, with examples of how to make express api routes. I don't need this as I am doing old fashioned serial batch processing.
I understand that Mongo is built to be asynchronous but I should still be able to work with it.
MongoClient.connect('mongodb://localhost:27017/Lessons', function (err, db) {
assert.equal(err, null);
console.log("Connectied to the 'Lessons' database");
var collection = db.collection('students');
collection.find().limit(10).toArray(function(err, docs) {
// console.log(docs);
array = docs.slice(0); //Cloning array
console.log(array);
db.close();
});
});
console.log('database is closed');
console.log(array);
It looks like I'm trying to log the data before the loop has finished. But how to synchronize the timing?
If somebody could explain this to me I'd be really grateful as I've been staring at this stuff for days and am really feeling stupid.

From the code you have shared, do you want the array to display in the console.log at the end? This will not work with your current setup as the 2 console.log's at the end will run before the query to your database is complete.
You should grab your results with a call back function. If your not sure what those are, you will need to learn as mongo / node use them everywhere. Basically javascript is designed to run really fast, and won't wait for something to finish before going to the next line of code.
This tutorial helped me a lot when I was first learning: https://zellwk.com/blog/crud-express-mongodb/
Could you let us know what environment you are running this mongo request? It would give more context because right now I'm not sure how you are using mongo.

thanks for the quick response.
Environment is Windows7 with an instance of mongod running in the background so I'm connecting to localhost. I'm using a db that I created but you can use any collection to run the sample code.
I did say I thought it was a timing thing. Your observation "the 2 console.log's at the end will run before the query to your database is complete" really clarified the problem for me.
I replaced the code at the end, after the connect() with the following code:
function waitasecond(){
console.log(array);
}
setTimeout(waitasecond, 2000);
And the array is fully populated. This suggests that what I was trying to do, at least the way I wanted to do, is not possible. I think I have two choices.
Sequential processing (as I originally concieved) - I would have to put some kind of time delay to let the db call finish before commencing.
Create a function with all the code for the processing that needs to be done and call it from inside the database callback when the database returns the data.
The first options is a bit smelly. I wouldn't want to see that in production so I guess I'll take the second option.
Thanks for the recommeded link. I did a quick look and the problem, for me, is that this is describing a very common pattern that relies on express listening for router calls to respond. The processing I'm doing has nothing to do with router calls.
Ah for the good old days of syncronous io.

node.js async.js nextTick vs setImmediate

I have a large node.js application that heavily uses the async.js module.
I have a lot of code like this:
async.series([
function(callback){
sql.update(query, callback);
},
function(callback){
if (something){
sql.update(query2, callback);
}
else{
callback(null);
}
}
]);
The big problem is the synchronous callback in the else statement. I read a while back that you should not do that with async.js as it could cause unexpected results, but I'm not sure what the best alternative is. I read that I should use process.nextTick in some places, but now I'm reading that we should not use that and it should be setImmediate.
Can someone point me in the right direction? Thanks!

I FINALLY found the reference I remember reading a while back.
It is on this page:
http://caolanmcmahon.com/posts/nodejs_style_and_structure/
He discusses the general case in rule #3 and in more direct correlation to my issue in response to the first comment on the page.
Even if it's not the cause of my random error, I would like to use the module according to the author's intentions which was the point of the question.
In this article, he mentions using process.nextTick, but I really think node is trying to move away from that in this case and I found a response in a async issue thread from a month ago that says:
"You need to make sure your functions are running asynchronously - if you don't know whether it'll be asynchronous or not you can force it to be async using setImmediate"
So, that's the answer I'm going with. I'm going to wrap any synch callbacks in setImmediate.

Reasonable handling scheme for Node.JS async errors in unusual places

In Java, I am used to try..catch, with finally to cleanup unused resources.
In Node.JS, I don't have that ability.
Odd errors can occur for example the database could shut down at any moment, any single table or file could be missing, etc.
With nested calls to db.query(..., function(err, results){..., it becomes tedious to call if(err) {send500(res); return} every time, especially if I have to cleanup resources, for example db.end() would definitely be appropriate.
How can one write code that makes async catch and finally blocks both be included?
I am already aware of the ability to restart the process, but I would like to use that as a last-resort only.

A full answer to this is pretty in depth, but it's a combination of:
consistently handling the error positional argument in callback functions. Doubling down here should be your first course of action.
You will see #izs refer to this as "boilerplate" because you need a lot of this whether you are doing callbacks or promises or flow control libraries. There is no great way to totally avoid this in node due to the async nature. However, you can minimize it by using things like helper functions, connect middleware, etc. For example, I have a helper callback function I use whenever I make a DB query and intend to send the results back as JSON for an API response. That function knows how to handle errors, not found, and how to send the response, so that reduces my boilerplate substantially.
use process.on('uncaughtExcepton') as per #izs's blog post
use try/catch for the occasional synchronous API that throws exceptions. Rare but some libraries do this.
consider using domains. Domains will get you closer to the java paradigm but so far I don't see that much talk about them which leads me to expect they are not widely adopted yet in the node community.
consider using cluster. While not directly related, it generally goes hand in hand with this type of production robustness.
some libraries have top-level error events. For example, if you are using mongoose to talk to mongodb and the connection suddenly dies, the connection object will emit an error event
Here's an example. The use case is a REST/JSON API backed by a database.
//shared error handling for all your REST GET requests
function normalREST(res, error, result) {
if (error) {
log.error("DB query failed", error);
res.status(500).send(error);
return;
}
if (!result) {
res.status(404).send();
return;
}
res.send(result); //handles arrays or objects OK
}
//Here's a route handler for /users/:id
function getUser(req, res) {
db.User.findById(req.params.id, normalREST.bind(null, res));
}
And I think my takeaway is that overall in JavaScript itself, error handling is basically woefully inadequte. In the browser, you refresh the page and get on with your life. In node, it's worse because you're trying to write a robust and long-lived server process. There is a completely epic issue comment on github that goes into great detail how things are just fundamentally broken. I wouldn't get your hopes up of ever having JavaScript code you can point at and say "Look, Ma, state-of-the-art error handling". That said, in practice if you follow the points I listed above, empirically you can write programs that are robust enough for production.
See also The 4 Keys to 100% Uptime with node.js.

node.js's "bad" packages?

I am getting my hands on node.js and its NPM valuable service. I tried installing this package and, by reading the documentation, it says that to generate a short id, this code needed:
shortId.generate();
that means that to use the ID, I would need something like this.
var id = shortId.generate();
res.end(id);
I hope I am not making a mistake here, but I thought the correct way to do things asynchronously was to use callbacks? And do something like:
shortId.generate(function(val){
res.end(val);
});
Can anyone please help me clarifying this problem? Thanks in advance.

Yes, the code in your example is synchronous. Node.JS has strength from it's asynchronous code, but not absolutely everything is asynchronous.
Mostly, the asynchronous code is usful for blocking IO.

As you could see from that module source code it does not perform any i/o at all while generating the id.
Callbacks in node are used when i/o takes place, so the program does not wait until the operation is performed, giving a function to be called when the i/o finishes.

The shortId.generate function is blocking, so it doesn't provide a callback for the result.
This makes sense in this case, because the unique ID generation isn't a heavy operation. If it was, you could adjust the code to enable a callback methodology.
Callbacks are definitely common though! For example, your web application wants to save an object to the server. You could be non-blocking here by adding a callback to the save function, so you could return a response sooner than the object has been written to disk/cache.
I recommend reading art of node for some great examples of blocking vs. non-blocking. :)

Asynchronous GraphicsMagick For Node

I am using GraphicsMagick for node. I basically crop the photos and retrieve the exif data of the photos uploaded by the user. I don't want to block the flow of request waiting for these task to be completed, therefore I need to use asynchronous functions to be able to do so. And I think I should be able to as these are I/O operations which node.js makes async itself.
But as I see all the functions in GraphicsMagick for node are synchronous functions. So I am not being able to sure as to how to achieve what I am looking for.
One idea that comes to my mind is to write a function with callback and have the GraphicsMagick processing done inside it. And use .nextTick() function to achieve asynchronous flow. But I am not totally sure if this is fine. And also are there any asynchronous functions for GraphicsMagick.
Please help me and an example code would be very appreciated as to how to get asynchronous functions from graphicsmagick.

UPDATE:
The actual answer from #Saransh Mohapatra is actually wrong. As after little investigation turned out that all methods that perform operations over images, actually do not perform anything but only append arguments to the list that will be used after when you write or call any buffer related methods executed in order to get/write actual image buffer.
Here is details over it in example of blur:
We call blur: https://github.com/aheckmann/gm/blob/master/lib/args.js#L780
It calls this.out which will call: https://github.com/aheckmann/gm/blob/master/lib/command.js#L49
Which has method made for it when it was constructed: https://github.com/aheckmann/gm/blob/master/lib/command.js#L34
Which all it does - a.push(arguments[i]); and then concats it to all list (to other arguments).
Thats it.
Then when write is called:
https://github.com/aheckmann/gm/blob/master/lib/command.js#L62
It gets list of arguments self.args(): https://github.com/aheckmann/gm/blob/master/lib/command.js#L78
Which just filters off some reserved fields: https://github.com/aheckmann/gm/blob/master/lib/command.js#L274
So then those arguments will be joined in _spawn which is called from write: https://github.com/aheckmann/gm/blob/master/lib/command.js#L187
Thats it.
So based on this, any method that makes operations over image, but do not save or persist buffer of it - do not need any async, as they actually do not do any work at all.
So that means - you do need to worry about them.
OLD:
The best approach for any heavy processing stuff is to use separate processes.
You can create another small node.js process, that will have some communication abilities with main process (ZeroMQ is good choice here).
This separate process have to be notified about file (path) and what to do with it, you can easily send that data from main process which makes such decisions via ZeroMQ.
This approach will allow you to have independence in the way main (web?) node processes work, as well as possibility in the future to scale to separate hardware/instances.
It is very good practice as well (unix-like application logic separation).

And here is how to promisify gm:
var Promise = require('bluebird');
var gm = require('gm').subClass({imageMagick: true});
Promise.promisifyAll(gm.prototype);
gm('1.jpg')
.resize(240, 240)
.noProfile()
.writeAsync('1b.jpg')
.then(function () {
console.log('done');
});
.catch(function (err) {
console.log(err);
});
https://github.com/aheckmann/gm/issues/320

Sorry my observation was mistaken, though the GraphicsMagick module seems as synchronous function but they are not. They spawn child process each time manipulation is done. And this has been confirmed here.
So anyone else looking for this problem, GraphicsMagick functions are Asynchronous. And you don't have to do anything from your part. Its a very good module and worth checking out.
Thanks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string