Getting data out of a MongoDB call [duplicate] - node.js

This question already has answers here:
Why is my variable unaltered after I modify it inside of a function? - Asynchronous code reference
(7 answers)
Closed 5 years ago.
I am unable to retrieve data from my calls to MongoDB.
The calls work as I can display results to the console but when I try to write/copy those results to an external array so they are usable to my program, outside the call, I get nothing.
EVERY example that I have seen does all it's work within the connection loop. I cannot find any examples where results are copied to an array (either global or passed in), the connection ends, and the program continues processing the external array.
Most of the sample code out there is either too simplistic (ie. just console.log within the connection loop) or is way too complex, with examples of how to make express api routes. I don't need this as I am doing old fashioned serial batch processing.
I understand that Mongo is built to be asynchronous but I should still be able to work with it.
MongoClient.connect('mongodb://localhost:27017/Lessons', function (err, db) {
assert.equal(err, null);
console.log("Connectied to the 'Lessons' database");
var collection = db.collection('students');
collection.find().limit(10).toArray(function(err, docs) {
// console.log(docs);
array = docs.slice(0); //Cloning array
console.log(array);
db.close();
});
});
console.log('database is closed');
console.log(array);
It looks like I'm trying to log the data before the loop has finished. But how to synchronize the timing?
If somebody could explain this to me I'd be really grateful as I've been staring at this stuff for days and am really feeling stupid.

From the code you have shared, do you want the array to display in the console.log at the end? This will not work with your current setup as the 2 console.log's at the end will run before the query to your database is complete.
You should grab your results with a call back function. If your not sure what those are, you will need to learn as mongo / node use them everywhere. Basically javascript is designed to run really fast, and won't wait for something to finish before going to the next line of code.
This tutorial helped me a lot when I was first learning: https://zellwk.com/blog/crud-express-mongodb/
Could you let us know what environment you are running this mongo request? It would give more context because right now I'm not sure how you are using mongo.

thanks for the quick response.
Environment is Windows7 with an instance of mongod running in the background so I'm connecting to localhost. I'm using a db that I created but you can use any collection to run the sample code.
I did say I thought it was a timing thing. Your observation "the 2 console.log's at the end will run before the query to your database is complete" really clarified the problem for me.
I replaced the code at the end, after the connect() with the following code:
function waitasecond(){
console.log(array);
}
setTimeout(waitasecond, 2000);
And the array is fully populated. This suggests that what I was trying to do, at least the way I wanted to do, is not possible. I think I have two choices.
Sequential processing (as I originally concieved) - I would have to put some kind of time delay to let the db call finish before commencing.
Create a function with all the code for the processing that needs to be done and call it from inside the database callback when the database returns the data.
The first options is a bit smelly. I wouldn't want to see that in production so I guess I'll take the second option.
Thanks for the recommeded link. I did a quick look and the problem, for me, is that this is describing a very common pattern that relies on express listening for router calls to respond. The processing I'm doing has nothing to do with router calls.
Ah for the good old days of syncronous io.

Related

How is Node.js asynchronous AND single-threaded at the same time?

The question title basically says it all, but to rephrase it:What handles the asynchronous function execution, if the main (and only) thread is occupied with running down the main code block?
So far I have only found that the async code gets executed elsewhere or outside the main thread, but what does this mean specifically?
EDIT: The proposed Node.js Event loop question's answers may also address this topic, but I was looking for a less complex, more specific answer, rather then an explanation of Node.js concept. Also, it does not show up in search for anything similar to "node asynchronous single-threaded".
EDIT, #Mr_Thorynque: Running a query to get data from database and log it to console. Nothing gets logged, because Node, being async, does not wait for query to finish and for data to populate. (this is just an example as requested, NOT a part of my question)
var = data;
mysql.query(`SELECT *some rows from database*`, function (err, rows, fields) {
rows.forEach(function(row){
data += *gather the requested data*
});
});
console.log(data);
What it really comes down to is that the node process (which is single threaded) hands off the work to "something else". This could be the OS's I/O process, or a networked resource or whatever. By handing it off, it frees its thread to keep working on the next in-memory thing. It uses file handles to keep track of any pending work and in the event loop marries the two back together and fire the callback when the work is done.
Note that this is also why you can block processing in your own code if poorly designed. If your code runs complex tasks and doesn't hand off the work, you'll block the single thread.
This is about as simple an answer as I can make, I think that the details are well explained in the links in your comments.

Synchronous Blocks in Node.js

I want to know if it's possible to have synchronous blocks in a Node.js application. I'm a total newbie to Node, but I couldn't find any good answers for the behavior I'm looking for specifically.
My actual application will be a little more involved, but essentially what I want to do is support a GET request and a POST request. The POST request will append an item to an array stored on the server and then sort the array. The GET request will return the array. Obviously, with multiple GETs and POSTs happening simultaneously, we need a correctness guarantee. Here's a really simple example of what the code would theoretically look like:
var arr = [];
app.get(/*URL 1*/, function (req, res) {
res.json(arr);
});
app.post(/*URL 2*/, function (req, res) {
var itemToAdd = req.body.item;
arr.push(itemToAdd);
arr.sort();
res.status(200);
});
What I'm worried about is a GET request returning the array before it is sorted but after the item is appended or, even worse, returning the array as it's being sorted. In a language like Java I would probably just use a ReadWriteLock. From what I've read about Node's asynchrony, it doesn't appear that arr will be accessed in a way that preserves this behavior, but I'd love to be proven wrong.
If it's not possible to modify the code I currently have to support this behavior, are there any other alternatives or workarounds to get the application to do what I want it to do?
What I'm worried about is a GET request returning the array before it is sorted but after the item is appended or, even worse, returning the array as it's being sorted.
In the case of your code here, you don't have to worry about that (although read on because you may want to worry about other things!). Node.js is single-threaded so each function will run in its entirety before returning control to the Event Loop. Because all your array manipulation is synchronous, your app will wait until the array manipulation is done before answering a GET request.
One thing to watch out for then, of course, is if (for example) the .sort() takes a long time. If it does, your server will block while that is going on. And this is why people tend to favor asynchronous operations instead. But if your array is guaranteed to be small and/or this is an app with a limited number of users (say, it's an intranet app for a small company), then what you're doing may work just fine.
To get a good understanding of the whole single-threaded + event loop thing, I recommend Philip Roberts's talk on the event loop.

Is NodeJS is suitable for websites - (ie. Port Blocking)?

I have gone through many painful months with is issue and I am now ready to let this go to the bin of "what-a-great-lie-for-websites-nodejs-is"!!!
All NodeJS tutorials discuss how to create a website. When done, it works. For one person at a time though. All requests sent to the port will be blocked by the first come-first-serve situation. Why? Because most requests sent to the nodejs server will have to get parsed, data requested from the database, data calculated and parsed, response prepared and sent back to the ajax call. (this is a mere simple website example).
Same applies for authentication - a request is made, data is parsed, authentication takes place, session is created and sent back to the requester.
No matter how you sugar coat this - All requests are done this way. Yes you can employ async functionality which will shorten the time spent on some portions, yes you can try promises, yes you can employ clustering, yes you can employ forking/spawning, etc... The result is always the same at all times: port gets blocked.
I tried responding with a reference so that we can use sockets to pass the data back and matching it with the reference - that also blocked it.
The point of the matter is this: when you ask for help, everyone wants all sort of code examples, but never go to the task of helping with an actual answer that works. The whole wide world!!!!! Which leads me to the answer that nodeJS is not suitable for websites.
I read many requests regarding this and all have been met with: "you must code properly"..! Really? Is there no NodeJS skilled and experienced person who can lend an answer on this one????!!!!!
Most of the NodeJS coders come from the PHP side - All websites using PHP never have to utilise any workaround whatsoever in order to display a web page and it never blocks 2 people at the same time. Thanks to the web server.
So how come NodeJS community cannot come to some sort of an asnwer on this one other than: "code carefully"???!!!
They want examples - each example is met with: "well that is a blocking code, do it another way", or "code carefully"!!! Come one people!!!
Think of the following scenario: One User, One page with 4 lists of records. Theoretically all lists should be filled up with records independently. What is happening because of how data is requested, prepared and responded, each list in reality is waiting for the next one to finish. That is on one session alone.
What about 2 people, 2 sessions at the same time?
So my question is this: is NodeJS suitable for a website and if it is, can anyone show and prove this with a short code??? If you can't prove it, then the answer is: "NodeJS is not suitable for websites!"
Here is an example based on the simplest tutorial and it is still blocking:
var express = require('express'),
fs = require("fs");
var app = express();
app.get('/heavyload', function (req, res) {
var file = '/media/sudoworx/www/sudo-sails/test.zip';
res.send('Heavy Load');
res.end();
fs.readFile(file, 'utf8', function (err,fileContent) {
});
});
app.get('/lightload', function (req, res) {
var file = '/media/sudoworx/www/sudo-sails/test.zip';
res.send('Light Load');
res.end();
});
app.listen(1337, function () {
console.log('Listening!')
});
Now, if you go to "/heavyload" it will immediately respond because that is the first thing sent to the browser, and then nodejs proceeds reading a heavy file (a large file). If you now go to the second call "/lightload" at the same time, you will see that it is waiting for the loading of the file to finish from the first call before it proceeds with the browser output. This is the simplest example of how nodejs simply fails in handling what otherwise would be simple in php and similar script languages.
Like mentioned before, I tried as many as 20 different ways to do this in my career of nodejs programmer. I totally love nodejs, but I cannot get past this obstacle... This is not a complaint - it is a call for help because I am at my end road with nodejs and I don't know what to do.
I thank you kindly.
So here is what I found out. I will answer it with an example of a blocking code:
for (var k = 0; k < 15000; k++){
console.log('Something Output');
}
res.status(200).json({test:'Heavy Load'});
This will block because it has to do the for loop for a long time and then after it finished it will send the output.
Now if you do the same code like this it won't block:
function doStuff(){
for (var k = 0; k < 15000; k++){
console.log('Something Output');
}
}
doStuff();
res.status(200).json({test:'Heavy Load'});
Why? Because the functions are run asynchronously...! So how will I then send the resulting response to the requesting client? Currently I am doing it as follows:
Run the doStuff function
Send a unique call reference which is then received by the ajax call on the client side.
Put the callback function of the client side into a waiting object.
Listen on a socket
When the doStuff function is completed, it should issue a socket message with the resulting response together with the unique reference
When the socket on the client side gets the message with the unique reference and the resulting response, it will then match it with the waiting callback function and run it.
Done! A bit of a workaround (as mentioned before), but it's working! It does require a socket to be listening. And that is my solution to this port-blocking situation in NodeJS.
Is there some other way? I am hoping someone answers with another way, because I am still feeling like this is some workaround. Eish! ;)
is NodeJS suitable for a website
Definitely yes.
can anyone show and prove this with a short code
No.
All websites using PHP never have to utilise any workaround whatsoever
in order to display a web page and it never blocks 2 people at the
same time.
Node.js doesn't require any workarounds as well, it simply works without blocking.
To more broadly respond to your question/complaint:
A single node.js machine can easily serve as a web-server for a website and handle multiple sessions (millions actually) without any need for workarounds.
If you're coming from a PHP web-server, maybe instead of trying to migrate an existing code to a new Node website, first play with online simple website example of Node.js + express, if that works well, you can start adding code that require long-running processes like reading from DBs or reading/writing to files and verify that visitors aren't being blocked (they shouldn't be blocked).
See this tutorial on how to get started.
UPDATE FOR EXAMPLE CODE
To fix the supplied example code, you should convert your fs.readFile call to fs.createReadStream. readFile is less recommended for large files, I don't think readFile literally blocked anything, but the need to allocate and move large amounts of bytes may choke the server, createReadStream uses chunks instead which is much easier on the CPU and RAM:
rstream = fs.createReadStream(file);
var length = 0;
rstream.on('data', function (chunk) {
length += chunk.length;
// Do something with the chunk
});
rstream.on('end', function () { // done
console.log('file read! length = ' + length);
});
After switching your code to createReadStream I'm able to serve continues calls to heavyload / lightload in ~3ms each
THE BETTER ANSWER I SHOULD HAVE WRITTEN
Node.js is a single process architecture, but it has multi-process capabilities, using the cluster module, it allows you to write master/workers code that distributes the load across multiple workers on multiple processes.
You can also use pm2 to do the clustering for you, it has a built-in load balancer to distribute to work, and also allows for scaling up/down without downtime, see this.
In your case, while one process is reading/writing a large file, other processes can accept incoming requests and handle them.

NodeJS callbacks: tracking db calls so I can terminate the process

I got a bill from Heroku this month, much to my surprise. It was only a few dollars, luckily, but I didn't think my usage had been that high. I checked the bill and it said I'd used about 1000 hours last month. I was briefly confused, since my app just runs for a few seconds every hour to send some emails, but then I realized that the process just wasn't terminating.
After commenting out swaths of my code, I've determined that the process doesn't exit because my mongoose database connection is still open. But I've got several nested callbacks to the database and then to mailgun to send these emails, and sometimes the mailgun callback has its own mailgun callback. How do I keep track of these and ensure that the database is closed at the end?
I asked my JS ninja friend, and he said to use semaphores. This sounded daunting but was actually incredibly easy.
npm install semaphore --save
Package page here. Then, for each of my database calls, I did this:
sem.take(function () {
Object.find({key: value}, function () {
sem.leave(); // (I don't need the database anymore)
// tons of other code
});
});
Then I made sure that all of that code runs before this:
sem.take(function () {
sem.leave();
db.close();
});
I think I probably could use a deeper understanding of what's going on, but this is working for now.

Node.js & Mongoose: Async Function Logic

I'm writing an API for a project and recently we've shifted our technology stack to Node.js and MongoDB. However I could not settle some of the aspects related to Node and Mongo.
I started to code Node by checking the infamous Node Beginner tutorial, where it is highly mentioned to follow the non-blocking logic. That is if I understood correctly not waiting for a function to finish, but move on and later "magically" get the results of that function you've moved on.
But there is one thing that confused me which if the non-blocking is the essence of Node, should I follow it when I'm querying a database, because I have to assure and return the result of the connection as either success or the error. The code I have will explain better for the tl;dr 's; (by the way I'm using Mongoose as mongoDB ODM.
db.on('error', function(err){
if(err)
console.log("There is an error");
response.write("Terrible Error!");
response.end();
});
I've written what to do when the db connection succeed after the 'db.on()' error code, however after a second thought I am think it is better to write in 'function(err)' since an error occurs it will directly cancel the operation and end the response. But is it against the non-blocking logic of Node.js?
Is the essence of your question where to place code for callbacks? The recommended pattern is to use the sort of pattern described in the docs. This wraps any document logic within callbacks to avoid blocking operations.

Resources