How to know when finished - node.js

Im pretty new to node.js, so i'm wondering how to know when all elements are processed in lets say:
["one", "two", "three"].forEach(function(item){
processItem(item, function(result){
console.log(result);
});
});
...now if i want to do something that can only be done when all items are processed, how would i do that?

You can use async module. Simple example: The
async.map(['one','two','three'], processItem, function(err, results){
// results[0] -> processItem('one');
// results[1] -> processItem('two');
// results[2] -> processItem('three');
});
The callback function of async.map will when all items are processed. However, in processItem you should be careful, processItem should be something like this:
processItem(item, callback){
// database call or something:
db.call(myquery, function(){
callback(); // Call when async event is complete!
});
}

forEach is blocking, see this post:
JavaScript, Node.js: is Array.forEach asynchronous?
so to call a function when all items are done processing, it can be done inline:
["one", "two", "three"].forEach(function(item){
processItem(item, function(result){
console.log(result);
});
});
console.log('finished');
if there is a high io-bound load for each item to be processed, then take a look at the module Mustafa recommends. there is also a pattern referenced in the post linked above.

Albeit other answers are correct, since node.js supports ES6 henceforth, in my opinion using built-in Promise library will be more stable and tidy.
You don't even need to require something, Ecma took the Promises/A+ library and implemented it to the native Javascript.
Promise.all(["one", "two","three"].map(processItem))
.then(function (results) {
// here we got the results in the same order of array
} .catch(function (err) {
// do something with error if your function throws
}
As Javascript is a adequately problematic language (dynamic typing, asynchronous flow) when it comes to debugging, sticking with promises instead of callbacks will save your time at the end.

Related

Using Mongodb variables out of its functions

So I'm making a web application and I'm trying to send variables to an EJS file but when they are sent out of the mongo functions they come out as undefined because it's a different scope for some reason. It's hard to explain so let me try to show you.
router.get("/", function(req, res){
var bookCount;
var userCount;
Books.count({}, function(err, stats){
if(err){
console.log("Books count failed to load.");
}else{
bookCount = stats;
}
});
User.count({}, function(err, count){
if(err){
console.log("User count failed to load.")
}else{
userCount = count;
console.log(userCount);
}
});
console.log(userCount);
//Get All books from DB
Books.find({}, function(err, allbooks){
if(err){
console.log("Problem getting all books");
}else{
res.render("index", {allbooks: allbooks, bookCount: bookCount, userCount: userCount});
}
});
});
So in the User.Count and Books.count I'm finding the number of documents in a collection which works and the number is stored inside of the variables declared at the very top.
After assigning the numbers like userCount i did console.log(userCount) which outputs the correct number which is 3, If was to do console.log(userCount) out of the User.count function it would return undefined, which is a reference to the declaration at the very top.
What is really weird is that Book.Find() has the correct userCount even though its a totally different function. The whole goal im trying to accomplish is doing res.render("index", {userCount: userCount}); outside of the Books.find(). I can do it but of course for some reason it passes undefined instead of 3. I hope this made a shred of sense.
I seem to have found a solution. but if anyone knows a different way I would love to know. So basically all you need to do is move the User.Count function outside of the router.get() function. Not completely sure about the logic of that but it works...
This is a classic asynchronous-operation problem: Your methods (Books.count, Books.find, User.count) are called immediately, but the callback functions you pass to them are not. userCount is undefined in your log because console.log is called before the assignment in the callback function is made. Your code is similar to:
var userCount;
setTimeout(function() {
userCount = 3;
}, 1000);
console.log(userCount); // undefined
User.count takes time to execute before calling back with the result, just like setTimeout takes the specified time to execute before calling its callback. The problem is JS doesn't pause and wait for the timeout to complete before moving on and calling console.log below it, it calls setTimeout, calls console.log immediately after, then the callback function is called one second later.
To render a complete view, you need to be sure you have all of the data before you call res.render. To do so you need to wait for all of the methods to call back before calling res.render. But wait, I just told you that JS doesn't pause and wait, so how can this be accomplished? Promise is the answer. Multiple promises, actually.
It looks like you are using Mongoose models. Mongoose has been written so that if you don't pass a callback function to your methods, they return a promise.
Books.count({}) // returns a promise
JS promises have a method then which takes a callback function that is called when the promise has been resolved with the value of the asynchronous method call.
Books.count({}) // takes some time
.then(function(bookCount) { // called when Books.count is done
// use the bookCount here
})
The problem is, you want to wait for multiple operations to complete, and multiple promises, before continuing. Luckily JS has a utility just for this purpose:
Promise.all( // wait for all of these operations to finish before calling the callback
Books.count({}),
User.count({}),
Books.find({})
)
.then(function(array) { // all done!
// the results are in an array
bookCount = array[0];
userC0unt = array[1];
allBooks = array[2];
})

How to return promise to the router callback in NodeJS/ExpressJS

I am new to nodejs/expressjs and mongodb. I am trying to create an API that exposes data to my mobile app that I am trying to build using Ionic framework.
I have a route setup like this
router.get('/api/jobs', (req, res) => {
JobModel.getAllJobsAsync().then((jobs) => res.json(jobs)); //IS THIS THe CORRECT WAY?
});
I have a function in my model that reads data from Mongodb. I am using the Bluebird promise library to convert my model functions to return promises.
const JobModel = Promise.promisifyAll(require('../models/Job'));
My function in the model
static getAllJobs(cb) {
MongoClient.connectAsync(utils.getConnectionString()).then((db) => {
const jobs = db.collection('jobs');
jobs.find().toArray((err, jobs) => {
if(err) {
return cb(err);
}
return cb(null, jobs);
});
});
}
The promisifyAll(myModule) converts this function to return a promise.
What I am not sure is,
If this is the correct approach for returning data to the route callback function from my model?
Is this efficient?
Using promisifyAll is slow? Since it loops through all functions in the module and creates a copy of the function with Async as suffix that now returns a promise. When does it actually run? This is a more generic question related to node require statements. See next point.
When do all require statements run? When I start the nodejs server? Or when I make a call to the api?
Your basic structure is more-or-less correct, although your use of Promise.promisifyAll seems awkward to me. The basic issue for me (and it's not really a problem - your code looks like it will work) is that you're mixing and matching promise-based and callback-based asynchronous code. Which, as I said, should still work, but I would prefer to stick to one as much as possible.
If your model class is your code (and not some library written by someone else), you could easily rewrite it to use promises directly, instead of writing it for callbacks and then using Promise.promisifyAll to wrap it.
Here's how I would approach the getAllJobs method:
static getAllJobs() {
// connect to the Mongo server
return MongoClient.connectAsync(utils.getConnectionString())
// ...then do something with the collection
.then((db) => {
// get the collection of jobs
const jobs = db.collection('jobs');
// I'm not that familiar with Mongo - I'm going to assume that
// the call to `jobs.find().toArray()` is asynchronous and only
// available in the "callback flavored" form.
// returning a new Promise here (in the `then` block) allows you
// to add the results of the asynchronous call to the chain of
// `then` handlers. The promise will be resolved (or rejected)
// when the results of the `job().find().toArray()` method are
// known
return new Promise((resolve, reject) => {
jobs.find().toArray((err, jobs) => {
if(err) {
reject(err);
}
resolve(jobs);
});
});
});
}
This version of getAllJobs returns a promise which you can chain then and catch handlers to. For example:
JobModel.getAllJobs()
.then((jobs) => {
// this is the object passed into the `resolve` call in the callback
// above. Do something interesting with it, like
res.json(jobs);
})
.catch((err) => {
// this is the error passed into the call to `reject` above
});
Admittedly, this is very similar to the code you have above. The only difference is that I dispensed with the use of Promise.promisifyAll - if you're writing the code yourself & you want to use promises, then do it yourself.
One important note: it's a good idea to include a catch handler. If you don't, your error will be swallowed up and disappear, and you'll be left wondering why your code is not working. Even if you don't think you'll need it, just write a catch handler that dumps it to console.log. You'll be glad you did!

How to process a big array applying a async function for each element in nodejs?

I am working with zombie.js to scrape one site, I must to use the callback style to connect to each url. The point is that I have got an urls array and I need to process each urls using an async function. This is my first approach:
Array urls = {http..., http...};
function process_url(index)
{
if(index == urls.length)
return;
async_function(url,
function() {
...
//parse the url
...
// Process the next url
process_url(index++);
}
);
}
process_url(0)
Without use someone third party nodejs library to use the asyn funtion as sync function or to wait for the function (wait.for, synchornized, mocha), this is the way that I though to resolve this problem, I don't know what would happen if the array is too big. Is the function released from the memory when the next function is called? or all the functions are in memory until the end?
Any ideas?
Your scheme will work. I call it "manually sequencing async operations".
A general purpose version of what you're doing would look like this:
function processItem(data, callback) {
// do your async function here
// for example, let's suppose it was an http request using the request module
request(data, callback);
}
function processArray(array, fn) {
var index = 0;
function next() {
if (index < array.length) {
fn(array[index++], function(err, result) {
// process error here
if (err) return;
// process result here
next();
});
}
}
next();
}
processArray(arr, processItem);
As to your specific questions:
I don't know what would happen if the array is too big. Is the
function released from the memory when the next function is called? or
all the functions are in memory until the end?
Memory in Javascript is released when it is not longer referenced by any running code and when the garbage collector gets time to run. Since you are running a series of asynchronous operations here, it is likely that the garbage collector gets a chance to run regularly while waiting for the http response from the async operation so memory could get cleaned up then. Functions are just another type of object in Javascript and they get garbage collected just like anything else. When they are no longer reference by running code, they are eligible for garbage collection.
In your specific code, because you are re-calling process_url() only in an async callback, there is no stack build-up (as in normal recursion). The prior instance of process_url() has already completed BEFORE the async callback is called and BEFORE you call the next iteration of process_url().
In general, management and coordination of multiple async operations is much, much easier using promises which are built into the current versions of node.js and are part of the ES6 ECMAScript standard. No external libraries are required to use promises in current versions of node.js.
For a list of a number of different techniques for sequencing your asynchronous operations on your array, both using promises and not using promises, see:
How to synchronize a sequence of promises?.
The first step in using promises is the "promisify" your async function so that it returns a promise instead of takes a callback.
function async_function_promise(url) {
return new Promise(function(resolve, reject) {
async_function(url, function(err, result) {
if (err) {
reject(err);
} else {
resolve(result);
}
});
});
}
Now, you have a version of your function that returns promises.
If you want your async operations to proceed one at a time so the next one doesn't start until the previous one has completed, then a usual design pattern for that is to use .reduce() like this:
function process_urls(array) {
return array.reduce(function(p, url) {
return p.then(function(priorResult) {
return async_function_promise(url);
});
}, Promise.resolve());
}
Then, you can call it like this:
var myArray = ["url1", "url2", ...];
process_urls(myArray).then(function(finalResult) {
// all of them are done here
}, function(err) {
// error here
});
There are also Promise libraries that have some helpful features that make this type of coding simpler. I, myself, use the Bluebird promise library. Here's how your code would look using Bluebird:
var Promise = require('bluebird');
var async_function_promise = Promise.promisify(async_function);
function process_urls(array) {
return Promise.map(array, async_function_promise, {concurrency: 1});
}
process_urls(myArray).then(function(allResults) {
// all of them are done here and allResults is an array of the results
}, function(err) {
// error here
});
Note, you can change the concurrency value to whatever you want here. For example, you would probably get faster end-to-end performance if you increased it to something between 2 and 5 (depends upon the server implementation on how this is best optimized).

Is safe read and then write a file asynchronously on node.js?

I have a method that reads and write a log file, this method is called on every request by all users, then write the log the request path in a file. The questions are two:
Is safe read and then write a file in async mode considering concurrency questions?
If yes for the first question, the code bellow will work considering concurrency questions?
If yes for the first question how I have to do?
Please, disregard exceptions and performance questions, this is a didactic code.
var logFile = '/tmp/logs/log.js';
app.get("/", function(req){
var log = {path: req.path, date: new Date().getTime()};
log(log);
});
function log(data){
fs.exists(logFile, function(exists){
if(exists){
fs.readFile(logFile, function (err, data) {
if (err){
throw err
}
var logData = JSON.parse(data.toString());
logData.push(data);
writeLog(logData);
});
}else{
writeLog([data]);
}
});
}
function writeLog(base){
fs.writeFile(logFile, JSON.stringify(base, null, '\t'), function(err) {
if(err)
throw err;
});
}
I strongly suggest that you don't just "log asynchronously" because you want the log to be ordered based on the order things happened in your app, and there is no guarantee this will happen that way if you don't synchronize it somehow.
You can, for instance, use a promise chain to synchronize it:
var _queue = Promise.resolve();
function log(message){
_queue = _queue.then(function(){ // chain to the queue
return new Promise(function(resolve){
fs.appendFile("/tmp/logs/log.txt", new Date() + message + "\n", function(err, data){
if(err) console.log(err); // don't die on log exceptions
else resolve(); // signal to the queue it can progress
});
});
});
}
You can now call log and it will queue messages and write them some time asynchronously for you. It will never take more than a single file descriptor or exhaust the server either.
Consider using a logging solution instead of rolling your own logger btw.
In you're example you're already using the Asynchronous versions of those functions. If you're concerned about the order of your operations then you should use the synchronous versions of those functions.
readFileSync
writeFileSync
Also to note, JSON.parse() is a synchronous operation.You can make this "asynchronous" using the async module and doing a async.asyncify(JSON.parse(data.toString()));.
As noted by #BenjaminGruenbaum, async.asyncify(); doesn't actually make the operation of JSON.parse(); truly asynchronous but it does provide a more "async" style for the control flow of the operations.

To to do a parallel "Query" in Mongoose

I am pretty new to Mongoose so please bear with me.
Is there a way to perform two queries in "parallel". Or at least query two documents and return their results together? The callback notation is tripping me up a little with the sync.
In pseudo code this is what I am looking for:
function someWork(callback) {
var task1 = service.doQueryAndReturnTask();
var task2 = service.doQueryAndReturnTask();
waitAll(task1, task2);
callback(task1, task2);
}
I know this is not the solution, due to the need to have callback on the doQueryAndReturnTask, but I need a pattern that works and referrable doesnt chain the callbacks
It's not about Mongoose. Node.js is an asynchronous language, so it allows you to execute any number of async tasks (e.g. querying a database) at the same time.
What you need is some lib to handle asynchronous control flow, like async.js or when.js:
var when = require('when');
var someWork = function(callback) {
when.all([
collection1.find(query1).exec(),
collection2.find(query2).exec()
]).spread(callback)
.otherwise(function(err) {
// something went wrong
});
};
when.js is a module to handle promises. So, if you don't need promises, you may use async.js instead:
var async = require('async');
var someWork = function(callback) {
async.parallel([
function(cb) { collection1.find(query1, cb) },
function(cb) { collection2.find(query2, cb) }
], function(err, res) {
if (!err) return callback.apply(null, data);
// something went wrong
});
};
Update: Promises are the alternative way to handle asynchronous control flow by wrapping asynchronous functions with promises.
Usually, to get the results of some asynchronous function you should pass it some callback which will be executed somewhere in the future.
When you're using promises, instead of passing some callback you're immediately getting the promise of the results of the executions which will be resolved somewhere in the future.
So, promises allows you to work with asynchronous functions in a synchronous way using promises instead of the real data. Promises also allows you to wait for the results at any point of the execution.
In my example I'm executing two queries getting two promises for their results. Then I'm telling node to wait until both promises are fulfilled passing their results to the callback function afterwards.
You can read promises/A+ specification here. You may also look at when.js api docs.
Nowadays, this could be achieved using Promise.all:
Promise.all([
collection1.find({foo: 'bar'}),
collection2.find({fooey: 'bazzy'})
]).then(([fooResults, fooeyResults]) => {
console.log('results: ', fooResults, fooeyResults);
}).catch((err) => {
console.log('Error: ', err);
});

Resources