Node.js: detect when all events of a request have finished - node.js

Sorry if this question is simple but I have been using node.js for only a few days.
Basically i receive a json with some entries. I loop on these entries and launch a http request for each of them. Something like this:
for (var i in entries) {
// Lots of stuff
http.get(options, function(res) {
// Parse reponse and detect if it was successfully
});
}
How can i detect when all requests were done? I need this in order to call response.end().
Also i will need to inform if each entry had success or not. Should i use a global variable to save the result of each entry?

You can e.g. use caolans "async" library:
async.map(entries, function(entry, cb) {
http.get(options, function(res) {
// call cb here, first argument is the error or null, second one is the result
})
}, function(err, res) {
// this gets called when all requests are complete
// res is an array with the results
}

There are many different libraries for that. I prefer q and qq futures libraries to async as async leads to forests of callbacks in complex scenarios. Yet another library is Step.

Related

Why can't I deep copping a DICT out of a function in Node.js?

I'm using request to call an API which gives me movies data in an object called body, but when I try to pass it to an array and console log it, the terminal shows me an empty array.
let copy = [];
const request = require('request');
request('https://www.omdbapi.com/?t=JOKER&apikey=b04f2804', { json: true }, (err, res, body) => {
if (err) { return console.log(err);}
copy.push(JSON.parse(JSON.stringify(body)))
});
console.log(copy);
First off, the request() library is deprecated and it is not recommended that you write new code with it. There is a list of alternatives (that all support promises) here.
Then second, the request() library is non-blocking and asynchronous. That means that when you call request(), it starts the operation and then immediately returns. The code after the call to request() then continues to execute BEFORE request() calls its callback. So, you're trying to examine the array before its value has been set. This is a classic issue in asynchronous programming. There are a number of ways to handle this. If you stayed with the request() library, then you must either use the result INSIDE the callback itself or you can call some other function from within that callback and pass it the array.
With one of the listed alternatives that all support promises (my favorite is the got() library), you can then use await which is often a preferred programming style for asynchronous operations:
const got = require('got');
async function someFunction() {
let result = await got('https://www.omdbapi.com/?t=JOKER&apikey=b04f2804').json();
// you can use the result here
console.log(result);
// Or you can return it and it will become the resolved value
// of the promise this async function returns
return result;
}
someFunction().then(result => {
// can use result here
}).catch(err => {
console.log(err);
});
Note that when you are writing asynchronously retrieved data to a higher scoped variable as you are when you do copy.push(), that is typically a warning that you're doing something wrong. This is because NONE of the code in the higher scope will know when the data is available in that variable. There are rare cases when you might do that (like for implementing a cache), but these have to be situations where the data is being stored only for future reference, not for immediate use. 99.9% of the time when we see that construct, it's a wrong way to program with asynchronously retrieved data.

Make chained functions wait for each other to execute

How do I make a chained function wait for the function before it, to execute properly?
I have this excerpt from my module:
var ParentFunction = function(){
this.userAgent = "SomeAgent";
return this;
}
ParentFunction.prototype.login = function(){
var _this = this;
request.post(
url, {
"headers": {
"User-Agent": _this.userAgent
}
}, function(err, response, body){
return _this;
})
}
ParentFunction.prototype.user = function(username){
this.username = username;
return this;
}
ParentFunction.prototype.exec = function(callback){
request.post(
anotherURL, {
"headers": {
"User-Agent": _this.userAgent
}
}, function(err, response, body){
callback(body);
})
}
module.exports = parentFunction;
And this is from within my server:
var pF = require("./myModule.js"),
parentFunction = new pF();
parentFunction.login().user("Mobilpadde").exec(function(data){
res.json(data);
});
The problem is, that the user-function won't wait for login to finish (Meaning, it executes before the login returns _this). So how do I make it wait?
You can't make Javascript "wait" before executing the next call in the chain. The whole chain will execute immediately in sequential order without waiting for any async operations to complete.
The only way I can think of to make this structure work is to create a queue of things waiting to execute and then somehow monitor the things in that queue that are async so you know when to execute the next thing in the queue. This requires making each method follow some sort of standard convention for knowing both whether the method is async and if it is async when the async operation is done and what to do if there's an error in the chain.
jQuery does something like this for jQuery animations (it implements a queue for all chained animations and calls each animation in turn when the previous animation is done). But, its implementation is simpler than what you have here because it works only with jQuery animations (or manually queued functions that follow the proper convention), not with other jQuery methods. You are proposing a mix of three different kinds of methods, two of which are not even async.
So, to make a long story shorter, what you are asking for could likely be done if you make all methods follow a set of conventions, but it's not easy. Since you appear to only have one async operation here, I'm wondering if you could do something like this instead. You don't show what the .exec() method does, but if all it does is call some other function at the end of the chain, then you'ld only have one async method in the chain so you could just let it take a callback and do this:
parentFunction.user("Mobilepadde").login(function(data) {
res.json(data);
});
I was working on a queued means of doing this, but it is not something I can write and test in less than an hour and you'd have to offer some ideas for what you want to do with errors that occur anywhere in the chain. Non-async errors could just throw an exception, but async errors or even non-async errors that occur after an async operation completes can't just throw because there's no good way to catch them. So, error handling becomes complex and you really shouldn't embark on a design path that doesn't anticipate appropriate error handling.
The more I think about this, the more I think you either want to get a library designed to handle the sequencing of async operations (such as the async module) and use that for your queueing of operations or you want to just give up on the chaining and use promises to support sequencing of your operations. As I was thinking about how to do error handling in the task queue with async operations, it occurred to me that all these problems have already been dealt with in promises (propagation of errors through reject and catching of exceptions in async handlers that are turned into promise rejections, etc...). Doing error handling right with async operations is difficult and is one huge reason to build off of promises for sequencing async operations.
So, now thinking about solving this using promises, what you could do is to make each async method return a promise (you can promisfy the entire request module with one call with many promise libraries such as Bluebird). Then, one .login() and .exec() return promises, you can do this:
var Promise = require('bluebird');
var request = Promise.promisifyAll(require('request'));
ParentFunction.prototype.login = function(){
return request.postAsync(
url, {
"headers": {
"User-Agent": this.userAgent
}
});
}
ParentFunction.prototype.exec = function(){
return request.postAsync(
anotherURL, {
"headers": {
"User-Agent": this.userAgent
}
}).spread(function(response, body) {
return body;
})
}
parentFunction.login().then(function() {
parentFunction.user("Mobilpadde");
return parentFunction.exec().then(function(data) {
res.json(data);
});
}).catch(function(err) {
// handle errors here
});
This isn't chaining, but it gets you going in minutes rather than something that probably takes quite awhile to write (with robust error handling).
Try this and see if it works:
ParentFunction.prototype.login = function(callback){
var _this = this;
request.post(
url, {
"headers": {
"User-Agent": _this.userAgent
}
}, function(err, response, body){
return callback(_this);
})
}
}
On the server side:
parentFunction.login(function(loggedin){
loggedin.user("Mobilpadde").exec(function(data){
res.json(data);
});
});

How do you structure sequential AWS service calls within lambda given all the calls are asynchronous?

I'm coming from a java background so a bit of a newbie on Javascript conventions needed for Lambda.
I've got a lambda function which is meant to do several AWS tasks in a particular order, depending on the result of the previous task.
Given that each task reports its results asynchronously, I'm wondering if the right way make sure they all happen in the right sequence, and the results of one operation are available to the invocation of the next function.
It seems like I have to invoike each function in the callback of the prior function, but seems like that will some kind of deep nesting and wondering if that is the proper way to do this.
For example on of these functions requires a DynamoDB getItem, following by a call to SNS to get an endpoint, followed by a SNS call to send a message, followed by a DynamoDB write.
What's the right way to do that in lambda javascript, accounting for all that asynchronicity?
I like the answer from #jonathanbaraldi but I think it would be better if you manage control flow with Promises. The Q library has some convenience functions like nbind which help convert node style callback API's like the aws-sdk into promises.
So in this example I'll send an email, and then as soon as the email response comes back I'll send a second email. This is essentially what was asked, calling multiple services in sequence. I'm using the then method of promises to manage that in a vertically readable way. Also using catch to handle errors. I think it reads much better just simply nesting callback functions.
var Q = require('q');
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
// Use a promised version of sendEmail
var ses = new AWS.SES({apiVersion: '2010-12-01'});
var sendEmail = Q.nbind(ses.sendEmail, ses);
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
var nome = event.nome;
var email = event.email;
var mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
var params = {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
};
// Here is the white-meat of the program right here.
sendEmail(params)
.then(sendAnotherEmail)
.then(success)
.catch(logErrors);
function sendAnotherEmail(data) {
console.log("FIRST EMAIL SENT="+data);
// send a second one.
return sendEmail(params);
}
function logErrors(err) {
console.log("ERROR="+err, err.stack);
context.done();
}
function success(data) {
console.log("SECOND EMAIL SENT="+data);
context.done();
}
}
Short answer:
Use Async / Await — and Call the AWS service (SNS for example) with a .promise() extension to tell aws-sdk to use the promise-ified version of that service function instead of the call back based version.
Since you want to execute them in a specific order you can use Async / Await assuming that the parent function you are calling them from is itself async.
For example:
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("SNS Push Failed:");
console.log(err.stack);
return;
}
console.log('SNS push suceeded: ' + data);
return data;
}).promise();
The important part is the .promise() on the end there. Full docs on using aws-sdk in an async / promise based manner can be found here: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-promises.html
In order to run another aws-sdk task you would similarly add await and the .promise() extension to that function (assuming that is available).
For anyone who runs into this thread and is actually looking to simply push promises to an array and wait for that WHOLE array to finish (without regard to which promise executes first) I ended up with something like this:
let snsPromises = [] // declare array to hold promises
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("Search Push Failed:");
console.log(err.stack);
return;
}
console.log('Search push suceeded: ' + data);
return data;
}).promise();
snsPromises.push(snsResult)
await Promise.all(snsPromises)
Hope that helps someone that randomly stumbles on this via google like I did!
I don't know Lambda but you should look into the node async library as a way to sequence asynchronous functions.
async has made my life a lot easier and my code much more orderly without the deep nesting issue you mentioned in your question.
Typical async code might look like:
async.waterfall([
function doTheFirstThing(callback) {
db.somecollection.find({}).toArray(callback);
},
function useresult(dbFindResult, callback) {
do some other stuff (could be synch or async)
etc etc etc
callback(null);
],
function (err) {
//this last function runs anytime any callback has an error, or if no error
// then when the last function in the array above invokes callback.
if (err) { sendForTheCodeDoctor(); }
});
Have a look at the async doco at the link above. There are many useful functions for serial, parallel, waterfall, and many more. Async is actively maintained and seems very reliable.
good luck!
A very specific solution that comes to mind is cascading Lambda calls. For example, you could write:
A Lambda function gets something from DynamoDB, then invokes…
…a Lambda function that calls SNS to get an endpoint, then invokes…
…a Lambda function that sends a message through SNS, then invokes…
…a Lambda function that writes to DynamoDB
All of those functions take the output from the previous function as input. This is of course very fine-grained, and you might decide to group certain calls. Doing it this way avoids callback hell in your JS code at least.
(As a side note, I'm not sure how well DynamoDB integrates with Lambda. AWS might emit change events for records that can then be processed through Lambda.)
Just saw this old thread. Note that future versions of JS will improve that. Take a look at the ES2017 async/await syntax that streamlines an async nested callback mess into a clean sync like code.
Now there are some polyfills that can provide you this functionality based on ES2016 syntax.
As a last FYI - AWS Lambda now supports .Net Core which provides this clean async syntax out of the box.
I would like to offer the following solution, which simply creates a nested function structure.
// start with the last action
var next = function() { context.succeed(); };
// for every new function, pass it the old one
next = (function(param1, param2, next) {
return function() { serviceCall(param1, param2, next); };
})("x", "y", next);
What this does is to copy all of the variables for the function call you want to make, then nests them inside the previous call. You'll want to schedule your events backwards. This is really just the same as making a pyramid of callbacks, but works when you don't know ahead of time the structure or quantity of function calls. You have to wrap the function in a closure so that the correct value is copied over.
In this way I am able to sequence AWS service calls such that they go 1-2-3 and end with closing the context. Presumably you could also structure it as a stack instead of this pseudo-recursion.
I found this article which seems to have the answer in native javascript.
Five patterns to help you tame asynchronis javascript.
By default Javascript is asynchronous.
So, everything that you have to do, it's not to use those libraries, you can, but there's simple ways to solve this. In this code, I sent the email, with the data that comes from the event, but if you want, you just need to add more functions inside functions.
What is important is the place where your context.done(); is going to be, he is going to end your Lambda function. You need to put him in the end of the last function.
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
var ses = new AWS.SES({apiVersion: '2010-12-01'});
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
nome = event.nome;
email = event.email;
mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
ses.sendEmail( {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
}
},
function(err, data) {
if (err) {
console.log("ERROR="+err, err.stack);
context.done();
} else {
console.log("EMAIL SENT="+data);
context.done();
}
});
}

wait for async to complete before return

mongoosejs async code .
userSchema.static('alreadyExists',function(name){
var isPresent;
this.count({alias : name },function(err,count){
isPresent = !!count
});
console.log('Value of flag '+isPresent);
return isPresent;
});
I know isPresent is returned before the this.count async function calls the callback , so its value is undefined . But how do i wait for callback to change value of isPresent and then safely return ?
what effect does
(function(){ asynccalls() asynccall() })(); has in the async flow .
What happens if var foo = asynccall() or (function(){})()
Will the above two make return wait ?
can process.nextTick() help?
I know there are lot of questions like these , but nothing explained about problem of returning before async completion
There is no way to do that. You need to change the signature of your function to take a callback rather than returning a value.
Making IO async is one of the main motivation of Node.js, and waiting for an async call to be completed defeats the purpose.
If you give me more context on what you are trying to achieve, I can give you pointers on how to implement it with callbacks.
Edit: You need something like the following:
userSchema.static('alreadyExists',function (name, callback) {
this.count({alias : name}, function (err, count) {
callback(err, err ? null : !!count);
console.log('Value of flag ' + !!count);
});
});
Then, you can use it like:
User.alreadyExists('username', function (err, exists) {
if (err) {
// Handle error
return;
}
if (exists) {
// Pick another username.
} else {
// Continue with this username.
}
}
Had the same problem. I wanted my mocha tests to run very fast (as they originally did), but at the same time to have a anti-DOS layer present and operational in my app. Running those tests just as they originally worked was crazy fast and ddos module I'm using started to response with Too Many Requests error, making the tests fail. I didn't want to disable it just for test purposes (I actually wanted to have automated tests to verify Too Many Requests cases to be there as well).
I had one place used by all the tests that prepared client for HTTPS requests (filled with proper headers, authenticated, with cookies, etc.). It looked like this more or less:
var agent = thiz.getAgent();
thiz.log('preReq for user ' + thiz.username);
thiz.log('preReq for ' + req.url + ' for agent ' + agent.mochaname);
if(thiz.headers) {
Object.keys(thiz.headers).map(function(header) {
thiz.log('preReq header ' + header);
req.set(header, thiz.headers[header]);
});
}
agent.attachCookies(req);
So I wanted to inject there a sleep that would kick in every 5 times this client was requested by a test to perform a request - so the entire suite would run quickly and every 5-th request would wait to make ddos module consider my request unpunishable by Too Many Requests error.
I searched most of the entries here about Async and other libs or practices. All of them required going for callback - which meant I would have to re-write a couple of hundreds of test cases.
Finally I gave up with any elegant solution and fell to the one that worked for me. Which was adding a for loop trying to check status of non-existing file. It caused a operation to be performed long enough I could calibrate it to last for around 6500 ms.
for(var i = 0; i < 200000; ++i) {
try {
fs.statSync('/path' + i);
} catch(err) {
}
};

Design pattern for many asynchronous tasks in node

I'm learning node and writing an API. One of my API calls takes a parameter called Tags, which will contain comma-delimited tags, each of which I want to save to disk (I'm using MongoDB + Mongoose). Typically when I save to DB in my API I pass a callback and carry on after the save inside of that callback, but here I have a variable number of objects to save to disk, and I don't know the cleanest way to save all of these tags to disk, then save the object which references them afterward. Can anyone suggest a clean async pattern to use? Thanks!
async is a good node library for these tasks..
run multiple async calls in parallel or in series and trigger one single callback after that:
async.parallel([
function(){ ... },
function(){ ... }
], callback);
async.series([
function(){ ... },
function(){ ... }
]);
This is common code pattern I often use when I don't want additional dependencies:
var tags = ['tag1', 'tag2', 'tag3'];
var wait = tags.length;
tags.forEach(function (tag) {
doAsyncJob(tag, done);
});
function done() {
if (--wait === 0) allDone();
}
This code will run doAsyncJob(tag, callback) in parallel for each item of array, and call allDone when each job completed. If you need to process data continuously (each after another), here's another pattern:
(function oneIteration() {
var item = tags.shift();
if (item) {
doAsyncJob(item, oneIteration);
} else {
allDone();
}
})();

Resources