I'm currently trying to practice using an API with Twitter API.
I'm using Twit package to connect to twitters API but when I try to do a get request I get
Promise { pending }
I have tried using Async-Await but I'm not sure what I'm doing wrong here.
Here is my code:
const Twit = require('twit');
const twitterAPI = require('../secrets');
//Twit uses OAuth to stablish connection with twitter
let T = new Twit({
consumer_key: twitterAPI.apiKey,
consumer_secret: twitterAPI.apiSecretKey,
access_token: twitterAPI.accessToken,
access_token_secret: twitterAPI.accessTokenSecret
})
const getUsersTweets = async (userName) => {
let params = { screen_name: userName, count: 1 }
const userTweets = await T.get('search/tweets', params, await function (err, data, response) {
if (err) {
return 'There was an Error', err.stack
}
return data
})
return userTweets
}
console.log(getUsersTweets('Rainbow6Game'));
Problem
The biggest assumption that is wrong with the sample code is that T.get is expected to eventually resolve with some data.
const userTweets = await T.get('search/tweets', params, await function (err, data, response) {
if (err) {
return 'There was an Error', err.stack
}
return data // 'data' returned from here is not necessarily going to be received in 'userTweets' variable
})
callback function provided as last argument in T.get function call doesn't have to be preceded by an 'await'.
'data' returned from callback function is not necessarily going to be received in 'userTweets' variable. It totally depends on how T.get is implemented and can not be controlled.
Reason
Thing to be noted here is that async / await works well with Promise returning functions which eventually get resolved with expected data, however, that is not guaranteed here
Relying on the result of asynchronous T.get function call will probably not work because it returns a Promise { pending } object immediately and will get resolved with no data. The best case scenario is that everything with your function will work but getUsersTweets function will return 'undefined'.
Solution
The best solution is to make sure that your getUsersTweets function returns a promise which eventually gets resolved with correct data. Following changes are suggested:
const getUsersTweets = (userName) => {
return new Promise ((resolve, reject) => {
let params = { screen_name: userName, count: 1 }
T.get('search/tweets', params, function (err, data, response) {
if (err) {
reject(err);
}
resolve(data);
})
}
}
The above function is now guaranteed to return expected data and can be used in the following way:
const printTweets = async () => {
const tweets = await getUsersTweet(userName);
console.log(tweets);
}
printTweets();
From what I can see on your code, getUserTweets is an async function, so it will eventually return a promise. I'm assuming you will use this value on another function, so you will need to use it inside an async function and use await, otherwise you will always get a promise.
const logTweets = async (username) => {
try {
const userTweets = await getUsersTweets(username);
// Do something with the tweets
console.log(userTweets);
catch (err) {
// catch any error
console.log(err);
}
}
If logging is all you want and you wrapped it inside a function in which you console.log it, you can call that function directly:
logTweets('someUsername');
Trying to send several messages (from AWS SQS lambda, if that matters) but it's never waiting for the promises.
function getEndpoint(settings){
return new Promise(function(resolve, reject) {
// [...] more stuff here
}
Which is then called in a loop:
exports.handler = async (event) => {
var messages = [];
event.Records.forEach(function(messageId, body) {
//options object created from some stuff
messages.push(getEndpoint(options).then(function(response){
console.log("anything at all"); //NEVER LOGGED
}));
});
await Promise.all(messages);
};
But the await seems to be flat out skipped. I'm not sure how I'm getting Process exited before completing request with an explicit await. I have similar async await/promise setups in other scripts that work, but cannot spot what I've done wrong with this one.
You forgot to return something to lambda:
exports.handler = async (event) => {
var messages = [];
event.Records.forEach(function(messageId, body) {
//options object created from some stuff
messages.push(getEndpoint(options));
});
await Promise.all(messages);
return 'OK'
};
this should also work:
exports.handler = (event) => { // async is not mandatory here
var messages = [];
event.Records.forEach(function(messageId, body) {
//options object created from some stuff
messages.push(getEndpoint(options));
});
return Promise.all(messages); // returning a promise
};
and you could use map:
exports.handler = (event) => { // async is not mandatory here
const messages = event.Records.map(function(messageId, body) {
//options object created from some stuff
return getEndpoint(options)
});
return Promise.all(messages); // returning a promise
};
To understand why this happens, you must dive a bit into lambda's implementation: it will essentially wait for the function stack to be cleared and since you did NOT return anything at all in there, the function stack got empty right after it queued all the stuff - adding a simple return after the await call makes the fn stack to NOT be empty which means lambda will wait for it to be finished.
If you run this on standard node, your function would also return before the promises were finished BUT your node process would NOT exit until the stack was cleared. This is where lambda diverges from stock node.
I am new to AWS Lambda and there is one thing I find very confusing.
So far, I found following options how to return from a function in Node.js:
1.
exports.handler = (event, context) => {
context.succeed('ok');
}
2.
exports.handler = (event, context) => {
context.done(null, 'ok');
}
3.
exports.handler = (event, context, callback) => {
callback(null, 'ok');
}
4.
exports.handler = async event => {
return "ok";
}
How are these different? Any functionality or performance distinctions?
Can anyone explain how to terminate a function in the right way?
You're probably using Node.js 8.10, which is so far the last Node.js version supported by AWS Lambda, otherwise the last (4.) snippet woudn't work at all (due to a syntax error).
In Node.js 8.10 all the above listed variants are valid, most of them are still there only for compatibility with earlier runtime versions.
The fist two (1. and 2.) are the oldest ones and it's not recomended to use them anymore. The done(err?, res?) function is an equivalent to the later added callback(err?, res?), which was frequently used before Node.js 8.10 and you can still find a lot of code examples even in the official documentation. It's a typical callback and could be used in asynchronous processing:
exports.handler = (event, context, callback) => {
var params = {
Bucket: "examplebucket",
Key: "HappyFace.jpg"
};
s3.getObject(params, function(err, data) {
if (err) return callback(err);
callback(null, data);
});
}
Nevertheless, this function has all the drawbacks of using callbacks in general (Callback Hell).
Up Node.js 8.10 you can use Promises together with async/await syntactic sugar, which makes asynchronous computation look like synchronous. AWS JavaScript SDK added a function promise() to almost all the functions previously using callbacks. Now you can write:
exports.handler = async event => {
var params = {
Bucket: "examplebucket",
Key: "HappyFace.jpg"
};
var data = await s3.getObject(params).promise();
// here you process the already resolved data...
return data;
// or you can omit `await` here whatsoever:
// return s3.getObject(params).promise();
}
Which produces shorter and eleganter code, more readable for humans.
Of course, at the end you can choose what you like, just compare your example snippets...
I am trying to run a small snippet of lambda code where i am pushing data to S3 using firehose. Here is my snippet
const AWS = require( 'aws-sdk' );
var FIREhose = new AWS.Firehose();
exports.handler = async (event,context,callback) => {
// TODO implement
const response = {
statusCode:200,
Name:event.Name,
Value:event.Value
};
const params = {
DeliveryStreamName: 'kinesis-firehose',
Record: { Data: new Buffer(JSON.stringify(response)) }
};
FIREhose.putRecord(params, (err, data) => {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data);
});
};
Here are my events
{
"Name": "Mike",
"Value": "66"
}
When i run this lambda all i am getting response as null . Since i am not passing any callback lambda will default run the implicit callback and returns null. I see that no data is pushed to S3 bucket.
But when i add callback(null,"success") line at the end like this
FIREhose.putRecord(params, (err, data) => {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data);
});
callback(null,"success")
};
I see the data is pushed to S3. Why is that ?
Does async functions always need a callback with some text appended to it ?
Any help is appreciated ?
Thanks
The problem here is that you're mixing your node.js lambda patterns.
Either you use an asynchronous function and return or throw:
exports.handler = async (event,context,callback) => {
// code goes here.
await FIREhose.putRecord(params).promise();
return null; // or whatever result.
};
Or you use the callback approach:
exports.handler = (event,context,callback) => {
// code goes here.
FIREhose.putRecord(params)
.promise();
.then((data) => {
// do stuff with data.
// n.b. you could have used the cb instead of a promise here too.
callback(null, null); // or whatever result.
});
};
(There's a third way using context. but that's a very legacy way).
This is all due to how lambda works and detects when there's been a response.
In your first example (no callback), lambda is expecting your handler to return a promise that it has to wait to resolve/reject, which, in turn, will be the response. However, you're not returning a promise (undefined) and so there's nothing to wait for and it immediately returns- quite probably before the putRecord call has completed.
When you used callback though, you explicitly told lambda that you're using the "old" way. And the interesting thing about the callback approach is that it waits for node's event loop to complete (by default). Which means that .putRecord will probably complete.
I'm trying to update a tool that was created a while ago which uses nodejs (I am not a JS developer, so I'm trying to piece the code together) and am getting stuck at the last hurdle.
The new functionality will take in a swagger .json definition, compare the endpoints against the matching API Gateway on the AWS Service, using the 'aws-sdk' SDK for JS and then updates the Gateway accordingly.
The code runs fine on a small definition file (about 15 endpoints) but as soon as I give it a bigger one, I start getting tons of TooManyRequestsException errors.
I understand that this is due to my calls to the API Gateway service being too quick and a delay / pause is needed. This is where I am stuck
I have tried adding;
a delay() to each promise being returned
running a setTimeout() in each promise
adding a delay to the Promise.all and Promise.mapSeries
Currently my code loops through each endpoint within the definition and then adds the response of each promise to a promise array:
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath));
Once the loop is finished I run this:
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
I have tried the same with a mapSeries (no luck).
It looks like the functions within the (getMethodResponse promise) are run immediately and hence, no matter what type of delay I add they all still just execute. My suspicious is that the I need to make (getMethodResponse) return a function and then use mapSeries but I cant get this to work either.
Code I tried:
Wrapped the getMethodResponse in this:
return function(value){}
Then added this after the loop (and within the loop - no difference):
Promise.mapSeries(function (promises) {
return 'a'();
}).then(function (results) {
console.log('result', results);
});
Also tried many other suggestions:
Here
Here
Any suggestions please?
EDIT
As request, some additional code to try pin-point the issue.
The code currently working with a small set of endpoints (within the Swagger file):
module.exports = (apiName, externalUrl) => {
return getSwaggerFromHttp(externalUrl)
.then((swagger) => {
let paths = swagger.paths;
let resourcePath = '';
let resourceMethod = '';
let promises = [];
_.each(paths, function (value, key) {
resourcePath = key;
_.each(value, function (value, key) {
resourceMethod = key;
let statusList = [];
_.each(value.responses, function (value, key) {
if (key >= 200 && key <= 204) {
statusList.push(key)
}
});
_.each(statusList, function (value, key) { //Only for 200-201 range
//Working with small set
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))
});
});
});
//Working with small set
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
})
.catch((err) => {
winston.error(err);
});
};
I have since tried adding this in place of the return Promise.all():
Promise.map(promises, function() {
// Promise.map awaits for returned promises as well.
console.log('X');
},{concurrency: 5})
.then(function() {
return console.log("y");
});
Results of this spits out something like this (it's the same for each endpoint, there are many):
Error: TooManyRequestsException: Too Many Requests
X
Error: TooManyRequestsException: Too Many Requests
X
Error: TooManyRequestsException: Too Many Requests
The AWS SDK is being called 3 times within each promise, the functions of which are (get initiated from the getMethodResponse() function):
apigateway.getRestApisAsync()
return apigateway.getResourcesAsync(resourceParams)
apigateway.getMethodAsync(params, function (err, data) {}
The typical AWS SDK documentation state that this is typical behaviour for when too many consecutive calls are made (too fast). I've had a similar issue in the past which was resolved by simply adding a .delay(500) into the code being called;
Something like:
return apigateway.updateModelAsync(updateModelParams)
.tap(() => logger.verbose(`Updated model ${updatedModel.name}`))
.tap(() => bar.tick())
.delay(500)
EDIT #2
I thought in the name of thorough-ness, to include my entire .js file.
'use strict';
const AWS = require('aws-sdk');
let apigateway, lambda;
const Promise = require('bluebird');
const R = require('ramda');
const logger = require('../logger');
const config = require('../config/default');
const helpers = require('../library/helpers');
const winston = require('winston');
const request = require('request');
const _ = require('lodash');
const region = 'ap-southeast-2';
const methodLib = require('../aws/methods');
const emitter = require('../library/emitter');
emitter.on('updateRegion', (region) => {
region = region;
AWS.config.update({ region: region });
apigateway = new AWS.APIGateway({ apiVersion: '2015-07-09' });
Promise.promisifyAll(apigateway);
});
function getSwaggerFromHttp(externalUrl) {
return new Promise((resolve, reject) => {
request.get({
url: externalUrl,
header: {
"content-type": "application/json"
}
}, (err, res, body) => {
if (err) {
winston.error(err);
reject(err);
}
let result = JSON.parse(body);
resolve(result);
})
});
}
/*
Deletes a method response
*/
function deleteMethodResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {
let methodResponseParams = {
httpMethod: httpMethod,
resourceId: resourceId,
restApiId: restApiId,
statusCode: statusCode
};
return apigateway.deleteMethodResponseAsync(methodResponseParams)
.delay(1200)
.tap(() => logger.verbose(`Method response ${statusCode} deleted for path: ${resourcePath}`))
.error((e) => {
return console.log(`Error deleting Method Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
logger.error('Error: ' + e.stack)
});
}
/*
Deletes an integration response
*/
function deleteIntegrationResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {
let methodResponseParams = {
httpMethod: httpMethod,
resourceId: resourceId,
restApiId: restApiId,
statusCode: statusCode
};
return apigateway.deleteIntegrationResponseAsync(methodResponseParams)
.delay(1200)
.tap(() => logger.verbose(`Integration response ${statusCode} deleted for path ${resourcePath}`))
.error((e) => {
return console.log(`Error deleting Integration Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
logger.error('Error: ' + e.stack)
});
}
/*
Get Resource
*/
function getMethodResponse(httpMethod, statusCode, apiName, resourcePath) {
let params = {
httpMethod: httpMethod.toUpperCase(),
resourceId: '',
restApiId: ''
}
return getResourceDetails(apiName, resourcePath)
.error((e) => {
logger.unimportant('Error: ' + e.stack)
})
.then((result) => {
//Only run the comparrison of models if the resourceId (from the url passed in) is found within the AWS Gateway
if (result) {
params.resourceId = result.resourceId
params.restApiId = result.apiId
var awsMethodResponses = [];
try {
apigateway.getMethodAsync(params, function (err, data) {
if (err) {
if (err.statusCode == 404) {
return console.log(`Method ${params.httpMethod} not found on resource path: ${resourcePath} (resourceId: ${params.resourceId})`); // an error occurred
}
console.log(err, err.stack); // an error occurred
}
else {
if (data) {
_.each(data.methodResponses, function (value, key) {
if (key >= 200 && key <= 204) {
awsMethodResponses.push(key)
}
});
awsMethodResponses = _.pull(awsMethodResponses, statusCode); //List of items not found within the Gateway - to be removed.
_.each(awsMethodResponses, function (value, key) {
if (data.methodResponses[value].responseModels) {
var existingModel = data.methodResponses[value].responseModels['application/json']; //Check if there is currently a model attached to the resource / method about to be deleted
methodLib.updateResponseAssociation(params.httpMethod, params.resourceId, params.restApiId, statusCode, existingModel); //Associate this model to the same resource / method, under the new response status
}
deleteMethodResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
.delay(1200)
.done();
deleteIntegrationResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
.delay(1200)
.done();
})
}
}
})
.catch(err => {
console.log(`Error: ${err}`);
});
}
catch (e) {
console.log(`getMethodAsync failed, Error: ${e}`);
}
}
})
};
function getResourceDetails(apiName, resourcePath) {
let resourceExpr = new RegExp(resourcePath + '$', 'i');
let result = {
apiId: '',
resourceId: '',
path: ''
}
return helpers.apiByName(apiName, AWS.config.region)
.delay(1200)
.then(apiId => {
result.apiId = apiId;
let resourceParams = {
restApiId: apiId,
limit: config.awsGetResourceLimit,
};
return apigateway.getResourcesAsync(resourceParams)
})
.then(R.prop('items'))
.filter(R.pipe(R.prop('path'), R.test(resourceExpr)))
.tap(helpers.handleNotFound('resource'))
.then(R.head)
.then([R.prop('path'), R.prop('id')])
.then(returnedObj => {
if (returnedObj.id) {
result.path = returnedObj.path;
result.resourceId = returnedObj.id;
logger.unimportant(`ApiId: ${result.apiId} | ResourceId: ${result.resourceId} | Path: ${result.path}`);
return result;
}
})
.catch(err => {
console.log(`Error: ${err} on API: ${apiName} Resource: ${resourcePath}`);
});
};
function delay(t) {
return new Promise(function(resolve) {
setTimeout(resolve, t)
});
}
module.exports = (apiName, externalUrl) => {
return getSwaggerFromHttp(externalUrl)
.then((swagger) => {
let paths = swagger.paths;
let resourcePath = '';
let resourceMethod = '';
let promises = [];
_.each(paths, function (value, key) {
resourcePath = key;
_.each(value, function (value, key) {
resourceMethod = key;
let statusList = [];
_.each(value.responses, function (value, key) {
if (key >= 200 && key <= 204) {
statusList.push(key)
}
});
_.each(statusList, function (value, key) { //Only for 200-201 range
promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))
});
});
});
//Working with small set
return Promise.all(promises)
.catch((err) => {
winston.error(err);
})
})
.catch((err) => {
winston.error(err);
});
};
You apparently have a misunderstanding about what Promise.all() and Promise.map() do.
All Promise.all() does is keep track of a whole array of promises to tell you when the async operations they represent are all done (or one returns an error). When you pass it an array of promises (as you are doing), ALL those async operations have already been started in parallel. So, if you're trying to limit how many async operations are in flight at the same time, it's already too late at that point. So, Promise.all() by itself won't help you control how many are running at once in any way.
I've also noticed since, that it seems this line promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath)) is actually executing promises and not simply adding them to the array. Seems like the last Promise.all() doesn't actually do much.
Yep, when you execute promises.push(getMethodResponse()), you are calling getMethodResponse() immediately right then. That starts the async operation immediately. That function then returns a promise and Promise.all() will monitor that promise (along with all the other ones you put in the array) to tell you when they are all done. That's all Promise.all() does. It monitors operations you've already started. To keep the max number of requests in flight at the same time below some threshold, you have to NOT START the async operations all at once like you are doing. Promise.all() does not do that for you.
For Bluebird's Promise.map() to help you at all, you have to pass it an array of DATA, not promises. When you pass it an array of promises that represent async operations that you've already started, it can do no more than Promise.all() can do. But, if you pass it an array of data and a callback function that can then initiate an async operation for each element of data in the array, THEN it can help you when you use the concurrency option.
Your code is pretty complex so I will illustrate with a simple web scraper that wants to read a large list of URLs, but for memory considerations, only process 20 at a time.
const rp = require('request-promise');
let urls = [...]; // large array of URLs to process
Promise.map(urls, function(url) {
return rp(url).then(function(data) {
// process scraped data here
return someValue;
});
}, {concurrency: 20}).then(function(results) {
// process array of results here
}).catch(function(err) {
// error here
});
In this example, hopefully you can see that an array of data items are being passed into Promise.map() (not an array of promises). This, then allows Promise.map() to manage how/when the array is processed and, in this case, it will use the concurrency: 20 setting to make sure that no more than 20 requests are in flight at the same time.
Your effort to use Promise.map() was passing an array of promises, which does not help you since the promises represent async operations that have already been started:
Promise.map(promises, function() {
...
});
Then, in addition, you really need to figure out what exactly causes the TooManyRequestsException error by either reading documentation on the target API that exhibits this or by doing a whole bunch of testing because there can be a variety of things that might cause this and without knowing exactly what you need to control, it just takes a lot of wild guesses to try to figure out what might work. The most common things that an API might detect are:
Simultaneous requests from the same account or source.
Requests per unit of time from the same account or source (such as request per second).
The concurrency operation in Promise.map() will easily help you with the first option, but will not necessarily help you with the second option as you can limit to a low number of simultaneous requests and still exceed a requests per second limit. The second needs some actual time control. Inserting delay() statements will sometimes work, but even that is not a very direct method of managing it and will either lead to inconsistent control (something that works sometimes, but not other times) or sub-optimal control (limiting yourself to something far below what you can actually use).
To manage to a request per second limit, you need some actual time control with a rate limiting library or actual rate limiting logic in your own code.
Here's an example of a scheme for limiting the number of requests per second you are making: How to Manage Requests to Stay Below Rate Limiting.