Best Practices in Serverless Framework - node.js

I am new beginner in serverless framwork.
When study Best Practices in Serverless.
here
I have a question about "Initialize external services outside of your Lambda code".
How to implement it?
For example: Below code in handler.js
const getOneUser = (event, callback) => {
let response = null;
// validate parameters
if (event.accountid && process.env.SERVERLESS_SURVEYTABLE) {
let docClient = new aws.DynamoDB.DocumentClient();
let params = {
TableName: process.env.SERVERLESS_USERTABLE,
Key: {
accountid: event.accountid,
}
};
docClient.get(params, function(err, data) {
if (err) {
// console.error("Unable to get an item with the request: ", JSON.stringify(params), " along with error: ", JSON.stringify(err));
return callback(getDynamoDBError(err), null);
} else {
if (data.Item) { // got response
// compose response
response = {
accountid: data.Item.accountid,
username: data.Item.username,
email: data.Item.email,
role: data.Item.role,
};
return callback(null, response);
} else {
// console.error("Unable to get an item with the request: ", JSON.stringify(params));
return callback(new Error("404 Not Found: Unable to get an item with the request: " + JSON.stringify(params)), null);
}
}
});
}
// incomplete parameters
else {
return callback(new Error("400 Bad Request: Missing parameters: " + JSON.stringify(event)), null);
}
};
The question is that how to initial DynamoDB outside of my Lambda code.
Update 2:
Is below code optimized?
Handler.js
let survey = require('./survey');
module.exports.handler = (event, context, callback) => {
return survey.getOneSurvey({
accountid: event.accountid,
surveyid: event.surveyid
}, callback);
};
survey.js
let docClient = new aws.DynamoDB.DocumentClient();
module.exports = (() => {
const getOneSurvey = (event, callback) {....
docClient.get(params, function(err, data)...
....
};
return{
getOneSurvey : getOneSurvey,
}
})();

Here's the quote in question:
Initialize external services outside of your Lambda code
When using services (like DynamoDB) make sure to initialize outside of your lambda code. Ex: module initializer (for Node), or to a static constructor (for Java). If you initiate a connection to DDB inside the Lambda function, that code will run on every invoke.
In other words, in the same file, but outside of -- before -- the actual handler code.
let docClient = new aws.DynamoDB...
...
const getOneUser = (event, callback) => {
....
docClient.get(params, ...
When the container starts, the code outside the handler runs. When subsequent function invocations reuse the same container, you save resources and time by not instantiating the external services again. Containers are often reused, but each container only handles one concurrent request at a time, and how often they are reused and for how long is outside your control... Unless you update the function, in which case any existing containers will no longer be reused, because they'd have the old version of the function.
Your code will work as written, but isn't optimized.
The caveat with this approach that arises in current generation Node.js Lambda functions (Node 4.x/6.x) is that some objects -- notably, those that create literal persistent connections to external services -- will prevent the event loop from becoming empty (a common example is a mysql database connection, which is holding a live TCP connection to the server; by contrast, a DynamoDB "connection" is actually connectionless, since it's transport protocol is HTTPS). In this case you need to either take a different approach or allow lambda to not wait for an empty event loop before freezing the container, by setting context.callbackWaitsForEmptyEventLoop to false before calling the callback... but only do this if needed and only if you fully understand what it means. Setting it by default because some guy on the Internet said it was a good idea will potentially bring you mysterious bugs, later.

Related

Why Does my AWS lambda function randomly fail when using private elasticache network calls as well as external API calls?

I am trying to write a caching function that returns cached elasticcache data or makes an api call to retrieve that data. However, the lambda function seems to be very unrealiable and timing out often.
It seems that the issue is having redis calls as well as public api calls causes the issue. I can confirm that I have setup aws correctly with a subnet with an internet gateway and a private subnet with a nat gateway. The function works, but lonly 10 % of the time.The remaining times exceution is stopped right before making the API call.
I have also noticed that the api calls fail after creating the redis client. If I make the external api call prior to making the redis check it seems the function is a lot more reliable and doesn't time out.
Not sure what to do. Is it best practice to seperate these 2 tasks or am I doing something wrong?
let data = null;
module.exports.handler = async (event) => {
//context.callbackWaitsForEmptyEventLoop = false;
let client;
try {
client = new Redis(
6379,
"redis://---.---.ng.0001.use1.cache.amazonaws.com"
);
client.get(event.token, async (err, result) => {
if (err) {
console.error(err);
} else {
data = result;
await client.quit();
}
});
if (data && new Date().getTime() / 1000 - eval(data).timestamp < 30) {
res.send(`({
"address": "${token}",
"price": "${eval(data).price}",
"timestamp": "${eval(data).timestamp}"
})`);
} else {
getPrice(event); //fetch api data
}
```
There a lot of misunderstand in your code. I'll try to guide you to fix it and understand how to do that correctly.
You are mixing asynchronous and synchronous code in your function.
You should use JSON.parse instead of eval to parse the data because eval allows arbitrary code to be executed in your function
You're using the res.send to return response to the client instead of callback. Remember the usage of res.send is only in express and you're using a lambda and to return the result to client you need to use callback function
To help you in this task, I completely rewrite your code solving these misundersand.
const Redis = require('ioredis');
module.exports.handler = async (event, context, callback) => {
// prefer to use lambda env instead of put directly in the code
const client = new Redis(
"REDIS_PORT_ENV",
"REDIS_HOST_ENV"
);
const data = await client.get(event.token);
client.quit();
const parsedData = JSON.parse(data);
if (parsedDate && new Date().getTime() / 1000 - parsedData.timestamp < 30) {
callback(null, {
address: event.token,
price: parsedData.price,
timestamp: parsedData.timestamp
});
} else {
const dataFromApi = await getPrice(event);
callback(null, dataFromApi);
}
};
There another usage with lambdas that return an object instead of pass a object inside callback, but I think you get the idea and understood your mistakes.
Follow the docs about correctly usage of lambda:
https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-lambda-functions.html
To undestand more about async and sync in javascript:
https://www.freecodecamp.org/news/synchronous-vs-asynchronous-in-javascript/
JSON.parse x eval: JSON.parse vs. eval()

Problem in reading event stream in AWS Lambda. Nodejs code working locally as desired but not in AWS Lambda

Here's the workflow:
Get a https link --> write to filesystem --> read from filesystem --> Get the sha256 hash.
It works all good on my local machine running node 10.15.3 But when i initiate a lambda function on AWS, the output is null. Some problem may lie with the readable stream. Here's the code. You can run it directly on your local machine. It will output a sha256 hash as required. If you wish to run on AWS Lambda, Comment/Uncomment as marked.
//Reference: https://stackoverflow.com/questions/11944932/how-to-download-a-file-with-node-js-without-using-third-party-libraries
var https = require('https');
var fs = require('fs');
var crypto = require('crypto')
const url = "https://upload.wikimedia.org/wikipedia/commons/a/a8/TEIDE.JPG"
const dest = "/tmp/doc";
let hexData;
async function writeit(){
var file = fs.createWriteStream(dest);
return new Promise((resolve, reject) => {
var responseSent = false;
https.get(url, response => {
response.pipe(file);
file.on('finish', () =>{
file.close(() => {
if(responseSent) return;
responseSent = true;
resolve();
});
});
}).on('error', err => {
if(responseSent) return;
responseSent = true;
reject(err);
});
});
}
const readit = async () => {
await writeit();
var readandhex = fs.createReadStream(dest).pipe(crypto.createHash('sha256').setEncoding('hex'))
try {
readandhex.on('finish', function () { //MAY BE PROBLEM IS HERE.
console.log(this.read())
fs.unlink(dest, () => {});
})
}
catch (err) {
console.log(err);
return err;
}
}
const handler = async() =>{ //Comment this line to run the code on AWS Lambda
//exports.handler = async (event) => { //UNComment this line to run the code on AWS Lambda
try {
hexData = readit();
}
catch (err) {
console.log(err);
return err;
}
return hexData;
};
handler() //Comment this line to run the code on AWS Lambda
There can be multiple things that you need check.
Since, the URL you are accessing is a public one, make sure either your lambda is outside VPC or your VPC has NAT Gateway attached with internet access.
/tmp is valid temp directory for lambda, but you may need to create doc folder inside /tmp before using it.
You can check cloud-watch logs for more information on what's going if enabled.
I've seen this difference in behaviour between local and lambda before.
All async functions return promises. Async functions must be awaited. Calling an async function without awaiting it means execution continues to the next line(s), and potentially out of the calling function.
So your code:
exports.handler = async (event) => {
try {
hexData = readit();
}
catch (err) {
console.log(err);
return err;
}
return hexData;
};
readit() is defined as const readit = async () => { ... }. But your handler does not await it. Therefore hexData = readit(); assigns an unresolved promise to hexData, returns it, and the handler exits and the Lambda "completes" without the code of readit() having been executed.
The simple fix then is to await the async function: hexData = await readit();. The reason why it works locally in node is because the node process will wait for promises to resolve before exiting, even though the handler function has already returned. But since Lambda "returns" as soon as the handler returns, unresolved promises remain unresolved. (As an aside, there is no need for the writeit function to be marked async, because it doesn't await anything, and already returns a promise.)
That being said, I don't know promises well, and I barely know anything about events. So there are others things which raise warning flags for me but I'm not sure about them, maybe they're perfectly fine, but I'll raise it here just in case:
file.on('finish' and readandhex.on('finish'. These are both events, and I believe are non-blocking, so why would the handler and therefore lambda wait around for them?
In the first case, it's within a promise and resolve() is called from within the event function, so that may be fine (as I said, I don't know much about these 2 subjects so am not sure) - the important thing is that the code must block at that point until the promise is resolved. If the code can continue execution (i.e. return from writeit()) until the finish event is raised, then it won't work.
The second case is almost certainly going to be a problem because it's just saying that if x event is raised, then do y. There's no promise being awaited, so nothing to block the code, so it will happily continue to the end of the readit() function and then the handler and lambda. Again this is based on the assumption that events are non blocking (in the sense of, a declaration that you want to execute some code on some event, does not wait at that point for that event to be raised).

AWS Lambda Container destroy event

When to release connections and cleanup resources in lambda. In normal Node JS application we do use the hook
process.on('exit', (code) => {
console.log(`About to exit with code: ${code}`);
});
However this doesn't work on AWS Lambda. Resulting the Mysql connection in sleep mode. We don't have enough resource for such active connections. None of the AWS documentation specify a way to achieve this.
How to receive stop event of AWS Lambda container ?
The short answer is that there is no such event to know when the container is stopped.
UPDATE: I've not used this library, but https://www.npmjs.com/package/serverless-mysql appears to try to solve just this problem.
PREVIOUS LONG ANSWER:
The medium answer: after speaking with someone at AWS about this I now believe you should scope database connections at the module level so they are reused as long as the container exists. When your container is destroyed the connection will be destroyed at that point.
Original answer:
This question touches on some rather complicated issues to consider with AWS Lambda functions, mainly because of the possibility of considering connection pooling or long-lived connections to your database. To start with, Lambda functions in Node.js execute as a single, exported Node.js function with this signature (as you probably know):
exports.handler = (event, context, callback) => {
// TODO implement
callback(null, 'Hello from Lambda');
};
The cleanest and simplest way to handle database connections is to create and destroy them with every single function call. In this scenario, a database connection would be created at the beginning of your function and destroyed before your final callback is called. Something like this:
const mysql = require('mysql');
exports.handler = (event, context, callback) => {
let connection = mysql.createConnection({
host : 'localhost',
user : 'me',
password : 'secret',
database : 'my_db'
});
connection.connect();
connection.query('SELECT 1 + 1 AS solution', (error, results, fields) => {
if (error) {
connection.end();
callback(error);
}
else {
let retval = results[0].solution;
connection.end();
console.log('The solution is: ', retval);
callback(null, retval);
}
});
};
NOTE: I haven't tested that code. I'm just providing an example for discussion.
I've also seen conversations like this one discussing the possibility of placing your connection outside the main function body:
const mysql = require('mysql');
let connection = mysql.createConnection({
host : 'localhost',
user : 'me',
password : 'secret',
database : 'my_db'
});
connection.connect();
exports.handler = (event, context, callback) => {
// NOTE: should check if the connection is open first here
connection.query('SELECT 1 + 1 AS solution', (error, results, fields) => {
if (error) {
callback(error);
}
else {
let retval = results[0].solution;
console.log('The solution is: ', retval);
callback(null, retval);
}
});
};
The theory here is this: because AWS Lambda will try to reuse an existing container after the first call to your function, the next function call will already have a database connection open. The example above should probably check the existence of an open connection before using it, but you get the idea.
The problem of course is that this leaves your connection open indefinitely. I'm not a fan of this approach but depending on your specific circumstances this might work. You could also introduce a connection pool into that scenario. But regardless, you have no event in this case to cleanly destroy a connection or pool. The container process hosting your function would itself be killed. So you'd have to rely on your database killing the connection from it's end at some point.
I could be wrong about some of those details, but I believe at a high level that is what you're looking at. Hope that helps!

Why AWS Lambda execution time is long using pg-promise

I started using AWS Lambda to perform a very simple task which is executing an SQL query to retrieve records from an RDS postgres database and create SQS message base on the result.
Because Amazon is only providing aws-sdk module (using node 4.3 engine) by default and we need to execute this SQL query, we have to create a custom deployment package which includes pg-promise. Here is the code I'm using:
console.info('Loading the modules...');
var aws = require('aws-sdk');
var sqs = new aws.SQS();
var config = {
db: {
username: '[DB_USERNAME]',
password: '[DB_PASSWORD]',
host: '[DB_HOST]',
port: '[DB_PORT]',
database: '[DB_NAME]'
}
};
var pgp = require('pg-promise')({});
var cn = `postgres://${config.db.username}:${config.db.password}#${config.db.host}:${config.db.port}/${config.db.database}`;
if (!db) {
console.info('Connecting to the database...');
var db = pgp(cn);
} else {
console.info('Re-use database connection...');
}
console.log('loading the lambda function...');
exports.handler = function(event, context, callback) {
var now = new Date();
console.log('Current time: ' + now.toISOString());
// Select auction that need to updated
var query = [
'SELECT *',
'FROM "users"',
'WHERE "users"."registrationDate"<=${now}',
'AND "users"."status"=1',
].join(' ');
console.info('Executing SQL query: ' + query);
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
var ids = [];
data.forEach(function(auction) {
ids.push(auction.id);
});
if (ids.length == 0) {
callback(null, 'No user to update');
} else {
var sqsMessage = {
MessageBody: JSON.stringify({ action: 'USERS_UPDATE', data: ids}), /* required */
QueueUrl: '[SQS_USER_QUEUE]', /* required */
};
console.log('Sending SQS Message...', sqsMessage);
sqs.sendMessage(sqsMessage, function(err, sqsResponse) {
console.info('SQS message sent!');
if (err) {
callback(err);
} else {
callback(null, ids.length + ' users were affected. SQS Message created:' + sqsResponse.MessageId);
}
});
}
}).catch(function(error) {
callback(error);
});
};
When testing my lambda function, if you look at the WatchLogs, the function itself took around 500ms to run but it says that it actually took 30502.48 ms (cf. screenshots).
So I'm guessing it's taking 30 seconds to unzip my 318KB package and start executing it? That for me is just a joke or am I missing something? I tried to upload the zip and also upload my package to S3 to check if it was faster but I still have the same latency.
I noticed that the Python version can natively perform SQL request without any custom packaging...
All our applications are written in node so I don't really want to move away from it, however I have a hard time to understand why Amazon is not providing basic npm modules for database interactions.
Any comments or help are welcome. At this point I'm not sure Lambda would be benefic for us if it takes 30 seconds to run a script that is triggered every minute...
Anyone facing the same problem?
UPDATE: This is how you need to close the connection as soon as you don't need it anymore (thanks again to Vitaly for his help):
exports.handler = function(event, context, callback) {
[...]
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
pgp.end(); // <-- This is important to close the connection directly after the request
[...]
The execution time should be measured based on the length of operations being executed, as opposed to how long it takes for the application to exit.
There are many libraries out there that make use of a connection pool in one form or another. Those typically terminate after a configurable period of inactivity.
In case of pg-promise, which in turn uses node-postgres, such period of inactivity is determined by parameter poolIdleTimeout, which defaults to 30 seconds. With pg-promise you can access it via pgp.pg.defaults.poolIdleTimeout.
If you want your process to exit after the last query has been executed, you need to shut down the connection pool, by calling pgp.end(). See chapter Library de-initialization for details.
It is also shown in most of the code examples, as those need to exit right after finishing.

How do you structure sequential AWS service calls within lambda given all the calls are asynchronous?

I'm coming from a java background so a bit of a newbie on Javascript conventions needed for Lambda.
I've got a lambda function which is meant to do several AWS tasks in a particular order, depending on the result of the previous task.
Given that each task reports its results asynchronously, I'm wondering if the right way make sure they all happen in the right sequence, and the results of one operation are available to the invocation of the next function.
It seems like I have to invoike each function in the callback of the prior function, but seems like that will some kind of deep nesting and wondering if that is the proper way to do this.
For example on of these functions requires a DynamoDB getItem, following by a call to SNS to get an endpoint, followed by a SNS call to send a message, followed by a DynamoDB write.
What's the right way to do that in lambda javascript, accounting for all that asynchronicity?
I like the answer from #jonathanbaraldi but I think it would be better if you manage control flow with Promises. The Q library has some convenience functions like nbind which help convert node style callback API's like the aws-sdk into promises.
So in this example I'll send an email, and then as soon as the email response comes back I'll send a second email. This is essentially what was asked, calling multiple services in sequence. I'm using the then method of promises to manage that in a vertically readable way. Also using catch to handle errors. I think it reads much better just simply nesting callback functions.
var Q = require('q');
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
// Use a promised version of sendEmail
var ses = new AWS.SES({apiVersion: '2010-12-01'});
var sendEmail = Q.nbind(ses.sendEmail, ses);
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
var nome = event.nome;
var email = event.email;
var mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
var params = {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
};
// Here is the white-meat of the program right here.
sendEmail(params)
.then(sendAnotherEmail)
.then(success)
.catch(logErrors);
function sendAnotherEmail(data) {
console.log("FIRST EMAIL SENT="+data);
// send a second one.
return sendEmail(params);
}
function logErrors(err) {
console.log("ERROR="+err, err.stack);
context.done();
}
function success(data) {
console.log("SECOND EMAIL SENT="+data);
context.done();
}
}
Short answer:
Use Async / Await — and Call the AWS service (SNS for example) with a .promise() extension to tell aws-sdk to use the promise-ified version of that service function instead of the call back based version.
Since you want to execute them in a specific order you can use Async / Await assuming that the parent function you are calling them from is itself async.
For example:
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("SNS Push Failed:");
console.log(err.stack);
return;
}
console.log('SNS push suceeded: ' + data);
return data;
}).promise();
The important part is the .promise() on the end there. Full docs on using aws-sdk in an async / promise based manner can be found here: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-promises.html
In order to run another aws-sdk task you would similarly add await and the .promise() extension to that function (assuming that is available).
For anyone who runs into this thread and is actually looking to simply push promises to an array and wait for that WHOLE array to finish (without regard to which promise executes first) I ended up with something like this:
let snsPromises = [] // declare array to hold promises
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("Search Push Failed:");
console.log(err.stack);
return;
}
console.log('Search push suceeded: ' + data);
return data;
}).promise();
snsPromises.push(snsResult)
await Promise.all(snsPromises)
Hope that helps someone that randomly stumbles on this via google like I did!
I don't know Lambda but you should look into the node async library as a way to sequence asynchronous functions.
async has made my life a lot easier and my code much more orderly without the deep nesting issue you mentioned in your question.
Typical async code might look like:
async.waterfall([
function doTheFirstThing(callback) {
db.somecollection.find({}).toArray(callback);
},
function useresult(dbFindResult, callback) {
do some other stuff (could be synch or async)
etc etc etc
callback(null);
],
function (err) {
//this last function runs anytime any callback has an error, or if no error
// then when the last function in the array above invokes callback.
if (err) { sendForTheCodeDoctor(); }
});
Have a look at the async doco at the link above. There are many useful functions for serial, parallel, waterfall, and many more. Async is actively maintained and seems very reliable.
good luck!
A very specific solution that comes to mind is cascading Lambda calls. For example, you could write:
A Lambda function gets something from DynamoDB, then invokes…
…a Lambda function that calls SNS to get an endpoint, then invokes…
…a Lambda function that sends a message through SNS, then invokes…
…a Lambda function that writes to DynamoDB
All of those functions take the output from the previous function as input. This is of course very fine-grained, and you might decide to group certain calls. Doing it this way avoids callback hell in your JS code at least.
(As a side note, I'm not sure how well DynamoDB integrates with Lambda. AWS might emit change events for records that can then be processed through Lambda.)
Just saw this old thread. Note that future versions of JS will improve that. Take a look at the ES2017 async/await syntax that streamlines an async nested callback mess into a clean sync like code.
Now there are some polyfills that can provide you this functionality based on ES2016 syntax.
As a last FYI - AWS Lambda now supports .Net Core which provides this clean async syntax out of the box.
I would like to offer the following solution, which simply creates a nested function structure.
// start with the last action
var next = function() { context.succeed(); };
// for every new function, pass it the old one
next = (function(param1, param2, next) {
return function() { serviceCall(param1, param2, next); };
})("x", "y", next);
What this does is to copy all of the variables for the function call you want to make, then nests them inside the previous call. You'll want to schedule your events backwards. This is really just the same as making a pyramid of callbacks, but works when you don't know ahead of time the structure or quantity of function calls. You have to wrap the function in a closure so that the correct value is copied over.
In this way I am able to sequence AWS service calls such that they go 1-2-3 and end with closing the context. Presumably you could also structure it as a stack instead of this pseudo-recursion.
I found this article which seems to have the answer in native javascript.
Five patterns to help you tame asynchronis javascript.
By default Javascript is asynchronous.
So, everything that you have to do, it's not to use those libraries, you can, but there's simple ways to solve this. In this code, I sent the email, with the data that comes from the event, but if you want, you just need to add more functions inside functions.
What is important is the place where your context.done(); is going to be, he is going to end your Lambda function. You need to put him in the end of the last function.
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
var ses = new AWS.SES({apiVersion: '2010-12-01'});
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
nome = event.nome;
email = event.email;
mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
ses.sendEmail( {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
}
},
function(err, data) {
if (err) {
console.log("ERROR="+err, err.stack);
context.done();
} else {
console.log("EMAIL SENT="+data);
context.done();
}
});
}

Resources