I have a node cloud function on GCP, which is working relatively fine, i even have a global logger and logs just to track the workflow, but now i'm having some strange issues with some calls to the cloud function.
async function updateCompletedRunInDB(runId, runObject, shouldChargeOrganization) {
log.info(runId + " - starting to update models");
const now = Date.now();
let run;
let chargeableMins;
let outputFileEntry, logFileEntry;
const runQuery = {
_id: runId,
};
// we load the run from the run model
run = await procLoadRun(runId);
and fucntion procLoadRun is something like this...
async function procLoadRun(runId) {
try {
run = await loadRun(runId);
} catch (err) {
log.error(err);
log.error("couldn't be loaded");
}
return run;
}
where loadRun is a query to mongodb that returns a specific run.
now on the gcp console i have the following (normal case scenario)
but sometimes i'm getting this one
Looks like my code is being skipped since procLoadRun is not being called, any ideas or suggestion would be appreciated.
my guess is something related to gcp? the cloud function is on nodejs10 with 1Gib memory
thanks in advance.
Related
I am trying to write a caching function that returns cached elasticcache data or makes an api call to retrieve that data. However, the lambda function seems to be very unrealiable and timing out often.
It seems that the issue is having redis calls as well as public api calls causes the issue. I can confirm that I have setup aws correctly with a subnet with an internet gateway and a private subnet with a nat gateway. The function works, but lonly 10 % of the time.The remaining times exceution is stopped right before making the API call.
I have also noticed that the api calls fail after creating the redis client. If I make the external api call prior to making the redis check it seems the function is a lot more reliable and doesn't time out.
Not sure what to do. Is it best practice to seperate these 2 tasks or am I doing something wrong?
let data = null;
module.exports.handler = async (event) => {
//context.callbackWaitsForEmptyEventLoop = false;
let client;
try {
client = new Redis(
6379,
"redis://---.---.ng.0001.use1.cache.amazonaws.com"
);
client.get(event.token, async (err, result) => {
if (err) {
console.error(err);
} else {
data = result;
await client.quit();
}
});
if (data && new Date().getTime() / 1000 - eval(data).timestamp < 30) {
res.send(`({
"address": "${token}",
"price": "${eval(data).price}",
"timestamp": "${eval(data).timestamp}"
})`);
} else {
getPrice(event); //fetch api data
}
```
There a lot of misunderstand in your code. I'll try to guide you to fix it and understand how to do that correctly.
You are mixing asynchronous and synchronous code in your function.
You should use JSON.parse instead of eval to parse the data because eval allows arbitrary code to be executed in your function
You're using the res.send to return response to the client instead of callback. Remember the usage of res.send is only in express and you're using a lambda and to return the result to client you need to use callback function
To help you in this task, I completely rewrite your code solving these misundersand.
const Redis = require('ioredis');
module.exports.handler = async (event, context, callback) => {
// prefer to use lambda env instead of put directly in the code
const client = new Redis(
"REDIS_PORT_ENV",
"REDIS_HOST_ENV"
);
const data = await client.get(event.token);
client.quit();
const parsedData = JSON.parse(data);
if (parsedDate && new Date().getTime() / 1000 - parsedData.timestamp < 30) {
callback(null, {
address: event.token,
price: parsedData.price,
timestamp: parsedData.timestamp
});
} else {
const dataFromApi = await getPrice(event);
callback(null, dataFromApi);
}
};
There another usage with lambdas that return an object instead of pass a object inside callback, but I think you get the idea and understood your mistakes.
Follow the docs about correctly usage of lambda:
https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-lambda-functions.html
To undestand more about async and sync in javascript:
https://www.freecodecamp.org/news/synchronous-vs-asynchronous-in-javascript/
JSON.parse x eval: JSON.parse vs. eval()
I am using gremlin#3.3.5 for my Node.js 8.10 application with AWS Lambdas. The process works all fine for a single invocation. Here is my very sample code.
const gremlin = require('gremlin');
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const Graph = gremlin.structure.Graph;
exports.handler = (event, context, callback) => {
dc = new DriverRemoteConnection('wss://your-neptune-endpoint:8182/gremlin');
const graph = new Graph();
const g = graph.traversal().withRemote(dc);
try {
const result = await g.V().limit(1).count().next();
dc.close();
callback(null, { result: result });
} catch (exception) {
callback('Error');
throw error;
}
}
When I run this process for single invocation, it appears to work all fine, but soon as I try to run a batch process of operations (something like 100,000 requests / hr), I am experiencing in CloudWatch log metrics that my connections are not closed successfully. I have tried a number of implementation of this, like callbackWaitForEventLoopEmpty, but that seizes the lambda. When I remove callback (or return similarly), this process works fine with batch operations too. But I do want to return data from this lambda with information that is passed to my step function to trigger another lambda based on that information.
After doing some research, I have found out the problem was with how gremlin package was handling the event of closing a connection didn't favor serverless architecture. When triggered driver.close(). When driver is instantiated, it creates instance of client, which inside itself creates instance of connection, which creates instance of websocket using ws library. Now ws.close() event gracefully closes all the events, which doesn't wait for event to be called before my callback is called and that event remains open and leaks. So after explicitly calling dc._client._connection.ws.terminate() on connection instance and then dc.close() closes connection immediately.
g.V().limit(1).count().next() is asynchronous.
Try this:
exports.handler = async (event) => {
try {
dc = new DriverRemoteConnection('wss://your-neptune-endpoint:8182/gremlin');
const graph = new Graph();
const g = graph.traversal().withRemote(dc);
const result = await g.V().limit(1).count().next();
dc.close();
return result;
} catch (error) {
throw error;
}
}
Since your Lambda runtime is Node.js 8.10 you don't need to use callback.
I started using AWS Lambda to perform a very simple task which is executing an SQL query to retrieve records from an RDS postgres database and create SQS message base on the result.
Because Amazon is only providing aws-sdk module (using node 4.3 engine) by default and we need to execute this SQL query, we have to create a custom deployment package which includes pg-promise. Here is the code I'm using:
console.info('Loading the modules...');
var aws = require('aws-sdk');
var sqs = new aws.SQS();
var config = {
db: {
username: '[DB_USERNAME]',
password: '[DB_PASSWORD]',
host: '[DB_HOST]',
port: '[DB_PORT]',
database: '[DB_NAME]'
}
};
var pgp = require('pg-promise')({});
var cn = `postgres://${config.db.username}:${config.db.password}#${config.db.host}:${config.db.port}/${config.db.database}`;
if (!db) {
console.info('Connecting to the database...');
var db = pgp(cn);
} else {
console.info('Re-use database connection...');
}
console.log('loading the lambda function...');
exports.handler = function(event, context, callback) {
var now = new Date();
console.log('Current time: ' + now.toISOString());
// Select auction that need to updated
var query = [
'SELECT *',
'FROM "users"',
'WHERE "users"."registrationDate"<=${now}',
'AND "users"."status"=1',
].join(' ');
console.info('Executing SQL query: ' + query);
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
var ids = [];
data.forEach(function(auction) {
ids.push(auction.id);
});
if (ids.length == 0) {
callback(null, 'No user to update');
} else {
var sqsMessage = {
MessageBody: JSON.stringify({ action: 'USERS_UPDATE', data: ids}), /* required */
QueueUrl: '[SQS_USER_QUEUE]', /* required */
};
console.log('Sending SQS Message...', sqsMessage);
sqs.sendMessage(sqsMessage, function(err, sqsResponse) {
console.info('SQS message sent!');
if (err) {
callback(err);
} else {
callback(null, ids.length + ' users were affected. SQS Message created:' + sqsResponse.MessageId);
}
});
}
}).catch(function(error) {
callback(error);
});
};
When testing my lambda function, if you look at the WatchLogs, the function itself took around 500ms to run but it says that it actually took 30502.48 ms (cf. screenshots).
So I'm guessing it's taking 30 seconds to unzip my 318KB package and start executing it? That for me is just a joke or am I missing something? I tried to upload the zip and also upload my package to S3 to check if it was faster but I still have the same latency.
I noticed that the Python version can natively perform SQL request without any custom packaging...
All our applications are written in node so I don't really want to move away from it, however I have a hard time to understand why Amazon is not providing basic npm modules for database interactions.
Any comments or help are welcome. At this point I'm not sure Lambda would be benefic for us if it takes 30 seconds to run a script that is triggered every minute...
Anyone facing the same problem?
UPDATE: This is how you need to close the connection as soon as you don't need it anymore (thanks again to Vitaly for his help):
exports.handler = function(event, context, callback) {
[...]
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
pgp.end(); // <-- This is important to close the connection directly after the request
[...]
The execution time should be measured based on the length of operations being executed, as opposed to how long it takes for the application to exit.
There are many libraries out there that make use of a connection pool in one form or another. Those typically terminate after a configurable period of inactivity.
In case of pg-promise, which in turn uses node-postgres, such period of inactivity is determined by parameter poolIdleTimeout, which defaults to 30 seconds. With pg-promise you can access it via pgp.pg.defaults.poolIdleTimeout.
If you want your process to exit after the last query has been executed, you need to shut down the connection pool, by calling pgp.end(). See chapter Library de-initialization for details.
It is also shown in most of the code examples, as those need to exit right after finishing.
I'm coming from a java background so a bit of a newbie on Javascript conventions needed for Lambda.
I've got a lambda function which is meant to do several AWS tasks in a particular order, depending on the result of the previous task.
Given that each task reports its results asynchronously, I'm wondering if the right way make sure they all happen in the right sequence, and the results of one operation are available to the invocation of the next function.
It seems like I have to invoike each function in the callback of the prior function, but seems like that will some kind of deep nesting and wondering if that is the proper way to do this.
For example on of these functions requires a DynamoDB getItem, following by a call to SNS to get an endpoint, followed by a SNS call to send a message, followed by a DynamoDB write.
What's the right way to do that in lambda javascript, accounting for all that asynchronicity?
I like the answer from #jonathanbaraldi but I think it would be better if you manage control flow with Promises. The Q library has some convenience functions like nbind which help convert node style callback API's like the aws-sdk into promises.
So in this example I'll send an email, and then as soon as the email response comes back I'll send a second email. This is essentially what was asked, calling multiple services in sequence. I'm using the then method of promises to manage that in a vertically readable way. Also using catch to handle errors. I think it reads much better just simply nesting callback functions.
var Q = require('q');
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
// Use a promised version of sendEmail
var ses = new AWS.SES({apiVersion: '2010-12-01'});
var sendEmail = Q.nbind(ses.sendEmail, ses);
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
var nome = event.nome;
var email = event.email;
var mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
var params = {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
};
// Here is the white-meat of the program right here.
sendEmail(params)
.then(sendAnotherEmail)
.then(success)
.catch(logErrors);
function sendAnotherEmail(data) {
console.log("FIRST EMAIL SENT="+data);
// send a second one.
return sendEmail(params);
}
function logErrors(err) {
console.log("ERROR="+err, err.stack);
context.done();
}
function success(data) {
console.log("SECOND EMAIL SENT="+data);
context.done();
}
}
Short answer:
Use Async / Await — and Call the AWS service (SNS for example) with a .promise() extension to tell aws-sdk to use the promise-ified version of that service function instead of the call back based version.
Since you want to execute them in a specific order you can use Async / Await assuming that the parent function you are calling them from is itself async.
For example:
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("SNS Push Failed:");
console.log(err.stack);
return;
}
console.log('SNS push suceeded: ' + data);
return data;
}).promise();
The important part is the .promise() on the end there. Full docs on using aws-sdk in an async / promise based manner can be found here: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-promises.html
In order to run another aws-sdk task you would similarly add await and the .promise() extension to that function (assuming that is available).
For anyone who runs into this thread and is actually looking to simply push promises to an array and wait for that WHOLE array to finish (without regard to which promise executes first) I ended up with something like this:
let snsPromises = [] // declare array to hold promises
let snsResult = await sns.publish({
Message: snsPayload,
MessageStructure: 'json',
TargetArn: endPointArn
}, async function (err, data) {
if (err) {
console.log("Search Push Failed:");
console.log(err.stack);
return;
}
console.log('Search push suceeded: ' + data);
return data;
}).promise();
snsPromises.push(snsResult)
await Promise.all(snsPromises)
Hope that helps someone that randomly stumbles on this via google like I did!
I don't know Lambda but you should look into the node async library as a way to sequence asynchronous functions.
async has made my life a lot easier and my code much more orderly without the deep nesting issue you mentioned in your question.
Typical async code might look like:
async.waterfall([
function doTheFirstThing(callback) {
db.somecollection.find({}).toArray(callback);
},
function useresult(dbFindResult, callback) {
do some other stuff (could be synch or async)
etc etc etc
callback(null);
],
function (err) {
//this last function runs anytime any callback has an error, or if no error
// then when the last function in the array above invokes callback.
if (err) { sendForTheCodeDoctor(); }
});
Have a look at the async doco at the link above. There are many useful functions for serial, parallel, waterfall, and many more. Async is actively maintained and seems very reliable.
good luck!
A very specific solution that comes to mind is cascading Lambda calls. For example, you could write:
A Lambda function gets something from DynamoDB, then invokes…
…a Lambda function that calls SNS to get an endpoint, then invokes…
…a Lambda function that sends a message through SNS, then invokes…
…a Lambda function that writes to DynamoDB
All of those functions take the output from the previous function as input. This is of course very fine-grained, and you might decide to group certain calls. Doing it this way avoids callback hell in your JS code at least.
(As a side note, I'm not sure how well DynamoDB integrates with Lambda. AWS might emit change events for records that can then be processed through Lambda.)
Just saw this old thread. Note that future versions of JS will improve that. Take a look at the ES2017 async/await syntax that streamlines an async nested callback mess into a clean sync like code.
Now there are some polyfills that can provide you this functionality based on ES2016 syntax.
As a last FYI - AWS Lambda now supports .Net Core which provides this clean async syntax out of the box.
I would like to offer the following solution, which simply creates a nested function structure.
// start with the last action
var next = function() { context.succeed(); };
// for every new function, pass it the old one
next = (function(param1, param2, next) {
return function() { serviceCall(param1, param2, next); };
})("x", "y", next);
What this does is to copy all of the variables for the function call you want to make, then nests them inside the previous call. You'll want to schedule your events backwards. This is really just the same as making a pyramid of callbacks, but works when you don't know ahead of time the structure or quantity of function calls. You have to wrap the function in a closure so that the correct value is copied over.
In this way I am able to sequence AWS service calls such that they go 1-2-3 and end with closing the context. Presumably you could also structure it as a stack instead of this pseudo-recursion.
I found this article which seems to have the answer in native javascript.
Five patterns to help you tame asynchronis javascript.
By default Javascript is asynchronous.
So, everything that you have to do, it's not to use those libraries, you can, but there's simple ways to solve this. In this code, I sent the email, with the data that comes from the event, but if you want, you just need to add more functions inside functions.
What is important is the place where your context.done(); is going to be, he is going to end your Lambda function. You need to put him in the end of the last function.
var AWS = require('aws-sdk');
AWS.config.credentials = { "accessKeyId": "AAAA","secretAccessKey": "BBBB"};
AWS.config.region = 'us-east-1';
var ses = new AWS.SES({apiVersion: '2010-12-01'});
exports.handler = function(event, context) {
console.log(event.nome);
console.log(event.email);
console.log(event.mensagem);
nome = event.nome;
email = event.email;
mensagem = event.mensagem;
var to = ['email#company.com.br'];
var from = 'site#company.com.br';
// Send email
mensagem = ""+nome+"||"+email+"||"+mensagem+"";
console.log(mensagem);
ses.sendEmail( {
Source: from,
Destination: { ToAddresses: to },
Message: {
Subject: {
Data: 'Form contact our Site'
},
Body: {
Text: {
Data: mensagem,
}
}
}
},
function(err, data) {
if (err) {
console.log("ERROR="+err, err.stack);
context.done();
} else {
console.log("EMAIL SENT="+data);
context.done();
}
});
}
I want to use NodeJS to read 60k records from a MySQL database and write them to a ArangoDB database. I will later use ArangoDB's aggregation features etc. to process my dataset.
Coming from PHP, where a script usually runs synchronous, and because I believe it makes sense here, my initial (naive) try was to make my NodeJS script run sync too. However, it doesn't work as expected:
I print to console, call a function via .sync() to connect to ArangoDB server and print all existing databases, then print to console again. But everything below the sync call to my ArangoDB function is completely ignored (does not print to console again, nor does it seem to execute anything else here).
What am I overlooking? Does .done() in the function called via .sync() cause trouble?
var mysql = require('node-mysql');
var arango = require('arangojs');
//var sync = require('node-sync'); // Wrong one!
var sync = require('sync');
function test_arango_query() {
var db = arango.Connection("http://localhost:8529");
db.database.list().done(function(res) {
console.log("Databases: %j", res);
});
return "something?";
}
sync(function() {
console.log("sync?");
var result = test_arango_query.sync();
console.log("done."); // DOES NOT PRINT, NEVER EXECUTED?!
return result;
}, function(err, result) {
if (err) console.error(err);
console.log(result);
});
Your function test_arango_query doesn't use a callback. sync only works with functions that use a callback. It needs to know when the data is ready to return it from .sync(), if your function never calls the callback, then sync can't ever return a result.
Update your function to call a callback function when you want it to return:
function test_arango_query(callback) {
var db = arango.Connection("http://localhost:8529");
db.database.list().done(function(res) {
console.log("Databases: %j", res);
callback('something');
});
}