Experience Neptune Gremlin connections problem on calling AWS lambda handlers` callback - node.js

I am using gremlin#3.3.5 for my Node.js 8.10 application with AWS Lambdas. The process works all fine for a single invocation. Here is my very sample code.
const gremlin = require('gremlin');
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const Graph = gremlin.structure.Graph;
exports.handler = (event, context, callback) => {
dc = new DriverRemoteConnection('wss://your-neptune-endpoint:8182/gremlin');
const graph = new Graph();
const g = graph.traversal().withRemote(dc);
try {
const result = await g.V().limit(1).count().next();
dc.close();
callback(null, { result: result });
} catch (exception) {
callback('Error');
throw error;
}
}
When I run this process for single invocation, it appears to work all fine, but soon as I try to run a batch process of operations (something like 100,000 requests / hr), I am experiencing in CloudWatch log metrics that my connections are not closed successfully. I have tried a number of implementation of this, like callbackWaitForEventLoopEmpty, but that seizes the lambda. When I remove callback (or return similarly), this process works fine with batch operations too. But I do want to return data from this lambda with information that is passed to my step function to trigger another lambda based on that information.

After doing some research, I have found out the problem was with how gremlin package was handling the event of closing a connection didn't favor serverless architecture. When triggered driver.close(). When driver is instantiated, it creates instance of client, which inside itself creates instance of connection, which creates instance of websocket using ws library. Now ws.close() event gracefully closes all the events, which doesn't wait for event to be called before my callback is called and that event remains open and leaks. So after explicitly calling dc._client._connection.ws.terminate() on connection instance and then dc.close() closes connection immediately.

g.V().limit(1).count().next() is asynchronous.
Try this:
exports.handler = async (event) => {
try {
dc = new DriverRemoteConnection('wss://your-neptune-endpoint:8182/gremlin');
const graph = new Graph();
const g = graph.traversal().withRemote(dc);
const result = await g.V().limit(1).count().next();
dc.close();
return result;
} catch (error) {
throw error;
}
}
Since your Lambda runtime is Node.js 8.10 you don't need to use callback.

Related

Why Does my AWS lambda function randomly fail when using private elasticache network calls as well as external API calls?

I am trying to write a caching function that returns cached elasticcache data or makes an api call to retrieve that data. However, the lambda function seems to be very unrealiable and timing out often.
It seems that the issue is having redis calls as well as public api calls causes the issue. I can confirm that I have setup aws correctly with a subnet with an internet gateway and a private subnet with a nat gateway. The function works, but lonly 10 % of the time.The remaining times exceution is stopped right before making the API call.
I have also noticed that the api calls fail after creating the redis client. If I make the external api call prior to making the redis check it seems the function is a lot more reliable and doesn't time out.
Not sure what to do. Is it best practice to seperate these 2 tasks or am I doing something wrong?
let data = null;
module.exports.handler = async (event) => {
//context.callbackWaitsForEmptyEventLoop = false;
let client;
try {
client = new Redis(
6379,
"redis://---.---.ng.0001.use1.cache.amazonaws.com"
);
client.get(event.token, async (err, result) => {
if (err) {
console.error(err);
} else {
data = result;
await client.quit();
}
});
if (data && new Date().getTime() / 1000 - eval(data).timestamp < 30) {
res.send(`({
"address": "${token}",
"price": "${eval(data).price}",
"timestamp": "${eval(data).timestamp}"
})`);
} else {
getPrice(event); //fetch api data
}
```
There a lot of misunderstand in your code. I'll try to guide you to fix it and understand how to do that correctly.
You are mixing asynchronous and synchronous code in your function.
You should use JSON.parse instead of eval to parse the data because eval allows arbitrary code to be executed in your function
You're using the res.send to return response to the client instead of callback. Remember the usage of res.send is only in express and you're using a lambda and to return the result to client you need to use callback function
To help you in this task, I completely rewrite your code solving these misundersand.
const Redis = require('ioredis');
module.exports.handler = async (event, context, callback) => {
// prefer to use lambda env instead of put directly in the code
const client = new Redis(
"REDIS_PORT_ENV",
"REDIS_HOST_ENV"
);
const data = await client.get(event.token);
client.quit();
const parsedData = JSON.parse(data);
if (parsedDate && new Date().getTime() / 1000 - parsedData.timestamp < 30) {
callback(null, {
address: event.token,
price: parsedData.price,
timestamp: parsedData.timestamp
});
} else {
const dataFromApi = await getPrice(event);
callback(null, dataFromApi);
}
};
There another usage with lambdas that return an object instead of pass a object inside callback, but I think you get the idea and understood your mistakes.
Follow the docs about correctly usage of lambda:
https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/using-lambda-functions.html
To undestand more about async and sync in javascript:
https://www.freecodecamp.org/news/synchronous-vs-asynchronous-in-javascript/
JSON.parse x eval: JSON.parse vs. eval()

Problem in reading event stream in AWS Lambda. Nodejs code working locally as desired but not in AWS Lambda

Here's the workflow:
Get a https link --> write to filesystem --> read from filesystem --> Get the sha256 hash.
It works all good on my local machine running node 10.15.3 But when i initiate a lambda function on AWS, the output is null. Some problem may lie with the readable stream. Here's the code. You can run it directly on your local machine. It will output a sha256 hash as required. If you wish to run on AWS Lambda, Comment/Uncomment as marked.
//Reference: https://stackoverflow.com/questions/11944932/how-to-download-a-file-with-node-js-without-using-third-party-libraries
var https = require('https');
var fs = require('fs');
var crypto = require('crypto')
const url = "https://upload.wikimedia.org/wikipedia/commons/a/a8/TEIDE.JPG"
const dest = "/tmp/doc";
let hexData;
async function writeit(){
var file = fs.createWriteStream(dest);
return new Promise((resolve, reject) => {
var responseSent = false;
https.get(url, response => {
response.pipe(file);
file.on('finish', () =>{
file.close(() => {
if(responseSent) return;
responseSent = true;
resolve();
});
});
}).on('error', err => {
if(responseSent) return;
responseSent = true;
reject(err);
});
});
}
const readit = async () => {
await writeit();
var readandhex = fs.createReadStream(dest).pipe(crypto.createHash('sha256').setEncoding('hex'))
try {
readandhex.on('finish', function () { //MAY BE PROBLEM IS HERE.
console.log(this.read())
fs.unlink(dest, () => {});
})
}
catch (err) {
console.log(err);
return err;
}
}
const handler = async() =>{ //Comment this line to run the code on AWS Lambda
//exports.handler = async (event) => { //UNComment this line to run the code on AWS Lambda
try {
hexData = readit();
}
catch (err) {
console.log(err);
return err;
}
return hexData;
};
handler() //Comment this line to run the code on AWS Lambda
There can be multiple things that you need check.
Since, the URL you are accessing is a public one, make sure either your lambda is outside VPC or your VPC has NAT Gateway attached with internet access.
/tmp is valid temp directory for lambda, but you may need to create doc folder inside /tmp before using it.
You can check cloud-watch logs for more information on what's going if enabled.
I've seen this difference in behaviour between local and lambda before.
All async functions return promises. Async functions must be awaited. Calling an async function without awaiting it means execution continues to the next line(s), and potentially out of the calling function.
So your code:
exports.handler = async (event) => {
try {
hexData = readit();
}
catch (err) {
console.log(err);
return err;
}
return hexData;
};
readit() is defined as const readit = async () => { ... }. But your handler does not await it. Therefore hexData = readit(); assigns an unresolved promise to hexData, returns it, and the handler exits and the Lambda "completes" without the code of readit() having been executed.
The simple fix then is to await the async function: hexData = await readit();. The reason why it works locally in node is because the node process will wait for promises to resolve before exiting, even though the handler function has already returned. But since Lambda "returns" as soon as the handler returns, unresolved promises remain unresolved. (As an aside, there is no need for the writeit function to be marked async, because it doesn't await anything, and already returns a promise.)
That being said, I don't know promises well, and I barely know anything about events. So there are others things which raise warning flags for me but I'm not sure about them, maybe they're perfectly fine, but I'll raise it here just in case:
file.on('finish' and readandhex.on('finish'. These are both events, and I believe are non-blocking, so why would the handler and therefore lambda wait around for them?
In the first case, it's within a promise and resolve() is called from within the event function, so that may be fine (as I said, I don't know much about these 2 subjects so am not sure) - the important thing is that the code must block at that point until the promise is resolved. If the code can continue execution (i.e. return from writeit()) until the finish event is raised, then it won't work.
The second case is almost certainly going to be a problem because it's just saying that if x event is raised, then do y. There's no promise being awaited, so nothing to block the code, so it will happily continue to the end of the readit() function and then the handler and lambda. Again this is based on the assumption that events are non blocking (in the sense of, a declaration that you want to execute some code on some event, does not wait at that point for that event to be raised).

Trying to run a Cloud Function with LRO

Background
I am working on creating an autonomous Google AutoML end<>end system. I created a cloud function that receives a cloud pub/sub message when training starts. The cloud function uses the operation ID to get the operation status of the training. If the training of the model is complete(operation metadata = true), the function will send the model ID to a deployment function and send a pub/sub message with the modelID for the model to be called on prediction from. I found a solution from SO from this post How to programmatically get model id from google-cloud-automl with node.js client library
Problem
The issue I am coming across is with the cloud function timeout of 10 minutes. I wrote this question on reddit on potential solutions. https://www.reddit.com/r/googlecloud/comments/jqr213/cloud_function_to_compute_engine/ The Compute Engine solution seems not practical for a system mainly written in a cloud function environment. While trying to implement the cron job solution, I thought of the retry feature for cloud functions. It keeps the same event and will retry the function for up to a week. The documentation for retry is https://cloud.google.com/functions/docs/bestpractices/retries How could I include a cancel of the function to keep it retrying until it becomes true and completes the deployment and pub/sub message? My thought is to include the ending of the system in the if else statement, I am just struggling to find documentation of this/ if it would actually work.
Code
const {AutoMlClient} = require('#google-cloud/automl').v1;
// Instantiates a client
const client = new AutoMlClient();
exports.helloPubSub = (event, context) => {
//Imports the Google Cloud AutoML library
const message = event.data
? Buffer.from(event.data, 'base64').toString()
: 'Hello, World';
const model = message;
console.log(model);
const modelpath = message.replace('"','');
const modelID = modelpath.replace('"','');
const message1 = model.replace('projects/170974376642/locations/us-central1/operations/','');
const message2 = message1.replace('"','');
const message3 = message2.replace('"','');
console.log(`Operation ID is: ${message3}`)
getOperationStatus(message3, modelID);
}
// [START automl_vision_classification_deploy_model_node_count]
async function getOperationStatus(opId, message) {
console.log('Starting operation status');
const opped = opId;
const data = message;
const projectId = '170974376642';
const location = 'us-central1';
const operationId = opId;
// Construct request
const request = {
name: `${message}`,
};
console.log('Made it to the response');
const [response] = await client.operationsClient.getOperation(request);
console.log(`Name: ${response.name}`);
console.log(`Operation details:`);
var apple = JSON.stringify(response);
console.log(apple);
console.log('Loop until the model is ready to deploy');
if (apple.includes('True')) {
const appleF = apple.replace((/projects\/[a-zA-Z0-9-]*\/locations\/[a-zA-Z0-9-]*\/models\//,''));
deployModelWithNodeCount(appleF);
pubSub(appleF);
} else {
getOperationStatus(opped, data);
}
}
async function pubSub(id) {
const topicName = 'modelID';
const data = JSON.stringify({foo: `${id}`});
async function publishMessage() {
// Publishes the message as a string, e.g. "Hello, world!" or JSON.stringify(someObject)
const dataBuffer = Buffer.from(data);
try {
const messageId = await pubSubClient.topic(topicName).publish(dataBuffer);
console.log(`Message ${messageId} published.`);
} catch (error) {
console.error(`Received error while publishing: ${error.message}`);
process.exitCode = 1;
}
}
publishMessage();
// [END pubsub_publish_with_error_handler]
// [END pubsub_quickstart_publisher]
process.on('unhandledRejection', err => {
console.error(err.message);
process.exitCode = 1;
});
}
async function deployModelWithNodeCount(message) {
const projectId = 'ireda1';
const location = 'us-central1';
const modelId = message;
// Construct request
const request = {
name: client.modelPath(projectId, location, modelId),
imageClassificationModelDeploymentMetadata: {
nodeCount: 1,
},
};
const [operation] = await client.deployModel(request);
// Wait for operation to complete.
const [response] = await operation.promise();
console.log(`Model deployment finished. ${response}`);
}
// [END automl_vision_classification_deploy_model_node_count]
There are several improvements that you can consider for your code. First of all, it is important to understand that Cloud Functions are short-lived. 9 minutes is the maximum, your function will be active. Cloud Functions are not meant for background operations, if you are looking at a solution, which can be executed in the background and requires minimal infrastructure, I would recommend having a look at Cloud Run.
Now lets have a look at some parts of the code and how it can be improved with a different architecture maintaining Cloud Functions and PubSub as the backbone.
Waiting on model deployment
The code you use is:
if (apple.includes('True')) {
const appleF = apple.replace((/projects\/[a-zA-Z0-9-]*\/locations\/[a-zA-Z0-9-]*\/models\//,''));
deployModelWithNodeCount(appleF);
pubSub(appleF);
} else {
getOperationStatus(opped, data);
}
First of all, I would strongly suggest not to use recursion here, because a) this can be handled via a simple loop, b) you are bombarding the service without any time out or back-off policy. The latter might result in either your service crashing or endpoint starting to reject your requests.
To improve your code, you can for example set at least timeout function, like this:
setTimeout(getOperationStatus(opped, data), 1000)
For readability, I would also suggest just to use a loop in the future since you are using async patterns anyways:
status = getOperationStatus(opped, data);
while(!status){
await new Promise(t => setTimeout(t, 1000));
status = getOperationStatus(opped, data);
}
In this case, you need to separate it into two functions - 1) getOperationStatus, which actually just return status, and 2) waitForDeployment, which polls for the status, compares it with the expected result, and decides to a) wait & retry or b) abandon & return
This might make your code better, but does not solve the fundamental problem of the system design. To understand this, let's have a look a splitting responsibility and structuring the system differently. As a side note, the guide here is not meant for a Cloud Function application.
A few explanations:
Activation Function initializes the entire process, it calls the Vision Auto ML to start the deployment. It only gets the ID of the operation and pushes it to the queue
Cloud Scheduler pushes a trigger to PubSub (alternatively it can also call the function as an endpoint) every X minutes/seconds saying that it is time to check on the progress
Polling Function once triggered ask for the next ID to check, queries Cloud AutoML and if finished, acknowledges the message and writes the results, otherwise exits. You need to be careful with the configuration of acknowledgments here. Useful information is here
Polling of the status
The minor thing I have noticed is how you are polling the status. Why don't your just query this URL GET https://automl.googleapis.com/v1/projects/project-id/locations/us-central1/operations/operation-id and get status of done (check here for details)
Conclusion: Cloud Functions are short-lived and must handle only one operation at a time, no waiting. If you want a simple loop for waiting for results, use Cloud Run

Lambda function only putting one data point into InfluxDB

I have a Lambda function that is designed to take a message from a SQS queue and then input a value called perf_value which is just an integer. The CloudWatch logs show it firing each time and logging Done as seen in the .then() block of my write point. With it firing each time I am still only seeing a single data point in InfluxDB Cloud. I can't figure out why it is only inputting a single value then nothing after that. I don't see a backlog in SQS and no error messages in CloudWatch either. I'm guessing it is a code issue or InfluxDB Cloud setup though I used defaults which you would expect to actually work for multiple data points
'use strict';
const {InfluxDB, Point, HttpError} = require('#influxdata/influxdb-client')
const InfluxURL = 'https://us-west-2-1.aws.cloud2.influxdata.com'
const token = '<my token>=='
const org = '<my org>'
const bucket= '<bucket name>'
const writeApi = new InfluxDB({url: InfluxURL, token}).getWriteApi(org, bucket, 'ms')
module.exports.perf = function (event, context, callback) {
context.callbackWaitsForEmptyEventLoop = false;
let input = JSON.parse(event.Records[0].body);
console.log(input)
const point = new Point('elapsedTime')
.tag(input.monitorID, 'monitorID')
.floatField('elapsedTime', input.perf_value)
// .timestamp(input.time)
writeApi.writePoint(point)
writeApi
.close()
.then(() => {
console.log('Done')
})
.catch(e => {
console.error(e)
if (e instanceof HttpError && e.statusCode === 401) {
console.log('Unauthorized request')
}
console.log('\nFinished ERROR')
})
return true
};
EDIT**
Still have been unable to resolve the issue. I can get one datapoint to go into the influxdb and then nothing will show up.
#Joshk132 -
I believe the problem is here:
writeApi
.close() // <-- here
.then(() => {
console.log('Done')
})
You are closing the API client object after the first write so you are only able to write once. You can use flush() instead if you want to force sending the Point immediately.

"connection terminated unexpectedly" error with Node, Postgres on AWS Lambda

I have a number of Node functions running on AWS Lambda. These functions have been using the Node 8 runtime but AWS sent out an end-of-life notice saying that functions should be upgraded to the latest LTS. With that, I upgraded one on my functions to use Node 12. After being in production for a bit, I'm starting to see a ton of connection terminated unexpectedly errors when querying the database.
Here are the errors that I'm seeing:
The connection terminated unexpectedly error
And Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed - this seems to happen on the 1st or second invocation after seeing the connection terminated unexpectedly error.
I'm using Knex.js for querying the database. I was running older version of knex and node-postgres and recently upgraded to see if it would resolve the issue, but no luck. Here are the versions of knex and node-postgres that I'm currently running:
"knex": "^0.20.8"
"pg": "^7.17.1"
The only change I've made to this particular function is the upgrade to Node 12. I've also tried Node 10, but the same issue persists. Unfortunately, AWS won't let me downgrade to Node 8 to verify that it is indeed an issue. None of my other functions running on Node 8 are experiencing this issue.
I've researched knex, node-postgres and tarn.js (the Knex connection pooling library) to see if any related issues or solutions popped up, but so far, I haven't had any luck.
UPDATE:
Example of a handler. Note that this is happening on many different Lambdas, all running Node 12.
require('../../helpers/knex')
const { Rollbar } = require('#scoutforpets/utils')
const { Email } = require('#scoutforpets/notifications')
const { transaction: tx } = require('objection')
const Invoice = require('../../models/invoice')
// configure rollbar for error logging
const rollbar = Rollbar.configureRollbar(process.env.ROLLBAR_TOKEN)
/**
*
* #param {*} event
*/
async function handler (event) {
const { invoice } = event
const { id: invoiceId } = invoice
try {
return tx(Invoice, async Invoice => {
// send the receipt
await Email.Customer.paymentReceipt(invoiceId, true)
// convert JSON to model
const i = Invoice.fromJson(invoice)
// mark the invoice as having been sent
await i.markAsSent()
})
} catch (err) {
return err
}
}
module.exports.handler = rollbar.lambdaHandler(handler)
Starting with node.js 10 aws lambda make the handler async, so you have to adapt your code.
Docs : https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-handler.html
The runtime passes three arguments to the handler method. The first
argument is the event object, which contains information from the
invoker. The invoker passes this information as a JSON-formatted
string when it calls Invoke, and the runtime converts it to an object.
When an AWS service invokes your function, the event structure varies
by service.
The second argument is the context object, which contains information
about the invocation, function, and execution environment. In the
preceding example, the function gets the name of the log stream from
the context object and returns it to the invoker.
The third argument, callback, is a function that you can call in
non-async functions to send a response. The callback function takes
two arguments: an Error and a response. When you call it, Lambda waits
for the event loop to be empty and then returns the response or error
to the invoker. The response object must be compatible with
JSON.stringify.
For async functions, you return a response, error, or promise to the
runtime instead of using callback.
exports.handler = async function(event, context, callback) {
console.log("EVENT: \n" + JSON.stringify(event, null, 2))
return context.logStreamName
}
Thx!
I think you need to set the right connection pooling config.
See the docs here: https://github.com/marcogrcr/sequelize/blob/patch-1/docs/manual/other-topics/aws-lambda.md
const { Sequelize } = require("sequelize");
let sequelize = null;
async function loadSequelize() {
const sequelize = new Sequelize(/* (...) */, {
// (...)
pool: {
/*
* Lambda functions process one request at a time but your code may issue multiple queries
* concurrently. Be wary that `sequelize` has methods that issue 2 queries concurrently
* (e.g. `Model.findAndCountAll()`). Using a value higher than 1 allows concurrent queries to
* be executed in parallel rather than serialized. Careful with executing too many queries in
* parallel per Lambda function execution since that can bring down your database with an
* excessive number of connections.
*
* Ideally you want to choose a `max` number where this holds true:
* max * EXPECTED_MAX_CONCURRENT_LAMBDA_INVOCATIONS < MAX_ALLOWED_DATABASE_CONNECTIONS * 0.8
*/
max: 2,
/*
* Set this value to 0 so connection pool eviction logic eventually cleans up all connections
* in the event of a Lambda function timeout.
*/
min: 0,
/*
* Set this value to 0 so connections are eligible for cleanup immediately after they're
* returned to the pool.
*/
idle: 0,
// Choose a small enough value that fails fast if a connection takes too long to be established.
acquire: 3000,
/*
* Ensures the connection pool attempts to be cleaned up automatically on the next Lambda
* function invocation, if the previous invocation timed out.
*/
evict: CURRENT_LAMBDA_FUNCTION_TIMEOUT
}
});
// or `sequelize.sync()`
await sequelize.authenticate();
return sequelize;
}
module.exports.handler = async function (event, callback) {
// re-use the sequelize instance across invocations to improve performance
if (!sequelize) {
sequelize = await loadSequelize();
} else {
// restart connection pool to ensure connections are not re-used across invocations
sequelize.connectionManager.initPools();
// restore `getConnection()` if it has been overwritten by `close()`
if (sequelize.connectionManager.hasOwnProperty("getConnection")) {
delete sequelize.connectionManager.getConnection;
}
}
try {
return await doSomethingWithSequelize(sequelize);
} finally {
// close any opened connections during the invocation
// this will wait for any in-progress queries to finish before closing the connections
await sequelize.connectionManager.close();
}
};
It's actually for sequelize, not knex, but I'm sure under the hood they work the same way.
I had this problem too, in my case it was cause i tried to connect db in production.
so, I added ssl to Pool, like this:
const pool = new Pool({
connectionString: connectionString,
ssl: {rejectUnauthorized: false},
});
Hope it helps you too...

Resources