I'm using two lambda functions with Javascript's 4.3 runtime. I run the first and it calls the second synchronously (sync is the intent). Problem is the second one times out (at 60sec) but it actually reaches a successful finish after only 22 seconds.
Here's the flow between the two Lambda functions:
Lamda function A I am no longer getting CloudWatch logs for but the real problem (I think) is function B which times out for no reason.
Here's some CloudWatch logs to illustrate this:
The code in Function B at the end -- which includes the "Success" log statement see in picture above -- is included below:
Originally I only had the callback(null, 'successful ...') line and not the nodejs 0.10.x way where you called succeed() off of context. In desperation I added both but the result is the same.
Anyone have an idea what's going on? Any way in which I can debug this?
In case the invocation logic between A and B makes a difference in the state that B starts in, here's the invocation:
As Michael - sqlbot said; the issue seems to be that as long as there is an open connection, because of the non empty event loop, calling the callback doesn't terminate the function. Had the same problem with a open Redis connection; solution as stated is context.callbackWaitsForEmptyEventLoop = false;
At least for redis conenctions it helps to quit the connection to redis in order to let Lambda finish the job properly.
Related
We've got an occasional weird problem in our AWS lambda function where sendMessage() only gets completed in the /next/ function invocation. Our code looks like:
await sqsClient.sendMessage(params).promise()
As you can see, we're awaiting on the promise... but occasionally it looks like the promise doesn't get completed until the next lambda invocation comes into the same warm container.
Any theories on what might cause that?
So after hours of debugging, I eventually tracked it down to us having a line of code in our codebase that called context.done() but then letting code continue to execute.
The next time our code calls await ..., I assume that node's async handling kicks in, realises that it is marked as done, so then just abruptly ends the lambda execution... even though the promise hasn't yet completed.
I was following the guide here for setting up a presignup trigger.
However, when I used callback(null, event) my lambda function would never actually return and I would end up getting an error
{ code: 'UnexpectedLambdaException',
name: 'UnexpectedLambdaException',
message: 'arn:aws:lambda:us-east-2:642684845958:function:proj-dev-confirm-1OP5DB3KK5WTA failed with error Socket timeout while invoking Lambda function.' }
I found a similar link here that says to use context.done().
After switching it works perfectly fine.
What's the difference?
exports.confirm = (event, context, callback) => {
event.response.autoConfirmUser = true;
context.done(null, event);
//callback(null, event); does not work
}
Back in the original Lambda runtime environment for Node.js 0.10, Lambda provided helper functions in the context object: context.done(err, res) context.succeed(res) and context.fail(err).
This was formerly documented, but has been removed.
Using the Earlier Node.js Runtime v0.10.42 is an archived copy of a page that no longer exists in the Lambda documentation, that explains how these methods were used.
When the Node.js 4.3 runtime for Lambda was launched, these remained for backwards compatibility (and remain available but undocumented), and callback(err, res) was introduced.
Here's the nature of your problem, and why the two solutions you found actually seem to solve it.
Context.succeed, context.done, and context.fail however, are more than just bookkeeping – they cause the request to return after the current task completes and freeze the process immediately, even if other tasks remain in the Node.js event loop. Generally that’s not what you want if those tasks represent incomplete callbacks.
https://aws.amazon.com/blogs/compute/node-js-4-3-2-runtime-now-available-on-lambda/
So with callback, Lambda functions now behave in a more paradigmatically correct way, but this is a problem if you intend for certain objects to remain on the event loop during the freeze that occurs between invocations -- unlike the old (deprecated) done fail succeed methods, using the callback doesn't suspend things immediately. Instead, it waits for the event loop to be empty.
context.callbackWaitsForEmptyEventLoop -- default true -- was introduced so that you can set it to false for those cases where you want the Lambda function to return immediately after you call the callback, regardless of what's happening in the event loop. The default is true because false can mask bugs in your function and can cause very erratic/unexpected behavior if you fail to consider the implications of container reuse -- so you shouldn't set this to false unless and until you understand why it is needed.
A common reason false is needed would be a database connection made by your function. If you create a database connection object in a global variable, it will have an open socket, and potentially other things like timers, sitting on the event loop. This prevents the callback from causing Lambda to return a response, until these operations are also finished or the invocation timeout timer fires.
Identify why you need to set this to false, and if it's a valid reason, then it is correct to use it.
Otherwise, your code may have a bug that you need to understand and fix, such as leaving requests in flight or other work unfinished, when calling the callback.
So, how do we parse the Cognito error? At first, it seemed pretty unusual, but now it's clear that it is not.
When executing a function, Lambda will throw an error that the tasked timed out after the configured number of seconds. You should find this to be what happens when you test your function in the Lambda console.
Unfortunately, Cognito appears to have taken an internal design shortcut when invoking a Lambda function, and instead of waiting for Lambda to timeout the invocarion (which could tie up resources inside Cognito) or imposing its own explicit timer on the maximum duration Cognito will wait for a Lambda response, it's relying on a lower layer socket timer to constrain this wait... thus an "unexpected" error is thrown while invoking the timeout.
Further complicating interpreting the error message, there are missing quotes in the error, where the lower layer exception is interpolated.
To me, the problem would be much more clear if the error read like this:
'arn:aws:lambda:...' failed with error 'Socket timeout' while invoking Lambda function
This format would more clearly indicate that while Cognito was invoking the function, it threw an internal Socket timeout error (as opposed to Lambda encountering an unexpected internal error, which was my original -- and incorrect -- assumption).
It's quite reasonable for Cognito to impose some kind of response time limit on the Lambda function, but I don't see this documented. I suspect a short timeout on your Lambda function itself (making it fail more promptly) would cause Cognito to throw a somewhat more useful error, but in my mind, Cognito should have been designed to include logic to make this an expected, defined error, rather than categorizing it as "unexpected."
As an update the Runtime Node.js 10.x handler supports an async function that makes use of return and throw statements to return success or error responses, respectively. Additionally, if your function performs asynchronous tasks then you can return a Promise where you would then use resolve or reject to return a success or error, respectively. Either approach simplifies things by not requiring context or callback to signal completion to the invoker, so your lambda function could look something like this:
exports.handler = async (event) => {
// perform tasking...
const data = doStuffWith(event)
// later encounter an error situation
throw new Error('tell invoker you encountered an error')
// finished tasking with no errors
return { data }
}
Of course you can still use context but its not required to signal completion.
I am using the nodejs to use AWS Lambda.
As I know each function of lambda is handled in independent and parallel process.
However, following example shows different result than I expected.
// test.js
const now = new Date();
module.exports = () => {
console.log(now);
};
// handler.js
const test = require('./test');
module.exports.hello = async (event, context) => {
test();
return {
statusCode: 200,
body: null
};
};
RESULT:
hello handler log
As I intended, each function was executed independently, so the value of console.log(now) should always be the point at which it was executed.
However, in the actual log, the value of now is continuously recorded at the point of the very first execution - rather than each function’s execution.
The log’s value after 5 minutes was the same.
However, the value changed after 12 hours, but after that, it shows the same problem.
This result gives us serious consideration of how to manage the DB connection.
There are two assumption for each case of lambda’s recycling
If lambda recycles like test.js,
better to use connection pool
also recommends to use a orm such as sequelize which requires initialization
If not,
better to use simple connections and regular queries to quickly consume connections
How can we use lambda within maximum performance?
How can we interpret the test results above?
AWS Lambda creates and reuses the containers, so you need to understand the impact of this practice on the programming model.
The first time a function executes, a new container will be created to execute it.
Let’s say your function finishes, and some time passes, then you call it again. Lambda may create a new container all over again. However, if you haven’t changed the Lambda function code and not too much time has gone by, Lambda may reuse the previous container. This offers performance advantages: Lambda gets to skip the nodejs language initialization, and you get to skip initialization in your code (so you can reuse DB connections, for example); files that you wrote to /tmp last time around will still be there if the container gets reused; anything you initialized globally outside of the Lambda function handler persists.
For more see Understanding Container Reuse in AWS Lambda.
The behavior that you have described is a result of AWS optimizations. It looks like your lambda is very fast and it is more efficient to use only one unit of execution (process/container/instance) fro AWS. So try to simulate a long running process and see that the actual timestamps are different in this case.
My route calls a function startJob() that immediately returns a response "Job started." It also calls a function runJob(), which continues with the job flow (async).
Obviously, I can't just do const jobResults = await startJob(), since that function just returns "Job started." The actual results of the job are not available until all the subsequent functions complete.
So I need a way to check the results of the test after all the functions complete, meaning I need to know when all the functions complete.
The last function in the flow logResults() is not going to work to spy on, because I need it to call through (including an async call). Jasmine doesn't allow me to call it through and then call another function that checks the results.
My idea was to add a function at the end of logResults() called finishJob and spy on that. finishJob() wouldn't do anything, so I wouldn't need to call it through.
Pseudo-code:
spyOn(finishJob).and.callFake( [function that checks the results] )
However, it seems like bad practice to add a function to the code purely for testing reasons.
Other possible solutions would be using some kind of node watchers, a state change machine, or just doing a timeout (also bad practice? but quick implementation...)
I found a solution I'm happy with. I decided to start the testing at a different point in the flow.
Instead calling await startJob() to start the test, I called await runJob(), which kicks off the asynchronous process and gets the results.
That way, I could know when the process finishes (using await keyword obvs) and check the results at that time.
I am trying to implement a couple of state handler funcitons in my javascript code, in order to perform 2 different distinct actions in each state. This is similar to a state design pattern of Java (https://sourcemaking.com/design_patterns/state).
Conceptually, my program need to remain connected to an elasticsearch instance (or any other server for that matter), and then parse and POST some incoming data to el. If there is no connection available to elasticsearch, my program would keep tring to connect to el endlessly with some retry period.
In a nutshell,
When not connected, keep trying to connect
When connected, start POSTing the data
The main run loop is calling itself recurssively,
function run(ctx) {
logger.info("run: running...");
// initially starts with disconnected state...
return ctx.curState.run(ctx)
.then(function(result) {
if (result) ctx.curState = connectedSt;
// else it remains in old state.
return run(ctx);
});
}
This is not a truly recursive fn in the sense that each invocation is calling itself in a tight loop. But I suspect it ends up with many promises in the chain, and in the long run it will consume more n more memory and hence eventually hang.
Is my assumption / understanding right? Or is it OK to write this kinda code?
If not, should I consider calling setImmediate / process.nextTick etc?
Or should I consider using TCO (Tail Cost Optimization), ofcourse I am yet to fully understand this concept.
Yes, by returning a new promise (the result of the recursive call to run()), you effectively chain in another promise.
Neither setImmediate() nor process.nextTick() are going to solve this directly.
When you call run() again, simply don't return it and you should be fine.