I wrote a simple mongo test, trying to access mongo server in a vpc.
for every run I get : "errorMessage": "*** Task timed out after 3.00 seconds"
I have written more handlers in the lambda just to check it.
There is no problem connecting to the vpc. other handler (same file) that connects to another server runs well.
There is no problem with other modules. I have added another module (make-random-string) and it's running every time.
I get no error messages. No exceptions from Mongo. it just times out every time.
increasing both memory to 1024 and execution time to 15s didn't help, the results are the same.
Mongo driver does not require any C++ builds unless you use kerberos, which I'm not.
Test file mimicking the lambda, runs fine.
The sample code is here: http://pastebin.com/R2e3jwwa where the db information is removed.
Thanks.
As weird as it may sound, we finally solved the problem just by changing the callback(null, response) to context.done(null, response). This nonsense took us more time than we would have liked to spend here.
You can find more info about the issue here https://github.com/serverless/serverless/issues/1036
I had the same issue. The solution was to move the database connection object outside the handler method and cache/reuse it.
Here I added more details about it:
https://stackoverflow.com/a/67530789/10664035
Related
I am working on AWS lambda functions (NodeJS) that connects to a MongoDB server running on EC2 Instance.
Lambda function is place in a VPC-1 and MongoDB server (EC2 Instance) is in VPC-2.
We have setup VPC peering between VPC-1 and VPC-2
The lambda function is intermittently throwing timeout error. It works 50% of the time and 50% of the time, it's throwing timeout error.
Note: The MongoDB is running on an EC2 Instance is specially setup for the development of this project. It does not get any additional traffic.
Also, another component of this project developed in NodeJS again running from another EC2 instance can communicate with the MongoDB server without any timeout issues.
Could someone help me in understanding the possible cause of the timeout issues?
Thanks in advance.
Hope below article might solve your problem:
To fix: Increase the timeout setting/memory on the configuration page of your Lambda function
For nodejs async related issues, please refer below link:
AWS Lambda: Task timed out
Lambda timeouts can best be described as
The amount of time that Lambda allows a function to run before stopping it. The default is 3 seconds. The maximum allowed value is 900 seconds.
Within the console you can increase this timeout to a greater number.
When you click on the Lambda function there will be a monitoring tab. From here you should be able to see execution time of Lambda functions. You might find that its always close to the bar.
I'd recommend increasing the timeout a bit higher than you anticipate it needs then reviewing these metrics. Once you have a baseline adjust this timeout value again
I have a serious problem in production causing the application to become unresponsive and output the following error:
Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
A running hypothesis is some operations are holding onto long-running Knex transactions. Enough of them to reach the pool size, basically.
Is there a way to query the KnexJS API for how many pool connections are in use at any one time? Unfortunately since KnexJS occupies the max pool settings from the config, it can be hard to know how many are actually in use. From the postgres end, it seems like KnexJS is idling on all of its connections when they are not in use.
Is there a good way to instrument Knex transaction and transacting with some kind of middleware or hook? Another useful thing is to log the callstack of any transaction (or any longer than, say, 7 seconds). One challenge is I have calls to Knex transaction and transacting throughout my project. Maybe it's a long shot.
Any advice is greatly appreciated.
System Information
KnexJS version: 0.12.6 (we will update in the next month)
Database + version: Postgres 9.6
OS: Heroku Linux (Ubuntu?)
Easiest was to see whats happening on connection pool level is to run knex with DEBUG=knex:* environment variable set, which will print quite a lot debug info whats happening inside knex. Those logs shows for example when connections are fetched from pool and returned to there and every ran query too.
There are couple of global events that you can use to hookup to every query, but there is not any for hooking to transactions. Here is related question where I have written some example code how to actually measure transaction durations with query hooks though: Tracking DB querying time - Bookshelf/knex It probably leaks some memory, so its not very production ready solution, but for your debugging purposes it might be helpful.
I'm currently building web API using AWS Lambda with Serverless Framework.
In my lambda functions, each of them connects to Redis (elasticache) and RDB (Aurora, RDS) or DynamoDB to retrieve data or write new data.
And all my lambda functions are running in my VPC.
Everything works fine except that when a lambda function is first executed or executed a while after last execution, it takes quite a long time (1-3 seconds) to execute the lambda function, or sometimes it even respond with a gateway timeout error (around 30 seconds), even though my lambda functions are configured to 60 seconds timeout.
As stated in here, I assume 1-3 seconds is for initializing a new container. However, I wonder if there is a way to reduce this time, because 1-3 seconds or gateway timeout is not really an ideal for production use.
You've go two issues:
The 1-3 second delay. This is expected and well-documented when using Lambda. As #Nick mentioned in the comments, the only way to prevent your container from going to sleep is using it. You can use Lambda Scheduled Events to execute your function as often as every minute using a rate expression rate(1 minute). If you add some parameters to your function to help you distinguish between a real request and one of these ping requests you can immediately return on the ping requests and then you've worked around your problem. It will cost you more, but we're probably talking pennies per month if anything. Lambda has a generous free tier.
The 30 second delay is unusual. I would definitely check your CloudWatch logs. If you see logs from when your function is working normally but no logs from when you see the 30 second timeout then I would assume the problem is with API Gateway and not with Lambda. If you do see logs then maybe they can help you troubleshoot. Another place to check is the AWS Status Page. I've seen sometimes where Lambda functions timeout and respond intermittently and I pull my hair out only to realize that there's a problem on Amazon's end and they're working on it.
Here's a blog post with additional information on Lambda Container Reuse that, while a little old, still has some good information.
I am currently working on a node.js api deployed on aws with elastic beanstalk.
The api accepts a url with query parameters, saves the parameters on a db (in my case aws rds), and redirects to a new url without waiting for the db response.
The main priority by far for the api is the redirection speed and the ability to handle a lot of requests. The aim of this question is to get your suggestions on how to do that.
I ran the api through a service called blitz.io to see what load it could handle and this is the report I got from them: https://www.dropbox.com/s/15wsa8ksj3lz99e/Blitz.pdf?dl=0
The instance and the database are running on t2.micro and db.t2.micro respectively.
The api can handle the load if no write is performed on the db, but crashes under a certain load when it writes on the db (I shared the report for the latter case) even without waiting for the db responses.
I checked the logs and found the following error in /var/log/nginx/error.log:
*1254 socket() failed (24: Too many open files) while connecting to upstream
I am not familiar with how nginx works but I imagine that every db connection is seen as an open file. Hence, the error implies that we reach the limit for open files before being able to close the connections. Is that a correct interpretation? Why am I getting the error?
I increased the limit in the way suggested here: https://forums.aws.amazon.com/thread.jspa?messageID=613983#613983 but it did not solve the problem.
At this point I am not sure what to do. Can I close the connections before getting a response from the db? Is it a hardware limitation? The writes to the db are tiny.
Thank you in advance for your help! :)
if you just modified ulimit, it might not be enough. You should look at fs.file-max for number of file descriptors,
sysctl -w fs.file-max=100000
as explained there :
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
Is there an option or a setting somewhere to control the timeout for an aws ec2 wait command?
Or the number of attempts or waiting period between attempts?
I want to be able to aws ec2 wait instance-terminated for some instances I'm quickly spinning up to perform a few task then terminating. It times out on some longer running tasks with "Waiter InstanceTerminated failed: Max attempts exceeded".
I can't seem to find any info anywhere. I've grepped the cli source code, but my knowledge of Python is too limited for me to understand what's going on. I see there might be something in this test using maxAttempts and delay, but can't figure out how to leverage that from the cli.
So far my suboptimal solution is to sleep first, then start the wait.
There is not a timeout option in the AWS CLI, but you can just use the native timeout command from coreutils to do what you want.
timeout 10 aws ec2 wait instance-terminated
will abort if the command does not return within 10 seconds. A timeout will automatically return error code 124, otherwise it returns the error code of the command.
There's an open Github issue about adding configurable parameters https://github.com/aws/aws-cli/issues/1295
You can also find some environment variables you can define here
https://docs.aws.amazon.com/cli/latest/topic/config-vars.html
One of them being AWS_MAX_ATTEMPTS Number of total requests
But for my use case (restore dynamo table from snapshot) does not seem to be working