Is AWS Lambda using Elastic IP? - node.js

First my question: are AWS Lambda "instances" using EIP?
My background:
I'm using lambda as solution to reduce my application load of certain task(download youtube videos).
In the past I was having problems trying to do this very thing in my ec2 instances, in which I used them with EIP, which always returned limit exceed message, and prompted human captcha verification. I solved this at that time by using the instances without EIP and worked like a charm.
Now using lambda for certain videos it throws me Error: Code 150: The uploader has not made this video available in your country. and I double checked that the video was not blocked for US, and it wasn't. So I decided to go back and test with an instance with EIP, and that was it, the same message that was been returned in my lambda function.
It seems to be a change from youtube, because around 3-4 months ago the error when using EIP was limit exceed, but now it turned to country blocked issue. So it's like lambda uses EIP or alike service which youtube doesn't seems to like.
PS: I'm running my lambda function with nodejs and download the videos with ytdl-core btw.
PS2: I asked this very question in aws forums but no luck so far in a week or so. So I decided to try asking here.
Thanks in advance

AWS Lambda is not the same an as EC2 instance. It runs on containers within the AWS infrastructure. Traffic would "appear" to be coming from certain IP addresses, but there is no way to configure which IP address is used.
It is possible that the range of "IP addresses from which Lambda appears to come" is not correctly updated in the geo-database used by the video service, and it thinks they are located in a different location.
Bottom line: There is nothing you can configure.

Related

AWS EC2 boots via scheduled Lambda, how to alert of errors?

My EC2 instance boots daily for 5 minutes before shutting down.
On bootup, a NodeJS script is executed. Usually this script will complete long before the 5 minutes are up, but I'd like to be notified (SMS/email) whenever it doesn't.
What is the correct approach? I can try to send a notification within my NodeJS code after 5 minutes if execution wasn't finished, but Lambda could shut down the instance before this occurs.
I'm quite new to AWS so I apologize if this is rather basic, I haven't had luck on Google with this issue.
Can you check if whatever Node script is doing when EC2 instance is up could be replicated with one or more lambda functions.
Think about serverless and microservices architecture. Theoretically any workflow which need servers could be achived via AWS Lambda functions and various triggers. In you case I can think of the following:
SES to send out email messages
API gateway to expose your Lambda function for trigger
Cloud watch events to trigger lambda function like a cronjob.
I would be surprise to learn if Serverless won't work here. Please do share the case so that I can brainstorm more and share a solution.

Scan files in AWS S3 bucket for virus using lambda

We've a requirement to scan the files uploaded by the user and check if it has virus and then tag it as infected. I checked few blogs and other stackoverflow answers and got to know that we can use calmscan for the same.
However, I'm confused on what should be the path for virus scan in clamscan config. Also, is there tutorial that I can refer to. Our application backend is in Node.js.
I'm open to other libraries/services as well
Hard to say without further info (i.e the architecture your code runs on, etc).
I would say the easiest possible way to achieve what you want is to hook up a trigger on every PUT event on your S3 Bucket. I have never used any virus scan tool, but I believe that all of them run as a daemon within a server, so you could subscribe an SQS Queue to your S3 Bucket event and have a server (which could be an EC2 instance or an ECS task) with a virus scan tool installed poll the SQS queue for new messages.
Once the message is processed and a vulnerability is detected, you could simply invoke the putObjectTagging API on the malicious object.
We have been doing something similar, but in our case, its before the file storing in S3. Which is OK, I think, solution would still works for you.
We have one EC2 instance where we have installed the clamav. Then written a web-service that accepts Multi-part file and take that file content and internally invokes ClamAv command for scanning that file. In response that service returns whether the file is Infected or not.
Your solution, could be,
Create a web-service as mentioned above and host it on EC2(lets call it, virus scan service).
On Lambda function, call the virus scan service by passing the content.
Based on the Virus Scan service response, tag your S3 file appropriately.
If your open for paid service too, then in above the steps, #1 won't be applicable, replace the just the call the Virus-Scan service of Symantec or other such providers etc.
I hope it helps.
You can check this solution by AWS, it will give you an idea of a similar architecture: https://aws.amazon.com/blogs/developer/virus-scan-s3-buckets-with-a-serverless-clamav-based-cdk-construct/

Lambda starts timing out randomly when communicating with DynamoDB

I have a Node.js Lambda code base that talks to tiny dataset in DynamoDB (less than 400 byte each). Every now and then the function will time out over 5 minutes whilst doing a get() request to DynamoDB (via dynamoDbdAWS.DynamoDB.DocumentClient();).
The problem is it's completely random as to when this issue will occur but when it works it take ~2 second from a cold start, so taking over 5 minutes to run makes no sense and at random points.
It's a dev environment so only myself is using this, and I'm doing maybe 10 requests a day
context.callbackWaitsForEmptyEventLoop = false; has been set
Memory allocation never exceeds 45MB (128MB set)
I'm testing directly in Lambda
The code is deployed via Serverless
When testing, using Serverless, locally it works whilst the Lambda fails
I've inherited this project but have a good understanding of the architecture around it and it's fairly simple but I've not done much work with Lambda before.
Any ideas what I should look for or any known issues will be a massive help.
It sounds like one (or more) of the VPC subnets the Lambda function is configured to run in doesn't have a route to a NAT Gateway (or an AWS PrivateLink configuration). So whenever that subnet is used by the Lambda function it is unable to access the AWS API.
If the Lambda function doesn't actually need to access any resources in the VPC then it is much better to not configure it to use the VPC at all.

Connecting Lambda to both RDS and S3

I have a Node.js task that converts values from my database to MP3 files, then uploads them to s3 storage. The code works beautifully when executing it on my laptop. I decided to migrate it to Lambda so I can run it automatically every couple hours. I made a few minor modifications, and again, it works great. But here's the catch: it's only working when my RDS instance is set to allow connections from ANY IP. Obviously, that's an unacceptable security risk.
I put my database and Lambda code in the same VPC and security group, but even so, my code wouldn't connect to S3. Then, I added an endpoint for S3, and it looked like everything was working per my console logs. However, the file in S3 storage is empty (0 bytes).
What do I need to change? I've heard that I might need to configure my VPC to have internet access, but I'm not sure if that's what I need to do. And honestly, those tutorials seem confusing to me.
Can someone point me in the right direction?
It is a known problem (known by users, not really acknowledged by AWS that I've seen) The lambda vps docs say:
http://docs.aws.amazon.com/lambda/latest/dg/vpc.html
"When a Lambda function is configured to run within a VPC, it incurs
an additional ENI start-up penalty. This means address resolution may
be delayed when trying to connect to network resources."
And
"If your Lambda function accesses a VPC, you must make sure that your
VPC has sufficient ENI capacity to support the scale requirements of
your Lambda function. "
Source: https://forums.aws.amazon.com/thread.jspa?messageID=767285
This means it has serious drawbacks that make it unworkable:
speed penalty
have to manually setup scaling
have to pay for NAT gateway 0.059 per hour (https://aws.amazon.com/vpc/pricing/)

How to optimize AWS Lambda?

I'm currently building web API using AWS Lambda with Serverless Framework.
In my lambda functions, each of them connects to Redis (elasticache) and RDB (Aurora, RDS) or DynamoDB to retrieve data or write new data.
And all my lambda functions are running in my VPC.
Everything works fine except that when a lambda function is first executed or executed a while after last execution, it takes quite a long time (1-3 seconds) to execute the lambda function, or sometimes it even respond with a gateway timeout error (around 30 seconds), even though my lambda functions are configured to 60 seconds timeout.
As stated in here, I assume 1-3 seconds is for initializing a new container. However, I wonder if there is a way to reduce this time, because 1-3 seconds or gateway timeout is not really an ideal for production use.
You've go two issues:
The 1-3 second delay. This is expected and well-documented when using Lambda. As #Nick mentioned in the comments, the only way to prevent your container from going to sleep is using it. You can use Lambda Scheduled Events to execute your function as often as every minute using a rate expression rate(1 minute). If you add some parameters to your function to help you distinguish between a real request and one of these ping requests you can immediately return on the ping requests and then you've worked around your problem. It will cost you more, but we're probably talking pennies per month if anything. Lambda has a generous free tier.
The 30 second delay is unusual. I would definitely check your CloudWatch logs. If you see logs from when your function is working normally but no logs from when you see the 30 second timeout then I would assume the problem is with API Gateway and not with Lambda. If you do see logs then maybe they can help you troubleshoot. Another place to check is the AWS Status Page. I've seen sometimes where Lambda functions timeout and respond intermittently and I pull my hair out only to realize that there's a problem on Amazon's end and they're working on it.
Here's a blog post with additional information on Lambda Container Reuse that, while a little old, still has some good information.

Resources