How to Load a dependency from EFS into lambda - node.js

I have recently created an EFS instance for my lambda in order to host my project dependencies since they exceed the 250MB hard cap. I Managed to get my File system and EC2 up and running with the appropriate permission. I also configured my lamda to use the EFS. Now the only part i am confused about :
How to i import these dependencies from EFS into my lamda code.
Do i use require() with absolute path to the module?
Only found tutorials to do it in Python

As Ervin Said in the comments, Using Docker was the way to go about this

Related

Running CI/CD pipelines in AWS Lambda

As a project, I am trying to create a CI/CD pipeline running inside an AWS Lambda application.
The problem I am facing is that AWS Lambda is missing some tools (for example xargs) that certain applications (for example Gradle) require to run properly:
/tmp/repo/gradlew: line 234: xargs: command not found
Or even more interestingly:
install: apt-get: command not found
How can I install the required tools to build the applications from within an AWS Lambda container?
How can I utilize layers to speed up those containers?
Aka, I assume I need to register that certain cli tools are present in mounted layers.
On windows, I would do this by (ab)using the PATH environment variable, but what is the recommended way to do this in Linux?
And how can I tell tools to look for their dependencies in those layers? to avoid errors like:
ld.gold: error: cannot find -lcurl
The best option as far as I can tell is to create a Docker image containing all the software that you require and provide this to the AWS Lambda service.
There is extensive documentation how to run Docker containers in AWS Lambda:
https://docs.aws.amazon.com/lambda/latest/dg/images-create.html
Personal note: While I like the idea of a challenge or proof-of-concept I'd recommend using one of the many CI/CD services out there instead of building one on your own. I can not think of any upside of this. AWS itself offers CI/CD solutions like AWS CodePipeline etc.
You might want to have a look at the following documentation:
https://aws.amazon.com/getting-started/hands-on/set-up-ci-cd-pipeline/

Storing node module on S3 bucket for AWS Lambda

I have developed a nodejs based function/program and want to run it on AWS Lambda. The problem is that the size is greater than 50MB and AWS Lambda supports direct function code to be under 50MB.
Mainly on my code the node module are of 43MB and the actual code is around 7MB. So is there any way I can separate my node module from code, May be if we can store the node modules in S3 bucket and then access it on AWS Lambda? Any suggestions would be helpful. Thanks
P.S: Due to some dependencies issues I cant run this function as a Docker image on Lambda.
If you do not want or cannot use Docker packaging, you can zip up your node_modules into an S3 bucket.
Your handler (or the module containing your handler), can then download the zip archive and extract files to /tmp. Then, you require() your modules from there.
The above description make not be 100% accurate as there are many ways of doing it. But that's the general idea.
This is one deployment method that zappa, a tool for deploying Python/Django apps to AWS Lambda, has supported long before docker containers were allowed in Lambda.
https://github.com/Miserlou/Zappa/pull/548
You may use lambda layers which is a perfect fit for your use case. Sometime ago, we need to use facebook sdk for one of our project and we created a lambda layer for the facebook sdk(32 mb) and then the deployment package became only 4 KB.
It is stated as
Using layers can make it faster to deploy applications with the AWS Serverless Application Model (AWS SAM) or the Serverless framework. By moving runtime dependencies from your function code to a layer, this can help reduce the overall size of the archive uploaded during a deployment.
Single Lambda function can use up to five layers. The maximum size of the total unzipped function and all layers is 250 MB which is far beyond your limits.

Is there a way to run a Deep Learning model locally with the data on AWS S3?

I am trying to implement a Neutral Network using Tensorflow with the dataset categorized into different folders (Each folders represent each class). I would like to know if there's a way to use the data from S3 and run the Deep Learning model in the local machine.
I have all the files on S3 but am unable to bring it to the local machine.
P.S I'm using Python version 3.5
As of now, no deep learning framework supports fetching data from s3 and train, maybe because of s3 pricing.
However you can mount S3 on your local system
S3-Fuse - https://github.com/s3fs-fuse/s3fs-fuse
S3Fs - https://fs-s3fs.readthedocs.io/en/latest/
Please not, for every read / write you will be billed according to aws s3 pricing, https://aws.amazon.com/s3/pricing/
Tensorflow supports this (but I think not in the nightly builds), see documentation.
Assuming you have configured the credentials as described (e.g. $HOME/.aws/credentials or with environment variables), you have to use URLs with s3 as protocol like
s3://mybucket/some/path/words.tsv
If you read or write files in your own code, be sure not to use any python IO but Tensorflow's tf.io.gfile.GFile. Similar, to list directories use e.g. tf.io.gfile.walk or tf.io.gfile.listdir
From the environment variables in the documentation, we only set AWS_REGION, but in addition the following ones are useful to control logging and avoid timeouts:
export AWS_LOG_LEVEL=3
export S3_REQUEST_TIMEOUT_MSEC=600000
Still, reading training data from s3 is usually only a good idea if you run your training on AWS. For running locally, it is usually better to copy the data to your local drive, e.g. with AWS CLI's sync command.

Can aws-sdk be a development dependency when developing lambdas with NodeJS?

I am quite new in trying to develop lambdas with NodeJs, so this question might sound silly.
One of the limitations of lambdas is the size of the function / dependencies (250 MB) and I was wondering if aws-sdk (which has >45 MB)can be treated as a dev-dependency since it occupies 1/5 of the total size of a lambda.
I understand that this is required during development, but is it not the case that this already exists in the lambda container once deployed to AWS?
Any suggestion would help as all the articles that I browsed seem to install it as a prod dependency.
Absolutely, the aws-sdk is available by default as an NPM dependency inside of the lambda containers so if you leave it as a development dependency your code will still work inside of lambda.
Here you can see which lambda containers contain which version of the AWS SDK. So in case you really need a specific version or one that's not yet loaded onto the lambda containers, you can manually include your own.

How is it possible to debug an AWS Lambda function from remote?

We are taking over a whole application from another company, and they have built the whole pipeline for deploying, but we still don't have access to it. What we know, that there's a lambda function is running triggered by certain SNS messages, and all the code is in Node.js, and the development is in VS Code. We also have issues debugging it locally, but it's a bigger problem, that we need to debug it remotely.
Since I am new in AWS services, I'd really appreciate if somebody could help me in this.
Does it necessary to open a port? How is it possible to connect to a lambda? Do we need serverless to setup? Many unresolved questions.
I don't think there is way you can debug a lambda function remotely. Your best bet is to download the code on local machine, setup the env variables as you have set up on your lambda function and take it from there.
Remember at the end of the day lambda is just a container which is running the code for you. AWS doesn't allow any ssh or connection with those container. In your case you should be able to debug it on local till you have the same env variables. There are other things as well which are lambda specific but considering it is a running code which you have got so you should be able to find out the issue.
Hope it makes sense.
Thundra (https://www.thundra.io/aws-lambda-debugger) has live/remote debugging support for AWS Lambda through its native IDE plugins (VSCode and IntelliJ IDEA).
The way AWS have you 'remote' debug is to execute the lambda locally through Docker as it proxies the requests to the cloud for you, using AWS Toolkit. You have a lambda running on your local computer via docker that can access resources on the cloud, such as databases, api's etc. You can step through debug them using editors like vscode.
I use SAM with a template.yaml . This way, I can pass event data to the handler, reference dependency layers (shared code libraries) and have a deployment manifest to create a Cloudformation stack (release instance with history and resource management).
Debugging can be a bit slow as it compiles, deploys to Docker and invokes, but allows step through debugging and variable inspection.
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-debugging.html
While far from ideal, any console-printing actions would likely get logged to CloudWatch, which you could then access to go through printed data.
For local debugging, there are many Github projects with Dockerfiles which which you can build a docker container locally, just like AWS does when your Lambda is invoked.

Resources