Storing node module on S3 bucket for AWS Lambda - node.js

I have developed a nodejs based function/program and want to run it on AWS Lambda. The problem is that the size is greater than 50MB and AWS Lambda supports direct function code to be under 50MB.
Mainly on my code the node module are of 43MB and the actual code is around 7MB. So is there any way I can separate my node module from code, May be if we can store the node modules in S3 bucket and then access it on AWS Lambda? Any suggestions would be helpful. Thanks
P.S: Due to some dependencies issues I cant run this function as a Docker image on Lambda.

If you do not want or cannot use Docker packaging, you can zip up your node_modules into an S3 bucket.
Your handler (or the module containing your handler), can then download the zip archive and extract files to /tmp. Then, you require() your modules from there.
The above description make not be 100% accurate as there are many ways of doing it. But that's the general idea.
This is one deployment method that zappa, a tool for deploying Python/Django apps to AWS Lambda, has supported long before docker containers were allowed in Lambda.
https://github.com/Miserlou/Zappa/pull/548

You may use lambda layers which is a perfect fit for your use case. Sometime ago, we need to use facebook sdk for one of our project and we created a lambda layer for the facebook sdk(32 mb) and then the deployment package became only 4 KB.
It is stated as
Using layers can make it faster to deploy applications with the AWS Serverless Application Model (AWS SAM) or the Serverless framework. By moving runtime dependencies from your function code to a layer, this can help reduce the overall size of the archive uploaded during a deployment.
Single Lambda function can use up to five layers. The maximum size of the total unzipped function and all layers is 250 MB which is far beyond your limits.

Related

How to Load a dependency from EFS into lambda

I have recently created an EFS instance for my lambda in order to host my project dependencies since they exceed the 250MB hard cap. I Managed to get my File system and EC2 up and running with the appropriate permission. I also configured my lamda to use the EFS. Now the only part i am confused about :
How to i import these dependencies from EFS into my lamda code.
Do i use require() with absolute path to the module?
Only found tutorials to do it in Python
As Ervin Said in the comments, Using Docker was the way to go about this

AWS Lambda with Node - saving files into Lambda's file system

I need to save files I get from S3 into a Lambda's file system and I wanted to know if I can do that simply using fs.writeFileSync ?
Or do I have to still use the context function as described here:
How to Write and Read files to Lambda-AWS with Node.js
(tried to find newer examples, but could not).
What is the recommended method?
Please advise.
Yes, you can use the typical fs functions to read/write from local disk, but be aware that writing is limited to the /tmp directory and the default max diskspace available to your Lambda function in that location is 512 MB. Also note that files written there may persist to the next (warm) Lambda invocation.
If you want to simply download an object from S3 to the local disk (assuming it will fit in the available diskspace) then you can combine AWS SDK methods and Node.js streaming to stream the content to disk.
Also, it's worth noting that, depending on your app, you may be able to process the entire S3 object in RAM via streaming, without any need to actually persist to disk. This is helpful if your object size is over 512MB.
Update: as of March 2022, you can now configure Lambda functions with more ephemeral storage in /tmp, up to 10 GB. You get 512 MB included with your Lambda function invocation and are charged for the additional configured storage above 512 MB.
If you need to persist very large files, consider using Elastic File System.
Lambda does not allow access to the local file system. It is mean to be an ephemeral environment. They allow access to the /tmp folder, but only for a maximum of 512MB. If you want to have storage along with your function, you will need to implement AWS S3 or AWS EFS.
Here's an article from AWS explaining this.
Here's the docs on adding storage to Lambda.

Is there a way to run a Deep Learning model locally with the data on AWS S3?

I am trying to implement a Neutral Network using Tensorflow with the dataset categorized into different folders (Each folders represent each class). I would like to know if there's a way to use the data from S3 and run the Deep Learning model in the local machine.
I have all the files on S3 but am unable to bring it to the local machine.
P.S I'm using Python version 3.5
As of now, no deep learning framework supports fetching data from s3 and train, maybe because of s3 pricing.
However you can mount S3 on your local system
S3-Fuse - https://github.com/s3fs-fuse/s3fs-fuse
S3Fs - https://fs-s3fs.readthedocs.io/en/latest/
Please not, for every read / write you will be billed according to aws s3 pricing, https://aws.amazon.com/s3/pricing/
Tensorflow supports this (but I think not in the nightly builds), see documentation.
Assuming you have configured the credentials as described (e.g. $HOME/.aws/credentials or with environment variables), you have to use URLs with s3 as protocol like
s3://mybucket/some/path/words.tsv
If you read or write files in your own code, be sure not to use any python IO but Tensorflow's tf.io.gfile.GFile. Similar, to list directories use e.g. tf.io.gfile.walk or tf.io.gfile.listdir
From the environment variables in the documentation, we only set AWS_REGION, but in addition the following ones are useful to control logging and avoid timeouts:
export AWS_LOG_LEVEL=3
export S3_REQUEST_TIMEOUT_MSEC=600000
Still, reading training data from s3 is usually only a good idea if you run your training on AWS. For running locally, it is usually better to copy the data to your local drive, e.g. with AWS CLI's sync command.

Can aws-sdk be a development dependency when developing lambdas with NodeJS?

I am quite new in trying to develop lambdas with NodeJs, so this question might sound silly.
One of the limitations of lambdas is the size of the function / dependencies (250 MB) and I was wondering if aws-sdk (which has >45 MB)can be treated as a dev-dependency since it occupies 1/5 of the total size of a lambda.
I understand that this is required during development, but is it not the case that this already exists in the lambda container once deployed to AWS?
Any suggestion would help as all the articles that I browsed seem to install it as a prod dependency.
Absolutely, the aws-sdk is available by default as an NPM dependency inside of the lambda containers so if you leave it as a development dependency your code will still work inside of lambda.
Here you can see which lambda containers contain which version of the AWS SDK. So in case you really need a specific version or one that's not yet loaded onto the lambda containers, you can manually include your own.

Create pkg file in lambda

I'm quite new to AWS S3 and Lambda. Using nodejs 4.3 with Lambda is it possible to pull in multiple files from an S3 bucket and compile them into a single osx flat package(.pkg)?
If you can find a library to do it within Lambda's 300 second hard timeout, sure.

Resources