AWS - resize BMP on upload - node.js

TASK
I am trying to write a Lambda function for AWS which upon uploading any given bitmap file to my AWS cloud, the function will read this given bitmap and resize it to a preset size and rewrite it back to the same bucket that it read it from.
SCENARIO
My Ruby web-app PUTs a given bitmap file to my AWS bucket which is 8MB in size and approximately 1920x1080 pixels in size.
Upon being uploaded, the image should be read by my Lambda function, resized to 350 x 350 in size and rewritten with the same file name and key location back to the bucket.
PROBLEM
I have no experience with NodeJS, and hence I cannot properly write this function myself. Can anyone advise me the steps to complete this task or point me to a similar function which outputs a resized BMP file?

Image resizing is one of the reference uses for Lambda. You can use the Serverless Image Resizer, which is a really robust solution, or an older version of it here.
There are literally dozens open source image manipulation projects, that you can find on Github. A very simple standalone Lambda that supports BMP's out of the box can be found here.

Related

S3 bucket to bucket copy performance

I'm trying to copy some files from one bucket to another (same region), getting speed of around 315mb/s. However I'm using it in lambda and there is a 15 min timeout limit. So for bigger files goes into timeout
Below is the code snippet I'm using (in python), is there any other way I can speed it up? any inputs are welcome.
s3_client = boto3.client(
's3',
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
aws_session_token=session_token,
config=Config(signature_version='s3v4')
)
s3_client.copy(bucket_pair["input"], bucket_pair["output"]["Bucket"],
bucket_pair["output"]["Key"])
I saw many posts of passing chunksize and all, but I don't see them in ALLOWED_COPY_ARGS. Thanks.
You can use a step function and iterate over all objects and copy them. To increase throughput you can use a map task
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-map-state.html
If you don’t want to use stepfunction you can use one producer lambda to write all objects into a sqs queue and consume them from a lambda to copy them to the respective target.
A different option would be to use S3 object replication
https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html
But I’m not sure if that fits for your use case

AWS Media Converter creates new job for each file?

I am working on AWS MediaConverter and trying to create a Node js API which converts .mp4 format to .wav format.
I have the api is working correctly, however it is creating a new job for each individual .mp4 file.
Is it possible to have one MediaConvert job and use that for every file in the input_bucket instead of creating a new job for every file?
I have tried going through AWS MediaConvert Documentation and various online articles, but I am not able to see any answer to my question.
I have tried to implement my api in following steps :
create an object of class
AWS.MediaConvert()
create a job template using
MediaConvert.createJobTemplate
create a job using
MediaConvert.createJob
There is generally a 1 : 1 relationship between inputs and jobs in MediaConvert.
A MediaConvert job reads an input video from S3 (or HTTP server) and converts the video to output groups that in turn can have multiple outputs. A single media convert job can create multiple versions of the input video in different codecs and packages.
The exception to this is when you want to join more than one input file into a single asset (input stitching).
In this case you can have up to 150 inputs in your job. AWS Elemental MediaConvert subsequently creates outputs by concatenating the inputs in the order that you specify them in the job.
Your question does however suggests that input stitching is not what you are looking to achieve. Rather, you are looking to transcode multiple inputs from the source bucket.
If so, you would need to create a job for each input.
Job Templates (as well as Output Presets) work to speed up your job setup by providing groups of recommended transcoding settings. Job templates apply to an entire transcoding job whereas output presets apply to a single output of a transcoding job.
References:
Step 1: Specify your input files : https://docs.aws.amazon.com/mediaconvert/latest/ug/specify-input-settings.html
Assembling multiple inputs and input clips with AWS Elemental MediaConvert : https://docs.aws.amazon.com/mediaconvert/latest/ug/assembling-multiple-inputs-and-input-clips.html
Working with AWS Elemental MediaConvert job templates : https://docs.aws.amazon.com/mediaconvert/latest/ug/working-with-job-templates.html
Working with AWS Elemental MediaConvert output presets : https://docs.aws.amazon.com/mediaconvert/latest/ug/working-with-presets.html

AWS Lambda: how to give ffmpeg large files?

Scenario:
Using AWS Lambda (Node.js), I want to process large files from S3 ( > 1GB).
The /tmp fs limit of 512MB means that I can't copy the S3 input there.
I can certainly increase the Lambda memory space, in order to read in the files.
Do I pass the memory buffer to ffmpeg? (node.js, how?)
Or....should I just make an EFS mount point and use that as the transcoding scratchpad?
You can just use the HTTP(s) protocol as input for ffmpeg.
Lambda has max 10GB memory limit, and data transfer speed from S3 is around 300MB per second the last time I test. So if you have only 1GB max video and are not doing memory intensive transformation, this approach should work fine
ffmpeg -i "https://public-qk.s3.ap-southeast-1.amazonaws.com/sample.mp4" -ss 00:00:10 -vframes 1 -f image2 "image%03d.jpg"
ffmpeg works on files, so maybe an alternative would be to setup a unix pipe and then read that pipe with ffmpeg, constantly feeding it with the s3 stream.
But maybe you'd wanna consider running this as an ECS task instead, you wouldn't have a time constraint, and not the same storage constraint either. Cold start of it using Fargate would be 1-2 minutes though, which maybe isn't acceptable?
Lambda now supports up to 10Gb storage:
https://aws.amazon.com/blogs/aws/aws-lambda-now-supports-up-to-10-gb-ephemeral-storage/
Update with cli:
$ aws lambda update-function-configuration --function-name PDFGenerator --ephemeral-storage '{"Size": 10240}'

How to force exceptions and errors to print in AWS Lambda that is forcing the Lambda function to run twice?

I have a AWS lambda function (written in python 3.7) that is triggered when a specific JSON file is uploaded from a server to a s3 bucket. Currently I have trigger set on AWS lambda for a PUT request with the specific suffix of the file.
The issue is the lambda function is running twice everytime the JSON file is uploaded once to the s3 bucket. I confirmed via cloudwatch that every instance of any additional runs is roughly 10seconds to 1min apart and each run has an unique requestID.
To troubleshoot, I confirmed that JSON input is coming from one bucket and outputs are being written to completely separate bucket. I silenced all warnings coming from pandas, and do not see any errors that would occur in the code pop up in cloudwatch. I also have changed the retry attempts from 2 to 0.
The function also has the following metrics when it is running, with a timeout set at 40seconds and memory size set to 1920MB. There should be enough time and memory for the function to use:
Duration: 1216.03 ms Billed Duration: 1300 ms Memory Size: 1920 MB Max Memory Used: 164 MB
I am at a loss as to what I am doing wrong.
How can I force AWS Lambda to display the issues or errors it is encountering that is forcing the Lambda function to run multiple times in my python code or where ever the issue is occurring?
The issue was that my code was throwing an error, but for some reason cloudwatch was not showing the error.

Image files after upload are larger than they should be

I am uploading (through aws-sdk library to node.js) some files to Amazon S3. When it comes to image file - it looks like it is much bigger on S3 than body.length printed in node.js
E.g. I've got file with 7050103 of body.length, but S3 browser shows that it is:
Size: 8,38 MB (8789522 bytes)
I know that there are some meta here - but what meta could take more than 1MB?
What is the source of such a big difference? Is there a way to find out what size it would be on s3 before sending this file to s3?
I have upload file via s3 console and actually in this case there was no difference in size. I found out that the problem was in using lwip library for rotating image. I had a bug - I did rotate even if angle was 0, so I was rotating by 0 deg. After such rotating the image was bigger. I think that compression to jpg may be in different quality or something.

Resources