how to read a file from form-Data in an AWS Lambda function - node.js

I am trying to send a PDF or text file to an AWS Lambda function (Node JS). I'd like to be able to process this file in the lambda function. I know often the best practice is to use a trigger function from an s3 bucket. However, I'd like to be able to send this lambda function a file from formdata, extract information from the file and then return the extracted info back.
I have been able to first encode the file in 64 bit binary data and send it to AWS lambda via a JSON, but often when I try to decode the file (especially a PDF) in the Lambda Function it is corrupted or empty.
Image files seem to work well for this type of encoding, but been unsuccessful with PDFs. Any help greatly appreciated. Rather then encode in base 64 is there a way I can obtain the file from a formdata? My code is below:
export const handler = async(event) => {
console.log("event", event)
var converted = atob(event.body) // RATHER HOW WOULD I READ A FILE FROM FORMDATA
const response = {
"statusCode": 200,
"headers": {"Content-Type":"text/html", //"application/pdf", // "multipart/form-data", //
"Access-Control-Allow-Origin":"*",
"Access-Control-Allow-Headers":"Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token"
},
"body": event,
"isBase64Encoded": true
}
return response;
};
thanks so much

I assume that you are using an API gateway to trigger the lambda function. In that case, you have to enable multipart/form-data in API gateway as mentioned in the documentation. The gist of the documentation is as follows:
In the Settings pane, choose Add Binary Media Type in the
Binary Media Types section. Type a required media type, for example, image/png, in the input text field.
Add Content-Type and Accept to the request headers for your proxy method.
Add those same headers to the integration request headers.
Deploy the API
PS: If you are using Lambda Proxy integration ({proxy+}, just steps 1 and 4 are enough.

Related

How to determine mime type from a file uploaded by API Gateway consumed with Lambda

I have a lambda function that delivers a pdf file that's been uploaded to my api gateway endpoint. When the event is received, I need to validate the correct file mime type. I'm currently using file-type (but I've tried others as well) with no success. Each time the fromBuffer runs, it returns undefined. I'm able to take that file and upload it to s3 afterward (which is successful). What am I missing here as to why the buffer isn't properly returning the file type?
const FileType = require('file-type')
exports.handler = async event => {
const buffer = Buffer.from(event.body, 'base64'); // I've configured multipart/form-data as binary
const type = await FileType.fromBuffer(buffer)
console.log(type); //undefined
// other code to move file to s3
}
Using file-type#16.5.1 to support commonjs
I'm using postman for the post request.
I'm using nodejs 14.x.
I have tried using the buffer.toString() to see if it's just the output of the file but it's totally fine and loads in s3 properly. I've also tried using magic-number, mmmagic, and other mime type detectors with no success.

Firebase bucket / axios url in request body decoding issue

Interesting issue when using Firebase buckets and axios in a JS environments.
When I upload a file into a bucket and send the file link returned by firebase to server in a request body, the link is auto decoded in the server.
Upload a file to bucket from web
Firebase returns a link: https://firebasestorage.googleapis.com/v0/b/[BUCKET_NAME]/o/[POINTER]%2Fimages%2F[FILE_NAME])
note the url encoded %2F that firebase uses around the 'images'
Save this to DB via a Cloud Function call by using axios.post()
Using headers: {'Content-Type': 'application/x-www-form-urlencoded'} due to Cloud Function limitations here. The url is nested in a JSON object as a String.
When this request is picked up in the Cloud Function, the URL in the object has been automatically urldecoded, resulting in:
https://firebasestorage.googleapis.com/v0/b/[BUCKET_NAME]/o/[POINTER]/images/[FILE_NAME])
note the / around the 'images'
Problem: Firebase doesn't return the file when %2F is replaced with / in the URL, only returning error:
Invalid HTTP method/URL pair.
I understand that I have only one option here, and it is to prevent this String to be URL decoded during the client-server axios call. Since I am using the mentioned headers, I'm not sure how this can be achieved.
Side quest: Why does Firebase enforce the urlencode this strongly and doesn't return the file independently of the representation of the path to file (encoded or not)?

How to use aws s3 image url in node js lambda?

I am trying to use aws s3 image in lambda node js but it throws an error 'no such file or directory'. But I have made that image as public and all permissions are granted.
fs = require('fs');
exports.handler = function( event, context ) {
var img = fs.readFileSync('https://s3-us-west-2.amazonaws.com/php-7/pic_6.png');
res.writeHead(200, {'Content-Type': 'image/png' });
res.end(img, 'binary');
};
fs is node js file system core module. It is for writing and reading files on local machine. That is why it gives you that error.
There are multiple things wrong with your code.
fs is a core module used for file operations and can't be used to access S3.
You seem to be using express.js code in your example. In lambda, there is no built-in res defined(unless you define it yourself) that you can use to send response.
You need to use the methods on context or the new callback mechanism. The context methods are used on the older lambda node version(0.10.42). You should be using the newer node version(4.3.2 or 6.10) which return response using the callback parameter.
It seems like you are also using the API gateway, so assuming that, I'll give a few suggestions. If the client needs access to the S3 object, these are some of your options:
Read the image from S3 using the AWS sdk and return the image using the appropriate binary media type. AWS added support for binary data for API gateway recently. See this link OR
Send the public S3 URL to client in your json response. Consider whether the S3 objects need to be public. OR
Use the S3 sdk to generate pre-signed URLs that are valid for a configured duration back to the client.
I like the pre-signed URL approach. I think you should check that out. You might also want to check the AWS lambda documentation
To get a file from S3, you need to use the path that S3 give you. The base path is https://s3.amazonaws.com/{your-bucket-name}/{your-file-name}.
On your code, you must replace the next line:
var img = fs.readFileSync('https://s3.amazonaws.com/{your-bucket-name}/pic_6.png');
If don't have a bucket, you should to create one to give permissions.

How to parse Multi-part form data in an Azure Function App with HTTP Trigger? (NodeJS)

I want to write a NodeJS HTTP endpoint using Azure Functions.
This endpoint will be a POST endpoint which takes files and upload these to blob storage.
However, NodeJS multipart form data parsers are all in the form of httpserver or expressJS middleware.
Is there any available tools that can parse the multipart form data after it has all been received from the Function Application's wrapper?
Thanks!
To answer original question:
However, NodeJS multipart form data parsers are all in the form of
httpserver or expressJS middleware.
Is there any available tools that can parse the multipart form data
after it has all been received from the Function Application's
wrapper?
Even 2 years later after you asked this question state of multipart form data parsers is not great, like you noticed majority of them assume req object which is a stream and tutorials/demos show how to parse multipart/form-data with express or httpServer.
However there is a parse-multipart npm package which can process req.body from azure function and return you array of objects with code similar to following:
const multipart = require("parse-multipart");
module.exports = function (context, request) {
context.log('JavaScript HTTP trigger function processed a request.');
// encode body to base64 string
const bodyBuffer = Buffer.from(request.body);
const boundary = multipart.getBoundary(request.headers['content-type']);
// parse the body
const parts = multipart.Parse(bodyBuffer, boundary);
context.res = { body : { name : parts[0].filename, type: parts[0].type, data: parts[0].data.length}};
context.done();
};
(original source: https://www.builtwithcloud.com/multipart-form-data-processing-via-httptrigger-using-nodejs-azure-functions/)
One area where I noticed parse-multipart can struggle is parsing forms with text fields. A slightly improved version which handles it better is called multipart-formdata:
require('multipart-formdata').parse(req.body, boundary)
//returns [{field, name, data, filename, type}, ...] where data is buffer you can use to save files
As Azure Functions has wrapped http server object in Node.js, and exposes a simple req and context with several functionalities, refer to https://learn.microsoft.com/en-us/azure/azure-functions/functions-reference-node#exporting-a-function for details.
And mostly, Azure Functions is designed for triggers and webhooks requests, you can refer to https://learn.microsoft.com/en-us/azure/azure-functions/functions-compare-logic-apps-ms-flow-webjobs for the detailed comparison.
Meanwhile, you can try the answer of Image upload to server in node.js without using express to parse the request body content to file content, and upload to Azure Storage leveraging Azure Storage SDK for node.js, you can install custom node modules via KUDU console. Refer to https://learn.microsoft.com/en-us/azure/azure-functions/functions-reference-node#node-version--package-management for more info.
And I suggest you can try to leverage Azure API App in node.js to approach your requiremnet. As it is an expressjs based project, which will be more easier to handle upload files.
Any further concern, please feel free to let me know.
I have good experiences with the new azure-function-multipart package. Example code might look like this:
const {
default: parseMultipartFormData,
} = require("#anzp/azure-function-multipart");
module.exports = async function (context, req) {
const { fields, files } = await parseMultipartFormData(req);
console.log("fields", fields)
console.log("files", files)
};
See docs for more details.
You can try to use this adapter for functions and express, it may allow you to successfully use the multi-part middleware you want: https://github.com/yvele/azure-function-express
As a less desirable option, you can parse the body yourself, all the multi-part data will be available in req.body and will look something like this:
------WebKitFormBoundarymQMaH4AksAbC8HRW
Content-Disposition: form-data; name="key"
value
------WebKitFormBoundarymQMaH4AksAbC8HRW
Content-Disposition: form-data; name=""
------WebKitFormBoundarymQMaH4AksAbC8HRW--
I do think it's a good idea to support httpserver / express better in order to enable this extensibility.

Gzipped response in AWS Lambda -> API Gateway

I can't seem to get a gzipped response from Lambda through the API Gateway.
I'm gzipping my response in Lambda and setting the "Content-Encoding" header in API Gateway.
I'm not sure which part is the problem.
Here's the final return from Lambda to API Gateway:
zlib.gzip(myJsonString, function (err, buffer) {
if ( err ) { return handleError(err, context) }
return context.succeed(buffer.toString('binary'));
});
I've tried just passing the buffer, base64 encoding it, etc.
Making a GET request from Chrome:
If I remove the Content-Encoding header from the gateway I get binary/base64/buffer array as a string response in the browser.
If I set the header, the GET request fails entirely with no response but testing in the AWS console returns the payload with quotes around it.
I don't know what's going on here but If Amazon actually wants people to use this thing we need to be able to compress our responses. Seems like it should just be a checkbox in API Gateway and then I could simply return a JSON string from Lambda and have it zipped up automatically.
As for Nov 17, 2016 - Binary Data Now Supported by API Gateway
Let me know if you figured that out!

Resources