use fastify to download big files as multipart - node.js

I have a code that download big objects from S3 using multipart download. I am working now on a microservice that will hide all the s3 operations and in the future will give me the flexibility to change to any object store.I have created a nodejs service using fastify, how can I add the support of multipart download using fastify?

You should rewrite the parts logic of AWS-S3 service in server and client side too.
The GetObject accepts the Range
header that will limit the download of that piece of file.
So the client need to know how many pieces compose a file, using the ListParts
API usually. Then it can call the GetObject
with the range parameter:
var params = {
Bucket: "examplebucket",
Key: "SampleFile.txt",
Range: "bytes=0-9"
};
s3.getObject(params, function(err, data) {...
So your Fastify server should proxy at latest those 2 services to let the client download
simultaneously many pieces of the file and then merge them.

Related

how to read a file from form-Data in an AWS Lambda function

I am trying to send a PDF or text file to an AWS Lambda function (Node JS). I'd like to be able to process this file in the lambda function. I know often the best practice is to use a trigger function from an s3 bucket. However, I'd like to be able to send this lambda function a file from formdata, extract information from the file and then return the extracted info back.
I have been able to first encode the file in 64 bit binary data and send it to AWS lambda via a JSON, but often when I try to decode the file (especially a PDF) in the Lambda Function it is corrupted or empty.
Image files seem to work well for this type of encoding, but been unsuccessful with PDFs. Any help greatly appreciated. Rather then encode in base 64 is there a way I can obtain the file from a formdata? My code is below:
export const handler = async(event) => {
console.log("event", event)
var converted = atob(event.body) // RATHER HOW WOULD I READ A FILE FROM FORMDATA
const response = {
"statusCode": 200,
"headers": {"Content-Type":"text/html", //"application/pdf", // "multipart/form-data", //
"Access-Control-Allow-Origin":"*",
"Access-Control-Allow-Headers":"Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token"
},
"body": event,
"isBase64Encoded": true
}
return response;
};
thanks so much
I assume that you are using an API gateway to trigger the lambda function. In that case, you have to enable multipart/form-data in API gateway as mentioned in the documentation. The gist of the documentation is as follows:
In the Settings pane, choose Add Binary Media Type in the
Binary Media Types section. Type a required media type, for example, image/png, in the input text field.
Add Content-Type and Accept to the request headers for your proxy method.
Add those same headers to the integration request headers.
Deploy the API
PS: If you are using Lambda Proxy integration ({proxy+}, just steps 1 and 4 are enough.

Piping a file straight to the client using Node.js and Amazon S3

So I want to pipe a file straight to the client; how I am currently doing it is create a file to disk, then sending that file straight to the client.
router.get("/download/:name", async (req, res) => {
const s3 = new aws.S3();
const dir = "uploads/" + req.params.name + ".apkg"
let file = fs.createWriteStream(dir);
await s3.getObject({
Bucket: <bucket-name>,
Key: req.params.name + ".apkg"
}).createReadStream().pipe(file);
await res.download(dir);
});
I just looked up that res.download() only serves locally. Is there a way you can do it directly from AWS S3 to Client download? i.e. pipe files straight to user. Thanks in advance
As described in this SO thread:
You can simply pipe the read stream into the response instead of the piping it to the file, just make sure to supply the correct Content-Type and to set it as an attachment, so the browser will know how to handle the response properly.
res.attachment(req.params.name);
await s3.getObject({
Bucket: <bucket-name>,
Key: req.params.name + ".apkg"
}).createReadStream().pipe(res);
On more pattern for this is to create a signed url directly to the S3 object and then let the client download straight from S3, instead of streaming it from your node webserver. This will reduce the workload from your web server.
You will need to use the getSignedUrl method from the AWS S3 SDK for JS.
Then, Once you have the URL, just return it to your client to download the file by themselves.
You should take into account that once you give the client a signed URL that has download permissions for, say, 5 minutes, they will only be able to download that file during those next 5 minutes. And you should also take into account that they will be able to pass that URL to anyone else for download during those 5 minutes, so it is dependant on how secure you need this to be.
S3 can be used to content so I would do the following.
Add CORS headers on your node response. This will enable browser to download from another origin i.e. S3.
Enable S3 web server on your bucket.
Script to download redirect from S3 - this you could achieve in JS.
Use signed URL as suggested in the other post if you need to protect S3 content.

How to use aws s3 image url in node js lambda?

I am trying to use aws s3 image in lambda node js but it throws an error 'no such file or directory'. But I have made that image as public and all permissions are granted.
fs = require('fs');
exports.handler = function( event, context ) {
var img = fs.readFileSync('https://s3-us-west-2.amazonaws.com/php-7/pic_6.png');
res.writeHead(200, {'Content-Type': 'image/png' });
res.end(img, 'binary');
};
fs is node js file system core module. It is for writing and reading files on local machine. That is why it gives you that error.
There are multiple things wrong with your code.
fs is a core module used for file operations and can't be used to access S3.
You seem to be using express.js code in your example. In lambda, there is no built-in res defined(unless you define it yourself) that you can use to send response.
You need to use the methods on context or the new callback mechanism. The context methods are used on the older lambda node version(0.10.42). You should be using the newer node version(4.3.2 or 6.10) which return response using the callback parameter.
It seems like you are also using the API gateway, so assuming that, I'll give a few suggestions. If the client needs access to the S3 object, these are some of your options:
Read the image from S3 using the AWS sdk and return the image using the appropriate binary media type. AWS added support for binary data for API gateway recently. See this link OR
Send the public S3 URL to client in your json response. Consider whether the S3 objects need to be public. OR
Use the S3 sdk to generate pre-signed URLs that are valid for a configured duration back to the client.
I like the pre-signed URL approach. I think you should check that out. You might also want to check the AWS lambda documentation
To get a file from S3, you need to use the path that S3 give you. The base path is https://s3.amazonaws.com/{your-bucket-name}/{your-file-name}.
On your code, you must replace the next line:
var img = fs.readFileSync('https://s3.amazonaws.com/{your-bucket-name}/pic_6.png');
If don't have a bucket, you should to create one to give permissions.

How to transfer base64 image from client to server or download binary / base64 from s3 bucket?

In my app, i'm sending photos directly from the client to s3, using something similar to this suggested heroku recommendation: https://devcenter.heroku.com/articles/s3-upload-node
The main benefit is that it saves server cost (i'm assuming since chunks aren't being sent to the server using something such as multipart-y form data).
However, I wish to be able to share these images to twitter also, which states this requirement:
Ensure the POST is a multipart/form-data request. Either upload the raw binary (media parameter) of the file, or its base64-encoded contents (media_data parameter). Use raw binary when possible, because base64 encoding results in larger file sizes
I've tried sending the base64 needed for the client-side s3 upload back to the server, but depending on the photo size -- I often get an error that it's too big to send back.
TLDR
Do I need to send my photos using mulitparty / multipart form data to my server, so I can have the needed base64 / binary to share a photo to twitter, or can I keep sending photos from my client to s3?
Then, somehow, efficiently obtain the needed base64 / binary on the server (possibly using the request module), so I can then send the image to twitter?
One fairly easy way to do this without changing your client code a whole lot would be to use S3 events. S3 events can trigger a lambda function in AWS that can post the image to twitter. You can use any library inside the lambda function to do efficient posting to twitter. Not sure if you want to use Lambda or stick to Heroku.
If you are directly uploading documents from the client to upload to s3, you are exposing your AWS secret/private keys with the client. A more secure way would be uploading the images to node and node in turn upload it to S3. A recommended way to upload images to node server would be using
multipart/form-data and using Multer middleware.
Regardless of the upload method, you can use the following code to serve images to twitter. This code uses AWS-SDK module.
var s3 = new AWS.S3();
var filename = req.query.filename;
var params = {
Bucket: <bucketname>,
Key: <image path>
};
var extension = filename.split('.')[1];
if (extension == "jpg" || extension == "JPG" || extension == "jpeg" || extension == "JPEG")
{
res.setHeader('Content-Type', 'image');
}
else if (extension == "png" || extension == "PNG")
{
res.setHeader('Content-Type', 'image/png');
}
s3.getObject(params).createReadStream().pipe(res);
This method can scale with easy like any other express app.

Serve static files from google-cloud-storage through express middleware

I have an express app hosted on google AppEngine which uses the express static middleware. I'd like to store the static files on google-cloud-storage, and to be able to switch from regular filesystem to google-cloud-storage without too much modification.
I was thinking of writing a middleware:
using the Google Cloud client library for Node.js, something like Express caching Image stream from Google Cloud Storage) ;
or acting as a proxy (mapping pathnames to raw google-cloud-storage urls).
Is there an easier/cleaner way to do that ?
I made this work using http-proxy-middleware. Essentially, since GCS files can be accessed via http protocol, this is all we need.
Ideally, the files could be served directly out of GCS itself, by making the bucket public and making its URL like https://storage.googleapis.com/<bucket-name>/file. But my requirement was the file needed to be served from the same domain as my app, but the files were not part of the app itself (they are generated separately). So, I had to implement it as a proxy.
import proxy from 'http-proxy-middleware';
...
app.use('/public', proxy({
target: `https://storage.googleapis.com/${process.env.GOOGLE_CLOUD_PROJECT}.appspot.com`,
changeOrigin: true,
}));
Note that the bucket based on the project ID is automatically created by GAE, but it needs to be given public access. This can be done by
gsutil defacl set public-read gs://${GOOGLE_CLOUD_PROJECT}.appspot.com
After setting up the proxy, all requests to https://example.com/public/* will served from the bucket <GOOGLE_CLOUD_PROJECT>.appspot.com/public/*.
I require the same and came across the example used on google cloud node or see example in another question
Pipe the readstream of the file-contents to your response.
Set the file-name and content-type
// Add headers to describe file
let headers = {
'Content-disposition': 'attachment; filename="' + 'giraffe.jpg' + '"',
'Content-Type': 'image/png'
};
// Streams are supported for reading files.
let remoteReadStream = bucket.file('giraffe.jpg').createReadStream();
// Set the response code & headers and pipe content to response
res.status(200).set(headers);
remoteReadStream.pipe(res);
You could configure a GCS bucket to host a static website and then use an existing express middleware to proxy requests to that bucket.
You might also be able to use an s3 express middleware like s3-proxy. By following the 'simple' migration steps to move an s3 client application to google cloud storage you should be able to derive the necessary config parameters for the middleware. The key step will be generating some 'access' and 'secret' developer keys.

Resources