I am developing a web application using Nodejs. I am using Amazon S3 bucket to store files. What I am doing now is that when I upload a video file (mp4) to the S3 bucket, I will get the thumbnail photo of the video file from the lambda function. For fetching the thumbnail photo of the video file, I am using this package - https://www.npmjs.com/package/ffmpeg. I tested the package locally on my laptop and it is working.
Here is my code tested on my laptop
var ffmpeg = require('ffmpeg');
module.exports.createVideoThumbnail = function(req, res)
{
try {
var process = new ffmpeg('public/lalaland.mp4');
process.then(function (video) {
video.fnExtractFrameToJPG('public', {
frame_rate : 1,
number : 5,
file_name : 'my_frame_%t_%s'
}, function (error, files) {
if (!error)
console.log('Frames: ' + files);
else
console.log(error)
});
}, function (err) {
console.log('Error: ' + err);
});
} catch (e) {
console.log(e.code);
console.log(e.msg);
}
res.json({ status : true , message: "Video thumbnail created." });
}
The above code works well. It gave me the thumbnail photos of the video file (mp4). Now, I am trying to use that code in the AWS lambda function. The issue is the above code is using video file path as the parameter to fetch the thumbnails. In the lambda function, I can only fetch the base 64 encoded format of the file. I can get id (s3 path) of the file, but I cannot use it as the parameter (file path) to fetch the thumbnails as my s3 bucket does not allow public access.
So, what I tried to do was that I tried to save the base 64 encoded video file locally in the lambda function project itself and then passed the file path as the parameter for fetching the thumbnails. But the issue was that AWS lamda function file system is read-only. So I cannot write any file to the file system. So what I am trying to do right now is to retrieve the thumbnails directly from the base 64 encoded video file. How can I do it?
Looks like you are using a wrong file location,
/tmp/* is your writable location for temporary files and limited to 512MB
Checkout the tutorial that does the same as you like to do.
https://concrete5.co.jp/blog/creating-video-thumbnails-aws-lambda-your-s3-bucket
Lambda Docs:
https://docs.aws.amazon.com/lambda/latest/dg/limits.html
Ephemeral disk capacity ("/tmp" space) 512 MB
Hope it helps.
Related
I've a requirement to
Download a PDF file from AWS S3 storage. (Key1)
Do some modifications.
Upload the modified PDF file back to S3 storage. (Key2)
The Uploaded file is a new file (K2). Not overwriting the existing file (K1)
Library used for modifying PDFs : pdf-lib
All the executions like downloading/modification/uploading of PDF are being done in AWS Lambda. The runtime is node.js 14.x
The objects in S3 bucket can be accessed through CDN as public access is blocked.
I'm able to download the file, then do the modifications and upload to S3. But when I open the file using CDN URL for the object, it is showing encoded text (garbage). Not the PDF preview of the file.
Downloading PDF file from S3.
const params = {
Bucket: bucket_name,
Key: key
};
// GET FILE AND RETURN PROMISE.
return new Promise((resolve, reject) => {
s3.getObject(params, (err, data) => {
if (err) {
reject(err);
}
try {
const obj = data.Body; // <<-- getting Uint8Array
resolve(obj);
} catch (e) {
reject(err);
}
});
});
Doing Modification on PDF file
async modificationFunction(opts) => {
const { fileData } = opts; //<<---- Unit8Array data from above snippet.
const pdfDoc = await PDFDocument.load(fileData);
// Do Some Modification like drawing lines.
const modifiedPDFData = await pdfDoc.saveAsBase64({ dataUri: true });
return modifiedPDFData; //<<--- Base64 data of modifications.
}
Uploading PDF file
const params = {
Bucket: bucket_name,
Key: key,
Body: data, //<<--- Base64 data of modification from above snippet
};
try {
await s3.upload(params).promise();
console.log('File uploaded:', `s3://${bucket_name}/${key}`);
}
Content of the PDF when viewed using CDN URL is attached. It is encoded/garbage content.
Same PDF when downloaded to laptop from AWS S3 using manual download from S3 bucket is showing the contents properly like a normal PDF file.
Referenced many online resources/stackoverflow threads:
link1
link2 Using the AWS SDK in javascript.
Tried ways with save() and saveAsBase64() methods of the pdf-lib nodejs library.
Tried to save the modified file locally. Upload this file manually to AWS S3 and access through CDN. Able to view the PDF properly this way. So there is some issue with how the file is uploaded to S3.
The issue was not with PDF file download, modification, upload operations. Actually the CDN had a caching policy due to which the initially generated garbage content files were getting served on further requests. After clearing the cache and trying again the files were properly viewable with the CDN URL.
I'm using node js aws-sdk package to download files from s3 storage, and when I download jpeg image and save it as a local file I can't view it. Is it the right way to download jpeg image?
public async downloadFile(fileName: string, targetPath: string): Promise<void> {
try {
const awsObject = await this.s3
.getObject({
Bucket: BUCKET,
Key: fileName,
})
.promise();
fs.writeFileSync(targetPath, awsObject.Body.toString());
} catch (error) {
throw new Error(`Failed to download file from aws storage with error ${error}`);
}
}
this is how I call it:
await awsSdk.downloadFile('fileInS3.jpeg', `test.jpeg`);
When I try to open the saved file I receive an error that says
The file “test.jpeg” could not be opened. It may be damaged or use a file format that Preview doesn’t recognize.
Update
Solved by replacing
fs.writeFileSync(targetPath, awsObject.Body.toString());
With
fs.writeFileSync(targetPath, awsObject.Body as Buffer);
This looks like a problem:
awsObject.Body.toString()
If you're writing an image, converting it to a string is going to break it.
I was practicing on this tutorial
https://www.youtube.com/watch?v=NZElg91l_ms&t=1234s
It is working absolutely like a charm for me but the thing is I am storing images of products I am storing them in bucket and lets say I upload 4 images they all are uploaded.
but when I am displaying them i got access denied error as I am displaying the list and repeated request are maybe detecting it as a spam
This is how i am trying to fetch them on my react app
//rest of data is from mysql datbase (product name,price)
//100+ products
{ products.map((row)=>{
<div className="product-hero"><img src=`http://localhost:3909/images/${row.imgurl}`</div>
<div className="text-center">{row.productName}</div>
})
}
as it fetch 100+ products from db and 100 images from aws it fails
Sorry for such detailed question but in short how can i fetch all product images from my bucket
Note I am aware that i can get only one image per call so how can I get all images one by one in my scenario
//download code in my app.js
const { uploadFile, getFileStream } = require('./s3')
const app = express()
app.get('/images/:key', (req, res) => {
console.log(req.params)
const key = req.params.key
const readStream = getFileStream(key)
readStream.pipe(res)
})
//s3 file
// uploads a file to s3
function uploadFile(file) {
const fileStream = fs.createReadStream(file.path)
const uploadParams = {
Bucket: bucketName,
Body: fileStream,
Key: file.filename
}
return s3.upload(uploadParams).promise()
}
exports.uploadFile = uploadFile
// downloads a file from s3
function getFileStream(fileKey) {
const downloadParams = {
Key: fileKey,
Bucket: bucketName
}
return s3.getObject(downloadParams).createReadStream()
}
exports.getFileStream = getFileStream
It appears that your code is sending image requests to your back-end, which retrieves the objects from Amazon S3 and then serves the images in response to the request.
A much better method would be to have the URLs in the HTML page point directly to the images stored in Amazon S3. This would be highly scalable and will reduce the load on your web server.
This would require the images to be public so that the user's web browser can retrieve the images. The easiest way to do this would be to add a Bucket Policy that grants GetObject access to all users.
Alternatively, if you do not wish to make the bucket public, you can instead generate Amazon S3 pre-signed URLs, which are time-limited URLs that provides temporary access to a private object. Your back-end can calculate the pre-signed URL with a couple of lines of code, and the user's web browser will then be able to retrieve private objects from S3 for display on the page.
I did sililar S3 image handling while I handle my blog's image upload functionality, but I did not use getFileStream() to upload my image.
Because nothing should be done until the image file is fully processed, I used fs.readFile(path, callback) instead to read the data.
My way will generate Buffer Data, but AWS S3 is smart enough to know to intercept this as image. (I have only added suffix in my filename, I don't know how to apply image headers...)
This is my part of code for reference:
fs.readFile(imgPath, (err, data) => {
if (err) { throw err }
// Once file is read, upload to AWS S3
const objectParams = {
Bucket: 'yuyuichiu-personal',
Key: req.file.filename,
Body: data
}
S3.putObject(objectParams, (err, data) => {
// store image link and read image with link
}
}
I've been taking a crack at uploading files onto S3 via NodeJS, but with a specific path where they have to be stored.
return s3fsImpl.writeFile(file_name.originalFilename,stream).then(function() {
fs.unlink(file_name.path, function(err) {
if (err) {
console.error(err);
} else { /** sucessess **/ }
I'm not sure how do I give a path like /project_name/file_name.
I have been following this tutorial
In this scenario your are using a stream as the target. When you created that stream you should have specified the path at that point.
I am trying to find some solution to stream file on amazon S3 using node js server with requirements:
Don't store temp file on server or in memory. But up-to some limit not complete file, buffering can be used for uploading.
No restriction on uploaded file size.
Don't freeze server till complete file upload because in case of heavy file upload other request's waiting time will unexpectedly
increase.
I don't want to use direct file upload from browser because S3 credentials needs to share in that case. One more reason to upload file from node js server is that some authentication may also needs to apply before uploading file.
I tried to achieve this using node-multiparty. But it was not working as expecting. You can see my solution and issue at https://github.com/andrewrk/node-multiparty/issues/49. It works fine for small files but fails for file of size 15MB.
Any solution or alternative ?
You can now use streaming with the official Amazon SDK for nodejs in the section "Uploading a File to an Amazon S3 Bucket" or see their example on GitHub.
What's even more awesome, you finally can do so without knowing the file size in advance. Simply pass the stream as the Body:
var fs = require('fs');
var zlib = require('zlib');
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body})
.on('httpUploadProgress', function(evt) { console.log(evt); })
.send(function(err, data) { console.log(err, data) });
For your information, the v3 SDK were published with a dedicated module to handle that use case : https://www.npmjs.com/package/#aws-sdk/lib-storage
Took me a while to find it.
Give https://www.npmjs.org/package/streaming-s3 a try.
I used it for uploading several big files in parallel (>500Mb), and it worked very well.
It very configurable and also allows you to track uploading statistics.
You not need to know total size of the object, and nothing is written on disk.
If it helps anyone I was able to stream from the client to s3 successfully (without memory or disk storage):
https://gist.github.com/mattlockyer/532291b6194f6d9ca40cb82564db9d2a
The server endpoint assumes req is a stream object, I sent a File object from the client which modern browsers can send as binary data and added file info set in the headers.
const fileUploadStream = (req, res) => {
//get "body" args from header
const { id, fn } = JSON.parse(req.get('body'));
const Key = id + '/' + fn; //upload to s3 folder "id" with filename === fn
const params = {
Key,
Bucket: bucketName, //set somewhere
Body: req, //req is a stream
};
s3.upload(params, (err, data) => {
if (err) {
res.send('Error Uploading Data: ' + JSON.stringify(err) + '\n' + JSON.stringify(err.stack));
} else {
res.send(Key);
}
});
};
Yes putting the file info in the headers breaks convention but if you look at the gist it's much cleaner than anything else I found using streaming libraries or multer, busboy etc...
+1 for pragmatism and thanks to #SalehenRahman for his help.
I'm using the s3-upload-stream module in a working project here.
There is also some good examples from #raynos in his http-framework repository.
Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.
Here is an example of streaming upload.
$ npm install minio
$ cat >> put-object.js << EOF
var Minio = require('minio')
var fs = require('fs')
// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
var s3Client = new Minio({
url: 'https://<your-s3-endpoint>',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})
var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
if (e) {
return console.log(e)
}
s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
return console.log(e) // should be null
})
})
EOF
putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.
Additionally this library is also isomorphic, can be used in browsers as well.