AWS lambda Converting inbound PDF to JPGS - node.js

Currently i am doing a simple copy with a lambda function in node.js where i copy an incoming pdf file to another bucket.
What i would like to do is copy that PDF and create a jpg of each page. i currently have a back end process doing this with imagemagick but would like to move it into my lambda function maybe with using gm?
Here is my current code.
var params = {
CopySource: srcBucket + '/' + srcKey,
Bucket: destinationbucket,
Key: outfile.pdf
};
s3.copyObject(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
context.succeed('exit');
});

ImageMagic is available for NodeJS Lambda functions. From the documentation:
If you author your Lambda function code in Node.js, the following
libraries are available in the AWS Lambda execution environment so you
don't need to include them:
ImageMagick: Installed with default settings. For versioning
information, see imagemagick nodejs wrapper and ImageMagick native
binary (search for "ImageMagick").
So you should be able to move your current solution to Lambda fairly easily.

Related

Lambda stream encoded string

I stored the audio file in an array buffer and encoded it with base64. I need to send the data from the lambda to react client. For the larger audio files, I'm facing a lambda payload limit error.
Is there any way to stream the data in chunks from the Lambda to client ?
function readFile(filepath, callback) {
//Uint8Array
fs.readFile(filepath, (err, data) => {
// Data is a Buffer object
if (err) console.log(err);
callback(data);
})
readFile(`${outputFile}`, function (data) {
try {
let base64enc = base64.encode(data);
responseBody.message = base64enc;
status = statusCodes.OK;
return sendResponse(status, responseBody);//Sending response
} catch (err) {
console.log("error " + err);
reject(err);
}
});
No, there isn't. Each Lambda function call can return one payload and Lambda function calls are independent. I suggest two possible solutions.
Request a specific chunk. You can call the Lambda function multiple times with each call requesting a specific chunk of the data and merging them together as one data (file in your case) on the frontend.
Use S3. You usually handle media files using S3. Assuming you already have the audio file on S3, you generate and return the pre-signed get URL object in Lambda, and use the url to get the object on the frontend. (You can refer to code on Presigned URL generation code times-out as Lambda, works locally or other sources). You can also upload audio files by getting the pre-signed put URL in Lambda and using the url on the frontend to upload.
I would suggest the second solution because it is a more standard way of dealing with media files.

download image on aws lambda nodejs

I need to download an image to my aws lambda function and use it for later use.
I have tried to use http.get() method but it requires local file system to place the image, which i guess is not available in case of lambda function.
I have also tried to use request.get method which is also not returning correct response to me.
Currently my function looks like:
function download_image(image_url){
return new Promise(resolve =>{
request.get(image_url, function (error, response, body) {
if (!error && response.statusCode == 200) {
// let data = "data:" + response.headers["content-type"] + ";base64," + new Buffer(body).toString('base64');
resolve("Downloaded")
}
else{
resolve("Failed Downloaded")
}
});
});
}
I am openly looking for a way to store image on s3 or if I can store it in dynamo db using any format.
Any help will be appreciated.
im fairly sure you can use http.get in a lambda.
in concept you'd be doing the request, saving it to a byte array or buffer then writing that to s3. s3 makes sense for files and the retrieving is way easier than dynamodb, and also dynamodb you pay for writes and reads.
Saving an image stored on s3 using node.js?

S3 file upload stream using node js

I am trying to find some solution to stream file on amazon S3 using node js server with requirements:
Don't store temp file on server or in memory. But up-to some limit not complete file, buffering can be used for uploading.
No restriction on uploaded file size.
Don't freeze server till complete file upload because in case of heavy file upload other request's waiting time will unexpectedly
increase.
I don't want to use direct file upload from browser because S3 credentials needs to share in that case. One more reason to upload file from node js server is that some authentication may also needs to apply before uploading file.
I tried to achieve this using node-multiparty. But it was not working as expecting. You can see my solution and issue at https://github.com/andrewrk/node-multiparty/issues/49. It works fine for small files but fails for file of size 15MB.
Any solution or alternative ?
You can now use streaming with the official Amazon SDK for nodejs in the section "Uploading a File to an Amazon S3 Bucket" or see their example on GitHub.
What's even more awesome, you finally can do so without knowing the file size in advance. Simply pass the stream as the Body:
var fs = require('fs');
var zlib = require('zlib');
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body})
.on('httpUploadProgress', function(evt) { console.log(evt); })
.send(function(err, data) { console.log(err, data) });
For your information, the v3 SDK were published with a dedicated module to handle that use case : https://www.npmjs.com/package/#aws-sdk/lib-storage
Took me a while to find it.
Give https://www.npmjs.org/package/streaming-s3 a try.
I used it for uploading several big files in parallel (>500Mb), and it worked very well.
It very configurable and also allows you to track uploading statistics.
You not need to know total size of the object, and nothing is written on disk.
If it helps anyone I was able to stream from the client to s3 successfully (without memory or disk storage):
https://gist.github.com/mattlockyer/532291b6194f6d9ca40cb82564db9d2a
The server endpoint assumes req is a stream object, I sent a File object from the client which modern browsers can send as binary data and added file info set in the headers.
const fileUploadStream = (req, res) => {
//get "body" args from header
const { id, fn } = JSON.parse(req.get('body'));
const Key = id + '/' + fn; //upload to s3 folder "id" with filename === fn
const params = {
Key,
Bucket: bucketName, //set somewhere
Body: req, //req is a stream
};
s3.upload(params, (err, data) => {
if (err) {
res.send('Error Uploading Data: ' + JSON.stringify(err) + '\n' + JSON.stringify(err.stack));
} else {
res.send(Key);
}
});
};
Yes putting the file info in the headers breaks convention but if you look at the gist it's much cleaner than anything else I found using streaming libraries or multer, busboy etc...
+1 for pragmatism and thanks to #SalehenRahman for his help.
I'm using the s3-upload-stream module in a working project here.
There is also some good examples from #raynos in his http-framework repository.
Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.
Here is an example of streaming upload.
$ npm install minio
$ cat >> put-object.js << EOF
var Minio = require('minio')
var fs = require('fs')
// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
var s3Client = new Minio({
url: 'https://<your-s3-endpoint>',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})
var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
if (e) {
return console.log(e)
}
s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
return console.log(e) // should be null
})
})
EOF
putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.
Additionally this library is also isomorphic, can be used in browsers as well.

Advice: flatiron, formidable and aws s3

I'm new with serverside programming with node.js. I'm sticking together a tiny webapp with it right now and having the usual startup learning to do. The following piece of code WORKS. But I would love to know if it's more or less a right way to do a simple file upload from a form and throw it into aws s3:
app.router.post('/form', { stream: true }, function () {
var req = this.req,
res = this.res,
form = new formidable.IncomingForm();
form
.parse(req, function(err, fields, files) {
console.log('Parsed file upload' + err);
if (err) {
res.end('error: Upload failed: ' + err);
} else {
var img = fs.readFileSync(files.image.path);
var data = {
Bucket: 'le-bucket',
Key: files.image.name,
Body: img
};
s3.client.putObject(data, function() {
console.log("Successfully uploaded data to myBucket/myKey");
});
res.end('success: Uploaded file(s)');
}
});
});
Note: I had to turn buffer off in union / flatiron.plugins.http.
What I would like to learn is, when to stream load a file and when to syncload it. It will be a really tiny webapp with little traffic.
If it's more or less good then please consider this as a token of working code which I also would throw into a gist. It's not that easy to find documenation and working examples of this kind of stuff. I like flatiron alot. But it's small module approach leads to lots of splattered docs and examples all over the net, speak alone of tutorials.
You should use other module than formidable because as far as I know formidable does not have s3 storage option , then you must save the files in your server before uploading it.
I would recommend you to use : multiparty
Use this example in order to upload directly to S3 without saving it locally in your server.

Turn on Server-side encryption and Get Object Version in Amazon S3 with knox and nodejs

So far I've been able to successfully use node.js, express, and knox to add/update/delete/retrieve objects in Amazon S3. Trying to move things to the next level I'm trying to figure out how to use knox (if it's possible) to do two things:
1) Set the object to use server-side encryption when adding/updating the object.
2) Get a particular version of an object or get a list of versions of the object.
I know this is an old question, but it is possible to upload a file with knox using server-side encryption by specifying a header:
client.putFile('test.txt', '/test.txt', {"x-amz-server-side-encryption": "AES256"}, function(err, res) {
//Do something here
});
Andy (who wrote AwsSum) here.
Using AwsSum, when you put an object, just set the 'ServerSideEncryption' to the value you want (currently S3 only supports 'AES256'). Easy! :)
e.g.
var body = ...; // a buffer, a string, a stream
var options = {
BucketName : 'chilts',
ObjectName : 'my-object.ext',
ContentLength : Buffer.byteLength(body),
Body : body,
ServerSideEncryption : 'AES256'
};
s3.PutObject(options, function(err, data) {
console.log("\nputting an object to pie-18 - expecting success");
console.log(err, 'Error');
console.log(data, 'Data');
});

Resources