What "streams and pipe-capable" means in pkgcloud in NodeJS - node.js

My issue is to get image uploading to amazon working.
I was looking for a solution that doesnt save the file on the server and then upload it to Amazon.
Googling I found pkgcloud and on the README.md it says:
Special attention has been paid so that methods are streams and
pipe-capable.
Can someone explain what that means and if it is what I am looking for?

Yupp, that means you've found the right kind of s3 library.
What it means is that this library exposes "streams". Here is the API that defines a stream: http://nodejs.org/api/stream.html
Using node's stream interface, you can pipe any readable stream (in this case the POST's body) to any writable stream (in this case the S3 upload).
Here is an example of how to pipe a file upload directly to another kind of library that supports streams: How to handle POSTed files in Express.js without doing a disk write
EDIT: Here is an example
var pkgcloud = require('pkgcloud'),
fs = require('fs');
var s3client = pkgcloud.storage.createClient({ /* ... */ });
app.post('/upload', function(req, res) {
var s3upload = s3client.upload({
container: 'a-container',
remote: 'remote-file-name.txt'
})
// pipe the image data directly to S3
req.pipe(s3upload);
});
EDIT: To finish answering the questions that came up in the chat:
req.end() will automatically call s3upload.end() thanks to stream magic. If the OP wants to do anything else on req's end, he can do so easily: req.on('end', res.send("done!"))

Related

what the best way to upload larger files to s3 with nodejs aws-sdk? MultipartUpload vs ManagedUpload vs getSignedURL, etc

Im trying to look over the ways AWS has to offer in order to upload files to s3. When I looked into their docs it confused the hell of out me. Looking up to the various resources I came to know a bit more resources like s3.upload vs s3.putObject and others realised there are physical limitations in API gateway and using lambda function to upload a file.
Particularly in case of uploading large file like 1-100 GB AWS suggests multiple methods to upload file to s3. Amongst them are createMultipartUpload, ManagedUpload, getSignedURL and tons of other.
So my Question is:
What is the best and the easiest way to upload large files to s3 where I also can cancel the upload process. The multipart upload seems to tedious.
There's no Best Way to upload file to S3
It depends on what you want especially what are the sizes of the object that you want to upload.
putObject - Ideal for objects which are under 20MB
Presigned Url - Allows you to bypass API Gateway and Put object under 5GB to s3 bucket
Multipart Upload - Allows you to upload files in chunks which means you can continue your upload even the connection went off temporarily. The maximum file size you can upload via this method is 5TB.
Use Streams to upload to S3, this way the Node.JS server doesn't take too much of the resources.
const AWS = require('aws-sdk');
const S3 = new AWS.S3();
const stream = require('stream');
function upload(S3) {
let pass = new stream.PassThrough();
let params = {
Bucket: BUCKET,
Key: KEY,
Body: pass
};
S3.upload(params, function (error, data) {
console.error(error);
console.info(data);
});
return pass;
}
const readStream = fs.createReadStream('/path/to/your/file');
readStream.pipe(upload(S3));
This is via streaming local file, the stream can be from request as well.
If want to listen to the progress can use ManagedUpload
const manager = S3.upload(params);
manager.on('httpUploadProgress', (progress) => {
console.log('progress', progress)
// { loaded: 6472, total: 345486, part: 3, key: 'large-file.dat' }
});

Download file from s3 without write it to file system in nodejs

I have a Nodejs server running with Hapi.
one of the job of the server is to send files to servicer API (the API only accept streams when I send buffer it return an error) on the user ask
All the files are stored in s3.
When I download them if I'm using promise(),
I get in the body buffer.
And I can get passthrough if I'm using createReadStream().
My problem is when I try to convert the buffer to stream and send it the API reject it, and the same when I use the createReadStream() result,
but when I use FS to save the file and then FS to read the API accept the stream and its work.
so I need help how can I create the same result without saving and reading the file.
edit:
here is my code I know it's the wrong way but it works I need a better way that will work
static async downloadFile(Bucket, Key) {
const result = await s3Client
.getObject({
Bucket,
Key
})
.promise();
fs.writeFileSync(`${Path.basename(Key)}`,result.Body);
const file = await fs.createReadStream(`${Path.basename(Key)}`);
return file;
}
If I understand it correctly, you want to get the object from the s3 bucket and stream to your HTTP response as the stream.
Instead of getting the data in the buffers and than figuring out the way to convert it to stream can be complicated and has its limitations, if you really want to leverage the power of streams then don't try to convert it to buffer and load the entire object to the memory, you can create a request that streams the returned data directly to a Node.js Stream object by calling the createReadStream method on the request.
Calling createReadStream returns the raw HTTP stream managed by the request. The raw data stream can then be piped into any Node.js Stream object.
This technique is useful for service calls that return raw data in their payload, such as calling getObject on an Amazon S3 service object to stream data directly into a file, as shown in this example.
//I Imagine you have something similar.
server.get ('/image', (req, res) => {
let s3 = new AWS.S3({apiVersion: '2006-03-01'});
let params = {Bucket: 'myBucket', Key: 'myImageFile.jpg'};
let readStream= s3.getObject(params).createReadStream();
// When the stream is done being read, end the response
readStream.on('close', () => {
res.end()
})
readStream.pipe(res);
});
When you stream data from a request using createReadStream, only the raw HTTP data is returned. The SDK does not post-process the data, this raw HTTP data can be directly returned.
Note:
Because Node.js is unable to rewind most streams, if the request initially succeeds, then retry logic is disabled for the rest of the response. In the event of a socket failure, while streaming, the SDK won't attempt to retry or send more data to the stream. Your application logic needs to identify such streaming failures and handle them.
Edits:
After the edits on the original question, I can see that s3 sends a PassThrough stream object which is different from a FileStream in Nodejs. So to get around the problem use the memory (If your files are not very big and or you have enough memory).
Use the package memfs, it will replace the native fs in your app
https://www.npmjs.com/package/memfs
Install the package by npm install memfs and require as follows:
const {fs} = require('memfs');
and your code will look like
static async downloadFile(Bucket, Key) {
const result = await s3
.getObject({
Bucket,
Key
})
.promise();
fs.writeFileSync(`/${Key}`,result.Body);
const file = await fs.createReadStream(`/${Key}`);
return file;
}
Note that the only change I have made in your functions is that I have changed the path ${Path.basename(Key)} to /${Key}, because now you don't need to know the path of your original filesystem we are storing files in memory. I have tested and this solution works

Streaming upload from NodeJS to Dropbox

Our system needs to use out internal security checks when interacting with dropbox, we can therefore not use the clientside SDK for Dropbox.
We would rather upload to our own endpoint, apply security checks, and then stream the incoming request to dropbox.
I am coming up short here as there was an older NodeJS Dropbox SDK which supported pipes, but the new SDK does not.
Old SDK:
https://www.npmjs.com/package/dropbox-node
We want to take the incoming upload request and forward it to dropbox as it comes in. (and thus prevent the upload from taking twice as long if we first upload the entire thing to our server and then upload to dropbox)
Is there any way to solve this?
My Dropbox NPM module (dropbox-v2-api) supports streaming. It's based on HTTP API, so you can take an advantage of streams. Example? I see it this way:
const contentStream = fs.createReadStream('file.txt');
const securityChecks = ... //your security checks
const uploadStream = dropbox({
resource: 'files/upload',
parameters: { path: '/target/file/path' }
}, (err, result, response) => {
//upload finished
});
contentStream
.pipe(securityChecks)
.pipe(uploadStream);
Full stream support example here.

Streaming files directly to Client from Amazon S3 (Node.js)

I am using sails.js and am trying to stream files from the Amazon s3 server directly to the client.
To connect to S3, I use the s3 Module : https://www.npmjs.org/package/s3
This module provides capabilities like client.downloadFile(params) and client.downloadBuffer(s3Params).
My current code looks like the following:
var view = client.downloadBuffer(params);
view.on('error', function(err) {
cb({success: 0, message: 'Could not open file.'}, null);
});
view.on('end', function(buffer) {
cb(null, buffer);
});
I catch this buffer in a controller using:
User.showImage( params , function (err, buffer){
// this is where I can get the buffer
});
Is it possible to stream this data as an image file (using buffer.pipe(res) doesn't work of course). But is there something similar to completely avoid saving file to server disk first?
The other option client.downloadFile(params) requires a local path (i.e. a server path in our case)
The GitHub issue contains the "official" answer to this question: https://github.com/andrewrk/node-s3-client/issues/53

Store WebM file in Redis (NodeJS)

I'm searching a solution to store a WebM file into Redis.
Let's explain the situation:
The NodeJS server receive a WebM file from a client, and save it into server file system.
Then it have to save this file in redis, because I don't want to manage redis and file system too. In this way I can delete the video just with redis command.
I think to read file with fs.readFile() and then save it into a Buffer, but I don't know witch encode format to use, and I don't know how to refer this process to give back the WebM video to a client when it make a request.
Is this a good way to proceed? Any suggestion?
PS: I use formidable to upload file.
EDIT: I found a way to proceed, but theres another problem:
var file = fs.readFileSync("./video.webm");
client.set("video1", file1, function(){
client.get("video1", function(err, data) {
var buffer = new Buffer(data, 'binary');
// file ≠ buffer
});
});
Is this an encode problem? Like unicode/UTF8/ASCII?
Maybe node and redis use different encode?
Solution found!
The problem became when you create the client object.
Usually this is what is done
var client = redis.createClient();
And return_buffers param will be set as false.
In this way
var client = redis.createClient(6379, '127.0.0.1', {
return_buffers: true,
auth_pass: null
});
everything gone right! ;)
this is the issue page they help me
I don't know much about NodeJS and WebM files.
Redis stores C char type on 8 bit String, so it should be binary friendly. Check the js code and configuration to ensure your js redis client sends / receives data as bytearray and not as UTF-8 string, probably there is a bad conversion in JS of data.

Resources