Node.js stream upload directly to Google Cloud Storage - node.js

I have a Node.js app running on a Google Compute VM instance that receives file uploads directly from POST requests (not via the browser) and streams the incoming data to Google Cloud Storage (GCS).
I'm using Restify b/c I don't need the extra functionality of Express and because it makes it easy to stream the incoming data.
I create a random filename for the file, take the incoming req and toss it to a neat little Node wrapper for GCS (found here: https://github.com/bsphere/node-gcs) which makes a PUT request to GCS. The documentation for GCS using PUT can be found here: https://developers.google.com/storage/docs/reference-methods#putobject ... it says Content-Length is not necessary if using chunked transfer encoding.
Good news: the file is being created inside the appropriate GCS storage "bucket"!
Bad News:
I haven't figured out how to get the incoming file's extension from Restify (notice I'm manually setting '.jpg' and the content-type manually).
The file is experiencing slight corruption (almost certainly do to something I'm doing wrong with the PUT request). If I download the POSTed file from Google, OSX tells me its damaged ... BUT, if I use PhotoShop, it opens and looks just fine.
Update / Solution
As pointed out by vkurchatkin, I needed to parse the request object instead of just piping the whole thing to GCS. After trying out the lighter busboy module, I decided it was just a lot easier to use multiparty. For dynamically setting the Content-Type, I simply used Mimer (https://github.com/heldr/mimer), referencing the file extension of the incoming file. It's important to note that since we're piping the part object, the part.headers must be cleared out. Otherwise, unintended info, specifically content-type, will be passed along and can/will conflict with the content-type we're trying to set explicitly.
Here's the applicable, modified code:
var restify = require('restify'),
server = restify.createServer(),
GAPI = require('node-gcs').gapitoken,
GCS = require('node-gcs'),
multiparty = require('multiparty'),
Mimer = require('mimer');
server.post('/upload', function(req, res) {
var form = new multiparty.Form();
form.on('part', function(part){
var fileType = '.' + part.filename.split('.').pop().toLowerCase();
var fileName = Math.random().toString(36).slice(2) + fileType;
// clear out the part's headers to prevent conflicting data being passed to GCS
part.headers = null;
var gapi = new GAPI({
iss: '-- your -- #developer.gserviceaccount.com',
scope: 'https://www.googleapis.com/auth/devstorage.full_control',
keyFile: './key.pem'
},
function(err) {
if (err) { console.log('google cloud authorization error: ' + err); }
var headers = {
'Content-Type': Mimer(fileType),
'Transfer-Encoding': 'Chunked',
'x-goog-acl': 'public-read'
};
var gcs = new GCS(gapi);
gcs.putStream(part, myBucket, '/' + fileName, headers, function(gerr, gres){
console.log('file should be there!');
});
});
});
};

You can't use the raw req stream since it yields whole request body, which is multipart. You need to parse the request with something like multiparty give you a readable steam and all metadata you need.

Related

"Error: MultipartParser.end(): stream ended unexpectedly" error when uploading a file

I'm uploading a file from a Buffer using form-data. On the server side I'm using formidable to parse the file data. I keep getting errors like this on the server:
Error: MultipartParser.end(): stream ended unexpectedly: state = START
or
Error: MultipartParser.end(): stream ended unexpectedly: state = PART_DATA
I'm not sure if this is a problem with form-data or formidable. I found a lot of solutions (mostly involving not setting the Content-Type header manually). However, I couldn't find one that resolved the issue for me. I ended up figuring something out, so posting in order to answer.
I encountered this issue while developing a Strapi upload provider. Strapi provides information about a file that needs to be uploaded to a service. The file contents are provided as a Buffer (for some reason). Here's what my code looked like when I was getting the error (modified slightly):
const form = new FormData()
form.append('file', Readable.from(file.buffer))
form.append('name', file.name)
form.append('hash', file.hash)
form.append('mime', file.mime)
form.on('error', () => abortController.abort())
return fetch(url, {
method: 'post',
body: form,
signal: abortController.signal,
}))
Again, I'm not sure if this is a problem with form-data or formidable, but if I provide a filename and knownLength to form-data, the issue goes away. This is what my final code looks like (modified slightly):
const fileStream = Readable.from(file.buffer)
const fileSize = Buffer.byteLength(file.buffer)
const abortController = new AbortController()
const form = new FormData()
form.append('file', fileStream, {filename: file.name, knownLength: fileSize})
form.append('name', file.name)
form.append('hash', file.hash)
form.append('mime', file.mime)
form.on('error', () => abortController.abort())
return fetch(url, {
method: 'post',
body: form,
signal: abortController.signal,
}))
I've tried logging the form.on('error') result and I get nothing (it's not aborting).
I've tried just setting filename and I get the same error.
I've tried just setting knownLength. The file uploads but it's empty (at least, formidable thinks it is). It must need the filename in order to parse the file correctly?
This is likely an issue with form-data not reading the input stream properly or not writing to the output stream properly (I did notice looking at the raw form data on the server that the file data was truncated) or with formidable not reading the file data properly. There's something about setting the filename and knownLength that bypasses the issue.
UPDATE: This may have been partially fixed in a newer version of form-data. I updated the package and no longer need to set the knownLength. I still need to set filename though. Without it, the server thinks the file is empty.

How to download a multipart wav file from cloudant database and save locally using Node JS and REST API?

I am stuck in retrieving multipart from cloudant using Node JS API. Hence, I used REST API to download the wav file from cloudant database. But its not downloading wav file from https URL. When I enter the https URL directly in browser, it prompts me to save file locally. So, the URL is correct.
Here is the code for REST API:
var request1 = require('request');
var filestream = fs.createWriteStream("input.wav");
var authenticationHeader = "Basic " + new Buffer(user + ":" + pass).toString("base64");
request1( { url : "example.com/data/1533979044129/female";, headers : { "Authorization" : authenticationHeader } },
function (error, httpResponse, body) {
const statusCode = httpResponse.statusCode;
httpResponse.pipe(filestream);
httpResponse.on('end', function () {
console.log("file complete");
filestream.close();
}); });
The file size of input.wav is 0. Its not downloading file. Please help.
Your callback has an error argument, which you are completely ignoring. Do something with this error, like print it out so your problem can tell you what you're doing wrong. I definitely see at least 1 problem in your source, and the error from request should tell you what it is.
Edit On second thought the above code shouldn't even execute. You should share code that you tested yourself. There's typos in there.

Having problems handling UIImageJPEGRepresentation or UIImagePNGRepresentation in Node.js

TL;DR: How do you write UIImageRepresentation data into the actual file format in a Node.js server? (or any place outside of Swift at that)
.
.
So I'm in a bit of a predicament here...
I wanted to send a UIImageJPEGRepresentation (or any form of data encoded imagery) through Alamofire, over to a node server, to be saved there. I used Busboy to handle MultipartFormData...
busboy.on('field', function(fieldname, val, fieldnameTruncated, valTruncated, encoding, mimetype) {
var datamail = './storage/' + fieldname;
var stream = fs.createWriteStream(datamail)
console.log('Field [' + fieldname + ']: value: ' + util.inspect(val));
console.log('Storing letter in ' + datamail);
stream.write(val);
});
and save it through a write stream. I originally wanted to read a UIImage I had put in, but I wasn't sure how the server would respond to an object like that, so I went and used UIImageJPEGRepresentation. It read the UIImageJPEGRepresentation object right...
Field [test.png]: value:
'�PNG\r\n\u001a\n\u0000\u0000\u0000\rIHDR\u0000\u0000\u0000d\u0000\u0000\u0000d\b\u0002\u0000\u0000\u0000��\u0002\u0003\u0000\u0000\u0000\u0001sRGB\u0000��\u001c�\u0000\u0000\u0000\u001ciDOT\u0000\u0000\u0000\u0002\u0000\u0000\u0000\u0000\u0000\u0000\u00002\u0000\u0000\u0000(\u0000\u0000\u00002\u0000\u0000\u00002\u0000\u0000(aj\rp�\u0000\u0000(-IDATx\u0001��\u0007T\u0014��LJ�\u0005X�\u0002�,u�w�+����KԈ-�\u0016[Ԩ�$1&j�I�b�\u0002��*Mz�lﻔ��م\r���s��yΜ��\u0011�����\u000e"���%S��`r�I\u0014j�Ie�OL,W��d�OM��I��%R�H�\u0019_��\u001bl�M�]�U U�\t�d2�T��\u0017H�k|��\u0013㉤�\te`]\u0002)X��O�!�\f�v�XoȿD\nn�a���?�`��\u0013R��\u001f��\u000e �_\u0007`�=:\u0010B�JozRz����\t�\u001f��\u0013X:^�CX\u000f.<9�c�\r§C��¢�$R�W?��:\u001a�L�\u0016K��\u0001\u0013H���\u000f����`�\u0013e���K\'���\u001a,+X�e\u0005�YY\u000f\u0016��A�`=\u0000�\u000f4zF�,t8t��\r�\u0005;�\b�JUz^��^Y�\t�\u001f��;�8��\u0004K\u001f�`�\u0003�wLX`�����2����e:4�|\u0007�5�L���p\u0006��ڸ��*Ē~��\u0015`�b���k\u0010����\u000fK/���t��\u0012�DD�\bbـ�\u0006�v\u0010,�T�5����tZ����\u000e�����wť����^�뿠�D���~08l-SH���$J0��\u0006�a�\u0006\u0014���\u0012��$\u0012H�������\u000e�`Xz}��a}�)=A��O����A\n\u001a�\f��ХNLt\u0018&�\u0015C�\r^:��q�Â�?\u000f�/x��z^�H�\u000b��[X#J�Ԁ\u0001)�#�\u0005^#Wh����a�ݩ-�tp�O��D�\u0001��&��ucWmDצK-5�\u0002c\'�\u0002\u0016�w�X.\u0014�ty\u0000v>A\u0006�\u0017����m��g��\u0012�����\u0002v�N6�˸\u000e��=�\u0016\u0013��Â}x\fL/Z�zX IX�W\u000e�h#J"ш\u0015=\u0000K(�\u0000,���\u0000�\u0000�\u000e\u0013���\u0018��\u0005|\u0017�a_ʱ�CGG��/I���+,�\u0013u6�\u0007a�_P\u0000E�����\u000elbP��c��\u0002T�\u0001�(u=\u0000\\�ˁ�\u0003�k\u0014�n�H)����n�\u0007\n� L>�\u0001�U�P1���^�\u001f�\u0013b7�nô��`��>����t;���V�rF�d���\u001de\r&\u0005O�=�\u0016\u001f�\u0011ɡ\u0010�R>&%�\nj\b��\u0000d�H�ݩ\u0006�\u0012e�H�\'Q�bE\u001f_�˓�t��;E�\u000e��C��!Ay2�K��#\u0006��J\u0014�\u0012�\u0010\u0016�\u00142�\u001f1:�rD�\u0019`>>\u0018\u0016�v#�S�\bK\u0007np����R�\u001b��0\u0018\u0016\u0000�oa�Ru+\u0015�J�F#��H42�L�Ķz$RT$F�¾�vuE-/3�+%��EJS⫦�IM\u000f^�>~��4���K��7�W�o3��y��"EY������\u000b\u0015I��J\u0002?R\u0001rÄ�\r^:q���\u001a�/U�j�\u001dXBI\u0017��g�\u0004\u0016P�w`\r�\u0005�t�ғ¤$U�%�^��O�A�R\u0000��Ҧ(��H�j|��x?��Σ�;\u000fx7�u�r��ʍ΋��~�����\r��_;�]i��b�OW���|�z��[տݩ�������|Iu���\u000b��~�\u0002d...
was successfully able to save it into a .jpeg...
SO won't let me add a picture because my rep isn't high enough... click instead
to find that it instead was a corrupted image that I couldn't use. My app extracts frames out of the camera on the fly and converts them to UIImage(JPEG/PNG)Representation (see gist). The next best option seemed like to do a direct upload (since it worked in Postman) but Alamofire only supports direct Data objects or URL encoded things, and I don't think UIImageJPEGRepresentations can be directly sent. I really just want to know how to handle these objects.
Thanks in advance.
Fixed this one quicker than I thought I would. I placed a logger in my server code to check for the MIME type:
busboy.on('field', function(mimetype) {
// var datamail = /*path.join('.', 'storage', fieldname);*/ './storage/' + fieldname;
// var stream = fs.createWriteStream(datamail)
console.log(MIMETYPE: ' + mimetype);
});
...discovering a text/plain one. I went Alamofire and changed the params,
from this:
Alamofire.upload(
multipartFormData: { multipartFormData in
multipartFormData.append(compresso!, withName: "test.png")
},
to this:
Alamofire.upload(
multipartFormData: { multipartFormData in
// notice the MIME change
multipartFormData.append(compresso!, withName: "test.png", fileName: "file.jpg", mimeType: "image/png")
},
And it worked! It was able to safely go through processing and do its' thing. They should really change this in the Alamofire examples, as they use a PNG Representation and send it without the extra params. (see Uploading Data to a server.) It is stuff like that that could potentially keep a dev up at night...

change content-disposition on piped response

I have the following controller that get a file from the a service and pipes the answer to the browser.
function (req,res){
request.get(serviceUrl).pipe(res);
}
I'd like to change the content-disposition (from attachment to inline) so the browser opens the file instead of directly download it.
I already tried this, but it is not working:
function (req,res){
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(res);
}
The versions I'm using are:
NodeJS: 0.12.x
Express: 4.x
To do this you can use an intermediate passtrhough stream between request and response, then headers from request won't be passed to response:
var through2 = require('through2'); // or whatever you like better
function (req, res) {
var passThrough = through2(); // this stream is necessary to put correct response headers
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(passThrough).pipe(res);
}
But be carefull, as this will ignore all headers, and you will probably need to specify 'Content-Type', etc.

S3 file upload stream using node js

I am trying to find some solution to stream file on amazon S3 using node js server with requirements:
Don't store temp file on server or in memory. But up-to some limit not complete file, buffering can be used for uploading.
No restriction on uploaded file size.
Don't freeze server till complete file upload because in case of heavy file upload other request's waiting time will unexpectedly
increase.
I don't want to use direct file upload from browser because S3 credentials needs to share in that case. One more reason to upload file from node js server is that some authentication may also needs to apply before uploading file.
I tried to achieve this using node-multiparty. But it was not working as expecting. You can see my solution and issue at https://github.com/andrewrk/node-multiparty/issues/49. It works fine for small files but fails for file of size 15MB.
Any solution or alternative ?
You can now use streaming with the official Amazon SDK for nodejs in the section "Uploading a File to an Amazon S3 Bucket" or see their example on GitHub.
What's even more awesome, you finally can do so without knowing the file size in advance. Simply pass the stream as the Body:
var fs = require('fs');
var zlib = require('zlib');
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body})
.on('httpUploadProgress', function(evt) { console.log(evt); })
.send(function(err, data) { console.log(err, data) });
For your information, the v3 SDK were published with a dedicated module to handle that use case : https://www.npmjs.com/package/#aws-sdk/lib-storage
Took me a while to find it.
Give https://www.npmjs.org/package/streaming-s3 a try.
I used it for uploading several big files in parallel (>500Mb), and it worked very well.
It very configurable and also allows you to track uploading statistics.
You not need to know total size of the object, and nothing is written on disk.
If it helps anyone I was able to stream from the client to s3 successfully (without memory or disk storage):
https://gist.github.com/mattlockyer/532291b6194f6d9ca40cb82564db9d2a
The server endpoint assumes req is a stream object, I sent a File object from the client which modern browsers can send as binary data and added file info set in the headers.
const fileUploadStream = (req, res) => {
//get "body" args from header
const { id, fn } = JSON.parse(req.get('body'));
const Key = id + '/' + fn; //upload to s3 folder "id" with filename === fn
const params = {
Key,
Bucket: bucketName, //set somewhere
Body: req, //req is a stream
};
s3.upload(params, (err, data) => {
if (err) {
res.send('Error Uploading Data: ' + JSON.stringify(err) + '\n' + JSON.stringify(err.stack));
} else {
res.send(Key);
}
});
};
Yes putting the file info in the headers breaks convention but if you look at the gist it's much cleaner than anything else I found using streaming libraries or multer, busboy etc...
+1 for pragmatism and thanks to #SalehenRahman for his help.
I'm using the s3-upload-stream module in a working project here.
There is also some good examples from #raynos in his http-framework repository.
Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.
Here is an example of streaming upload.
$ npm install minio
$ cat >> put-object.js << EOF
var Minio = require('minio')
var fs = require('fs')
// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
var s3Client = new Minio({
url: 'https://<your-s3-endpoint>',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})
var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
if (e) {
return console.log(e)
}
s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
return console.log(e) // should be null
})
})
EOF
putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.
Additionally this library is also isomorphic, can be used in browsers as well.

Resources