How to upload file in node.js with http module? - node.js

I have this code so far, but can not get the Buffer bianary.
var http = require('http');
var myServer = http.createServer(function(request, response)
{
var data = '';
request.on('data', function (chunk){
data += chunk;
});
request.on('end',function(){
if(request.headers['content-type'] == 'image/jpg') {
var binary = Buffer.concat(data);
//some file handling would come here if binary would be OK
response.write(binary.size)
response.writeHead(201)
response.end()
}
But get this error: throw new TypeError('Usage: Buffer.concat(list, [length])');

You're doing three bad things:
Using the Buffer API wrong - hence the error message.
Concatenating binary data as strings
Buffering data in memory
Mukesh has dealt with #1, so I'll cover the deeper problems.
First, you're receiving binary Buffer chunks and converting them to strings with the default (utf8) encoding, then concatenating them. This will corrupt your data. As well as there existing byte sequences that aren't valid utf8, if a valid sequence is cut in half by a chunk you'll lose that data too.
Instead, you should keep the data always as binary data. Maintain an array of Buffer to which you push each chunk, then concatenate them all at the end.
This leads to the problem #3. You are buffering the whole upload into memory then writing it to a file, instead of streaming it directly to a (temporary) file. This puts a lot of load on your application, both using up memory and using up time allocating it all. You should just pipe the request to a file output stream, then inspect it on disk.
If you are only accepting very small files you may get away with keeping them in memory, but you need to protect yourself from clients sending too much data (and indeed lying about how much they're going to send).

Related

Write stream into buffer object

I have a stream that is being read from an audio source and I'm trying to store it into a Buffer. From the documentation that I've read, you are able to pipe the stream into one using fs.createWriteStream(~buffer~) instead of a file path.
I'm doing this currently as:
const outputBuffer = Buffer.alloc(150000)
const stream = fs.createWriteStream(outputBuffer)
but when I run it, it throws an error saying that the Path: must be a string without null bytes for the file system call.
If I'm misunderstanding the docs or missing something obvious please let me know!
The first parameter to fs.createWriteStream() is the filename to read. That is why you receive that particular error.
There is no way to read from a stream directly into an existing Buffer. There was a node EP to support this, but it more or less died off because there are some potential gotchas with it.
For now you will need to either copy the bytes manually or if you don't want node to allocate extra Buffers, you will need to manually call fs.open(), fs.read() (this is the method that allows you to pass in your Buffer instance, along with an offset), and fs.close().

Slow Buffer.concat

When I read a 16MB file in pieces of 64Kb, and do Buffer.concat on each piece, the latter proves to be incredibly slow, takes a whole 4s to go through the lot.
Is there a better way to concatenate a buffer in Node.js?
Node.js version used: 7.10.0, under Windows 10 (both are 64-bit).
This question is asked while researching the following issue: https://github.com/brianc/node-postgres/issues/1286, which affects a large audience.
The PostgreSQL driver reads large bytea columns in chunks of 64Kb, and then concatenates them. We found out that calling Buffer.concat is the culprit behind a huge loss of performance in such examples.
Rather than concatenating every time (which creates a new buffer each time), just keep an array of all of your buffers and concat at the end.
Buffer.concat() can take a whole list of buffers. Then it's done in one operation. https://nodejs.org/api/buffer.html#buffer_class_method_buffer_concat_list_totallength
If you read from a file and know the size of that file, then you can pre-allocate the final buffer. Then each time you get a chunk of data, you can simply write it to that large 16Mb buffer.
// use the "unsafe" version to avoid clearing 16Mb for nothing
let buf = Buffer.allocUnsafe(file_size)
let pos = 0
file.on('data', (chunk) => {
buf.fill(chunk, pos, pos + chunk.length)
pos += chunk.length
})
if(pos != file_size) throw new Error('Ooops! something went wrong.')
The main difference with #Brad's code sample is that you're going to use 16Mb + size of one chunk (roughly) instead of 32Mb + size of one chunk.
Also, each chunk has a header, various pointers, etc. so you are not unlikely to use 33Mb or even 34Mb... that's a lot more RAM. The amount of RAM copied is otherwise the same. That being said, it could be that Node starts reading the next chunk while you copy so it could make it transparent. When done in one large chunk in the 'end' event, you're going to have to wait for the contact() to complete while doing nothing else in parallel.
In case you are receiving an HTTP POST and are reading it. Remember that you get a Content-Length parameter so you also have the length in that case and can pre-allocate the entire buffer before reading the data.

is the nodejs Buffer asynchronous or synchronous?

I dont see a callback in the Buffer documentation at http://nodejs.org/api/buffer.html#buffer_buffer. Am I safe to assume that Buffer is synchronous? I'm trying to convert a binary file to a base64 encoded string.
What I'm ultimately trying to do is take a PNG file and store its base64 encoded string in MongoDB. I read somewhere that I should take the PNG file, use Buffer to convert to base64, then pass this base64 output to Mongo.
My code looks something like this:
fs.readFile(filepath, function(err, data) {
var fileBuffer = new Buffer(data).toString('base64');
// do Mongo save here with the fileBuffer ...
});
I'm a bit fearful that Buffer is synchronous, and thus would be blocking other requests while this base64 encoding takes place. If so, is there a better way of converting a binary file to a base64 encoded one for storage in Mongo?
It is synchronous. You could make it asynchronous by slicing your Buffer and converting a small amount at a time and calling process.nextTick() in between, or by running it in a child process - but I wouldn't recommend either of those approaches.
Instead, I would recommend not storing images in your db- store them on disk or perhaps in a file storage service such as Amazon S3, and then store just the file path or URL in your database.

Buffer or string or array for adding chunks with json

I'm downloading varying sizes of json data from a provider. The sizes can vary from a couple of hundred bytes to tens of MB.
Got into trouble with a string (i.e. stringVar += chunk). I'm not sure, but I suspect my crashes has to to with quite large strings (15 MB).
In the end I need the json data. My temporary solution is to use a string up to 1MB and then "flushing" it to a buffer. I didn't want to use a buffer from start as it would have to be grown (i.e. copied to a larger buffer) quite often when downloads are small.
Which solution is the best for concatenating downloaded chunks and then parsing to json?
1.
var dataAsAString = '';
..
dataAsAString += chunk;
..
JSON.parse(dataAsAString);
2.
var dataAsAnArray = [];
..
dataAsAnArray.push(chunk);
..
concatenate
JSON.parse..
3.
var buffer = new Buffer(initialSize)
..
buffer.write(chunk)
..
copy buffer to larger buffer when needed
..
JSON.parse(buffer.toString());
Michael
I don't know why you are appending the chunk in a cumulative manner.
If you could store the necessary metadata for the entire duration of all data processing, then you could use a loop and just process the chunk. chunk data should be declared in the loop, then after every iteration, the chunk variable goes out of scope and used memory wouldn't grow continuously.
while((chunk=receiveChunkedData())!=null)
{
JSON.parse(chunk);
}
I have now moved to streams instead of accumulating buffers. Streams are really awesome.
If someone comes here for a solution on accumulating buffer chunks in a speedy manner I thought I'd share my find..
Substack has a module for keeping all the chunks separate without reallocating memory and then treat them as a contiguous buffer when you need.
https://github.com/substack/node-buffers
I think node-stream-buffer can solve your problem.

Saving a base 64 string to a file via createWriteStream

I have an image coming into my Node.js application via email (through cloud service provider Mandrill). The image comes in as a base64 encoded string, email.content in the example below. I'm currently writing the image to a buffer, and then a file like this:
//create buffer and write to file
var dataBuffer = new Buffer(email.content, 'base64');
var writeStream = fs.createWriteStream(tmpFileName);
writeStream.once('open', function(fd) {
console.log('Our steam is open, lets write to it');
writeStream.write(dataBuffer);
writeStream.end();
}); //writeSteam.once('open')
writeStream.on('close', function() {
fileStats = fs.statSync(tmpFileName);
This works fine and is all well and good, but am I essentially doubling the memory requirements for this section of code, since I have my image in memory (as the original string), and then create a buffer of that same string before writing the file? I'm going to be dealing with a lot of inbound images so doubling my memory requirements is a concern.
I tried several ways to write email.content directly to the stream, but it always produced an invalid file. I'm a rank amateur with modern coding, so you're welcome to tell me this concern is completely unfounded as long as you tell me why so some light will dawn on marble head.
Thanks!
Since you already have the entire file in memory, there's no point in creating a write stream. Just use fs.writeFile
fs.writeFile(tmpFileName, email.content, 'base64', callback)
#Jonathan's answer is a better way to shorten the code you already have, so definitely do that.
I will expand on your question about memory though. The fact is that Node will not write anything to a file without converting it to a Buffer first, so given when you have told us about email.content, there is nothing more you can do.
If you are really worried about this though, then you would need some way to process the value of email.content as it comes in from where ever you are getting it from, as a stream. Then as the data is being streamed into the server, you immediately write it to a file, thus not taking up any more RAM than needed.
If you elaborate more, I can try to fill in more info.

Resources