Nodejs fs.readfile vs new buffer binary - node.js

I have a situation where I receive a base64 encoded image, decode it, then want to use it in some analysis activity.
I can use Buffer to go from base64 to binary but i seem to be unable to use that output as expected (as an image).
The solution now is to convert to binary, write it to a file, then read that file again. The FS output can be used as an image but this approach seems a bit inefficient and additional steps as i would expect the buffer output to also be a usable image as it has the same data?
my question, is how does the fs.readfile output differ from the buffer output? And is there a way i can use the buffer output as i would the fs output?
Buffer from a base64 string:
var bin = new Buffer(base64String, 'base64').toString('binary');
Read a file
var bin = fs.readFileSync('image.jpg');
Many thanks

Related

Sharp unable to read file buffer

I'm using express-fileupload for reading files from the API. Now I want to process the image in the request body using Sharp.
I don't want to first save the file at the server and process it using fs.readFileSync.
I tried passing req.files.image.data which is supposed to be a buffer.
const image = await sharp(Buffer.from(req.files.image.data))
.resize(500, 500)
.jpeg({ quality: 10 })
.toBuffer()
.then((outputBuffer) =>
({ data: outputBuffer, mimetype: 'image/jpeg' }))
.catch(err => {
console.log(err);
return null;
});
But it is throwing error this error: [Error: VipsJpeg: Premature end of input file]
When I tried converting the image buffer data into string as suggested in this post, converting it into buffer using Buffer.from and then passing it, it throwing error: [Error: Input buffer contains unsupported image format]
Edit: There was a limit on image size 5mb, that's why the images greater than that were not getting completely captured in the buffer, hence, this error.
app.use(fileUpload({
limits: { fileSize: 50 * 1024 * 1024 },
}));
Some questions arise, when I see your code. Let me try to get closer to a possible solution with your provided input by asking some questions and providing hints:
Are you sure your input file is not a corrupted format? It does not meet the requirements of jpeg specs probably? If you can be sure that the format is correct, try the next steps....
If req.files.image.data is really a buffer, why do you try to generate a buffer again by using Buffer.from(req.files.image.data)? You want to create a buffer from a buffer?
By the way - I line 4 you again try to conduct a conversion to a buffer with .toBuffer(). From an already existing buffer? In this case, I speak from personal experience, sharp would throw an error if trying to create a buffer from an already existing buffer.
You mention req.files.image.data is supposed to be a buffer. Sounds maybe you are not 100% sure. I suggest to check if you really have a buffer by using
const isItReallyBuffer = Buffer.isBuffer(req.files.image.data)
After that you can simply print it to the console: console.log(isItReallyBuffer); // true or false
In the end, it shouldn't make a big difference if a string, a buffer or certain kinds of arrays are provided to sharp as input. As per documentation, Sharp is very flexible when it comes to input data and accepts Buffer, Uint8Array, Uint8ClampedArray, Int8Array, Uint16Array, Int16Array, Uint32Array, Int32Array, Float32Array, Float64Array and as mentioned a string.
Maybe check the type of your provided input req.files.image.data one more time. Is it really a not corrupted image file? Does it really comply with the input options accepted by Sharp listed above?
If yes, rather try const image = await sharp(req.files.image.data)...
If you already use a buffer, remove .toBuffer()
AMENDMENT
fs.readFileSync is often used to process image files. In this case, I speak from my personal experience of many days of working with image files in node.js, I would think about using fs and better prefer the Sharp package for reading and writing image files. I don't use fs anymore. Fs is concatenating image chunks which in turn increases the probability leading to memory hogs.
You can simply open an PNG image on your desktop with WordPad or Notepad and search for IDAT chunks. Then, process the same image with fs package and you will see the difference; all of a sudden you likely have only one very huge IDAT chunk.

Converting a nodejs buffer to string and back to buffer gives a different result in some cases

I created a .docx file.
Now, I do this:
// read the file to a buffer
const data = await fs.promises.readFile('<pathToMy.docx>')
// Converts the buffer to a string using 'utf8' but we could use any encoding
const stringContent = data.toString()
// Converts the string back to a buffer using the same encoding
const newData = Buffer.from(stringContent)
// We expect the values to be equal...
console.log(data.equals(newData)) // -> false
I don't understand in what step of the process the bytes are being changed...
I already spent sooo much time trying to figure this out, without any result... If someone can help me understand what part I'm missing out, it would be really awesome!
A .docXfile is not a UTF-8 string (it's a binary ZIP file) so when you read it into a Buffer object and then call .toString() on it, you're assuming it is already encoding as UTF-8 in the buffer and you want to now move it into a Javascript string. That's not what you have. Your binary data will likely encounter things that are invalid in UTF-8 and those will be discarded or coerced into valid UTF-8, causing an irreversible change.
What Buffer.toString() does is take a Buffer that is ALREADY encoded in UTF-8 and puts it into a Javascript string. See this comment in the doc,
If encoding is 'utf8' and a byte sequence in the input is not valid UTF-8, then each invalid byte is replaced with the replacement character U+FFFD.
So, the code you show in your question is wrongly assuming that Buffer.toString() takes binary data and reversibly encodes it as a UTF8 string. That is not what it does and that's why it doesn't do what you are expecting.
Your question doesn't describe what you're actually trying to accomplish. If you want to do something useful with the .docX file, you probably need to actually parse it from it's binary ZIP file form into the actual components of the file in their appropriate format.
Now that you explain you're trying to store it in localStorage, then you need to encode the binary into a string format. One such popular option is Base64 though it isn't super efficient (size wise), but it is better than many others. See Binary Data in JSON String. Something better than Base64 for prior discussion on this topic. Ignore the notes about compression in that other answer because your data is already ZIP compressed.

Binary data from mongodb gets corrupted

When I upload a photo it converts to base64 and then when I send to mongodb using Mongoose it saves as Binary. But when I call the same picture back from the database it returns as Buffer array. After converting to base64 it returns as a base64 string but completely different from the original base64. The new base64 is unable to be rendered in browser because it has been corrupted.
Below are pictures of the different strings
This is the initial base64
This is the Buffer array
This is the corrupted base64 after converting from the buffer array using Buffer.from(avatar).toString('base64').
Please note that I appended to it "data:image/png;base64," before rendering in the browser and it still did not render.
Please can someone tell me what I am doing wrong?
the best solution is convert to png or jpg file and upload only path and save image to folder.
Here is how I solved it.
I converted from binary to utf8 instead of to base64.
There is a huge difference bewteen
Buffer.from(binary_data, 'binary').toString('utf8')
and
Buffer.from(binary_data, 'binary').toString('base64')

NodeJS - Stream a large ASCII file from S3 with Hex Charcaters (NUL)

I am trying to read (via streaming) a large file in a Lambda function. My goal is to just read the first few lines and look for some information. The input file in S3 seems to have hex characters (NUL) and the following code stops reading the line when it hits the NUL character and goes to the next line. I would like to know how can I read the whole line and replace/remove the NUL character before I look for the information in the line. Here is the code that does not work as expected:
var readline = require('line-reader');
var readStream = s3.getObject({Bucket: S3Bucket, Key: fileName}).createReadStream();
readline.eachLine(readStream, {separator: '\n', encoding: 'utf8'}, function(line) {
console.log('Line ',line);
});
As mentioned by Brad its hard to help as this is more an issue with your line-reader lib.
I can offer an alternate solution however that doesn't require the lib.
I would use GetObject as you are, but I would also specify a value for the range parameter, then work my way through the file in chunks and then stop reading chunks when I am satisfied.
If the chunk I read doesn't have a /n then read another chunk, keep going until I get a /n, then read from the start of my buffered data to /n, set the new starting position int based on the position of /n and then read a new chunk from that position if you want to read more data.
Check out the range parameter in the api:
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

Pass in buffer when library expects a file path node

I have a library that expects a filepath in order to load the data. However, I have the contents of the file in the form of a buffer instead. How do I make the buffer pretend to be a filepath?
So you're saying you have a Buffer that's holding the route to a file. You want to convert the Buffer into a string. If that's what you're trying to do then, here you go:
var buff = new Buffer('/path/to/wonder/land.js');
buff.toString('utf8');
Hope that helps.

Resources