nodejs Buffer toString changed original buffer - node.js

in node.js
const buffer = Buffer.from('000000a6', 'hex');
console.log(buffer); // <Buffer 00 00 00 a6>
const bufferString = buffer.toString();
const newBuffer = Buffer.from(bufferString);
console.log(newBuffer); // <Buffer 00 00 00 ef bf bd>
Why convert buffer to string, then convert the string back to buffer, the new buffer is different from the original one?
I tried toString('hex') toString('binary') or other encode, like ascii, etc. All these encodes changed the original buffer.
buffer.toString(encode) use the default encode utf8, Buffer.from(string, encode) also use the default encode utf8, it still different.
How can I convert buffer to string, and convert it back to buffer exactly as the original buffer?
PS: This question comes from when I want to send request body as a buffer. I just send to the server, but the server gets .
PPS: The server is not in my control. So I'm not able to use Buffer.from(string, 'hex') to parse request body buffer.toString('hex').

No need to convert into string like const bufferString = buffer.toString();
const buffer = Buffer.from('000000a6', 'hex');
console.log(buffer); //
const bufferString = buffer;
const newBuffer = Buffer.from(bufferString);
console.log(newBuffer);

In other words, you're converting a buffer to a string and back to a buffer using the same encoding both times and the buffer isn't the same? Here's a function to test whether an input string will stay the same for a given encoding. This should tell you which encodings are likely to be problematic.
var f = (buf, enc) => Buffer.from(buf.toString(enc), enc).every((e,i) => e === buf[i]);
f("hello world", "utf16le"); // ==> returns true
f("hello world", "binary"); // ==> returns true
UTF8 is the default encoding if you don't specify one, so the Buffer sequence in your original answer is very likely bad UTF8 or needs to be escaped in some other way to correctly map it to a UTF8 string.

Related

IORedis: how to publish ArrayBuffer

I'm trying to publish an ArrayBuffer to a IORedis stream.
I do so as follow:
const ab = new ArrayBuffer(1); // ArrayBuffer of length = 1 byte
const dv = new DataView(ab);
dv.setInt8(0, 7); // Write the number 7 in the buffer
const buffer = Buffer.from(ab); // Convert to Buffer since that's what `publish` expects
redisPublisher.publish('buffer-test', buffer);
It's a toy example, in practice I'll want to encode complex stuff in the ArrayBuffer, not just a number. Anyway, then I try to read with
redisSubscriber.on('message', async (channel, data) => {
logger.info(`Redis message: channel: ${channel}, data: ${data}, ${typeof data}`);
// ... do something with it
});
The problem is that data is empty, and its type is considered as string. As per the documentation I tried redisSubscriber.on('messageBuffer', ... instead, but it behaves exactly the same, so much so that I'm failing to understand the difference between the two.
Also confusing is that if I encode a Buffer, e.g.
const buffer = Buffer.from("I'm a string!", 'utf-8');
redisPublisher.publish('buffer-test', buffer);
Upon reception, data will again be a string, decoded from the Buffer, which in that toy case is ok but generally is not for me. I'd like to send an Buffer in, containing more complex data that just a string (an ArrayBuffer in my case), and get a Buffer out, that I could properly parse based on my needs and not have automatically read as a string.
Any help is welcome!

Nodejs asymmetrical buffer <-> string conversion

In nodejs I had naively expected the following to always output true:
let buff = Buffer.allocUnsafe(20); // Essentially random contents
let str = buff.toString('utf8');
let decode = Buffer.from(str, 'utf8');
console.log(0 === buff.compare(decode));
Given a Buffer buff, how can I detect ahead of time whether buff will be exactly equal to Buffer.from(buff.toString('utf8'), 'utf8')?
You should be probably be fine by just testing that the input buffer contains valid UTF-8 data:
try {
new TextDecoder('utf-8', { fatal: true }).decode(buff);
console.log(true);
} catch {
console.log(false);
}
But I wouldn't swear on Node being 100% consistent in the handling of invalid UTF-8 data when converting from string to buffer. If you want to be safe, you'll have to stick to buffer comparison. You could make the process of encoding/decoding a little more efficient by using transcode, which does not require creating a temporary string.
import { transcode } from 'buffer';
let buff = Buffer.allocUnsafe(20);
let decode = transcode(buff, 'utf8', 'utf8');
console.log(0 === buff.compare(decode));
If you're interested how TextDecoder determines if a buffer represents a valid utf8 string, the rigorous definition of this procedure can be found here.

node.js: Sending and receiving bytes to a TCP socket

This should be easy to figure out but I am getting so frustrated, and I can't seem to find documentation for this rather simple case.
I want to send bytes (not strings) over a TCP connection and process a response. Here's what I've got, but it throws a type exception on the use of a Buffer type. When I use a string type instead, it sends the bytes 0xc3 0xbe 0x74 0x01 instead of 0xfe 0x74 0x01 (from tcpdump). God knows why.
If I should be using the pipe interface instead, then great, but I can't seem to find how to do so for TCP streams and not files.
const net = require ('net')
const pumpIP = '192.168.1.208'
const pumpPort = 2101
const pumpStr = '\xfe\x74\x01'
const pumpBuffer = Buffer.from(0xfe, 0x74, 0x01)
var pump = new net.Socket()
pump.connect(pumpPort, pumpIP, function() {
pump.write(pumpBuffer) // <-- this throws a type error
// pump.write(pumpStr) // <-- this sends 0xc3 0xbe 0x74 0x01 instead
})
pump.on('data', function(data) {
// code to handle data
pump.destroy()
})
For your Buffer.from(), you need to use an array. Try this:
const pumpBuffer = Buffer.from([0xFE, 0x74, 0x01]);
https://nodejs.org/api/buffer.html#buffer_class_method_buffer_from_array

Convert a buffer to string and then convert string back to buffer in javascript

I am using ZLIB in NODEJS to compress a string. On compressing the string I get a BUFFER. I want to send that buffer as a PUT request, but the PUT request rejects the BUFFER as it needs only STRING. I am not able to convert BUFFER to STRING and then on the receiving end I cannot decompress that string, so I can get the original data. I am not sure how I can convert the buffer to string and then convert that string to buffer and then decompress the buffer to get the original string.
let zlib = require('zlib');
// compressing 'str' and getting the result converted to string
let compressedString = zlib.deflateSync(JSON.stringify(str)).toString();
//decompressing the compressedString
let decompressedString = zlib.inflateSync(compressedString);
The last line is causing an issue saying the input is invalid.
I tried to converted the the 'compressedString' to a buffer and then decompress it then also it does not help.
//converting string to buffer
let bufferedString = Buffer.from(compressedString, 'utf8');
//decompressing the buffer
//decompressedBufferString = zlib.inflateSync(bufferedString);
This code also gives the exception as the input is not valid.
I would read the documentation for zlib but the usage is pretty clear.
var Buffer = require('buffer').Buffer;
var zlib = require('zlib');
// create the buffer first and pass the result to the stream
let input = new Buffer(str);
//start doing the compression by passing the stream to zlib
let compressedString = zlib.deflateSync(input);
// To deflate you will have to do the same thing but passing the
//compressed object to inflateSync() and chain the toString()
let decompressedString = zlib.deflateSync(compressedString).toString();
There are a number of ways to handle streams but this is what you are trying to achieve with the code provided.
Try sending the buffer as a latin1 string not an utf8 string. For instance if your buffer is in the mybuf variable:
mybuf.toString('latin1');
And send mybuf to your API. Then in your frontend code you can do something like this, supposing your response is in the response variable:
const byteNumbers = new Uint8Array(response.length);
for (let i = 0; i < response.length; i++) {
byteNumbers[i] = response[i].charCodeAt(0);
}
const blob: Blob = new Blob([byteNumbers], {type: 'application/gzip'});
In my experience the transferred size will be just a little higher this way compared to sending the buffer, but at least unlike utf8 you can get your original binary data back. I still don't know how to do it with utf8 encoding, according to this SO answer it doesn't seem possible.

Node.js Buffer from string not correct

To create a utf-8 buffer from a string in javascript on the web you do this:
var message = JSON.stringify('ping');
var buf = new TextEncoder().encode(message).buffer;
console.log('buf:', buf);
console.log('buf.buffer.byteLength:', buf.byteLength);
This logs:
buf: ArrayBuffer { byteLength: 6 }
buf.buffer.byteLength: 6
However in Node.js if I do this:
var nbuf = Buffer.from(message, 'utf8');
console.log('nbuf:', nbuf);
console.log('nbuf.buffer:', nbuf.buffer);
console.log('nbuf.buffer.byteLength:', nbuf.buffer.byteLength);
it logs this:
nbuf: <Buffer 22 70 69 6e 67 22>
nbuf.buffer: ArrayBuffer { byteLength: 8192 }
nbuf.buffer.byteLength: 8192
The byteLength is way to high. Am I doing something wrong here?
Thanks
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer
ArrayBuffer.prototype.byteLength Read only
The size, in bytes, of the array. This is established when the array is constructed and cannot be changed. Read only.
It seems you should not assume byteLength property to be equal to the actual byte length occupied by the elements in the ArrayBuffer.
In order to get the actual byte length, I suggest using Buffer.byteLength(string[, encoding])
Documentation: https://nodejs.org/api/buffer.html#buffer_class_method_buffer_bytelength_string_encoding
For example,
var message = JSON.stringify('ping');
console.log('byteLength: ', Buffer.byteLength(message));
correctly gives
byteLength: 6

Resources