What's the node.js paradigm for socket stream conversation? - node.js

I'm trying to implement a socket protocol and it is unclear to me how to proceed. I have the socket as a Stream object, and I am able to write() data to it to send on the socket, and I know that the "readable" or "data" events can be used to receive data. But this does not work well when the protocol involves a conversation in which one host is supposed to send a piece of data, wait for a response, and then send data again after the response.
In a block paradigm it would look like this:
send some data
wait for specific data reply
massage data and send it back
send additional data
As far as I can tell, node's Stream object does not have a read function that will asynchronously return with the number of bytes requested. Otherwise, each wait could just put the remaining functionality in its own callback.
What is the node.js paradigm for this type of communication?

Technically there is a Readable.read() but its not recommended (maybe you can't be sure of the size or it blocks, not sure.) You can keep track of state and on each data event add to a Buffer that you keep processing incrementally. You can use readUInt32LE etc. on Buffer to read specific pieces of binary data if you need to do that (or you can convert to string if its textual data). https://github.com/runvnc/metastream/blob/master/index.js
If you want to write it in your 'block paradigm', you could basically make some things a promise or async function and then
let specialReplyRes = null;
waitForSpecialReply = f => new Promise( res => specialReplyRes = res);
stream.on('data', (buff) => {
if (buff.toString().indexOf('special')>=0) specialReplyRes(buff.toString());
});
// ...
async function proto() {
stream.write(data);
let reply = await waitForSpecialReply();
const message = massage(reply);
stream.write(message);
}
Where your waitForSpecialReply promise is stored and resolved after a certain message is received through your parsing.

Related

Why while loop is needed for reading a non-flowing mode stream in Node.js?

In the node.js documentation, I came across the following code
const readable = getReadableStreamSomehow();
// 'readable' may be triggered multiple times as data is buffered in
readable.on('readable', () => {
let chunk;
console.log('Stream is readable (new data received in buffer)');
// Use a loop to make sure we read all currently available data
while (null !== (chunk = readable.read())) {
console.log(`Read ${chunk.length} bytes of data...`);
}
});
// 'end' will be triggered once when there is no more data available
readable.on('end', () => {
console.log('Reached end of stream.');
});
Here is the comment from the node.js documentation concerning the usage of the while loop, saying it's needed to make sure all data is read
// Use a loop to make sure we read all currently available data
while (null !== (chunk = readable.read())) {
I couldn't understand why it is needed and tried to replace while with just if statement, and the process terminated after the very first read. Why?
From the node.js documentation
The readable.read() method should only be called on Readable streams operating in paused mode. In flowing mode, readable.read() is called automatically until the internal buffer is fully drained.
Be careful that this method is only meant for stream that has been paused.
And even further, if you understand what a stream is, you'll understand that you need to process chunks of data.
Each call to readable.read() returns a chunk of data, or null. The chunks are not concatenated. A while loop is necessary to consume all data currently in the buffer.
So i hope you understand that if you are not looping through your readable stream and only executing 1 read, you won't get your full data.
Ref: https://nodejs.org/api/stream.html

What are the roles of _read and read in Node JS streams?

I'm really just looking for clarification on how these work. IMO the documentation on streams is somewhat lacking, and there actually aren't a lot of resources out their that comprehensively explain explain how they're are meant to work and be extended.
My question can be broken down into two parts
One, What is the role of the _read function within the stream module? When I run this code it endlessly prints out "hello world" until null is pushed onto the stream buffer. This seems to indicate that _read is called in some kind of loop that waits for a null in the buffer, but I can't find documentation anywhere that states this in explicit terms.
var Readable = require('stream').Readable
var rs = Readable()
rs._read = function () {
rs.push("hello world")
rs.push(null)
};
rs.on("data", function(data){
console.log("some data", data)
})
Two, what does read actually do? My understanding is that read consumes data from the read stream buffer, and fires the data event. Is that all that's going on here?
read() is something that a consumer of the readStream calls if they want to specifically read some bytes from the stream (when the stream is not flowing).
_read() is an internal method that is part of the internal implementation of the read stream. The internals of the stream call this method (it is NOT to be called from the outside) when the stream is flowing and the stream wants to get more data from the source. When called the _read() method pushes data with .push(data) or if it has no more data, then it does a .push(null).
You can see an explanation and example here in this article.
_read(size) {
if (this.data.length) {
const chunk = this.data.slice(0, size);
this.data = this.data.slice(size, this.data.length);
this.push(chunk);
} else {
this.push(null); // 'end', no more data
}
}
If you were implementing a read stream to some custom source of data, then you would implement the _read() method to fetch up to the size amount of data from your source and .push() that data into the stream.

Buffering a Float32Array to a client

This should be obvious, but for some reason I am not getting any result. I have already spent way too much time just trying different ways to get this working without results.
TLDR: A shorter way to explain this question could be: I know how to stream a sound from a file. How to stream a buffer containing sound that was synthesized on the server instead?
This works:
client:
var stream = ss.createStream();
ss(socket).emit('get-file', stream, data.bufferSource);
var parts = [];
stream.on('data', function(chunk){
parts.push(chunk);
});
stream.on('end', function () {
var blob=new Blob(parts,{type:"audio"});
if(cb){
cb(blob);
}
});
server (in the 'socket-connected' callback of socket.io)
var ss = require('socket.io-stream');
// ....
ss(socket).on('get-file', (stream:any, filename:any)=>{
console.log("get-file",filename);
fs.createReadStream(filename).pipe(stream);
});
Now, the problem:
I want to alter this audio buffer and send the modified audio instead of just the file. I converted the ReadStream into an Float32Array, and did some processes sample by sample. Now I want to send that modified Float32Array to the client.
In my view, I just need to replaces the fs.createReadStream(filename) with(new Readable()).push(modifiedSoundBuffer). However, I get a TypeError: Invalid non-string/buffer chunk. Interestingly, if I convert this modifiedSodunBuffer into a Uint8Array, it doesn't yell at me, and the client gets a large array, which looks good; only that all the array values are 0. I guess that it's flooring all the values?
ss(socket).on('get-buffer', (stream:any, filename:any)=>{
let readable=(new Readable()).push(modifiedFloat32Array);
readable.pipe(stream);
});
I am trying to use streams for two reasons: sound buffers are large, and to allow concurrent processing in the future
if you will convert object Float32Array to buffer before sending like this Readable()).push(Buffer.from(modifiedSoundBuffer)) ?

request: how to pipe file and keep the response object

I use npm request. I want to download and write a file to the filesystem and after that-use the returned response object for further processing.
I know that I can pipe the response object directly but afterwards I have no response object more for further processing
var result = await request(builtReq).pipe(fs.createWriteStream(myFilePath));
So my implementation at the moment looks like this:
var result = await request(builtReq);
But I am not able to pipe the result object because the streamable state is false.
So I need a way to keep the return of the request and to write the file to the filesystem. Can I reset the stream state or somehow keep the response obj and write the file at the same time?
I tried to implement the write file manually via. fs.writeFile() after I received the response obj but I had problems with file encodings because I can receive anything and I ended up with broken files.
Has somebody an idea how to solve that?
Do you want the response object (this contains the status code and headers), or do you need the response body in-memory?
If you only want the response object, and still want the body written to a file, you can grab the response like so:
var response;
await request(builtReq).on('response', r => {
console.log('Response received: ' + r);
response = r;
}).pipe(fs.createWriteStream(myFilePath));
console.log('Finished saving file, response is still ' + response);
If you need to process the actual body, you have a couple options:
Keep your existing code as-is, then just read the file off the disk right after it finishes (since you are using await, that would be the next line) and process it in memory.
Pipe the input stream to multiple places -- one to the file write stream, and the other to an in-memory buffer (using a standard on('data') handler). Once the stream finishes, the file is saved and you can process the in-memory buffer. See related question node.js piping the same readable stream for several different examples.
remember promises can only be resolved once.
so to solve your problem:
var promise = request(builtReq)
// now do what ever number of operations you need
result.pipe(fs.createWriteStream(myFilePath));
promise.then(r=>console.log('i will still get data here :)');
promise.then(r=>console.log('and here !');
promise.then(r=>console.log('and here !!');
var result = await promise; // and here to !!
a promise once resolve, the .then will always get the result.

node - send large JSON over net socket

The problem is that sending large serialized JSON (over 16,000 characters) over a net socket gets split into chunks. Each chunk fires the data event on the receiving end. So simply running JSON.parse() on the incoming data may fail with SyntaxError: Unexpected end of input.
The work around I've managed to come up with so far is to append a null character ('\u0000') to the end of the serialized JSON, and check for that on the receiving end. Here is an example:
var partialData = '';
client.on( 'data', function( data ) {
data = data.toString();
if ( data.charCodeAt( data.length - 1 ) !== 0 ) {
partialData += data;
// if data is incomplete then no need to proceed
return;
} else {
// append all but the null character to the existing partial data
partialData += data.substr( 0, data.length - 1 );
}
// pass parsed data to some function for processing
workWithData( JSON.parse( partialData ));
// reset partialData for next data transfer
partialData = '';
});
One of the failures of this model is if the receiver is connected to multiple sockets, and each socket is sending large JSON files.
The reason I'm doing this is because I need to pass data between two processes running on the same box, and I prefer not to use a port. Hence using a net socket. So there would be two questions: First, is there a better way to quickly pass large JSON data between two Node.js processes? Second, if this is the best way then how can I better handle the case where the serialized JSON is being split into chunks when sent?
You can use try...catch every time to see if it is a valid json. Not very good performance though.
You can calculate size of your json on sending side and send it before JSON.
You can append a boundary string that's unlikely be in JSON. Your \u0000 - yes, it seems to be a legit way. But most popular choice is newline.
You can use external libraries like dnode which should already do something I mentioned before. I'd recommend trying that. Really.
One of the failures of this model is if the receiver is connected to multiple sockets, and each socket is sending large JSON files.
Use different buffers for every socket. No problem here.
It is possible to identify each socket individually and build buffers for each one. I add an id to each socket when I receive a connection and then when I receive data I add that data to a buffer.
net.createServer( function(socket) {
// There are many ways to assign an id, this is just an example.
socket.id = Math.random() * 1000;
socket.on('data', function(data) {
// 'this' refers to the socket calling this callback.
buffers[this.id] += data;
});
});
Each time you can check if you have received that "key" delimiter that will tell you that a buffer is ready to be used.

Resources