Altering an Array from within Node JS net Socket - node.js

I am trying to update an array from within a server program, in order to record client data. However, whilst the positions array is updating okay, the board array (which obtains data from the positions array) never recognises these changes. Thus the output (socket.write) never changes. I feel I must be missing something obvious. This is a basic implementation of what I'm trying to do. Thank you in advance.
const net = require('net');
const position = [" ", " ", " "];
const board = [position[0], "-", position[1], "-", position[2]];
const server = net.createServer(socket => {
socket.on('data', data => {
socket.write('Enter a value between 1 & 3 (inclusive)');
const value = data.toString('utf-8');
position[value-1] = 'X';
board.forEach(item => {
socket.write(item);
})
})
socket.on('end', () => {
console.log("Session ended");
})
})
server.listen(5000);```

The two variables - position and board - are entirely independent (board does not hold references to position), so your modification to position is not carried over to board. This is because, in JavaScript, strings are pass-by-value, not pass-by-reference - any assignment of a string is a copy operation.
If you need to derive a value from another for sending, it's best to write a function that transforms your input to a desired output format, like so:
toBoard(position) {
return [position[0], "-", position[1], "-", position[2]];
}
and then
toBoard(position).forEach(item => {
socket.write(item);
})
Note, however, that your socket-related code has a serious bug: it treats data events as messages that come in individually. This is called message or datagram semantics, where a peer sends 1 message and the other peer receives the same message in its entirety. This is different from stream semantics, where a sequence of bytes is sent, and the same bytes come out on the other side, in the same order, but not necessarily sliced the same way.
With TCP, if the client does:
socket.write('1');
socket.write('2');
socket.write('3');
The server may receive any of:
one data event with '123'
two data events with '1' and '23'
two data events with '12' and '3'
three separate data events with '1', '2' and '3'
You should look into protocols that preserve message boundaries, instead of using raw TCP. Here are some protocols you could use instead:
UDP
WebSocket
ZeroMQ
If you decide to implement this on raw TCP yourself (which is error-prone), you'll need some logic to receive the stream progressively from the network, buffer and split the chunks accordingly into logical messages that you can process. A simple example of a built-in tool which does this is readline - a module that converts a stream-oriented input (such as a TCP socket, or a process' standard input) into discrete line events.

Related

How to properly implement Node.js communication over Unix Domain sockets?

I'm debugging my implementation of an IPC between multithreaded node.js instances.
As datagram sockets are not supported natively, I use the default stream protocol, with simple application-level packaging.
When two threads communicate, the server side is always receiving, the client side is always sending.
// writing to the client trasmitter
// const transmitter = net.createConnection(SOCKETFILE);
const outgoing_buffer = [];
let writeable = true;
const write = (transfer) => {
if (transfer) outgoing_buffer.push(transfer);
if (outgoing_buffer.length === 0) return;
if (!writeable) return;
const current = outgoing_buffer.shift();
writeable = false;
transmitter.write(current, "utf8", () => {
writeable = true;
write();
});
};
// const server = net.createServer();
// server.listen(SOCKETFILE);
// server.on("connection", (reciever) => { ...
// reciever.on("data", (data) => { ...
// ... the read function is called with the data
let incoming_buffer = "";
const read = (data) => {
incoming_buffer += data.toString();
while (true) {
const decoded = decode(incoming_buffer);
if (!decoded) return;
incoming_buffer = incoming_buffer.substring(decoded.length);
// ... digest decoded string
}
};
My stream is encoded in transfer packages, and decoded back, with the data JSON stringified back and forth.
Now what happens is, that from time to time, as it seems more frequently at higher CPU loads, the incoming_buffer gets some random characters, displayed as ��� when logged.
Even if this is happening only once in 10000 transfers, it is a problem. I would need a reliable way, even if the CPU load is at max, the stream should have no unexpected characters, and should not get corrupted.
What could potentially cause this?
What would be the proper way to implement this?
Okay, I found it. The Node documentation gives a hint.
readable.setEncoding(encoding)
Must be used instead of incoming_buffer += data.toString();
The readable.setEncoding() method sets the character encoding for data
read from the Readable stream.
By default, no encoding is assigned and stream data will be returned
as Buffer objects. Setting an encoding causes the stream data to be
returned as strings of the specified encoding rather than as Buffer
objects. For instance, calling readable.setEncoding('utf8') will cause
the output data to be interpreted as UTF-8 data, and passed as
strings. Calling readable.setEncoding('hex') will cause the data to be
encoded in hexadecimal string format.
The Readable stream will properly handle multi-byte characters
delivered through the stream that would otherwise become improperly
decoded if simply pulled from the stream as Buffer objects.
So it was rather depending on the number of multibyte characters in the stress test, then on CPU load.

Nodejs PassThrough Stream

I want to transmit an fs.Readstream over a net.Socket (TCP) stream. For this I use a .pipe.
When the fs.Readstream is finished, I don't want to end the net.Socket stream. That's why I use
readStream.pipe(socket, {
end: false
})
Unfortunately I don't get 'close', 'finish' or 'end' on the other side. This prevents me from closing my fs.Writestream on the opposite side. However, the net.Socket connection remains, which I also need because I would like to receive an ID as a response.
Since I don't get a 'close' or 'finish' on the opposite, unfortunately I can't end the fs.Writestream and therefore can't send a response with a corresponding ID
Is there a way to manually send a 'close' or 'finish' event via the net.socket without closing it?
With the command, only my own events react.
Can anyone tell me what I am doing wrong?
var socket : net.Socket; //TCP connect
var readStream = fs.createWriteStream('test.txt');
socket.on('connect', () => {
readStream.pipe(socket, {
end: false
})
readStream.on('close', () => {
socket.emit('close');
socket.emit('finish');
})
//waiting for answer
//waiting for answer
//waiting for answer
socket.on('data', (c) => {
console.log('got my answer: ' + c.toString());
})
})
}
Well there's not really much you can do with a single stream except provide some way to the other side to know that the stream has ended programatically.
When the socket sends an end event it actually flushes the buffer and then closes the TCP connection, which then on the other side is translated into finish after the last byte is delivered. In order to re-use the connection you can consider these two options:
One: Use HTTP keep-alive
As you can imagine you're not the first person having faced this problem. It actually is a common thing and some protocols like HTTP have you already covered. This will introduce a minor overhead, but only on starting and ending the streams - which in your case may be more acceptable than the other options.
Instead of using basic TCP streams you can as simply use HTTP connections and send your data over http requests, a HTTP POST request would be just fine and your code wouldn't look any different except ditching that {end: false}. The socket would need to have it's headers sent, so it'd be constructed like this:
const socket : HTTP.ClientRequest = http.request({method: 'POST', url: '//wherever.org/somewhere/there:9087', headers: {
'connection': 'keep-alive',
'transfer-encoding': 'chunked'
}}, (res) => {
// here you can call the code to push more streams since the
});
readStream.pipe(socket); // so our socket (vel connection) will end, but the underlying channel will stay open.
You actually don't need to wait for the socket to connect, and pipe the stream directly like in the example above, but do check how this behaves if your connection fails. Your waiting for connect event will also work since HTTP request class implements all TCP connection events and methods (although it may have some slight differences in signatures).
More reading:
Wikipedia article of keep-alive - a good explaination how this works
Node.js http.Agent options - you can control how many connections you have, and more importantly set the default keep alive behavior.
Oh and a bit of warning - TCP keep-alive is a different thing, so don't get confused there.
Two: Use a "magic" end packet
In this case what you'd do is to send a simple end packet, for instance: \x00 (a nul character) at the end of the socket. This has a major drawback, because you will need to do something with the stream in order to make sure that a nul character doesn't appear there otherwise - this will introduce an overhead on the data processing (so more CPU usage).
In order to do it like this, you need to push the data through a transform stream before you send them to the socket - this below is an example, but it would work on strings only so adapt it to your needs.
const zeroEncoder = new Transform({
encoding: 'utf-8',
transform(chunk, enc, cb) { cb(chunk.toString().replace('\x00', '\\x00')); },
flush: (cb) => cb('\x00')
});
// ... whereever you do the writing:
readStream
.pipe(zeroEncoder)
.on('unpipe', () => console.log('this will be your end marker to send in another stream'))
.pipe(socket, {end: false})
Then on the other side:
tcpStream.on('data', (chunk) => {
if (chunk.toString().endsWith('\x00')) {
output.end(decodeZeros(chunk));
// and rotate output
} else {
output.write(decodeZeros(chunk));
}
});
As you can see this is way more complicated and this is also just an example - you could simplify it a bit by using JSON, 7-bit transfer encoding or some other ways, but it will in all cases need some trickery and most importantly reading through the whole stream and way more memory for it - so I don't really recommend this approach. If you do though:
Make sure you encode/decode the data correctly
Consider if you can find a byte that won't appear in your data
The above may work with strings, but will be at least bad with Buffers
Finally there's no error control or flow control - so at least pause/resume logic is needed.
I hope this is helpful.

Best way to query all documents from a mongodb collection in a reactive way w/out flooding RAM

I want to query all the documents in a collection in a reactive way. The collection.find() method of the mongodb nodejs driver returns a cursor that fires events for each document found in the collection. So I made this:
function giant_query = (db) => {
var req = db.collection('mycollection').find({});
return Rx.Observable.merge(Rx.Observable.fromEvent(req, 'data'),
Rx.Observable.fromEvent(req, 'end'),
Rx.Observable.fromEvent(req, 'close'),
Rx.Observable.fromEvent(req, 'readable'));
}
It will do what I want: fire for each document, so I can treat then in a reactive way, like this:
Rx.Observable.of('').flatMap(giant_query).do(some_function).subscribe()
I could query the documents in packets of tens, but then I'd have to keep track of an index number for each time the observable stream is fired, and I'd have to make an observable loop which I do not know if it's possible or the right way to do it.
The problem with this cursor is that I don't think it does things in packets. It'll probably fire all the events in a short period of time, therefore flooding my RAM. Even if I buffer some events in packets using Observable's buffer, the events and events data (the documents) are going to be waiting on RAM to be manipulated.
What's the best way to deal with it n a reactive way?
I'm not an expert on mongodb, but based on the examples I've seen, this is a pattern I would try.
I've omitted the events other than data, since throttling that one seems to be the main concern.
var cursor = db.collection('mycollection').find({});
const cursorNext = new Rx.BehaviourSubject('next'); // signal first batch then wait
const nextBatch = () => {
if(cursor.hasNext()) {
cursorNext.next('next');
}
});
cursorNext
.switchMap(() => // wait for cursorNext to signal
Rx.Observable.fromPromise(cursor.next()) // get a single doc
.repeat() // get another
.takeWhile(() => cursor.hasNext() ) // stop taking if out of data
.take(batchSize) // until full batch
.toArray() // combine into a single emit
)
.map(docsBatch => {
// do something with the batch
// return docsBatch or modified doscBatch
})
... // other operators?
.subscribe(x => {
...
nextBatch();
});
I'm trying to put together a test of this Rx flow without mongodb, in the meantime this might give you some ideas.
You also might wanna check my solution without using of rxJS:
Mongoose Cursor: http bulk request from collection

What's the node.js paradigm for socket stream conversation?

I'm trying to implement a socket protocol and it is unclear to me how to proceed. I have the socket as a Stream object, and I am able to write() data to it to send on the socket, and I know that the "readable" or "data" events can be used to receive data. But this does not work well when the protocol involves a conversation in which one host is supposed to send a piece of data, wait for a response, and then send data again after the response.
In a block paradigm it would look like this:
send some data
wait for specific data reply
massage data and send it back
send additional data
As far as I can tell, node's Stream object does not have a read function that will asynchronously return with the number of bytes requested. Otherwise, each wait could just put the remaining functionality in its own callback.
What is the node.js paradigm for this type of communication?
Technically there is a Readable.read() but its not recommended (maybe you can't be sure of the size or it blocks, not sure.) You can keep track of state and on each data event add to a Buffer that you keep processing incrementally. You can use readUInt32LE etc. on Buffer to read specific pieces of binary data if you need to do that (or you can convert to string if its textual data). https://github.com/runvnc/metastream/blob/master/index.js
If you want to write it in your 'block paradigm', you could basically make some things a promise or async function and then
let specialReplyRes = null;
waitForSpecialReply = f => new Promise( res => specialReplyRes = res);
stream.on('data', (buff) => {
if (buff.toString().indexOf('special')>=0) specialReplyRes(buff.toString());
});
// ...
async function proto() {
stream.write(data);
let reply = await waitForSpecialReply();
const message = massage(reply);
stream.write(message);
}
Where your waitForSpecialReply promise is stored and resolved after a certain message is received through your parsing.

node - send large JSON over net socket

The problem is that sending large serialized JSON (over 16,000 characters) over a net socket gets split into chunks. Each chunk fires the data event on the receiving end. So simply running JSON.parse() on the incoming data may fail with SyntaxError: Unexpected end of input.
The work around I've managed to come up with so far is to append a null character ('\u0000') to the end of the serialized JSON, and check for that on the receiving end. Here is an example:
var partialData = '';
client.on( 'data', function( data ) {
data = data.toString();
if ( data.charCodeAt( data.length - 1 ) !== 0 ) {
partialData += data;
// if data is incomplete then no need to proceed
return;
} else {
// append all but the null character to the existing partial data
partialData += data.substr( 0, data.length - 1 );
}
// pass parsed data to some function for processing
workWithData( JSON.parse( partialData ));
// reset partialData for next data transfer
partialData = '';
});
One of the failures of this model is if the receiver is connected to multiple sockets, and each socket is sending large JSON files.
The reason I'm doing this is because I need to pass data between two processes running on the same box, and I prefer not to use a port. Hence using a net socket. So there would be two questions: First, is there a better way to quickly pass large JSON data between two Node.js processes? Second, if this is the best way then how can I better handle the case where the serialized JSON is being split into chunks when sent?
You can use try...catch every time to see if it is a valid json. Not very good performance though.
You can calculate size of your json on sending side and send it before JSON.
You can append a boundary string that's unlikely be in JSON. Your \u0000 - yes, it seems to be a legit way. But most popular choice is newline.
You can use external libraries like dnode which should already do something I mentioned before. I'd recommend trying that. Really.
One of the failures of this model is if the receiver is connected to multiple sockets, and each socket is sending large JSON files.
Use different buffers for every socket. No problem here.
It is possible to identify each socket individually and build buffers for each one. I add an id to each socket when I receive a connection and then when I receive data I add that data to a buffer.
net.createServer( function(socket) {
// There are many ways to assign an id, this is just an example.
socket.id = Math.random() * 1000;
socket.on('data', function(data) {
// 'this' refers to the socket calling this callback.
buffers[this.id] += data;
});
});
Each time you can check if you have received that "key" delimiter that will tell you that a buffer is ready to be used.

Resources