SocketIO limit requests on progress watching - node.js

I'm using SocketIO for a small app, where users would receive updates whenever a change occurs. However, i'd like to implement it to have real time progress visualization in tasks that are done in server side.
However, if the task progress changes too fastly, this would result in tons of event emissions and i think this could decrease the app performance. Is there a way to limit event emits to a maximum of N per second (Emitting only the last one, with the last percent of the progress) ?

Yes, that can be done. It requires you to hold events for a short time to see if there are more events of the same kind coming and thus combine all of them into one. I will code up an example in a few minutes here.
Here's a general idea for how you could do this:
function emitMessageLast(socket, msg, data) {
const queueTime = 500; // wait for up to 500ms of idle time before sending latest data
const longestWaitTime = 2000; // wait no more than 2 seconds if data is being continuously sent
function stopTimer() {
if (socket._timer) {
clearTimeout(socket._timer);
socket._timer = null;
}
}
function sendNow() {
socket._lastMsg = msg;
socket._lastTime = Date.now();
return socket.emit(msg, data);
}
// if this is the first time we're sending this message
// or it's been awhile since we last sent data
// just send the new data immediately
if (socket._lastMsg !== msg || !socket._lastTime || Date.now() - socket._lastTime > longestWaitTime) {
stopTimer();
return sendNow();
}
// at this point, we know we're sending the same message as has recently been sent
socket._lastMsg = msg;
socket._lastData = data;
stopTimer();
// set a timer so that if no more data has arrived before the timer fires,
// we sent the last data we saved
socket._timer = setTimeout(() => {
socket._timer = null;
sendNow();
}, queueTime);
}
The general idea for this code is as follows:
When you get called with a message to send and no message of the same type has been recently sent, then send this one immediately and record the time it was sent.
When you get called with a message to send and it's been more than longestWaitTime since you last sent a message, then send this one immediately. This means if the server is continuously sending data, the server will wait for up to longestWaitTime before sending the latest value of the data.
When the server is sending data sporadically, it will wait up to queueTime (waiting to see if there's more data coming) before sending the last piece of data. It is essentially buffering the last message until no more messages have been send in the last queueTime and then a timer will fire off that last message.
I've configured the defaults here so that it will delay sending data to the client for up to 500ms (while waiting to see if the server is about to send more data so it can avoid sending all the intermediate values of the data) figuring that if the client updates its status every 500ms, that is plenty often. And, if the server is continuously sending updates, then the server will skip up to 2000ms of updates to send just the one last update. Again, you can set these numbers however you see appropriate.

Related

socket.io how to send multiple messages sequentially?

I'm using socket.io like this
Client:
socket.on('response', function(i){
console.log(i);
});
socket.emit('request', whateverdata);
Server:
socket.on('request', function(whateverdata){
for (i=0; i<10000; i++){
console.log(i);
socket.emit('response', i);
}
console.log("done!");
});
I need output like this when putting the two terminals side by side:
Server Client
0 0
1 1
. (etc) .
. .
9998 9998
9999 9999
done!
But instead I am getting this:
Server Client
0
1
. (etc)
.
9998
9999
done!
0
1
.
. (etc)
9998
9999
Why?
Shouldn't Socket.IO / Node emit the message immediately, not wait for the loop to complete before emitting any of them?
Notes:
The for loop is very long and computationally slow.
This question is referring to the socket.io library, not websockets in general.
Due to latency, waiting for confirmation from the client before sending each response is not possible
The order that the messages are received is not important, only that they are received as quickly as possible
The server emits them all in a loop and it takes a small bit of time for them to get to the client and get processed by the client in another process. This should not be surprising.
It is also possible that the single-threaded nature of Javascript in node.js prevents the emits from actually getting sent until your Javascript loop finishes. That would take detailed examination of socket.io code to know for sure if that is an issue. As I said before if you want to 1,1 then 2,2 then 3,3 instead of 1,2,3 sent, then 1,2,3 received you have to write code to force that.
If you want the client to receive the first before the server sends the 2nd, then you have to make the client send a response to the first and have the server not send the 2nd until it receives the response from the first. This is all async networking. You don't control the order of events in different processes unless you write specific code to force a particular sequence.
Also, how do you have client and server in the same console anyway? Unless you are writing out precise timestamps, you wouldn't be able to tell exactly what event came before the other in two separate processes.
One thing you could try is to send 10, then do a setTimeout(fn, 1) to send the next 10 and so on. That would give JS a chance to breathe and perhaps process some other events that are waiting for you to finish to allow the packets to get sent.
There's another networking issue too. By default TCP tries to batch up your sends (at the lowest TCP level). Each time you send, it sets a short timer and doesn't actually send until that timer fires. If more data arrives before the timer fires, it just adds that data to the "pending" packet and sets the timer again. This is referred to as the Nagle's algorithm. You can disable this "feature" on a per-socket basis with socket.setNoDelay(). You have to call that on the actual TCP socket.
I am seeing some discussion that Nagle's algorithm may already be turned off for socket.io (by default). Not sure yet.
In stepping through the process of socket.io's .emit(), there are some cases where the socket is marked as not yet writable. In those cases, the packets are added to a buffer and will be processed "later" on some future tick of the event loop. I cannot see exactly what puts the socket temporarily in this state, but I've definitely seen it happen in the debugger. When it's that way, a tight loop of .emit() will just buffer and won't send until you let other events in the event loop process. This is why doing setTimeout(fn, 0) every so often to keep sending will then let the prior packets process. There's some other event that needs to get processed before socket.io makes the socket writable again.
The issue occurs in the flush() method in engine.io (the transport layer for socket.io). Here's the code for .flush():
Socket.prototype.flush = function () {
if ('closed' !== this.readyState &&
this.transport.writable &&
this.writeBuffer.length) {
debug('flushing buffer to transport');
this.emit('flush', this.writeBuffer);
this.server.emit('flush', this, this.writeBuffer);
var wbuf = this.writeBuffer;
this.writeBuffer = [];
if (!this.transport.supportsFraming) {
this.sentCallbackFn.push(this.packetsFn);
} else {
this.sentCallbackFn.push.apply(this.sentCallbackFn, this.packetsFn);
}
this.packetsFn = [];
this.transport.send(wbuf);
this.emit('drain');
this.server.emit('drain', this);
}
};
What happens sometimes is that this.transport.writable is false. And, when that happens, it does not send the data yet. It will be sent on some future tick of the event loop.
From what I can tell, it looks like the issue may be here in the WebSocket code:
WebSocket.prototype.send = function (packets) {
var self = this;
for (var i = 0; i < packets.length; i++) {
var packet = packets[i];
parser.encodePacket(packet, self.supportsBinary, send);
}
function send (data) {
debug('writing "%s"', data);
// always creates a new object since ws modifies it
var opts = {};
if (packet.options) {
opts.compress = packet.options.compress;
}
if (self.perMessageDeflate) {
var len = 'string' === typeof data ? Buffer.byteLength(data) : data.length;
if (len < self.perMessageDeflate.threshold) {
opts.compress = false;
}
}
self.writable = false;
self.socket.send(data, opts, onEnd);
}
function onEnd (err) {
if (err) return self.onError('write error', err.stack);
self.writable = true;
self.emit('drain');
}
};
Where you can see that the .writable property is set to false when some data is sent until it gets confirmation that the data has been written. So, when rapidly sending data in a loop, it may not be letting the event come through that signals that the data has been successfully sent. When you do a setTimeout() to let some things in the event loop get processed that confirmation event comes through and the .writable property gets set to true again so data can again be sent immediately.
To be honest, socket.io is built of so many abstract layers across dozens of modules that it's very difficult code to debug or analyze on GitHub so it's hard to be sure of the exact explanation. I did definitely see the .writable flag as false in the debugger which did cause a delay so this seems like a plausible explanation to me. I hope this helps.

how to avoid countdown timer reset on page refresh , node js / socket io

server.js
io.on('connection', function (socket) {
var addedUser = false;
socket.on('setTimer', function(data) {
timer.setEndTime(data.time);
socket.broadcast.emit('currentEndTime', {time: timer.getEndTime() });
});
});
client.js
$(function() {
var timer = new Timer(),
socket = io.connect('http://localhost:3000');
socket.on('currentEndTime', function (data) {
//this is the full date time in ms.
timer.setEndTimeFromServer(data.time);
});
set = setInterval(function(){
$('.time').trigger('click');
clearInterval(set);
},100);
$('.time').on('click', function(e) {
e.stopPropagation();
var time = $(this).text() * 1000;
timer.setEndTime(time);
timer.timeRemaining();
socket.emit('setTimer', { time: time });
});
});
Hi there, i am trying to integrate countdown timer for node js/socket io application. Timer works fine, but how do i avoid timer reset on new socket connection/page refresh. Thank You
This is because of two things:
Your socket goes away when you hit the refresh button and a new one is created and thus your connection to the server (and hence the timer value) goes away when you hit the refresh button.
When you load the page first, you don't get the current value of the timer and print it on your page to start everything.
You can have each individual timer stored in a key value storage. You can use the IP Address as key and timer value as value or something like that. This way, you can retrieve the current value of the timer for the user when user connects to the server and continue counting down.
If you use the socket ID as key, it will do you no good except to pointlessly populate a key value storage and spike up your memory usage. Use something more persistent, such as IP Address, username, e-mail or something of that sort as key and current value of timer as value.
Also, your current solution is flawed because it faces the "refresh lag". When you hit the refresh button, you stop counting for the amount of time that is spent on re-rendering of your web page. If it's nothing more than a handful of milliseconds, you're fine; but as long as it goes to a few hundreds of milliseconds, this will make actual difference. Do the count down at the server side. Just notify the client at the set intervals. This will increase the server workload, but at least you're not bound to having bad timers.
If you're going to have a single countdown, like "x days, x hours, x minutes, x seconds remaining to Superbowl or some other big event", you're much better off without socket.io. Just send the timestamp to the big event to the client and do the remainder on the client side.

Amazon SQS with aws-sdk receiveMessage Stall

I'm using the aws-sdk node module with the (as far as I can tell) approved way to poll for messages.
Which basically sums up to:
sqs.receiveMessage({
QueueUrl: queueUrl,
MaxNumberOfMessages: 10,
WaitTimeSeconds: 20
}, function(err, data) {
if (err) {
logger.fatal('Error on Message Recieve');
logger.fatal(err);
} else {
// all good
if (undefined === data.Messages) {
logger.info('No Messages Object');
} else if (data.Messages.length > 0) {
logger.info('Messages Count: ' + data.Messages.length);
var delete_batch = new Array();
for (var x=0;x<data.Messages.length;x++) {
// process
receiveMessage(data.Messages[x]);
// flag to delete
var pck = new Array();
pck['Id'] = data.Messages[x].MessageId;
pck['ReceiptHandle'] = data.Messages[x].ReceiptHandle;
delete_batch.push(pck);
}
if (delete_batch.length > 0) {
logger.info('Calling Delete');
sqs.deleteMessageBatch({
Entries: delete_batch,
QueueUrl: queueUrl
}, function(err, data) {
if (err) {
logger.fatal('Failed to delete messages');
logger.fatal(err);
} else {
logger.debug('Deleted recieved ok');
}
});
}
} else {
logger.info('No Messages Count');
}
}
});
receiveMessage is my "do stuff with collected messages if I have enough collected messages" function
Occasionally, my script is stalling because I don't get a response for Amazon at all, say for example there are no messages in the queue to consume and instead of hitting the WaitTimeSeconds and sending a "no messages object", the callback isn't called.
(I'm writing this up to Amazon Weirdness)
What I'm asking is whats the best way to detect and deal with this, as I have some code in place to stop concurrent calls to receiveMessage.
The suggested answer here: Nodejs sqs queue processor also has code that prevents concurrent message request queries (granted it's only fetching one message a time)
I do have the whole thing wrapped in
var running = false;
runMonitorJob = setInterval(function() {
if (running) {
} else {
running = true;
// call SQS.receive
}
}, 500);
(With a running = false after the delete loop (not in it's callback))
My solution would be
watchdogTimeout = setTimeout(function() {
running = false;
}, 30000);
But surely this would leave a pile of floating sqs.receive's lurking about and thus much memory over time?
(This job runs all the time, and I left it running on Friday, it stalled Saturday morning and hung till I manually restarted the job this morning)
Edit: I have seen cases where it hangs for ~5 minutes and then suddenly gets messages BUT with a wait time of 20 seconds it should throw a "no messages" after 20 seconds. So a WatchDog of ~10 minutes might be more practical (depending on the rest of ones business logic)
Edit: Yes Long Polling is already configured Queue Side.
Edit: This is under (latest) v2.3.9 of aws-sdk and NodeJS v4.4.4
I've been chasing this (or a similar) issue for a few days now and here's what I've noticed:
The receiveMessage call does eventually return although only after 120 seconds
Concurrent calls to receiveMessage are serialised by the AWS.SDK library so making multiple calls in parallel have no effect.
The receiveMessage callback does not error - in fact after the 120 seconds have passed, it may contain messages.
What can be done about this? This sort of thing can happen for a number of reasons and some/many of these things can't necessarily be fixed. The answer is to run multiple services each calling receiveMessage and processing the messages as they come - SQS supports this. At any time, one of these services may hit this 120 second lag but the other services should be able to continue on as normal.
My particular problem is that I have some critical singleton services that can't afford 120 seconds of down time. For this I will look into either 1) use HTTP instead of SQS to push messages into my service or 2) spawn slave processes around each of the singletons to fetch the messages from SQS and push them into the service.
I also ran into this issue, but not when calling receiveMessage but sendMessage. I also saw hangups of exactly 120 seconds. I also saw it with a few other services, like Firehose.
That lead me to this line in the AWS SDK:
SQS Constructor
httpOptions:
timeout [Integer] — Sets the socket to timeout after timeout milliseconds of inactivity on the socket. Defaults to two minutes (120000).
to implement a fix, I override the timeout for my SQS client that performs the sendMessage to timeout after 10 seconds, and another with 25 seconds for receiving (where I long poll for 20 seconds):
var sendClient = new AWS.SQS({httpOptions:{timeout:10*1000}});
var receiveClient = new AWS.SQS({httpOptions:{timeout:25*1000}});
I've had this out in production for a week now and I've noticed that all of my SQS stalling issues have been eliminated.

Socket.io loses data on server side

Yellow,
so, I'm making a multiplayer online game on node (for funzies) and I'm stuck on a problem for over a week now. Perhaps the solution is simple, but I'm oblivious to it.
Long story short:
Data gets sent from client to server, this emit happens every
16.66ms.
Server receives them correctly and we collect all the data (lots of
fireballs in this case). We save them in player.skills_to_execute
array.
Every 5 seconds, we copy the data to seperate array (player_information), because
we are gona clean the current one, so it can keep collecting new
data, and then we send all the collected data back to the client.
Problem is definitely on server side. Sometimes this works, and sometimes it doesn't.
player_information is the array that I'm sending back to front, but before I send it, I do check with console.log on server if it does actually contain the data, and it does! But somehow that data gets deleted/overwritten right before sending it and it sends empty array (cause I check on frontend and I receive empty).
Code is fairly more complex, but I've minimized it here so it's easier to understand it.
This code stays on client side, and works as it should:
// front.js
socket.on("update-player-information", function(player_data_from_server){
console.log( player_data_from_server.skills_to_execute );
});
socket.emit("update-player-information", {
skills_to_execute: "fireball"
});
This code stays on server side, and works as it should:
// server.js
socket.on("update-player-information", function(data){
// only update if there are actually skills received
// we dont want every request here to overwrite actual array with empty [] // data.skills_to_execute = this will usually be 1 to few skills that are in need to be executed on a single client cycle
// naturally, we receive multiple requests in these 5 seconds,
// so we save them all in player object, where it has an array for this
if ( data.skills_to_execute.length > 0 ) {
player.skills_to_execute.push( data.skills_to_execute );
}
});
Now this is the code, where shit hits the fan.
// server.js
// Update player information
setInterval(function(){
// for every cycle, reset the bulk data that we are gona send, just to be safe
var player_information = [];
// collect the data from player
player_information.push(
{
skills_to_execute: player.skills_to_execute
}
);
// we reset the collected actions here, cause they are now gona be sent to front.js
// and we want to keep collecting new skills_to_execute that come in
player.skills_to_execute = [];
socket.emit("update-player-information", player_information);
}, 5000);
Perhaps anybody has any ideas?
Copy the array by value instead of by reference.
Try this:
player_information.push(
{
skills_to_execute: player.skills_to_execute.slice()
}
);
Read more about copying arrays in JavaScript by value or by reference here: Copying array by value in JavaScript

nodejs http response.write: is it possible out-of-memory?

If i have following code to send data repeatedly to client every 10ms:
setInterval(function() {
res.write(somedata);
}, 10ms);
What would happen if the client is very slow to receive the data?
Will server get out-of-memory error?
Edit:
actually the connection is kept alive, sever send jpeg data endlessly (HTTP multipart/x-mixed-replace header + body + header + body.....)
Because node.js response.write is asynchronous,
so some users guess it may store data in internal buffer and wait until low layer tells it can send,
so the internal buffer will grow, am i right?
If i am right, then how to resolve this?
the problem is node.js does not notify me when data is send for a single write call.
In other word, i can not tell user this way is theoretically no risk of "out of memory" and how to fix it.
Update:
By the keyword "drain" event given by user568109, i studied the source of node.js, and got conclusion:
it really will cause "out-of-memory" error. I should check return value of response.write(...)===false and then handle "drain" event of the response.
http.js:
OutgoingMessage.prototype._buffer = function(data, encoding) {
this.output.push(data); //-------------No check here, will cause "out-of-memory"
this.outputEncodings.push(encoding);
return false;
};
OutgoingMessage.prototype._writeRaw = function(data, encoding) { //this will be called by resonse.write
if (data.length === 0) {
return true;
}
if (this.connection &&
this.connection._httpMessage === this &&
this.connection.writable &&
!this.connection.destroyed) {
// There might be pending data in the this.output buffer.
while (this.output.length) {
if (!this.connection.writable) { //when not ready to send
this._buffer(data, encoding); //----------> save data into internal buffer
return false;
}
var c = this.output.shift();
var e = this.outputEncodings.shift();
this.connection.write(c, e);
}
// Directly write to socket.
return this.connection.write(data, encoding);
} else if (this.connection && this.connection.destroyed) {
// The socket was destroyed. If we're still trying to write to it,
// then we haven't gotten the 'close' event yet.
return false;
} else {
// buffer, as long as we're not destroyed.
this._buffer(data, encoding);
return false;
}
};
Some gotchas:
If sending over http it is not be a good idea. The browser may consider the request as timeout if it is not finished within specified amount of time. Server too will close connection which is idle for too long. If client cannot keep up, the timeout is almost certain.
setInterval for 10ms is also subject to some restrictions. It doesn't mean it will repeat after every 10ms, 10ms is the minimum it will wait before repeating. It will be slower than what you set the interval.
Let's say you chance to overload the response with data, then at some point the server will end connection and respond by 413 Request Entity Too Large depending on what the limit is set.
Node.js has single threaded architecture with a max memory limitation of around 1.7 GB. If you set your above server limits to too high and have many incoming connections you will get process out of memory error.
So with appropriate limits it will either give timeout or be request too large. (And there are no other errors in your program.)
Update
You need to use drain event. The http response is a writable stream. It has its own internal buffer. When the buffer is emptied the drain event is triggered. You should learn more about streams as you would go in deeper. This will help you not just in http. You can find several resources about streams on web.

Resources