Using multiple MQTT node-red nodes behaving not as expected - node.js

I have multiple MQTT nodes with different topics configured in them. Now I will process the value of multiple topics and figure out some assumptions(Basically stream analytics).
My expectation:
I know that java script is single threaded. So I thought that when one topic data is received it will be processed and then only once it is completed other topic will be received and so on.
Reality:
It is working like multi threaded.
Test case:
Flow: MQTT ---> Process for a second ---> Output
Sleep function code(Not really sleeping more like processing):
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++)
{
if ((new Date().getTime() - start) > 1000)
{
break;
}
}
return msg;
Now I will publish data form 1 to 100 continuously using for loop.
My expectation:
Now 1, 2, 3....100 will be displayed one after another with 1 second gap. So now it should be taking 100 seconds approximately to display values form 1 to 100.
Reality:
First it will sleep for 100 seconds and then form 1 to 100 all will be displayed at once. So what is happening here?
Flow json:
[{"id":"e9a53835.09af38","type":"tab","label":"Flow 1","disabled":false,"info":""},{"id":"5ffb1b40.1405b4","type":"debug","z":"e9a53835.09af38","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","x":438,"y":216,"wires":[]},{"id":"a8406277.a78ee","type":"mqtt in","z":"e9a53835.09af38","name":"Test MQTT Queue","topic":"1","qos":"2","broker":"b4c58fab.26844","x":146,"y":120,"wires":[["629e90bb.996ad"]]},{"id":"629e90bb.996ad","type":"function","z":"e9a53835.09af38","name":"Sleep 1 seconds","func":"var start = new Date().getTime();\nfor (var i = 0; i < 1e7; i++)\n{\n if ((new Date().getTime() - start) > 1000)\n {\n break;\n }\n}\nreturn msg;","outputs":1,"noerr":0,"x":298,"y":168,"wires":[["5ffb1b40.1405b4"]]},{"id":"b4c58fab.26844","type":"mqtt-broker","z":"","name":"","broker":"127.0.0.1","port":"1883","clientid":"","usetls":false,"compatmode":true,"keepalive":"60","cleansession":true,"willTopic":"","willQos":"2","willRetain":"false","willPayload":"","birthTopic":"","birthQos":"2","birthRetain":"false","birthPayload":""}]
C# publisher function:
// Retain: false, QOS= 2 on both publisher and client.
for (int i = 1; i <= 10; i++)
{
client.Publish(1, Encoding.UTF8.GetBytes(i.ToString()), MqttMsgBase.QOS_LEVEL_EXACTLY_ONCE, false);
}

The first message arrives and is passed to the Function node which then does a busy-wait loop, not allowing the node.js event loop to process any other work.
During that time, the remaining 99 messages arrive in the underlying MQTT client and internal events are queued up to process them
The first message then finally makes it to the Debug node. The Debug node passes the message to the websocket asynchronously - which means that piece of work is wrapped in an event and put at the end of the node.js event queue - behind the 99 messages.
the same then happens for the next 99 events - they are processed synchronously with no opporunity for the node.js event loop to make progress, each one added another event to the end of the queue to have the message passed to Debug
The last of the messages is processed, the node.js event loop then reaches the events to process the debug messages over the websocket and all 100 messages appear in the Debug sidebar
The key here is that blocking synchronously is a bad thing to do in the node.js world. If you want to delay a message, use a Delay node, which does so using timers - thereby allowing node.js to continue processing other work in the background.

Related

NodeJS - Print stack trace when stuck/frozen

Is it possible to print the stack trace of a nodejs app when it is becoming very slow or froze to get information about performance spikes?
This would be incredibly helpful in instances where the reproduction for the issue is unknown.
In Java this saved hundreds of hours and was straight forward:
spawn a new "watchdog" thread
send a heartbeat every 50ms from the main thread to the watchdog
if the "watchdog" doesn't receive a heartbeat for +200ms, log the main threads stacktrace
Is something like this possible with nodejs?
FI: the nodejs diagnostics report doesn't contain any javascript stack trace when initiated from a sig kill event.
You are looking for checking if event loop is blocked or slow. There is a npm package https://www.npmjs.com/package/blocked-at that detects slow synchronous execution and report where it started.
Usage:
const blocked = require('blocked-at');
blocked((time, stack) => {
console.log(`Blocked for ${time}ms, operation started here:`, stack)
});
from scratch you can implement yourself a check in this way:
var interval = 500;
var interval = setInterval(function() {
var last = process.hrtime();
setImmediate(function() {
var delta = process.hrtime(last);
if (delta > blockDelta) {
console.log("node.eventloop_blocked", delta);
}
});
}, interval);
The idea is: if the timer doesn't fire after the expected time, this mean that event loop was blocked in some operation.
This snippet check if event loop is blocked for more than 500 ms. Isn't perfect, I'm suggest to use blocked-at for more robust control.

SocketIO limit requests on progress watching

I'm using SocketIO for a small app, where users would receive updates whenever a change occurs. However, i'd like to implement it to have real time progress visualization in tasks that are done in server side.
However, if the task progress changes too fastly, this would result in tons of event emissions and i think this could decrease the app performance. Is there a way to limit event emits to a maximum of N per second (Emitting only the last one, with the last percent of the progress) ?
Yes, that can be done. It requires you to hold events for a short time to see if there are more events of the same kind coming and thus combine all of them into one. I will code up an example in a few minutes here.
Here's a general idea for how you could do this:
function emitMessageLast(socket, msg, data) {
const queueTime = 500; // wait for up to 500ms of idle time before sending latest data
const longestWaitTime = 2000; // wait no more than 2 seconds if data is being continuously sent
function stopTimer() {
if (socket._timer) {
clearTimeout(socket._timer);
socket._timer = null;
}
}
function sendNow() {
socket._lastMsg = msg;
socket._lastTime = Date.now();
return socket.emit(msg, data);
}
// if this is the first time we're sending this message
// or it's been awhile since we last sent data
// just send the new data immediately
if (socket._lastMsg !== msg || !socket._lastTime || Date.now() - socket._lastTime > longestWaitTime) {
stopTimer();
return sendNow();
}
// at this point, we know we're sending the same message as has recently been sent
socket._lastMsg = msg;
socket._lastData = data;
stopTimer();
// set a timer so that if no more data has arrived before the timer fires,
// we sent the last data we saved
socket._timer = setTimeout(() => {
socket._timer = null;
sendNow();
}, queueTime);
}
The general idea for this code is as follows:
When you get called with a message to send and no message of the same type has been recently sent, then send this one immediately and record the time it was sent.
When you get called with a message to send and it's been more than longestWaitTime since you last sent a message, then send this one immediately. This means if the server is continuously sending data, the server will wait for up to longestWaitTime before sending the latest value of the data.
When the server is sending data sporadically, it will wait up to queueTime (waiting to see if there's more data coming) before sending the last piece of data. It is essentially buffering the last message until no more messages have been send in the last queueTime and then a timer will fire off that last message.
I've configured the defaults here so that it will delay sending data to the client for up to 500ms (while waiting to see if the server is about to send more data so it can avoid sending all the intermediate values of the data) figuring that if the client updates its status every 500ms, that is plenty often. And, if the server is continuously sending updates, then the server will skip up to 2000ms of updates to send just the one last update. Again, you can set these numbers however you see appropriate.

About checkpoint strategy in event hub processor

I use event hubs processor host to receive and process the events from event hubs. For better performance, I call checkpoint every 3 minutes instead of every time when receiving the events:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
if (checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
await context.CheckpointAsync();
}
}
But the problem is, that there might be some events never being checkpoint if not new events sending to event hubs, as the ProcessEventAsync won't be invoked if no new messages.
Any suggestions to make sure all processed events being checkpoint, but still checkpoint every several mins?
Update: Per Sreeram's suggestion, I updated the code as below:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
this.lastProcessedEventsCount += messages.Count();
if (this.checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
this.checkpointStopWatch.Restart();
if (this.lastProcessedEventsCount > 0)
{
await context.CheckpointAsync();
this.lastProcessedEventsCount = 0;
}
}
}
Great case - you are covering!
You could experience loss of event checkpoints (and as a result event replay) in the below 2 cases:
when you have sparse data flow (for ex: a batch of messages every 5 mins and your checkpoint interval is 3 mins) and EventProcessorHost instance closes for some reason - you could see 2 min of EventData - re-processing. To handle that case,
Keep track of the lastProcessedEvent after completing IEventProcessor.onEvents/IEventProcessor.ProcessEventsAsync & checkpoint when you get notified on close - IEventProcessor.onClose/IEventProcessor.CloseAsync.
There might just be a case when - there are no more events to a specific EventHubs partition. In this case, you would never see the last event being checkpointed - with your Checkpointing strategy. However, this is uncommon, when you have continuous flow of EventData and you are not sending to specific EventHubs partition (EventHubClient.send(EventData_Without_PartitionKey)). If you think - you could run into this situation, use the:
EventProcessorOptions.setInvokeProcessorAfterReceiveTimeout(true); // in java or
EventProcessorOptions.InvokeProcessorAfterReceiveTimeout = true; // in C#
flag to wake up the processEventsAsync every so often. Then, keep track of, LastProcessedEventData and LastCheckpointedEventData and make a judgement whether to checkpoint when no Events are received, based on EventData.SequenceNumber property on those events.

socket.io how to send multiple messages sequentially?

I'm using socket.io like this
Client:
socket.on('response', function(i){
console.log(i);
});
socket.emit('request', whateverdata);
Server:
socket.on('request', function(whateverdata){
for (i=0; i<10000; i++){
console.log(i);
socket.emit('response', i);
}
console.log("done!");
});
I need output like this when putting the two terminals side by side:
Server Client
0 0
1 1
. (etc) .
. .
9998 9998
9999 9999
done!
But instead I am getting this:
Server Client
0
1
. (etc)
.
9998
9999
done!
0
1
.
. (etc)
9998
9999
Why?
Shouldn't Socket.IO / Node emit the message immediately, not wait for the loop to complete before emitting any of them?
Notes:
The for loop is very long and computationally slow.
This question is referring to the socket.io library, not websockets in general.
Due to latency, waiting for confirmation from the client before sending each response is not possible
The order that the messages are received is not important, only that they are received as quickly as possible
The server emits them all in a loop and it takes a small bit of time for them to get to the client and get processed by the client in another process. This should not be surprising.
It is also possible that the single-threaded nature of Javascript in node.js prevents the emits from actually getting sent until your Javascript loop finishes. That would take detailed examination of socket.io code to know for sure if that is an issue. As I said before if you want to 1,1 then 2,2 then 3,3 instead of 1,2,3 sent, then 1,2,3 received you have to write code to force that.
If you want the client to receive the first before the server sends the 2nd, then you have to make the client send a response to the first and have the server not send the 2nd until it receives the response from the first. This is all async networking. You don't control the order of events in different processes unless you write specific code to force a particular sequence.
Also, how do you have client and server in the same console anyway? Unless you are writing out precise timestamps, you wouldn't be able to tell exactly what event came before the other in two separate processes.
One thing you could try is to send 10, then do a setTimeout(fn, 1) to send the next 10 and so on. That would give JS a chance to breathe and perhaps process some other events that are waiting for you to finish to allow the packets to get sent.
There's another networking issue too. By default TCP tries to batch up your sends (at the lowest TCP level). Each time you send, it sets a short timer and doesn't actually send until that timer fires. If more data arrives before the timer fires, it just adds that data to the "pending" packet and sets the timer again. This is referred to as the Nagle's algorithm. You can disable this "feature" on a per-socket basis with socket.setNoDelay(). You have to call that on the actual TCP socket.
I am seeing some discussion that Nagle's algorithm may already be turned off for socket.io (by default). Not sure yet.
In stepping through the process of socket.io's .emit(), there are some cases where the socket is marked as not yet writable. In those cases, the packets are added to a buffer and will be processed "later" on some future tick of the event loop. I cannot see exactly what puts the socket temporarily in this state, but I've definitely seen it happen in the debugger. When it's that way, a tight loop of .emit() will just buffer and won't send until you let other events in the event loop process. This is why doing setTimeout(fn, 0) every so often to keep sending will then let the prior packets process. There's some other event that needs to get processed before socket.io makes the socket writable again.
The issue occurs in the flush() method in engine.io (the transport layer for socket.io). Here's the code for .flush():
Socket.prototype.flush = function () {
if ('closed' !== this.readyState &&
this.transport.writable &&
this.writeBuffer.length) {
debug('flushing buffer to transport');
this.emit('flush', this.writeBuffer);
this.server.emit('flush', this, this.writeBuffer);
var wbuf = this.writeBuffer;
this.writeBuffer = [];
if (!this.transport.supportsFraming) {
this.sentCallbackFn.push(this.packetsFn);
} else {
this.sentCallbackFn.push.apply(this.sentCallbackFn, this.packetsFn);
}
this.packetsFn = [];
this.transport.send(wbuf);
this.emit('drain');
this.server.emit('drain', this);
}
};
What happens sometimes is that this.transport.writable is false. And, when that happens, it does not send the data yet. It will be sent on some future tick of the event loop.
From what I can tell, it looks like the issue may be here in the WebSocket code:
WebSocket.prototype.send = function (packets) {
var self = this;
for (var i = 0; i < packets.length; i++) {
var packet = packets[i];
parser.encodePacket(packet, self.supportsBinary, send);
}
function send (data) {
debug('writing "%s"', data);
// always creates a new object since ws modifies it
var opts = {};
if (packet.options) {
opts.compress = packet.options.compress;
}
if (self.perMessageDeflate) {
var len = 'string' === typeof data ? Buffer.byteLength(data) : data.length;
if (len < self.perMessageDeflate.threshold) {
opts.compress = false;
}
}
self.writable = false;
self.socket.send(data, opts, onEnd);
}
function onEnd (err) {
if (err) return self.onError('write error', err.stack);
self.writable = true;
self.emit('drain');
}
};
Where you can see that the .writable property is set to false when some data is sent until it gets confirmation that the data has been written. So, when rapidly sending data in a loop, it may not be letting the event come through that signals that the data has been successfully sent. When you do a setTimeout() to let some things in the event loop get processed that confirmation event comes through and the .writable property gets set to true again so data can again be sent immediately.
To be honest, socket.io is built of so many abstract layers across dozens of modules that it's very difficult code to debug or analyze on GitHub so it's hard to be sure of the exact explanation. I did definitely see the .writable flag as false in the debugger which did cause a delay so this seems like a plausible explanation to me. I hope this helps.

Amazon SQS with aws-sdk receiveMessage Stall

I'm using the aws-sdk node module with the (as far as I can tell) approved way to poll for messages.
Which basically sums up to:
sqs.receiveMessage({
QueueUrl: queueUrl,
MaxNumberOfMessages: 10,
WaitTimeSeconds: 20
}, function(err, data) {
if (err) {
logger.fatal('Error on Message Recieve');
logger.fatal(err);
} else {
// all good
if (undefined === data.Messages) {
logger.info('No Messages Object');
} else if (data.Messages.length > 0) {
logger.info('Messages Count: ' + data.Messages.length);
var delete_batch = new Array();
for (var x=0;x<data.Messages.length;x++) {
// process
receiveMessage(data.Messages[x]);
// flag to delete
var pck = new Array();
pck['Id'] = data.Messages[x].MessageId;
pck['ReceiptHandle'] = data.Messages[x].ReceiptHandle;
delete_batch.push(pck);
}
if (delete_batch.length > 0) {
logger.info('Calling Delete');
sqs.deleteMessageBatch({
Entries: delete_batch,
QueueUrl: queueUrl
}, function(err, data) {
if (err) {
logger.fatal('Failed to delete messages');
logger.fatal(err);
} else {
logger.debug('Deleted recieved ok');
}
});
}
} else {
logger.info('No Messages Count');
}
}
});
receiveMessage is my "do stuff with collected messages if I have enough collected messages" function
Occasionally, my script is stalling because I don't get a response for Amazon at all, say for example there are no messages in the queue to consume and instead of hitting the WaitTimeSeconds and sending a "no messages object", the callback isn't called.
(I'm writing this up to Amazon Weirdness)
What I'm asking is whats the best way to detect and deal with this, as I have some code in place to stop concurrent calls to receiveMessage.
The suggested answer here: Nodejs sqs queue processor also has code that prevents concurrent message request queries (granted it's only fetching one message a time)
I do have the whole thing wrapped in
var running = false;
runMonitorJob = setInterval(function() {
if (running) {
} else {
running = true;
// call SQS.receive
}
}, 500);
(With a running = false after the delete loop (not in it's callback))
My solution would be
watchdogTimeout = setTimeout(function() {
running = false;
}, 30000);
But surely this would leave a pile of floating sqs.receive's lurking about and thus much memory over time?
(This job runs all the time, and I left it running on Friday, it stalled Saturday morning and hung till I manually restarted the job this morning)
Edit: I have seen cases where it hangs for ~5 minutes and then suddenly gets messages BUT with a wait time of 20 seconds it should throw a "no messages" after 20 seconds. So a WatchDog of ~10 minutes might be more practical (depending on the rest of ones business logic)
Edit: Yes Long Polling is already configured Queue Side.
Edit: This is under (latest) v2.3.9 of aws-sdk and NodeJS v4.4.4
I've been chasing this (or a similar) issue for a few days now and here's what I've noticed:
The receiveMessage call does eventually return although only after 120 seconds
Concurrent calls to receiveMessage are serialised by the AWS.SDK library so making multiple calls in parallel have no effect.
The receiveMessage callback does not error - in fact after the 120 seconds have passed, it may contain messages.
What can be done about this? This sort of thing can happen for a number of reasons and some/many of these things can't necessarily be fixed. The answer is to run multiple services each calling receiveMessage and processing the messages as they come - SQS supports this. At any time, one of these services may hit this 120 second lag but the other services should be able to continue on as normal.
My particular problem is that I have some critical singleton services that can't afford 120 seconds of down time. For this I will look into either 1) use HTTP instead of SQS to push messages into my service or 2) spawn slave processes around each of the singletons to fetch the messages from SQS and push them into the service.
I also ran into this issue, but not when calling receiveMessage but sendMessage. I also saw hangups of exactly 120 seconds. I also saw it with a few other services, like Firehose.
That lead me to this line in the AWS SDK:
SQS Constructor
httpOptions:
timeout [Integer] — Sets the socket to timeout after timeout milliseconds of inactivity on the socket. Defaults to two minutes (120000).
to implement a fix, I override the timeout for my SQS client that performs the sendMessage to timeout after 10 seconds, and another with 25 seconds for receiving (where I long poll for 20 seconds):
var sendClient = new AWS.SQS({httpOptions:{timeout:10*1000}});
var receiveClient = new AWS.SQS({httpOptions:{timeout:25*1000}});
I've had this out in production for a week now and I've noticed that all of my SQS stalling issues have been eliminated.

Resources