Is there a way to overload Node.js event loop using websoket - node.js

I'm having issues with Node.js and the "WS" implementation of websocket (https://www.npmjs.com/package/ws). After a surge (plenty of messages in a short window of time), I'm having data that suggests that I've "missed" a message.
I've contacted the owner of the emitter server and he assures me that all messages have been sent on his side.
I've logger every message received on my side (at the start of the function on('message', () => {}), and I can't seem to find the message, so my assumption is that it doesn't even reached this point
So I'm wondering:
Messages are reveived and treated in a FIFO order. During the treatment of the current message, new ones will be stacked in the node event loop to be computed immediatly after. Correct ? Is there a way for that event loop to be "too big" that may drop new incomming messages ? If so, does it drop it quietly ? or does the program crashes vigorously (in other words, how can I see if a message has been dropped this way ?)
Does the 'ws' module have any kind of kown limitations for a maximum number of message received ? Does it have an internal way of dropping messages ?
Is there a better alternative than the 'ws' module ?
Is there any other ways to explain a "missed" message ?
Thanks a lot for your insights,

I use ws in nodejs to handle large message flows from many clients simultaneously in production, and I have never had it lose messages. Each server handles several thousand messages each second from hundreds of different client connections. The way my system works, if ws dropped messages or changed their order, my users would complain loudly.
That makes me guess you are not hitting any limitation of ws.
Early in my programming work I had the not-so-bright idea of putting incoming messages in queue objects in my nodejs code and processing them "later." That led to a hideously confusing message flow through my server. It sometimes looked like I had lost ws messages. I was happy to delete all that code, and dispatch every message completely within its message event handler.
Websocket connections sometimes close abnormally. Because network. You can catch those situations with error and close event handlers. It can take a while for the sender of a message, or the receiver, to detect that a network fault of some kind disrupted its connection. That can lead to disagreement about message count between sender and receiver. It's worth investigating.
I adorn ws's connection objects with message counts ("adorn" -- add an application-specific property to an object) and put those message counts into the log when a connection closes.

Related

Azure Servicebus competing receivers are picking up locked messages

We have a Topic with Subscriptions with a default LockDuration of 1min, and multiple SubscriptionClients listening to each subscription. For our test purposes, there are 3 clients listening to a single subscription.
SubscriptionClients are created as:
Client = new SubscriptionClient(endPoint, topicName, subscriptionName);
We put one message on the Topic, which is filtered into the Subscription.
We would expect one of the SubscriptionClients to pick up the message, and the other two clients cannot because it is locked.
What is actually happening, is all three clients are simultaneously picking up the same message, with different DeliveryCounts, and all within the 1minute lock duration.
Is there something wrong with the way we're creating the SubscriptionClient such that the lock is shared between them rather than being exclusive?
There are possibly two things that could be wrong. And none of those would be the broker but likely the client-side code.
MaxLockDuration is too short and while one client is still working on the message, the other client(s) receives that same message. You should be able to confirm by looking at the duration of the message processing. If it exceeds MaxLockDuration set on the queue, that's it.
You're using a message handler with automatic lock renewal and that one is failing to extend the lock. In that case, you would have a message handler error callback raised with the details.
Either way, you could log the errors and share the logs if possible to help with pinpointing what the issue is.

RabbitMQ subscriber sending message back onto RabbitMQ queue?

I would appreciate your thoughts on this.
I have a node app which subscribes to a RabbitMQ queue. When it receives a message, it checks it for something and then saves it to a database.
However, if the message is missing some information or some other criteria is not yet met, I would like the subscriber to publish the message back onto the RabbitMQ queue.
I understand logically this is just connecting to the queue and publishing the message, but is it really this simple or is this a bad practice or potentially dangerous?
Thanks for your help.
As I point out in the comment, When you create connection with queue, and set autoAck = true, to enable message acknowledge. The message in the queue will be deleted until receive acknowledge.
When the received message meets requirement, then send ack message to this queue, and this message will be deleted from queue. Otherwise, no ack message is sent to queue, this message will stay in the queue.
As for you mentioned in comment, the valid process may take 5 minutes, just set the send ack message as callback function of validation function.
In your question, you describe two criterion for when a message may not be processed:
if the message is missing some information or
some other criteria is not yet met
The first of these appears to be an issue with the message, and it doesn't seem that it makes much sense to re-queue a message that has a problem. The appropriate action is to log an error and drop the message (or invoke whatever error-handling logic your application contains).
The second of these is rather vague, but for the purposes of this answer, we will assume that the problem is not with the message but with some other component in the system (e.g. perhaps a network connection issue). In this case, the consuming application can send a Nack (negative acknowldegement) which can optionally requeue the message.
Keep in mind that in the second case, it will be necessary to shut down the consumer until the error condition has resolved, or the message will be redelivered and erroneously processed ad infinitum until the system is back up, thus wasting resources on an unprocessable message.
Why use a nack instead of simply re-publishing?
This will set the "redelivered" flag on the message so that you know it was delivered once already. There are other options as well for handling bad messages.

Distributed pub/sub with single consumer per message type

I have no clue if it's better to ask this here, or over on Programmers.SE, so if I have this wrong, please migrate.
First, a bit about what I'm trying to implement. I have a node.js application that takes messages from one source (a socket.io client), and then does processing on the message, which might result in zero or more messages back out, either to the sender, or other clients within that group.
For the processing, I would like to essentially just shove the message into a queue, then it works its way through various message processors that might kick off their own items, and eventually, the bit running socket.io is informed "Hey, send this message back"
As a concrete example, say a user signs into the service, that sign in message is then placed in the queue, where the authorization processor gets it, does it's thing, then places a message back in the queue saying the client's been authorized. This goes back to the socket.io socket that is connected to the client, along with other clients that might be interested. It can also go to other subsystems that might want to do more processing on authorization (looking up user info, sending more info to the client based on their data, etc).
If I wanted strong coupling, this would be easy, but I tried that before, and it just goes to a mess of spaghetti code that's very fragile, and I would like to avoid that. Another wrench in the setup is this should be cluster-able, which is where the real problem comes in. There might be more than one, say, authorization processor running. But the authorization message should be processed only once.
So, in short, I'm looking for a pattern/technique that will allow me to, essentially, have multiple "groups" of subscribers for a message, and the message will be processed only once per group.
I thought about maybe having each instance of a processor generate a unique name that would be used as a list in Reids. This name would then be registered with some sort of dispatch handler, and placed into a set for that group of subscribers. Then when a message arrives, the dispatch pulls a random member out of that set, and places it into that list. While it seems like this would work, it seems somewhat over-complicated and fragile.
The core problem is I've never designed a system like this, so I'm not even sure the proper terms to use or look up. If anyone can point me in the right direction for this, I would be most appreciative.
I think what your describing is similar to https://www.getbridge.com/ service. I it but ended up writing my own based on zeromq, it allows you to register services, req -> <- rec and channels which are pub / sub workers.
As for the design, I used a client -> broker -> services & channels which are all plug and play using auto discovery, you have the services register their schema with the brokers who open a tcp connection so that brokers on other servers can communicate with that broker groups services. Then internal services and clients connect via unix sockets or ipc channels which ever is preferred.
I ended up wrapping around the redis publish/subscribe functions a bit to do this. Each type of message processor gets a "group name", and there can be multiple instances of the processor within that group (so multiple instances of the program can run for clustering).
When publishing a message, I generate an incremental ID, then store the message in a string key with that ID, then publish the message ID.
On the receiving end, the first thing the subscriber does is attempt to add the message ID it just got from the publisher into a set of received messages for that group with sadd. If sadd returns 0, the message has already been grabbed by another instance, and it just returns. If it returns 1, the full message is pulled out of the string key and sent to the listener.
Of course, this relies on redis being single threaded, which I imagine will continue to be the case.
What you might be looking for is an AMQP protocol implementation,where you can have queue get custom exchanges,and implement a pub-sub model.
RabbitMQ - a popular amqp protocol implementation with lots of libraries
it also has node.js library

How to design a scalable rpc call listener?

I have to listen for rpc calls , stack them somewhere , process them, and answer. The thing is that they are not run as soon as they come. The response is an ACK for each rpc call recieved.
The problem is that i want to design it in a way that i can have many listening servers writing in the same stack of calls, piling them up as they come.
My objective is to listen to as many calls as possible. How should i achieve this?
My main technology is Perl and node.js but would use any open source software for this task.
It sounds like any kind of job queue will do what you need it to; I'm personally a big fan of using Redis for this kind of thing. Since Redis lists maintain insertion order, you can simply LPUSH your RPC call info on to the end of the list from any number of web servers listening to the RPC calls, and somewhere else (in another process/on another machine, I assume) RPOP (or BRPOP) them off and process them.
Since Node.js uses fully asynchronous IO, assuming you're not doing a lot of processing in your RPC listeners (that is, you're only listening for requests, sending an ACK, and pushing onto Redis), my guess is that Node would be exceedingly efficient at this.
An aside on using Redis for a queue: if you want to ensure that, in the event of a catastrophic failure, jobs are not lost, you'll need to implement a little more logic; from the RPOPLPUSH documentation:
Pattern: Reliable queue
Redis is often used as a messaging server to implement processing of background jobs or other kinds of messaging
tasks. A simple form of queue is often obtained pushing values into a
list in the producer side, and waiting for this values in the consumer
side using RPOP (using polling), or BRPOP if the client is better
served by a blocking operation.
However in this context the obtained
queue is not reliable as messages can be lost, for example in the case
there is a network problem or if the consumer crashes just after the
message is received but it is still to process.
RPOPLPUSH (or
BRPOPLPUSH for the blocking variant) offers a way to avoid this
problem: the consumer fetches the message and at the same time pushes
it into a processing list. It will use the LREM command in order to
remove the message from the processing list once the message has been
processed.
An additional client may monitor the processing list for
items that remain there for too much time, and will push those timed
out items into the queue again if needed.

How to avoid flooding a message queue?

I'm working on an application that is divided in a thin client and a server part, communicating over TCP. We frequently let the server make asynchronous calls (notifications) to the client to report state changes. This avoids that the server loses too much time waiting for an acknowledgement of the client. More importantly, it avoids deadlocks.
Such deadlocks can happen as follows. Suppose the server would send the state-changed-notification synchronously (please note that this is a somewhat constructed example). When the client handles the notification, the client needs to synchronously ask the server for information. However, the server cannot respond, because he is waiting for an answer to his question.
Now, this deadlock is avoided by sending the notification asynchronously, but this introduces another problem. When asynchronous calls are made more rapidly than they can be processed, the call queue keeps growing. If this situation is maintained long enough, the call queue will get totally full (flooded with messages). My question is: what can be done when that happens?
My problem can be summarized as follows. Do I really have to choose between sending notifications without blocking at the risk of flooding the message queue, or blocking when sending notifications at the risk of introducing a deadlock? Is there some trick to avoid flooding the message queue?
Note: To repeat, the server does not stall when sending notifications. They are sent asynchronously.
Note: In my example I used two communicating processes, but the same problem exists with two communicating threads.
If the server is sending informational messages to the client, which you yourself say are asynchronous, it should not have to wait for a reply from the client. If they are not informational, in other words they require an answer, I would say a server should never send such messages to a client, and their presence indicates a poor design.
If you have a constant congestion problem, there is little you can do other than gracefully fail and notify the client that no new messages can be posted; then it is up to the client to maintain a backlog of messages to be posted.
Introducing a priority queue and using message expiration/filtering could allow you to free up space in the queue, but that really just postpones the problem. If possible, you could also aggregate messages or ignore duplicate messages, but again the problem does not seem to be the queue itself. (Not to mention that the more complex queue logic could eat up valuable resources that would be better used actually processing messages.)
Depending on what the server side does, you could introduce result hashing for long computations, offload some types of messages to a dedicated device, check if the server waits unreasonably long for I/O operations, and a myriad of other techniques. Profile if possible, at least try to find out which message(s) causes congestion.
Oh, and the business solution: Compare cost of estimated development time to the cost of better hardware and conclude that you should just buy a more powerful server (or an additional one).
Depending on how important these messages are you might want to look into Message Expiration, or perhaps a Message Filter, though it sounds like your architecture may be incorrect.
I would rather fix the logic in the server side. The message queue should not stall waiting for the answer. Rather have a state machine which can also receive those info queries while it is waiting for the answer from the client.
Of course you can still flood your message queue, but with TCP you can handle it pretty easily.
The best way, I believe, would be to add another state to your client. This I borrowed from the SMPP protocol specs.
Add a congestion state to the client, whereby it always checks the queue length, assuming this is possible, and therefore once a certain threshold is attained, say 1000 unprocessed messages, the client sends the server a message indicating that it's congested and the server will be required to cease all messaging until it receives a notification indicating that the client is no longer congested.
Alternatively, on the server side, if there is a certain number of pending replies, the server could simply cease sending messages until the client replies a certain number of them.
These thresholds can be dynamically calculated or fixed, depending.....

Resources