Google PubSub listener freezing - node.js

I've got an issue with Google PubSub Node.js listener freezing using pull.
I use the following flow:
1. PubSub client is created,
2. Subscription is fetched from the specified topic
3. Listeners are attached to "message" and "error" events.
At first it pulls around ~500 messages and acknowledges them, but after that it just hangs, with > 1000 messages in the queue. I've tried periodically reiniting the listeners (removeListener/on), but it only fetches few messages. After restarting the app, it pulls ~500 and the same happens.

Try checking your FlowControl setup, it might be limiting the number and the rate at which your subscriber receives messages. Additionally, if you do not ack or nack the messages that you’ve received, they will count toward the total number of outstanding messages. Once the maxMessages limit is reached, the subscriber will not receive anymore messages until the outstanding messages are either acked or nacked (or expired and will eventually be redelievered after maxExtension period).
For more information: https://cloud.google.com/pubsub/docs/pull#subscriber-flow-control-nodejs

Related

How to throttle my cron worker form pushing messages to RabbitMQ?

Context:
We have micro service which consumes(subscribes)messages from 50+ RabbitMQ queues.
Producing message for this queue happens in two places
The application process when encounter short delayed execution business logic ( like send emails OR notify another service), the application directly sends the message to exchange ( which in turn it is sent to the queue ).
When we encounter long/delayed execution business logic We have messages table which has entries of messages which has to be executed after some time.
Now we have cron worker which runs every 10 mins which scans the messages table and pushes the messages to RabbitMQ.
Scenario:
Let's say the messages table has 10,000 messages which will be queued in next cron run,
9.00 AM - Cron worker runs and it queues 10,000 messages to RabbitMQ queue.
We do have subscribers which are listening to the queue and start consuming the messages, but due to some issue in the system or 3rd party response time delay it takes each message to complete 1 Min.
9.10 AM - Now cron worker once again runs next 10 Mins and see there are yet 9000+ messages yet to get completed and time is also crossed so once again it pushes 9000+ duplicates messages to Queue.
Note: The subscribers which consumes the messages are idempotent, so there is no issue in duplicate processing
Design Idea I had in my mind but not best logic
I can have 4 status ( RequiresQueuing, Queued, Completed, Failed )
Whenever a message is inserted i can set the status to RequiresQueuing
Next when cron worker picks and pushes the messages successfully to Queue i can set it to Queued
When subscribers completes it mark the queue status as Completed / Failed.
There is an issue with above logic, let's say RabbitMQ somehow goes down OR in some use we have purge the queue for maintenance.
Now the messages which are marked as Queued is in wrong state, because they have to be once again identified and status needs to be changed manually.
Another Example
Let say I have RabbitMQ Queue named ( events )
This events queue has 5 subscribers, each subscribers gets 1 message from the queue and post this event using REST API to another micro service ( event-aggregator ). Each API Call usually takes 50ms.
Use Case:
Due to high load the numbers events produced becomes 3x.
Also the micro service ( event-aggregator ) which accepts the event also became slow in processing, the response time increased from 50ms to 1 Min.
Cron workers follows your design mentioned above and queues the message for each min. Now the queue is becoming too large, but i cannot also increase the number of subscribers because the dependent micro service ( event-aggregator ) is also lagging.
Now the question is, If keep sending the messages to events queue, it is just bloating the queue.
https://www.rabbitmq.com/memory.html - While reading this page, i found out that rabbitmq won't even accept the connection if it reaches high watermark fraction (default is 40%). Of course this can be changed, but this requires manual intervention.
So if the queue length increases it affects the rabbitmq memory, that is reason i thought of throttling at producer level.
Questions
How can i throttle my cron worker to skip that particular run or somehow inspect the queue and identify it already being heavily loaded so don't push the messages ?
How can i handle the use cases i said above ? Is there design which solves my problem ? Is anyone faced the same issue ?
Thanks in advance.
Answer
Check the accepted answer Comments for the throttling using queueCount
You can combine QoS - (Quality of service) and Manual ACK to get around this problem.
Your exact scenario is documented in https://www.rabbitmq.com/tutorials/tutorial-two-python.html. This example is for python, you can refer other examples as well.
Let says you have 1 publisher and 5 worker scripts. Lets say these read from the same queue. Each worker script takes 1 min to process a message. You can set QoS at channel level. If you set it to 1, then in this case each worker script will be allocated only 1 message. So we are processing 5 messages at a time. No new messages will be delivered until one of the 5 worker scripts does a MANUAL ACK.
If you want to increase the throughput of message processing, you can increase the worker nodes count.
The idea of updating the tables based on message status is not a good option, DB polling is the main reason that system uses queues and it would cause a scaling issue. At one point you have to update the tables and you would bottleneck because of locking and isolations levels.

Azure Service Bus: Duplicate messages are processing in message queue

I am working on Azure Service Bus. My service bus queue is processing one message 3 times. My lock time of message is 5 minutes. Every message is processing max of 2 mins but I don't know why the queue is picking same message and sending to processing and the duplicate messages are picking after 5 mins only.
How can I resolve this?
With Azure Service Bus messages will be re-processed when a message is not actioned by the receiving party. An action would be completing, deferring, dead-lettering. If you don't have any of those, once LockDuration on the broker side expires, the message will be re-delivered. Additional situation when a message would be re-delivered without waiting for LockDuration to time out would be to abandon a message. Then a message is picked up right away by the next request for new messages.
You should share your code to provide enough context. Messages can be received manually using MessageReceiver.ReceiveAsync() or using user-callback API. For the first option you have to action messages (complete for example). For the other option, there's a configuration API where you could opt-out of auto-completion and would be required manually complete message passed into user-callback.

Hidden messages in Azure storage queue

Sometimes there are some messages in Azure Queues that are not taken in charge by Azure Functions and also are not visible from StorageExplorer.
These messages are created without any visibility delay.
Is there any way to know what do those messages contain, and why are they not processed by our Azure Functions?
In the image you can see that we have a message in queue but it is not visible in the list and it is there from hours.
The Azure Queue API currently has no way to check invisible messages.
There are several situations in which a message will become invisible:
The message was added with an VisibilityTimeout in the Put Message request. The message will be invisible until this initial timeout expires.
The message has been retrieved (dequeued). Whenever a message is retrieved it will be invisible for the duration of the VisibilityTimeout specified by the Get Messages request, or 30 seconds by default.
The message has expired. Messages expire after 7 days by default, or after the MessageTTL specified in the Put Message request. Note: after a while these messages are automatically deleted, but until that point they are there as invisible message.
Use cases
Initial VisibilityTimeout
Messages are created with an initial VisibilityTimeout so that the message can be created now, but processed later (after the timeout expires), for whatever reason the creator has for wanting to delay this processing.
VisibilityTimeout on retrieving
The intended process for processing queue messages is:
The application dequeues one or more messages, optionally specifying the next VisibilityTimeout. This timeout should be bigger than the time it takes to process the message(s).
The application processes the message(s).
The application deletes the messages. When the processing fails the message(s) are not deleted.
Message(s) for which the process failed will become visible again as soon as their VisibilityTimeout expires, so that they can be re-tried. To prevent endless retries step 2. should start by checking the DequeueCount of the message: if it is bigger than the desired retry-count it should be deleted, instead of processed. It is good practice to copy such messages to a deadletter / poison queue (for example a queue with the original queue name plus a -poison suffix).
MessageTTL
By default messages have a time-to-live of 7 days. If the application processing cannot handle the amount of messages being added, a backlog could build up. Adjusting the TTL will determine what happens to such backlog.
Alternatively the application could crash, so that the backlog would build up until the application would be started again.
It seems that the message is expired. The following steps could reproduce the issue, you could test it.
Add message with a short TTL
After the message has been expired

How to put a message at the end of MQRabbit Queue

I'm working on a worker which is able to treat message from a RabbitMQ.
However, I am unsure of how to accomplish this.
If I receive a message and during my treating an error occurs, how can I put the message into the end of the queue?
I'm trying to using nack or reject, but the message is always re-put in the first position, and other messages stay frozen!
I don't understand why the message has to be put in the first position, I'm trying to "play" with other options like requeue or AllupTo but none of them seem to work.
Thank you in advance!
Documentation says:
Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and
basic.nack), or due to a channel closing while holding unacknowledged
messages. Any of these scenarios caused messages to be requeued at the
back of the queue for RabbitMQ releases earlier than 2.7.0. From
RabbitMQ release 2.7.0, messages are always held in the queue in
publication order, even in the presence of requeueing or channel
closure.
With release 2.7.0 and later it is still possible for individual
consumers to observe messages out of order if the queue has multiple
subscribers. This is due to the actions of other subscribers who may
requeue messages. From the perspective of the queue the messages are
always held in the publication order.
Remember to ack your successful messages, otherwise they will not be removed from the queue.
If you need more control over your rejected messages you should take a look to dead letter exchanges.
nack or reject either discard the message or re-queue the message.
For your requirement following could be suitable,
Once the consumer receives the message, just before start processing it, send ack() back to rabbitmq server.
Process the message then after, If found any error in the process then send ( publish ) the same message into the same queue. This will put the message at the back of the queue.
On successful processing do nothing. ack() has been already sent to rabbitmq server. Just take the next message and process it.

Azure Service Bus delayed (scheduled) message on Topic is lost when used with Duplicate Detection

We are facing a problem with Service Bus.
We have a topic, with two subscriptions.
We have enabled Duplicate Detection on those, with 1 minutes window (tried with 2 seconds first). We are using Duplicate Detection to avoid multiple messages processed in short interval (to maintain the interval between the messages)
We are using the message scheduling (ScheduledEnqueueTimeUtc) to repeat the messages to appear after 5 minutes, with same message ID (every time a new message is created with schedule, and old message is completed)
The workflow is as follows (problem):
First time a message is published (without scheduling)
This message is immediately consumed by the message pump, and a new message with same details and a schedule time of 5 minutes is send to the topic (UTC), expecting it to appear after 5 minutes
The message is not appearing in the subscription
When debugged, this issue doesn’t come up
When we send the First message with at least 30 second delay (scheduled), then it is working fine
If we recreate the topic and subscription with Duplicate Detection turned off, we are able to get the message using the above workflow
Since we have no clue on what is happening to the published message, we need help to identify the root cause of the issue.
This is an expected behavior of the ASB.
When a message is scheduled, it's actually enqueued on the broker with delayed appearance. ASB on the server side de-duplicates messages upon arrival and uses message ID for de-dup.
In your case, if you delay dispatch of the second message and the original message is processed, there will be nothing to de-dup and the second message will be enqueued. If you don't delay, then the broker will see an identical ID to the previously sent message that has not been completed or DLQed yet, and it will be de-duped.
Possible way to go about is not reuse the same transport message ID (ID used for the BrokeredMessage). In case you need to associate messages, you can use Properties for that.

Resources