Python Celery task.delay() when broker Queue is full - python-3.x

I am using celery 4 with RabbitMQ as broker. I have limited Queue (size == 200). My main code looks like:
for i in range(200):
tasks.delay(i)
It works if range of i <= size. If I call something like this:
for i in range(2000):
tasks.delay(i)
And size limit is 200, the Queue will be full and rest of the task will be skipped.
Can anyone please explain how to handle this situation? I need to wait till Queue will be free and insert another task.
Thanks

This is a RabbitMQ behavior. RabbitMQ docs.
Messages will be dropped or dead-lettered from the front of the queue to make room for new messages once the limit is reached.
You can either manage this in RabbitMQ by modifying the config or you can use multiple queues.
Another way to fix this would be to use celery's apply_async which can let you set a retry value, eta or retry_policy. BTW, delay() is just a shortcut to apply_async().

Related

RateLimit and ibmmq module Node.js

I'm using ibmmq module https://github.com/ibm-messaging/mq-mqi-nodejs.
I am trying to make an application which will get one message from a queue every 500ms.
There is an option getLoopPollTimeMs, but it works when there is no messages in the queue and then they comes.
I've tried to use limiter https://www.npmjs.com/package/limiter
mq.Get(openQueue.ref as mq.MQObject, mqmd, gmo, await this.getCB.bind(this))
async getCB(...) {
const remainingMessages = await this.limiter.removeTokens(1);
...
}
So the application reads a message from the queue and processes it.
And in the same time it reads all other messages and wait for the limiter to process because of the asynchronous callback.
But I need it to read the next message only when the previous one is processed.
I've tryed GetSync, but limiter works absolutley incorrect and when it's sync other processes in the application stop working.
How can I get only one message from the queue? Is it the only way if I mq.GetDone(hObj); every time in getCB and then connect with mq.Get to the queue again in setInterval? Any advices?
Upd: The way with mq.GetDone(hObj) isn't working. The application reads one message, processes it, and then it reads the second message from the queue and falls with mistake
terminate called after throwing an instance of 'Napi::Error'
what(): GetDone: MQCC = MQCC_FAILED [2] MQRC = MQRC_HOBJ_ERROR [2019]
Aborted
The queue is closed, but getCB is still working.
As per the comments, its possible to use tuning parameters, see https://github.com/ibm-messaging/mq-mqi-nodejs and line 196-202 of https://github.com/ibm-messaging/mq-mqi-nodejs/blob/148b70db036c80f442adb34769d5d239a6f05b65/lib/mqi.js#L575
Again as per the comments you could use a combination of
mq.setTuningParameters({getLoopDelayTimeMs: 2000, maxConsecutiveGets: 1})
for a throttle limit of 1 message in 2 seconds.

Spring Integration - Router, task-executor and smart LB

I have a queue channel and a chain with poller and task-executor "listening" on that channel, doing some processing in parallel. What I would like to do is to configure it in such a way that I could route particular messages based on some logic/property to make sure that particular message 'type' is always being process by particular thread from the task-executor.
Example: messages where: PAYLOAD_PROPERTY & 1 == 0 go always to thread 1, PAYLOAD_PROPERTY & 1 == 1 to thread 2 (please notice that this is just an example for 2 threads - I could easily use router here but I can imagine there is logic - like modulo operation - for 10 threads as well) - another words: thread 1 and thread 2 cannot process concurrently same 'type' of message. So the purpose is not just to load balance it - it is to stick with the same thread based on some logic.
My initial thought was to somehow use channel dispatcher (it can have load-balancer-ref and task-executor) but not sure if this would work as I have a chain with poller which do the processing I need further.
Can you advice what is the best component(s) setup to have workflow like above?
There's nothing like that in a "standard" task executor.
It's probably easier to remove the queue channel have a router (subscribed to a direct channel) route to 10 separate executor channels, each configured with a single-thread executor.

Azure queues getMessages method in sdk not working as expected

I have created a queue in Azure Queue and enqueued two items in it. Using the nodejs sdk, i create a timer that executes every 5 secs and calls:
azure.createQueueService("precondevqueues", "<key>").getMessages(queueName, {numOfMessages : 1, visibilityTimeout: 1 }, callback)
I expect that the same message of the two in the queue to show up after every 5 secs but that does not seem to be the case. The output of this call alternates between the two messages.
This should not be the case since visibilityTimeout is set to 1 and hence, after 1 second, the message dequeued in the first call should be visible again before the next getMessage call is made.
As noted here, FIFO ordering is not guaranteed. So it may be the case, that most of the time messages are fetched in FIFO order, but that is not guaranteed and Azure can give you the messages in the order which is best for their implementation.
Messages are generally added to the end of the queue and retrieved
from the front of the queue, although first in, first out (FIFO)
behavior is not guaranteed.
Aha my mistake! I again read the getMessages documentation very carefully and realize that getMessages dequeues the message but retains a invisible copy outside of the queue. If the message processor does not delete the message before the visibility timeout expires, the copy is re-enqueued in the message and therefore they go to the end of the queue.

RabbitMQ: how to limit consuming rate

I need to limit the rate of consuming messages from rabbitmq queue.
I have found many suggestions, but most of them offer to use prefetch option. But this option doesn't do what I need. Even if I set prefetch to 1 the rate is about 6000 messages/sec. This is too many for consumer.
I need to limit for example about 70 to 200 messages per second. This means consuming one message every 5-14ms. No simultaneous messages.
I'm using Node.JS with amqp.node library.
Implementing a token bucket might help:
https://en.wikipedia.org/wiki/Token_bucket
You can write a producer that produces to the "token bucket queue" at a fixed rate with a TTL on the message (maybe expires after a second?) or just set a maximum queue size equal to your rate per second. Consumers that receive a "normal queue" message must also receive a "token bucket queue" message in order to process the message effectively rate limiting the application.
NodeJS + amqplib Example:
var queueName = 'my_token_bucket';
rabbitChannel.assertQueue(queueName, {durable: true, messageTtl: 1000, maxLength: bucket.ratePerSecond});
writeToken();
function writeToken() {
rabbitChannel.sendToQueue(queueName, new Buffer(new Date().toISOString()), {persistent: true});
setTimeout(writeToken, 1000 / bucket.ratePerSecond);
}
I've already found a solution.
I use module nanotimer from npm for calculation delays.
Then I calculate delay = 1 / [message_per_second] in nanoseconds.
Then I consume message with prefetch = 1
Then I calculate really delay as delay - [processing_message_time]
Then I make timeout = really delay before sending ack for the message
It works perfectly. Thanks to all
See 'Fair Dispatch' in RabbitMQ Documentation.
For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.
This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.
In order to defeat that we can use the prefetch method with the value of 1. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.
I don't think RabbitMQ can provide you this feature out of the box.
If you have only one consumer, then the whole thing is pretty easy, you just let it sleep between consuming messages.
If you have multiple consumers I would recommend you to use some "shared memory" to keep the rate. For example, you might have 10 consumers consuming messages. To keep 70-200 messages rate across all of them, you will make a call to Redis, to see if you are eligible to process message. If yes, then update Redis, to show other consumers that currently one message is in process.
If you have no control over consumer, then implement option 1 or 2 and publish message back to Rabbit. This way the original consumer will consume messages with the desired pace.
This is how I fixed mine with just settimeout
I set mine to process consume every 200mls which will consume 5 data in 1 seconds I did mine to do update if exist
channel.consume(transactionQueueName, async (data) => {
let dataNew = JSON.parse(data.content);
const processedTransaction = await seperateATransaction(dataNew);
// delay ack to avoid duplicate entry !important dont remove the settimeout
setTimeout(function(){
channel.ack(data);
},200);
});
Done

Manually publish messages to dead-letter queue?

Why would someone want to do that? I have to unit-test exception handling mechanism in our application.
I presumed that dead letter queue is literally azure service bus queue, where I could publish messages using QueueClient
string dlQ = #"sb://**.servicebus.windows.net/**/Subscriptions/DefaultSubscription/$DeadLetterQueue";
string connectionString = CloudConfigurationManager.GetSetting("Microsoft.ServiceBus.ConnectionString");
NamespaceManager _namespaceManager = NamespaceManager.CreateFromConnectionString(connectionString);
QueueDescription qd = _namespaceManager.GetQueue(dataPromotionDLQ);
var queueClient = QueueClient.CreateFromConnectionString(connectionString, "DefaultSubscription/$DeadLetterQueue");
BrokeredMessage brokeredMessage = new BrokeredMessage("Message to PublishToDLQ");
try
{
queueClient.Send(brokeredMessage);
}
catch (Exception)
{
}
But I get MessagingEntityNotFoundException. What could be wrong?
You would never want to publish directly to a dead letter queue. It's where poisoned messages that can't be processed are placed.
There are two ways of placing messages onto the dead letter queue. The service bus itself dead-letters messages that have exceeded the maximum number of delivery attempts. You can also explicitly dead-letter a message that you have received using the DeadLetter() method.
Create your messages with a very short TTL via the BrokeredMessage.TimeToLive property.
The Subscription must have EnableDeadLetteringOnMessageExpiration set to true.
Though late here, adding to the answers of #Mikee and #Ben Morris may help someone. You can make use of #Mike's suggestion of making use of message.DeadLetter() or message.DeadLetterAsync() to dead-letter a message. Another suggestion can be to set very less or 0 second TimeToLive to move the messages to Dead letter.
After you perform any of these and try to view the messages in the Active end queue, you may still find that message is available sometimes (Which you are currently facing). The reason is that the messages that are dead-lettered due to TTLExpiredException, HeaderSizeExceeded or any system defined Errors, or manually Dead-Lettered messages like DeadLetter() methods are cleaned up by an asynchronous "garbage collection" program periodically. This doesn't occur immediately which we expect it to.
When you perform Peek operation, you can still see that the message is in the Active queue. You have to wait for the garbage collector to run or you can perform a Receive operation which forces the garbage collector to run first, thereby moving the messages to dead-letter before retrieval is done.

Resources