I am using rabbitmq 3.7.0 with nodejs using this library in typescript. All consumers are {noAck: false, exclusive: false}, all the queues are {prefetch:1, expires: 14400000, durable:false} (4 hours).
I have a lot of queues that does not have any pending ack but still holds hundreds of messages. Beside this, the queues do not expire when it should, what will generate performance issues after a couple of days. I haven't found any justification on why the message still in the queue after the ack process.
Thanks
Edit (1) - missing information I've perceived from the answers
As far as I could assess, there is no active consumer for such number of queues. Is it possible to confirm this information in the management tool?
What complicates the scenario is that I can't simply delete the queue after the last consumer because I could use it to recover a broken session, and if I don't delete de queue after the last consumer, the queue continues to be updated. But the expires parameter should suffice for this.
Some snapshots from the management tool:
I suggest you read the documentation for how expires works: https://www.rabbitmq.com/ttl.html#queue-ttl
If your queue is not being expired it means you most likely have a consumer registered on it. There are a couple other cases where queues won't be expired.
As for the messages that are supposedly acknowledged but remain in the queue the only explanation I have is that perhaps they have not been delivered at all.
Finally, without more information or a reliable reproducer there's not much further assistance that can be given with the limited amount of information provided.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
Related
I'm getting a bit confused trying to nail down how peek lock works in Service Bus. In particular I'm using Microsoft.Azure.ServiceBus with Azure Functions and a ServiceBusTrigger.
From what I can make out the time a message gets locked for is set on the queue itself and defaults to 30 seconds though it can be set to be anywhere up to 5 minutes.
When a message is peeked from the queue this lock kicks in.
There is then a setting called maxAutoRenewDuration which when using Azure Functions is set in the host.json file under Extensions:ServiceBus:messageHandlerOptions. This allows the client to automatically request one or more extensions to a lock until the maxAutoRenewDuration is reached. Once you hit this limit a renewal won't be requested and the lock will be released.
Renewals are best effort and can't be guaranteed so ideally you try to come up with a design where messages are typically processed within the lock period specified on the queue.
Have I got this right so far?
Questions I still have are
are there and limits on what maxAutoRenewDuration can be set to. One article I read seemed to suggest that this could be set to whatever I need to ensure my message is processed (link). The Microsoft Documentation though states that the maximum value for this is also limited to 5 minutes (link).
The maxAutoRenewDuration is configurable in host.json, which maps to OnMessageOptions.MaxAutoRenewDuration. The maximum allowed for this setting is 5 minutes according to the Service Bus documentation
Which is correct? I know the default lock duration has a maximum of 5 minutes but it doesn't seem to make sense that this also applies to maxAutoRenewDuration?
I've read about a setting called MaxLockDuration in some articles (e.g. link). Is this just referring to the lock duration set on the queue itself?
Am I missing anything else? Are the lock duration set on the queue and the maxAutoRenewDuration in my code the main things I need to consider when dealing with locks and renewals?
Thanks
Alan
I understand your confusion. The official doc explanation of maxAutoRenewDuration seems wrong. There is already an open doc issue https://github.com/MicrosoftDocs/azure-docs/issues/62110 and also a reference https://github.com/Azure/azure-functions-host/issues/6500.
To pinpoint your questions:
#1: As told above, the there is open doc issue.
#2: MaxLockDuration is Service Bus queue side setting which basically signifies that if you peek lock a message from the queue, the message is locked for the consumer for that duration. So, unless you complete your message processing or renew lock within that period, the lock is going to expire.
#3: #sean-feldman 's awesome explanation in the thread https://stackoverflow.com/a/60381046/13791953 should answer that.
What it does is extends the message lease with the broker, "re-locking" it for the competing consumer that is currently handling the message. MaxAutoRenewDuration should be set to the "possibly maximum processing time a lease will be required".
I thought my question was related to post "Azure Service Bus: How to Renew Lock?" but I have tried the RenewLockAsync.
Here is the concern, I am receiving messages from the ServBus with Sessions enabled so I get the session then receive messages. All good, here's the Rub.
There are TWO ADDITIONAL processes to complete per message. A manual transform / harvest of the message into some other object which is then sent out to a Kafka topic (stream). Note its all Async on top of this craziness. My team lead is insistent that the two sub processes can just be added INTO the receive process (ReceiveAsync) and finally call session.CompleteAsync() AFTER the OTHER two processes complete.
Well needles to say I'm consistently erroring with "The session lock has expired on the MessageSession. Accept a new MessageSession." with that architecture. I haven't even fleshed out the send to Kafka part its just mocked so its going to take longer once fleshed out.
Is it even remotely plausible to session.CompleteAsync() AFTER the sub processes or shouldn't that be done when the message is successfully received, then move on to other processing? I thought separate tasks would be more appropriate but again he didn't dig that idea..
I appreciate all insight and opinions thank you !
"The session lock has expired on the MessageSession. Accept a new MessageSession." indicates one of 2 things:
The lock has been open for too long, in which case calling "RenewLockAsync" before it expires would help.
The message lock has been explicitly released, through a call to CompleteAsync, AbandonAsync, DeadLetterAsync, etc. That would indicate a bug, since the lock can not be used after it has been released
I noticed CloudQueue.ApproximateMessageCount would return the number of messages including the expired ones. This is probably a bug. Is there any way to see how many messages are in the queue?
So after doing some digging around I think I found the behavior you're talking about. From what I can tell the messages are still in the queue when they expire but aren't retrievable. They seem to remain in there for a short period of time and then are cleared out.
If I had to guess it may be similar to a storage bus queue in that expired messages are moved to some sort of dead letter queue. Except with storage queues you can't access the dead letter queue and the dead letter queue is automatically cleared after some period of time.
I'll update this answer if I find more.
Edit
I confirmed the behavior. It seems expired messages remain in the queue but you can't interact with them. They disappear eventually without intervention it appears.
It seems that new azure SDK extends the visibilitytimeout to <= 7 days. I know by default, when I add a message to an azure queue, the live time is 7days. When I get message out, and set the visibilitytimeout to 7days. Does that mean I don't need to delete this message if I don't care about message reliable? the message will disappear later 7 days.
I want to take this way because DeleteMessage is very slow. If I don't delete message, doesn't it have any impact on performance of GetMessage?
Based on the documentation for Get Messages, I believe it is certainly possible to set the VisibilityTimeout period to 7 days so that messages are fetched only once. However I see some issues with this approach instead of just deleting the message once the process is done:
What happens when you get the message and start processing it and somehow the process fails? If you set the visibility timeout to be 7 days, then the message would never appear in the queue again and thus the process it was supposed to do never gets done.
Even though the message is hidden, it is still there in the queue thus you keep on incurring storage charges for that message. Even though the cost is trivial but why keep the message when you don't really need it.
A lot of systems rely on Approximate Messages Count property of a queue to check on the health of processes which are performed by messages in a queue. Please note that even though you make the message hidden, it is still there in the queue and thus will be included in total messages count in the queue. So if you're building a system which relies on this for health check, you will always find your system to be unhealthy because you're never deleting the messages.
I'm curious to know why you find deleting messages to be very slow. In my experience this is quite fast. How are you monitoring message deletion?
Rather than hacking around the problem I think you should drill into understanding why the deletes are slow. Have you enabled logs and looked at the e2elatency and serverlatency numbers across all your queue operations. Ideally you shouldn't be seeing a large difference between the two for all your queue operations. If you do see a large difference then it implies something is happening on the client that you should investigate further.
For more information on logging take a look at the following articles:
http://blogs.msdn.com/b/windowsazurestorage/archive/tags/analytics+2d00+logging+_2600_amp_3b00_+metrics/
http://msdn.microsoft.com/en-us/library/azure/hh343262.aspx
Information on client side logging can also be found in this post: e client side logging – which you can learn more about in this blog post. http://blogs.msdn.com/b/windowsazurestorage/archive/2013/09/07/announcing-storage-client-library-2-1-rtm.aspx
Please let me know what you find.
Jason
Since Azure Service Bus limits the maximum number of concurrent connections to a Queue or Topic to 100, is there a method that we can use to query our Queues/Topics to determine how many concurrent connections there are?
We are aware that we can capture the throttling events, but would very much prefer an active approach, where we can proactively increase or decrease the number of Queues/Topics when the system is under a heavy load.
The use case here is a process waiting for a reply message, where the reply is coming from a long-running process, and the subscription is using a Correlation Filter to facilitate two-way communication between the Publisher and Subscriber. Thus, we must have a BeginReceive() going in order to await the response, and each such Publisher will be consuming a connection for the duration of their wait time. The system already balances load across multiple Topics, but we need a way to be proactive about how many Topics are created, so that we do not get throttled too often, but at the same time not have an excess of Topics for this purpose.
I don't believe it is currently possile to query the listener counts. I think that the subscriber object also figures into that so in theory, if you have up to 2000 subscribers per topic and if each allows up to 100 connections, that's alot of potential connections. We just need to keep in mind that subscribers are cooperative (each gets a copy of all messages) and receivers on subscriers are competitive (only one gets it).
I've also seen unconfirmed reports of performance delays when you start running > 1,000 subscribers so make sure you test this scenario.
But... given your scenario, I'd deduce that performance time likely isn't the biggest factor (you have long running processes already). So introducing a couple seconds lag into the workflow likely won't be critical. If that's the case, I'd set the timeout for your BeginRecieve to something fairly short (couple seconds) and have a sleep/wait delay between attempts. This gives other listeners an opportnity to get messsages as well. We might also want to consider an approach where we attempt to recieve multiple messages and then assign them out other processes for processing (coorelation in this case?).
Juts some thoughts.