Python ServiceBus Message Lock Expiry Issue - python-3.x

Im running into an issue with the message lock expiry in SerivceBus using Python SDK.
I've created a simple tool that will clear dead-lettered messages in queues and subscriptions. We have various amount of queues and subscriptions and the tool acts as a clean-up job which I'm just running locally.
The tool works fine however because its a "clean-up" tool we might have messages banked in DLQ which have been there for months.
Im running into a issue where im trying to complete these messages however its throwing azure.servicebus.exceptions.ServiceBusError: The lock on the message lock has expired - exception.
I thought I was able to resolve this issue but using AutoLockRenewer and actually renewing the lock on the message before completing it, however, the exception seems to still be getting thrown.
It's strange because say the exception gets thrown the tool will stop running, once I re-run the tool it's able to complete messages where it previously couldn't, but it will eventually find a message in another queue/subscription where it will break due to the lock. So after each re-run it's able to clear DLQ in more and more queues/subscriptions, it doesn't break at the same point as the previous run.
This is a snippet of my code:
renewer = AutoLockRenewer()
with ServiceBusClient.from_connection_string(shared_access_key["primaryConnectionString"]) as client:
for queue in queues:
if queue["countDetails"]["deadLetterMessageCount"] > 0:
with client.get_queue_receiver(queue_name=queue["name"], sub_queue="deadletter") as receiver:
while len(receiver.receive_messages(max_wait_time=60)) > 0:
messages = receiver.receive_messages(max_message_count=50, max_wait_time=60)
for message in messages:
renewer.register(receiver, message, max_lock_renewal_duration=300)
receiver.complete_message(message)

Related

Azure Service Bus - cancelled scheduled messages getting re-queued

I'm using the latest Java bindings (v3.1.3) for Azure Service Bus: https://github.com/Azure/azure-sdk-for-java/tree/master/sdk/servicebus
When I create a new queue client, schedule a message, and cancel it...
QueueClient sendClient = new QueueClient(new ConnectionStringBuilder(connectionString, queueName), ReceiveMode.PEEKLOCK);
long sequenceNumber = sendClient.scheduleMessage(message, instant);
...
sendClient.cancelScheduledMessage(sequenceNumber)
...the code appears to work as intended: The active message count goes to 0. But as soon as the scheduled message gets to the time it was supposed to be scheduled (I tested with 10 seconds and 100 seconds in the future), the message sometimes gets re-queued with a new sequence number. I'm not getting any errors when scheduling or cancelling the messages. Is there something I can do to make sure cancelled messages don't get re-queued?
From my own testing, I found that cancelling a service bus message in a short time frame after the scheduled message was sent to the service bus queue does not always process the cancellation as expected. In general we're talking only a few seconds but the behaviour is not entirely consistant.
My conslusion is that there will be some latency between the scheduled message being queued to when a cancellation of that same message is registered which means that canclelling a scheduled message almost straight away after sending it to the queue will not always stop it being processed.
Therefore in my environment, I had to provide my own fallback feature to check additional custom properties in the service bus message, so when it arrives back at my subscriber app, i use an IF Statement to check the status of the custom property so I can chose whether to ignore it and not process anything more.
This really caught me out for a little while as my environement was rather complex and I assumed there was some issue in my code somwhere along the line which in the end, once I factored in the above annomlly and started to see how the service bus was responding to the schedule message cancellation, I was able to overcome this issue.
You can schedule messages either by setting the ScheduledEnqueueTimeUtc property when sending a message through the regular send path, or explicitly with the ScheduleMessageAsync API. The latter immediately returns the scheduled message's SequenceNumber, which you can later use to cancel the scheduled message if needed.
Cancels the enqueuing of an already sent scheduled message, if it was not already enqueued. This is an asynchronous method returning a CompletableFuture which completes when the message is cancelled.
So, I suggest that you could use cancelScheduledMessageAsync to cancel scheduled message.

Azure Function and storage queue, what to do if function fails

I'm working out a scenario where a post a message to an Azure Storage Queue. For testing purposes I've developed a console app, where I get the message and I'm able to update it with a try count, and when the logic is done, I delete the message.
Now I'm trying to port my code to an Azure Function. One thing that seems to be very different is, when the Azure Function is called, the message is deleted from the queue.
I find it hard to find any documentation on this specific subject and I feel I'm missing something with regard to the concept of combining these two.
My questions:
Am I right, that when you trigger a function on a new queue item, the function takes the message and deletes it from the queue, even if the function fails?
If 1 is correct, how do you make sure that the message is retried and posted to a dead queue for later processing?
The runtime only deletes the queue message when your Function successfully processes it (i.e. no error has occurred). When the message is dequeued and passed to your function, it becomes invisible for a period of time (10 minutes). While your function is running this invisibility is maintained. If your function fails, the message is not deleted - it remains in the queue in an invisible state. After the visibilty timeout expires, the message will become visible in the queue again for reprocessing.
The details of how core WebJobs SDK queue processing works can be found here. On that page, see the section "How to handle poison messages" which addresses your question. Basically you'll get all the right behaviors for free - retry handling, poison message handling, etc. :)

How to prevent Azure webjob processing same message multiple times concurrently

I have an Azure WebJob project, that I am running locally on my dev machine. It is listening to an Azure Service Bus message queue. Nothing going on like Topics, just the most basic message queue.
It is receiving/processing the same message multiple times, launching twice immediately when the message is received, then intermittently whilst the message is being processed.
Questions:
How come I am receiving the same message multiple times instantly? It seems that it's re-fetching before a PeekLock is applied?
How come the message is being re-received even though it is still being processed? Can I set the PeekLock duration, or somehow lock the message to only be processed once
How can I ensure that each message on the queue is only processed once?
I want to be able to process multiple messages at once, just not the same message multiple times, so setting MaxConcurrentCalls to 1 does not seem to be my answer, or am I misunderstanding that property?
I am using an async function, simple injector and a custom JobActivator, so instead of a static void method, my Function signature is:
public async Task ProcessQueueMessage([ServiceBusTrigger("AnyQueue")] MediaEncoderQueueItem message, TextWriter log) {...}
Inside the job, it is moving some files around on a blob service, and calling (and waiting for) a media encoder from media services. So, whilst the web job itself is not doing a lot of processing, it takes quite a long time (15 minutes, for some files).
The app is launching, and when I post a message to the queue, it responds. However, it is receiving the message multiple times as soon as the message is received:
Executing: 'Functions.ProcessQueueMessage' - Reason: 'New ServiceBus message detected on 'MyQueue'.'
Executing: 'Functions.ProcessQueueMessage' - Reason: 'New ServiceBus message detected on 'MyQueue'.'
Additionally, whilst the task is running (and I see output from the Media Service functionality), it will get another "copy" from the queue.
finally after the task has completed, it's still intermittently processing the same message.
I suspect what's happening is the following:
Maximum DurationLock can be 5 minutes. If processing of the message is done under 5 minutes, message is marked as completed and removed from the broker. Otherwise, message will re-appear if processing takes longer than 5 minutes (we lost the lock on the message) and will be consumed again. You could verify that by looking at the DeliveryCount of your message.
To resolve that, you could renew message lock just before it's about to expire using BrokeredMessage.RenewLockAsync().

Automatic reboot whenever there's an uncaught exception in a continous WebJob

I'm currently creating a continous webjob that will do polling to an API, and then forward messages to an Azure Service Bus. I've managed to get this to work just fine, but I have one problem; what if my app crashes for whatever reason? What if there's an uncaught exception, or something goes wrong, and the app stops running. How do i get it to run again?
I created a test app, which will send a message every to the Service Bus, then on the 11th message it will crash due to an intentionally placed NullReferenceException. I did this in order to investigate behaviour whenever/if the app crashes.
What happens is that the app runs just fine for the first 10 seconds (as expected). Messages are being sent, and everything looks good. Then after the 10th second, when the exception occurs, nothing happens. No log in Azure saying there was an exception, no reboot - nothing. It just stands there as "running", but messages are no longer being sent.
How do I deal with this? It's essential that the application is able to reboot if it fails. Are there any standard ways to do this? Best practices?
Any help would be appreciated :)
It is always good to handle most of the failure scenarios in the system by ourselves rather than to let the hosting environment to react for the failures.
My suggestion would be to have a check in the code for exceptions like any try catch block in your executable script to catch different kind of failure scenarios and instead of throwing the exceptions, log it your self or take any retry operation if required.
Example, when you got a junk data to process and it failed. Then you can try to do the operation again for eg. 3 times and then finally push a log to deadletter account to manually take care of such junk inputs. And don't let the flow be stopped by throwing the exception but instead handle it your self by logging a message which needs manual intervention.
In any GUI or Web applications, if there is an exception then the flow is re initiated by user click and system will respond. But here as it a background processor, it is ideal to avoid all such control flow blockers.
Hope this would help.

Azure ServiceBus Retry Delay

I am using the Microsoft Azure ServiceBus for Queue messages using WCF for the subscriptions. I am trying to implement retry logic. I use Peak/Lock to view the message and then have to do some local processing on the message. If that processing fails, I unlock the message so I can try processing it again. The problem is I need to build a have a delay in-between processing tries. Currently it is popped back into the queue and then is processed almost immediately. There needs to be about 2 minutes between attempts.
If you always have to wait 2 minutes before re-processing the message of that particular queue, you could try to configure the lock-timeout on the queue to be 2 minutes (plus the time you expect it will take you to process the message) and then just let the lock expire, instead of unlocking it. This has the downside that you would need to keep an eye on your processing time, and extend the lock's timeout if needed.
Another option could be to receive and complete the message, set a scheduled delivery of 2 minutes into the future, and re send the message. This has the downside that you need to consume it and ack it, which involves certain risks (e.g. your process dies before you get a chance to re-send the message).
"If the message is Peeked in Peek Lock mode from a Queue then you don't have the receive context in the message. You can receive the message in Peek Lock mode, which will lock the message for the interval specified for the 'lock duration' property of the queue. Locked messages cannot be received until its lock expires. Thus, by setting the lock duration to 2 minutes and Receiving messages in Peek Lock mode will solve this issue.
You can either write custom code to update the Lock Duration property. Tools like Service Bus Explorer, Serverless360 etc provides options to update property using graphical user interface."

Resources