So I have an azure function acting as a queue trigger that calls an internally hosted API.
There doesn't seem to be a definitive answer online on how to handle a message that could not be processed due to issues other than being poisonous.
An example:
My message is received and the function attempts to call the API. The message payload is correct and could be handled however the API/service is down for whatever reason (this time will likely be upwards of 10 minutes). Currently what happens is the message delivery count is reaching its max(10) and then getting pushed to the dead letter queue, which in turn happens for each message after.
I need a way to either not increment the delivery count or reset it upon reaching max. Alternatively I could abandon the peek lock on the message without increment the delivery count as I want to stop processing any message on the queue until the API/service is back up and running.
This way I would ensure that all messages that can be processed will be and will not fall on the dead letter because of connection issues between services.
Any ideas on how to achieve this?
Currently what happens is the message delivery count is reaching its max(10) and then getting pushed to the dead letter queue, which in turn happens for each message after.
As this document states about Exceeding MaxDeliveryCount:
Queues and subscriptions have a QueueDescription.MaxDeliveryCount/SubscriptionDescription.MaxDeliveryCount setting; the default value is 10. Whenever a message has been delivered under a lock (ReceiveMode.PeekLock), but has been either explicitly abandoned or the lock has expired, the message's BrokeredMessage.DeliveryCount is incremented. When the DeliveryCount exceeds the MaxDeliveryCount, the message gets moved to the DLQ specifying the ``MaxDeliveryCountExceeded``` reason code.
This behavior cannot be turned off, but the MaxDeliveryCount can set to a very large number.
According to your requirement, I assumed that you could follow the approaches below to achieve your purpose:
For receiving messages under ReceiveMode.PeekLock
You could specify the Maximum Delivery Count between 1 and 2147483647 under the "SETTINGS > Properties" of your service bus queue on Azure Portal.
For receiving messages under ReceiveMode.ReceiveAndDelete
You could try-catch the exception when your API/service is down, then you could re-send the message to your queue.
Related
I am looking into setting up a web job trigger to read message from service bus queue. What would be the best practice to implement a retry logic in case of any errors handling the downstream systems.
Would we be able to throw an exception so that the message will not be deleted from the queue and will be retried after certain time period?
Appreciate your feedback.
You don't need to define retry logic explicitly. When the message is de-queued from service bus , it gets invisible from queue for certain time period (lock time default 30secs , you can configure it). You try to process the message , if it gets successful you simply call BrokeredMessage.CompleteAsync which means i am done and mark this message as completed. If you have some problem in down stream you can abandon the message by calling BrokeredMessage.AbandonAsync . This will unlock the message and the message appears back in the queue. The message will be picked up by the worker again and process it. Until you get successful or reach the max retry limit after which the message is send to dead letter queue.
I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
I'm asking this because I currently have a process that does the following ...
The process starts and a brokered message is sent to a queue (this triggers a function that records the message body as an entity in table storage with a 'Processing' status).
Additional work is done in the process
If we get to the end of the process without any issues, another brokered message is sent to the queue with a completion message (this triggers the same function that updates the entity in table storage with a 'Complete' status).
While this method is mostly working, it feels clunky and fragile. I would really like to be able to send a message to the queue and then have the final step make the message visible on the queue so it can be consumed by the function (Durable Function).
I thought about setting the ScheduledEnqueueTimeUtc, but I can't guarantee when the process will finish (I'm thinking worst case scenario here) so I'm not sure how long to set it.
I also looked at the Defer option for a BrokeredMessage but it seems this can only be set from the receiver and not be in a deferred state initially.
Is what I'm trying to do possible with Service Bus brokered messages? Could I set the Scheduled Enqueue time so some ridiculously long time (e.g. 2 hours) and if it reaches that time it is automatically expired and moved to the Dead Letter queue? Should I send the initial message to the Dead Letter queue and then once the process is complete, retrieve it and resubmit it?
Has anyone had any experience with implementing a process like this ... send a start message and only process the message once a completion notification has been received? I need this to be as robust as possible as I'm dealing with financial transactions in this process.
Hopefully my explanation makes sense.
I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
That's not possible. You can only delay a brand new message, not defer it. Deferring required a message to be received first for it to have a SequenceNumber.
Using ScheduledEnqueueTimeUtc has its challenges as you will be sending it in the future, but cannot cancel once processing is over. Instead, you could leverage QueueClient.ScheduleMessageAsync() that returns back SequenceNumber immediately. This way you can set the message far into future, but also cancel it if processing is finished earlier.
I ended up solving this issue by keeping the process of sending two messages, but refactoring my durable function to record the messages in Table Storage, check that both messages have been received and if they have, add a new message to Azure Queue Storage. A second function listens to the queue which starts its process.
After much testing, this appears to be quite a robust solution. It then doesn't matter what order the two messages arrive, or how long they take ... as long as both of them have arrive, that is when the second function will kick off.
Sometimes there are some messages in Azure Queues that are not taken in charge by Azure Functions and also are not visible from StorageExplorer.
These messages are created without any visibility delay.
Is there any way to know what do those messages contain, and why are they not processed by our Azure Functions?
In the image you can see that we have a message in queue but it is not visible in the list and it is there from hours.
The Azure Queue API currently has no way to check invisible messages.
There are several situations in which a message will become invisible:
The message was added with an VisibilityTimeout in the Put Message request. The message will be invisible until this initial timeout expires.
The message has been retrieved (dequeued). Whenever a message is retrieved it will be invisible for the duration of the VisibilityTimeout specified by the Get Messages request, or 30 seconds by default.
The message has expired. Messages expire after 7 days by default, or after the MessageTTL specified in the Put Message request. Note: after a while these messages are automatically deleted, but until that point they are there as invisible message.
Use cases
Initial VisibilityTimeout
Messages are created with an initial VisibilityTimeout so that the message can be created now, but processed later (after the timeout expires), for whatever reason the creator has for wanting to delay this processing.
VisibilityTimeout on retrieving
The intended process for processing queue messages is:
The application dequeues one or more messages, optionally specifying the next VisibilityTimeout. This timeout should be bigger than the time it takes to process the message(s).
The application processes the message(s).
The application deletes the messages. When the processing fails the message(s) are not deleted.
Message(s) for which the process failed will become visible again as soon as their VisibilityTimeout expires, so that they can be re-tried. To prevent endless retries step 2. should start by checking the DequeueCount of the message: if it is bigger than the desired retry-count it should be deleted, instead of processed. It is good practice to copy such messages to a deadletter / poison queue (for example a queue with the original queue name plus a -poison suffix).
MessageTTL
By default messages have a time-to-live of 7 days. If the application processing cannot handle the amount of messages being added, a backlog could build up. Adjusting the TTL will determine what happens to such backlog.
Alternatively the application could crash, so that the backlog would build up until the application would be started again.
It seems that the message is expired. The following steps could reproduce the issue, you could test it.
Add message with a short TTL
After the message has been expired
I have a webjob which process the message only once by using the condition (DevliverCount = 1). Because I don't want other instance to process it if the locktime expired by first webjob. As other webjob try to process the message after locktime expired, the condition (DevliverCount = 1) will not met and comes out of the method which deletes the message from the queue automatically.
The problem over here is if the message state went to never finished (other than success) I wont have message in queue to process. How to handle this situation?
I think part of the problem is that you're trying to use the MaxDeliveryCount property to prevent concurrent message processing:
MaxDeliveryCount
The max delivery count setting is not used to prevent multiple consumers from processing a message at the same time, it's used to prevent "poison messages" where any consumer attempts to process a message whose contents prevent successful processing, and therefore the message would otherwise be processed forever.
I recommend you determine exactly what it is you're trying to accomplish. If you want a simple competing consumers scenario where multiple webjobs consume messages from a single queue, then there are standard ways to accomplish that:
good description of competing consumers
competing consumers with Service Bus queues
You can use MaxDeliveryCount in conjunction with competing consumers... if you want to prevent poison messages you can set MaxDeliveryCount to something larger than 1 and still give other consumers a chance to process messages whose locks expire.
Azure Service Bus supports dead-lettering of poison messages that exceed max delivery count, so you're able to examine such messages offline... they aren't simply deleted forever.
You might also need to add code in your webjobs to renew locks prior to their expiration... otherwise service bus can't differentiate between "valid messages that are taking a long time to process" and "poison messages that can't be processed". Without lock renewal your long-running valid messages will be dead-lettered the same as poison messages, which is almost certainly not what you want.
Good luck!
I am working with Windows Azure Message queues. I want know if is there a method to lock messages in the queue when i get them ?
When you retrieve a message from the queue, it's marked as invisible until you delete it (or the timeout period is reached). When it's marked as invisible, nobody else sees the message. I guess that's as closed to "locked" as you're going to get.
If, while processing, you feel you need more time, you can modify the message and extend the invisibility timeout.
You do need to focus on idempotent operations with Windows Azure queues: Assume that any given message may be processed more than once:
Processing goes beyond invisibility timeout, so some other worker gets the message
VM instance crashes while processing message, causing it to re-appear in the queue and get processed again