Azure queue - can I verify a message will be read only once? - azure

I am using an Azure queue and have several different processes reading from the queue.
My system is built in a way that assumes each message is read only once.
This Microsoft article claims Azure queues have an at least once delivery guarantee which potentially means two processes can read the same message from the queue.
This StackOverflow thread claims that if I use GetMessage then the message becomes invisible to all other processes for the invisibility timeout.
Assuming I use GetMessage() and never exceed the message invisibility time before I DeleteMessage, can I assume I will get each message only once?

I think there is a property in queue message named DequeueCount, which is the number of times this message has been dequeued. And it's maintained by queue service. I think you can use this property to identify whether your message had been read before.
https://learn.microsoft.com/en-us/dotnet/api/azure.storage.queues.models.queuemessage.dequeuecount?view=azure-dotnet

No. The following can happen:
GetMessage()
Add some records in a database...
Generate some files...
DeleteMessage() -> Unexpected failure (process that crashes, instance that reboots, network connectivity issues, ...)
In this case your logic was executed without calling DeleteMessage. This means, once the invisibility timeout expires, the message will appear in the queue and be processed once again. You will need to make sure that your process is idempotent:
Idempotence is the property of certain operations in mathematics and
computer science, that they can be applied multiple times without
changing the result beyond the initial application.
An alternative solution would be to use Service Bus Queues with the ReceiveAndDelete mode (see this page under How to Receive Messages from a Queue). If you receive the message it will be marked as consumed and never appear again. This way you can be sure it is delivered At-Most-Once (see the comparison with Storage Queues here). But then again, if something happens while your are processing the message (ie: server crashes, ...), you could loose valuable information.
Update:
This will simulate an At-Most-Once in storage queues. The message can arrive multiple times via GetMessage, but will only be processed once by your business logic (with the risk that some of your business logic will never execute).
GetMessage()
DeleteMessage()
AddRecordsToDatabase()
GenerateFiles()

Related

Azure Service Bus - Add a message to the queue in a deferred state

I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
I'm asking this because I currently have a process that does the following ...
The process starts and a brokered message is sent to a queue (this triggers a function that records the message body as an entity in table storage with a 'Processing' status).
Additional work is done in the process
If we get to the end of the process without any issues, another brokered message is sent to the queue with a completion message (this triggers the same function that updates the entity in table storage with a 'Complete' status).
While this method is mostly working, it feels clunky and fragile. I would really like to be able to send a message to the queue and then have the final step make the message visible on the queue so it can be consumed by the function (Durable Function).
I thought about setting the ScheduledEnqueueTimeUtc, but I can't guarantee when the process will finish (I'm thinking worst case scenario here) so I'm not sure how long to set it.
I also looked at the Defer option for a BrokeredMessage but it seems this can only be set from the receiver and not be in a deferred state initially.
Is what I'm trying to do possible with Service Bus brokered messages? Could I set the Scheduled Enqueue time so some ridiculously long time (e.g. 2 hours) and if it reaches that time it is automatically expired and moved to the Dead Letter queue? Should I send the initial message to the Dead Letter queue and then once the process is complete, retrieve it and resubmit it?
Has anyone had any experience with implementing a process like this ... send a start message and only process the message once a completion notification has been received? I need this to be as robust as possible as I'm dealing with financial transactions in this process.
Hopefully my explanation makes sense.
I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
That's not possible. You can only delay a brand new message, not defer it. Deferring required a message to be received first for it to have a SequenceNumber.
Using ScheduledEnqueueTimeUtc has its challenges as you will be sending it in the future, but cannot cancel once processing is over. Instead, you could leverage QueueClient.ScheduleMessageAsync() that returns back SequenceNumber immediately. This way you can set the message far into future, but also cancel it if processing is finished earlier.
I ended up solving this issue by keeping the process of sending two messages, but refactoring my durable function to record the messages in Table Storage, check that both messages have been received and if they have, add a new message to Azure Queue Storage. A second function listens to the queue which starts its process.
After much testing, this appears to be quite a robust solution. It then doesn't matter what order the two messages arrive, or how long they take ... as long as both of them have arrive, that is when the second function will kick off.

How does Azure Service Bus Queue guarantees at most once delivery?

According to this doc service bus supports two modes Receive-and-Delete and Peek-Lock.
If using Peek-Lock Mode if the consumer crashes/hangs/do a very long GC right after processing the message, but before the messageId is "Completed" and visibility time expires there's a chance that same message is delivered twice.
Then how does Microsoft says that Service Bus supports at most once delivery mode. Is it because of the Receive-and-Delete mode which sends messages only once.But then again, if something happens while consumers are processing the message then that valuable info is lost.
If yes then what is the best way to ensure exact once delivery using Azure Services Bus as Queue and Azure Functions as Consumers.
P.S. The one approach I can think of is storing MessageID's in blob but since in my case number of MessageID's could be very large storing and loading all of them is not right approach.
Azure Functions will always consume Service Bus messages in Peek-Lock mode. Exactly Once delivery is basically not possible in general case: there's always a chance that consuming application will crash at wrong time just before completing the message, and then the message will be re-delivered.
You should strive to implement Effectively Once processing. This is usually achieved with idempotent message processor.
Storing MessageID's (consumer-side de-duplication) is one option. You could have a policy to clean up old Message IDs to keep the size of such storage manageable. To make this 100% reliable you would have to store Message ID in the same transaction as other modifications done by processor.
Other options really depend on your processing scenario. Find a way to make it idempotent - so that processing the same message multiple times is functionally same as processing it just once.

Amazon SQS better way of handling listeners

I have an SQS Queue which has a lot of messages (typically in thousands). Presently I am having multiple listeners (which are created by threads created from the same source) and each listener listens to the queue and receives messages. As soon as a listener receives a message from the Queue, that listener deletes the message from the Queue. The message will be processed only after deleting the message from the queue. I am having a visibility timeout of 30 seconds.
I am not using any locks or anything to handle duplicates since I am deleting the message from the queue as soon as after receiving. I haven't seen a case of duplicity until now but I am just worried it might.
Now, the question is, which is a better way, having multiple listeners this way or listening to the queue in a single thread, and then spinning up new threads to process each message you receive?
Firstly, it is worth understanding the concept of message invisibility timeout.
When a message is retrieved from an Amazon SQS queue (eg by your thread), the message is marked as invisible in Amazon SQS. Best-practice is for your thread to then process the message and then delete the message after it has completed processing the message. This way, if the thread fails, the message will automatically become visible on the queue again and another thread can process it.
With your current application design, if a thread fails then the message is lost and will not be retried. You should consider changing your code to delete the message only after it has been processed.
Using multiple threads to process messages is recommended, because it will allow higher message throughput by processing messages in parallel. It is also a simpler design, and simple is always best. Your alternate idea of having one process retrieve messages and then firing off threads to process the message is more complex and does not provide any benefits.
Amazon SQS queues can occasionally return the same message more than once. It is rare, but can happen. The multiple-thread design will probably result in it happening more than the single-thread design because multiple threads might simultaneously retrieve the same message. However, there it could still happen in the single-thread model, too.
If processing the same message twice is a concern, then consider using a FIFO queue (not currently available in every AWS Region). This will guarantee that every message is received only once. Alternatively, your code would need to check whether a particular message has already been processed (eg by checking in a database).
The multiple-thread design will also allow you to horizontally scale by having multiple system (even across multiple Availability Zones) process messages, whereas your single-thread design has a single point of failure and is less scalable.

Status as never finished by one of my Webjob while processing the message

I have a webjob which process the message only once by using the condition (DevliverCount = 1). Because I don't want other instance to process it if the locktime expired by first webjob. As other webjob try to process the message after locktime expired, the condition (DevliverCount = 1) will not met and comes out of the method which deletes the message from the queue automatically.
The problem over here is if the message state went to never finished (other than success) I wont have message in queue to process. How to handle this situation?
I think part of the problem is that you're trying to use the MaxDeliveryCount property to prevent concurrent message processing:
MaxDeliveryCount
The max delivery count setting is not used to prevent multiple consumers from processing a message at the same time, it's used to prevent "poison messages" where any consumer attempts to process a message whose contents prevent successful processing, and therefore the message would otherwise be processed forever.
I recommend you determine exactly what it is you're trying to accomplish. If you want a simple competing consumers scenario where multiple webjobs consume messages from a single queue, then there are standard ways to accomplish that:
good description of competing consumers
competing consumers with Service Bus queues
You can use MaxDeliveryCount in conjunction with competing consumers... if you want to prevent poison messages you can set MaxDeliveryCount to something larger than 1 and still give other consumers a chance to process messages whose locks expire.
Azure Service Bus supports dead-lettering of poison messages that exceed max delivery count, so you're able to examine such messages offline... they aren't simply deleted forever.
You might also need to add code in your webjobs to renew locks prior to their expiration... otherwise service bus can't differentiate between "valid messages that are taking a long time to process" and "poison messages that can't be processed". Without lock renewal your long-running valid messages will be dead-lettered the same as poison messages, which is almost certainly not what you want.
Good luck!

Locking messages in queue with Windows Azure Queues

I am working with Windows Azure Message queues. I want know if is there a method to lock messages in the queue when i get them ?
When you retrieve a message from the queue, it's marked as invisible until you delete it (or the timeout period is reached). When it's marked as invisible, nobody else sees the message. I guess that's as closed to "locked" as you're going to get.
If, while processing, you feel you need more time, you can modify the message and extend the invisibility timeout.
You do need to focus on idempotent operations with Windows Azure queues: Assume that any given message may be processed more than once:
Processing goes beyond invisibility timeout, so some other worker gets the message
VM instance crashes while processing message, causing it to re-appear in the queue and get processed again

Resources