How does Azure Service Bus Queue guarantees at most once delivery? - azure

According to this doc service bus supports two modes Receive-and-Delete and Peek-Lock.
If using Peek-Lock Mode if the consumer crashes/hangs/do a very long GC right after processing the message, but before the messageId is "Completed" and visibility time expires there's a chance that same message is delivered twice.
Then how does Microsoft says that Service Bus supports at most once delivery mode. Is it because of the Receive-and-Delete mode which sends messages only once.But then again, if something happens while consumers are processing the message then that valuable info is lost.
If yes then what is the best way to ensure exact once delivery using Azure Services Bus as Queue and Azure Functions as Consumers.
P.S. The one approach I can think of is storing MessageID's in blob but since in my case number of MessageID's could be very large storing and loading all of them is not right approach.

Azure Functions will always consume Service Bus messages in Peek-Lock mode. Exactly Once delivery is basically not possible in general case: there's always a chance that consuming application will crash at wrong time just before completing the message, and then the message will be re-delivered.
You should strive to implement Effectively Once processing. This is usually achieved with idempotent message processor.
Storing MessageID's (consumer-side de-duplication) is one option. You could have a policy to clean up old Message IDs to keep the size of such storage manageable. To make this 100% reliable you would have to store Message ID in the same transaction as other modifications done by processor.
Other options really depend on your processing scenario. Find a way to make it idempotent - so that processing the same message multiple times is functionally same as processing it just once.

Related

Will Azure Service Bus .net SDK deliver duplicate messages?

I am currently using Azure.Messaging.ServiceBus .net SDK to send high throughput messages in batches to ASB standard. While simulating an intermittent network connectivity situation by turning off the internet during a high throughput batched send, we observed that there were some extra messages delivered to the ASB. The count observed in the ASB portal for the queue didn't match our logs. Can this happen with the ASB? Note that we have enabled retry policies enabled for the client. I came across a similar StackOverflow thread and want to know if there is any latest progress pertaining to this.
Yes; the Service Bus guarantee is at-least-once. Duplicates are possible when publishing fails with an ambiguous state and is retried. Using the duplicate detection feature can help to guard against this, but requires that your application take responsibility for assigning unique MessageId values.
Duplicates can also occur if a message is received and is not settled. Once its lock has expired, the message is eligible to be received again.
Applications receiving messages should ensure that processing is idempotent and any needed detection or guards for duplicates are in place.

Replaying Messages in Order

I am implementing a consumer which does processing of messages from a queue where order of messages is of importance. I would like to implement a mechanism using NodeJS where:
the consumer function is consuming messages m1, m2, ..., mN from the queue
doing an IO intensive operation and process the messages. m -> m'
Storing the result m' in a redis cache.
acknowledging the queue after each message process (2)
In a different function, I am listening to the message from the cache
sending the processed messages m' to an external system
if the external system was able to process the external system, then delete the processed message from the cache
If the external system rejects the processed message, then stop sending messages, discard the unsent processed messages in the cache and reset the offset to the last accepted message in the queue. For example if m12' was the last message accepted by the system, and I have acknowledged m23 from the queue, then I have to discard m13' to m23' and reset the offset so that the consumer can read and start processing from m13 again.
Few assumptions:
The processing m to m' is intensive and I am processing them optimistically, knowing that most of the times there won't be a failure
With the current assumptions and goals, is there any way I can achieve this with RabbitMQ or any Azure equivalent? My client doesn't prefer Kafka or any Azure equivalent of Kafka (Azure Event Hub).
In scenarios where the messages will always be generated in sequence then a simple queue is probably all you need.
Azure Queues are pretty simple to get into, but the general mode of operation for queues is to remove the messages as they are processed successfully.
If you can avoid the scenario where you must "roll back" or re-process from an earlier time, so if you can avoid the orchestration aspect then this would be a much simpler option.
It's the "go back and replay" that you will struggle with. If you can implement two queues in a sequential pattern, where processing messages from one queue successfully pushes the message into the next queue, then we never need to go back, because the secondary consumer can never process ahead of the primary.
With Azure Event Hubs it is much easier to reset the offset for processing, because the messages stay in the bucket regardless of their read state, (in fact any given message does not have such a state) and the consumer maintains the offset pointer itself. It also has support for multiple consumer groups, which will make a copy of the message available to each consumer.
You can up your plan to maintain the data for up to 7 days without blowing the budget.
There are two problems with Large scale telemetry ingestion services like Azure Event Hubs for your use case
The order of receipt of the message is less reliable for messages that are extremely close together, the Hub is designed to receive many messages from many sources concurrently, so its internal architecture cares a lot less about trying to preserve the precise order, it records the precise receipt timestamp on the message, but it does not guarantee that the overall sequence of records will match exactly to a scenario where you were to sort by the receipt timestamp. (its a subtle but important distinction)
Event Hubs (and many client processing code examples) are designed to guarantee Exactly Once delivery across multiple concurrent consuming threads. Again the Consumers are encouraged to be asynchronous and the serice will try to ensure that failed processing attempts are retried by the next available thread.
So you could use Event Hubs, but you would have to bypass or disable a lot of its features which is generally a strong message that it is not the correct fit for your purpose, if you want to explore it though, you would want to limit the concurrency aspects:
minimise the partition count
You probably want 1 partition for each message producer, or atleast for each sequential set, maintaining sequence is simpler inside a single partition
make sure your message sender (producer) only sends to a specific partition
Each producer MUST use a unique partition key
create a consumer group for each of your consumers
process messages one at a time, not in batches
process with a single thread
I have a lot of experience in designing MS Azure based solutions for Industrial IoT (Telemetry from PLCs) and Agricultural IoT (Raspberry Pi) device implementations. In almost all cases we think that the order of messaging is important, but unless you are maintaining real-time 2 way command and control, you can usually get away with an optimisitic approach where each message and any derivatives are or were correct at the time of transmission.
If there is the remote possibility that a device can be offline for any period of time, then dealing with the stale data flushing through the system when a device comes back online can really play havok with sequential logic programming.
Take a step back to analyse your solution, EventHubs does offer a convient way to rollback the processing to a previous offset, as long as that record is still in the bucket, but can you re-design your logic flow so that you do not have to re-process old data?
What is the requirement that drives this sequence? If it is so important to maintain the sequence, then you should probably process the data with a single consumer that does everything, or look at chaining the queues in a sequential manner.

Azure queue - can I verify a message will be read only once?

I am using an Azure queue and have several different processes reading from the queue.
My system is built in a way that assumes each message is read only once.
This Microsoft article claims Azure queues have an at least once delivery guarantee which potentially means two processes can read the same message from the queue.
This StackOverflow thread claims that if I use GetMessage then the message becomes invisible to all other processes for the invisibility timeout.
Assuming I use GetMessage() and never exceed the message invisibility time before I DeleteMessage, can I assume I will get each message only once?
I think there is a property in queue message named DequeueCount, which is the number of times this message has been dequeued. And it's maintained by queue service. I think you can use this property to identify whether your message had been read before.
https://learn.microsoft.com/en-us/dotnet/api/azure.storage.queues.models.queuemessage.dequeuecount?view=azure-dotnet
No. The following can happen:
GetMessage()
Add some records in a database...
Generate some files...
DeleteMessage() -> Unexpected failure (process that crashes, instance that reboots, network connectivity issues, ...)
In this case your logic was executed without calling DeleteMessage. This means, once the invisibility timeout expires, the message will appear in the queue and be processed once again. You will need to make sure that your process is idempotent:
Idempotence is the property of certain operations in mathematics and
computer science, that they can be applied multiple times without
changing the result beyond the initial application.
An alternative solution would be to use Service Bus Queues with the ReceiveAndDelete mode (see this page under How to Receive Messages from a Queue). If you receive the message it will be marked as consumed and never appear again. This way you can be sure it is delivered At-Most-Once (see the comparison with Storage Queues here). But then again, if something happens while your are processing the message (ie: server crashes, ...), you could loose valuable information.
Update:
This will simulate an At-Most-Once in storage queues. The message can arrive multiple times via GetMessage, but will only be processed once by your business logic (with the risk that some of your business logic will never execute).
GetMessage()
DeleteMessage()
AddRecordsToDatabase()
GenerateFiles()

Detect and Delete Orphaned Queues, Topics, or Subscriptions on Azure Service Bus

If there are no longer any publishers or subscribers reading nor writing to a Queue, Topic, or Subscription, because of crashes or other abnormal terminations (instance restart, etc.), is that Queue/Topic/Subscription effectively orphaned?
I tested this by creating a few Queues, and then terminating the applications. Those Queues were still on the Service Bus a long time later. It seems that they will just stay there forever. That would be wonderful if we WANTED that behavior, but in this case, we do not.
How can we detect and delete these Queues, Topics, and Subscriptions? They will count towards Azure limits, etc, and we cannot have these orphaned processes every time an instance is restarted/patched/crashes.
If it helps make the question clearer, this is a unique situation in which the Queues/Topics/Subscriptions have special names, or special Filters, and a very limited set of publishers (1) and subscribers (1) for a limited time. This is not a case where we want survivability. These are instance-specific response channels. Whether we use Queues or Subscriptions is immaterial. If the instance is gone, so is the need for that Queue (or Subscription).
This is part of a solution where each web role has a dedicated response channel that it monitors. At any time, this web role may have dozens of requests pending via other messaging channels (Queues/Topics), and it is waiting for the answers on multiple threads. We need the response to come back to the thread that placed the message, so that the web role can respond to the caller. It is no good in this situation to simply have a Subscription based on the machine, because it will be receiving messages for other threads. We need each publishing thread to establish a dedicated response channel, so that the only thing on that channel is the response for that thread.
Even if we use Subscriptions (with some kind of instance-related filter) to do a long-polling receive operation on the Subscription, if the web role instance dies, that Subscription will be orphaned, correct?
This question can be boiled down like so:
If there are no more publishers or subscribers to a Queue/Topic/Subscription, then that service is effectively orphaned. How can those orphans be detected and cleaned up?
In this scenario you are looking for the Queue/Subscriptions to be "dynamic" in nature. They would be created and removed based on use as opposed to the current explicit provisioning model for these entities. Service Bus provides you with the APIs to perform create/delete operations so you can plug these on role OnStart/OnStop events appropriately. If those operations fail for some reason then the orphaned entities will exist. Again you can run clean up operation on them based on some unique identifier for the name of the entities. An example of this can be seen here: http://windowsazurecat.com/2011/08/how-to-simplify-scale-inter-role-communication-using-windows-azure-service-bus/
In the near future we will add more metadata and query capabilities to Queues/Topics/Subscriptions so you can see when they were last accessed and make cleanup decisions.
Service Bus Queues are built using the “brokered messaging” infrastructure designed to integrate applications or application components that may span multiple communication protocols, data contracts, trust domains, and/or network environments. The allows for a mechanism to communicate reliably with durable messaging.
If a client (publisher) sends a message to a service bus queue and then crashes the message will be stored on the Queue until as consumer reads the message off the queue. Also if your consumer dies and restarts it will just poll the queue and pick up any work that is waiting for it (You can scale out and have multiple consumers reading from queue to increase throughput), Service Bus Queues allow you to decouple your applications via durable cloud gateway analogous to MSMQ on-premises (or other queuing technology).
What I'm really trying to say is that you won't get an orphaned queue, you might get poisoned messages that you will need to handled, this blog post gives some very detailed information re: Service Bus Queues and their Capacity and Quotas which might give you a better understanding http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
Re: Queue Management, you can do this via Visual Studio (1.7 SDK & Tools) or there is an excellent tool called Service Bus Explorer that will make your life easier for queue managagment: http://code.msdn.microsoft.com/windowsazure/Service-Bus-Explorer-f2abca5a
*Note the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call)
As Abhishek Lai mentioned there is no orphan detecting capability supported.
Orphan detection can be implement externally in multiple ways.
For example, whenever you send/receive a message, update a timestamp in an SQL database to indicate that the queue/tropic/subscription is still active. This timestamp can then be used to determine orphans.
If your process will crash which is very much possible there will be issue with the message delivery within the queue however queue will still be available to process your request. Handling Application Crashes and Unreadable Messages with Windows Azure Service Bus queues are described here:
The Service Bus provides functionality to help you gracefully recover from errors in your application or difficulties processing a message. If a receiver application is unable to process the message for some reason, then it can call the Abandon method on the received message (instead of the Complete method). This will cause the Service Bus to unlock the message within the queue and make it available to be received again, either by the same consuming application or by another consuming application.
In the event that the application crashes after processing the message but before the Complete request is issued, then the message will be redelivered to the application when it restarts. This is often called At Least Once Processing, that is, each message will be processed at least once but in certain situations the same message may be redelivered. If the scenario cannot tolerate duplicate processing, then application developers should add additional logic to their application to handle duplicate message delivery. This is often achieved using the MessageId property of the message, which will remain constant across delivery attempts.
If there are no longer any processes reading nor writing to a queue, because of crashes or other abnormal terminations (instance restart, etc.), is that queue effectively orphaned?
No the queue is in place to allow communication to occur via Brokered Messages, if all your apps die for some reason then the queue still exists and will be there when they become alive again, it's the communication channel for loosely decoupled applications. Regards Billing 'Messages are charged based on the number of messages sent to, or delivered by, the Service Bus during the billing month' you won't be charged if a queue exists but nobody is using it.
I tested this by creating a few queues, and then terminating the
applications. Those queues were still on the machine a long time
later.
The whole point of the queue is to guarantee message delivery of loosely decoupled applications. Think of the queue as an entity or application in its own right with high availability (SLA) as its hosted in Azure, your producer/consumers can die/restart and the queue will be active in Azure. *Note I got a bit confused with your wording re: "still on the machine a long time later", the queue doesn't actually live on your machine, it sits up in Azure in a designated service bus namespace. You can view and managed the queues via the tools I pointed out in the previous answer.
How can we detect and delete these queues, as they will count towards
Azure limits, etc.
As stated above the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call), queue management can be done via the tools stated in the other answer. You should only be looking to delete queue's when you no longer have producer/consumers looking to write to them (i.e. never again). You can of course create and delete queues in your producer/consumer applications via the namespaceManager.QueueExists, more information here How to Use Service Bus Queues
If it helps make the question clearer, this is a unique situation in which the queues have special names, and a very limited set of publishers (1) and subscribers (1) for a limited time.
It sounds like you need to use Topics & Subscriptions How to Use Service Bus Topics/Subscriptions, this link also has a section on 'How to Delete Topics and Subscriptions' If you have a very limited lifetime then you could handle topic creation/deletion in your app's otherwise you could have have a separate Queue/Topic/Subscription setup/deletion script to handle this logic...

Competing-Consumers Messaging Pattern in Azure Service Bus

I'm just getting started with Windows Azure Service Bus (Topics & Queues) and I'm trying to implement a Competing-Consumers messaging pattern.
Essentially, I want to have a set of message Producers and a set of message Consumers. Once a message is produced, I want the first available Consumer to process the message. No other Consumers should get the message.
Is there a way to do this in Azure?
Simple. Just make two (or more) receivers that concurrently receive from a single queue and you're done. Any retrieved message goes to exactly one of those receivers since the cursor over the mesasage log is advanced as a message is taken. Competing consumers are an inherent capability of a networked queue so there's really nothing special needed.
If you need the opposite - each message goes to each consumer - you make a subscrioption per consumer which gives you an isolated cusor over the message log that can move independent of other receivers. For kicks, you can obviously also have competing consumers on a subscription.
Clemens
Topics are a feature of brokered messaging, but are a one-to-many "publish/subscribe" pattern. Queues are one-to-one message communication. So yes, it sounds like you should simply use queues. Also see http://msdn.microsoft.com/en-us/library/hh689723(VS.103).aspx.
You probably don't want Topics then, but rather Brokered Messaging.
You can emulate Topic-like functionality in Brokered Messaging by using the message's Label and/or Content Type properties along with the PeekLock receive mode.

Resources