Will Azure Service Bus .net SDK deliver duplicate messages? - azure

I am currently using Azure.Messaging.ServiceBus .net SDK to send high throughput messages in batches to ASB standard. While simulating an intermittent network connectivity situation by turning off the internet during a high throughput batched send, we observed that there were some extra messages delivered to the ASB. The count observed in the ASB portal for the queue didn't match our logs. Can this happen with the ASB? Note that we have enabled retry policies enabled for the client. I came across a similar StackOverflow thread and want to know if there is any latest progress pertaining to this.

Yes; the Service Bus guarantee is at-least-once. Duplicates are possible when publishing fails with an ambiguous state and is retried. Using the duplicate detection feature can help to guard against this, but requires that your application take responsibility for assigning unique MessageId values.
Duplicates can also occur if a message is received and is not settled. Once its lock has expired, the message is eligible to be received again.
Applications receiving messages should ensure that processing is idempotent and any needed detection or guards for duplicates are in place.

Related

What is the actual meaning, value and usage of Azure Service Bus' "at most once" delivery capability?

The Service Bus documentation states that "the At-Most-Once semantic can be supported by using session state to store the application state and by using transactions to atomically receive messages and update the session state." "Session" here appears to refer to Service Bus' messaging sessions, which include the ability to store arbitrary state. This mechanism lets you enroll state updates in transactions along with operations on messages.
I see how this can be used to reliably maintain the state of an application that is using message sessions. If you can update application state and complete a message in the same transaction, a properly-implemented app could potentially die anywhere in execution, and on resume would be guaranteed to inherit a state that results in successful, in-order continued session processing (sample code is here, though strangely it doesn't actually use transactions, although I see how it could and what that would accomplish).
What I don't see is how any of this translates to "at-most-once" delivery. Nothing about Service Bus, including updates to session state, can be enrolled in a distributed transaction. So what exactly does "at-most-once" mean, and what does it accomplish? And what distinguishing feature of Service Bus allows it to support "at-most-once" delivery when Azure Storage queues do not?
After looking at your post and reading through the doc, I realized it wasn't really explaining at-most-once.
So I reached out to the concerned team and confirmed that it is indeed incorrect. A PR has been raised to fix the doc accordingly.
Instead, sessions and transactions together provide a higher level of consistency which is commonly referred to as exactly-once processing (which can't really be achieved just by the message broker itself but along with a receiver capable of deduplication).
PS: at-most-once is indeed possible by simply using the ReceiveAndDelete mode

How does Azure Service Bus Queue guarantees at most once delivery?

According to this doc service bus supports two modes Receive-and-Delete and Peek-Lock.
If using Peek-Lock Mode if the consumer crashes/hangs/do a very long GC right after processing the message, but before the messageId is "Completed" and visibility time expires there's a chance that same message is delivered twice.
Then how does Microsoft says that Service Bus supports at most once delivery mode. Is it because of the Receive-and-Delete mode which sends messages only once.But then again, if something happens while consumers are processing the message then that valuable info is lost.
If yes then what is the best way to ensure exact once delivery using Azure Services Bus as Queue and Azure Functions as Consumers.
P.S. The one approach I can think of is storing MessageID's in blob but since in my case number of MessageID's could be very large storing and loading all of them is not right approach.
Azure Functions will always consume Service Bus messages in Peek-Lock mode. Exactly Once delivery is basically not possible in general case: there's always a chance that consuming application will crash at wrong time just before completing the message, and then the message will be re-delivered.
You should strive to implement Effectively Once processing. This is usually achieved with idempotent message processor.
Storing MessageID's (consumer-side de-duplication) is one option. You could have a policy to clean up old Message IDs to keep the size of such storage manageable. To make this 100% reliable you would have to store Message ID in the same transaction as other modifications done by processor.
Other options really depend on your processing scenario. Find a way to make it idempotent - so that processing the same message multiple times is functionally same as processing it just once.

Move all messages from deadletter queue back into main queue of subscription

My service consumes messages from an Azure Service Bus subscription. A dependency of my service was down for a while, which caused a lot of messages to end up in the deadletter queue (DLQ). Now that the service is back up, I want to reprocess all messages from the DLQ. How can I move/resubmit all messages from the DLQ back in to the main queue.
Restrictions:
It's thousands of messages, so manually handling them isn't feasible.
The topic has about ten subscriptions. I don't want to resubmit the messages to the topic, because then all subscriptions would receive the messages, leading to double-processing.
I don't want to run the service against the DLQ directly, because some messages are broken and cause permanent errors, i.e. they would end up in the DLQ again, which would lead to an infinite loop. Moreover, the broken messages are put back at the front of the queue, effectively starving healthy messages that come after the broken ones.
I realize this is a while after the original post but if anyone else stumbles on this problem, there is a fairly handy solution baked into the Service Bus Explorer (which I have found to be incredibly handy with ASB development).
After connecting to your Service Bus and finding the needed namespace, find the desired topic and subscription with the deadletters in it. From there Right Click and Receive Deadletter Queue Messages and hit OK.
From there, highlight which you would like to send back to the main queue and hit Resubmit Selected Messages in Batch Mode.
Thomas, you probably already found your answer since this is quite awhile ago. think of DLQ (or any existing queue that you have) as just another collection variable like in a PC app, but residing on the cloud. just like a PC-app or in-memory collection variable from your tool-kit, you have many ways of utilising it. off course there are limitations and differences between these 2 types of collection variables, but that's how you design your solution as though the DLQ is just another collection variable by knowing those limitations and differences.
For some queuing implementations, one of the solutions would be to have another instance of the same app pointing to the DLQ, but with a fairly long visibility timeout (e.g. 6 or 12 or even 24 hours depending on your SLA), since you don't want to repeat them too often. However, this is not applicable to Azure service bus, as it limits the visibility timeout to at most 5 minutes.
if the DLQ contains broken un-recoverable jobs, you should fix the app to delete them based on the error messages when the unknown exception occurred. once the fix is deployed, such broken un-recoverable jobs would have been removed by your app and never get sent to the DLQ in the first place. and those already in the DLQ will be removed by the fixed app.
The only option to replay DLQ messages is to receive them from DLQ, create new message with same content and send it again to the topic. They will end up at the end of subscription queue.
You can't send messages directly to the subscription. There is a trick to add a metadata property to the message, and then adjust all except one subscription to filter out such messages. It's up to you to decide if it's going to help in your scenario.
As for tooling, we always did that with custom code, because we always needed some extra work to be done, like logging each replayed message for further analysis.
The quick answer is that you cannot directly move messages back to the main queue of a subscription. This is by design with how Microsoft implemented their topics and subscriptions.
Option #1
There is the option to use Azure Service Bus topic filters https://learn.microsoft.com/en-us/azure/service-bus-messaging/topic-filters and define/tag your messages in a manner that would only allow them to be received on the targeted subscription.
Option #2
The other option would be to change your current implementation. You would set up "delivery queues" (regular service bus queues) and configure each corresponding subscription to auto forward its messages to these delivery queues. Your message processing logic would then listen on these "delivery queues" vs the subscription. Any failures would then result in DLQ messages on these associated "delivery queues" which could then be handled outside of the topic/subscriptions.

Azure - Send message to all other Roles and wait for response

A really common pattern that I need in multi instance web applications is invalidating MemoryCaches over all instances - and waiting for a confirmation that this has been done. (Because a user might otherwise after a refresh suddenly see old data on another instance)
We can make this with a combination of:
AzureServicebus,
Sending message to a topic
other instances send message back with ReplyTo to the original instance
have a wait loop for waiting on the messages back,
be aware of how many other instances are there in the first place.
probably some timeout because what happens if an instance crashes in between?
I think working out all these little edge cases might be a lot of work - so before we reinvent the wheel - is there already a common pattern or library for this?
(of course one solution would be using a shared cache like Redis, but for some situations a memorycache is a lot faster)
Have a look at Azure Durable Functions, e.g. Fan-In/Fan-Out scenario. They use Azure Storage Queues underneath, but provide higher-level abstractions.
Note that Durable Functions are still in early preview (as of August 2017), so not suitable for production use yet.
I think working out all these little edge cases might be a lot of work - so before we reinvent the wheel - is there already a common pattern or library for this?
Indeed. This sounds like a candidate for a middleware framework such as NServiceBus or MassTransit.
AzureServicebus
Both NServiceBus and MassTransit support Azure Service Bus as the transport.
Sending message to a topic
Both NServiceBus and MassTransit can Publish messages (events) to topics.
other instances send message back with ReplyTo to the original instance
Both NServiceBus and MassTransit can send messages to specific destination. NServiceBus also can Reply to the originator of an incoming message using a request/reply pattern.
have a wait loop for waiting on the messages back
Both NServiceBus and MassTransit support Sagas, also known as Process Coordinator pattern.
be aware of how many other instances are there in the first place.
Not sure about this requirement. When you scale out, you're running with a competing consumer and shouldn't care about number of instances of an endpoint.
probably some timeout because what happens if an instance crashes in between?
If you refer to retries and recovery, then both NServiceBus and MassTransit support retries.
You can use Azure Redis cache pub/sub model to do this.
1) Subscribe to Redis multiplexer
connectionMultiplexer.GetSubscriber().Subscribe(
"SubscribeChannelName",
(channel, message) => {
invalidate cache here and publish the confirmation using below publish method
connectionMultiplexer.GetSubscriber().PublishAsync("PublishChannelName", "Cache invalidated for instance").Wait();
});
2) Publish the cache invalidation and subscribe for confirmation from instances
var connection = ConnectionMultiplexer.Connect("redis connection string");
var redisSubscriber = connection.GetSubscriber();
redisSubscriber.Subscribe(
"PublishChannelName",
(channel, message) => {
// write logic to verify if all instances notified about cache invalidation.
});
redisSubscriber.PublishAsync("SubscribeChannelName","invalidate cache")).Wait();

Detect and Delete Orphaned Queues, Topics, or Subscriptions on Azure Service Bus

If there are no longer any publishers or subscribers reading nor writing to a Queue, Topic, or Subscription, because of crashes or other abnormal terminations (instance restart, etc.), is that Queue/Topic/Subscription effectively orphaned?
I tested this by creating a few Queues, and then terminating the applications. Those Queues were still on the Service Bus a long time later. It seems that they will just stay there forever. That would be wonderful if we WANTED that behavior, but in this case, we do not.
How can we detect and delete these Queues, Topics, and Subscriptions? They will count towards Azure limits, etc, and we cannot have these orphaned processes every time an instance is restarted/patched/crashes.
If it helps make the question clearer, this is a unique situation in which the Queues/Topics/Subscriptions have special names, or special Filters, and a very limited set of publishers (1) and subscribers (1) for a limited time. This is not a case where we want survivability. These are instance-specific response channels. Whether we use Queues or Subscriptions is immaterial. If the instance is gone, so is the need for that Queue (or Subscription).
This is part of a solution where each web role has a dedicated response channel that it monitors. At any time, this web role may have dozens of requests pending via other messaging channels (Queues/Topics), and it is waiting for the answers on multiple threads. We need the response to come back to the thread that placed the message, so that the web role can respond to the caller. It is no good in this situation to simply have a Subscription based on the machine, because it will be receiving messages for other threads. We need each publishing thread to establish a dedicated response channel, so that the only thing on that channel is the response for that thread.
Even if we use Subscriptions (with some kind of instance-related filter) to do a long-polling receive operation on the Subscription, if the web role instance dies, that Subscription will be orphaned, correct?
This question can be boiled down like so:
If there are no more publishers or subscribers to a Queue/Topic/Subscription, then that service is effectively orphaned. How can those orphans be detected and cleaned up?
In this scenario you are looking for the Queue/Subscriptions to be "dynamic" in nature. They would be created and removed based on use as opposed to the current explicit provisioning model for these entities. Service Bus provides you with the APIs to perform create/delete operations so you can plug these on role OnStart/OnStop events appropriately. If those operations fail for some reason then the orphaned entities will exist. Again you can run clean up operation on them based on some unique identifier for the name of the entities. An example of this can be seen here: http://windowsazurecat.com/2011/08/how-to-simplify-scale-inter-role-communication-using-windows-azure-service-bus/
In the near future we will add more metadata and query capabilities to Queues/Topics/Subscriptions so you can see when they were last accessed and make cleanup decisions.
Service Bus Queues are built using the “brokered messaging” infrastructure designed to integrate applications or application components that may span multiple communication protocols, data contracts, trust domains, and/or network environments. The allows for a mechanism to communicate reliably with durable messaging.
If a client (publisher) sends a message to a service bus queue and then crashes the message will be stored on the Queue until as consumer reads the message off the queue. Also if your consumer dies and restarts it will just poll the queue and pick up any work that is waiting for it (You can scale out and have multiple consumers reading from queue to increase throughput), Service Bus Queues allow you to decouple your applications via durable cloud gateway analogous to MSMQ on-premises (or other queuing technology).
What I'm really trying to say is that you won't get an orphaned queue, you might get poisoned messages that you will need to handled, this blog post gives some very detailed information re: Service Bus Queues and their Capacity and Quotas which might give you a better understanding http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
Re: Queue Management, you can do this via Visual Studio (1.7 SDK & Tools) or there is an excellent tool called Service Bus Explorer that will make your life easier for queue managagment: http://code.msdn.microsoft.com/windowsazure/Service-Bus-Explorer-f2abca5a
*Note the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call)
As Abhishek Lai mentioned there is no orphan detecting capability supported.
Orphan detection can be implement externally in multiple ways.
For example, whenever you send/receive a message, update a timestamp in an SQL database to indicate that the queue/tropic/subscription is still active. This timestamp can then be used to determine orphans.
If your process will crash which is very much possible there will be issue with the message delivery within the queue however queue will still be available to process your request. Handling Application Crashes and Unreadable Messages with Windows Azure Service Bus queues are described here:
The Service Bus provides functionality to help you gracefully recover from errors in your application or difficulties processing a message. If a receiver application is unable to process the message for some reason, then it can call the Abandon method on the received message (instead of the Complete method). This will cause the Service Bus to unlock the message within the queue and make it available to be received again, either by the same consuming application or by another consuming application.
In the event that the application crashes after processing the message but before the Complete request is issued, then the message will be redelivered to the application when it restarts. This is often called At Least Once Processing, that is, each message will be processed at least once but in certain situations the same message may be redelivered. If the scenario cannot tolerate duplicate processing, then application developers should add additional logic to their application to handle duplicate message delivery. This is often achieved using the MessageId property of the message, which will remain constant across delivery attempts.
If there are no longer any processes reading nor writing to a queue, because of crashes or other abnormal terminations (instance restart, etc.), is that queue effectively orphaned?
No the queue is in place to allow communication to occur via Brokered Messages, if all your apps die for some reason then the queue still exists and will be there when they become alive again, it's the communication channel for loosely decoupled applications. Regards Billing 'Messages are charged based on the number of messages sent to, or delivered by, the Service Bus during the billing month' you won't be charged if a queue exists but nobody is using it.
I tested this by creating a few queues, and then terminating the
applications. Those queues were still on the machine a long time
later.
The whole point of the queue is to guarantee message delivery of loosely decoupled applications. Think of the queue as an entity or application in its own right with high availability (SLA) as its hosted in Azure, your producer/consumers can die/restart and the queue will be active in Azure. *Note I got a bit confused with your wording re: "still on the machine a long time later", the queue doesn't actually live on your machine, it sits up in Azure in a designated service bus namespace. You can view and managed the queues via the tools I pointed out in the previous answer.
How can we detect and delete these queues, as they will count towards
Azure limits, etc.
As stated above the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call), queue management can be done via the tools stated in the other answer. You should only be looking to delete queue's when you no longer have producer/consumers looking to write to them (i.e. never again). You can of course create and delete queues in your producer/consumer applications via the namespaceManager.QueueExists, more information here How to Use Service Bus Queues
If it helps make the question clearer, this is a unique situation in which the queues have special names, and a very limited set of publishers (1) and subscribers (1) for a limited time.
It sounds like you need to use Topics & Subscriptions How to Use Service Bus Topics/Subscriptions, this link also has a section on 'How to Delete Topics and Subscriptions' If you have a very limited lifetime then you could handle topic creation/deletion in your app's otherwise you could have have a separate Queue/Topic/Subscription setup/deletion script to handle this logic...

Resources