I'm struggling at understanding if and what needs to be done to provide for high availability of two different types of Azure resources:
Azure Service Bus
Function Apps
Service Bus guarantee at least 99.9% of the time for most of the service (includes Relay, Queues and Topics, Notification Hubs). Besides, replication and partitioning messaging entities (Each partitioned queue or topic consists of multiple fragments. Each fragment is stored in a different messaging store. If the corresponding messaging store is unavailable, Service Bus writes the message to a different fragment, if possible.) could be used as common solution for high availability.
The following article would be helpful, please read it.
High Availability and Disaster Recovery for Azure Service
Bus
For Function Apps running on App Service Plans Microsoft guarantee that the associated Functions compute will be available 99.95% of the time. So if possible, you could run your Function App on an App Service Plan and enable the Always On setting.
Related
I have nearly identical service buses in 2 separate regions. I am trying to make them be more region agnostic for consuming applications.
While looking into things like Azure service bus geo-disaster recovery and message replication and cross-region federation and how complicated they are, I was thinking instead that I could create a service bus client that would just read from the same topic/subscription name in separate regions and treat them as if they came from the same region.
While I'm sure this can be implemented, I was wondering, does this functionality exists in any current Microsoft libraries? Basically, if message A get published to the east topic/subscription and message B gets published to the Central US topic/subscription, then the client would receive A and B. The order is not important.
Thanks!
Some sort of functionality has existed in the track 0 SDK of Azure Service Bus SDK for failover but not concurrent execution. As it was a client-side feature, it did not get much traction and was very confusing and complicated.
NServiceBus had a legacy Azure Service Bus transport that supported using more than one namespace concurrently. The feature was deprecated as it was also more of a trouble than good. Not to mention the fact that Service Bus has introduced the Premium tier which would handle availability better than multiple standard namespaces together. On top of that, add availability zones and it's hands down a better option than the complexity of setting up multiple receivers.
In case your namespaces are identical, I would suggest consolidating them. One of the strategies would be to "forward" messages from one namespace to another using some processor as there's no cross-namespace forwarding.
We use Azure Service Bus to send messages from one service to another. Producer produces quite huge amount of messages(a couple of millions) during 1-2 hours. As a result our Service Bus(we use Premium Azure Service Bus) is throttled and we receive errors on producer and consumer sides. I wonder if we could check somehow the load of our ASB by using Azure SDK(we use ASB Java SDK) and if it is high we can slow down the services that sends messages into the queue/topic.
I also understand that we can add more Premium units, but it is the last option we will take.
What we use:
Azure Service Bus Java SDK
Java 9 and Spring Boot 2.0
Azure Service Bus Premium version
Do you have any recommendations for my case? Any recommendation - patterns, frameworks, ASB SDK features would be great.
What you usually can do is to check from the portal your usage (from the overview panel), there is no feature in the SDK itself that can show you the load on your service bus namespace.
Since you're using premium messaging, there is no certain threshold that can determine if you've exceeded your messaging unit capacity or not.
you will get throttling errors as shown in the documentation:
Also be aware that you can scale up and down according to your usage with how many messaging units you want.
Do Logic Apps have some sort of built in geo-replication like the Azure Scheduler or Key Vaults? I can't seem to find any information about it.
I have seen some implementations using API management but that is for Logic App that use HTTP triggers, in my case I'm using Service Bus triggers.
If there is no geo-replication how would a disaster recovery implementation look like for my scenario?
I think you are asking three questions - How do I get a geo-redundant Logic Apps deployment and How do I get a geo-redundant Service Bus Messaging deployment and how do I use them in combination.
I would start with the Service Bus Messaging side as it is the foundation for the LA process. In order to have a geo-redundant Service Bus Messaging queue you have to use the Premium SKU and this article goes into detail on how it works: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-geo-dr
For the Logic Apps side you would setup an LA in each region (primary and secondary) and point the Logic Apps to the alias for Service Bus Queue. You would then disable the Logic App in the secondary region and only enable it when the primary region's Logic App was not operational. This would have to be done with some endpoint monitoring scripting and then switch over to the secondary and disable the primary.
Like you said, there are other more automated options (Traffic Manager) when Logic Apps is being triggered by HTTP traffic but since you are reading queues the recovery is more complex.
I'm playing around with creating a Pub/Sub system using Azure Service Bus Topics and an Azure Function app where the functions are decorated with ServiceBusTrigger. I'm using the standard tier for Service Bus and Consumption Plan for the Function app.
The concept is working fine and will serve my purposes, however, I'm having difficulty understanding how this will affect my Azure costs.
Running just one function with that attribute seems to generate a couple request per second to the Service Bus. In the end, I expect to have quite a few Topics with Subscriptions and an Azure Function for each Subscription.
If I understand correctly, each of the requests that the ServiceBusTrigger generates counts against the number of Service Bus operations you get. With a few functions, I'll go over 13M ops pretty easily.
Some of these subscribers aren't so time sensitive that they'd need to check the Subscription several times per second. Is there a way to change the frequency that ServiceBusTrigger checks the Subscription to reduce costs? Am I approaching this problem from entirely the wrong angle?
I'm currently building a hybrid-cloud solution that needs to write messages to a queue for later processing. It is absolutely imperative that the queue is highly available (99.999+% uptime).
My options are to read/write messages to a local ZeroMQ high availability pair, or an Azure Service Bus. I would prefer to go the Azure Service Bus route, but can't find any documentation regarding high availability configuration for Azure Service Bus.
Has anyone had success setting up Azure Service Bus for high availability? I understand that the SLA for a single instance of any Azure service cannot be changed. I'm thinking more along the lines of the failover capabilities of Azure Web Apps.
The main thing you can do for consuming a service at a higher than SLA value is to ensure you are handling retry logic. The key here will be the temporal nature of any outage, and tuning a retry backoff to handle edge cases. Some use linear or exponential backoffs to wait even longer for the service to come back up.
Also, you can have more than one service bus in a different region for georedundancy, and either load balancing messages across the two or use one as a hot backup. This can get you around any regional outages and keep your service up when one data center is not meeting its local SLA.
You can find the for SLA for Azure Service Bus here: legal/sla/service-bus/v1_0/
For Service Bus Relays, we guarantee that at least 99.9% of the time,
properly configured applications will be able to establish a
connection to a deployed Relay. For Service Bus Queues and Topics, we
guarantee that at least 99.9% of the time, properly configured
applications will be able to send or receive messages or perform other
operations on a deployed Queue or Topic. For Service Bus Basic and
Standard Notification Hub tiers, we guarantee that at least 99.9% of
the time, properly configured applications will be able to send
notifications or perform registration management operations with
respect to a Notification Hub. For Event Hubs Basic and Standard
tiers, we guarantee that at least 99.9% of the time, properly
configured applications will be able to send or receive messages or
perform other operations on the Event Hub.
We've had Service Bus Relay up and running for 5+ years and have had one outage. It was an outage at the specific data center the relay was provisioned in and touched many services. After that we implemented redundancy by implementing a secondary Service Bus Relay namespace in a different data center location. The reconfigured code was set to check the connectivity on every connection and switch the primary and secondary connections. We treated them as equals so once we "failed over" that namespace would become primary.
Service Bus now supports Geo-disaster recovery and Geo-replication at the namespace level.
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-geo-dr