How to write to Azure storage when the primary region down? - azure

Our application uses RA-GZRS for storage which enables to read data from the secondary when the primary is down, but can't write to it.
Is there a solution which enables application to do both read from and write to storage in the event of an Azure region going down?

In Azure Storage, there can be only one region (primary) where you can write. The other region (secondary) will always be read only.
One possible solution would be to do a manual failover so that the secondary region of your account becomes the primary and then you should be able to write to it. However, please be aware that the manual failover comes with a lots of caveats and make sure you understand those.
You can read more about those things here: https://learn.microsoft.com/en-us/azure/storage/common/storage-initiate-account-failover?tabs=azure-portal#important-implications-of-account-failover.

Please go through this. To quote from the article:
If the primary region becomes unavailable, you can choose to fail over
to the secondary region. After the failover has completed, the
secondary region becomes the primary region, and you can again read
and write data. For more information on disaster recovery and to learn
how to fail over to the secondary region, see Disaster recovery and
storage account failover.
The tutorial here tells you how to build a highly available application that automatically switches between endpoints on a failure. It uses a circuit breaker pattern.

Related

Clarification on Azure SQL Database Backup plan (short term retention)

I am confused with azure SQL database backup plan (short term backup retention).
As far as i understood,
In DTU purchasing model, no extra charge for backup storage, you only pay for redundancy type (such as LRS,ZRS)
In vCore purchase model, you will have to pay for backup storage.
am i right ?
does that mean , i will not have any backups if do not subscribe to backup storage in vCore ?
further, in azure pricing calculator, in vCore, General purpose option, you have two redundancy drop down options (i am not talking about long term retention plan) , what is the difference between them ?
Thanks.
i will not have any backups if do not subscribe to backup storage in vCore ?
Yes, in vCore, if you do not allocate a storage account for backups, you will not be able to perform backup operations, either manually or automatically. If you believe you do not need backups, then you might be a fool ;), Azure will maintain access to your database according to the standard SLAs but the infrastructure will not provide a way for you to point-in-time restore the state of your database, only backups can adequately do that for you. But the storage costs are usually a very minimal component of your overall spend. Once the backup operation is complete you can download the backup for local storage and then clear the blob, making this aspect virtually cost free, but you will need a storage account to complete the backup process at all.
in azure pricing calculator, in vCore, General purpose option, you have two redundancy drop down options
Are you referring to the Computer Redundancy:
Zone redundancy for Azure SQL Database general purpose tier
The zone redundant configuration utilizes Azure Availability Zones to replicate databases across multiple physical locations within an Azure region. By selecting zone redundancy, you can make your serverless and provisioned general purpose single databases and elastic pools resilient to a much larger set of failures, including catastrophic datacenter outages, without any changes of the application logic. This configuration offers 99.995% availability SLA and RPO=0. For more information see general purpose service tier zone redundant availability.
In the other tiers, these redundancy modes are referred to as LRS (Locally Redundant) and ZRS (Zone Redundant). Think of this your choice on what happens when your data centre is affected by some sort of geological or political event that means the server cluster, pod or whole data centre is offline.
Locally Redundant offers redundancy only from a geographically local (often the same physical site). In general this protects from local hardware failures but not usually against scenarios that take the whole data center off-line. This is the minimal level of redundancy that Azure requires for their hardware management and maintenance plans.
Zone Redundant offers redundancy across multiple geographically independent zones but still within the same Azure Region. Each Azure availability zone is an individual physical location with its own independent networking, power, and cooling. ZRS provides a minimum of 99.9999999999% durability for objects during a given year.
There is a third type of redundancy offered in higher tiers: Geo-Redundant Storage (GRS). This has the same Zone level redundancy but configures additional replicas in other Azure regions around the world.
In the case of Azure SQL DB, these terms for Compute (So the actual server and CPU) have almost identical implications as that of Storage Redundancy. So with regard to available options, the pricing calculator is pretty well documented for everything else, use the info tips for quick info and go to the reference pages for the extended information:
The specifics are listed here: Azure Storage redundancy but redundancy in Azure is achieved via replication. That means that an entire workable and usable version of your database is maintained so that in the event of failure, the replica takes the load.
A special feature of replication is that you can actively utilise the replicated instance for Read Only workloads, which gives us as developers and architects some interesting performance opportunities for moving complex reporting and analytic workloads out of the transactional data manipulations OOTB, traditionally this was a non-trivial configuration.
The RA prefix on redundancy options is an acronym for Read Access.

Azure Cosmos DB - How to bring the Primary Region Offline and force Automatic failover?

I have provisioned Cosmos DB with the following configurations
West US - Primary => Read & Write
East US - Secondary => Read
I want to prove that when the primary goes offline the secondary will become the primary and function.
I could successfully perform the manual failover however I don't know how to take the primary database offline (Note: I could delete only the secondary replica)?
You cannot take a region "offline" and removing the region is not possible while it is functioning at the primary write region. If you want to test failover, use the manual failover. This is sufficient for BCDR testing scenarios.
Juts a supplementary here. As #MarkBrown indicated, you can use manual failover, and this is an official document for it

Configure disaster recovery and automatic failover for Azure Key vault?

We have our Azure key vault provisioned in East US and our hot-stand by region is West US.
Does it support Geo-Replication?
How do I configure the Azure Key vault to support the disaster recovery with automatic failover? Would it impact the connection string?
As per the documentation, its provided by default
"If individual components within the key vault service fail, alternate
components within the region step in to serve your request to make
sure that there is no degradation of functionality. You don't need to
take any action to start this process, it happens automatically and
will be transparent to you.
In the rare event that an entire Azure region is unavailable, the
requests that you make of Azure Key Vault in that region are
automatically routed (failed over) to a secondary region except in the
case of the Brazil South region. When the primary region is available
again, requests are routed back (failed back) to the primary region.
Again, you don't need to take any action because this happens
automatically."
You can read the whole documentation here. https://learn.microsoft.com/en-us/azure/key-vault/general/disaster-recovery-guidance

Convert Premium To Standard

We have a VM using premium managed drives that is also replicated to another azure data center using azure site recovery. I am aware of how to convert the premium drives to standard by deallocating the vm and changing the drive type. However I suspect I will need to stop and remove the disaster recovery replication and reinitializing vm replication resulting in the loss of all previous recovery points.
Does anyone know for sure and what the process to convert the disks given VM replication would be.
Thx.
You need to reach out to the right Azure Support Team. The Azure Site Recovery support team should be able to give you the correct information, they handle disaster recovery replication scenario.
You want to make sure you are getting vetted information on issues like this.

How do I make my Windows Azure application resistant to Azure datacenter catastrophic event?

AFAIK Amazon AWS offers so-called "regions" and "availability zones" to mitigate risks of partial or complete datacenter outage. Looks like if I have copies of my application in two "regions" and one "region" goes down my application still can continue working as if nothing happened.
Is there something like that with Windows Azure? How do I address risk of datacenter catastrophic outage with Windows Azure?
Within a single data center, your Windows Azure application has the following benefits:
Going beyond one compute instance, your VMs are divided into fault domains, across different physical areas. This way, even if an entire server rack went down, you'd still have compute running somewhere else.
With Windows Azure Storage and SQL Azure, storage is triple replicated. This is not eventual replication - when a write call returns, at least one replica has been written to.
Ok, that's the easy stuff. What if a data center disappears? Here are the features that will help you build DR into your application:
For SQL Azure, you can set up Data Sync. This facility synchronizes your SQL Azure database with either another SQL Azure database (presumably in another data center), or an on-premises SQL Server database. More info here. Since this feature is still considered a Preview feature, you have to go here to set it up.
For Azure storage (tables, blobs), you'll need to handle replication to a second data center, as there is no built-in facility today. This can be done with, say, a background task that pulls data every hour and copies it to a storage account somewhere else. EDIT: Per Ryan's answer, there's data geo-replication for blobs and tables. HOWEVER: Aside from a mention in this blog post in December, and possibly at PDC, this is not live.
For Compute availability, you can set up Traffic Manager to load-balance across data centers. This feature is currently in CTP - visit the Beta area of the Windows Azure portal to sign up.
Remember that, with DR, whether in the cloud or on-premises, there are additional costs (such as bandwidth between data centers, storage costs for duplicate data in a secondary data center, and Compute instances in additional data centers). .
Just like with on-premises environments, DR needs to be carefully thought out and implemented.
David's answer is pretty good, but one piece is incorrect. For Windows Azure blobs and tables, your data is actually geographically replicated today between sub-regions (e.g. North and South US). This is an async process that has a target of about a 10 min lag or so. This process is also out of your control and is purely for a data center loss. In total, your data is replicated 6 times in 2 different data centers when you use Windows Azure blobs and tables (impressive, no?).
If a data center was lost, they would flip over your DNS for blob and table storage to the other sub-region and your account would appear online again. This is true only for blobs and tables (not queues, not SQL Azure, etc).
So, for a true disaster recovery, you could use Data Sync for SQL Azure and Traffic Manager for compute (assuming you run a hot standby in another sub-region). If a datacenter was lost, Traffic Manager would route to the new sub-region and you would find your data there as well.
The one failure that you didn't account for is in the ability for an error to be replicated across data centers. In that scenario, you may want to consider running Azure PAAS as part of HP Cloud offering in either a load balanced or failover scenario.

Resources