Does Azure Synapse Analytics support Geo-Redundancy like Storage Account & Key vault? If not, why do I implement High availability for Azure Synapse Analytics? I have the following components as a part of the Azure Synapse Analytics Solution
SQL Dedicated Pool
SQL Serverless Pool
Spark Pool
Storage Account(ADLS)
Azure DevOps Git Repo
First, designing and documenting a Disaster Recovery plan is a project unto itself. I’ve been working on one for a client of mine using Synapse for several months part-time.
The first task is to define your Recovery Time Objective (RTO, meaning how long before your solution is back up in the event of a disaster) and your Recovery Point Objective (RPO, meaning how many minutes or hours of data you can afford to lose… and with analytics solutions you can usually reload from the source to catch up). If your RTO and RPO are low for an analytics solution (like 2 hours) then you probably need to spin up parallel environments in another region and load data to both environments in parallel. If your RTO and RPO are typical for an analytics solution (24-48 hours) then you can probably survive with ensuring backups are geo-redundant and restoring in the event of an outage. I would recommend you preconfigured your Synapse workspace and other infrastructure before the outage unless you have a trust an infrastructure as code solution. If your RPO and RTO are long (like 7 days) it’s extremely unlikely an Azure service or region is going to be down for that long.
ADLS supports RA-GRS redundancy so you could read all the files from the secondary endpoint in its pair region and copy files to another ADLS in the secondary region. Unfortunately ADLS accounts don’t yet support user-initiated failover.
Dedicated SQL Pools support built-in geo redundant backups once a day but you can’t control when they are taken. If this isn’t acceptable then you need to proactively create a user-defined restore point and proactively restore it cross region and pause the SQL pool.
Synapse Serverless SQL pools have no storage so ensure you have a backup of the schema (views, permissions, external data sources, external tables, etc) in source control or somewhere. The data will failover with ADLS.
For Spark Pools ensure you have your notebook artifacts in source control and you can always run them in a different Synapse workspace in another region when needed. Document your cluster configs.
Write out a disaster recovery playbook and do a DR drill periodically (once a quarter or once a year).
Here is another author’s description of the DR plan for Synapse.
Related
I have used several Azure services to upload data from on-premises to Azure SQL DW for Power BI.
SQL Server (On Perm) -> Azure Data Factory (SSIS IR, SSIS in Azure SQL Database) -> Azure SQL Database<br/>
2 Azure services used
However, we find that the data size is growing much bigger than design the platform.
We are planning to change to Azure Synapse.
But based on Microsoft Documentation, it seems the Data Factory (Preview) did not come with SSIS IR.
Here is what come up on my mind:
SQL Server (On Perm) -> Azure Data Factory (SSIS IR, SSIS in Azure SQL Database) -> Azure Synapse<br/>
3 Azure services used
I wonder does it have a better way for Synapse with SSIS.
Many thanks.
Azure SQL database now has a Hyperscale service tier.
The Hyperscale service tier in Azure SQL Database provides the following additional capabilities:
Support for up to 100 TB of database size
Nearly instantaneous database backups (based on file snapshots stored
in Azure Blob storage) regardless of size with no IO impact on
compute resources
Fast database restores (based on file snapshots) in minutes rather
than hours or days (not a size of data operation)
Higher overall performance due to higher log throughput and faster
transaction commit times regardless of data volumes
Rapid scale out - you can provision one or more read-only nodes for
offloading your read workload and for use as hot-standbys
Rapid Scale up - you can, in constant time, scale up your compute
resources to accommodate heavy workloads when needed, and then scale
the compute resources back down when not needed.
Since the Azure Synapse Analytics is not supported with SSIS IR, I think scale up the Azure SQL Database service tier is good choice for you.
I just added azure data factory service to my subscription. During the setup I was able to select only one region, what happens if disaster happens in this region? How does ADF guarantees high availability?
Do we need to wait till recovery or is there any similar setup like in ADLS2(GRS & RA-GRS).
No statements of Disaster Recovery could be found in the ADF official document.Based on my researching,ADF only provides cloud-based data integration work flow, the DR is affected by the supported data stores in ADF actually. I provide some clues for your reference:
1.The statement of Location option when you create ADF:
2.High availability for Azure Integration Runtime,it is affected by DU setting(allocation of compute resources):https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance-features#data-integration-units
3.High availability for Self-Hosted Integration Runtime,it could be better if you create multiple nodes in the on-premise environment:https://learn.microsoft.com/en-us/azure/data-factory/create-self-hosted-integration-runtime#high-availability-and-scalability
I want to confirm our understanding of how our Azure SQL databases are being backed up to enable point in time restore. We have not currently configured geo-replication to have the database available in another region. We may in the future as some data analysis is done. But my understanding is that the database is still being backed up to a geo redundant location so I could do a geo-restore if there was an issue with the data center that houses my sql database. Is that correct or do I need to enable geo-replication and pay for a second database in order to have a disaster recover option if the datacenter had an issue.
To clarify further: I think this article states what I'm saying in the Geo-Restore section.
https://azure.microsoft.com/en-us/documentation/articles/sql-database-business-continuity/
Thanks
Yes, all databases have a geo-replicated copy for disaster recovery purposes. For more details, please see the following: https://azure.microsoft.com/en-us/blog/azure-sql-database-geo-restore/
Geo-restore uses the same technology as point in time restore with one
important difference. It restores the database from a copy of the most
recent daily backup in geo-replicated blob storage (RA-GRS). For each
active database, the service maintains a backup chain that includes a
weekly full backup, multiple daily differential backups, and
transaction logs saved every 5 minutes. These blobs are geo-replicated
this guarantees that daily backups are available even after a massive
failure in the primary region.
Yes, Azure SQL Databases are automatically backed up to a different Azure data center using Geo-Replication. This is an automatic features of Azure SQL that is baked into the service offering.
Here's a blog post with further information about Azure SQL Data Replication:
https://azure.microsoft.com/en-us/blog/azure-sql-database-standard-geo-replication/
I'm building a web app using Azure & SQL Azure. I'm setting it up so each organization has their own database. Low to moderate traffic per customer organization.
I'm thinking about using SQL Azure Data Sync as part of a failover/backup plan, so that if SQL Azure goes down, my app can switch over to my on-premises SQL Server (read-only mode).
I would also be able to do all of my backups on-prem, instead of in the cloud which could incur costs.
One issue may be trying to data-sync multiple databases to my on-prem
sql server (not sure what the limit is on the number of databases
that can be synced to one server)
Bandwidth may be an issue, but I'll probably only sync daily.
Does anyone see any other problems with this approach?
Data Sync is ok, but may or may not be good for your particular DR plan since it's not a transactional sync model.
One option to consider is making a database copy:
CREATE DATABASE destination_database_name
AS COPY OF [source_server_name.]source_database_name
Then you can create a backup from this copy, store the backup in blob storage, and (optionally) delete the database copy. While this does add an additional cost due to a second database being live, you can keep that cost to a minimum if you delete the database instance after creating a backup and storing to blob storage (remember that databases are amortized daily).
Since your backups would then be in blob storage, you could keep multiple backups in blob storage, and pull a backup to your on-premises server if needed.
I know Azure will geo-replication a copy of current storage account to another location,
my questions is: can I access another location in program, even just read only
I asked this, because this allow me to build another deploy in different geo-location for performance and disaster-proof like what Azure did. For current setup, if I use same source of storage in different geo-location, I have to pay extra bandwidth cost.
You can only access your storage account by its primary name. In the event of failover, that name will be mapped to the alternate datacenter. You cannot access the failover storage directly, nor can you choose when to trigger a failover. For a multi-site setup as you described, you'd need to duplicate your data (which would then add the cost of storage in datacenter #2). This does give you ultimate flexibility in your DR and performance planning, but at an added cost of storage and bandwidth (egress-only).
Last week the storage team announced read-only access to the failover storage: Windows Azure Storage Redundancy Options and Read Access Geo Redundant Storage.
This means you can now deploy your application in a different datacenter which can be used for "full" failover (meaning that the storage will also be available there). Even if it's only read-only, your application will still be online - but simply in "degraded" mode.
The steps on how you can implement this with traffic manager are described here: http://fabriccontroller.net/blog/posts/adding-failover-to-your-application-with-read-access-geo-redundant-storage-and-the-windows-azure-traffic-manager/