SQL Azure Data Sync vs Standard Geo Replication - azure

What is the difference between Data Sync and Standard Geo Replication on SQL Azure databases?
I understand that Active Geo Replication provides the ability to connect to a replicated database whereas Standard does not allow connections. However, how does Data Sync differ? I know it's not immediate replication, but I need to point my BI software to a replication and am debating which configuration I use for replication and disaster recovery.

Data Sync allows you specify what to sync (e.g., which tables), specify sync interval (e.g, 5mins, 15mins), replicas are read/write and allows you to specify how to resolve conflicts (e.g., hub wins, client wins), databases can exists independent of other databases.

Related

Azure database for mysql - cross region read replica

I'm using Azure database for mysql - Flexible server and we would like to have Disaster recovery in another Azure region. As per the below documentation, the cross region read replica is not supported. My question is what is the correct way to have cross region Disaster recovery?
https://learn.microsoft.com/en-us/azure/mysql/flexible-server/concepts-read-replicas
-Suresh
Cross region read replicas for MySQL Flexible Server is part of our backlog and will be coming soon. Meanwhile, you can leverage data-in replication to achieve the same.
https://techcommunity.microsoft.com/t5/azure-database-for-mysql-blog/cross-region-replication-using-data-in-replication-with-azure/ba-p/3563231
not all regions for the azure MySQL server supporting the replicating so the best way and the fastest one is to create another MYSQL server that is supporting the replicating
in the following ling you gonna find how to make the replication :
https://learn.microsoft.com/en-us/cli/azure/sql/db/replica?view=azure-cli-latest
try to create the server in westus for example i remember that is supporting the replication in another region

Is Azure SQL Database a Distributed SQL database?

I am trying to understand the differences between the new CockroackDB and other distributed SQL databases as compared to a cloud-managed database like Azure SQL Database.
It seems there is no difference in the use cases between them:
Like various NOSQL databases SQL (in general) allows partitioning keys.
I can add cores in Azure to increase the performance as needed, I can also switch to Hyper-scale if I have an elastic workload.
I can have read replication across multiple nodes over multiple availability zones (geo-locations)
I can configure data replication in Azure SQL Database too.
It seems to me that a cloud SQL database covers all the use cases the newer distributed databases cover, so why would I want to use a newer product ?
Isn't Azure SQL Database basically a distributed database server ?
Am I missing something ?
Is Azure SQL Server a Distributed SQL database?
No.
Like various NOSQL databases SQL (in general) allows partitioning keys.
Partitioning in NoSQL databases like Cassandra (and Azure Table Storage) is about distributing partitions to physically distinct nodes, and requires rows to have an explicitly set partition-key value.
Cassandra nodes are physically different machines that can run independently, which gives it excellent resiliency.
Partitioning in SQL Server, Azure SQL, and Azure SQL Managed Instance is about dividing data up into row-groups that exist in the same server for performance, not resiliency.
On on-prem MS SQL Server, these row-groups (well, partitions) can exist in different FILEGROUPs, which means they can exist in different storage volumes to avoid IO bottlenecks, but Azure SQL does not support multiple FILEGROUPs.
The benefits of implementing partitioning, including on Azure SQL, are documented online - and the article explains how it's about performance, not resilience.
I can add cores in Azure to increase the performance as needed, I can also switch to Hyper-scale if I have an elastic workload.
This fact has absolutely nothing to do with distributed databases.
I can have read replication across multiple nodes over multiple availability zones (geo-locations).
I can configure data replication in Azure SQL Database too.
Replication isn't the same thing as a true distributed database:
In Cassandra and other distributed databases, all clients can connect to all nodes and accomplish the same tasks; and you can arbitrarily add and remove nodes while the system is running.
In SQL Server and Azure SQL's replication feature, the replica is strictly a "secondary" that is subordinate to your primary server.
Clients can connect to either the secondary or the primary, but the secondary server can only perform read-only queries, whereas if a client wants to do DML (INSERT/UPDATE/DELETE/MERGE) or DDL (CREATE/ALTER) then the client must connect to the primary server.
It seems to me that a cloud SQL database covers all the use cases the newer distributed databases cover, so why would I want to use a newer product?
It can't: because Azure SQL is not a distributed database it cannot allow any client to read and write to any node or endpoint and have that change replicated to all other nodes (using an eventual consistency model). Instead, Azure SQL requires writes to be performed by the single primary "server".
Note that an Azure SQL "server" or logical server is largely an abstraction that hides what Azure SQL really is: a distinct build of SQL Server's engine that runs in a high-availability Azure Service Fabric environment (which is how cores/RAM can be added and removed while it's running and provides for some kind of local resilience against hardware failure) in a single Azure datacenter.

Azure SQL Read-Scale Out

I have a Azure SQL Database in WEST-US region and it is Geo-Replicated to EAST-US region. Is it possible to enable the read-scale out only for the Geo-replicated database. I have tremendous amount of BI load on secondary region and really want to leverage the read scale out feature only in the secondary region. All BI ETLs directly point to secondary endpoint and would like to optimize the BI workload with the read-scale out feature.
I found the official document here . However, I didn't find it to be clear if I can enable it only for geo-replicated database.
It doesn't really matter(?) You don't have to use the local HA readable secondary even if it is enabled. You control this with the readintent tag on your connection string. (It is already there for premium + vcore databases and it's just a question of whether you want to enable it). So, if you enable it for both and just use it on the geo secondary, that works just fine.

SQL Azure and CDN

what is the best way to limit latency for SQL Azure in global applications?
My Application uses SQL Azure and would like to know based on the network location of users if its possible to connect SQL Azure near to users.
So Logically would need to have SQL Azure database with global replication but not geo-replication as each copy would serve as Master and not secondary.
Thank you in advance.
You may want to try CosmosDB to distribute data globally and obtain low latency as explained on this article and this documentation.
For replicating data using SQL Data Sync with Azure SQL Database, take in consideration paired regions which may reduce latency. With SQL Data Sync a hub database can be defined and many member database on another region, and data can be synched on both ways between the hub and any member database.

Is SQL Azure database backed up across datacenters by default?

I want to confirm our understanding of how our Azure SQL databases are being backed up to enable point in time restore. We have not currently configured geo-replication to have the database available in another region. We may in the future as some data analysis is done. But my understanding is that the database is still being backed up to a geo redundant location so I could do a geo-restore if there was an issue with the data center that houses my sql database. Is that correct or do I need to enable geo-replication and pay for a second database in order to have a disaster recover option if the datacenter had an issue.
To clarify further: I think this article states what I'm saying in the Geo-Restore section.
https://azure.microsoft.com/en-us/documentation/articles/sql-database-business-continuity/
Thanks
Yes, all databases have a geo-replicated copy for disaster recovery purposes. For more details, please see the following: https://azure.microsoft.com/en-us/blog/azure-sql-database-geo-restore/
Geo-restore uses the same technology as point in time restore with one
important difference. It restores the database from a copy of the most
recent daily backup in geo-replicated blob storage (RA-GRS). For each
active database, the service maintains a backup chain that includes a
weekly full backup, multiple daily differential backups, and
transaction logs saved every 5 minutes. These blobs are geo-replicated
this guarantees that daily backups are available even after a massive
failure in the primary region.
Yes, Azure SQL Databases are automatically backed up to a different Azure data center using Geo-Replication. This is an automatic features of Azure SQL that is baked into the service offering.
Here's a blog post with further information about Azure SQL Data Replication:
https://azure.microsoft.com/en-us/blog/azure-sql-database-standard-geo-replication/

Resources