I have a requirement where I have to compare data between two tables from two different Azure SQL DB's. Both the database have same schema so i just need to compare the data. What are all the options i have since i cant do a cross SQL querying in Azure SQL? Any help is appreciated!!!
I would first try the tablediff utility, perhaps running on an Azure VM.
Related
I'm trying to get my head around Databricks.
I've found documentation stepping through importing data from S3 or Azure Datalake, and then outputting into Azure Synapse Analytics or another Data Warehouse solution.
After a quick play, I've recognised that you can simply save a table in Databricks, access it using SQL, and even pull it into PowerBI as a source.
So my question: for a small Datamart (10 dims, 5 facts), why would I choose to pay for an additional database solution like Azure SQL, Synapse, RDS or other when I could simply leave the data in a table in Databricks and then access it directly from my reporting tool from there?
Thank you in advance.
Andy
Yes this is very much possible . Just to let you know that SQL Azure and Synapse may be a Microsoft offering but they are for different purpose , Synapse supports MPP and so it more big data implementation . Also its not only how many dimension and fact table you have , how much data you have , what kind of aggregation it has etc becomes decisive .
I am using an Azure SQL Database for our team's reporting and the data size right now is too big to handle by a single data (at least I think so, it has 2 fact tables with around 100m rows in each table).
The Azure SQL Database is named "operation-db" and the Synapse is named "operation-synapse".
I want to make the transition for my team become as smooth as possible. So I'm planning to copy all the tables, views, stored procedure and user-defined function over to Synapse.
Once I'm done with that, is there a way to rename "operation-synapse" to "operation-db" so the team doesn't have to go to their code base to change the name of the db?
Thanks!
It is not possible to rename a SQL Pool via SQL Server Management Studio and you will receive the following error:
ALTER DATABASE NAME statement is not supported in a Synapse workspace.
To update the name of a SQL pool, use the Azure Synapse Portal or the
Synapse REST API. (Microsoft SQL Server, Error: 49978)
The REST API however does list a move method to change names:
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Synapse/workspaces/{workspaceName}/sqlPools/{sqlPoolName}/move?api-version=2019-06-01-preview
I couldn't get it to work though. YMMV. Not renaming your db shouldn't be a big deal though. Your team should feel comfortable with changing connection strings etc and it will help them understand they are moving to a different product (Synapse) with different characteristics.
Before you move to Synapse however, have you look at Clustered Columnstore indexes in Azure SQL DB? They are default type of index in a SQL Pool database but are also available in SQL DB. They can compress your data 5-10x so it might end up not that big at all. Columnstore is great for aggregate queries but less so for point lookups so have a think about your workload before you migrate.
100 million rows is not big enough for synapse. Cci data in each shard will only have 1 row group (1mil rows).
Consider using partitioning or CCI in your sql db itself.
Also what's your usage pattern? If you are doing point lookups and updates clustered indexes will perform better.
You can rename a Synapse database easily using the SSMS GUI. (I've just tried this on v18.8).
Just click once on the database name in the Object Explorer to select it, then press the F2 key to rename it.
The Synapse service must be running (i.e. not paused) for the rename to work.
You can rename Synapse database using T-SQL. The command is as follows:
ALTER DATABASE [OldSynapseDBName]
MODIFY NAME = [NewSynapseDBName]
Note you need to be connected to/issue the command from the master database otherwise it will not work.
The command takes can 30 seconds on 100GB DB and there are some caveats such as DB must not be used during operation.
I been working long with on-premises DWH solutions. Now moving to AZURE DWH.
Right now am up-to doing most of the processing / transformation in Azure Databricks and writing the result set to Azure SQL DWH Staging Tables.
Now I want to MERGE (UPSERT) the Dimensions and Load Fact Tables.
As MERGE is not supported in AZURE SQL DWH, what is the best way to accomplish this?
MERGE is not support with AZURE SQL DWH, Azure SQL DWH team said they are planning to support this feature.
Reference: MERGE statement support.
I found this blog, MSFT give an example to use UPDATE/INSERT statements instead of MERGE.
Hope this helps.
How can I query multiple tables from multiple azure databases?
Imagine I have "customers" table in Database X and "sales" table in dabtabase Y and I want to join them in a query, how is it possible to do this in Azure?
David is correct -- currently Azure SQL DB doesn't support cross-database joins.
Is it possible for you to combine the two databases into a single DB, but use separate schema to keep the namespaces of objects separate? I am curious about the business reasons you are maintaining separate dbs. You can reach out to me directly at Stuarto Microsoft com.
Assuming you're talking about Azure's SQL Database service, then no, you cannot have queries across two separate database instances. The queries are limited to the single database.
If you require queries spanning multiple databases, you'd need to install SQL Server on a VM.
You can join them if they're hosted on the same server. Try this
SELECT a.userID, b.usersFirstName, b.usersLastName FROM databaseA.dbo.TableA a inner join database B.dbo.TableB b ON a.userID=b.userID
I have a SQL Azure Database Server and I need to query between the Databases but can't figure out how to accomplish this.
Here is the structure of my databases:
Server.X
Database.A
Database.B
Database.C
In Database.A I have a Stored Procedure that needs to retrieve data from Database.B. Normally, I would reference the database like SELECT * FROM [Database.B].[dbo].[MyTable] but this does not appear to be allowed in SQL Azure.
Msg 40515, Level 15, State 1, Line 16
Reference to database and/or server name in 'Database.B.dbo.MyTable' is not supported in this version of SQL Server.
Is there a way to do this on the database end?
In the final version Databases A & C will both need data from Database B.
Update:
As per Illuminati's comment and answer, the situation has changed since this answer was originally accepted and there is now support for cross database queries as per https://azure.microsoft.com/en-us/blog/querying-remote-databases-in-azure-sql-db/
Original Answer (2013):
Cross database queries aren't supported in SQL Azure. Which means you need to either combine the databases to prevent the need in the first place, or query both databases independently and basically join the data in your application.
Cross database queries are now supported in SQL Azure
https://azure.microsoft.com/en-us/blog/querying-remote-databases-in-azure-sql-db/
Azure SQL DB is previewing Elastic Database Query feature at this point in time that will help you query among Azure SQL DBs with some limitations. You can get detailed information about the feature here.