Az MySql to Az SQL Server - Data Lake Gen2 - azure

I creating Data Factory Pipeline to Load Initial and Incremental into Data Lake from Az MySql database to an Az SQL Server database.
Initial Pipeline to load data from MySql to Data Lake is all good. Is being persisted as .parquet files.
Now I need to load these into a SQL Server table with some basic type conversions. What is the best way?
Databricks => mount these .parquet files, standardised and load into SQL Server tables?
Or can I create an external source to these files in SQL Server on Azure and do standardisation. We are not on Synapse (dwh) yet.
Or is there better way?

Since you are already using ADF , you can explore Mapping data flow .
https://learn.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview

Related

From azure sql Database to snowflake

I am thinking about using Snowflake as data warehouse. My databases are in Azure SQl Database and I would like to know what tools I need for etl my data from Azure SQL Database to Snowflake.
I think Snowpark could work for data transformations, but I wonder what other code tool could I use.
Also, I wonder if I use azure blob storage as staging area or snowflake has its own.
Thanks
You can use HEVO data a third-party tool where you can directly migrate data from Microsoft SQL Server to Snowflake.
STEPS TO BE FOLLOWED
Make a connection to your Microsoft SQL Server database.
Choose a replication mode.
Create a Snowflake Data Warehouse configuration.
Alternatively, You can use SnowSQL to Connect Microsoft SQL Server to Snowflake where you export data from SQL Server to SSMS, upload the same to either Azure storage or S3, and move the data from Storage to Snowflake.
REFERENCES:
Microsoft SQL Server to Snowflake
How to move the data from Azure Blob Storage to Snowflake

Moving data from Teradata to Snowflake

Trying to move data from Teradata to Snowflake. Have created a process to run TPT scripts for each table to generate files for each table.
Files are also split to achieve concurrency while running COPY INTO in snowflake.
Need to understand what is the best way to move those Files from On Prem Linux Machine to Azure ADLS. Considering files in Terabyte size.
Does Azure provide any mechanism to move these files or can we directly create files on ADLS from Teradata?
The best approach to load data to snowflake via external table if you have the Azure Blob Storage or ADLS Gen2. Load data to blob storage and create external table and then load data data to snowflake.

Extracting and Transforming Data from local MySQL to Azure Synapse Data Warehouse

I'm trying to setup a Demo Data Warehouse in Azure Synapse. I would like to extract data from a local MySQL database, transform and aggregate some data and store it in fact-/dimension tables in Azure Synapse Analytics.
Currently I have an instance of Azure SQL Data Warehouse and Data Factory. I created a connection to my MySQL database in Data Factory and my thought was, i can use this connector as input for a new Data Flow, which transforms the dataset and stores it to my destination dataset, which is linked to my Azure Synapse Data Warehouse.
The Problem is, Data Factory just support some Azure Services like Azure Data Lake or Azure SQL Database as Source for a new Data Flow.
What would be the best practice for solving this Problem? Create an Instance of Azure SQL Database, copy the Data from the local MySQL Database to the Azure SQL Database and use it then as Source for a new Data Flow?
Best practice here is to use the Copy Activity in an ADF pipeline to land the data from MySQL into Parquet in Blob or ADLS G2, then transform the data using Data Flows.

Is possible to read an Azure Databricks table from Azure Data Factory?

I have a table into an Azure Databricks Cluster, i would like to replicate this data into an Azure SQL Database, to let another users analyze this data from Metabase.
Is it possible to acess databricks tables through Azure Data factory?
No, unfortunately not. Databricks tables are typically temporary and last as long as your job/session is running. See here.
You would need to persist your databricks table to some storage in order to access it. Change your databricks job to dump the table to Blob storage as it's final action. In the next step of your data factory job, you can then read the dumped data from the storage account and process further.
Another option may be databricks delta although I have not tried this yet...
If you register the table in the Databricks hive metastore then ADF could read from it using the ODBC source in ADF. Though this would require an IR.
Alternatively you could write the table to external storage such as blob or lake. ADF can then read that file and push it to your sql database.

Add SQL Server as a data source in Azure Data Lake Analytics

I'm doing some tests with Azure Data Lake Analytics and I can’t add a new SQL Server database as a Data Source. When I click on "Add data source", the only two available options are: "Azure Data Lake Storage Gen1" and "Azure Storage".
What I want is to add one SQL Server database so that I can run U-SQL queries against it.
Our SQL Server firewall is correctly configured to allow access to Azure Services, but I am not allowed to add it as a data source.
How can this be done? Is it a matter of other configuration issues?
Any help would be greatly appreciated.
Per my research ,there is no other configuration issues for sql server data source in DLA. Based on this official doc, DLA only supports two data sources:Data Lake Store and Azure Storage.
As workaround , I suggest you using Azure Data Factory to transfer data from sql server database to azure storage so that you could run U-SQL script against data source.
Any concern,please let me know.

Resources