Can we pass whole data of csv from blob storage to on-premises SQL database using Azure data factory? - azure

I am trying to pass all records of csv from blob storage to on-premises database of SQL using Azure data factory. I know how to pass data one by one using LookUp and Copy activity but I don't know how to pass all records of csv.

you can directly use copy activity with blob file as the source and Azure SQL DB table as sink wherein all records of file would be copied into table.
There is no need of a lookup activity

You will have to use copy activity to copy data from azure blob storage to on-prem SQL database.
You can follow below steps:
Step1:
Select copy activity in data factory.
Step2:
Select Source dataset as azure blob storage
Step3:
Select on-prem sql database as a sink
Step4:
Click on import schema to do the mapping.
Step5:
Finally Execute Copy activity. No need to use lookup activity here.

Related

Synapse Analytics save a query result in a blob storage

So, I have some parquet files stored in azure devops containers inside a blob storaged. I used data factory pipeline with the "copy data" connector to extract the data from a on premise Oracle database.
I'm using Synapse Analytics to do a sql query that uses some of the parquet files store in the blob container and I want to save the results of the query in another blob. Which Synapse connector can I use the make this happen? To do the query I'm using the "develop" menu inside Synapse Analytics.
To persist the results of a serverless SQL query, use CREATE EXTERNAL TABLE AS SELECT (CETAS). This will create a physical copy of the result data in your storage account as a collection of files in a folder, so you cannot specify how many files nor the naming scheme of the files.

Azure Data Factory: read from csv and copy row by row to a cosmos db

I'm new to Azure Data Factory. I'm trying to solve the following problem:
Read csv file from Azure Blob
Parse it row by row and dump each row into an existing cosmos db
I am currently looking into a solution that does:
Copy data from source (csv) to sink (Azure Storage Table)
ForEach activity that parses the table and copies the rows into the db
Is this a correct approach, and if it is, how should I set up the dynamic content of the ForEach activity?
Note:
I've tried this solution (link) but I get an error message saying
Reading or replacing offers is not supported for serverless accounts
which means that
CosmosDB Serverless is not currently supported as Sink for the Data flow in Azure Data Factory.
If you use Lookup + ForEach actives, the Foreach Items should be:
#activity('Lookup1').output.value
Your solution may be hard to achieve that.
Since you have found that Data Flow doesn't support Cosmos DB Serverless, I think you may can ref this tutorial: Copy Data From Blob Storage To Cosmos DB Using Azure Data Factory
It uses copy active to copy data from a csv file in Blob Storage to Azure Cosmos DB directly.

How to perform data factory transformations on large datasets in Azure data warehouse

We have Data warehouse tables that we perform transformations using ADF.
If I have a group of ADW tables, and I need to perform transformations on them to land them back onto ADW, should I save the transformations into Azure Blob Storage? or go direct into the target table.
The ADW tables are in excess of 100Mil records.
Is it an acceptable practice to use Blob Storage as the middle piece.
I can think of two ways to do this (they do not require moving the data into blob storage),
Do the transformation within SQL DW using stored procedure and use ADF to orchestrate the stored procedure call
Use ADF's data flow to apply the transformation to read from SQL DW and write back to SQL DW
Yes, you'd better using the use Blob Storage as the middle piece.
You can not copy the tables from SQL DW(Source) to the same SQL DW(Sink) directly! If you have tried this, you will have the problems:
Copy data: has the error in data mapping, copy data to the same table, not create new tales.
Copy Active: Table is required for Copy activity.
If you want to copy the data from SQL DW tables to new tables with Data Factor, you need at least two steps:
copy the data from the SQL DW tables to Blob storage(create the csv
files).
Load these csv files to SQL DW and create new tables.
Reference tutorials:
Copy and transform data in Azure Synapse Analytics (formerly Azure
SQL Data Warehouse) by using Azure Data Factory
Copy and transform data in Azure Blob storage by using Azure Data
Factory
Data Factory is good at transfer big data. Reference Copy performance of Data Factory. I think it may faster than SELECT - INTO Clause (Transact-SQL).
Hope this helps.

Is there any way to filter out some data in copy activity when source is Blob storage and sink is SQL database?

I am trying to copy data from Azure blobs to azure SQL database using Azure Data Factory.
The azure blobs are incrementally stored everytime in the storage account. They are just JSON having key-value pairs. So I want to filter the data on the basis of one key-value before it get copied inside the SQL databse.
you could use stored procedure to do some filtering before writing into your db.
https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database#invoking-stored-procedure-for-sql-sink

Azure Data Factory: Moving data from Table Storage to SQL Azure

While moving data from Table Storage to SQL Azure, is it possible to obtain only the Delta (The data that hasn't been already moved) using Azure Data Factory?
A more detailed explanation:
There is an Azure Storage Table, which contains some data, which will be updated periodically. And I want to create a Data Factory pipeline which moves this data to an SQL Azure Database. But during each move I only want the newly added data to be written to SQL DB. Is it possible with Azure Data Factory?
See more information on azureTableSourceQuery and copy activity at this link : https://azure.microsoft.com/en-us/documentation/articles/data-factory-azure-table-connector/#azure-table-copy-activity-type-properties.
Also see this link for invoking stored procedure for sql: https://azure.microsoft.com/en-us/documentation/articles/data-factory-azure-sql-connector/#invoking-stored-procedure-for-sql-sink
You can query each time on timestamp to achieve something similar to delta copy, but this is not true delta copy.

Resources