Mapping tab in ADF "Copy data" action missing new table storage property/column name - azure

We have an existing Azure Data Factory pipeline that takes data from Azure Table Storage table and copies the data to Azure SQL table - which is working without issue.
The problem exists when we added a new data element to the table storage (since it is NoSQL). When I go into the ADF Source of Pipeline and refresh the table storage, the new data is not available to map properly. Is there something I am missing to get this new data element (column) to show up. I know this is working correctly since I can see this column in Azure Storage Explorer.

Congratulations that you found the answer:
"I located the answer with additional research. See article: https://stackoverflow.com/questions/44123539/azure-data-factory-changing-azure-table-schema"
This can be beneficial to other community members.

Related

File is not readed completely by Copy Data in Azure Data Factory

I'm developing a pipeline that be able to insert data from a .txt file located in the Blob Storage into a table in a SQL Data Base.
Problem: Somehow the activity configuration is not working properly cause' is not reading all the records in the file and in consequence is not loading all the data into the Data Base (I realized this issue when I opened the file and compared the number of records from .text file against SQL table. Also, when I searched records from the last month in the table on SQL I didn't find them)
Note: I checked out the size limit of characters in the table from SQL and that isn't the problem.
I'd like to share with you the Data Copy activity and Source Data Set configuration as well:
Sink Dataset:
Do you know, guys what I'm doing wrong here? Hope you can help me, best regards.
P.S. Here's the Source Dataset
As discussed in comments, while using copy activity you would have to make sure to set the schema before running the activity. By design the schema mapping is left empty and has to be configured by the user either manually or asking adf to import the schema from the dataset.
Note: While using Auto create table option in sink, it automatically creates sink table (if nonexistent) in source schema,
but won't be supported when a stored procedure is specified (on the
sink side) or when staging is enabled.
Using COPY statement to load data into Azure Synapse Analytics as sink, the connector supports automatically creating destination table with DISTRIBUTION = ROUND_ROBIN if not exists based on the source schema.
Refer official doc: Copy and transform data in Azure Synapse Analytics by using Azure Data Factory or Synapse pipelines
Source...
Sink...
So Azure Synapse will be used as the sink. Additionally, an Azure Synapse table has to be created which matches the column names, column order, and column data types of source.
For dynamic mapping
If you view the pipeline code, you can see in the Translator section the JSON equivalent of the mapping section from UI.
You can reuse this as a base in Dynamic mapping to enable further copying similar files without having to manually configure schema.
Copy the JSON under mappings in translator

How to incrementally load data from Azure Blob storage to Azure SQL Database using Data Factory?

I have a json file stored in Azure Blob Storage and I have loaded it into Azure SQL DB using Data Factory.
Now I would like to find a way in order to load only new records from the file to my database (as the file is being updated every week or so). Is there a way to do it?
Thanks!
You can use the upsert ( slowly changing dimension type 1 ) that is already implemented in Azure Data Factory.
It will add new record and update old record that changed.
Here a quick tutorial :
https://www.youtube.com/watch?v=MzHWZ5_KMYo
I would suggest you to use Dataflow activity.
In Dataflow Activity, you have the option of alter row as shown in below image.
In Alter row you can use Upsert if condition.
Here mention condition as 1 == 1

Datafactory to sharepoint list

I've setup a connection from our data factory in azure to a sharepoint site so I can pull some of the lists on the site into blob storage so I can then process into our warehouse. This all works fine and I can see the data I want. I don't however want to pull all the columns contained in the list I'm after. Looking at the connection I can specify a query, however anything I put in here has no affect on the data that comes back. Is there a way to specify the columns from a list in sharepoint through the copy activity into blob storage?
You need to user select query like below -
$select=Title,Number,OrderDate in the Query text field of the Azure Data Factory Source.
You could use Preview Data button to validate the results. Please refer to the documentation for using Custom OData query options.
I have tried this and it works fine for me (see screenshot below)-
Thanks
Saurabh

Error trying to copy data from Azure SQL database to Azure Blob Storage

I have created a pipeline in Azure data factory (V1). I have a copy pipeline, that has an AzureSqlTable data set on input and AzureBlob data set as output. The AzureSqlTable data set that I use as input, is created as output of another pipeline. In this pipeline I launch a procedure that copies one table entry to blob csv file.
I get the following error when launching pipeline:
Copy activity encountered a user error: ErrorCode=UserErrorTabularCopyBehaviorNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=CopyBehavior property is not supported if the source is tabular data source.,Source=Microsoft.DataTransfer.ClientLibrary,'.
How can I solve this?
According to the error information, it indicateds that it is not supported action for Azure data factory, but if use Azure sql table as input and Azure blob data as output it should be supported by Azure data factory.
I also do a demo test it with Azure portal. You also could follow the detail steps to do that.
1.Click the copy data from Azure portal.
2.Set copy properties.
3.Select the source
4.Select the destination data store
5.Complete the deployment
6.Check the result from azure and storage.
Update:
If we want to use the existing dataset we could choose [From Existing Conections], for more information please refer to the screenshot.
Update2:
For Data Factory(v1) copy activity settings it just supports to use existing Azure blob storage/Azure Data Lake Store Dataset. More detail information please refer to this link.
If using Data Factory(V2) is acceptable, we could using existing azure sql dataset.
So, actually, if we don't use this awful "Copy data (PREVIEW)" action and we actually add an activity to existing pipeline and not a new pipeline - everything works. So the solution is to add a copy activity manually into an existing pipeline.

Azure Data Sync - Copy Each SQL Row to Blob

I'm trying to understand the best way to migrate a large set of data - ~ 6M text rows from (an Azure Hosted) SQL Server to Blob storage.
For the most part, these records are archived records, and are rarely accessed - blob storage made sense as a place to hold these.
I have had a look at Azure Data Factory and it seems to be the right option, but I am unsure of it fulfilling requirements.
Simply the scenario is, for each row in the table, I want to create a blob, with the contents of 1 column from this row.
I see the tutorial (i.e. https://learn.microsoft.com/en-us/azure/data-factory/data-factory-copy-activity-tutorial-using-azure-portal) is good at explaining migration of bulk-to-bulk data pipeline, but I would like to migrate from a bulk-to-many dataset.
Hope that makes sense and someone can help?
As of now, Azure Data Factory does not have anything built in like a For Each loop in SSIS. You could use a custom .net activity to do this but it would require a lot of custom code.
I would ask, if you were transferring this to another database, would you create 6 million tables all with the same structure? What is to be gained by having the separate items?
Another alternative might be converting it to JSON which would be easy using Data Factory. Here is an example I did recently moving data into DocumentDB.
Copy From OnPrem SQL server to DocumentDB using custom activity in ADF Pipeline
SSIS 2016 with the Azure Feature Pack, giving Azure Tasks such as Azure Blob Upload Task and Azure Blob Destination. You might be better off using this, maybe an OLEDB command or the For Each loop with an Azure Blob destination could be another option.
Good luck!
Azure has a ForEach activity which can be place after LookUp or Metadata to get the each row from SQL to blob
ForEach

Resources