Azure DataFactory|| Copy Activtity

Azure DataFactory|| Copy Activtity - azure

I am trying to copy a excel file with multiple tabs from one folder to another in ADLs using copy activity in datafactory.I have selected excel dataset as source dataset and csv dataset as sink dataset with name of the tab to copy defined in the dataset properties.
I am getting below error while running the pipeline :
Only formula cells have cached results Activity ID: 0d26511f-4f82-45df-9e92-62c78f3f02b6

Looks like you are trying to copy multiple worksheets of your excel file to ADLS using ADF Copy activity, if that is the case then you will have to pass the worksheet names dynamically to the source dataset and and use the same parameter to define the sink file name, this will help you copy multiple sheets of the same excel file to desired sink.
Fore more info please refer to similar conversation here: Dynamic sheet name in source dataset: (Excel (Blob storage)) on Azure Data Factory. - Error: Please select a work sheet for your dataset

Related

Convert excel file to csv in synapse

When converting an excel file to csv in synapse pipeline or dataflow, I need to put the values of certain cells in excel in an additional column.
I was able to convert an excel file to csv, but I can't figure out how to read the values in a particular cell and add them as a column.
What I would like to achieve is as follows.
Excel sample file
I want to add "outlet_code" as a column in the "C2" cell of the Excel file.
csv file

you can try the below logic:
1st use a copy activity to copy the content of the Excel file from A1 to c2 into CSV (assuming the place of the code remains the same)
Excel Source dataset :
Leverage Lookup Activity to get the code value
Lookup should map to the CSV file generated above
Create another Copy activity to actually copy your excel content into CSV with the current setting you have and a below logic to add an additional column :
Add additional column in copy activity using Azure Data Factory

Add file name to Copy activity in Azure Data Factory

I want to copy data from a CSV file (Source) on Blob storage to Azure SQL Database table (Sink) via regular Copy activity but I want to copy also file name alongside every entry into the table. I am new to ADF so the solution is probably easy but I have not been able to find the answer in the documentation and neither on the internet so far.
My mapping currently looks like this (I have created a table for output with the file name column but this data is not explicitly defined at the column level at the CSV file therefore I need to extract it from the metadata and pair it to the column):
For the first time, I thought that I am going to put dynamic content in there and therefore solve the problem this way. But there is not an option to use dynamic content in each individual box so I do not know how to implement the solution. My next thought was to use Pre-copy script but have not seen how could I use it for this purpose. What is the best way to solve this issue?

In Mapping columns of copy activity you cannot add the dynamic content of Meta data.
First give the source csv dataset to the Get Metadata activity then join it with copy activity like below.
You can add the file name column by the Additional columns in the copy activity source itself by giving the dynamic content of the Get Meta data Actvity after giving same source csv dataset.
#activity('Get Metadata1').output.itemName
If you are sure about the data types of your data then no need to go to the mapping, you can execute your pipeline.
Here I am copying the contents of samplecsv.csv file to SQL table named output.
My output for your reference:

Adf Copy task with Excel in a foreach loop

within Azure Adf, I have an excel file with 4 tabs named Sheet1-Sheet4
I would like to loop through the excel creating a CSV per tab
I have created a SheetsNames parameter in the pipeline with a default value of ["Sheet1","Sheet2","Sheet3","Sheet4"]
How do I use this with the copy task to loop with the tabs?

Please try this:
Create a SheetsNames parameter in the pipeline with a default value of ["Sheet1","Sheet2","Sheet3","Sheet4"].
Add a For Each activity and type #pipeline().parameters. SheetsNames in the Items option.
Within the For Each activity, add a copy activity.
Create Source dataset and create a parameter named sheetName with empty default value.
Navigate to the Connection setting of the Source dataset and check Edit in the Sheet name option. Then type #dataset().sheetName in it.
Navigate to the Source setting of the Copy data activity and pass #item() to the sheetName.
Create a Sink dataset and it's setting is similar to the Source dataset.
Run the pipeline and get this result:

How to bulk copy multiple csv files from blob to mutliple sql database tables with data factory

I am trying copy different csv files in blob storage into there very own sql tables(I want to auto create these tables). Ive seen alot of questions but I haven't seen any that answer this.
Currently I have a getmetadata function that grabs a list of child items to get the name of the files and a foreach loop but from there I don't know how to have them sent to different tables per file.

Updated:
When I run it for a 2nd time. It will add new rows into the table.
I created a simple test and it works well. This is my csv file stored in Azure Data Lake.
Then we can use pipeline to copy this csv file into Azure SQL table(auto create these tables).
At GetMetaData1 activity, we can set the dataset of the folder containing csv files
And select First row as header at the dataset.
2.At ForEach1 activity we can foreach the file list via expression #activity('Get Metadata1').output.childItems.
3.Inside ForEach1 activity, we can use Copy data1 activity with the same data source as GetMetaData1 activity. At source tab, we can type in dynamic content #item().name. We can use #item().name to get the file name.
At sink tab, we should select Auto create table.
In the Azure SQL dataset, we should type in schema name and dynamic content #replace(item().name,'.csv','') as its table name. Because this information is needed to create a table dynamically.
The debug result is as follows:

Dynamic sheet name in source dataset: (Excel (Blob storage)) on Azure Data Factory. - Error: Please select a work sheet for your dataset

I have multiple .xlsx-files in my blob storage and I need to copy them to my Azure SQL Database using Azure data Factory.
I want to keep one Source Dataset (Blob Storage -Excel).
So I added two parameters in the dataset.
File (string): blabla.xlsx
Sheet (string): blabla (name of the sheet in excel).
Source Dataset
If I go to copy data and details are already filled in, I got the following error:
'Please select a work sheet for your dataset'
Copy data
If I change the sheet name in hardcode: blabla. It will works, but then I cannot make use of a dynamic sheet name.
Does someone know how I can fix this?

If you want to pass the sheet name as dynamic to the dataset, then you will have to have dataset parameter and a pipeline parameter and then pass sheet name value from pipeline parameter to dataset parameter as below:

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure DataFactory|| Copy Activtity - azure

Related

Convert excel file to csv in synapse

Add file name to Copy activity in Azure Data Factory

Adf Copy task with Excel in a foreach loop

How to bulk copy multiple csv files from blob to mutliple sql database tables with data factory

Dynamic sheet name in source dataset: (Excel (Blob storage)) on Azure Data Factory. - Error: Please select a work sheet for your dataset

Categories

Resources