When converting an excel file to csv in synapse pipeline or dataflow, I need to put the values of certain cells in excel in an additional column.
I was able to convert an excel file to csv, but I can't figure out how to read the values in a particular cell and add them as a column.
What I would like to achieve is as follows.
Excel sample file
I want to add "outlet_code" as a column in the "C2" cell of the Excel file.
csv file
you can try the below logic:
1st use a copy activity to copy the content of the Excel file from A1 to c2 into CSV (assuming the place of the code remains the same)
Excel Source dataset :
Leverage Lookup Activity to get the code value
Lookup should map to the CSV file generated above
Create another Copy activity to actually copy your excel content into CSV with the current setting you have and a below logic to add an additional column :
Add additional column in copy activity using Azure Data Factory
Related
I am trying to copy a excel file with multiple tabs from one folder to another in ADLs using copy activity in datafactory.I have selected excel dataset as source dataset and csv dataset as sink dataset with name of the tab to copy defined in the dataset properties.
I am getting below error while running the pipeline :
Only formula cells have cached results Activity ID: 0d26511f-4f82-45df-9e92-62c78f3f02b6
Looks like you are trying to copy multiple worksheets of your excel file to ADLS using ADF Copy activity, if that is the case then you will have to pass the worksheet names dynamically to the source dataset and and use the same parameter to define the sink file name, this will help you copy multiple sheets of the same excel file to desired sink.
Fore more info please refer to similar conversation here: Dynamic sheet name in source dataset: (Excel (Blob storage)) on Azure Data Factory. - Error: Please select a work sheet for your dataset
I am wondering if it is possible to add a date column to each file uploaded.
For example each month a CSV is produced. I am wanted to add for example "December 2020" to each row and then for the next months upload add "January 2021" to every row in the CSV file. Before copying this into a SQL database.
e.g. file name "Latest Rating December 2020" I would want the 'December 2020' as a column and be the same value for all rows. The naming convention will be the same for each months upload.
Thanks
I've created a test to add a column to the csv file.
The result is as follows:
We can get file name via Child Items in Get MetaData activity.
The dataset is to the container in ADLS.
Then we can declare a variable FileName to store the file name via the expression #activity('Get Metadata1').output.childItems[0].name.
3.We can use additional column in Copy activity, and use the expression #concat(split(variables('FileName'),' ')[2],' ',split(variables('FileName'),' ')[3]) to
get the value we need. Note that the single quote contains a space.
In the dataset, we need key in a dynamic content #variables('FileName') to specify which file to be copied.
The sink is the same as source in Copy activity.
Then we can run debug to confirm it.
Here I think we also can copy into SQL table directly, when we set the sink to a sql table.
I have question
There are excel data like this
input file
More than 500 person
I wanna convert data to csv
expected csv result
The data age is not 100% the second row, some may be third row. Name can be duplicate data.
I’m really confused. Can i use excel feature to do this or any way like coding?
I upload file : https://ufile.io/rxe1l
Add name, age, add etc as column in excel and export as .csv.
In your Excel workbook, switch to the File tab, and then click Save As. Alternatively, you can press F12 to open the same Save As dialog.
In the Save as type box, choose to save your Excel file as CSV (Comma delimited).
Please simply follow this link conversation excel into csv
I'm provided with a folder of excel files. Each represent one form with data entered in specific cells. Each file is of the same format and each would for ONE row of information to be imported into my sql server database.
I believe I can loop through each excel file in the folder, however I am having issues finding the right tools to extract these specific cells and merge them into a single row to insert into the table.
Power Query to the rescue! :)
http://excelunplugged.com/2015/02/10/get-data-from-folder-in-power-query/
Ended up writing some VBA instead to move the data into a tabular / List form in one excel sheet then used that Document to feed SSIS. So far, does not seem like SSIS can do that initial part.
I have a webservice which continuously generate/update a csv file which can be downloaded externally. I would like to use an Excel file to continuously get the content of the csv file, and do calculation and visualization accordingly.
e.g.
my csv file contains 4 rows, 3 columns.
In an Excel sheet, A1:C4 are used to store the contents of the csv file. A5:C5 are the average of each column. And a bar chart of the contents in each column is displayed in A5.
How can I ask the Excel to download the csv file by specifying its URL, and store in A1:C4? And how to make it automatically update its contents when the csv file is updated?