I have metadata in my Azure SQL db /csv file as below which has old column name and datatypes and new column names.
I want to rename and change the data type of oldfieldname based on those metadata in ADF.
The idea is to store the metadata file in cache and use this in lookup but I am not able to do it in data flow expression builder. Any idea which transform or how I should do it?
I have reproduced the above and able to change the column names and datatypes like below.
This is the sample csv file I have taken from blob storage which has meta data of table.
In your case, take care of new Data types because if we don't give correct types, it will generate error because of the data inside table.
Create dataset and give this to lookup and don't check first row option.
This is my sample SQL table:
Give the lookup output array to ForEach.
Inside ForEach use script activity to execute the script for changing column name and Datatype.
Script:
EXEC SP_RENAME 'mytable2.#{item().OldName}', '#{item().NewName}', 'COLUMN';
ALTER TABLE mytable2
ALTER COLUMN #{item().NewName} #{item().Newtype};
Execute this and below is my SQL table with changes.
Related
I need to pick a time stamp data from a column ‘created on’ from a csv file in ADLS. Later I want to query Azure SQL DB like delete from table where created on = ‘time stamp’ in ADF. Please help on how could this be achieved.
Here I repro'd to fetch a selected row from the CSV in ADLS.
Create a Linked service and Dataset of the source file.
Read the Data by the Lookup Activity from the Source path.
For each activity iterates the values from the output of Lookup.#activity('Lookup1').output.value
Inside of For Each activity use Append Variable and set Variable Use value for append variable from the For each item records.
Using it as Index variable.
Use script activity to run query and reflect the script on the data.
Delete FROM dbo.test_table where Created_on = #{variables('Date_COL3')[4]}
I'm querying an API using Azure Data Factory and the data I receive from the API looks like this.
{
"96":"29/09/2022",
"95":"31/08/2022",
"93":"31/07/2022"
)
When I come to write this data to a table, ADF assumes the column names are the numbers and the dates are stored as rows like this
96
95
93
29/09/2022
31/08/2022
31/07/2022
when i would like it to look like this
Date
ID
29/09/2022
96
31/08/2022
95
31/07/2022
93
Does any one have any suggestions on how to handle this, I ideally want to avoid using USP's and dynamic SQL. I really only need the ID for the month of the previous one we're in.
PS - API doesn't support any filtering on this object
Updates
I'm querying the API using a web activity and if i try to store the data to an Array variable the activity fails as the output is an object.
When I use a copy data activity I've set the sink to automatically create the table and the mapping looks likes this
mapping image
Thanks
Instead of directly trying to copy the JSON response to SQL table, convert it the response to a string, extract the required values and insert them into the SQL table.
Look at the following demonstration. I have taken the sample response provided as a parameter (object type). I used set variable activity for extracting the values.
My parameter:
{"93":"31/07/2022","95":"31/08/2022","96":"29/09/2022"}
Dynamic content used in set variable activity:
#split(replace(replace(replace(replace(string(pipeline().parameters.api_op),' ',''),'"',''),'{',''),'}',''),',')
The output for set variable activity will be:
Now inside For each activity (pass the previous variable value as items value in for each), I used copy data to copy each row separately to my sink table (Auto create option enabled). I have taken a sample json file as my source (We are going to ignore all the columns anyway.)
Create the required 2 additional columns called id and date with the following dynamic content:
#id
#split(item(),':')[0]
#date
#split(item(),':')[1]
Configure the sink. Select the database, create dataset, give a name for table (I have given dbo.opTable) and select Auto create table under sink settings.
The following is an image of mapping. Delete the column mappings which are not required and only use additional columns created above.
When I debug the pipeline, it will run successfully, and the required values are inserted into the table. The following is output sink table for reference.
I want to copy data from a CSV file (Source) on Blob storage to Azure SQL Database table (Sink) via regular Copy activity but I want to copy also file name alongside every entry into the table. I am new to ADF so the solution is probably easy but I have not been able to find the answer in the documentation and neither on the internet so far.
My mapping currently looks like this (I have created a table for output with the file name column but this data is not explicitly defined at the column level at the CSV file therefore I need to extract it from the metadata and pair it to the column):
For the first time, I thought that I am going to put dynamic content in there and therefore solve the problem this way. But there is not an option to use dynamic content in each individual box so I do not know how to implement the solution. My next thought was to use Pre-copy script but have not seen how could I use it for this purpose. What is the best way to solve this issue?
In Mapping columns of copy activity you cannot add the dynamic content of Meta data.
First give the source csv dataset to the Get Metadata activity then join it with copy activity like below.
You can add the file name column by the Additional columns in the copy activity source itself by giving the dynamic content of the Get Meta data Actvity after giving same source csv dataset.
#activity('Get Metadata1').output.itemName
If you are sure about the data types of your data then no need to go to the mapping, you can execute your pipeline.
Here I am copying the contents of samplecsv.csv file to SQL table named output.
My output for your reference:
I am trying to read data from csv to azure sql db using copy activity. I have selected auto create option for destination table in sink dataset properties.
copy activity is working fine but all columns are getting created with nvarchar(max) or length -1. I dont want -1 length as default length for my auto created sink table columns.
Does any one know how to change column length or create fixed length columns while auto table creation in azure data factory?
As you have clearly detailed, the auto create table option in the copy data activity is going to create a table with generic column definitions. You can run the copy activity initially in this way and then return to the target table and run T-SQL statements to further define the desired column definitions. There is no option to define table schema with the auto create table option.
ALTER TABLE table_name ALTER COLUMN column_name new_data_type(size);
Clearly, the new column definition(s) must match the data that has been initially copied via the first copy activity but this is how you would combine the auto create table option and resulting in clearly defined column definitions.
It would be useful if you created a Azure Data Factory UserVoice entry to request that the auto create table functionality include creation of column definitions as a means of saving the time needed to go back an manually alter the columns to meet specific requirements, etc.
I am having a CSV file in blob now I wanted to push that CSV file into SQL table using azure data factory but want I want is to put a check condition on CSV data if any cell has null value so that row data will copy into an error table like for an example I have ID, name and contact column in CSV so for any record lets say contact is null(1, 'Gaurav', NULL) so in that case, this row will insert into an error table and if there is no null in the row then that row will go into the master table
Note: As the sink is SQL on a VM so we can't create any this over there we have to handle this on data factory level only
This can be done using a mapping data flow in ADF. One way of doing it is to use a derived column with an expression that that does the null check with for example the isNull() function. That way you can populated a new column with some value for the different cases, which you can then use in a conditional split to redirect the different streams to different sinks.