I'm using the Copy Data activity from Microsoft Azure Data Factory for mapping JSON file attributes in my storage account (source) to MySQL database columns (sink).
Question:
Is it possible to get the Blob URL's within Copy Data from the JSON files and send it to my database? I know it's possible to get the file name in the source tab with "$$FILEPATH", but I'd like to get the complete URL.
ok since its constant value , you can add it as a column to your json Data and write it into SQL DB
Here is a quick demo that i created :
created a dataflow activity
in Dataflow activity, added a json file as a source
flattened the json (you have to flatten the json in order to write it into SQL DB because you cant write a complex data type to SQL Db)
added a derived column 'BlobUrl', here i mapped a new column and added a string value to the data. (see screenshot attached below) you can ingest here a variable instead of a constant value , please check it out here : Passing a variable from pipeline to a dataflow in azure data factory
saved the result in SQL DB as a sink
DataFlow Activities:
Derived Column Activity:
Related
I am using azure data factory to have a soap API connection data to be transferred to snowflake. I understand that snowflake has to have the data in variant column or csv or we need to have intermediate storage in azure to finally land the data in snowflake. the problem I faced is the data from api is a string within that there is xml data. so when i put the data in blob storage, its a string. how do I avoid this and have the proper columns while putting the data ?
over here, the column is read as string. is there a way to parse it into their respective rows ? I tried to put the collection reference, it still does not recognize individual columns. Any input is highly appreciated.
You need to change to Advanced editor in Mapping section of copy activity. I took the sample data and repro'd this. Below are the steps.
Img:1 Source dataset preview
In mapping section of copy activity,
Click Import Schema
Switch to Advanced editor .
Give the collection reference value.
Img:2 Mapping settings
I need to pick a time stamp data from a column ‘created on’ from a csv file in ADLS. Later I want to query Azure SQL DB like delete from table where created on = ‘time stamp’ in ADF. Please help on how could this be achieved.
Here I repro'd to fetch a selected row from the CSV in ADLS.
Create a Linked service and Dataset of the source file.
Read the Data by the Lookup Activity from the Source path.
For each activity iterates the values from the output of Lookup.#activity('Lookup1').output.value
Inside of For Each activity use Append Variable and set Variable Use value for append variable from the For each item records.
Using it as Index variable.
Use script activity to run query and reflect the script on the data.
Delete FROM dbo.test_table where Created_on = #{variables('Date_COL3')[4]}
I created an Azure Data Factory pipeline that uses a Rest data source to pull data from a Rest API and copy it to an Azure SQL database. Each row in the Rest data source contains approx. 8 fields but one of those fields contains an array of values. I'm using a Copy Data task. How do I get all values from that field to map into 1 of my database fields, possibly as a string? I've tried clicking on "Collection Reference" for that field but if the array field has 5 values, it creates 5 different records in my SQL table for the one source row. If I don't select "Collection Reference", it only grabs the first value in the array.
I looked into using the Data Flow mapping task instead, but that one doesn't seem to support a Rest API dataset as a data source.
Please help.
You can store the output of REST API as a JSON file in Azure blob storage by Copy Data activity. Then you can use that file as Source and do transformation in Data Flow. Also you can use Lookup activity to get the JSON data and invoke the SP to store the data in Azure SQL Database(This way will be cheaper and it's performance will be better).
I am performing a a trigger based pipeline to copy data from blob storage to SQL database. In every blob file there are bunch of JSONs from which I need to copy just few of them and I can differenciate them on the basis of a Key-value pair present in every JSON.
So How to filter those JSON containing that Value corresponding to a common key?
One Blob file looks like this. Now While the copy activity is happening ,it should filter data according to the Event- Name: "...".
Data factory in general only moves data, it doesnt modify it. What you are trying to do might be done using a staging table in the sink sql.
You should first load the json values as-is from the blob storage in the staging table, then copy it from the staging table to the real table where you need it, applying your logic to filter in the sql command used to extract it.
Remember that sql databases have built in functions to treat json values: https://learn.microsoft.com/en-us/sql/relational-databases/json/json-data-sql-server?view=sql-server-2017
Hope this helped!
At this time we do not have an option for the copy activity to filter the content( with an exception of sql source ) .
In your scenario it looks like that already know which values needs to omitted , on way to go will be have a "Stored Procedure" activity , after the copy activity which will be just delete the values which you don't want from the table ,this should be easy to implement but depending on the volume of data it may lead to performance issues . The other option is to have the JSON file cleaned on the storage side before it is ingested .
I am using Azure Data Factory V1. We want to copy the json data stored as documents from Azure cosmos to a azure sql table, using a copy activity.
I figured out copying the data by specifying the columns in sql table to match the property names from json. However our goal is to copy the entire json data as a single field. We are doing this for the purpose of being agnostic to the schema within the json data.
I have tried specifying a single nvarchar(max) column to store the json data, and the query on the copy activity to be "select c as "FullData" from c". But the copy activity simply generates a NULL.
I think this is because "FullData" is of type json on the document end and it is string on the sql end. I also tried to convert the json object to string within the cosmos db query. But I couldnt find any API to do so.
I know we could write a custom activity to accomplish what I want to do, but is this possible to do with ADF out of the box functionality?
you can use the jsonPathDefinition as similar as this:
"column_full": "$. "
Refer to this link on how to use jsonFormat with ADF: https://learn.microsoft.com/en-us/azure/data-factory/supported-file-formats-and-compression-codecs#json-format
To use ADF to copy the JSON document as single field into Azure SQL Database, you need to make sure the result set by the specified Cosmos DB query is actually a single "column" containing the entire object as string.
I don't think Cosmos DB has built-in query syntax for this, but you can create a UDF (user defined function) in Cosmos DB to convert object to string e.g. with JSON.stringify(), then call that UDF in the select query in ADF copy source.