Creating JSON Array in Azure Data Factory with multiple Copy Activities output objects - azure

Is it possible to embed the output of a copy activity in Azure Data Factory within an array that is meant to be iterated over in a subsequent ForEach?
My goal is to create an array with the output of several copy activities and then in a ForEach, access the properties of those copy activities with dot notation (Ex: item().rowsRead). Image shows code details.
Image
Specifically, I have 7 copy activities whose output JSON object (described here) would be stored in an array that I then iterate over. In the ForEach I would be checking the properties on each of the copy activities (rowsRead, rowsCopied, etc.) for validation purposes.
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-monitoring

I think we can embed the output of a copy activity in Azure Data Factory within an array. I've created a test to save the output of 2 Copy activities into an array. We need to concat a string type and then convert it to json type. Please see my step2.
We can declare an array type variable named CopyInfo to store the output. The another array type variable named JsonArray is used to see the test result at debug mode.
In Append variable1 activity, I use #json(concat('{"activityName":"Copy1","activityObject":',activity('Copy data1').output,'}')) to save the output of Copy data1 activity and convert it from String type to Json type.
In Append variable2 activity, I use #json(concat('{"activityName":"Copy2","activityObject":',activity('Copy data2').output,'}')) to save the output of Copy data2 activity and convert it from String type to Json type.
Then I assign the value of variable CopyInfo to variable JsonArray
In the end, we can see the json array like :
"name": "JsonArray",
"value": [
{
"activityName": "Copy1",
"activityObject": {
"dataRead": 643,
"dataWritten": 643,
"filesRead": 1,
"filesWritten": 1,
...
},
{
"activityName": "Copy2",
"activityObject": {
"dataRead": 643,
"dataWritten": 643,
"filesRead": 1,
"filesWritten": 1,
...
}
}
]

Related

How to send the output values of a Lookup activity in an email in Data Factory?

I'm trying to send a LookUp activity output values as part of a body parameter in a POST request using LogicApp, which uses three parameters: "to", "email_body", "subject".
The LookUp activity depends on a query, and it may return from 2 rows up to 10 rows.
According to Azure, the output of the activity should look like this:
{
"count": 2,
"value": [
{
"column1":value1,
"column2":value2,
"column3":value3
},
{
"column1":value4,
"column2":value5,
"column3":value6
}
]
}
In this case, the query returned 2 rows, but how can I attach every output value to the POST body without having to use #activity('lookup_act').output.value[0].column1 and so on for every value?
The POST body is the following:
{
"email_body": "Hi, the following tables have been updated:
#{activity('lookup_act').output.value[0].column1}
#{activity('lookup_act').output.value[1].column1}",
"subject": "Update on tables",
"to": "email#domain.com"
}
I've tried using #activity('lookup_act').output.value to bring every value but it won't work.
Is there a way to call every single output value? If so, how can it be done and paste into a table?
Thanks beforehand.
There are two ways to get all values in mail:
1. Get whole lookup output array in mail.
First get the results from Lookup activity and then pass the output of this activity by converting it into a string otherwise you will get error regarding deserialization.
{"message":"#string(activity('Lookup1').output.value)",
"dataFactoryName":"#{pipeline().DataFactory}",
"pipelineName":"#{pipeline().Pipeline}",
"receiver":"#{pipeline().parameters.receiver}"}
OUTPUT
2. Get all the respective values column wise.
First get the results from Lookup activity then take a foreach loop and create append variable for every column to store every column value in single array.
ForEach activity setting:
Took append variable activity and created Idarray variable. and gave item().id as value to store all id values in a single array.
Then in web activity passed below body for getting all arrays.
{"message":"#{string(variables('Idarray'))} as Id, #{string(variables('Namearray'))} as Name, #{string(variables('ProfessionArray'))} as Profession",
"dataFactoryName":"#{pipeline().DataFactory}",
"pipelineName":"#{pipeline().Pipeline}",
"receiver":"#{pipeline().parameters.receiver}"}
OUTPUT

ADF/Synapse all Objects iterate and remove the Underscore

Wanted to iterate the list of objects/Tables and exclusively for one object which is not getting picked up as there is Underscore between the words "Admin_process" Expectation is to get as "Adminprocess" in the adf/synapse by removing the underscore,such that all objects will be passed to the copy operation.
Objects/Tables list
AdminUser
Admin_process
TempUser
Currently it is above, However is not reading the object "Admin_Process" as there is underscore.
Could you someone please tell me how to handle this case.
Thank you,
You can use replace function in ADF dynamic content.
please follow the demonstration below.
Here I am using an array parameter with keys and the above list of tables as values.
[
{
"Objectname": "AdminUser"
},
{
"Objectname": "Admin_process"
},
{
"Objectname": "TempUser"
}
]
Parameter array to ForEach activity:
To use replace function, create a set variable activity and give the below expression.
#replace(item().Objectname, '_','' )
Output with required result(Underscore removed):
Now you can pass this value to a copy activity inside the same ForEach activity.

how to compare 2 JSON files in Azure data factory

I'm new to Azure data factory. I want to compare 2 json files through azure data factory. We need to get new list of id's in current JSON file which are not in previous JSON file. Below are the 2 sample JSON files.
Previous JSON file :
{
"count": 2,
"values": [
{
"id": "4e10aa02d0b945ae9dcf5cb9ded9a083"
},
{
"id": "cbc414db-4d08-48f2-8fb7-748c5da45ca9"
}
]
}
Current JSON file:
{
"count": 3,
"values": [
{
"id": "4e10aa02d0b945ae9dcf5cb9ded9a083"
},
{
"id": "cbc414db-4d08-48f2-8fb7-748c5da45ca9"
},
{
"id": "5ea951e3-88d7-40b4-9e3f-d787b94a43c8"
}
]
}
New id's has to perform one activity and old id's has to perform another activity.
WE are running out the time and please help me out.
Thanks in advance!
You can simply use a IfCondition Activity
If expression:
#equals(activity('Lookup1').output.value,activity('Lookup2').output.value)
Further I have used Fail Activity for False condition for better visibility.
--
Lookup1 Activity --> Json1.json
Lookup2 Activity --> Json2.json
This can be done using a single Filter Activity.
I have assigned two parameters "Old_json" and "New_json" for your Previous Json and Current Json files respectively.
In the settings of Filter activity,
Items: #pipeline().parameters.New_json.values
Condition: #not(contains(pipeline().parameters.Old_Json.values,item()))
So, this filter activity goes through each item in New json, and checks if they are present in the old json. If not present, then will give that as an output.
Output of the filter activity
Thanks #KarthikBhyresh-MT for a helpful answer.
Just to add, if (like me) you want to compare two files (or in my case, a file with the output of a SQL query), but don't care about the order of the records, you can do this using a ForEach activity. This also has the benefit of allowing a more specific error message in the case of a difference between the files.
My first If Condition checks the two files have the same row count, with the expression:
#equals(activity('Select from SQL').output.count, activity('Lookup from CSV').output.count)
The False branch leads to a Fail activity with message:
#concat(pipeline().parameters.TestName, ': CSV has ', string(activity('Lookup from CSV').output.count), ' records but SQL query returned ', string(activity('Select from SQL').output.count))
If this succeeds, flow passes to a ForEach, iterating through items:
#activity('Lookup from CSV').output.value
... which contains an If Condition with expression:
#contains(string(activity('Select from SQL').output.value), string(item()))
The False branch for that If Condition contains an Append variable activity, which appends to a variable I've added to the pipeline called MismatchedRecords. The Value appended is:
#item()
Following the ForEach, a final If Condition then checks whether MismatchedRecords contains any items:
#equals(length(variables('MismatchedRecords')), 0)
... and the False branch contains another Fail activity, with message:
#concat(string(length(variables('MismatchedRecords'))), ' records from CSV not found in SQL. Missing records: ', string(variables('MismatchedRecords')), ' SQL output: ', string(activity('Select from SQL').output.value))
The message contains specific information about the records which could not be matched, to allow further investigation.

Filter inside ForEach activity in Azure Datafactory

I was trying to filter an array based on values from different array. As filter activity doesn't allow arrays inside conditions, so, I am trying to filter one array by iterating over each value in second array. However, while writing the condition in the "Filter Activity", I am unable to reference "item()" value of the ForEach loop (value from second array on which iteration is running). Is there a way to reference outer item() inside filter activity?
I saw a post which showed we can use -: items("ForEachActivity") to refer foreach activity's values however, it throws an error -: {"code":"BadRequest","message":"ErrorCode=InvalidTemplate, ErrorMessage=The template validation failed: 'The workflow action 'FilterFilter1' at line '1 and column '42236' references the action 'ForEach1' of type 'Http': only the actions of type 'foreach' are allowed to be referenced by 'repeatItems' or 'items' functions","target":"pipeline/TableIeratorPipeline/runid/f332271e-4628-4a3f-95a2-7794e3a4216f","details":null,"error":null} .
You can create a variable to save the current item from For Each activity.
My test:
1.create three variables in pipeline.
2.create a For Each activity and check Sequential option.
3.pass value of current item to variable.
4.determine whether two values are equal.

How to render JSON using Stream Analytics Query

I have Inputs in the form of JSON stored in Blob Storage
I have Output in the form of SQL Azure table.
My wrote query and successfully moving value of specific property in JSON to corresponding Column of SQL Azure table.
Now for one column I want to copy entire JSON payload as Serialized string in one sql column , I am not getting proper library function to do that.
SELECT
CASE
WHEN GetArrayLength(E.event) > 0
THEN GetRecordPropertyValue(GetArrayElement(E.event, 0), 'name')
ELSE ''
END AS EventName
,E.internal.data.id as DataId
,E.internal.data.documentVersion as DocVersion
,E.context.custom As CustomDimensionsPayload
Into OutputTblEvents
FROM InputBlobEvents E
This CustomDimensionsPayload should be a JSON actually
I made a user defined function which did the job for me:
function main(InputJSON) {
var InputJSONString = JSON.stringify(InputJSON);
return InputJSONString;
}
Then, inside the Query, I used the function like this:
SELECT udf.ConvertToJSONString(COLLECT()) AS InputJSON
INTO outputX
FROM inputY
You need to just reference the input object itself instead of COLLECT() if you want the entire payload to be converted. I was trying to do this also so figured I'd add what i did.
I used the same function suggested by PerSchjetne, query then becomes
SELECT udf.JSONToString(IoTInputStream)
INTO [SQLTelemetry]
FROM [IoTInputStream]
Your output will now be the full JSON string, including all the metadata extras that IOT hub adds on.

Resources