Get Stored Procedure Output DataSet from Azure Data Factory

Get Stored Procedure Output DataSet from Azure Data Factory - azure

I have a stored procedure which accepts a single input parameter and returns a data set. I want to invoke this stored procedure from my ADF Pipeline and with the stored proc data, I want to call another proc with whose result I want to do further processing.
I tried with the Stored Procedure Activity but it's output doesn't contain the actual data set:
{
"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime (Australia East)",
"executionDuration": 0,
"durationInQueue": {
"integrationRuntimeQueue": 0
},
"billingReference": {
"activityType": "ExternalActivity",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.016666666666666666,
"unit": "Hours"
}
]
}
}
Also, I tried with LookUp Activitybut it's result only contains the first row of the resultant data set:
{
"firstRow": {
"CountryID": 1411,
"CountryName": "Maldives",
"PresidentName": "XXXX"
},
"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime (Australia East)",
"billingReference": {
"activityType": "PipelineActivity",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.016666666666666666,
"unit": "DIUHours"
}
]
},
"durationInQueue": {
"integrationRuntimeQueue": 0
}
}
My main intention behind using ADF is to reduce the huge amount of time taken otherwise by an existing API (.Net Core) for the same steps. What else can be done? Should I consider any other Azure Service(s)?

Related

Syntax for JSONPath filtering to not return array

I'm new to JSONPath and want to write a JSONPath-syntax that retrieves the property value only if a certain condition is met. The value I'm after is not part of an array, but I've managed to make filtering work in the following JSONPath tool: https://www.site24x7.com/tools/json-path-evaluator.html
Given the following JSON, I only want to extract the value of column2.dimValue if column2.attributeId equals B0:
{
"batchId": 279,
"companyId": "40",
"period": 202208,
"taxCode": "1",
"taxSystem": "",
"transactionDate": "2022-08-05T00:00:00.000",
"transactionNumber": 222006089,
"transactionType": "IF",
"year": 2022,
"accountingInformation": {
"account": "4010",
"column1": {
"attributeId": "H9",
"dimValue": "76"
},
"column2": {
"attributeId": "B0",
"dimValue": "2170103"
},
"column3": {
"attributeId": "",
"dimValue": ""
},
"column4": {
"attributeId": "BF",
"dimValue": "217010330"
},
"column5": {
"attributeId": "10",
"dimValue": "3101"
},
"column6": {
"attributeId": "06",
"dimValue": ""
},
"column7": {
"attributeId": "19",
"dimValue": "K"
}
},
"categories": {
"cat1": "H9",
"cat2": "B0",
"cat3": "",
"cat4": "BF",
"cat5": "10",
"cat6": "06",
"cat7": "19",
"dim1": "76",
"dim2": "2170103",
"dim3": "",
"dim4": "217010330",
"dim5": "3101",
"dim6": "",
"dim7": "K"
},
"amounts": {
"amount": 48.24,
"amount3": 0.0,
"amount4": 0.0,
"currencyAmount": 48.24,
"currencyCode": "NOK",
"debitCreditFlag": 1
},
"invoice": {
"customerOrSupplierId": "58118",
"description": "",
"externalArchiveReference": "",
"externalReference": "2170103",
"invoiceNumber": "220238522",
"ledgerType": "P"
},
"additionalInformation": {
"number": 0,
"orderLineNumber": 0,
"orderNumber": 0,
"sequenceNumber": 1,
"status": "",
"value": 0.0,
"valueDate": "2022-08-05T00:00:00.000"
},
"lastUpdated": {
"updatedAt": "2022-09-05T10:59:11.633",
"updatedBy": "HELVES"
}
}
I've used this JSONPath-syntax:
$['accountingInformation']['column2'][?(#.attributeId=='B0')].dimValue
This gives the following result:
[
"2170103"
]
I'm using this result in Azure Data Factory mapping, and it seems that it doesn't work as the result is an array.
Can anyone help me with the syntax to it only returns the actual value? Is that even possible?

I repro'd the same and below is the approach.
Sample Json file is taken as in below image as a source in lookup activity.
If activity is taken to filter the value of column2 with attributeId='B0'. Expression is given as below
#equals(activity('Lookup1').output.value[0].accountingInformation.column2.attributeId ,'B0')
In true case of IF activity, Set Variable is added. New Variable with string type is taken and it is set using below expression.
#activity('Lookup1').output.value[0].accountingInformation.column2.dimvalue
Then Copy activity is added next to IF activity sequentially. In source dummy dataset is taken. +New is click in additional columns
Name: col1
Value: #variables('v2')
In Mapping, Import schemas is clicked. All other columns except the additional column that is added in source are deleted.
Pipeline is debugged and data is copied to sink without error.

ADF get property "status": "Succeeded" and IF for validation

I have a pipeline that pull out data from external and sink into SQL Server table as staging. Process for getting raw data has already succeeded by using 4 'Copy data'. Because of so many columns (250 columns), so I split them.
What the next requirement validate 4 those 'Copy data' by getting succeeded status. The output of 'Copy data' look like this
{
"dataRead": 4772214,
"dataWritten": 106918,
"sourcePeakConnections": 1,
"sinkPeakConnections": 1,
"rowsRead": 1366,
"rowsCopied": 1366,
"copyDuration": 8,
"throughput": 582.546,
"errors": [],
"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime (Southeast Asia)",
"usedDataIntegrationUnits": 4,
"billingReference": {
"activityType": "DataMovement",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.016666666666666666,
"unit": "DIUHours"
}
]
},
"usedParallelCopies": 1,
"executionDetails": [
{
"source": {
"type": "RestService"
},
"sink": {
"type": "AzureSqlDatabase",
"region": "Southeast Asia"
},
"status": "Succeeded",
"start": "2022-04-13T07:16:48.5905628Z",
"duration": 8,
"usedDataIntegrationUnits": 4,
"usedParallelCopies": 1,
"profile": {
"queue": {
"status": "Completed",
"duration": 4
},
"transfer": {
"status": "Completed",
"duration": 4,
"details": {
"readingFromSource": {
"type": "RestService",
"workingDuration": 1,
"timeToFirstByte": 1
},
"writingToSink": {
"type": "AzureSqlDatabase",
"workingDuration": 0
}
}
}
},
"detailedDurations": {
"queuingDuration": 4,
"timeToFirstByte": 1,
"transferDuration": 3
}
}
],
"dataConsistencyVerification": {
"VerificationResult": "NotVerified"
},
"durationInQueue": {
"integrationRuntimeQueue": 0
}
}
Now, I want to get "status": "Succeeded" (JSON output) for validating in the 'IF Condition'. So, I set Value from variable in the dynamic content #activity('copy_data_Kobo_MBS').output
but when it run, I got error
The variable 'copy_Kobo_MBS' of type 'Boolean' cannot be initialized
or updated with value of type 'Object'. The variable 'copy_Kobo_MBS'
only supports values of types 'Boolean'.
And the question is how to get "status": "Succeeded" (JSON output) as 'Variable' value ? So 'IF condition' can examine the 'Variable' value.

You can use the below expression to pull the run status from the copy data activity. As your variable is of Boolean type, you need to evaluate it using the #equals() function which returns true or false.
#equals(activity('Copy data1').output.executionDetails[0].status,'Succeeded')
As per knowledge, you don’t have to extract the status from copy data activity as you are connecting your copy activity to set variable activity upon success.
That means your set variable activity runs only when your copy data activity ran successfully.
Also, note that
If the copy data activity (or any other activity) fails, then the activities which are added upon the success output of the previous activity will not be running.
And if you are connecting more than 1 activity output to a single activity, it only runs when all the connected activities run.
You can add activities upon failure or upon completion to process further.
Example:
In the below snip, the Set Variable activity is not run as copy data is not successful. And Wait2 activity is not run as all the input activities are not run successfully.

Stored procedure activity ADF V2

I'm using a stored procedure activity for ADF v2 pipeline. Now issue here is whenever the pipeline fails at the stored procedure activity I'm not getting the complete error details. Below is the JSON output of that stored procedure activity:
{
"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (West Europe)",
"executionDuration": 416,
"durationInQueue": {
"integrationRuntimeQueue": 0
},
"billingReference": {
"activityType": "ExternalActivity",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.11666666666666667,
"unit": "Hours"
}
]
}
}
Please let me know how do I get the error details for the stored procedure activity for ADF v2 pipeline?

You should throw the exception in your stored procedure code:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/throw-transact-sql?view=sql-server-ver15

Loop on Widget part

How to iterate only the widget part of the terraform script and get all the widget in a single dashboard?
locals {
instances = csvdecode(file("${path.module}/sample.csv"))
}
// if we use count it will loop this part
resource "aws_cloudwatch_dashboard" "main" {
dashboard_name = "my-dashboard"
dashboard_body = <<EOF
{
"widgets": [
{
"type":"metric",
"x":0,
"y":0,
"width":12,
"height":6,
"properties":{
"metrics":[
for itr in local.instances.id:
[
"AWS/EC2",
"CPUUtilization",
"InstanceId",
itr // want this section to fetch the value form excel
]
],
"period":300,
"stat":"Average",
"region":"ap-south-1",
"title":"EC2 Instance CPU ",
"annotations": {
"horizontal": [
{
"label": "Untitled annotation",
"value": 2
}]}
}},]}EOF}

If your goal is to generate JSON, it's generally better to use jsonencode rather than template_file, because it can handle the JSON syntax details automatically and thus avoid the need to tweak annoying details of a text template to get the JSON right.
For example:
dashboard_body = jsonencode({
"widgets": [
"type": "metric",
"x": 0,
"y": 0,
"width": 12,
"height": 6,
"properties": {
"metrics": [
for inst in local.instances : [
"AWS/EC2",
"CPUUtilization",
"InstanceId",
inst.id,
]
],
"period": 300,
# etc, etc
},
],
})
By using jsonencode you can use any of Terraform's normal language features and functions to produce your data structure, and leave the jsonencode function to turn that into valid JSON syntax at the end.

I tried to use two template_file resources.
sample.csv for test
instance_id
i-00001
i-00002
i-00003
create a template_file resouce for loop,
data "template_file" "ec2_metric" {
count = length(local.instances)
template = jsonencode([ "AWS/EC2", "CPUUtilization", "InstanceId", element(local.instances.*.instance_id, count.index)])
}
create a template_file for whole json
data "template_file" "widgets" {
template = <<JSON
{
"widgets": [
{
"type":"metric",
"x":0,
"y":0,
"width":12,
"height":6,
"properties":{
"metrics":[
${join(", \n ", data.template_file.ec2_metric.*.rendered)}
],
"period":300,
"stat":"Average",
"region":"ap-south-1",
"title":"EC2 Instance CPU ",
"annotations": {
"horizontal": [
{
"label": "Untitled annotation",
"value": 2
}]}
}},]}
JSON
}
using template_file,
resource "aws_cloudwatch_dashboard" "main" {
dashboard_name = "my-dashboard"
dashboard_body = data.template_file.widgets.rendered
...
}
Test template_file.widgets
output "test" {
value = data.template_file.widgets.rendered
}
result
Outputs:
test = {
"widgets": [
{
"type":"metric",
"x":0,
"y":0,
"width":12,
"height":6,
"properties":{
"metrics":[
["AWS/EC2","CPUUtilization","InstanceId","i-00001"],
["AWS/EC2","CPUUtilization","InstanceId","i-00002"],
["AWS/EC2","CPUUtilization","InstanceId","i-00003"]
],
"period":300,
"stat":"Average",
"region":"ap-south-1",
"title":"EC2 Instance CPU ",
"annotations": {
"horizontal": [
{
"label": "Untitled annotation",
"value": 2
}]}
}},]}

Call stored procedure using ADF

I am loading SQL server table using ADF and after insertion is over, I have to do little manipulation using below approach
Trigger (After insert) - Failed, SQL server not able to detect inserted record that I push using ADF.. **Seems to be a bug**.
Stored procedure using user defined table type - Getting error
Error Number '156'. Error message from database execution : Incorrect
syntax near the keyword 'select'. Must declare the table variable
"#a".
I have created below pipeline
{
"name": "CopyPipeline-xxx",
"properties": {
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "AzureDataLakeStoreSource",
"recursive": false
},
"sink": {
"type": "SqlSink",
"sqlWriterStoredProcedureName": "sp_xxx",
"storedProcedureParameters": {
"stringProductData": {
"value": "str1"
}
},
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "col1:col1,col2:col2"
}
},
"inputs": [
{
"name": "InputDataset-3jg"
}
],
"outputs": [
{
"name": "OutputDataset-3jg"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetry": 0,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Hour",
"interval": 8
},
"name": "Activity-0-xxx_csv->[dbo]_[xxx_staging]"
}
],
"start": "2017-01-09T21:48:53.348Z",
"end": "2099-12-30T18:30:00Z",
"isPaused": false,
"hubName": "hub",
"pipelineMode": "Scheduled"
}
}
and using below stored procedure
create procedure [dbo].[sp_xxx] #xxx1 [dbo].[ut_xxx] READONLY, #str1 varchar(100) AS
MERGE xxx_dummy AS a
USING #xxx1 AS b
ON (a.col1 = b.col1)
WHEN NOT MATCHED
THEN INSERT(col1, col2)
VALUES(b.col1, b.col2)
WHEN MATCHED
THEN UPDATE SET a.col2 = b.col2;
Please help me to resolve the issue.

I can reproduce your first error. Inserting to a SQL Server table with Azure Data Factory (ADF) appears to use a bulk insert method (similar to BULK INSERT, bcp, SSIS etc) and by default these methods do not fire triggers:
insert bulk [dbo].[testADF] ([col1] Int, [col2] Int, [col3] Int, [col4] Int)
with (TABLOCK, CHECK_CONSTRAINTS)
With bcp, BULK INSERT there is a flag to change to say 'fire triggers' but it appears there is no way to change this setting for ADF. As a workaround, move the logic from your trigger into the stored proc.
If you believe this flag is important, consider creating a feedback item.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get Stored Procedure Output DataSet from Azure Data Factory - azure

Related

Syntax for JSONPath filtering to not return array

ADF get property "status": "Succeeded" and IF for validation

Stored procedure activity ADF V2

Loop on Widget part

Call stored procedure using ADF

Categories

Resources