Azure Data Factory Copy Activity on Failure | Expression not evaluated - azure

I'm trying to run a copy activity in ADF and purposely trying to fail this activity to test my failure logging.
Here is what the pipeline looks like (please note that this copy activity sits inside a "for each" activity and (inside "for each") an "if conditional" activity.
This is how the pipeline looks
I'm expecting the copy to fail, however not for the "LOG FAILURE" stored procedure, since I want to log the copy activity details in a SQL DB table. Here is what the error says:
In the LOG FAILURE activity:
"errorCode": "InvalidTemplate",
"message": "The expression 'activity('INC_COPY_TO_ADL').output.rowsCopied' cannot be evaluated because property 'rowsCopied' doesn't exist, available properties are 'dataWritten, filesWritten, sourcePeakConnections, sinkPeakConnections, copyDuration, errors, effectiveIntegrationRuntime, usedDataIntegrationUnits, billingReference, usedParallelCopies, executionDetails, dataConsistencyVerification, durationInQueue'.",
"failureType": "UserError",
"target": "LOG_FAILURE"
In the Copy activity INC_COPY_TO_ADL (this is expected since the SQL query is wrong)
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Invalid object name 'dbo.CustCustomerV3Staging123'.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Invalid object name 'dbo.CustCustomerV3Staging123'.,Source=.Net SqlClient Data Provider,SqlErrorNumber=208,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=208,State=1,Message=Invalid object name 'dbo.CustCustomerV3Staging123'.,},],'",
"failureType": "UserError",
"target": "INC_COPY_TO_ADL"
I wonder why the LOG Failure activity failed (i.e. the expression was not evaluated)? Please note that when the copy activity is correct, the "LOG SUCCESS" stored procedures works okay.
This is how the pipeline looks like
Many thanks.
RA

#rizal activity('INC_COPY_TO_ADL').output.rowsCopied is not part of the output of the copy activity in case of failure. Try to set default value for Log_Failure in this case -1 and keep Log_Success as it is

Related

Azure Data Factory - Capture error details of a dataflow activity

I have a data flow and my requirement is to capture the error details into a variable when it fails and assign this variable to a parameter in the next data flow. I tried to achieve this until the second stage(With help) as below, but I'm unable to get this variable assigned to a parameter in the next data flow. The error I get is - Expression cannot be parsed
To retrieve the dataflow error message, connect the dataflow activity upon failure to the set variable activity to store the error message using the expression:
#string(json(activity('Data flow1').error.message).Message)
Error Message:
Output:

Azure synapse get metadata

I am trying to get a list of all files in a folder with get metadata activity. To pass this list to the for-each activity, which in turn executes a notebook.
I have a binary dataset and field list is set to child items.
Pipeline crashes every time with the error:
{
"errorCode": "2011",
"message": "Blob operation Failed. ContainerName: tmp, path: /tmp/folder/folder1/.",
"failureType": "UserError",
"target": "Get Metadata",
"details": []
}
The files are in 'folder/folder1'.
It's not my first time working with Get Metadata activity and so far it has always worked (in ADF). But I do it first time in Synapse, are there differences? Do you have any ideas what this could be or how I can solve the problem?
Usage of Get Metadata activity to retrieve the metadata of any data is the same in the Azure data factory and Azure Synapse pipeline.
Create a binary dataset with a dataset parameter for filename.
Connect the binary dataset to the Get metadata activity.
Pass '*' to the filename parameter value.
Select child items under Field list to get the list of files/subfolders from the folder.
The output gives the list of files from the folder.

Azure Stream Analytic always got same 'OutputDataConversionError.TypeConversionError', even I remove the datetime column in the Synapse DW sql pool,

Always got same 'OutputDataConversionError.TypeConversionError' , even I remove the datetime column in output in Synapse DW sql pool, and got same error after delete and recreated stream analystic.
Stream Input is event hub, get dignostic log from azure sql database. Tested pass.
Stream output is a table in azure synapse analystic DW sql pool. Tested ok.
Query is like:
SELECT
Records.ArrayValue.count as [count],
Records.ArrayValue.total as [total],
Records.ArrayValue.minimum as [minimum],
Records.ArrayValue.minimum as [maximum],
Records.ArrayValue.resourceId as [resourceId],
CAST(Records.ArrayValue.time AS datetime) as [time],
Records.ArrayValue.metricName as [metricName],
Records.ArrayValue.timeGrain as [timeGrain],
Records.ArrayValue.average as [average]
INTO
OrderSynapse
FROM
dbhub d
CROSS APPLY GetArrayElements(d.records) AS Records
the query passed the test run. but stream job got into degraded state. and got error:
Source 'dblog' had 1 occurrences of kind 'OutputDataConversionError.TypeConversionError' between processing times '2021-11-12T05:28:08.7922407Z' and '2021-11-12T05:28:08.7922407Z'.
But even I deleted the stream job, drop the [time] column in output table, remove the "CAST(Records.ArrayValue.time AS datetime) as [time], " in the query statement, and recreated a new stream job, still got same error?
Part of the Activity log:
"ErrorCategory": "Diagnostic",
"ErrorCode": "DiagnosticMessage",
"Message": "First Occurred: 11/12/2021 7:39:12 AM | Resource Name: dblog | Message: Source 'dblog' had 1 occurrences of kind 'OutputDataConversionError.TypeConversionError' between processing times '2021-11-12T07:39:12.8681135Z' and '2021-11-12T07:39:12.8681135Z'. ",
"Type": "DiagnosticMessage",
Why? is there a hidden cache I can not clean?
It looks like a bug in the output adapter is provoking that issue. While the fix is rolling out, you can re-order the field list to match the column order in the destination table.

Azure Datafactory can't handle empty json array in blob

in azure data factory dataset, using the copy activity to load json blob to sqldb, when the json blob is an empty array "[]" the copy activity gets stuck with error.
{
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=UserErrorTypeInSchemaTableNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to get the type from schema table. This could be caused by missing Sql Server System CLR Types.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.InvalidCastException,Message=Unable to cast object of type 'System.DBNull' to type 'System.Type'.,Source=Microsoft.DataTransfer.ClientLibrary,'",
"failureType": "UserError",
"target": "BP_acctset_Blob2SQL",
"details": []
}
Use Get Metadata to get the file size.
Use if condition to juge if the size is greater than 2. If true then exec copy activity.

Azure data factory: Handling inner failure in until/for activity

I have an Azure data factory v2 pipeline containing an until activity.
Inside the until is a copy activity - if this fails, the error is logged, exactly as in this post, and I want the loop to continue.
Azure Data Factory Pipeline 'On Failure'
Although the inner copy activity’s error is handled, the until activity is deemed to have failed because an inner activity has failed.
Is there any way to configure the until activity to continue when an inner activity fails?
Solution
Put the error-handling steps in their own pipeline and run them from an ExecutePipeline activity. You'll need to pass-in all the parameters required from the outer pipeline.
You can then use the completion (blue) dependency from the ExecutePipeline (rather than success (green)) so the outer pipeline continues to run despite the inner error.
Note that if you want the outer to know what happened in the inner then there is currently no way to pass data out of the ExecutePipeline to its parent (https://feedback.azure.com/forums/270578-data-factory/suggestions/38690032-add-ability-to-customize-output-fields-from-execut).
To solve this, use an sp activity inside the ExecutePipeline to write data to a SQL table, identified with the pipeline run id. This can be referenced inside the pipeline with #pipeline().RunId.
Then outside the pipeline you can do a lookup in the SQL table, using the run ID to get the right row.
HEALTH WARNING:
For some weird reason, the output of ExecutePipeline is returned not as a JSON object but as a string. So if you try to select a property of output like this #activity('ExecutePipelineActivityName').output.something then you get this error:
Property selection is not supported on values of type 'String'
So, to get the ExecutePipeine's run ID from outside you need:
#json(activity('ExecutePipelineActivityName').output).pipelineRunId
I couldn't find this documented in Microsoft's documentation anywhere, hence posting gory details here.

Resources