Azure synapse get metadata

Azure synapse get metadata - azure

I am trying to get a list of all files in a folder with get metadata activity. To pass this list to the for-each activity, which in turn executes a notebook.
I have a binary dataset and field list is set to child items.
Pipeline crashes every time with the error:
{
"errorCode": "2011",
"message": "Blob operation Failed. ContainerName: tmp, path: /tmp/folder/folder1/.",
"failureType": "UserError",
"target": "Get Metadata",
"details": []
}
The files are in 'folder/folder1'.
It's not my first time working with Get Metadata activity and so far it has always worked (in ADF). But I do it first time in Synapse, are there differences? Do you have any ideas what this could be or how I can solve the problem?

Usage of Get Metadata activity to retrieve the metadata of any data is the same in the Azure data factory and Azure Synapse pipeline.
Create a binary dataset with a dataset parameter for filename.
Connect the binary dataset to the Get metadata activity.
Pass '*' to the filename parameter value.
Select child items under Field list to get the list of files/subfolders from the folder.
The output gives the list of files from the folder.

Related

ADF copy activity and data flow behaving differently when writing data to multi lookup field in Dynamics 365

I am trying to import data from a CSV file into a Dynamics 365 Account table. As I need to do some transformations I am using a dataflow rather than a basic copy activity.
I was having difficulties getting it to work using a dataflow for writing to a multi lookup field so I tried using a copy activity to see if that worked using the exact same source,sink and mappings. I was able to import the
data successfully with the copy activity. I'm confused as to why the data flow does not work using the same source,sink and mappings. Below are screenshots of the various elements I set up and configured. Would appreciate any suggestions to get the dataflow working.
I'm using a cut down version of what will ultimately be my source CSV file. This is just so I can concentrate on getting the writing to the lookup field working.
Source CSV file
Copy Activity Source
Copy Activity Sink
Dynamics 365 Sink
Dataflow Source
Dataflow Sink
Copy Activity Mapping
Dataflow Mapping
Copy Activity Success
Dataflow Failure
Dataflow Error
Details
{"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: DF-REST_001 - Rest - Error response received from the server (url:https://##############v9.0/accounts,request body: Some({"accountid":"8b0257ea-de19-4aaa-9945-############","name":"A User","ownerid":"7d64133b-daa8-eb11-9442-############","ownerid#EntityReference":"systemuser"}), request method: POST, status code: 400), response body: Some({"error":{"code":"0x0","message":"An error occurred while validating input parameters: Microsoft.OData.ODataException: A 'PrimitiveValue' node with non-null value was found when trying to read the value of the property 'ownerid'; however, a 'StartArray' node, a 'StartObject' node, or a 'PrimitiveValue' node with null value was expected.\r\n at Microsoft.OData.JsonLight.ODataJsonLightPropertyAndValueDeserializer.ValidateExpandedNestedResourceInfoPropertyValue(IJsonReader jsonReader, Nullable1 isCollection, String propertyName, IEdmTypeReference typeReference)\r\n at Microsoft.OData.JsonLight.ODataJsonLightResourceDeserializ","Details":"com.microsoft.dataflow.Issues: DF-REST_001 - Rest - Error response received from the server (url:https://dev-gc.crm11.dynamics.com/api/data/v9.0/accounts,request body: Some({"accountid":"8b0257ea-de19-4aaa-9945-############","name":"A User","ownerid":"7d64133b-daa8-eb11-9442-############","ownerid#EntityReference":"systemuser"}), request method: POST, status code: 400), response body: Some({"error":{"code":"0x0","message":"An error occurred while validating input parameters: Microsoft.OData.ODataException: A 'PrimitiveValue' node with non-null value was found when trying to read the value of the property 'ownerid'; however, a 'StartArray' node, a 'StartObject' node, or a 'PrimitiveValue' node with null value was expected.\r\n at Microsoft.OData.JsonLight.ODataJsonLightPropertyAndValueDeserializer.ValidateExpandedNestedResourceInfoPropertyValue(IJsonReader jsonReader, Nullable1 isCollection, String propertyName, IEdmTypeReference typeReference)\r\n at Microsoft.OData.JsonLight.ODataJsonLightResourceDeser"}

I am running into a same wall,
but a temporary solution here is to sink the dataflow output to a Csv/or similar file into ADLS and then use a Copy activity to extract those files and Upsert it into the Dynamics.
Other references: https://vishalgrade.com/2020/10/01/how-to-populate-multi-lookup-attribute-in-ce-using-azure-data-factory/

Azure Datafactory can't handle empty json array in blob

in azure data factory dataset, using the copy activity to load json blob to sqldb, when the json blob is an empty array "[]" the copy activity gets stuck with error.
{
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=UserErrorTypeInSchemaTableNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to get the type from schema table. This could be caused by missing Sql Server System CLR Types.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.InvalidCastException,Message=Unable to cast object of type 'System.DBNull' to type 'System.Type'.,Source=Microsoft.DataTransfer.ClientLibrary,'",
"failureType": "UserError",
"target": "BP_acctset_Blob2SQL",
"details": []
}

Use Get Metadata to get the file size.
Use if condition to juge if the size is greater than 2. If true then exec copy activity.

Azure Data Factory Copy Activity on Failure | Expression not evaluated

I'm trying to run a copy activity in ADF and purposely trying to fail this activity to test my failure logging.
Here is what the pipeline looks like (please note that this copy activity sits inside a "for each" activity and (inside "for each") an "if conditional" activity.
This is how the pipeline looks
I'm expecting the copy to fail, however not for the "LOG FAILURE" stored procedure, since I want to log the copy activity details in a SQL DB table. Here is what the error says:
In the LOG FAILURE activity:
"errorCode": "InvalidTemplate",
"message": "The expression 'activity('INC_COPY_TO_ADL').output.rowsCopied' cannot be evaluated because property 'rowsCopied' doesn't exist, available properties are 'dataWritten, filesWritten, sourcePeakConnections, sinkPeakConnections, copyDuration, errors, effectiveIntegrationRuntime, usedDataIntegrationUnits, billingReference, usedParallelCopies, executionDetails, dataConsistencyVerification, durationInQueue'.",
"failureType": "UserError",
"target": "LOG_FAILURE"
In the Copy activity INC_COPY_TO_ADL (this is expected since the SQL query is wrong)
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Invalid object name 'dbo.CustCustomerV3Staging123'.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Invalid object name 'dbo.CustCustomerV3Staging123'.,Source=.Net SqlClient Data Provider,SqlErrorNumber=208,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=208,State=1,Message=Invalid object name 'dbo.CustCustomerV3Staging123'.,},],'",
"failureType": "UserError",
"target": "INC_COPY_TO_ADL"
I wonder why the LOG Failure activity failed (i.e. the expression was not evaluated)? Please note that when the copy activity is correct, the "LOG SUCCESS" stored procedures works okay.
This is how the pipeline looks like
Many thanks.
RA

#rizal activity('INC_COPY_TO_ADL').output.rowsCopied is not part of the output of the copy activity in case of failure. Try to set default value for Log_Failure in this case -1 and keep Log_Success as it is

Azure : How to write path to get a file from a time series partitioned folder using the Azure logic apps

I am trying to retrieve a csv file from the Azure blob storage using the logic apps.
I set the azure storage explorer path in the parameters and in the get blob content action I am using that parameter.
In the Parameters I have set the value as:
concat('Directory1/','Year=',string(int(substring(utcNow(),0,4))),'/Month=',string(int(substring(utcnow(),5,2))),'/Day=',string(int(substring(utcnow(),8,2))),'/myfile.csv')
So during the run time this path should form as:
Directory1/Year=2019/Month=12/Day=30/myfile.csv
but during the execution action is getting failed with the following error message
{
"status": 400,
"message": "The specifed resource name contains invalid characters.\r\nclientRequestId: 1e2791be-8efd-413d-831e-7e2cd89278ba",
"error": {
"message": "The specifed resource name contains invalid characters."
},
"source": "azureblob-we.azconn-we-01.p.azurewebsites.net"
}
So my question is: How to write path to get data from the time series partitioned path.

The response of the Joy Wang was partially correct.
The Parameters in logic apps will treat values as a String only and will not be able to identify any functions such as concat().
The correct way to use the concat function is to use the expressions.
And my solution to the problem is:
concat('container1/','Directory1/','Year=',string(int(substring(utcNow(),0,4))),'/Month=',string(int(substring(utcnow(),5,2))),'/Day=',string(int(substring(utcnow(),8,2))),'/myfile.csv')

You should not use that in the parameters, when you use this line concat('Directory1/','Year=',string(int(substring(utcNow(),0,4))),'/Month=',string(int(substring(utcnow(),5,2))),'/Day=',string(int(substring(utcnow(),8,2))),'/myfile.csv') in the parameters, its type is String, it will be recognized as String by logic app, then the function will not take effect.
And you need to include the container name in the concat(), also, no need to use string(int()), because utcNow() and substring() both return the String.
To fix the issue, use the line below directly in the Blob option, my container name is container1.
concat('container1/','Directory1/','Year=',substring(utcNow(),0,4),'/Month=',substring(utcnow(),5,2),'/Day=',substring(utcnow(),8,2),'/myfile.csv')
Update:
As mentioned in #Stark's answer, if you want to drop the leading 0 from the left.
You can convert it from string to int, then convert it back to string.
concat('container1/','Directory1/','Year=',string(int(substring(utcNow(),0,4))),'/Month=',string(int(substring(utcnow(),5,2))),'/Day=',string(int(substring(utcnow(),8,2))),'/myfile.csv')

How to get the Source Data When Ingestion Failure in KUSTO ADX

I have a base table in ADX Kusto DB.
.create table base (info:dynamic)
I have written a function which parses(dynamic column) the base table and greps a few columns and stores it in another table whenever the base table gets data(from EventHub). Below function and its update policy
.create function extractBase()
{
base
| evaluate bag_unpack(info)
| project tostring(column1), toreal(column2), toint(column3), todynamic(column4)
}
.alter table target_table policy update
#'[{"IsEnabled": true, "Source": "base", "Query": "extractBase()", "IsTransactional": false, "PropagateIngestionProperties": true}]'
suppose if the base table does not contain the expected column, ingestion error happens. how do I get the source(row) for the failure?
When using .show ingestion failures, it displays the failure message. there is a column called IngestionSourcePath. when I browse the URL, getting an exception as Resource Not Found.
If ingestion failure happens, I need to store the particular row of base table into IngestionFailure Table. for further investigation

In this case, your source data cannot "not have" a column defined by its schema.
If no value was ingested for some column in some row, a null value will be present there and the update policy will not fail.
Here the update policy will break if the original table row does not contain enough columns. Currently the source data for such errors is not emitted as part of the failure message.
In general, the source URI is only useful when you are ingesting data from blobs. In other cases the URI shown in the failed ingestion info is a URI on an internal blob that was created on the fly and no one has access to.
However, there is a command that is missing from documentation (we will make sure to update it) that allows you to duplicate (dump to storage container you provide) the source data for the next failed ingestion into a specific table.
The syntax is:
.dup-next-failed-ingest into TableName to h#'Path to Azure blob container'
Here the path to Azure Blob container must include a writeable SAS.
The required permission to run this command is DB admin.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure synapse get metadata - azure

Related

ADF copy activity and data flow behaving differently when writing data to multi lookup field in Dynamics 365

Azure Datafactory can't handle empty json array in blob

Azure Data Factory Copy Activity on Failure | Expression not evaluated

Azure : How to write path to get a file from a time series partitioned folder using the Azure logic apps

How to get the Source Data When Ingestion Failure in KUSTO ADX

Categories

Resources