ADF HANA View partition columns

ADF HANA View partition columns - azure

I am trying to extract data from a HANA view using ADF HANA connector. In the partition option(inside copy activity), I am using dynamic range partitioning by passing a partition column. While doing so, I am getting the below error :-
ErrorCode=SapHanaFailToGetBoundsDueToInvalidQuery,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Fail
to get bounds by using query. Make sure your query can be nested while
using partition reading and your partition column data type is
acceptable.,Source=Microsoft.DataTransfer.Runtime.SapHanaConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR
[S1000] [SAP AG][LIBODBCHDB DLL][HDBODBC] General error;260 invalid
column name: FISCALYR: line 4 col 60 (at pos 530),Source=libodbcHDB.dll,'
I tried using a different partition column, but that also gives the same error. Can anyone let me know the possible cause?
Below is the copy activity input log
{
"source": {
"type": "SapHanaSource",
"query": "select\r\n"COCD" as BUKRS,\r\n"ASSETNBR" as ANLN1,\r\n"ASSETSUBNUMB" as ANLN2,\r\n"FISCALYR" as
GJAHR,\r\n"REALDEPRAREA" as AFABE,\r\n"DEPRLASTPOST" as
AFBLPE,\r\n"DEPRPERDPOST" as AFBANZ,\r\n"FISCALYREXPR" as
\r\nFROM
"_SYS_BIC"."intel.finance.private/AssetValueHistoricView" WHERE
?AdfHanaDynamicRangePartitionCondition",
"partitionOption": "SapHanaDynamicRange",
"partitionSettings": {
"partitionColumnName": "FISCALYR"
}
},
"sink": {
"type": "ParquetSink",
"storeSettings": {
"type": "AzureBlobFSWriteSettings"
},
"formatSettings": {
"type": "ParquetWriteSettings"
}
},
"enableStaging": false,
"parallelCopies": 20,
"translator": {
"type": "TabularTranslator",
"typeConversion": true,
"typeConversionSettings": {
"allowDataTruncation": true,
"treatBooleanAsNumber": false
}
} }

Related

Azure Data Factory Copy Activity error mapping JSON to SQL

I have an Azure Data Factory Copy Activity that is using a REST request to elastic search as the Source and attempting to map the response to a SQL table as the Sink. Everything works fine except when it attempts to map the data field that contains the dynamic JSON. I get the following error:
{
"errorCode": "2200",
"message": "ErrorCode=UserErrorUnsupportedHierarchicalComplexValue,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The retrieved type of data JObject with value {\"name\":\"department\"} is not supported yet, please either remove the targeted column or enable skip incompatible row to skip them.,Source=Microsoft.DataTransfer.Common,'",
"failureType": "UserError",
"target": "CopyContents_Paged",
"details": []
}
Here's an example of my mapping configuration:
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "['_source']['id']"
},
"sink": {
"name": "ContentItemId",
"type": "String"
}
},
{
"source": {
"path": "['_source']['status']"
},
"sink": {
"name": "Status",
"type": "Int32"
}
},
{
"source": {
"path": "['_source']['data']"
},
"sink": {
"name": "Data",
"type": "String"
}
}
],
"collectionReference": "$['hits']['hits']"
}
The JSON in the data object is dynamic so I'm unable to do an explicit mapping for the nested fields within it. That's why I'm trying to just store the entire JSON object under data in a column of a SQL table.
How can I adjust my mapping configuration to allow this to work properly?

I posted this question on the MSDN forums and I was told that if you are using a tabular sink you can set this option "mapComplexValuesToString": true and it should allow complex JSON properties to get mapped correctly. This resolved my ADF copy activity issue.

I have the same problem a few days ago. You need to convert your JSON object to a Json String. It will solve your mapping problem (UserErrorUnsupportedHierarchicalComplexValue).
Try it and tell me if also resolves your error.

How to insert data to bigquery table with custom fields with NodeJS?

I'm using npm BigQuery module for inserting data into bigquery. I have a custom field say params which is of type RECORD and accept any int,float or string value as a key value pair. How can I insert to such fields?
Looked into this, but could not find anything useful
[https://cloud.google.com/nodejs/docs/reference/bigquery/1.3.x/Table#insert]

If I understand correctly, you are asking for a map with ANY TYPE value, which is not support in BigQuery.
You may have a map with value type info with a record like below schema.
Your insert code needs to pick correct type_value to set.
{
"name": "map_field",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "key",
"type": "STRING",
},
{
"name": "int_value",
"type": "INTEGER"
},
{
"name": "string_value",
"type": "STRING"
},
{
"name": "float_value",
"type": "FLOAT"
}
]
}

Azure Data Factory activity copy: Evaluate column in sink table with #pipeline().TriggerTime

With Data Factory V2 I'm trying to implement a stream of data copy from one Azure SQL database to another.
I have mapped all the columns of the source table with the sink table but in the sink table I have an empty column where I would like to enter the pipeline run time.
Does anyone know how to fill this column in the sink table without it being present in the source table?
Below there is the code of my copy pipeline
{
"name": "FLD_Item_base",
"properties": {
"activities": [
{
"name": "Copy_Team",
"description": "copytable",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "SqlSource"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"preCopyScript": "TRUNCATE TABLE Team_new"
},
"enableStaging": false,
"dataIntegrationUnits": 0,
"translator": {
"type": "TabularTranslator",
"columnMappings": {
"Code": "Code",
"Name": "Name"
}
}
},
"inputs": [
{
"referenceName": "Team",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "Team_new",
"type": "DatasetReference"
}
]
}
]
}
}
In my sink table I already have the column data_loadwhere I would like to insert the pipeline execution date, but I did not currently map it.

Based on your situation, please configure SQL Server stored procedure in your SQL Server sink as a workaround.
Please follow the steps from this doc:
Step 1: Configure your Sink dataset:
Step 2: Configure Sink section in copy activity as follows:
Step 3: In your database, define the table type with the same name as sqlWriterTableType. Notice that the schema of the table type should be same as the schema returned by your input data.
CREATE TYPE [dbo].[testType] AS TABLE(
[ID] [varchar](256) NOT NULL,
[EXECUTE_TIME] [datetime] NOT NULL
)
GO
Step 4: In your database, define the stored procedure with the same name as SqlWriterStoredProcedureName. It handles input data from your specified source, and merge into the output table. Notice that the parameter name of the stored procedure should be the same as the "tableName" defined in dataset.
Create PROCEDURE convertCsv #ctest [dbo].[testType] READONLY
AS
BEGIN
MERGE [dbo].[adf] AS target
USING #ctest AS source
ON (1=1)
WHEN NOT MATCHED THEN
INSERT (id,executeTime)
VALUES (source.ID,GETDATE());
END

you can consider using stored procedure at the sink side to apply the source data into the sink table by designating "sqlWriterStoredProcedureName" of the SqlSink. Pass the pipeline run time to the stored procedure as the parameter and insert into sink table.

Azure ADF sliceIdentifierColumnName is not populating correctly

I've set up a ADF pipeline using a sliceIdentifierColumnName which has worked well as it populated the field with a GUID as expected. Recently however this field stopped being populated, the refresh would work but the sliceIdentifierColumnName field would have a value of null, or occasionally the load would fail as it attempted to populate this field with a value of 1 which causes the slice load to fail.
This change occurred at a point in time, before it worked perfectly, after it repeatedly failed to populate the field correctly. I'm sure no changes were made to the Pipeline which caused this to suddenly fail. Any pointers where I should be looking?
Here an extract of the pipeline source, I'm reading from a table in Amazon Redshift and writing to an Azure SQL table.
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "RelationalSource",
"query": "$$Text.Format('select * from mytable where eventtime >= \\'{0:yyyy-MM-ddTHH:mm:ssZ}\\' and eventtime < \\'{1:yyyy-MM-ddTHH:mm:ssZ}\\' ' , SliceStart, SliceEnd)"
},
"sink": {
"type": "SqlSink",
"sliceIdentifierColumnName": "ColumnForADFuseOnly",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
}
},
"inputs": [
{
"name": "AmazonRedshiftSomeName"
}
],
"outputs": [
{
"name": "AzureSQLDatasetSomeName"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 10,
"style": "StartOfInterval",
"longRetry": 0,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Hour",
"interval": 2
},
"name": "Activity-somename2Hour"
}
],
Also, here is the error output text
Copy activity encountered a user error at Sink:.database.windows.net side: ErrorCode=UserErrorInvalidDataValue,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Column 'ColumnForADFuseOnly' contains an invalid value '1'.,Source=Microsoft.DataTransfer.Common,''Type=System.ArgumentException,Message=Type of value has a mismatch with column typeCouldn't store <1> in ColumnForADFuseOnly Column.
Expected type is Byte[].,Source=System.Data,''Type=System.ArgumentException,Message=Type of value has a mismatch with column type,Source=System.Data,'.
Here is part of the source dataset, it's a table with all datatypes as Strings.
{
"name": "AmazonRedshiftsomename_2hourly",
"properties": {
"structure": [
{
"name": "eventid",
"type": "String"
},
{
"name": "visitorid",
"type": "String"
},
{
"name": "eventtime",
"type": "Datetime"
}
}
Finally, the target table is identical to the source table, mapping each column name to its counterpart in Azure, with the exception of the additional column in Azure named
[ColumnForADFuseOnly] binary NULL,
It is this column which is now either being populated with NULLs or 1.
thanks,

You need to define [ColumnForADFuseOnly] as binary(32), binary with no length modifier is defaulting to a length of 1 and thus truncating your sliceIdentifier...
When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified with the CAST function, the default length is 30. See here

Call stored procedure using ADF

I am loading SQL server table using ADF and after insertion is over, I have to do little manipulation using below approach
Trigger (After insert) - Failed, SQL server not able to detect inserted record that I push using ADF.. **Seems to be a bug**.
Stored procedure using user defined table type - Getting error
Error Number '156'. Error message from database execution : Incorrect
syntax near the keyword 'select'. Must declare the table variable
"#a".
I have created below pipeline
{
"name": "CopyPipeline-xxx",
"properties": {
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "AzureDataLakeStoreSource",
"recursive": false
},
"sink": {
"type": "SqlSink",
"sqlWriterStoredProcedureName": "sp_xxx",
"storedProcedureParameters": {
"stringProductData": {
"value": "str1"
}
},
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "col1:col1,col2:col2"
}
},
"inputs": [
{
"name": "InputDataset-3jg"
}
],
"outputs": [
{
"name": "OutputDataset-3jg"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetry": 0,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Hour",
"interval": 8
},
"name": "Activity-0-xxx_csv->[dbo]_[xxx_staging]"
}
],
"start": "2017-01-09T21:48:53.348Z",
"end": "2099-12-30T18:30:00Z",
"isPaused": false,
"hubName": "hub",
"pipelineMode": "Scheduled"
}
}
and using below stored procedure
create procedure [dbo].[sp_xxx] #xxx1 [dbo].[ut_xxx] READONLY, #str1 varchar(100) AS
MERGE xxx_dummy AS a
USING #xxx1 AS b
ON (a.col1 = b.col1)
WHEN NOT MATCHED
THEN INSERT(col1, col2)
VALUES(b.col1, b.col2)
WHEN MATCHED
THEN UPDATE SET a.col2 = b.col2;
Please help me to resolve the issue.

I can reproduce your first error. Inserting to a SQL Server table with Azure Data Factory (ADF) appears to use a bulk insert method (similar to BULK INSERT, bcp, SSIS etc) and by default these methods do not fire triggers:
insert bulk [dbo].[testADF] ([col1] Int, [col2] Int, [col3] Int, [col4] Int)
with (TABLOCK, CHECK_CONSTRAINTS)
With bcp, BULK INSERT there is a flag to change to say 'fire triggers' but it appears there is no way to change this setting for ADF. As a workaround, move the logic from your trigger into the stored proc.
If you believe this flag is important, consider creating a feedback item.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ADF HANA View partition columns - azure

Related

Azure Data Factory Copy Activity error mapping JSON to SQL

How to insert data to bigquery table with custom fields with NodeJS?

Azure Data Factory activity copy: Evaluate column in sink table with #pipeline().TriggerTime

Azure ADF sliceIdentifierColumnName is not populating correctly

Call stored procedure using ADF

Categories

Resources