Error while connecting DB2/IDAA using ADFV2 - azure

I am trying to connect DB2/IDAA using ADFV2 - while executing simple query "select * from table" - I am getting below error:
Operation on target Copy data from IDAA failed: An error has occurred on the Source side. 'Type = Microsoft.HostIntegration.DrdaClient.DrdaException, Message = Exception or type' Microsoft.HostIntegration.Drda.Common.DrdaException 'was thrown. SQLSTATE = HY000 SQLCODE = -343, Source = Microsoft.HostIntegration.Drda.Requester, '
I checked a lot and tried various options but still it's an issue.
I tried query "select * from table with ur" - query to call with read-only but still get above result.
If I use query like select * from table; commit; - then activity succeeded but no record fetch.
Is anyone have solution ?
I have my linked service setup like this. additional connection properties value is : SET CURRENT QUERY ACCELERATION = ALL

Related

psycopg2 SELECT query with inbuilt functions

I have the following SQL statement where i am reading the database to get the records for 1 day. Here is what i tried in pgAdmin console -
SELECT * FROM public.orders WHERE createdat >= now()::date AND type='t_order'
I want to convert this to the syntax of psycopg2but somehow it throws me errors -
Database connection failed due to invalid input syntax for type timestamp: "now()::date"
Here is what i am doing -
query = f"SELECT * FROM {table} WHERE (createdat>=%s AND type=%s)"
cur.execute(query, ("now()::date", "t_order"))
records = cur.fetchall()
Any help is deeply appreciated.
DO NOT use f strings. Use proper Parameter Passing
now()::date is better expressed as current_date. See Current Date/Time.
You want:
query = "SELECT * FROM public.orders WHERE (createdat>=current_date AND type=%s)"
cur.execute(query, ["t_order"])
If you want dynamic identifiers, table/column names then:
from psycopg2 import sql
query = sql.SQL("SELECT * FROM {} WHERE (createdat>=current_date AND type=%s)").format(sql.Identifier(table))
cur.execute(query, ["t_order"])
For more information see sql.

Synapse Dedicated SQL Pool - Copy Into Failing With Odd error - Python

I'm getting an error when attempting to insert from a temp table into a table that exists in Synapse, here is the relevant code:
def load_adls_data(self, schema: str, table: str, environment: str, filepath: str, columns: list) -> str:
if self.exists_schema(schema):
if self.exists_table(schema, table):
if environment.lower() == 'prod':
schema = "lvl0"
else:
schema = f"{environment.lower()}_lvl0"
temp_table = self.generate_temp_create_table(schema, table, columns)
sql0 = """
IF OBJECT_ID('tempdb..#CopyDataFromADLS') IS NOT NULL
BEGIN
DROP TABLE #CopyDataFromADLS;
END
"""
sql1 = """
{}
COPY INTO #CopyDataFromADLS FROM
'{}'
WITH
(
FILE_TYPE = 'CSV',
FIRSTROW = 1
)
INSERT INTO {}.{}
SELECT *, GETDATE(), '{}' from #CopyDataFromADLS
""".format(temp_table, filepath, schema, table, Path(filepath).name)
print(sql1)
conn = pyodbc.connect(self._synapse_cnx_str)
conn.autocommit = True
with conn.cursor() as db:
db.execute(sql0)
db.execute(sql1)
If I get rid of the insert statement and just do a select from the temp table in the script:
SELECT * FROM #CopyDataFromADLS
I get the same error in either case:
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Not able to validate external location because The remote server returned an error: (409) Conflict. (105215) (SQLExecDirectW)')
I've run the generated code for both the insert and the select in Synapse and they ran perfectly. Google has no real info on this so could someone assist with this? Thanks
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Not able to validate external location because The remote server returned an error: (409) Conflict. (105215) (SQLExecDirectW)')
This error occurs mostly because of authentication or access.
Make sure you have blob storage contributor access.
In the copy into script, add the authentication key for blob storage, unless it is a public blob storage.
I tried to repro this using copy into statement without authentication and got the same error.
After adding authentication using SAS key data is copied successfully.
Refer the Microsoft document for permissions required for bulk load using copy into statements.

Data Factory V2 Query Azure Table Storage but use a lookup Value

I have a SQL watermark table which contains the last date in my destination table
My source data is coming from an Azure Storage Table and the date time is a string
I set up the date time in the watermark table to match the format in the Azure table storage
I create a lookup and a copy task
If I hard code the date into the Query for source and run this works fine CreatedAt ge '2019-03-06T14:03:11.000Z'
But obviously I dont want to hard code this value. I want to use the date from the lookup
But when I replace the hardcoded date with the lookup value
CreatedAt ge 'activity('LookupWatermarkOld').output'
I get an error
{
"errorCode": "2200",
"message":"ErrorCode=FailedStorageOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A
storage operation failed with the following error 'The remote server returned an error: (400) Bad Request.'.,Source=,
''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (400) Bad Request.,
Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=Syntax
error at position 42 in 'CreatedAt ge 'activity('LookupWatermarkOld').output''.\nRequestId:8c65ced9-b002-0051-79d9-d41d49000000\nTime:2019-03-07T11:35:39.0640233Z,,''Type=System.Net.WebException,Message=The remote server returned an error: (400) Bad Request.,Source=Microsoft.WindowsAzure.Storage,'",
"failureType": "UserError",
"target": "CopyMentions"
}
Can anyone help me with this? How do you use the Lookup value in a Azure Table query?
check this out:
1) Lookup activity. Query field:
SELECT MAX(WatermarkColumnName) as LastId FROM TableName;
Also, make sure that you checked "First row only" option.
2) In Copy Data activity use query. Query field:
#concat('SELECT * FROM TableName as s WHERE s.WatermarkColumnName > ''', activity('LookupActivity').output.firstRow.LastID, '''')
Finally I got some help on this and it works with
CreatedAt gt '#{activity('LookupWatermarkOld').output.firstRow.WaterMarkValue}'
the WaterarkValue is the column name from the SQL Lookup table
The Lookup creates an array so you have to specify the FirstRow from this array
And wrap in '' so its used as a string value
--For recent ADFv2
Use the watermark/lookup/output value in parameter.
Example: ParamUserCount = #{activity('LookupActivity').output.count}
or for output function
and you can use it in query as
Example: "select * from userDetails where usercount = {$ParamUserCount}"
make sure you enclose the query in " " to set as string and parameter in query should be enclosed in { }

Can anyone help me with this error

I am trying to fetch some data from azure data lake to azure datawarehouse, but I am unable to do it I have followed the documentation link
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store
But I am getting this error when I am trying to create an external table, I have created another web/api app but still was not able to access thE application here is the error which I am facing
EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
GETFILESTATUS failed with error 0x83090aa2 (Forbidden. ACL verification failed. Either the resource does not exist or the user is not authorized to perform the requested operation.). [0ec4b8e0-b16d-470e-9c98-37818176a188][2017-08-14T02:30:58.9795172-07:00]: Error [GETFILESTATUS failed with error 0x83090aa2 (Forbidden. ACL verification failed. Either the resource does not exist or the user is not authorized to perform the requested operation.). [0ec4b8e0-b16d-470e-9c98-37818176a188][2017-08-14T02:30:58.9795172-07:00]] occurred while accessing external file.'
Here is the script which I am trying to get it to work with
CREATE DATABASE SCOPED CREDENTIAL ADLCredential2
WITH
IDENTITY = '2ec11315-5a30-4bea-9428-e511bf3fa8a1#https://login.microsoftonline.com/24708086-c2ce-4b77-8d61-7e6fe8303971/oauth2/token',
SECRET = '3Htr2au0b0wvmb3bwzv1FekK88YQYZCUrJy7OB3NzYs='
;
CREATE EXTERNAL DATA SOURCE AzureDataLakeStore11
WITH (
TYPE = HADOOP,
LOCATION = 'adl://test.azuredatalakestore.net/',
CREDENTIAL = ADLCredential2
);
CREATE EXTERNAL FILE FORMAT TextFileFormat
WITH
( FORMAT_TYPE = DELIMITEDTEXT
, FORMAT_OPTIONS ( FIELD_TERMINATOR = '|'
, DATE_FORMAT = 'yyyy-MM-dd HH:mm:ss.fff'
, USE_TYPE_DEFAULT = FALSE
)
);
CREATE EXTERNAL TABLE [extccsm].[external_medication]
(
person_id varchar(4000),
encounter_id varchar(4000),
fin varchar(4000),
mrn varchar(4000),
icd_code varchar(4000),
icd_description varchar(300),
priority integer,
optional1 varchar(4000),
optional2 varchar(4000),
optional3 varchar(4000),
load_identifier varchar(4000),
upload_time datetime2,
xx_person_id varchar(4000),--Person ID is the ID that we will use to represent the person through out the process uniquely, This requires initial analysis to determine how to set it
xx_encounter_id varchar(4000),--Encounter ID is the ID that will represent the encounter uniquely through out the process, This requires initial analysis to determine hos to set it based on client data
mod_optional1 varchar(4000),
mod_optional2 varchar(4000),
mod_optional3 varchar(4000),
mod_optional4 varchar(4000),
mod_optional5 varchar(4000),
mod_loadidentifier datetime2
)
WITH
(
LOCATION='\testfiles\procedure_azure.txt000\',
DATA_SOURCE = AzureDataLakeStore11, --DATA SOURCE THE BLOB STORAGE
FILE_FORMAT = TextFileFormat, --TYPE OF FILE FORMAT
REJECT_TYPE = percentage,
REJECT_VALUE = 1,
REJECT_SAMPLE_VALUE = 0
);
Please tell me whats wrong here?
I can reproduce this but it's hard to narrow down exactly. I think it's to do with permissions. From the Azure portal:
Data Lake Store > yourDataLakeAccount > your folder > Access
From there, make sure your AD Application has Read, Write and Execute permission on the relevant files / folders. Start with one file initially. I can reproduce the error by assigning / unassigning the Execute permissions but need to repeat the steps to confirm. I'll retrace my steps but for now concentrate your search here. In my example below, my Azure Active Directory Application is called adwAndPolybase; you can see I've given it Read, Write and Execute. I also experimented with the Advanced and 'Apply to children' options:

Load connecting tables from Cassandra in QlikView with DataSatx ODBC

I am new to both Cassandra (2.0) and QlikView (11).
I have two keyspaces (tables) with large amount of data in Cassandra and I want to load them to QlikView.
Since I can not load the entire set, filtering is necessary.
// In QlikView's edit script
ODBC CONNECT TO [DataStax Cassandra ODBC DSN64];
LOAD idsession,
logintime,
"h_id" as hid;
SQL SELECT *
FROM Cassandra.test.sessions
WHERE logintime > '2015-06-09'
ALLOW FILTERING;
LOAD idhost,
site;
SQL SELECT *
FROM Cassandra.test.hosts
WHERE idhost in hid;
The second query does not work, error from qlikview line 3:16 no viable alternative at input 'hid'.
My question: is it possible to get the h_ids from the first query and only collect the corresponding entities from the second table?
I assume that you can't do an Exists in the DataSyntax ODBC which may help. DataStax doc
This could be done with an external program like (C#) but I really want to do this in QlikView's script file:
// Not complete code
query = select * from sessions where loginTime > '2015-06-09';
foreach (var id in query) {
query2 = "select * from hosts where idhost = " + i;
}
EDIT
This can be solved when loading XML files:
TableA:
LOAD id,
itema
FROM
[C:\test1data.xlsx]
(ooxml, embedded labels);
TableB:
LOAD idb,
itemb,
ida
FROM
[C:\test2data.xlsx]
(ooxml, embedded labels) where(Exists (id,ida));
EDIT2
Besides the great answer from #i_saw_drones another solutions is to loop through ids.
For i = 1 to NoOfRows('Sessions')
Let cur_id = Peek('hid',i - 1,'Sessions');
LOAD
idhost,
site;
SQL SELECT *
FROM Cassandra.test.hosts
WHERE idhost = $(cur_id);
NEXT i
Nevertheless was the performance not the great. It took about 30 minutes to load around 300 K lines from Cassandra. The same queries were tested in a C# program with the connector and it took 9 sec. But that was just the query. Then you should write it to XML and then load it to QlikView.
The reason that the second query fails is because the WHERE clause is expecting to find a literal string list of values to look "in". For example:
LOAD
idhost,
site;
SQL SELECT *
FROM Cassandra.test.hosts
WHERE idhost in ('ID1', 'ID2', 'ID3', 'ID4');
The hid field returned by the first query is a QlikView list and as such cannot be immediately coerced into a string. We have to do a little more scripting to obtain a list of values from the first query in literal form, and then add that to the second query as part of the WHERE clause. The easiest way to do this is to concatenate all of your hids into a string and then use the string as part of your WHERE IN clause.
ODBC CONNECT TO [DataStax Cassandra ODBC DSN64];
MyData:
LOAD
idsession,
logintime,
"h_id" as hid;
SQL SELECT *
FROM Cassandra.test.sessions
WHERE logintime > '2015-06-09'
ALLOW FILTERING;
hid_entries:
LOAD
chr(39) & hids & chr(39) as hids;
LOAD
concat(hid, chr(39) & ',' & chr(39)) as hids;
LOAD DISTINCT
hid
RESIDENT MyData;
LET hid_values = '(' & peek('hids',0,'hid_entries') & ')';
DROP TABLE hid_entries;
LOAD
idhost,
site;
SQL SELECT *
FROM Cassandra.test.hosts
WHERE idhost in $(hid_values);

Resources