ERROR KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider - apache-spark

I am getting error while inserting data into hive table but data is getting inserted successfully in table.
act = sqlContext.createDataFrame(df,schema)
act.createOrReplaceTempView("act_view")
sqlContext.sql("insert into table project_defect.biweb_t_activity select * from act_view")
Give me this following error:
KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider
I am using Hortonworks Platform, if any one has faced this issues please suggest.

Related

ERROR org.apache.hadoop.hdfs.KeyProviderCache - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !! -chgrp:

im trying to fetch data from hive to spark sql in spark
im getting results for few tables but when the same thing i try to access through hive it is showing this error.
ERROR org.apache.hadoop.hdfs.KeyProviderCache - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
-chgrp:

Azure Data Factory Error: "incorrect syntax near"

I'm trying to do a simple incremental update from an on-prem database as source to Azure SQL database based on a varchar column called "RP" in On-Prem database that contains "date+staticdescription" for example: "20210314MetroFactory"
1- I've created a Lookup activity called Lookup1 using a table created in Azure SQL Database and uses this Query
"Select RP from SubsetwatermarkTable"
2- I've created a Copy data activity where the source settings have this Query
"Select * from SourceDevSubsetTable WHERE RP NOT IN '#{activity('Lookup1').output.value}'"
When debugging -- I'm getting the error:
Failure type: User configuration issue
Details: Failure happened on 'Source' side.
'Type=System.Data.SqlClient.SqlException,Message=Incorrect syntax near
'[{"RP":"20210307_1Plant
1KAO"},{"RP":"20210314MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"2'.,Source=.Net
SqlClient Data
Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect
syntax near
'[{"RP":"20210311MetroFactory"},{"RP":"20210311MetroFactory"},{"RP":"202103140MetroFactory"},{"RP":"20210308MetroFactory"},{"RP":"2'.,},],'
Can anyone tell me what I am doing wrong and how to fix it even if it requires creating more activities.
Note: There is no LastModifiedDate column in the table. Also I haven't yet created the StoredProcedure that will update the Lookup table when it is done with the incremental copy.
Steve is right as to why it is failling and the query you need in the Copy Data.
As he says, you want a comma-separated list of quoted values to use in your IN clause.
You can get this more easily though - from your Lookup directly using this query:-
select stuff(
(
select ','''+rp+''''
from subsetwatermarktable
for xml path('')
)
, 1, 1, ''
) as in_clause
The sub-query gets the comma separated list with quotes around each rp-value, but has a spurious comma at the start - the outer query with stuff removes this.
Now tick the First Row Only box on the Lookup and change your Copy Data source query to:
select *
from SourceDevSubsetTable
where rp not in (#{activity('lookup').output.firstRow.in_clause})
The result of #activity('Lookup1').output.value is an array like your error shows
[{"RP":"20210307_1Plant
1KAO"},{"RP":"20210314MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"2'.,Source=.Net
SqlClient Data
Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect
syntax near
'[{"RP":"20210311MetroFactory"},{"RP":"20210311MetroFactory"},{"RP":"202103140MetroFactory"},{"RP":"20210308MetroFactory"},{"RP":"2'.,},]
However, your SQL should be like this:Select * from SourceDevSubsetTable WHERE RP NOT IN ('20210307_1Plant 1KAO','20210314MetroFactory',...).
To achieve this in ADF, you need to do something like this:
create three variables like the following screenshot:
loop your result of #activity('Lookup1').output.value and append 'item().RP' to arrayvalues:
expression:#activity('Lookup1').output.value
expression:#concat(variables('apostrophe'),item().RP,variables('apostrophe'))
3.cast arrayvalues to string and add parentheses by Set variable activity
expression:#concat('(',join(variables('arrayvalues'),','),')')
4.copy to your Azure SQL database
expression:Select * from SourceDevSubsetTable WHERE RP NOT IN #{variables('stringvalues')}

Multiple Aggregations Across Partition Keys Only Work From Web Base Azure Data Explorer But Not From Python Client Library

I'm working with the Azure Core API. Using the web based data explorer, I'm able to successfully run a query that has multiple aggregations across multiple partitions. Yet when I attempt to run such a query from my shell using the Azure client library for Python (pip install azure-cosmos==4.0.0), I get an error message. I've tried two variations of the query, where one included the partition key and one didn't. Both queries returned the same error message.
container = database.get_container_client('some_container')
query1 = "select c.fmonth, c.fquarter, c.fyear, sum(c.revenue) as actual_revenue__sum, sum(c.predicted_revenue_m1) as predicted_revenue__sum from c where c.fyear=2020 group by c.fmonth, c.fquarter, c.fyear"
query2 = "select c.fmonth, c.fquarter, c.fyear, sum(c.revenue) as actual_revenue__sum, sum(c.predicted_revenue_m1) as predicted_revenue__sum from c where c.date_start >='2020-01-01' and c.date_start < '2021-01-01' group by c.fmonth, c.fquarter, c.fyear"
res = container.query_items(query1, enable_cross_partition_query=True)
Error Message Returned:
CosmosHttpResponseError: (BadRequest) Message: {"Errors":["Cross partition query only supports 'VALUE ' for aggregates."]}
ActivityId: 4961b99e-7032-4eac-ae84-2c8cab03a496, Microsoft.Azure.Documents.Common/2.11.0
Errors out for aggregates on multiple partitions, with enable cross
partition query set to true, but no "value" keyword present
If you want to use aggregates on multiple partitions,select value count(1) from c,this will work.
But python sdk doesn't support group by now.
Refer to this document,.net sdk and js sdk support group by.Other sdk will support later.
Hope this can help you.

presto: error while query from information_schema.columns

When run sql select * from hive.information_schema.columns; in presto client. I get the error infomation:
Query 20170208_085534_00061_ny9tu failed: outputFormat should not be accessed from a null StorageFormat
However, It succeed when select from other table in information_schema like select * from hive.information_schema.tables;
Can anybody help?
Thanks.
This is a bug caused by a table that has metadata we aren't expecting.
I scan all tables in hive and find some tables' InputFormat/OutputFormat is null. It will get the same error information if I DESCRIBE TABLENAME for those table with null InputFormat/OutputFormat in presto.
reference

Unable to select from SQL Database tables using node-ibm_db

I created a new table in the Bluemix SQL Database service by uploading a csv (baseball.csv) and took the default table name of "baseball".
I created a simple app in Node.js which is just trying to select data from the table with select * from baseball, but I keep getting the following error:
[IBM][CLI Driver][DB2/NT] SQL0204N "USERxxxx.BASEBALL" in an undefined name
Why can't it find my database table?
This issue seems independent of bluemix, rather it is usage error.
This error is possibly caused by following:
The object identified by name is not defined in the database.
User response
Ensure that the object name (including any required qualifiers) is correctly specified in the SQL statement and it exists.
try running "list tables" from command prompt to check if your table spelling is correct or not.
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.messages.sql.doc/doc/msql00204n.html?cp=SSEPGG_9.7.0%2F2-6-27-0-130
I created the table from SQL Database web UI in bluemix and took the default name of baseball. It looks like this creates a case-sensitive table name.
Unfortunately for me, the sql_db libary (and all db2 clients I believe) auto-capitalizes the SQL query into "SELECT * FROM BASEBALL"
The solution was to either
A. Explicitly name my table BASEBALL in the web UI; or
B. Modify my sql query by quoting the table name:
select * from "baseball"
More info at http://www.ibm.com/developerworks/data/library/techarticle/0203adamache/0203adamache.html#N10121

Resources