Presto Superset query unable to find configuration property of new storage account - presto

We have an Azure HDInsight cluster setup that runs Presto and Superset app connecting to it. We recently onboarded a new storage account to the cluster by updating core-site.xml, which allows us to create an external table from the Hive View.
We are able to query the external table from the new storage account in the Hive View without issue.
In the Superset app, we are able to locate the external table and see the table schema without issue.
But when trying to query the external table over the Superset app via Presto, it says presto error: Configuration property storageaccount.dfs.core.windows.net not found
Anyone know what is missing from our setup? Any advice is appreciated.
core-site.xml setting
external table query successful in Hive View
Presto not able to query the same table

Problem resolved. We simply need to restart the Presto.

Related

Unity catalog on azure databricks external table upgrade fails with below error

Is anyone came across below issue during unity catalog table upgrade
we have configured unity catalog in azure and assigned one workspace
added external location and credentials in unity catalog
Assigned create permission on external location
Metastore created is using Azure Gen2
Location assigned in abfss path and connection test looks good
when we try to upgrade one table we are getting below error
[UPGRADE_NOT_SUPPORTED.UNSUPPORTED_FILE_SCHEME] Table is not eligible for upgrade from Hive Metastore to Unity Catalog. Reason: Unsupported file system scheme wasbs.
I am not seeing any issue in terms of unity catalog, do we have any pre-requisites on source /mnt, as it uses wasbs. usually it should not consider that as we are upgrading our external table using external credential that was configured
do we need to convert existing mount to abfss format before starting unity catalog external table upgrade, i am not seeing any reason
we have tried to updated table properties and tested pre-requistes

How can I change the database DBT connects to in DBT Databricks

I'm trying to get data from a different database in databricks that's not the default one. However, I can't seem to find details about how to go about it.
The docs here only mention that it uses the default db in databricks, however my data is not in there.
Can anyone point some resources to be able to query a different database in Databricks?
Thanks

Is there a way to access internal metastore of Azure HDInsight to fire queries on Hive metastore tables?

I am trying to access the internal Hive metastore tables like HIVE.SDS, HIVE.TBLS etc.
I have an HDInsight Hadoop Cluster running with the default internal metastore. From Ambari screen, I got the Advanced setting details required for connections like -
javax.jdo.option.ConnectionDriverName,javax.jdo.option.ConnectionURL,javax.jdo.option.ConnectionUserName as well as the password
When I try connecting to the SQL Server instance(internal hive metastore) instance from a local machine, I get the message to add my IP address to the allowed list. However, since this Azure SQL server is not visible in the list of Azure SQL server dbs in the portal, it is not possible for me to whitelist my IP.
So, I tried logging in via the secure shell user- SSHUSER into the Cluster and tried accessing the HIVE database from within the Cluster using the credentials of metastore provided in Ambari. I am still not able to access it. I am using sqlcmd to connect to sql sever.
Does HDInsight prevent direct access to internal Metastores? Is External Metastore the only way to move ahead? Any leads would be helpful.
Update- I created an external SQL Server instance and used it as an external metastore and was able to access it programatically.
No luck with the Internal one yet.
There is not a way to access internal metastores for HDInsight cluster. The internal metastores live in the internal subscription which only PGs are able to access.
If you want to have more control on you metastores it is recommended to bring your own "external" metastore.

Azure Databricks Cluster Questions

I am new to azure and am trying to understand the below things. It would be helpful if anyone can share their knowledge on this.
Can the table be created in Cluster A be accessed in Cluster B if Cluster A is down?
What is the connection between the cluster and the data in the tables?
You need to have running process (cluster) to be able to access metastore, and read data, because data is stored in the customer's location, not directly accessible from the control plane that runs UI.
When you wrote data into table, then this data should be available in other cluster in following conditions:
the both clusters are using the same metastore
user has correct permissions (could be enforced via Table ACLs)

Common metadata in databricks cluster

I have a 3-4 clusters in my databricks instance of Azure cloud platform. I want to maintain a common metastore for all the cluster. Let me know if anyone implemented this.
I recommend configuring an external Hive metastore. By default, Detabricks spins its own metastore behind the scenes. But you can create your own database (Azure SQL does work, also MySQL or Postgres) and specify it during the cluster startup.
Here are detailed steps:
https://learn.microsoft.com/en-us/azure/databricks/data/metastores/external-hive-metastore
Things to be aware of:
Data tab in Databricks - you can choose the cluster and see different metastores.
To avoid using SQL user&password, look at Managed Identities https://learn.microsoft.com/en-us/azure/stream-analytics/sql-database-output-managed-identity
Automate external Hive metastore connections by using initialization scripts for your cluster
Permissions management on your sources. In case of ADLS Gen 2, consider using password pass-through

Resources