How can i setup unity catalog in databricks? - azure

how can i setup unity catalog in azure databricks from azure portal, setting up from metastore, containers
I have created premium azure databricks workspace but still unable to link metastore to it to run unity catalog.

What error you have, where you stucked ?
I followed this instruction (it is nicely written):
https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/get-started
It is important that you need Global Administrator role in Azure or someone with this role who can granted you Account admin role here:
https://accounts.azuredatabricks.net/login/

Related

How to create a new metastore?

I want to configure Unity Catalog and one step is creating a metastore in the region where I create databricks workspace (I am on Azure).
I created a workspace with a premium pricing tier and I am the admin.
Following the documentation, I should go to the Data tab to create metastore.
However, when I open the Data tab, I don't see "Create Metastore" button.
The same in SQL persona:
Could you guide me how to make a new metastore?
If a metastore is already created in the region, how can I find it?
In order to do this sort of management, you should access the Databricks account portal at the tenant level:
Databricks Account
From there, you can create and manage the metastores, as well as assign a metastore with a Databricks Workspace, which is what you have created.
Take into account that for most of what you have described, you must be an account admin for the Databricks Account.
As per the official docs (source):
The first Azure Databricks account admin must be an Azure Active Directory Global Administrator at the time that they first log in to the Azure Databricks account console. Upon first login, that user becomes an Azure Databricks account admin and no longer needs the Azure Active Directory Global Administrator role to access the Azure Databricks account. The first account admin can assign users in the Azure Active Directory tenant as additional account admins (who can themselves assign more account admins). Additional account admins do not require specific roles in Azure Active Directory.
Configure your Unity Catalog Metastore
Go to + New add click on new notebook and open.
If you already have catalogs with data .then use below command to check,
# Show all catalogs in the metastore.
display(spark.sql("SHOW CATALOGS"))
If you don't have catalog . create utility catalog :
# Create a catalog.
spark.sql("CREATE CATALOG IF NOT EXISTS catalog_name")
# Set the current catalog.
spark.sql("USE CATALOG catalog_name")
for more information refer this offical_document and Notebook.
You must be an Azure Databricks account admin to getting started using Unity Catalog this can be done for first time using Azure Active Directory Global Administrator of your subscription.
As per official documentation:
The first Azure Databricks account admin must be an Azure Active
Directory Global Administrator at the time that they first log in to
the Azure Databricks account console. Upon first login, that user
becomes an Azure Databricks account admin and no longer needs the
Azure Active Directory Global Administrator role to access the Azure
Databricks account. The first account admin can assign users in the
Azure Active Directory tenant as additional account admins (who can
themselves assign more account admins). Additional account admins do
not require specific roles in Azure Active Directory.
How to identify your Microsoft Azure global administrators for your subscriptions?
The global administrator has access to all administrative features. By default, the person who signs up for an Azure subscription is assigned the global administrator role for the directory. Only global administrators can assign other administrator roles.
Login into the Azure Databricks account console via Global admin and then account admin can assign users in the Azure Active Directory tenant.
For more details, refer to Azure Databricks - Get started using Unity Catalog and also refer to MS Q&A thread - How to access Azure Databricks account admin? addressing similar issue.

Azure connectivity and Access error error code 403

Error 1: Failed to load one or more resources due to no access, error code 403.
I checked with the answers here but they don't work for me. As the screenshots below suggest, I am the service administrator, owner and contributor of the Synapse workspace. I also allow public access to the Synapse workspace.
Error 2: If I check the access control on Synapse studio portal, it says I am not the synapse administrator but I am actually the service administrator of the entire subscription.
Error 3: Cannot create an SQL pool.
The Azure IAM/RBAC roles are for working with the Azure resource, but the Synapse workspace also has its own access control. You will need to grant permissions/RBAC inside the workspace itself. [documentation]
I recommend using Groups to manage permissions, but you can start by adding yourself as a Synapse Administrator.
For Error1: You may try the following steps and let us know.
This article - Disabling Public Network Access in Synapse helps to resolve the issue.
For Error2:Make sure you have Synapse Administrator role in the Manage => Security => Access Control
For more details, refer to Grant access to SQL pools.

Azure Databricks Unity Catalogue Create metastore button unavailable

Trying to create a Metastore for manage identity incorporating in Azure Databricks but the data tab only shows create table.
Per the documentation, it should be there. Also, I have created the databricks service and have azure contributor role.
I am an admin to the Databricks workspace. Is it unavailable on Azure?
Well, you don't give details about your environment, so I just can give some ideas about what is missing.
First, change the environment to "SQL" (click on "Data Science & Engineering" menu at the top left)
Second, do you have all the requirements? The requirements are here: https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/get-started#requirements
I think do you missing this permission here:
*You must be an Azure Databricks account admin.
The first Azure Databricks account admin must be an Azure Active Directory Global Administrator at the time that they first log in to the Azure Databricks account console. Upon first login, that user becomes an Azure Databricks account admin and no longer needs the Azure Active Directory Global Administrator role to access the Azure Databricks account. The first account admin can assign users in the Azure Active Directory tenant as additional account admins (who can themselves assign more account admins). Additional account admins do not require specific roles in Azure Active Directory.*
To check if you are an Azure Databricks account admin you can access:
https://accounts.azuredatabricks.net/login?next_url=%2Flogin%2F
and verify if you have the access to the Databricks administration screen

Failed to access the Azure Dedicated SQL pool with the given credentials

Our organization having an Azure Synapse Dedicated Pool instance. I am trying to register the Azure Synapse Dedicated Pool with Azure Purview and want to scan the Synapse DB. However, I am getting the following Error every time:
“Failed to access the Azure Dedicated SQL pool with the given credentials”
Following are the process I followed to Register the Data Source:
I opened “Purview Studio”
There I have created a “Collection”
Then I go to “Register Sources”
Then I search for “Azure Synapse Dedicated Pool”
Then I select the subscription where my Azure Synapse Dedicated Pool is present
Then I Registered my Data Source
Now I am trying to create a New Scan for my Synapse Dedicated Pool
The problem starts from here, First of all I selected the subscription, then I selected the resource group and then I selected the Synapse DB name. I tried two authentication methods to authenticate my Synapse Instance. First one is Purview MSI account and second one is SQL Authentication. I have added my Purview MSI account as a user in Synapse Dedicated pool using following command.
CREATE USER [PurviewAccountName] FROM EXTERNAL PROVIDER
GO
EXEC sp_addrolemember 'db_datareader', [PurviewAccountName]
GO
Now I tried to test the connection but it is not working and giving me following Error:
“Failed to access the Azure Dedicated SQL pool with Purview MSI account”
My Azure Synapse Dedicated Pool instance in not publically accessible, we have put it behind the private link. I can connect my Azure Synapse Instance using VPN connectivity on my machine and login through SSMS and Azure Data Studio.
I also tried with SQL authentication by using SQL username and Password which is kept under the keyvault. I have checked it multiple times and I am confident I have configured it correctly. But still when I try to test the connection. It is showing following error:
“Failed to access the Azure Dedicated SQL pool with the given credentials”
Some where I have read I need self-hosted-integration runtime if the Azure Synapse instance is behind private link.
So I installed integration runtime on my machine, configure it and tested for the Synapse connection with SQL Authentication by connecting to VPN. Self-Hosted IR configured successfully. I tested with both the IR. Azure IR and Self-hosted-IR. But no luck, I am getting the same error.
I have also added Purview MSI account to Access Policy in keyVault and provided GET, List permission on keys and Secrets.
However, I am not getting what I am missing here and why it is giving me the same error.
Any help on this is really means a lot me..
CREATE USER [PurviewAccountName] FROM EXTERNAL PROVIDER
GO
EXEC sp_addrolemember 'db_datareader', [PurviewAccountName]
GO
According to Microsoft official documentation, to execute the above command one must be Azure Synapse Administrator in the workspace. It is alsi required that your purview account name must have reader role set which can be done from Access Control (IAM) under the Azure Synapse Workspace resource.
To create SQL Pools, Apache Spark Pools and Integration Runtimes, users must have at least Azure Contributor role in the workspace. The contributor role also allows these users to manage the resources, including pausing and scaling. If you're- using Azure Portal or Synapse Studio to create SQL Pools, Apache Spark Pools and INtegration Runtimes, then you need Azure Contributor role at the resource group level.
To GRANT access to a Dedicated SQL Pool database, the scripts can be run by the workspace creator or any member of the workspace1_SynapseAdministrators group.
Follow the below steps in the Azure Synapse SQL script editor:
Create the USER in the database by running the following command on the target database, selected using the Connect to dropdown:
CREATE USER [<alias#domain.com>] FROM EXTERNAL PROVIDER;
Grant a user a role to access the database
EXEC sp_addrolemember 'db_owner', '<alias#domain.com>'

How to use IS_MEMBER('AAD_GROUPNAME') in Azure Synapse Analytics?

We are implementing row level security in Azure Synapse Analytics and we want to check if user is member of specific Azure AAD group, user can access data. As per [documentation][1]
[1]: https://learn.microsoft.com/en-us/sql/t-sql/functions/is-member-transact-sql?view=sql-server-ver15 it says this function only check windows group. Is there any work around or ETA when this features will be available?
We tried using below query but it always returns NULL
SELECT IS_MEMBER('AAD_Group_Name')
The document you have shared clearly mentioned that IS_MEMBER function is not supported for Azure Active Directory Groups.
You can raise the feature request here.
Alternatively, you can check this official document about How to set up access control for your Azure Synapse workspace. This will help you to understand and implement control access to a Microsoft Azure Synapse workspace using Azure roles, Azure Synapse roles, SQL permissions, and Git permissions.

Resources