Azure databricks cluster don't have acces to mounted adls2 - azure

I followed the documentation azure-datalake-gen2-sp-access and I mounted a ADLS2 storage in databricks, but when I try to see data from the GUI I get the next error:
Cluster easy-matches-cluster-001 does not have the proper credentials to view the content. Please select another cluster.
I don't find any documentation, only something about premium databricks, so I can only access with a premium databricks resource?
Edit1: I can see the mounted storage with dbutils.

After mounting the storage account, please do run this command do check if you have data access permissions to the mount point created.
dbutils.fs.ls("/mnt/<mount-point>")
If you have data access - you will see the files inside the storage
account.
Incase if you don't have data access- you will get this error - "This request is not authorized to perform this operation using this permissions", 403.
If you are able to mount the storage but unable to access, check if the ADLS2 account has the necessary roles assigned.
I was able to repro the same. Since you are using Azure Active Directory application, you would have to assign "Storage Blob Data Contributor" role to Azure Active Directory application too.
Below are steps for granting blob data contributor role on the registered application
1. Select your ADLS account. Navigate to Access Control (IAM). Select Add role assignment.
2. Select the role Storage Blob Data Contributor, Search and select your registered Azure Active Directory application and assign.
Back in Access Control (IAM) tab, search for your AAD app and check access.
3. Run dbutils.fs.ls("/mnt/<mount-point>") to confirm access.

Solved unmounting, mounting and restarting the cluster. I followed this doc: https://learn.microsoft.com/en-us/azure/databricks/kb/dbfs/remount-storage-after-rotate-access-key

If you still encounter the same issue when Access Control is checked. Do the following.
Use dbutils.fs.unmount() to unmount all storage accounts.
Restart the cluster.
Remount

Related

Is there any method by which I can restrict other user not to view my container in Azure data lake gen 2

Problem Statement- There are two different teams working on two different project for same client. Both team have access to azure resource group on which azure data lake storage has been created. Now Client want us to use same data lake storage for both project but they also want that team working on a specific containers should not have access to other containers which other team will use and vice-versa.
Example--
Azure data lake storage -both team have access to this
->container1--only team 1 should have access to this
->container2--only team 2 should have access to this
Can anyone please suggest that how can we achieve this.
Thanks In advance!!
You can manage the access to containers, directories and blobs by using Access control lists (ACLs) feature in Azure Data Lake Storage Gen2.
You can associate a security principal with an access level for files and directories. Each association is captured as an entry in an access control list (ACL). Each file and directory in your storage account has an access control list. When a security principal attempts an operation on a file or directory, An ACL check determines whether that security principal (user, group, service principal, or managed identity) has the correct permission level to perform the operation.
To manage the ACL on the container, follow the below steps:
Go to the container in the storage account.
Navigate to any container, directory, or blob. Right-click the object, and then select Manage ACL.
The Access permissions tab of the Manage ACL page appears. Use the controls in this tab to manage access to the object.
To add a security principal to the ACL, select the Add principal button.
Find the security principal by using the search box, and then click the Select button.
You should create a security group in Azure AD for each of your team, and then maintain permissions on the group rather than for individual users.
Refer: Access control lists (ACLs) in Azure Data Lake Storage Gen2

How to configure access to a single blob storage container

I need to enable one external user, to be able to access a single directory in a single container in my datalake, in order to upload some data. From what I see in the documentation, it should be possible to simply use RBAC & ACL, so that the user can authenticate himself later on using Powershell and Connect-AzureAD(or to obtain a OAuth2 token).
However, I am having trouble with all those inherited permissions. Once I add a user to my active directory, he is not able to see anything, unless I give him at least reader access on the subscription level. This gives him at least reader permission on all the resources in this subscription, which cannot be removed.
Is it possible to configure this access in such a way, that my user is only able to see a single datalake, single container, and a single folder within this container?
If you want just the one user to access only a single directory/container in your storage account, you should rather look at Shared Access Signatures or Stored Access policies.
For SAS : https://husseinsalman.com/securing-access-to-azure-storage-part-4-shared-access-signature/
For SAS built on top of Stored Acess Policies : https://husseinsalman.com/securing-access-to-azure-storage-part-5-stored-access-policy/
Once you have configured the permissions just for that directory/container, you can send that Shared Access Signature to the user and he/she can use Azure Storage Explorer to perform and file upload/delete etc actions on your container.
Download Azure storage explorer here : https://azure.microsoft.com/en-us/features/storage-explorer/#overview
For how to use Azure Storage Explorer : https://www.red-gate.com/simple-talk/cloud/azure/using-azure-storage-explorer/
More on using Azure storage explorer with azure data lake Gen 2 : https://medium.com/microsoftazure/guidance-for-using-azure-storage-explorer-with-azure-ad-authorization-for-azure-storage-data-access-663c2c88efb

Unable to add service principle, groups to the $logs container in ADLS2

Recently enabled storage analytics on ADLS Gen2 storage account.I can see the $logs container and the logs are writing to this on an hourly basis. But when I'm trying to add service principal to this container getting permission denied. I have storage data contributor role on this storage account, any special permission is required to achieve this?
In general, being able to manage IAM requires higher level roles to be granted to your account. I assume, that you're trying to grant access via Access Control (IAM) feature / API call. Using Storage Data Contributor is not sufficient as it only allows you to access containers and blobs with read / write / delete access.
You need a role which grants you Microsoft.Authorization/*/write permission in order to get it working.
The problem is resolved by adding the SP/groups from the portal at the container level instead of storage explorer.

Azure Storage Explorer - Unable to list resources

Granted Reader & Storage Blob Data Reader Role access on Azure Data Lake Gen2 Storage Account to the user DataLakeTester
Also under Manage Access granted full rights on Access / Default section.
But when logged into into Azure Storage Explorer with above user, is successfully connected to the data lake but cannot list the containers and throw below error. Is there some other role assignment to be done?
The latest version of Storage Explore now available is 1.11.1. Please update and try again:https://github.com/Microsoft/AzureStorageExplorer/releases
In response to your query:
But when logged into Azure Storage Explorer with above user, is successfully connected to the data lake but cannot list the containers and throw below error. Is there some other role assignment to be done? It works fine on my side, could you try to sign out and sign in again?
The RBAC roles you have appear to be sufficient. It can take some time for RBAC changes to propagate. So accessing things in Storage Explorer might not work as expected for a few minutes.

Faster way to grant access privileges to ADLS on HDInsight cluster provisioning?

I have an Azure Data Lake Store (ADLS) containing ~100k files that I need to access from an HDInsight cluster for analysis. When I provision the cluster via Azure Portal, I use this ADLS for the cluster's storage and assign rwx privileges for all files on the ADLS using a service principal + the "Data Lake Store Access" feature. This feature appears to grant access to each file one at a time, at a rate of about 2k per minute: it takes over an hour just to grant the permissions!
Is there a faster way to grant a new cluster rwx privileges on its associated ADLS?
Yes there is a better way to get this all set up. You need to, on a one-time basis, add permissions for an Azure Active Directory group to all your files and folders. Once that is set up, then whenever you create a new HDInsight cluster, the service principal simply needs to be made a member of the group.
So to summarize:
Create a new Azure Active Directory Group
Propagate permissions in your ADLS account to this group on the appropriate files and folders
Create your HDInsight cluster. Choose the right service principal
when creating it.
Add the service principal to the group created in
step 1
Hope this helps and do let me know if you have questions.

Resources