unable to create mount point in databricks for adls gen 2 - apache-spark

I am trying to create mount point to the ADLS Gen2 using key vault in databricks, however i am not being able to do so due to some error that i am getting.
I have contributor access and i tried with Storage Blob Data Contributor and contributor access to the SPN still i am not being able to create it the mount points.
I request some help please
configs= {"fs.azure.account.auth.type":"OAuth",
"fs.azure.account.oauth.provider.type":"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id":"abcdefgh",
"fs.azure.account.oauth2.client.secret":dbutils.secrets.get(scope="myscope",key="mykey"),
"fs.azure.account.oauth2.client.endpoint":"https://login.microsoftonline.com/tenantid/oauth2/token",
"fs.azure.createRemoteFileSystemDuringInitialization": "true"}
dbutils.fs.mount(
source= "abfss://cont1#storageaccount.dfs.core.windows.net/",
mount_point="/mnt/cont1",
extra_configs=configs)
the error i am getting is
An error occurred while calling o280.mount.
: HEAD https://storageaccount.dfs.core.windows.net/cont1?resource=filesystem&timeout=90
StatusCode=403
StatusDescription=This request is not authorized to perform this operation.

When performing the steps in the Assign the application to a role, make sure that your user account has the Storage Blob Data Contributor role assigned to it.
Repro: I have provided owner permission to the service principal and tried to run the “dbutils.fs.ls("mnt/azure/")”, returned same error message as above.
Solution: Now assigned the Storage Blob Data Contributor role to the service principal.
Finally, able to get the output without any error message after assigning the Storage Blob Data Contributor role to the service principal.
For more details, refer “Tutorial: Azure Data Lake Storage Gen2, Azure Databricks & Spark”.

Related

is Databricks access connector trusted resource in Azure?

I'm trying to set up the external location for the unity catalog. it was able to connect to storage while trying to test a connection in which storage access is limited to selected vnets and ips. but I'm getting a 403 error while accessing the storage from the notebook even adding the blob contributor access to managed identity. Did I miss anything?
my assumption is since I added a connector to the trusted resources it will bypass the network rules.
Databricks throwing 403 error
The main reason for 403 error is related to authorization issues for accessing azure storage account to avoid access related issues Assign the application to a role, make sure to assign the Storage Blob Data Contributor role to the service principal.
You need to have only (Storage Blob Data Contributor) Role specified on your storage for your service principal. To assign Storage Blob Data Contributor roles using portal follow this link.
I have created demt1 storage account for demo, open Access controls -> Role assignment
![enter image description here](https://i.imgur.com/a140fKd.png
Under Role assignment select Storage Blob Data Contributor created initially
To check if the role is assigned open Access control -> Check Access -> Check access
and search for databricks
Under Current assignments there will be assigned role
Open databricks account and try to access storage by mounting an existing container
Additional Settings
Try adding the Databricks' workspace managed identity as a Storage Blob Data Contributor.
IAM for Databricks Managed Identity
You'll also want to add the relevant IAM conditional access, such as Read / Write permissions.
Conditional Access for Managed Identity

What permissions I need to create queue storage trigger function?

I am new to Azure Functions and I want to create a queue trigger Function to consume the items in specific queue. But when I create queue trigger function in vscode, it keeps show that I lack some permissions.
The client 'live.com#***#gmail.com' with object id '***' does not have authorization to perform action 'Microsoft.Storage/storageAccounts/listKeys/action' over scope '/subscriptions/**/resourceGroups/***/providers/Microsoft.Storage/storageAccounts/***' or the scope is invalid. If access was recently granted, please refresh your credentials.
The permissions I am now obtaining for this queue are as follows:
The permissions I am now obtaining for the storage account are as follows:
I am confused that which kind of permissions I need to create a queue triggered function to consume items in specific queue?
Thank you!
When creating a function from VS Code using the Azure SDK, it will try to get the access key of the storage: by default this how you can authenticate to the storage. The error you're receiving is saying that you don't have permission to list the storage access keys.
From this documentation, these are the roles that has the Microsoft.Storage/storageAccounts/listkeys/action RBAC action:
The Reader and Data Access role
The Storage Account Contributor role
The Azure Resource Manager Contributor role
The Azure Resource Manager Owner role
Azure AD roles and Azure roles are the 2 different. You have to assign the Azure roles to the applications you're working.
It means, storage contributor role on the storage account application that in turn refers you have to grant the access to the service principal which is running your application to the storage contributor role on the storage account.
Yes! As Thomas said that the permissions required to the storage account access keys along with Storage Contributor role were Reader and Data Access, ARM Contributor and ARM Owner role.
I believe any one role from the above is required, depends on the level of access you required.
Refer to Azure storage account access keys for more information.

Create User delegation key from azure portal

I try to create a User delegation key from azure portal.
No matter what privileges I'm assigning to myself, I hit the same error message
You don't have permissions to grant read access. You can still create
a shared access signature, but you'll need an RBAC role with
additional permissions before you can grant that level of access to
your signature recipient.Learn more about Azure roles for access to
blob data
So far I have the following the roles assigned :
And the link provided in the error message says I need one of the following :
Contributor
Storage Account Contributor
Storage Blob Data Contributor
Storage Blob Data Owner
Storage Blob Data Reader
Storage Blob Delegator
So it should work, but it doesn't. What am I missing ?
The error usually occurs if you don't have required roles/permissions assigned to create User delegation key.
Please note that in order to create user delegation key, ensure to have role that includes action like below:
Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey
The above action is included in the below roles:
Storage Blob Data Contributor
Storage Blob Data Owner
Storage Blob Data Reader
Storage Blob Delegator
Try assigning either Storage Blob Data Contributor / Storage Blob Data Owner roles as you didn't assign.
Please check at what scope you have assigned the role, make sure to assign the roles at the level of the storage account, the resource group, or the subscription.
I tried in my environment, and got the same error when the roles are not assigned:
After assigning the roles, I am able to create user delegation key successfully without errors.
If still the error persists, try creating an Azure Support ticket.
For more in detail, please refer below links:
Create SAS tokens for containers and blobs with the Azure portal | Microsoft Docs
azure-docs/storage-blob-user-delegation-sas-create-cli.md at main · MicrosoftDocs/azure-docs · GitHub

Azure Storage blob delete cache permission issue

I have an Azure account with Owner permission for the subscription we have. I can see that two permissions existing for the same subscription, One is owner, and the other is Contributor. I am trying to delete the blob cache with the following Azure CLI command:
az storage blob delete-batch --source <containerName> --account-name <storageAccountName> --auth-mode login
I am getting the below error
I am not sure, despite having enough permissions why I am getting this error. Please help
Attaching the permission of my subscription
My access permission to storage account
If you set the --auth-mode parameter to login, it means that you use Azure AD auth to retrieve Azure blob data. If so, the Azure AD Azure AD security principal you used to login should be assigned to the role Storage Blob Data Owner Storage Blob Data Contributor or Storage Blob Data Reader. Otherwise, you have no permissions to process Azure blob.
Now, your account just has been assigned to Owner, please set the --auth-mode parameter to key which means that users attempt to retrieve the account access key to use for processing Azure blob. The Owner role has the permissions to do that.
For more details, please refer to here and here

Azure Databricks: can't connect to Azure Data Lake Storage Gen2

I have Storage account kagsa1 with container cont1 inside and need it to accessible (mounted) via Databricks
If I use storage account key in KeyVault it works correctly:
configs = {
"fs.azure.account.key.kagsa1.blob.core.windows.net":dbutils.secrets.get(scope = "kv-db1", key = "storage-account-access-key")
}
dbutils.fs.mount(
source = "wasbs://cont1#kagsa1.blob.core.windows.net",
mount_point = "/mnt/cont1",
extra_configs = configs)
dbutils.fs.ls("/mnt/cont1")
..but if I'm trying to connect using Azure Active Directory credentials:
configs = {
"fs.azure.account.auth.type": "CustomAccessToken",
"fs.azure.account.custom.token.provider.class": spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}
dbutils.fs.ls("abfss://cont1#kagsa1.dfs.core.windows.net/")
..it fails:
ExecutionError: An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.ls.
: GET https://kagsa1.dfs.core.windows.net/cont1?resource=filesystem&maxResults=5000&timeout=90&recursive=false
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
ErrorCode=AuthorizationPermissionMismatch
ErrorMessage=This request is not authorized to perform this operation using this permission.
Databrics Workspace tier is Premium,
Cluster has Azure Data Lake Storage Credential Passthrough option enabled,
Storage account has hierarchical namespace option enabled,
Filesystem was initialized with
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true")
dbutils.fs.ls("abfss://cont1#kagsa1.dfs.core.windows.net/")
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "false")
and I have full access to container in storage account:
What am I doing wrong?
Note: When performing the steps in the Assign the application to a role, make sure to assign the Storage Blob Data Contributor role to the service principal.
As part of repro, I have provided owner permission to the service principal and tried to run the “dbutils.fs.ls("mnt/azure/")”, returned same error message as above.
Now assigned the Storage Blob Data Contributor role to the service principal.
Finally, able to get the output without any error message after assigning Storage Blob Data Contributor role to the service principal.
For more details, refer “Tutorial: Azure Data Lake Storage Gen2, Azure Databricks & Spark”.
Reference: Azure Databricks - ADLS Gen2 throws 403 error message.

Resources