Azure Databricks: Accessing Blob Storage Behind Firewall - azure

I am reading files on an Azure Blob Storage account (gen 2) from an Azure Databricks Notebook. Both services are in the same region (West Europe). Everything works fine, except when I add a firewall in front of the storage account. I have opted to allow "trusted Microsoft services":
However, running the notebook now ends up with an access denied error:
com.microsoft.azure.storage.StorageException: This request is not authorized to perform this operation.
I tried to access the storage directly from Spark and by mounting it with dbutils, but same thing.
I would have assumed that Azure Databricks counts as a trusted Microsoft service? Furthermore I couldn't find solid information on IP ranges for Databricks regions that could be added to the firewall rules.

Yes, the Azure Databricks does not count as a trusted Microsoft service, you could see the supported trusted Microsoft services with the storage account firewall.
From networking, Here are two suggestions:
Find the Azure datacenter IP address (Original deprecated URL) and scope a region where your Azure Databricks located. Whitelist the IP list in the storage account firewall.
Deploy Azure Databricks in your Azure Virtual Network (Preview) then whitelist the VNet address range in the firewall of the storage account. You could refer to configure Azure Storage firewalls and virtual networks. Also, you have NSG to restrict inbound and outbound traffics from this Azure VNet. Note: you need to deploy Azure Databricks to your own VNet.
Hope this helps.

The described scenario only works if you deploy Azure Databricks in your own Azure Virtual Network (vnet). With this you are able to use Service Endpoints, so could add your Databricks vnet to the Blob Storage. With the default deployment this is not supported and not possible.
See the following Documentation for more details and a description how to get the vnet-injection feature enabled.
Enabling the mentioned exception does not work, as Azure Databricks is not in the list of trusted Services for Blob Storage. See the following Documentation which services still can access the storage account with the exception enabled.

Related

"PackageUriForbidden" error when trying to deploy an Azure Cloud Service ES

I am trying to redeploy an Azure Cloud Service (classic) to an "extended support" one since the former is being deprecated. Following this guide and the prerequisites I have created a virtual network and new storage account. I set up a bunch of permissions and the Connectivity Check for my storage account indicates no problems. However when I try to create and deploy a new Cloud Service (Extended Support) using my (updated) .cscfg, .csdef and .cspkg files I get this error:
Error:AuthorizationFailure, message:This request is not authorized to perform this operation. (Code: PackageUriForbidden)
I've tried setting the container and blob access to public for the deploy files, I have added Network Contributor and Storage Blob Data Contributor to both the subscription and the cloud storage resources for my user account. What am I missing?
I tried deploying cloud services extended support via Azure Portal and it got deployed successfully.
Refer below :-
I have uploaded all my cloud services packages in my storage account and used those packages from my storage blobs and created the Cloud service ES instance.
I enabled connection from all networks for my storage account, Thus I did not receive any authorization error :-
It looks like your Storage account has Firewall and V-Net enabled for the selected Networks. Or There’s an IP Address added to restrict storage account.
I created a Create a Service endpoint in my V-Net to allow Microsoft.storage like below :-
Added this V-Net in the selected Networks in Storage account’s Firewall :-
Checked this in order to allow azure services to access storage account like below :-
Now, when I try to deploy another cloud service with the same storage account having firewall and V-Net enabled, I get the same error as yours refer below :-
I allowed my client machine’s IP in the storage account and was able to add the packages without any error while deploying the Cloud Service:-

Restricting access to storage account containing package blob for cloud service (extended support) deployment

I'm nearly done migrating our cloud service (classic) deployments to cloud service (extended support). I'm working now on updating deployment pipelines. My package blob is located in a storage account. I create a SAS for the blob and use an API call to management.azure.com to create/update the deployment, passing ARM template as the body of the request.
This works correctly as long as the storage account with the package blob has its network set to "allow access from all networks". I want to restrict this access. I set the allow access from:
specific IP addresses of our devops servers
our own IP addresses
private vnet/subnets for the cloud service
I also tick the "Allow Azure services on the trusted services list to access this storage account" checkbox.
Yet, API call fails with error message indicating access is not allowed to the blob. When I change the storage account network configuration to "allow access from all networks", everything works correctly.
With lots of searches, I found only one hit explaining the same problem - https://github.com/Azure/azure-powershell/issues/20299 - yet no solution has been suggested other than allowing access from all networks.
I must be missing some trick - but what is it? How can I restrict access to the storage account?

Accessing Azure Storage Accounts with Selected Network Enabled

As per the requirements, I need to Enable Firewall with Selected Network ON for Azure Storage Accounts. But when I do the same along with adding all required IPs, Azure Function App and Azure Data Factory is going down.
Currently the VNET is unavailable and cannot be created. Managed Identity is not an option as Contributor role unavailable.
Is there a way to to configure the Data Factory and Function Apps after enabling FireWall with selected networks for Azure KeyVault and Azure Storage Accounts.
Please find the below steps helps to work around:
Is there a way to to configure the Data Factory and Function Apps after enabling FireWall with selected networks for Azure KeyVault and Azure Storage Accounts.
When Network rules like specified IP Addresses, IP Ranges, subnets are configured to the storage accounts, then that storage accounts can only be accessed by applications that request data over the specified set of networks or through the specified set of Azure resources.
Also, the option Allow Trusted Services is set to ON while enabling the firewall for a storage account, which allows connectivity from Azure trusted services like Data Factory, Azure functions, etc.
Visit this documentation to know the list of trusted services allowed to access a key vault in Azure.
You have to create the VNet, attach to the Azure Function App which helps to connect to the Storage Account.
Currently the VNET is unavailable and cannot be created. Managed Identity is not an option as Contributor role unavailable.
To enable a service endpoint for a subnet/IP Addresses attached to Storage account, you can have custom role like Microsoft.Network/virtualNetworks/subnets/joinViaServiceEndpoint/action.
Refer to MSFT Docs1 and here for more information.

Whitelist Azure Automation account in Azure Storage to read zipped module from blob and upload in azure automation module

I have uploaded the zipped powershell module in Azure Blob Storage and my networking is selected as Allow access from Selected networks. I am running below command to upload module in azure automation-
New-AzAutomationModule -ResourceGroupName $automationrg -AutomationAccountName $automationaccount -Name ($Mod.Name).Replace('.zip','') -ContentLink $Blob.ICloudBlob.Uri.AbsoluteUri
After running this command, I am getting below error.
[error]{"Message":"Module is not accessible. Exception: This request is not authorized to perform this operation."}
##[error]PowerShell exited with code '1'.
I checked and get to know that it is a firewall issue and I can not select access from all virtual networks. How Can I whitelist azure automation in storage networking?
I checked and get to know that it is a firewall issue and I can not select access from all virtual networks. How Can I whitelist azure automation in storage networking?
Whitelisting all Automation account's IP address is not option because it is impossible to keep up with updates hundreds of IP-addresses.
In this Case, you can use Azure Private Link to connect networks to Azure Automation. But this has the main limitation:
Private Link support with Azure Automation is available only in Azure Commercial and Azure US Government clouds.
One option which might work for you is to use a Hybrid Worker Group in Azure Automation. The systems can be your physical systems that can reach Azure or your Azure VMs. You can then grant access to the IP addresses that are in your Hybrid Runbook Worker group.
References:
Use Azure Private Link to securely connect networks to Azure Automation
Azure Automation network configuration details
Access storage account with Automation Account / Runbook - MSFT Q&A

Azure Storage Account Firewall Permissions for Vulnerability Assessment

I have created a storage account for use in storing the results of an Azure Vulnerability Assessment on an Azure SQL Database.
If the firewall on the storage account is disabled, allowing access from all networks, Azure Vulnerability Scans work as expected.
If the firewall is enabled, the Azure Vulnerability Scan on the SQL Database reports an error, saying the storage account is not valid or does not exist.
Checking the box for "Allow Azure services on the trusted services list to access this storage account." in Networking properties for the storage account does not work to resolve this issue, though it is the recommended step in the documentation here: https://learn.microsoft.com/en-us/azure/azure-sql/database/sql-database-vulnerability-assessment-storage
Allow Azure Services
What other steps could resolve this issue, rather than just disabling the firewall?
You have to add the subnet and vnet that is being used by the SQL Managed Instance as mentioned in the document you are following . You can refer the below screenshot:
After enabling the service endpoint status as shown in the above image , Click Add . After adding the vnet it should look like below:
After this is done , Click on save and you should be able to resolve the issue.
Reference:
Store Vulnerability Assessment scan results in a storage account accessible behind firewalls and VNets - Azure SQL Database | Microsoft Docs

Resources