Azure blob storage networking rules (Ip) for Azure data warehouse - azure

I need to load external data (in blob storage) to my Azure data warehouse using Polybase. I had it working fine when I was using Classic Azure Storage.
Recently, I have to update our Storage to ARM and I could not figure out how to set up the firewall rule on the ARM Storage to my Azure data warehouse. If I set the firewall to "All networks" everything works seamlessly. However, I cannot let the blob wide open.
I tried using nslookup to find the outbound ip for our Azure Data warehouse and put the value into the Firewall of the Storage; I got "This request is not authorized to perform this operation." error
Is there a way I can find the ip address for an Azure Data warehouse? Or I should use different approach to make it work?
Any Suggestions are appreciated.
Kevin

Under the section 1.1 Create a Credential, it states:
Don't skip this step if you are using this tutorial as a template for loading your own data. To access data through a credential, use the following script to create a database-scoped credential, and then use it when defining the location of the data source.
-- A: Create a master key.
-- Only necessary if one does not already exist.
-- Required to encrypt the credential secret in the next step.
CREATE MASTER KEY;
-- B: Create a database scoped credential
-- IDENTITY: Provide any string, it is not used for authentication to Azure storage.
-- SECRET: Provide your Azure storage account key.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH
IDENTITY = 'user',
SECRET = '<azure_storage_account_key>'
;
-- C: Create an external data source
-- TYPE: HADOOP - PolyBase uses Hadoop APIs to access data in Azure blob storage.
-- LOCATION: Provide Azure storage account name and blob container name.
-- CREDENTIAL: Provide the credential created in the previous step.
CREATE EXTERNAL DATA SOURCE AzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'wasbs://<blob_container_name>#<azure_storage_account_name>.blob.core.windows.net',
CREDENTIAL = AzureStorageCredential
);
Edit: (additional way to access Blobs from ADW through the use of SAS):
You also can create a Storage linked service by using a shared access signature. It provides the data factory with restricted/time-bound access to all/specific resources (blob/container) in the storage.
A shared access signature provides delegated access to resources in your storage account. You can use a shared access signature to grant a client limited permissions to objects in your storage account for a specified time. You don't have to share your account access keys. The shared access signature is a URI that encompasses in its query parameters all the information necessary for authenticated access to a storage resource. To access storage resources with the shared access signature, the client only needs to pass in the shared access signature to the appropriate constructor or method. For more information about shared access signatures, see Shared access signatures: Understand the shared access signature model.
Full document can be found here

Related

Azure blob storage - SAS - Data Factory

I was able to blob test connection and it's successful, but when I attempt to look for the storage path it shows this error. screenshot
Full error:
Failed to load
Blob operation failed for: Blob Storage on container '' and path '/' get failed with 'The remote server returned an error: (403) Forbidden.'. Possible root causes: (1). Grant service principal or managed identity appropriate permissions to do copy. For source, at least the “Storage Blob Data Reader” role. For sink, at least the “Storage Blob Data Contributor” role. For more information, see https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#service-principal-authentication. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses. If you allow trusted Microsoft services to access this storage account option in firewall, you must use https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#managed-identity. For more information on Azure Storage firewalls settings, see https://docs.microsoft.com/en-us/azure/storage/common/storage-network-security?tabs=azure-portal.. The remote server returned an error: (403) Forbidden.StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Context: I'm trying to copy data from SQL db to Snowflake and I am using Azure Data Factory for that. Since this doesn't publish, I enable the staged copy and connect blob storage.
I already tried to check network and it's set for all network. I'm not sure what I'm missing here because I found a youtube video that has it working but they didn't show an issue related/similar to this one. https://www.youtube.com/watch?v=5rLbBpu1f6E.
I also tried to retain empty storage path but trigger for copy data pipeline isn't successfully to.
Full error from trigger:
Operation on target Copy Contacts failed: Failure happened on 'Sink' side. ErrorCode=FileForbidden,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error occurred when trying to upload a blob, detailed message: dbo.vw_Contacts.txt,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (403) Forbidden.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
I created Blob storage and generated SAS token for that. I created a blob storage linked service using SAS URI It created successfully.
Image for reference:
When I try to retrieve the path I got below error
I changed the networking settings of storage account by enabling enabled from all networks of storage account
Image for reference:
I try to retrieve the path again in data factory. It worked successfully. I was able to retrieve the path.
Image for reference:
Another way is by whitelisting the IP addresses we can resolve this issue.
From the error message:
'The remote server returned an error: (403) Forbidden.'
It's likely the authentication method you're using doesn't have enough permissions on the blob storage to list the paths. I would recommend using the Managed Identity of the Data Factory to do this data transfer.
Take the name of the Data Factory
Assign the Blob Data Contributor role in the context of the container or the blob storage to the ADF Managed Identity (step 1).
On your blob linked service inside of Data Factory, choose the managed identity authentication method.
Also, if you stage your data transfer on the blob storage, you have to make sure the user can write to the blob storage, and also bulk permissions on SQL Server.

Trying to set a link services through registerd app to azure data lake storage and keep getting 24200 error

I am new to azure. We have azure data lake storage set. I am trying to set the link services from the data factory to the azure data lake storage gen2. It keeps failing when I test the link service to the data lake storage. As far as I can see, I have granted the "Storage blob contributor" role to the user in the azure data lake storage. I still keep getting permission denied error when I test the link services
ADLS Gen2 operation failed for: Storage operation '' on container 'testconnection' get failed with 'Operation returned an invalid status code 'Forbidden''. Possible root causes: (1). It's possible because the service principal or managed identity don't have enough permission to access the data. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://learn.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses.. Account: 'dlsisrdatapoc001'. ErrorCode: 'AuthorizationFailure'. Message: 'This request is not authorized to perform this operation.'.
What I could observe is that when I open the network to all (public) in the data lake storage, it works, when I set the firewall with CIDR it fails. Couldn't narrow the cause of the problem. I do have the "Allow azure services on the trusted services list to access this account" checked.
Completely lost
As mentioned in the error description, the error usually occurs if you don't have sufficient permissions to perform the action or if you don't add the required IPs in the firewall settings of your storage account.
To resolve the error, please check if you added the Storage Blob Data Contributor role to your managed identity along with the user like below:
Go to Azure Portal -> Storage Accounts -> Your Storage Account -> Access Control (IAM) ->Add role assignment
Make sure to select the managed identity, based on the authentication method you selected while creating linked service.
As mentioned in this MsDoc, make sure to add all the required IPs based on your resource location and service tag.
Download the JSON file to know the IP range for service tag in your resource location and add them in the firewall settings like below:
Make sure to select the Resource type as
Microsoft.DataFactory/factories while choosing CIDR.
For more in detail, please refer below links:
Error when I am trying to connect between Azure Data factory and Azure Data lake Gen2 by Anushree Garg
Storage Accoung V2 access with firewall, VNET to data factory V2 by Cindy Pau

Azure ADF using Azure Batch throws Shared Access Signature generation error

I am working on a simple Azure Data Factory pipeline where I have simply added a Batch Service and in that specified the Batch Service account (which I have created thru linked service and tested the connection is working). In the command I am just running a simple "ls" command and when I do a debug run I get this error: "Cannot create Shared Access Signature unless Account Key credentials are used." I have following linked services "Azure Batch", "Azure Blob Storage" and Key Vault (where we store the access key). All linked services connections are working properly.
Any help on how to fix this error: "Cannot create Shared Access Signature unless Account Key credentials are used."
Azure Batch Linked service:
Azure Storage Linked service:
Azure Data factory pipeline:
The issue happens because you use "Managed Identity" to connect ADF to the Storage. It will say "successful" when doing a connection test on the linked services but when this storage is used for a Batch, it needs to have "Account Key" authentication type (see here).

Secure access to Azure Table Storage via Azure Function

I have data stored in Azure Table Storage and want to secure it such that only my API (a function app) can read and write data.
What is best practice and how can I do this? I thought setting --default-action on the network rules to Deny for the Storage, plus adding a --bypass Logging Metrics AzureServices would shut down access but enable my Azure services, but this did not work.
I then looked at creating a Managed Service Identity (MSI) for the function app and adding RBAC to the Storage Account, but this did not work either. It doesn't look like MSIs are supported for Table Storage Access Azure Table Storage with Azure MSI
Am I missing or misunderstanding something? How do I secure the data in the tables in the Storage account, and is this even possible?
As the link you provided, azure table storage does not support Azure MSI, and it only support Shared Key (storage account key) and Shared access signature (SAS).
You must use Shared Key authorization to authorize a request made against the Table service if your service is using the REST API to make the request.
To encode the signature string for a request against the Table service made using the REST API, use the following format:
StringToSign = VERB + "\n" +
Content-MD5 + "\n" +
Content-Type + "\n" +
Date + "\n" +
CanonicalizedResource;
You can use Shared Key Lite authorization to authorize a request made against any version of the Table service.
StringToSign = Date + "\n"
CanonicalizedResource
For more details, you could refer to this article.
For securing Azure Table Storage data you do below network configurations -
Use selected network instead of public network. This configuration is available under "Firewalls and virtual networks" of storage account.
Second step which you can do is to either move the data to Azure Key Vault or use an encryption key stored in Azure Key Vault to encrypt required fields of Azure Table Storage. This way you won't face Azure Key Vault's throttling limits - https://learn.microsoft.com/en-us/azure/key-vault/general/service-limits#secrets-managed-storage-account-keys-and-vault-transactions

Storage destination needs to have a Service SAS, not an Account SAS. What Does This Mean?

Hello recently I have been in the process of trying to use this azure graph request noted here
https://learn.microsoft.com/en-us/graph/api/user-exportpersonaldata?view=graph-rest-1.0&tabs=http
Now when you do that request as stated in it you provide a storage location which is, "This is a shared access signature (SAS) URL to an Azure Storage account, to where data should be exported."
Every time I provide by SAS url I get this error, "Storage destination needs to have a Service SAS, not an Account SAS"
Can someone please help me understand what this means? The documentation it links is not clear.
Storage destination needs to have a Service SAS, not an Account SAS
Difference between Account SAS and Service SAS is described here: https://learn.microsoft.com/en-us/rest/api/storageservices/delegate-access-with-shared-access-signature#types-of-shared-access-signatures.
You're providing an SAS URL for the entire account (e.g. https://account.blob.core.windows.net/?sas-parameters) whereas it is expected that you provide a SAS URL for a specific blob container (e.g. https://account.blob.core.windows.net/blob-container/?sas-parameters).
There are two possible solutions:
Create a SAS URL for a specific blob container. Or in other words create a Service SAS as the error message is telling you to do. You can do so using a tool like Microsoft Storage Explorer.
Insert the blob container name in your account SAS URL so that it looks like something like this https://account.blob.core.windows.net/blob-container/?sas-parameters.
Please note that if you're using an Account SAS, it should at least have Write permission on Object for Blob service.

Resources