Azure blob storage - SAS - Data Factory - azure

I was able to blob test connection and it's successful, but when I attempt to look for the storage path it shows this error. screenshot
Full error:
Failed to load
Blob operation failed for: Blob Storage on container '' and path '/' get failed with 'The remote server returned an error: (403) Forbidden.'. Possible root causes: (1). Grant service principal or managed identity appropriate permissions to do copy. For source, at least the “Storage Blob Data Reader” role. For sink, at least the “Storage Blob Data Contributor” role. For more information, see https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#service-principal-authentication. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses. If you allow trusted Microsoft services to access this storage account option in firewall, you must use https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#managed-identity. For more information on Azure Storage firewalls settings, see https://docs.microsoft.com/en-us/azure/storage/common/storage-network-security?tabs=azure-portal.. The remote server returned an error: (403) Forbidden.StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Context: I'm trying to copy data from SQL db to Snowflake and I am using Azure Data Factory for that. Since this doesn't publish, I enable the staged copy and connect blob storage.
I already tried to check network and it's set for all network. I'm not sure what I'm missing here because I found a youtube video that has it working but they didn't show an issue related/similar to this one. https://www.youtube.com/watch?v=5rLbBpu1f6E.
I also tried to retain empty storage path but trigger for copy data pipeline isn't successfully to.
Full error from trigger:
Operation on target Copy Contacts failed: Failure happened on 'Sink' side. ErrorCode=FileForbidden,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error occurred when trying to upload a blob, detailed message: dbo.vw_Contacts.txt,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (403) Forbidden.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

I created Blob storage and generated SAS token for that. I created a blob storage linked service using SAS URI It created successfully.
Image for reference:
When I try to retrieve the path I got below error
I changed the networking settings of storage account by enabling enabled from all networks of storage account
Image for reference:
I try to retrieve the path again in data factory. It worked successfully. I was able to retrieve the path.
Image for reference:
Another way is by whitelisting the IP addresses we can resolve this issue.

From the error message:
'The remote server returned an error: (403) Forbidden.'
It's likely the authentication method you're using doesn't have enough permissions on the blob storage to list the paths. I would recommend using the Managed Identity of the Data Factory to do this data transfer.
Take the name of the Data Factory
Assign the Blob Data Contributor role in the context of the container or the blob storage to the ADF Managed Identity (step 1).
On your blob linked service inside of Data Factory, choose the managed identity authentication method.
Also, if you stage your data transfer on the blob storage, you have to make sure the user can write to the blob storage, and also bulk permissions on SQL Server.

Related

ADF Pipeline Errors - RequestContentTooLarge and InvalidContentLink

The ADF Pipeline release to the test Data Factory instance is failing with the following error as shown in the image below.
So, to overcome the above issue, I modified the pipeline by adding an additional step of Azure Blob File Copy to store the linked templates in a storage account and reference it in the pipeline to use it for the deployment. However when I made the above change I am getting another error which states InvalidContentLink: Unable to download deployment content from 'https://xxx.blob.core.windows.net/adf-arm-templates/ArmTemplate_0.json?***Sanitized Azure Storage Account Shared Access Signature***'. The tracking Id is 'xxxxx-xxxx-x-xxxx-xx'. Please see https://aka.ms/arm-deploy for usage details.
I have tried using the SAS token for both at the Container level and at the Storage Account level. I also have ensured that the agent and the storage account are under same VNets. I have also tried to remove the firewall restrictions but still it gives me the same InvalidContentLink error.
The modified pipeline with the Azure Storage Account step :
How do I resolve this issue?
InvalidContentLink: Unable to download deployment content from 'https://xxx.blob.core.windows.net/adf-arm-templates/ArmTemplate_0.json?Sanitized Azure Storage Account Shared Access Signature'. The tracking Id is 'xxxxx-xxxx-x-xxxx-xx'. Please see https://aka.ms/arm-deploy for usage details.
This error can cause because of you are trying to link which might not present in storage account.
Make sure you provide correct URl for the nested template that is accessible.
Also, if your storage account has firewall rule you can't link nested template from it.
Make sure your Storage Account, Container and Blob are publicly available. To achieve this:
Provide a Blob level Shared Access Signature URL. select the file click on"..." and then Click on Generate SAS.
refer for more understanding about nested template.

even trigger set up on blob storage fails in azure factory

I am setting up a event trigger on a blob storage v2in data factory pipeline, when i publish the pipeline I keep getting this error below, i have only set up storage recently but i cant see any thing out of place, do I need to set up even subscription in blob storage and create event from the storage itself as there are option to set up automation in there
The attempt to configure storage notifications for the provided storage account hmtest1 failed. Please ensure that your storage account meets the requirements described at https://aka.ms/storageevents. The error is Failed to retrieve credentials for request=RequestUri=https://management.azure.com/subscriptions
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
AFAIK, In ADF, this error occurs when the Data factory is not registered in the Resource providers.
To resolve this, we need to register Data factory in the Resource Providers.
Go to Subscriptions->your account->Resource providers and check whether Data factory is Registered or not.
If it is showing as NotRegistered then select it and click on Register.
After successfully registered, create a new data factory workspace and check the Storage event trigger.
If it still gives the same error, register the EventGrid as well and re-check.

Trying to set a link services through registerd app to azure data lake storage and keep getting 24200 error

I am new to azure. We have azure data lake storage set. I am trying to set the link services from the data factory to the azure data lake storage gen2. It keeps failing when I test the link service to the data lake storage. As far as I can see, I have granted the "Storage blob contributor" role to the user in the azure data lake storage. I still keep getting permission denied error when I test the link services
ADLS Gen2 operation failed for: Storage operation '' on container 'testconnection' get failed with 'Operation returned an invalid status code 'Forbidden''. Possible root causes: (1). It's possible because the service principal or managed identity don't have enough permission to access the data. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://learn.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses.. Account: 'dlsisrdatapoc001'. ErrorCode: 'AuthorizationFailure'. Message: 'This request is not authorized to perform this operation.'.
What I could observe is that when I open the network to all (public) in the data lake storage, it works, when I set the firewall with CIDR it fails. Couldn't narrow the cause of the problem. I do have the "Allow azure services on the trusted services list to access this account" checked.
Completely lost
As mentioned in the error description, the error usually occurs if you don't have sufficient permissions to perform the action or if you don't add the required IPs in the firewall settings of your storage account.
To resolve the error, please check if you added the Storage Blob Data Contributor role to your managed identity along with the user like below:
Go to Azure Portal -> Storage Accounts -> Your Storage Account -> Access Control (IAM) ->Add role assignment
Make sure to select the managed identity, based on the authentication method you selected while creating linked service.
As mentioned in this MsDoc, make sure to add all the required IPs based on your resource location and service tag.
Download the JSON file to know the IP range for service tag in your resource location and add them in the firewall settings like below:
Make sure to select the Resource type as
Microsoft.DataFactory/factories while choosing CIDR.
For more in detail, please refer below links:
Error when I am trying to connect between Azure Data factory and Azure Data lake Gen2 by Anushree Garg
Storage Accoung V2 access with firewall, VNET to data factory V2 by Cindy Pau

AuthenticationException when creating Azure ML Dataset from Azure Data Lake Gen2 Datastore

I have an Azure Data Lake Gen2 with public endpoint and a standard Azure ML instance.
I have created both components with my user and I am listed as Contributor.
I want to use data from this data lake in Azure ML.
I have added the data lake as a Datastore using Service Principal authentication.
I then try to create a Tabular Dataset using the Azure ML GUI I get the following error:
Access denied
You do not have permission to the specified path or file.
{
"message": "ScriptExecutionException was caused by StreamAccessException.\n StreamAccessException was caused by AuthenticationException.\n 'AdlsGen2-ListFiles (req=1, existingItems=0)' for '[REDACTED]' on storage failed with status code 'Forbidden' (This request is not authorized to perform this operation using this permission.), client request ID '1f9e329b-2c2c-49d6-a627-91828def284e', request ID '5ad0e715-a01f-0040-24cb-b887da000000'. Error message: [REDACTED]\n"
}
I have tried having our Azure Portal Admin, with Admin access to both Azure ML and Data Lake try the same and she gets the same error.
I tried creating the Dataset using Python sdk and get a similar error:
ExecutionError:
Error Code: ScriptExecution.StreamAccess.Authentication
Failed Step: 667ddfcb-c7b1-47cf-b24a-6e090dab8947
Error Message: ScriptExecutionException was caused by StreamAccessException.
StreamAccessException was caused by AuthenticationException.
'AdlsGen2-ListFiles (req=1, existingItems=0)' for 'https://mydatalake.dfs.core.windows.net/mycontainer?directory=mydirectory/csv&recursive=true&resource=filesystem' on storage failed with status code 'Forbidden' (This request is not authorized to perform this operation using this permission.), client request ID 'a231f3e9-b32b-4173-b631-b9ed043fdfff', request ID 'c6a6f5fe-e01f-0008-3c86-b9b547000000'. Error message: {"error":{"code":"AuthorizationPermissionMismatch","message":"This request is not authorized to perform this operation using this permission.\nRequestId:c6a6f5fe-e01f-0008-3c86-b9b547000000\nTime:2020-11-13T06:34:01.4743177Z"}}
| session_id=75ed3c11-36de-48bf-8f7b-a0cd7dac4d58
I have created Datastore and Datasets of both a normal blob storage and a managed sql database with no issues and I have only contributor access to those so I cannot understand why I should not be Authorized to add data lake. The fact that our admin gets the same error leads me to believe there are some other issue.
I hope you can help me identify what it is or give me some clue of what more to test.
Edit:
I see I might have duplicated this post: How to connect AMLS to ADLS Gen 2?
I will test that solution and close this post if it works
This was actually a duplicate of How to connect AMLS to ADLS Gen 2?.
The solution is to give the service principal that Azure ML uses to access the data lake the Storage Blob Data Reader access. And note you have to wait at least some minutes for this to have effect.

Azure ML studio export data Azure Storage V2

I already post my problem here and they suggested me to post it here.
I am trying to export data from Azure ML to Azure Storage but I have this error:
Error writing to cloud storage: The remote server returned an error: (400) Bad Request.. Please check the url. . ( Error 0151 )
My blob storage configuration is Storage v2 / Standard and Require secure transfer set as enabled.
If I set the Require secure transfer set as disabled, the export works fine.
How can I export data to my blob storage with the require secure transfer set as enabled ?
According to the offical tutorial Export to Azure Blob Storage, there are two authentication types for exporting data to Azure Blob Storage: SAS and Account. The description for them as below.
For Authentication type, choose Public (SAS URL) if you know that the storage supports access via a SAS URL.
A SAS URL is a special type of URL that can be generated by using an Azure storage utility, and is available for only a limited time. It contains all the information that is needed for authentication and download.
For URI, type or paste the full URI that defines the account and the public blob.
For private accounts, choose Account, and provide the account name and the account key, so that the experiment can write to the storage account.
Account name: Type or paste the name of the account where you want to save the data. For example, if the full URL of the storage account is http://myshared.blob.core.windows.net, you would type myshared.
Account key: Paste the storage access key that is associated with the account.
I try to use a simple module combination as the figure and Python code below to test the issue you got.
import pandas as pd
def azureml_main(dataframe1 = None, dataframe2 = None):
dataframe1 = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
return dataframe1,
When I tried to use the authentication type Account of my Blob Storage V2 account, I got the same issue as yours which the error code is Error 0151 as below via click the View error log Button under the link of View output log.
Error 0151
There was an error writing to cloud storage. Please check the URL.
This error in Azure Machine Learning occurs when the module tries to write data to cloud storage but the URL is unavailable or invalid.
Resolution
Check the URL and verify that it is writable.
Exception Messages
Error writing to cloud storage (possibly a bad url).
Error writing to cloud storage: {0}. Please check the url.
Based on the error description above, the error should be caused by the blob url with SAS incorrectly generated by the Export Data module code with account information. May I think the code is old and not compatible with the new V2 storage API or API version information. You can report it to feedback.azure.com.
However, I switched to use SAS authentication type to type a blob url with a SAS query string of my container which I generated via Azure Storage Explorer tool as below, it works fine.
Fig 1: Right click on the container of your Blob Storage account, and click the Get Shared Access Signature
Fig 2: Enable the permission Write (recommended to use UTC timezone) and click Create button
Fig 3: Copy the Query string value, and build a blob url with a container SAS query string like https://<account name>.blob.core.windows.net/<container name>/<blob name><query string>
Note: The blob must be not exist in the container, otherwise an Error 0057 will be caused.

Resources