Azure ML studio export data Azure Storage V2 - azure

I already post my problem here and they suggested me to post it here.
I am trying to export data from Azure ML to Azure Storage but I have this error:
Error writing to cloud storage: The remote server returned an error: (400) Bad Request.. Please check the url. . ( Error 0151 )
My blob storage configuration is Storage v2 / Standard and Require secure transfer set as enabled.
If I set the Require secure transfer set as disabled, the export works fine.
How can I export data to my blob storage with the require secure transfer set as enabled ?

According to the offical tutorial Export to Azure Blob Storage, there are two authentication types for exporting data to Azure Blob Storage: SAS and Account. The description for them as below.
For Authentication type, choose Public (SAS URL) if you know that the storage supports access via a SAS URL.
A SAS URL is a special type of URL that can be generated by using an Azure storage utility, and is available for only a limited time. It contains all the information that is needed for authentication and download.
For URI, type or paste the full URI that defines the account and the public blob.
For private accounts, choose Account, and provide the account name and the account key, so that the experiment can write to the storage account.
Account name: Type or paste the name of the account where you want to save the data. For example, if the full URL of the storage account is http://myshared.blob.core.windows.net, you would type myshared.
Account key: Paste the storage access key that is associated with the account.
I try to use a simple module combination as the figure and Python code below to test the issue you got.
import pandas as pd
def azureml_main(dataframe1 = None, dataframe2 = None):
dataframe1 = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
return dataframe1,
When I tried to use the authentication type Account of my Blob Storage V2 account, I got the same issue as yours which the error code is Error 0151 as below via click the View error log Button under the link of View output log.
Error 0151
There was an error writing to cloud storage. Please check the URL.
This error in Azure Machine Learning occurs when the module tries to write data to cloud storage but the URL is unavailable or invalid.
Resolution
Check the URL and verify that it is writable.
Exception Messages
Error writing to cloud storage (possibly a bad url).
Error writing to cloud storage: {0}. Please check the url.
Based on the error description above, the error should be caused by the blob url with SAS incorrectly generated by the Export Data module code with account information. May I think the code is old and not compatible with the new V2 storage API or API version information. You can report it to feedback.azure.com.
However, I switched to use SAS authentication type to type a blob url with a SAS query string of my container which I generated via Azure Storage Explorer tool as below, it works fine.
Fig 1: Right click on the container of your Blob Storage account, and click the Get Shared Access Signature
Fig 2: Enable the permission Write (recommended to use UTC timezone) and click Create button
Fig 3: Copy the Query string value, and build a blob url with a container SAS query string like https://<account name>.blob.core.windows.net/<container name>/<blob name><query string>
Note: The blob must be not exist in the container, otherwise an Error 0057 will be caused.

Related

Azure blob storage - SAS - Data Factory

I was able to blob test connection and it's successful, but when I attempt to look for the storage path it shows this error. screenshot
Full error:
Failed to load
Blob operation failed for: Blob Storage on container '' and path '/' get failed with 'The remote server returned an error: (403) Forbidden.'. Possible root causes: (1). Grant service principal or managed identity appropriate permissions to do copy. For source, at least the “Storage Blob Data Reader” role. For sink, at least the “Storage Blob Data Contributor” role. For more information, see https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#service-principal-authentication. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses. If you allow trusted Microsoft services to access this storage account option in firewall, you must use https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#managed-identity. For more information on Azure Storage firewalls settings, see https://docs.microsoft.com/en-us/azure/storage/common/storage-network-security?tabs=azure-portal.. The remote server returned an error: (403) Forbidden.StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Context: I'm trying to copy data from SQL db to Snowflake and I am using Azure Data Factory for that. Since this doesn't publish, I enable the staged copy and connect blob storage.
I already tried to check network and it's set for all network. I'm not sure what I'm missing here because I found a youtube video that has it working but they didn't show an issue related/similar to this one. https://www.youtube.com/watch?v=5rLbBpu1f6E.
I also tried to retain empty storage path but trigger for copy data pipeline isn't successfully to.
Full error from trigger:
Operation on target Copy Contacts failed: Failure happened on 'Sink' side. ErrorCode=FileForbidden,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error occurred when trying to upload a blob, detailed message: dbo.vw_Contacts.txt,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (403) Forbidden.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
I created Blob storage and generated SAS token for that. I created a blob storage linked service using SAS URI It created successfully.
Image for reference:
When I try to retrieve the path I got below error
I changed the networking settings of storage account by enabling enabled from all networks of storage account
Image for reference:
I try to retrieve the path again in data factory. It worked successfully. I was able to retrieve the path.
Image for reference:
Another way is by whitelisting the IP addresses we can resolve this issue.
From the error message:
'The remote server returned an error: (403) Forbidden.'
It's likely the authentication method you're using doesn't have enough permissions on the blob storage to list the paths. I would recommend using the Managed Identity of the Data Factory to do this data transfer.
Take the name of the Data Factory
Assign the Blob Data Contributor role in the context of the container or the blob storage to the ADF Managed Identity (step 1).
On your blob linked service inside of Data Factory, choose the managed identity authentication method.
Also, if you stage your data transfer on the blob storage, you have to make sure the user can write to the blob storage, and also bulk permissions on SQL Server.

MainThread: Vaex: Error while Opening Azure Data Lake Parquet file

I tried to open a parquet on an Azure data lake gen 2 storage using SAS URL generated (with the datetime limit and token embedded in the url) using vaex by doing:
vaex.open(sas_url)
and I got the error
ERROR:MainThread:vaex:error opening 'the path which was also the sas_url(can't post it for security reasons)'
ValueError: Do not know how to open (can't publicize the sas url) , no handler for https is known
How do I get vaex to read the file or is there another azure storage that works better with vaex?
I finally found a solution! Vaex can read files in Azure blob storage with this:
import vaex
import adlfs
storage_account = "..."
account_key = "..."
container = "..."
object_path = "..."
fs = adlfs.AzureBlobFileSystem(account_name=storage_account, account_key=account_key)
df = vaex.open(f"abfs://{container}/{object_path}", fs=fs)
for more details, I found the solution in https://github.com/vaexio/vaex/issues/1272
Vaex is not capable to read the data using https source, that's the reason you are getting error "no handler for https is known".
Also, as per the document, vaex supports data input from Amazon S3 buckets and Google cloud storage.
Cloud support:
Amazon Web Services S3
Google Cloud Storage
Other cloud storage options
They mentioned that other cloud storages are also supported but there is no supporting document anywhere with any example where they are fetching the data from Azure storage account, that also using SAS URL.
Also please visit API document for vaex library for more info.

Azcopy throws error while executing via Terraform

I am using the Azcopy tool to copy a storage account to another. While executing the command using terminal it executes perfectly. But while executing the same using Terraform's local-executioner it throws an error. Please find the code and error below.
Code:
resource "null_resource" "backup" {
provisioner "local-exec" {
command= <<EOF
azcopy cp "https://${var.src_storage_acc_name}.blob.core.windows.net${var.src_sas}" "https://${var.dest_storage_acc_name}.blob.core.windows.net${var.dest_sas}"
EOF
}
}
Error:
Error running command ' azcopy cp "https://strsrc.blob.core.windows.net?[SAS]" "https://strdest.blob.core.windows.net?[SAS]"
': exit status 1. Output: INFO: The parameters you supplied were Source: '"https://strsrc.blob.core.windows.net?[SAS]-REDACTED- of type Local, and Destination: '"https://strdest.blob.core.windows.net?[SAS]-REDACTED- of type Local
INFO: Based on the parameters supplied, a valid source-destination combination could not automatically be found. Please check the parameters you supplied. If they are correct, please specify an exact source and destination type using the --from-to switch. Valid values are two-word phases of the form BlobLocal, LocalBlob etc. Use the word 'Blob' for Blob Storage, 'Local' for the local file system, 'File' for Azure Files, and 'BlobFS' for ADLS Gen2. If you need a combination that is not supported yet, please log an issue on the AzCopy GitHub issues list.
failed to parse user input due to error: the inferred source/destination combination could not be identified, or is currently not supported
Please provide your thoughts on this.
Today I needed to implement a similar task, and I used the azcopy cp command with --recursive=true option which is given in the document.
It successfully copied all contents of the source container to the destination.
Copy all blob containers, directories, and blobs from storage account to another by using a SAS token:
- azcopy cp "https://[srcaccount].blob.core.windows.net?[SAS]" "https://[destaccount].blob.core.windows.net?[SAS]" --recursive=true
azcopy only support certain combinations of source and destination types (blob, Gen1, Gen2, S3, Local file system, ...) for copy sub-command.
azcopy tries to guess source/destination types based on URL & params.
This error means that
you're trying to use a combination that isn't supported OR
Nothing you can do. Raise a issue as suggested. They'll probably just ignore it like this or this.
there is something wrong with your URL. E.g. you have blob.core.windows.net when you should've had dfs.core.windows.net or vice versa. This in turn causes mis-identification of source and destination types.
If you're sure that the combination is supported then you can tell azcopy the types using --from-to. Ironically, when you use a combination that isn't supported (e.g. BlobFSBlobFS), it gives the same error message instead of saying "source destination combination not supported).
When dealing with Gen2, you could use blob instead of dfs in the URL to make it use older (blob/Gen1) APIs to interact with your Gen2 account. Though less performant, it still might work.
'Blob' for Blob Storage
'Local' for the local file system
'File' for Azure Files
'BlobFS' for ADLS Gen2
As of now following combinations are supported per documentation:
local <-> Azure Blob (SAS or OAuth authentication)
local <-> Azure Files (Share/directory SAS authentication)
local <-> Azure Data Lake Storage Gen 2 (SAS, OAuth, or shared key authentication)
Azure Blob (SAS or public) -> Azure Blob (SAS or OAuth authentication)
Azure Blob (SAS or public) -> Azure Files (SAS)
Azure Files (SAS) -> Azure Files (SAS)
Azure Files (SAS) -> Azure Blob (SAS or OAuth authentication)
Amazon Web Services (AWS) S3 (Access Key) -> Azure Block Blob (SAS or OAuth authentication)
Google Cloud Storage (Service Account Key) -> Azure Block Blob (SAS or OAuth authentication) [Preview]

Storage destination needs to have a Service SAS, not an Account SAS. What Does This Mean?

Hello recently I have been in the process of trying to use this azure graph request noted here
https://learn.microsoft.com/en-us/graph/api/user-exportpersonaldata?view=graph-rest-1.0&tabs=http
Now when you do that request as stated in it you provide a storage location which is, "This is a shared access signature (SAS) URL to an Azure Storage account, to where data should be exported."
Every time I provide by SAS url I get this error, "Storage destination needs to have a Service SAS, not an Account SAS"
Can someone please help me understand what this means? The documentation it links is not clear.
Storage destination needs to have a Service SAS, not an Account SAS
Difference between Account SAS and Service SAS is described here: https://learn.microsoft.com/en-us/rest/api/storageservices/delegate-access-with-shared-access-signature#types-of-shared-access-signatures.
You're providing an SAS URL for the entire account (e.g. https://account.blob.core.windows.net/?sas-parameters) whereas it is expected that you provide a SAS URL for a specific blob container (e.g. https://account.blob.core.windows.net/blob-container/?sas-parameters).
There are two possible solutions:
Create a SAS URL for a specific blob container. Or in other words create a Service SAS as the error message is telling you to do. You can do so using a tool like Microsoft Storage Explorer.
Insert the blob container name in your account SAS URL so that it looks like something like this https://account.blob.core.windows.net/blob-container/?sas-parameters.
Please note that if you're using an Account SAS, it should at least have Write permission on Object for Blob service.

Copying blob to another storage account using REST API gives 404 error

In the Logic App I'm using Blob Service Rest API to copy blob between different accounts.
I have SAS signatures on both source and destination URLs. Not sure what I am doing wrong.
Update
The destination URL (with SAS) is obtained from Dynamics 365 endpoint. It comes back with sv value of 2014-02-14. Could this be the problem (the sv is too old as suggested in the comments)?
I managed to copy the blob in a different way, by reading the contents of the source blob and creating the blob at the destination URL with that content (Put Blob).
Some information for you to refer.
I generate the SAS token in the portal, and copy the blob in storage account A to B, I test it in the logicapp, it works fine.
Generate SAS:
Request URL:
Put https://storageB.blob.core.windows.net/containername/testcopy1?sv=2017-11-09&ss=bfqt&srt=sco&sp=rwdlacup&se=2018-08-27T10:43:40Z&st=2018-08-27T02:43:40Z&spr=https&sig=xxxxxxx
Request Headers:
x-ms-copy-source:https://storageA.blob.core.windows.net/containername/2.5.txt?sv=2017-11-09&ss=bfqt&srt=sco&sp=rwdlacup&se=2018-08-27T10:59:19Z&st=2018-08-27T02:59:19Z&spr=https&sig=xxxxxx
In the LogicApp:
Check in the portal:
Update:
I think this is obviously the problem.
Refer to: version mentioned in the article

Resources