Azure ADF using Azure Batch throws Shared Access Signature generation error - azure

I am working on a simple Azure Data Factory pipeline where I have simply added a Batch Service and in that specified the Batch Service account (which I have created thru linked service and tested the connection is working). In the command I am just running a simple "ls" command and when I do a debug run I get this error: "Cannot create Shared Access Signature unless Account Key credentials are used." I have following linked services "Azure Batch", "Azure Blob Storage" and Key Vault (where we store the access key). All linked services connections are working properly.
Any help on how to fix this error: "Cannot create Shared Access Signature unless Account Key credentials are used."
Azure Batch Linked service:
Azure Storage Linked service:
Azure Data factory pipeline:

The issue happens because you use "Managed Identity" to connect ADF to the Storage. It will say "successful" when doing a connection test on the linked services but when this storage is used for a Batch, it needs to have "Account Key" authentication type (see here).

Related

Azure blob storage - SAS - Data Factory

I was able to blob test connection and it's successful, but when I attempt to look for the storage path it shows this error. screenshot
Full error:
Failed to load
Blob operation failed for: Blob Storage on container '' and path '/' get failed with 'The remote server returned an error: (403) Forbidden.'. Possible root causes: (1). Grant service principal or managed identity appropriate permissions to do copy. For source, at least the “Storage Blob Data Reader” role. For sink, at least the “Storage Blob Data Contributor” role. For more information, see https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#service-principal-authentication. (2). It's possible because some IP address ranges of Azure Data Factory are not allowed by your Azure Storage firewall settings. Azure Data Factory IP ranges please refer https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses. If you allow trusted Microsoft services to access this storage account option in firewall, you must use https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#managed-identity. For more information on Azure Storage firewalls settings, see https://docs.microsoft.com/en-us/azure/storage/common/storage-network-security?tabs=azure-portal.. The remote server returned an error: (403) Forbidden.StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Context: I'm trying to copy data from SQL db to Snowflake and I am using Azure Data Factory for that. Since this doesn't publish, I enable the staged copy and connect blob storage.
I already tried to check network and it's set for all network. I'm not sure what I'm missing here because I found a youtube video that has it working but they didn't show an issue related/similar to this one. https://www.youtube.com/watch?v=5rLbBpu1f6E.
I also tried to retain empty storage path but trigger for copy data pipeline isn't successfully to.
Full error from trigger:
Operation on target Copy Contacts failed: Failure happened on 'Sink' side. ErrorCode=FileForbidden,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error occurred when trying to upload a blob, detailed message: dbo.vw_Contacts.txt,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (403) Forbidden.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
I created Blob storage and generated SAS token for that. I created a blob storage linked service using SAS URI It created successfully.
Image for reference:
When I try to retrieve the path I got below error
I changed the networking settings of storage account by enabling enabled from all networks of storage account
Image for reference:
I try to retrieve the path again in data factory. It worked successfully. I was able to retrieve the path.
Image for reference:
Another way is by whitelisting the IP addresses we can resolve this issue.
From the error message:
'The remote server returned an error: (403) Forbidden.'
It's likely the authentication method you're using doesn't have enough permissions on the blob storage to list the paths. I would recommend using the Managed Identity of the Data Factory to do this data transfer.
Take the name of the Data Factory
Assign the Blob Data Contributor role in the context of the container or the blob storage to the ADF Managed Identity (step 1).
On your blob linked service inside of Data Factory, choose the managed identity authentication method.
Also, if you stage your data transfer on the blob storage, you have to make sure the user can write to the blob storage, and also bulk permissions on SQL Server.

even trigger set up on blob storage fails in azure factory

I am setting up a event trigger on a blob storage v2in data factory pipeline, when i publish the pipeline I keep getting this error below, i have only set up storage recently but i cant see any thing out of place, do I need to set up even subscription in blob storage and create event from the storage itself as there are option to set up automation in there
The attempt to configure storage notifications for the provided storage account hmtest1 failed. Please ensure that your storage account meets the requirements described at https://aka.ms/storageevents. The error is Failed to retrieve credentials for request=RequestUri=https://management.azure.com/subscriptions
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
AFAIK, In ADF, this error occurs when the Data factory is not registered in the Resource providers.
To resolve this, we need to register Data factory in the Resource Providers.
Go to Subscriptions->your account->Resource providers and check whether Data factory is Registered or not.
If it is showing as NotRegistered then select it and click on Register.
After successfully registered, create a new data factory workspace and check the Storage event trigger.
If it still gives the same error, register the EventGrid as well and re-check.

Empty error while executing SSIS package in Azure Data Factory

I have created a simple SSIS project and in this project, I have a package that will delete a particular file in Downloads folder.
I deployed this project to Azure. And when I am trying to execute this package using Azure Data Factory then the pipeline fails with an empty error (I am attaching the screenshot here).
enter image description here
What I have done to fix this error is:
I have added self-hosted IR to Azure-SSIS IR as the proxy to access the data on-premise.
Set the ConnectByProxy as True.
Converted the project to Project Deployment Model.
Please help me out to fix this error and if you need more details then just leave a comment.
Windows Authentication :
To access data stores such as SQL servers/file shares on-premises or Azure Files, check the Windows authentication check box.
If this check box is selected, fill in the Domain, Username, and Password fields with the values for your package execution credentials. The domain is Azure, the username is storage account name>, and the password is storage account key> to access Azure Files, for example.
Using the secrets stored in your Azure Key Vault
As a substitute, you can leverage secrets from your Azure Key Vault as values. Select the AZURE KEY VAULT check box next to them to do so. Create a new key vault connected service or choose or update an existing one. Then choose your value's secret name and version. You can pick or update an existing key vault or create a new one when creating or editing your key vault connected service. If you haven't previously done so, allow Data Factory managed identity access to your key vault. You may also directly input your secret in the format key vault linked service name>/secret name>/secret version>.
Note : If you are using Windows Authentication, there are four methods to
access data stores with Windows authentication from SSIS packages
running on your Azure-SSIS IR: Access data stores and file shares with
Windows authentication from SSIS packages in Azure | Docs
Make Sure it Falls under one of such methods, else it could potentially fail at the Run Time.

Unable to deploy the index and grammar file in KES

I'm using Knowledge Exploration Service by Azure. I've prepared a grammar and an index file. Since, the size of it was small I was able to run it on my local machine and on a Azure VM.
But now, I want to deploy this service. Issue is when I run the command kes deploy_service it is unable to download the blob from Azure Storage. Even when I try to provide the file from my local machine.
Followed the same steps on a Azure VM and I receive the same errors.
>kes deploy_service Some.grammar Some.index kes-example
00:00:00 Index: Some.index
00:00:00 ERROR: Invalid value for index parameter: 'Some.index' is not a blob URI.
>kes deploy_service Some.grammar https://storagename.blob.core.windows.net/containername/Some.index kes-example
00:00:00 Index: https://storagename.blob.core.windows.net/containername/Bell.index
00:00:02 ERROR: ResourceNotFound: The storage account 'storagename' was not found.
The container has public access. I can download the file via the browser and even via Azure CLI.
What am I missing here?
EDIT: Adding a sample index file which I've uploaded on Azure Storage with public access. This index file was generated using the Academic example in the documentation.
>kes describe_index https://kesstorage.blob.core.windows.net/kess/Academic.index
ERROR: ResourceNotFound: The storage account 'kesstorage' was not found.
kes.exe is using the old Service Management API. It is querying the API for Storage Accounts in your subscription, but this API predates Azure Resource Manager (ARM), and therefore has no knowledge of ARM Storage Accounts. You will need to use a Classic Storage Account instead.
For how to create a Classic storage account tutorial, refer to this link: https://learn.microsoft.com/en-us/azure/storage/common/storage-create-storage-account#create-a-storage-account

Azure blob storage networking rules (Ip) for Azure data warehouse

I need to load external data (in blob storage) to my Azure data warehouse using Polybase. I had it working fine when I was using Classic Azure Storage.
Recently, I have to update our Storage to ARM and I could not figure out how to set up the firewall rule on the ARM Storage to my Azure data warehouse. If I set the firewall to "All networks" everything works seamlessly. However, I cannot let the blob wide open.
I tried using nslookup to find the outbound ip for our Azure Data warehouse and put the value into the Firewall of the Storage; I got "This request is not authorized to perform this operation." error
Is there a way I can find the ip address for an Azure Data warehouse? Or I should use different approach to make it work?
Any Suggestions are appreciated.
Kevin
Under the section 1.1 Create a Credential, it states:
Don't skip this step if you are using this tutorial as a template for loading your own data. To access data through a credential, use the following script to create a database-scoped credential, and then use it when defining the location of the data source.
-- A: Create a master key.
-- Only necessary if one does not already exist.
-- Required to encrypt the credential secret in the next step.
CREATE MASTER KEY;
-- B: Create a database scoped credential
-- IDENTITY: Provide any string, it is not used for authentication to Azure storage.
-- SECRET: Provide your Azure storage account key.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH
IDENTITY = 'user',
SECRET = '<azure_storage_account_key>'
;
-- C: Create an external data source
-- TYPE: HADOOP - PolyBase uses Hadoop APIs to access data in Azure blob storage.
-- LOCATION: Provide Azure storage account name and blob container name.
-- CREDENTIAL: Provide the credential created in the previous step.
CREATE EXTERNAL DATA SOURCE AzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'wasbs://<blob_container_name>#<azure_storage_account_name>.blob.core.windows.net',
CREDENTIAL = AzureStorageCredential
);
Edit: (additional way to access Blobs from ADW through the use of SAS):
You also can create a Storage linked service by using a shared access signature. It provides the data factory with restricted/time-bound access to all/specific resources (blob/container) in the storage.
A shared access signature provides delegated access to resources in your storage account. You can use a shared access signature to grant a client limited permissions to objects in your storage account for a specified time. You don't have to share your account access keys. The shared access signature is a URI that encompasses in its query parameters all the information necessary for authenticated access to a storage resource. To access storage resources with the shared access signature, the client only needs to pass in the shared access signature to the appropriate constructor or method. For more information about shared access signatures, see Shared access signatures: Understand the shared access signature model.
Full document can be found here

Resources