Azure Data Factory to Azure Blob Storage Permissions - azure

I'm connecting ADF to blob storage v2 using a managed identity following this doc: Doc1
When it comes to test the connection with my first dataset, I am successful when I test the connection to the linkedservice. When I try by the filepath, and enter "testfolder" (which exists in the blob) it fails returning a generic forbidden error displayed at the end of this post.
However, when I opt to "browse" the folders in the dataset portal, the folder "testfolder" does show up. But when I select it, it will not show me anything within that folder.
The Data Factory managed instance is given the role of Contributor, granting full access to manage all resources. Is there some other hidden issue or possible way to narrow down the issue? My instinct is that this is something within the blob container since I can view the containers, but not their contents.
Error message:

It seems that you don't give the role of azure blob storage.
Please fellow this:
1.click IAM in azure blob storage,navigate to Role assignments and add role assignment.
2.choose role according your need and select your data factory.
3.A few minute later,you can retry to choose file path.
Hope this can help you.

Related

ADF Pipeline Errors - RequestContentTooLarge and InvalidContentLink

The ADF Pipeline release to the test Data Factory instance is failing with the following error as shown in the image below.
So, to overcome the above issue, I modified the pipeline by adding an additional step of Azure Blob File Copy to store the linked templates in a storage account and reference it in the pipeline to use it for the deployment. However when I made the above change I am getting another error which states InvalidContentLink: Unable to download deployment content from 'https://xxx.blob.core.windows.net/adf-arm-templates/ArmTemplate_0.json?***Sanitized Azure Storage Account Shared Access Signature***'. The tracking Id is 'xxxxx-xxxx-x-xxxx-xx'. Please see https://aka.ms/arm-deploy for usage details.
I have tried using the SAS token for both at the Container level and at the Storage Account level. I also have ensured that the agent and the storage account are under same VNets. I have also tried to remove the firewall restrictions but still it gives me the same InvalidContentLink error.
The modified pipeline with the Azure Storage Account step :
How do I resolve this issue?
InvalidContentLink: Unable to download deployment content from 'https://xxx.blob.core.windows.net/adf-arm-templates/ArmTemplate_0.json?Sanitized Azure Storage Account Shared Access Signature'. The tracking Id is 'xxxxx-xxxx-x-xxxx-xx'. Please see https://aka.ms/arm-deploy for usage details.
This error can cause because of you are trying to link which might not present in storage account.
Make sure you provide correct URl for the nested template that is accessible.
Also, if your storage account has firewall rule you can't link nested template from it.
Make sure your Storage Account, Container and Blob are publicly available. To achieve this:
Provide a Blob level Shared Access Signature URL. select the file click on"..." and then Click on Generate SAS.
refer for more understanding about nested template.

even trigger set up on blob storage fails in azure factory

I am setting up a event trigger on a blob storage v2in data factory pipeline, when i publish the pipeline I keep getting this error below, i have only set up storage recently but i cant see any thing out of place, do I need to set up even subscription in blob storage and create event from the storage itself as there are option to set up automation in there
The attempt to configure storage notifications for the provided storage account hmtest1 failed. Please ensure that your storage account meets the requirements described at https://aka.ms/storageevents. The error is Failed to retrieve credentials for request=RequestUri=https://management.azure.com/subscriptions
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
{"code":"InvalidAuthenticationToken","message":"The received access token is not valid: at least one of the claims 'puid' or 'altsecid' or 'oid' should be present. If you are accessing as application please make sure service principal is properly created in the tenant."}}
AFAIK, In ADF, this error occurs when the Data factory is not registered in the Resource providers.
To resolve this, we need to register Data factory in the Resource Providers.
Go to Subscriptions->your account->Resource providers and check whether Data factory is Registered or not.
If it is showing as NotRegistered then select it and click on Register.
After successfully registered, create a new data factory workspace and check the Storage event trigger.
If it still gives the same error, register the EventGrid as well and re-check.

Azure Data Explorer oneclick Ingest from blob container (UI)

I'm trying to configure and use the Azure Data Explorer OneClick Ingest from blob container (continous ingest).
Whatever I try the URL is never accepted, I always end up with this error:
Invalid URL. Either the URL leads to a blob instead of a container, or the permissions are incorrect. If you just grant permission, please wait couple of minutes and try again.
The URL I'm using follow that pattern:
https://mystorageaccount.blob.core.windows.net/mycontainer?sp=rl&st=2022-04-26T22:01:42Z&se=2032-04-27T06:01:42Z&spr=https&sv=2020-08-04&sr=c&sig=Z4Mlh7s5%2Fm1890kdfzlkYLSIHHDdGJmTSyYXVYsHdn01o%3D
I'm probably missing something, either in the URL syntax ou SAS generation.
Has anyone successfully used it? Any idea what could be wrong?
Thanks
I finally found out what was the issue.
Probably due to the security in place on my Storage account I had to create in Azure Data Explorer Networking panel, a Managed private enpoint, pointing to my storage resource (and then approve that endpoint in the storage account Networking)
https://learn.microsoft.com/en-us/azure/data-explorer/security-network-managed-private-endpoint-create

Azure Synapse severless SQL pool - query execution fails

After completing tutorial 1, I am working on this tutorial 2 from Microsoft Azure team to run the following query (shown in step 3). But the query execution gives the error shown below:
Question: What may be the cause of the error, and how can we resolve it?
Query:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet',
FORMAT='PARQUET'
) AS [result]
Error:
Warning: No datasets were found that match the expression 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet'. Schema cannot be determined since no files were found matching the name pattern(s) 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet'. Please use WITH clause in the OPENROWSET function to define the schema.
NOTE: The path of the file in the container is correct, and actually I generated the following query just by right clicking the file inside container and generated the script as shown below:
Remarks:
Azure Data Lake Storage Gen2 account name: contosolake
Container name: users
Firewall settings used on the Azure Data lake account:
Azure Data Lake Storage Gen2 account is allowing public access (ref):
Container has required access level (ref)
UPDATE:
The owner of the subscription is someone else, and I did not get the option Check the "Assign myself the Storage Blob Data Contributor role on the Data Lake Storage Gen2 account" box described in item 3 of Basics tab > Workspace details section of tutorial 1. I also do not have permissions to add roles - although I'm the owner of synapse workspace. So I am using workaround described in the Configure anonymous public read access for containers and blobs from Azure team.
--Workaround
If you are unable to granting Storage Blob Data Contributor, use ACL to grant permissions.
All users that need access to some data in this container also needs
to have the EXECUTE permission on all parent folders up to the root
(the container). Learn more about how to set ACLs in Azure Data Lake
Storage Gen2.
Note:
Execute permission on the container level needs to be set within the
Azure Data Lake Gen2. Permissions on the folder can be set within
Azure Synapse.
Go to the container holding NYCTripSmall.parquet.
--Update
As per your update in comments, it seems you would have to do as below.
Contact the Owner of the storage account, and ask them to perform the following tasks:
Assign the workspace MSI to the Storage Blob Data Contributor role on
the storage account
Assign you to the Storage Blob Data Contributor role on the storage
account
--
I was able to get the query results following the tutorial doc you have mentioned for the same dataset.
Since you confirm that the file is present and in the right path, refresh linked ADLS source and publish query before running, just in case if a transient issue.
Two things I suspect are
Try setting Microsoft network routing in Network Routing settings in ADLS account.
Check if built-in pool is online and you have atleast contributer roles on both Synapse workspace and Storage account. (If the current credentials using to run the query has not created the resources)

shell.azure.com is failing in configuration

While doing something I got option to execute shell commands from azure portal. It required to configure shell.azure.com first time.
In first step it is giving option of selecting Subscription & create storage. When I select required subscription & click on create storage, it is giving error:
Error: 409
{"error":{"code":"StorageAccountAlreadyTaken", "message":"The storage account named ... is already taken"}}
Can't create a storage account. Please try again.
I tried multiple times but no avail.
I opened Show advanced settings & tried to play with combinations but here using existing storage account is disabled(in advanced settings) and create storage is also disabled.
strong text
PS I have rights to create storage account on subscription, so that is not an issue.
I also face the same issue before. You need to directly edit (manually type the name) the existing storage account in the box, just ignore the using existing checkbox. It seems like a UI bug.
When you add the existing storage account on the UI, please note that the cloud shell region matches the storage account region. You can see the Supported storage regions from https://learn.microsoft.com/en-us/azure/cloud-shell/persisting-shell-storage.
Refer to the familiar threads,
Unable to open Cloud Shell because of Storage Account error
Azure Cloud shell requires storage account

Resources