Azur Functions is not authorized to list blobs from afolder in a container in Data Lake - azure

I would like to list blobs in a folder in a container in Azure Data Lake in an Azure Functions.
Für authenticating I would like to use system assigned managed identity of Azure Functions. I have activate it in my azure Functions and on Data Lake side give it Storage Blob Data Contributor role.
Here is my Code:
string dfsUri = "https://<myDataLake>.dfs.core.windows.net";
DataLakeClientOptions options = new DataLakeClientOptions(DataLakeClientOptions.ServiceVersion.V2019_07_07);
DataLakeServiceClient dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), new Azure.Identity.DefaultAzureCredential(),options);
DataLakeFileSystemClient dataLakeFileSystemClient = dataLakeServiceClient.GetFileSystemClient("my-file-system");
IAsyncEnumerator<PathItem> enumerator = dataLakeFileSystemClient.GetPathsAsync("testfolder").GetAsyncEnumerator();
await enumerator.MoveNextAsync();
PathItem item = enumerator.Current;
while (item != null)
{
log.LogInformation($"File Name {item.Name}.");
if (!await enumerator.MoveNextAsync())
{
break;
}
item = enumerator.Current;
}
If I run my Code I get this error message:
This request is not authorized to perform this operation using this permission.
RequestId:cd00a570-401f-0024-4d21-35badb000000
Time:2021-04-19T13:39:30.8429070Z
Status: 403 (This request is not authorized to perform this operation using this permission.)
ErrorCode: AuthorizationPermissionMismatch
Headers:
Server: Windows-Azure-HDFS/1.0,Microsoft-HTTPAPI/2.0
x-ms-error-code: AuthorizationPermissionMismatch
x-ms-request-id: cd00a570-401f-0024-4d21-35badb000000
x-ms-version: 2019-07-07
x-ms-client-request-id: b0510f6a-5798-476c-a95e-6f206bf2a9cc
Date: Mon, 19 Apr 2021 13:39:29 GMT
Content-Length: 227
Content-Type: application/json; charset=utf-8
but on the container level the following code works fine for me:
string dfsUri = "https://<myDataLake>.dfs.core.windows.net";
DataLakeClientOptions options = new DataLakeClientOptions(DataLakeClientOptions.ServiceVersion.V2019_07_07);
DataLakeServiceClient dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), new Azure.Identity.DefaultAzureCredential(),options);
DataLakeFileSystemClient dataLakeFileSystemClient = dataLakeServiceClient.GetFileSystemClient("my-file-system");
IAsyncEnumerator<PathItem> enumerator = dataLakeFileSystemClient.GetPathsAsync("").GetAsyncEnumerator();
await enumerator.MoveNextAsync();
PathItem item = enumerator.Current;
while (item != null)
{
log.LogInformation($"File Name {item.Name}.");
if (!await enumerator.MoveNextAsync())
{
break;
}
item = enumerator.Current;
}
can someone tell me what should I do to list blobs from a folder in a container?

PART 1
I have done something similar with a user assigned identity, but it should be the same for both. It sounds like your configuration should be correct, but you can do the following to confirm. In the Function App's Identity section, click on the "Azure Role Assignments":
Do you see anything listed? If not, then your current configuration it isn't recognized. You can try to the "Add role assignment" feature, which is a pretty straightforward wizard.
PART 2
Again, not exactly your scenario, but it may help. I recently wrote some Azure Functions that connect to Synapse via Managed Identity. I used a ManagedIdentityCredential, not a DefaultAzureCredential:
// Build the credentials
var clientId = SettingsHelper.Get(Settings.ManagedIdentityClientId);
var credential = new ChainedTokenCredential(new ManagedIdentityCredential(clientId),
new AzureCliCredential());
As you can see, I stored the Managed Identity Client ID value in the Function App Settings. The ChainedTokenCredential is there to support local development via the AzureCliCredential. I used the exact same code to connect to Key Vault.

Update:
Tested, the document description is indeed problematic, and 'Storage Blob Data Contributor' can give the required permissions:
var uri = new Uri("https://datalakename.dfs.core.windows.net/");
var tokenCredential = new ManagedIdentityCredential();
DataLakeServiceClient dataLakeServiceClient = new DataLakeServiceClient(uri,tokenCredential);
var fileSystemClient = dataLakeServiceClient.GetFileSystemClient("test");
Pageable<PathItem> items = fileSystemClient.GetPaths();
foreach (var item in items) {
Console.WriteLine(item.Name);
}
After add Storage Blob Data Contributor RBAC role of function app identity to data lake, I still get the error:
But few minutes later, I can fetch the blobs with no problem. I suspect you just need to wait for minutes and it will be OK。

Related

Azure blob read SAS token throws AuthorizationPermissionMismatch exception

I'm trying to generate a SAS token for a blob, so that any user with the token can read the blob. Below is the code I have. I get an exception when I try to read the blob. If I grant "Storage Blob Data Reader" access to the user, then it works. My understanding is that user with SAS token should be able to read the blob without granting specific permission. what am I missing here ?
BlobServiceClient blobServiceClient = new BlobServiceClient(new Uri("https://accountname.blob.core.windows.net/"), new DefaultAzureCredential());
UserDelegationKey key = await blobServiceClient.GetUserDelegationKeyAsync(DateTimeOffset.UtcNow,
DateTimeOffset.UtcNow.AddDays(1));
BlobSasBuilder sasBuilder = new BlobSasBuilder()
{
BlobContainerName = "containerName",
BlobName = "file.json",
Resource = "b",
StartsOn = DateTimeOffset.UtcNow,
ExpiresOn = DateTimeOffset.UtcNow.AddHours(1)
};
sasBuilder.SetPermissions(BlobSasPermissions.Read);
string sasToken = sasBuilder.ToSasQueryParameters(key, "accountname").ToString();
UriBuilder fullUri = new UriBuilder()
{
Scheme = "https",
Host = string.Format("{0}.blob.core.windows.net", "accountname"),
Path = string.Format("{0}/{1}", "containerName", "file.json"),
Query = sasToken
};
var blobClient = new Azure.Storage.Blobs.BlobClient(fullUri.Uri);
using (var stream = await blobClient.OpenReadAsync()) // throws exception
{ }
Exception : Service request failed.
Status: 403 (This request is not authorized to perform this operation using this permission.)
ErrorCode: AuthorizationPermissionMismatch
I believe you are getting this error is because the user for which you are getting the user delegation key does not have permissions to access the data in the storage account.
Assigning Owner permission enables the user to manage the storage account itself, it does not give them permissions to manage the data.
Please try by assigning the user one of the data roles described here: https://learn.microsoft.com/en-us/azure/storage/blobs/authorize-access-azure-active-directory#azure-built-in-roles-for-blobs.
To learn more about RBAC roles to manage data, please see this link: https://learn.microsoft.com/en-us/azure/storage/blobs/assign-azure-role-data-access?tabs=portal.

Azure Function - Managed IDs to write to storage table - failing with 403 AuthorizationPermissionMismatch

I have an Azure function application (HTTP trigger) that writes to the storage queue and table. Both fail when I try to change to managed Id. This post / question is about just the storage table part.
Here's the code that does the actual writing to the table:
GetStorageAccountConnectionData();
try
{
WorkspaceProvisioningRecord provisioningRecord = new PBIWorkspaceProvisioningRecord();
provisioningRecord.status = requestType;
provisioningRecord.requestId = requestId;
provisioningRecord.workspace = request;
#if DEBUG
Console.WriteLine(Environment.GetEnvironmentVariable("AZURE_TENANT_ID"));
Console.WriteLine(Environment.GetEnvironmentVariable("AZURE_CLIENT_ID"));
DefaultAzureCredentialOptions options = new DefaultAzureCredentialOptions()
{
Diagnostics =
{
LoggedHeaderNames = { "x-ms-request-id" },
LoggedQueryParameters = { "api-version" },
IsLoggingContentEnabled = true
},
ExcludeVisualStudioCodeCredential = true,
ExcludeAzureCliCredential = true,
ExcludeManagedIdentityCredential = true,
ExcludeAzurePowerShellCredential = true,
ExcludeSharedTokenCacheCredential = true,
ExcludeInteractiveBrowserCredential = true,
ExcludeVisualStudioCredential = true
};
#endif
DefaultAzureCredential credential = new DefaultAzureCredential();
Console.WriteLine(connection.storageTableUri);
Console.WriteLine(credential);
var serviceClient = new TableServiceClient(new Uri(connection.storageTableUri), credential);
var tableClient = serviceClient.GetTableClient(connection.tableName);
await tableClient.CreateIfNotExistsAsync();
var entity = new TableEntity();
entity.PartitionKey = provisioningRecord.status;
entity.RowKey = provisioningRecord.requestId;
entity["requestId"] = provisioningRecord.requestId.ToString();
entity["status"] = provisioningRecord.status.ToString();
entity["workspace"] = JsonConvert.SerializeObject(provisioningRecord.workspace);
//this is where I get the 403
await tableClient.UpsertEntityAsync(entity);
//other stuff...
catch(AuthenticationFailedException e)
{
Console.WriteLine($"Authentication Failed. {e.Message}");
WorkspaceResponse response = new PBIWorkspaceResponse();
response.requestId = null;
response.status = "failure";
return response;
}
catch (Exception ex)
{
Console.WriteLine($"whoops! Failed to create storage record:{ex.Message}");
WorkspaceResponse response = new WorkspaceResponse();
response.requestId = null;
response.status = "failure";
return response;
}
I have the client id/ client secret for this security principal defined in my local.settings.json as AZURE_TENANT_ID/AZURE_CLIENT_ID/AZURE_CLIENT_SECRET.
The code dies trying to do the upsert. And it never hits the AuthenticationFailedException - just the general exception.
The security principal defined in the AZURE* variables was used to created this entire application including the storage account.
To manage data inside a storage account (like creating table etc.), you will need to assign different sets of permissions. Owner role is a control-plane role that enables you to manage storage accounts themselves and not the data inside them.
From this link:
Only roles explicitly defined for data access permit a security
principal to access blob data. Built-in roles such as Owner,
Contributor, and Storage Account Contributor permit a security
principal to manage a storage account, but do not provide access to
the blob data within that account via Azure AD.
Even though the text above is for Blobs, same thing applies for Tables as well.
Please assign Storage Table Data Contributor to your Managed Identity and then you should not get this error.

Access Azure Data Factory V2 programmatically: The Resource Microsoft.DataFactory/dataFactories/ under resource group was not found

I'm trying to access Azure Data Fabric V2 programmatically.
First, I had created an App Registration in Azure portal and a Client secret. Then I gave Contributor permission to this App registration on the entire suscription, and also in the resource group where my data factory lives.
Using this credentials I'm able to login to the portal and create an DataFactoryManagementClient
private void CreateAdfClient()
{
var authenticationContext = new AuthenticationContext($"https://login.windows.net/{tenantId}");
var credential = new ClientCredential(clientId: appRegistrationClientId, clientSecret: appRegistrationClientkey);
var result = authenticationContext.AcquireTokenAsync(resource: "https://management.core.windows.net/", clientCredential: credential).ConfigureAwait(false).GetAwaiter().GetResult();
if (result == null)
{
throw new InvalidOperationException("Failed to obtain the JWT token");
}
var token = result.AccessToken;
var tokenCloudCredentials = new TokenCloudCredentials(subscriptionId, token);
datafactoryClient = new DataFactoryManagementClient(tokenCloudCredentials);
}
However, when I try to get my pipeline with
var pipeline = datafactoryClient.Pipelines.Get(resourceGroup, dataFactory, pipelineName);
it throws an error:
System.Private.CoreLib: Exception while executing function:
StartRawMeasuresSync. Microsoft.Azure.Management.DataFactories:
ResourceNotFound: The Resource
'Microsoft.DataFactory/dataFactories/MyPipeline' under resource group
'MyResGroup' was not found.
I had verified that the resource group, the data factory name and the pipeline name are correct, but it keeps throwing this error.
I had the same issue, and it was due to referencing the Nuget package for Azure Data Factory v1 instead of v2.
Version 1: Microsoft.Azure.Management.DataFactories
Version 2: Microsoft.Azure.Management.DataFactory

Create Shared Access Token with Microsoft.WindowsAzure.Storage returns 403

I have a fairly simple method that uses the NEW Storage API to create a SAS and copy a blob from one container to another.
I am trying to use this to Copy blob BETWEEN STORAGE ACCOUNTS. So I have TWo Storage accounts, with the exact same Containers, and I am trying to copy a blob from the Storage Account's Container to another Storage Account's Container.
I don't know if the SDK is built for that, but it seems like it would be a common scenario.
Some additional information:
I create the token on the Destination Container.
Does that token need to be created on both the source and destination? Does it take time to register the token? Do I need to create it for each request, or only once per token "lifetime"?
I should mention a 403 is an Unauthorized Result http error code.
private static string CreateSharedAccessToken(CloudBlobClient blobClient, string containerName)
{
var container = blobClient.GetContainerReference(containerName);
var blobPermissions = new BlobContainerPermissions();
// The shared access policy provides read/write access to the container for 10 hours:
blobPermissions.SharedAccessPolicies.Add("SolutionPolicy", new SharedAccessBlobPolicy()
{
// To ensure SAS is valid immediately we don’t set start time
// so we can avoid failures caused by small clock differences:
SharedAccessExpiryTime = DateTime.UtcNow.AddHours(1),
Permissions = SharedAccessBlobPermissions.Write |
SharedAccessBlobPermissions.Read
});
blobPermissions.PublicAccess = BlobContainerPublicAccessType.Blob;
container.SetPermissions(blobPermissions);
return container.GetSharedAccessSignature(new SharedAccessBlobPolicy(), "SolutionPolicy");
}
Down the line I use this token to call a copy operation, which returns a 403:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
destBlob.StartCopyFromBlob(uri);
My version of Azure.Storage is 2.1.0.2.
Here is the full copy method in case that helps:
private static void CopyBlobs(
CloudBlobContainer srcContainer, string blobToken,
CloudBlobContainer destContainer)
{
var srcBlobList
= srcContainer.ListBlobs(string.Empty, true, BlobListingDetails.All); // set to none in prod (4perf)
//// get the SAS token to use for all blobs
//string token = srcContainer.GetSharedAccessSignature(
// new SharedAccessBlobPolicy(), "SolutionPolicy");
bool pendingCopy = true;
foreach (var src in srcBlobList)
{
var srcBlob = src as ICloudBlob;
// Determine BlobType:
ICloudBlob destBlob;
if (srcBlob.Properties.BlobType == BlobType.BlockBlob)
{
destBlob = destContainer.GetBlockBlobReference(srcBlob.Name);
}
else
{
destBlob = destContainer.GetPageBlobReference(srcBlob.Name);
}
// Determine Copy State:
if (destBlob.CopyState != null)
{
switch (destBlob.CopyState.Status)
{
case CopyStatus.Failed:
log.Info(destBlob.CopyState);
break;
case CopyStatus.Aborted:
log.Info(destBlob.CopyState);
pendingCopy = true;
destBlob.StartCopyFromBlob(destBlob.CopyState.Source);
return;
case CopyStatus.Pending:
log.Info(destBlob.CopyState);
pendingCopy = true;
break;
}
}
// copy using only Policy ID:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
destBlob.StartCopyFromBlob(uri);
//// copy using src blob as SAS
//var source = new Uri(srcBlob.Uri.AbsoluteUri + token);
//destBlob.StartCopyFromBlob(source);
}
}
And finally the account and client (vetted) code:
var credentials = new StorageCredentials("BAR", "FOO");
var account = new CloudStorageAccount(credentials, true);
var blobClient = account.CreateCloudBlobClient();
var sasToken = CreateSharedAccessToken(blobClient, "content");
When I use a REST client this seems to work... any ideas?
Consider also this problem:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
Probably you are calling the "ToString" method of Uri that produce a "Human redable" version of the url. If the blobToken contain special caracters like for example "+" this will cause a token malformed error on the storage server that will refuse to give you the access.
Use this instead:
String uri = srcBlob.Uri.AbsoluteUri + blobToken;
Shared Access Tokens are not required for this task. I ended up with two accounts and it works fine:
var accountSrc = new CloudStorageAccount(credsSrc, true);
var accountDest = new CloudStorageAccount(credsSrc, true);
var blobClientSrc = accountSrc.CreateCloudBlobClient();
var blobClientDest = accountDest.CreateCloudBlobClient();
// Set permissions on the container.
var permissions = new BlobContainerPermissions {PublicAccess = BlobContainerPublicAccessType.Blob};
srcContainer.SetPermissions(permissions);
destContainer.SetPermissions(permissions);
//grab the blob
var sourceBlob = srcContainer.GetBlockBlobReference("FOO");
var destinationBlob = destContainer.GetBlockBlobReference("BAR");
//create a new blob
destinationBlob.StartCopyFromBlob(sourceBlob);
Since both CloudStorageAccount objects point to the same account, copying without a SAS token would work just fine as you also mentioned.
On the other hand, you need either a publicly accessible blob or a SAS token to copy from another account. So what you tried was correct, but you established a container-level access policy, which can take up to 30 seconds to take effect as also documented in MSDN. During this interval, a SAS token that is associated with the stored access policy will fail with status code 403 (Forbidden), until the access policy becomes active.
One more thing that I would like to point is; when you call Get*BlobReference to create a new blob object, the CopyState property will not be populated until you do a GET/HEAD operation such as FetchAttributes.

Checking if a queue exists

I have a very basic question about Windows Azure Storage Queue errors/access.
I am trying to find out if the given storage account already contains a queue by the given name - say "queue1". I do not want to create the queue if it does not exist, and so am not keen on using the CreateIfNotExist method. The permissions I have given to the SAS token are - processing and Add (since all I want to do is to add a new message to the queue only if it already exists, and throw an error otherwise)
The problem is that when I try to get reference to a fake named queue, and add a message to it, I get a 403. 403 can also occur when the SAS token does not have permissions, so I cannot be sure what is causing the error.
Is there a way I could explicitly know if the queue exists or not?
I have tried the BeginExist, and EndExist methods but they always return false even when I can see the queue being there.
Any suggestions?
The Get Queue Metadata REST API operation will return status code 200 if the queue exists or a Queue Service Error Code otherwise.
Regarding to authorization,
This operation can be performed by the account owner and by anyone with a shared access signature that has permission to perform this operation.
A GET request to
https://myaccount.queue.core.windows.net/myqueue?comp=metadata
Will return a response like:
Response Status:
HTTP/1.1 200 OK
Response Headers:
Transfer-Encoding: chunked
x-ms-approximate-messages-count: 0
Date: Fri, 16 Sep 2011 01:27:38 GMT
Server: Windows-Azure-Queue/1.0 Microsoft-HTTPAPI/2.0
Are you sure you're getting a 403 error even if the queue does not exist. Based on what you described above, I created a simple console app. The queue does not exist in my storage account. When I try to add a message with valid SAS token, I get a 404 error:
CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials("account", "key"), false);
CloudQueueClient client = storageAccount.CreateCloudQueueClient();
CloudQueue queue = client.GetQueueReference("non-existent-queue");
var queuePolicy = new SharedAccessQueuePolicy();
var sas = queue.GetSharedAccessSignature(new SharedAccessQueuePolicy()
{
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(30),
Permissions = SharedAccessQueuePermissions.Add | SharedAccessQueuePermissions.ProcessMessages | SharedAccessQueuePermissions.Update
}, null);
StorageCredentials creds = new StorageCredentials(sas);
var queue1 = new CloudQueue(queue.Uri, creds);
try
{
queue1.AddMessage(new CloudQueueMessage("This is a test message"));
}
catch (StorageException excep)
{
//Get 404 error here
}
Next, I made the SAS token invalid by setting it's expiry to 30 minutes before current time. Now when I run the application, I get 403 error as expected.
CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials("account", "key"), false);
CloudQueueClient client = storageAccount.CreateCloudQueueClient();
CloudQueue queue = client.GetQueueReference("non-existent-queue");
var queuePolicy = new SharedAccessQueuePolicy();
var sas = queue.GetSharedAccessSignature(new SharedAccessQueuePolicy()
{
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(-30),//-30 to ensure SAS is invalid
Permissions = SharedAccessQueuePermissions.Add | SharedAccessQueuePermissions.ProcessMessages | SharedAccessQueuePermissions.Update
}, null);
StorageCredentials creds = new StorageCredentials(sas);
var queue1 = new CloudQueue(queue.Uri, creds);
try
{
queue1.AddMessage(new CloudQueueMessage("This is a test message"));
}
catch (StorageException excep)
{
//Get 403 error here
}
There is now an Exists and ExistsAsync (with various overloads).
Example of the former in use:
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connectionString);
CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();
CloudQueue queue = queueClient.GetQueueReference(queueName);
bool doesExist = queue.Exists();
You will want a reference to Microsoft.Azure.Storage.Queue (I believe older 'cloud' assemblies may not have had these properties - initially I could only access ExistsAsync before I had reference the right package, once I had added the above via Nuget Exists also was available)
For more details see the following links:
Exists
ExistsAsync
There is no Exists method in the v12 as well. Wrote a simple helper method to do the check:
private async Task<bool> QueueExistsAsync(QueueClient queue)
{
try
{
await queue.GetPropertiesAsync();
return true;
}
catch (RequestFailedException ex)
{
if (ex.Status == (int) HttpStatusCode.NotFound)
{
return false;
}
throw;
}
}

Resources