Folder level access control in ADLS Gen2 for upcoming users - azure

I have a Gen2 storage account and created a container.
Folder Structure looks something like this
StorageAccount
->Container1
->normal-data
->Files 1....n
->sensitive-data
->Files 1....m
I want to give read only access to the user only for normal-data and NOT sensitive-data
This can be achieved by setting ACL's on the folder level and giving access to the security service principle.
But limitation of this approach is user can only access the files which are loaded into the directory after the ACL is set up, hence cannot access the files which are already present inside the directory.
Because of this limitation, new users cannot be given full read access (unless new users use the same service principle, which is not the ideal scenario in my usecase)
Please suggest a read-only access method in ADLS Gen2, where
If files are already present under a folder and a new user is onboarded, he should be able to read all the files under the folder
New user should get access to only normal-data folder and NOT to sensitive-data
PS : There is a script for assigning ACL's recursively. But as I will get close to million records each day under normal-data folder, it would not be feasible for me to use the recursive ACL script

You could create an Azure AD security group and give that group read only access to the read-only folder.
Then you can add new users to the security group.
See: https://learn.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-groups-create-azure-portal

Related

Storage account FileShare Can I mount as Drive

Hi Have below questions.
I have a storage account and inside storage account, I have file shares.
And below is my folder structure
Root\Account 1
Root\Account 1\ReadOnly
Root\Account 1\ReadAndWrite
Root\Account 2
Root\Account 2\ReadOnly
Root\Account 2\ReadAndWrite
Now my questions are can I map my End users with Root\Account 2\ReadOnly or Root\Account 2\ReadAndWrite as their network-connected shared Drive “z:\”
I was actually trying with https://husseinsalman.com/securing-access-to-azure-storage-part-5-stored-access-policy/ blog post, here What I do not understand is the how to give SAS Signature to mount as a network folder ?
It's not possible to mount the specific directory, however you can set permission to files and directory.You can check Azure Active Directory Domain Services authentication on Azure Files assign permission for the directories
Azure Files identity-based authentication options for SMB access
Configure directory and file level permissions over SMB
If you mount the file share by using SMB, you don't have folder-level control over permissions. However, if you create a shared access signature by using the REST API or client libraries, you can specify read-only or write-only permissions on folders within the share.

Azure Data Lake Gen 2 default access control list not being applied to new files

Azure Data Lake Gen 2 has two levels of access control; roles based access controls (RBAC) and access control lists (ACL). RBAC function at the container level and ACL can function at the directory and file level. For child objects of a directory to inherit the ACL of the parent, the "Default" permissions need to be specified to be the same as the access permission of the parent.
See: https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control#default-permissions-on-new-files-and-directories
My issue is that I'm seeing behavior where child directories inherit their parent's ACL but child files do not.
My steps were thus:
Create a AAD group, something like "Consumers"
In Microsoft Azure Storage Explorer, create a new directory ("foo"), right click "foo", select "Manage Access", select "Add", add the "Consumers" group to the list, check [x] Access with [x] Read and [x] Execute. Check [x] Default with [x] Read and [x] Execute.
Write an Azure Function that copies blobs from a container to something like "foo/dataset/2020/05/myblob.csv" in the container with managed access.
Drill down directories; "dataset" directory has same ACL as "foo" as does "2020" and "05". But "myblob.csv" does not include the "Consumers" group in its ACL at all.
Is this unexpected behavior or am I missing something fundamental here?
It seems to be an issue with Functions, or the Azure Data Lake Gen 2 SDK (C#) used in the Function.
Using Azure Storage Explorer, when I manually add a file under a directory that I've added the "Consumers" group to it applies the expected ACL. It also works when I add a directory containing a file - both the file in the subdirectory and the subdirectory have expected ACL.
Thanks
[edit] Is this related to the umask when writing the file with the C# SDK? Do I need to override the default mask to allow files to inherit permissions of their parent? https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control#the-mask
[edit2] I think it's related to using DataLakeFileClient.Rename to "move" the blob. I suspect the blob retains its original ACL instead of inheriting the ACL from its new parent. Writing a test...
Use Azure Storage Explorer to set the permissions. And use the "propagate access-list" feature to set proper permissions.

Grant access to Azure Data Lake Gen2 Access via ACLs only (no RBAC)

my goal is to restrict access to a Azure Data Lake Gen 2 storage on a directory level (which should be possible according to Microsoft's promises).
I have two directories data, and sensitive in a data lake gen 2 container. For a specific user, I want to grant read access to the directory data and prevent any access to directory sensitive.
Along the documentation I removed all RBAC assignements for that user (on storage account as well as data lake container) so that I have no inherited read access on the directories. Then I added a Read-ACL statement to the data directory for that user.
My expectation:
The user can directly download files from the data directory.
The user can not access files of the sensitive directoy
Reality:
When I try to download files from the data directory I get a 403 ServiceCode=AuthorizationPermissionMismatch
az storage blob directory download -c containername -s data --account-name XXX --auth-mode login -d "./download" --recursive
RESPONSE Status: 403 This request is not authorized to perform this operation using this permission.
I expect that this should work. Otherwhise I only can grant access by assigning the Storage Blob Reader role but that applies to all directory and file within a container and cannot be overwritten by ACL statements. Did I something wrong here?
According to my research, if you want to grant a security principal read access to a file, we need to give the security principal Execute permissions to the container, and to each folder in the hierarchy of folders that lead to the file. for more details, please refer to the document
I found that I could not get ACLs to work without an RBAC role. I ended up creating a custom "Storage Blob Container Reader" RBAC role in my resource group with only permission "Microsoft.Storage/storageAccounts/blobServices/containers/read" to prevent access to listing and reading the actual blobs.

data factory loses permissions when copying from data lake (gen1) to blob storage

Data factory gives me this error when attempting to copy from data lake gen1 to blob storage:
"message": "Failure happened on 'Sink' side. ErrorCode=UserErrorFailedFileOperation,
'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Upload file failed at
path myblobcontainer\\file_that_im_tryin_to_copy.xml.,Source=Microsoft.DataTransfer.Common,''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to read a 'AzureDataLakeStore' file. File path: 'SourceFolderInDataLake/2019/09/26/SomeOtherFile.usql'.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (403) Forbidden.
I have a U-SQL Script activity that will execute 1-Patient.usql:
The next activity is a copy data step:
Source
Sink
I have configured roles/permissions using this tutorial.
I can solve this issue by going to Data Explorer --> Access:
I then click on Advanced:
After clicking on Apply to all children, then the copying works fine!
Please note that prior to the Copy Data activity, data factory is executing usql script inside of the gen1. The script is stored in gen1, and it generates files inside of data lake as well as folders. There is never any permissions issue running this script.
What am I doing wrong?
I can reproduce your issue. Actually the Apply folder permissions to sub-folders is not necessary. The issue should be caused by the access control of data lake gen1, the key to the problem is the order in which files are uploaded and permissions are set.
You could check the Access control in Azure Data Lake Storage Gen1 first and refer to the information below which was based my test.
I suppose you add the permissions like below to the root /.
If your file is already existing before setting the permission, it will be affected by the operation, i.e. the access to the file will be set, you can access the file.
But if you upload the file or create a new folder after setting the permission, the folder and file will not have the access, you will not be able to access them. You could select the file, click the Access to check directly.
After setting the permission above, then if you set the A default permission entry, it will not affect the existing folders and files, but if you create new folders and files, you will get the access of all of them. i.e. the old folder and file still have not access, the new ones will have access. If you want to get the access to the old ones, just add the permissions like the screenshot again, the Apply folder permissions to sub-folders is the same logic.
So in conclusion, if you want to your service principal/MSI access all the files in your data lake, you could add the third option An access permission entry and a default permission entry, then you will be able to access both the existing and new folders/files.

Azure Data Lake Service Principal write w/ Data Factory

I have created a Service Principal, and set up the necessary linked services to utilise the credentials and secret key etc in ADF, here is a run down of how this is done:
https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-authenticate-using-active-directory
When i execute my pipeline, and the files are written to the ADL, i can see the top level folder (i am logged in the creator of the ADL service, and am also a contributor on the Resource Group), but i am absolutely unable to drill down any further.
The error i receive basically boils down an ACL error.
Interestingly, i also not at the Execution Location is listed as: East US 2 when using the service principal.
When i manually authenticate the ADL connection in Data Factory (with my own credentials), every works absolutely fine, and the 'execution location' is now listed, correctly, as North Europe.
Has anyone ever seen this?
Helpful Reading: https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-access-control
The problem that you are running into is like an ACL issue as you mentioned. By just having contributor access, you only have access and permission on the Management Plane and not the Data Plane of the account.
Here is the mental model for thinking about ACLs
If you need to be able to read a file, you need r-x access on that file, and --x permissions on the parent folder all the way up to root.
If you create a new folder, and you create an Default ACL entry for yourself, it will apply to all new files and folders created below it.
To address your issue, please ask a Super User (someone from the Owners group) to give you this access.
Alternatively if you are an owner, you will have RWX access to any files/folder indepedent of any ACLs.
This should solve your problem.

Resources