We have azure analysis services setup that pulls data from ADLS Gen2 (JSON files). When we try to process or model the tables from within SSMS it throws the following error - –
Failed to save modifications to the server. Error returned: 'The
credentials provided for the AzureBlobs source are invalid.
When I open up the storage to all networks then no issues. However I am worried about the security aspect opening up storage account like that.
My Quesiton is : Any pointers to why SSMS would throw such an error?
Tried to create SP as admin on AAS server and added the same SP to storage blob as contributor but no luck.
Add contributor will never help for solve this problem.
As you can see in practice, this has nothing to do with RBAC roles.
The key to the problem lies in the network, you need to set the storage firewall.
You have two ways:
1, Add the outbound ip of the service you are using to the allowed list of storage .
2, Or integrate your service with Azure VNET, and then add this virtual network to the allow list of the storage firewall.
This is all of the ip address of azure service we can get:(You need to add the ip address of the corresponding service to the allowed list about firewall in storage.)
https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20200824.json
Related
I am using an Azure API management service to serve as a small API accessing a table storage in my storage account. I am using the table storage REST API (eg: https://learn.microsoft.com/en-us/rest/api/storageservices/query-entities)
I had no problems accessing the tablestorage using sharedkey-lite authorization, running a little script in policies, but due to business needs I needed to restrict access to the storage account.
Because of monetary considerations I cannot put the apim inside the vnet (nor external or internal) so I need to find another way to access the storage account.
I have tried adding the apim public ip to the firewall exceptions, but that still returned 403 forbidden.
I have added a managed identity allowing read access to the entire storage account and using the policy expression:
<authentication-managed-identity resource="https://storage.azure.com/"/>
But after digging more into the docs it seems that table storage is not supported by MSI only blob and queue (https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/services-support-managed-identities#azure-storage-blobs-and-queues)
Does anyone has an idea how to access the table storage REST API? I cannot wrap my head around why IP whitelisting does not work.
This 403 forbidden error only have two ways to solve:
1, put the api in a vnet and let vnet access.
2, let the outbound ip of the service to access. You need to get the outbound ip.
I'm using Power BI Dataflows to access spreadsheets I have in blob storage. I have configured IAM permissions on the storage account for myself and the Power BI Service user. The network configuration is set to 'Allow trusted Microsoft services to access this storage account' and 'Microsoft network routing endpoint' preferences.
First Test: Storage Account allowing access from all networks
I am able to access the spreadsheet from the Power BI Service and perform transformations.
Second Test: Storage Account allowing only selected networks
In this case, I have added a group of CIDR blocks for other services that need to access the storage account. I have also added the whitelists for the Power BI Service and PowerQueryOnline service using both the deprecated list and new json list.
When running the same connection from Power BI Service Dataflows I now get the 'Invalid Credentials' error message. After turning on logging for the storage account and running another successful test it looks like the requests are coming from private IP addresses (10.0.1.6), not any of the public ranges.
2.0;2020-09-18T12:57:17.0000567Z;ListFilesystems;OAuthSuccess;200;4;4;bearer;restrictiedmobacc;restrictiedmobacc;blob;"https://restrictiedmobacc.dfs.core.windows.net/?resource=account";"/restrictiedmobacc";7a6efbbd-e01f-004c-31bb-8d39a9000000;0;10.0.1.6;2018-06-17;2185;0;184;108;0;;;"gzip, deflate";Monday, 01-Jan-01 00:00:00 GMT;;"Microsoft.Data.Mashup (https://go.microsoft.com/fwlink/?LinkID=304225)";;"f5d7d551-0291-e765-f20d-09a337164e19";"31cae3e8-e77a-4db2-9050-a69c0555d912";"2f6a613f-ba8c-4432-bdb8-9a0ea0a9f51d";"b52893c8-bc2e-47fc-918b-77022b299bbc";"https://storage.azure.com";"https://sts.windows.net/2f6a613f-ba8c-4432-bdb8-9a0ea0a9f51d/";"<MY EMAIL ADDRESS>";;"{"action":"Microsoft.Storage/storageAccounts/blobServices/containers/read", "roleAssignmentId":"9fe216db-d682-462c-b408-4133a454ef1a", "roleDefinitionId":"8e3af657-a8ff-443c-a75c-2fe8c4bcb635", "principals": [{"id": "31cae3e8-e77a-4db2-9050-a69c0555d912", "type":"User"}], "denyAssignmentId":""}"
I'm at a loss as what to try next, it is a requirement that this storage account not be open to the world. I have read that you can use a On Premise Data Gateway so that you can lock the address range down to that device, but I don't really want to go down that route.
Have you tried to enable a Service endpoint for Azure Storage within the VNet?
The service endpoint routes traffic from the VNet through an optimal path to the Azure Storage service.
Could you also check if you have whitelisted the following links, you will find them in this link:
https://learn.microsoft.com/en-us/power-bi/admin/power-bi-whitelist-urls
Kr,
Abdel
After speaking with Microsoft Support I have been told
It is not possible to connect Power BI Service with a storage account that has restricted network access enabled.
However, after doing some reading on Azure Data Factory I noticed a statement...
"Services deployed in the same region as the storage account use private IP addresses for communication. Thus, you cannot restrict access to specific Azure services based on their public outbound IP address range."
Therefore I created a storage account in UK West with our Power BI Service in UK South. Looking at logs on the storage account I can now see requests from Power BI coming over a 51.0.0.0/8 range instead of private addresses. By adding 51.0.0.0/8 to the allowed CIDRs, Power BI Service Dataflows can now access the spreadsheets stored in the Datalake.
I'm trying to build a very basic data flow in Azure Data Factory pulling a JSON file from blob storage, performing a transformation on some columns, and storing in a SQL database. I originally authenticated to the storage account using Managed Identity, but I get the error below when attempting to test the connection to the source:
com.microsoft.dataflow.broker.MissingRequiredPropertyException:
account is a required property for [myStorageAccountName].
com.microsoft.dataflow.broker.PropertyNotFoundException: Could not
extract value from [myStorageAccountName] - RunId: xxx
I also see the following message in the Factory Validation Output:
[MyDataSetName] AzureBlobStorage does not support SAS,
MSI, or Service principal authentication in data flow.
With this I assumed that all I would need to do is switch my Blob Storage Linked Service to an Account Key authentication method. After I switched to Account Key authentication though and select my subscription and storage account, when testing the connection I get the following error:
Connection failed Fail to connect to
https://[myBlob].blob.core.windows.net/: Error Message: The
remote server returned an error: (403) Forbidden. (ErrorCode: 403,
Detail: This request is not authorized to perform this operation.,
RequestId: xxxx), make sure the
credential provided is valid. The remote server returned an error:
(403) Forbidden.StorageExtendedMessage=, The remote server returned an
error: (403) Forbidden. Activity ID:
xxx.
I've tried selecting from Azure directly and also entering the key manually and get the same error either way. One thing to note is the storage account only allows access to specified networks. I tried connecting to a different, public storage account and am able to access fine. The ADF account has the Storage Account Contributor role and I've added the IP address of where I am working currently as well as the IP range of Azure Data Factory that I found here: https://learn.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses
Also note, I have about 5 copy data tasks working perfectly fine with Managed Identity currently, but I need to start doing more complex operations.
This seems like a similar issue as Unable to create a linked service in Azure Data Factory but the Storage Account Contributor and Owner roles I have assigned should supersede the Reader role as suggested in the reply. I'm also not sure if the poster is using a public storage account or private.
Thank you in advance.
At the very bottom of the article listed above about white listing IP ranges of the integration runtime, Microsoft says the following:
When connecting to Azure Storage account, IP network rules have no
effect on requests originating from the Azure integration runtime in
the same region as the storage account. For more details, please refer
this article.
I spoke to Microsoft support about this and the issue is that white listing public IP addresses does not work for resources within the same region because since the resources are on the same network, they connect to each other using private IP's rather than public.
There are four options to resolve the original issue:
Allow access from all networks under Firewalls and Virtual Networks in the storage account (obviously this is a concern if you are storing sensitive data). I tested this and it works.
Create a new Azure hosted integration runtime that runs in a different region. I tested this as well. My ADF data flow is running in East region and I created a runtime that runs in East 2 and it worked immediately. The issue for me here is I would have to have this reviewed by security before pushing to prod because we'd be sending data across the public network, even though it's encrypted, etc, it's still not as secure as having two resources talking to each other in the same network.
Use a separate activity such as an HDInsight activity like Spark or an SSIS package. I'm sure this would work, but the issue with SSIS is cost as we would have to spin up an SSIS DB and then pay for the compute. You also need to execute multiple activities in the pipeline to start and stop the SSIS pipeline before and after execution. Also I don't feel like learning Spark just for this.
Finally, the solution that works that I used is I created a new connection that replaced the Blob Storage with a Data Lakes Gen 2 connection for the data set. It worked like a charm. Unlike Blob Storage connection, Managed Identity is supported for Azure Data Lakes Storage Gen 2 as per this article. In general, the more specific the connection type, the more likely the features will work for the specific need.
This is what you faced now:
From the description we know that is a connection error of storage. I also set the contributer role to the data factory, but still get the problem.
The problem comes from the network and firewall of your storage account. Please have a check of it.
Make sure you have add the client id and the 'Trusted Microsoft services' exception.
Have a look of this doc:
https://learn.microsoft.com/en-us/azure/storage/common/storage-network-security#trusted-microsoft-services
Then, go to your adf, choose these:
After that, it should be ok.
If I add a firewall rule on my Azure storage account that only allows access from my IP address, I can still successfully access the table and queue storage in that storage account, but when I try to access any of the blobs or file storage, I get an error.
Using Microsoft Azure Storage Explorer, the error I see is 'Unable to retrieve child resources. This request is not authorized to perform this operation'
It seems crazy that the firewall rule would work differently for blob and table storage. Any ideas?
I finally figured this out. It's a bit embarrassing! It turns out that my company has two internet connections, and the firewall decides which one to use based on the name of the resource. So table and blob storage are accessed using different IP addresses. I only put one of these IP addresses in the firewall rules (the one that was being used to access table storage).
Thanks to Jason Ye for trying to help me out!
I realize that when you create a Shared Access Signature (SAS), you can limit the SAS to only be viable from certain IP ranges.
But what I need is, to secure the Azure storage account, such that even if you have the access keys, you would be unable to access anything on the account, unless the request was coming from a set of white-listed IP ranges. Is this at all possible?
As far as I know, Azure doesn’t support the IP limitation on access keys.
You should know that the Azure storage account & access keys are about Management Plane Security. It grants complete access which is not a good choice to share your storage account and access keys to someone else.
Depending on your needs, using SAS with IP limits is your best choice. They are useful for providing limited permissions to your storage account to clients that should not have the account key. As such, they are a vital part of the security model for any application using Azure Storage.
Azure Storage security guide
Not sure when this started but there's an option now
https://learn.microsoft.com/en-us/azure/storage/common/storage-network-security
You can essentially block all communication with a storage account to a set of IPs, CIDR blocks or Azure VNets
Go to your storage account > Firewalls and virtual networks > select 'Selected Networks'
Then specify IPs, CIDR blocks or Azure VNets.
NOTE: The moment you turn this on, it essentially blocks any connections regardless of SAS tokens they present, including access via Azure Portal (you'll get an access denied on the blades showing your containers.) and Storage Explorer. If you have apps running that use this account, make sure your restrictions include them before pressing Save.
If you configured static site hosting, they will also be affected. This affects the whole account, not only blobs. So if you have apps accessing tables or files in the account, make sure you add them to the list.
If you want to upload, edit, download things from your storage account and you're not in the networks specified, you will have to add your current IP.