Using Azure Identity credentials for Spark access to Blob store - apache-spark

I'm trying to use Azure RBAC to secure access to storage blobs, and to use Azure Identity to access those blobs from Apache Spark. I see that recent versions of Hadoop-Azure support abfs, and it supports a few token providers: https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html#Azure_Managed_Identity . For production usage, I can use a service principal with an AD app and the associated client id, secret, and endpoint. Or I can even use Managed Identity.
When developing locally, it would be good to be able to do the same with something like DeviceCodeCredential or InteractiveBrowserCredential, i.e. something that will make the user log in to Azure using a browser, and use the credentials returned to get the access token as pass it to Spark. The reason I'd like this is to have users use their own credentials when accessing data, and not have storage keys / SAS tokens / etc flying about.
Is something like this possible? I could implement a Custom Token Provider that wraps an Azure Identity instance, but was wondering if there were a less nuclear approach.

If you want to use the user credential to auth, the closest way in the supported auth ways is the OAuth 2.0: Username and Password, but essentially it uses the Azure AD ROPC flow to auth, it has some limits, e.g. it will not work with the user account which is MFA-enabled.
Actually, for local development, the way I most want to recommend is to use a service principal to auth i.e. OAuth 2.0 Client Credentials, because the MSI(managed identity) is essentially a service principal managed by azure, if you use MSI for production, the transition of the environment will be smoother, the permissions stuff in azure will be some differences between the user account and service principal in some scenarios(maybe not in this case). Of course, Custom Token Provider is also a feasible way, they all depend on yourself.

Related

2 different auth types on .Net API?

We have a .net 5 API that runs as the backbone of our service.
Currently we have b2c set up within the API and our angular app to auth users.
But now we have a secondary daemon that needs to authorise onto the api to be able to do its thing.
What would be best practise to achieve this? Since we need to use client credentials and not user interaction.
• You can use the Azure Maps resource in Azure to ensure that a daemon app securely authenticates the API and authorizes it to do the thing. Azure Maps supports two authentication methods, viz., Azure Active Directory Authentication and Shared Key authentication. AAD method maps the ‘Client ID’, i.e., the service principal or the account used for REST API request to the application/API through it and the other method, i.e., shared key authentication uses primary and secondary key as the subscription key for it as it relies on passing the key generated by the Azure Maps account to each request in Azure Maps.
• For more information, you can surely refer to the documentation below on the detailed steps to follow for the efficient and correct use of the above stated methods for a secondary daemon app to authorize onto an API.
https://learn.microsoft.com/en-us/azure/azure-maps/how-to-secure-daemon-app
In the above link, shared key authentication with Azure key vault and Azure AD role based access control are discussed.

How to generate SAS Token to connect to Azure Storage Account - File Share?

In order to connect to Azure Shared Storage(in particularly File Share) to perform tasks like copying/removing/modifying files from remote to azure storage, we need either SAS(Shared Access Signature) or Active Directory Settings Enabled (and then assign roles based on requirement).
I wanted to implement the access using SAS approach, I tried generating SAS from UI, tried generating SAS by making use of Access Keys(Present Inside Storage Account - Confidential and most important key for storage account) both worked. But UI approach isn't conducive in my case, and Access token can't be given to anyone apart from the administrator.
So is there a way to generate SAS using Azure AD credentials or some service where we can create an account and password/key and that account can be used to create SAS token via curl(REST call) and not generating SAS via access keys(admin key).
The tricky part is to let your users create a sas token for the file share without granting them permissions on the whole storage account.
You can use a middle tier application that creates the SAS token and allow the users to use that app. An azure function with an HTTP trigger can be used for example. You grant the azure function access to the storage account using a Managed Service Identity and secure the access to the Azure function either with Active Directory or a function key, that you distribute to your users.
You can try with this approach:
A SAS token for access to a container, directory, or blob may be secured by using either Azure AD credentials or an account key.
Microsoft recommends that you use Azure AD credentials when possible as a security best practice, rather than using the account key, which can be more easily compromised. When your application design requires shared access signatures, use Azure AD credentials to create a user delegation SAS for superior security.
Create a User delegation SAS
Generate a User Delegation Key:
POST https://myaccount.blob.core.windows.net/?restype=service&comp=userdelegationkey

Using service principals / apps for OAuth2 authentication within Azure Data Factory

We aim to collect data from the Azure Management APIs. These APIs provide information on the resources we have running in Azure, the consumed budget, etc (example). Following our design choices, we prefer to exclusively use Azure Data Factory to make the HTTP requests and store the data into our data lakes. This is fairly obvious, using the REST linked service. However, we struggle to correctly set up the OAuth2 authentication dance with this method.
Our first idea was to store the token and the refresh token within the Azure Key Vault. A series of HTTP requests within the pipeline would then test whether the token is still valid or otherwise use the refresh token to get a new token. The downside to this approach is that the token within the Azure Key Vault is never updated, when needed, and that the logic becomes more complex.
Alternatively, we were trying to set up the authorization through combination of a registered app and service principal to our Azure AD account. The REST linked service within Data Factory can be created with a service principal, which would then handle most of the information of the scope and consent. The service principal is also accompanied with a Azure app, which would hold the token etc. Unfortunately, we are unable to make this setup function correctly.
Questions we have:
Can we actually use a service principal / app to store our OAuth2 tokens? If so, will these be automatically refreshed within our app?
How do we assign the correct privileges / authorizations to our app that it can use this (external) API?
Is the additional logic with HTTP calls within Azure Data Factory pipeline needed to update the tokens or can these apps / service principals handle this?
Thank you for your time and help!
It is not a good idea to store the tokens in the keyvault, because they will expire.
In your case, two options for you to use.
Use service principal to auth
Use managed identity to auth(best practice)
Steps to use service principal to auth:
1.Register an application with Azure AD and create a service principal.
2.Get values for signing in and create a new application secret.
3.To call the Azure REST API e.g. Resources - List you mentioned, your service principal needs the RBAC role in your subscription.
Navigate to the Azure portal -> Subscription -> add your service principal as a Contributor/Owner role in the subscription like below.
4.In the linked service, configure it like below, fix them with the values got from step 2.
Don't forget to replace the {subscriptionId} in the Base URL.
https://management.azure.com/subscriptions/{subscriptionId}/resources?api-version=2020-06-01
5.Test the linked service with a copy activity, it works fine.
Steps to use managed identity to auth:
1.Make sure your data factory has enabled the MSI(managed identity), if you create it in the portal or powershell, MSI will be enabled automatically, don't worry about that.
2.Navigate to the Subsctiption in the portal, add the role to the MSI like step 3 in Steps to use service principal to auth, just search for your ADF name in the bar, the MSI is essentially a service principal with the same name of your ADF, which is managed by azure.
3.Then in the linked service, just change it like below.
At last, answer your questions.
Can we actually use a service principal / app to store our OAuth2 tokens? If so, will these be automatically refreshed within our app?
As I mentioned, it is not a good idea, just use the service principal/MSI to auth like the steps above.
How do we assign the correct privileges / authorizations to our app that it can use this (external) API?
To use the Azure REST API, just assign the RBAC roles like above, specify the correct AAD resource e.g. https://management.azure.com in this case.
Is the additional logic with HTTP calls within Azure Data Factory pipeline needed to update the tokens or can these apps / service principals handle this?
No need to do other steps, when you use the configuration above, essentially it will use the client credential flow to get the token in the background for you automatically, then use the token to call the API.

How to login on Azure Portal using REST APIs

I plan to implement a C# app that will create Azure resources using REST APIs (API calls to Azure Resource Manager). When calling a REST API you have to authenticate by passing an authentication header "Authorization: Bearer yJ0eXAiOiJKV...".
How do I get this Bearer token? Looking online all that I found is having a Web App , you use its application_id. However i don't have any application and I don't want to create one.
I can replicate the calls that I intercept with Fiddler but I think that that is not the "recommended" way.
Have anyone faced this problem and has a solution?
Short answer: If you're developing a C# application that is going to use Azure REST APIs, then in order to get the bearer token for authentication you do need to have an Azure AD application registration (no way around that, as it's required for you to be able to authenticate using any of the supported OAuth 2.0 grant flows).
There are a few ways to make things more convenient for you though:
Use CLI to create a service principal for RBAC
From Azure Portal, open up the CLI by clicking on highlighted icon.
Now run below mentioned command
az ad sp create-for-rbac -n "MyTestSPForAzureRESTAPIs"
This does multiple things for you in a single command and provides a great way to get started with testing the REST APIs.
The created service principal is added as a "Contributor" to your Azure subscription. You can always go to Subscriptions > Your Subscription > Access control (IAM) and change that as per your requirements.
You get an application ID as well as Password/client secret that you can then use in C# code to get bearer token.
Sample output
NOTE: Since this approach gives you a client secret, you should use this only from server side applications (like a web API or Web App or Daemon service). Do NOT use client secrets from a desktop based app (like console app or WPF app) or SPA in a production scenario.
I say this because desktop based apps or SPAs are not secure enough to handle client secrets and there are other different authentication flows recommended for them. If your case happens to be any of those, look at delegated permissions from your Azure AD application where you can prompt an end user for credentials. Just comment on the answer and I can add more specific guidance around those.
Use Managed Identity in case of App Service or Azure Function
If you plan to host the C# application that you mention, using App Service or as an Azure Function, then you can make use of MSI. Even in this case an application will be created in Azure AD, but you don't need to do that or manage keys (change them regularly etc.). It's a great option, highly recommended if it suits your scenario.
Read here for more details: How to use managed identities for App Service and Azure Functions
If you just want to get the bearer token. I recommand that you could login in your account in the Azure API document. After we login then we could get the bearer token.
If we want to use code to get access token to access or modify resources, create an identity for the Azure AD application is required . This identity is known as a service principal. Then we can then assign the required permissions to the service principal.
How to registry an Azure AD application and assign role to the application, please refer to this document.
The following is demo code how to get the access token with applicationId and sercet key
public static async Task<string> GetAccessToken(string tenantId, string clientId, string clientSecretKey)
{
var context = new AuthenticationContext("https://login.windows.net/" + tenantId);
ClientCredential clientCredential = new ClientCredential(clientId, clientSecretKey);
var tokenResponse = await context.AcquireTokenAsync("https://management.azure.com/", clientCredential);
var accessToken = tokenResponse.AccessToken;
return accessToken;
}

Programmatically access Microsoft identity across Azure, VSTS, and Graph

Is there a way with a single app to access Graph, VSTS, and Azure information? It seems access to each of these requires it's own app with origination and callback urls.
For Azure, I'm using NPM's passport-azure-ad in a node js app.
Ideally, I would like to combine VSTS build info, Azure service usage info, and User profile info.
Each of the services you mentioned has their own API:
Azure REST API
Visual Studio Team Services REST API
Microsoft Graph
This does not however mean that they also each need their own "app". When you register your application in Azure AD via the Azure Portal you're able to request access to a number APIs. Each access_token you receive will be tied to one API (called a "resource") but you can use the refresh_token to switch the targeted resource:
The only exception here is the VSTS REST API. While most APIs use the same identity provider, VSTS has their own. So for the purposes of VSTS, you will need to have the user authenticate separately. Obviously, that isn't a great user experience but there is a useful workaround: Personal Access Tokens.
Using a Personal Access Token for VSTS allows you to authenticate the user via Azure AD OAuth and get an access token you can use with Microsoft Graph and the Azure REST API. Once you've authenticated them, you can ask them to provide a Personal Access Token to access VSTS. This allows you to forgot asking the user to authenticate a second time since you'll store their PAT use it for any calls to VSTS.
First, there is Allow scripts to access OAuth token option in the Phase of Build/Release definition, you can check this option and access the token through System.AcessToken variable.
To grant the permission for that user, you need to grant the permission(s) for Project Collection Build Service (xxxx) account.
Secondly, there are some tasks related to Azure (e.g. Azure PowerShell), that can access azure resources (The AAD application is associated to the Azure endpoint)
You can retrieve the necessary information in multiple task, then store the result in the variables through Logging Commands (##vso[task.setvariable]value), then combine them together.

Resources