Azure service principal reset - azure

I have a k8s cluster running in Azure and I have always reset the service principal credentials by using Azure CLI: az ad sp credential reset --name <xyz> --years 2. Afterwards I have updated the AKS cluster with the new service principal credentials, see update AKS Cluster credentials. After this the cluster will be restarted.
For a production environment I want to avoid restarting the cluster after resetting the credentials, so I was thinking to use the same password as before. So old password=new password. This is achieved using az ad sp credential reset --name <xyz> --years 2 --password <1234>
Now my question is: should I update the AKS cluster with new service principal credentials even if I use the same password as before? Has anyone tried this before?

AFAIK it's not require to update or restart the AKS cluster with new service principal credentials even if you are using the same password as before.
Since you are using the same password at AKS level and only reseting with same password at service principal level for sake of not to expire password.
Note : If you were resting with different password as you are using currently in that case you have to update and reset the AKS cluster.
For the benifits You can use of managed identity rather than service principle as rickvdbosch and Philip Welz Suggested in the comment itself.
You can now update an AKS cluster currently working with service principals to work with managed identities by using the following CLI commands.
az aks update -g <RGName> -n <AKSName> --enable-managed-identity

Related

how to update the credentials for Azure Kubernetes Service (AKS)

I get this issue on my aks cluster
Your service principal has expired or is invalid. Please check if you
are using the correct secret or if the key has expired.
I think the problem is the current service principal needs to be reset because I think it is already expired.
I have followed the instruction in this link, but am still unable to do it even using the account with owner access
Using Global Administrator user
root#root~ % SP_ID=$(az aks show --resource-group myclusterk8s --name myclusterk8s \
--query servicePrincipalProfile.clientId -o tsv)
I already get the id from the first command, but when we do the second command the id we get from the first command is cannot be used
root#root ~ % SP_SECRET=$(az ad sp credential reset --name "$SP_ID" --query
password -o tsv) ERROR: Resource
'6ec95333-4a5f-4sd1-8478-4b367a4b3711' does not exist or one of its
queried reference-property objects are not present.
Does anyone have any suggestions on how to solve this?
Resource 'appID' does not exist or one of its queried
reference-property objects are not present.
The error suggests that the service principal does not exist in your Azure AD
Firstly, you need to check if the service principal exists in your Azure Active Directory
If the service principal does not exist, you can create a new service principal and update the AKS cluster with the new service principal credentials
If the service principal still exists and you are facing the issue, you can create a new client secret in the Azure Portal. Then you can update the AKS cluster with the new credentials

Kubernetes pods using invalid Azure Active Directory access tokens

When deploying new jobs and services to Azure Kubernetes Service cluster, the pods fail to request valid AAD access tokens with all permissions available. If new permissions were added on the same day, before or after a deployment, the tokens still do not pick them up. This issue has been observed with permissions granted to Active Directory Groups over Key Vaults, Storage Accounts, and SQL databases scopes so far.
Example: I have a .NET 5.0 C# API running on 3 pods with antiaffinity rules located each on a separate node. The application reads information from a SQL database. I made a release and added the database permissions afterwards. Things I have tried so far to make the application reset the access tokens:
kubectl delete pods --all -n <namespace> which essentially created 3 new pods again failing due to insufficient permissions.
kubectl apply -f deployment.yaml to deploy a new version of the image running in the containers, again all 3 pods kept failing.
kubectl delete -f deployment.yaml followed by kubectl apply -f deployment.yaml to erase the old kubernetes object and create a new one. This resolved the issue on 2/3 pods, however, the third one kept failing due to insufficient permissions.
kubectl delete namespace <namespace> to erase the entire namespace with all configuration available and recreated it again. Surprisingly, again 2/3 pods were running with the correct permissions and the last one did not.
The commands were ran more than one hour after the permissions were added to the group. The database tokens are active for 24 hours and when I have seen this issue occur with cronjobs, I had to wait 1 day for the task to execute correctly (none of the above steps worked in a cronjob scenario). The validity of the tokens kept changing which implied that the pods are requesting new access tokens, again excluding the most recently added permissions. The only solution I have found that works 100% of the time is destroy the cluster and recreate it which is not viable in any production scenario.
The failing pod from my example was the one always running on node 00 which made me think there may be an extra caching layer on the first initial node of the cluster. However, I still do not understand why the other 2 pods were running with no issue and also what is the way to restart my pods or refresh the access token to minimise the wait time until resolution.
Kubernetes version: 1.21.7.
The cluster has no AKS-managed AAD or pod-identity enabled. All RBAC is granted to the cluster MSI via AD groups.
Please check if below can be worked around in your case.
To access the Kubernetes resources, you must have access to the AKS cluster, the Kubernetes API, and the Kubernetes objects. Ensure that you're either a cluster administrator or a user with the appropriate permissions to access the AKS cluster
Things you need to do, if you haven't already:
Enable Azure RBAC on your existing AKS cluster, using:
az aks update -g myResourceGroup -n myAKSCluster --enable-azure-rbac
Create Role that allows read access to all other Pods and Services:
Add the necessary roles (Azure Kubernetes Service Cluster User Role , Azure Kubernetes Service RBAC Reader/Writer/Admin/Cluster Admin) to the user. See ( Microsoft Docs).
Also check Troubleshooting
Also check if you need to have "Virtual Machine Contributor" and storage account contributer for your resource group containing pods and see if namespace is mentioned in that pod , if you have missed . Stack Overflow refernce.Also do check if firewall is restricting the access to the network in that pod.
Resetting the kubeconfig context using the az aks get-credentials command may clear the previously cached authentication token for some xyz user:
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster --overwrite-existing >Reference
Please do check Other References below:
kubernetes - Permissions error - Stack Overflow
create-role-assignments-for-users-to-access-cluster | microsoft docs
user can't access to AKS cluster with RBAC enabled (github.com)
kubernetes - Stack Overflow

Azure Kubernetes Error when running "az aks get-credentials" command

Azure Admins created a cluster for us.
On VM I installed "az cli" and "kubectl".
With my account from Azure Portal I can see that Kubernetes Service and Resource Group to which it belongs.
From the level of that cluster in Azure Portal I can see that I have a role:
"AKS Cluster Admin Operator"
I am logged on VM with kubectl with my account. I need to config my kubectl to work with our cluster.
When I try to execute:
az aks get-credentials --resource-group FRONT-AKS-NA2 --name front-aks
I am getting error:
ForbiddenError: The client 'my_name#my_comp.COM' with object id
'4ea46ad637c6' does not have authorization to perform action
'Microsoft.ContainerService/managedClusters/listClusterUserCredential/action'
over scope
'/subscriptions/89e05d73-8862-4007-a700-0f895fc0f7ea/resourceGroups/FRONT-AKS-NA2/providers/Microsoft.ContainerService/managedClusters/front-aks'
or the scope is invalid. If access was recently granted, please
refresh your credentials.
In my case to refresh recently granted credentials helped this one:
az account set --subscription "your current subscription name"
It led to re-login and fix the issue.
Well, I see the comment, and you already get the solution. So I just can explain the difference to you. Hope it will help you!
When you use the command az aks get-credentials without parameter --admin, it means the CLI command uses the default value: Cluster user. And when you use the cluster user, it just works if you integrate AKS with the AAD. But you said you just have the AKS Cluster Admin Operator role, so the appropriate parameter is --admin. You can get more details here.
And on my side, it's a little dangerous. If the AKS cluster is just for the test, there is no problem. But if it's for production, I recommend you integrate with the AAD, and then give the appropriate permissions to the user. Because the admin user means you have all the permissions, you know, it's not safe.

How do I get my AKS cluster to authenticate to my ACR?

A few weeks ago, I was able to use the Azure CLI to create my Container Registry (ACR) and Kubernetes (AKS) cluster. I could push images to my ACR and have AKS pull images successfully - everything worked great. Every now and then, I would have to refresh my login with az acr login --name <acrName>, but not a big deal.
Today, I found that when I go to deploy an updated image to my AKS cluster, I got a status of ImagePullBackOff:
Failed to pull image "MY_ACR.azurecr.io/MY_IMAGE:v1": rpc error: code = Unknown desc = Error response from daemon: Get https://MY_ACR.azurecr.io/v2/MY_IMAGE/manifests/v1: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.
I couldn't remember what I needed to do to make this work, so I went through my original steps and created an entirely new resource group, ACR, AKS cluster, and service principal connecting them. I pushed images to my ACR and was able to apply my Kubernetes manifest, and everything worked again.
A couple hours later, when I applied an updated manifest, I again got the same error message. As part of my setup, I created a service principal:
az ad sp create-for-rbac --skip-assignment
az role assignment create --assignee <principal's appId> --scope <my ACR's id> --role Reader
I also used --role acrpull. It seems like the authentication has timed out, and the documentation for Authenticate with an Azure container registry says that individual AD identities will time out after 3 hours, but even after running az acr login --name <acrName>, I'm not able to fix the issue.
What are the required steps to get my AKS cluster to be able to authenticate again to my ACR?
I'll note that I also attached the ACR according to the documentation at Authenticate with Azure Container Registry from Azure Kubernetes Service by running:
az aks update -n cluster_name -g resource_group --attach-acr acr_name
I also tried using the ACR id instead of the name. After a minute or so, the command completed, and even a half hour+ later, I get the same permissions issue.
The easiest way to integrate AKS with ACR is to leverage the --attach-acr option during cluster creation. This will have AKS manage the service principal for your and handle the token refresh's
https://learn.microsoft.com/en-us/azure/aks/cluster-container-registry-integration#create-a-new-aks-cluster-with-acr-integration

Azure AKS User Credentials Login to K8 Dashboard and RBAC Built-in Roles

According to the documentation, Azure Kubernetes Service Cluster User Role allows access to Microsoft.ContainerService/managedClusters/listClusterUserCredential/action API call only.
My user is part of an AD group that has Azure Kubernetes Service Cluster User Role permissions on the AKS cluster and all the cluster role and cluster role bindings have been applied via kubectl.
I can double check and verify that access to dashboard and permissions work with these steps:
1. az login
2. az aks get-credentials --resource-group rg --name aks
3. kubectl proxy
4. Open web connection
5. Get prompt on terminal to login via device code flow
6. Return to web connection on dashboard
7. I can correctly verify that my permissions apply,
i.e. deleting a job does not work and this falls in line with my
kubectl clusterrole bindings to the Azure AD group.
However when I try to use the az aks browse command to open the browser automatically like this, i.e. without kubectl proxy:
1. az login
2. az aks get-credentials --resource-group rg --name aks
3. az aks browse --resource-grouprg --name aks
I keep getting the following error:
The client 'xxx' with object id 'yyyy' does not have authorization to perform action
'Microsoft.ContainerService/managedClusters/read' over scope
'/subscriptions/qqq/resourceGroups/rg/providers/Microsoft.ContainerService/managedClusters/aks'
or the scope is invalid. If access was recently granted, please refresh your credentials.
A dirty solution was to apply Reader role on the AKS cluster for that AD group - then this issue goes away but why does az aks browse require Microsoft.ContainerService/managedClusters/read permission and why is that not included in Azure Kubernetes Service Cluster User Role?
What is happening here?
Currently, the command
az aks browse --resource-grouprg --name aks isn't working with the more recent version of AKS, you can find the full details here.
https://github.com/MicrosoftDocs/azure-docs/issues/23789
Also, your current problem might also be that your user XXX doesn't have the right IAM access level at the Subscription/ResourceGroup level.

Resources