Azure AKS - give entire cluster access to Azure Key Vault - azure

I'm trying to find a way to give an entire AKS cluster to Azure Key vault. I have temporarily got this working by following the below process:
Go to the VMSS of the cluster -> Identity -> Set System Assigned Status to 'On'
Add this Managed identity as an access policy to Key Vault.
This works, however whenever I stop and start the cluster, I have to re-create this managed identity and re-add it to Key Vault. I have tried using the User Assigned Identities for the vmss as well but that does not seem to work.
I also cannot use the azure pod identities/CSI features for other reasons so I'm just looking for a simple way to give my cluster permanent access to key Vault.
Thanks in advance

Pod is smallest unit in Kubernetes. Pod is a group of one or more containers that are deployed together on the same host (node).
Pod runs a node which is controlled by master.
Pod uses OS level virtualization which can consume resources of VMSS when it runs and based on requirement.
Stopping and restarting cluster/nodes pod will lose all the resources that leads to loss of pods. So, there will be no pod under VMSS until you restart. In case you restart your cluster/node, the new pod will be created with different name and with another IP address.
From this github discussion, I found that MIC (Managed Identity Cluster) removes the identity from the underlying VMSS when no pods are configured to use that identity. So, you have to recreate the Managed Identity for VMSS.
You can refer this link for better understanding how to access keyvault from Azure AKS.

Related

What happens to Azure AD Pod identity once the pod dies

I am planning to assign pod identity to one of my applications. However, I am unable to understand the part where what happens to the assigned pod identity when the pod restarts/dies on its own ?
Does the pod get assigned a new identity on its own?
Not sure about your configuring End to end setup however if you are using it with Service Account and annotating it workload idenetiy it will stay there even if POD will restart or so.
AZURE_AUTHORITY_HOST, azure-identity-token will get auto-injected if POD restarting. Instead of using POD you can also use deployment and attach the Service account to it.
As mentioned in the official doc, it's service account to AAD mapping so if you service account is there in config with POD or deployment it will get secret and other values.
Azure AD Workload Identity supports the following mappings:
one-to-one (a service account referencing an AAD object)
many-to-one (multiple service accounts referencing the same AAD object).
one-to-many (a service account referencing multiple AAD objects by changing the client ID annotation).

Which authenticate used AKS to create Azure resource?

I would like to know under whose authority AKS is creating the resource.
I'm trying to create an Internal Loadbalancer in AKS, but it fails without permissions.
However, I don't know who to give that privilege to.
The account that connected to AKS or the managed identity of AKS ? Or something else ?
Is the account that connected to AKS in the first place the same as the account that creates the AKS resources ?
It would be great if you could tell me the source of the information as well, as I need the documentation to explain it to my boss.
Best regards.
I'm trying to create an Internal Loadbalancer in AKS, but it fails
without permissions. However, I don't know who to give that privilege
to. The account that connected to AKS or the managed identity of AKS ?
Or something else ?
You will have to provide the required permissions to the managed identity of the AKS Cluster . So for your requirement to create a ILB in AKS you need to give Network Contributor Role to the identity.
You can refer this Microsoft Documentation on How to delegate access for AKS to access other Azure resources.
Is the account that connected to AKS in the first place the same as
the account that creates the AKS resources ?
The account which is connected to AKS is same as the account that created the AKS resources from Azure Portal (User Account) But different while accessing the Azure resources from inside the AKS (Managed Identity / Service Principal).
For more information you can refer this Microsoft Documentation.

Kubernetes pods using invalid Azure Active Directory access tokens

When deploying new jobs and services to Azure Kubernetes Service cluster, the pods fail to request valid AAD access tokens with all permissions available. If new permissions were added on the same day, before or after a deployment, the tokens still do not pick them up. This issue has been observed with permissions granted to Active Directory Groups over Key Vaults, Storage Accounts, and SQL databases scopes so far.
Example: I have a .NET 5.0 C# API running on 3 pods with antiaffinity rules located each on a separate node. The application reads information from a SQL database. I made a release and added the database permissions afterwards. Things I have tried so far to make the application reset the access tokens:
kubectl delete pods --all -n <namespace> which essentially created 3 new pods again failing due to insufficient permissions.
kubectl apply -f deployment.yaml to deploy a new version of the image running in the containers, again all 3 pods kept failing.
kubectl delete -f deployment.yaml followed by kubectl apply -f deployment.yaml to erase the old kubernetes object and create a new one. This resolved the issue on 2/3 pods, however, the third one kept failing due to insufficient permissions.
kubectl delete namespace <namespace> to erase the entire namespace with all configuration available and recreated it again. Surprisingly, again 2/3 pods were running with the correct permissions and the last one did not.
The commands were ran more than one hour after the permissions were added to the group. The database tokens are active for 24 hours and when I have seen this issue occur with cronjobs, I had to wait 1 day for the task to execute correctly (none of the above steps worked in a cronjob scenario). The validity of the tokens kept changing which implied that the pods are requesting new access tokens, again excluding the most recently added permissions. The only solution I have found that works 100% of the time is destroy the cluster and recreate it which is not viable in any production scenario.
The failing pod from my example was the one always running on node 00 which made me think there may be an extra caching layer on the first initial node of the cluster. However, I still do not understand why the other 2 pods were running with no issue and also what is the way to restart my pods or refresh the access token to minimise the wait time until resolution.
Kubernetes version: 1.21.7.
The cluster has no AKS-managed AAD or pod-identity enabled. All RBAC is granted to the cluster MSI via AD groups.
Please check if below can be worked around in your case.
To access the Kubernetes resources, you must have access to the AKS cluster, the Kubernetes API, and the Kubernetes objects. Ensure that you're either a cluster administrator or a user with the appropriate permissions to access the AKS cluster
Things you need to do, if you haven't already:
Enable Azure RBAC on your existing AKS cluster, using:
az aks update -g myResourceGroup -n myAKSCluster --enable-azure-rbac
Create Role that allows read access to all other Pods and Services:
Add the necessary roles (Azure Kubernetes Service Cluster User Role , Azure Kubernetes Service RBAC Reader/Writer/Admin/Cluster Admin) to the user. See ( Microsoft Docs).
Also check Troubleshooting
Also check if you need to have "Virtual Machine Contributor" and storage account contributer for your resource group containing pods and see if namespace is mentioned in that pod , if you have missed . Stack Overflow refernce.Also do check if firewall is restricting the access to the network in that pod.
Resetting the kubeconfig context using the az aks get-credentials command may clear the previously cached authentication token for some xyz user:
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster --overwrite-existing >Reference
Please do check Other References below:
kubernetes - Permissions error - Stack Overflow
create-role-assignments-for-users-to-access-cluster | microsoft docs
user can't access to AKS cluster with RBAC enabled (github.com)
kubernetes - Stack Overflow

Aks is not connecting with keyvault when autoscaling is enabled. We used aks pool identity for connecting Keyvault

Issue:
We have deployed AKS from ARM template with managed identity by using default Kubnet. Auto scaling is disabled while deploying. we have deployed cluster with two node configuration..
Recently one of our Node utilisation reached 95%. We have given a Pod replica as 1. So pod is present on a single node only. To over come this problem we tried to enable autoscaling of node and kept Node min as2 and max as 5. All these autoscaling is enabled from portal.
Main Issue is when we enable autoscaling the agentpool identity system assigned managed identity is getting disabled Refer pic 1. Due to which we are not able to connect with key vault.
When ever the Node gets scaled the ASK pool identity system managed is getting disabled.
We are using AKS pool System Managed Identity and adding in keyvault access policy. Then we can get values from Keyvault. But when the node is auto scaling system managed identity is getting disabled. So keyvault connection is getting break. Is there any way to connect automatically even autoscales the node?

Error when trying to get Azure Kubernetes Service to use Cluster Load balancer from Service

I'm working to get Streamsets Data Collector running in Azure Kubernetes Service (AKS) and when I run kubectl .... the service appears to be up, however its giving this error. This is an RBAC AKS Cluster so I think I need to give the service principal permissions AND/OR do a cluster role binding to that service in Kubernetes. Any ideas?
The error shows invalid client. It probably means that the original service principal secret of your AKS cluster is invalid or expired. See the similar error here.
Follow that link, you can find the original client secret when you deploy the AKS cluster, so that you can re-add that as a key to the Service Principal. On the master and node VMs in the Kubernetes cluster, the service principal credentials are stored in the file /etc/kubernetes/azure.json.
On the VM page---Run command---choose RunShellScript---Enter cat /etc/kubernetes/azure.json and click "Run" then you could find the property aadClientSecret.
For more details, you could read Service principals with Azure Kubernetes Service (AKS)

Resources