azure aks failed to create - azure

I'm trying to create aks cluster with command
az aks create --node-vm-size Standard_A2 --resource-group dev --name cluster --node-count 1 --generate-ssh-keys --debug
It successfully creates the AD App for the cluster.
Anyway, it shows the error:
Operation failed with status: 'Bad Request'. Details: Service
principal clientID: not found in Active Directory tenant
.
The clientId is the id of the app in the AD it has created.
I don't have even an idea where does it take the tenant guid.
So does somebody knows how can I solve the issue?
Info about my subscription:
One account, one directory (Default), two subscriptions (trial expired, and bizspark one).

So in my experience I had to specify clientId\clientSecret to the az aks command to be able to créate aks cluster. I dont think that's a permissions issue (because I definitely have permissions to créate new service principal on my subscriptions), but rather a bug.
az aks create --resource-group aks --name aks --location westeurope --service-principal guid --client-secret 'secret'

Related

az aks with --admin switch does not require a password?

If I connect to my AKS cluster with,
az aks get-credentials --resource-group <rgname> --name <clustername> --admin
it does not require any credentials. Is this expected? Or is it using my "Az login" credentials and passing that through? My cluster is enabled for AD access but I was reading that the --admin flag can be used to force it to use the k8s admin. Should this be blocked for security reasons?
Sorry, quite new to AKS and Kubernetes.
Yes, The below cmdlet will not require any addinational credential to connect to the AKS, Az login is enough to connect to the AKS who has access of subscription in which AKS created.
az aks get-credentials --resource-group <rgname> --name <clustername> --admin
--admin flag can be used to force it to use the k8s admin. Should this be blocked for security reasons?
Yes you are correct,This should be blocked for secuirity purpose, But unfortunatlly switch –admin access on or off using a simple switch with az aks commands still in preview state, This is not recommanded for production use as of now.
For more information how to disable local user account (–admin) in Azure Kubernetes Service you can refer this document
There is also workaround given in this Github Disccussion you can also go through that.

Permissions to create AKS cluster with Gateway

Looking into using AKS but having some issues/queries when creating.
I can create a cluster using the following CLI command:
az aks create --resource-group myRG --name aks --node-count 2 --generate-ssh-keys --enable-managed-identity --attach-acr myACR
And this creates it fine without any issues. I want to add Application Gateway which is causing me issues and have narrowed it down to the following call:
az aks create --resource-group myRG --name aks --node-count 2 --generate-ssh-keys --enable-managed-identity --attach-acr myACR --network-plugin azure
By setting the --network-plugin to Azure it fails with the following error:
ValidationError: Directory permission is needed for the current user to register the application. For how to configure, please refer 'https://learn.microsoft.com/azure/azure-resource-manager/resource-group-create-service-principal-portal'. Original error: Insufficient privileges to complete the operation.
Setting the --debug flag on the call displays the following:
urllib3.connectionpool : Starting new HTTPS connection (1): graph.windows.net:443
urllib3.connectionpool : https://graph.windows.net:443 "POST /1f141cfd-a6c5-4e9a-bf84-7116c141e5f4/applications?api-version=1.6 HTTP/1.1" 403 219
msrest.http_logger : Response status: 403
...
msrest.http_logger : {"odata.error":{"code":"Authorization_RequestDenied","message":{"lang":"en","value":"Insufficient privileges to complete the operation."},"requestId":"65e9cb16-4df1-4824-bd33-3ee6f691ed07","date":"2020-10-19T10:58:49"}}
So I can see that it's because I do not have permissions to create an app in Azure AD.
I do not have access to Azure AD as it's a corporate one, but wondering the following:
What are the lowest permissions required for me to be able to create AKS in AD?
Can the Azure AD owner create me an application for me to then use? How would I reference this when creating the cluster?
Is Managed Identities preferred over Service Principals? Can managed identities be used with existing AD objects?
I have a feeling some of my questions show my lack of understanding/knowledge of how permissions work with Azure AKS/AD, but am having trouble understanding as there is not a lot readable errors (I can't even access Azure AD pane within the portal).

How do I get my AKS cluster to authenticate to my ACR?

A few weeks ago, I was able to use the Azure CLI to create my Container Registry (ACR) and Kubernetes (AKS) cluster. I could push images to my ACR and have AKS pull images successfully - everything worked great. Every now and then, I would have to refresh my login with az acr login --name <acrName>, but not a big deal.
Today, I found that when I go to deploy an updated image to my AKS cluster, I got a status of ImagePullBackOff:
Failed to pull image "MY_ACR.azurecr.io/MY_IMAGE:v1": rpc error: code = Unknown desc = Error response from daemon: Get https://MY_ACR.azurecr.io/v2/MY_IMAGE/manifests/v1: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.
I couldn't remember what I needed to do to make this work, so I went through my original steps and created an entirely new resource group, ACR, AKS cluster, and service principal connecting them. I pushed images to my ACR and was able to apply my Kubernetes manifest, and everything worked again.
A couple hours later, when I applied an updated manifest, I again got the same error message. As part of my setup, I created a service principal:
az ad sp create-for-rbac --skip-assignment
az role assignment create --assignee <principal's appId> --scope <my ACR's id> --role Reader
I also used --role acrpull. It seems like the authentication has timed out, and the documentation for Authenticate with an Azure container registry says that individual AD identities will time out after 3 hours, but even after running az acr login --name <acrName>, I'm not able to fix the issue.
What are the required steps to get my AKS cluster to be able to authenticate again to my ACR?
I'll note that I also attached the ACR according to the documentation at Authenticate with Azure Container Registry from Azure Kubernetes Service by running:
az aks update -n cluster_name -g resource_group --attach-acr acr_name
I also tried using the ACR id instead of the name. After a minute or so, the command completed, and even a half hour+ later, I get the same permissions issue.
The easiest way to integrate AKS with ACR is to leverage the --attach-acr option during cluster creation. This will have AKS manage the service principal for your and handle the token refresh's
https://learn.microsoft.com/en-us/azure/aks/cluster-container-registry-integration#create-a-new-aks-cluster-with-acr-integration

Azure AKS User Credentials Login to K8 Dashboard and RBAC Built-in Roles

According to the documentation, Azure Kubernetes Service Cluster User Role allows access to Microsoft.ContainerService/managedClusters/listClusterUserCredential/action API call only.
My user is part of an AD group that has Azure Kubernetes Service Cluster User Role permissions on the AKS cluster and all the cluster role and cluster role bindings have been applied via kubectl.
I can double check and verify that access to dashboard and permissions work with these steps:
1. az login
2. az aks get-credentials --resource-group rg --name aks
3. kubectl proxy
4. Open web connection
5. Get prompt on terminal to login via device code flow
6. Return to web connection on dashboard
7. I can correctly verify that my permissions apply,
i.e. deleting a job does not work and this falls in line with my
kubectl clusterrole bindings to the Azure AD group.
However when I try to use the az aks browse command to open the browser automatically like this, i.e. without kubectl proxy:
1. az login
2. az aks get-credentials --resource-group rg --name aks
3. az aks browse --resource-grouprg --name aks
I keep getting the following error:
The client 'xxx' with object id 'yyyy' does not have authorization to perform action
'Microsoft.ContainerService/managedClusters/read' over scope
'/subscriptions/qqq/resourceGroups/rg/providers/Microsoft.ContainerService/managedClusters/aks'
or the scope is invalid. If access was recently granted, please refresh your credentials.
A dirty solution was to apply Reader role on the AKS cluster for that AD group - then this issue goes away but why does az aks browse require Microsoft.ContainerService/managedClusters/read permission and why is that not included in Azure Kubernetes Service Cluster User Role?
What is happening here?
Currently, the command
az aks browse --resource-grouprg --name aks isn't working with the more recent version of AKS, you can find the full details here.
https://github.com/MicrosoftDocs/azure-docs/issues/23789
Also, your current problem might also be that your user XXX doesn't have the right IAM access level at the Subscription/ResourceGroup level.

Renewing ACS Kubernetes cluster

I've ACS kubernetes cluster running on azure vmss, recently I renewed my acs service principal by adding the new key in /etc/kubernetes/azure.json in master and worker nodes and restarted them but the issue is new nodes created as part of scaling are not able to get the new service principal key.
Updating azure.json is not enough.
In order to update your cluster with new credentials, you should use az aks update-credentials command
az aks update-credentials \
--resource-group myResourceGroup \
--name myAKSCluster \
--reset-service-principal \
--service-principal $SP_ID \
--client-secret $SP_SECRET
After that cluster autoscaler will use updated principal for the new instances
Update:
For acs cluster you have to manually update service principal on each worker node.
Or you may use custom script extension, which you can integrate with Azure Resource Manager template or run by Azure Virtual Machines REST API

Resources