I'm trying to implement Azure Key Vault such that API keys, credentials and other Kubernetes secrets are read into production and staging environments. Ultimately, I'd like to try to expand that to local development environments so devs don't have to mess with it at all. It is just read in when they start their cluster.
Anyway, I'm following this to enable Pod Identities:
https://learn.microsoft.com/en-us/azure/aks/use-azure-ad-pod-identity
When I get to this step, I'm modifying the:
az aks create -g myResourceGroup -n myAKSCluster --enable-managed-identity --enable-pod-identity --network-plugin azure
To the following because I'm trying to change an existing cluster:
az aks update -g myResourceGroup -n myAKSCluster --enable-managed-identity --enable-pod-identity --network-plugin azure
This doesn't work and figured out I need to run each flag one at a time, so I had to run --enable-managed-identity first since --enable-pod-identity depends on it.
At any rate, when I get to the --enable-pod-identity I get the following error:
Operation failed with status: 'Bad Request'. Details: Network plugin kubenet is not supported to use with PodIdentity addon.
So I try the --network-plugin azure and get:
az: error: unrecognized arguments: --network-plugin azure
Apparently this is flag is not available with update.
Poking around in the Azure portal for the AKS resource, I do see kubenet listed, but I'm not able to change it.
So, the question: Is it possible to change the Network Plugin on existing cluster or do I need to start a new?
EDIT: Looks like others are having similar issues on existing clusters:
https://github.com/Azure/AKS/issues/2094
Is it possible to change the Network Plugin on the existing cluster or do
I need to start a new?
It's impossible to change the network plugin on the existing cluster, so you need to create a new cluster and set the network plugin with azure at the creation time. You can find there is no parameter --network-plugin in the CLI command az aks update even if you install the aks-preview extension. It means it does not support changing the network plugin of the existing cluster.
Related
I removed two nodes of my Kubernetes cluster manually first calling "kubectl drain " and then "kubectl delete " for each. While the cluster seems to work without a problem the Azure UI shows me exactly two nodes more than I see when I use "kubectl get nodes". So when I configure Kubernetes to have 9 nodes in the Azure UI only 7 nodes are there if I take a look with kubectl. Scaling up or down does not solve the problem as Azure is always off by two nodes.
How can I solve this problem? Is there a way I can notify Azure that a node has been deleted?
If you want to solve the issue, you need to have a deeper understanding of the k8s cluster.
When you use the command kubectl delete to remove the node from the agent pool, it means the agent pool won't have control over that node. But it does not mean you really delete the machine. So you can find the number of the machine does not change in the Azure portal. This is the truth you find.
How can I solve this problem? Is there a way I can notify Azure that a
node has been deleted?
Here are two questions. For the first, you can express it in this way:
How to restore the node that deleted before to the agent pool?
It's simple to solve. You only need to restart the kubelet service in that node. For example, you use the VMSS as the agent pool of the AKS and that node instance id is 4. Then you can do it like this:
az vmss run-command invoke --resource-group group_name --name vmss_name --instance-id 4 --command-id RunShellScript --scripts "service kubelet restart"
For the second one, you can only use the Azure command to let Azure know the update. Here it means you can scale the agent pool, for example, using the Azure CLI command:
az aks nodepool --resource-group group_name --name agentpool_name --cluster-name cluster_name --node-count 2
I have k8s cluster on Azure and can not access the dashboard.
To access it I was doing aks browse --resource-group <res_group> --name <cluster_name>
It does not open after accidentally deleted the kube-dashboard pod.
Error:
Couldn't find the Kubernetes dashboard pod.
Did try to enable-disbale dashboard add-on on Azure.
Re-install k8s-dashboard. (Azure did not allow)
Any ideas on how to solve the issue and restart the dashboard?
Did find the following solution that worked for me:
Created another Azure k8s cluster. For each cluster Azure makes a dashboard
deployment.
Copied the yaml files with the command:
kubectl get deployment -n kube-system <kubernetes-dasboard-xxx>
for each "deployment, replicaSet, service and pod related to dashboard"
Recreated them into the old not working cluster.
Upgraded-downgraded the cluster version to re-deploy the objects.
Depends on your k8s version, AKS doesn't enable dashboard while creating a new cluster. You can find details in below link.
https://learn.microsoft.com/en-us/azure/aks/kubernetes-dashboard
And I suggest you, can directly install dashboard from kubernetes dashboard page, it is installing dashboard another namespace(it it better actually) and you can create and RBAC account to see all resources as an admin privileges.
https://github.com/kubernetes/dashboard
https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md
And also you can use --enable-addons
https://learn.microsoft.com/en-us/azure/aks/kubernetes-dashboard
I am trying to experiment with Preview feature available in Azure AKS as per documentation available we need to have the following requirements
Kubernetes version 1.12.4 or later
Azure CLI version 2.0.55 or later.
add aks preview :- az extension add --name aks-preview
register scale set provider:- az feature register --name VMSSPreview --namespace Microsoft.ContainerService
ensure that it is registerd
created AKS cluster with terraform
when i try to apply following command
az aks update --resource-group rg-euwest-d04-dvag-001 --name k8s-euwest-d04-dvag-dfs-dfsapp-001 --enable-cluster-autoscaler --min-count 3 --max-count 5
error
Operation failed with status: 'Bad Request'. Details: AgentPool
'' has set auto scaling as enabled but is not on Virtual
Machine Scale Sets, this is not allowed
As per my understanding, it is not supported at this time through terraform or from Azure Portal but only possible from Azure CLI
Your cluster needs to be created via Azure CLI to enable autoscaling. So if you have created on evia Azure portal, you need to delete it and create new one through Azure CLI. Ref: https://github.com/MicrosoftDocs/azure-docs/issues/29199
I have a brand new kubernetes cluster on AKS.
I disabled the addons with the azure-cli as described in documentation:
az aks disable-addons --addons http_application_routing --name myAKSCluster --resource-group myResourceGroup --no-wait
The portal shows no domain associated to the cluster.
But with kubelets I still see all the pods and deployments related to the addon.
I tried to delete deployments and stuff with kubectl, but the deployments recreates themself.
Have anybody experienced the same?
Thanks!
there is a known issue with 1.12.6
Unable to disable addons on deployed clusters
AKS Engineering is diagnosing an issue around existing/deployed clusters being unable to disable Kubernetes addons within the addon-manager. When we have identified and repaired the issue we will roll out the required hot fix to all regions.
This impacts all addons including monitoring, http application routing, etc.
https://github.com/Azure/AKS/releases
All of our AKS clusters have the following error reported in Azure Portal:
This container service is in a failed state. Click here to open a new support request.
It seems we also cannot edit the cluster. When trying to scale out the nodes, I am getting the following error:
Failed to save container service 'test-aks'. Error: Operation is not allowed while cluster is being upgrading or failed in upgrade
When looking into the AKS properties, I see there is a provisioning state of "Failed":
We don't know how to troubleshoot this problem.
Use the az aks scale command to scale the cluster nodes using Azure CLI as described here: https://learn.microsoft.com/en-us/azure/aks/scale-cluster#scale-the-cluster-nodes
az aks show --resource-group myResourceGroup --name myAKSCluster --query agentPoolProfiles
This will show you the descriptive error message in Azure CLI. It is likely that you exceeded the limit for the core quota.
More details discussed on this thread: https://github.com/Azure/AKS/issues/542
For the issue that you shows:
This container service is in a failed state. Click here to open a new
support request.
It also happened to me. Usually, there is some limitation to the user for the use of resources. On my side, I just can use 10 vCpu. So I got the error when I scale up for more nodes if the vCpu have none left. I think it's also a possible reason for you. You can take a check.