Azure Kubernetes Service (AKS) and the primary node pool - azure

Foreword
When you create a Kubernetes cluster on AKS you specify the type of VMs you want to use for your nodes (--node-vm-size). I read that you can't change this after you create the Kubernetes cluster, which would mean that you'd be scaling vertically instead of horizontally whenever you add resources.
However, you can create different node pools in an AKS cluster that use different types of VMs for your nodes. So, I thought, if you want to "change" the type of VM that you chose initially, maybe add a new node pool and remove the old one ("nodepool1")?
I tried that through the following steps:
Create a node pool named "stda1v2" with a VM type of "Standard_A1_v2"
Delete "nodepool1" (az aks nodepool delete --cluster-name ... -g ... -n nodepool1
Unfortunately I was met with Primary agentpool cannot be deleted.
Question
What is the purpose of the "primary agentpool" which cannot be deleted, and does it matter (a lot) what type of VM I choose when I create the AKS cluster (in a real world scenario)?
Can I create other node pools and let the primary one live its life? Will it cause trouble in the future if I have node pools that use larger VMs for its nodes but the primary one still using "Standard_A1_v2" for example?

Primary node pool is the first nodepool in the cluster and you cannot delete it, because its currently not supported. You can create and delete additional node pools and just let primary be as it is. It will not create any trouble.
For the primary node pool I suggest picking a VM size that makes more sense in a long run (since you cannot change it). B-series would be a good fit, since they are cheap and CPU\mem ratio is good for average workloads.
ps. You can always scale primary node pool to 0 nodes, cordon the node and shut it down. You will have to repeat this post upgrade, but otherwise it will work

It looks like this functionality was introduced around the time of your question, allowing you to add new system nodepools and delete old ones, including the initial nodepool. After encountering the same error message myself while trying to tidy up a cluster, I discovered I had to set another nodepool to a system type in order to delete the first.
There's more info about it here, but in short, Azure nodepools are split into two types ('modes' as they call it): System and User. When creating a single pool to begin with, it will be of a system type (favouring system pod scheduling -- so it might be good to have a dedicated pool of a node or two for system use, then a second user nodepool for the actual app pods).
So if you wish to delete your only system pool, you need to first create another nodepool with the --mode switch set to 'system' (with your preferred VM size etc.), then you'll be able to delete the first (and nodepool modes can't be changed after the fact, only on creation).

Related

AKS nodepool in a failed state, PODS all pending

yesterday I was using kubectl in my command line and was getting this message after trying any command. Everything was working fine the previous day and I had not touched anything in my AKS.
Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2022-01-11T12:57:51-05:00 is after 2022-01-11T13:09:11Z
After doing some google to solve this issue I found a guide about rotating certificates:
https://learn.microsoft.com/en-us/azure/aks/certificate-rotation
After following the rotate guide it fixed my certificate issue however all my pods were still in a pending state so I then followed this guide: https://learn.microsoft.com/en-us/azure/aks/update-credentials
Then one of my nodepools started working again which is of type user but the one of type system is still in a failed state with all pods pending.
I am not sure of the next steps I should be taking to solve this issue. Does anyone have any recommendations? I was going to delete the nodepool and make a new one but I can't do that either because it is the last system node pool.
Assuming you are using API version older than 2020-03-01 for creating AKS cluster.
There are few limitations apply when you create and manage AKS clusters that support system node pools.
• An API version of 2020-03-01 or greater must be used to set a node
pool mode. Clusters created on API versions older than 2020-03-01
contain only user node pools, but can be migrated to contain system
node pools by following update pool mode steps.
• The mode of a node pool is a required property and must be
explicitly set when using ARM templates or direct API calls.
You can use the Bicep/JSON code provided in MS Document to create the AKS cluster as there is using upgaded API version.
You can also follow this MS Document if you want to Create a new AKS cluster with a system node pool and add a dedicated system node pool to the existing AKS cluster.
The following command adds a dedicated node pool of mode type system with a default count of three nodes.
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name systempool \
--node-count 3 \
--node-taints CriticalAddonsOnly=true:NoSchedule \
--mode System

AKS nodepool vmSize upgrades vs nodepool delete

I have a case where I want to perform an inplace upgrade of the AKS cluster node pools vmSize, deletion of the full cluster is not possible.
One alternative that I have looked into is to perform `az aks nodepool delete' and then recreate it with a new vmSize.
Question here is: What is really happening under the hood, drain all and delete?
Should we first drain all the nodes in sequence, and then run the command to achieve 0 downtime? We are running multiple node pools
Anyother suggestion?
why you dont add a new node pool and migrate your workload to the new nodepool, then delete the old node pool.
You could also import this new node pool with Terraform if you use it.
Or is it the system node pool you are talking about?

MongoDB change stream - Duplicate records / Multiple listeners

My question is an extension of the earlier discussion here:
Mongo Change Streams running multiple times (kind of): Node app running multiple instances
In my case, the application is deployed on Kubernetes pods. There will be at least 3 pods and a maximum of 5 pods. The solution mentioned in the above link suggests to use <this instance's id> in the $mod operator. Since the application is deployed to K8s pods, pod names are dynamic. How can I achieve a similar solution for my scenario?
if you are running stateless workload i am not sure why you want to fix name of POD(deployment).
Fixing PODs names is only possible with stateful sets.
You should be using statefulset instead of deployment, replication controllers(RC), however, replication controllers are replaced with ReplicaSets.
StatefulSet Pods have a unique identity comprised of an ordinal. For any StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, which will be unique across Set.

Storage account connectivity method for AKS

I'm setting up a Storage Account so I can Dynamically create and use a persistent volume with Azure Files in Azure Kubernetes Service (AKS). Doing this to:
Have a PV and PVC for the database
A place to store the application files
AKS does create a storage account in the MC_<resource-group>_<aks-name>_<region> resource group that is automatically created. However, that storage account is destroyed if the node size/VM is changed (not node count), so it shouldn't be used since you'll lose your files and database if you need a node size/VM with more resources.
This documentation, nor any other I've really come across, says what the best practice is for the Connectivity method:
Public endpoint (all networks)
Public endpoint (selected networks)
Private endpoint
The first option sounds like a bad idea.
The second option allows me to select a virtual network, and there are two choices:
MC_<resource-group>_<aks-name>_<region>... again, doesn't seem like a good idea because if the node size/VM is changed, the connection will be broke.
aks-vnet-<number>... not sure what this is, but looks like it is part of the previous resource group so will also be destroyed in the previously mentioned scenario.
The third option contains a number of options some of which are included the second option.
So how should I securely set this up for AKS to share files with the application and persist database files?
EDIT
Looking at the both the "Firewalls and virtual networks" and "Private endpoint connections" for the storage account that comes with the AKS node, it looks like it is just setup for "All networks"... so maybe having that were my actual PV and PVC will be stored isn't such an issue...? Could use some clarity on the topic.
not sure where the problem lies. all the assets generated by AKS are tied to AKS lifecycle. if you delete AKS it will delete the MC_* resource group (and that it 100% right). Not sure what do you mean about storage account being destroyed, it wouldn't get destroyed unless you remove the pvc and set the delete action to reclaim.
Reading: https://learn.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv
As for the networking part, selected networks with selecting the AKS nodes network should be the way to go. you can figure that network out by looking at the AKS nodes or the AKS agent pool definition(s). I dont think this is configurable only using kubernetes primitives, so that would be a manual\scripted action after storage account is created.

Is it possible to change subnet in Azure AKS deployment?

I'd like to move an instance of Azure Kubernetes Service to another subnet in the same virtual network. Is it possible or the only way to do this is to recreate the AKS instance?
No, it is not possible, you need to redeploy AKS
edit: 08.02.2023 - its actually possible to some extent now: https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni-dynamic-ip-allocation#configure-networking-with-dynamic-allocation-of-ips-and-enhanced-subnet-support---azure-cli
I'm not sure it can be updated on an existing cluster without recreating it (or the nodepool)
I know its an old thread, but just responding in case someone might find it useful. You cannot change the subnet of the AKS directly. However, you can always change the subnets of the underlying components. In our case, we had a simple setup of 2 nodes and a LoadBalancer. We created a new subnet and change the subnets on these individual components. It worked for us, so do ensure to check the services and the pods, to ensure correct working.

Resources