I wanted to install Kubeflow into the Azure, So I started off creating an Azure Kubernetes Cluster(AKS) with a single node(B4MS virtual machine). During the installation, I didn't enable the virtual node pool option. After creating the AKS cluster, I ran the command "$ kubectl describe node aks-agentpool-3376354-00000" to check the specs. The allocatable number of Pods were 110 and I was able to install Kubeflow without any issues. However, sometime later I wanted an AKS Cluster with virtual node pool enabled so I could use GPUs for training. So I deleted the old Cluster and created a new AKS Cluster with the same B4MS virtual machine and with the virtual node pool option enabled. This time when I ran the same command as above to describe the node specs, the allocatable number of Pods were 30 and the kubeflow installation failed due to lack of pods to provision.
Can someone explain me why the number of allocatable Pods change when the virtual node option is enabled or disabled? How do I maintain the number of allocatable Pods as 110 while having the virtual node pool option enabled?
Thank you in advance!
Virtual Node Pool requires the usage of the Advance Networking configuration of AKS which brings in AZURE CNI network plugin.
The Default POD count per node on AKS when using AZURE CNI is 30 pods.
https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni#maximum-pods-per-node
This is the main reason why you are now getting 30 MAX pods per node.
This can be updated to a bigger number when using the AZ CLI to provision your cluster.
https://learn.microsoft.com/en-us/cli/azure/ext/aks-preview/aks?view=azure-cli-latest#ext-aks-preview-az-aks-create
--max-pods -m
The maximum number of pods deployable to a node.
Related
How to free up disk space from the backend nodes of Azure service fabric cluster? Provided, Azure service fabric cluster running on top of Azure Virtual Machine Scale Set(VMSS) & the VMs part of VMSS are running out of space. Attached the screen of the error reported by the cluster
Is Azure Service Fabric as a platform providing, any out-of-box options to free up disk space from its backend nodes?
What happens when I stop aks cluster and start?
Will my pods remain in the same state?
Do the node pool and nodes inside that stop?
Do the services inside the cluster still runs and cost me if it is a load balancer?
Stopping cluster will lost all the pods and starting it again it will create a new pod with the same name but Ip address of pod will changes.
Pods are only scheduled once in their lifetime. Once a Pod is scheduled (assigned) to a Node, the Pod runs on that Node until it stops or is terminated.
Do the node pool and nodes inside that stop?Do the services inside the cluster still runs and cost me if it is a load balancer?
Yes It will Stop the nodes and Complete Node Pool as well.Service Inside the cluster will also stop and it will not cost as well.
Reference : https://learn.microsoft.com/en-us/azure/aks/start-stop-cluster?tabs=azure-cli
I have an Azure Kubernetes Cluster Running with Azure CNI (virtual network) as the Network. The cluster is running on 1 subnet of the network.
On another subnet, I have a Virtual Machine running as it has a private IP of 10.1.0.4.
Now I have a pod in the K8S cluster, which is trying to connect with the Virtual Machine. But it's not able to do so.
Also, the ping 10.1.0.4 from inside the pod gives a timeout.
Please help me to figure out, what I am doing wrong so that I can connect the Pod with the VM.
• You cannot directly create communication between an AKS cluster pod and a Virtual Machine as the IP assigned to a pod/node in an AKS cluster is a subset range of the address space of the higher CIDR IP address range assigned while deploying the cluster. And communication within the cluster between the nodes is uninterrupted and possible readily. But the same with resources other AKS is restricted as they are governed by Azure CNI framework policy which directs the Kubernetes cluster to direct traffic outbound of the cluster in a regulated and conditional way.
• Thus, the above said can only be achieved by initiating intermediate services such as an internal load balancer between the AKS and the VMs as the CIDR of the VM and the AKS is different. So, leveraging the Azure plugin to deploy an internal load balancer as a service through AKS is only way through which you can achieve communication between AKS pod and a VM deployed in Azure. Below is a diagram for illustration purposes.
To deploy the internal load balancer through YAML files in AKS for external communication with VMs, kindly refer to the link below for details: -
https://fabriciosanchez-en.azurewebsites.net/implementing-virtual-machine-to-pod-communication-in-azure-kubernetes-service-aks/
I know that with Azure AKS , master components are fully managed by the service. But I'm a little confused here when it comes to pick the node pools. I understand that there are two kind of pools system and user, where the user nodes pool offer hosting my application pods. I read on official documentation that System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. And i'm aware that we can only rely on system nodes to create and run our kubernetes cluster.
My question here, do they mean here by the system node the MASTER node ? If it is so, why then we have the option to not create the user nodes (worker node by analogy) ? because as we know -in on prem kubernetes solution- we cannot create a kubernetes cluster with master nodes only.
I'll appreciate any help
System node pools in AKS does not contain Master nodes. Master nodes in AKS are 100% managed by Azure and are outside your VNet. A system node pool contains worker nodes on which AKS automatically assigns the label kubernetes.azure.com/mode: system, that's about it. AKS then use that label to deploy critical pod like tunnelfront, which is use to create a secure communication from your nodes to the control plane. You need at least 1 system node pool per cluster and they have the following restrictions :
System pools osType must be Linux.
System pools must contain at least one node, and user node pools may contain zero or more nodes.
System node pools require a VM SKU of at least 2 vCPUs and 4GB memory. But burstable-VM(B series) is not recommended.
System node pools must support at least 30 pods as described by the minimum and maximum value formula for pods.
KOPS lets us create a Kubernetes cluster along with a bastion that has ssh access to the cluster nodes
With this setup is it still considered safe to use kubectl to interact with the Kubernetes API server?
kubectl can also be used to interact with shell on the pods? Does this need any restrictions?
What are the precautionary steps that need to be taken if any?
Should the Kubernetes API server also be made accessible only through the bastion?
Deploying a Kubernetes cluster with the default Kops settings isn’t secure at all and shouldn’t be used in production as such. There are multiple configuration settings that can be done using kops edit command. Following points should be considered after creating a Kubnertes Cluster via Kops:
Cluster Nodes in Private Subnets (existing private subnets can be specified using --subnets with the latest version of kops)
Private API LoadBalancer (--api-loadbalancer-type internal)
Restrict API Loadbalancer to certain private IP range (--admin-access 10.xx.xx.xx/24)
Restrict SSH access to Cluster Node to particular IP (--ssh-access xx.xx.xx.xx/32)
Hardened Image can also be provisioned as Cluster Nodes (--image )
Authorization level must be RBAC. With latest Kubernetes version, RBAC is enabled by default.
The Audit logs can be enabled via configuration in Kops edit cluster.
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /srv/kubernetes/audit.yaml
Kops provides reasonable defaults, so the simple answer is : it is reasonably safe to use kops provisioned infrastructure as is after provisioning.