what is the maximum number of horizontal pod autoscaler in AKS.
Background-
We have a service with 100 ports
We are creating a container per port and using hpa for each container
Can we have an AKS cluster with 100 HPAs?
Related
I have an aks cluster with a single nodepool with azure cni network plugin and a subnet of /21 around 2000 ip's dedicated for this single nodepool. This nodepool is configured with 3 availability zones.
Consider I have another cluster with a single nodepool with the configuration same as above but for the nodepool availability zone is set to none.
Workloads are the same in both the nodepools. but will the nodepool deployed in 3 availability zones consume more ip's compared to the one deployed in the availability zone set to None
AZ is for high Availability separated by a physical boundary within the region, Yes, It will consume separate IP's for each Node Pool. Selecting more than one availability zone will create same number of node pool and Consume same number of IP's.
I have tested in my environment with 3 AZ it occupied my 3 Ip's address from the selected subnet.
I have an Azure Kubernetes Cluster Running with Azure CNI (virtual network) as the Network. The cluster is running on 1 subnet of the network.
On another subnet, I have a Virtual Machine running as it has a private IP of 10.1.0.4.
Now I have a pod in the K8S cluster, which is trying to connect with the Virtual Machine. But it's not able to do so.
Also, the ping 10.1.0.4 from inside the pod gives a timeout.
Please help me to figure out, what I am doing wrong so that I can connect the Pod with the VM.
• You cannot directly create communication between an AKS cluster pod and a Virtual Machine as the IP assigned to a pod/node in an AKS cluster is a subset range of the address space of the higher CIDR IP address range assigned while deploying the cluster. And communication within the cluster between the nodes is uninterrupted and possible readily. But the same with resources other AKS is restricted as they are governed by Azure CNI framework policy which directs the Kubernetes cluster to direct traffic outbound of the cluster in a regulated and conditional way.
• Thus, the above said can only be achieved by initiating intermediate services such as an internal load balancer between the AKS and the VMs as the CIDR of the VM and the AKS is different. So, leveraging the Azure plugin to deploy an internal load balancer as a service through AKS is only way through which you can achieve communication between AKS pod and a VM deployed in Azure. Below is a diagram for illustration purposes.
To deploy the internal load balancer through YAML files in AKS for external communication with VMs, kindly refer to the link below for details: -
https://fabriciosanchez-en.azurewebsites.net/implementing-virtual-machine-to-pod-communication-in-azure-kubernetes-service-aks/
I am using Kubernetes in Azure with Virtual Nodes this is a plugin that creates virtual nodes using Azure Container Instances.
The instructions to set this up require creating a AKSNet/AKSSubnet which seems to automatically come along with A VMSS called something like. aks-control-xxx-vmss I followed the instruction on the link below.
https://learn.microsoft.com/en-us/azure/aks/virtual-nodes-cli
This comes with a single instance VM that I am being charged full price for regardless of container instances I create and I am charged extra for every container instance I provision onto my virtual node pool even if they should fit on just 1 VM. These resources do not seem to be related.
I am currently having this out as unexpected billing with Microsoft but the process has been very slow so I am reverting to here to find out if anyone else has had this experience?
The main questions I have are:
Can I use Azure Container Instances without the VMSS?
If not can I somehow make this VM visible to my cluster so I can at least use it to
provision containers onto and get some value out of it?
Have I just done something wrong?
Update, NB: this is not my control node that is a B2s which I can see my system containers running on.
Any advice would be a great help.
Can I use Azure Container Instances without the VMSS?
In an AKS cluster currently you cannot have virtual nodes without a node pool of type VirtualMachineScaleSets or AvailabilitySet. An AKS cluster has at least one node, an Azure virtual machine (VM) that runs the Kubernetes node components and container runtime. [Reference] Every AKS cluster must contain at least one system node pool with at least one node. System node pools serve the primary purpose of hosting critical system pods such as CoreDNS, kube-proxy and metrics-server. However, application pods can be scheduled on system node pools if you wish to only have one pool in your AKS cluster.
For more information on System Node Pools please check this document.
In fact, if you run kubectl get pods -n kube-system -o wide you will see all the system pods running on the VMSS-backed node pool node including the aci-connector-linux-xxxxxxxx-xxxxx pod which connects the cluster to the virtual node, as shown below:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
aci-connector-linux-859d9ff5-24tgq 1/1 Running 0 49m 10.240.0.30 aks-nodepool1-29819654-vmss000000 <none> <none>
azure-cni-networkmonitor-7zcvf 1/1 Running 0 58m 10.240.0.4 aks-nodepool1-29819654-vmss000000 <none> <none>
azure-ip-masq-agent-tdhnx 1/1 Running 0 58m 10.240.0.4 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-autoscaler-6699988865-k7cs5 1/1 Running 0 58m 10.240.0.31 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-d4866bcb7-4r9tj 1/1 Running 0 49m 10.240.0.12 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-d4866bcb7-5vkhc 1/1 Running 0 58m 10.240.0.28 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-d4866bcb7-b7bzg 1/1 Running 0 49m 10.240.0.11 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-d4866bcb7-fltbf 1/1 Running 0 49m 10.240.0.29 aks-nodepool1-29819654-vmss000000 <none> <none>
coredns-d4866bcb7-n94tg 1/1 Running 0 57m 10.240.0.34 aks-nodepool1-29819654-vmss000000 <none> <none>
konnectivity-agent-7564955db-f4fm6 1/1 Running 0 58m 10.240.0.4 aks-nodepool1-29819654-vmss000000 <none> <none>
kube-proxy-lntqs 1/1 Running 0 58m 10.240.0.4 aks-nodepool1-29819654-vmss000000 <none> <none>
metrics-server-97958786-bmmv9 1/1 Running 1 58m 10.240.0.24 aks-nodepool1-29819654-vmss000000 <none> <none>
However, you can deploy Azure Container Instances [How-to] without an AKS cluster altogether. For scenarios where you need full container orchestration, including service discovery across multiple containers, automatic scaling, and coordinated application upgrades, we recommend Azure Kubernetes Service (AKS).
If not can I somehow make this VM visible to my cluster so I can at least use it to provision containers onto and get some value out of it?
Absolutely, you can. In fact if you do a kubectl get nodes and the node from the VMSS-backed node pool (in your case aks-control-xxx-vmss-x) shows STATUS as Ready, then it is available to the kube-scheduler for scheduling workloads. Please check this document.
If you do a kubectl describe node virtual-node-aci-linux you should find the following in the output:
...
Labels: alpha.service-controller.kubernetes.io/exclude-balancer=true
beta.kubernetes.io/os=linux
kubernetes.azure.com/managed=false
kubernetes.azure.com/role=agent
kubernetes.io/hostname=virtual-node-aci-linux
kubernetes.io/role=agent
node-role.kubernetes.io/agent=
node.kubernetes.io/exclude-from-external-load-balancers=true
type=virtual-kubelet
...
Taints: virtual-kubelet.io/provider=azure:NoSchedule
...
In the document that you are following, in the Deploy a sample app section to schedule the container on the virtual node, a nodeSelector and toleration are defined in the Deploy a sample app section as follows:
...
nodeSelector:
kubernetes.io/role: agent
beta.kubernetes.io/os: linux
type: virtual-kubelet
tolerations:
- key: virtual-kubelet.io/provider
operator: Exists
- key: azure.com/aci
effect: NoSchedule
If you remove this part from the Deployment manifest, or do not specify this part in the manifest of a workload that you are deploying, then the corresponding resource(s) will be scheduled on a VMSS-backed node.
Have I just done something wrong?
Maybe you can evaluate the answer to this based on my responses to your earlier questions. However, here's a little more to help you understand:
If a node doesn't have sufficient compute resources to run a requested pod, that pod can't progress through the scheduling process. The pod can't start unless additional compute resources are available within the node pool.
When the cluster autoscaler notices pods that can't be scheduled because of node pool resource constraints, the number of nodes within the node pool is increased to provide the additional compute resources. When those additional nodes are successfully deployed and available for use within the node pool, the pods are then scheduled to run on them.
If your application needs to scale rapidly, some pods may remain in a state waiting to be scheduled until the additional nodes deployed by the cluster autoscaler can accept the scheduled pods. For applications that have high burst demands, you can scale with virtual nodes and Azure Container Instances.
This however does not mean that we can dispense with the VMSS or Availability Set backed node pools.
I am using AWS EKS (managed Kubernetes service) and Fargate (managed nodes) to deploy a pod running a nodejs React service on port 5000. The pod switches from "Running" state to "Terminating" state continuously immediately after deployment to Fargate. Eventually, it settles on "Running". Other pods are running fine on Fargate.
I am unable to view the logs due to Kubernetes reporting net/http: TLS handshake timeout .
The service is fronted by AWS Application Load Balancer (ALB). In the target group, I can see continuous registration and deregistration of the pod/node IP.
How can I troubleshoot this further?
Some ways to troubleshoot:
With kubectl, if your pods are run with a K8s deployment:
kubectl describe deployment <deployment-name> 👈 check for events
With kubectl, before the pod goes into Terminating
kubectl logs <pod-id>
kubectl describe pod <pod-id> 👈 check for events
Check EKS control plane logs in the S3 bucket where you are sending them to.
The idea is here is to troubleshoot with the Kubernetes tools.
It appears the React service was taking a long time to start due the compute allocation 0.25 vCPU and 0.5 GB, and eventually failing after 10 minutes. We set the following resource requests and limits in the deployment manifest. The pod starts within a couple of minutes without problems.
resources:
limits:
cpu: 1000m
memory: 2000Mi
requests:
cpu: 800m
memory: 1500Mi
I wanted to install Kubeflow into the Azure, So I started off creating an Azure Kubernetes Cluster(AKS) with a single node(B4MS virtual machine). During the installation, I didn't enable the virtual node pool option. After creating the AKS cluster, I ran the command "$ kubectl describe node aks-agentpool-3376354-00000" to check the specs. The allocatable number of Pods were 110 and I was able to install Kubeflow without any issues. However, sometime later I wanted an AKS Cluster with virtual node pool enabled so I could use GPUs for training. So I deleted the old Cluster and created a new AKS Cluster with the same B4MS virtual machine and with the virtual node pool option enabled. This time when I ran the same command as above to describe the node specs, the allocatable number of Pods were 30 and the kubeflow installation failed due to lack of pods to provision.
Can someone explain me why the number of allocatable Pods change when the virtual node option is enabled or disabled? How do I maintain the number of allocatable Pods as 110 while having the virtual node pool option enabled?
Thank you in advance!
Virtual Node Pool requires the usage of the Advance Networking configuration of AKS which brings in AZURE CNI network plugin.
The Default POD count per node on AKS when using AZURE CNI is 30 pods.
https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni#maximum-pods-per-node
This is the main reason why you are now getting 30 MAX pods per node.
This can be updated to a bigger number when using the AZ CLI to provision your cluster.
https://learn.microsoft.com/en-us/cli/azure/ext/aks-preview/aks?view=azure-cli-latest#ext-aks-preview-az-aks-create
--max-pods -m
The maximum number of pods deployable to a node.