On AKS are System node pool the same as master node? - azure

I know that with Azure AKS , master components are fully managed by the service. But I'm a little confused here when it comes to pick the node pools. I understand that there are two kind of pools system and user, where the user nodes pool offer hosting my application pods. I read on official documentation that System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. And i'm aware that we can only rely on system nodes to create and run our kubernetes cluster.
My question here, do they mean here by the system node the MASTER node ? If it is so, why then we have the option to not create the user nodes (worker node by analogy) ? because as we know -in on prem kubernetes solution- we cannot create a kubernetes cluster with master nodes only.
I'll appreciate any help

System node pools in AKS does not contain Master nodes. Master nodes in AKS are 100% managed by Azure and are outside your VNet. A system node pool contains worker nodes on which AKS automatically assigns the label kubernetes.azure.com/mode: system, that's about it. AKS then use that label to deploy critical pod like tunnelfront, which is use to create a secure communication from your nodes to the control plane. You need at least 1 system node pool per cluster and they have the following restrictions :
System pools osType must be Linux.
System pools must contain at least one node, and user node pools may contain zero or more nodes.
System node pools require a VM SKU of at least 2 vCPUs and 4GB memory. But burstable-VM(B series) is not recommended.
System node pools must support at least 30 pods as described by the minimum and maximum value formula for pods.

Related

Can i stop AKS cluster and start?

What happens when I stop aks cluster and start?
Will my pods remain in the same state?
Do the node pool and nodes inside that stop?
Do the services inside the cluster still runs and cost me if it is a load balancer?
Stopping cluster will lost all the pods and starting it again it will create a new pod with the same name but Ip address of pod will changes.
Pods are only scheduled once in their lifetime. Once a Pod is scheduled (assigned) to a Node, the Pod runs on that Node until it stops or is terminated.
Do the node pool and nodes inside that stop?Do the services inside the cluster still runs and cost me if it is a load balancer?
Yes It will Stop the nodes and Complete Node Pool as well.Service Inside the cluster will also stop and it will not cost as well.
Reference : https://learn.microsoft.com/en-us/azure/aks/start-stop-cluster?tabs=azure-cli

AKS Cluster with virtual node enabled and without virtual node enabled

I wanted to install Kubeflow into the Azure, So I started off creating an Azure Kubernetes Cluster(AKS) with a single node(B4MS virtual machine). During the installation, I didn't enable the virtual node pool option. After creating the AKS cluster, I ran the command "$ kubectl describe node aks-agentpool-3376354-00000" to check the specs. The allocatable number of Pods were 110 and I was able to install Kubeflow without any issues. However, sometime later I wanted an AKS Cluster with virtual node pool enabled so I could use GPUs for training. So I deleted the old Cluster and created a new AKS Cluster with the same B4MS virtual machine and with the virtual node pool option enabled. This time when I ran the same command as above to describe the node specs, the allocatable number of Pods were 30 and the kubeflow installation failed due to lack of pods to provision.
Can someone explain me why the number of allocatable Pods change when the virtual node option is enabled or disabled? How do I maintain the number of allocatable Pods as 110 while having the virtual node pool option enabled?
Thank you in advance!
Virtual Node Pool requires the usage of the Advance Networking configuration of AKS which brings in AZURE CNI network plugin.
The Default POD count per node on AKS when using AZURE CNI is 30 pods.
https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni#maximum-pods-per-node
This is the main reason why you are now getting 30 MAX pods per node.
This can be updated to a bigger number when using the AZ CLI to provision your cluster.
https://learn.microsoft.com/en-us/cli/azure/ext/aks-preview/aks?view=azure-cli-latest#ext-aks-preview-az-aks-create
--max-pods -m
The maximum number of pods deployable to a node.

Subnets in kubernetes

I'm still experimenting with migrating my service fabric application to kubernetes. In service fabric I have two node types (effectively subnets), and for each service I configure which subnet it will be deployed to. I cannot see an equivalent option in the kubernetes yml file. Is this possible in kubernetes?
First of all, they are not effectively subnets. They could all be in the same subnet. But with AKS you have node pools. similarly to Service Fabric those could be in different subnets (inside the same vnet*, afaik). Then you would use nodeSelectors to assign pods to nodes on the specific node pool.
Same principle would apply if you are creating kubernetes cluster yourself, you would need to label nodes and use nodeSelectors to target specific nodes for your deployments.
In Azure the AKS cluster can be deployed to a specific subnet. If you are looking for deployment level isolation, deploy the two node types to different namespaces in k8s cluster. That way the node types get isolation and can be reached using service name and namespace combination
I want my backend services that access my SQL database in a different subnet to the front-end. This way I can limit access to the DB to backend subnet only.
This is an older way to solve network security. A more modern way is called Zero Trust Networking, see e.g. BeyondCorp at Google or the book Zero Trust Networks.
Limit who can access what
The Kubernetes way to limit what service can access what service is by using Network Policies.
A network policy is a specification of how groups of pods are allowed to communicate with each other and other network endpoints.
NetworkPolicy resources use labels to select pods and define rules which specify what traffic is allowed to the selected pods.
This is a more software defined way to limit access, than the older subnet-based. And the rules are more declarative using Kubernetes Labels instead of hard-to-understand IP-numbers.
This should be used in combination with authentication.
For inspiration, also read We built network isolation for 1,500 services to make Monzo more secure.
In the Security team at Monzo, one of our goals is to move towards a completely zero trust platform. This means that in theory, we'd be able to run malicious code inside our platform with no risk – the code wouldn't be able to interact with anything dangerous without the security team granting special access.
Istio Service Entry
In addition, if using Istio, you can limit access to external services by using Istio Service Entry. It is possible to use custom Network Policies for the same behavior as well, e.g. Cilium.

kubectl exec vs ssh using bastion

KOPS lets us create a Kubernetes cluster along with a bastion that has ssh access to the cluster nodes
With this setup is it still considered safe to use kubectl to interact with the Kubernetes API server?
kubectl can also be used to interact with shell on the pods? Does this need any restrictions?
What are the precautionary steps that need to be taken if any?
Should the Kubernetes API server also be made accessible only through the bastion?
Deploying a Kubernetes cluster with the default Kops settings isn’t secure at all and shouldn’t be used in production as such. There are multiple configuration settings that can be done using kops edit command. Following points should be considered after creating a Kubnertes Cluster via Kops:
Cluster Nodes in Private Subnets (existing private subnets can be specified using --subnets with the latest version of kops)
Private API LoadBalancer (--api-loadbalancer-type internal)
Restrict API Loadbalancer to certain private IP range (--admin-access 10.xx.xx.xx/24)
Restrict SSH access to Cluster Node to particular IP (--ssh-access xx.xx.xx.xx/32)
Hardened Image can also be provisioned as Cluster Nodes (--image )
Authorization level must be RBAC. With latest Kubernetes version, RBAC is enabled by default.
The Audit logs can be enabled via configuration in Kops edit cluster.
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /srv/kubernetes/audit.yaml
Kops provides reasonable defaults, so the simple answer is : it is reasonably safe to use kops provisioned infrastructure as is after provisioning.

How Failover works when Primary VM Set get restarted?

Above is sample configuration for Azure Service Fabric.
I have created with Wizard and I have deployed one Asp.net core Application and that I am able to access from out side.
Now if you look at the image below Service Fabric is being access with sfclustertemp.westus2.cloudapp.azure.com. I am able to access application with
sfclustertemp.westus2.cloudapp.azure.com/api/values.
Now if I restart primary VM set it should transfer load to secondary and I have a thought that it should done automatically but it is not as Second Load Balancer has different dns name. ( If I specify different dns name then it is accessible).
I have understanding cluser has one id so it is common for both load balancer.
Is such configuration possible ?
Maybe you could use Azure Traffic Manager with health probes.
However, instead of using multiple node types for fail-over options during reboot, have a look at 'Durability tiers'. Using Silver or Gold will have the effect that reboots are performed sequentially on machine groups (grouped by fault domain), instead of all at once.
The durability tier is used to indicate to the system the privileges
that your VMs have with the underlying Azure infrastructure. In the
primary node type, this privilege allows Service Fabric to pause any
VM level infrastructure request (such as a VM reboot, VM reimage, or
VM migration) that impact the quorum requirements for the system
services and your stateful services.
There is misconception on what is a SF cluster.
On your diagram, the part you describe on the left as 'Service Fabric' does not belong there.
Service Fabric is nothing more than applications and services deployed in the cluster nodes, when you create a cluster, you define a primary node type, will be there where service fabric will deployed the services used for managing the cluster.
A node type will be formed by:
A VM Scale Set: machines with OS and SF services installed
A load balancer with dns and IP, forwarding requests to the VM Scale Set
So what you describe there, should be represented as:
NodeTypeA (Primary)
Load Balancer (cluster domain + IP)
VM Scale Set
SF management services (explorer, DNS)
Your applications
NodeTypeB
Load Balancer (other dns + IP)
VM Scale Set
Your applications
Given that:
the first concern is, if the Primary Node goes down, you will lose your cluster, because the management services won't be available to manage your service instances.
second: you shouldn't rely on node types for this kind of reliability, you should increase the reliability of your cluster adding more nodes to the node types.
third: if the concern is a data center outage, you could:
Create a custom cluster that span multiple regions
Add a reverse proxy or API gateway in front of your service to route the request wherever your service is.

Resources