I found the mention of an agent node in the aks documentation but i'm not finding the defition of it. can anyone please explain it to ? also want to know if is it an azure concept or a kubernetes concept.
Regards,
In Kubernetes the term node refers to a compute node. Depending on the role of the node it is usually referred to as control plane node or worker node. From the docs:
A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node.
The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault-tolerance and high availability.
Agent nodes in AKS refers to the worker nodes (which should not be confused with the Kubelet, which is the primary "node agent" that runs on each worker node)
Related
I'm deciding if I should provide use vanilla kubernetes or use Azure Kubernetes Service for my CI build agents.
What control will I lose if used AKS; SSH inside cluster? turning on and off the VMS? How about the cost, I see that AKS use the VM pricing, is there something beyond that
There are several limitations which come to my mind, but neither of them should restrict your use case:
You lose control over master nodes (control plane). Shouldn't be an issue in your use case, and I hardly imagine where this may be a limitation. You still can SSH into worker nodes in AKS.
You lose fine-grained control over size of worker nodes. Node pools become an abstraction to control size of the VMs. In a self-managed cluster you can attach VMs of completely different size to the cluster. In AKS all the nodes in the same pool must be of the same size (but you can create several node pools with different VM sizes).
It's not possible to choose node's OS in AKS (it's Ubuntu-based).
You're not flexible in choosing network plugins for k8s. It's either kubenet or Azure CNI. But that's fine as long as you're not using some weird applications which requre L2 networking, more info here
There are definitely benefits of AKS:
You're not managing control plane which is a real pain reliever.
AKS can scale its nodes dynamically, which may be a good option for bursty workloads like build agents, but also imposes additional delay during node scaling procedure.
Cluster (control and data planes) upgrades are just couple of clicks in azure portal.
Control plane is free in AKS (in contrast e.g. to EKS in Amazon), you pay only for the worker nodes, you can calculate your price here
I am going to start with an example. Say I have an AKS cluster with three nodes. Each of these nodes runs a set of pods, let's say 5 pods. That's 15 pods running on my cluster in total, 5 pods per node, 3 nodes.
Now let's say that my nodes are not fully utilized at all and I decide to scale down to 2 nodes instead of 3.
When I choose to do this within Azure and change my node count from 3 to 2, Azure will close down the 3rd node. However, it will also delete all pods that were running on the 3rd node. How do I make my cluster reschedule the pods from the 3rd node to the 1st or 2nd node so that I don't lose them and their contents?
The only way I feel safe to scale down on nodes right now is to do the rescheduling manually.
Assuming you are using Kubernetes deployments (or replica sets) then it should do this for you. Your deployment is configured with a set number of replicas to create for each pod when you remove a node the scheduler will see that the current active number is less than the desired number and create new ones.
If you are just deploying pods without a deployment, then this won't happen and the only solution is manually redeploying, which is why you want to use a deployment.
Bear in mind though, what you get created are new pods, you are not moving the previously running pods. Any state you had on the previous pods that is not persisted will be gone. This is how it is intended to work.
We are running a lot of connectors on premise and we need to go to Azure. These on premise machines are running Kafka Connect API on 4 nodes. We deploy this API executing this on all these machines:
export CLASSPATH=/path/to/connectors-jars
/usr/hdp/current/kafka-broker/bin/connect-distributed.sh distributed.properties
We have Kafka deployed on Azure Kafka for HD Insight. We need at least 2 nodes running the distributed Connect API and we don't know where to deploy them:
On head nodes (which we still don't know what they are for)
On worker nodes (where kafka brokers live)
On edge nodes
We also have Azure AKS running containers. Should we deploy the distributed Connect API on AKS?
where kafka brokers live
Ideally, no. Connect uses lots of memory when batching lots of records. That memory is better left to the page cache for the broker.
On edge nodes
Probably not. That is where you users are interacting with your cluster. You wouldn't want them poking at your configurations or accidentally messing up the processes in other ways. For example, we had someone fill-up an edge-nodes local disk because they were copying large volumes of data in and out of the "edge".
On head nodes
Maybe? But then again, those are only for cluster admin services, and probably have little memory.
Better solution - run dedicated instances outside of HD Insights in Azure that are only running Kafka Connect. Perhaps running them as containers in Kubernetes because they are completely stateless services and only need access to your sources. sinks, and Kafka brokers for transferring data. This way, they can be upgraded and configured separately from what Hortonworks and HDInsights provides.
I started using the AKS service with 3 nodes setup. As I was curious I peeked at the provisioned VMs which are used as nodes. I noticed I can get root on these and that there need to be some updates installed. As I couldn't find anything in the docs, my question is: Who is in charge of managing the AKS nodes (vms).
Do I have to do this myself or what is the idea here?
Thank you in advance.
Azure automatically applies security patches to the nodes in your cluster on a nightly schedule. However, you are responsible for ensuring that nodes are rebooted as required.
You have several options for performing node reboots:
Manually, through the Azure portal or the Azure CLI.
By upgrading your AKS cluster. Cluster upgrades automatically cordon
and drain nodes, then bring them back up with the latest Ubuntu
image. Update the OS image on your nodes without changing Kubernetes
versions by specifying the current cluster version in az aks
upgrade.
Using Kured, an open-source reboot daemon for Kubernetes.
Kured runs as a DaemonSet and monitors each node for the presence of
a file indicating that a reboot is required. It then manages OS
reboots across the cluster, following the same cordon and drain
process described earlier.
I have setup up a VM cluster using Azure Container Service. The container orchestrator is DC/OS. There are 3 Master nodes and 3 slave agents.
I have a Docker app that I am trying to launch on my cluster using Marathon. Each time I launch, I notice that the CPU utilization of 3 nodes is always 0 i.e. the app is never scheduled on them. The other 3 nodes, on the other hand, have almost 100% CPU utilization. (As I scale the application.) At that point, the scaling stops and Marathon shows state "waiting" for resource ads from Mesos.
I don't understand why Marathon is not scheduling more containers, despite there being empty nodes when I try to scale the application.
I know that Marathon runs on the Master nodes; is it unaware of the presence of the slave agents? (Assuming that the 3 free nodes are the slaves.)
Here is the config file of the application: pastebin-config-file
How can I make full use of the machines using Marathon?
Tasks are not scheduled to the masters. They are reserved for management of the cluster.