I have deployed 5 apps using Azure container instances, these are working fine, the issue I have is that currently, all containers are running all the time, which gets expensive.
What I want to do is to start/stop instances when required using for this a Master container or VM that will be working all the time.
E.G.
This master service gets a request to spin up service number 3 for 2 hours then shut it down and all other containers will be off until they receive a similar request.
For my use case, each service will be used for less than 5 hours a day most of the time.
Now, I know Kubernetes its an engine made to manage containers but all examples I have found are for high scale services, not for 5 services with only one container each, also not sure if Kubernetes allows to have all the containers off most of the time.
What I was thinking on is to handle all these throw some API, but I'm not fiding any service in Azure that allows something similar to this, I have only found options to create new containers, not to spin up and shut them down.
EDIT:
Also, this apps run process that are to heavy to have them on a serverless platform.
Solution is to define horizontal pod autoscaler for your deployment.
The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can’t be scaled, for example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by user.
Configuration file should looks like this:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-images-service
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: example-deployment
minReplicas: 2
maxReplicas: 100
targetCPUUtilizationPercentage: 75
scaleRef should refer toyour deployment definition and minReplicas you can set as 0, value of targetCPUUtilization you can set according to your preferences.. Such approach should help you to save money due to termination pod which have high CPU utilization.
Kubernetes official documentation: kubernetes-hpa.
GKE autoscaler documentation: gke-autoscaler.
Useful blog about saving cash using GCP: kubernetes-google-cloud.
Related
I am new bee to Kubernetes and I am doing some workaround on these pods.
I have 3 pods running in 3 different nodes. One of the Pod App is taking more usage 90+ and I want to create a health check for that.
Is there any way for creating a health check in Kubernetes ?
If I mention 80 CPU limit, Kubernetes will create new pod or not ?
You need a Horizontal Pod Autoscaler to scale pods. There is a simple guide that will walk you through creating one. Here's a resource example from the mentioned guide:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
as mentioned in the below answer , you are supposed to create a horizontal autoscaler component for that deployment object, the kubernetes metrics server will continuously be keeping watch on CPU utilization against each POD and once the usage crosses the threshold i.e. "averageUtilization: 50" ( as mentioned below ), then a new pod will get spawned once the existing pod reaches 50% of the CPU provided to it.
And this is different from health check thing, as health of a pod is decides whether to send traffic on that or not i.e. via liveness and readiness probes.
Make sure you mention the resources and limits for the POD in the deployment file that you create, so that HPA can take a reference value of CPU against which it will be calculating the percentage of utilization.
I'm trying to run a deployment of my NodeJS application in EKS with a ReplicaSet dictating that 3 pods should be run of the application. However, I'm trying to make some logic exclusive to one of the Pods, calling it the "master" version of the application.
Is it possible to either a) have a different environment like IS_MASTER passed to just that pod or to otherwise tell from within the application that it's running on the "master pod" without multiple deployments?
You can have a sticky identity for each pods using StatefulSets
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
Quoting docs
Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.
Pods will have the hostnames foo-{0..N-1} given N replicas, you can do some sort of check for a master like if the hostname is foo-0 then it is master.
My question is an extension of the earlier discussion here:
Mongo Change Streams running multiple times (kind of): Node app running multiple instances
In my case, the application is deployed on Kubernetes pods. There will be at least 3 pods and a maximum of 5 pods. The solution mentioned in the above link suggests to use <this instance's id> in the $mod operator. Since the application is deployed to K8s pods, pod names are dynamic. How can I achieve a similar solution for my scenario?
if you are running stateless workload i am not sure why you want to fix name of POD(deployment).
Fixing PODs names is only possible with stateful sets.
You should be using statefulset instead of deployment, replication controllers(RC), however, replication controllers are replaced with ReplicaSets.
StatefulSet Pods have a unique identity comprised of an ordinal. For any StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, which will be unique across Set.
We have a K8s cluster on Azure (aks). On this cluster, we added a loadbalancer on the setup which installed an nginx-ingress controller.
Looking at the deployments:
addon-http-application-routing-default-http-backend 1
addon-http-application-routing-external-dns 1
addon-http-application-routing-nginx-ingress-controller 1
I see there is 1 of each running. Now I find very little information if these should be scaled (there is 1 pod each) and if they should, how?
I've tried running
kubectl scale deployment addon-http-application-routing-nginx-ingress-controller --replicas=3
Which temporarily scales it to 3 pods, but after a few moments, it is downscaled again.
So again, are these supposed to be scaled? Why? How?
EDIT
For those that missed it like I did: The AKS addon-http-application is not ready for production, it is there to quickly set you up and start experimenting. Which is why I wasn't able to scale it properly.
Read more
That's generally the way how you do it:
$ kubectl scale deployment addon-http-application-routing-nginx-ingress-controller --replicas=3
However, I suspect you have an HPA configured which will scale up/down depending on the load or some metrics and has the minReplicas spec set to 1. You can check with:
$ kubectl get hpa
$ kubectl describe hpa <hpa-name>
If that's the case you can scale up by just patching the HPA:
$ kubectl patch hpa <hpa-name> -p '{"spec": {"minReplicas": 3}}'
or edit it manually:
$ kubectl edit hpa <hpa-name>
More information on HPAs here.
And yes, the ingress controllers are supposed to be scaled up and down depending on the load.
In AKS, being a managed service, this "system" workloads like kube-dns and the ingress controller, are managed by the service itself and they cannot be modified by the user (because they're labeled with addonmanager.kubernetes.io/mode: Reconcile, which forces the current configuration to reflect what's on disk at /etc/kubernetes/addons on the masters).
Imagine there is a cluster with lots of different deployments running on it. Some pods uses PersistentVolumes (Azure Disks). There is a limit in Azure how much disks can be mounted to a VM and this leads to errors on scheduling like
Status=409 Code="OperationNotAllowed" Message="The maximum number of data disks allowed to be attached to a VM of this size is 8
Pods stay in
Waiting: Container creating
state forever, however some nodes were having much less pods with attached disks at the moment of scheduling. It would be great to limit amount of pods with attached disks per node so this error will never happen. I believe
podAntiAffinity
is what I need and I know I can restrict pods with same label from scheduling on same node, but I don't know how to allow it until node has maximum amount of pods with disks.
My installation is AKS.
az acs create \
--orchestrator-type=kubernetes \
--orchestrator-version 1.7.9 \
--resource-group <resource_group_here> \
--name=<name_here> \
...
KUBE_MAX_PD_VOLS is what you are looking for. By default it's value is 16 for Azure Disks. So you can either use instances which has same limit of attached disks (16) or set it to preferrable value. You can see where it's declared at github
You should set this environment variable in your scheduler declaration. I found my scheduler declaration in /etc/kubernetes/manifests/kube-scheduler.yaml. This is what it looks now:
apiVersion: "v1"
kind: "Pod"
metadata:
name: "kube-scheduler"
...
spec:
containers:
- name: "kube-scheduler"
...
env:
- name: KUBE_MAX_PD_VOLS
value: "8"
...
Note spec.containers.env.KUBE_MAX_PD_VOLS setting - it prevents from scheduling more than 8 disks on each node.
This way pods spread among nodes without any issues, pods which cannot fit stays in Pending state until they find enough nodes to fit in.