Is there a way that "request per second" based scaling can be enabled on Azure Kubernetes Service?
HPA in AKS does allow horizontal pod scaling based on CPU and Memory constraints but there is no straightforward way to do it for - requests per second.
Is there a way to use advanced metrics in the metrics server bundled by AKS?
If you use Azure Application Gateway as your Ingress Controller, you can use its metrics for Horizontal Autoscaler. You can find documentation regarding the use case here. Note that this feature seems to still be in beta.
Related
I want to find the Node scalability time on Azure Kubernetes Service (AKS) using Logs.
It's possible with some assumptions.
This information is taken from Azure AKS documentation (consider getting familiar with it, it describes how to enable, where to look at and etc):
To diagnose and debug autoscaler events, logs and status can be
retrieved from the autoscaler add-on.
AKS manages the cluster autoscaler on your behalf and runs it in the
managed control plane. You can enable control plane node to see the
logs and operations from CA (cluster autoscaler).
The same cluster-autoscaler is used across different platforms, each of them can have some specific setup (e.g. for Azure AKS). Based on it, logs should have events like:
status, scaleUp, scaleDown, eventResult
I am planning to deploy 15 different applications initially and would endup with 300+ applications on azure kubernetes and would be using Prometheus and Grafana for monitoring.
I have deployed both the Prometheus and Grafana on a separate namespace on the dedicated node.
How do I enable horizontal pod scaling for Prometheus and Grafana?
You can scale your applications based on custom metrics gathered by Prometheus and presented in the Grafana dashboard.
In order to do that you'll need the Prometheus Adapter to implement the custom metrics API, which enables the HorizontalPodAutoscaler controller to retrieve metrics using the custom.metrics.k8s.io API. You can define your own metrics through the adapter’s configuration so the HPA would scale based on those stats.
Here you can find a short guide that would get you started.
I have my model hosted on ACI compute. I'm trying to investigate what it would take to support auto-scaling of the underlying instances? If auto scaling isnt possible, then is there documentation to manually scale the endpoint?
Basically, I need to support high availability on this model endpoint.
A thought that I had was to manually publish the model to 2 endpoints and then add a Load Balander in front. Seems a little hacky...
Thanks!
We usually recommend deploying to AKS for high availability. https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service
Is there an equivalent functionality in Azure like AWS Auto Scaling Group or GCP Instance Group? All I can find is Azure Virtual Machine Scale Set which always uses load balancer. The closest resource I found is Azure Automation Runbook which a bit more complex for my use case.
I just need to spin up virtual machines based on current vm's health threshold and/or to use it for vertical scaling by simply change the instance type.
You can create an Azure VMSS without a loadbalancer, you may need to assign a pubic IP addresses to your VM which is now available. In your case it sounds like you just want 1 node in the VMSS so you can use AutoScale.
https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-networking#public-ipv4-per-virtual-machine
The equivalent Azure service for AWS Auto Scaling Group or GCP Instance Group is Azure Autoscale.
I'll provide some basic overview on Azure's Autoscale taken from here.
Azure Autoscale supports the most common scaling scenarios based on a
schedule and, optionally, triggered scaling operations based on
runtime metrics (such as processor utilization, queue length, or
built-in and custom counters). You can configure simple autoscaling
policies for a solution quickly and easily by using the Azure portal.
For more detailed control, you can make use of the Azure Service
Management REST API or the Azure Resource Manager REST API. The Azure
Monitoring Service Management Library and the Microsoft Insights
Library (in preview) are SDKs that allow collecting metrics from
different resources, and perform autoscaling by making use of the REST
APIs. For resources where Azure Resource Manager support isn't
available, or if you are using Azure Cloud Services, the Service
Management REST API can be used for autoscaling. In all other cases,
use Azure Resource Manager.
The mentioned article is a great resource.
It also provides information about:
Types of scaling (Vertical Vs Horizontal).
Configure autoscaling for an Azure solution.
How to use Azure Autoscale.
Application design considerations for implementing autoscaling.
Check out also this resource on How to auto scale a cloud service.
problem statement.
as per my understanding, we can run an elastic search, kibana and logstash etc as a pod in kubernates cluster for log management. but it is also memory heavy intensive application. AWS provides various managed services like Cloudwatch, cloud trail and ELK stack for log management.
do we have a similar substitute in Azure as well i.e. some managed service?
you can use AKS with Azure Monitor (reading). I'm not sure you can apply this to not AKS cluster (at least not in a straight forward fashion).
Onboarding (for AKS clusters) is really simple and can be done using various methods (portal included).
You can read more on the docs I've linked (for example, about capabilities).
Azure Monitor for Containers is available now and once integrated some cluster metrics as well as the logs will be automatically collected and made available through log analytics.