I am running an optimization model (using Google.OrTools) that I build in .Net framework. When I run in my local, the application was running with a CPU of more than 99%, so my team has decided to move this application to Azure ScaleSet where I have one VM and I configured to Scale up to 10 VMs. The problem I face is the same >99% CPU only in my main VM even though new VMs have been added (scaled-up), the CPU on that VMs are <1%. I am now confused about working with ScaleSets in Azure.
In my above case, I am thinking that the job has not been shared with other VMs. How can I resolve this?
Please note that I am running my application using a Console Application and this job does not have frequent connections with database and also Drive, this job is a purely mathematical problem.
Customer will use Azure VMSS as the front endpoint(Or backend pool).
Azure VMSS autoscale ability reduces the management overhead to monitor and tune your scale set as customer demand changes over time.
Azure VMSS will use Azure load balancer to route traffic to all VMSS instances, in this way, all instances CPU usage are consistent.
If your service running without other requests, or other connections, the CPU usage is 99%, it means you should resize that VM to a high size.
First, your preferences and your budget don't determine whether your workload can scale out rather than scale up.
An Azure scale set includes some backend VMs and a load balancer. The load balancer distributes requests to the backend servers.
Your workload can take advantage of an Azure scale set if it consists of multiple, independent requests. The canonical example of this kind of workload is a web server. Running this kind of workload on an Azure scale set doesn't usually require any changes to code.
You might be able to run your workload on a scale set if you have a single request that can be broken down into smaller pieces that can be processed independently. For this kind of parallel processing to work, you'd probably have to rewrite some of your code. The load balancer would see these smaller pieces as multiple requests.
Other ways to improve mathematical performance include
using a different, more appropriate language,
running your code on a GPU rather than a CPU, or
leveraging a third-party system, like Wolfram Mathematica.
I'm sure there are other ways.
Imagine you have 10 physical machines in the lab. How would you split up this task to run faster, on all the machines?
A scale set is a collection of VMs. To make use of scale sets, and autoscale, your compute intensive job needs to be parallelizable. For example, if you can split it into many sub-tasks, then each VM in the scale set can request a sub-task, compute it, send the result somewhere for aggregation, and request another task.
Here is an example of a compute intensive task running on 1000 VMs in a scale set: https://techcommunity.microsoft.com/t5/Microsoft-Ignite-Content-2017/The-journey-to-provision-and-manage-a-thousand-VM-application/td-p/99113
Related
I need that an asp.net core application in azure to have redundancy. If one application fails another, take over your tasks online. I didn't find anything that I can use as a guide. Thanks for your help.
Azure VMs HA options:
Use Availability Set: An availability set is a logical grouping of VMs that allows Azure to understand how your application is built to provide for redundancy and availability. (SLA 99,95%)
Scale Sets: Azure virtual machine scale sets let you create and manage a group of load balanced VMs. The number of VM instances can automatically increase or decrease in response to demand or a defined schedule. Scale sets provide high availability to your applications, and allow you to centrally manage, configure, and update many VMs.
Load Balancing
Also follow this decission tree as starting point to choose whatever feats your needs.
I have some workload which needs to be run a few times per week. It requires some heavy computational work and runs about one hour (with 16 cores and 32gb memory). It is possible to run it in a container.
Azure offers many different possibilities to run containers. (I have no knowledge of most of the Azure services, so my conclusions might be wrong.) Firstly, I thought Azure Container Instances is perfect for this scenario, but it only offers containers with up to 4 vCPU and 16gb memory. There is no need for orchestration with a single container, so Azure Kubernetes Service and Azure Service Fabric come with too much overhead. Similarly, Azure Batch also offers computational clusters which are not needed for a single workload.
Which Azure service is the best fit for this use case?
While a "best fit" question is likely to be closed. Anyways, here's a suggestion.
Don't dismiss AKS. You can easily create a 1 node cluster using a VM that fits your required configuration. Using the standard SLA, you don't pay for the master node and you can stop your cluster after each run and stop being charged. No need to bother about orchestration, see this as a VM that has everything to run your container that you'll use like an ACI.
We have a problem statement i.e. We are using Azure Service Fabric for our production. We have service fabric with Silver Tier. Our issue is when a single instance gets Spike i.e. due to High CPU utilization and Memory. Load balancer is unable to transfer request to other nodes. Single node get 90 percent utilization and we are even unable to RDP that node during that time. I have seen articles from Microsoft about adding placement constraints. Still that didn't work either. We are unable to apply rules to loadbalancer as we have integrated APIM with Service Fabric. I had multiple calls with Microsoft Still didn't get appropriate solution which could work. I need a solution to my problem.
I know we have issue in one of our services we are already working on it but we need SF to handle this scenario as well.
If one or more of your services generates CPU / memory spikes (and not a consistent high utilization) then it will be very hard to balance such behavior.
Anyway, you can do two things to mitigate it:
Use resource governance to restrict the amount of CPU and memory that this problematic service can consume
Microsoft released FabricObserver which can be used to extend the monitoring of our SF cluster. You can have a look and see how you can leverage AppObserver to report CPU and memory usages of a single service (process) as LoadMetrics and use it to balance the cluster
Actually I am getting System.OutOfMemoryException for blob trigger azure function.Do i need to scale up or Scale out App Service Plan to fix this problem.
What is the difference between these two?
For your original question, if your function is running on consumption plan, Scale up App Service Plan of your Azure service.The plan you already have less Memory and if you have multiple functions running in App Service Plan then scale out.
From the docs,
Scale up means :
A scale up operation is the Azure Web Sites cloud equivalent of moving
your non-cloud web site to a bigger physical server. So, scale up
operations are useful to consider when your site is hitting a quota,
signaling that you are outgrowing your existing mode or options. In
addition, scaling up can be done on virtually any site without
worrying about the implications of multi-instances data consistency.
Two examples of scale up operations in Windows Azure Web Sites are:
Scale Out means:
A scale out operation is the equivalent of creating multiple copies of
your web site and adding a load balancer to distribute the demand
between them. When you scale out a web site in Windows Azure Web Sites
there is no need to configure load balancing separately since this is
already provided by the platform.
Digram depicting the difference between the two :
You need to scale up your app service plan.
"Scale up" means upgrade the capacity of the host where the app is hosted. Ex: Increase the memory from 1.75GB to 3.5GB.
"Scale out" means upgrade the capacity of the app by increasing the number of host instances.
In Short Scale Up is vertical scaling, where you add more resources to increase capacity of the underlaying Hardware/Infrastructure.
Where, Scale Out is horizontal scaling, where you add more instance of the same app to process/handle request simultaneously.
If you choose Scale Out, you will get more VMs and balance your workloads to those VMs. If you choose Scale Up, your VM will get more punch to handle current workloads. More VMs or more power to your current VM
VM scale set can be used to create multiple VM's based on the business requirement and, Also, Azure batch is also used to execute job in multiple VM's.
What is the exact difference between Azure Batch and VM Scale set?
Azure Batch is a Platform as a Service offering that has an entire plaform for scheduling, submitting tasks and obtaining their results. Jobs and tasks are submitted using Node pools. Node pools can be comprised of VMSS compute resourses.
Whereas a VMSS is an Infrastructure as a Service that provides compute resources for any intended purposes. While you can spin up your own VMSS for running tasks, you would have to also implement your own job, task and compute coordinator service around it in order to simulate the Azure Batch service offerings.
At a high-level, Azure Batch provides two fundamental pieces for scheduling Batch and HPC workloads in the cloud:
Managed infrastructure
Cloud-native job scheduling
Azure Batch presents infrastructure at a managed layer "above" VMSS and CloudServices. Azure Batch orchestrates the pieces underneath to provide a concept called Batch pools, which provide potentially higher scale (as multiple deployments can be orchestrated together transparently) and higher resiliency to failures as Batch automatically recovers virtual machines or cloud service instances which have degraded.
Additionally, and just as important, Azure Batch provides cloud-native job scheduling. This portion is fully managed, i.e., you don't have to run a scheduler yourself. In a nutshell, Azure Batch provides concepts for job queues and tasks which you can define within the programmatic (API/SDK) or tooling that is available. Azure Batch operates on these concepts to execute the work you define (e.g., a command-line with dependencies or a Docker container); tasks can even span multiple nodes (e.g., MPI jobs). Azure Batch has the ability to retry these tasks if they fail on different nodes within a pool. Azure Batch provides an autoscale system that allows you to dynamically resize your infrastructure (Batch pools) that respond to node metrics and the number of jobs/tasks executing in the system.
Please refer to the technical overview as a starting point.
azure batch intent is to run jobs, vmss workloads. technically they do overlap a fair bit, but job is something rather short lived\bursty, whereas workload has to be working all the time
VM Scaleset is used to provide automatic scaling for an application and load balancing of traffic.VM Scale sets are good for running web applications/api based workloads where automatic scaling of the applications is handled and traffic load balancing is done.
Azure Batch is for tasks, scheduling jobs, running intrinsically parallel and tightly coupled workloads. It can provide scaling and load balancing of different nodes/VM that would be used for performing a high computation job. It would probably not be a suitable target for long-running services. A common scenario for Batch involves scaling out intrinsically parallel work, such as the rendering of images for 3D scenes, on a pool of compute nodes.