Azure API app strange delay before send response - azure

I'm experiencing a really strange problem.
My API is taking a long time to response and I can't see why from AppInsights.
Attached is the end-to-end transaction view. It shows that the API call took 16.7 seconds. It's clear that a dependency call took 3.8 seconds. Some DB operation took several ms. But what's causing the long delay? What happened in the red rectangle marked with question mark?
The web app is hosted with a P1V2 app service plan. The plan is shared by 3 API apps. Could this be a problem?

The web app is hosted with a P1V2 app service plan. The plan is shared by 3 API apps. Could this be a problem?
No, that couldn't be the problem because the P1V2 app service plan is a premium service plan. We can reduce the cost consumption by combining multiple apps into a single app service plan. We can add multiple apps to an existing plan as long as the plan has enough resources.
Here are my few observations.
The P1-V2 service plan will have 3.5 GB of RAM and 1 VCore processor. We need to cross-check the utilization logs once.
APIs are calling token methods Get/msi/token, and there onwards it's forward to the PUT call on the portal; there it's taking close to 16 seconds.
The one mentioned in the red box is not an issue because it takes only 16 milliseconds to execute SQL.

Related

Azure Functions scalability issue

I am using Azure Functions on the App Service Plan. My understanding is for every new execution the Azure Function will create a new App Service, execute the function and then shut down the App Service. There would be nothing shared between the multiple App Services that are spawned due to multiple requests.
However when I do test my Function(which is a video processing one), for one request the time it takes is around 2-3 mins however for multiple simultaneous requests the time increases to 10-15 mins. My questions are whether my understanding above is correct? If not then what resource is shared amongst these App Services? How should I decide my scaling options(manual vs auto)?
"My understanding is for every new execution the Azure Function will create a new App Service" Nope it will not run new instance each time. Generally if there is no load on AF it will stop all instances.
Then if first request/event comes in it will start first instance. This is why we have ColdStart in Serverless. After that scale controller will measure your instance performance memory and CPU consumption and decide if it needs to scale but it wont be instant. So if lets say you sent N amount of requests to do smth with video they could go to same first instance and increase load. Then AF will scale, because of CPU spike but it wont help with old requests since they are handled at first instance. Keep in mind For non-HTTP triggers, new instances are allocated, at most, once every 30 seconds which means that your AF should have CPU spike for at least 30 second to add new instance https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling
I am not sure if Azure Functions are good option for video processing. Azure function should be used for quick stuff usually I would say not more than 30 sec. But there are some limitation of execution time depends how you run it https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal
Not sure what type of video processing you doing but i would have a look into Azure Media Services
The other options as you mentioned is Batch jobs with low priority https://azure.microsoft.com/en-au/blog/announcing-public-preview-of-azure-batch-low-priority-vms/ it actually a good use case you have: Media processing and transcoding, rendering and so on
A small addition to Vova's answer: if you're running your Function in an App Service (also known as a Dedicated Plan), it will by default only scale instances within the possibilities of the App Service Plan you defined. This means that all of the instances of your Function App run on the same virtual machine. That is most probably the reason you're seeing increasing request times with more requests.
If you want your Functions to scale beyond the capabilities of that plan, you will need to manually scale or enable autoscaling for the App Service plan.
An App Service plan defines a set of compute resources for an app to run. These compute resources are analogous to the server farm in conventional hosting.
and
Using an App Service plan, you can manually scale out by adding more VM instances. You can also enable autoscale, though autoscale will be slower than the elastic scale of the Premium plan. [...] You can also scale up by choosing a different App Service plan.
If you run your Function App on Consumption Plan (the true serverless hosting plan option since it enables scaling to zero),
The Consumption plan scales automatically, even during periods of high load.
In case you need longer execution times than those available in Consumption Plan, but the App Service Plan doesn't seem to be the best hosting environment for your Functions there's also the Premium Plan.
The Azure Functions Elastic Premium plan is a dynamic scale hosting option for function apps.
Premium plan hosting provides the following benefits to your functions:
Avoid cold starts with perpetually warm instances
Virtual network connectivity.
Unlimited execution duration, with 60 minutes guaranteed.
Premium instance sizes: one core, two core, and four core instances.
More predictable pricing, compared with the Consumption plan.
High-density app allocation for plans with multiple function apps.
More info on all the different Azure Functions hosting options.

I am seeing 502 errors reported in Diagnose and Solve for my Azure App Service

Within the Web App Down page in Diagnose and Solve for my Azure App Service I am seeing a series of 502 errors that have been occurring consistently for the past few hours. My site is unreachable upon browsing. I have tried restarting the app, and this has not helped. There have been no recent code deployments or configuration changes that led to this error.
Looking at the Microsoft Documentation I see:
https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-troubleshooting-502#cause-1
This seems to be an issue with the connectivity to the back end address pool that is behind a gateway which should be managed by Azure.
As you said, 502s generally indicate being unable to connect to back-end instances.
A solution to this can be to scale up or scale down your app service plan ensuring that you remain within the same tier (i.e. standard vs premium), so as to not change your inbound virtual IP, wait ~5 minutes, and then scale back.
Examples: S1 -> S2 or P2v2 -> P1v2
This operation, also referred to as the "scaling trick", allocates both new instances to the app service plan hosting your web apps, as well as a new internal load balancer.
In the event that there is a process hang-up caused by another resource running on the same hardware hosting your instance(s) and your site, this is the most efficient way to move your site to a new instance. Essentially, this functions as a hard reset beyond the capabilities of the traditional restart.
Lastly, because Azure bills by the hour and this temporary scale is for only 5 minutes, in the event that you need to scale up to remain in the same app service plan tier
(i.e. standard vs premium), you will face either negligible cost or no cost at all.
For future reference, in order to prevent this issue from re-occurring, if you have multiple instances running for your app then please consider enabling health check feature: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/autoscale-get-started#route-traffic-to-healthy-instances-app-service
You can find other best practices here: https://azure.github.io/AppService/2020/05/15/Robust-Apps-for-the-cloud.html

Azure FunctionApps vs Azure App Services for Compute intensive work

I have 2 questions first related to hosting, second related to sdk/library to use:
I need to write a kind of work allocation service scheduler to people, which will run say every 1 hour to run compute intensive logic in background and push the results in our database. The input may be number of days to create schedule for, number of people available, count of tasks to be done. So primarily its compute intensive.
Should i host it in App Service or in Azure Function (TimerTrigger)? This scheduler run as total background job and never called from UI or any backend API.
If i go App service way i have choice of either Hangfire or WebJob. How should i decide which is good for me.
Certainly quick execution with lesser cost is my criteia to move ahead.
One consideration for Azure function is how long the processing will take. Azure functions have a maximum time limit that depends on hosting plan. When you create a function app in Azure, you must choose a hosting plan for your app. There are three hosting plans available for Azure Functions: Consumption plan, Premium plan, and Dedicated (App Service) plan. An overview of hosting plans and their timeout durations is here: Azure Functions scale and hosting.
Unlimited duration is in Premium plan or Dedicated plan (Unlimited execution duration - 60 minutes guaranteed).
Maximum duration for Consumption plan is 10 minutes.

Azure WebJob sporadically fails with OutOfMemoryException

I am trying to figure out the reason why an Azure WebJob sporadically fails with OutOfMemoryException.
There are 3 App Services assigned to the same Standard 3 App Service plan (4 cores, 7GB RAM). One of them consists of a set of static HTML, CSS and JS files. Another App service is an ASP.NET MVC application. The last one is a set of WebJobs (1 continuous and 3 triggered).
One of the triggered WebJobs is an import which runs at night when there are no traffic from the users. Some times the import fails with an OutOfMemoryException. According to the Azure Metrics graphics, RAM utilization of the Service Plan is never bigger than 50% with an average value of about 40%. The App Service with the job uses up to 1.3 GB of RAM.
My question is how is it possible that OutOfMemoryException occurs, but there is no respective evidence of critical RAM usage on the graphics?

How to change basic to standard tier in Azure

My app deployed in Azure with basic tier having 10GB space. Now it showing the usage warning error in Server. So I want change the scale from basic to standard. Then which instance size should choose having ( Small-1 core, Medium-2cores and Large- 4 cores) ? Also while saving following notifications are showing
In Standard mode, if a web app is stopped, billing continues, and changing the scaling for an app affects other apps. Are you sure you want to continue?
This will scale the following web apps in the East US 2 region. This can take several minutes to complete. Your web apps will keep running during the process.
please help
To answer your question, here is a table with App Service sizes in which you can see that the Standard size has 50GB and the Premium has 500GB of disk space.
To answer your other questions:
The reality is that you pay for the App Service Plan, each plan can host dozens of Apps. Think of it as a Platform running all the time that hosts your Apps, if you stop one App, the Platform is still running (because you might have other Apps running on it), and thus, you are still charged for it.
Like I said, because what you pay is the App Service Plan, scaling the Plan will automatically scale all the Apps contained in it, that's the reason of the second message.
Think of the App Service Plan as a server in which you run your Apps, the moment you delete all the Apps in the Plan, the Plan stops billing, but as long as you have at least one App (running or stopped) in it, it will keep charging.

Resources