Azure. The system memory and CPU change cause slowdown app respond? - azure

we had a problem at 24th of February between 12:35 and 12:50 (pm by utc). Our app starts to respond slowly and it led to failures, but our app didn’t overload Availability and Performance.
Could the system memory and CPU change cause slowdown our app respond? If yes, then how can avoid this problem next time?
There is a screenshot of our app slowdown chart: Web App Slow
In this time we detect change in physical memory and CPU Memory Analysis, High CPU Analysis
Availability and Performance tools detected top 5 slow request execution. It is requests from our app service to external services. Could these requests the overall performance of the app service or app service plan?

Yes. Low compute resource can lead to slow performance (application using high memory/CPU).
Other reasons for performance issue at application level issues, network requests taking a long time, application code or database queries being inefficient, or application crashing due to an exception. To isolate and avoid such issues in future you may try these steps.
Firstly, review the service heath for any reported issues during that time-frame:
You can track the health of the service on the Azure portal
From the screenshots you have shared, it looks like there 2 5xx errors and 48 4xx errors, you may review the logs to fetch more details on the issue.
-Access Kudu - https://.scm.azurewebsites.net/.
To analyze logs and collect diagnostic dumps as required.
Enable diagnostics logging for apps in Azure App Service
Typically, in Azure App Service, for increased performance and throughput, you can adjust the scale at which you are running your application. I'm unsure on what App Service Plan (ASP) you're using.
If you have multiple Apps under a single ASP, the compute resources are shared by all of those running apps. Based on your requirement and usage you may either consider changing your App Service plan to a higher pricing tier or scale-out instances.

Related

Azure Web App Working Memory Set vs App Service Plan Memory Usage

I am doing performance testing for my API hosted in Azure App Service. My API response time is increasing whenever there is a spike in the Working Memory Set graph. But my app service plan is showing 50-55% of memory with only one instance running.
Can you clarify to me, why API response time is increasing every time there is a spike in the 'Working memory Set' even though my app service plan memory is only around 50%?
Response Time Graph
Working Memory Set Graph
This is tough to answer without accessing your site logs but in general your api is consuming resources when it's called and then resting again. Similar to how your heartrate would increase when you start running but should return to normal when you stop again.
Is the response time not within your allowed time range? What is the experience from the customer side? Performance testing on cloud environments can be a slippery slope.
I would suggest using the built in Diagnose and Solve blade of your web app, which has the same troubleshooting tools a support engineer would use to assist you in a paid technical support ticket. This should help tell you if there are any issues with your site that might be impacting performance.
Also, please note that if you're running on the free or shared tier that perf testing is not really applicable as we do not support running production apps on those tiers.

I am seeing 502 errors reported in Diagnose and Solve for my Azure App Service

Within the Web App Down page in Diagnose and Solve for my Azure App Service I am seeing a series of 502 errors that have been occurring consistently for the past few hours. My site is unreachable upon browsing. I have tried restarting the app, and this has not helped. There have been no recent code deployments or configuration changes that led to this error.
Looking at the Microsoft Documentation I see:
https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-troubleshooting-502#cause-1
This seems to be an issue with the connectivity to the back end address pool that is behind a gateway which should be managed by Azure.
As you said, 502s generally indicate being unable to connect to back-end instances.
A solution to this can be to scale up or scale down your app service plan ensuring that you remain within the same tier (i.e. standard vs premium), so as to not change your inbound virtual IP, wait ~5 minutes, and then scale back.
Examples: S1 -> S2 or P2v2 -> P1v2
This operation, also referred to as the "scaling trick", allocates both new instances to the app service plan hosting your web apps, as well as a new internal load balancer.
In the event that there is a process hang-up caused by another resource running on the same hardware hosting your instance(s) and your site, this is the most efficient way to move your site to a new instance. Essentially, this functions as a hard reset beyond the capabilities of the traditional restart.
Lastly, because Azure bills by the hour and this temporary scale is for only 5 minutes, in the event that you need to scale up to remain in the same app service plan tier
(i.e. standard vs premium), you will face either negligible cost or no cost at all.
For future reference, in order to prevent this issue from re-occurring, if you have multiple instances running for your app then please consider enabling health check feature: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/autoscale-get-started#route-traffic-to-healthy-instances-app-service
You can find other best practices here: https://azure.github.io/AppService/2020/05/15/Robust-Apps-for-the-cloud.html

High CPU usage was detected for the kudu app for Azure App service

I noticed that our app was experiencing high CPU usage. In the diagnostics I found the below message.
High CPU usage was detected for the kudu app for 'DemoApiApp'(39.1%) on only one instance out of 4 instances in your app service plan. The affected instance had a peak overall usage of 87.8% during this time. High CPU usage in the kudu process is most often caused by web job usage. Affected instance name: RD0003FF1C445A
Note that, apps in the same App Service plan share the same compute resources. To determine whether the new app has the necessary resources, you need to understand the capacity of the existing App Service plan, and the expected load for the new app. Overloading an App Service plan can potentially cause downtime for your new and existing apps. Refer App Service limits for more details.
As specified in the documentation, isolate your app into a new App Service plan when:
-The app is resource-intensive.
-You want to scale the app independently from the other apps the existing plan.
-The app needs resource in a different geographical region.
If your process is running slower than expected, or the latency of HTTP requests are higher than normal and the CPU usage of the process is also high, you can remotely profile your process and get the CPU sampling call stacks to analyze the process activity and code hot paths. Refer Remote Profiling support in Azure App Service for more details.
Hope this helps.

Azure Long Response Time bottleneck?

How does one diagnose performance (response time) bottlenecks in Azure?
I've got a .NET Core website on Azure that consists of a web app service and one SQL database.
I've set up load test and deployed it via the cloud to hit the website. Configuration properties for the load test agents are
4 cores
start at 10 simultaneous users, + 10 every 20 seconds, up to 150 users
5 second think time between requests
Web app resource allocation is as follows
2 instances of
4 Core, 7GB RAM (S3 Standard)
This image shows the hardware utilization during the load test (2 tests shown, around 1pm and 1:30pm)
Seems reasonable, except my response times are, in my opinion, too slow, considering the hardware isn't stressed at all. For instance, at 10 users, my response time starts at 20ms, but at 150 users (at the end of the test), I'm seeing 5 second response times.
For the last portion of the test, My requests per second was at about 50.
Database performance, at 100DTUs, doesn't seem to be a factor:
What else can I do to diagnose slow response times? If the web server hardware isn't pegged, and the database isn't even sneezing, what other knobs can I turn on Azure?
The long response time bottleneck can be caused by various reasons, for example, bandwidth restrictions, source limited, bad application design, dependency of tightly coupled component etc.More information about how to troubleshoot slow web app performance issues , please refer to the document. There are some snipped from the document.
Enable diagnostics logging for your web app.
WebApp provides diagnostic functionality for logging information from both the web server and the web application.
We can enable Detailed Error Logging, Failed Request Tracing, Web Server Logging for web server diagnostic
Use Kudu Debug console (https://. scm.azurewebsites.net/)
Kudu provides environment settings for your application, log stream, diagnostic dump
We also can use Azure Application Insights to monitor the usage and performance of our app, then we can get more detail info about request, more detail exception info, response time and so on.
If we get more detail info about application exception, request failed, server logs and application log, it will be more helpful for us to diagnose.
There are also some related articles about how to diagnostic Web App and how to use Application insights:
Enable diagnostics logging for web apps in Azure App Service
Monitor performance in web applications
Diagnose exceptions in your web apps with Application Insights
Using Search in Application Insights
Try using new relic extensions
It provides great insight to response time and lot more with the free account
You can also enable application insights on the web app. It will provide you with details on response time and other details

How to change basic to standard tier in Azure

My app deployed in Azure with basic tier having 10GB space. Now it showing the usage warning error in Server. So I want change the scale from basic to standard. Then which instance size should choose having ( Small-1 core, Medium-2cores and Large- 4 cores) ? Also while saving following notifications are showing
In Standard mode, if a web app is stopped, billing continues, and changing the scaling for an app affects other apps. Are you sure you want to continue?
This will scale the following web apps in the East US 2 region. This can take several minutes to complete. Your web apps will keep running during the process.
please help
To answer your question, here is a table with App Service sizes in which you can see that the Standard size has 50GB and the Premium has 500GB of disk space.
To answer your other questions:
The reality is that you pay for the App Service Plan, each plan can host dozens of Apps. Think of it as a Platform running all the time that hosts your Apps, if you stop one App, the Platform is still running (because you might have other Apps running on it), and thus, you are still charged for it.
Like I said, because what you pay is the App Service Plan, scaling the Plan will automatically scale all the Apps contained in it, that's the reason of the second message.
Think of the App Service Plan as a server in which you run your Apps, the moment you delete all the Apps in the Plan, the Plan stops billing, but as long as you have at least one App (running or stopped) in it, it will keep charging.

Resources