Currently my app service in azure performs well in staging environments but when it comes to production i find an unusual spike in the response times in application and demands for app-service restart in a few scenarios.
I am trying to analyse this issue and was trying to generate a thread dump using the kudu lite but the container is crashing when we try this and i am currently working with Microsoft on this.
Meanwhile what are the best practices or approach to understand this . I have tried to dig in the application insights logs but there was no much info about the worker threads that are hung up or if the thread pool is exhausted .
Please advise me on this situation on how to analyse and reverse engineer to get to the bottom of this problem .
Thanks in Advance !
In your app service plan check the Diagnosed and solve problems” analyze CPU usage of your App on all instances and see a breakdown of usage of all apps on your server. And check the CPU utilization on each instance serving your app and identify the app and the corresponding process causing High CPU in percentage. Check the Troubleshoot performance degradation of our service
You can use the Kudu console to download the Diagnostic dump in a KUDU -> Tools -> Diagnostic dump
Once you download the diagnostic dump you have log files and a Deployment directory.
You can check the log files to know the spike details
Refer here for more information
Related
I am trying to track down when our frontend started to work that slow. Recently I created new app services within the same service plan.
so now I have six apps (2 frontend, 4 backend) running under same App Service plan using Basic pricing tier. Also, we use Kudu for deployments.
Could that be the reason? or how to look for the reason?
this is overview of that service plan
appreciating any ideas and suggestions
#user122222 This is a high CPU issue and not a slow request issue as others have pointed out.
An immediate action you can take is to scale up. If you are using a B1 instance in the basic tier, try to scale up to a B3, which will provide you with more CPU cores and RAM. See if that provides you relief. If so, then you likely need to remain at this instance level. At this point it would also be worth while to analyze your number of requests. You should scale up when you are running many sites or resource intensive sites and you should scale out when you are receiving a high number of requests.
My money is on the fact that you likely have an issue with your code that is causing a deadlock or similar. Your CPU usage graph is stuck at 100% usage over many hours. Even an overloaded ASP will see a few dips over the course of a few hours.
To troubleshoot high CPU usage, start by using the diagnose and solve problems blade in your app service plan. This is the same troubleshooting tool that a support engineer would use in a paid technical support case. Use it to troubleshoot high CPU (not slow requests as based on your screenshot, it would appear the CPU is the culprit of the slow requests).
This can tell you what app in the ASP is causing the issue and sometimes even tell you the process in that app that is causing the issue. Beyond this, I'd suggest creating and analyzing a memory dump of the problematic web app. More steps on how to do that here.
Please try to restart the worker instance.
https://learn.microsoft.com/en-us/rest/api/appservice/app-service-plans/reboot-worker#code-try-0
Right now my website is slow and when I see the xxxx-cd-hp it looks like picture below. CPU: 90, 36%. Is this still normal?
Apparently at certain times, CPU percentage increased. Maybe because many users have access
How can I solve this problem?
CPU time or process time is an indication of how much processing time on the CPU, a process has used since the process has started and CPU Percentage = Process time/Total CPU Time* 100
Suppose If the process has been running for 5 hours and the CPU time is 5 hours, and it is a single core machine, then that means that the process has been utilizing 100% of the resources of the CPU. This may either be a good or bad thing depending on whether you want to keep resource consumption low or want to utilize the entire power of the system.
App Service Diagnostics is an intelligent and interactive experience to help you troubleshoot your app with no configuration required. When you run into issues with your app, App Service Diagnostics points out what’s wrong to guide you to the right information to more easily troubleshoot and resolve issues. To access App Service diagnostics, navigate to your App Service web app in the Azure portal. In the left navigation, click on Diagnose and solve problems.
I deployed an Azure web app back in July and it's been running flawlessly up until about three weeks ago. At that time, I would notice my CPU utilization constantly between 80% to 100%, with no corresponding increase in traffic. The first time I saw this, after concluding it wasn't my app, or increased traffic, causing this, I restarted the web app service and the CPU utilization returned to its normal 5% to 15%. Then after a couple days it started to do it again. And, again, a restart solved the issue.
My question is this. Is this normal to have to restart the web service every day or so? And, if so, why?
Assuming no changes have been made to your code and you have not seen a corresponding increase in traffic, it is not normal. An Azure Web App with no app deployed should almost always stay at 0% CPU utilization. I say "almost always" because Microsoft does run diagnostic and monitoring tools in the background that can cause some very temporary spikes. See here for a thread on that particular issue.
My recommendations are:
When CPU pegs and stays pegged, log into your SCM site. Check the Process Explorer and confirm that it's your w3wp.exe (Note there's a separate w3wp.exe for your SCM site.) that's pegged the CPU.
Ensure that you don't have any Site Extensions or WebJobs that are losing their mind. You can check your installed Site Extensions on the SCM site under the Site Extensions -> Installed tab. Any WebJobs will show up on your SCM process explorer as separate processes from step #1.
Log into the Azure Portal and browse to your Web App's management blade. Go to the Diagnose and Solve Problems blade. From here, you can try "Metrics per Instance" and go through all of the Perf Counters to see if it gives you a clue as to what's wrong. For example, I had SignalR go nuts once and only found it by seeing that my thread count was out of control.
On the Diagnose and Solve Problems blade, you can also check Application Events.
You may have some light shed on this by installing Application Insights on your web application. It has a free tier that will likely have enough space to troubleshoot for a few days. If this is something going bananas with your code, you may get some insight here.
I'm including failed request tracing logs here for completeness. But these would likely show up in Application Insights.
If you've exhausted all of these possibilities, file a support ticket with Microsoft. As the above link shows, they have access to diagnostic tools that we don't and can eliminate the possibility of a runaway diagnostics or infrastructure process. I don't know how much help they can be if the CPU spike is due to your own w3wp.exe that's spiking the CPU.
Of course, if your app is seriously easy to redeploy and it's not a ridiculous hassle, you can just re-provision it and see if you see the same behavior.
I currently have 4 websites hosted in a S2 hosting plan and this evening received a CPU percentage alert. I went to the management portal and checked all of the sites hosted in the hosting plan, however found no reason for it to be so high. After checking site by site and finding no evidence of what could be causing this problem I went and stopped every site, much to my surprise the CPU usage did not drop and it's been a staggering 50% for the last 30 minutes, is there any way to find out what is causing this? Do you guys have any idea if it could be a bug in the azure sites service?
Thanks in advance.
A couple of things to check for:
- do you have any webjobs on that system? They also consume resources but don't show up in all reports.
You can also check the Kudu Process Monitor to see if there are any other processes running (maybe you've been hacked and someone is running something on your box?) If you've never used the Kudu tool, it is quite handy - to get to it in your browser, put '.scm' after the sitename in your url. For example, if your site is
'mysite.azurewebsites.net'
the Kudu tools are at
'mysite.scm.azurewebsites.net'
There is a process explorer in there that you can see what processes are running under your account.
I am relatively new to Azure. I have a website that has been running for a couple of months with not too much traffic...when users are on the system, the various dashboard monitors go up and then flat line the rest of the time. This week, the CPU time when way up when there were no requests and data going in or out of the site. Is there a way to determine the cause of this CPU activity when the site is not active? It doesn't make sense to me that I should have CPU activity being assigned to my site when there is to site activity.
If your website has significant processing at application start, it is possible your VM got rebooted or your app pool recycled and your onstart handler got executed again (which would cause CPU to spike without any request).
You can analyze this by adding application logs to your Application_Start event (but after initializing trace). There is another comment detailing how to enable logging, but you can also consult this link.
You need to collect data to understand what's going on. So first thing I would say is:
1. Go to Azure management portal -> your website (assuming you are using Azure websites) -> dashboard -> operation logs. Try to see whether there is any suspicious activity going on.
download the logs for your site using any ftp client and analyze what's happening. If there is not much data, I would suggest adding more logging in your application to see what is happening or which module is spinning.
A great way to detect CPU spikes and even determine slow running areas of your application is to use a profiler like New Relic. It's a free add on for Azure that collects data and provides you with a dashboard of data. You might find it useful to determine the exact cause of the CPU spike.
We regularly use it to monitor the performance of our applications. I would recommend it.