We have hosted application on IIS 8,it becomes slower over a period of time.When app pool recycles after 29 hours it is back to normal performance but we are not getting outofmemory exception.
application getting slower over a period and is back to normal performance on default app pool recycle
How slow is it ? Don't back on manual interpretation here. Check IIS logs (Default Location: C:\inetpub\logs\logfiles\w3svc_website_ID) and look at the time-taken field. The time-taken will be in milliseconds and time field would be in UTC.
Based on the IIS logs, set up a FREB rule as per - https://blogs.msdn.microsoft.com/docast/2016/04/28/troubleshooting-iis-request-performance-slowness-issues-using-freb-tracing/
In order to analyze FREB log, open it in IE and click on compact view. Observe the time field on the right hand side. Once you observe a definite jump in time field, that's your culprit. if you do not observe any changes in the time field, then the last module is the possible culprit.
You can also capture manual hang dumps of the w3wp process using debug diag during the time of issue. Install debug diag from - https://www.microsoft.com/en-us/download/details.aspx?id=49924
Capture a manual hang dump during the time of issue. Search specifically for "Create a user dump file for a process" in this page and follow the steps mentioned under it - https://support.microsoft.com/en-us/help/919792/how-to-use-the-debug-diagnostics-tool-to-troubleshoot-a-process-that-h
Once you have the required dump file, run debug diag analysis engine on this and look out for the results. Let me know if you need help in dump analysis
Related
We have a long running ASP.NET WebApp in Azure which has no real endpoints exposed – it serves a single functional purpose primarily reading and manipulating database data, effectively a batched, scheduled task, triggered by a timer every 30 seconds.
The app runs fine most of the time but we are seeing occasional issues where the CPU load for the app goes close to the maximum for the AppServicePlan, instantaneously rather than gradually, and stops executing any more timer triggers and we cannot find anything explicitly in the executing code to account for it (no signs of deadlocks etc. and all code paths have try/catch so there should be no unhandled exceptions). More often than not we see errors getting a connection to a database but it’s not clear if those are cause or symptoms.
Note, this is the only resource within the AppService Plan. The Azure SQL database is in the same region and whilst utilised by other apps is very lightly used by them and they also exhibit none of the issues seen by the problem app.
It feels like this is infrastructure related but we have been unable to find anything to explain what is happening so if anyone has any suggestions for where we should be looking they would be gratefully received. We have enabled basic Application Insights (not SDK) but other than seeing CPU load spike prior to loss of app response there is little information of interest given our limited knowledge of how to best utilise Insights.
According to your description, I thought of two points to troubleshoot your problem. First of all, you can track the running status of your program through the code, and put a log at the beginning and end of your batch scheduled tasks to record the status of each run. If possible, record request and response information and start and end information. This can completely record the time and running status of your task.
Secondly, you can record logs before the program starts database operations, and whether the database connection is successful. The best case is to be able to record, what business will trigger CPU load when operating, and track the specific operating conditions, in order to specifically analyze what causes the database connection failure.
Because you cannot reproduce your problem, you can only guess the cause of the problem. If you still can't find where the problem is through the above two points, then modify your timer appropriately, and let the program trigger once every 5 minutes instead of 30s.
UPDATE: I've figured it out. See the end of this question.
I have an Azure App Service running four sites. One of the sites has two deployment slots in addition to the primary one. Recently I've been seeing really high CPU utilization for the App Service plan as a whole.
The dark orange line shows the CPU percentage. This is just after restarting all my sites, which brought it down to this level.
However, when I look at the CPU use reported by each site, it's really low.
The darker blue line shows the CPU time, which is basically nothing. I did this for all of my sites, and all the graphs look the same. Basically, it seems that none of my sites are causing the issue.
A couple of the sites have web jobs, so I took a look at the logs but everything is running fine there. The jobs run for a few seconds every few hours.
So my question is: how can I determine the source of this CPU utilization? Any pointers would be greatly appreciated.
UPDATE: Thanks to the replies below, I was able to get more detail into what was happening. I ended up getting what I needed from SCM / Kudu tools. You can get here by going to your web app in Azure and choosing Advanced Tools from the side nav. From the Kudu dashboard, choose Process Explorer. The value in the Total CPU Time column is not directly useful, because it's the time in seconds that the process has run since it started, which might have been minutes or days ago.
However, if you make a record of the value at intervals, you can look at the change over time, and one process might jump out at you. In my case, it was my WebJobs process. Every 60 seconds, this one process was consuming about 10 seconds of processor time, just within one environment.
The great thing about this Kudu dashboard is, if you can catch the problem while it is actually happening, you can hit the Start Profiling button and capture a diagnostic session. You can then open this up in Visual Studio and get some nice details about where the CPU time is being spent.
Just in case anyone else is seeing similar issues, I'll provide more details about my particular case. As I mentioned, my WebJobs exe was the culprit, and I found that all the CPU time was being spent in StackExchange.Redis.SocketManager, which manages connections to Azure Redis Cache. In my main web app, I create only one connection, as recommended. But Since my web jobs only run every once in a while, I was creating a new connection to Azure Redis Cache each time one ran, which apparently can lead to issues. I changed my code to create the Redis Cache connection once when the WebJob process starts up and use the existing connection when any individual WebJob runs.
Time will tell if this really fixes the issue, but I think it will. When the problem occurred, it always fit the same pattern: After a few days of running fine, my CPU would slowly ramp up over the course of about 12 hours. My thinking is that each time a WebJob ran, it created a connection object, which at first didn't produce trouble, but gradually as WebJobs ran every hour or two, cruft was building up until finally some critical threshold was met and the CPU usage would take off.
Hope this helps someone out there. Best wishes!
May be you should go to webApp scm?
%yourAppName%.scm.azurewebsites.com;
There is a page, that can show you all process, that runned now on your web app. (something like Console > Process).
Also you can go to support page (from scm right corner).
You can find some more info about your performance there, and make memory dump (not for this problem, but it useful for performance issues).
According to your description, I assumed that you could leverage the Crash Diagnoser extension to capture dump files from your Web Apps and WebJobs when the CPUs usage percentage is higher than the specific threshold to isolate this issue. For more details, you could refer to this official blog.
I have a Azure Web Site running for 6 months and on Friday 1st April 2016 at 09:50pm the CPU was very high and this had a impact on the performance of the web site. Stopping and restarting the web service solved the problem but it came back at 13:00pm. Since then the CPU has stayed high and making the web site un-useable
I've tried all monitoring tools, Daas, Event Logs, checked for Open Connections and ensure my software is closing or disposing objects correctly.
But the CPU is still high. Only way to resolve is to restart the web service but I dont want to keep doing this.
Has anyone else experience a similar problem and what was the solutions.
The only thing from the event logs that look an issues is the odd "A network-related or instance-specific error occurred while establishing a connection to SQL Server", which could be because the SQL Aure is not available.
Please help
Hmmm, high cpu means that your web site is executing code, perhaps a wrong loop on some not frequent code path.
The brute force way to identify what code is being executed, would be to add tracing to your solution by System.Diagnostics.Trace.WriteLine("I am here") and then check the Azure Application Log.
Another way would be to attach the Visual Studio Debugger during high cpu and check what is being executed
The other way would be to take a dump or minidump from kudu site and analyze it with WinDbg:
1)What thread is conuming cpu:
!runaway
2) What is this thread doing:
!clrstack
hth,
Aldo
I am relatively new to Azure. I have a website that has been running for a couple of months with not too much traffic...when users are on the system, the various dashboard monitors go up and then flat line the rest of the time. This week, the CPU time when way up when there were no requests and data going in or out of the site. Is there a way to determine the cause of this CPU activity when the site is not active? It doesn't make sense to me that I should have CPU activity being assigned to my site when there is to site activity.
If your website has significant processing at application start, it is possible your VM got rebooted or your app pool recycled and your onstart handler got executed again (which would cause CPU to spike without any request).
You can analyze this by adding application logs to your Application_Start event (but after initializing trace). There is another comment detailing how to enable logging, but you can also consult this link.
You need to collect data to understand what's going on. So first thing I would say is:
1. Go to Azure management portal -> your website (assuming you are using Azure websites) -> dashboard -> operation logs. Try to see whether there is any suspicious activity going on.
download the logs for your site using any ftp client and analyze what's happening. If there is not much data, I would suggest adding more logging in your application to see what is happening or which module is spinning.
A great way to detect CPU spikes and even determine slow running areas of your application is to use a profiler like New Relic. It's a free add on for Azure that collects data and provides you with a dashboard of data. You might find it useful to determine the exact cause of the CPU spike.
We regularly use it to monitor the performance of our applications. I would recommend it.
We have a Windows Server 2003 web server, and on that server runs about 5-6 top level Sharepoint sites, with a different application pool for each one.
There is one W3WP process that keeps pegging 100% for most of the day (happened yesterday and today) and it's connected (found by doing "Cscript iisapp.vbs" at the command line and matching ProcessID) to a particular Sharepoint site...which is nearly unusable.
What kind of corrective action can I take? These are the following ideas I had
1) Stopping and restarting the Web Site in IIS - For some reason this didn't stop the offending W3WP process??? Any ideas why not?
2) Stopping and restarting the associated Application Pool.
3) Recycling the associated Application Pool.
Any of those sound like the right idea? If not what are some good things to try? I can't do an iisreset since I don't want to alter service to the other, much more heavily used, Sharepoint sites.
If I truly NEED to do some diagnostic work please point me in the right direction. I'm not the Sharepoint admin guy (he's out of town so I'm filling in even though I'm just a developer) but I'll do my best.
If you need any information just let me know and I'll look it up (slowly though, as that one process is pegging the entire machine).
It's not an IISReset that you need. You have a piece of code that is running amok with your memory. Most likely it's not actually a CPU problem but a paging problem. I've encountered this a few times with data structures in memory that grow too large to page in/out effectively and eventually the attempt to page data just begins consuming everything. The steps I would recommend are:
1) Go get the IIS Debug Diagnostics tools. And learn how to use them.
2) If possible, remove the session state from InProc to a state server or a sql server (since this requires serialization of all classes that go into session this may not be possible). This will help alleviate some process related memory issues.
3) Go to your application pool and adjust the number of worker processes upward. Remove Rapid fail protection (this will allow the site to continue serving pages even if rapid catastrophic errors occur).
The IIS debug diagnostics will record a LOT of data, but you can specify specific "catch" alerts that will detect hangs, excessive cpu usage etc. It will capture gigs of data, so be ready for a long wait when attempting to view the logs.
Turns out someone tried to install some features that went haywire.
So he wrote a stsadm script to uninstall those features
Processor was still pegging.
I restarted the IIS Application Pool for that IIS process, didn't fix it.
So then I restarted IIS for that site and that resolved the processor issue.