inexplicable high cpu percentage on azure hosting plan - azure

I currently have 4 websites hosted in a S2 hosting plan and this evening received a CPU percentage alert. I went to the management portal and checked all of the sites hosted in the hosting plan, however found no reason for it to be so high. After checking site by site and finding no evidence of what could be causing this problem I went and stopped every site, much to my surprise the CPU usage did not drop and it's been a staggering 50% for the last 30 minutes, is there any way to find out what is causing this? Do you guys have any idea if it could be a bug in the azure sites service?
Thanks in advance.

A couple of things to check for:
- do you have any webjobs on that system? They also consume resources but don't show up in all reports.
You can also check the Kudu Process Monitor to see if there are any other processes running (maybe you've been hacked and someone is running something on your box?) If you've never used the Kudu tool, it is quite handy - to get to it in your browser, put '.scm' after the sitename in your url. For example, if your site is
'mysite.azurewebsites.net'
the Kudu tools are at
'mysite.scm.azurewebsites.net'
There is a process explorer in there that you can see what processes are running under your account.

Related

Web App Service - How often should it be restarted?

I deployed an Azure web app back in July and it's been running flawlessly up until about three weeks ago. At that time, I would notice my CPU utilization constantly between 80% to 100%, with no corresponding increase in traffic. The first time I saw this, after concluding it wasn't my app, or increased traffic, causing this, I restarted the web app service and the CPU utilization returned to its normal 5% to 15%. Then after a couple days it started to do it again. And, again, a restart solved the issue.
My question is this. Is this normal to have to restart the web service every day or so? And, if so, why?
Assuming no changes have been made to your code and you have not seen a corresponding increase in traffic, it is not normal. An Azure Web App with no app deployed should almost always stay at 0% CPU utilization. I say "almost always" because Microsoft does run diagnostic and monitoring tools in the background that can cause some very temporary spikes. See here for a thread on that particular issue.
My recommendations are:
When CPU pegs and stays pegged, log into your SCM site. Check the Process Explorer and confirm that it's your w3wp.exe (Note there's a separate w3wp.exe for your SCM site.) that's pegged the CPU.
Ensure that you don't have any Site Extensions or WebJobs that are losing their mind. You can check your installed Site Extensions on the SCM site under the Site Extensions -> Installed tab. Any WebJobs will show up on your SCM process explorer as separate processes from step #1.
Log into the Azure Portal and browse to your Web App's management blade. Go to the Diagnose and Solve Problems blade. From here, you can try "Metrics per Instance" and go through all of the Perf Counters to see if it gives you a clue as to what's wrong. For example, I had SignalR go nuts once and only found it by seeing that my thread count was out of control.
On the Diagnose and Solve Problems blade, you can also check Application Events.
You may have some light shed on this by installing Application Insights on your web application. It has a free tier that will likely have enough space to troubleshoot for a few days. If this is something going bananas with your code, you may get some insight here.
I'm including failed request tracing logs here for completeness. But these would likely show up in Application Insights.
If you've exhausted all of these possibilities, file a support ticket with Microsoft. As the above link shows, they have access to diagnostic tools that we don't and can eliminate the possibility of a runaway diagnostics or infrastructure process. I don't know how much help they can be if the CPU spike is due to your own w3wp.exe that's spiking the CPU.
Of course, if your app is seriously easy to redeploy and it's not a ridiculous hassle, you can just re-provision it and see if you see the same behavior.

Website going in and out, no activities shown for past 3 days on Azure Web App

Our website (hosted on Azure) has been going in and out the whole morning, works for 5 minutes and then stops loading, then switches back on again. Here's the error message I receive. I've tried restarting the site from Azure Web App a few times and the problem persists.
I've also track activities on Azure dashboard and there is nothing recorded for the last 3 days.
http://i60.tinypic.com/mie6v5.png
Please let me know how to fix this issue, thanks.
P.S.: We have a Standard subscription, and I'm thinking this might be due to the Service Bus - West US and Australia Southeast - Partial Service Interruption as reported on http://azure.microsoft.com/en-us/status/#current
This may be caused because you are using the Free tier which limits CPU to 5 min per hour or Shared which is slightly higher. To keep your site up and responsive, enable Always On in the site's configure tab. You will need to scale your site to the Basic or Standard tier first.
I would also recommend using End Point Monitoring to verify your site is actively handling requests. You can set this in the configure tab as well.
Lastly, I recommend you use the /support tool in Kudu as it provides a richer set of graphs to monitor site activity. To access it type in this URL in your browser, https://mysite.scm.azurewebsites.net/support where mysite is the name of your site. Best of all is this tool can analyze your site as well and help you troubleshoot issues.
Hope that is helpful.

Intermittent Microsoft Azure Web Site access failure

I have a number of small MVC apps deployed as Microsoft Windows Azure websites. This has been working for several months.
Yesterday I rolled out a new one, and the deployment was unremarkable, everything worked fine. But a couple of hours later, access to the site was unavailable. The symptoms were that when the browser tried to navigate to the URL for that site, it would try to load for several minutes and then just give up with a completely blank page.
I attempted to stop and restart the site, and it worked once, but the symptoms came back several minutes later. Then I tried to stop and restart, and it didn't work.
I deployed the identical app to three additional URLs. Again, immediately on deployment, they all work fine, however, they fail at some interval in the future. They seem to not all fail at once. Sometimes restarting the site will fix the problem, and sometimes not.
IMPORTANT: If I wait for some period of time, the site may start to work again on its own.
However, deploying four versions of the app so that our users can go to a backup one if the primary one is not working is not optimal.
Any words of wisdom as to how I might go about debugging this?
ADDITIONAL INFO NOV 25, 2013:
When sites are failing, the IIS logs show either 500 or 502 Internal Service Errors. Our own MVC code is never hit, not even app_start.
You can start by checking the logs and remote debugging
http://www.drdobbs.com/windows/azure-sdk-22-supports-visual-studio-2013/240163499
Are the apps working locally?
Might not be the same problem, but from time to time our Azure instances will get the blue question mark of death as a status.
The reason we found out was that Microsoft will do upgrades on instances from time to time. If you have just one instance in a cloud service/role, then from time to time they will do maintenance and during that time it will be dead.
I have confirmed this with their support.
The only way to get around this that I know of is to create two instances. Then Microsoft guarantees ~99% availability.
Of course I also confirmed with them that this means twice the cost. =/
If that's not the issue I would enable RDP and get onto the machine to see what the problem is. Microsoft has these tools to help debug problems: http://blogs.msdn.com/b/kwill/archive/2013/08/26/azuretools-the-diagnostic-utility-used-by-the-windows-azure-developer-support-team.aspx
First, you should always run multiple instances of your web role with more than 1 upgrade domain. This is configurable in the service definition (CSDEF). Without this, you don't get an SLA from Microsoft, so you can't really complain that the VMs go down.
Second, to figure out what might be going on with these boxes, you should have both logs (my preference is to roll my own with page blobs or table storage), AND you should always have RDP access to a pre-production environment (production as well if you're not too fussed about security). Once on the box, look through the event viewer for errors.
Third, when an outage occurs check out the azure service dashboard (http://www.windowsazure.com/en-us/support/service-dashboard/) for outages.
Lastly, contact Microsoft support. It may take a few hours, but they are pretty good.
That it is happening repeatedly and for extended periods of time (more than 5 minutes), I would be there's something wrong with your hosted service. Again, RDP in and poke around. Good luck.
To debug your sites try to enable diagnostic logs:
http://www.windowsazure.com/en-us/develop/net/common-tasks/diagnostics-logging-and-instrumentation/
Another nice way to look around your site is using the debug console:
https://github.com/projectkudu/kudu/wiki/Kudu-console

Determining Cause of Suspended Website on Windows Azure

I have a Website hosted on Windows Azure. This website is a custom ASP.NET MVC 4 site hosted as a shared web site instance. Within the past couple of days, I've started to get large spikes in CPU Time. These spikes have been sustained and have caused my web site to get suspended. However, I'm not sure how to determine the cause of these spikes. Here is what I've done so far:
I attempted to look at the diagnostics via the Windows Azure FTP drop. I did not see anything there.
I reviewed my Google Analytics to see if there was anything out of the ordinary. The site had 20 visitors yesterday. So nothing crazy.
How can I identify the culprit of the the CPU spike? Once it spikes, it just sits there for hours. I'm not sure what would cause this.
Thank you
Have you tried running your site on your local box and simulating your visitor traffic, exercising all your website's features?
Testing locally is 1000's of times easier and more revealing than trying to debug a site that's running live.
If you still can't find anything wrong when running locally, consider using logging and tracing to strategic points in your site so that you can see how often, and how long it takes for your site to execute complex operations.

Web Site Availability in Windows Azure [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
App pool timeout for azure web sites
I am working on an asp.net mvc 4 app that is hosted in Windows Azure. This app will not have a lot of traffic as people will intermittently (once an hour) use it. I wanted to try using Windows Azure.
My app is currently set to use the FREE web site mode. I noticed that after 30 minutes, the site takes a long-time (> 5 seconds) to load. After that initial load, its fast. Then, if someone doesn't use it for another 30 minutes, it takes >5 seconds to load again.
I then tried upping the web site mode to a SHARED instance. I experienced the same problem there.
I then tried upping the web site mode to a RESERVED instance. The problem then goes away.
While I'd like to use Windows Azure, paying $50+ a month for a RESERVED instance is pretty expensive for a site that few have used up to this point. However, I can't have the initial lag. That will just defer the few users I have. You could say you get what you pay for. At the same time, I have a hard time believing others are experiencing this problem and not complaining. There has to be something I'm missing.
I figure the problem has to deal with the application pool resetting. However, I can't seem to figure a way around this. Is anyone familiar with this issue? Is there a way to fix it on a FREE or SHARED instance?
Thank you!
This is expected behavior based on how Windows Azure Web Sites work. The app pool they live in is spun up "on demand" and then hangs around for a time period.
For a detailed (and shameless plug) you can check out my article on this: http://www.simple-talk.com/dotnet/.net-framework/windows-azure-websites-%e2%80%93-a-new-hosting-model-for-windows-azure/
In summary:
Web Sites are hosted in a process on a farm of machines running IIS. If a site is idle for some time then the process is torn down automatically. Also, if the box is seeing a lot of pressure due to the other sites on the box the idle timeout may come down quite a bit (even as low as five minutes). When the next call comes in you'll see the process spun up again (likely on a completely different server). This is because you are in a shared environment (and is similar to how Heroku works). Once you move to reserved then you are the ONLY person on that virtual machine and if you suffer from noisy neighbor issues in processing its' because of your own stuff.
There are ways to keep your site "up", such as having a job that pings the url frequently; however, given that the idle timeout is somewhat fluid it may not solve every case. You can check out a recent post by Sandrino on how to use Azure Mobile Services as a job scheduler: http://fabriccontroller.net/blog/posts/job-scheduling-in-windows-azure/ . There are also 3rd party services available that can do the ping for you automatically.
To be honest, the web sites are a great feature for quick development and test, or even relatively low traffic sites as you are talking about. If you need a high level of uptime and better performance then you'll want to look at Reserved, or another option if the cost isn't in line with expectations.
This isn't an Azure problem. It is a "feature" of any web site hosted in IIS. The default time-out for app pools is 20 minutes. Read about App Pool timeouts here - http://technet.microsoft.com/en-us/library/cc771956(v=ws.10).aspx - one method is to create a keep alive page and ping the page every 10 minutes or so.

Resources