Web Site Availability in Windows Azure [duplicate] - azure

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
App pool timeout for azure web sites
I am working on an asp.net mvc 4 app that is hosted in Windows Azure. This app will not have a lot of traffic as people will intermittently (once an hour) use it. I wanted to try using Windows Azure.
My app is currently set to use the FREE web site mode. I noticed that after 30 minutes, the site takes a long-time (> 5 seconds) to load. After that initial load, its fast. Then, if someone doesn't use it for another 30 minutes, it takes >5 seconds to load again.
I then tried upping the web site mode to a SHARED instance. I experienced the same problem there.
I then tried upping the web site mode to a RESERVED instance. The problem then goes away.
While I'd like to use Windows Azure, paying $50+ a month for a RESERVED instance is pretty expensive for a site that few have used up to this point. However, I can't have the initial lag. That will just defer the few users I have. You could say you get what you pay for. At the same time, I have a hard time believing others are experiencing this problem and not complaining. There has to be something I'm missing.
I figure the problem has to deal with the application pool resetting. However, I can't seem to figure a way around this. Is anyone familiar with this issue? Is there a way to fix it on a FREE or SHARED instance?
Thank you!

This is expected behavior based on how Windows Azure Web Sites work. The app pool they live in is spun up "on demand" and then hangs around for a time period.
For a detailed (and shameless plug) you can check out my article on this: http://www.simple-talk.com/dotnet/.net-framework/windows-azure-websites-%e2%80%93-a-new-hosting-model-for-windows-azure/
In summary:
Web Sites are hosted in a process on a farm of machines running IIS. If a site is idle for some time then the process is torn down automatically. Also, if the box is seeing a lot of pressure due to the other sites on the box the idle timeout may come down quite a bit (even as low as five minutes). When the next call comes in you'll see the process spun up again (likely on a completely different server). This is because you are in a shared environment (and is similar to how Heroku works). Once you move to reserved then you are the ONLY person on that virtual machine and if you suffer from noisy neighbor issues in processing its' because of your own stuff.
There are ways to keep your site "up", such as having a job that pings the url frequently; however, given that the idle timeout is somewhat fluid it may not solve every case. You can check out a recent post by Sandrino on how to use Azure Mobile Services as a job scheduler: http://fabriccontroller.net/blog/posts/job-scheduling-in-windows-azure/ . There are also 3rd party services available that can do the ping for you automatically.
To be honest, the web sites are a great feature for quick development and test, or even relatively low traffic sites as you are talking about. If you need a high level of uptime and better performance then you'll want to look at Reserved, or another option if the cost isn't in line with expectations.

This isn't an Azure problem. It is a "feature" of any web site hosted in IIS. The default time-out for app pools is 20 minutes. Read about App Pool timeouts here - http://technet.microsoft.com/en-us/library/cc771956(v=ws.10).aspx - one method is to create a keep alive page and ping the page every 10 minutes or so.

Related

Diagnosing ASP.NET Azure WebApp issue

since a month one of our web application hosted as WebApp on Azure is having some kind of problem and I cannot find the root cause of that.
This WebApp is hosted on Azure on a 2 x B2 App Service Plan. On the same App Service Plan there is another WebApp that is currently working without any issue.
This WebApp is an ASP.NET WebApi application and exposes a REST set of API.
Effect: without any apparent sense (at least for what I know by now), the ThreadCount metric starts to spin up, sometimes very slowly, sometimes in few minutes. What happens is that no requests seems to be served and the service is dead.
Solution: a simple restart of the application (an this means a restart of the AppPool) causes an immediate obvious drop of the ThreadCount and everything starts as usual.
Other observations: there is no "periodicity" in this event. It happened in the evening, in the morning and in the afternoon. It seems that evening is a preferred timeframe, but I won't say there is any correlation.
What I measured through Azure Monitoring Metric:
- Request Count seems to oscillate normally. There is no peak that causes that increase in ThreadCount
- CPU and Memory seems to be normal, nothing strange.
- Response time, like the others metrics
- Connections (that should be related to sockets) oscillates normally. So I'd exclude something related to DB connections.
What may I do in order to understand what's going on?
After a lot of research, this happened to be related to a wrong usage of Dependency Injection (using Ninject) and an application that wasn't designed to use it.
In order to diagnose, I discovered a very helpful feature in Azure. You can reach it by entering into the app that is having the problem, click on "Diagnose and solve problems" then click on "Diagnostic tools" and then select "Collect .NET profiler report". In that panel, after configuring the storage for the diagnostic files, you can select "Add thread report".
In those report you can easily understand what's going wrong.
Hope this helps.

inexplicable high cpu percentage on azure hosting plan

I currently have 4 websites hosted in a S2 hosting plan and this evening received a CPU percentage alert. I went to the management portal and checked all of the sites hosted in the hosting plan, however found no reason for it to be so high. After checking site by site and finding no evidence of what could be causing this problem I went and stopped every site, much to my surprise the CPU usage did not drop and it's been a staggering 50% for the last 30 minutes, is there any way to find out what is causing this? Do you guys have any idea if it could be a bug in the azure sites service?
Thanks in advance.
A couple of things to check for:
- do you have any webjobs on that system? They also consume resources but don't show up in all reports.
You can also check the Kudu Process Monitor to see if there are any other processes running (maybe you've been hacked and someone is running something on your box?) If you've never used the Kudu tool, it is quite handy - to get to it in your browser, put '.scm' after the sitename in your url. For example, if your site is
'mysite.azurewebsites.net'
the Kudu tools are at
'mysite.scm.azurewebsites.net'
There is a process explorer in there that you can see what processes are running under your account.

Intermittent Microsoft Azure Web Site access failure

I have a number of small MVC apps deployed as Microsoft Windows Azure websites. This has been working for several months.
Yesterday I rolled out a new one, and the deployment was unremarkable, everything worked fine. But a couple of hours later, access to the site was unavailable. The symptoms were that when the browser tried to navigate to the URL for that site, it would try to load for several minutes and then just give up with a completely blank page.
I attempted to stop and restart the site, and it worked once, but the symptoms came back several minutes later. Then I tried to stop and restart, and it didn't work.
I deployed the identical app to three additional URLs. Again, immediately on deployment, they all work fine, however, they fail at some interval in the future. They seem to not all fail at once. Sometimes restarting the site will fix the problem, and sometimes not.
IMPORTANT: If I wait for some period of time, the site may start to work again on its own.
However, deploying four versions of the app so that our users can go to a backup one if the primary one is not working is not optimal.
Any words of wisdom as to how I might go about debugging this?
ADDITIONAL INFO NOV 25, 2013:
When sites are failing, the IIS logs show either 500 or 502 Internal Service Errors. Our own MVC code is never hit, not even app_start.
You can start by checking the logs and remote debugging
http://www.drdobbs.com/windows/azure-sdk-22-supports-visual-studio-2013/240163499
Are the apps working locally?
Might not be the same problem, but from time to time our Azure instances will get the blue question mark of death as a status.
The reason we found out was that Microsoft will do upgrades on instances from time to time. If you have just one instance in a cloud service/role, then from time to time they will do maintenance and during that time it will be dead.
I have confirmed this with their support.
The only way to get around this that I know of is to create two instances. Then Microsoft guarantees ~99% availability.
Of course I also confirmed with them that this means twice the cost. =/
If that's not the issue I would enable RDP and get onto the machine to see what the problem is. Microsoft has these tools to help debug problems: http://blogs.msdn.com/b/kwill/archive/2013/08/26/azuretools-the-diagnostic-utility-used-by-the-windows-azure-developer-support-team.aspx
First, you should always run multiple instances of your web role with more than 1 upgrade domain. This is configurable in the service definition (CSDEF). Without this, you don't get an SLA from Microsoft, so you can't really complain that the VMs go down.
Second, to figure out what might be going on with these boxes, you should have both logs (my preference is to roll my own with page blobs or table storage), AND you should always have RDP access to a pre-production environment (production as well if you're not too fussed about security). Once on the box, look through the event viewer for errors.
Third, when an outage occurs check out the azure service dashboard (http://www.windowsazure.com/en-us/support/service-dashboard/) for outages.
Lastly, contact Microsoft support. It may take a few hours, but they are pretty good.
That it is happening repeatedly and for extended periods of time (more than 5 minutes), I would be there's something wrong with your hosted service. Again, RDP in and poke around. Good luck.
To debug your sites try to enable diagnostic logs:
http://www.windowsazure.com/en-us/develop/net/common-tasks/diagnostics-logging-and-instrumentation/
Another nice way to look around your site is using the debug console:
https://github.com/projectkudu/kudu/wiki/Kudu-console

Determining Cause of Suspended Website on Windows Azure

I have a Website hosted on Windows Azure. This website is a custom ASP.NET MVC 4 site hosted as a shared web site instance. Within the past couple of days, I've started to get large spikes in CPU Time. These spikes have been sustained and have caused my web site to get suspended. However, I'm not sure how to determine the cause of these spikes. Here is what I've done so far:
I attempted to look at the diagnostics via the Windows Azure FTP drop. I did not see anything there.
I reviewed my Google Analytics to see if there was anything out of the ordinary. The site had 20 visitors yesterday. So nothing crazy.
How can I identify the culprit of the the CPU spike? Once it spikes, it just sits there for hours. I'm not sure what would cause this.
Thank you
Have you tried running your site on your local box and simulating your visitor traffic, exercising all your website's features?
Testing locally is 1000's of times easier and more revealing than trying to debug a site that's running live.
If you still can't find anything wrong when running locally, consider using logging and tracing to strategic points in your site so that you can see how often, and how long it takes for your site to execute complex operations.

DotNetNuke on Windows Azure Websites performance

I am evaluating the Windows Azure WebSites Preview (WAWS I think, not sure with all these changing names and acronyms that Microsoft loves to mutate on) with DotNetNuke (DNN) which I am also using for years on a "non cloud" V-Server. Installation was a breeze. I only tried the free shared instance and I have tested with 1 and with 3 active instances with similar results.
First hit performance always was a problem with my previous DNN installations, when a website was idle for a while (15 minutes or so) the process would stop and then the next unlucky visitor will wait at least around 20 seconds. With some IIS tweaking it was possible to minimize this problem but I had the best results with a monitoring service that will request a page from DNN every five minutes and keep the process up.
While surfing the DNN page usually performs well on WAWS, I immediately noticed that the "first hit" problem is an issue with DNN on WAWS so I configured a monitoring service for the page. That did not help and the monitoring service will always report that the site is down. Almost as if WAWS was trying to avoid keeping the site up since it detected that only a monitoring service was requesting the page.
Also, when navigating on the DNN pages and then pausing for just a minute or two, I will often get an "Internet Explorer could not load this page" error with no specific error code.
Do others have experience with the DNN performance on WAWS or maybe know why the "first hit" is such a problem?
I suspect that Microsoft is actively trying to avoid the keep-alive tricks that many ASP.Net devs use. WAWS, like many shared hosting platforms, relies on only having a certain number of active websites on the server at any one time in order to achieve higher server densities and keep the cost of hosting under control. This is one of the reasons that they can offer this service for free.
I think what you want to look into is "keep alive."
What you are experiencing is the ASP .NET process getting killed for your application due to inactivity. When the process isn't in memory and the site is accessed IIS has to spin it back up which is the 10 - 20 second lag you get upon accessing your site as the process gets up again and/or just in time compiles.
You can schedule some 3rd party monitoring services to check your site every 10 minutes via an HTTP request that will keep your site up. Just pinging it will not keep it up.

Resources