Slow publishing in Azure umbraco website - azure

I have an umbraco website setup in Azure. The front-end website loads fine but the back end takes more than 15 seconds from when you hit the "Save and Publish" to show the check mark denoting success. I setup a test website but in Azure VM pointing to the same Azure sql database that hosts the same umbraco Azure website and I don't get this problem.

I just spent some time debugging this same scenario. We had a situation where once someone saved a node, it would take 30 seconds before the UI would become responsive again. Network trace confirmed that we were waiting on API calls back from Umbraco.
We were on an S0 SQL instance, so I bumped it to S1 and the perf got worse!? (guessing indexes rebuilding?).
We already have a few Azure specific web.config options set (like useTempStorage="Sync" in our ExamineSettings.config). Ended up adding the line below and now our saves went from 30-35s to 1-2s!
<add key="umbracoContentXMLUseLocalTemp" value="true" />
This is from the load balancing guide available here - https://our.umbraco.org/documentation/Getting-Started/Setup/Server-Setup/load-balancing/flexible

Umbraco Save and publish is fairly DB intensive. An S1 instance is possibly not beefy enough. We use S2 for our dev sites, and that can take up to 5 seconds to save and publish depending on whether anything else is hitting the database.
You may also have issues with other code running that's slowing things down. Things like Examine indexing can be quite slow on Azure. It could also be one a plugin that slowing things down.
How complex are your DocTypes? Also, which version of Umbraco are you running? Some older versions have bugs in that cause performance issues on Azure.

Related

Diagnosing ASP.NET Azure WebApp issue

since a month one of our web application hosted as WebApp on Azure is having some kind of problem and I cannot find the root cause of that.
This WebApp is hosted on Azure on a 2 x B2 App Service Plan. On the same App Service Plan there is another WebApp that is currently working without any issue.
This WebApp is an ASP.NET WebApi application and exposes a REST set of API.
Effect: without any apparent sense (at least for what I know by now), the ThreadCount metric starts to spin up, sometimes very slowly, sometimes in few minutes. What happens is that no requests seems to be served and the service is dead.
Solution: a simple restart of the application (an this means a restart of the AppPool) causes an immediate obvious drop of the ThreadCount and everything starts as usual.
Other observations: there is no "periodicity" in this event. It happened in the evening, in the morning and in the afternoon. It seems that evening is a preferred timeframe, but I won't say there is any correlation.
What I measured through Azure Monitoring Metric:
- Request Count seems to oscillate normally. There is no peak that causes that increase in ThreadCount
- CPU and Memory seems to be normal, nothing strange.
- Response time, like the others metrics
- Connections (that should be related to sockets) oscillates normally. So I'd exclude something related to DB connections.
What may I do in order to understand what's going on?
After a lot of research, this happened to be related to a wrong usage of Dependency Injection (using Ninject) and an application that wasn't designed to use it.
In order to diagnose, I discovered a very helpful feature in Azure. You can reach it by entering into the app that is having the problem, click on "Diagnose and solve problems" then click on "Diagnostic tools" and then select "Collect .NET profiler report". In that panel, after configuring the storage for the diagnostic files, you can select "Add thread report".
In those report you can easily understand what's going wrong.
Hope this helps.

How does one know why an Azure WebSite instance(WebApp) was shutdown?

By looking at my Pingdom reports I have noted that my WebSite instance is getting recycled. Basically Pingdom is used to keep my site warm. When I look deeper into the Azure Logs ie /LogFiles/kudu/trace I notice a number of small xml files with "shutdown" or "startup" suffixes ie:
2015-07-29T20-05-05_abc123_002_Shutdown_0s.xml
While I suspect this might be to do with MS patching VMs, I am not sure. My application is not showing any raised exceptions, hence my suspicions that it is happening at the OS level. Is there a way to find out why my Instance is being shutdown?
I also admit I am using a one S2 instance scalable to three dependent on CPU usage. We may have to review this to use a 2-3 setup. Obviously this doubles the costs.
EDIT
I have looked at my Operation Logs and all I see is "UpdateWebsite" with status of "succeeded", however nothing for the times I saw the above files for. So it seems that the "instance" is being shutdown, but the event is not appearing in the "Operation Log". Why would this be? Had about 5 yesterday, yet the last "Operation Log" entry was 29/7.
An example of one of yesterday's shutdown xml file:
2015-08-05T13-26-18_abc123_002_Shutdown_1s.xml
You should see entries regarding backend maintenance in operation logs like this:
As for keeping your site alive, standard plans allows you to use the "Always On" feature which pretty much do what pingdom is doing to keep your website warm. Just enable it by using the configure tab of portal.
Configure web apps in Azure App Service
https://azure.microsoft.com/en-us/documentation/articles/web-sites-configure/
Every site on Azure runs 2 applications. 1 is yours and the other is the scm endpoint (a.k.a Kudu) these "shutdown" traces are for the kudu app, not for your site.
If you want similar traces for your site, you'll have to implement them yourself just like kudu does. If you don't have Always On enabled, Kudu get's shutdown after an hour of inactivity (as far as I remember).
Aside from that, like you mentioned Azure will shutdown your app during machine upgrade, though I don't think these shutdowns result in operational log events.
Are you seeing any side-effects? is this causing downtime?
When upgrades to the service are going on, your site might get moved to a different machine. We bring the site up on a new machine before shutting it down on the old one and letting connections drain, however this should not result in any perceivable downtime.

Intermittent Microsoft Azure Web Site access failure

I have a number of small MVC apps deployed as Microsoft Windows Azure websites. This has been working for several months.
Yesterday I rolled out a new one, and the deployment was unremarkable, everything worked fine. But a couple of hours later, access to the site was unavailable. The symptoms were that when the browser tried to navigate to the URL for that site, it would try to load for several minutes and then just give up with a completely blank page.
I attempted to stop and restart the site, and it worked once, but the symptoms came back several minutes later. Then I tried to stop and restart, and it didn't work.
I deployed the identical app to three additional URLs. Again, immediately on deployment, they all work fine, however, they fail at some interval in the future. They seem to not all fail at once. Sometimes restarting the site will fix the problem, and sometimes not.
IMPORTANT: If I wait for some period of time, the site may start to work again on its own.
However, deploying four versions of the app so that our users can go to a backup one if the primary one is not working is not optimal.
Any words of wisdom as to how I might go about debugging this?
ADDITIONAL INFO NOV 25, 2013:
When sites are failing, the IIS logs show either 500 or 502 Internal Service Errors. Our own MVC code is never hit, not even app_start.
You can start by checking the logs and remote debugging
http://www.drdobbs.com/windows/azure-sdk-22-supports-visual-studio-2013/240163499
Are the apps working locally?
Might not be the same problem, but from time to time our Azure instances will get the blue question mark of death as a status.
The reason we found out was that Microsoft will do upgrades on instances from time to time. If you have just one instance in a cloud service/role, then from time to time they will do maintenance and during that time it will be dead.
I have confirmed this with their support.
The only way to get around this that I know of is to create two instances. Then Microsoft guarantees ~99% availability.
Of course I also confirmed with them that this means twice the cost. =/
If that's not the issue I would enable RDP and get onto the machine to see what the problem is. Microsoft has these tools to help debug problems: http://blogs.msdn.com/b/kwill/archive/2013/08/26/azuretools-the-diagnostic-utility-used-by-the-windows-azure-developer-support-team.aspx
First, you should always run multiple instances of your web role with more than 1 upgrade domain. This is configurable in the service definition (CSDEF). Without this, you don't get an SLA from Microsoft, so you can't really complain that the VMs go down.
Second, to figure out what might be going on with these boxes, you should have both logs (my preference is to roll my own with page blobs or table storage), AND you should always have RDP access to a pre-production environment (production as well if you're not too fussed about security). Once on the box, look through the event viewer for errors.
Third, when an outage occurs check out the azure service dashboard (http://www.windowsazure.com/en-us/support/service-dashboard/) for outages.
Lastly, contact Microsoft support. It may take a few hours, but they are pretty good.
That it is happening repeatedly and for extended periods of time (more than 5 minutes), I would be there's something wrong with your hosted service. Again, RDP in and poke around. Good luck.
To debug your sites try to enable diagnostic logs:
http://www.windowsazure.com/en-us/develop/net/common-tasks/diagnostics-logging-and-instrumentation/
Another nice way to look around your site is using the debug console:
https://github.com/projectkudu/kudu/wiki/Kudu-console

Windows Azure reliability (my server just lost its drives & sites, then 20 minutes later they reappeared)

I am on a Windows Azure trial to evaluate migrating a number of commercial ASP.NET sites to Azure from dedicated hosting. All was going OK ... until just now!
Some background - the sites are set up under Web Roles (i.e. as opposed to Web Sites) using SQL Azure and SQL Reporting. The site content was under the X: drive (there was also a B: drive that seemed to be mapped to the same location). There are several days left of the trial.
Without any apparent warning my test sites suddenly stopped working. Examining the server (through RDP) I saw that the B: and X: drives had disappeared (just C: D & E: I think were left), and in IIS the application pools and Sites had disappeared. In the Portal however, nothing seemed to have changed - the same services & config seemed to be there.
Then about 20 minutes later the missing drives, app pools and sites reappeared and my test sites started working again! However, the B: drive was gone and now there was an F: drive (showing the same as X:); also the MS ReportViewer 2008 control that I had installed earlier in the day was gone. It is almost as if the server had been replaced with another (but the IIS config was restored from the original).
As you can imagine, this makes me worried! If this is something that could happen in production there is no way I would consider hosting commercial sites for clients on Azure (unless there is some redundancy system available to keep a site up when such a failure occurs).
Can anyone explain what may have happened, if this is possible/predictable under a live subscription, and if so how to work around it?
One other thing to keep in mind is that an Azure Web Role is not persistent. I'm not sure how you installed the MS Report Viewer 2008 control but anything you add or install outside of a deployment package when you push your solution to Azure is not guaranteed to be available at some future point.
I admit that I don't fully understand the full picture when it comes to the overall architecture of Azure but I do know that Web Roles can and do re-create themselves from time to time. When the role recycles, it returns to the state as it was when it was installed. This is why Microsoft suggests using at least 2 instances of your role because while one or the other may recycle they will never recycle both at the same time, part of what guarantees the 99.9% uptime.
You might also want to consider an Azure VM. They are persistent but require you to maintain the server in terms of updates and software much in the way I suspect you are already doing with your dedicated hosting.
I've been hosting my solution in a large (4 core) web role, also using SQL Azure, for about two years and have had great success with it. I have roughly 3,000 users and rarely see the utilization of my web role go over 2% (meaning I've got a lot of room to grow). Overall it is a great hosting solution in my opinion.
According to the Azure SLA Microsoft guarantees up time of 99.9% or higher on all its products per billing month. (20 min on the month would be .0004% loss, not being critical, just suggesting that they are still within their SLA)
Current status shows that sql databases were having issues in the US north last night, but all services appear to be up currently
Personally, I have seen the dashboard go down, and report very weird problems, but the services that I programmed to worked just fine all the way through it. When I experienced this problem it was reported on the Azure Status, the platform status and the twitter feed
While I have seen bumps, they are few and far between, and I find reliability to be perceptibly higher than other providers that I have worked with.
As for workarounds I would suggest a standard mode for your websites and increasing instances of the site. You might try looking into the new add ins that are available with the latest Azure release. Active Cloud Monitoring by Metrichub might be what you require.
It sounds like you're expecting the web role to act as a Virtual Machine instance.
Web Roles aren't persistent (the machine can be destroyed and recreated at any time), so you should do any additional required set up as a 'startup task' in your Azure project (never install software manually).
Because of this you need at least 2 instances so that rolling upgrades (i.e. Windows security patches, hotfixes and so on) can be performed automatically without having your entire deployment taken offline.
If this doesn't suit your use case then you should look at Azure Virtual Machines, but you'll need to manage updates and so on yourself. It's usually better to use Web Roles properly as you can then do scaling and so on a lot more easily.

Determining Cause of Suspended Website on Windows Azure

I have a Website hosted on Windows Azure. This website is a custom ASP.NET MVC 4 site hosted as a shared web site instance. Within the past couple of days, I've started to get large spikes in CPU Time. These spikes have been sustained and have caused my web site to get suspended. However, I'm not sure how to determine the cause of these spikes. Here is what I've done so far:
I attempted to look at the diagnostics via the Windows Azure FTP drop. I did not see anything there.
I reviewed my Google Analytics to see if there was anything out of the ordinary. The site had 20 visitors yesterday. So nothing crazy.
How can I identify the culprit of the the CPU spike? Once it spikes, it just sits there for hours. I'm not sure what would cause this.
Thank you
Have you tried running your site on your local box and simulating your visitor traffic, exercising all your website's features?
Testing locally is 1000's of times easier and more revealing than trying to debug a site that's running live.
If you still can't find anything wrong when running locally, consider using logging and tracing to strategic points in your site so that you can see how often, and how long it takes for your site to execute complex operations.

Resources