Does the Diagnostic Logging setting turn itself off by design? - azure

I have enabled diagnostic logging (Error level only to file system or blob) on my azure website several times and confirmed that it is working. When I come back and check the next day it is switched off. I can't seem to find any documentation that suggest that this is by design.

If you're logging to File System, then it does disable itself after 12 hours. You can see this if you click the help bubble:
The reason is that it could affect site performance due to excessive writing to the (slow) file system.
However, if you set it up for blob, it should never get turned off until you do it.

If you turn on Application Logging to the File System, then yes, it will turn itself off after 12 hours. You can see this in the portal if you hover over the information icon for Application Logging (see below). This behavior is also document here for reference.
The reason why this is disabled after 12 hours has to do with the limited set of storage you have on the local file system, which will be 1GB - 250GB depending on your App Service Plan (size).
If you enable application logging to Azure Storage (blob), then you have up to 500TB of potential storage. In this scenario, your logging should not be getting disabled after 12 hours.

Related

How does one know why an Azure WebSite instance(WebApp) was shutdown?

By looking at my Pingdom reports I have noted that my WebSite instance is getting recycled. Basically Pingdom is used to keep my site warm. When I look deeper into the Azure Logs ie /LogFiles/kudu/trace I notice a number of small xml files with "shutdown" or "startup" suffixes ie:
2015-07-29T20-05-05_abc123_002_Shutdown_0s.xml
While I suspect this might be to do with MS patching VMs, I am not sure. My application is not showing any raised exceptions, hence my suspicions that it is happening at the OS level. Is there a way to find out why my Instance is being shutdown?
I also admit I am using a one S2 instance scalable to three dependent on CPU usage. We may have to review this to use a 2-3 setup. Obviously this doubles the costs.
EDIT
I have looked at my Operation Logs and all I see is "UpdateWebsite" with status of "succeeded", however nothing for the times I saw the above files for. So it seems that the "instance" is being shutdown, but the event is not appearing in the "Operation Log". Why would this be? Had about 5 yesterday, yet the last "Operation Log" entry was 29/7.
An example of one of yesterday's shutdown xml file:
2015-08-05T13-26-18_abc123_002_Shutdown_1s.xml
You should see entries regarding backend maintenance in operation logs like this:
As for keeping your site alive, standard plans allows you to use the "Always On" feature which pretty much do what pingdom is doing to keep your website warm. Just enable it by using the configure tab of portal.
Configure web apps in Azure App Service
https://azure.microsoft.com/en-us/documentation/articles/web-sites-configure/
Every site on Azure runs 2 applications. 1 is yours and the other is the scm endpoint (a.k.a Kudu) these "shutdown" traces are for the kudu app, not for your site.
If you want similar traces for your site, you'll have to implement them yourself just like kudu does. If you don't have Always On enabled, Kudu get's shutdown after an hour of inactivity (as far as I remember).
Aside from that, like you mentioned Azure will shutdown your app during machine upgrade, though I don't think these shutdowns result in operational log events.
Are you seeing any side-effects? is this causing downtime?
When upgrades to the service are going on, your site might get moved to a different machine. We bring the site up on a new machine before shutting it down on the old one and letting connections drain, however this should not result in any perceivable downtime.

Azure - 12 hour logging

So I'm trying to familiarise myself with Azure and have started work on a website which is currently being deployed on git commit to Azure. I decided I had to look at logging and so turned on application diagnostics in the Azure portal. I logged via a trace statement in my code and sure enough it writes to a log file.
I noticed that on hover of the info icon at the side of the "application logging (filesystem)" toggle, that it notes it will be turned off after 12 hours. I presumed that meant diagnostic logging will be turned off after 12 hours, but over 20 hours later that seems not to be the case.
Does the 12 hours refer to the retention of file logs post creation or geniunely that logging will (at some point) be switched off?
From the little I've read if I want durable logging I need to consider pushing log files to blob storage or azure tables (possibly writing directly). Are my thoughts on the 12 hour retention to be correct?
Thanks
Tim
This 12 hours limit is about application logging into text file(s): if you use an ILogger instance to log data (ie. logger.LogInformation(...)) then this feature will be disabled after 12 hours.

Azure Websites automated and manual backups are not created

Whilst accepting that Backups in Windows Azure Websites are a preview feature, I can't seem to get them working at all. My site is approximately 3GB and on the standard tier. The settings are configured to move to a Geo-Redundant storage account with no other containers. There is no database selected, I'm only backing up the files.
In the Admin Portal, if I use the manual Backup Now button, a 0 bytes file is created within the designated storage account, dated 01/01/0001 00:00:00. However even after several days, it is not replaced with the 'actual' file.
If I use the automated backup scheduler, nothing happens at all - no errors, no 0 byte files.
Can anyone shed any light on this please?
The backup/restore feature is still in a preview mode and officially supports only 2 GB of data. From the error message you posted ("backup is currenly in progress") it seems you probably hit a bug which was there and was fixed last week (the result of that bug was that there were some lingering backups which blocked subsequent backups).
Please try it again, you should be able to invoke it now. If you find another error message in operational logs, feel free to post it here (just leave the RequestId in it unscrambled - we can correlate using that) and we can take a look.
However, as I mentioned in the beginning, more than 2 GBs are not fully supported yet (you might not be able to do e.g. roundtrip with your data - backup and then restore).
Thanks,
Petr

Caching Diagnostics recommends 20GB of local storage(!). Why?

I installed the Azure 1.8 tools/SDK and it upgraded my projects co-located caching from preview to final. However, it also decided to add 20GB to the Role's Local Storage (DiagnosticStore). I manually dialed it down to 500MB but then I get the following message in the Role's Property page (cloud proj => roles => right click role => properties i.e. GUI for ServiceDefinition.csdef):
Caching Diagnostics recommends 20GB of local storage. If you decrease
the size of local storage a full redeployment is required which will
result in a loss of virtual IP addresses for this Cloud Service.
I don't know who signed off on this operating model within MS but it begs a simple Why?. For better understanding, I'm breaking that "Why" into 3 "Why" subquestions for caching in Azure SDK 1.8:
Why is the diagnostics of caching coupled with the caching itself? We just need caching for performance...
Why is the recommendation for a whopping 20Gigs? What happens if I dial it down to 500MB?
Slightly off-topic but still related: why does the decreasing of local storage require a full redeployment? This is especially painful since Azure doesn't provide any strong controls to reserve IP addresses. So if you need to work with 3rd parties that use whitelisted IPs - too bad!?
PS: I did contemplate breaking it into 3 separate questions. But given that they are tightly coupled it seems this would be a more helpful approach for future readers.
Diagnostic store is used for storing cache diagnostic data which includes - server logs, crash dumps, counter data etc. which can be automatically uploaded to Azure Storage by configuring the cache diagnostics (CacheDiagnostics.ConfigureDiagnostics call in OnStart method - without this call, data is generated on local VM but not uplaoded into Azure Storage ). And the amount of data that is collected is controlled by diagnostic level (higher the level, more data is collected) which can be changed dynamically. More details on cache diagnostics is avialble at: http://msdn.microsoft.com/en-us/library/windowsazure/hh914135.aspx
Since you enabled cache, it will come with default diagnostic level that should help in diagnosing cache issues if they happen. This data is stored locally unless you call the ConfigureDiagnostics method in OnStart (which uploads the data to Azure storage).
If a lower storage value is provided (say 2GB), then higher diagnostic levels cannot be used since they need more space (crash dump itself can take upwards 12GB for XL VMs). And if you want higher levels, then you might want to upgrade the deployment with change in the diagnostic store size which defeats the purpose - change diagnostic level without redeployment/upgrade/update/restarts. That is the reason why a limit of 20GB is set to cater to all diagnostic levels (and they can be changed in a running deployment with cscfg change).
is answered above.
Hope this helps.
I'll answer question #3 - local storage decreases are one of the only deployment changes that can't be done in-place (increases are fine, as well as VM size changes and several other changes now possible without redeploy). See this post for details around in-place updates.

What happens when when the disc holding logs on azure is full?

Our website is currently deployed to azure and we are writing trace logs using azure diagnostics. We then ship the logs to blob storage periodically and read them using Cerebrata’s Windows Diagnostics Manger software. I would like to know what happens when the disc holding the logs on azure is full i.e before the logs are shipped. When do the logs get purged? and is it is any different if the logs are not shipped. My concern is that the site may somehow fall over when exceptions are raised (if at all) when trying to write to a full disc.
Many Thanks
If you are using Windows Azure Diagnostics, then it will age out the logs on disk (deleting the oldest files first). You have a quota that is specified in your wad-control-container in blob storage on an instance level basis. By default, this will be 4GB (you can change it). All of your traces, counters, and event logs needs to fit in this 4GB of disk space. You can set separate quotas here if you like per data source as well. The Diagnostics Manager takes care to manage the data sources and the quota.
Now, there was a bug in older versions of the SDK where the disk could get full and diagnostics stopped working. You will know if you might be impacted by this bug by RDP'ing into an instance and trying to navigate to C:\Resources\Directory\\Monitor directory. If you are denied access, then you are likely to hit this bug. If you can view this directory as normal admin on machine, you should not be impacted. There was a permission issue in an older SDK version where deletes to this directory failed. Unfortunately, the only symptom of this impact is that suddenly you won't get data transferred out anymore. There is no overt failure.
Are you using System.Diagnostics.Trace to "write" your logs, or you are writing in log files.
In either way there is a roll-up. Meaning that if you hit storage quota, before logs beeing transfered, the oldest logs are being deleted. But you can easily increase your default (4G !) logs quota.
Please take a look at following articles and posts, describing in detail diagnostics in Windows Azure:
http://blogs.msdn.com/b/golive/archive/2012/04/21/windows-azure-diagnostics-from-the-ground-up.aspx
http://www.windowsazure.com/en-us/develop/net/common-tasks/diagnostics/
http://msdn.microsoft.com/en-us/library/windowsazure/hh411544.aspx

Resources