On some nights our site stops logging at the time the db backup happens - in the last week, we've had 5 days with no issues and 2 days where we had to recycle the IIS app pool to get logging going again. We are logging at DEBUG level. The last item before it stops is a DEBUG-level log.
Our theory is that it only breaks when a request happens at the time of the db backup.
Any ideas as to the potential cause, or a reliable solution?
Log4net stops logging to the database if there is a connection problem. You can set the ReConnectOnError flag on your appender (make sure that you use a very short connection timeout or else your application may hang).
Related
I have a Web API application hosted under an on-prem (not Azure) IIS which logs to Insights and is showing Request durations very similar to the times reported in the IIS logs.
However, every so often, there is an entry in the IIS logs with a TimeTakenMS duration which is much longer than the Insights Request duration.
For example, a relatively simple request to read a small amount of data from the DB is logged with a total duration by Insights as 984.126ms but IIS is logging it with TimeTakenMS as 43718.
I have conclusively linked the two requests (they show a unique URL). I have eliminated application start up/recycle times (the application is clearly already started and serving other requests and the recycle boundary is hours away from the logged time stamps) and I have eliminated exceptions (Insights is not logging any exceptions at this time).
I also have a StopWatch set up in the controller for every web method and TrackTrace the elapsed milliseconds to Insights in a finally block, just before the method returns.
What other factors could cause an IIS hosted application to fail to log actual execution times to Insights but cause IIS to log much longer times?
I've considered network + processing time, but 42 seconds seems an unreasonably long duration (even for the start up time of this particular application.)
I am building a Blazor Server intranet application for my customer. One of the requirements is that they can stay logged in indefinitely. If they starting inputting some data on a Friday afternoon, they should be able to return on Monday morning and continue working without interruption.
I came to observe that the client-side was getting disconnected from the server about once per day. When this happened I would see the dreaded Blazor error “Reconnection failed. Try reloading the page if you’re unable to reconnect.”. If I click the link to Reload, it immediately reconnects to my server, but any work in process would be lost.
I found the root cause: by default, IIS is recycling the application pool every 29 hours. When this happens, the Blazor SignalR connection is getting interrupted, and hence the code running in the browser times out and disconnects.
I am able to work around this issue by disabling application pool recycling altogether. So far, it looks like that works fine (I could keep connectivity for the past 3 days). But I am worried this may not be safe long term, since application pool recycling helps deal with issues such as memory leaks, fragmentation, etc.
SO, my question is: is it possible to configure IIS in way that I can recycle the application pool AND also keep my blazor server connection available during that recycle period?
When you recycle an application pool, HTTP.SYS holds onto the client connection in kernel mode while the user mode worker process recycles. After the process recycle, HTTP.SYS transparently routes the new requests to the new worker process. Thus, the client never "loses all connectivity" to the server - the TCP connection is never lost - and never notices the process recycle.
I believe your problem is with the applications running in your application pool that store state within the process, such as whether a user is logged in or not. Everytime the process recycles, that state is automatically lost... which is by-design since that is what a process recycle accomplishes. As a result, your users "lose all connectivity" and "have to log back into their applications" to re-establish that lost state. The only way to fix this is for your applications to store its state outside of the IIS worker process such that it is friendly to being recycled.
The following blog entry talks more about what is going on:
https://learn.microsoft.com/en-us/archive/blogs/david.wang/why-do-i-lose-asp-session-state-on-iis6
I'm working on configuring an Azure Log Analytics alert (using KQL) to capture the IIS Stop & Start events (from Events table) in my OMS Workspace, and if the alert query finds that there's no corresponding IIS Start event log generated from a PaaS Role for a particular IIS Stop event log- the user should get notified by an alert so that he can bring IIS back up.
Problem: Let’s say I setup my alert to run over a Time Period & Frequency of 15mins. If the alert triggered at 10:30AM, that means it will scan the IIS logs from 10:15:01 AM to 10:29:59 AM. Now, suppose an IIS Stop event got logged in around 10:28 AM, then the respective IIS Start log (if any) will be logged in after a couple of minutes around 10:31AM or 10:32 AM – and hence it will go out of the alert’s monitoring time period. This will create a false positive failure scenario. (IIS got started back but my alert didn’t captured the Start event log). And thus, it might lead to some unnecessary IIS Start/Reset operations on my PaaS roles.
Attaching a representative quick sketch to explain it figuratively.
Please let me know if there's any possible approach to achieve this. Any suggestions are welcome. Thanks in advance!
Current implementation as follows.
Here we can see False Alert generated at 10:30.
You can see the below approach, where we select last 10 minutes data(Overlapped) every 5 minutes.
For the below case you can generate the alert
See if its helping you.
We have an Azure worker role that exposes a RESTful WCF service (using System.ServiceModel.Web) through a ServiceHost. The performance is irreproachable on massive traffic, but it seems like the response time is significantly higher (more than five seconds) on the first request when the role has been idle for some time. Does anyone know what might cause this?
The default AppPool timeout is 20 minutes. Might you be running into this? If so, you can add something like this to a startup script to change the timeout:
%windir%\system32\inetsrv\appcmd set config -section:applicationPools -applicationPoolDefaults.processModel.idleTimeout:00:00:00
Here's another answer I posted, to a different question, discussing this further.
We've configured our application pool to recycle at a regular time interval of 180 minutes. But the worker processes are getting recycled every 60 minutes.
Is this a known issue, or do we need to configure something else?
Thanks
As stated in What causes an application pool in IIS to recycle?:
You might want to turn on full AppPool Recycle Event logs:
cscript adsutil.vbs Set w3svc/AppPools/DefaultAppPool/LogEventOnRecycle 255
You also might want to take a look at this Scott Guthrie blog article: [http://weblogs.asp.net/scottgu/archive/2005/12/14/433194.aspx][1] that shows how to write code in Global.ASAX to log the actual cause of an Application.End event.