I rephrased the question to avoid focusing on C# exception as an exception.
By design log4net is silent utility, or as FAQ describes it as fail-stop system.
However such behaviour is opposite to what I need. And I need this:
incorrect configuration -- notification event
database provider missing -- notification event
cannot log into database -- notification event
and so on and on.
So is it possible (and if yes, how?) to configure log4net to make a notification on error instead of silently ignoring it?
Are you talking about log4net internal debugging?
You can enable this as described here: http://logging.apache.org/log4net/release/faq.html#trouble-file-perm.
This will make sure that inernal log4net exceptions are logged too.
EDIT Can log4net throw exceptions at run time?
No. log4net is not reliable. It is a best-effort and fail-stop logging system. By fail-stop, we mean that log4net will not throw unexpected exceptions at run-time potentially causing your application to crash. See http://logging.apache.org/log4net/release/faq.html.
Related
I have Azure Durable functions with Event Grid as a trigger point which is pointing to blob storage.
I have 8 activity functions and 1 orchestrator.
Based on the file types I receive one of the activity function is executed.
However I keep receiving the crashing message as in the image.
Based on the error message that you have shared is pointing that function failed with "System.ExecutionEngineException"
Generally , System.ExecutionEngineException exception is thrown when the CLR detects that something has gone horribly wrong.
This can happen some considerable time after the problem occurred. This is because the exception is usually a result of corruption of internal data structures - the CLR discovers that something has got into a state that makes no sense. It throws an uncatchable exception because it's not safe to proceed.
Looking at the stack trace that you have mentioned in the screen shot more over exception is pointing out DurableTask.AzureStorage.TimeoutHandler+ <ExecuteWithTimeout issue.
You can use memory dump generated by the Proactive Crash Monitoring tool
to identify the function crash & associated crashing thread call stack.
please create a technical support ticket by following the link wherein technical support team would help you in troubleshooting the issue or open a discussion over Microsoft Q&A Community.
Microsoft is aware of it, and it's currently as designed. It shouldn't affect your apps. https://github.com/Azure/azure-functions-durable-extension/issues/1965#issuecomment-931637193
I'm using NServiceBus (v 4.0.5) on an Azure virtual machine using the Azure Service Bus transport (v 4.0.5). The NServiceBus.Host service has been crashing on an occasional basis but lately has been crashing more often than not. The exception thrown is:
Application: NServiceBus.Host.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: Microsoft.ServiceBus.Common.CallbackException
Stack:
at Microsoft.ServiceBus.Common.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
I'm using dedicated machine running the generic host service, and I have 3 machines which send messages to it (I don't use pub/sub).
What I've tried
Rebooting / restarting the service manually.
Researching the error: not many people seem to have received this message, and for the people that have, their response did not apply to my situation.
Verifying the dead letter queue: several messages are placed in the dead letter queue (over 400 in the past 6 months), but I could not correlate any specific message types to the crash (at least 40% of my message types have been found in the dead letter queue). I'm assuming that most of these messages have been added to the DLQ because the service is failing.
Checking application logs: my application logs exceptions to a log4net log, however no exceptions were logged during the time of the crashes.
Checking event logs: nothing relevant was found except for the main error message noted above.
Upgrading NServiceBus to 4.4.2 and WindowsAzureServiceBus package to 5.1.1: due to NuGet package conflicts upgrading is proving to be painful. I'm using Microsoft.Data.OData 5.4.0 and Microsoft.Data.Edm 5.4.0, but the NServiceBus.Azure package depends on v5.2.0 of these assemblies. I could discard the nuget package dependencies and add the references myself, but I'd like to know why the WindowsAzureServiceBus package depends specifically on v5.2.0 before doing this.
Any thoughts or ideas would be helpful.
Thank you!
I will look into this, It sounds like a bug, most likely an unhandled exception coming from the azure servicebus (but doesn't necessarily originate there)
I've created a github issue here: https://github.com/Particular/NServiceBus.Azure/issues/133
Are you able to reproduce the issue? And what has changed between the time where you saw it occasionally and where it happens often.
One thing you could do is to add an eventhandler for all exceptions occuring on the appdomain and log those as well, that should theorethically catch anything and if there is an innerexception to this callback exception you could catch it this way.
On the strict dependency of the packages. This is mostly done because nuget package manager does not apply binding redirects to the app.config of worker roles, which tripped up way to many users in the past (it often manifests itself as an infinitly rebooting worker role). So go ahead and override.
Encountering a strange issue with one of our queues (for production, no less). When I try to put a message onto the queue, it's throwing an exception that simply states:
A timeout has occurred during the operation
The messages do seem to be making it onto the queue, as evidenced by the fact that I can see the queue length increasing in the management portal. However, the client application is not receiving any messages.
The management portal shows that there have been several failed requests, and also several internal server exceptions; though unfortunately I don't see any way to get more details about those failed requests and errors.
I'm somewhat at a loss as to what may have caused this, how to get more information about what's wrong, and how to move ahead in troubleshooting this. Any help would be greatly appreciated.
edit: I should mention just for completeness sake, that I did not make any changes to the clients that I'm aware of; This issue just sort of started happening all of a sudden
edit #2, woke up this morning, and things have magically returned to normal. Still not sure what happened, so I'd like to change the tone of the question to solicit suggestions as to how this kind of thing may be mitigated and/or troubleshooted (troubleshot? troubleshat? :) ) better
I have experienced this scenario too. When I tried too create a new service bus namespace, and pointed my app to this new namespace, it worked for me. This suggests that it might be some hardware failure going on (on the node where your sb-namespace resides).
Be sure to use transient failure handling, for example http://www.nuget.org/packages/EnterpriseLibrary.WindowsAzure.TransientFaultHandling/
But there might as well be required too use a "second level retry" for errors that are not transient. This you have to code yourself.
Too be more fault tolerant you can also use the new feature of paired namespaces. Here is a good resource: http://msdn.microsoft.com/en-us/library/dn292562.aspx
Hth
//Peter
Our application sends out log4j emails when an Exception is thrown. Because we're doing batch processing, the exception may occur every minute (or more often) until the error is resolved.
Is there a way to set up log4j so that it buffers exceptions and consolidates them? So that we can deuce the number of email alerts that get sent out?
Would also accept a third party service or nagios plugin that can do this sort of consolidation.
I am envisioning that even though exceptions go thrown every minute for an hour, we have some tool, service, or other mechnaism that can consolidate log4j logs (or any application error log that gets emailed out) so that we have more control over alerts.
The goal is to reduce noise of alerts going out to the ops folks. They need to know the alert occurred and is still occurring, but dont need to be spammed every minute.
This works for Logback. There is a plan to include support for log4j.
Have a look at the new Whisper appender: Whisper on Github.
Statutory disclaimer: I'm the author.
We are developing several Azure-based applications in C# and are attempting to centralize some common code in a utility library. One of the common functions is Diagnostic monitoring setup.
We created a class that simplifies the configuration of diag collection, log transfer, etc.
The main issue we are facing is that when we run our code while the class lives in a different assembly from the WebRole or WorkerRole, the diagnostic information is never collected and transferred to azure table storage. If we move the class to the same project as the Web/Worker role, then everything works as expected.
Is there something that either the DiagnosticMonitor.GetDefaultInitialConfiguration(); or the DiagnosticMonitor.Start(StorageConnectionStringKey, _diagConfig); doesn't like about being in another assembly? I'm stumped!
Any insight would be appreciated.
Thanks,
Matt
Which part is not working here? Trace Logs not getting transferred? That seems to be the one that most people have issues with.
We do something similar and have no issues. Typically when you don't see stuff getting transferred it's because the current process where the listener is getting configured is not always the same one where tracing occurs (especially when dynamically adding to trace listener collection). Notably, a lot of users find this issue with web apps in Windows Azure.
What are you expecting to see transferred? Perf counters? Traces? Event Logs? etc.