Have a lot of troubles on production server. Some routing cause crashing of Application Pool with event id 1011:
Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1011
Date: 1/21/2009
Time: 9:08:17 AM
User: N/A
Computer: xxxxxxxxxxxxx
Description:
A process serving application pool 'DefaultAppPool' suffered a fatal communication error with the World Wide Web Publishing Service. The process id was '3788'. The data field contains the error number.
8007006d
I have a few very hard hours for me before I found a problem.
Thanks to Tess Ferrandez and her blog post I found it.
Always double check Your multithreaded code in asp.net application. When Unhandled exceptions occurs application pool crashes and it's damn hard to find WHY.
Tess's blog was a little advanced for me. I had to search around for quite a bit before I found the right articles that helped me debug my dump files. This article will help others who want to debug their crashing asp.net application pools but don't know how to start.
Related
our Api app is in UAT on Azure with service plan (Standard 3 large). What should we do if App Availability is Zero. It is getting slow response or timeout issue. When i restart the application it is up to normal. (We are using Parallel Language programming.(Async/Await)
How to find the route cause from it for slowness issue.
Ensure that Always On feature is enabled.
Such problems may be caused by application level issues, such as:
network requests taking a long time
application code or database queries being inefficient
application using high memory/CPU
application crashing due to an exception
You could enable web server diagnostics to fetch more details on the issue.
Detailed Error Logging - Detailed error information for HTTP status codes that indicate a failure (status code 400 or greater). This may contain information that can help determine why the server returned the error code.
Failed Request Tracing - Detailed information on failed requests, including a trace of the IIS components used to process the request and the time taken in each component. This can be useful if you are attempting to improve web app performance or isolate what is causing a specific HTTP error.
Web Server Logging - Information about HTTP transactions using the W3C extended log file format. This is useful when determining overall web app metrics, such as the number of requests handled or how many requests are from a specific IP address.
Also, Azure Application Insights collects telemetry from your application to help analyze its operation and performance. You can use this information to identify problems that may be occurring or to identify improvements to the application that would most impact users. This tutorial takes you through the process of analyzing the performance of both the server components of your application and the perspective of the client: https://learn.microsoft.com/en-us/azure/application-insights/app-insights-tutorial-performance
Ref: https://learn.microsoft.com/en-us/azure/app-service/app-service-web-troubleshoot-performance-degradation
Background / Issue
Having a strange issue running a Ghost blog on Azure. The site seems to run fine for a while, but every once in a while, I'll receive a 500 error with no further information. The next request always appears to succeed (in tests so far).
The error seems to happen after a period of inactivity. Since I'm currently just getting set up, I'm utilizing an Azure "Free" instance, so I'm wondering if some sort of resource conservation is causing it behind the scenes (which will be allevaited when I upgrade).
Any idea what could be causing this issue? I'm sort of at a loss for where to start since the logs don't necessarily help me in this case. I'm new to NodeJS (and nodeJS on Azure) and since this is my first foray, any tips/tricks on where to look would be helpful as well.
Some specific questions:
When receiving an error like this, is there anywhere I can go to see any output, or is it pretty much guaranteed that Node actually didn't output something?
On Azure free instances, does some sort of resource conservation take place which might cause the app to be shut down (and thus for me to see these errors only after a period of inactivity)?
The Full Error
The full text of the error is below (I've turned debugging on for this reason):
iisnode encountered an error when processing the request.
HRESULT: 0x2
HTTP status: 500
HTTP reason: Internal Server Error
You are receiving this HTTP 200 response because system.webServer/iisnode/#devErrorsEnabled configuration setting is 'true'.
In addition to the log of stdout and stderr of the node.exe process, consider using debugging and ETW traces to further diagnose the problem.
The node.exe process has not written any information to stderr or iisnode was unable to capture this information. Frequent reason is that the iisnode module is unable to create a log file to capture stdout and stderr output from node.exe. Please check that the identity of the IIS application pool running the node.js application has read and write access permissions to the directory on the server where the node.js application is located. Alternatively you can disable logging by setting system.webServer/iisnode/#loggingEnabled element of web.config to 'false'.
I think it might be something in the Azure web config rather than Ghost itself. So look for logs based on that because Ghost is not throwing that error. I found this question that might help you out:
How to debug Azure 500 internal server error
Good luck!
I've recently tried to upgrade my WebRole from Azure SDK v1.6 to v1.7. This appears to have worked OK. I can build and run the role in my devfabric just fine. When I try to deploy the upgraded project to the real cloud, the instances never start. They just sit in the "busy" state. Interestingly, they don't do the typical "recycle loop", they just sit at "busy" forever.
When I log into the instances with RDP, I see the following error in the event logs:
The application '/' belonging to site '1' has an invalid AppPoolId 'DefaultAppPool' set. Therefore, the application will be ignored.
Followed by:
Site 1 was disabled because the root application defined for the site is invalid. See the previous event log message for information about the root application is invalid.
Looking in IIS manager confirms that there is no AppPool called "DefaultAppPool". There also are none of the typical AppPools with GUIDs for names that Azure creates. Unsurprisingly, none of my sites exist either.
So how do I resolve this?
I had the same issue after upgrading to v1.7, but upon looking at the Windows Azure logs in the Azure VM I noticed the following exception:
An unhandled exception occurred. Type: System.ArgumentException Process ID: 2340
Process Name: DiagnosticsAgent
Thread ID: 1
AppDomain Unhandled Exception for role Backend_IN_0
Exception: Endpoint http://xxxx.blob.core.windows.net/ is not a secure connection.
So I changed the Diagnostics connection string to use https instead of http and voilá, that solved my problem.
Hope that works for you, I've been pulling my hair off for two days.
according to the the IIS documentation the rapid fail protection once activated leads to the deactivation of an application pool if a "failure" occurs. However, I could not find the definition of the "failure" case. In my web application I have a special exception that I would like the IIS to consider it as a "failure".
Does anyone have an idea? Thanks
This appears to have a list, for Server 2003 at least: http://web.archive.org/web/20130511004652/http://technet.microsoft.com/en-us/library/cc787273(WS.10).aspx
The WWW service shuts down an application pool whenever a worker
process in the application pool fails often enough to equal or exceed
the Rapid-Fail Protection (RFP) interval time window (for example:
five failures in five minutes). The WWW service detects failure
whenever:
A worker process does not start within the startup time limit.
A worker process does not shut down within the shutdown time limit.
A worker process shuts itself down because of a fatal error and sends
the WWW service an error code.
A worker process fails to respond to a ping message.
The WWW service detects that a worker process is sending non-standard
communications (the worker process may have been taken over).
(updated with archive.org to fix broken link, and replicated detail here)
The documentation for configuring rapid fail protection alludes to a "failure" meaning a worker process crash.
Through experimentation I've noticed that you should expect something like the following in Windows Event Application Logs for a w3wp.exe crash:
An unhandled exception occurred and the process was terminated.
Application ID: /LM/W3SVC/1/ROOT
Process ID: 2628
Exception: System.SomeUnhandledException
Indeed with rapid fail protection enabled with the default configuration, 5 such events within 5 minutes of each other cause the application pool to stop, and you'll see a further Windows Event Application Log similar to:
Application pool 'my-test-application-pool' is being automatically
disabled due to a series of failures in the process(es) serving that
application pool.
When I run my WorkerRole C# application on Azure, after a while waworkerhost.exe crashes due the following exception:
Application: WaWorkerHost.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Runtime.CallbackException
Stack:
at System.Runtime.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
I have an application that generates load to a webserver. I don't care about the actual response, but i want to control the number of requests made per second.
Therefore i have a Timer that fires every second and generates a number of requests. I have tried the following options:
Parallel.For with WebRequests
For loop with ASync WebRequests
For loop with ThreadPool.QueueUserWorkItem(do
webrequest)
When the number of requests increase, the exception occurs (8+ req/sec). The same exception for all three options. When I run the role in local DevelopmentFabric all three options work just fine. If someone could give me some pointers on what might be going wrong I appreciate it. If you have other ideas to generate this type of load from Azure and C#, please share your thoughts.
The author answered the question in the comment to the original post, but for better visibility, I'm reporting it to here:
Turn out to be an IntelliTrace issue, see
http://social.msdn.microsoft.com/Forums/en-ZA/windowsazuretroubleshooting/thread/543da280-2e5c-4e1a-b416-9999c7a9b841:
...
After redeploying my solution with Intellitrace disabled, the issues
where resolved, and my WorkerRole stayed healthy.