IIS 6 Hangs , then app pool resets. IIS Debug Diag Dump Attached - iis

IIS 6.0 Hangs, then the app pool resets after approximately 3 minutes. This is an ASP site, upon reset it functions great for a few seconds, then hangs. All other App Pools on this instance of IIS 6 function correctly. There do not appear to be any performance issues with this machine. I took a memory dump using IIS Debug Diagnostics, and this is the rendered analysis. Can anyone please lend some support?
Analysis Summary Type Description Recommendation Warning
Detected possible blocking or leaked critical section at
ntdll!LdrpLoaderLock owned by thread 24 in
w3wp.exe__SupportSiteAppPool__PID__3960__Date__07_23_2009__Time_02_22_36PM__551__Manual
Dump.dmp
Impact of this lock
66.67% of executing ASP Requests blocked
22.58% of threads blocked
(Threads 6 22 23 27 28 29 30)
The following functions are trying to enter this critical section
ntdll!LdrLockLoaderLock+133
ntdll!LdrpGetProcedureAddress+128
ntdll!LdrpInitializeThread+68
The following module(s) are involved with this critical section
C:\WINDOWS\system32\ntdll.dll from Microsoft Corporation The
entry-point function for a dynamic link library (DLL) should perform
only simple initialization or termination tasks, however this thread
(24) is loading a dll using the LoadLibrary API. Follow the guidance
in the MSDN documentation for DllMain to avoid access violations and
deadlocks while loading and unloading libraries.
Please follow up with the vendor Microsoft Corporation for
C:\WINDOWS\system32\mscoree.dll
Warning Detected possible blocking or leaked critical section at asp!g_ViperReqMgr+2c owned by thread 8 in
w3wp.exe__SupportSiteAppPool__PID__3960__Date__07_23_2009__Time_02_22_36PM__551__Manual
Dump.dmp
Impact of this lock
6.45% of threads blocked
(Threads 7 9)
The following functions are trying to enter this critical section
asp!CViperActivity::PostAsyncRequest+72
The following module(s) are involved with this critical section
\?\C:\WINDOWS\system32\inetsrv\asp.dll from Microsoft Corporation
The following vendors were identified for follow up based on root
cause analysis
Microsoft Corporation
Please follow up with the vendors identified above Consider the
following approach to determine root cause for this critical section
problem: Enable 'lock checks' in Application Verifier Download
Application Verifier from the following URL:
Microsoft Application Verifier Enable 'lock checks' for this process by running the following command:
Appverif.exe -enable locks -for w3wp.exe See the following document for more information on Application Verifier:
Testing Applications with AppVerifier Use a DebugDiag crash rule to monitor the application for exceptions

Your ASP Classic App is failing because all threads are blocked. I suggest running Process Monitor on the web server to see what handles are taken up where. I don't see a lot of repetition in your stack trace that would indicate a problem with a particular dll.

Given the information provided it sounds like a problem with the application itself rather than IIS. Have you made sure there aren't any crazy tight loops or excessive/extremely heavy DB loads, possibly some PInvoke calls or just something out of the ordinary for a webapp that are killing the application/runtime and causing the pool to die?

I think you should try some tools likes fiddler and other things.With that you can have exact idea what is taking time to load your site. From the log it seems that there is problem with the application itself. So don't use excessive loops, cache data from db and use and also don't store large object in session or application.

Related

Understanding DebugDiag Tool

I have been trying to understand what is the cause of high memory usage from processes in the windows server I have. I installed that tool DebugDiag 1.2 to try to find the problem.
Here is what runs in my server:
I have the IIS server which has a decent number of pool applications (68 pool applications). For each pool application there are at least 4 applications.
Recently, I have faced problems related to high memory usage, causing the server to work at 97% of memory usage or higher.
It was working fine when I took this printscreen below. However, the memory usage will easily get higher.
Task Manager:
With that being said, I have been trying to understand how to use the tool "DebugDiag1.2" from microsoft to find something (part of the source code, an sql procedure) that might help me locate what is causing the problem.
I read that we can't limit the memory for each IIS pool application, so I guess the solution would be trying to optmize the application. But first I need to know where to start.
I hope someone can help me out.

Azure Web Site CPU High at random intervals of the day

I have a Azure Web Site running for 6 months and on Friday 1st April 2016 at 09:50pm the CPU was very high and this had a impact on the performance of the web site. Stopping and restarting the web service solved the problem but it came back at 13:00pm. Since then the CPU has stayed high and making the web site un-useable
I've tried all monitoring tools, Daas, Event Logs, checked for Open Connections and ensure my software is closing or disposing objects correctly.
But the CPU is still high. Only way to resolve is to restart the web service but I dont want to keep doing this.
Has anyone else experience a similar problem and what was the solutions.
The only thing from the event logs that look an issues is the odd "A network-related or instance-specific error occurred while establishing a connection to SQL Server", which could be because the SQL Aure is not available.
Please help
Hmmm, high cpu means that your web site is executing code, perhaps a wrong loop on some not frequent code path.
The brute force way to identify what code is being executed, would be to add tracing to your solution by System.Diagnostics.Trace.WriteLine("I am here") and then check the Azure Application Log.
Another way would be to attach the Visual Studio Debugger during high cpu and check what is being executed
The other way would be to take a dump or minidump from kudu site and analyze it with WinDbg:
1)What thread is conuming cpu:
!runaway
2) What is this thread doing:
!clrstack
hth,
Aldo

COM Runtime Breakdown in Multithreaded Server Application

We are experiencing intermittent catastrophic failures of the COM runtime in a large server application.
Here's what we have:
A server process running as a Windows service hosts numerous free-threaded COM components written in C++/ATL. Multiple client processes written in C++/MFC and .NET use these components via cross-procces COM calls (incl .NET interop) on the same machine. The OS is Windows Server 2008 Terminal Server (32-bit).
The entire software suite was developed in-house, we have the source code for all components. A tracing toolkit writes out errors and exceptions generated during operation.
What is happening:
After some random period of smooth sailing (5 days to 3 weeks) the server's COM runtime appears to fall apart with any combination of these symptoms:
RPC_E_INVALID_HEADER (0x80010111) - "OLE received a packet with an invalid header" returned to the caller on cross-process calls to server component methods
Calls to CoCreateInstance (CCI) fail for the CLSCTX_LOCAL_SERVER context
CoInitializeEx(COINIT_MULTITHREADED) calls fail with CO_E_INIT_TLS (0x80004006)
All in-process COM activity continues to run, CCI works for CLSCTX_INPROC_SERVER.
The overall system remains responsive, SQL Server works, no signs of problems outside of our service process.
System resources are OK, no memory leaks, no abnormal CPU usage, no thrashing
The only remedy is to restart the broken service.
Other (related) observations:
The number of cores on the CPU has an adverse effect - a six core Xeon box fails after roughly 5 days, smaller boxes take 3 weeks or longer.
.NET Interop might be involved, as running a lot of calls accross interop from .NET clients to unmanaged COM server components also adversely affects the system.
Switching on the tracing code inside the server process prolongs the working time to the next failure.
Tracing does introduce some partial synchronization and thus can hide multithreaded race condition effects. On the other hand, running on more cores with hyperthreading runs more threads in parallel and increases the failure rate.
Has anybody experienced similar behaviour or even actually come accross the RPC_E_INVALID_HEADER HRESULT? There is virtually no useful information to be found on that specific error and its potential causes.
Are there ways to peek inside the COM Runtime to obtain more useful information about COM's private resource pool usage like memory, handles, synchronization primitives? Can a process' TLS slot status be monitored (CO_E_INIT_TLS)?
We are confident to have pinned down the cause of this defect to a resource leak in the .NET framework 4.0.
Installations of our server application running on .NET 4.0 (clr.dll: 4.0.30319.1) show the intermittent COM runtime breakdown and are easily fixed by updating the .NET framework to version 4.5.1 (clr.dll: 4.0.30319.18444)
Here's how we identified the cause:
Searches on the web turned up an entry in an MSDN forum: http://social.msdn.microsoft.com/Forums/pt-BR/f928f3cc-8a06-48be-9ed6-e3772bcc32e8/windows-7-x64-com-server-ole32dll-threads-are-not-cleaned-up-after-they-end-causing-com-client?forum=vcmfcatl
The OP there described receiving the HRESULT RPC_X_BAD_STUB_DATA (0x800706f7) from CoCreateInstanceEx(CLSCTX_LOCAL_SERVER) after running a COM server with an interop app for some length of time (a month or so). He tracked the issue down to a thread resource leak that was observable indirectly via an incrementing variable inside ole32.dll : EventPoolEntry::s_initState that causes CCI to fail once its value becomes 0xbfff...
An inspection of EventPoolEntry::s_initState in our faulty installations revealed that its value started out at approx. 0x8000 after a restart and then constantly gained between 100 and 200+ per hour with the app running under normal load. As soon as s_initState hit 0xbfff, the app failed with all the symptoms described in our original question. The OP in the MSDN forum suspected a COM thread-local resource leak as he observed asymmetrical calls to thread initialization and thread cleanup - 5 x init vs. 3 x cleanup.
By automatically tracing the value of s_initState over the course of several days we were able to demonstrate that updating the .NET framework to 4.5.1 from the original 4.0 completely eliminates the leak.

Creating objects suddenly begins failing after they have been loaded in memory successfully

Behavior:
Application is loaded and being used as expected.
Suddenly, a particular DLL can no longer be loaded. The error message is:
ActiveX component cannot create object.
In each case, the object had been created successfully many times before failure. All objects are marked for "retain in memory".
This error is cleared when the application pool is recycled. It may be hours or months before it is seen again.
Issue has happened within two hours of a refresh, as well as never happened in months of uptime.
Issue has happened with hundreds of simultaneous users (heavy usage) and also with 1-3 users.
While the issue is occurring, the process running that application pool cannot create the object that is failing. However it can create any other objects. Memory, CPU, and other resources all remain at normal usage. In addition, other processes (such as a stand-alone exe) can successfully create the object.
The first instance of the issue appeared in mid 2008. There have been less than fifty instances since then, despite a pool of hundreds of servers for it to occur on. All instances except one have failed on the same DLL.
DLL Failure Info:
most common - generic data structure implementing a b-tree, has no references other than to its interface. Code consists of arrays and one use of the vb6 Event functionality. The object has not been changed in any way since 2005.
one-time - interop to a .NET module. the failure is occurring when trying to create the interop object, not the .NET object. This object is updated a few times each year.
Application Environment:
IIS hosted application
VB6, classic ASP, some interop to minor .NET components
Windows Server 2003 / Windows Server 2008 (both have independently had the problem)
Attempts to Reproduce:
Using scripts (and real-life humans) to run the same end-user workflows that our logs reported the days before the issue occurred.
Using scripts to create/destroy suspected objects as fast as possible from multiple simultaneous sessions.
Wild speculation.
No intentional success, but it does manifest randomly on the servers on its own.
Troubleshooting:
Code reviews
Test harnesses to investigate upper limits of object creation / destruction
Verification of ability to create object outside of the process experiencing the issue
Monitoring of resources over time on servers under load
Review of IIS, error, and event logs to determine events leading up to issue
Questions:
Any ideas on how to reproduce the issue?
What could cause this behavior?
Ideas for bypassing the first two questions in favor of a fast solution?
The DLL isn't on a network drive is it? You can get "glitches" where the drive is not available momentarily that then means COM can't do what it needs and could then fail to notice the drive is available again.
I used Process Monitor to debug similar problem when accessing ADO/OLEDB stack. Turned out environment got corrupted at some point and ADO classes are registered with InprocServer32 being REG_EXPAND_SZ pointing to %CommonProgramFiles%\System\ado\msado15.dll or similar ot x64 OSes.
Also when you register an application with Restart Manager, on failure the process gets restarted by winlogon process whose environment is different than explorer's one and unfortunately is missing %CommonProgramFiles% -- ouch!
This seems like a random failure; some race condition.
Try VMWARE to record the state of the machine you run this dll on. When the error happens you can then replay the record and inspect the memory contents. That why you won't have to play try and catch the error. At least you will have a solid record of it.
While I can't provide a solution, try catching the error and retry loading the dll when this happens after a refresh to the environment.

How do I confirm whether Application Warm-Up plugin works?

I have a web application that's consuming a WCF service. Both are slow on warmup after IIS reset or app pool recycle. So, as a possiible solution I installed Application Warm-Up for IIS 7.5 and set it up for both web site and wcf service.
My concern is, it doesn't seem to make any difference - first time I hit the site it still takes long time to bring it up. I checked event logs, there are no errors. So I'm wondering if anything special needs to be done for that module to work.
In IIS manager, when you go into the site, then into Application Warm-Up, the right-hand side has an "Actions" pane. I think you need the following two things:
Click Add Request and add at least one URL, e.g. /YourService.svc
Click Settings, and check "Start Application Pool 'your pool' when service started"
Do you have both of these? If you don't have the second setting checked, then I think the warmup won't happen until a user hits the site (which probably defeats the purpose of the warmup module in your case).
There is a new module from Microsoft that is part of IIS 8.0 that supercedes the previous warm-up module. This Application Initialization Module for IIS 7.5 is available a separate download.
The module will create a warm-up phase where you can specify a number of requests that must complete before the server starts accepting requests. Most importantly it will provide overlapping processes so that the user will not be served by the newly started process before it is ready.
I have answered a similar question with more details at How to warm up an ASP.NET MVC application on IIS 7.5?.
After you have fixed possible software/code optimizations allow me to suggest that each and evey code needs processing via hardware cpu. And our server skyrocketed in performance when we went to a multicore cpu and installed more GIGS of ram and connected UTP-6 cable insetad of standard UTP 5e cable onto the server... That doesnt fix your problem but if you are obsessed with speed as much as us, then you will be interested in the various dimensions that bottleneck speed.

Resources