Diagnosing Sporadic Lockups in Website Running on IIS

Diagnosing Sporadic Lockups in Website Running on IIS - iis

Goal
Determine the cause of the sporadic lock ups of our web application running on IIS.
Problem
An application we are running on IIS sporadically locks up throughout the day. When it locks up it will lock up on all workers and on all load balanced instance.
Environment and Application
The application is running on 4 different Windows Server 2016 machines. The machines are load balanced using ha-proxy using a round robin load balancing scheme. The IIS application pools this website is hosted in are configured to have 4 workers each and the application it hosts is a 32-bit application. The IIS instances are not using a shared configuration file but the application pools for this application are all configured the same.
This application is the only application in the IIS application pool. The application is an ASP.NET web API and is using .NET 4.6.1. The application is not creating threads of its own.
Theory
My theory for why this is happening is that we have requests that are coming in that are taking ~5-30 minutes to complete. Every machine gets tied up servicing these requests so they look "locked up". The company rolled their own logging mechanism and from that I can tell we have requests that are taking ~5-30 minutes to complete. The team responsible for the application has cleaned up many of these but I am still seeing ~5 minute requests in the log.
I do not have access to the machines personally so our systems team has gotten memory dumps of the application when this happens. In the dumps I generally will see ~50 threads running and all of them are in our code. These threads will be all over our application and do not seem to be stopped on any common piece of code. When the application is running correctly the dumps have 3-4 threads running. Also I have looked at performance counters like the ASP.NET\Requests Queued but it never seems to have any requests queued. During these times the CPU, Memory, Disk and Network usage look normal. Using windbg none of the threads seem to have a high CPU time other than the finalizer thread which as far as I know should live the entire time.
Conclusion
I am looking for a means to prove or disprove my theory as to why we are locking up as well as any metrics or tools I should look at.

So this issue came down to our application using query in stitch on a table with 2,000,000 records in it to another table. Memory would become so fragmented that the Garbage Collector was spending more time trying to find places to put objects and moving them around than it was running our code. This is why it appeared that our application was still working and why their was no exceptions. Oddly IIS would time out the requests but would continue processing the threads.

Related

ASP.NET Core 2.2 experiencing high CPU usage

So I have hosted asp.net core 2.2 web service on Azure(S2 plan). The problem is that my application sometimes getting high CPU usage(almost 99%). What I have done for now - checked process explorer on azure. I see there a lot of processes who are consuming CPU. Maybe someone knows if it's okay for these processes consume CPU?
Currently, I don't have an idea where do they come from. Maybe it's normal to have them here.
Shortly about my application:
Currently, there is not much traffic. 500-600 request in a day. Most of the request is used to communicate with MS SQL by querying records, adding, etc.
As well I am using MS Websocket, but high CPU happens when no WebSocket client is connected to web service, so I hardly believe that it's a cause. I tried to use apache ab for load testing, but there isn't any pattern, that after one request's load test, I would get high CPU. So sometimes happens, sometimes don't during load testing.
So I just update screenshot of processes, I see that lots of threads are being locked/used during the time when fluent migrator start running its logging.
Update*
I will remove fluent migrator logging middleware from Configure method. Will look forward with the situation.
UPDATE**
So I removed logging of FluentMigrator. Until now I didn't notice any CPU usage over 90%.
But still, I am confused. My CPU usage is spinning. Is it health CPU usage graph or not?
Also, I tried to make a load test on the websocket server.
I made a script that calls some functions of WebSocket every 100ms from 6-7 clients. So every 100ms there are 7 calls to WebSocket server from different clients, every function within itself queries some data/insert (approximately 3-4 queries of every WebSocket function).
What I did notice, on Azure S1 DTU 20 after 2min I am getting out of SQL pool connections, If I increase DTU to 100, it handles 7 clients properly without any errors of 'no connection pool'.
So the first question: is it a normal CPU spinning?
Second: should I get an error message of 'no SQL connection free' using this kind of load test on DTU 10 Azure SQL. I am afraid that when creating a scoped service on singleton WebSocket Service I am leaking connections.
This topic gets too long, maybe I should move it to a new topic?
-

At this stage I would say you need to profile your application and figure out what areas of your code are CPU intensive. In the past I have used dotTrace, this highlighted methods which are the most expensive with a call tree.
Once you know what areas of your code base are the least efficient, you can begin to refactor them so that they are more efficient. This could simply be changing some small operations, adding caching for queries or using distributed locking for example.
I believe the reason the other DLLs are showing CPU usage is because your code calling methods which are within those DLLs.

Can IIS Handle Thousands of AppPools/Worker Processes?

I am currently investigating the feasibility of an architecture where we will have potentially thousands of AppPools and therefore Worker Processes for each of our micro-services running in IIS (10+). (It is one of a few options)
I understand the overhead of each worker process. Currently my estimation would be that each worker is going to be about 20-30MB. Server resourcing should not be too much of an issue as we are likely going to be provisioning servers with 32-64GB of RAM. To add to this not all workers would be active at all times so we should gain headroom when AppPools are idle.
My question: Can IIS handle this many AppPools/Worker processes?
I don't see a reason it shouldn't given sufficient resources however have not been able to find any documentation around it after some brief searching.

So I'll add some answers to my own question here as I did a little bit of testing.
Server
Intel Xeon - X5550
32GB Ram
Windows Server 2012 R2
Application
Created a barebones WebAPI only ASP.Net application with a single controller and action.
When installed in IIS this is the observed memory footprint.
Memory (Idle) = ~ 5172 K
Memory (Running) = ~26 000 K
Prep
I created some powershell scripts (sorry can't share it as they leverage our closed source deployment scripts) to:
Create - Unique folder for each application to prevent possible resource sharing
Launch - Makes a web request
Cleanup - Deletes all applications, pools and folders
Recycle - Unloads the application, sets it back to Idle state
Test
Below are my results observed from PerfMon
As you will note I could not get all 1000 running at once. I ran into a few things:
Trying to fire a call to all 1000 so they all running simultaneously is not as easy as it sounds.
ASP.Net temporary internet files is on the C:\ which ran out of space
Things began running slowly since memory was being paged.
Conclusion
It seems that IIS really has no limit on the number of processes. The core constraint is the resourcing on the machine.
What is interesting is it is unlikely all applications would be running simulationoesly and so one can take advantage of the fact that IIS will provision mem

Web Api Requests Queueing up forever on IIS (in state: ExecuteRequestHandler)

I'm currently experiencing some hangs on production environment, and after some investigation I'm seeing a lot of request queued up in the worker process of the Application Pool. The common thing is that every request that is queued for a long time is a web api request, I'm using both MVC and Web API in the app.
The requests are being queued for about 3 hours, when the application pool is recycled they immediately start queueing up.
They are all in ExecuteRequestHandler state
Any ideas for where should I continue digging?

Your requests can be stalled for a number of reasons:
they are waiting on I/O operation e.g database, web service call
they are looping or performing operations on a large data set
cpu intensive operations
some combination of the above
In order to find out what your requests are doing, start by getting the urls of the requests taking a long time.
You can do this in the cmd line as follows
c:\windows\system32\inetsrv\appcmd list requests
If its not obvious from the urls and looking at the code, you need to do a process dump of the w3wp.exe on the server. Once you have a process dump you will need to load it into windbg in order to analyze what's taking up all the cpu cycles. Covering off windbg is pretty big, but here's briefly what you need to do:
load the SOS dll (managed debug extension)
call the !runaway command
to get list of long running threads dive into a long running thread
by selecting it and calling !clrstack command
There are many blogs on using windbg. Here is one example. A great resource on analyzing these types of issues is Tess Ferrandez's blog.

This is a really long shot without having first hand access to your system but try and check the Handler mappings in IIS Manager gui for your WebApi. Compare it with IIS settings of your DEV or any other Env where it works.
IF this isnt the issue then do a comparison of all other IIS settings for that App.
Good luck.

I'm not sure how to correctly configure my server setup

This is kind of a multi-tiered question in which my end goal is to establish the best way to setup my server which will be hosting a website as well as a service (using Socket.io) for an iOS (and eventually an Android) app. Both the app service and the website are going to be written in node.js as I need high concurrency and scaling for the app server and I figured whilst I'm at it may as well do the website in node because it wouldn't be that much different in terms of performance than something different like Apache (from my understanding).
Also the website has a lower priority than the app service, the app service should receive significantly higher traffic than the website (but in the long run this may change). Money isn't my greatest priority here, but it is a limiting factor, I feel that having a service that has 99.9% uptime (as 100% uptime appears to be virtually impossible in the long run) is more important than saving money at the compromise of having more down time.
Firstly I understand that having one node process per cpu core is the best way to fully utilise a multi-core cpu. I now understand after researching that running more than one per core is inefficient due to the fact that the cpu has to do context switching between the multiple processes. How come then whenever I see code posted on how to use the in-built cluster module in node.js, the master worker creates a number of workers equal to the number of cores because that would mean you would have 9 processes on an 8 core machine (1 master process and 8 worker processes)? Is this because the master process usually is there just to restart worker processes if they crash or end and therefore does so little it doesnt matter that it shares a cpu core with another node process?
If this is the case then, I am planning to have the workers handle providing the app service and have the master worker handle the workers but also host a webpage which would provide statistical information on the server's state and all other relevant information (like number of clients connected, worker restart count, error logs etc). Is this a bad idea? Would it be better to have this webpage running on a separate worker and just leave the master worker to handle the workers?
So overall I wanted to have the following elements; a service to handle the request from the app (my main point of traffic), a website (fairly simple, a couple of pages and a registration form), an SQL database to store user information, a webpage (probably locally hosted on the server machine) which only I can access that hosts information about the server (users connected, worker restarts, server logs, other useful information etc) and apparently nginx would be a good idea where I'm handling multiple node processes accepting connection from the app. After doing research I've also found that it would probably be best to host on a VPS initially. I was thinking at first when the amount of traffic the app service would be receiving will most likely be fairly low, I could run all of those elements on one VPS. Or would it be best to have them running on seperate VPS's except for the website and the server status webpage which I could run on the same one? I guess this way if there is a hardware failure and something goes down, not everything does and I could run 2 instances of the app service on 2 different VPS's so if one goes down the other one is still functioning. Would this just be overkill? I doubt for a while I would need multiple app service instances to support the traffic load but it would help reduce the apparent down time for users.
Maybe this all depends on what I value more and have the time to do? A more complex server setup that costs more and maybe a little unnecessary but guarantees a consistent and reliable service, or a cheaper and simpler setup that may succumb to downtime due to coding errors and server hardware issues.
Also it's worth noting I've never had any real experience with production level servers so in some ways I've jumped in the deep end a little with this. I feel like I've come a long way in the past half a year and feel like I'm getting a fairly good grasp on what I need to do, I could just do with some advice from someone with experience that has an idea with what roadblocks I may come across along the way and whether I'm causing myself unnecessary problems with this kind of setup.
Any advice is greatly appreciated, thanks for taking the time to read my question.

Running multiple virtual directories on IIS - any performance issues?

I need to run 8-10 instances of my application on IIS 6.0 that are all identical but point to different backends (handled via config files, which would be different for each virtual directory). I want to create multiple virtual directories that point to different versions of the app and I want to know if there is any significant performance penalty for this. The server (Windows Server 2003) is a quad-core with 4 GB of ram and the single install of the app barely touches the CPU or memory, so it doesn't seem to be a concern. This doesn't seem to justify another server, especially since some of the instances will be very lightly used. Obviously, performance depends on the server and the application, but are there any concerns with this situation?

IIS on Windows Server 2003 is built to handle lots of sites, so the number of sites itself is not a concern. The resource needs of your application is much more of a factor. I.e., How much, i/o, cpu, threads, database resources does it consume?
We have a quad-core Windows Server 2003 server here handling several hundred sites no problem. But one resource-intensive app can eat a whole server no problem.
If you find your application is cpu bound, you can put each instance in its own application pool and then limit the amount of cpu each pool can use, so that no one instance can bottleneck any of the others.
I suggest you add a few at a time and see how it goes.

No concerns. If you run into any performance issues, it won't be with IIS for 10 apps that size.

You should consider using multiple application pool. If you do that, and the cpu, memory, IO and network resources of the server are in order. Then there is no performance issue.
It is possible to run them all on the same application pool. But then add to the list, thread pool usage issue, because all application will use one thread pool, and if it is 32 bit server Then there is a limit( around 1.5 Gb ) for the w3wp process.

We constantly run 15-20 per server on a 10 server load balanced farm. We don't come across any issues

The short answer is no, there should be no concerns.
In effect, you are asking if IIS can host 8 - 10 websites... of course it can. Perhaps, you might want to configure it as individual websites rather than virtual directories, and perhaps with individual application pools so that each instance is entirely independent.
You mention that these aren't vary demanding applications; assuming they aren't all linking into the same Access database, I can't see any problems.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string