Is there a way of telling which threads are redundant and remove them?
Is this something that might need to be done? or do threads self terminate if they encounter any error or are used for too long?
You can get all the active threads and kill them. Go through the link below.
http://coldfusion-tip.blogspot.com/search/label/thread%20kill
You can control the number of available threads in the ColdFusion Administrator under "Server Settings > Request Tuning > Maximum number of simultaneous Template requests" will set the number of available workers/threads ready to accept requests handed off from the webserver. You can fine tune some of the other thread settings (direct CFC/Remoting requests, report requests etc) there as well. The thread pool should recycle itself normally if it encounters a ColdFusion error but there are "hung" threads.
The thread pool should stay at the number set in the administrator. You can set a timeout on a thread's maximum runtime under "Server Settings > Settings > Timeout Requests after seconds" setting.
Using the built-in server monitor under Server Monitoring > Server Monitor You can go to Alerts -> Alert Configuration and set a number of thread killing conditions based on timeouts, memory, etc. but this does increase the load on your system.
This section of the Performance tuning for ColdFusion applications article from Adobe gives you some tips/advice about threads.
Fusion Reactor is a great commercial product for those who need powerful control for high-availability over their ColdFusion server.
Also checkout CFTracker, a free/OSS CF Server Monitoring Project.
Related
So I'm trying to migrate a Legacy website from an AWS VM to an Azure VM and we're trying to get the same level of performance. The problem is I'm pretty new to setting up sites on IIS.
The authors of the application are long gone and we struggle with the application for many reasons. One of the problems with the site is when it's "warming up" it pulls back a ton of data to store in memory for the entire day. This involves executing long running stored procs and in memory processes which means first load of certain pages takes up to 7 minutes. It then uses a combination of in memory data and output caching to deliver the pages.
Sessions do seem to be in use although the site is capable of recovering session data from the database in some more relatively long running database operations so sessions are better to stick with where possible which is why I'm avoiding a web garden.
That's a little bit of background, however my question is really about upping the performance on IIS. When I went through their settings on the AWS box they had something call NUMA enabled with what appears to be the default settings and then the maximum worker processes set to 0 which seems to enable NUMA. I don't know why they enabled NUMA or if it was necessary, but I am trying to get as close to a like for like transition as possible and if it gives extra performance in this application we'll probably need it!
On the Azure box I can see options to set the maximum worker processes to 0 but no NUMA options. My question is whether NUMA is enabled with those default options or is there something further I need to do to enable NUMA.
Both are production sized VMs but the one on Azure I'm working with is a Standard D16s_v3 with 16 vCores and 64Gb RAM. We are load balancing across a few of them.
If you don't see the option in the Azure VM it's because the server is using symmetric processing and isn't NUMA aware.
Now to optimize your loading a bit:
HUGE CAVEAT - if you have memory leak type issues, don't do this! To ensure you don't, put on a private bytes limit roughly 70% the size of memory on the server. If you see that get hit/issue an IIS recycle (that event is logged by default) then you may want to ignore further steps. Either that or mess around with perfmon (or more easily iteratively check peak bytes in task manager where you'll have to add that column in the details pane)
Change your app pool startup mode to: AlwaysRunning
Change your web app to preloadenabled=true
Set an initialization page in your web.config (so that preloading knows what to load).
*Edit forgot some steps. Make sure your idle timeout is clear or set it to midnight.
Make sure you don't have the default recycle time enabled, clear that out.
If you want to get fancy you can add a loading page and set an http refresh or due further customizations seen below:
https://learn.microsoft.com/en-us/iis/get-started/whats-new-in-iis-8/iis-80-application-initialization
One of our customers has set the application pool to throttle under load at 35%, and at times they noticed the following event
Event ID: 5210
CPU time for application pool 'abc' has been throttled.
They noticed such events show up in the event viewer log, even though the CPU utilization on the web server is not pegged high, for example < 60%
Would like to know:
• Under what condition does the event id 5210 get generated?
• How does IIS detect contention on the CPU? Is it based on a performance counter etc?
The cause of the issue is there is something misconfiguration with your Application Pool.
open iis manager.
Right-click on the appropriate Application Pool and click Advanced Settings.
Set the Limit to 0 and Limit Action value to NoAction.
IIS 8.0 CPU Throttling: Sand-boxing Sites and Applications
This is strange question: you said you had set AppPool to ThrottleUnderLoad and set Limit to 35%. Why do you wonder when it hits the limit and generates a log record? The most common value for the throttle limit is 80%, which allows to throttle the load a little when there are too many requests and the CPU resources are not enough. I don't see sense to set the limit to 35%, this will slow web responses from your server when it has plenty of CPU time.
I suggest increasing the value. This would decrease the number of 5210 warnings.
Goal
Determine the cause of the sporadic lock ups of our web application running on IIS.
Problem
An application we are running on IIS sporadically locks up throughout the day. When it locks up it will lock up on all workers and on all load balanced instance.
Environment and Application
The application is running on 4 different Windows Server 2016 machines. The machines are load balanced using ha-proxy using a round robin load balancing scheme. The IIS application pools this website is hosted in are configured to have 4 workers each and the application it hosts is a 32-bit application. The IIS instances are not using a shared configuration file but the application pools for this application are all configured the same.
This application is the only application in the IIS application pool. The application is an ASP.NET web API and is using .NET 4.6.1. The application is not creating threads of its own.
Theory
My theory for why this is happening is that we have requests that are coming in that are taking ~5-30 minutes to complete. Every machine gets tied up servicing these requests so they look "locked up". The company rolled their own logging mechanism and from that I can tell we have requests that are taking ~5-30 minutes to complete. The team responsible for the application has cleaned up many of these but I am still seeing ~5 minute requests in the log.
I do not have access to the machines personally so our systems team has gotten memory dumps of the application when this happens. In the dumps I generally will see ~50 threads running and all of them are in our code. These threads will be all over our application and do not seem to be stopped on any common piece of code. When the application is running correctly the dumps have 3-4 threads running. Also I have looked at performance counters like the ASP.NET\Requests Queued but it never seems to have any requests queued. During these times the CPU, Memory, Disk and Network usage look normal. Using windbg none of the threads seem to have a high CPU time other than the finalizer thread which as far as I know should live the entire time.
Conclusion
I am looking for a means to prove or disprove my theory as to why we are locking up as well as any metrics or tools I should look at.
So this issue came down to our application using query in stitch on a table with 2,000,000 records in it to another table. Memory would become so fragmented that the Garbage Collector was spending more time trying to find places to put objects and moving them around than it was running our code. This is why it appeared that our application was still working and why their was no exceptions. Oddly IIS would time out the requests but would continue processing the threads.
I'm currently experiencing some hangs on production environment, and after some investigation I'm seeing a lot of request queued up in the worker process of the Application Pool. The common thing is that every request that is queued for a long time is a web api request, I'm using both MVC and Web API in the app.
The requests are being queued for about 3 hours, when the application pool is recycled they immediately start queueing up.
They are all in ExecuteRequestHandler state
Any ideas for where should I continue digging?
Your requests can be stalled for a number of reasons:
they are waiting on I/O operation e.g database, web service call
they are looping or performing operations on a large data set
cpu intensive operations
some combination of the above
In order to find out what your requests are doing, start by getting the urls of the requests taking a long time.
You can do this in the cmd line as follows
c:\windows\system32\inetsrv\appcmd list requests
If its not obvious from the urls and looking at the code, you need to do a process dump of the w3wp.exe on the server. Once you have a process dump you will need to load it into windbg in order to analyze what's taking up all the cpu cycles. Covering off windbg is pretty big, but here's briefly what you need to do:
load the SOS dll (managed debug extension)
call the !runaway command
to get list of long running threads dive into a long running thread
by selecting it and calling !clrstack command
There are many blogs on using windbg. Here is one example. A great resource on analyzing these types of issues is Tess Ferrandez's blog.
This is a really long shot without having first hand access to your system but try and check the Handler mappings in IIS Manager gui for your WebApi. Compare it with IIS settings of your DEV or any other Env where it works.
IF this isnt the issue then do a comparison of all other IIS settings for that App.
Good luck.
We have a Windows Server 2003 web server, and on that server runs about 5-6 top level Sharepoint sites, with a different application pool for each one.
There is one W3WP process that keeps pegging 100% for most of the day (happened yesterday and today) and it's connected (found by doing "Cscript iisapp.vbs" at the command line and matching ProcessID) to a particular Sharepoint site...which is nearly unusable.
What kind of corrective action can I take? These are the following ideas I had
1) Stopping and restarting the Web Site in IIS - For some reason this didn't stop the offending W3WP process??? Any ideas why not?
2) Stopping and restarting the associated Application Pool.
3) Recycling the associated Application Pool.
Any of those sound like the right idea? If not what are some good things to try? I can't do an iisreset since I don't want to alter service to the other, much more heavily used, Sharepoint sites.
If I truly NEED to do some diagnostic work please point me in the right direction. I'm not the Sharepoint admin guy (he's out of town so I'm filling in even though I'm just a developer) but I'll do my best.
If you need any information just let me know and I'll look it up (slowly though, as that one process is pegging the entire machine).
It's not an IISReset that you need. You have a piece of code that is running amok with your memory. Most likely it's not actually a CPU problem but a paging problem. I've encountered this a few times with data structures in memory that grow too large to page in/out effectively and eventually the attempt to page data just begins consuming everything. The steps I would recommend are:
1) Go get the IIS Debug Diagnostics tools. And learn how to use them.
2) If possible, remove the session state from InProc to a state server or a sql server (since this requires serialization of all classes that go into session this may not be possible). This will help alleviate some process related memory issues.
3) Go to your application pool and adjust the number of worker processes upward. Remove Rapid fail protection (this will allow the site to continue serving pages even if rapid catastrophic errors occur).
The IIS debug diagnostics will record a LOT of data, but you can specify specific "catch" alerts that will detect hangs, excessive cpu usage etc. It will capture gigs of data, so be ready for a long wait when attempting to view the logs.
Turns out someone tried to install some features that went haywire.
So he wrote a stsadm script to uninstall those features
Processor was still pegging.
I restarted the IIS Application Pool for that IIS process, didn't fix it.
So then I restarted IIS for that site and that resolved the processor issue.