PerfMon and w3wp?

PerfMon and w3wp? - iis

I'm looking at "Process:w3wp*:% Processor Time" in PerfMon and am struggling to follow something. I have traces running for w3wp and then w3wp#1 - w3wp#6, which are the six sites running on the server.
w3wp's trace doesn't appear to be related to the total of #1-#6 ?
e.g.
'#1 can have a %Processor higher than w3wp, and conversely w3wp can have near 100% when ALL the other %'s are very low.
I'm trying to find a performance bottleneck in our server and the obvious one is that the CPU tops out. We are going to add another CPU (as it's on VM) but I'd like to try to understand what I am looking at...and what can be done to alleviate the issue?
Why is w3wp often close to 100% even though the individual sites are very low? What might be causing w3wp to be so high if its not a particluar site?
ps. If anyone has a way I can save an image here I can post the graph. TY
pps. IIS7 on Win2008.

Answered by Nick Craver in this thread: .net performance counter - Process(w3wp)\% Processor Time

Related

Linux Webserver - htop shows extreme cpu usage?

I am a bit confused about what the tool "htop" shows as cpu usage and average load. I was asked to have a look at a webserver which is performing incredibly slow.
I googled a bit and always found the statement that everyting above 1.00 in average load is terrible when you only have one cpu in the machine.
However, my "htop" experience looks like this:
htop screenshot
Can someone please tell me what exactly is going on here? Is this bad or do I misunderstand everything?
Thank you for your help.

In your screenshot the CPU usage bars are colored in green and red. Press '?' in htop for a help screen to show up. From there you will see that green color is for a normal priority userspace applications CPU usage and the red color is for kernel threads.
Basically, in your screenshot all the CPU cores are 100% busy and most of the time they spend in the kernel.
Yes, this is bad. Further investigation is needed to tell what exactly is going on here.

The htop screenshot is showing you each of the cores of the CPU and the usage for each. What you really want to be looking at are the processes and how much CPU they are consuming.
There's an article here which explains it in more detail: http://www.thegeekstuff.com/2011/09/linux-htop-examples
Goodluck!

Profiling resource usage - CPU, memory, hard-drive - of a long-running process on Linux?

We have a process that takes about 20 hours to run on our Linux box. We would like to make it faster, and as a first step need to identify bottlenecks. What is our best option to do so?
I am thinking of sampling the process's CPU, RAM, and disk usage every N seconds. So unless you have other suggestions, my specific questions would be:
How much should N be?
Which tool can provide accurate readings of these stats, with minimal interference or disruption from the fact that the tool itself is running?
Any other tips, nuggets of wisdom, or references to other helpful documents would be appreciated, since this seems to be one of these tasks where you can make a lot of time-consuming mistakes and false-starts as a newbie.

First of all, what you want and what you are asking is completely different.
Monitoring is required when you are running it for first time i.e. when you don't know its resource utilization (CPU, Memory,Disk etc.).
You can follow below procedure to drill down the bottleneck,
Monitor system resources (Generally 10-20 seconds interval should be fine with Munin, ganglia or other tool).
In this you should be able to identify if your hw is bottleneck or not i.e are you running out of resources Ex. 100% cpu util, very low memory, high io etc.
If this your case then probably think about upgrading hw or tuning the existing.
Then you tune your application/utility. Use profilers/loggers to find out which method, process is taking time. Try to tune that process. If you have single threaded codes then probably use parallelism. If DB etc. are involved try to tune your queries, DB params.
Then again run test with monitoring to drill down more :)

I think a graph representation should be helpful for solving your problem and i advice you Munin.
It's a resource monitoring tool with a web interface. By default it monitors disk IO, memory, cpu, load average, network usage... It's light and easy to install. It's also easy to develop your own plugins and set alert thresholds.
http://munin-monitoring.org/
Here is an example of what you can get from Munin : http://demo.munin-monitoring.org/munin-monitoring.org/demo.munin-monitoring.org/

IIS Memory Leak in 64 bit only mode

I am having an issue where w3wp.exe is gaining around 100 meg per page load on a specific page (not entire site). The page is not that memory intensive and should not require so much memory.
I modified a single setting, "Enable 32-bit Applications" and set it to true and now the leak is gone, however I need to understand why this might be happening. It is happening only on one server, the other servers we test on do not see this issue. When Enable 32-bit Applications is disabled (false), the results from the ANTS memory profiler are attached below. Does anyone have any idea what's going on? Please note the only thing growing is "Unused Memory" / "Free Space"

Can you go into the list of classes and sort by the memory occupied?

I was too quick to judge this. After I disabled most of the work that one page was doing, the growth stopped, but looking at other pages I saw a similar pattern, but it stopped at a lower memory limit, like say 450 meg. then I upped our private memory limit to 2 gig instead of 1 gig, and re-enabled the "leaking" code. The memory shot up to 1.05 gig in 3 refreshes. 20 refreshes later, it is not significantly changing.
This is a case of IIS 64 bit app pool allocating way more memory than is necessary. Since it isn't actually leaking, this question was invalid.
anyway, if you notice the same behavior with Enable 32-bit Applications, I hope this helps you

linux CPU cache slowdown

We're getting overnight lockups on our embedded (Arm) linux product but are having trouble pinning it down. It usually takes 12-16 hours from power on for the problem to manifest itself. I've installed sysstat so I can run sar logging, and I've got a bunch of data, but I'm having trouble interpreting the results.
The targets only have 512Mb RAM (we have other models which have 1Gb, but they see this issue much less often), and have no disk swap files to avoid wearing the eMMCs.
Some kind of paging / virtual memory event is initiating the problem. In the sar logs, pgpin/s, pgnscand/s and pgsteal/s, and majflt/s all increase steadily before snowballing to crazy levels. This puts the CPU up correspondingly high levels (30-60 on dual core Arm chips). At the same time, the frmpg/s values go very negative, whilst campg/s go highly positive. The upshot is that the system is trying to allocate a large amount of cache pages all at once. I don't understand why this would be.
The target then essentially locks up until it's rebooted or someone kills the main GUI process or it crashes and is restarted (We have a monolithic GUI application that runs all the time and generally does all the serious work on the product). The network shuts down, telnet blocks forever, as do /proc filesystem queries and things that rely on it like top. The memory allocation profile of the main application in this test is dominated by reading data in from file and caching it as textures in video memory (shared with main RAM) in an LRU using OpenGL ES 2.0. Most of the time it'll be accessing a single file (they are about 50Mb in size), but I guess it could be triggered by having to suddenly use a new file and trying to cache all 50Mb of it all in one go. I haven't done the test (putting more logging in) to correlate this event with these system effects yet.
The odd thing is that the actual free and cached RAM levels don't show an obvious lack of memory (I have seen oom-killer swoop in the kill the main application with >100Mb free and 40Mb cache RAM). The main application's memory usage seems reasonably well-behaved with a VmRSS value that seems pretty stable. Valgrind hasn't found any progressive leaks that would happen during operation.
The behaviour seems like that of a system frantically swapping out to disk and making everything run dog slow as a result, but I don't know if this is a known effect in a free<->cache RAM exchange system.
My problem is superficially similar to question: linux high kernel cpu usage on memory initialization but that issue seemed driven by disk swap file management. However, dirty page flushing does seem plausible for my issue.
I haven't tried playing with the various vm files under /proc/sys/vm yet. vfs_cache_pressure and possibly swappiness would seem good candidates for some tuning, but I'd like some insight into good values to try here. vfs_cache_pressure seems ill-defined as to what the difference between setting it to 200 as opposed to 10000 would be quantitatively.
The other interesting fact is that it is a progressive problem. It might take 12 hours for the effect to happen the first time. If the main app is killed and restarted, it seems to happen every 3 hours after that fact. A full cache purge might push this back out, though.
Here's a link to the log data with two files, sar1.log, which is the complete output of sar -A, and overview.log, a extract of free / cache mem, CPU load, MainGuiApp memory stats, and the -B and -R sar outputs for the interesting period between midnight and 3:40am:
https://drive.google.com/folderview?id=0B615EGF3fosPZ2kwUDlURk1XNFE&usp=sharing
So, to sum up, what's my best plan here? Tune vm to tend to recycle pages more often to make it less bursty? Are my assumptions about what's happening even valid given the log data? Is there a cleverer way of dealing with this memory usage model?
Thanks for your help.
Update 5th June 2013:
I've tried the brute force approach and put a script on which echoes 3 to drop_caches every hour. This seems to be maintaining the steady state of the system right now, and the sar -B stats stay on the flat portion, with very few major faults and 0.0 pgscand/s. However, I don't understand why keeping the cache RAM very low mitigates a problem where the kernel is trying to add the universe to cache RAM.

virtual memory consumption of pthreads

Hello I developed a multi-threaded TCP server application that allows 10 concurrent connections receives continuous requests from them, after some processing requests, responds them to clients. I'm running it on a TI OMAP l137 processor based board it runs Monta Vista Linux. Threads are created per client ie 10 threads and it's pre-threaded. it's physical memory usage is about %1.5 and CPU usage is about %2 according to ps, top and meminfo. It's vm usage rises up to 80M where i have 48M (i reduced it from u-boot to reserve some mem for DSP). Any help is appreciated, how can i reduce it??.(/proc/sys/vm/.. tricks doesn't help :)
Thanks.

You can try using a drop in garbage collecting replacement for malloc(), and see if that solves your problem. If it does, find the leaks and fix them, then get rid of the garbage collector.
Its 'interesting' to chase these kinds of problems on platforms that most heap analyzers and profilers (e.g. valgrind) don't fully (if at all) support.
On another note, given the constraints .. I'm assuming you have decreased the default thread stack size? I think the default is 8M, you probably don't need that much. See pthread_attr_setstacksize() if you haven't adjusted it.
Edit:
You can check the default stack size with pthread_attr_getstacksize(). If it is at 8M, you've already blown your ceiling during thread creation (10 threads, as you mentioned).

Most VM is probably just for stacks. Of course, it's virtual, so it doesn't get commited if you don't use it.
(I'm wondering if thread's default stack size has anything to do with ulimit -s)
Apparently yes, according to
this other SO question

Does it rise to that level and stay there? Or does it eventually run out of memory? If the former, you simply need to figure out a way to have a smaller working set. If the latter, you have a memory leak and need to fix it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string