I understand older Procmon and its predecessors (filemon, regmon etc) used virtual drivers to hook the kernel. However, Patchguard prevents SSDT hooking etc on 64-bit Vista+.
It is my understanding that Procmon now uses a minifilter driver for File IO monitoring and ETW for networking monitoring. However, I am no clear on how it monitors registry access and process/image/thread events? Does it also use ETW for these?
There are bunch of callbacks for monitoring support in kernel (since xp):
registry -> http://msdn.microsoft.com/en-us/library/windows/hardware/ff545879(v=vs.85).aspx
process/image/thread notify - PsSetCreateProcessNotifyRoutineEx / PsSetLoadImageNotifyRoutine / PsSetCreateThreadNotifyRoutine -> http://msdn.microsoft.com/en-us/library/windows/hardware/ff559917(v=vs.85).aspx
object manager callbacks for handles monitoring -> http://msdn.microsoft.com/en-us/library/windows/hardware/ff558692(v=vs.85).aspx
on xp was some limitation, but since vista they fully functional. No need to patch any internal tables for any monitoring activity.
Related
As far as I understand, Linux does not have IO Completion Ports.
That's probably the reason why in Scala (JVM) developer should explicitly notify API about blocking operations.
However, Task Parallel Library seems to not bother the developer with such details. Everything works out of the box.
But how then is thread starvation avoided in case of Mono/NetCore on Linux?
I am new to tracing in Linux. I have a multi-threaded C++ user application. The threads wake up periodically (by o/s timer) and sleep after doing some processing. I want to visualise:
1) When the threads start and stop running
2) Which cores the threads are running on.
I have installed lttng and Trace Compass onto an Ubuntu 14.04 LTS machine. But I don't know how to use these tools to achieve my objective.
I have read the following lttng doc section:
http://lttng.org/docs/#doc-tracing-your-own-user-application
In order to collect my trace, must I define custom lttng tracepoint definitions ( in a tracepoint provider header file ), and insert tracepoints into my user application, or is there a simpler way of achieving my goal?
Best regards
David
You can take a kernel trace, enabling at least the sched_switch event, to obtain information regarding which thread is running on which CPU. Opening such trace in Trace Compass and looking at the Control Flow View should show the status of all threads, so you can search for the ones the correspond to your application.
In addition, you could also instrument your application with userspace tracepoints, as you mentioned. This would allow you to track userspace states, going further than what is available in just the kernel trace.
You might be interested this example/tutorial, which shows how to instrument a simple application and how to write a Trace Compass configuration file to display application states graphically.
Is there a way to enable Performance Counters to monitor Node.js application performance in Windows Azure?
I haven't experimented with it myself yet, but there is node-perfmon which is a wrapper around typeperf. That says it allows you to write performance counters, as well as simple memory / cpu monitoring. Is this the sort of monitoring you were looking for?
Just adding more to above answers..
For application stats monitoring on Node.js you can use Hummingbird. It supports status over http so you can integrate the code in your node.js app add one port to get the monitoring data over HTTP. No need to use Azure Storage Diagnostics and all info in real time in same machine. It's still in pre-alfa, but is handling with few tasks really well.
http://projects.nuttnet.net/hummingbird/
I know about the node.js "monitor" plugin which is the best for Linux machines for system specific performance and also use HTTP to provide system specific data. I am not sure if that can be ported to Windows Server but if can that is one great choice. Read more about monitor usage here:
http://www.sys-con.com/node/2275314
You may want to also look at these, they aren't directly using perfmon, but allow you to monitor the performance of your Node.js server:
http://search.npmjs.org/#/Probes.js
http://search.npmjs.org/#/nodetime
The NPM registry is a great tool for finding Node.js packages.
I am using oracle 11g and i have an application which is coded in Spring framework. Once i configure the database on Sun fire 4170 installed with Linux the machine's CPU utilization is around 80-100% and, however, when i shift the same database to Sun M3000 server installed with Unix OS (supposedly more powerful machine) the application performance goes down and CPU utilization remains 90-100%. I can't figure out if its the application which is making the such utilization or its the database design.
It is added that the database is not relational; things are handled by the application.
Well you certainly can find some interesting opinions on the intertubes.
Oracle does not have a true server
architecture (others have it).
Rather than performing classic server
tasks, such as multi-threading,
caching of data pages, parallel
processing (split a query across many
devices) etc. within itself, it uses
the o/s to do all that. That means for
each user process (PL/SQL connection)
there is one unix process; 1000 users
means 1000 unix processes, all
competing for the same resources.
You might note that Oracle has had
a connection pooling architecture (multi-threaded server) since version 7 (1992).
a cache for data pages (known helpfully as the buffer cache) since forever
parallel query (splitting a query across many processes) since version 7.1 (1993)
splitting queries across multiple servers since OPS (version 6) or across distributed databases (version 5)
It's also noteworthy that even if all that was said was correct rather than incorrect it doesn't actually help you in determining root cause.
Especially noteworthy, because it uses
file system files (not raw
partitions), and the "caching" is
outside, it relies heavily on (and is
very sensitive to) the file system
cache that you have set up. likewise,
Oracle needs a massive amount of
memory for these processes.
Oracle certainly can use raw partitions again dating back to the last millenium, moreover if you wish to cache within the database - using the buffer cache that PerformanceDBA has forgotten about - and bypass the filesystem cache this feature is available on all current filesystems. Oracle also supplies it's own combined filesystem/volume manager in ASM which you can use if you wish.
Oracle is also rather well instrumented (and if you have access to dtrace so is solaris) and can certainly tell you what sessions, processes etc are using the CPU, what the time the application spends in the database is consumed by (down to individual block read times if you care) and so is very susceptible to profiling. I'd recommend that you check out Thinking Clearly about Performance available at http://www.method-r.com/downloads/cat_view/38-papers-and-articles and written by one of the top Oracle Performance experts in the world. If you have access to the Oracle Diagnostics pack then checking out first of all ADDM reports and secondly AWR reports would be profitable.
Trying to avoid a flame war here.
I should probably have separated out the "how to find out" part of my response more clearly from my responses to the comments about server architecture from PerformanceDBA. I share Stephanie's suspicions about the spring framework, but without properly scoped measurement evidence there is no point in blaming any particular attribute of the environment, that would be just particular bias. Fortunately the instrumentation built into the oracle kernel allows you to trace and then profile the slow sessions to determine exactly where the issue lies. So I would do the following:
1) enable tracing for a representative session (you can use the dbms_monitor package for that).
2) also gather an execution plan for the statement(s) involved with the gather_plan_statistics hint.
3) profile the trace file by time using an appropriate profile (tkprof,orasrp,method-r profiler)
Investigate the problem statements in contribution to response time order.
If you can't carry out the above, then you can use ADDM and/or AWR if licenced as I originally suggested or statspack if not licensed for the diagnostics pack. ADDM naturally concentrates on time consumers, I suggest if you are forced down the statspack route you do the same.
The M3000 is certainly a more powerful machine, but it is more suitable for true servers. The X4170 with hyper-threads is more suited for file servers.
I'm not so certain about that. Have any data to support that claim?
An M3000 has one SPARC64 VII processor with 4 cores (tech specs) while a X4170 has 1 or 2 Intel 5500 "Nehalem-EP" processors each with 4 cores (tech specs). I know that I would expect much more from even a single processor Nehalem-EP system, than the M3000. Obviously data will vary slightly with the workload, but I know where I'd put my money.
From this answer: When is a C++ terminate handler the Right Thing(TM)?
It would be nice to have a list of resources that 'are' and 'are not' automatically cleaned up by the OS when an application quits. In your answer it would be nice if you can specify the OS/resource and preferably a link to some documentaiton (if appropriate).
The obvious one:
Memory: Yes automatically cleaned up.
Question. Are there any exceptions?
There are some obscure resources that Windows does not clean up when an app crashes or exits without explicitly releasing them, mostly because the OS doesn't know if they're important to leave around or not.
Temporary files -- as others have mentioned.
Globally registered WNDCLASSes ("No window classes registered by a DLL are unregistered when the DLL is unloaded. A DLL must explicitly unregister its classes when it is unloaded." MSDN) If your global window class also has a class DC, then that DC will leak as well.
Global ATOMs (a relatively limited resource).
Window message IDs created with RegisterWindowMessage. These are designed to leak, since there's no UnregisterWindowMessage.
Semaphores and Events aren't technically leaked, but when the owning application goes away without signalling them, then other processes can hang. This is not true for a Mutex. If the owning application goes away, other processes waiting on that Mutex are released.
There may be some residual weirdness on Windows XP and earlier if you don't unregister a hot key before exiting. Other applications may be unable to register the same hot key.
On Windows XP and earlier, it's not uncommon to have a zombie console window live on after a process crashes. (Specifically, a GUI application that also creates a console window.) It shows up on the task bar. All you can do is minimize, restore, or move the window.
Buggy drivers can be aggravated by apps that don't explicitly release resources when they exit. Non-paged pool leaks are fairly common.
Data copied to the clipboard. I guess that doesn't really count because it's owned by the OS at that point, not the application that put it there.
Globally installed hooks aren't unloaded when the installing process crashes before removing the hook.
Temporary files is a good example of something that will not be cleaned up - the handle is released but the file isn't deleted
In Windows, just about anything you can get handle to should be in fact be managed by the OS - that's why you only get a handle. This includes, but is not limited tom
the following (list copied from MSDN docs for CloseHandle() API):
Communications device
Console input
Console screen buffer
Event
File
File mapping
Job
Mailslot
Mutex
Named pipe
Process
Semaphore
Socket
Thread
Token
All of these should be recovered by the OS when an application closes, though possibly not immediately, depending on their use by other processes.
Other operating systems work in the same way. It's hard to an imagine an OS worth its name (I exclude embedded systems etc.) where this is not the case - resource management is the #1 raison d'etre for an operating system.
Any exception is a bug - applications can and do crash and do contain leaks. An OS needs to be reliable and not exhaust resources even in the face of poorly written applications. This also applies to non-OS resources. Services that hand out resources to processes need to free those resources when the process exits. If they don't it is a bug that needs to be fixed.
If you're looking for program artifacts which can persist beyond process exit, on Windows you have at least:
Registry keys that are created
without REG_OPTION_VOLATILE
Files created without FILE_FLAG_DELETE_ON_CLOSE
Event log entries
Paper that was used for print jobs