VTune hotspots analysis reports my program's execution time (elapsed time) was 60 seconds out of which only 10 seconds are reported as "CPU Time". I'm trying to where the remaining 50 seconds was spent. Using Windows Process Monitor's File System Activity, I see my program spent 5 seconds doing disk I/O. This still leaves 45 seconds unaccounted for.
There are two threads in my program, according to VTune, one of those thread consumed 99% of the CPU Time. I don't see how these two threads given their execution profiles could explain the lost time.
Any thoughts?
Related
I would like to measure the CPU usage of a running database server when I execute a query.
The goal is to get the wallclock time, total CPU time, user CPU time and kernel(system) CPU time, so I can estimate how much time is spent on computation, and how much time is on I/O.
The server is dedicated to this experiment and the CPU usage is close to 0% when no query is running, so my plan is to
start the monitor
run the query
stop the monitor and collect the CPU usage during the interval
The monitor can either give a sum of CPU time in that period or a list of sampling results, which I can sum them up myself.
I have searched for similar problems and tried several solutions but they do not satisfy my need.
pidstat pidstat seems good but the granularity is too coarse. The smallest interval is 1 second and I will need a finer interval such as 100ms.
mpstat The same problem as pidstat. Large interval.
top top can run in batch mode but the sampling interval is also big (2s-3s). It also does not provide user/kernel time breakdown.
Thank you all for your suggestions!
Try using the "time" or "times" command. You might have to make some workaround to use these commands.
time program_name
https://man7.org/linux/man-pages/man1/time.1.html
"times" needs to be executed from the same shell from where database server has been started.
https://man7.org/linux/man-pages/man2/times.2.html
I am trying to find simple examples for what are exactly the wait time and execution time in determining the size of the thread pool. According to brian Goetz:
For tasks that may wait for I/O to complete -- for example, a task
that reads an HTTP request from a socket -- you will want to increase
the pool size beyond the number of available processors, because not
all threads will be working at all times. Using profiling, you can
estimate the ratio of waiting time (WT) to service time (ST) for a
typical request. If we call this ratio WT/ST, for an N-processor
system, you'll want to have approximately N*(1+WT/ST) threads to keep
the processors fully utilized.
I really didn't understand what he meant the Input/output. Who's doing the I/O tasks.
Imagine a task that reads some data from disk. What actually happens:
Open file.
Wait for (the spinning) disk to awake from sleep, to position the head at the right spot and for the desired blocks to appear underneath the head until all bytes arrive in a buffer.
Read from the buffer.
The whole task takes 0.1s to complete. Of this 0.1s 10 percent are spent on step 1 and 3 and the remaining 90 percent on step 2. So 0.01s are "working time" and 0.09s "wait time" that is spent waiting for the disk.
While working with a tomcat process in Linux we observed that the time field shows
5506:34 ( cumulative CPU time ) . While exploring this is the CPU percentage of time spent running during the entire lifetime of a process.
Since this is a Java process we also observed that memory was almost full and needed a restart.
My Question is what exactly is this Cumulative CPU time. Why does this specific process taking more CPU time when there are other process too ?
the total time the cpu spends on a process. If the process uses more threads, these are cumulated.
The hotspots view (cpu view) shows incorrect time units for inherent times. I tried profiling an application which copies a physical file 200 times concurrently. The application completed in 1.2 seconds while the jprofiler snapshot shows a particular method taking 122 secs. That's strange.
Anyone who has worked with jprofiler?
This looks OK. JProfiler shows elapsed times, not CPU times. By default, the CPU views cumulate all threads, so with 200 concurrent threads, the displayed time measurements should be upwards of 200 times of the time measurements for a single thread.
You can use the thread selector at the top to switch to a single thread, then you will see times that correspond to the total run time.
I am aware that the output of the time command can show greater time under the user section than the real section for multi-processor cases, but recently, I was trying to profile a program when I saw that real was substantially greater than user + sys.
$ time ./test.o
real 0m5.576s
user 0m1.270s
sys 0m0.540s
Can anybody explain why such a behaviour is caused?
That's the normal behavior.
"Real" is is the wall-clock time. In your example, it literally took 5.576 seconds to run './test.o'
'user' is the User CPU time, or (roughly) CPU time used by user-space processes. This is essentially the time your CPU spent actually executing './test.o'. 1.270 seconds.
And finally, 'sys' is System CPU time, or (roughly) CPU time used by your kernel. 0.540 seconds.
If you add sys + user, you get the amount of time your CPU had to spend executing the program.
real - (user + sys) is, then, the time spent not running your program. 3.766 seconds were spent between invocation and termination not running your program--probably waiting for the CPU to finish running other programs, waiting on disk I/O, etc.
Time your process spends sleeping (e.g., waiting for I/O) is not counted by either "user" or "system", but "real" time still elapses.
Try:
time cat
...then wait 10 seconds and hit ctrl-D.
There are at least two possibilities:
The system is busy with other competing processes
The program is sleeping a lot, or doing other operations which cause it to wait, like i/o (waiting for user input, disk i/o, network i/o, etc.)