In Linux, we can use "cat /proc/processs-id/sched" to get the scheduling infomation, nr_switches,nr_voluntary_switches,nr_involuntary_switches tell us how many times has the process be scheduled. Is there any similar method that we can get a thread's scheduling times?
thanks in advance!
It's hard to know what you mean by "scheduling times". If you mean kernel/user run ticks then /prox/xxx/stat looks like it has some details about the runtimes.
Under linux, the threads of a process can be found in /proc/xxx/task/yyy. Each directory corresponds to a thread process associated with the parent.
utime %lu Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations.
stime %lu Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK).
I'd check the proc manpages for a list of the available files.
man proc
Related
While working with a tomcat process in Linux we observed that the time field shows
5506:34 ( cumulative CPU time ) . While exploring this is the CPU percentage of time spent running during the entire lifetime of a process.
Since this is a Java process we also observed that memory was almost full and needed a restart.
My Question is what exactly is this Cumulative CPU time. Why does this specific process taking more CPU time when there are other process too ?
the total time the cpu spends on a process. If the process uses more threads, these are cumulated.
How to execute a process for n cpu cycles on Linux? I have a batch processing system on a multi-core server and would like to ensure that each task gets exactle the same amount of cpu time. Once the cpu amount is consumed I would like to stop the process. So far I tried to do some thing with /proc/pid/stats utime and stime, but I did not succeed.
I believe it is impossible (to give the exact same number of cycles to several processes; a CPU cycle is often less than a nanosecond). You could execute a process for x CPU seconds. For that use setrlimit(2) with RLIMIT_CPU
Your batch processor could also manage time itself, see time(7). You could use timers (see timer_create(2) & timerfd_create(2)), have an event loop around poll(2), measure time with clock_getttime(2)
I'm not sure it is useful to write your own batch processing system. You could use the existing batch or slurm or gnqs (see also commercial products like pbsworks, lsf ..)
I have used this piece of code for trying to set the -same- high priority while executing a program :
CPU_SET(CPU_NUM, &cmask);
if (pthread_setaffinity_np(pid, sizeof(cmask), &cmask) < 0) {
LOG_ERROR("Could not set cpu affinity to core %d", CPU_NUM); goto exit_err;
}
errno = 0;
setpriority(PRIO_PROCESS, 0, -19);
The purpose of the program is to perform a computation for a constant bunch (every 80 bytes) of input.
But when executing the program, the time elapsed for this computation varies from 30% to 150%.
When plotting the computation time values, I was waiting for a -quite- smooth graph were the deviation would be something like 10%-15%, but instead there is more than 40% !!!
So I would like to ask, if the CPU is interfering the execution of the program with an other, and if so could I force the CPU to run ONLY a specific program?
Thanks in advance !
P.S. I haven't found a thread that could answer to my question yet...
The most relevant is :) :
Linux reserve a processor for a group of processes (dynamically)
To try and reduce jitter some of the things you can do are:
Ensure sure you've turned off CPU scaling.
Set scheduling policy to SCHED_FIFO for that program.
Try and pin your process to a single processor if you have more than one.
Try and run as few other processes at the same time while you're measuring your program.
Don't trigger sources of time related non-determinism (e.g. disk I/O).
It is probably useful to skim through How to build a Linux RT application because accurate measurement is the same domain - it's possible to be more extreme though:
Ensure your program doesn't use dynamic memory allocations.
Use a realtime Linux kernel.
Prevent Linux from scheduling non-specific userspace programs on a given CPU.
Even disable timer ticks on a given CPU (CONFIG_TASK_ISOLATION).
Modern desktop/server processors are so complicated that trying to precisely measure a single program's execution time with low variance is extremely hard. Things like the various caches and pipeline starting states can perturb execution times in any number of ways so there are always going to be limits.
We just discovered the peculiar feature of the Linux "top" tool.
The feature is that the summary cpu time for all threads is less than the time displayed for entire process. This is observed when our application spawns more than 50 threads and works for several minutes.
So the question is: what is that extra time consumed not by any thread but by the process itself? How is that possible?
As I understand the information about processes and threads CPU usage is taken from /proc/<pid>/stat & /proc/<pid>/task/<tid>/stat files. Who fills these files and why the time in <pid>/stat is not a sum of all <tid>/stat times?
I am aware that the output of the time command can show greater time under the user section than the real section for multi-processor cases, but recently, I was trying to profile a program when I saw that real was substantially greater than user + sys.
$ time ./test.o
real 0m5.576s
user 0m1.270s
sys 0m0.540s
Can anybody explain why such a behaviour is caused?
That's the normal behavior.
"Real" is is the wall-clock time. In your example, it literally took 5.576 seconds to run './test.o'
'user' is the User CPU time, or (roughly) CPU time used by user-space processes. This is essentially the time your CPU spent actually executing './test.o'. 1.270 seconds.
And finally, 'sys' is System CPU time, or (roughly) CPU time used by your kernel. 0.540 seconds.
If you add sys + user, you get the amount of time your CPU had to spend executing the program.
real - (user + sys) is, then, the time spent not running your program. 3.766 seconds were spent between invocation and termination not running your program--probably waiting for the CPU to finish running other programs, waiting on disk I/O, etc.
Time your process spends sleeping (e.g., waiting for I/O) is not counted by either "user" or "system", but "real" time still elapses.
Try:
time cat
...then wait 10 seconds and hit ctrl-D.
There are at least two possibilities:
The system is busy with other competing processes
The program is sleeping a lot, or doing other operations which cause it to wait, like i/o (waiting for user input, disk i/o, network i/o, etc.)