Is the UNIX `time` command accurate enough for benchmarks? [closed] - linux

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Let's say I wanted to benchmark two programs: foo.py and bar.py.
Are a couple thousand runs and the respective averages of time python foo.py and time python bar.py adequate enough for profiling and comparing their speed?
Edit: Additionally, if the execution of each program was sub-second (assume it wasn't for the above), would time still be okay to use?

time produces good enough times for benchmarks that run over one second otherwise the time it took exec()ing a process may be large compared to its run-time.
However, when benchmarking you should watch out for context switching. That is, another process may be using CPU thus contending for CPU with your benchmark and increasing its run time. To avoid contention with other processes you should run a benchmark like this:
sudo chrt -f 99 /usr/bin/time --verbose <benchmark>
Or
sudo chrt -f 99 perf stat -ddd <benchmark>
sudo chrt -f 99 runs your benchmark in FIFO real-time class with priority 99, which makes your process the top priority process and avoids context switching (you can change your /etc/security/limits.conf so that it doesn't require a privileged process to use real-time priorities).
It also makes time report all the available stats, including the number of context switches your benchmark incurred, which should normally be 0, otherwise you may like to rerun the benchmark.
perf stat -ddd is even more informative than /usr/bin/time and displays such information as instructions-per-cycle, branch and cache misses, etc.
And it is better to disable the CPU frequency scaling and boost, so that the CPU frequency stays constant during the benchmark to get consistent results.

Nowadays, imo, there is no reason to use time for benchmarking purposes. Use perf stat instead. It gives you much more useful information and can repeat the benchmarking process any given number of time and do statistics on the results, i.e. calculate variance and mean value. This is much more reliable and just as simple to use as time:
perf stat -r 10 -d <your app and arguments>
The -r 10 will run your app 10 times and do statistics over it. -d outputs some more data, such as cache misses.
So while time might be reliable enough for long-running applications, it definitely is not as reliable as perf stat. Use that instead.
Addendum: If you really want to keep using time, at least don't use the bash-builtin command, but the real-deal in verbose mode:
/usr/bin/time -v <some command with arguments>
The output is then e.g.:
Command being timed: "ls"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1968
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 93
Voluntary context switches: 1
Involuntary context switches: 2
Swaps: 0
File system inputs: 8
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Especially note how this is capable of measuring the peak RSS, which is often enough if you want to compare the effect of a patch on the peak memory consumption. I.e. use that value to compare before/after and if there is a significant decrease in the RSS peak, then you did something right.

Yes, time is accurate enough. And you'll need to run only a dozen of times your programs (provided the run lasts more than a second, or a significant fraction of a second - ie more than 200 milliseconds at least). Of course, the file system would be hot (i.e. small files would already be cached in RAM) for most runs (except the first), so take that into account.
the reason you want to have the time-d run to last a few tenths of seconds at least is the accuracy and granularity of the time measurement. Don't expect less than hundredth of second of accuracy. (you need some special kernel option to have it one millisecond)
From inside the application, you could use clock, clock_gettime, gettimeofday,
getrusage, times (they surely have a Python equivalent).
Don't forget to read the time(7) man page.

Yes. The time command gives both elapsed time as well as consumed CPU. The latter is probably what you should focus on, unless you're doing a lot of I/O. If elapsed time is important, make sure the system doesn't have other significant activity while running your test.

Related

How can a daemon stay active while using no memory?

I come up with this question while using the ps aux command.
Here I can see that a few processes are at 0% CPU 0% MEM 0 VSZ 0 RSS.
If a daemon is not using any memory, why and how could it be displayed in the first place ? I kind of understand that 0% CPU usage mean the process is not currently in use but wouldn't 0% MEM mean no process at all ?
I wanted to check if this was somehow systems daemon specific so I made a simple C program with an infinite loop, without any variables.
void main()
{
while (1){}
}
This time VSZ and RSS have actual values, while MEM staying at 0%.
What is happening here ?
%MEM is probably not fully documented on your system. AIX manual about ps command says:
%MEM
Calculated as the sum of the number of working segment and code
segment pages in memory times 4 (that is, the RSS value), divided by
the size of the real memory in use, in the machine in KB, times 100,
rounded to the nearest full percentage point. This value attempts to
convey the percentage of real memory being used by the process.
Unfortunately, like RSS, it tends the exaggerate the cost of a process
that is sharing program text with other processes. Further, the
rounding to the nearest percentage point causes all of the processes
in the system that have RSS values under 0.005 times real memory size
to have a %MEM of 0.0.
As you could have suspected by examining the output, some rounding have been applied. So if the value is too low %0.0 is printed.
And, this measure percentage of the real memory usage, which means that it doesn't reflect the size of the process but only which part of the process is actually mapped to real memory.
In your first case %0.0 for CPU just means that the process exists but actually does nothing and it is probably in a waiting state (or consuming a very small percentage of the processing power), not "that it is is not currently in use". In your second case, your process is active, it is in fact very busy (this is what %97.7 reflects), but what it does is stupid (infinite loop doing nothing).
To understand all of this, you may read about process state, process scheduling and virtual memory.
While Jean-Baptiste's answer is correct as far as it goes, I believe it's more significant in this case that all of the 0 memory in all three fields processes you're noting are kernel threads. Their memory is all kernel memory, and doesn't show up on top or ps. You can tell it's a kernel thread on linux both by the command being encapsulated by brackets and by the process consuming no memory in the VSZ column. (That's the column that represents basically everything that could be considered the process's memory. It's only 0 for kernel threads, and that only because they don't properly report their memory.
Also note that with a start time in 2018 and having consumed no more than 1 minute 41 seconds, none of those jobs are really very active.

Does load average affect performance?

For example, first I run a benchmark program when the load average is 0.00,
then, I run some cpu-consuming task to generate some load to 10.00, then kill it.
next, now cpu usage is 0 but load average is 10.00, if I run the benchmark program again, will the load average affect the result?
No, but that doesn't mean your benchmark will run the same.
The answer to your question is no. The load average is a reported value. It is meant to give you an idea of the state of the system, averaged over several periods of time. Since it is averaged, it takes time for it to go back to 0 after a heavy load was placed on the system.
This is just a report, however. Your system isn't really loaded, and the CPU isn't currently taken. A new benchmark you'll run is unaffected by the system's state 5 minutes ago.
With that said, what is true for CPU may not be true for memory. If your loader uses a lot of memory, the kernel might push less used memory into the swap. It will also reduce the amount of memory it has for file cache. Depending on your benchmark, that might affect the benchmark's performance.

How to measure if a program was run in parallel over multiple cores in Linux?

I want to know if my program was run in parallel over multiple cores. I can get the perf tool to report how many cores were used in the computation, but not if they were used at the same time (in parallel).
How can this be done?
You can try using the command
top
in another terminal while the program is running. It will show the usage of all the cores on your machine.
A few possible solutions:
Use htop on another terminal as your program is being executed. htop shows the load on each CPU separately, so on an otherwise idle system you'd be able to tell if more than one core is involved in executing your program.
It is also able to show each thread separately, and the overall CPU usage of a program is aggregated, which means that parallel programs will often show CPU usage percentages over 100%.
Execute your program using the time command or shell builtin. For example, under bash on my system:
$ dd if=/dev/zero bs=1M count=100 2>/dev/null | time -p xz -T0 > dev/null
real 0.85
user 2.74
sys 0.14
It is obvious that the total CPU time (user+sys) is significantly higher than the elapsed wall-clock time (real). That indicates the parallel use of multiple cores. Keep in mind, however, that a program that is either inefficient or I/O-bound could have a low overall CPU usage despite using multiple cores at the same time.
Use top and monitor the CPU usage percentage. This method is even less specific than time and has the same weakness regarding parallel programs that do not make full use of the available processing power.

How does proc stats work

I have done a lot of reading and testing of the proc directory in OS's using the Linux kernel. I have been using Linux myself for many years now, but I needed to get into more details for a small private project. Specifically how the stat files work. I knew the basics, but not enough to create actual calculations with the data in them.
The problem is that the files in proc does not seem to contain what they should, not according to what I have read vs. my tests.
For example: the CPU line in the root stat file should contain the total uptime for the CPU times the amount of cores (and/or amount of CPU's) in jiffies. So to get the system uptime, you would have to add each number in the row to each other, divide by the number of cores/CPU's and again divide by whatever a jiffie is defined to be on that particular system. At least this is the formula that I keep finding when searching this subject. If this was true, then the result should be equal to the first number in /proc/uptime? But this is not the case, and I have tested this on several machines with different amount of cores, both 32bit and 64bit systems. I can never get these two to match up.
Also the stat file for each pid have an uptime part (part 21 I think it was). But I cannot figure out what this number should be matched against to calculate a process uptime in seconds. So far what I have read, it should contain the total cpu jiffies as they was when the process was started. So if this is true, then one would simply substract this from the current total cpu jiffies and divide this with whatever a jiffie is on that system? But again, I cannot seam to get this to add up to reality.
Then there is the problem with finding out what a jiffie is. I found a formula where /proc/stat was used together with /proc/uptime and some dividing with the amount of cores/CPU's to get that number. But this does not work. And I would not expect it to when the values of those two files does not add up, like mentioned in my first problem above. I did however come up with a different approach. Simply reading the first line of /proc/stat twice within a second. Then I could just compare and see how many jiffies the system had added in that second and divide that with the number of cores. This works on normal Linux systems, but it fails on Android in most cases. Android is constantly attaching/detaching cores depending on needs, which means that it differs how much you have to divide with. It is no problem as long as the core count matches both reads, but if one core goes active during the second read, it does not work.
And last. I do not quite get the part by dividing by amount of cores. If each core writes all of it's work time and idle time to the total line in /proc/stat, then it would make sense as that line would actually contain the total uptime times the amount of cores. But if this was true then each of the cpu lines would add up to the same number, but they don't. This means that dividing by amount of cores should provide an incorrect result. But that would also mean that cpu monitor tools are making calculation errors, as they all seam to use this method.
Example:
/proc/stat
cpu 20455737 116285 4584497 104527701 1388173 366 102373 0 0 0
cpu0 4833292 5490 1413887 91023934 1264884 358 94250 0 0 0
cpu1 5785289 47944 1278053 4439797 45015 1 4235 0 0 0
cpu2 4748431 20922 926839 4552724 33455 2 2745 0 0 0
cpu3 5088724 41928 965717 4511246 44819 3 1141 0 0 0
The lines cpu0, cpu1, cpu2 and cpu3 does not add up to the same total result. This means that using the total result of the general cpu line divided by 4 should be incorrect.
/proc/uptime
1503361.21 3706840.53
All of the above output was taken from a system that should be using clock ticks of 100. Now if you take the result of the general cpu line, divide that with 100 and then with 4 (amount of cores), you will not get the result of the uptime file.
And if you take the result of the general cpu line, divide that with the uptime from /proc/uptime and then with 4 (amount of cores), you will not get the 100 that is this kernels clock ticks.
So why is nothing adding up as it should? How do I get the clock ticks of a kernel, even on systems that attaches/detaches cores constantly? How to I get the total real uptime of a process? How do I get the real uptime from /proc/stat?
(This answer is based on the 4.0 kernel.)
The first number on each line of /proc/stat is the total time which each CPU has spent executing non-"nice" tasks in user mode. The second is the total time spent executing "nice" tasks in user mode.
Naturally, there will be random variations across CPUs. For example, the processes running on one CPU may make more syscalls, or slower syscalls, than those running on another CPU. Or one CPU may happen to run more "nice" tasks than another.
The first number in /proc/uptime is the "monotonic boot time" -- the amount of time which has passed since the system was last booted (including time which passed while the system was suspended). The second number is the total amount of time which all CPUs have spent idling.
There is also a task-specific stat file for each PID in the corresponding subdirectory under /proc. This one starts with a PID number, a name in brackets, and a state code (represented by a character). The 19th number after that is the start time of the process, in ticks.
All of this information is not very hard to find simply by browsing the Linux source code. I recommend you clone a local copy of Linus' repo and use grep to find the details you need. As a tip, the process-specific files in /proc are implemented in fs/proc/base.c. The /proc/stat file which you asked about is implemented in fs/proc/stat.c. proc/uptime is implemented in fs/proc/uptime.c.
man proc is also a good source of information.

How can I determine max memory usage of a process in Linux?

I have a program that's running in two different modes. I want to compare the two modes with regard to runtime and memory requirements. Determining runtime is easy with using time. In fact, in this case it's really easy because the program reports both the CPU time and the wallclock time at the end of the test. However, determining memory usage is a bit harder.
How can I get details of the memory usage of the process throughout its lifetime? I want to know both the maximum usage and the average. In fact, ideally I'd like some graph of memory usage throughout the life of the run.
time has a verbose mode which gives you the maximum and average resident set size.
(The resident set size is the portion of a process's memory that is held in RAM).
$ /usr/bin/time -v command_that_needs_to_measured |& grep resident
Maximum resident set size (kbytes): 6596
Average resident set size (kbytes): 0
Remember to use the binary /usr/bin/time, which has a -v option. You can view its documentation by running man time. If you fail to specify its path, bash's built-in time will run instead, which doesn't have a -v option. You can view its documentation in the bash man page or by running help time.
Valgrind's massif tool can give you a chart of memory usage over time. See http://valgrind.org/docs/manual/ms-manual.html

Resources