I want to get execution time from task_struct - linux

In a module I am accessing the task_struct and returning with stime+utime.
I want to convert it to milliseconds. In what format stime and utime will be present in task_struct.
I can also access it from /proc//stat. Are the both unit are different.

Advanced Unix Programming by Marc Rochkind covers this topic to some degree (page 55-ish, if I remember correctly). Pardon me if I paraphrase what he states better.
utime represents user time, and is the time spent executing instructions. It is CPU time only and doesn't include time spent waiting to run.
stime is the CPU time spent executing system calls on behalf of the process.
the units are in clock ticks.
clock ticks per second can be determined with the sysconf system call.
I hope this helps.

Given a task_struct, the total on-cpu time in nanonseconds is stored in task->se.sum_exec_runtime.
This does not appear to add up to task->utime+task->stime unless first adjusted via task_cputime_adjusted(task, &utime, &stime).
For anyone else trying to implement this kind of functionality, I highly recommend reading proc(5). In this case, searching for utime leads you to /proc/[pid]/stat (or proc/self/stat) which provides utime in its 14th column. The implementation can be found in fs/proc/array.c which is where I found the call to task_cputime_adjusted. Verify module output against e.g. /proc/[pid]/stat | awk '{ print $14 } for utime.

Related

What is meaning of timestamp in perf?

I'd like to use 'perf' to measure real execution time of a function. 'perf script' command gives timestamp when the function is called.
Xorg 1523 [001] 25712.423702: probe:sock_write_iter: (ffffffff95cd8b80)
The timestamp field's format is X.Y. How can I understand this value? Is it X.Y seconds?
X.Y is the timestamp in units of seconds.microseconds.
How this value is displayed can be looked at here. You can pass the switch --ns to perf script to display the timestamps in seconds.nanoseconds format too.
To understand this value, you need to understand how the perf module calculates timestamps. You can associate each event with a different clock function to compute the timestamps. By default, perf uses sched_clock function to compute timestamps for an event, more details here.
event->clock = &local_clock;
But you can use the -k switch along with perf record command to associate an event with various clockids.
-k, --clockid
Sets the clock id to use for the various time fields in the
perf_event_type records. See clock_gettime(). In particular
CLOCK_MONOTONIC and CLOCK_MONOTONIC_RAW are supported, some
events might also allow CLOCK_BOOTTIME, CLOCK_REALTIME and
CLOCK_TAI.
Adding the -k switch to perf record command will enable various clock functions depending on which clockid you use, as can be seen here.
sched_clock function shall return the number of nanoseconds since the system was started. A particular architecture may or may not provide an implementation of sched_clock() on its own. The system jiffy counter will be used as sched_clock(), if a local implementation is not provided.
Note that, all of the above code snippets are for Linux kernel 5.6.7.
X.Y is the raw format of the date. The perf's time function, which operates in nanoseconds, needs to be converted to a human readable format. Use this website to convert it to date, http://www.timestamp.fr/ , or using bash
date -d #25712

How do I know the last sched time of a process

I current run into an issue that a process seems stuck somehow, it just doesn't gets scheduled, the status is always 'S'. I have monitored sched_switch_task trace by debugfs for a while, didn't see the process get scheduled. So I would like to know when is that last time scheduled of this process by kernel?
Thanks a lot.
It might be possible using the info in /proc/pid#/sched file.
In there you can find these parameters (depending on the OS version, mine is opensuse 3.16.7-21-desktop):
se.exec_start : 593336938.868448
...
se.statistics.wait_start : 0.000000
se.statistics.sleep_start : 593336938.868448
se.statistics.block_start : 0.000000
The values represent timestamps relative to the system boot time, but in a unit which may depend on your system (in my example the unit is 0.5 msec, for a total value of ~6 days 20 hours and change).
In the last 3 parameters listed above at most one appears to be non-zero at any time and it I suspect that the respective non-zero value represents the time when it last entered the corresponding state (with the process actively running when all are zero).
So if your process is indeed stuck the non-zero value would have recorded when it got stuck.
Note: this is mostly based on observations and assumptions - I didn't find these parameters documented anywhere, so take them with a grain of salt.
Plenty of other scheduling info in that file, but mostly stats and without documentation difficult to use.

Importance of do_fast_gettimeoffset( ) in linux

Was reading "Understanding Linux Kernel" book and in it says that, "number of microseconds is calculated by do_fast_gettimeoffset( )". Also it says that "to count the number of microseconds that have elapsed within the current second."
Couldnt understand what the author means by last sentence. Could anyone explain more on that?
If you want to understand the linux kernel, you should be aware that that book has been outdated for a long time and that do_fast_gettimeoffset no longer exists.
do_get_fast_time returns the number of seconds, and is always fast.
do_gettimeoffset returns the number of microseconds since the start of the second, and might be slow.

Starting point for CLOCK_MONOTONIC

As I understand on Linux starting point for CLOCK_MONOTONIC is boot time. In my current work I prefer to use monotonic clock instead of CLOCK_REALTIME (for calculation) but in same time I need to provide human friendly timestamps (with year/month/day) in reporting. They can be not very precise so I was thinking to join monotonic counter with boot time.
From where I can get this time on linux system using api calls?
Assuming the Linux kernel starts the uptime counter at the same time as it starts keeping track of the monotonic clock, you can derive the boot time (relative to the Epoch) by subtracting uptime from the current time.
Linux offers the system uptime in seconds via the sysinfo structure; the current time in seconds since the Epoch can be acquired on POSIX compliant libraries via the time function.
#include <stddef.h>
#include <stdio.h>
#include <time.h>
#include <sys/sysinfo.h>
int main(void) {
/* get uptime in seconds */
struct sysinfo info;
sysinfo(&info);
/* calculate boot time in seconds since the Epoch */
const time_t boottime = time(NULL) - info.uptime;
/* get monotonic clock time */
struct timespec monotime;
clock_gettime(CLOCK_MONOTONIC, &monotime);
/* calculate current time in seconds since the Epoch */
time_t curtime = boottime + monotime.tv_sec;
/* get realtime clock time for comparison */
struct timespec realtime;
clock_gettime(CLOCK_REALTIME, &realtime);
printf("Boot time = %s", ctime(&boottime));
printf("Current time = %s", ctime(&curtime));
printf("Real Time = %s", ctime(&realtime.tv_sec));
return 0;
}
Unfortunately, the monotonic clock may not match up relative to boot time exactly. When I tested out the above code on my machine, the monotonic clock was a second off from the system uptime. However, you can still use the monotonic clock as long as you take the respective offset into account.
Portability note: although Linux may return current monotonic time relative to boot time, POSIX machines in general are permitted to return current monotonic time from any arbitrary -- yet consistent -- point in time (often the Epoch).
As a side note, you may not need to derive boot time as I did. I suspect there is a way to get the boot time via the Linux API, as there are many Linux utilities which display the boot time in a human-readable format. For example:
$ who -b
system boot 2013-06-21 12:56
I wasn't able to find such a call, but inspection of the source code for some of these common utilities may reveal how they determine the human-readable boot time.
In the case of the who utility, I suspect it utilizes the utmp file to acquire the system boot time.
http://www.kernel.org/doc/man-pages/online/pages/man2/clock_getres.2.html:
CLOCK_MONOTONIC
Clock that cannot be set and represents monotonic time since some
unspecified starting point.
Means that you can use CLOCK_MONOTONIC for interval calculations and other things but you can't really convert it to a human readable representation.
Moreover, you prabably want CLOCK_MONOTONIC_RAW instead of CLOCK_MONOTONIC:
CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific)
Similar to CLOCK_MONOTONIC, but provides access to a raw hard‐
ware-based time that is not subject to NTP adjustments.
Keep using CLOCK_REALTIME for human-readable times.
CLOCK_MONOTONIC is generally not affected by any adjustments to system time. For example, if the system clock is adjusted via NTP, CLOCK_MONOTONIC has no way of knowing (nor does it need to).
For this reason, don't use CLOCK_MONOTONIC if you need human-readable timestamps.
See Difference between CLOCK_REALTIME and CLOCK_MONOTONIC? for a discussion.

printk's in under a nanosecond? - getnstimeofday() questions

I'm doing module programming. I have a time measuring I want to make on the performance impact of some printk's I'm doing. I have a setup in code like this.
In "declare-y" beginning part of the code:
struct timespec ts_start,ts_end,test_of_time;
In a method:
{
//..other stuff
getnstimeofday(&ts_start);
printk("mkdir being hijacked\n");
printk("pid is %d ", current->pid);
printk("call #: 39 \n");
printk("user_id of process: %d, effuid: %d\n\n", current->uid, current->euid);
getnstimeofday(&ts_end);
test_of_time = timespec_sub(ts_end,ts_start);
printk("%lu", test_of_time.tv_nsec);
return val;
}
I dmesg and strangely see the value 0. I highly doubt it took 0 nanoseconds for this to happen. What is amiss here?
Thanks
What version of the kernel are you using? You likely don't actually have nanosecond resolution on your timer. If you want to measure the time the printks take, you should run them in a loop so that they take a finite and measurable amount of time. It won't be completely accurate (e.g. the first prink while likely be slower than subsequent ones due to cache misses, etc.), but that should give you a ballpark idea.
If you want to see why this happens, try allocating a big buffer, spinning in a loop writing the values of getnstimeofday into the buffer for a while, and then outputting the buffer to somewhere you can analyze it. You'll probably be able to see the actual clock resolution in the data.

Resources