Linux clock_gettime(CLOCK_MONOTONIC) strange non-monotonic behavior - linux

Folks, in my application I'm using clock_gettime(CLOCK_MONOTONIC) in order to measure the delta time between frames (a typical approach in gamedev) and from time to time I'm facing a strange behavior of clock_gettime(..) - returned values occasionally are not monotonic (i.e prev. time is bigger than current time).
Currently, if such a paradox happens I simply skip the current frame and start processing the next one.
The question is how can this be possible at all? Is it a bug in Linux POSIX implementation of clock_gettime? I'm using Ubuntu Server Edition 10.04 (kernel 2.6.32-24, x86_64), gcc-4.4.3.

man clock_gettime says:
CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific)
Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based time that is not subject to NTP adjustments.
Since CLOCK_MONOTONIC_RAW is not subject of NTP adjustments, I guess CLOCK_MONOTONIC could be.
We had similar problems with Redhat Enterprise 5.0 with 2.6.18 kernel and some specific Itanium processor. We couldn't reproduce it with other processor on the same OS. It was fixed in RHEL 5.3 with slightly newer kernel and some Redhat patches.

Looks like an instance of
commit 0696b711e4be45fa104c12329f617beb29c03f78
Author: Lin Ming <ming.m.lin#intel.com>
Date: Tue Nov 17 13:49:50 2009 +0800
timekeeping: Fix clock_gettime vsyscall time warp
Since commit 0a544198 "timekeeping: Move NTP adjusted clock
multiplier to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.
This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf
See here for a patch. This was included into 2.6.32.19, but may not have been backported by the Debian team(?). You should check it out.

Try CLOCK_MONOTONIC_RAW.

Sure sounds like a bug to me. Perhaps you should report it in Ubuntu's bug tracker.

It's a linux bug. No ajustment in a monotonic clock can make it go backwards. You're using a very old kernel and a very old distribution.
Edit: are you sure you need to skip the frame ? If you call clock_gettime again, what happens ?

Related

Is clock_nanosleep affected by adjtime and NTP?

Usually CLOCK_MONOTONIC_RAW is used for obtaining a clock that is not affected by NTP or adjtime(). However clock_nanosleep() doesn't support CLOCK_MONOTONIC_RAW and trying to use it anyway will result in return code 95 Operation not supported (Kernel 4.6.0).
Does clock_nanosleep() somehow take these clock adjustments into account or will the sleep time be affected by it?
What are the alternatives if a sleeping time is required which should not be affected by clock adjustments?
CLOCK_MONOTONIC_RAW never had support for clock_nanosleep() since it was introduced in Linux 2.6.28. It was also explicitly fixed to not have this support in 2.6.32 because of oopses. The code had been refactored several times after that, but still there is no support for CLOCK_MONOTONIC_RAW in clock_nanosleep() and I wasn't able to find any comments on why is that.
At the very minimum, the fact that there was a patch that explicitly disabled this functionality and it passed all reviews tells us that it doesn't look like a big problem for kernel developers. So, at the moment (4.7) the only things CLOCK_MONOTONIC_RAW supports are clock_getres() and clock_gettime().
Speaking of adjustments, as already noted by Rich CLOCK_MONOTONIC is subject to rate adjustments just by the nature of this clock. This happens because hrtimer_interrupt() runs its queues with adjusted monotonic time value (ktime_get_update_offsets_now()->timekeeping_get_ns()->timekeeping_delta_to_ns() and that operates with xtime_nsec which is subject to adjustment). Actually, looking at this code I'm probably no longer surprised that CLOCK_MONOTONIC_RAW has no support for clock_nanosleep() (and probably won't have it in future) — adjusted monotonic clock usage seems to be the basis for hrtimers.
As for alternatives, I think there are none. nanosleep() uses the same CLOCK_MONOTONIC, setitimer() has its own set of timers, alarm() uses ITIMER_REAL (same as setitimer()), that (with some indirection) is also our good old friend CLOCK_MONOTONIC. What else do we have? I guess nothing.
As an unrelated side note, there is an interesting observation in that if you call clock_nanosleep() for relative interval (that is not TIMER_ABSTIME) then CLOCK_REALTIME actually becomes a synonym for CLOCK_MONOTONIC.

What is the plan to upgrade time_t for linux 32

Linux currently has time_t as 32 bits on Linux 32 bit. This will run out in less than 25 years (mortgage times) and Linux is being used embedded in devices with long > 10 year lifetimes (cars). Is there an upgrade plan for this platform?
There is no "set" time or time frame for which all Linux kernels will be using 64-bit time_t. In fact right now the general consensus is that it will not be changed anytime soon. No one is really that worried about it yet; just like Y2K it will cause problems in code that already relies on time_t.
There are a few Operating Systems that are using the workaround which is to use a wrapper that makes time_t both a 32-bit and a 64-bit integer.
While others have just forcibly upgraded time_t to use 64-bit integers.
For more information please refer to this link:
https://en.wikipedia.org/wiki/Year_2038_problem
There were some good articles about it (specifically syscalls) on LWN. Have a look at System call conversion for year 2038

Linux: starttime field in /proc/[pid]/stat on earth use HZ(jiffies) or USER_HZ(_SC_CLK_TCK)

I wan't to measure the executing duration of a process outside this process on Linux. I found /proc/[pid]/state has a field named starttime which described to be "The time in jiffies the process started after system boot" on man page.
Also, I found /proc/uptime provides elapsed time ET in seconds since system boot. Theoretically I can acquire running time from these two files by
running time = ET - starttime / (jiffies per second).
As to jiffies, I think it refers to CONFIG_HZ of kernel (250 on ubuntu 12.04) instead of USER_HZ (100 on ubuntu 12.04, acquired by "getconf CLK_TCK"), as described in http://www.makelinux.net/books/lkd2/ch10lev1sec3. However, I test it and found that in fact the starttime uses USER_HZ on ubuntu 12.04. I was confused by this point. Could someone explained this to me? Thanks a lot!
Your man page was probably out-of-date at the time you retrieved it. Here's a more current page which states the following:
(22) starttime %llu
The time the process started after system boot. In
kernels before Linux 2.6, this value was expressed
in jiffies. Since Linux 2.6, the value is expressed
in clock ticks (divide by sysconf(_SC_CLK_TCK)).
In older kernels (before Linux 2.6), the time really was represented in kernel jiffies. However, this behavior changed to now provide time in clock ticks -- jiffies scaled via the USER_HZ constant as you expect.

How to improve real-time performance of 1ms timer in Linux?

I'm working on an embedded Linux project, using an arago distribution that is probably around version 3.3.
I have configured a high-resolution Linux timer to wake-up my process once per millisecond. This works ok but there are two issues with the timing:
A jitter in the wake-up time
Variability of the processing time when awake, despite the fact that the processing done by the process is constant.
I attribute these problems to the less-than real-time performance of Linux. But I need to
investigate ways of improving the real-time performance.
I have checked that the kernel is configured with the CONFIG_PREEMPT kernel option, which is good for real-time.
I have also applied the SCHED_FIFO scheduling class to my process:
struct sched_param schedparm;
memset(&schedparm, 0, sizeof(schedparm));
schedparm.sched_priority = 1; // lowest rt priority
sched_setscheduler(0, SCHED_FIFO, &schedparm);
but that made no difference.
I guess that a logical step is to apply the PREEMPT_RT patch to the kernel build, but I haven't identified how to do that yet.
Is there anything else I can do to improve the jitter / duration variability?
Or can anyone suggest an accessible tutorial on how to apply the PREEMPT_RT patch?
It seems PREEMPT_RT is the logical next step. Did you try this tutorial?
https://rt.wiki.kernel.org/index.php/RT_PREEMPT_HOWTO
Update: I suggest you look at how others build a preemptive kernel, e.g. here:
https://aur.archlinux.org/packages/linux-rt/
You can read the PKGBUILD to understand what is done.
If you use Debian testing, or LMDE for that matter, they have a precompiled PREEMPT_RT kernel in the repo for the x86 and amd64 architectures.
apt-get install linux-image-rt-686-pae
or
apt-get install linux-image-rt-amd64

clock source in Linux

In a system running kernel version 2.6.38 I see this sysfs file which shows the current clock source (happens to be tsc) /sys/devices/system/clocksource/clocksource0/current_clocksource
But looks like this sysfs file was introduced recently. In 2.6.9 I don't see this file. In versions that doesn't have this sysfs is there an easy to see the clock source ? When I compare the clock_gettime() ouptputs in these versions 2.6.9 seems to be at microseconds granularity and 2.6.38 is at nanoseconds granularity. Hence wondering what the clock source in 2.6.9 is..
You could try grepping clocksource and TSC from dmesg output.
FWIW, the high-resolution timers (which enabled nanosecond resolution, among many other things) were introduced in the 2.6.21 kernel or thereabouts; older kernels don't have that, as you have found out.

Resources