epoll_wait and changing system time back

epoll_wait and changing system time back - linux

My app uses epoll_wait to perform a timed wait for IO events. If no event happens, epoll_wait is supposed to return after the timeout and my app continues.
During testing, someone turn the system clock back by a day and the part of my app that uses epoll_wait stopped working for 24 hours. Obviously, this is a problem.
I've rummaged around looking for something that might allow my app to know that the time has changed (for example, a signal) but I haven't found anything.
Is there any way to deal with abrupt time changes like this?

Use a timer based upon a monotonic time (e.g., timer_create(2)) to generate a signal and do a blocking epoll_wait, and check for -1 return code and errno set to EINTR.

The Linux kernel implementation of epoll_wait uses CLOCK_MONOTONIC so it should be immune to system clock changes. (I investigated this since I was seeing a similar problem and wanted to discount epoll_wait as a possible cause.) Before v2.6.37 it used to use jiffies which are also monotonic.
I've submitted a patch to the epoll_wait man page to clarify this.

Related

Is there a way to make time pass faster in linux

I'm not quite sure how timekeeping works in linux short of configuring an NTP server and such.
I am wondering if there is a way for me to make time tick faster in linux. I would like for example for 1 second to tick 10000 times faster than normal.
For clarification I don't want to make time jump like resetting a clock, I would like to increase the tick rate whatever it may be.

This is often needed functionality for simulations and replaying incoming data or events as fast as possible.
The way people solve this issue is that they have an event loop, e.g. libevent or boost::asio. The current time is obtained from the event loop (e.g. the time when epoll has returned) and stored in the event loop variable current time. Instead of using gettimeofday or clock_gettime the time is read from that current time variable. All timers are driven by the event loop current time.
When simulating/replaying, the event loop current time gets assigned the timestamp of the next event, hence eliminating time durations between the events and replaying the events as fast as possible. And your timers still work and fire in between the events as they would in the real-time but without the delays. For this to work your saved event stream that your replay must contain a timestamp of each event, of course.

How to call a function every n milliseconds in "real world" time exactly?

If I understand correctly, setInterval(() => console.log('hello world'), 1000) will place the function to some queue of tasks to run. But if there are other tasks in-front of it, it won't run exactly at 1000 millisecond or every time.
In a single complex program, is it possible to also make calls to some function every n millisecond exactly in real world time with node.js ?

If I understand correctly, setInterval(() => console.log('hello world'), 1000) will place the function to some queue of tasks to run. But if there are other tasks in-front of it, it won't run exactly at 1000 millisecond or every time.
That is correct. It won't run exactly at the desired time if node.js happens to be busy doing something else when the timer is ready to run. node.js will wait until it finishes it's other task before running the timer callback. You can think of node.js as if it has a one-track mind (can only do one thing at a time) and timers don't ever interrupt existing tasks that are running.
In a single complex program, is it possible to also make calls to some function every n millisecond exactly in real world time with node.js ?
No, it is not possible to do that in node.js. node.js runs your Javascript as single-threaded, it's event driven and not-preemptive. All of these mean that you cannot rely on code running at a precise real-world time.
What happens under the covers in node.js is that you set a timer for a specific time in the future. That timer goes is registered with the node.js event loop so that each time it gets through the event loop, it will check if there are any pending timers. But, it only gets through the event loop when other code that was running before the timer was ready to fire finishes running. Here's the sequence of events:
Run some code
Set timer for some time in the future (say time X)
Run some more code
Nothing to do for awhile
Run some more code (while this code is running, time X passes - the time for your timer to run)
Previous block of code finishes running and control returns back to the node.js event loop at time X + n (some time after the timer X was supposed to fire).
Event loop checks to see if there are any pending timers. It finds a timer and calls its callback at time X + n.
So, the only way that your timer gets called at approximately time X is if node.js has nothing else to do at exactly time X. If your program is ever doing anything else, you can't guarantee that your program will be free at exactly time X to run the timer exactly when you want it to run. node.js is NOT a real-time system in any way. single-threaded and non-pre-emptive mean that a timer may have to wait for node.js to finish some other things before it gets to run and thus there is no guarantee that the timer will run exactly on time. Instead, it will run as not before time X when the interpreter is next free to return back to the event loop (done running whatever else might have been running at the time). This could be close to time X or it could be a significant time after time X.
If you really need something to run precisely at a specific time, then you likely need a pre-emptive system (not node.js) that is much more real-time than node.js is.
You could create a "work-around" in node.js by firing up another node.js process (you could use the child_process module) and start a program in that other process that has nothing else to do except serve your timer and execute the code associated with that timer. Then, at least you timer won't be pre-empted by some other Javascript task that might be running and will get to run pretty close to the desired time. Keep in mind that even this work-around still isn't a true real-time system, but might serve some purposes.
Otherwise, you probably want to write this in a more real-time system language that has pre-emptive timers (probably even with thread priorities).

But if there are other tasks in-front of it, it won't run exactly at 1000 millisecond or every time.
Your question is actually operating system specific, assuming the computer is running some (usual) operating system (like Windows, Android, Linux, MacOSX, etc...). I recommend reading Operating Systems: Three Easy Pieces to learn more.
In practice, your computer has many other processes managed by its operating system. Some of them might be running. Your computer might be in a situation where it is loaded enough by other processes to the point of not being able to run your tasks or threads exactly every second. Read about thrashing.
You might want to use some genuine real-time operating system. But then, node.js probably won't run on it.
How to call a function every n milliseconds in “real world” time exactly?
You cannot do that reliably. Because your node.js process (it is actually single threaded, at the system threads level, see pthreads(7) and jfriend00's answer) might not get enough resources from your OS (so if other processes are loading your computer too much, node.js would be starved and won't be able to progress like you want; be also aware of possible priority inversions).
On Linux, see also shed(7), chrt(1), renice(1)

I suggest to make a cron which will run at every n seconds. If your program is complex and it may take more time then you can go with async.
npm install cron
var CronJob = require('cron').CronJob;
new CronJob('* * * * * *', function() {
console.log('You will see this message every second');
callYourFunc();
}, null, true, 'America/Los_Angeles');
For more read this link

Perhaps you could spawn a worker thread and block it while it’s waiting to do the work, in the way suggested by CertainPerformance in the comments. It may not be the most elegant way to do it but at least you can put the blocking logic aside so that it doesn’t affect the rest of the application.
Check out the example in the docs if you’re unfamiliar with the cluster module: https://nodejs.org/docs/latest-v10.x/api/cluster.html

nodejs prioritise function execution

I've edited a library (ddp-client) to make use of a heartbeat timer, which sends out a ping every X seconds. However, I'm also doing some work with the bluetooth hardware, which I believe is responsible for pings sometimes not returning in time (because the bluetooth seems to block the event loop temporarily). Is there a way to prioritise a certain function on the event loop, so it will always be executed before others? I don't think setImmediate would be suitable here, since I don't know exactly when the response message from the server would arrive.
The implementation of the timer is roughly as follows:
every X seconds
if(ping outstanding) {
//Did not resolve in time
closeConnection()
} else {
ping outstanding = true
sendPing()
}
This works perfectly fine if I run it without the bluetooth module. When I enable the bluetooth module, pings sometimes do not get resolved because the time taken to scan for bluetooth is sometimes longer than the interval of the timer, leading to a disconnect, while it's actually still connected.

Is there a way to prioritise a certain function on the event loop, so it will always be executed before others?
No. node.js does not have a way for one piece of code to pre-empt another and always have priority. Any code that "hogs" the CPU a bit or otherwise blocks the event loop a bit should either be fixed to not do that or it can be moved into it's own child process and you can communicate with it via any one of the many interprocess communication schemes.
Or, alternatively, if the ping timer is really, really important to run on time, then maybe it should be in its own child process where it can always just run as scheduled with no chance of something else interrupting it.
Implementing precise timers like this is one thing that node.js is just not good at. Because it runs all your Javascript in a single thread, keeping a server instantly responsive or keeping timers running precisely on time requires that nobody ever blocks the event loop or hogs the CPU for longer than your timing threshold. The usual work-around is to move things into their own child process where they get their own priority with the CPU.

Open MPI Virtual Timer Expired

I'm using Open MPI 1.8 on Gentoo 3.13 to manage the data transfer from one program to another via a server/client concept. Both the server and the clients are launched via mpiexec as separate processes. After some days (this is quite a heavy computation...), I sometimes receive the error
mpiexec noticed that process rank 0 with PID 17213 on node XXX exited on signal 26 (Virtual timer expired).
Unfortunately, the error is not reproducible in a reliable way, i.e., the error does not appear always and not always at the same point in the program flow. I also experienced this error on other machines. I already tracked the issue down to the ITIMER_VIRTUAL which, upon expiration, delivers SIGVTALRM (see, e.g., http://man7.org/linux/man-pages/man2/setitimer.2.html). In the BUGS section of the man page, it says that
Under very heavy loading, an ITIMER_REAL timer may expire before the signal from a previous expiration has been delivered. The second signal in such an event will be lost.
I wonder if something similar might also hold for ITIMER_VIRTUAL? Did anyone experience similar problems and can confirm the error?
The only workaround I can think of is to invoke setitimer(...) and try to manipulate the timer myself. However, I hope there is another way since I can't always modify the clients' source code. Any suggestions?

Since this question has not been answered officially, I will do it on behalf of Hristo (#HristoIliev: I hope this is ok for you). As was pointed out in the first comment to my question, there is not a single hint in the Open MPI source code which can have caused the virtual timer expiration. Indeed, the timer problem was related to a third-party library which made the code crash after an unpredictable time (depending on the current loading of the machine).

What would cause WaitForSingleObject with finite timeout not to return?

The title says it all. I am using C++ Builder to submit a form to an Internet server using TIdHTTP->Post(), to get a response. Since that call can get stuck if there is a network problem or a server problem, I am trying to run it in a separate thread. When the Post() returns, I signal the Event that I am waiting for with WaitForSingleObject, using a timeout of 1000. At one point, I was processing messages after the timeouts, but now I am just repeating the WaitForSingleObject call with a timeout of 1000 again, until the event is signaled or my total timeout period (20 seconds) has elapsed. If the timeout elapses, I would call Disconnect() on the TIdHTTP and try again.
However, I have not been able to get this to work reliably, although it usually works. I am using CodeSite to log the progress, and I can see that, on occasion, WaitForSingleObject is called, but does not return (ever). Since WaitForSingleObject is being called on the main thread, the application is then unresponsive until it is killed.
While one must always think of memory corruption when a C++ program stalls, I don't think that is what is going on. The stall is always at the WaitForSingleObject call, and if it was a memory corruption issue, I would expect that, at least sometimes, something else would go wrong.
The MSDN page for WaitForSingleObject says that the timer does not count down while the computer is asleep, and the monitor does go blank after a while, but the computer continues to run, and in any case WaitForSingleObject does not return once the mouse is moved and the monitor comes back on.
So, again, my question. What could be causing WaitForSingleObject with a finite timeout (1000 msecs) to never return?

So, the answer to my question is "Something Else". In this case, I finally tracked it down to a library I was using that also used threads. It worked with a previous version of RAD Studio, but disabling that library fixed this issue. I am moving to the current version of that library and will re-test.
I had read about problems with WFSO causing blocks, and even where Sleep might never return if there were too many threads running (http://msdn.microsoft.com/en-us/library/windows/desktop/ms686298(v=vs.85).aspx), so I thought there might be something I didn't know about threads and WFSO causing this.
Thanks to everyone for your helpful comments which pointed me in the right direction, and especially to Remy Lebeau for his suggestions on managing Post timeouts with Indy, which he maintains.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string