Node.js: When does process.hrtime start - node.js

What is the starting time for process.hrtime (node.js version of microTime)? I couldn't find any event that starts it. Is the timestamp of the start saved anywhere?
I need my server to be able to get latency to client both ways and for that, I need reliable microtime measure. (+ for some other things)

The starting time is arbitrary; its actual value alone is meaningless. Call hrtime when you want to start the stopwatch, and call it again when your operation is done. Subtract the former from the latter and you have the elapsed time.
process.hrtime()
Returns the current high-resolution real time in a
[seconds, nanoseconds] tuple Array. It is relative to an arbitrary
time in the past. It is not related to the time of day and therefore
not subject to clock drift. The primary use is for measuring
performance between intervals.
You may pass in the result of a previous call to process.hrtime() to
get a diff reading, useful for benchmarks and measuring intervals:

Related

Random slowdowns in node.js execution

I have an optimization algorithm written in node.js that uses cpu time (measured with performance.now()) as a heuristic.
However, I noticed that occasionally some trivial lines of code would cost much more than usual.
So I wrote a test program:
const timings = [];
while (true) {
const start = performance.now();
// can add any trivial line of code here, or just nothing
const end = performance.now();
const dur = end - start;
if (dur > 1) {
throw [
"dur > 1",
{
start,
end,
dur,
timings,
avg: _.mean(timings),
max: _.max(timings),
min: _.min(timings),
last: timings.slice(-10),
},
];
}
timings.push(dur);
}
The measurements showed an average of 0.00003ms and a peak >1ms (with the second highest <1ms but same order of magnitude).
The possible reasons I can think of are:
the average timing isn't the actual time for executing the code (some compiler optimization)
performance.now isn't accurate somehow
cpu scheduling related - process wasn't running normally but still counted in performance.now
occasionally node is doing something extra behind the scenes (GC etc)
something happening on the hardware/os level - caching / page faults etc
Is any of these a likely reason, or is it something else?
Whichever the cause is, is there a way to make a more accurate measurement for the algorithm to use?
The outliers are current causing the algorithm to misbehave & without knowing how to resolve this issue the best option is to use the moving average cost as a heuristic but has its downsides.
Thanks in advance!
------- Edit
I appreciate how performance.now() will never be accurate, but was a bit surprised that it could span 3-4 orders of magnitude (as opposed to 2 orders of magnitude or ideally 1.)
Would anyone have any idea/pointers as to how performance.now() works and thus what's likely the major contributor to the error range?
It'd be nice to know if the cause is due to something node/v8 doesn't have control over (hardware/os level) vs something it does have control over (a node bug/options/gc related), so I can decide whether there's a way to reduce the error range before considering other tradeoffs with using an alternative heuristic.
------- Edit 2
Thanks to #jfriend00 I now realize performance.now() doesn't measure the actual CPU time the node process executed, but just the time since when the process started.
The question now is
if there's an existing way to get actual CPU time
is this a feature request for node/v8
unless the node process doesn't have enough information from the OS to provide this
You're unlikely to be able to accurately measure the time for one trivial line of code. In fact, the overhead in executing performance.now() is probably many times higher than the time to execute one trivial line of code. You have to be careful that what you're measuring takes substantially longer to execute than the uncertainty or overhead of the measurement itself. Measuring very small executions times is not going to be an accurate endeavor.
1,3 and 5 in your list are also all possibilities. You aren't guaranteed that your code gets a dedicated CPU core that is never interrupted to service some other thread in the system. In my Windows system, even when my nodejs is the only "app" running, there are hundreds of other threads devoted to various OS services that may or may not request some time to run while my nodejs app is running and eventually get some time slice of the CPU core my nodejs app was using.
And, as best I know, performance.now() is just getting a high resolution timer from the OS that's relative to some epoch time. It has no idea when your thread is and isn't running on a CPU core and wouldn't have any way to adjust for that. It just gets a high resolution timestamp which you can compare to some other high resolution timestamp. The time elapsed is not CPU time for your thread. It's just clock time elapsed.
Is any of these a likely reason, or is it something else?
Yes, they all sound likely.
is there a way to make a more accurate measurement for the algorithm to use?
No, sub-millisecond time measurements are generally not reliable, and almost never a good idea. (Doesn't matter whether a timing API promises micro/nanosecond precision or whatever; chances are that (1) it doesn't hold up in practice, and (2) trying to rely on it creates more problems than it solves. You've just found an example of that.)
Even measuring milliseconds is fraught with peril. I once investigated a case of surprising performance, where it turned out that on that particular combination of hardware and OS, after 16ms of full load the CPU ~tripled its clock rate, which of course had nothing to do with the code that appeared to behave weirdly.
EDIT to reply to edited question:
The question now is
if there's an existing way to get actual CPU time
No.
is this a feature request for node/v8
No, because...
unless the node process doesn't have enough information from the OS to provide this
...yes.

Executing a function periodically on accurate and precise intervals

I want to implement an accurate and precise countdown timer for my application. I started with the most simple implementation, which was not accurate at all.
loop {
// Code which can take upto 10 ms to finish
...
let interval = std::time::Duration::from_millis(1000);
std::thread::sleep(interval);
}
As the code before the sleep call can take some time to finish, I cannot run the next iteration at the intended interval. Even worse, if the countdown timer is run for 2 minutes, the 10 milliseconds from each iteration add up to 1.2 seconds. So, this version is not very accurate.
I can account for this delay by measuring how much time this code takes to execute.
loop {
let start = std::time::Instant::now();
// Code which can take upto 10 ms to finish
...
let interval = std::time::Duration::from_millis(1000);
std::thread::sleep(interval - start.elapsed());
}
Even though this seems to precise up to milliseconds, I wanted to know if there is a way to implement this which is even more accurate and precise and/or how it is usually done in software.
For precise timing, you basically have to busy wait: while time.elapsed() < interval {}. This is also called "spinning" (you might have heard of "spin lock"). Of course, this is far more CPU intensive than using the OS-provided sleep functionality (which often transitions the CPU in some low power mode).
To improve upon that slightly, instead of doing absolutely nothing in the loop body, you could:
Call thread::yield_now().
Call std::hint::spin_loop()
Unfortunately, I can't really tell you what timing guarantees these two functions give you. But from the documentation it seems like spin_loop will result in more precise timing.
Also, you very likely want to combine the "spin waiting" with std::thread::sleep so that you sleep the majority of the time with the latter method. That saves a lot of power/CPU-resources. And hey, there is even a crate for exactly that: spin_sleep. You should probably just use that.
Finally, just in case you are not aware: for several use cases of these "timings", there are other functions you can use. For example, if you want to render a frame every 60th of a second, you want to use some API that synchronizes your loop with the refresh rate/v-blanking of the monitor directly, instead of manually sleeping.

Measuring a feature's share of a web service's execution time

I have a piece of code that includes a specific feature that I can turn on and off. I want to know the execution time of the feature.
I need to measure this externally, i.e. by simply measuring execution time with a load test tool. Assume that I cannot track the feature's execution time internally.
Now, I execute two runs (on/off) and simply assume that the difference between the resulting execution time is my feature's execution time.
I know that it is not entirely correct to do this as I'm looking at two separate runs that may be influenced by networking, programmatic overhead, or the gravitational pull of the moon. Still, I hope I can assume that the result will still be viable if I have a sufficiently large number of requests.
Now for the real question. I do the above using the average response time. Which is not perfect, but more or less ok.
My question is, what if I now use a percentile (say, 95th) instead?
Would my imperfect subtract-A-from-B approach become significantly more imperfect when using percentiles?
I would stick to the percentiles as the "average" approach can mask the problem, for example if you have very low response times during the initial phase of the test when the load is low and very high response times during the main phase of test when the load is immense the arithmetic mean approach will give you okayish values while with the percentiles you will get the information that the response time for 95% of requests was X or higher.
More information: Understanding Your Reports: Part 3 - Key Statistics Performance Testers Need to Understand

What time are the process.hrtime() and process.hrtime.bigint() functions referring to in Node.js?

the process.hrtime() is a legacy version of process.hrtime.bigint() method
process.hrtime(); // -> [27511, 516453000] (seconds, remaining nanoseconds)
process.hrtime.bigint(); // -> 27511516453000n (nanoseconds)
27511516453000 nanoseconds are 7.6420879036111113 hours
When I'm Testing this the time is 11:54 UTC and 14:54 Locale Time
the 7.64 hours does not refer to the current time so what is that 7.64 hours are referring to?
The meaning of these values is defined in the documentation (although it's easy to miss, since it's just one line):
These times are relative to an arbitrary time in the past, and not related to the time of day and therefore not subject to clock drift.
So in other words, hrtime is only useful for calculating the time relative to another point in time. If you call hrtime now, and then again ten seconds in the future, the result of subtracting the former from the latter will equal ten seconds. The values returned by those two calls, however, have no real meaning in isolation from each other.

What's a good technique to store a time-dependent metric in Redis?

I have some metrics (like counts of logged in users, or SQL queries, or whatever), and I want to gather some time-dependent stats on a regular basis.
For example I want to know how many users were registered in some particular year, month, week, day or even hour.
I thought maybe Redis can be a good fit. But I can't imagine a good strategy for storing such stats. The only idea I have is to store independent counters for days, weeks, etc, and bump them all at once.
How do you do it? I need a good trick. Or maybe Redis isn't any good for my task.
If all you need is a count for each period, the multiple counter approach you suggest is a good one. Incrementing 5 counters in a single pipelined transaction is O(1), while set operations are O(log n + m) with potentially large values of n/m.
The set solution Frank suggested does have its place - I use something similar where I need to know which actions happened rather than just how many. Obviously storing details of each action takes more memory than the counters, but with the amount of RAM typically available these days you can store millions of records before that becomes a problem.
I would just use a sorted sets where the score is the timestamp in seconds since the epoch (unix time). Say you have a sorted set of logins and you want to see how many logins occured in the year 2010, just convert 20101231 23:59:59 and 20100101 00:00:00 to seconds and use those are the max and min arguments to zcount.
The obviously difficulty here is handling the time conversion yourself, but its actually very easy because it the standard Unix format. You can use the date command with %S (on linux at least) or use the system calls time(), localtime() and mktime(), as well as any of the myriad ways available within specific languages that are built on top of these system calls.
I am sure there is some equivalent paradigm in Windows, but that I don't have any experience there.

Resources