The code below executes a long running function (a sleep to keep it simple), then calls back itself again using a setTimeout. I am using nodejs 5.1.0
var start_time = Math.ceil(new Date() /1000)
function sleep(delay) {
var start = new Date().getTime();
while (new Date().getTime() < start + delay);
}
function iteration(){
console.log("sleeping at :", Math.ceil(new Date()/1000 - start_time));
sleep(10000); // sleep 10 secs
console.log("waking up at:", Math.ceil(new Date()/1000 - start_time));
setTimeout(iteration, 0);
}
iteration();
At the moment where setTimeout is called node is not doing anything (eventloop empty in theory?) so I expect that callback to be executed immediately. In other terms, I expect the program to sleep immediately after it wakes up.
However here is what happens:
>iteration();
sleeping at : 0
waking up at: 10
undefined
> sleeping at : 10 //Expected: Slept immediately after waking up
waking up at: 20
sleeping at : 30 //Not expected: Slept 10 secs after waking up
waking up at: 40
sleeping at : 50 //Not expected: Slept 10 secs after waking up
waking up at: 60
sleeping at : 70 //Not expected. Slept 10 secs after waking up
waking up at: 80
sleeping at : 90 //Not expected: Slept 10 secs after waking up
waking up at: 100
sleeping at : 110 //Not expected: Slept 10 secs after waking up
As you can see, the program sleeps immediately after it wakes up the first time (which is what I expect), but not after that. In fact, it waits exactly another 10 seconds between waking up and sleeping.
If I use setImmediate things work fine. Any idea what I am missing?
EDIT:
Notes:
1- In my real code, I have a complex long algorithm and not a simple sleep function. The code above just reproduces the problem in a simpler use case.
That means I can't use setTimeout(iteraton, 10000)
2- The reason why I use setTimeout with a delay of 0 is because my code blocks nodejs for so long that other smaller tasks get delayed. By doing a setTimeout, nodejs gets backs to the event loop to execute those small tasks before continuing to work on my main long running code
3- I completely understand that setInterval(function, 0) will run as soon as possible and not immediately after, however that does not explain why it runs with 10 seconds delay given that nodejs is not doing anything else
4- My code in Chrome runs without any problems. Nodejs bug?
First look at the node.js documenation:
It is important to note that your callback will probably not be called
in exactly delay milliseconds - Node.js makes no guarantees about the
exact timing of when the callback will fire, nor of the ordering
things will fire in. The callback will be called as close as possible
to the time specified.
To make your function more async you could make something like:
function iteration(callback){
console.log("sleeping at :", Math.ceil(new Date()/1000));
sleep(10000); // sleep 10 secs
console.log("waking up at:", Math.ceil(new Date()/1000));
return setTimeout(callback(iteration), 0);
}
iteration(iteration);
Next problem is your sleep function. You have a while loop blocking all your node.js processes to make something the setTimeout function of node.js does for you.
function sleep(delay) {
var start = new Date().getTime();
while (new Date().getTime() < start + delay);
}
That what happens now is that your sleep function is blocking the process. It is not only happening when you call the function directly it is also happening, when the scope of the function is initialised after the async call of setTimeout, by node.js.
And that is not a bug. It has to be like that, because inside your while loop could be something your iteration functions has to know. Specially when it is called after a setTimeout. So first the sleep function is checked and as soon as possible the function is called.
Edited answer after question change
Related
I am very new to Scala and following the Scala Book Concurrency section (from docs.scala-lang.org). Based off of the example they give in the book, I wrote a very simple code block to test using Futures:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.{Failure, Success}
object Main {
def main(args: Array[String]): Unit = {
val a = Future{Thread.sleep(10*100); 42}
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
Thread.sleep(5000)
}
}
When compiled and run, this properly prints out:
Future(Success(42))
to the console. I'm having trouble wrapping my head around why the Thread.sleep() call comes after the onComplete callback method. Intuitively, at least to me, would be calling Thread.sleep() before the callback so by the time the main thread gets to the onComplete method a is assigned a value. If I move the Thread.sleep() call to before a.onComplete, nothing prints to the console. I'm probably overthinking this but any help clarifying would be greatly appreciated.
When you use the Thread.sleep() after registering the callback
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
Thread.sleep(5000)
then the thread that is executing the body of the future has the time to sleep one second and to set the 42 as the result of successful future execution. By that time (after approx. 1 second), the onComplete callback is already registered, so the thread calls this as well, and you see the output on the console.
The sequence is essentially:
t = 0: Daemon thread begins the computation of 42
t = 0: Main thread creates and registers callback.
t = 1: Daemon thread finishes the computation of 42
t = 1 + eps: Daemon thread finds the registered callback and invokes it with the result Success(42).
t = 5: Main thread terminates
t = 5 + eps: program is stopped.
(I'm using eps informally as a placeholder for some reasonably small time interval; + eps means "almost immediately thereafter".)
If you swap the a.onComplete and the outer Thread.sleep as in
Thread.sleep(5000)
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
then the thread that is executing the body of the future will compute the result 42 after one second, but it would not see any registered callbacks (it would have to wait four more seconds until the callback is created and registered on the main thread). But once 5 seconds have passed, the main thread registers the callback and exits immediately. Even though by that time it has the chance to know that the result 42 has already been computed, the main thread does not attempt to execute the callback, because it's none of its business (that's what the threads in the execution context are for). So, right after registering the callback, the main thread exits immediately. With it, all the daemon threads in the thread pool are killed, and the program exits, so that you don't see anything in the console.
The usual sequence of events is roughly this:
t = 0: Daemon thread begins the computation of 42
t = 1: Daemon thread finishes the computation of 42, but cannot do anything with it.
t = 5: Main thread creates and registers the callback
t = 5 + eps: Main thread terminates, daemon thread is killed, program is stopped.
so that there is (almost) no time when the daemon thread could wake up, find the callback, and invoke it.
A lot of things in Scala are functions and don't necessarily look like it. The argument to onComplete is one of those things. What you've written is
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
What that translates to after all the Scala magic is effectively (modulo PartialFunction shenanigans)
a.onComplete({ value =>
value match {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
})
onComplete isn't actually doing any work. It's just setting up a function that will be called at a later date. So we want to do that as soon as we can, and the Scala scheduler (specifically, the ExecutionContext) will invoke our callback at its convenience.
So I've written the following function to show what i mean:
use std::{thread, time};
const TARGET_FPS: u64 = 60;
fn main() {
let mut frames = 0;
let target_ft = time::Duration::from_micros(1000000 / TARGET_FPS);
println!("target frame time: {:?}",target_ft);
let mut time_slept = time::Duration::from_micros(0);
let start = time::Instant::now();
loop {
let frame_time = time::Instant::now();
frames+=1;
if frames == 60 {
break
}
if let Some(i) = (target_ft).checked_sub(frame_time.elapsed()) {
time_slept+=i;
thread::sleep(i)
}
}
println!("time elapsed: {:?}",start.elapsed());
println!("time slept: {:?}",time_slept);
}
The idea of the function is to execute 60 cycles at 60fps then exit with the time elapsed and the total time spent sleeping during the loop. ideally, since im executing 60 cycles at 60fps with no real calculations happening between, it should take about one second to execute and spend basically the entire second sleeping. but instead when i run it it returns:
target frame time: 16.666ms
time elapsed: 1.8262798s
time slept: 983.2533ms
As you can see, even though it was only told to sleep for a total of 983ms, the 60 cycles took nearly 2 seconds to complete. Because of this nearly 50% inaccuracy, a loop told to run at 60fps instead runs at only 34fps.
The docs say The thread may sleep longer than the duration specified due to scheduling specifics or platform-dependent functionality. It will never sleep less. But is this really just from that? Am i doing something wrong?
i switched to using spin_sleep::sleep(i) from https://crates.io/crates/spin_sleep and it seems to have fixed it. i guess it must just be windows inaccuracies then...still strange that time::sleep on windows would be that far off for something as simple as a game loop
My question is about performance in my NodeJS app...
If my program run 12 iteration of 1.250.000 each = 15.000.000 iterations all together - it takes dedicated servers at Amazon the following time to process:
r3.large: 2 vCPU, 6.5 ECU, 15 GB memory --> 123 minutes
4.8xlarge: 36 vCPU, 132 ECU, 60 GB memory --> 102 minutes
I have some code similair to the code below...
start();
start(){
for(var i=0; i<12; i++){
function2(); // Iterates over a collection - which contains data split up in intervals - by date intervals. This function is actually also recursive - due to the fact - that is run through the data many time (MAX 50-100 times) - due to different intervals sizes...
}
}
function2(){
return new Promise{
for(var i=0; i<1.250.000; i++){
return new Promise{
function3(); // This function simple iterate through all possible combinations - and call function3 - with all given values/combinations
}
}
}
}
function3(){
return new Promise{ // This function simple make some calculations based on the given values/combination - and then return the result to function2 - which in the end - decides which result/combination was the best...
}}
This is equal to 0.411 millisecond / 441 microseconds pér iteration!
When i look at performance and memory usage in the taskbar... the CPU is not running at 100% - but more like 50%...the entire time?
The memory usage starts very low - but KEEPS growing in GB - every minute until the process is done - BUT the (allocated) memory is first released when i press CTRL+C in the Windows CMD... so its like the NodeJS garbage collection doesn't not work optimal - or may be its simple the design of the code again...
When i execute the app i use the memory opt like:
node --max-old-space-size="50000" server.js
PLEASE tell me every thing you thing i can do - to make my program FASTER!
Thank you all - so much!
It's not that the garbage collector doesn't work optimally but that it doesn't work at all - you don't give it any chance to.
When developing the tco module that does tail call optimization in Node i noticed a strange thing. It seemed to leak memory and I didn't know why. It turned out that it was because of few console.log()
calls in various places that I used for testing to see what's going on because seeing a result of recursive call millions levels deep took some time so I wanted to see something while it was doing it.
Your example is pretty similar to that.
Remember that Node is single-threaded. When your computations run, nothing else can - including the GC. Your code is completely synchronous and blocking - even though it's generating millions of promises in a blocking manner. It is blocking because it never reaches the event loop.
Consider this example:
var a = 0, b = 10000000;
function numbers() {
while (a < b) {
console.log("Number " + a++);
}
}
numbers();
It's pretty simple - you want to print 10 million numbers. But when you run it it behaves very strangely - for example it prints numbers up to some point, and then it stops for several seconds, then it keeps going or maybe starts trashing if you're using swap, or maybe gives you this error that I just got right after seeing the Number 8486:
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Aborted
What's going on here is that the main thread is blocked in a synchronous loop where it keeps creating objects but the GC has no chance to release them.
For such long running tasks you need to divide your work and get into the event loop once in a while.
Here is how you can fix this problem:
var a = 0, b = 10000000;
function numbers() {
var i = 0;
while (a < b && i++ < 100) {
console.log("Number " + a++);
}
if (a < b) setImmediate(numbers);
}
numbers();
It does the same - it prints numbers from a to b but in bunches of 100 and then it schedules itself to continue at the end of the event loop.
Output of $(which time) -v node numbers1.js 2>&1 | egrep 'Maximum resident|FATAL'
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Maximum resident set size (kbytes): 1495968
It used 1.5GB of memory and crashed.
Output of $(which time) -v node numbers2.js 2>&1 | egrep 'Maximum resident|FATAL'
Maximum resident set size (kbytes): 56404
It used 56MB of memory and finished.
See also those answers:
How to write non-blocking async function in Express request handler
How node.js server serve next request, if current request have huge computation?
Maximum call stack size exceeded in nodejs
Node; Q Promise delay
How to avoid jimp blocking the code node.js
As stated in https://lwn.net/Articles/308545/ hrtimer callbacks run in hard interrupt context with irqs disabled.
But what about a SMP?
Can a second callback for another hrtimer run on another core,
while a first callback is allready running or do they exclude each other on all cores, so that no locking is needed between them?
edit:
When a handler for a "regular" hardware IRQ (let's call it X) is running on a core, all IRQs are disabled on only that core, but IRQ X is disabled on the whole system, so two handlers for X never run concurrently.
How do hrtimer interrupts behave in this regard?
Do they all share the same quasi IRQ, or is there one IRQ per hrtimer?
edit:
Did some experiments with two timers A and B:
// starting timer A to fire as fast as possible...
A_ktime = ktime_set(0, 1); // 1 NS
hrtimer_start( &A, A_ktime, HRTIMER_MODE_REL );
// starting timer B to fire 10 us later
B_ktime = ktime_set(0, 10000); // 10 us
hrtimer_start( &B, B_ktime, HRTIMER_MODE_REL );
Put some printks into the callbacks and a huge delay into the one for timer A
// fired after 1 NS
enum hrtimer_restart A(struct hrtimer *timer)
{
printk("timer A: %lu\n",jiffies);
int i;
for(i=0;i<10000;i++){ // delay 10 seconds (1000 jiffies with HZ 100)
udelay(1000);
}
printk("end timer A: %lu\n",jiffies);
return HRTIMER_NORESTART;
}
// fired after 10 us
enum hrtimer_restart B(struct hrtimer *timer)
{
printk("timer B: %lu\n",jiffies);
return HRTIMER_NORESTART;
}
Result was reproducible something like
[ 6.217393] timer A: 4294937914
[ 16.220352] end timer A: 4294938914
[ 16.224059] timer B: 4294938915
1000 jiffies after start of timer A,
when timer B was setup to fire after less than one jiffie after it.
When driving this further and increasing the delay to 70 seconds,
I got 7000 jiffies between start of timer A callback and timer B callback.
[ 6.218258] timer A: 4294937914
[ 76.220058] end timer A: 4294944914
[ 76.224192] timer B: 4294944915
edit:
Locking is probably required, because hrtimers
just get enqueued an any CPU. If two of them are enqueued on the same, it might happen, that they delay each other, but there is no guarantee.
from hrtimer.h:
* On SMP it is possible to have a "callback function running and enqueued"
* status. It happens for example when a posix timer expired and the callback
* queued a signal. Between dropping the lock which protects the posix timer
* and reacquiring the base lock of the hrtimer, another CPU can deliver the
* signal and rearm the timer.
is there any diffrence between in these approach?
val upload = for {
done <- Future {
println("uploadingStart")
uploadInAmazonS3 //take 10 to 12 sec
println("uploaded")
}
}yield done
println("uploadingStart")
val upload = for {
done <- Future {
uploadInAmazonS3 //take 10 to 12 sec
}
}yield done
println("uploadingStart")
i wanna know in terms of thread Blocking?
does thread is blocked here, while executing these three lines
println("uploadingStart")
uploadInAmazonS3 //take 10 to 12 sec
println("uploaded")
and in another it is not blocking thread it is so?
or thread are same busy in both cases?
The code within future will be executed by some thread from the executionContext(thread pool)
Yes, the thread which executes this part
println("uploadingStart")
uploadInAmazonS3 //take 10 to 12 sec
println("uploaded")
will be blocked, but not the calling thread(main thread).
In the second case both the println statements are executed by the main thread. Since the main thread simply proceeds after creating the future, the println statements are executed without any delay
The difference is that in former code, println are executed when the future is really performed, whereas in the second one println are runed when the future is declared (prepared, but not yet executed).