the difference between `setTimeout` and `setImmediate` - node.js

the nodejs document say
Schedules "immediate" execution of callback after I/O events' callbacks and before timers set by setTimeout and setInterval are triggered. Returns an immediateObject for possible use with clearImmediate.
but I write a test code as follows:
server = http.createServer((req, res)->
res.end()
)
setImmediate(()->
console.log 'setImmediate'
)
setTimeout(()->
console.log 'setTimeout'
, 0)
process.nextTick(()->
console.log 'nextTick'
)
server.listen(8280, ()->
console.log 'i/o event'
)
why the setTimeout always output befeore setImmediate

SetTimeOut - This type of function will be call after set time which is 0 in your case but it follows event loop. And event loop does not grantee that it will work after 0 seconds. In fact it only guarantees that function will be called after completing set time.
But, function can be called any time after completing time when node event queue is free to take up callback function
Source to understand event loop - https://www.youtube.com/watch?v=8aGhZQkoFbQ
SetImmediate - This will get called as and when it goes to stack and does not follow cycle of callback in event loop.

The main advantage to using setImmediate() over setTimeout() is setImmediate() will always be executed before any timers if scheduled within an I/O cycle, independently of how many timers are present.
However, if executed in the main module, the execution order will be nondeterministic and dependent on the performance of the process
see https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/ for more information

Related

Why using sync functions in nodejs is known as "blocking the event loop"?

If I understand correctly, the EventLoop is the mechanism that node uses to resolve asynchronous operations and then passing them to the call stack, right? My question is, when I use a synchronous method (for example pbkdf2Sync) it will get stuck in the call stack until it is finished but it won't be moved to the EventLoop because is not an async operation, so why is this known as blocking the event loop if in reality it is blocking everything? not JUST the event loop (that as far as I understand, can continue working and will pass the callbacks to the call stack when they're finished)
Is my understanding of the NodeJs inner workins completly wrong? This topic specifically is kinda hard to understand because every resource I read differs in some way or another, so even if I think I get the bigger picture, these are the "details" that confuse me.
What is blocking
Why using sync functions in nodejs is known as "blocking the event loop"?
In a nutshell, it's because the event loop can only process the next event when you return from whatever your current Javascript is doing and allow the event loop to look for the next event. A sync function blocks the interpreter until it finishes. So, the entire time a sync function is working and you're waiting for it to return, the interpreter is blocked and control is not returned back to the event loop. This blocks the event loop and also blocks your Javascript from running.
Single Thread
Nodejs runs your Javascript all with a single thread. Other threads are used internally, but your Javascript itself runs only in a single thread (we're assuming there is no use of WorkerThreads in your code). So, when you make a synchronous function call, that single thread that runs your Javascript is busy and blocked until the synchronous function call returns and can then continue executing more of your Javascript.
This blocks everything. It blocks running more of your Javascript after the synchronous function call and it blocks getting back to the event loop to run any other event handlers that are pending such as incoming network events, timers, completion events from other things such as disk I/O, etc... So, while this is blocking the event loop, it's also blocking running more of your own code after the function call.
Non-Blocking, Asynchronous Operations
On the other hand, asynchronous functions such as fs.readFile(), for example, don't block. They initiate their operation and return immediately. This allows the interpreter to continue running any more of your own Javascript after the call to fs.readFile() and it also allows you to return from whatever event triggered your work in the first place which will return control back to the event loop so it can service other waiting events or other events that will trigger in the future. fs.readFile() then does most of its work in native code (behind the scenes) outside of the main thread that runs your Javascript. So, these type of asynchronous functions don't block the event loop - instead they cooperate with the event loop so that other things can get run while waiting for the completion of the asynchronous operation that was previously initiated. When they complete, they insert an event into the event loop that causes the event loop to call the completion callback at it's earliest convenience (when it's not blocked).
Differences in Blocking
It's also worth noting that functions that represent both synchronous and asynchronous operations both block the execution of your Javascript and block the event loop until they return. The difference is that an asynchronous operation returns from the function nearly immediately, long before the asynchronous operation itself is complete and communicates its completion and/or eventual result back via a promise, callback or event (which are all callbacks at the lowest level of the event loop). The synchronous operation does not return until the operation itself is complete. So, the asynchronous operation only blocks for a very short duration while the operation is being initiated whereas the synchronous operation blocks for the entire duration of the operation (until it completes).
More About the Event Loop
So, in what moment during the event loop is my javascript code executed?
When control returns back to the event loop, it goes through several different phases looking for things to do. When it finds something to do, that "something" results in calling a Javascript callback that starts running some of your Javascript. For example if the "something to do" is a setTimeout() timer that is ready to fire, then it will call the Javascript callback that was passed to setTimeout(). That callback runs to its completion and only when your Javascript returns from that callback does the event loop regain control and get to look for the next event to run and call its callback.
it won't be moved to the EventLoop because is not an async operation
This is not really the correct way to think about things. Things are not really "moved to the event loop".
A synchronous operation is just a blocking function call that returns when it returns and execution of any other Javascript is blocked until that blocking function call returns. Things are blocked because the single threaded interpreter running your Javascript is stuck waiting for this function to finish. It can't do anything else and the event loop is also blocked because it can't do anything until the interpreter returns control back to the event loop.
An asynchronous operation, on the other hand, initiates some operation (let's say it issues an http request to some other host) and then immediately returns, long before it has the result of that http request. Since this asynchronous operation returns before it has its result, it is considered non-blocking and because it returns quickly, you can then return from whatever event caused your code to run and that will then return control back to the event loop. That allows the event loop to then look for other events to handle and run their corresponding callbacks. Meanwhile, the asynchronous operation that was previously started has some native code associated with it (that may or may not be running in an native code OS thread - depending upon what type of operation it is). But regardless, that native code is configured such that when the asynchronous operation completes, it will insert an event into the appropriate event queue. So, at some future point when nodejs has control back in the event loop, it will find that event and run the Javascript callback associated with that event, thus notifying the original Javascript code that the asynchronous operation is now complete and providing some sort of result or error code.
Example
As a simple example, let's say you run this code:
// timer that wants to fire in 1 second
setTimeout(function() {
console.log("timer fired")
}, 1000);
// loop that blocks for 5 seconds
const start = Date.now();
while (Date.now() - start < 5000) { }
console.log("blocking loop finished");
This will output:
blocking loop finished
timer fired
Even though the timer was set to run 1 second from now, the while() loop blocked everything for 5 seconds so it wasn't until after the while() loop finished and returned back to the system that the event loop could look at what it had to do next and call the callback associated with the timer.
This while loop is similar to a blocking function such as pbkdf2Sync(). Both block the interpreter and don't return until they are finished and therefore nothing in the event loop gets a chance to run until they are finished.
A Simple Analogy
Here's a simple analogy. Imagine you need to contact your cable company to troubleshoot a problem. There are two ways for you to get ahold of them. The first way is that you call them and sit on the phone on hold for an hour waiting for a customer service representative to pick up your call. You can't really do much else while you're sitting on hold because you have to be right there waiting and ready to respond when someone finally comes on the line to help you. Thus, you are "blocked" from doing a lot of other things. This is a "blocking, synchronous operation". You can't really do much else while you're waiting on hold.
The second way you can call the cable company is to request a callback sometime in the future and when a representative is available, they will call you back. As soon as you're done requesting the callback, you can go about your business doing other things. You have to be able to answer an incoming call when it arrives, but other than that you're not blocked from doing many other things. This is a "non-blocking, asynchronous operation".
But, if while you're supposed to be able to answer your telephone callback from the second scenario above, you make another call and are on the phone for 30 minutes, the cable company is blocked from reaching you on the phone for the duration of your other call. In our analogy, you are "blocking the event loop" during your other call as the incoming event from the cable company can't be processed while you're on a call.

In Node.js, setImmediate() callbacks get executed on current event loop tick, or on the next one?

In the official docs, "setImmediate() vs setTimeout()" section, it says:
"setImmediate() is designed to execute a script once the current poll phase completes", and as i understand, it means also current tick/event loop cycle.
However later in the "process.nextTick() vs setImmediate()" section, it says:
"setImmediate() fires on the following iteration or 'tick' of the event loop".
So which one is the correct answer, did i miss anything here?
Thanks in advance.
Docs relevant page: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/
As you can see by the image below:
The diagram is from Event Loop and the Big Picture — NodeJS Event Loop Part 1 By Deepal Jayasekara
"The boxes around the circle illustrate the queues of the different phases we seen before, and the two boxes in the middle illustrate two special queues that need to be exhausted before moving from one phase to the other. The “next tick queue” processes callbacks registered via nextTick(), ..."
From Synk - Node.js Event-Loop: How even quick Node.js async functions can block the Event-Loop, starve I/O which reference the article above
So the difference between setTimeout, setImmediate and nextTick is:
setTimeout schedules a script to be run after a minimum threshold in ms has elapsed.1
setImmediate runs after the poll phase (the pool phase is the IO events queue in the diagram)
nextTick runs between phases.
1 The order in which the timers are executed will vary depending on the context in which they are called., see here for more
Simple answer:
setImmediate runs on the next tick.
Details:
setImmediate and setTimeout both run in different phases on the event loop.
That is why when setTimeout is set with zero delay it will not run immediately. Use setImmediate which always run on the next tick of the event loop.
but
process.nextTick basically does not care about phases of event loop. The callback you assign to this method will get executed after the current operation is complete and before event loop continues.
The document you linked has a very good explanation about the differences. I'll just grab your attention to few points here.
In setimmediate-vs-settimeout the order of call is nondeterministic UNLESS they are scheduled within an I/O cycle like below example.
Now add the process.nextTick to the example
const fs = require('fs');
// I/O cycle
fs.readFile(__filename, () => {
setTimeout(() => {
console.log('timeout');
}, 0);
setImmediate(() => {
console.log('immediate');
});
});
process.nextTick(() => {
console.log('nextTick')
})
Output:
nextTick
immediate
timeout
If you run the above code whatever is in process.nextTick function will get executed first. The reason is, it runs before event loop continues. Remember, both setTimeout and setImmediate are event loop related but not process.nextTick.
Be very careful how you use the process.nextTick. You can end up in bad situations if you are not using it properly.
The concept of tick is IMO out of scope for us as JS engineers, as there is no native API for us to find or count which tick we are in right now. Sometimes the poll phase waits for the incoming requests to be completed and the event loop stays in the same tick. However, we can indeed try and make sense of the concept with proper code examples.
I asked the same question on the official node.js dev doc and they haven't gotten back to me yet https://github.com/nodejs/nodejs.dev/issues/2343.
Based on the code I ran and observed, the documentation doesn't really explain it clearly. But here it goes:
setImmediate runs on the next tick of the event loop in the main module. Eg: The initial code from index.js/main.js.
// Both of the codes will be executed in the next tick of the event loop.
// Depending on the event-loop start time, if the timer is executed in the next tick, then setImmediate will also do the same, as the check-phase follows the timers phase.
// If the timer is delayed further then only setImmediate will be executed in the next tick and the timer a tick after that.
setTimeout(() => console.log('May be I run first'), 0);
setImmediate(() => console.log('May be I run first'));
setImmediate runs on the same iteration of the event loop inside an I/O callback. Because timer callbacks are always executed after setImmediate callbacks.
fs.read('/file/path', () => {
// check phase follows the I/O phase and poll phase.
// And when the poll phase is idle then the check phase is run right after.
setImmediate(() => console.log('I am printed first always'));
// Timers will be deferred to the next tick as timers-phase are the start of a "tick".
setTimeout(() => console.log('I will be called in the next tick'), 0);
setTimeout(() => console.log('I will be called may be tick after next one'), 100);
});

How does nodejs works asynchronously?

I get the response from /route1 until /route2 logs "ROUTE2".
But I studied that nodejs puts functions like setTimeout() in external threads and the main thread continues to work. Shouldn't the for loop executed in an external thread?
app.get("/route1",(req,res)=>
{
res.send("Route1");
});
app.get("/route2",(req,res)=>
{
setTimeout(()=>
{
console.log("ROUTE2");
for(let i=0;i<1000000000000000;i++);
res.send("Route2");
},10000);
})
Node.js uses an Event Loop to handle Asynchronous operations. And by Asynchronous operations I mean all I/O operations like interaction with the system's disk or network, Callback functions and Timers(setTimeout, setInterval). If you put Synchronous operations like a long running for loop inside a Timer or a Callback, it will block the event loop from executing the next Callback or I/O Operation.
In above case both /route1 and /route2 are Callbacks, so when you hit them, each callback is put into the Event Loop.
Below scenarios will give clear understanding on Asynchronous nature of Node.js:
Hit /route2 first and within 10 Seconds(Since setTimeout is for 10000 ms) hit /route1
In this case you will see the output Route 1 because setTimeout for 10 secs still isn't complete. So everything is asynchronous.
Hit /route2 first and after 10 Seconds(Since setTimeout is for 10000 ms) hit /route1
In this case you will have to wait till the for loop execution is complete
for(let i=0;i<1000000000000000;i++);
because a for loop is still a Synchronous operation so /route1 will have to wait till /route2 ends
Refer below Node.js Guides for more details:
Blocking vs Non-Blocking
The Node.js Event Loop

Single thread synchronous and asynchronous confusion

Assume makeBurger() will take 10 seconds
In synchronous program,
function serveBurger() {
makeBurger();
makeBurger();
console.log("READY") // Assume takes 5 seconds to log.
}
This will take a total of 25 seconds to execute.
So for NodeJs lets say we make an async version of makeBurgerAsync() which also takes 10 seconds.
function serveBurger() {
makeBurgerAsync(function(count) {
});
makeBurgerAsync(function(count) {
});
console.log("READY") // Assume takes 5 seconds to log.
}
Since it is a single thread. I have troubling imagine what is really going on behind the scene.
So for sure when the function run, both async functions will enter event loops and console.log("READY") will get executed straight away.
But while console.log("READY") is executing, no work is really done for both async function right? Since single thread is hogging console.log for 5 seconds.
After console.log is done. CPU will have time to switch between both async so that it can run a bit of each function each time.
So according to this, the function doesn't necessarily result in faster execution, async is probably slower due to switching between event loop? I imagine that, at the end of the day, everything will be spread on a single thread which will be the same thing as synchronous version?
I am probably missing some very big concept so please let me know. Thanks.
EDIT
It makes sense if the asynchronous operations are like query DB etc. Basically nodejs will just say "Hey DB handle this for me while I'll do something else". However, the case I am not understanding is the self-defined callback function within nodejs itself.
EDIT2
function makeBurger() {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
return count;
}
function makeBurgerAsync(callback) {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
callback(count);
}
In node.js, all asynchronous operations accomplish their tasks outside of the node.js Javascript single thread. They either use a native code thread (such as disk I/O in node.js) or they don't use a thread at all (such as event driven networking or timers).
You can't take a synchronous operation written entirely in node.js Javascript and magically make it asynchronous. An asynchronous operation is asynchronous because it calls some function that is implemented in native code and written in a way to actually be asynchronous. So, to make something asynchronous, it has to be specifically written to use lower level operations that are themselves asynchronous with an asynchronous native code implementation.
These out-of-band operations, then communicate with the main node.js Javascript thread via the event queue. When one of these asynchronous operations completes, it adds an event to the Javascript event queue and then when the single node.js thread finishes what it is currently doing, it grabs the next event from the event queue and calls the callback associated with that event.
Thus, you can have multiple asynchronous operations running in parallel. And running 3 operations in parallel will usually have a shorter end-to-end running time than running those same 3 operations in sequence.
Let's examine a real-world async situation rather than your pseudo-code:
function doSomething() {
fs.readFile(fname, function(err, data) {
console.log("file read");
});
setTimeout(function() {
console.log("timer fired");
}, 100);
http.get(someUrl, function(err, response, body) {
console.log("http get finished");
});
console.log("READY");
}
doSomething();
console.log("AFTER");
Here's what happens step-by-step:
fs.readFile() is initiated. Since node.js implements file I/O using a thread pool, this operation is passed off to a thread in node.js and it will run there in a separate thread.
Without waiting for fs.readFile() to finish, setTimeout() is called. This uses a timer sub-system in libuv (the cross platform library that node.js is built on). This is also non-blocking so the timer is registered and then execution continues.
http.get() is called. This will send the desired http request and then immediately return to further execution.
console.log("READY") will run.
The three asynchronous operations will complete in an indeterminate order (whichever one completes it's operation first will be done first). For purposes of this discussion, let's say the setTimeout() finishes first. When it finishes, some internals in node.js will insert an event in the event queue with the timer event and the registered callback. When the node.js main JS thread is done executing any other JS, it will grab the next event from the event queue and call the callback associated with it.
For purposes of this description, let's say that while that timer callback is executing, the fs.readFile() operation finishes. Using it's own thread, it will insert an event in the node.js event queue.
Now the setTimeout() callback finishes. At that point, the JS interpreter checks to see if there are any other events in the event queue. The fs.readfile() event is in the queue so it grabs that and calls the callback associated with that. That callback executes and finishes.
Some time later, the http.get() operation finishes. Internal to node.js, an event is added to the event queue. Since there is nothing else in the event queue and the JS interpreter is not currently executing, that event can immediately be serviced and the callback for the http.get() can get called.
Per the above sequence of events, you would see this in the console:
READY
AFTER
timer fired
file read
http get finished
Keep in mind that the order of the last three lines here is indeterminate (it's just based on unpredictable execution speed) so that precise order here is just an example. If you needed those to be executed in a specific order or needed to know when all three were done, then you would have to add additional code in order to track that.
Since it appears you are trying to make code run faster by making something asynchronous that isn't currently asynchronous, let me repeat. You can't take a synchronous operation written entirely in Javascript and "make it asynchronous". You'd have to rewrite it from scratch to use fundamentally different asynchronous lower level operations or you'd have to pass it off to some other process to execute and then get notified when it was done (using worker processes or external processes or native code plugins or something like that).

How to know when the Promise is actually resolved in Node.js?

When we are using Promise in nodejs, given a Promise p, we can't know when the Promise p is actually resolved by logging the currentTime in the "then" callback.
To prove that, I wrote the test code below (using CoffeeScript):
# the procedure we want to profile
getData = (arg) ->
new Promise (resolve) ->
setTimeout ->
resolve(arg + 1)
, 100
# the main procedure
main = () ->
beginTime = new Date()
console.log beginTime.toISOString()
getData(1).then (r) ->
resolveTime = new Date()
console.log resolveTime.toISOString()
console.log resolveTime - beginTime
cnt = 10**9
--cnt while cnt > 0
return cnt
main()
When you run the above code, you will notice that the resolveTime (the time your code run into the callback function) is much later than 100ms from the beginTime.
So If we want to know when the Promise is actually resolved, HOW?
I want to know the exact time because I'm doing some profiling via logging. And I'm not able to modify the Promise p 's implementation when I'm doing some profiling outside of the black box.
So, Is there some function like promise.onUnderlyingConditionFulfilled(callback) , or any other way to make this possible?
This is because you have a busy loop that apparently takes longer than your timer:
cnt = 10**9
--cnt while cnt > 0
Javascript in node.js is single threaded and event driven. It can only do one thing at a time and it will finish the current thing it's doing before it can service the event posted by setTimeout(). So, if your busy loop (or any other long running piece of Javascript code) takes longer than you've set your timer for, the timer will not be able to run until this other Javascript code is done. "single threaded" means Javascript in node.js only does one thing at a time and it waits until one thing returns control back to the system before it can service the next event waiting to run.
So, here's the sequence of events in your code:
It calls the setTimeout() to schedule the timer callback for 100ms from now.
Then you go into your busy loop.
While it's in the busy loop, the setTimeout() timer fires inside of the JS implementation and it inserts an event into the Javascript event queue. That event can't run at the moment because the JS interpreter is still running the busy loop.
Then eventually it finishes the busy loop and returns control back to the system.
When that is done, the JS interpreter then checks the event queue to see if any other events need servicing. It finds the timer event and so it processes that and the setTimeout() callback is called.
That callback resolves the promise which triggers the .then() handler to get called.
Note: Because of Javascript's single threaded-ness and event-driven nature, timers in Javascript are not guaranteed to be called exactly when you schedule them. They will execute as close to that as possible, but if other code is running at the time they fire or if their are lots of items in the event queue ahead of you, that code has to finish before the timer callback actually gets to execute.
So If we want to know when the Promise is actually resolved, HOW?
The promise is resolved when your busy loop is done. It's not resolved at exactly the 100ms point (because your busy loop apparently takes longer than 100ms to run). If you wanted to know exactly when the promise was resolved, you would just log inside the setTimeout() callback right where you call resolve(). That would tell you exactly when the promise was resolved though it will be pretty much the same as where you're logging now. The cause of your delay is the busy loop.
Per your comments, it appears that you want to somehow measure exactly when resolve() is actually called in the Promise, but you say that you cannot modify the code in getData(). You can't really do that directly. As you already know, you can measure when the .then() handler gets called which will probably be no more than a couple ms after resolve() gets called.
You could replace the promise infrastructure with your own implementation and then you could instrument the resolve() callback directly, but replacing or hooking the promise implementation probably influences the timing of things even more than just measuring from the .then() handler.
So, it appears to me that you've just over-constrained the problem. You want to measure when something inside of some code happens, but you don't allow any instrumentation inside that code. That just leaves you with two choices:
Replace the promise implementation so you can instrument resolve() directly.
Measure when .then() is triggered.
The first option probably has a heisenberg uncertainty issue in that you've probably influenced the timing by more than you should if you replace or hook the promise implementation.
The second option measures an event that happens just slightly after the actual .resolve(). Pick which one sounds closest to what you actually want.

Resources